ftl_mason | ildikov: Based on the installer logs, it appears that the manifest for k8s dashboard was referencing cert-manager where kubernetes-dashboard was expected. | 00:09 |
---|---|---|
ftl_mason | I'm going to create a Launchpad account and log an issue there. | 00:10 |
ildikov | Oh, interesting | 00:10 |
ildikov | Thanks for looking into it! | 00:10 |
ftl_mason | However, it doesn't look like I'm out of the woods yet. I was able to login to the controller's web UI, but when I got there and reviewed the alerts it appeared there was an issue with the controller. It recommended that I lock and unlock the controller, which I did. This triggered a reboot, but the VM won't come back up. It's just stuck at the bootloader. | 00:11 |
ftl_mason | I'm running VirtualBox on Ubuntu 22.04. I don't know if the problem is VirtualBox or StarlingX, but this far my experience setting up a controller has been really poor. I'm 3 full days into this and still don't have a functioning controller. | 00:13 |
ftl_mason | I'm not sure in which direction I should head now. I'm thinking that perhaps the VM route is a dead end right now and I should just try bare-metal? I think I have enough resources kicking around my office that I can build an AIO-SX system on bare metal and hopefully spin up an AIO-SX subcloud on another machine. | 00:16 |
ftl_mason | However, I'm struggling a bit to understand exactly what the network interface configuration needs to look like to accomplish that. I've seen some suggestions that it can all be done on a single NIC with VLANs, some suggestions that I need multiple discreet networks and some other permutations. I'll try going through the documentation again and see if I can get this working on bare metal. | 00:19 |
ftl_mason | Right now the main driver behind this effort, is simply to find out what the real-world minimum requirements are for an AIO-SX subcloud. The documentation suggests that I need a minimum of a xeon-D and a lot of RAM, yet I heard from a few people at the conference that it'll run in a VM with as little as 8GB of RAM and 4 vCPUs. I want to know what I can actually get away with. | 00:23 |
ftl_mason | I have to run out for a few min. If anyone responds, I'll continue the conversation shortly. | 00:24 |
ftl_mason | I'm back. | 00:44 |
ildikov | Hmm, that’s not good :( | 01:05 |
ildikov | I’ve never done the install myself | 01:05 |
ildikov | OutBackDingo: have you or your team ever done the virtual setup? ^^ | 01:06 |
ildikov | Or maybe about to suggest on network config for bare metal? | 01:06 |
OutBackDingo | yes we have on kvm, and then i filed patches for it, whioch i could never complete | 01:06 |
OutBackDingo | since then its been "closed" | 01:07 |
ildikov | Ok, so not everything got fixed then | 01:07 |
ildikov | I know Bruno’s team picked some of that up, but I think he signed off for the day already | 01:08 |
OutBackDingo | no... but we did have it up on kvm, it was a matter of only cvhanging the machine model for alterantive operating system ... ie... fedora | 01:08 |
OutBackDingo | let me finish what im into and read tyhe conversation, see if i can help | 01:09 |
OutBackDingo | give me like 30 mins | 01:09 |
ildikov | Sounds good, thank you! | 01:09 |
ftl_mason | OutBackDingo: Thank you! | 01:09 |
OutBackDingo | ftl_mason: so first whats the environment kvm or vbox ? | 01:10 |
OutBackDingo | VirtualBox on Ubuntu 22.04 | 01:11 |
ftl_mason | I'm willing to do whatever works. Both Eddy and Bruno have recommended the VirtualBox route, as it seems like that's where most of the work has been. | 01:11 |
OutBackDingo | pffft :) | 01:11 |
OutBackDingo | to be quite honest, the kvm way just worked out of the box.... previously, not sure why any of that would have changed as all their work was vbox based | 01:12 |
ftl_mason | It's possible it's just me doing something odd or misunderstanding the documentation. | 01:13 |
OutBackDingo | and what git repo did you use | 01:13 |
ftl_mason | https://opendev.org/starlingx/virtual-deployment | 01:13 |
OutBackDingo | and what doc ? | 01:14 |
OutBackDingo | just so i can see what you followed | 01:14 |
ftl_mason | I also tried this one https://github.com/zbsarashki/stx-labs-openInfraVancouver2023/tree/main | 01:14 |
ftl_mason | The URL I just posted shows the instructions that I followed from that repo. | 01:15 |
ftl_mason | These are the instructions that I followed for the Pybox install. This install worked just fine, but when I locked and unlocked the VM it triggered a reboot and the VM wouldn't boot after that. https://opendev.org/starlingx/virtual-deployment/src/branch/master/virtualbox/pybox | 01:16 |
ftl_mason | I haven't played around with VirtualBox to see if I can get the VM to boot again. | 01:16 |
OutBackDingo | https://github.com/zbsarashki/stx-labs-openInfraVancouver2023/tree/main/libvirt looks viable... however you also have already installed vbox, which honestly i dont use... never saw the need for it as kvm just worked | 01:17 |
OutBackDingo | and this is the repo and guide that we based our fedora deployment from https://github.com/zbsarashki/stx-labs-openInfraVancouver2023/tree/main/libvirt | 01:19 |
OutBackDingo | so it should also work fine on ubuntu out of the box | 01:19 |
ftl_mason | I followed these instructions for my first attempt https://docs.starlingx.io/r/stx.8.0/deploy_install_guides/release/virtual/aio_simplex.html | 01:19 |
ftl_mason | Which really didn't work. | 01:19 |
ftl_mason | I must be missing something really obvious then. | 01:20 |
OutBackDingo | mmm possibly... | 01:21 |
OutBackDingo | let me try something right quick | 01:21 |
ftl_mason | Sure | 01:21 |
OutBackDingo | ok... | 01:54 |
OutBackDingo | this works so far https://github.com/zbsarashki/stx-labs-openInfraVancouver2023/tree/main/libvirt | 01:54 |
OutBackDingo | ./setup_configuration.sh -i /var/lib/libvirt/images/pool/starlingx-intel-x86-64-cd.iso -c simplex | 01:55 |
OutBackDingo | and you have to change stxbr to madbr | 01:56 |
OutBackDingo | in the setup_configuratiomn.sh | 01:56 |
OutBackDingo | ```❯ brctl show | 01:56 |
OutBackDingo | bridge namebridge idSTP enabledinterfaces | 01:56 |
OutBackDingo | br-98a1f15542408000.0242824763d4noveth143a4e4 | 01:56 |
OutBackDingo | docker08000.02429a98242fno | 01:56 |
OutBackDingo | madbr18000.460c19283202novnet54 | 01:56 |
OutBackDingo | madbr28000.3eac164718e6novnet55 | 01:56 |
OutBackDingo | madbr38000.ba6c78ed84ccnovnet56 | 01:56 |
OutBackDingo | madbr48000.929c5f4a71a8novnet57 | 01:56 |
OutBackDingo | virbr08000.525400ca2351yes``` | 01:56 |
ftl_mason | Ok, I'll try that later this evening. I'll let you know how I make out. Thanks for your help! | 01:57 |
brunomuniz | Ok, I had to go through the logs to make sure I read everything http://eavesdrop.openstack.org/irclogs/%23starlingx/ | 07:13 |
brunomuniz | Regarding the VM being stuck at the bootloader, I remember some reports related to that and us switching mostly to graphical install type to avoid that (via the "--install-mode graphical"parameter). IIRC on existing VMs, connecting to the serial port via "socat" would also unstuck the VM. =| | 07:17 |
haoii | how do I connect to the serial port via socat? | 07:20 |
brunomuniz | Something like "socat <address> stdio,raw,escape=0x1d,echo=0,icanon=0" | 07:24 |
brunomuniz | You can find the address on the virtualbox config (either via CLI or GUI) | 07:25 |
brunomuniz | "vboxmanage list vms --long | grep 'UART 1'" to take a quick look at what you might have there. | 07:27 |
haoii | vboxmanage list vms --long | grep 'UART 1' UART 1: I/O base: 0x03f8, IRQ: 4, attached to pipe (server) '/tmp/STX8-AIOSX_serial', 16550A UART 1: I/O base: 0x03f8, IRQ: 4, attached to pipe (server) '/tmp/hli_StarlingX-controller-0_serial', 16550A | 07:28 |
haoii | it is the bottom one I am trying to fix, the VirtualBox starts but it stuck in black screen. | 07:29 |
brunomuniz | Try the socat with this /tmp/hli_StarlingX-controller-0_serial address and hit enter when it connects (it should just hag there for a while) | 07:30 |
haoii | I am sorry, this is not my field of competance. There will be some dumd questions, but I do not understand what is the serial address. | 07:34 |
brunomuniz | The might be some dumb answers as well :) | 07:40 |
brunomuniz | Think of it as just a way to interact with the VMs console. | 07:40 |
brunomuniz | (the same thing that you usually see when you start a VM in VirtualBox and it opens up in a window - if you're on a GUI environment) | 07:41 |
haoii | Ah, I understand. It seems that socat needs two addresses, what is the second address I should give it? | 07:44 |
brunomuniz | I just do an address and then a set of parameters for the connection | 07:49 |
brunomuniz | In my case, for example, I just did "socat TCP4:localhost:10001 stdio,raw,escape=0x1d,echo=0,icanon=0"... | 07:51 |
brunomuniz | For you it should be "socat /tmp/hli_StarlingX-controller-0_serial stdio,raw,escape=0x1d,echo=0,icanon=0" | 07:52 |
brunomuniz | (the escape sequence would be CTRL+] then <enter>, so you don't get stuck in yet another place) | 07:53 |
haoii | I still got a black screen. Have not gotten anything on the screen on this one. Have one which I tried with libvirt, but there I have other problems:p | 07:59 |
brunomuniz | It takes some time to restart. If you connect to the serial port via socat then restart the VM (on another terminal or via GUI) do you see anything? | 08:01 |
brunomuniz | (sorry, brb) | 08:04 |
brunomuniz | Regarding the virtualization tool, we're doing most of the work in VirtualBox because we got feedback from the community in at least two different meetings back in March/April that it's what most devs use, so it made sense for us to put most of our work there (also being the one with an existing automation that hadn't being touched in a while) | 08:27 |
haoii | The socat then stopped. It was running as long as the VM was running, but did not give anything back. | 08:27 |
brunomuniz | But I don't have any favorites. In a perfect world we would have a customizable automation/installation tool that could interface with either virtualbox or libvirt on the underlying OS. | 08:28 |
brunomuniz | Were you able to reconnect right after the VM restarted? | 08:30 |
brunomuniz | It didn't show anything even with the VM booting up? That's weird. | 08:30 |
brunomuniz | If you try VirtualBox again, try the pybox thing with "--install-mode graphical". This solved problems like this for us before, although we're not sure exactly what the problem was, tbh. | 08:32 |
haoii | okay, I might try that then. Thanks! | 08:34 |
haoii | Now I rememberd my PyBox error: 2023-07-13 10:43:41,326: Expecting text within 3.0 minutes: Press | 08:47 |
haoii | this step fails, as no text appears on the screen. | 08:47 |
haoii | Can not find any error within the VirtualBox GUI | 08:47 |
haoii | Is anyone known to this and a possible cause to this | 08:48 |
haoii | This time I ran the installation with --install-mode graphical instead of serial | 08:51 |
haoii | still had the same error, so assume it is not related to the installation mode | 08:51 |
haoii | been following this guide, which was linked here one of the previous days: https://opendev.org/starlingx/virtual-deployment/src/branch/master/virtualbox/pybox#installation-and-usage | 09:02 |
brunomuniz | This comes from a function that expects a given text to appear on the console. There's a few calls that use the 3 minutes timeout, mostly related to logging in for the first time and then changing the password. Can you paste the logs from a few lines above so we know what the code was doing | 10:06 |
brunomuniz | Have you defined the password with "--password <something>" | 10:06 |
brunomuniz | The instructions (if you copy and paste) will use an environment variable called $STX_INSTALL_PASSWORD. | 10:07 |
brunomuniz | The current version, however, does some basic validation of the password (size, special chars etc) to make sure it will be accepted later by the OS, so I'm thinking it should be something else. | 10:14 |
brunomuniz | The logs right above this error should point us to what specifically is failing. | 10:15 |
daniel-caires | About the VM not booting, it seems is a problem with VirtualBox when a Host Pipe is used. A review is currently in progress that will change host pipe to TCP as serial port( https://review.opendev.org/c/starlingx/virtual-deployment/+/887301v ). But you can to this manually, once you change to TCP put a port as path, then your VM should boot normally | 11:10 |
daniel-caires | As Bruno said using graphical as --install-mode you will see the VM booting otherwise it will be blank for about 6 minutes while it boots | 11:12 |
haoiii | Got passed my error, seemed to be a memory error. Now the installation ran very far, beyond the ansible playbook step. Now I am not able to retry, but it wont let me. Is there a great tool for sending the logs? assume it is not wanted here as it is a bit long. | 11:15 |
daniel-caires | There are a few stages that can only be ran once as it makes some configurations that the system won't allow making twice. | 11:21 |
haoiii | yes it wont let me retry the pybox installation process | 11:22 |
daniel-caires | I often use --snapshot parameter so I can return to after some stage and run the one that failed again | 11:22 |
daniel-caires | from the very begging? | 11:22 |
haoiii | yes, I tried to delete the VM to see if it helped, but it didnt. So now I have neither the VM nor the snapshots... | 11:23 |
daniel-caires | that's weird. If anyone knows somewhere where the log can be posted so we can take a look | 11:25 |
daniel-caires | Although I think it did happened once with me, but it got resolved once a changed the labname | 11:26 |
haoiii | Ah that helped:) | 11:27 |
daniel-caires | Great! | 11:27 |
daniel-caires | Just one thing about the socat, may be irrelevant but the full command that works with me is "socat UNIX-CONNECT:'<tmp/adress>' stdio,raw,escape=0x1d,echo=0,icanon=0" if using host pipe as serial mode | 11:33 |
dpereira_ | haoiii, in the future you can use https://paste.opendev.org/ to send your log output. | 11:37 |
haoiii | thanks! | 11:46 |
haoiii | https://paste.opendev.org/show/beFo6fjdPsy7Pjq0B68R/ | 12:21 |
haoiii | So here is the logs of the error I get, after the ansible playbook stage of the pybox StarlingX installation | 12:21 |
haoiii | seems to be ssh related | 12:22 |
daniel-caires | ssh is not something I'm very familiar with, so I may not be of much help but let's try :) It seems it didn't find the sshpass folder, did you installed it? and rsync ass well | 12:28 |
haoiii | rsync is installed it says, did install ssh now. Still cant locate the folder. | 12:33 |
brunomuniz[m] | (hopefully) all the dependencies are installed with: sudo apt install virtualbox socat git rsync sshpass openssh-client python3-pip python3-venv | 12:36 |
haoiii | Yes of course I did that step before launcing the installation, so it should be fine, but apperantly there is some problem. | 12:37 |
brunomuniz[m] | It's not locating the known_hosts file, apparently, right? | 12:41 |
brunomuniz[m] | Does the command work by itself? The "ssh-keygen -f "/home/hli/.ssh/known_hosts" -R [127.0.0.1]:3122"? | 12:41 |
haoiii | but that is done on the VM or on the host machine? | 12:44 |
brunomuniz[m] | Tjat | 12:44 |
brunomuniz[m] | That's done on the host machine. | 12:44 |
brunomuniz[m] | Try to backup whatever is there in your known_hosts file with something like "mv /home/hli/.ssh/known_hosts /home/hli/.ssh/known_hosts.bkp" | 12:44 |
brunomuniz[m] | I assume hli is your username on the host machine. | 12:44 |
brunomuniz[m] | haoiii: I believe you're not available anymore, but I just found a situation that might be similar to what you're facing. I can explain my conclusions and I can also take a look at you full logs if you want, to confirm my theory. | 15:28 |
brunomuniz[m] | In the mean time, you can try two things. | 15:28 |
brunomuniz[m] | 1) vboxmanage natnetwork start --netname NatNetwork; vboxmanage natnetwork stop --netname NatNetwork | 15:29 |
brunomuniz[m] | This was something that we noticed last week (and I found other reports about VirtualBox not being able to handle the port forwards to a Nat Network). | 15:30 |
brunomuniz[m] | Right now I noticed that the port forwards were working just fine (via netstat I could see the ports listening on my host) but only after recreating the whole NatNetwork on my system (delete then re-create) my SSH connection (which relies on a portforward to the NatNetwork) started working again. | 15:31 |
brunomuniz[m] | So, the second thing would be: | 15:32 |
brunomuniz[m] | 2. Recreate the NatNetwork in VirtualBox from scratch. I did it via GUI, but I can paste a one-liner that does it if you need. | 15:32 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!