opendevreview | Verification of a change to openstack/ironic-python-agent master failed: [trivial] Fix typo in __init__.py https://review.opendev.org/c/openstack/ironic-python-agent/+/822049 | 00:04 |
---|---|---|
*** sshnaidm is now known as sshnaidm|afk | 02:45 | |
arne_wiebalck | Good morning, Ironic! | 07:27 |
jingvar | \0 | 07:36 |
rpittau | good morning ironic! o/ | 07:44 |
*** amoralej|off is now known as amoralej | 08:23 | |
janders | hey arne_wiebalck rpittau and Ironic o/ | 09:12 |
arne_wiebalck | hey janders o/ | 09:13 |
arne_wiebalck | hey rpittau and jingvar o/ | 09:13 |
ajya | Hi, happy Friday! Can 2nd core take a look at this https://review.opendev.org/c/openstack/ironic/+/821576 ? It's tiny. | 09:19 |
rpittau | hey arne_wiebalck janders ajya :) | 09:34 |
rpittau | Happy Friday! | 09:34 |
rpittau | ajya: done | 09:35 |
ajya | thanks, rpittau | 09:35 |
holtgrewe | arne_wiebalck: for feedback, I was able to create a UEFI buildable image with dib but it does not boot when installed on a software RAID1 | 10:17 |
arne_wiebalck | holtgrewe: it does boot w/o RAID? | 10:21 |
holtgrewe | arne_wiebalck: yes | 10:23 |
holtgrewe | but let me triple-check | 10:24 |
arne_wiebalck | holtgrewe: can you see on the console where it gets stuck? | 10:25 |
arne_wiebalck | holtgrewe: also, check the deploy logs (on the conductor in /var/log/ironic/deploy) to see if the IPA complained about anything | 10:25 |
arne_wiebalck | holtgrewe: which release is this again? | 10:26 |
holtgrewe | arne_wiebalck: OS xena installed with kayobe/kolla and I'm trying to boot a CentOS7.9 image. | 10:28 |
arne_wiebalck | holtgrewe: does the image have support for md and the rootfs UUID as metadata ? | 10:30 |
holtgrewe | arne_wiebalck: good questions. I'm running the installation on a non-software RAID right now. I can answer the first question after looking whether mdadm is present, right? How would I find the answer to your second question? | 10:31 |
holtgrewe | arne_wiebalck: The conductor has the following log entry. Could not get 'rootfs_uuid' property for image a4aa2287-a543-4e2e-ada5-57f4e279ed18 from Glance for node 9f0fe673-02a7-403a-b30f-f9dc30fa2ac3. KeyError: 'rootfs_uuid'. | 10:33 |
holtgrewe | OK, so I gather your meta data question refers to the meta data in OS/glance? | 10:36 |
*** redrobot6 is now known as redrobot | 10:37 | |
holtgrewe | OK, mdmadm package missing inside *sigh*, so the answer to your first question is no, obviously | 10:38 |
arne_wiebalck | holtgrewe: sorry, got distracted | 10:42 |
arne_wiebalck | holtgrewe: yes, mdadm/kernel module in the image | 10:42 |
arne_wiebalck | holtgrewe: `openstack image show` should have `rootfs_uuid` as a property (this is maybe not needed for UEFI since the EFI content is copied over) | 10:44 |
holtgrewe | arne_wiebalck: `disk-image-create -p mdadm` should be enough to get md support for CentOS7.9 as the kernel contains the module | 10:46 |
holtgrewe | I guess. | 10:46 |
arne_wiebalck | holtgrewe: yes | 10:51 |
holtgrewe | arne_wiebalck: Ah, I can set the --root-label with disk-image builder. ShouldI just pass in a random UUID there? | 10:51 |
arne_wiebalck | holtgrewe: it should be the UUID that is used in the image | 10:52 |
arne_wiebalck | holtgrewe: so, if you have an instance, grab the UUID and make it a property | 10:52 |
arne_wiebalck | holtgrewe: Ironic will need this UUID to find the rootfs and mount it | 10:52 |
arne_wiebalck | holtgrewe: to run grub2-install | 10:53 |
holtgrewe | OK, rebuilding the image now. | 10:54 |
holtgrewe | arne_wiebalck: so this is what blkid would give me for the file system to be mounted at "/"? It tells me >>/dev/nbd0p3: LABEL="cloudimg-rootfs" UUID="967cd880-ab18-4bd4-a92c-976131bb6ab3" TYPE="ext4" PARTLABEL="root" PARTUUID="19b7fce5-ba7b-4bbb-9769-fd50e2a57137"<< | 10:58 |
holtgrewe | or is it the PTUUID of the device? no, you said file system | 10:59 |
arne_wiebalck | "967cd880-ab18-4bd4-a92c-976131bb6ab3" | 11:01 |
holtgrewe | Great. I think that I can even set this explicitely when providing DIB_BLOCK_DEVICE_CONFIG. | 11:01 |
holtgrewe | It looks like dib is a pretty sharp tool in my box after all... | 11:02 |
arne_wiebalck | holtgrewe: oh, wasn't aware, nice | 11:05 |
* holtgrewe is cleaning machine and applying RAID configuration ... | 11:15 | |
* holtgrewe is deploing the image... | 11:19 | |
holtgrewe | https://paste.openstack.org/show/811743/ | 12:08 |
holtgrewe | arne_wiebalck: these are my notes, is this helpful enough for your docs? | 12:08 |
holtgrewe | yikes, it boots into dracut ... https://paste.openstack.org/show/811744/ | 12:09 |
holtgrewe | *sigh* one more round | 12:09 |
holtgrewe | I guess the problem is the kernel command line >>BOOT_IMAGE=/boot/vmlinuz-3.10.0-1160.49.1.el7.x86_64 root=LABEL=img-rootfs ro console=tty0 crashkernel=auto net.ifnames=0 console=ttyS0 console=tty0 console=ttyS0,115200 no_timer_check nofb nomodeset gfxpayload=text | 12:13 |
holtgrewe | yeah, you have to provide this to disk-image-create | 12:18 |
arne_wiebalck | holtgrewe: we have `rd.auto` to auto-assemble the md devices | 12:37 |
arne_wiebalck | holtgrewe: and `root=UUID=` point to the rootfs UUID | 12:38 |
holtgrewe | arne_wiebalck: ok, adding rd.auto=1 now | 12:49 |
* holtgrewe oh the suspense | 13:04 | |
* arne_wiebalck is crossing fingers | 13:11 | |
holtgrewe | arne_wiebalck: sadly, no | 13:16 |
holtgrewe | cat /proc/cmdline => BOOT_IMAGE=/boot/vmlinuz-3.10.0-1160.49.1.el7.x86_64 root=LABEL=root_fs ro console=tty0 crashkernel=auto net.ifnames=0 console=ttyS0 console=tty0 console=ttyS0,115200 no_timer_check nofb nomodeset gfxpayload=text rd.auto=1 | 13:16 |
holtgrewe | no /dev/md0 | 13:16 |
arne_wiebalck | holtgrewe: you logged into the instance? | 13:17 |
holtgrewe | I'm stuck in dracut after boot on serial console. | 13:17 |
holtgrewe | no mdadm in dracut | 13:18 |
*** amoralej is now known as amoralej|lunch | 13:19 | |
arne_wiebalck | holtgrewe: yep, that is needed ... IIRC we had this removed by accident as well at some point | 13:20 |
holtgrewe | ok, looks like I want to have the dracut-regenerate element | 13:23 |
holtgrewe | here we go again | 13:28 |
holtgrewe | Wherever this goes, thanks a lot! I would never have gotten that far without you. | 13:28 |
arne_wiebalck | holtgrewe: np, let's see if we can make it work :) | 13:29 |
holtgrewe | arne_wiebalck: heh, I'm learning things that I never intended to learn... all I want is to do some science ;-) | 13:35 |
arne_wiebalck | holtgrewe: what is the context of your Ironic deployment if I may ask? | 13:36 |
holtgrewe | I'm replacing the infrastructure of our "mid size" HPC system. It used to be proxmox for VM and xcat for bare metal. | 13:38 |
holtgrewe | it's about 250 nodes plus 3 ... mid-sied ceph clusters | 13:39 |
holtgrewe | only a "few" PB of storage of HDD and only a "few" 100 TB of NVME (ceph) | 13:40 |
holtgrewe | Nothing compared to CERN but quite something for our life science context. | 13:40 |
holtgrewe | Plus we are running a number of data management and analysis systems that used to run in a separate proxmox cluster. | 13:43 |
holtgrewe | OK, reinstall appears to be through, I did not do the cleaning step this time, maybe that was a mistake. | 13:44 |
holtgrewe | "md/raid1:md127: not clean -- starting background reconstruction" | 13:46 |
holtgrewe | and a thousand times "dracut-initqueue[969]: Warning: dracut-initqueue timeout - starting timeout scripts" | 13:48 |
holtgrewe | I think the clean step would have been necessary | 13:49 |
arne_wiebalck | probably, we even recreate the s/w RAID on every cleaning | 13:50 |
holtgrewe | yeah, that's also kind of for free, the expensive part is boothing the machine into IPA, the raid operations are instanteneous | 13:53 |
holtgrewe | if you were to wipe the disks that would probably be dominating time | 13:54 |
holtgrewe | Not to forget the "cooling off" time between making a node available and it being actually available by nova. ;-) You have to love distributed systems with async calls. | 13:56 |
arne_wiebalck | depends on how you wipe (shred may take days, secure erase may only take seconds) | 13:56 |
arne_wiebalck | holtgrewe: thanks for the infra overview! | 13:56 |
arne_wiebalck | holtgrewe: and for the paste with the steps! | 13:57 |
holtgrewe | I'll create a new paste once I have it working end-to-end. | 13:57 |
arne_wiebalck | holtgrewe: cool, ty | 13:58 |
*** amoralej|lunch is now known as amoralej | 14:01 | |
rpittau | hey if anyone has a minute I added all the classifier patches to ironic-week-prio, they require just an approval and should be a very quick review, thanks! | 14:14 |
rpittau | mmm I ahve the terrible suspect that something's off either with pbr or with pip | 14:19 |
rpittau | probably pip | 14:20 |
rpittau | orrr could be setuptools also | 14:21 |
holtgrewe | arne_wiebalck: this works now https://paste.openstack.org/show/811746/ | 14:37 |
arne_wiebalck | holtgrewe: it booted off the s/w RAID now? | 14:40 |
holtgrewe | arne_wiebalck: I'm 99% certain | 14:43 |
holtgrewe | I have to rebuild the image with a devueser now | 14:43 |
holtgrewe | to figure out what's causing my issues why cloud-init is not working | 14:43 |
holtgrewe | it got over the dracut | 14:43 |
holtgrewe | earlier, I forgot to put the dracut-regenerate element which is why it did not work | 14:44 |
arne_wiebalck | holtgrewe: right | 14:48 |
arne_wiebalck | holtgrewe: sounds like progress, though | 14:48 |
holtgrewe | in ~5min I should know whether it really worked | 14:50 |
holtgrewe | and then on to the next problem | 14:50 |
TheJulia | good morning | 14:50 |
TheJulia | Happy Friday! | 14:53 |
holtgrewe | arne_wiebalck: I can confirm that it now booted from an md raid1 array! | 15:50 |
arne_wiebalck | holtgrewe: awesome | 15:50 |
arne_wiebalck | Good morning, TheJulia o/ | 15:50 |
holtgrewe | so the latest notes paste should be fine | 15:50 |
holtgrewe | Now on to the next riddle... how is the ironic host supposed to get its IP address in "flat" network mode? | 15:51 |
rpittau | bye everyone, have a great weekend! o/ | 15:51 |
TheJulia | holtgrewe: not sure I grok the question your seeking to answer | 15:53 |
TheJulia | the baremetal node, or.... the conductor host? | 15:53 |
holtgrewe | TheJulia: I got my baremetal/ironic node to boot an UEFI image at last via nova. I attached a port with a static IP as I usually would for VMs. I put configdrive userdata via Ansible as I usually would. The node bootes up and has dhcp running. | 15:55 |
holtgrewe | Should it get its IP via dhcp from the ironic neutron agent as it does on deployment? | 15:55 |
TheJulia | holtgrewe: generally that is how people do it | 15:57 |
holtgrewe | TheJulia: OK... so at least I understood how it should work. Now I can try to figure out where things go wrong. Thanks. | 16:01 |
holtgrewe | OK... enough for today. Thanks all, have a nice weekend o/ | 16:08 |
TheJulia | holtgrewe: okay, have a wonderful weekend! | 16:10 |
arne_wiebalck | holtgrewe: o/ | 16:10 |
holtgrewe | I'll also setup UEFI+software RAID1 setup with Rocky 8.x once I got my GPFS upgrade through and will share the command lines that worked for me. | 16:11 |
holtgrewe | Is the format from my paste above enough? | 16:11 |
holtgrewe | I could also wrap this in some explanatory text if it helps you. If you point me at the right place in the repositories I can also add a documentation patch. | 16:13 |
holtgrewe | anyway, off for today | 16:13 |
*** holtgrewe is now known as holtgrewe^gone | 16:13 | |
arne_wiebalck | holtgrewe^gone: if you would do that, that would be great ofc | 16:13 |
arne_wiebalck | holtgrewe^gone: should go to the admin section in the ironic repo | 16:14 |
opendevreview | Julia Kreger proposed openstack/ironic-tempest-plugin master: WIP: An idea for rbac positive/negative testing https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/819165 | 16:41 |
opendevreview | Merged openstack/ironic master: Fix redfish update_firmware for newer Sushy https://review.opendev.org/c/openstack/ironic/+/821576 | 17:53 |
*** amoralej is now known as amoralej|off | 17:55 | |
arne_wiebalck | bye everyone, see | 18:05 |
arne_wiebalck | you next week o/ | 18:05 |
opendevreview | Merged openstack/ironic-python-agent master: [trivial] Fix typo in __init__.py https://review.opendev.org/c/openstack/ironic-python-agent/+/822049 | 18:22 |
-opendevstatus- NOTICE: The review.opendev.org server is being rebooted to validate a routing configuration update, and should return to service shortly | 22:28 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!