Wednesday, 2024-05-08

*** logan_ is now known as Guest482405:45
cidgood morning \o08:59
*** iurygregory_ is now known as iurygregory11:21
iurygregorygood morning Ironic, cid o/11:21
cid\o11:48
Sandzwerg[m]JayF: I'd be interested to talk about ironic documentation although I'm not a new user. but I'm off till next week Tuesday so maybe we can coordinate after that?12:06
TheJuliagood morning!12:51
iurygregorygood morning TheJulia =)12:55
JayFSandzwerg[m]: just drop me a line in an email and I'll connect you with Dave13:11
JayFReverbverbverb: ^^13:11
opendevreviewcid proposed openstack/ironic master: Flexible IPMI credential persistence method configuration  https://review.opendev.org/c/openstack/ironic/+/91722913:57
opendevreviewJulia Kreger proposed openstack/metalsmith master: DNM/WIP: Create missing allocation on list?  https://review.opendev.org/c/openstack/metalsmith/+/91866414:40
opendevreviewcid proposed openstack/ironic master: Flexible IPMI credential persistence method configuration  https://review.opendev.org/c/openstack/ironic/+/91722916:22
opendevreviewcid proposed openstack/ironic master: Flexible IPMI credential persistence method configuration  https://review.opendev.org/c/openstack/ironic/+/91722916:35
opendevreviewcid proposed openstack/ironic master: Flexible IPMI credential persistence method configuration  https://review.opendev.org/c/openstack/ironic/+/91722917:30
mnaserhenlo ironic team. I got a little curveball here18:01
mnaserI’m dealing with an environment where I am using multipath for the root device18:03
opendevreviewDan Smith proposed openstack/ironic master: [devstack] Upload images with --file instead of stdin  https://review.opendev.org/c/openstack/ironic/+/91869018:03
TheJuliaokay, whats up?18:03
mnaserIt seems like ipa seems to not see it but rather see all the devices as separate. I’m wondering if there’s something I have to flip somewhere v18:03
mnaserSo sda/sdb/sdc etc rather than multipathing the device18:04
TheJuliamnaser: what is your hint, and is mutltipath-tools in the agent image?18:04
TheJuliamultipath-tools18:04
mnaserI am using the latest zed ipa but I found this maybe — https://bugs.launchpad.net/ironic-python-agent/+bug/203109218:04
mnaserI’m using the dib images of ipa in tarballs.o.o — I wonder if those are built without it?18:05
mnaserAlso no hinting in this case18:05
TheJuliaI don't know if the ones on tarballs have it18:05
mnaserahhh so multipath isn’t built into ipa image by default18:06
TheJuliaif it is present, the devices attached to multipathing gets excluded from the possible options18:06
mnaserThe system managed to deploy but it booted out of /dev/sdg or something18:06
TheJuliayeah, you'll need to build an IPA image most likley18:06
TheJulianot surprising18:06
mnaserAhhhhh, that’s lovely. That seems pretty simple then18:06
TheJuliamnaser: so18:07
TheJuliasdg should likely be fine, and would likely end up beign the same device if multipath-tools is present18:07
TheJuliahaving multipath-tools largely allows it to de-duplicate and avoid locked paths18:07
mnaseryeah and I think the root path should be /dev/dm-0 or something with multipath, no?18:08
TheJuliaeh, the root device selection logic will still take palce18:09
TheJuliaso it might be something like /dev/dm-5 or /dev/mpath000518:10
TheJuliait is always the smallest device >4GB which looks like a disk18:14
TheJuliamnaser: if you want the very first device which initializes, a root device hint will be needed18:18
mnaserI think it is /dev/mpatha actually18:19
TheJulialikely18:19
mnasercause once the system booted, it says "mpatha: ignored map" or something18:19
TheJuliaquite possible18:20
TheJuliaIt is not soemthing I look at much tese days18:20
mnasermultipathing for root os over fc is also a bit magical here :)18:20
mnaserit is a first time for me18:20
TheJuliaand last time I did multipathing heavily before OpenStack we were dealing with mostly 4GB FibreChannel18:20
mnaseryou don't miss it? :)18:22
TheJuliaSome days I do, then I get home lab gear liek the 40GB switch besides my desk and want to throw things18:22
mnaserhttps://review.opendev.org/c/openstack/ironic-python-agent-builder/+/83609418:23
mnaserI think this is a good step 118:23
mnaserTheJulia: "brrrrrr"18:23
TheJuliaOn a plus side, I had to order a QSPF+ card and there was a second on in the box18:24
* TheJulia is not complaining18:24
TheJulia... although now I need to order another cable18:24
TheJuliamnaser: that does mix in the iscsi stuffs and some of the other initiators, but it should be fine to build a dib image. Now that I think about it, I don't think the upstream image is built with that element.18:26
mnaserTheJulia: the IPA is happy now but the system booted with /dev/sda3 mounted at / and /dev/sdf1 at /boot/efi lol -- so I think there's something to fix here in my ubuntu dib image18:58
TheJuliaoh yes18:58
TheJuliavery yes18:58
TheJuliavery very yes18:58
mnaserwhich is `ubuntu vm block-device-efi dhcp-all-interfaces cloud-init` only .. hmm18:58
TheJuliaYour image needs to have multipath-tools as well 18:58
mnaserit does :<18:59
mnasermultipathd seems to be logging: "mpatha: ignoring map" a few times18:59
mnaseroh interesting,.. "multipath: error getting device" in dmesg19:02
opendevreviewcid proposed openstack/ironic master: wip: Provision aarch64 fake-bare-metal-vms  https://review.opendev.org/c/openstack/ironic/+/91544119:02
TheJuliamnaser: interesting...19:03
* TheJulia wonders how many people are getting these emails: https://code.launchpad.net/~liushuyu-011/ubuntu/+source/vitrage-tempest-plugin/+git/vitrage-tempest-plugin/+merge/46574519:07
mnaserTheJulia: I wonder if multipath is missing in the initramfs on the final image19:07
mnaserTheJulia: bingo.  `multipath-tools-boot`.19:09
cidgood night \o19:28
fricklerTheJulia: I've received those, too, but I cannot tell why. will ask in #launchpad19:51
TheJuliafrickler: thanks!19:52
clarkbfrickler: TheJulia a canonical employee added 'openstack-team' to https://code.launchpad.net/~liushuyu-011/ubuntu/+source/vitrage-tempest-plugin/+git/vitrage-tempest-plugin/+merge/465745 and requested openstack-team review it. I think openstack-team includes all our subgroups. Then when everyone "unsubscribes" (which I think isn'19:55
clarkb(which I think isn't actually doign anything) we all get additional notifications. The fix is to remove openstack-team from the review on that merge quest19:55
clarkbI don't know how to do that, but maybe if I login to lp it will let me do so?19:56
fricklerclarkb: I don't see that subscription, maybe it was cleared up already?19:56
TheJuliaI figured that, but yeah....19:56
fricklerseems there is also no log of actions19:56
clarkbfrickler: oh yes I have refreshed and openstack-team is removed now19:57
clarkbso ya I think someone already dealt with it properly via launchpad rather than spamming each other19:57
frickler7586 active members, that's a nice spam :)19:59
fricklerwould it make sense to delete that team? seems to hold a number of very outdated information only https://launchpad.net/~openstack (guess we should move that discussion to #opendev or maybe #openstack-tc, too)20:05
clarkbI don't think we can delete that team20:22
clarkbor at least it would require careful planning as I think stuff is tied to it (as its the root of the tree)20:22
clarkbwe can prune the tree though20:23
clarkbI see the conversation went to the tc channel /me takes it there20:23
JayFlooks like cid's aarch64 change actually booted a vm. that's just about all it did, but we changed the error to (unknown, no more logs) ... my hunch would be that ipxe is croaking? I suspect we have to figure out how to get logs out as the next step?20:30
clarkblooks like conductor complains about a callback timeout being hit. Possible that emulated aarch64 is simply slow?20:39
opendevreviewJulia Kreger proposed openstack/metalsmith master: Create missing allocation on list  https://review.opendev.org/c/openstack/metalsmith/+/91866420:43
JayFclarkb: I hypothesize that is likely because the aarch64 VM didn't boot anything of value20:45
JayFeven though we know, at least, that libvirt started it without error from the *absense* of angry logs20:45
JayFso I'd look at ipxe or ipa20:45
TheJuliastevebaker[m]: my metalsmith change above relates to the issue we keep getting reported downstream, lmk what you think20:47
stevebaker[m]TheJulia: hrm, it's not ideal for a list operation to have the side effect of creating records. The only alternative I can think of is that the error message should print a batch of allocation create commands to run to fix the situation20:52
JayFstevebaker[m]: that first half is what I thought when I saw the change, too -- that sounds like a good suggestion20:53
JayFor even in gentoo, sometimes it'll do that "you need to add this to your config in order for this to succeed" but you can pass a flag to have it do it for you instead20:53
TheJuliabasically the customers are streamrolling past the documentation guidelines20:53
TheJuliawhich is to ensure a good state prior to doing anything20:54
TheJuliaso, yeah20:54
TheJuliaso dunno, its not good all around20:55
JayFprinting the commands with a "Ensure you've done the prework suggested in documentation" or similar would probably be nice20:56
JayFlist commands causing creates to happen is nasty, I didn't +2 it earlier because of that (but didn't wanna block a fix for a software tool I don't use)20:56
TheJuliamy worry is they will jsut turn around and go "f-that" and file a support case anyway20:57
TheJuliaand we can only know it per node at that point... :\20:57
JayFLet me ask the question a different way20:57
* TheJulia shrugs20:57
TheJuliaI'll revise it with suggested commands tomorrow20:57
JayFis there a way we could put that prework, the "ensure you're in a good state" into a separate command?20:57
JayFwhich would essentially do the list+create allocation stuff20:58
JayFand the error could be like "go run prepare"20:58
JayFif it's safe enough to do on list, why wouldn't it be safe enough to do in a command (or with a --fix-it flag)20:58
TheJuliaNot at this point20:58
TheJuliabut that is what I would have expected upfront when I found the cause20:58
TheJuliabut of course not20:58
TheJuliaI think the challenge is people just expect it to work and it doesn't if their initial state was bad21:01
JayFA technology user? Expecting things to work without putting in effort 😱21:05
JayFlolsob :(21:05
mnaserTheJulia: just for updates, installing multipath-tools-boot fixed things on an existing machine after a reboot. a new image with it didn't get networking.  my guess is cloud-init is somehow not picking up the configdrive partition21:23
TheJuliamnaser: that is... crazy21:25
mnaserhmm... or maybe not, cause I have dhcp-all-interfaces and I can't ping that system21:26
mnaserbut.. ipa stuff runs just fine21:26
stevebaker[m]TheJulia: JayF A new "metalsmith reconcile" command which creates missing allocations would be useful, and the metalsmith list error could suggest using it. But that is a bit of work for a deprecated tool21:29
TheJuliamnaser: I thought the point of dhcp-all-interfaces was kind of moot with networkmanager these days, then again local configuration drives are not checked by default by cloud-init either, you have to tell it to :(21:31
mnaserTheJulia: then it must be something else that is not causing the system to get any networking at all?21:57
TheJuliaAre you providing a config drive with networking?21:58
mnaserTheJulia: I'm not, but this is through nova, so I had kinda assumed it did on my behalf?22:02
mnaser(but also... this thing worked fine until I added `-p multipath-tools-boot` into the dib command..)22:02
TheJuliamnaser: not by default, depends on how it is configured if to send one or not. See nova.conf for that.22:03
TheJuliaHmmm fine until multiparth tools on Boot is a good hint that maybe the boot sequence is being changed22:04
mnaserTheJulia: well I can see in the console the system booted just fine but I can't ping it and clearly no metadata (cause it starts with `ubuntu login:`)22:06
JayFIs that a two-way console? Could you login to it and see if there's a configdrive anywhere to be found?22:06
TheJuliaso it breaks metadata then22:07
TheJuliadid you pass a cloud-init data sources list for your image build?22:07
mnaserJayF: it is.. but I just don't have a password and im not sure how the best way to do a dance to get into the box22:07
mnaserTheJulia: nope :X22:07
JayFmnaser: fair22:07
TheJuliamnaser: heh, yeah, remember what I said about cloud init really not looking to config drives at all unless you explicitly tell it ? :)22:07
TheJuliamnaser: add DIB_CLOUD_INIT_DATASOURCES="ConfigDrive" to your environment for building the image22:08
TheJuliahttps://github.com/openstack/diskimage-builder/tree/master/diskimage_builder/elements/cloud-init-datasources22:08
TheJuliaand add the cloud-init-datasources element22:09
mnaseralright let me see if that does it22:10
mnaserit might be it but I still feel like it's so odd that this change broke it22:13
mnaserbuilding..22:13
TheJuliaI'm just thinking if that might be a way around it that ubuntu package broke networking entirely22:13
mnaseroh yeah that's an idea22:14
mnaserTheJulia: actually if anything it might be useful to see what happened22:16
TheJulialooks like it puts some udev rules in, but that shouldb't break networking22:16
TheJuliashouldn't break that is22:16
mnaserif it brings up a little bit of networking enough to get in22:16
TheJuliayup22:16
TheJuliaif you do have a working config drive with network data, it might just work22:16
mnaseryeah I have config drive enabled in nova22:17
mnaserI will need that no matter what because bonds22:17
mnasernot even just bonds.. bonds with vlans on top, so it'll be some fun to setup :)22:18
TheJuliaeek, so yeah, your going to need it all then because default state you really shouldn't have working dhcp at the lowest level if your fully ml2 integrated22:18
mnaserTheJulia: well it technically works because the LACP fallback is pointing to the network its booting on, but the bond has that netwrk tagged22:52
TheJuliaOnly if you have a native vlan on the port, for provisioning it is an access port22:53
mnaserwell, lacp fallback with native vlan for provisioning, then lacp'd port trunk'd22:56
opendevreviewMerged openstack/ironic master: Fix spurious CI job failures around partition images  https://review.opendev.org/c/openstack/ironic/+/91846623:05
mnaserTheJulia: well that did the truck23:12
mnasertrick23:12
mnaserbut truck too I guess23:12
opendevreviewJulia Kreger proposed openstack/ironic stable/2024.1: Fix spurious CI job failures around partition images  https://review.opendev.org/c/openstack/ironic/+/91864523:17

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!