*** logan_ is now known as Guest4824 | 05:45 | |
cid | good morning \o | 08:59 |
---|---|---|
*** iurygregory_ is now known as iurygregory | 11:21 | |
iurygregory | good morning Ironic, cid o/ | 11:21 |
cid | \o | 11:48 |
Sandzwerg[m] | JayF: I'd be interested to talk about ironic documentation although I'm not a new user. but I'm off till next week Tuesday so maybe we can coordinate after that? | 12:06 |
TheJulia | good morning! | 12:51 |
iurygregory | good morning TheJulia =) | 12:55 |
JayF | Sandzwerg[m]: just drop me a line in an email and I'll connect you with Dave | 13:11 |
JayF | Reverbverbverb: ^^ | 13:11 |
opendevreview | cid proposed openstack/ironic master: Flexible IPMI credential persistence method configuration https://review.opendev.org/c/openstack/ironic/+/917229 | 13:57 |
opendevreview | Julia Kreger proposed openstack/metalsmith master: DNM/WIP: Create missing allocation on list? https://review.opendev.org/c/openstack/metalsmith/+/918664 | 14:40 |
opendevreview | cid proposed openstack/ironic master: Flexible IPMI credential persistence method configuration https://review.opendev.org/c/openstack/ironic/+/917229 | 16:22 |
opendevreview | cid proposed openstack/ironic master: Flexible IPMI credential persistence method configuration https://review.opendev.org/c/openstack/ironic/+/917229 | 16:35 |
opendevreview | cid proposed openstack/ironic master: Flexible IPMI credential persistence method configuration https://review.opendev.org/c/openstack/ironic/+/917229 | 17:30 |
mnaser | henlo ironic team. I got a little curveball here | 18:01 |
mnaser | I’m dealing with an environment where I am using multipath for the root device | 18:03 |
opendevreview | Dan Smith proposed openstack/ironic master: [devstack] Upload images with --file instead of stdin https://review.opendev.org/c/openstack/ironic/+/918690 | 18:03 |
TheJulia | okay, whats up? | 18:03 |
mnaser | It seems like ipa seems to not see it but rather see all the devices as separate. I’m wondering if there’s something I have to flip somewhere v | 18:03 |
mnaser | So sda/sdb/sdc etc rather than multipathing the device | 18:04 |
TheJulia | mnaser: what is your hint, and is mutltipath-tools in the agent image? | 18:04 |
TheJulia | multipath-tools | 18:04 |
mnaser | I am using the latest zed ipa but I found this maybe — https://bugs.launchpad.net/ironic-python-agent/+bug/2031092 | 18:04 |
mnaser | I’m using the dib images of ipa in tarballs.o.o — I wonder if those are built without it? | 18:05 |
mnaser | Also no hinting in this case | 18:05 |
TheJulia | I don't know if the ones on tarballs have it | 18:05 |
mnaser | ahhh so multipath isn’t built into ipa image by default | 18:06 |
TheJulia | if it is present, the devices attached to multipathing gets excluded from the possible options | 18:06 |
mnaser | The system managed to deploy but it booted out of /dev/sdg or something | 18:06 |
TheJulia | yeah, you'll need to build an IPA image most likley | 18:06 |
TheJulia | not surprising | 18:06 |
mnaser | Ahhhhh, that’s lovely. That seems pretty simple then | 18:06 |
TheJulia | mnaser: so | 18:07 |
TheJulia | sdg should likely be fine, and would likely end up beign the same device if multipath-tools is present | 18:07 |
TheJulia | having multipath-tools largely allows it to de-duplicate and avoid locked paths | 18:07 |
mnaser | yeah and I think the root path should be /dev/dm-0 or something with multipath, no? | 18:08 |
TheJulia | eh, the root device selection logic will still take palce | 18:09 |
TheJulia | so it might be something like /dev/dm-5 or /dev/mpath0005 | 18:10 |
TheJulia | it is always the smallest device >4GB which looks like a disk | 18:14 |
TheJulia | mnaser: if you want the very first device which initializes, a root device hint will be needed | 18:18 |
mnaser | I think it is /dev/mpatha actually | 18:19 |
TheJulia | likely | 18:19 |
mnaser | cause once the system booted, it says "mpatha: ignored map" or something | 18:19 |
TheJulia | quite possible | 18:20 |
TheJulia | It is not soemthing I look at much tese days | 18:20 |
mnaser | multipathing for root os over fc is also a bit magical here :) | 18:20 |
mnaser | it is a first time for me | 18:20 |
TheJulia | and last time I did multipathing heavily before OpenStack we were dealing with mostly 4GB FibreChannel | 18:20 |
mnaser | you don't miss it? :) | 18:22 |
TheJulia | Some days I do, then I get home lab gear liek the 40GB switch besides my desk and want to throw things | 18:22 |
mnaser | https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/836094 | 18:23 |
mnaser | I think this is a good step 1 | 18:23 |
mnaser | TheJulia: "brrrrrr" | 18:23 |
TheJulia | On a plus side, I had to order a QSPF+ card and there was a second on in the box | 18:24 |
* TheJulia is not complaining | 18:24 | |
TheJulia | ... although now I need to order another cable | 18:24 |
TheJulia | mnaser: that does mix in the iscsi stuffs and some of the other initiators, but it should be fine to build a dib image. Now that I think about it, I don't think the upstream image is built with that element. | 18:26 |
mnaser | TheJulia: the IPA is happy now but the system booted with /dev/sda3 mounted at / and /dev/sdf1 at /boot/efi lol -- so I think there's something to fix here in my ubuntu dib image | 18:58 |
TheJulia | oh yes | 18:58 |
TheJulia | very yes | 18:58 |
TheJulia | very very yes | 18:58 |
mnaser | which is `ubuntu vm block-device-efi dhcp-all-interfaces cloud-init` only .. hmm | 18:58 |
TheJulia | Your image needs to have multipath-tools as well | 18:58 |
mnaser | it does :< | 18:59 |
mnaser | multipathd seems to be logging: "mpatha: ignoring map" a few times | 18:59 |
mnaser | oh interesting,.. "multipath: error getting device" in dmesg | 19:02 |
opendevreview | cid proposed openstack/ironic master: wip: Provision aarch64 fake-bare-metal-vms https://review.opendev.org/c/openstack/ironic/+/915441 | 19:02 |
TheJulia | mnaser: interesting... | 19:03 |
* TheJulia wonders how many people are getting these emails: https://code.launchpad.net/~liushuyu-011/ubuntu/+source/vitrage-tempest-plugin/+git/vitrage-tempest-plugin/+merge/465745 | 19:07 | |
mnaser | TheJulia: I wonder if multipath is missing in the initramfs on the final image | 19:07 |
mnaser | TheJulia: bingo. `multipath-tools-boot`. | 19:09 |
cid | good night \o | 19:28 |
frickler | TheJulia: I've received those, too, but I cannot tell why. will ask in #launchpad | 19:51 |
TheJulia | frickler: thanks! | 19:52 |
clarkb | frickler: TheJulia a canonical employee added 'openstack-team' to https://code.launchpad.net/~liushuyu-011/ubuntu/+source/vitrage-tempest-plugin/+git/vitrage-tempest-plugin/+merge/465745 and requested openstack-team review it. I think openstack-team includes all our subgroups. Then when everyone "unsubscribes" (which I think isn' | 19:55 |
clarkb | (which I think isn't actually doign anything) we all get additional notifications. The fix is to remove openstack-team from the review on that merge quest | 19:55 |
clarkb | I don't know how to do that, but maybe if I login to lp it will let me do so? | 19:56 |
frickler | clarkb: I don't see that subscription, maybe it was cleared up already? | 19:56 |
TheJulia | I figured that, but yeah.... | 19:56 |
frickler | seems there is also no log of actions | 19:56 |
clarkb | frickler: oh yes I have refreshed and openstack-team is removed now | 19:57 |
clarkb | so ya I think someone already dealt with it properly via launchpad rather than spamming each other | 19:57 |
frickler | 7586 active members, that's a nice spam :) | 19:59 |
frickler | would it make sense to delete that team? seems to hold a number of very outdated information only https://launchpad.net/~openstack (guess we should move that discussion to #opendev or maybe #openstack-tc, too) | 20:05 |
clarkb | I don't think we can delete that team | 20:22 |
clarkb | or at least it would require careful planning as I think stuff is tied to it (as its the root of the tree) | 20:22 |
clarkb | we can prune the tree though | 20:23 |
clarkb | I see the conversation went to the tc channel /me takes it there | 20:23 |
JayF | looks like cid's aarch64 change actually booted a vm. that's just about all it did, but we changed the error to (unknown, no more logs) ... my hunch would be that ipxe is croaking? I suspect we have to figure out how to get logs out as the next step? | 20:30 |
clarkb | looks like conductor complains about a callback timeout being hit. Possible that emulated aarch64 is simply slow? | 20:39 |
opendevreview | Julia Kreger proposed openstack/metalsmith master: Create missing allocation on list https://review.opendev.org/c/openstack/metalsmith/+/918664 | 20:43 |
JayF | clarkb: I hypothesize that is likely because the aarch64 VM didn't boot anything of value | 20:45 |
JayF | even though we know, at least, that libvirt started it without error from the *absense* of angry logs | 20:45 |
JayF | so I'd look at ipxe or ipa | 20:45 |
TheJulia | stevebaker[m]: my metalsmith change above relates to the issue we keep getting reported downstream, lmk what you think | 20:47 |
stevebaker[m] | TheJulia: hrm, it's not ideal for a list operation to have the side effect of creating records. The only alternative I can think of is that the error message should print a batch of allocation create commands to run to fix the situation | 20:52 |
JayF | stevebaker[m]: that first half is what I thought when I saw the change, too -- that sounds like a good suggestion | 20:53 |
JayF | or even in gentoo, sometimes it'll do that "you need to add this to your config in order for this to succeed" but you can pass a flag to have it do it for you instead | 20:53 |
TheJulia | basically the customers are streamrolling past the documentation guidelines | 20:53 |
TheJulia | which is to ensure a good state prior to doing anything | 20:54 |
TheJulia | so, yeah | 20:54 |
TheJulia | so dunno, its not good all around | 20:55 |
JayF | printing the commands with a "Ensure you've done the prework suggested in documentation" or similar would probably be nice | 20:56 |
JayF | list commands causing creates to happen is nasty, I didn't +2 it earlier because of that (but didn't wanna block a fix for a software tool I don't use) | 20:56 |
TheJulia | my worry is they will jsut turn around and go "f-that" and file a support case anyway | 20:57 |
TheJulia | and we can only know it per node at that point... :\ | 20:57 |
JayF | Let me ask the question a different way | 20:57 |
* TheJulia shrugs | 20:57 | |
TheJulia | I'll revise it with suggested commands tomorrow | 20:57 |
JayF | is there a way we could put that prework, the "ensure you're in a good state" into a separate command? | 20:57 |
JayF | which would essentially do the list+create allocation stuff | 20:58 |
JayF | and the error could be like "go run prepare" | 20:58 |
JayF | if it's safe enough to do on list, why wouldn't it be safe enough to do in a command (or with a --fix-it flag) | 20:58 |
TheJulia | Not at this point | 20:58 |
TheJulia | but that is what I would have expected upfront when I found the cause | 20:58 |
TheJulia | but of course not | 20:58 |
TheJulia | I think the challenge is people just expect it to work and it doesn't if their initial state was bad | 21:01 |
JayF | A technology user? Expecting things to work without putting in effort 😱 | 21:05 |
JayF | lolsob :( | 21:05 |
mnaser | TheJulia: just for updates, installing multipath-tools-boot fixed things on an existing machine after a reboot. a new image with it didn't get networking. my guess is cloud-init is somehow not picking up the configdrive partition | 21:23 |
TheJulia | mnaser: that is... crazy | 21:25 |
mnaser | hmm... or maybe not, cause I have dhcp-all-interfaces and I can't ping that system | 21:26 |
mnaser | but.. ipa stuff runs just fine | 21:26 |
stevebaker[m] | TheJulia: JayF A new "metalsmith reconcile" command which creates missing allocations would be useful, and the metalsmith list error could suggest using it. But that is a bit of work for a deprecated tool | 21:29 |
TheJulia | mnaser: I thought the point of dhcp-all-interfaces was kind of moot with networkmanager these days, then again local configuration drives are not checked by default by cloud-init either, you have to tell it to :( | 21:31 |
mnaser | TheJulia: then it must be something else that is not causing the system to get any networking at all? | 21:57 |
TheJulia | Are you providing a config drive with networking? | 21:58 |
mnaser | TheJulia: I'm not, but this is through nova, so I had kinda assumed it did on my behalf? | 22:02 |
mnaser | (but also... this thing worked fine until I added `-p multipath-tools-boot` into the dib command..) | 22:02 |
TheJulia | mnaser: not by default, depends on how it is configured if to send one or not. See nova.conf for that. | 22:03 |
TheJulia | Hmmm fine until multiparth tools on Boot is a good hint that maybe the boot sequence is being changed | 22:04 |
mnaser | TheJulia: well I can see in the console the system booted just fine but I can't ping it and clearly no metadata (cause it starts with `ubuntu login:`) | 22:06 |
JayF | Is that a two-way console? Could you login to it and see if there's a configdrive anywhere to be found? | 22:06 |
TheJulia | so it breaks metadata then | 22:07 |
TheJulia | did you pass a cloud-init data sources list for your image build? | 22:07 |
mnaser | JayF: it is.. but I just don't have a password and im not sure how the best way to do a dance to get into the box | 22:07 |
mnaser | TheJulia: nope :X | 22:07 |
JayF | mnaser: fair | 22:07 |
TheJulia | mnaser: heh, yeah, remember what I said about cloud init really not looking to config drives at all unless you explicitly tell it ? :) | 22:07 |
TheJulia | mnaser: add DIB_CLOUD_INIT_DATASOURCES="ConfigDrive" to your environment for building the image | 22:08 |
TheJulia | https://github.com/openstack/diskimage-builder/tree/master/diskimage_builder/elements/cloud-init-datasources | 22:08 |
TheJulia | and add the cloud-init-datasources element | 22:09 |
mnaser | alright let me see if that does it | 22:10 |
mnaser | it might be it but I still feel like it's so odd that this change broke it | 22:13 |
mnaser | building.. | 22:13 |
TheJulia | I'm just thinking if that might be a way around it that ubuntu package broke networking entirely | 22:13 |
mnaser | oh yeah that's an idea | 22:14 |
mnaser | TheJulia: actually if anything it might be useful to see what happened | 22:16 |
TheJulia | looks like it puts some udev rules in, but that shouldb't break networking | 22:16 |
TheJulia | shouldn't break that is | 22:16 |
mnaser | if it brings up a little bit of networking enough to get in | 22:16 |
TheJulia | yup | 22:16 |
TheJulia | if you do have a working config drive with network data, it might just work | 22:16 |
mnaser | yeah I have config drive enabled in nova | 22:17 |
mnaser | I will need that no matter what because bonds | 22:17 |
mnaser | not even just bonds.. bonds with vlans on top, so it'll be some fun to setup :) | 22:18 |
TheJulia | eek, so yeah, your going to need it all then because default state you really shouldn't have working dhcp at the lowest level if your fully ml2 integrated | 22:18 |
mnaser | TheJulia: well it technically works because the LACP fallback is pointing to the network its booting on, but the bond has that netwrk tagged | 22:52 |
TheJulia | Only if you have a native vlan on the port, for provisioning it is an access port | 22:53 |
mnaser | well, lacp fallback with native vlan for provisioning, then lacp'd port trunk'd | 22:56 |
opendevreview | Merged openstack/ironic master: Fix spurious CI job failures around partition images https://review.opendev.org/c/openstack/ironic/+/918466 | 23:05 |
mnaser | TheJulia: well that did the truck | 23:12 |
mnaser | trick | 23:12 |
mnaser | but truck too I guess | 23:12 |
opendevreview | Julia Kreger proposed openstack/ironic stable/2024.1: Fix spurious CI job failures around partition images https://review.opendev.org/c/openstack/ironic/+/918645 | 23:17 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!