Monday, 2024-09-16

cardoeThis is getting crazy on this RBAC change I keep trying to help get over the line. https://review.opendev.org/c/openstack/ironic/+/928283 timeout on downloading an image now04:16
fricklera) why is ironic using an outdated cirros version? b) why are you downloading the image in CI instead of using a pre-cached copy?06:19
opendevreviewTakashi Kajinami proposed openstack/ironic master: Drop SQLALCHEMY_WARN_20  https://review.opendev.org/c/openstack/ironic/+/92939606:21
opendevreviewTakashi Kajinami proposed openstack/ironic master: Drop SQLALCHEMY_WARN_20  https://review.opendev.org/c/openstack/ironic/+/92939606:25
kubajjgood morning ironic! o/06:42
dtantsurfrickler: we try to use the cached version, but the code is somewhat fragile since it has to fall back to downloading to support local devstack runs09:22
dtantsurthis is my general impression of the issue, the details may differ09:22
dtantsurJayF: wdyt about landing https://review.opendev.org/c/openstack/ironic-inspector/+/928783? I'm not fond of it (mostly because I don't understand what exactly has changed to cause it) but I also don't want to spend too much time debugging CI for a deprecated project12:31
JayFMy question for stuff like this is always do we have any idea if it's broken in the real world13:15
JayFIf this is just devstack shenanigans, +2. If we don't know ... That's more hairy 13:15
JayFfrickler: iirc newer cirros broke us at one point and we pinned it. I'm onboard to revisit13:26
TheJuliagood morning13:29
TheJuliaWe've hit and pinned like 5-6 times through my entire time in openstack13:29
TheJuliaI think some of that was added complexity of trying to take it and make a partition image which we can likely revisit13:31
TheJuliaSorry, my brain got nerd sniped by product management at 6 AM sharp13:31
dtantsurJayF: as far as my limited understanding goes, it's very specific to how the inspector's devstack plugin is organized13:53
TheJuliaHow to know when the day is going to be interesting: When someone puts something so crazy in writing you have to respond with "I'm sorry, what?!" as a reply13:57
dtantsurMonday :D14:01
cardoemorning all... here's my weekly "ready for workflow" spam...14:54
cardoehttps://review.opendev.org/c/openstack/ironic/+/927780 https://review.opendev.org/c/openstack/ironic/+/928106 https://review.opendev.org/c/openstack/ironic/+/929364 https://review.opendev.org/c/openstack/ironic/+/929272 https://review.opendev.org/c/openstack/ironic-lib/+/928778 https://review.opendev.org/c/openstack/ironic-lib/+/92877614:57
TheJulia:)14:57
TheJulia<314:57
* TheJulia tries to get a slow cooked piece started for dinner in 10 hours14:59
cardoeA couple that don't have a +2 but are likely not that controversial and can be a quick review https://review.opendev.org/c/openstack/sushy/+/929055 https://review.opendev.org/c/openstack/ironic/+/92764515:00
JayFI'll note, Muhammad Ahmad from https://review.opendev.org/c/openstack/ironic/+/929364 reached out to me on linkedin about contributing more to Ironic. I'll respond today but that was nice context around why that patch was posted15:01
JayFWho is running meeting this morning?15:01
opendevreviewTakashi Kajinami proposed openstack/ironic-inspector master: Drop SQLALCHEMY_WARN_20  https://review.opendev.org/c/openstack/ironic-inspector/+/92953315:02
JayFI'm going to run it in lieu of anyone else :)15:02
JayF#startmeeting ironic15:02
opendevmeetMeeting started Mon Sep 16 15:02:27 2024 UTC and is due to finish in 60 minutes.  The chair is JayF. Information about MeetBot at http://wiki.debian.org/MeetBot.15:02
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:02
opendevmeetThe meeting name has been set to 'ironic'15:02
TheJuliaI ran it last week :)15:02
JayF#chair TheJulia15:02
opendevmeetCurrent chairs: JayF TheJulia15:02
JayFHi all, welcome to the Ironic meeting, we hold our meetings under the Openinfra CoC15:02
JayF#topic Announcements/Reminders15:02
masgharo/15:03
JayF#note  Standing reminder to review patches tagged ironic-week-prio and to hashtag any patches ready for review with ironic-week-prio: https://tinyurl.com/ironic-weekly-prio-dash15:03
dtantsuro/15:03
JayF#note This is R-2 week. Next week final releases must be cut and the integrated OpenStack release is early october.15:03
JayF#link https://releases.openstack.org/dalmatian/schedule.html15:03
TheJuliaWhich means we need to cut *this* week15:04
cido/15:04
JayFTheJulia: the linked thing seems to indicate we could do it next week15:04
JayFTheJulia: but I am very +1 to getting it done in advance15:04
TheJuliaJayF: they will force us for this week most likely, it has happened nearly every time15:04
TheJuliaanxiety overrules all15:04
JayFThat's my motto15:05
JayFlol15:05
JayFSo lets cut one this week then, sgtm15:05
JayF#note the next OpenInfra PTG which will take place October 21-25, 2024 virtually! Registration is now open! 15:05
JayF#link https://ptg.openinfra.dev/ 15:05
JayF#link https://etherpad.opendev.org/p/ironic-ptg-october-202415:05
JayFPlease add your topics to the etherpad, and review them async so we can keep the PTG live sessions to the point and on topic as much as possible.15:06
JayFThat's all on the agenda for announcements15:06
JayF#topic Review Ironic CI Status15:06
dtantsurRe releases: rpittau comes back on ?Wednesday?15:07
dtantsurRe CI: we put out SOOO many fires last week15:07
JayFyeah15:07
JayFOur CI has bitrotted significantly, in a handful of different ways15:07
dtantsurBroken inspector should be fixed by https://review.opendev.org/c/openstack/ironic-inspector/+/92878315:07
dtantsurthen... hopefully....15:07
JayFwe wanted to fix some of it with a replacement for ipa; we didn't do that this cycle15:07
JayFthere's a lot of random failures like Jens alludded to earlier- including downloading cirros from the internet15:07
JayFI feel like CI can consume infinite time, but between releases I think it'd pay off to try and tackle as many of these intermittant issues we can15:08
TheJuliaI'm wondering if we need to instead of looking at CI during hte ptg through the lens of fix/reduction but to radical revisioning, or determination of what we can do to make it easier for ourselves to have more stable/reliable CI15:08
TheJulianote, that is a big picture thought, we don't have to discuss now()15:09
JayFI agree a lot15:09
JayFincluding documenting our specific CI matrix, what jobs cover what15:09
* dtantsur has deja vu15:09
TheJuliadtantsur: I said similar in the existing ptg notes15:09
dtantsurand in the past.. and in the past.. and in the past15:10
JayFWe have a few topic on PTG for this: replacing dnsmasq to help remove that periodic CI failure, training up folks on how to troubleshoot CI so it's less of a strain on the longer term folks15:10
TheJuliaall this has happened before, all this will happen again15:10
* dtantsur nods wisely15:10
JayFdtantsur: we've been doing this for ten years, I'd be amazed if we don't do this every two years15:10
JayFand I'm not sure it's a bad thing :D 15:10
* TheJulia channels https://en.battlestarwikiclone.org/wiki/Pythia15:10
JayFLooks like we'll have a handful of things to chat about CI during PTG 15:11
JayFDiscussion topics in the meeting agenda are old; am not going to cover those15:11
JayF#topic Bug Deputy updates15:12
JayFcid and masghar had volunteered to be deputies over a couple of weeks, at least when the agenda was updated15:12
JayFis there anything to note?15:12
masgharI forgot to update the agenda, apologies15:12
* JayF notes the agenda was last updated 8/26/2024 so that promise is beyond expired15:12
masgharNo new bugs for September15:12
JayF++ Thanks15:12
JayFDo we have a volunteer to be the next bug deputy?15:12
cidNo new bugs15:13
JayFI can probably take it if nobody else is signing up?15:13
JayFI haven't in a month+15:13
cidmasghar, we can both keep an eye until after release (?)15:13
masgharI wouldnt mind it at all :)15:13
cid++15:13
JayFI'll be around if you have any questions15:13
masgharPerfect, thanks15:14
JayF#note CID and masghar agree to bug deputy thru release 15:14
cidJayF: ty15:14
JayF#topic RFE Review15:14
JayFthere are no RFEs posted for review15:14
JayF#topic Open Discussion15:14
TheJuliabraaaaains15:14
TheJuliacardoe has some items requiring attention15:15
TheJuliaI can take a look after the meeting15:15
JayFI have a small thing for here, honestly probably a plug more than anything else -- I made a short video about OSSA-2024-003, https://www.youtube.com/watch?v=cz71C4tW3Pw -- feel free to point anyone at it who has questions15:15
JayFTheJulia: I landed most of them already I think15:15
cardoeJayF: what I linked? That was me scrapping about 20 minutes ago.15:15
JayFYeah, I've been busy15:16
JayFAnything else for open discussion?15:16
TheJuliauhhh15:16
TheJuliaAvailability in weeks leading up to the PTG>15:16
TheJulia?15:16
JayFI'll note it's unlikely I'll be here a week from today15:16
JayFas it's my birthday and I usually try not to work15:16
cidJayF: Happy birthday in advance15:17
masghar++ ^15:17
TheJuliaAs an FYI, the first whole week in october I'll be in the EU time zone for meetings. The week after that I'll be in Indiana for OpenInfra days15:17
TheJuliaThe week after that i the PTG15:18
JayFI might be out at right around that time, too15:18
* JayF checks calendar15:18
JayFYeah like 10/3 - 10/9 or so15:18
TheJuliaof further note, the week after the ptg, I have jury duty too15:18
JayFMy brother is getting married :)15:18
JayF** Oct 3 - Oct 9 (I know that was a us-ian way to write that)15:19
dtantsurI'll be out for a few days around Oct 115:19
JayF#note Many cores out first two weeks of October; please set expectations appropriately for getting code landed15:19
TheJulia++15:20
JayFLast call for items for Open Discussion?15:20
TheJuliaUhh15:20
TheJuliaOpenInfra Days NA - Indiana, about a month from now()15:20
TheJuliaI will have a project update/feedback session, if anyone else will be there15:21
kubajjo/15:21
TheJuliao/ kubajj 15:21
JayFo/15:21
JayFTheJulia: nice; if you want I can try to attend remotely but will not be on site 15:21
TheJuliaI don't know if they are recording forum sessions15:22
TheJuliasince I'm sort of splitting the time into high level what is going on and then requirements gathering15:22
JayFYeah, that's what I figured, just tossing out the willingness out there15:22
JayFI know having you and dtantsur on the line when rpittau and I did BM SIG @ CERN was super helpful15:23
JayF(and I think we've incorporated much of that actionable feedback into this release \o/)15:23
TheJuliaI'll see what I can do, but won't know likely until the day of15:24
opendevreviewMerged openstack/ironic master: Remove default override for RBAC config options  https://review.opendev.org/c/openstack/ironic/+/92828315:24
JayFYou know how to get ahold of me if you need :D 15:24
TheJulia\o/15:24
JayFLast call (part deux) for open discussion items?15:25
dtantsurI can haz a review request?15:25
dtantsurthe nc-si / no-power-off spec is sad and lonely: https://review.opendev.org/c/openstack/ironic-specs/+/92665415:26
TheJuliadtantsur: we can haz cheezeburger?15:26
dtantsurmmmm do want15:26
TheJuliadtantsur: I can take a look in a day or two, I need to get some other stuff sorted first15:26
JayFdtantsur: I have some concerns but I think the only way to resolve them is "don't support this hardware" which isn't reasonable15:27
JayFdtantsur: so I'll take a look15:27
TheJuliaI just independently had a crazy idea without looking at the spec, I wonder how similar it might be?!15:27
dtantsurConcerns should at least be listed, mitigated and documented15:27
TheJulia++15:27
JayFYeah, mainly I think there are cases we can leak a running  ramdisk15:28
dtantsurThe spec is kinda boring, so a crazy idea might turn out something completely different15:28
JayFwhich are ~unavoidable, in the same way the "we flip the power and hope the machine comes up" is unavoidable15:28
TheJuliaJayF: for network booting?15:28
JayFlike think abuot15:28
JayFwhat happens if the agent completely goes away15:28
JayFthe "timeout" bug gets 100000x worse if you can't reboot via oob somehow15:28
JayFthat goes from "a deploy fails" to "a human must touch this machine"15:28
dtantsurYou can reboot, you cannot power off15:29
TheJuliaoh yeah, possible, I was more thinking these cases are likely vmedi abased15:29
JayFdtantsur: you can reboot /via oob/?15:29
dtantsurYes, sure. Only power off is a problem.15:29
JayFdtantsur: if so that will obviate like 90% of my concerns15:29
TheJuliaI think we always try inband first15:29
JayFokay nice, I was thinking the inband method too 15:29
dtantsurWe do try in-band power off first, which will also need to be changed to a reboot15:29
TheJuliaoh, yeah15:30
TheJuliaugh15:30
JayFso it's not like the interface to power off doesn't exist15:30
TheJuliaor maybe smarter?!15:30
JayFit's like, power off for this hardware isn't a usual thing to do?15:30
dtantsurPlease check the spec, it answers all these questions :)15:30
TheJuliachanging running default across a fleet is slightly problematic15:30
JayFI've read the spec15:30
TheJuliaack15:30
* TheJulia read a much earlier version, I think15:30
JayFprobably same15:30
* TheJulia needs to refresh the brain15:30
TheJuliaOh, have we posted our prelude?15:30
JayFYeah it does not specifically say it's oob, I think there are some unsaid assumptions in there, I'll give it a quality review once meeting is over15:31
TheJuliaThere is a morale imperitive for a silly prelude15:31
JayFTheJulia: nope, that's another thing on my list for today if nobody else pushes it15:31
dtantsurHappy to elaborate wherever needed15:31
JayFI am not a silly writer15:31
JayFif you want a boring prelude I will write one15:31
TheJuliaokay, I'll see what I can hammer out after my next meeting15:31
* JayF prefers boring for docs that will be read by multiple cultures / esl folks15:31
JayFI don't trust myself to know what is funny for everyone :)15:32
TheJuliaThat might be better15:32
* dtantsur cannot English well enough for a funny prelude (at least the one that most people will find funny)15:32
* TheJulia thinks dtantsur underestimates his english skills15:32
JayFI was going to suggest we highlight the operator-facing enhancements:15:32
JayFsame15:32
TheJuliaor German has replaced htem15:32
JayFlike 2-3 features spceifically asked for at BM SIG have landed this release, ironic-guest-metadata, runbooks15:32
JayFwe have a lot of operator-happiness changes in this one15:32
dtantsurTheJulia: I do have a mess in my head nowadays :D15:33
kubajjbtw, I discovered a bug (that I caused) in the RAID skip_block_devices and am working on fixing it15:33
TheJuliaDid docs for runbooks merge? I ask because I had someone send me a feature list and it lacked it15:33
TheJuliakubajj: oh, good to know. Is there a bug in launchpad?15:33
JayFcid: ^ re: runbook docs15:34
JayFI think they did?15:34
JayFI think they were alongside15:34
kubajjTheJulia: I'll have a look, not sure15:34
TheJulianothing on runbooks in https://docs.openstack.org/ironic/latest/admin/15:34
cidJayF: yes, unless something was left out, it was all part of the same patch.15:34
TheJuliaand that is where I have folks looking for "features"15:35
TheJulianor https://docs.openstack.org/ironic/latest/user/15:35
TheJuliaI bet it is just not linked in15:35
JayFUnless objection, going to close the meeting as I think this is rapidly flowing into normal conversation?15:35
JayFme too TheJulia 15:35
TheJuliaJayF: sgtm15:36
JayF#endmeeting15:36
opendevmeetMeeting ended Mon Sep 16 15:36:07 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:36
opendevmeetMinutes:        https://meetings.opendev.org/meetings/ironic/2024/ironic.2024-09-16-15.02.html15:36
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/ironic/2024/ironic.2024-09-16-15.02.txt15:36
opendevmeetLog:            https://meetings.opendev.org/meetings/ironic/2024/ironic.2024-09-16-15.02.log.html15:36
JayFTheJulia: it was added inline to servicing/cleaning docs15:36
JayFwhich tbh makes sense to me versus a dedicated page based on the feedback from the docs review15:37
TheJuliaJayF: ahh15:37
JayFhttps://docs.openstack.org/ironic/latest/admin/cleaning.html#runbooks-for-manual-cleaning https://docs.openstack.org/ironic/latest/admin/servicing.html#using-runbooks-for-servicing15:37
TheJuliakind of yeah, somehow the pm folks looking delineated cleaning/servicing15:37
JayFthere's no specific guide, outside of client/api docs, on how to create them15:37
JayFbut that seems mostly OK to me?15:37
TheJuliaWe likely need something user centric on performing actions15:39
TheJuliaotherwise the bar is a bit high15:39
TheJuliaand somewhat expectedly so, but an easier on-ramp is always better15:39
JayFthat's basically what dave said15:40
JayFwe have a lot of "you can do x"15:40
JayFbut not a lot of "if you want [thing], do these steps:"15:41
JayFso our docs assume basic context going in that you have an idea what people normally use these tools for15:41
JayFwhich may not be true for all folks15:41
JayFrealistically the docs that were added for runbooks fall into this trap too, but I don't expect folks to overhaul the docs for a feature :)15:42
TheJuliaDefinitely not true for folks who haven't touched ironic in ages and very much not true for people stumbling upon it for the first time or with a "oh, its a thing" context15:42
JayFyeah, exactly15:42
TheJuliatrue, yeah15:43
TheJuliakubajj: if you can just create something, that would be helpful. Doesn't have to be much, just a skelaton of context15:49
kubajjTheJulia: working on it, trying to remember what is the current state before I started fixing stuff :D15:51
TheJuliaheh15:52
TheJuliathat happens :)15:52
TheJuliaI think we had someone notice something else about 5 months ago15:52
TheJuliayou might just want to double check git blame15:52
JayFcardoe: so the other day, you were asking about monitoring/metrics for Ironic iirc, yeah?15:54
cardoeyeah15:54
JayFYou know about the built-in Ironic/IPA metrics support?15:55
JayFAPI/Conductor/IPA will send metrics to statsd; for support w/the prometheus-exporter, you can only send from conductor15:55
TheJuliaconductor is also two sided, prometheus-exporter stuff was originally for scraping node data up15:56
TheJuliabut it now captures the metrics stuffs too if so configured15:56
JayFso statsd metrics: only application performance metrics15:56
JayFprom metrics: conductor perf metrics + node data?15:56
TheJulia(in other words, if you want more/other data, we need to understand that for the PTG) :)15:57
kubajjTheJulia: will have a look15:57
JayFand speak now because statsd support is potentially on the chopping block15:58
JayFwith ironic-lib getting parted out15:58
TheJulia... and no we're not going to take ironic-lib to Gotham Garage15:58
kubajjTheJulia: this? https://review.opendev.org/c/openstack/ironic-python-agent/+/91585815:59
TheJulia(https://www.imdb.com/title/tt8893550/)15:59
dtantsurspeaking of ironic-lib: https://review.opendev.org/c/openstack/ironic-python-agent/+/92877916:00
TheJuliakubajj: I think there was something else16:02
dtantsurand once we get the inspector job fixed, I think the next step should be https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/92828416:03
opendevreviewMerged openstack/ironic-lib master: Remove code already migrated to Ironic and IPA  https://review.opendev.org/c/openstack/ironic-lib/+/92877616:03
dtantsurI wonder why Ubuntu renamed its OVMF files to OVMF_CODE_4M.fd16:08
dtantsurwtf is 4M? should I be worried?16:08
JayFwhat package is that in?16:08
JayFhttps://launchpad.net/ubuntu/+source/qemu/+changelog might have details16:08
dtantsurJayF: ovmf16:08
dtantsur"I've also read about KVM guys defaulting to the 4M firmware image, to prevent UEFI update failures, as these cert lists become too long to store within the "old" 2M images." - from googling16:09
opendevreviewMerged openstack/ironic master: Make sure qemu-img command is available in debian/suse  https://review.opendev.org/c/openstack/ironic/+/92927216:09
dtantsurthis_is_fine.jpg16:09
opendevreviewMerged openstack/ironic master: CI: Remove scope enforced ci jobs  https://review.opendev.org/c/openstack/ironic/+/92810616:11
opendevreviewMerged openstack/ironic master: Remove skip check for Python 3.6  https://review.opendev.org/c/openstack/ironic/+/92778016:11
JayFdtantsur: https://review.opendev.org/c/openstack/ironic-specs/+/926654/2#message-fb51f6d8cfaf50a89ab6261b9869aa0d03e98fc3 as promised16:11
kubajjTheJulia: I've drafted this: https://bugs.launchpad.net/ironic/+bug/2080871 (managed to fix 1-3 and testing the 4th one) - problem is that deployment is taking ages, but I have access to our qa nodes now16:12
opendevreviewDmitry Tantsur proposed openstack/bifrost master: WIP: start working on Ubuntu 24.04 support  https://review.opendev.org/c/openstack/bifrost/+/92889516:16
dtantsurJayF: thanks! Typed some quick responses, please check. Will follow-up with everything else tomorrow.16:22
JayFdtantsur: tl;dr my reply is "rescue is the model"16:23
JayFdtantsur: it's already a place where we swap networking on a running IPA, and we can use the same pattern, just instead of putting it up on the new network, we batten down the hatches in case it stays running16:24
dtantsurokay, so basically on receiving $something, IPA stops its API service and heartbeater?16:24
Pcmalih_Hello this is test message16:25
JayFhi Pcmalih_ 16:25
JayFdtantsur: yeah, basically equivalents of `service iptables panic` and `for i in devs; do ip link set $i down`16:25
JayFdtantsur: obviously writing that code will be tricky but I think we can do it 16:26
dtantsurJayF: mmm, I don't quite like running system actions16:26
JayFdtantsur: we do it in rescue16:26
dtantsurnone of these will work on our ramdisks, for instance16:26
JayFthis is a ramdisk think though16:26
JayF**thing16:26
dtantsurbecause IPA is in a container16:26
JayFI'm talking about /in the ramdisk/16:26
JayFIPA is not a container in any upstream ramdisk16:26
JayFand we pulled it out explicitly to enable this16:26
Pcmalih_Hi @Jay, I have a question about ironic used in kolla-ansible yoga/stable . VIF attach command is not effctive . Once when i try to depliy image it say no ViF found. 16:27
JayFbecause I got tired of patching container runtimes breaking our bios updates :)16:27
dtantsurwell, it is for us and potentially more people16:27
JayFPcmalih_: hey, can I get a little context? I'm wondering if we work together based on versions/16:27
JayFThen allow the container to do those things? 16:27
dtantsurbut okay, I can do some best effort network shutdown after making sure IPA is silent and not reachable16:27
JayFthe thing I realize now though16:27
JayFand what I *thought* you were alluding to16:27
JayFis the other direction16:27
JayFwhen we have an installed thing, when would we switch to IPA network16:28
JayFwe can't do that nearly as safely because we cannot be assured the OS is no longer running16:28
dtantsursigh, yeah16:28
dtantsuralso: how do we set the reboot command to IPA that has been shut down?16:28
dtantsurI need to think this all through once again. but I think there is a reason why I ended up with the spec like this.16:28
JayFyou don't, you do it oob16:28
JayFWell the thing is, for flat networking this is scary, and now I think it's scary for multitenant networking16:29
dtantsurwe very explicitly start with in-band to make sure we don't lose any information16:29
JayFthe A++ solution is bidirectional conversation with the switch16:29
JayFwhere we can see switchport go down as a proxy for reboot16:29
JayFeven though that can be fooled16:29
dtantsurOOB is a hard reset. unless we want to get into the business of using soft reboot16:29
JayFdtantsur: prepare_for_network_and_reboot() { sync;sync;sync;sync;sync;sync;./kill-the-network.bash } kinda-/s16:30
JayFI'll note the original IPA reboot was "echo s > /proc/sysrq-trigger && echo b > /proc/sysrq-trigger" :)16:30
JayF(not that I think that's good, just saying that it's a continuum of tradeoffs)16:30
dtantsurAre we ready to stop using in-band reboots for *all cases* then? And rely on sync to work?16:31
dtantsurThe last thing I want to do is to catch data integrity bugs that only happen when this flag is on16:31
JayFWhy all cases to cater to a single weird hardware behavior? 16:31
dtantsurYou say it's safe to do, so why keep the complexity16:31
JayFyeah, that's why I wouldn't wanna for all cases :/ 16:31
JayFno, I said it's a contiuum of tradeoffs16:31
JayFif we don't have the interfaces we need to do everything right, it's a choice of what things we do less-right16:31
dtantsurwell, so far nobody asked for the neutron interface support16:32
dtantsurso it's weird to try make it work at the expense of the people who actually want this feature...16:32
JayFthe flat support was originally a concern for me before we even went down this path 16:32
opendevreviewMerged openstack/ironic-lib master: Remove the rest of the rootwrap machinery  https://review.opendev.org/c/openstack/ironic-lib/+/92877816:32
JayF[DEFAULT]/its_ok_if_ipa_keeps_running=True16:32
dtantsurI like your idea of detecting the running IPA though16:32
JayFI'm not 100% serious, but right now I'd almost want this behavior hidden behind a flag for safety16:32
dtantsurthis can very well be done16:32
JayFI don't like the idea that anyone who could update driver_info (a project admin or manager in some cases) could open the security crack to try and jam a crowbar into :)16:33
dtantsurFlag - maybe. Although, in the RBAC universe, if you can update driver_info, you can do so many things.16:33
dtantsurI mean.. wanna know how our folks made it work for now?16:33
JayFif you tell me, will I consider it a security bug?16:34
dtantsurthey literally install a Redfish proxy between Ironic and hardware that intersects power offs and trun it into reboots16:34
JayFconsider that before making your decision :)16:34
JayFthat is actually a slick as hell solution16:34
dtantsur:D16:34
JayFI don't dislike that at all16:34
dtantsuralso super fragile16:34
JayFI might like that better than your spec LOL16:34
JayFwell yeah, it gets the fragility outta ironic where you wanna put it LOL16:34
JayF"we work around it right now by making it not your problem" - sgtm /s ;)16:35
dtantsurfor some definition of "you"... we broke this hack at some point, and it became my problem16:35
JayFPcmalih_: You disconnected, did you see: hey, can I get a little context? I'm wondering if we work together based on versions/16:35
dtantsurmy point is: if you can change driver_info, you can do anything, including fishing for BMC passwords or changing a lot of Ironic logic16:35
JayFdtantsur: I mean, yeah, and you know philosophically I try to save all of us time with the hope it gets paid back to upstream, but I do think it's different when we're talking about taking on potential security risk16:35
JayFthat's where my head is when I say this16:36
JayFremember you're talking to someone who just lost a month to testing patches and doing security fixes16:36
dtantsurWhat's the threat model? A person who can change driver_info but is not trusted to make an informed decision on powering off vs rebooting?16:36
pcmalih_@JayF: I am able to enroll Supermicro using Redfish driver. It entoll, manageable and available but at deploy command gailed with no vif errro.  I did create vir using vif attach and its visible using nshow node command. Im using kolla-ansible (manually enabled ironic)16:37
opendevreviewMerged openstack/ironic-inspector master: Try enabling port 5050 on PUBLIC_BRIDGE  https://review.opendev.org/c/openstack/ironic-inspector/+/92878316:37
opendevreviewMerged openstack/ironic master: Bring back the metal3-integration job  https://review.opendev.org/c/openstack/ironic/+/92928616:37
JayFa tenant who has understanding of this can potentially prevent a reboot from working, causing the tenant OS to be alive in IPA16:37
dtantsurhow?16:37
JayFI've 100% experienced, personally, hardware that out of the box you could essentially make the BMC useless if you loaded the right set of commands16:37
JayFthis was 10 years ago+IPMI and I hope it doesn't exist still, but ever since then I've been very suspect of BMC actions that we can't verify actually happened16:38
JayFpower off [did we power off?] power on [did we power on?] mitigates this in a way a single reboot we can't confirm does not16:38
dtantsurif you can mess with the BMC, you can work around this limitation too (just make the frontend return the right value without actually changing the power state)16:39
JayFthat's not how the exploit I personally experienced worked16:40
dtantsurif what you're looking for is a flag to disable this feature, I'm happy to provide it16:40
JayFI'm looking to have the conversation to see if two smart people with an audience can figure out the safe path through the woods :D16:40
pcmalih_ I am able to enroll Supermicro using Redfish driver. It entoll, manageable and available but at deploy command failed with no vif error.  I did create vif link by using vif attach command and its visible using show node command. Im using kolla-ansible (manually enabled ironic)16:41
dtantsurI don't think there is a 100% safe path to use this feature and also uphold the guarantees of the neutron interface - this is why I opted to fail validation rather than putting people at risk16:41
JayFpcmalih_: I'm not sure exactly how to get past that; I would assume something about the tenant networking side is weirdly misconfigured16:41
JayFI think it's scary for flat case, you just basically make me go from "it's less scary for neutron" -> "it's more scary for neutron" :D 16:41
dtantsurwe already have quite a few knobs for scary features like disabling cleaning or running firmware updates16:42
JayFdtantsur: let me ponder it, as it is, I'd say we'd put it behind a flag that details the security risk in multitenant environments; but I don't wanna give up on thinking of another path16:42
JayFdtantsur: if you remember back, I was upset when we added those, too ;) 16:42
* TheJulia exits meeting with scrambled eggs for brains16:42
JayFI am consistenly paranoid :D 16:42
TheJuliaso. much.chatting.16:43
JayFpcmalih_: I feel like your nick is slightly familiar but I can't place it, do we know each other?16:43
pcmalih_For tenant networking, can you please confirm that the neutron dns range in the kolla-ansible need s to be same at the physical BMC network or the neutron virtual network created by openstack?16:43
dtantsurJayF: I'll also ponder this conversation with a fresher brain, i.e. tomorrow16:43
JayFThat's probably a better question for the k-a folks, but I can have a look. 16:43
pcmalih_Its my first time to chat here but i have attempted to chat in the past. I am part of bluebrint paper release recently. 16:44
TheJuliapcmalih_: welcome!16:45
pcmalih_This vif unavailability issue is acting as blocker  16:46
JayFyeah, it's just tough trying to answer questions about years-old releases in install automation I don't know about16:46
JayFI'll have a look but please set expectations appropriately with that in mind :)16:47
pcmalih_Ah.. but for the sake of clarity. What is the expected value for neutron dhcp subnet? 16:50
JayFthat's what I'm looking for friend, I just don't know k-a code so it's not immediate16:50
JayFthat doesn't always 1:1 map with an ironic concept 16:50
JayFdtantsur: I documented this conversation with a link to the logs and a tldr of my take on the conclusion for now in the spec, so it's not lost16:51
dtantsurthx16:51
pcmalih_@JayF: do you mean i need to use Biforst etc instead on Kolla-aansible?16:51
JayFThink of Ironic as raw material :) kolla-ansible, bifrost, metal3, openstack-ansible, etc all use it in slightly different ways16:51
JayFI know a lot about the raw material but not so much about this shape of it :D16:52
pcmalih_Okay so what is the role of VIF? What configurations it expects?16:52
JayFpcmalih_: jay in kolla-ansible on  unmaintained/yoga via  v3.11.9  \n ❯ rg neutron_dhcp_subnet -- is the field named something different?16:53
JayFpcmalih_: well that's just the thing: it depends on context; if using neutron network interface you'd have (I believe?) a vif in the cleaning *and* tenant networks16:53
TheJuliawhat is going on with vif availability?16:54
TheJuliathe quick tl;dr please16:54
pcmalih_Do it need ml2 driver configuration to get vif detected by node?16:57
TheJuliavifs are not detected by the node, ironic gets told the user required vif mapping(s)16:58
TheJuliaironic creates/deletes some others as needed for cleaning/deployment/rescue16:59
TheJuliabut it always cleans those up16:59
TheJuliaThere were some issues around that, but I think I got them all fixed by Train16:59
*** awb_ is now known as awb17:00
TheJuliaAnd being one of the people who spent a lot of time chasing down and trying to remedy those issues, I'm curious to understand *exactly* what is taking place17:01
pcmalih_@TheJulia: would you please send link to those fixes in Train? We also find Train to be working one. But yoga/stable is strucking my neck! 17:01
TheJulia(how is that as service, when the board chair chimes in between emails)17:01
* JayF adds another train ironic installation to his sadness board /s17:02
TheJuliahttps://review.opendev.org/q/I8d683d2d506c97535b5a8f9a5de4c070c7e887df was one of them, but that was in queens days17:05
TheJuliawhich makes me wonder if there is something entirely separate happening outside of our control17:05
pcmalih_Unmaintained/yoga with bifrost works fine but stable/yoga with  k-a is  failing at vif nit found 17:07
pcmalih_Same machine, same driver etc17:07
TheJuliaso kolla-ansible is failing where reporting vif not found?17:08
pcmalih_#TheJuliayeso17:08
TheJuliaSo yoga could be an RBAC configuration issue, but that where is going to be critical to understand17:09
TheJuliasince Wallaby is where we and other projects started evolving the RBAC model heavily17:10
pcmalih_@TheJulia yes k-a is failing to detect VIF 17:12
TheJuliapcmalih_: so I don't understand what you mean when you say that17:13
TheJuliaif you can supply a complete error or sniplet of log, that would help orient the discussion17:13
* TheJulia sighs17:13
Pcmalih_VIF list command didn’t show anything. Even if i create VIF using vif create command. 17:16
TheJuliaso if your using ironic, and your asking for a vif list, you see nothing?17:16
TheJuliawhat user roles are you running your account with?17:16
Pcmalih_Yes openstack baremetal node vif list17:17
Pcmalih_User role is Admin. 17:18
TheJuliaSo the tl;dr is you do "openstack baremetal node vif create x y z", and then you do "openstack baremetal node vif list" and there is nothing?17:19
TheJuliano intermediate step/action taken?17:20
TheJuliaPcmalih_: can you verify you have the reader role? https://github.com/openstack/ironic/blob/unmaintained/yoga/ironic/common/policy.py#L833-L84017:22
Pcmalih_First I enable ironic in k-a, set ironic_dnsmasq_interface to the physical interface, ironic_dnsmasq_dhcp_range to range of Ip ands sbnet created by neutron where dhcp_agent and subnet is attached.17:23
Pcmalih_Set ironic_cleaning_network: to public (default created network).  S set ironic_dnsmasq_boot_file to pxelinux.017:24
Pcmalih_Then create node and set its redfish and node paramenters. Then create neutron interface and attached it to node with vif create command. Then do vif list. 17:25
TheJuliawow17:27
TheJuliaPcmalih_: your proxy/router/browser dislikes the web irc client interface17:28
Pcmalih_My webclient  sessionkeep on disconnecting after few minute17:29
TheJuliayeah17:29
JayFPcmalih_: I suggest a free sub to irccloud.com; that might work better for you17:29
TheJuliaEvery 2.5 minutes most likely, your network/router/proxy is likely slaying the sockets17:30
TheJuliasince that is the maximum "theoretical" stale keepalive time17:30
TheJuliaPcmalib: JayF just recommended using irccloud, it is free and has extra logic to handle things like breaking connections17:30
Pcmalib@TheJulia: I may have missed your message after [22:25] <Pcmalih_> Then create node and set its redfish and node paramenters. Then create neutron interface and attached it to node with vif create command. Then do vif list17:31
TheJuliaPcmalib: so, we need you to confirm the roles your user has from keystone because honestly, it sort of sounds like your missing the reader role17:31
JayFPcmalib: fwiw, also https://meetings.opendev.org/irclogs/%23openstack-ironic/ exists to find things you might have missed :) although I do recommend a better client17:32
TheJuliaso if secure rbac is enabled by kolla-ansible on that version and you somehow don't have the reader role, your *should* end up with an empty list or an access denied17:32
* TheJulia doesn't remember exactly which but thinks it is an empty list in herently if you have a valid user17:32
PcmalibDo you want me to manually add readers role in adminrc file and use that?17:34
TheJuliauhh, it is whatever is in keystone, not what is defined in adminrc17:34
TheJuliaso you have to look at the rights keystone has on file17:34
TheJuliabecuase ironic turns around, takes your token, goes and asks keystone for "what is it really"17:34
TheJuliaand uses *that*17:34
TheJuliacan't trust an authenticated user as an inherently authorized user17:35
JayFopenstack user show $user ## might tell you the roles17:36
JayFthat is a command that exists, unsure if it gives you what you need17:36
TheJuliaI think it does17:36
TheJuliaI think17:36
TheJuliait has been a while since I've invoked it17:36
jssfr`openstack role assignment list --user $user --names`17:36
jssfris what you're looking for. I don't think `user show` shows the roles.17:37
TheJuliajssfr: your a life saver!17:37
* JayF says loudly to nobody in particular that he'd like a bag full of money17:37
JayFI just figured while people were dropping in and granting wishes :D 17:37
JayF(ty jss)17:37
Pcmalib@Julia Appreciate that and will give it a try today asap. 17:38
* TheJulia goes turns on the surround sound and starts Rammstein Radio on Pandora17:38
TheJulia.... I was supposed to touch code today, wasn't I ?17:38
jssfryou're welcome17:38
jssfralso pro tip: don't mix up `openstack role remove` and `openstack role delete`.17:38
jssfrthe latter will have you find out the hard way ~~how to restore data from mariadb WALs~~ if your backups work.17:38
Pcmalib@JayF how to tag someone in the reply? Like i am trying my luck  @ keywork17:39
TheJuliajssfr: I'm twitching already17:39
JayFmost clients just need the person's name said to highlight17:39
JayFthe @blah vs blah: vs ... are all just aestetic choices17:39
TheJuliafor example, I type jay<tab> and irccloud turns that into JayF, and I then just add a :17:39
JayFirccloud adds the ': ' for me, too17:40
JayFis that configurable?17:40
TheJuliait might be17:40
JayFnot that I care, I'm just intrigued if we have idfferent behavior17:40
TheJuliaI've been using irccloud for like 10 years, so I might be missing some settings17:40
TheJuliaoh, if it is the beginning of a line, it does add it automatically17:40
TheJuliaSo, Time for Ethyl 2-cyanoacrylate ?17:42
clarkbweechat will only append : if it happens at the beginning of a message17:42
TheJuliaso next irc release would be 26.1 right?17:44
clarkbIRC is specified in RFC which are simply numbered and don't have relationship to previous versions? Anyway https://modern.ircdocs.horse/ looks like a collection of things that could be the next rfc (but that probably won't happen)17:48
JayFIRCv3 is a thing that exists, and irccloud supports it17:52
JayFthe irccloud network is an IRCv3 network17:53
JayFif more things used v3 we'd have less need for matrix and friends tbh17:53
TheJuliaerr, not IRC, Ironic17:54
TheJuliasorry, brain needs migraine meds17:54
opendevreviewJulia Kreger proposed openstack/ironic master: Add Prelude for end of cycle release  https://review.opendev.org/c/openstack/ironic/+/92956417:56
TheJuliaJayF: ^17:56
TheJuliaedit at will17:57
JayFI wonder if it's worth specifically calling out that was feedback given in a BM SIG that we followed up on 17:58
JayFseems weird for a prelude but maybe an oppo to show people the value of participation17:58
TheJuliaI focused mainly on features in our reno, I didn't get back to the sig etherpad17:59
* TheJulia takes migraine meds17:59
* TheJulia also takes the meds which should have been taken with breakfast18:00
* JayF +2s it18:00
JayFI couldn't find a way to make it sound not-bad to do what I suggested18:01
JayFand you covered the things I was specifically thinking should be in there18:01
TheJuliaI know the block bios stuff was in that etherpad18:01
* TheJulia expects operators to scream in joy to being able to block people from adding bios nodes18:01
JayFI was surprised to hear people wanted that18:02
JayFcid: ^ she's talking about your feature, btw18:02
TheJuliaWhen you think about it, it makes a lot of sense18:02
TheJuliabecause users get stuck in their ways and may not really grok the difference there18:02
JayFIt basically allows operators with clouds where multiple people are enrolling to remove a decoder ring :)18:02
TheJuliaI can tell you, with my more PM-ey hat on, I'm acutely worried about baremetal hardware users perceptions as we move forward over the next couple years18:03
TheJuliasince eventually Centos/RHEL won't be bios bootable at all anymore too18:03
TheJuliaYup18:03
* cid starts reading 2+ hours of chat log to get upto speed with the discussion.18:03
JayFno need cid, just pointing out TheJulia said opers were gonna jump for joy over your features18:03
TheJuliacid: eh, maybe 20 minutes :)18:03
JayFspecifically the allowed boot modes stuff18:04
TheJuliacid: thank you for putting in the effort on that!18:04
JayFTheJulia: looking at the etherpad; we filed 3 RFEs, one complete, one in progress (awaiting feedback from CERN actually), one unmoved18:04
JayFWe have an email out to CERN to help us figure out behavior of stress-ng with >1 GPU to enable burnin18:05
TheJuliastevebaker[m]: speaking of, I bet we might want to start asserting that config in openstack-k8s-operators at some point since I think we're going to hit a major pain point when we get to our next major version if we don't toggle it out of the gate18:05
TheJuliawhat was the unmoved one?18:05
JayFhttps://bugs.launchpad.net/ironic/+bug/206908318:05
JayFtl;dr enable burnin as an inspection thing18:05
TheJuliaahh, yeah18:06
JayFso you can get metrics on speed and catch anomolous hardware18:06
TheJuliaand that makes sense only with the rest of the pipeline there18:06
TheJulia++18:06
TheJuliamakes tons of sense18:06
TheJuliaI... wonder if someone like NobodyCam might be able to make use of similar18:06
JayFoh, we should mention image types in there too18:07
* JayF does it18:08
JayFyeah nevermind it's in upgrade18:09
JayFand I don't want our happy prelude to have sad padlocks18:09
* cid feels a sense of accomplishment, having run through all of the chats now.18:35
cidJayF, TheJulia: Oh, ty! It really was a team effort.18:36
JayFyep, you've been here the whole release18:36
JayFit's nice to take a minute, take a deep breath, and realize you achieved something18:36
JayFthen get back to work! :D 18:36
JayFlol18:36
cidIndeed18:37
opendevreviewcid proposed openstack/ironic master: [WIP] Add inspection rules  https://review.opendev.org/c/openstack/ironic/+/91830320:58
cidmasghar: ^^ (take another look when you can).21:08
* cid EOD21:08
TheJuliag'night21:12
TheJulia:)21:12
opendevreviewcid proposed openstack/ironic master: [WIP] Add inspection rules  https://review.opendev.org/c/openstack/ironic/+/91830321:23
opendevreviewJulia Kreger proposed openstack/ironic master: trivial: fix http result code on ImageInvalid  https://review.opendev.org/c/openstack/ironic/+/92957421:32
stevebaker[m]TheJulia: Hey I have one comment on the prelude21:43
TheJuliastevebaker[m]: oh?21:44
JayFTheJulia: that patch ^^ makes me wonder if there's ever a world where we would hook up image safety check to a validate call21:44
stevebaker[m]just clarity21:44
TheJuliaJayF: that would make me tableflip and drink tons21:44
TheJuliasince that is a fundimental meaning of the call change21:44
TheJuliagiven it reutnrs a bunch of data today which is cursory21:44
JayFfair21:44
TheJuliaanyway21:44
opendevreviewJulia Kreger proposed openstack/ironic master: Add Prelude for end of cycle release  https://review.opendev.org/c/openstack/ironic/+/92956421:46
TheJuliastevebaker[m]: done21:46
stevebaker[m]TheJulia: cool, do you care that you lost the backtick formatting on that edit?21:48
TheJuliaanywayugh21:48
opendevreviewJulia Kreger proposed openstack/ironic master: Add Prelude for end of cycle release  https://review.opendev.org/c/openstack/ironic/+/92956421:49
stevebaker[m]sweet, approved :)21:51
JayFreadded my +2 as well, thanks Julia \o/21:55
opendevreviewMerged openstack/ironic master: Add Prelude for end of cycle release  https://review.opendev.org/c/openstack/ironic/+/92956422:14

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!