Monday, 2023-08-14

kubajjGood morning Ironic! o/07:32
masgharGood morning!08:38
mmalchukmorning Ironic o/09:04
mmalchukrpittau any updates on CI ?09:04
iurygregorygood morning ironic 11:21
dtantsurTheJulia: FRESH BRAINZZ! (mine is ready once you are, but I'll need to leave earlier today)12:21
TheJuliabraiinnz!13:02
TheJuliagood morning!13:10
TheJuliadtantsur: so do we see the same level in OpenStack CI at all? Or was that just a particularly bad example?13:16
TheJuliaI sort of have a theory as to it, but still trying to wrap my head around it13:17
dtantsurTheJulia: you mean, the metal3 failure? I haven't collected statistics yet. It's not permanent for sure.13:26
TheJuliayeah, so I noticed some socket errors and huge interaction latencies appear, that being said a good chunk of it seems to be caused by locking, all sort of originating after the ipmi power status check13:28
TheJuliaI guess, I am curious what Nordix's CI is backed, but I'm just lacking context there13:33
dtantsurI think we use a mix of redfish and ipmi13:34
TheJuliayeah13:38
TheJuliainterestingly, found an instance in zuul's logs, but nowhere near as severe as the one you linked13:38
TheJuliamore "i'm slightly grumpy..." than anything else13:39
TheJuliaI feel like we might be leaving the db locked/orphaning something on getting a node, but I might just be hyper focusing... I feel like I need to go look at the sqlalchemy sqlite driver13:44
iurygregorydtantsur, we also found the issue downstream I think it was in one of the bugs that rpittau was working on13:52
dtantsuriurygregory: mm, which one?13:59
iurygregorypm =) 14:02
*** JasonF is now known as JayF14:07
dtantsurah, well.. this was related to outdated code downstream14:09
iurygregoryyeah, but since we are still seeing issues upstream I think it can probably happen downstream again ...14:10
dtantsurPossibly, I'm literally talking with someone with similar symptoms (although I'm not sure they have the Riccardo's fix).14:11
* dtantsur is on the verge of desperation with sqlite, sqlalchemy and our usage of them..14:11
TheJuliadtantsur: up for a chat in say 10 minutes?14:20
dtantsurTheJulia: possibly in 10 more, finishing something here right now14:32
TheJuliano worries14:33
dtantsurTheJulia: free now14:40
TheJuliahttps://meet.google.com/mec-txxy-aqi14:40
dtantsurTheJulia: https://review.opendev.org/c/openstack/ironic/+/88783514:49
JayFprepare to nest your meetings ;)14:59
JayF#startmeeting ironic15:00
opendevmeetMeeting started Mon Aug 14 15:00:57 2023 UTC and is due to finish in 60 minutes.  The chair is JayF. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
opendevmeetThe meeting name has been set to 'ironic'15:00
iurygregoryo/15:01
masgharo/15:01
dtantsuro/15:01
TheJuliao/15:01
JayFGood morning, welcome to the Ironic meeting. A reminder we operate under the OpenInfra Foundation CoC https://openinfra.dev/legal/code-of-conduct15:01
kubajjo/15:01
JayF#topic Announcements/Reminder15:01
JayF Standing reminder to review patches tagged ironic-week-prio and to hashtag any patches ready for review with ironic-week-prio: https://tinyurl.com/ironic-weekly-prio-dash15:01
JayFthank you for cleaning up that dashboard, btw, it's nice to see it actually shrink as things land haha15:02
JayFNote that the next Bobcat milestone is in 10 days; the non-client library freeze15:02
JayF#note Reminder PTG will take place virtually 2023-10-23 through 2023-10-27. Please document any items for discussion here15:03
JayF#link15:03
JayF#undo15:03
opendevmeetRemoving item from minutes: #link 15:03
JayF#link https://etherpad.opendev.org/p/ironic-ptg-october-202315:03
JayFThat's a little over two months away but it will sneak up15:03
JayFThat's all I've got for announcements.15:04
JayFNo action items from last meeting, skipping that items.15:04
JayF#topic Reivew Ironic CI status 15:04
JayFrpittau: do you have an update on how bifrost CentOS job is doing?15:04
iurygregoryI think he is out today15:04
JayFI also think TheJulia and dtantsur are working through some metal3 locking issues; I'm unsure if that impacts our gate.15:04
JayFack ty iurygregory I'm sure he'll update the channel tomorrow15:04
dtantsurNot much so far, seems performance-dependent15:05
TheJuliavery performance dependent, unfortunately15:05
JayFOur CI is very good at finding those kinds of issues :-|15:06
TheJuliaI went through our logs for recent metal3-integration jobs, and found nothing as horrible as the nordix job dmitry linked to me last week15:06
JayFOh, so performance the *other way* lol15:06
TheJuliaunfortunately, it seems15:07
JayFThank you all for looking at that, if there's anything I can help review or fix let me know.15:07
JayFIs there anything else notable about CI before we move on?15:07
JayFMoving on. 15:08
JayF#topic Ongoing 2023.1 Workstreams15:08
JayF#link https://etherpad.opendev.org/p/IronicWorkstreams2023.215:08
JayFWe're reaching the last weeks of the cycle. If we want something in Bobcat we have to move soon :D 15:09
JayFI know service steps is close and needs review attention.15:09
dtantsurwe could use more eyes on masghar's changes, first and foremost https://review.opendev.org/c/openstack/ironic/+/88755415:09
masgharThank you ^^15:09
JayFgood stuff, ty masghar it's on my list15:10
masgharShould I tag it ironic-weekly-priority?15:10
JayF hashtag: ironic-week-prio15:10
JayFand it will show up in review dashboards15:10
masgharAlright, thanks15:10
JayFyou have to "Show all" for the hashtag field to show up15:10
masgharI was trying to find it thanks15:11
* dtantsur needs to leave now, will respond to any pings tomorrow15:11
JayFI'm going to move on15:11
JayF#topic OpenStack User Survey updates15:11
JayFThe OpenStack User Survey is asking for projects to review project-specific questions15:11
JayFwhich people are shown if they choose "yep, I use [project]"15:11
JayFAFAICT, right now Ironic has no questions.15:12
JayFI've started drafting some to add here:15:12
JayF#link https://etherpad.opendev.org/p/ironic-user-survey-questions-202315:12
JayFplease give feedback/brainstorm/etc in that Etherpad15:12
JayFin the next day or two I'll be consolidating this down and submitting it; we only have until Friday; so please prioritize this if you care to have input15:12
TheJuliaSuggested a question to help size the usage15:14
JayFPerfect, we'll have discussion there in the therpad15:15
JayFThere are no RFEs currently submitted for review so we're own to 15:15
JayF#topic Open Dicussion15:15
JayF#link https://bugs.launchpad.net/ironic/+bug/203097615:16
JayFscottsol discovered a security bug in Ironic, which appears to impact most openstack projects15:16
JayFwhere sensitive data is being places in notifications15:16
JayFthere is a draft PR up to fix it: 15:16
JayF#link https://review.opendev.org/c/openstack/oslo.messaging/+/89109615:16
JayFI say "draft" but only because it's not been reviewed or merged yet. It's confirmed to fix the issue and pass tests.15:17
JayFPlease take due notice of the bug, and keep an eye out for the OSS{A,N} coming down the pipe chen completed15:17
JayF*when15:17
JayFThat's all I have for open discussion; anyone wanna talk about this or anything else?15:17
TheJuliaAlso, as additional context, it is when notifications are logged to the message queue as opposed to an actual log file.15:17
TheJulias/queue/bus/15:18
JayFLast call before I close up the meeting.15:20
TheJuliaI got nothing15:20
JayF#endmeeting 15:20
opendevmeetMeeting ended Mon Aug 14 15:20:59 2023 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:20
opendevmeetMinutes:        https://meetings.opendev.org/meetings/ironic/2023/ironic.2023-08-14-15.00.html15:20
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/ironic/2023/ironic.2023-08-14-15.00.txt15:20
opendevmeetLog:            https://meetings.opendev.org/meetings/ironic/2023/ironic.2023-08-14-15.00.log.html15:20
opendevreviewJakub Jelinek proposed openstack/ironic master: WIP: Introduce default kernel/ramdisks by arch  https://review.opendev.org/c/openstack/ironic/+/89081915:24
opendevreviewMerged openstack/ironic master: Support sha256/sha512 with the ilo firmware upgrade logic  https://review.opendev.org/c/openstack/ironic/+/88216416:36
JayFTheJulia: You had mentioned something about syncing up on service steps; I am out this afternoon with medical stuff so if you want to do it, it's gotta be nowish or over the next two hours16:41
TheJuliaI can in about 15 if that works?16:42
JayFsure16:43
opendevreviewJulia Kreger proposed openstack/ironic master: Slow down the sqlite retry  https://review.opendev.org/c/openstack/ironic/+/89133316:59
opendevreviewJulia Kreger proposed openstack/ironic master: Log upon completion of power sync  https://review.opendev.org/c/openstack/ironic/+/89133416:59
opendevreviewJulia Kreger proposed openstack/ironic master: Don't yield on power sync at the end of the work  https://review.opendev.org/c/openstack/ironic/+/89133516:59
TheJuliaJayF: https://meet.google.com/ewi-rybd-mub16:59
kubajjdtantsur, TheJulia or anybody who has any opinion about ironic config, I drafted a functional implementation of the deploy_kernel_by_arch here: https://review.opendev.org/c/openstack/ironic/+/89081918:31
JayFthanks kubajj; what great timing I was just asking Julia to look at that for ya literally 90 seconds ago :D 18:31
kubajjI will add similar behaviour for rescue_..._by_arch and add the original parameters back in a hierarchical manner as JayF suggested18:32
kubajjThanks JayF:)18:33
JayFyou will also, potentially, depending on timing of it landing, need to do the same for service (Ironic node service is what I just reviewed for Julia)18:33
JayFI'd say that's likely18:33
JayFbut in a perfect world, you just respect mode in that method and *_kernel/*_ramdisk will work18:34
TheJuliakubajj: feedback posted from my point of view on the configuration change18:35
kubajjTheJulia: thanks18:37
TheJuliaCI is not happy today :(18:44
iurygregoryprobably a newbie question, but the code of conduct for openstack-discuss is the one from openinfra right?21:11
JayFAbsolutely21:12
iurygregorytks JayF 21:12
opendevreviewJulia Kreger proposed openstack/ironic-python-agent master: Handle the node being locked  https://review.opendev.org/c/openstack/ironic-python-agent/+/89135721:32
TheJuliadtantsur: so it looks like a couple different things happened which cascaded. One of them being the agent is kind of agressive about re-querying ironic when ironic says "node is locked right now". But "why" the node was locked looks to be rooted, in this case, back with introspection stuck in its internal method for ~1200 seconds in one case. Agent did actually log sort of the right thing, except what happened is it kept 21:32
TheJuliarequesting over and over, some of which started to stack up waiting for locks to interact on the file and db. 21:32
TheJuliadtantsur: so I think tomorrow, the key question is "why did that task hangout for so long!?" and then maybe just come to the conclusion if it was environmental... or not.21:33
JayFShould we make IPA retry less aggressively, too?21:34
TheJulialook at the change I just posted for ipa :)21:36
TheJuliait is more a cascading result as opposed to the root cause though21:36
TheJuliabut 287 attempts to hit the /v1/lookup endpoint21:36
TheJuliain the nordix logs dmitry linked from last week21:37
TheJulia*part* of that is that introspection, *did* fail it seems21:37
TheJuliajust don't see why, at least yet21:37
TheJuliafrom some of the logging I found, it basically retired every 15 seconds, which in that system it was taking 7-14 seconds just do do the needful in some cases when it *could* proceed21:40
TheJuliagoing to head out, dtantsur lets try and sync tomorrow when I get online, I've got a few other small patches up, but tl;dr ci is not happy today due to connectivity/unrelated issues22:11
opendevreviewMerged openstack/ironic master: Fix several issues in the lock/release database code  https://review.opendev.org/c/openstack/ironic/+/88783523:09
*** dmellado81918 is now known as dmellado819123:17

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!