Wednesday, 2025-07-16

opendevreviewClif Houck proposed openstack/ironic master: Add a new 'vendor' field to the Port object  https://review.opendev.org/c/openstack/ironic/+/95496600:26
opendevreviewMerged openstack/ironic-python-agent master: Fix missing [mdns] options  https://review.opendev.org/c/openstack/ironic-python-agent/+/95418301:56
rpittaugood morning ironic! o/06:10
queensly[m]Good morning08:06
opendevreviewMerged openstack/ironic master: Imported Translations from Zanata  https://review.opendev.org/c/openstack/ironic/+/95484408:35
tkajinamrpittau, so giving it another thought I noticed the real problem triggered by python 3.9 removal is that it breaks compatibility with c9s, not ubuntu jammy, because c9s uses python 3.9 as its default.08:57
rpittautkajinam: yeah, you're right, I was wondering about jammy too08:58
rpittauin any case, we're using python 3.12 on CS908:58
tkajinamI'm unsure why https://review.opendev.org/c/openstack/bifrost/+/955061 does not break c9s jobs, but I see these jobs attempt to install ironic services in c9s (with py3.9 used). If these attempt to install master, not stable releases, then removing py39 support may kill these08:58
tkajinamah, ok08:58
tkajinamhmm. I know we use py3.12 for devstack jobs but I wasn't sure if the same switch has been done for bifrost jobs08:59
tkajinamthat's what I was about to ask08:59
rpittautkajinam: nvm I confused with another thimg, we actually tried to use Py 3.12 in bifrost but to no avail09:02
rpittauwe'll have to abandon CS9 and switch to CS10 when ready09:02
rpittauwe're pinning UC for Py3.9 compatibility for the time being09:03
rpittauI think im going to move the cs9 jobs to non voting 09:23
tkajinamrpittau, yeah or use 2025.1 branch for c9s jobs10:36
rpittautkajinam: or even both, we're not supposed to support cs9/py3.9 during this cycle anyway 10:51
tkajinamyeah10:51
dtantsurTheJulia: if futurist relies on eventlet in code that is not explicitly called GreenSomething, it's a bug, and we can fix it11:12
dtantsur(I think I still have +2 on futurist heh)11:12
opendevreviewRiccardo Pittau proposed openstack/bifrost master: Updated pinned upper-constraints for Python 3.9  https://review.opendev.org/c/openstack/bifrost/+/95518111:17
rpittauwe'll have to start pinning jobs to cs10 compatible nodes, starting to see "Fatal glibc error: CPU does not support x86-64-v3"11:25
iurygregorythis is fine11:26
opendevreviewVerification of a change to openstack/ironic master failed: Advanced vmedia deployment test ops  https://review.opendev.org/c/openstack/ironic/+/89801012:10
TheJuliadtantsur: so, the eventlet use looks like it could be excised. The thread exhaustion I'm seeing really makes me wonder :\13:05
TheJuliarpittau: are we seeing that leak into cs9 ? or are we trying to run cs10 now without config to pin the jobs to providers?14:34
opendevreviewMerged openstack/ironic master: Advanced vmedia deployment test ops  https://review.opendev.org/c/openstack/ironic/+/89801014:34
rpittauTheJulia: I'm seeing that when trying to build a CS10 image on a noble node14:34
TheJuliaoh yeah, that makes sense14:34
TheJuliaand would be expected, we need to explicitly pin those nodes14:34
rpittauunfortunately it does14:34
rpittauyeah14:35
rpittaubtw I'm seeing this error when trying to run ironic with Python 3.12 "RLock(s) were not greened, to fix this error make sure you run eventlet.monkey_patch() before importing any other modules."14:36
rpittauI beleive it's a bug in an older version of eventlet, but if anyone has any hint would be great :)14:36
TheJuliayeah, the packaged version is too old14:43
TheJuliawe'll need to get that to ?0.30?14:43
TheJuliaI think14:43
cardoeTheJulia: I had mentioned disabling a port due to a bad link or something to you the other day... it seems like maybe we should add that into this? https://review.opendev.org/c/openstack/ironic-specs/+/94086114:47
rpittauTheJulia: thanks, I think I will give it a try with the latest, we're currently running 0.33.114:55
TheJuliacardoe: could you elaborate on what you mean?14:59
TheJuliaI may finally be escaping the shop soon too :)14:59
* TheJulia watches magical wash machine do it's thing14:59
TheJuliaokay, it looks like we are orphaning threads from the periodic worker launches and because of the futurist code path just keeps creating new workers. I guess it was written in eventlet loaded systems where it might orphan or consider them abandoned, but the overall model is different could be as simple as just calling stop on the current thread executor. I think what we need to do as a first step is actually save a name 15:25
TheJuliaon each thread worker15:25
dtantsurOrphaning threads, omg15:46
TheJuliain the eventlet world, its sort of entirely free. It does look possible to nuke a thread, although futurist doesn't quite like that and new periodics seem to not do so well :)15:48
TheJuliabut that is also me going "is this idle, if so, get rid of it"15:48
TheJuliadtantsur: and that is combined with the default model on futurist is "lets add a new thread"15:49
opendevreviewClif Houck proposed openstack/ironic master: Add a new 'vendor' field to the Port object  https://review.opendev.org/c/openstack/ironic/+/95496615:58
* dtantsur dives into the futurist code16:04
* dtantsur stares at vim in disbelief16:07
dtantsurTheJulia: well, it absolutely does grow to max_workers on each submission because ThreadWorkers only exit on shutdown16:08
TheJuliaoh, they never actually exit16:09
TheJuliaunless you *ask* them to exit16:09
dtantsurthey do not indeed16:09
TheJuliaand there are totally valid reasons there16:09
dtantsurhttps://github.com/openstack/futurist/blob/master/futurist/_futures.py#L154-L15516:09
dtantsurself._workers never shrinks16:09
TheJuliaindeed16:10
TheJuliayup16:10
dtantsurTheJulia: I highly suspect that the intention of MAX_IDLE_FOR here https://github.com/openstack/futurist/blob/master/futurist/_thread.py#L84 was to exit the thread once it's reached16:10
dtantsuri.e. line 86 should be dropped16:11
TheJuliaquite possibly16:12
dtantsurThat, of course, can cause races with the growth logic16:12
dtantsurA saner idea, probably, is to replace this logic https://github.com/openstack/futurist/blob/master/futurist/_futures.py#L154-L155 with something based on the queue size and allow shrinking16:14
opendevreviewNahian Pathan proposed openstack/sushy master: Support expanded Chassis and Storage for redfish  https://review.opendev.org/c/openstack/sushy/+/95521116:14
dtantsurI have a strong deja vu, I litreally wrote https://github.com/cherrypy/cheroot/issues/190#issuecomment-2883903045 already16:15
opendevreviewNahian Pathan proposed openstack/sushy master: Support expanded Chassis and Storage for redfish  https://review.opendev.org/c/openstack/sushy/+/95521116:18
opendevreviewNahian Pathan proposed openstack/sushy master: Support expanded Chassis and Storage for redfish  https://review.opendev.org/c/openstack/sushy/+/95521116:22
dtantsurTheJulia: while oslo folks are hopefully thinking, let's take a step back. Okay, the workers count will necessarily reach max_workers and stop growing. That should not prevening you from submitting more work though.16:26
dtantsurI wonder if the slow-down is simply due to the crazy number of threads doing roughly nothing16:27
TheJuliaI think the other issue we're trying to create more by default to meet the new request, and then we drop into the backup worker path. I guess a starting point is to try and record a name to at least help us rationalize what is going on16:34
TheJuliaand then sort of make sure work is cleaning itself up16:34
TheJulia*alternatively* we could just have a periodic which individually launches threads16:34
TheJuliawhich would address power sync specifically and then close that out16:35
TheJuliathat might be awful though16:35
dtantsurFrom the openstack-oslo discussion, maybe I just fix futurist..16:38
dtantsurYeah, the reserved pool is probably biting us here16:39
dtantsurHmm or not. The rejection logic uses teh queue size, not the workers number16:39
opendevreviewNahian Pathan proposed openstack/sushy master: Support expanded Chassis and Storage for redfish  https://review.opendev.org/c/openstack/sushy/+/95521516:45
opendevreviewNahian Pathan proposed openstack/sushy master: Support expanded Chassis and Storage for redfish  https://review.opendev.org/c/openstack/sushy/+/95521116:53
TheJulianah, its the rejection upon trying to schedule new work that is biting us17:05
TheJuliaso what happens17:05
TheJuliathe power sync periodic starts17:06
TheJuliait basically tries to launch 8 threads to work the queue17:06
TheJuliafuturist rejects it because it can't create 8 more17:06
TheJuliaso we get, for example 217:06
TheJuliawhich then begins to create this overall state where okay, we were looping every 6-8 minutes reliably for power sync on say 5k nodes. Suddenly that is 13+ minutes17:07
TheJuliabecause we only got like half the workers we wanted when we reach the end of the normal worker pool17:07
TheJuliahopefully that makes sense17:07
TheJuliaI'm going to go takes some meds and hopefully hit the road shortly17:08
TheJulia(in any event, I suspect futurist could likely pass/set a name on a thread as an optional argument, I'll give it a spin locally, most likely tomorrow, just so we can make debugging easier17:08
dtantsurTheJulia:  safe travels! I don't think I understand why it rejects work when there is still capacity.. but it's a bit late, I need a fresh head for that17:16
opendevreviewNahian Pathan proposed openstack/sushy master: Support expanded Chassis and Storage for redfish  https://review.opendev.org/c/openstack/sushy/+/95521117:17
TheJuliaCool cool, sort of still trying to understand it myself17:27
cardoeTheJulia: sorry. so that's in relation to know a port is bad for example18:42
cardoewe don't wanna schedule on a disconnected port18:42
cardoeSo a node has maintenance mode but a port does not.18:42
iurygregoryin case someone is interested in nic firmware updates https://review.opendev.org/c/openstack/ironic/+/953394 o/19:39
opendevreviewNahian Pathan proposed openstack/sushy master: Support expanded Chassis and Storage for redfish  https://review.opendev.org/c/openstack/sushy/+/95521119:46
opendevreviewNahian Pathan proposed openstack/sushy master: Support expanded Chassis and Storage for redfish  https://review.opendev.org/c/openstack/sushy/+/95521121:30
opendevreviewQueensly Kyerewaa Acheampongmaa proposed openstack/sushy-tools master: Validate JSON content type before parsing manager PATCH requests  https://review.opendev.org/c/openstack/sushy-tools/+/95494522:28
opendevreviewSteve Baker proposed openstack/networking-generic-switch master: Add security group support to netmiko_sonic  https://review.opendev.org/c/openstack/networking-generic-switch/+/95525223:01

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!