Tuesday, 2025-05-06

opendevreviewMichael Still proposed openstack/nova master: libvirt: Add extra spec for sound device.  https://review.opendev.org/c/openstack/nova/+/92612605:23
opendevreviewMichael Still proposed openstack/nova master: Protect older compute managers from sound model requests.  https://review.opendev.org/c/openstack/nova/+/94077005:23
opendevreviewMichael Still proposed openstack/nova master: libvirt: Add extra specs for USB redirection.  https://review.opendev.org/c/openstack/nova/+/92735405:23
zigohaleyb: Thanks.08:06
zigohaleyb: Can you give me your email address and the one of Billy (wolsen) please ?08:06
zigohaleyb: I would welcome you guys to join #debian-openstack, and #debian-openstack-commits, which is were I discuss OpenStack packaging.08:07
opendevreviewArnaud Morin proposed openstack/nova master: Fix limit when instances are stuck in build_requests  https://review.opendev.org/c/openstack/nova/+/94780409:45
opendevreviewElod Illes proposed openstack/osc-placement stable/2025.1: Add bindep.txt for ubunutu 24.04 support  https://review.opendev.org/c/openstack/osc-placement/+/94886712:06
gibisean-k-mooney[m] dansmith FYI: the first real complication with threading https://review.opendev.org/c/openstack/oslo.service/+/945720/comments/fb5d6632_f0eaf10212:18
elodilleshi stable maintainers, may i get a +2+W for this simple clean cherry pick that fixes osc-placement's stable/2025.1 gate? o:)  https://review.opendev.org/c/openstack/osc-placement/+/94886712:20
sean-k-mooneygibi: that not really related to threading12:21
sean-k-mooneythat related to multi processing and how cotyledon is handling workers12:22
gibisean-k-mooney: the python doc says do not mix threading with os.fork12:22
gibiwe do12:22
gibiand I see the bad results of it12:22
sean-k-mooneywell we were not but we are now12:23
gibiif I force os.spawn as suggested then I see the pickling error due to a lambda in oslo.config12:23
sean-k-mooneyso even without the eventlet removal work12:23
sean-k-mooneyif we enabel the cotyledon backend it would fail in the same way12:24
sean-k-mooneyat least in the places we are spawnign thread today if they interacted with a thread pool12:25
sean-k-mooneyim not if we really shoudl be using worker process instead of worker threads12:26
gibiwe have limited number of thread pools and those are started in the worker process I guess making it no problem. Now that our default ThreadPoolExecutor is also threaded, and that is starting in the master process, os.fork become a problem12:27
gibiI will see if the only pickling error is the oslo.config lamda, and if I can patch that out, but if not then we have a nice complication in our hand.12:28
sean-k-mooneywell my point is i think this is perhaps a bug on oslo.service12:28
sean-k-mooneysince the new backend does not behave the same as the old processlauncher12:29
gibiit is not a bug it is a behavior of os.fork. The child inherits the parent's state12:29
sean-k-mooneyalthough maybe it is the same https://github.com/openstack/oslo.service/blob/master/oslo_service/backend/eventlet/service.py#L55612:29
sean-k-mooneygibi: sure but the fork is ment ot happen before we ever create any of the executors12:30
gibinope12:30
gibilet me link it to you...12:30
sean-k-mooneythe thread pools are not initalised until there first use right?12:31
gibiright12:31
gibithis is the point of the fork https://github.com/openstack/nova/blob/a5bcaf69b1a80d4d02fe092900471a6e7a28e292/nova/cmd/scheduler.py#L5112:31
sean-k-mooneywhich happen after the fork if there is one12:31
gibibut Service.create() is before that 12:31
gibiand that already depends on our executors and therefore initialize them12:31
sean-k-mooneywhich threadpool is causign the error12:31
sean-k-mooneyis it the oslo one or one of the ones we are creating12:32
gibiin the scheduler startup both the default and the scatter gather pool is initialized before the fork by Service().create12:32
sean-k-mooneyok i would not expect either to be sued at this point12:33
sean-k-mooneythe service creat is going to hit cell012:33
sean-k-mooneyfor the schduler at least12:33
gibiService.create calls https://github.com/openstack/nova/blob/a5bcaf69b1a80d4d02fe092900471a6e7a28e292/nova/service.py#L258 which calls scatter-gather12:33
sean-k-mooneyah i see12:34
gibithe default pool is a bit more complicated but also used for the Schedulers of async init 12:34
sean-k-mooneyok so we have a few option to adress that i guess.12:35
sean-k-mooneywe could move all this init logic later into the workers12:35
sean-k-mooneyyou menthioned something about not usign os.fork and using spawn?12:35
gibisean-k-mooney: it depends on if we can kill the master from the worker in a consistent way. the raise_if_old_compute should stop the master process12:36
gibisean-k-mooney: we can force os.spawn12:36
gibibut that hits a pickling error when it spanws the worker12:36
sean-k-mooneydo we have the equivlent for an thread join for processes12:36
gibithe interface exists I'm not sure about the semantic12:37
sean-k-mooneyi.e. can we have the master wait for all the child process to exit12:37
gibialso current default behavior is that if the worker dies the master creates a new worker12:37
sean-k-mooneyi would not expect that but ok12:37
sean-k-mooneyi assume that handeld in oslo.service?12:38
gibieither oslo.service or cotyledon12:38
gibione of those12:38
gibiI saw the logic once so I can find it again if needed12:38
elodilles(thanks gibi for the +2+W o/)12:38
gibibut bottom line, in some cases we want to respawn the worker, but in other cases we want to kill the master12:39
gibiso we need an active information flow from worker to master12:39
sean-k-mooneyi do not really expect either ot take actions like that. if things die i expect that to propagate up and systemd or whatever process is maanging nova would hanel that12:39
gibito influence the logic, if we move the version check to the worker12:39
gibisystemd handles the master process12:39
opendevreviewArnaud Morin proposed openstack/nova master: Fix limit when instances are stuck in build_requests  https://review.opendev.org/c/openstack/nova/+/94780412:39
gibiso if master dies the systemd restarts that12:40
gibibut if a single worker dies killing the rest of the workers and then let the master die to let systemd restart it seems dangerous without graceful shutdown12:40
sean-k-mooneyright and i think the master process shoudl just spawn the chile process and wiat for them to exit. any recreation of the child process i woudl expect to be handled in nova12:40
sean-k-mooneyits tricky however.12:41
sean-k-mooneyis this one of the know open issue for the eventlet remove12:41
sean-k-mooney*removal12:41
gibiI can a) try to patch oslo.config to support os.spawn or b) try to re-init the executors in the worker if I can detect that I'm a worker c) move the version check to the worker and try to find a way to signal the master no to respawn but to exit12:43
gibiI linked the issue to the #openstack-eventlet-removal as well so maybe Herve will have some ideas too12:44
sean-k-mooneylooking at the eventlet oslo service code12:44
sean-k-mooneythe respawn logic was hardcoded12:45
sean-k-mooneyso while i dont really expect that to happen it seams like its how it has alwasy worked12:45
stephenfindansmith: Could you clarify what we're trying to say about nova-metadata-wsgi under local/global deployments here? https://docs.openstack.org/nova/latest/admin/cells.html#nova-metadata-api-service12:45
sean-k-mooneygibi: we do have another option12:45
sean-k-mooneywhich is to not use oslo to create teh works but to spwan them on a process pool ourselves12:46
stephenfinAre we simply saying that `[api] local_metadata_per_cell` should be false for global and true for local? Because with the removal of the eventlet server, there's no way to run the metadata API in the same service as the compute (REST) API now so "standalone" doesn't make much sense12:47
sean-k-mooneygibi: that would allow us to get back futures form the process pools and we can then catch the excption and decied what to do based on that12:47
stephenfin(because it's always "standalone")12:47
gibiyeah that is option d). But I guess other projects will have similar problems so a nova local solution is not a nice one12:47
sean-k-mooneygibi: well option e woudl be do d in oslo12:48
gibisean-k-mooney: :)12:48
gibistephenfin: you are probably correct12:48
sean-k-mooneygibi: neutron may have some ideas ro we could look to one of the service that has been usign coytolon or what ever it is for years12:48
gibiyeah12:49
sean-k-mooneystephenfin: local metadata per cell should be set to true if you have confiured the metadata agent deploy by neutron to use the local nova metadta api endpoint12:49
sean-k-mooneystephenfin: but i think what the doc is saying is12:50
stephenfin> The nova metadata API service must not be run as a standalone service, using the nova-metadata-wsgi service, in this case.12:50
sean-k-mooneyif you do that you need to deploy the dedicated nova-metadata-api wsgi app12:51
sean-k-mooneyim not sure why the combined one could not work however12:51
stephenfinThere's no combined one12:51
stephenfinNot in WSGI. Only eventlet (now removed)12:51
sean-k-mooneyoh then that would be why i guess12:51
sean-k-mooneyack12:51
gibiyepp the combined one was eventlet only12:52
gibiin the nova-operator we always start the metadata wsgi as separate pod either on top or on cell level12:52
sean-k-mooneyok i guess we were just tryign to say if you want to run per cell metadata you need to deploy addtional metadata api (at least 1 per cell)12:53
stephenfingibi: tbc, when you say on top or cell-level you're refererring to the global and local deployment topologies described in that doc, yeah?12:53
gibiyeapp12:53
sean-k-mooneyyep12:53
stephenfinokay, sweet12:53
gibistephenfin: do you see a need for clarifying things in our doc?12:53
stephenfinI think so. The sentence I quoted above is confusing IMO12:54
stephenfin> The nova metadata API service must not be run as a standalone service, using the nova-metadata-wsgi service, in this case.12:54
stephenfinTo me, that says "you cannot use nova-metadata-wsgi" service if deploying in a global configuration12:54
sean-k-mooneythat was not ture even before we remvoe the combined one12:55
sean-k-mooneybut now it can be simplifed12:55
stephenfinand since we no longer provide nova-metadata (i.e. the eventlet one), that would suggest global deployments weren't possible anymore12:55
sean-k-mooneywe only have the wsgi service on master12:55
sean-k-mooneyand only the split endpoints12:55
stephenfinyeah, hence my concern12:55
stephenfinbut it sounds like it's just worded weirdly12:55
opendevreviewMerged openstack/osc-placement stable/2025.1: Add bindep.txt for ubunutu 24.04 support  https://review.opendev.org/c/openstack/osc-placement/+/94886712:55
sean-k-mooneywe used to have seperate console script for nova-api and nvoa-api-metadata12:55
stephenfinand I can rephrase like so12:55
sean-k-mooneyi think gibi dropped the refences to the console scripts when they were remvoed12:56
stephenfin> The ``api.local_metadata_per_cell`` option must be set to ``False``12:56
sean-k-mooneybut keepd the verbage related to the pbr wsgi scripts12:56
stephenfinOr drop the sentence entirely12:56
gibiyeah I tried to cleanup the doc when the eventlet server is removed but probably missed things12:56
sean-k-mooneyit need to be set based on teh toplogy12:57
gibiyeah, if the metadata is deployed globally (top level) the local_metadata_per_cell needs to be false, but when deployed metadata to each cell then it need to be true12:57
sean-k-mooneyit tecnically optiona to set this to true for the per cell deployment by the way. it default to false12:58
sean-k-mooneymeanign if you deploy per cell metadata they can lookup metdata for other cells if they have access to the api db to do so12:58
gibisean-k-mooney: it depends, the the local metadata has cell db access then sure it can be set to false12:58
gibiyeah that12:58
gibisean-k-mooney: cotyledon has the worker respawn implemented here https://github.com/sileht/cotyledon/blob/be444189de32a8c29c7107a9b02da44248a7e64a/cotyledon/_service_manager.py#L254-L25612:59
sean-k-mooneyhttps://docs.openstack.org/nova/latest/configuration/config.html#api.local_metadata_per_cell is basically a configuration to prevnet that cross cell lookup12:59
gibiyepp13:00
gibioption b) aka re-initing the executor after fork works, based on the process name stored on the executor compared to the current process name. But this is ugly and nova only solution. Fortunately we will move all our spawns to executors so at least it applicable easily for now13:14
opendevreviewMerged openstack/osc-placement stable/2025.1: Update .gitreview for stable/2025.1  https://review.opendev.org/c/openstack/osc-placement/+/94375713:38
dansmithgibi: that move to spawn soon is surprising to me, as I suspect it will be hella slow, especially for openstack code14:02
dansmithgibi: I'm not sure where we care about process workers, other than the api services in standalone mode.. in wsgi mode we shouldn't be forking at all, right?14:02
gibiwe don't spawn much, just at the start to get the worker processes (or when they die)14:02
gibidansmith: this is scheduler. The default behavior of oslo.service is to fork the workers14:03
dansmithoh I guess conductor needs process workers too14:03
gibiconductor too14:03
dansmithdo we need to do that for scheduler though? we did under eventlet for parallelism, but I'm not sure either do going forward14:04
gibithreading in python 3.12 still limited by the GIL, so having a way to spawn processes as scaling make sense as that way we can saturate more CPU cores if needed14:05
dansmithsure, but there are lots of multithreaded python programs providing reasonable performance with the GIL :)14:06
gibiIm not 100% sure but I feel that oslo.service with worker=1 still forks a worker proc from the master proc14:06
dansmithI suspect neither are cpu-intensive enough to really need full parallelism14:06
dansmithokay14:06
gibibut I think I understand you, we might not need to fork just use a single proc with big enough thread pools. I'm not sure this is something that supported by oslo.service out of the box.14:09
gibicould be an improvement request14:09
gibito provide alternative way to avoid the fork14:09
dansmithyeah, just might be worth consideration14:10
dansmithconductor especially I suspect is mostly waiting on mysql and rabbit connections anyway.. scheduler *might* do enough work traversing lots of host objects or something, but I suspect not anymore with placement14:11
gibiI think we had issues with the Numa filter where I needed to add cacheing to help with the execution time14:14
gibiso meh14:14
opendevreviewMerged openstack/osc-placement stable/2025.1: Update TOX_CONSTRAINTS_FILE for stable/2025.1  https://review.opendev.org/c/openstack/osc-placement/+/94375814:17
opendevreviewBalazs Gibizer proposed openstack/nova master: Translate scatter-gather to futurist  https://review.opendev.org/c/openstack/nova/+/94796614:17
opendevreviewBalazs Gibizer proposed openstack/nova master: Use futurist for _get_default_green_pool()  https://review.opendev.org/c/openstack/nova/+/94807214:18
opendevreviewBalazs Gibizer proposed openstack/nova master: Replace utils.spawn_n with spawn  https://review.opendev.org/c/openstack/nova/+/94807614:18
opendevreviewBalazs Gibizer proposed openstack/nova master: Add spawn_on  https://review.opendev.org/c/openstack/nova/+/94807914:18
opendevreviewBalazs Gibizer proposed openstack/nova master: Move ComputeManager to use spawn_on  https://review.opendev.org/c/openstack/nova/+/94818614:18
opendevreviewBalazs Gibizer proposed openstack/nova master: Move ConductorManager to use spawn_on  https://review.opendev.org/c/openstack/nova/+/94818714:18
opendevreviewBalazs Gibizer proposed openstack/nova master: Make nova.utils.pass_context private  https://review.opendev.org/c/openstack/nova/+/94818814:18
opendevreviewBalazs Gibizer proposed openstack/nova master: Rename DEFAULT_GREEN_POOL to DEFAULT_EXECUTOR  https://review.opendev.org/c/openstack/nova/+/94808614:18
opendevreviewBalazs Gibizer proposed openstack/nova master: Make the default executor configurable  https://review.opendev.org/c/openstack/nova/+/94808714:18
opendevreviewBalazs Gibizer proposed openstack/nova master: Print ThreadPool statistics  https://review.opendev.org/c/openstack/nova/+/94834014:18
opendevreviewBalazs Gibizer proposed openstack/nova master: WIP: allow service to start with threading  https://review.opendev.org/c/openstack/nova/+/94831114:18
opendevreviewBalazs Gibizer proposed openstack/nova master: DNM:Run nova-next with n-sch in threading mode  https://review.opendev.org/c/openstack/nova/+/94845014:18
opendevreviewStephen Finucane proposed openstack/nova master: setup: Remove pbr's wsgi_scripts  https://review.opendev.org/c/openstack/nova/+/90268814:18
stephenfinsean-k-mooney: gibi: Context for my question earlier ^14:18
gibidansmith: sean-k-mooney: so I think I fixed the fork issue with re-initing the executor in the worker processes https://review.opendev.org/c/openstack/nova/+/947966/8/nova/utils.py#1276 not nice but seem to work. (I noticed the issue when tested slow / never finishing scatter gatter scenarios locally)14:19
dansmithyeah that's not great14:21
gibistephenfin: thank I will take a loook14:21
dansmithgibi: is there any way I could get you to stop putting many lines of comments in the middle of a condition like that? seems like a pattern that has been emerging and it completely breaks my brain when trying to reason about the logic like that :(14:21
gibifeel free to leave a comment and I will move the comment 14:22
gibiI tried to be as close to the condition as possible but I can move it a bit further14:22
dansmithgibi: I may have missed context from the above conversation about where this gets initialized so early, but is moving that an option?14:28
gibiit is the service version check that happens before the fork14:28
gibiand that uses scatter-gather14:28
gibi(and also any init that happens at Service.create() today)14:29
gibiwe can move it but we have no way to signal to the master process that a worker want to kill the master if the version check fails14:29
gibiI need to drop for a bit, I will be back for the nova meeting 14:29
dansmithah, I see14:29
dansmithseems like we could maybe just disable the pooling there, or just destroy the threadpool at the end of service.create before we return to allow the fork to be clean?14:30
gibidisable pooling needs infra as the version check uses the generic scatter-gather. I would not duplicate that14:31
gibidestroying the pool before the fork could work14:31
gibiprobably a bit cleaner than the re-init14:31
gibibut14:31
gibiI need to test it14:32
dansmithfor the disable pooling suggestion, I meant something like "use a temporary pool instead of the global one" - I know we need *a* pool there14:32
dansmithyou could also have a "sequential thread pool" that just runs each cell query synchronously in order instead of the threads maybe too14:33
dansmithbut yeah, just destroying after we're done with that check would be simple and clean, I suspect14:33
dansmithI will miss the nova meeting today, btw. see you later14:34
sean-k-mooneygibi: if notthing else it confirm that the issue is with how it forking and not reinitalisting it 14:36
sean-k-mooneygibi: dansmith i was on a metting but would markign the pool as thread local help14:36
sean-k-mooneythey would not be shared with the process correct. the only problem with that is if work on a thread pool wanted to add somethign to the same thread pool14:37
sean-k-mooneymy thinking is we should not have that patter in general14:37
sean-k-mooneyso if we mark the module level thread pool at threadlocal then on fork it will be initalsed to empty14:38
sean-k-mooneyand the first attempt to use it will init it in the child process without any shareing of state14:38
sean-k-mooneyhaving a way to disable scatter gahter for that first call on startup would also be an option 14:39
sean-k-mooneydansmith: by "sequential thread pool" you mean a sequtial executor right. like the one we used in tests14:40
dansmithyes14:41
sean-k-mooneyok futureist already provides that https://github.com/openstack/futurist/blob/master/futurist/_futures.py#L22714:41
sean-k-mooneyso that would be an option for the early init14:42
* dansmith nods14:42
sean-k-mooneygibi: i comment with links to the thread local apporch but i have not tested that change it mich be worth trying with a DNM patch or locally14:50
sean-k-mooneyim just not sure if its the correct patthern to apply to this type of problem14:50
sean-k-mooneyi.e. can we convert all module global state to thread local storage if we have this problem or would that only work in this case14:51
sean-k-mooneywe dotn really want to have to have thread local db engine facades for example14:51
opendevreviewStephen Finucane proposed openstack/placement master: setup: Remove pbr's wsgi_scripts  https://review.opendev.org/c/openstack/placement/+/91958214:54
stephenfinsean-k-mooney: Think you can remove your -1 on that now? ^14:54
sean-k-mooneyprobaly we had that supprot in epoxy right14:55
sean-k-mooneyso we should be able to remove it now if we wanted too14:55
stephenfinYep14:58
sean-k-mooneystephenfin: im still hesitent to remvoe this entrily for the simple reason that apache mod_wsgi does not suport using the module approch supproted by uwsgi/gunicorn14:58
sean-k-mooneywith that said14:58
sean-k-mooneycan the wsgi module be used directly14:58
sean-k-mooneyi.e. can you just poing mod_wsgi at https://github.com/openstack/placement/blob/master/placement/wsgi/api.py14:59
sean-k-mooneythat effectivly what was generated by pbr right14:59
sean-k-mooneyso for modwsgi you would just point to that in site packages ?15:00
sean-k-mooneyok so not quite https://termbin.com/5n9815:02
sean-k-mooneybut for the mod_wsgi usecase the answer is yes15:02
sean-k-mooneycare to call that out in the release note?15:02
sean-k-mooneyhum 15:05
sean-k-mooneypython3 -c "import placement.wsgi.api; print(placement.wsgi.api.__file__);"15:05
sean-k-mooneyso because we dont card that with if __main__15:05
sean-k-mooneythat import actully runs some of the code15:06
sean-k-mooneystephenfin: anyway im not sure if we need to expalin how to locate /opt/stack/placement/placement/wsgi/api.py on the system15:07
sean-k-mooneybut instead of /opt/stack/data/venv/bin/placement-api they just need to use ^15:07
sean-k-mooneyif this was not devstack that would be in /usr/lib/python3/dist-packages/ or /usr/lib/python3/site-packages/15:09
UgglaNova meeting in ~50mn15:10
UgglaNova meeting in ~10mn, time for you to grab a cup of coffee. 15:49
gibiIf I drink a coffee no then won't sleep until 2 in the morning15:53
gibisean-k-mooney: the thread local has a problem that if we have a periodic that wants to run a scatter-gather then the thread holding the periodic will see a different scatter-gather pool than the main one15:54
gibiso I will look into the destroy pool before fork idea15:55
sean-k-mooneyi think there is a generic hook we can implemnt for that15:57
sean-k-mooneywe have some module levle reset functionaltiy that we use for mutable config15:58
sean-k-mooneyim thinking of thing like atexit()15:58
sean-k-mooneythere might be a pre/post fork hook we could register to do that15:58
gibihttps://github.com/sileht/cotyledon/blob/be444189de32a8c29c7107a9b02da44248a7e64a/cotyledon/_service_manager.py#L156 cotyledon has hooks16:00
Uggla#startmeeting nova16:02
opendevmeetMeeting started Tue May  6 16:02:07 2025 UTC and is due to finish in 60 minutes.  The chair is Uggla. Information about MeetBot at http://wiki.debian.org/MeetBot.16:02
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:02
opendevmeetThe meeting name has been set to 'nova'16:02
bauzas\o16:02
UgglaHello everyone16:02
Ugglaawaiting a moment for people to join.16:02
elodilleso/16:02
fwieselo/16:02
gmaano/16:03
Ugglathanks bauzas for last week meeting.16:04
bauzasnp16:04
gibio/16:04
Uggla#topic Bugs (stuck/critical) 16:04
Uggla#info No Critical bug16:04
Uggla#topic Gate status 16:05
Uggla#link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs 16:05
Uggla#link https://etherpad.opendev.org/p/nova-ci-failures-minimal16:05
Uggla#link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&branch=stable%2F*&branch=master&pipeline=periodic-weekly&skip=0 Nova&Placement periodic jobs status16:05
Uggla#info Please look at the gate failures and file a bug report with the gate-failure tag.16:05
Uggla#info Please try to provide meaningful comment when you recheck16:05
UgglaIf I understood well, the gate was blocked by this https://review.opendev.org/c/openstack/nova/+/948392 last week.16:06
gibijepp16:06
gibifix is landed16:06
UgglaIt is landed now.16:06
UgglaIt the gate looks good. Please tell me if I'm wrong.16:06
sean-k-mooneythe cyborg and barbican jobs are still broken i think16:06
gibiI have hit by https://bugs.launchpad.net/glance/+bug/2109428 multiple times recently. Not a blocker but definitly a source of rechecks16:06
gibisean-k-mooney: yepp those non-votings are broken16:07
gibibugs are filed16:07
sean-k-mooneyi may try and find time to go fix them16:07
sean-k-mooneythey are both broken on the lack of a pyproject.toml16:07
gibiyepp16:07
Ugglagood to know thx gibi 16:07
gibihttps://bugs.launchpad.net/barbican/+bug/210958416:07
gibihttps://bugs.launchpad.net/openstack-cyborg/+bug/210958316:07
sean-k-mooneyif i can fix it without having to install the service locally i might give it a try16:07
sean-k-mooneyi can refence thos bugs16:08
gibithanks16:08
gmaanI think we need to make pyproject.toml for all projects in openstack otherwise they will break slowly at some point16:08
gibigmaan: I agree16:08
Ugglashould we track this ?16:09
sean-k-mooneygmaan: yes we will16:09
sean-k-mooneyUggla: nova is mostly doen stephen and i tried to do this 2 years ago16:09
sean-k-mooneyto get ahead of things breaking16:10
sean-k-mooneyso nova and placment are done16:10
sean-k-mooneyi need to check os-*16:10
sean-k-mooneybut for the libs we are responible for it trivial16:10
sean-k-mooneyok os-vif is still pending16:11
Ugglaany bug / blueprint to refer to this work ?16:11
sean-k-mooneyill start working on them and ping folks to review16:11
gmaan++16:11
Ugglasean-k-mooney++16:11
Ugglaanything else ?16:12
Ugglamoving on to next item.16:12
Uggla#topic tempest-with-latest-microversion job status 16:12
Uggla#link https://zuul.opendev.org/t/openstack/builds?job_name=tempest-with-latest-microversion&skip=016:12
UgglaI have just discussed with gmann about it.16:12
Ugglagmann is progressing on this periodic job.16:13
Ugglagmaan, I let you give a quick status if you wish.16:13
gmaansure16:14
gmaanit is not worst than what i suspected, only 23 tests failing16:14
gmaanI started fixing those one by one16:14
gmaan#link https://review.opendev.org/q/topic:%22latest-microversion-testing%2216:14
gmaanhypervisor are fixed, I will fix a few more today and this week16:14
sean-k-mooneydo you know if some of them are invalid failure16:15
gmaanthat is ^^ topic I am adding all changes to, feel free to review/comment16:15
sean-k-mooneyi.e. the test is depending on an older microverion behavior 16:15
sean-k-mooneyand should be skipped when using latest16:15
gmaansean-k-mooney: not invalid, either we need to fix schema or cap the test with min/max microversions16:15
sean-k-mooneyok ya that actully what i was wondering16:15
gmaansean-k-mooney: yeah, for example hypervisor uptime test should run till 2.8716:15
sean-k-mooneycan we express in tempst16:15
sean-k-mooneythat this test has a max rather then just min verion requirement16:16
gmaanyes with 'max_microversion'16:16
sean-k-mooneyack16:16
sean-k-mooneyah i see that how your fixing it in https://review.opendev.org/c/openstack/tempest/+/94849016:16
sean-k-mooneycool16:16
gmaanyeah ^^16:16
gmaansometime we need to refactor test but not bug deal. 16:17
gmaanthat is all on these, fell free to review, as I am only active core in tempest I will merge them but keep it open for sometime if anyone would like to review16:17
gmaanmost probably, i will keep all fixs open till job is green 16:18
Ugglathx gmaan 16:18
Uggla#topic Release Planning 16:19
Uggla#link https://releases.openstack.org/flamingo/schedule.html16:19
Uggla#link https://releases.openstack.org/flamingo/schedule.html16:19
UgglaThe patch about nova deadlines has been merged so I think it is ok.16:19
UgglaPlease let me know if it is not the case or if something is wrong.16:20
Uggla#topic Review priorities16:20
Uggla#link https://etherpad.opendev.org/p/nova-2025.2-status16:21
UgglaI'd like to progress on openapi and I'll try to check with stephenfin about it.16:22
Uggla#topic Stable Branches16:22
Ugglaelodilles, the mic is yours16:23
elodillesthanks Uggla , so16:23
elodilles#info stable/2023.2 (bobcat) is End of Life, branch is deleted (tag: bobcat-eol)16:23
elodilles#info maintained stable branches: stable/2025.1, stable/2024.2, stable/2024.116:23
elodillesdown to 3 maintained branches again ;)16:23
elodilles#info nova stable release from stable/2024.1 is out (29.2.1)16:24
elodilles#info not aware of any stable gate failure16:24
elodilles#info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci16:24
elodilleswe had a broken gate on stable/2025.1 osc-placement,16:24
elodillesbut it is now fixed16:25
elodillesthanks all for the help :)16:25
elodillesand i think that's all from me16:25
elodillesUggla: back to you16:25
Ugglaelodilles, fyi 947847: nova: Release 2024.2 Dalmatian 30.0.1 | https://review.opendev.org/c/openstack/releases/+/947847 should be ok now.16:25
UgglaAs I have done 948811: Add uc check alternative method | https://review.opendev.org/c/openstack/requirements/+/94881116:25
elodillesUggla: thanks for working on that!16:26
Ugglathe path is on master with a backport to the stable branch.16:26
elodillesi've just added a comment on your patch o:)16:26
Ugglaso it needs reviews.16:26
Ugglabut at least the verify is +116:27
opendevreviewsean mooney proposed openstack/os-vif master: add pyproject.toml to support pip 23.1  https://review.opendev.org/c/openstack/os-vif/+/89994616:27
Uggla#topic vmwareapi 3rd-party CI efforts Highlights16:28
fwiesel #info No updates16:28
fwieselUggla: Back to you16:28
Ugglafwiesel, btw welcome back, I hope you are good.16:28
Uggla#topic Gibi's news about eventlet removal.16:29
fwieselThanks! Just still catching up with everything.16:29
gibio/16:29
Uggla#link Series: https://gibizer.github.io/categories/eventlet/16:29
gibiso last week we saw nova-scheduler running with native threading. There is a blogpost about it16:29
bauzashad no time to review those patches yet :(16:29
gibithis week I started testing the non happy path of the scatter-gather16:29
* bauzas hardly crying16:30
gibigood news both mysql server and pymysql client can be configured with timeouts to avoid handing gather threads16:30
gibiI will add a new doc about it in tree16:30
gibibad news, the oslo.service threading backend uses forks16:31
gibiwhich does not play nice with out global threadpools initialized before the service workers are forked off16:31
gibiwe have workarounds16:31
gibidetails are in https://review.opendev.org/c/openstack/oslo.service/+/945720/comments/fb5d6632_f0eaf10216:31
gibithere are two patches to look at16:31
gibihttps://review.opendev.org/c/openstack/nova/+/948064?usp=search now with test coverage16:32
gibiand16:32
gibihttps://review.opendev.org/c/openstack/nova/+/948437?usp=search16:32
gibithis week I will work on cleaning up the long service to make more patches ready to review16:32
gibithat is all16:32
gibiUggla: back to you16:32
sean-k-mooneygmaan: Uggla: just a quick update on the pyproject.toml change if i may16:33
sean-k-mooneyhttps://etherpad.opendev.org/p/pep-517-and-pip-23 is my old ether pad to track that work16:33
Ugglathx gibi 16:33
Ugglasean-k-mooney, sure go ahead16:33
sean-k-mooneyand only the os-vif change above is not mergd for nova16:33
sean-k-mooneyso i rebased and appvoed that16:33
gmaanjust saw, ++16:33
sean-k-mooneyonce that is landed we shoudl be good16:33
Uggla\o/16:34
sean-k-mooneyanyway that was all on that topic16:34
Ugglagibi, just a question you have said : the oslo.service threading backend uses forks  it means real processes and not threads ?16:35
gibiUggla: no we want worker processes with thread pool16:35
gibipools16:35
gibibut it uses os.fork to create the worker from the main process16:35
sean-k-mooneycontext is oslo service 16:35
sean-k-mooneyallows you to have multiple workers 16:35
gibifork copies the state of the process 16:36
sean-k-mooneywe use that in the schduler and conductor16:36
sean-k-mooneybut not nava-comptue or the api16:36
gibiwe would need os.spawn to get a totally new worker process16:36
gibiwithout inheriting state16:36
Ugglaok I think I have understood the pb.16:37
gibithe wa is to reset the problematic state16:37
gibiafter the fork16:37
gibibut I think the real solution would be os.spawn16:37
gibibut changing to that is not super simple as oslo.config has non pickleable lambdas :/16:37
gibidetails are in the linked gerrit comment16:37
gibiand are up in the today IRC log16:38
dansmithwe can reset after the fork (as you're doing) or reset before the fork, as I suspect will also work and be less janky16:38
gibiyepp16:38
gibiI will try dansmith's suggestion next16:38
gibiwhile we wait for the oslo folks to respond16:38
sean-k-mooneyhttps://github.com/openstack/oslo.config/blob/d6e5c96d6dbeec0db974dfb8afc8e508b74861e5/oslo_config/cfg.py#L138716:38
dansmithI agree the real solution is spawn, but we probably don't want to wait16:38
sean-k-mooneythat is there only use of lambda i think by the way16:38
gibidansmith: yepp we won't wait, we will go with the WA, and adapt when oslo makes the move16:39
dansmith+116:39
Ugglamoving on16:40
Uggla#topic Open discussion 16:40
Ugglafwiesel, wants to propose something about Cross hypervisor resize.16:41
gibiI guess that was last week's topic16:41
Ugglayep but I understood you wanted to discuss it more.16:42
gibior maybe a continuation16:42
fwieselYes, but there was limited feedback. I mean, if no one feels strongly about it, then I can go forward with a blueprint.16:42
fwieselWe would like to allow mobility between two hypervisors, and I was thinking the cross-cell migration might already cover it to a large degree.16:43
fwieselThe question is though, if that is a use-case you would feel worthwhile supporting or rather not.16:44
dansmithI think it _has_ to be more like cross-cell migration than regular16:44
dansmithI'm a bit mixed on whether or not I think this is worth it, because I suspect there will be a lot of gotchas in the image properties16:45
dansmithand because you can pretty much do this with snapshot yourself now16:45
dansmithand because we won't really be able to test it regularl I don't think16:45
gibiyeah testing this will be painful16:46
fwieselOkay, got it. That's fine.16:46
sean-k-mooneydansmith: specicly like updating the hw_vif_model for say the vmware one to virtio if they dont happen to have common values already16:46
dansmithyeah, all that kind of stuff16:46
fwieselWell, we would set those in the flavours and that would override that.16:47
sean-k-mooneyso general question. is there anyting we know off that would prevent you form shelving, modifying them in glance and unshleving?16:47
fwieselNo, it is more a usability issue.16:47
fwieselFor our users.16:48
dansmithsean-k-mooney: if the flavor you booted from was vmware-specific you can't unshelve to a libvirt-y one right?16:48
sean-k-mooneyfwiesel: so historically flavors are for amounts and images propereis are for changing how the devices are presented16:48
fwieselAnd we were thinking of using shared NFS shares to avoid going through image upload, etc...16:48
sean-k-mooneyfwiesel: so in the past the precendet was dont put device models in flavors 16:48
sean-k-mooneydansmith: correct16:48
sean-k-mooneydansmith: what im thinkign is we dicussed allowing resize in the sheleved state16:49
dansmithto me, snapshot, tweak, boot fresh is the best pattern here16:49
sean-k-mooneyso the workflow would be shelve. resize , update image properties and then unshleve16:49
dansmithsean-k-mooney: that's a whole other conversation16:49
fwieselBut I got it... hard to test so, no takers... Perfectly understandable.16:49
sean-k-mooneyya im just confirming if there is a way to do that today with the api we have and i think the gaps are resize while shleved16:49
opendevreviewBalazs Gibizer proposed openstack/nova master: DNM:Run nova-next with n-sch in threading mode  https://review.opendev.org/c/openstack/nova/+/94845016:50
fwieselThanks for the feedback.16:50
sean-k-mooneybut ya snapshot. detach volumes/ports and boot new vms with thos is a valid workflow too16:50
UgglaShall I say it is ok for fwiesel to propose something that goes in that direction ?16:52
fwieselMy understanding is, I won't propose things as it is doable with the existing API and implementing it within Nova is hard to test and maintain.16:53
dansmithI think anyone can propose anything they want :)16:53
dansmithfwiesel: ++16:54
Ugglaoh ok good.16:54
UgglaWe are almost at the top of the hour. Are you ok triaging a couple of bugs or would you prefer to do it next week ?16:56
Ugglasean-k-mooney, dansmith ^^16:58
Ugglaok no answer so you might be busy. So closing the meeting.16:59
Ugglathanks all16:59
gibiUggla: thanks17:00
fwieselthanks everyone17:00
Uggla#endmeeting17:00
opendevmeetMeeting ended Tue May  6 17:00:10 2025 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)17:00
opendevmeetMinutes:        https://meetings.opendev.org/meetings/nova/2025/nova.2025-05-06-16.02.html17:00
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/nova/2025/nova.2025-05-06-16.02.txt17:00
opendevmeetLog:            https://meetings.opendev.org/meetings/nova/2025/nova.2025-05-06-16.02.log.html17:00
bauzasthanks Uggla17:00
Ugglathank you. We will try to do the bug scrubbing next week17:00
dansmithsorry, multitasking17:01
elodillesthanks Uggla o/17:01
gibisean-k-mooney: I stop for today but I will get back to forking tomorrow :)17:04
sean-k-mooneygibi: no worries17:19
sean-k-mooneyi started at 7am because i woke at 5 so i shoudl also finsih for today17:19
opendevreviewMerged openstack/os-vif master: add pyproject.toml to support pip 23.1  https://review.opendev.org/c/openstack/os-vif/+/89994620:48
opendevreviewMerged openstack/nova master: [quota]Refactor group counting to scatter-gather  https://review.opendev.org/c/openstack/nova/+/94806422:40

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!