Monday, 2025-04-28

opendevreviewDongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova  https://review.opendev.org/c/openstack/nova/+/94830301:06
opendevreviewDongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova  https://review.opendev.org/c/openstack/nova/+/94830301:25
opendevreviewDongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova  https://review.opendev.org/c/openstack/nova/+/94830301:42
opendevreviewDongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova  https://review.opendev.org/c/openstack/nova/+/94830301:56
opendevreviewMasanori Kuroha proposed openstack/nova master: Copy applied provider.yaml  https://review.opendev.org/c/openstack/nova/+/94830402:47
opendevreviewMasanori Kuroha proposed openstack/nova master: Copy applied provider.yaml  https://review.opendev.org/c/openstack/nova/+/94830403:05
opendevreviewDongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova  https://review.opendev.org/c/openstack/nova/+/94830304:28
opendevreviewDongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova  https://review.opendev.org/c/openstack/nova/+/94830305:00
opendevreviewDongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova  https://review.opendev.org/c/openstack/nova/+/94830305:41
opendevreviewDongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova  https://review.opendev.org/c/openstack/nova/+/94830306:04
opendevreviewBalazs Gibizer proposed openstack/nova master: [quota]Refactor group counting to scatter-gather  https://review.opendev.org/c/openstack/nova/+/94806406:08
opendevreviewBalazs Gibizer proposed openstack/nova master: WIP:Translate scatter-gather to futurist  https://review.opendev.org/c/openstack/nova/+/94796606:08
opendevreviewBalazs Gibizer proposed openstack/nova master: Use futurist for _get_default_green_pool()  https://review.opendev.org/c/openstack/nova/+/94807206:08
opendevreviewBalazs Gibizer proposed openstack/nova master: Replace utils.spawn_n with spawn  https://review.opendev.org/c/openstack/nova/+/94807606:08
opendevreviewBalazs Gibizer proposed openstack/nova master: Add spawn_on  https://review.opendev.org/c/openstack/nova/+/94807906:08
opendevreviewBalazs Gibizer proposed openstack/nova master: Move ComputeManager to use spawn_on  https://review.opendev.org/c/openstack/nova/+/94818606:08
opendevreviewBalazs Gibizer proposed openstack/nova master: Move ConductorManager to use spawn_on  https://review.opendev.org/c/openstack/nova/+/94818706:08
opendevreviewBalazs Gibizer proposed openstack/nova master: Make nova.utils.pass_context private  https://review.opendev.org/c/openstack/nova/+/94818806:08
opendevreviewBalazs Gibizer proposed openstack/nova master: Rename DEFAULT_GREEN_POOL to DEFAULT_EXECUTOR  https://review.opendev.org/c/openstack/nova/+/94808606:08
opendevreviewBalazs Gibizer proposed openstack/nova master: Make the default executor configurable  https://review.opendev.org/c/openstack/nova/+/94808706:08
opendevreviewBalazs Gibizer proposed openstack/nova master: WIP: allow service to start with threading  https://review.opendev.org/c/openstack/nova/+/94831106:08
opendevreviewDongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova  https://review.opendev.org/c/openstack/nova/+/94830306:31
mkuroha@sean @gibi Hi, this patch is the initial implementation for the behavior change of additional.traits in provider.yaml that was discussed at the PTG: https://review.opendev.org/c/openstack/nova/+/948304  I would appreciate it if you could review the current code mm08:21
bauzasgibi: so the goal for this week is about testing scatter-gather for the scheduler ?08:48
gibimkuroha: thanks for proposing that patch. It looks pretty clean and simple, so I like it. How do you feel about this approach?08:52
gibibauzas: yeah, I want to see the first light locally that nova-scheduler starts without monkey patching and with all the spawn calls actually taking native threads from a threadpool. If it start and not imediately fails that is already success. Then I will try throwing tempest at it and see if it works, does not hang, does not loose threads from the pool, etc. 08:54
gibibauzas: it needs to depend the still open (and not fully working) oslo.sevice patch so this will be mostly local trials and maybe CI runs if I can extend devstack to pass env variable to nova-scheduler at start08:55
gibiI expect that this means less actual reviewable patches proposed, and more WIP patches iterated08:56
gibithis week.08:56
gibiAlso this week I'm off Thrusday, Friday08:57
gibiso it is a short week anyhow08:57
gibiso you can review the long series for spotting conceptual errors but that series is mostly WIP and expected to change based on the feedback provided by the local testing of scatter-gather in the scheduler in native threading mode.08:58
gibiI will write up a post about how to replicate the local testing setup I have so others can try08:58
bauzasgibi: cool, sorry wasn't nagged by a notification when you pinged me09:10
bauzasgibi: I'll also be on PTO on Thur and Fri09:10
bauzasgibi: so tbc, that's this gerrit branch that I could look at ? https://review.opendev.org/c/openstack/nova/+/948064/309:11
gibiyes that is the one I'm trying to use to test things locally09:12
bauzasnoted09:12
gibithere are two independent patches. A small delete https://review.opendev.org/c/openstack/nova/+/947260?usp=search09:12
gibiand the replacement series here https://review.opendev.org/c/openstack/nova/+/94721209:13
gibion the latter I expect Kamil to propose more similar patches this week09:13
bauzasI already reviewed the hacking patch 09:13
gibiyeah, I just noticed. Thanks09:14
bauzasfor the moment, the removals and hacking patches were easy to review and without any concerns09:14
gibiif you look at the recheck history of https://review.opendev.org/c/openstack/nova/+/947212 you see that our gate is not super happy09:14
gibiI filed https://bugs.launchpad.net/glance/+bug/210942809:15
bauzasbut the futurist branch needs more brain time for me :)=09:15
gibibauzas: yeah the long barnch is more complicated09:15
gibiand less hashed out09:15
bauzasyeah saw the libvirt failure09:15
bauzasseems a new regression IMHO09:15
gibialso we reverted a pyroute2 bump over the weekend https://bugs.launchpad.net/glance/+bug/210942809:16
gibisorry worng link09:16
gibihttps://review.opendev.org/c/openstack/requirements/+/94828309:16
gibibut the author of pyroute2 already responded in https://bugs.launchpad.net/os-vif/+bug/2109396 so that is nicely moving along09:17
bauzasthanks for the info09:17
gibinow that I'm back to relying on the upstream CI, I will spend more time caring for it and filing bugs j)09:18
gibi:)09:18
mkurohagibi: Thank you for the confirmation!!:) I have one concern: I'm wondering if there is any concern in the behavior when /etc/nova/provider.yaml is deleted while /applied_provider_config/provider.yaml is present. (Current:  the additional.traits of the resource provider and the /applied_provider_config/provider.yaml file remain.). 09:31
mkurohaAdditionally, in this patch, are there any tests other than the nova's functional tests that should be added? (This is my almost first patch for nova, and I'm not very familiar with it, so I would like to confirm it.)09:31
gibimkuroha: good point about the edge case of deleting /etc/nova/provider.yaml. I think the expected behavior in this case that all the traits that are listed in the applied file is removed and then the applied file is deleted at the end.09:33
gibimkuroha: the functional test looks good to me. Adding one for the deleted provider.yaml case would be nice too. If you feel that the merge_provider_config or other changes could benefit from some unit test then that is also OK to add. 09:35
gibibut for me the current PoC would be enough to write and approve a spec09:36
gibiso I suggest to write a small spec based on the PoC, so that we can approve that by the rest of the team 09:37
mkurohagibi: I see, thank you for your all comments!! I will start working on fixing the spec side :)09:40
gibimkuroha: thanks for the effort!09:40
opendevreviewElod Illes proposed openstack/nova stable/2025.1: FUP Remove unnecessary PCI check  https://review.opendev.org/c/openstack/nova/+/94662310:13
opendevreviewMerged openstack/nova master: Replace eventlet sleep with time.sleep  https://review.opendev.org/c/openstack/nova/+/94721210:17
sean-k-mooneymelwitt: dansmith we proably need to go fix https://bugs.launchpad.net/nova/+bug/207661411:39
sean-k-mooneyis there any reason not to delete all the online migration beofre the bobcat one on master?>11:39
sean-k-mooneythe previous one was for victoria when upgrading to wallaby11:40
sean-k-mooneyso no modern cloud shoudl be running anythin other then the latest one. even if your coming form yoga11:41
sean-k-mooneyor wallaby11:41
dansmithsean-k-mooney: removing the super old ones seems fine.. are you suggesting removing the one from victoria?13:52
sean-k-mooneythat im unsure13:52
sean-k-mooneyi was thinkign yes13:52
dansmitheither way, that fake deleted instance marker thing probably needs to be cleaned in its own migration and never done again13:52
sean-k-mooneyand only keeping the last one13:52
sean-k-mooneybut i wanted your input on where to draw the line13:52
dansmithkeeping the bobcat one you mean13:52
sean-k-mooneyya so we should keep that one13:52
sean-k-mooneybecause you  can still go form antelope to carical13:52
sean-k-mooneybut vitoriay is very much eol13:53
sean-k-mooneyso i think we coudl drop that on master13:53
dansmithyeah13:53
sean-k-mooneythe reason this is on my mind13:54
sean-k-mooneyis there is a mailing list thread13:54
sean-k-mooneyon slup upgrades13:54
sean-k-mooneyand apprently this is impacting folks so i wanted to highlight this again so we can figure out how to adress this13:55
dansmithyeah13:56
sean-k-mooneyby the way i noticed that we only run the online migrations13:59
sean-k-mooneyif we pass a var to grenade13:59
sean-k-mooneyhttps://opendev.org/openstack/grenade/src/branch/master/projects/60_nova/upgrade.sh#L83-L8614:00
sean-k-mooneyi have not checked yet but i wonder if we are passing FORCE_ONLINE_MIGRATIONS in our ci jobs14:00
dansmithonline migrations are supposed to be post-upgrade idle time work, so the default is to not run them in grenade to make sure they aren't *required* to make things work14:00
sean-k-mooneyoh ok14:00
sean-k-mooneyso that intentional14:00
dansmithyeah14:00
sean-k-mooneyi guess we would only set that for a blocker data migration?14:01
sean-k-mooneyi was just trying to think is there any way we could catch the isseu in ci14:01
dansmithno, a blocker migration is something that blocks the schema upgrade, before we've upgraded code14:02
dansmithso the only way we'd hit that is if we were somehow doing a multi-step upgrade in ci14:02
dansmithsean-k-mooney: a functional reproducer for this would be best I think,14:03
sean-k-mooneyok i was not sure if we had both blocker migration for schema and data or not14:03
dansmithas it requires going through old migrations and new ones right?14:03
dansmithlike, you have to have run the vif one years ago and *then* run the more recent ones with that deleted entry still in there?14:03
sean-k-mooneymaybe we might need to tweak or aovid some of the db fixture14:03
dansmithalso doesn't a db archive clean this up?14:03
dansmithoh nm, it gets re-created even if we don't have anything to migrate I guess? ugh, that was such a mistake14:04
sean-k-mooneyi think, we might replicate the fake isntance issue14:04
sean-k-mooneyif we created the scemea with the latest verion14:04
sean-k-mooneyand then did the online migration twice14:05
dansmithyeah, twice14:05
sean-k-mooneybut  im not sure14:05
dansmithor we can just nuke that old stuff and move on14:05
sean-k-mooneyso tryign to replciat this manually might be a good first step just to make sure we can replciate this in say devstack by hand14:05
sean-k-mooneythen we can comment out the old migrations and see if it works14:06
sean-k-mooneyalthough i think we will need to clean up the fake isntance14:06
sean-k-mooneyim trying not to get nerd sniped into actully fixing this myslef 14:06
sean-k-mooneywhich is why i have not tried that yet14:06
sean-k-mooneybut if others dont have tiem i can maybe try that14:07
dansmith$ nova-manage db online_data_migrations14:08
dansmithRunning batches of 50 until complete14:08
dansmithERROR nova.objects.instance [None req-f81c2435-02e7-4cb1-83be-41c1d58c0114 None None] [instance: 730e0a06-3164-4096-8713-4d1f87d3260b] Unable to migrate instance because host None with node None not found: nova.exception.ComputeHostNotFound: Compute host None could not be found.14:08
dansmithconfirmed14:08
dansmithI don't think we need to go overboard here, just nuke that ancient migration and move on right?14:09
sean-k-mooneyright the only thing i was thinking fo doing diffently was maybe deleteing or skiping the marker instnace in the migtaion that is failing14:09
sean-k-mooneyso maybe 2 patches 1 to make the comptue node migraiton not fail if the fake instance is there by skiping that14:10
sean-k-mooneythen a second one to drop the old migrations14:10
sean-k-mooneyunless you feel like we coudl backport droping the migration to carical? Antelope?14:11
sean-k-mooneyi guess carical because we run the migrtion for the dest release14:11
opendevreviewDan Smith proposed openstack/nova master: Remove buggy VIF migration from Stein  https://review.opendev.org/c/openstack/nova/+/94832814:14
dansmiththis ^ plus archive should be enough to get yourself out of a jam, so it seems like that's the best thing to do in the short term to me14:14
dansmithand yes, I think we could backport that being so small a release or two,14:15
dansmithbut keeping it small like that also means unmaintainers could snag it as well14:15
sean-k-mooneyam i guess. do you agree we shoudl follow up with removing the other old ones in a diffent commit14:15
sean-k-mooneydansmith: by the way i would assume/hope there was some unit test for that that we shoudl drop14:16
dansmithif I apply that ^ and run archive once, then my migrations complete successfully14:16
dansmithsean-k-mooney: yeah probably, we'll find out14:17
sean-k-mooneyack14:17
gibisean-k-mooney, dansmith: fyi with https://review.opendev.org/c/openstack/nova/+/948311 and https://review.opendev.org/c/openstack/oslo.service/+/948310 nova-scheduler can be started with native threading. It is able to start up without crashing. Reaching "DB_Driver: join new ServiceGroup member aio to the scheduler group". It does not receive the RPC messages yet for some unknown reason. But it is 14:52
gibiprogress :)14:52
dansmithcool14:52
sean-k-mooneythats an odd failure mode but cool14:52
gibithe oslo.service threading backend is not yet in good shape but I was able to provide a good bunch of feedback to the author14:52
sean-k-mooneydid you change oslo.messaging executor to threaded14:53
sean-k-mooneygibi: basicaly this https://github.com/openstack/nova/blob/master/nova/rpc.py#L220-L22614:54
gibisean-k-mooney: not yet explicitly. I thought that is automatic14:54
gibiohh14:54
sean-k-mooneyit is if we dont pass one to use14:54
gibihehe14:54
sean-k-mooneybut nova does14:54
gibithen that is the next step14:54
gibithanks for the pointer14:55
sean-k-mooneyi woudl personally prefer to continue to pass one explcitly14:55
sean-k-mooneybut in theory either works14:55
sean-k-mooneyi think the automatic switch can fail in some cases but im unsure14:55
gibiwoot, scheduling works. VM booted with threaded scheduler14:57
sean-k-mooneynice14:57
gibinow I have a long list of cleanups and a long list of testing to make in the series while the oslo folks cleans up the their patch14:58
sean-k-mooneyif you can patch the new revision with that chagne we can get some ci ressult and see if 1 it work in ci and 2 if there is any obvious timing differences14:58
sean-k-mooneyi know there will be run to run variance but we can at least see if it massively changed things 14:59
*** gmaan_pto is now known as gmaan14:59
sean-k-mooneyit would be good to look at the memory usage form those runs too14:59
sean-k-mooneyim not expecting anything dramatic in either case but it will be a nice datapoint15:00
gibiI need to extend devstack to be able to pass an env variable to nova-scheduler service from the zuul config to enable the threaded mode15:01
sean-k-mooneyah well you could hardcode it for now to on15:01
gibiahh true15:01
sean-k-mooneybut ok ya we need to be able to pass it via the systemd service file15:01
gibia DNM patch with hardcoded logic15:01
gibiahh meeting :/15:01
opendevreviewBalazs Gibizer proposed openstack/nova master: WIP: allow service to start with threading  https://review.opendev.org/c/openstack/nova/+/94831115:30
*** elodille1 is now known as elodilles15:31
opendevreviewBalazs Gibizer proposed openstack/nova master: WIP: allow service to start with threading  https://review.opendev.org/c/openstack/nova/+/94831115:37
gibisean-k-mooney: https://review.opendev.org/c/openstack/nova/+/948311/comment/0de434cf_ac26e3db/ this should trigger CI with nova-scheduler running in threading mode. 15:39
sean-k-mooneydansmith: looks like it passed unit tests... at least on arm15:40
sean-k-mooneyit failed the functional test os at least we have some testign for that15:42
sean-k-mooneyi guess testing online data migration does make more sense to do in fucntional test 15:42
sean-k-mooneyalthough we could have had a test that just mocked each of the migration funciton and asserted all of the expected one were called but we evidently dont15:43
sean-k-mooney... locks like the cyborg job i sbroken its not able to find the wsgi console script15:47
sean-k-mooneypartly becasue they have not converted to pyporject.toml + havign the wsgi entryp point being called via a module path15:47
opendevreviewBalazs Gibizer proposed openstack/nova master: WIP: allow service to start with threading  https://review.opendev.org/c/openstack/nova/+/94831116:07
sean-k-mooneygibi: the ovs hybrid plug job passed so that promising https://zuul.opendev.org/t/openstack/build/4770c559c7514ad48463b7c198f31fa117:06
sean-k-mooneyhttps://zuul.opendev.org/t/openstack/buildset/1cd96c3f1a6e405eba4934bc783382db some of the other fast tempet test have also passed 17:07
sean-k-mooneygibi: is there any logmessage or anything like that that we can point to to knwo its in threadign mode17:08
sean-k-mooneygibi: ah we shoudl see https://review.opendev.org/c/openstack/nova/+/948311/4/nova/monkey_patch.py#89 if we are monkey patched17:22
sean-k-mooneyand this warning if we are not https://review.opendev.org/c/openstack/nova/+/948311/4/nova/monkey_patch.py#11917:22
sean-k-mooneyi do not see either printed17:24
opendevreviewDan Smith proposed openstack/nova master: Remove buggy VIF migration from Stein  https://review.opendev.org/c/openstack/nova/+/94832817:25
gibisean-k-mooney: yeah I think somehow logging is not properly initialized. If I run then nova-scheduler in devstack from a terminal I see the warning. I will look into this. In parellel I put together a extra logging about the stats of our pools during use. 17:31
sean-k-mooneyack17:31
sean-k-mooneywhen we are monkey patching i dont think we hav initalised oslo17:32
sean-k-mooneyor logign fully17:32
sean-k-mooney* or rather logging fullyt17:32
* sean-k-mooney cant type today17:32
sean-k-mooneygibi: what prorbaly happening is this is likely going to stdout17:33
sean-k-mooneybecause we have not setup the redirection to the journal yet17:33
sean-k-mooneyim not sure we have parsed or cofnig yet when we are monkey patching like this17:34
opendevreviewBalazs Gibizer proposed openstack/nova master: Print ThreadPool statistics  https://review.opendev.org/c/openstack/nova/+/94834017:35
gibithis will print semi periodically ^^17:36
gibiso we will see the type and the state of the executors17:36
sean-k-mooney ack17:36
sean-k-mooneyok that include the executor name in the stats17:37
sean-k-mooneyya that shoudl work17:37
opendevreviewBalazs Gibizer proposed openstack/nova master: Print ThreadPool statistics  https://review.opendev.org/c/openstack/nova/+/94834017:37
gibiI stop here today17:38
gibiit was a good day :)17:38
sean-k-mooneygibi: ack o/17:39
gibio/17:39
sean-k-mooneywe shoudl have som enice logs to reviw in the morning17:39
opendevreviewBalazs Gibizer proposed openstack/nova master: Print ThreadPool statistics  https://review.opendev.org/c/openstack/nova/+/94834018:33
opendevreviewBalazs Gibizer proposed openstack/nova master: Print ThreadPool statistics  https://review.opendev.org/c/openstack/nova/+/94834018:59
opendevreviewDan Smith proposed openstack/nova master: Remove buggy VIF migration from Stein  https://review.opendev.org/c/openstack/nova/+/94832819:17
opendevreviewBalazs Gibizer proposed openstack/nova master: Print ThreadPool statistics  https://review.opendev.org/c/openstack/nova/+/94834020:04
opendevreviewIvan Anfimov proposed openstack/nova master: Remove deprecated [workarounds] enable_numa_live_migration  https://review.opendev.org/c/openstack/nova/+/90542621:49

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!