opendevreview | DongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova https://review.opendev.org/c/openstack/nova/+/948303 | 01:06 |
---|---|---|
opendevreview | DongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova https://review.opendev.org/c/openstack/nova/+/948303 | 01:25 |
opendevreview | DongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova https://review.opendev.org/c/openstack/nova/+/948303 | 01:42 |
opendevreview | DongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova https://review.opendev.org/c/openstack/nova/+/948303 | 01:56 |
opendevreview | Masanori Kuroha proposed openstack/nova master: Copy applied provider.yaml https://review.opendev.org/c/openstack/nova/+/948304 | 02:47 |
opendevreview | Masanori Kuroha proposed openstack/nova master: Copy applied provider.yaml https://review.opendev.org/c/openstack/nova/+/948304 | 03:05 |
opendevreview | DongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova https://review.opendev.org/c/openstack/nova/+/948303 | 04:28 |
opendevreview | DongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova https://review.opendev.org/c/openstack/nova/+/948303 | 05:00 |
opendevreview | DongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova https://review.opendev.org/c/openstack/nova/+/948303 | 05:41 |
opendevreview | DongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova https://review.opendev.org/c/openstack/nova/+/948303 | 06:04 |
opendevreview | Balazs Gibizer proposed openstack/nova master: [quota]Refactor group counting to scatter-gather https://review.opendev.org/c/openstack/nova/+/948064 | 06:08 |
opendevreview | Balazs Gibizer proposed openstack/nova master: WIP:Translate scatter-gather to futurist https://review.opendev.org/c/openstack/nova/+/947966 | 06:08 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Use futurist for _get_default_green_pool() https://review.opendev.org/c/openstack/nova/+/948072 | 06:08 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Replace utils.spawn_n with spawn https://review.opendev.org/c/openstack/nova/+/948076 | 06:08 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Add spawn_on https://review.opendev.org/c/openstack/nova/+/948079 | 06:08 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Move ComputeManager to use spawn_on https://review.opendev.org/c/openstack/nova/+/948186 | 06:08 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Move ConductorManager to use spawn_on https://review.opendev.org/c/openstack/nova/+/948187 | 06:08 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Make nova.utils.pass_context private https://review.opendev.org/c/openstack/nova/+/948188 | 06:08 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Rename DEFAULT_GREEN_POOL to DEFAULT_EXECUTOR https://review.opendev.org/c/openstack/nova/+/948086 | 06:08 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Make the default executor configurable https://review.opendev.org/c/openstack/nova/+/948087 | 06:08 |
opendevreview | Balazs Gibizer proposed openstack/nova master: WIP: allow service to start with threading https://review.opendev.org/c/openstack/nova/+/948311 | 06:08 |
opendevreview | DongHun, Kim proposed openstack/nova master: Add support for memory allocation mode='immediate' in Nova https://review.opendev.org/c/openstack/nova/+/948303 | 06:31 |
mkuroha | @sean @gibi Hi, this patch is the initial implementation for the behavior change of additional.traits in provider.yaml that was discussed at the PTG: https://review.opendev.org/c/openstack/nova/+/948304 I would appreciate it if you could review the current code mm | 08:21 |
bauzas | gibi: so the goal for this week is about testing scatter-gather for the scheduler ? | 08:48 |
gibi | mkuroha: thanks for proposing that patch. It looks pretty clean and simple, so I like it. How do you feel about this approach? | 08:52 |
gibi | bauzas: yeah, I want to see the first light locally that nova-scheduler starts without monkey patching and with all the spawn calls actually taking native threads from a threadpool. If it start and not imediately fails that is already success. Then I will try throwing tempest at it and see if it works, does not hang, does not loose threads from the pool, etc. | 08:54 |
gibi | bauzas: it needs to depend the still open (and not fully working) oslo.sevice patch so this will be mostly local trials and maybe CI runs if I can extend devstack to pass env variable to nova-scheduler at start | 08:55 |
gibi | I expect that this means less actual reviewable patches proposed, and more WIP patches iterated | 08:56 |
gibi | this week. | 08:56 |
gibi | Also this week I'm off Thrusday, Friday | 08:57 |
gibi | so it is a short week anyhow | 08:57 |
gibi | so you can review the long series for spotting conceptual errors but that series is mostly WIP and expected to change based on the feedback provided by the local testing of scatter-gather in the scheduler in native threading mode. | 08:58 |
gibi | I will write up a post about how to replicate the local testing setup I have so others can try | 08:58 |
bauzas | gibi: cool, sorry wasn't nagged by a notification when you pinged me | 09:10 |
bauzas | gibi: I'll also be on PTO on Thur and Fri | 09:10 |
bauzas | gibi: so tbc, that's this gerrit branch that I could look at ? https://review.opendev.org/c/openstack/nova/+/948064/3 | 09:11 |
gibi | yes that is the one I'm trying to use to test things locally | 09:12 |
bauzas | noted | 09:12 |
gibi | there are two independent patches. A small delete https://review.opendev.org/c/openstack/nova/+/947260?usp=search | 09:12 |
gibi | and the replacement series here https://review.opendev.org/c/openstack/nova/+/947212 | 09:13 |
gibi | on the latter I expect Kamil to propose more similar patches this week | 09:13 |
bauzas | I already reviewed the hacking patch | 09:13 |
gibi | yeah, I just noticed. Thanks | 09:14 |
bauzas | for the moment, the removals and hacking patches were easy to review and without any concerns | 09:14 |
gibi | if you look at the recheck history of https://review.opendev.org/c/openstack/nova/+/947212 you see that our gate is not super happy | 09:14 |
gibi | I filed https://bugs.launchpad.net/glance/+bug/2109428 | 09:15 |
bauzas | but the futurist branch needs more brain time for me :)= | 09:15 |
gibi | bauzas: yeah the long barnch is more complicated | 09:15 |
gibi | and less hashed out | 09:15 |
bauzas | yeah saw the libvirt failure | 09:15 |
bauzas | seems a new regression IMHO | 09:15 |
gibi | also we reverted a pyroute2 bump over the weekend https://bugs.launchpad.net/glance/+bug/2109428 | 09:16 |
gibi | sorry worng link | 09:16 |
gibi | https://review.opendev.org/c/openstack/requirements/+/948283 | 09:16 |
gibi | but the author of pyroute2 already responded in https://bugs.launchpad.net/os-vif/+bug/2109396 so that is nicely moving along | 09:17 |
bauzas | thanks for the info | 09:17 |
gibi | now that I'm back to relying on the upstream CI, I will spend more time caring for it and filing bugs j) | 09:18 |
gibi | :) | 09:18 |
mkuroha | gibi: Thank you for the confirmation!!:) I have one concern: I'm wondering if there is any concern in the behavior when /etc/nova/provider.yaml is deleted while /applied_provider_config/provider.yaml is present. (Current: the additional.traits of the resource provider and the /applied_provider_config/provider.yaml file remain.). | 09:31 |
mkuroha | Additionally, in this patch, are there any tests other than the nova's functional tests that should be added? (This is my almost first patch for nova, and I'm not very familiar with it, so I would like to confirm it.) | 09:31 |
gibi | mkuroha: good point about the edge case of deleting /etc/nova/provider.yaml. I think the expected behavior in this case that all the traits that are listed in the applied file is removed and then the applied file is deleted at the end. | 09:33 |
gibi | mkuroha: the functional test looks good to me. Adding one for the deleted provider.yaml case would be nice too. If you feel that the merge_provider_config or other changes could benefit from some unit test then that is also OK to add. | 09:35 |
gibi | but for me the current PoC would be enough to write and approve a spec | 09:36 |
gibi | so I suggest to write a small spec based on the PoC, so that we can approve that by the rest of the team | 09:37 |
mkuroha | gibi: I see, thank you for your all comments!! I will start working on fixing the spec side :) | 09:40 |
gibi | mkuroha: thanks for the effort! | 09:40 |
opendevreview | Elod Illes proposed openstack/nova stable/2025.1: FUP Remove unnecessary PCI check https://review.opendev.org/c/openstack/nova/+/946623 | 10:13 |
opendevreview | Merged openstack/nova master: Replace eventlet sleep with time.sleep https://review.opendev.org/c/openstack/nova/+/947212 | 10:17 |
sean-k-mooney | melwitt: dansmith we proably need to go fix https://bugs.launchpad.net/nova/+bug/2076614 | 11:39 |
sean-k-mooney | is there any reason not to delete all the online migration beofre the bobcat one on master?> | 11:39 |
sean-k-mooney | the previous one was for victoria when upgrading to wallaby | 11:40 |
sean-k-mooney | so no modern cloud shoudl be running anythin other then the latest one. even if your coming form yoga | 11:41 |
sean-k-mooney | or wallaby | 11:41 |
dansmith | sean-k-mooney: removing the super old ones seems fine.. are you suggesting removing the one from victoria? | 13:52 |
sean-k-mooney | that im unsure | 13:52 |
sean-k-mooney | i was thinkign yes | 13:52 |
dansmith | either way, that fake deleted instance marker thing probably needs to be cleaned in its own migration and never done again | 13:52 |
sean-k-mooney | and only keeping the last one | 13:52 |
sean-k-mooney | but i wanted your input on where to draw the line | 13:52 |
dansmith | keeping the bobcat one you mean | 13:52 |
sean-k-mooney | ya so we should keep that one | 13:52 |
sean-k-mooney | because you can still go form antelope to carical | 13:52 |
sean-k-mooney | but vitoriay is very much eol | 13:53 |
sean-k-mooney | so i think we coudl drop that on master | 13:53 |
dansmith | yeah | 13:53 |
sean-k-mooney | the reason this is on my mind | 13:54 |
sean-k-mooney | is there is a mailing list thread | 13:54 |
sean-k-mooney | on slup upgrades | 13:54 |
sean-k-mooney | and apprently this is impacting folks so i wanted to highlight this again so we can figure out how to adress this | 13:55 |
dansmith | yeah | 13:56 |
sean-k-mooney | by the way i noticed that we only run the online migrations | 13:59 |
sean-k-mooney | if we pass a var to grenade | 13:59 |
sean-k-mooney | https://opendev.org/openstack/grenade/src/branch/master/projects/60_nova/upgrade.sh#L83-L86 | 14:00 |
sean-k-mooney | i have not checked yet but i wonder if we are passing FORCE_ONLINE_MIGRATIONS in our ci jobs | 14:00 |
dansmith | online migrations are supposed to be post-upgrade idle time work, so the default is to not run them in grenade to make sure they aren't *required* to make things work | 14:00 |
sean-k-mooney | oh ok | 14:00 |
sean-k-mooney | so that intentional | 14:00 |
dansmith | yeah | 14:00 |
sean-k-mooney | i guess we would only set that for a blocker data migration? | 14:01 |
sean-k-mooney | i was just trying to think is there any way we could catch the isseu in ci | 14:01 |
dansmith | no, a blocker migration is something that blocks the schema upgrade, before we've upgraded code | 14:02 |
dansmith | so the only way we'd hit that is if we were somehow doing a multi-step upgrade in ci | 14:02 |
dansmith | sean-k-mooney: a functional reproducer for this would be best I think, | 14:03 |
sean-k-mooney | ok i was not sure if we had both blocker migration for schema and data or not | 14:03 |
dansmith | as it requires going through old migrations and new ones right? | 14:03 |
dansmith | like, you have to have run the vif one years ago and *then* run the more recent ones with that deleted entry still in there? | 14:03 |
sean-k-mooney | maybe we might need to tweak or aovid some of the db fixture | 14:03 |
dansmith | also doesn't a db archive clean this up? | 14:03 |
dansmith | oh nm, it gets re-created even if we don't have anything to migrate I guess? ugh, that was such a mistake | 14:04 |
sean-k-mooney | i think, we might replicate the fake isntance issue | 14:04 |
sean-k-mooney | if we created the scemea with the latest verion | 14:04 |
sean-k-mooney | and then did the online migration twice | 14:05 |
dansmith | yeah, twice | 14:05 |
sean-k-mooney | but im not sure | 14:05 |
dansmith | or we can just nuke that old stuff and move on | 14:05 |
sean-k-mooney | so tryign to replciat this manually might be a good first step just to make sure we can replciate this in say devstack by hand | 14:05 |
sean-k-mooney | then we can comment out the old migrations and see if it works | 14:06 |
sean-k-mooney | although i think we will need to clean up the fake isntance | 14:06 |
sean-k-mooney | im trying not to get nerd sniped into actully fixing this myslef | 14:06 |
sean-k-mooney | which is why i have not tried that yet | 14:06 |
sean-k-mooney | but if others dont have tiem i can maybe try that | 14:07 |
dansmith | $ nova-manage db online_data_migrations | 14:08 |
dansmith | Running batches of 50 until complete | 14:08 |
dansmith | ERROR nova.objects.instance [None req-f81c2435-02e7-4cb1-83be-41c1d58c0114 None None] [instance: 730e0a06-3164-4096-8713-4d1f87d3260b] Unable to migrate instance because host None with node None not found: nova.exception.ComputeHostNotFound: Compute host None could not be found. | 14:08 |
dansmith | confirmed | 14:08 |
dansmith | I don't think we need to go overboard here, just nuke that ancient migration and move on right? | 14:09 |
sean-k-mooney | right the only thing i was thinking fo doing diffently was maybe deleteing or skiping the marker instnace in the migtaion that is failing | 14:09 |
sean-k-mooney | so maybe 2 patches 1 to make the comptue node migraiton not fail if the fake instance is there by skiping that | 14:10 |
sean-k-mooney | then a second one to drop the old migrations | 14:10 |
sean-k-mooney | unless you feel like we coudl backport droping the migration to carical? Antelope? | 14:11 |
sean-k-mooney | i guess carical because we run the migrtion for the dest release | 14:11 |
opendevreview | Dan Smith proposed openstack/nova master: Remove buggy VIF migration from Stein https://review.opendev.org/c/openstack/nova/+/948328 | 14:14 |
dansmith | this ^ plus archive should be enough to get yourself out of a jam, so it seems like that's the best thing to do in the short term to me | 14:14 |
dansmith | and yes, I think we could backport that being so small a release or two, | 14:15 |
dansmith | but keeping it small like that also means unmaintainers could snag it as well | 14:15 |
sean-k-mooney | am i guess. do you agree we shoudl follow up with removing the other old ones in a diffent commit | 14:15 |
sean-k-mooney | dansmith: by the way i would assume/hope there was some unit test for that that we shoudl drop | 14:16 |
dansmith | if I apply that ^ and run archive once, then my migrations complete successfully | 14:16 |
dansmith | sean-k-mooney: yeah probably, we'll find out | 14:17 |
sean-k-mooney | ack | 14:17 |
gibi | sean-k-mooney, dansmith: fyi with https://review.opendev.org/c/openstack/nova/+/948311 and https://review.opendev.org/c/openstack/oslo.service/+/948310 nova-scheduler can be started with native threading. It is able to start up without crashing. Reaching "DB_Driver: join new ServiceGroup member aio to the scheduler group". It does not receive the RPC messages yet for some unknown reason. But it is | 14:52 |
gibi | progress :) | 14:52 |
dansmith | cool | 14:52 |
sean-k-mooney | thats an odd failure mode but cool | 14:52 |
gibi | the oslo.service threading backend is not yet in good shape but I was able to provide a good bunch of feedback to the author | 14:52 |
sean-k-mooney | did you change oslo.messaging executor to threaded | 14:53 |
sean-k-mooney | gibi: basicaly this https://github.com/openstack/nova/blob/master/nova/rpc.py#L220-L226 | 14:54 |
gibi | sean-k-mooney: not yet explicitly. I thought that is automatic | 14:54 |
gibi | ohh | 14:54 |
sean-k-mooney | it is if we dont pass one to use | 14:54 |
gibi | hehe | 14:54 |
sean-k-mooney | but nova does | 14:54 |
gibi | then that is the next step | 14:54 |
gibi | thanks for the pointer | 14:55 |
sean-k-mooney | i woudl personally prefer to continue to pass one explcitly | 14:55 |
sean-k-mooney | but in theory either works | 14:55 |
sean-k-mooney | i think the automatic switch can fail in some cases but im unsure | 14:55 |
gibi | woot, scheduling works. VM booted with threaded scheduler | 14:57 |
sean-k-mooney | nice | 14:57 |
gibi | now I have a long list of cleanups and a long list of testing to make in the series while the oslo folks cleans up the their patch | 14:58 |
sean-k-mooney | if you can patch the new revision with that chagne we can get some ci ressult and see if 1 it work in ci and 2 if there is any obvious timing differences | 14:58 |
sean-k-mooney | i know there will be run to run variance but we can at least see if it massively changed things | 14:59 |
*** gmaan_pto is now known as gmaan | 14:59 | |
sean-k-mooney | it would be good to look at the memory usage form those runs too | 14:59 |
sean-k-mooney | im not expecting anything dramatic in either case but it will be a nice datapoint | 15:00 |
gibi | I need to extend devstack to be able to pass an env variable to nova-scheduler service from the zuul config to enable the threaded mode | 15:01 |
sean-k-mooney | ah well you could hardcode it for now to on | 15:01 |
gibi | ahh true | 15:01 |
sean-k-mooney | but ok ya we need to be able to pass it via the systemd service file | 15:01 |
gibi | a DNM patch with hardcoded logic | 15:01 |
gibi | ahh meeting :/ | 15:01 |
opendevreview | Balazs Gibizer proposed openstack/nova master: WIP: allow service to start with threading https://review.opendev.org/c/openstack/nova/+/948311 | 15:30 |
*** elodille1 is now known as elodilles | 15:31 | |
opendevreview | Balazs Gibizer proposed openstack/nova master: WIP: allow service to start with threading https://review.opendev.org/c/openstack/nova/+/948311 | 15:37 |
gibi | sean-k-mooney: https://review.opendev.org/c/openstack/nova/+/948311/comment/0de434cf_ac26e3db/ this should trigger CI with nova-scheduler running in threading mode. | 15:39 |
sean-k-mooney | dansmith: looks like it passed unit tests... at least on arm | 15:40 |
sean-k-mooney | it failed the functional test os at least we have some testign for that | 15:42 |
sean-k-mooney | i guess testing online data migration does make more sense to do in fucntional test | 15:42 |
sean-k-mooney | although we could have had a test that just mocked each of the migration funciton and asserted all of the expected one were called but we evidently dont | 15:43 |
sean-k-mooney | ... locks like the cyborg job i sbroken its not able to find the wsgi console script | 15:47 |
sean-k-mooney | partly becasue they have not converted to pyporject.toml + havign the wsgi entryp point being called via a module path | 15:47 |
opendevreview | Balazs Gibizer proposed openstack/nova master: WIP: allow service to start with threading https://review.opendev.org/c/openstack/nova/+/948311 | 16:07 |
sean-k-mooney | gibi: the ovs hybrid plug job passed so that promising https://zuul.opendev.org/t/openstack/build/4770c559c7514ad48463b7c198f31fa1 | 17:06 |
sean-k-mooney | https://zuul.opendev.org/t/openstack/buildset/1cd96c3f1a6e405eba4934bc783382db some of the other fast tempet test have also passed | 17:07 |
sean-k-mooney | gibi: is there any logmessage or anything like that that we can point to to knwo its in threadign mode | 17:08 |
sean-k-mooney | gibi: ah we shoudl see https://review.opendev.org/c/openstack/nova/+/948311/4/nova/monkey_patch.py#89 if we are monkey patched | 17:22 |
sean-k-mooney | and this warning if we are not https://review.opendev.org/c/openstack/nova/+/948311/4/nova/monkey_patch.py#119 | 17:22 |
sean-k-mooney | i do not see either printed | 17:24 |
opendevreview | Dan Smith proposed openstack/nova master: Remove buggy VIF migration from Stein https://review.opendev.org/c/openstack/nova/+/948328 | 17:25 |
gibi | sean-k-mooney: yeah I think somehow logging is not properly initialized. If I run then nova-scheduler in devstack from a terminal I see the warning. I will look into this. In parellel I put together a extra logging about the stats of our pools during use. | 17:31 |
sean-k-mooney | ack | 17:31 |
sean-k-mooney | when we are monkey patching i dont think we hav initalised oslo | 17:32 |
sean-k-mooney | or logign fully | 17:32 |
sean-k-mooney | * or rather logging fullyt | 17:32 |
* sean-k-mooney cant type today | 17:32 | |
sean-k-mooney | gibi: what prorbaly happening is this is likely going to stdout | 17:33 |
sean-k-mooney | because we have not setup the redirection to the journal yet | 17:33 |
sean-k-mooney | im not sure we have parsed or cofnig yet when we are monkey patching like this | 17:34 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Print ThreadPool statistics https://review.opendev.org/c/openstack/nova/+/948340 | 17:35 |
gibi | this will print semi periodically ^^ | 17:36 |
gibi | so we will see the type and the state of the executors | 17:36 |
sean-k-mooney | ack | 17:36 |
sean-k-mooney | ok that include the executor name in the stats | 17:37 |
sean-k-mooney | ya that shoudl work | 17:37 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Print ThreadPool statistics https://review.opendev.org/c/openstack/nova/+/948340 | 17:37 |
gibi | I stop here today | 17:38 |
gibi | it was a good day :) | 17:38 |
sean-k-mooney | gibi: ack o/ | 17:39 |
gibi | o/ | 17:39 |
sean-k-mooney | we shoudl have som enice logs to reviw in the morning | 17:39 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Print ThreadPool statistics https://review.opendev.org/c/openstack/nova/+/948340 | 18:33 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Print ThreadPool statistics https://review.opendev.org/c/openstack/nova/+/948340 | 18:59 |
opendevreview | Dan Smith proposed openstack/nova master: Remove buggy VIF migration from Stein https://review.opendev.org/c/openstack/nova/+/948328 | 19:17 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Print ThreadPool statistics https://review.opendev.org/c/openstack/nova/+/948340 | 20:04 |
opendevreview | Ivan Anfimov proposed openstack/nova master: Remove deprecated [workarounds] enable_numa_live_migration https://review.opendev.org/c/openstack/nova/+/905426 | 21:49 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!