| *** mhen_ is now known as mhen | 02:53 | |
| tonyb | review.o.o is down. When I check it was in shutoff state. That's the second time recently. One guess is that the hypervisor host's OOM killer is ... well killing the VM (it is a memory hog). | 06:47 |
|---|---|---|
| tonyb | If that happened, and nova notices does nova take any action? Like maybe logging a fake/minimal server event | 06:48 |
| tonyb | Maybe it looks like a "power-update" to nova.compute.api::external_instance_event and emits and event here: https://opendev.org/openstack/nova/src/commit/32f58e8ad6a7ff896cc6ae8a361e3a18f5b35c9a/nova/compute/api.py#L5994 ? | 07:12 |
| gibi | tonyb: nova logs the following from a periodic tasks if detects that the power state of the VM is different than the power state of the domain on the hypervisor https://github.com/openstack/nova/blob/32f58e8ad6a7ff896cc6ae8a361e3a18f5b35c9a/nova/compute/manager.py#L11134 | 08:00 |
| gibi | then aligns the power state in the DB | 08:01 |
| tonyb | gibi: In this case i don't have access to the nova logs | 08:02 |
| gibi | I assume after the OOM kill libvirt will report that the VM is stopped. If so then nova will use the normal VM stop api to stop the VM properly https://github.com/openstack/nova/blob/32f58e8ad6a7ff896cc6ae8a361e3a18f5b35c9a/nova/compute/manager.py#L11190 | 08:02 |
| tonyb | gibi: I do see a server event: https://paste.opendev.org/show/bvD4k4YfaLIc7qYkc3dc/ | 08:02 |
| gibi | during the normal stop we record events to the DB and send notification on the message bus https://github.com/openstack/nova/blob/32f58e8ad6a7ff896cc6ae8a361e3a18f5b35c9a/nova/compute/manager.py#L3465-L3470 | 08:04 |
| gibi | tonyb: yepp that server event seems like the one happening if nova detects that the VM is in stopped state on the hypervisor | 08:04 |
| gibi | but active in the DB | 08:04 |
| gibi | and aligns the DB | 08:04 |
| tonyb | gibi: Thanks | 08:05 |
| gibi | note that the same would happen if somebody logs into the VM and shutdowns it from within the guest | 08:06 |
| gibi | nova does not really know why the VM is suddenly stopped on the hypervisor | 08:06 |
| tonyb | Ah okay. I can check if that was it but given it's review.o.o I doubt it | 08:06 |
| tonyb | Yup. I was just wanting to verify that the event "mostly agrees" with the theory the the OOM-killer fired | 08:07 |
| gibi | OK | 08:09 |
| gibi | sounds like it agrees | 08:09 |
| tonyb | gibi: perfect thanks | 08:36 |
| *** bogdando_ is now known as bogdando | 09:08 | |
| LarsErikP | Hi. I reported this, after talking to melwitt back in september. Could anyone take a look? https://bugs.launchpad.net/nova/+bug/2125730 | 09:26 |
| opendevreview | Balazs Gibizer proposed openstack/nova master: Libvirt event handling without eventlet https://review.opendev.org/c/openstack/nova/+/965949 | 09:46 |
| opendevreview | Balazs Gibizer proposed openstack/nova master: Run nova-compute in native threading mode https://review.opendev.org/c/openstack/nova/+/965467 | 09:46 |
| opendevreview | Balazs Gibizer proposed openstack/nova master: Compute manager to use native thread pools https://review.opendev.org/c/openstack/nova/+/966016 | 09:46 |
| opendevreview | Balazs Gibizer proposed openstack/nova master: Run nova-compute in native threading mode https://review.opendev.org/c/openstack/nova/+/965467 | 09:49 |
| *** sambork_ is now known as sambork | 12:11 | |
| sean-k-mooney | LarsErikP: i think dansmith fixed that but i dont knwo in which release | 12:43 |
| sean-k-mooney | my guess is this si because nova in carical does not have the fix | 12:43 |
| sean-k-mooney | caracal is now unmaintaiend so this bug is liekly not valid form a core team point of view but the fix might be backprotable | 12:44 |
| sean-k-mooney | assuimg the cahge we merged for mast resolve the issue | 12:45 |
| sean-k-mooney | https://review.opendev.org/c/openstack/nova/+/930754 | 12:46 |
| sean-k-mooney | that in epoxy but not in older release | 12:46 |
| LarsErikP | sean-k-mooney: ah ok. that makes sense. thanks | 13:16 |
| jlejeune | hello guys, can I have some reviews please for my topics: https://review.opendev.org/q/topic:%22bug/2085135%22 and https://review.opendev.org/q/topic:%22bug-2044235%22 ? | 13:22 |
| jlejeune | just matter of backports | 13:22 |
| opendevreview | Balazs Gibizer proposed openstack/nova master: Libvirt event handling without eventlet https://review.opendev.org/c/openstack/nova/+/965949 | 14:25 |
| opendevreview | Balazs Gibizer proposed openstack/nova master: Run nova-compute in native threading mode https://review.opendev.org/c/openstack/nova/+/965467 | 14:25 |
| dansmith | bauzas: in case you didn't see the mention, can you see my proposal here please? https://review.opendev.org/c/openstack/nova/+/941795/comments/fc49c07d_c06afef1 | 14:27 |
| bauzas | dansmith: nope sorry, working hard on something else, f*** agile :) | 14:51 |
| bauzas | dansmith: I definitely want to review again the vtpm series, will do it hopefully today | 14:51 |
| dansmith | bauzas: alright, well, that link is to a specific proposal for for merging the bottom part now, which I think you've mostly reviewed already | 14:52 |
| gibi | dansmith: thanks for the review on switching the default to native threading. Did you missed that it has two pre-req patch in the series? | 14:54 |
| dansmith | gibi: oh, no I saw it, but I got distracted before I followed that down, sorry about that | 14:55 |
| gibi | no worries. Take you time :) | 14:55 |
| dansmith | I had skimmed them looking to see if the api/__init__ was being added in this series or was already in place | 14:57 |
| gibi | it was in place. Actually I needed to improve and move from from cmd/__init__ to cmd/<entrypoint2025-11-03 14:29:01.144038 | ubuntu-noble | oslo_service.backend.exceptions.BackendAlreadySelected: Backend already set to 'eventlet', cannot reinitialize with 'threading'> | 14:58 |
| gibi | oops | 14:59 |
| gibi | copy past buffer grrr | 14:59 |
| gibi | it was in place. Actually I needed to improve and move from from cmd/__init__ to cmd/<entrypoint>.py | 14:59 |
| gibi | so I can make the default different per entry point | 14:59 |
| dansmith | you're planning to do that for api? | 15:00 |
| gibi | but the wsgi entry point is differnt than the cmd one | 15:00 |
| dansmith | right, but we _could_ have an entry point somewhere else for api/wsgi that does the patching before it imports... | 15:00 |
| gibi | as both wsgi service we have are switching defaults at the same time I did not needed to touch the api part | 15:00 |
| dansmith | yeah, I'm more concerned about just something else importing something from api to get a constant or something and screwing up | 15:01 |
| gibi | I cannot just delay monkey_patch call in api wsgi as other imports might start using primitives that should have been monkey patched | 15:04 |
| gibi | if another service imports from api then we assume that the another service already went through monkey patching. And oslo.service does not allow changing the backend type once it set so we will catch it | 15:05 |
| dansmith | you can if you control the order of the importing in the entry point | 15:05 |
| gibi | when the second monkey patch call tries to change the backend due to differning default | 15:05 |
| gibi | dansmith: that is what we do, we added the monkey patch as early as possible in the import list | 15:06 |
| gibi | that is our control | 15:06 |
| dansmith | I'm not sure if you're not understanding what I'm saying, or don't think it's possible, or think it's a bad idea.. either way, I'm not arguing...just commenting about my opinion | 15:07 |
| gibi | import wsgi from api first execute __init__ so we need to be in __init__ if __init__ has other imports like oslo.log that creates a lock | 15:07 |
| gibi | I guess I'm missing what would you like to see in case of api/__init__ | 15:08 |
| gibi | s/import wsgi from api/from api import wsgi/ | 15:08 |
| gibi | I think it is not possible to move the monkey_patch to api/wsgi from api/__init__ | 15:10 |
| gibi | as api/__init__ has other imports | 15:10 |
| sean-k-mooney | you could monkey patch before the other import | 15:11 |
| dansmith | this ^ | 15:12 |
| sean-k-mooney | you might need to add a noqa or similar comment | 15:12 |
| dansmith | but I don't want to argue | 15:12 |
| sean-k-mooney | but we can if we need too | 15:12 |
| gibi | but I monkey patch before all the imports | 15:12 |
| gibi | :) | 15:12 |
| sean-k-mooney | i have not been folloing the converstaion by they way | 15:13 |
| gibi | the first line of nova/api/openstack/__init__.py is the monkey patch | 15:13 |
| gibi | I think dansmith would like that to be moved somewhere else | 15:13 |
| sean-k-mooney | ah yes i remoember dicussign the #noqa comment for that | 15:13 |
| gibi | at the moment I'm not sure where to move our nova.api.openstack.compute.wsgi:init_application to be able to move the monkey_patch | 15:14 |
| sean-k-mooney | so ya https://github.com/openstack/nova/blob/master/nova/api/__init__.py and https://github.com/openstack/nova/blob/master/nova/__init__.py are empty today | 15:14 |
| sean-k-mooney | well the entry point ot that is https://github.com/openstack/nova/blob/master/nova/wsgi/osapi_compute.py | 15:15 |
| sean-k-mooney | across openstack we are ment ot have <package>.wsgi.<api endpoint) | 15:15 |
| sean-k-mooney | so you could monkey patch here https://github.com/openstack/nova/blob/master/nova/wsgi/__init__.py but im not sure it is required | 15:16 |
| gibi | hm but our setup.cfg ponts to nova-api-wsgi = nova.api.openstack.compute.wsgi:init_application | 15:16 |
| sean-k-mooney | that in wsgi script | 15:16 |
| sean-k-mooney | but that is being deleted | 15:16 |
| sean-k-mooney | it will generate the same script effectivly | 15:17 |
| sean-k-mooney | we agreed at the ptg to finally delete https://github.com/openstack/nova/blob/master/setup.cfg#L87-L89 | 15:17 |
| gibi | but until that setup.cfg is a thing I need to make sure that works. So today I cannot move the monkey patch to nova/wsgi because that will not be executed if the setup.cfg defined entry point is used | 15:17 |
| gibi | I agree that once the setup.cfg does not have such entry point defined and only the nova/wsgi is used then I can move the monkey_patch | 15:18 |
| sean-k-mooney | right but lest just find and merge stephesn patch | 15:18 |
| sean-k-mooney | then you can move it if needed | 15:18 |
| gibi | OK | 15:18 |
| sean-k-mooney | https://review.opendev.org/c/openstack/nova/+/902688 | 15:19 |
| sean-k-mooney | he updated the docs too | 15:19 |
| sean-k-mooney | so its a slightly longer review then i was plannign but ill re review that today/tomorrow | 15:20 |
| gibi | so that patch suggest the deployers do the right thing and use nova/wsgi | 15:20 |
| sean-k-mooney | yep which is how devstack has worked for 2 releases now | 15:21 |
| gibi | I need to respin my defaulting series anyhow so I can rebase on top of stephenfin and move the monkey patch accordingly | 15:25 |
| stephenfin | sean-k-mooney: if you have time, would you be able to continue and grab the other 3 patches on top of that one? https://review.opendev.org/c/openstack/nova/+/953703 | 15:37 |
| sean-k-mooney | my 1:1 just finished so sure. | 16:00 |
| gibi | stephenfin: could you quickly fix the versioning issue in https://review.opendev.org/c/openstack/nova/+/902688/9#message-72481479904c69ce6f3adbcdfb26143c6f85756a then we can land that | 16:10 |
| sean-k-mooney | gibi: ah the versionchanged line | 16:13 |
| sean-k-mooney | ya that shoudl eb 33.0.0 | 16:13 |
| opendevreview | Stephen Finucane proposed openstack/nova master: setup: Remove pbr's wsgi_scripts https://review.opendev.org/c/openstack/nova/+/902688 | 16:14 |
| opendevreview | Stephen Finucane proposed openstack/nova master: Migrate mypy configuration to pyproject.toml https://review.opendev.org/c/openstack/nova/+/953703 | 16:14 |
| opendevreview | Stephen Finucane proposed openstack/nova master: Migrate codespell configuration to pyproject.toml https://review.opendev.org/c/openstack/nova/+/953704 | 16:14 |
| opendevreview | Stephen Finucane proposed openstack/nova master: Migrate setup configuration to pyproject.toml https://review.opendev.org/c/openstack/nova/+/953705 | 16:14 |
| opendevreview | Stephen Finucane proposed openstack/nova master: pre-commit: Bump versions https://review.opendev.org/c/openstack/nova/+/966089 | 16:14 |
| stephenfin | gibi: sean-k-mooney: done | 16:14 |
| sean-k-mooney | stephenfin: did you see https://review.opendev.org/c/openstack/nova/+/953704/4/pyproject.toml | 16:14 |
| sean-k-mooney | im fien with not changing it just noded it was false before | 16:15 |
| gibi | stephenfin: thanks | 16:15 |
| sean-k-mooney | as in im fine with the patch as is just making sure you saw it | 16:15 |
| stephenfin | sean-k-mooney: replied on both | 16:18 |
| sean-k-mooney | thanks | 16:20 |
| Zhan[m] | hi friends, I'm trying to work on some improvements for pre-copy live migrations - to reduce VMs' network connectivity loss time. I put my ideas and research in https://bugs.launchpad.net/nova/+bug/2128665 and would love some feedback before I start coding and making changes. does anyone mind taking a look? I can create a blueprint if it's needed. thanks!! | 16:29 |
| sean-k-mooney | Zhan[m]: you are effectivly askign to trigger post-live-migration very slightly earlier | 16:45 |
| sean-k-mooney | when pre-copy triggers a pause of the guest. | 16:45 |
| sean-k-mooney | that only happens if you have set force-complete as the timeout action or if you manually do that | 16:45 |
| sean-k-mooney | we do this for post-copy migration already when it enter the post copy phase | 16:46 |
| sean-k-mooney | im not sure how safe ti woudl be to do that in this case vs the current migration completion event | 16:46 |
| sean-k-mooney | Zhan[m]: this likely need more then a bug ie. perhaps a spec | 16:46 |
| sean-k-mooney | or atleast more condierdation the a quick review of your propal to think though the implciiton for upgrades and the diffenent failure modes | 16:47 |
| Zhan[m] | understood, I will go ahead and create a spec. wanna get some early feedback to see if this idea make sense or not. | 16:49 |
| Zhan[m] | we are not using post-copy (I know that we'll avoid this with post-copy) so hence this is just for pre-copy | 16:49 |
| sean-k-mooney | we might be able to use a slightly earlier event | 16:49 |
| Zhan[m] | from my understanding, the paused event should be the earliest? (basically when the process is paused for switch-over) | 16:50 |
| sean-k-mooney | yes but we need to think about the recover cases if there si a failure after pause | 16:51 |
| sean-k-mooney | we currently do it in complete because we know we cant rollback at that point | 16:51 |
| sean-k-mooney | Zhan[m]: just to confirm you are currenlty using pre-copy not post-copy | 16:52 |
| sean-k-mooney | is there a reason you are not using post-copy | 16:52 |
| Zhan[m] | > if there si a failure after pause... (full message at <https://matrix.org/oftc/media/v1/media/download/AQcXzaU3yYgbieAMs5LdQxhfkMQjWLQ8n98jM_lIB8gwWH9ApwZ_vdawDM9LNz0brOl_UevcPCkv3f-67_yYcUZCeal-T3mAAG1hdHJpeC5vcmcvRFZtSG9LWGxZaFNCZ0hrYnpXbVh0WnNi>) | 16:55 |
| Zhan[m] | * - if there si a failure after pause... (full message at <https://matrix.org/oftc/media/v1/media/download/AYPX5olphEwNM-oVQEAeW1n4bjBMOCL7JcBa2--xwCdXhT0nA_sPuR_jnKT0C0oDEmSyzPWDZpPKsZ1wc-FdO-dCeal-VAbgAG1hdHJpeC5vcmcvRUtYVUxDZVBiekRLck1wZ0hpSW1ZVWNn>) | 16:56 |
| Zhan[m] | * - if there si a failure after pause... (full message at <https://matrix.org/oftc/media/v1/media/download/AfGmfWXp6q289_-pAg_3r2CNmHk8GLA3AAnfqOAWmBViUGQMwTDwe8p12QyJHYlvI7HoFg0OieVLTgc2FNFdS1dCeal-XNrgAG1hdHJpeC5vcmcvZkprTUVOeWlRR0htWmlYc2hLZVhYZ0ha>) | 16:56 |
| Zhan[m] | * - if there si a failure after pause... (full message at <https://matrix.org/oftc/media/v1/media/download/AQbdtBpPhkIEZJLqRgxshCfnuFsmdpJyW0XBuScJnchhklfJJ0L11j5bwq4dTpkoh88wJMokcThxBRTCLw1ECLJCeal-bZagAG1hdHJpeC5vcmcvQXNCTUJqVWtqUWF0cmJKRG1XRURxbmF1>) | 16:58 |
| Zhan[m] | I think this change can help regardless, maybe in some environments, pre-copy migrations are preferred. | 16:58 |
| Zhan[m] | We've also seen some crazy wait time (20+ seconds) between when the VM is paused and when libvirt says the migration is done (the completion event), so that's where our motivation comes from. | 17:05 |
| gibi | I saw working live migration with native threaded compute \o/ | 17:48 |
| gibi | https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_071/openstack/07142d96394445fc9d4d5ad58b963223/testr_results.html | 17:48 |
| sean-k-mooney | gibi: nice | 18:12 |
| sean-k-mooney | gibi: do you know what is causing the volume issues | 18:14 |
| gibi | not yet | 18:14 |
| sean-k-mooney | well its less broken then it could have been | 18:14 |
| gibi | I'm still not convinced that the lifecycle events are actually working between libvirt and nova in the gate. Locally it is working for me so I need to look more tomorrow | 18:14 |
| sean-k-mooney | i assum this is lvm? | 18:14 |
| sean-k-mooney | ok ya it is cidner lvm | 18:15 |
| gibi | yeah it is not ceph I'm sure | 18:16 |
| gibi | but I drop now. Continue from here tomorrow | 18:17 |
| sean-k-mooney | gibi: looksl ilke there are are a buch of multipahd error bug i cant tell at a glance if that brick being brick or actully imporant | 18:17 |
| sean-k-mooney | gibi: when we stop monkeypatching nova-comptue | 18:19 |
| sean-k-mooney | we ar ealso unmonkeypatchign os-brick | 18:19 |
| sean-k-mooney | i hope that is not related | 18:19 |
| sean-k-mooney | i do see that some of the iscsiadm commadns failed os hopefuly it a unrealted ci faliure and a recheck will jsut work | 18:20 |
| sean-k-mooney | most of the time the iscsiadm dicovery is workign so it coudl be load/timing related | 18:20 |
| sean-k-mooney | Zhan[m]: multi line matix message are hard to read on the irc side as we only get the first line and link to the rest of the message | 18:49 |
| sean-k-mooney | Zhan[m]: but no i htink the issue you are hitting is sepcific to pre-copy migration | 18:49 |
| Zhan[m] | ah apologies, first time using this | 18:50 |
| sean-k-mooney | thats ok | 18:50 |
| sean-k-mooney | we also get a new message everytime you edit a message | 18:50 |
| Zhan[m] | yes this is specific to pre-copy. When I was reading https://docs.openstack.org/nova/2024.1/configuration/config.html#libvirt.live_migration_permit_post_copy, I think it's saying that post-copy won't be used until we hit timeout right? | 18:51 |
| sean-k-mooney | no post-copy will be enabeld by libvirt without nova's intervention or contol | 18:52 |
| sean-k-mooney | the way post-copy works is qemu does an inital precoppy pahse whiel the instnace is running | 18:52 |
| sean-k-mooney | it then internally decied if it is making enough forward progress to proceed with a standar pre-copy migration | 18:53 |
| sean-k-mooney | if not after the 2nd pre-copy compeltes it will swap to post copy mode where the vm will be resumed on teh dest | 18:54 |
| sean-k-mooney | we trigger off the suspend call in post-copy case | 18:54 |
| sean-k-mooney | Zhan[m]: effectivly what nova does is tell qemu it is allwoed to use post-copy and it will make the decion of when to do it | 18:56 |
| Zhan[m] | thanks, that's interesting to know, I was thinking we are controlling this with Nova when reading the config | 18:56 |
| sean-k-mooney | the config option just tell nova to pass the flag to librit to give it permision to use post-copy | 18:57 |
| Zhan[m] | but talking back, do you think this is something that can be improved on for pre-copy? | 18:58 |
| Zhan[m] | I'll definitely play around with post-copy too. | 18:58 |
| sean-k-mooney | im unsure https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/neutron-new-port-binding-api.html is the spec where we defien dthe current sematicn for prot activation during live migration | 18:59 |
| sean-k-mooney | if we change when we do the activateion (it may be ok) we need ot review the libvirt docuemtion and lifecycle fo migrton evnet | 19:00 |
| Zhan[m] | yes I will list relevant events, job stats, and related stuff in the spec, as well as any edge cases I can think of. | 19:02 |
| sean-k-mooney | Zhan[m]: today the only way to do the activation sooner is to do post_live_migration sooner which may or may not be safe | 19:02 |
| sean-k-mooney | failing that we need to register a specific addtional event handeler just for the port activation case | 19:02 |
| sean-k-mooney | that shoudl be possible but post_live_migration woudl have to handel the fact that the bining is actived earlier and they clean up code for error/abort woudl too | 19:03 |
| Zhan[m] | Currently, I'm thinking we just do the networking part - activate port bindings earlier like what's done previously | 19:03 |
| sean-k-mooney | well when we activate it on the test network backend are ment to start routing traffic to the dest as well | 19:03 |
| sean-k-mooney | in the past with ovn that would have resulting in packet beign rescived on the source host being drop | 19:04 |
| sean-k-mooney | today we replciat the packt to the inactive port bidnign sepcified in the migrating_to filed in the port bidning profile | 19:04 |
| Zhan[m] | I'm thinking adding addition functions in the rollback function in case the live migration fails later. | 19:04 |
| Zhan[m] | right, we're not using ovn so that might be something to consider too... | 19:05 |
| sean-k-mooney | ya so this is why im sayin gwe might need a sepc for this | 19:05 |
| sean-k-mooney | because what ever we do has to work correctly for all neutorn network backends | 19:05 |
| Zhan[m] | no I'm not objecting a writing a spec, I'll work on that. I'm just asking what do we feel about this (like if it worth a try at all | 19:06 |
| sean-k-mooney | by the way to beclear the VIR_DOMAIN_JOB_COMPLETED is waht tirggers post live migration wehn doing precoppy today | 19:07 |
| Zhan[m] | yeah indeed, lemme get that spec out, as it seems possible as long as we handle the cases correctly | 19:07 |
| Zhan[m] | yeah,but what I found is that the query of job status will hang when libvirt is cleaning up the domain | 19:07 |
| sean-k-mooney | ya so your suggestin using VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED right | 19:08 |
| Zhan[m] | so even though it is saying "VIR_DOMAIN_JOB_COMPLETED -> job finished, but cleanup is not finished", it is very likely that the queryJobInfo by nova will be blocked until the cleanup it done | 19:08 |
| sean-k-mooney | similar to VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY | 19:08 |
| Zhan[m] | yes | 19:08 |
| sean-k-mooney | so VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED woudl be sent when the vm is paused in pre-copy or when you do force-complete via the api | 19:08 |
| sean-k-mooney | or the migrtion timeout action does force-complete | 19:09 |
| sean-k-mooney | what we woudl need to confirm is that VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED is not emitted if auto converge is enabled | 19:09 |
| sean-k-mooney | if you set live_migration_permit_auto_converge | 19:10 |
| Zhan[m] | why not? auto converge is just to make sure that the migration will eventually finish right? | 19:10 |
| sean-k-mooney | it allows qemu to pause the vcpu temporally | 19:10 |
| sean-k-mooney | to allow pre-copy to make progress | 19:10 |
| Zhan[m] | we have auto converge on and I do see VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED gets generated | 19:10 |
| sean-k-mooney | right but my point is its only safe to use VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED | 19:11 |
| Zhan[m] | the VM needs to paused before the swtich-over, regardless of converge or not, right? | 19:11 |
| sean-k-mooney | if its only sent once | 19:11 |
| sean-k-mooney | when we are never going to resume on this host again | 19:11 |
| Zhan[m] | good point, I believe it is only sent once, I'll double check that | 19:11 |
| sean-k-mooney | if VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED is a point of no return (we will never resume on the source after that) then | 19:12 |
| sean-k-mooney | i think we can move to useign that | 19:12 |
| sean-k-mooney | but we will obviuly need to test hat | 19:13 |
| sean-k-mooney | *that | 19:13 |
| Zhan[m] | but even if it's resumed later (likely due to the migration fails), libvirt will report error and we can still roll back the port bindings to source? | 19:13 |
| sean-k-mooney | Zhan[m]: since your using calico your proably also interest in https://bugs.launchpad.net/nova/+bug/2033681 | 19:14 |
| sean-k-mooney | Zhan[m]: not if we have invoked post-live-migration | 19:14 |
| sean-k-mooney | which was my point before if we are not adding a speciall addtional handeler | 19:14 |
| sean-k-mooney | and jsut calling post-live-migration like we do for post-copy | 19:14 |
| sean-k-mooney | then we need to be 100% sure it cant role back and that its safe to start cleaning up the souce on the nova side | 19:15 |
| sean-k-mooney | if we only do the port bindign activation it probaly ok | 19:15 |
| sean-k-mooney | what we might do is move the port bidning activation logic to its own method and have that tirger on either VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY or VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY | 19:16 |
| sean-k-mooney | *VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY or VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED | 19:16 |
| Zhan[m] | If we have invoked post live migration then it must mean the migration succeeded, so we are ok to move everything to dest | 19:16 |
| Zhan[m] | right, that's what I'm thinking | 19:16 |
| Zhan[m] | I do have some code in hand and will be starting testing while writing the spec, I can link it there | 19:17 |
| sean-k-mooney | https://docs.openstack.org/nova/latest/reference/live-migration.html | 19:19 |
| Zhan[m] | ok I think we are on the same page, lemme write that spec and put details there | 19:19 |
| sean-k-mooney | the main concen i have is you are changing the current workflow | 19:19 |
| sean-k-mooney | by adding another pahse for network activation | 19:19 |
| Zhan[m] | yes, because before, for precopy, we only activate bindings when we see the migration is good | 19:19 |
| sean-k-mooney | well when its complete | 19:20 |
| Zhan[m] | complete and good (Ig :P) | 19:20 |
| sean-k-mooney | the lbivrt part where we block should typeicly be very very short | 19:20 |
| sean-k-mooney | unelss you are activly copying memry for the guest | 19:21 |
| Zhan[m] | no... in our environment we saw some crz numbers like 20s+ | 19:21 |
| Zhan[m] | or even a minute | 19:21 |
| Zhan[m] | between PAUSED to job completed event | 19:21 |
| Zhan[m] | because libvirt waits for the source QEMU process to be cleaned up before saying job is completed | 19:21 |
| sean-k-mooney | Zhan[m]: https://docs.openstack.org/nova/latest/configuration/config.html#workarounds.enable_qemu_monitor_announce_self | 19:21 |
| sean-k-mooney | by the way do you enable ^ | 19:22 |
| sean-k-mooney | Zhan[m]: right but that shoudl only happen if qemu is copying memory | 19:22 |
| sean-k-mooney | i guess the qemu instance clean up coudl take a long time if it waithign for io to complete | 19:22 |
| sean-k-mooney | but libvirt shoudl not unpasue the dest instance until the source is stoped. | 19:23 |
| Zhan[m] | what I found was that the kernel was spinning to clean up the memory resource | 19:23 |
| Zhan[m] | which took quite a lot of time | 19:23 |
| Zhan[m] | I dont think we enabled that, lemme take a look | 19:24 |
| Zhan[m] | the memory resource -> the memory space that the VM uses, not by migration. if the VM is using a huge number of pages then the cleanup can take longer. | 19:24 |
| sean-k-mooney | Zhan[m]: does the guest use hugepages or anonoums 4k pages out fo interest | 19:24 |
| sean-k-mooney | im just wonderign why that takes so long in your case | 19:25 |
| Zhan[m] | we use THP, but we figured that KSM and this bug(https://lore.kernel.org/all/20240626191830.3819324-1-yang@os.amperecomputing.com/) was causing it to not use hugepages as expected. | 19:25 |
| sean-k-mooney | ah ok | 19:26 |
| sean-k-mooney | THP is awsome if it works and terible otherwise | 19:26 |
| sean-k-mooney | espically when combind iwht ksm | 19:26 |
| sean-k-mooney | it causes a lot of non determinism | 19:26 |
| Zhan[m] | exactly lol, but yeah, that's why we also wanna tweak nova to just skip the cleanup | 19:27 |
| sean-k-mooney | hum i was not aware of that bug but reading it that definly going to slow things donw | 19:28 |
| sean-k-mooney | is that fixed upstream? | 19:29 |
| Zhan[m] | yes, what QEMU does in migration is that it will read the page first and then write | 19:29 |
| Zhan[m] | it's fixed in 6.13 IIRC | 19:29 |
| Zhan[m] | so all of the following writes done by the VM will be regular pages | 19:29 |
| Zhan[m] | which is painful.... | 19:29 |
| sean-k-mooney | ah ok | 19:30 |
| Zhan[m] | https://github.com/torvalds/linux/commit/5c00ff742bf5caf85f60e1c73999f99376fb865d yeah 6.13 | 19:30 |
| sean-k-mooney | 6.13 is resonably new so that proably not in a lot of disto kernels | 19:30 |
| sean-k-mooney | ubutnu has 6.14 in the HWE kernsl but it default to 6.11 for 24.04 | 19:31 |
| sean-k-mooney | rhel 10 is 6.12 + backports | 19:31 |
| Zhan[m] | yeah but anyways, nova doesn't really need to wait for the cleanup to be done. | 19:31 |
| Zhan[m] | if it's doing cleanup, then it means the VM has been resumed on dest | 19:32 |
| sean-k-mooney | ya | 19:32 |
| Zhan[m] | which we can start at least setting up the network | 19:32 |
| sean-k-mooney | if we can trigger form that point we likely cna just go direclty ot post-live-migrate | 19:32 |
| sean-k-mooney | but we cant relibly tirgger on the cleanup starting then we can expore just doing the network activation | 19:33 |
| Zhan[m] | yeah just the network I feel is fine, everything else can be pushed back | 19:34 |
| Zhan[m] | what VM users ultimately care about is just network connectivity and if the VM is running or not | 19:34 |
| Zhan[m] | which in that case the VM is running so we should setup the network | 19:34 |
| sean-k-mooney | what i woudl suggest is push a patch to hack https://github.com/openstack/nova/blob/master/nova/virt/libvirt/host.py#L274-L321 | 19:36 |
| sean-k-mooney | oh | 19:36 |
| sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/virt/libvirt/host.py#L295-L298 | 19:36 |
| Zhan[m] | right, I'm thinking if the migration fails, we rollback the port bindings back to source | 19:36 |
| sean-k-mooney | Zhan[m]: so based on that comment we cant realy on that only beign sent on sucssess | 19:37 |
| sean-k-mooney | whgich mean we have to call guest.get_job_info() | 19:37 |
| Zhan[m] | right but that get_job_info will hang until cleanup is done | 19:37 |
| sean-k-mooney | right | 19:37 |
| sean-k-mooney | but we cant just assume its ok to activate the netork on the dest based on VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED | 19:38 |
| Zhan[m] | right, but if migration fails at last, then we can just rollback the port binding change back to source | 19:38 |
| sean-k-mooney | no becaue if the migtion failed that does not mean the guest crashed | 19:39 |
| sean-k-mooney | it can unpase on the souce | 19:39 |
| sean-k-mooney | so we dont want to have to rool back the prot bidning in that case | 19:39 |
| Zhan[m] | hmm why we can't? | 19:40 |
| sean-k-mooney | i was goign tot setted modfiyin tghis to just tirger post-live-migration and see what happesn in ci | 19:41 |
| sean-k-mooney | https://github.com/openstack/nova/commit/aa87b9c288d316b85079e681e0df24354ec1912c | 19:41 |
| sean-k-mooney | well because of https://bugs.launchpad.net/nova/+bug/1788014 | 19:42 |
| Zhan[m] | right that is what I put in that bug report | 19:42 |
| Zhan[m] | the question is "As a result, failed live migrations will inadvertantly trigger activation of the port bindings on the destination host, which deactivates the source host port bindings, and then _rollback_live_migration will delete those activated dest host port bindings and leave the source host port bindings deactivated." | 19:43 |
| Zhan[m] | so let's just rollback the port binding changes in _rollback_live_migration? | 19:43 |
| Zhan[m] | for failure cases like this. | 19:43 |
| sean-k-mooney | if the migration failure does nto kill the vm and it resulmes activating the port binding on the dest will deactivate tehm on the souce and some network backend will stop delivering packet to the soruce | 19:44 |
| sean-k-mooney | so we woudl be addigng nework downtiem in the failure case that did nto exist before | 19:44 |
| sean-k-mooney | however that may or may not be accpatable | 19:44 |
| sean-k-mooney | im going to quickly try hacking somethign to see how it behviaves one sec | 19:46 |
| Zhan[m] | I tested this (rollback when failure) in my environment, it works properly, like I see port binding gets updated on dest, then I wrote a function to roll back (basically delete, create, and re-activate bindings on source), then the VM regained network connectivity and is still on source. | 19:49 |
| Zhan[m] | yes we would add more downtime, but I would think that failure doesn't really happen that often (correct me if I'm wrong), so we would be trading benefits for something happens less frequent for benefits for something happens more frequent (migration success case). | 19:50 |
| Zhan[m] | *downtime for failure cases | 19:51 |
| opendevreview | sean mooney proposed openstack/nova master: [DNM] testing early activation https://review.opendev.org/c/openstack/nova/+/966106 | 19:52 |
| sean-k-mooney | Zhan[m]: in the happy path i think that is effectivly what you want to do | 19:53 |
| Zhan[m] | yes | 19:53 |
| Zhan[m] | the only concern I think is the unhappy path :) | 19:54 |
| sean-k-mooney | ya, so the reason im debating this in my head is this is already complciated and im not sure we want to make it more complicted by adding aother event handeler | 19:56 |
| Zhan[m] | To ease the process a little bit, so basically: 1. activate port bindings when paused 2. If migration ok later, all is well 3. If migration not ok later, delete, recreate, reactivate the port binding on the source side | 19:57 |
| sean-k-mooney | well you just activate them again and delete the dest ones | 19:58 |
| Zhan[m] | It is complicated :( I spent quite some time digging all these too | 19:58 |
| sean-k-mooney | but you also have to cater for neturon backend that dont supprot multipel port bidnign | 19:58 |
| sean-k-mooney | because we have not made those unsupproted yet | 19:58 |
| Zhan[m] | I'm not sure on that, I got exceptions when I try to re-enable the source bindings and I saw some fields are missing in DB, so that's why I recreated the bindings, but this is not important, the idea is the same. | 19:59 |
| sean-k-mooney | there are some network backend that dont supprot the multipe prot bidnging exteion and in those case we only have one and we jsut update the bindings:host field | 19:59 |
| sean-k-mooney | i gues it depend on if they are delete or not | 20:00 |
| Zhan[m] | right, if backend doesn't support I think the port binding extension will not be used right? | 20:00 |
| sean-k-mooney | but ya we have to recrate them if they are i though activating them only deactived teh souce one but didnt delete tehm | 20:00 |
| Zhan[m] | yeah in that case migrate_instance_start won't do anything, so | 20:00 |
| Zhan[m] | they are not deleted, but as far as what I'm seeing in the DB, some metadata are deleted so hmmm not very sure on that | 20:01 |
| sean-k-mooney | so what i woudl like is if we do have a dedicated port biding activation phase we use it for both pre and post copy | 20:01 |
| sean-k-mooney | that is how it works today | 20:01 |
| sean-k-mooney | what i dont wnat to have is 1 way to activate them for pre-copy, 1 way for post copy adn 1 way for backend that dont suprot it | 20:02 |
| Zhan[m] | ok so like extracting it out of post live migration and event handler function? | 20:02 |
| sean-k-mooney | right | 20:02 |
| sean-k-mooney | we woudl pull out the networkign bits and do them alwsy in a new funciton that woudl be before psot-copy and move post copy to happen alwasy on the migration complete event | 20:03 |
| Zhan[m] | I see, let's put that maybe as a stage 2 thing? I'll work on this first and see how we refactor this process later | 20:03 |
| sean-k-mooney | well we woudl want ot have both before merging any code | 20:04 |
| Zhan[m] | obviously the spec will come first | 20:04 |
| sean-k-mooney | but ya spec first to fiutre out exaclty how it shoudl look | 20:04 |
| sean-k-mooney | is this somehtin you think you would have time to work on this cycle | 20:04 |
| Zhan[m] | ok let's start with the spec first, thanks for looking into this too!! | 20:04 |
| Zhan[m] | I'll see and get back to you on this. | 20:05 |
| sean-k-mooney | ack no worries | 20:05 |
| Zhan[m] | Ok so I'll be working on the spec now and submit it soonish, but regarding the code (the refining part) may come later in Q1, consider we may have some back and forth in the spec. | 20:21 |
| Zhan[m] | I'll put the goal of spec to be 1. this change 2. put networking bits in the right place. | 20:21 |
| sean-k-mooney | Zhan[m]: we have selected december 4 or 8th for spec feeze | 20:26 |
| sean-k-mooney | with feature freeze aroudn the middle fo febuary | 20:26 |
| sean-k-mooney | just for context so you are not surpised | 20:27 |
| sean-k-mooney | normally sepc freeze would be around january 8th | 20:27 |
| Zhan[m] | understood, thanks for the heads up. In this case I think I'll submit the spec before Dec 4 and I can open a PR in Q1 but it can definitely wait until the freeze is over. | 20:34 |
| sean-k-mooney | well its better to submit it early not late as if it comes in mid to late q1 it will be unlikely to land this cycle | 20:44 |
| *** mnaser[m] is now known as mnaser | 20:46 | |
| sean-k-mooney | the offical release is Wednesday, April 1, 2026. freautre freeze is February 26, 2026 | 20:46 |
| sean-k-mooney | so fi you only puth the patch up for review in late january or early febuary we likely wont have time to review it | 20:46 |
| Zhan[m] | Ok, I'll see if I can make it there, if not, what will be the next cycle? | 20:50 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!