Thursday, 2026-02-19

opendevreviewGhanshyam Maan proposed openstack/nova master: Use 2nd RPC server in compute operations  https://review.opendev.org/c/openstack/nova/+/97558802:57
opendevreviewGhanshyam Maan proposed openstack/nova master: Prepare resize/cold migration for graceful shutdown  https://review.opendev.org/c/openstack/nova/+/97718204:34
opendevreviewGhanshyam Maan proposed openstack/nova master: Use 2nd RPC server in compute operations  https://review.opendev.org/c/openstack/nova/+/97558805:02
opendevreviewGhanshyam Maan proposed openstack/nova master: Prepare resize/cold migration for graceful shutdown  https://review.opendev.org/c/openstack/nova/+/97718205:46
opendevreviewGhanshyam Maan proposed openstack/nova master: Use 2nd RPC server in compute operations  https://review.opendev.org/c/openstack/nova/+/97558806:14
opendevreviewGhanshyam Maan proposed openstack/nova master: Prepare resize/cold migration for graceful shutdown  https://review.opendev.org/c/openstack/nova/+/97718206:24
opendevreviewGhanshyam Maan proposed openstack/nova master: Use 2nd RPC server in compute operations  https://review.opendev.org/c/openstack/nova/+/97558807:03
opendevreviewGhanshyam Maan proposed openstack/nova master: Prepare resize/cold migration for graceful shutdown  https://review.opendev.org/c/openstack/nova/+/97718207:11
opendevreviewDominik proposed openstack/nova master: NUMA Topology with Resource Providers: Libvirt NUMA Migrate  https://review.opendev.org/c/openstack/nova/+/97117708:54
gibisean-k-mooney: hi, there is another iothreads fix from lajoskatona that ready to land https://review.opendev.org/c/openstack/nova/+/97593409:22
opendevreviewSilvan Kaiser proposed openstack/nova master: libvirt: partial revert, Quobyte driver supported again  https://review.opendev.org/c/openstack/nova/+/97730009:44
opendevreviewSilvan Kaiser proposed openstack/nova master: libvirt: partial revert, Quobyte driver supported again  https://review.opendev.org/c/openstack/nova/+/97730009:46
*** tkajinam_ is now known as tkajinam10:13
sean-k-mooneygibi: ah yes i didnt pull it donw since they respon it but ill take a look shortly10:31
sean-k-mooneythat was on my todo list thatnks for the reminder10:31
sean-k-mooneygibi: ill have finsihed testing this in about 15 minute and then ill submit it but how do you feel about adding whitebox to check? we talked about it a few times but it actully caght this bug10:42
sean-k-mooneyi mentioned it offhand to jparker too but i recnectly became aware that the linux kernle has the ablity to emulate/report multiple numa nodes as well10:43
sean-k-mooneywith not much effort we coulf properly test live migration with and without cpu pinning and hugepages and numa in the first part ci10:44
sean-k-mooneywe have most of it already in whitebox we just need to run it and tweak the job slichtly if we want ot add numa10:44
sean-k-mooneyi need to tweak my old patch that we reverted for the basic testing of cpu_share_set and repo-spe that for nova-alt-config10:45
opendevreviewribaudr proposed openstack/nova master: FUP Add HW_PCI_LIVE_MIGRATABLE trait to PCI resource providers  https://review.opendev.org/c/openstack/nova/+/97731010:51
sean-k-mooneylajoskatona: gibi  tested and approved it will hopefully merge later today and unblock https://review.opendev.org/c/openstack/whitebox-tempest-plugin/+/97522511:13
sean-k-mooneyit has also been tested trasitivly via that change11:14
lajoskatonasean-k-mooney, gibi: thanks11:17
priteauLots of jobs hitting post_failure in 2025.1 :( ModuleNotFoundError: No module named 'typing_extensions'11:24
priteauThat's the testtools issue11:25
priteauIs there a patch for this already?11:26
gibipriteau: as far as I know fix landed in testtools and we are waiting for a new release from testtools to get it11:30
gibipriteau: https://github.com/testing-cabal/testtools/pull/57011:35
priteauhttps://pypi.org/project/testtools/11:36
priteauReleased: 2 minutes ago11:36
priteauBut we would need a bump of upper-constraints?11:39
priteau2024.1 has testtools===2.7.111:39
tkajinampriteau, no11:40
tkajinampriteau, the failing task was common for all branches and doesn't use u-c11:41
tkajinams/was/is/11:41
priteauOK11:43
priteauSo recheck should be enough?11:43
fricklerlikely we will need to build new images, as IIUC the broken venv is preconfigured there11:52
opendevreviewribaudr proposed openstack/nova master: FUP Add HW_PCI_LIVE_MIGRATABLE trait to PCI resource providers  https://review.opendev.org/c/openstack/nova/+/97731012:02
tkajinamfrickler, seems so, looking at the failure still appearing12:25
opendevreviewribaudr proposed openstack/nova master: FUP Add HW_PCI_LIVE_MIGRATABLE trait to PCI resource providers  https://review.opendev.org/c/openstack/nova/+/97731012:52
priteaufrickler: ah, this is why I couldn't see a fresh installation of testtools12:56
gibidansmith: gmaan: after I implemented the singel long task executor I realized that we have a logical problem https://review.opendev.org/c/openstack/nova/+/977251/1#message-ff7b23fb4a19eee42289ff428220461ae3df2da8  The recommended limit for the concurrent live migration is wildly different from the number of parallel builds13:08
gibiI don't think deplolyers will accept this approach to have 1 lm - 1 build or 10 lm - 10 build, config.13:09
gibiwe recommend a single lm but that is not a useful value for concurrent builds13:09
gibithis experiment shows me that we might not able to avoid the complexity in https://review.opendev.org/c/openstack/nova/+/975924/113:10
sean-k-mooneygibi: ya those need to be split13:12
sean-k-mooneywe could still sue a semapor for teh ratelimiting13:13
sean-k-mooneyand just have the pool limit be seperate13:13
sean-k-mooneywe can warn if the semephor config option exceed the pool size13:13
sean-k-mooneybut i think that an ok compromise13:13
opendevreviewribaudr proposed openstack/nova master: FUP Add HW_PCI_LIVE_MIGRATABLE trait to PCI resource providers  https://review.opendev.org/c/openstack/nova/+/97731013:33
gibisean-k-mooney: we cannot really take the semaphore approach for live migration as we rely on cancellability of live migrations. If I use the shared executor and take a semaphore within the lm task then lm task waiting for the semaphore becomes non cancellable13:34
gibion lm tasks waiting in an executor queue is cancellable13:35
sean-k-mooneyit depend on how we do cancelation i guess13:35
gibiwe cancel futures13:35
sean-k-mooneyright but live migation supprot direct cancelation  via the api as well13:36
gibisure we can re-engineer the lm abort logic but then that is complexity 13:36
sean-k-mooneyso even if its in progess we can actuly cancel it13:36
sean-k-mooneyya so for now i guess we coudl keep those executor seperate13:36
sean-k-mooneybut i dont think we want a seprate executor per config optionright13:37
sean-k-mooneyat elast not unless the idel at 013:37
gibiahh there is different abort modes of cancellation of lm, but we basically loose the abort mode for lms that are not yet running due to the sempahore (limit'l13:37
gibi(limit)13:37
sean-k-mooneyyes you can force compelte or abort13:38
gibiif the lm is really executing the we abort via the driver13:38
sean-k-mooneysorry i have not had time to properly load context13:38
sean-k-mooneyyou are stating https://review.opendev.org/c/openstack/nova/+/975924/1 wont work right13:38
sean-k-mooneyor are you saying it will and we cant avoid the complexity13:39
gibiI'm saying I think we probably need the complexity from https://review.opendev.org/c/openstack/nova/+/975924/1 as a shared executor in https://review.opendev.org/c/openstack/nova/+/977251/1#message-ff7b23fb4a19eee42289ff428220461ae3df2da8 does not work well13:39
sean-k-mooneyack13:39
sean-k-mooneyi have not reviewd iether patch yet hence why im asking for your gut feeling on which one is more likely to work13:40
gibihaving a shared executor for build and snapshot might work, but we would need a separate for lm. Or we need the complexity from the wrapper that implements limits per task type. 13:41
sean-k-mooneyi dont hate the idea of TaskTypeLimiterExecutorWrapper13:41
sean-k-mooneyi actully think we will want to have a task/proirty aware executro in the future13:42
sean-k-mooneyi think that is nessisary complexity rather then overengeinging given our usecases13:42
sean-k-mooneybut i ahve only really read the doc strings at this point13:42
sean-k-mooneygibi: honestly i dont think https://review.opendev.org/c/openstack/nova/+/975924/1/nova/utils.py is that complex and i woudl personlly extend it to add a priorty filed to the task or type stuct or both13:46
sean-k-mooneyproirtyon type woudl mean all tasks fo this type has a default priorty we regestry when we registrer the type13:47
sean-k-mooneythat woudl allow use to still work properly if the concurnace on teh executor is less then the limtis for all the indiviual types13:48
sean-k-mooneywe can technially do that in your version as well as proposed13:48
sean-k-mooneybut obvioulsy we might want to express some prefence beyond arivle time in when the next task si enquened13:49
sean-k-mooneygibi: if this is don correctly by the way this can just live en futureist13:50
sean-k-mooneybut im ok with the idea of building this out in nova first13:50
gibiyeah13:53
gibigiven the closeness of FF I'm trying to make a way forward with a split approach. build and snapshot in a shared executor, and lm having its own executor. 13:58
gibibut I think the right approach is the TaskTypeLimiterExecutorWrapper13:58
gibiwe just don't have time13:58
sean-k-mooneylooking at it breifly i woudl be ok with proceeding in that direction too13:58
sean-k-mooneythat obviously not a rigours code review but directionally i think the trade offs you are makign make sense13:59
gibiyeah if I would have 2 weeks I would go all in on the wrapper14:01
gibibut if I want to be realistic and want nova-compute with threading in G then the split approach is less risky14:01
opendevreviewBalazs Gibizer proposed openstack/nova master: [compute]Use single long task executor  https://review.opendev.org/c/openstack/nova/+/97725114:11
opendevreviewBalazs Gibizer proposed openstack/nova master: Run nova-compute in native threading mode  https://review.opendev.org/c/openstack/nova/+/96546714:11
dansmithgibi: you're saying you don't think it's reasonable to lift the live migration limit to match what is probably the higher build limit?14:42
dansmithlive migrations are controlled by the admins, so I guess I don't see that as a fatal compromise for the time being,14:43
dansmithbut even if you go with the semaphore approach, I think you can still handle cancelation if the tasks each immediately check to see if they've been canceled after they acquire the semaphore and exit if so, no? the operation will appear to be a bit sticky until the (or a) current one finishes, but I imagine that's happening today...14:44
UgglaFYI: https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/VTLDDSXUHPKON3WOKNMATFL7ERHFPCYB/15:35
sean-k-mooneyUggla: my understandign was the bug scrub was previously ment ot happen in teh channel after the meeting15:36
sean-k-mooneynot part of it 15:36
sean-k-mooneybut sure. for watcher we do bug scrubing in the meeting each week and it works fine15:36
gmaanbauzas: this is also ready  for your review https://review.opendev.org/c/openstack/nova/+/97558615:36
gmaangibi: bauzas: can either of you +w this one, I thought it was merged https://review.opendev.org/c/openstack/nova/+/975242/4 15:37
gmaangibi: dansmith sean-k-mooney: for single executor, adding semaphore for lm makes it comples than the approach gibi mentioned. I also thought the same. to have separate executor for lm and shared for all other long running tasks 15:38
Ugglabauzas, gibi, elodilles, gmaan see 16:35 msg.15:38
gmaanUggla: yeah, I saw the email too. this cycle, gate is giving the best feeling of FF15:39
dansmithgmaan: the only problem there is that 20 live migrations waiting on semaphores will consume all the builder threads, but the point is to limit the activity a bit, so I think that should be okay15:39
Ugglasean-k-mooney, I would like to have a sync point during the meeting to restart it. Because I was lazy on that topic... :(15:40
opendevreviewBence Romsics proposed openstack/nova master: WIP Functional reproducer for #2051685  https://review.opendev.org/c/openstack/nova/+/97733115:41
gmaandansmith: but as we recommend only 1 live migration at a time (currently), by common executor, will 10 (or say 5) parallel live migration be successful and on time ? I am wondering on that side too15:42
opendevreviewBence Romsics proposed openstack/nova master: WIP Functional reproducer for #2051685  https://review.opendev.org/c/openstack/nova/+/97733115:43
dansmithgmaan: you can limit parallel migrations manually by not issuing more than one of them at a time, but a semaphore (set to 1) for live migrations will limit them to one at a time... it's just that if you start 10 in parallel, 9 will consume threads from the pool that will prevent builds from happening too15:43
gmaanohk15:44
dansmiththe suggestion of putting them both into one pool and requiring them to be the same was a compromise to avoid the complexity that gibi seemed to not think was achievable in a short period of time.. that compromise is, of course, a compromise and has some restrictions15:45
dansmithif we're not okay with those, then we could just punt and try to do the complex thing first15:46
dansmithI tend to want an incremental approach where not everything will be perfect in the first round15:46
gmaanlive migration are doable by manager user also not just admin, but still 10 as default shoudl be ok and if operator make it lower then they know the limitation. if needed we can make it 20 as default ?15:49
dansmithI thought the default for the limit was 10 and we _recommended_ they set it to one?15:50
gmaank, one will be issue i think15:53
sean-k-mooneythe default for the live migtion is 1 and we recomemnd it to be one 16:12
sean-k-mooneybut for the long live pool 10 is proably a reasonable default16:12
gmaanI am saying with single limit for executor, build is default to 10 so that will go on live migration also16:13
sean-k-mooneyi woudl prefer to still havea way to limit live migration with a semephro if possibel as i dont like the idea of changign the default by proxy to 1016:16
sean-k-mooneybut i dont think that is a conflciting request16:17
sean-k-mooneywe jsut keep the existign cofnig option and semepor for live migraton 16:17
opendevreviewKoya Watanabe proposed openstack/nova-specs master: Repropose instance-metadata-tag-protection  https://review.opendev.org/c/openstack/nova-specs/+/97733916:30
opendevreviewKoya Watanabe proposed openstack/nova-specs master: Repropose newly instance metadata/tag protection feature  https://review.opendev.org/c/openstack/nova-specs/+/97733916:34
sean-k-mooneygibi: melwitt  will ye have time to look at https://review.opendev.org/c/openstack/nova/+/975859/4 and https://review.opendev.org/c/openstack/nova/+/975872/5 next week? its a bug fix so it is not as time sensitive as features so if not that is ok16:55
gibisorry I had to step away. I don't think 10 concurrent live migration is a good thing, as well as limiting the concurrent builds to 1. The semaphore approach makes it complicated to cancel queued live migrations waiting on the semaphore (=> complexity), but also 10 lm waiting on the sempahore will block any new build requests as the sempahore is take on the worker from the Executor. I understand this 16:55
gibiwas as compromise but I think this is too big of a compromise. As a sort term I'm now pushing for build and snapshot sharing an executor and live migration having its own (as today). Then after G I would do the per task type limit with an executor wrapper16:55
gibiand sorry again, but I have to step away again :/ (but I will read back)16:56
gmaanI am good that way for G and it is simple. semaphore approach still consume the executor so it does not really solve the operation limitation. If we are going to make it more smarter in future then this approach is ok for now.17:03
sean-k-mooneythis approch beign gibis current patch with 1 pool for live migration and another for the rest of the backgorund task and no semaphor?17:06
sean-k-mooneyim ok with that for this release as well jsut making sure that we are talking about th esame thing17:07
opendevreviewGhanshyam Maan proposed openstack/nova master: Add manager graceful shutdown, timeout, and wait  https://review.opendev.org/c/openstack/nova/+/97558617:08
gmaanbauzas: gibi thanks for review on graceful shutdown changes, I fixed the bauzas comment in this change itself instead of followup because anyways I need to change the other changes in that series.17:11
gmaanthis one https://review.opendev.org/c/openstack/nova/+/97558617:11
opendevreviewMerged openstack/nova master: Add 2nd RPC server for compute service  https://review.opendev.org/c/openstack/nova/+/97524217:36
opendevreviewmelanie witt proposed openstack/nova master: TPM: fixups for live migration of `host` secret security  https://review.opendev.org/c/openstack/nova/+/97631618:09
opendevreviewmelanie witt proposed openstack/nova master: TPM: support live migration of `deployment` secret security  https://review.opendev.org/c/openstack/nova/+/92577118:09
opendevreviewmelanie witt proposed openstack/nova master: TPM: bump service version to enable live migration  https://review.opendev.org/c/openstack/nova/+/97572418:09
opendevreviewmelanie witt proposed openstack/nova master: TPM: test live migration between hosts with different security  https://review.opendev.org/c/openstack/nova/+/95262918:09
opendevreviewmelanie witt proposed openstack/nova master: TPM: add late check for supported TPM secret security  https://review.opendev.org/c/openstack/nova/+/95697518:09
opendevreviewmelanie witt proposed openstack/nova master: TPM: enable conversion of secret security modes via resize  https://review.opendev.org/c/openstack/nova/+/96205218:09
opendevreviewmelanie witt proposed openstack/nova master: DNM vtpm tempest  https://review.opendev.org/c/openstack/nova/+/95747718:09
gmaangibi: In my graceful shutdown change, i saw test_submit_second_while_delaying_first failing in threading job  AssertionError: 1.997275639999998 not greater than 2.018:57
gmaanhttps://1069208da190f941bbcb-6faf22591116ac424591f44dbeb2cb9b.ssl.cf1.rackcdn.com/openstack/0cc7f4aa239e491493e73f88b45c8986/testr_results.html18:57
gmaannot sure you or sean-k-mooney talked about it but just to let you know that this is happening in more places18:57
sean-k-mooneygmaan: melwitt mentioned it a day or two ago i think18:58
gmaanohk18:58
sean-k-mooneyi didnt look at it in too much detail but we likely neeed to mock time slightly diffently18:58
sean-k-mooneythat or use the assert almsot equesl th9ing for floats18:59
sean-k-mooneyits defnitly a semi flaky test but i hav enot see it fail much18:59
gmaank, I just saw it in my change but not anywhere else19:00
gmaanits for delay task StaticallyDelayingCancellableTaskExecutorWrapper with delay of 2 so task should not finish before 2 if we check float then it will fail too19:01
sean-k-mooneywell sleep shodl not be less then the ammount but i have not looked at the code19:03
gmaanor maybe it is just matter of next line captired the monotomic time after task is submitted. maybe we can capture time before task is submitted 19:03
gmaanlet me propose the change and see if that make sense19:03
sean-k-mooneyack19:03
sean-k-mooneyim just finishing up but ill be around for a few more mins so if you push it before i wrap for the weekend ill review it quickly19:04
gmaank, give me few min19:04
sean-k-mooneywe could just assert >=1.9 19:09
sean-k-mooneybut ya its the order of https://github.com/openstack/nova/blob/7a303bc1e28e9426f2f6d9898a18edda34bb8dd9/nova/tests/unit/test_utils.py#L2166-L216719:09
sean-k-mooneyits already submited ot teh executor at that point when we recored teh teim19:10
opendevreviewGhanshyam Maan proposed openstack/nova master: Fix the flasky test test_submit_second_while_delaying_first  https://review.opendev.org/c/openstack/nova/+/97735619:13
gmaansean-k-mooney: melwitt gibi ^^19:13
sean-k-mooneythat is exactly the fix i was expecting so +2 :)19:14
gmaanthanks19:14
gmaanand it seems py310 job also green19:14
sean-k-mooneyalready or locally19:15
gmaanalready, i can see it is passing in my change19:15
sean-k-mooneylocally i gues because it has not run yet in the gate19:15
sean-k-mooneyhttps://zuul.openstack.org/status?change=97735619:15
gmaanin ohter change i mean which is still in gate19:15
sean-k-mooneyoh ok19:16
sean-k-mooneyis the grenade issue fixed out of interest19:16
sean-k-mooneyModuleNotFoundError: No module named 'typing_extensions'19:17
gmaanwhich one? is there new one 19:17
sean-k-mooneythat failing in watcher 19:17
gmaanoh, i can see grenade job also passing in my nova change19:17
gmaanI think it was same in py310 also, i did not dig into it but same error19:17
sean-k-mooneyoh ok i tought i saw folks taling about this in one of the irc channle19:18
sean-k-mooneyi think its a testtools issue and we ewre waiting on the new release19:18
sean-k-mooneywhich happend eailer today19:18
gmaanyeah19:18
sean-k-mooneyok well that wil either pass or not but should be resovled one way or the ther by monday19:19
gmaanthis is merged so all green https://review.opendev.org/c/openstack/nova/+/975242/419:20
gmaanyou can recheck maybe19:20
gmaanwathcer should be ok too19:20
sean-k-mooneyjoan already did19:20
gmaank19:20
sean-k-mooneyand we merges some stuff too i just was not sure if the greade issue had been resovled19:20
gmaank, at least for now but I am sure its going to be more failure in FF week :) if it start it happen at worst :)19:21
melwittawesome gmaan 19:21
opendevreviewGhanshyam Maan proposed openstack/nova master: Add manager graceful shutdown, timeout, and wait  https://review.opendev.org/c/openstack/nova/+/97558619:49
opendevreviewGhanshyam Maan proposed openstack/nova master: Use 2nd RPC server in compute operations  https://review.opendev.org/c/openstack/nova/+/97558820:20
opendevreviewMerged openstack/nova master: Fix the flasky test test_submit_second_while_delaying_first  https://review.opendev.org/c/openstack/nova/+/97735620:21
lothHey all, I'm having some trouble getting multi-cells working. Is anyone on that is familar with it?20:40
opendevreviewGhanshyam Maan proposed openstack/nova master: Prepare resize/cold migration for graceful shutdown  https://review.opendev.org/c/openstack/nova/+/97718223:25

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!