Monday, 2024-07-29

*** tobias-urdin|pto is now known as tobias-urdin06:31
*** bauzas_ is now known as bauzas06:52
jlejeunebauzas: hello, I would need a one more review on https://review.opendev.org/c/openstack/nova/+/925014 please :)07:12
*** bauzas_ is now known as bauzas07:24
*** bauzas_ is now known as bauzas07:56
*** bauzas_ is now known as bauzas08:21
opendevreviewMerged openstack/nova stable/2023.1: Fix test_vmdk_bad_descriptor_mem_limit and test_vmdk_bad_descriptor_mem_limit_stream_optimized  https://review.opendev.org/c/openstack/nova/+/92501408:28
jlejeunethanks bauzas :)08:29
*** bauzas_ is now known as bauzas10:04
ralonsohhi folks! please check this list of backports: https://review.opendev.org/q/Ie731045629f0899840a4680d21793a16ade9b98e10:07
ralonsohthis is affecting the stable branches CI in Neutron right now10:07
ralonsohthe master patch, merged 3 days ago, solved the issue for the master branch10:08
ralonsohthanks in advance!10:08
sean-k-mooneybauzas: gibi: can try an dland these backports today https://review.opendev.org/q/topic:%22remove-ami-override%2210:51
sean-k-mooneythis will be needed by neutorn and i want to include them in the release shas for the new release for the cve which i hope to base on what is merged by EOD10:52
fricklerelodilles: ^^ maybe? or as a last resort I can undust my release-team-inherited stable-maint hat10:58
sean-k-mooneyfrickler: i think elodilles  is on pto so its ok to wait for others11:01
sean-k-mooneyfrickler: the timeline i have in my head is prepare the branches to be tagged today by merging any pending backport we think are reasonabe11:01
sean-k-mooneythen tomorrow ill submit the release patches adn we can review them in the nova meeting11:02
sean-k-mooneyif no objections at that point then we will proceed with the actual release11:02
sean-k-mooneyfrickler: if really needed i can self approve my own backport or dansmith can approve them  i was just avoiding ping people before they come online11:03
gibisean-k-mooney: I'm +2 on the backports11:16
sean-k-mooneygibi: thanks if you and bauzas: are both happy i can add +w as required once the preceddeing patch is merged11:19
gibisounds good to me11:21
sean-k-mooneyby the way i hada weirdly passive agressinve interaction with someone over the weekend. i dont know why they were getting so defensive when i asked simple questions but they abandoned there patch 11:21
sean-k-mooneyhttps://review.opendev.org/c/openstack/nova/+/92503711:21
sean-k-mooneyapparently if we have mdevs allocated volume or possible also interface  detach11:22
sean-k-mooneycan fail https://bugs.launchpad.net/nova/+bug/207421911:22
sean-k-mooneybauzas: we should create a functional repoducer and or see if we can replicate this in your devstack mdev series11:23
sean-k-mooneymy guess is they work for Indiana University and just decieded that actully fixing this properly upstream was too much effrot and they are just going to carry the patch downstream11:26
sean-k-mooneybut if this is a real problem i wouls like to actully fix it for everyone properly11:27
sean-k-mooneyhowever we shoudl acutlly create a repoducer before doing that11:27
elodillessean-k-mooney: actually i'm back from today on, though i just started picking up the pace o:)11:43
sean-k-mooneyelodilles: well welcome back and dont burn yourself out.11:54
elodillesthx, i will try no to o:)12:00
opendevreviewRajesh Tailor proposed openstack/nova master: Add regression test case for bug 1978573  https://review.opendev.org/c/openstack/nova/+/91465312:21
opendevreviewRajesh Tailor proposed openstack/nova master: Restore original AZ when unshelve fails  https://review.opendev.org/c/openstack/nova/+/91110812:21
sean-k-mooneyelodilles: i dont really want to delay the release for the cve too long but if there are any backports you think we should include in the release im happy to review12:26
kashyapsean-k-mooney: On that MDEV patch review, for a new contributor, I wouldn't use language like: "any attempt to rely on something like that would not pass code review."12:31
sean-k-mooneykashyap: they are not really new htey have submitted patches before over a few releases and abandoned most of them12:32
sean-k-mooneykashyap: but they were also implying that we were not following best pratices and relying on bad coding style. 12:32
kashyapI see; I've just seen that they've also contributed in the past a patch.12:33
sean-k-mooneywhen i didnt recognise the name i looked to see if they were new and saw they ahd previous patches but i tought it was odd they abandoned most of them12:34
kashyapYeah, I saw that comment.  Maybe they could've said, "_if_ we're relying on missing attributes ..."12:34
sean-k-mooneyperhaps they seams to ignore the content of my comment hosever and get very defensive12:35
sean-k-mooneyi on 2 ocations asked them to create a repoducer to demonstrate the orgainal bug and confrim there patch actully fixed it12:36
kashyapBut is parsing the alias field here wrong?  I'm not 100% sure yet12:36
sean-k-mooneythe bug is that we dont check that all the objects have it before using it12:45
sean-k-mooneyand only some of them do12:45
sean-k-mooneyyou can only request MDEVs via the flavor12:45
sean-k-mooneyso we dont actuly supprot detaching or attaching them to a running domain12:46
sean-k-mooneywhich is why we did not add the alias field to them12:46
sean-k-mooneynow we can just do that and that will fix the current bug12:46
sean-k-mooneywe could also fix this in other ways12:46
sean-k-mooneywithout a repoducer its hard to prevent a futrue regressions12:47
kashyapYeah, I see that bit - that we don't support detach/attach mdevs to a running domain.12:47
kashyapI agree; repro will help here, and you've asked as much on the review12:47
sean-k-mooneyso the fact we dont supprot it is just an aside, its why we didnt think it was required when we started using the aliases12:48
sean-k-mooneyif we fix this by adding the filed12:48
sean-k-mooneythen we have a wackamole problem fo doing this for each possibel object that could be returened12:48
sean-k-mooneyif we fix this but only checking the aliase if the object has one then we fix it for all device types12:48
sean-k-mooneyin the previous case of a PF it was correct to add teh alias filed because we do supprot attaching/detaching those to a running guest12:49
sean-k-mooneyso there patch is not wrong jsut incompelte12:49
sean-k-mooneykashyap: is this somehting your interested in fixing?12:50
sean-k-mooneyi was trying to highlight it to bauzas in case they wanted to fix it or where seeign it in there mtty serise12:51
kashyapsean-k-mooney: I'm not sufficiantly clued into it; I was just curious based on your remark here.  (Reading back your comments)12:51
kashyapHmm, you do raise a good point on the whack-a-mole problem of each object12:51
kashyapPff; I'd probably not risk to introduce new regressions at this stage, and given capacity.12:51
sean-k-mooneyah ok, so the underliying problem is we actully woudl ideally have a new api call in libvirt12:51
kashyapHmm, really?12:52
kashyapAh, "ideally", you said.  What's the current API call?12:52
sean-k-mooneyya so ideally libvirt would have an api to get a domain device element given an alias12:52
sean-k-mooneysince it does not we get all the devices and do the filtering ourselves12:52
kashyapLet me check on #virt (OFTC) if they're open to it12:54
sean-k-mooneyso this is where the actly bug is in nova https://github.com/openstack/nova/blob/master/nova/virt/libvirt/guest.py#L41912:54
sean-k-mooneydev.alias is only in some of our devices not all12:54
sean-k-mooneyso we coudl fix it in the base clase we use or we shoudl do a hasattr check before using it12:55
kashyapDo we know a list of devices that libvirt _does_ return here?12:55
sean-k-mooneyits all devices in the domain after we parse it12:56
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/virt/libvirt/guest.py#L419-L45212:56
kashyapI'm looking at get_all_devices() 12:56
sean-k-mooneyso we could fix this by adding alias to https://github.com/openstack/nova/blob/master/nova/virt/libvirt/config.py#L300012:57
sean-k-mooneythese are not really object returned form libvirt12:58
sean-k-mooneythey are our config objects12:58
sean-k-mooneywe construct them form parsing the xml12:58
sean-k-mooneybut we check the libvirt pyton binding before doing this to see if there was a fucntion we could use more directly and didnt find one at the time12:58
sean-k-mooneywhat we would do ideally is just as libvirt for the singel device element based on the alias12:59
sean-k-mooneyinstead of pulling the entire domain. and parsing out all the devics to try and find the specific one13:00
kashyapYeah, I know these (the config.py stuff) are all libvirt guest-config classes.13:02
kashyapMaybe you want to summarize / copy paste your comment from here on the change for the record.13:02
sean-k-mooneynot really :) i could but since they have abandoned it i feel like they are nto going to work on it any more13:03
sean-k-mooneyso i wanted to see if one of use wanted to pick it up13:03
sean-k-mooneykashyap: im expectign bauzas mtty patches to repodcue this13:03
sean-k-mooneyso as part of using mtty to test in the gate we will likely need to fix this anyway13:03
kashyapYou think so?13:03
kashyapGot a link?13:04
sean-k-mooneyso if this is failing the way they reported it woudl cause volume detach to fail if libvirt report mdevs13:04
sean-k-mooneyso i woudl expect the volume tempest test to be failin on https://review.opendev.org/q/topic:%22mtty_support%2213:05
sean-k-mooneythe fact it does not makes me want the repodcures even more13:06
sean-k-mooneyoh...13:06
sean-k-mooneyok so i wonder if this only failes if you have both a vgpu and a cinder volume13:07
kashyapHmm13:07
sean-k-mooneyon the same vm13:07
sean-k-mooneywe dont have result anymore but it did actully fail13:08
sean-k-mooneyhttps://zuul.opendev.org/t/openstack/build/42a251fe1d174a2abef7016b96fe0dc413:08
kashyapOhh, so the vGPU + Cinder volume is the combo13:10
opendevreviewMerged openstack/nova stable/2024.1: Remove AMI snapshot format special case  https://review.opendev.org/c/openstack/nova/+/92498213:14
sean-k-mooneykashyap: i think so but again they didnt provide a clear repoducer in the bug so its hard to tell exactly13:21
sean-k-mooneybut i think that has to be the case looking at the code13:21
sean-k-mooneywe are looping over the devices in the guest domain and the error is in the mdev class13:22
sean-k-mooneyso it must only be an issue when comibining both features13:22
sean-k-mooneywhich would expliain why we dont see this today13:22
dansmithsean-k-mooney: gibi: I thought we were into "written by a core, backported by a core, can be single-vote-landed by a core" on stable now13:30
sean-k-mooneydansmith: i guess, it does not come up often13:31
sean-k-mooneybut in either case i was going to proxy your/my +2 by end of day if noone else said go for it13:32
dansmithack13:34
opendevreviewRajesh Tailor proposed openstack/nova master: Fix KeyError on assisted snapshot call  https://review.opendev.org/c/openstack/nova/+/90078313:45
opendevreviewRajesh Tailor proposed openstack/nova master: Handle InstanceExists exception for duplicate instance  https://review.opendev.org/c/openstack/nova/+/86093813:57
elodillessean-k-mooney: i'm OK to propose stable releases anytime & i don't have any important backport on my radar14:00
sean-k-mooneyelodilles: bauzas asked me to include https://review.opendev.org/c/openstack/nova/+/924514 and its backports14:04
sean-k-mooneyi think we can include https://review.opendev.org/c/openstack/nova/+/923689 also on 2024.114:05
elodillesACK, I'll wait for them then o/14:06
bauzascool14:06
sean-k-mooneyelodilles: i have not realy looked at the older branch contet much14:07
sean-k-mooneyi do see some nice to haves so i may do some stable review today14:07
sean-k-mooneybut i dont see anythign super urgent14:08
elodillesACK, i'll check them too14:09
bauzassean-k-mooney: can you summarize the mdev issue you said ?14:34
bauzas-ETOOMUCHCONTENTTODIGEST14:34
bauzasfwiw, I need to resume my mtty series now that devstack supports 24.0414:35
sean-k-mooneybauzas: it appears that if a gues has an mdev  and a cinder volume we cant detach t he cinder volume because the get_device_by_alis code cant hanel the fact that we dont have an aias field in the class that modeles the mdev device in the guest xml14:36
sean-k-mooneyi.e. you cant detach a cinder volume for a guest with a vgpu14:37
sean-k-mooneywe can fix that in the way propsoed in the patch but it missed the underly ing bug and did not include a repoducer14:37
bauzasi see14:37
sean-k-mooneythe underlying bug is the get_device_by_alias code assumes all geust devices have an alias filed14:38
bauzasso people need to rebuild the instance with a flavor without VGPU first before detaching14:38
sean-k-mooneyso we shoudl fix that14:38
bauzasagreed14:38
bauzasis this https://review.opendev.org/c/openstack/nova/+/925037 ?14:38
sean-k-mooneybauzas: they might just need to stop the instnace14:38
sean-k-mooneyyep14:39
sean-k-mooneythat patch will work but it does not have enouch test coverage 14:39
sean-k-mooneyand as i said the bug is reallly here https://github.com/openstack/nova/blob/7a7427691e0bd4818bb7a2c5f5371e0244addbbb/nova/virt/libvirt/guest.py#L41914:39
bauzasthe real problem is that we didn't had any functest for that14:40
sean-k-mooneythat if assumes all devices in get_all_devices will have an alias field14:40
bauzasbecause mdev support predates alias 14:40
sean-k-mooneyyep14:40
sean-k-mooneyso ideally we woudl create a fucntional repoducer14:40
bauzasbut when we merged aliases series, we created a regression14:40
bauzasyup 14:40
bauzaswe need a functtest14:40
sean-k-mooneyand then fix both the underlying bug and optionall add supprot ofr alias ot mdevs even if not used today14:41
sean-k-mooneywe can fix the actual bug in 1 of 2 ways14:41
sean-k-mooneyeither check if dev has an alais field before usign it14:41
sean-k-mooneyor make sure that all device will have that eihter by extending the base class or checking all child class have it14:42
sean-k-mooneyso either add it here https://github.com/openstack/nova/blob/7a7427691e0bd4818bb7a2c5f5371e0244addbbb/nova/virt/libvirt/config.py#L56 or add a hasattr check14:43
bauzasI'm fine with the first option14:46
bauzasthe second option is better in case of *another* regression14:46
bauzaswant me to help with the reproducer ?14:46
bauzasI just have 3.5 days to work before my PTO and I really want my mtty series to be reviewable before I leavred14:47
opendevreviewPavlo Shchelokovskyy proposed openstack/nova master: Replace -- as 3-4 char in hostname with -  https://review.opendev.org/c/openstack/nova/+/89307215:23
*** efried1 is now known as efried15:25
opendevreviewMerged openstack/nova stable/2023.2: Remove AMI snapshot format special case  https://review.opendev.org/c/openstack/nova/+/92499915:33
*** efried1 is now known as efried15:33
*** efried1 is now known as efried15:53
*** bauzas_ is now known as bauzas16:01
opendevreviewMerged openstack/nova stable/2023.2: Fixed an error when caching multiple images in aggregate  https://review.opendev.org/c/openstack/nova/+/90401716:09
sean-k-mooneybauzas: i rehcecked the last patch in your mtty seriese and it didnt repoduce the cinder issue but it did fail with https://paste.opendev.org/show/bexn6H6KYRrAlhNONtVn/16:20
sean-k-mooneythere is a bug somewehre but im not sure that https://review.opendev.org/c/openstack/nova/+/918420/ is even current16:22
opendevreviewsean mooney proposed openstack/nova stable/2023.1: Fixed an error when caching multiple images in aggregate  https://review.opendev.org/c/openstack/nova/+/90401816:28
opendevreviewMerged openstack/nova stable/2023.2: Added context manager for instance lock  https://review.opendev.org/c/openstack/nova/+/91139416:39
opendevreviewMerged openstack/nova stable/2023.2: Separate OSError with ValueError  https://review.opendev.org/c/openstack/nova/+/91139516:44
sean-k-mooneybauzas: if you have time to work on it do17:25
sean-k-mooneyperhaps just try that usecase "mdev + cinder volume in the same vm" when working on the mtty stuff17:26
sean-k-mooneyand see if you see the same issue17:26
sean-k-mooneyif you do then we can confirm the bug17:26
sean-k-mooneyand create a repoducer17:26
sean-k-mooneyin the fucntional tests17:26
opendevreviewMerged openstack/nova stable/2023.1: Fixed an error when caching multiple images in aggregate  https://review.opendev.org/c/openstack/nova/+/90401819:51
*** bauzas_ is now known as bauzas20:16
opendevreviewPavlo Shchelokovskyy proposed openstack/nova master: Replace -- as 3-4 char in hostname with -  https://review.opendev.org/c/openstack/nova/+/89307221:08

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!