Monday, 2025-02-10

*** elodilles is now known as elodilles_pto06:33
*** tkajinam is now known as Guest863909:04
sean-k-mooneygibi: i forgot to hit send, the numa code and pci code expect that its  workign with an upleveled object at least for the instance numa objectss10:54
sean-k-mooneyso i disagre that that how nova was designed10:55
sean-k-mooneyit might have been how dansmith intenetd we use ovo in nova10:55
sean-k-mooneybut in practice its not the mental model i or others had when writing the numa features10:55
sean-k-mooneywe always wote the code assuming the obejct would be version matched to the current processes class defintion10:58
gibisean-k-mooney: yeah I would like to use our ovos in a way that if a field is defined in the `fields` dict of the class then i) the ovo has that field always ii) the ovo hides all the necessary version forward (or backward) migration logic from the caller11:00
gibiThis is not the first time such assumption of mine gets invalidated11:01
sean-k-mooneywe should never loose data by upleveling, (unless we stop supporting a feature)11:01
sean-k-mooneyi dont think its invlaide11:01
sean-k-mooneyi think dansmith has been overcome by events11:01
sean-k-mooneyi.e. even if there view poitn was the orginal intent it not how we have been using them in the code for years11:02
sean-k-mooneymigration is fine by the way becasue we pass data with dedicated objects for live migration 11:02
sean-k-mooneyso we not passing an instance numa toplogy object form one compute to another11:03
sean-k-mooneywere are always gettign it form teh conductor11:03
sean-k-mooneyor constucting it locally11:03
sean-k-mooneythe conductor will backlevel it if the compute is older but the compute can use the latest version it knows about in all cases11:04
gibiI see11:04
sean-k-mooneyoh i did push my comments on teh follow up https://github.com/openstack/nova/blob/e27bbe72e0d293e55c30d4f90ca0afcf47427419/nova/compute/manager.py#L955-L95811:05
sean-k-mooneycool i was going to check _get_numa_constriats in hardware.py to see what we set it too11:06
sean-k-mooneyjust since you might not know this. we use the instance numa toplogy in 2 ways.11:06
sean-k-mooneyfor the schduler we partially popuplated it with the "requests" form the flavor/image11:07
sean-k-mooneyand then later we actully fully populated it with the actul cores/numa assignmentes on the compute node11:07
sean-k-mooneyso in the schduler 2 we jsut have "i woudl like 2 numa nodes please" and on the ocmpute we actully also record "sure you can have 1 an 3"11:08
sean-k-mooneythat may or may not be relevent jsut want to let you knwo that the api/sechduler  use only a subset of the files. we could have had 2 obejct but we didnt for historical reasons i dont remember11:10
sean-k-mooneywell no, the intent was to evenutally have the scheduler/conductor do the numa claim in the db but we never did that so the idea was we woudl pass the fully populated numa toplogy obejct to the comptue isntead of claiming the speciric cores ectra on the compute11:12
sean-k-mooneybut we decied to build placement instead so never got around to that11:13
gibiwe have a case to return cpu_policy None from https://github.com/openstack/nova/blob/e27bbe72e0d293e55c30d4f90ca0afcf47427419/nova/virt/hardware.py#L156211:14
sean-k-mooneyhttps://github.com/openstack/nova/blob/e27bbe72e0d293e55c30d4f90ca0afcf47427419/nova/virt/hardware.py#L229411:14
sean-k-mooneyyep  that exactly what i rememebred and wanted to confrim11:15
gibianyhow I pushed a fix that considers None the same as SHARED11:15
sean-k-mooneyyep i saw11:16
gibiit seems we don't have the bot active posting the incoming patchsets11:16
sean-k-mooneyi still got the emails11:16
sean-k-mooneyi think there was a gerrit restart at teh weekend maybe the bot need to be restarted?11:16
gibiyeah11:17
sean-k-mooneythe last review email was from friday as far as i can tell11:17
sean-k-mooneyany idea how to do that?11:17
gibinope11:18
gibiprobably infra knows11:18
sean-k-mooneyok i was going to say letst just ask there11:18
sean-k-mooneyif i open a chat to opendevreview it does nto prompt with anythign so i guess its not contolled like that11:21
opendevreviewRajesh Tailor proposed openstack/nova master: Add support for showing image properties in server show response  https://review.opendev.org/c/openstack/nova/+/93964913:59
gibihey the bot is back \o/14:06
sean-k-mooneyyep fungi restarted it. there was an unplanned gerrit outage over the weekend i think and it just didnt reconnect14:06
fungiyeah, if gerrit vanishes like happened that time (nova spontaneously stopped the vm, still waiting to hear back from the provider on a root cause), the client doesn't notice the server has gone away and continues listening on a tcp connection that will never receive new packets14:08
sean-k-mooneynova only updates it to stopped if it sees it stopped on the libvirt side14:09
sean-k-mooneyor if it recived an api request to do so14:09
sean-k-mooneyso proably an issue on the underlying host like an OOM kills would be my guess14:09
sean-k-mooneyif they cant root cause it tell them to file a nova but and let use know and we can maybe point them in the right direction14:10
bauzashmmmpffff, I got a similar nova-next and nova-multicell failure for my patch 15:10
bauzashttps://6944f33f3ef66c20bc22-320386062a8fef96051148fe5e7af6b1.ssl.cf5.rackcdn.com/940642/5/check/nova-next/c60809e/testr_results.html15:10
bauzashttps://8ea52cf6ed28836394cc-227b19b7b0e554c64ba32d661a445fcc.ssl.cf1.rackcdn.com/940642/5/check/nova-multi-cell/13c8443/testr_results.html15:11
bauzaslooks like our lordy gate is b0rken15:11
MengyangZhang[m]<sean-k-mooney> "Mengyang Zhang: i suspect the..." <- Hi Sean, after giving another thought about this. I don't think we need to add this min version check. Migrating instances between compute hosts with different versions should be fine. 15:43
MengyangZhang[m]During the live migration process, pre_live_migration is called on the source host, which calls the Cinder API to re-create volume attachments, updating connection_info and attachment_id on all volume BDMS to reflect the new destination host attachment and returns the information as migrate_data. Then nova performs live migration by calling LibvirtDriver(driver.ComputeDriver)'s live_migration method to execute the live15:43
MengyangZhang[m]migration, utilizing migrate_data to update the current XML.15:43
MengyangZhang[m]* Hi Sean, after giving another thought about this. I don't think we need to add this min version check. Migrating instances between compute hosts with different versions should be fine.... (full message at <https://matrix.org/oftc/media/v1/media/download/ASNbn5eOSN0eZUGb2nl-I79qeQk8MrG28vSQcbe97Ljgln-5krYH4sKDCO_d1rvB2o3m5nhyeSYEkc3QXYITUeZCeVOKSligAG1hdHJpeC5vcmcvd0NYdFZjcU1wQ3d6cXBvbVdUWGdFWURI>)15:47
opendevreviewribaudr proposed openstack/nova master: Add managed flag to PCI device specification  https://review.opendev.org/c/openstack/nova/+/93764915:49
opendevreviewribaudr proposed openstack/nova master: Update driver to deal with managed flag  https://review.opendev.org/c/openstack/nova/+/93840515:49
opendevreviewFabian Wiesel proposed openstack/nova master: WIP: Re-create volume backed root before destroy  https://review.opendev.org/c/openstack/nova/+/94109515:49
bauzasgood news are that apparently nova-next runs are not really stopped https://zuul.openstack.org/builds?job_name=nova-next&skip=016:03
sean-k-mooneystopped?16:04
sean-k-mooneyby what?16:04
sean-k-mooneyoh you saw a few failures16:04
bauzassean-k-mooney: see my above comment ^16:04
sean-k-mooneyMengyangZhang[m]: migration woudl fail in pre-livemigrate16:05
sean-k-mooneyactully no it would not16:05
sean-k-mooneyMengyangZhang[m]: migrating form an upgraded compute to an non upgraded one will generate the correct config16:06
sean-k-mooneybut then if you hard reboot the qos config would be lost16:06
sean-k-mooneyso we woudl need to have the migration fail in that case16:07
opendevreviewMerged openstack/nova master: api: Allow min/max_version arguments to response  https://review.opendev.org/c/openstack/nova/+/93636416:46
opendevreviewMerged openstack/nova master: trivial: Remove legacy API artifact  https://review.opendev.org/c/openstack/nova/+/93737716:46
mikalWas there a zuul outage yesterday or something? CI doesn't seem to have run on https://review.opendev.org/c/openstack/nova/+/924844 after about 10 hours.18:50
sean-k-mooneythere was yesterday at some point i think19:23
sean-k-mooneynot today that im aware of19:23
sean-k-mooneymikal: it looks like the last buidl for that was https://zuul.openstack.org/buildset/5704a9aeddb44126966dad2b267a626319:27
sean-k-mooneyso around "2025-02-09 00:09:08"19:28
sean-k-mooneymikal: i see your recheck from 30 mins ago but i dont see anyign in zuul for that19:29
sean-k-mooneyi wonder is there a merge confligt with the depends on agaisnt my patch19:30
mikalYeah, I did that recheck because I couldn't find any evidence that zuul had actually run a check on the latest patchset.19:32
mikalThat recheck did cause the vmware test to run at least, but I agree I can't see anything in zuul.19:32
sean-k-mooneylet me rebase my patch quickly19:33
sean-k-mooneyv4 is not based on teh latest verion of its parent https://review.opendev.org/c/openstack/nova/+/940873/419:33
mikalI fixed a typo in one of yours to remove a stray pep8 error in my test results. I wonder if that is the problem?>19:33
opendevreviewsean mooney proposed openstack/nova master: Dont deploy n-spice on compute nodes.  https://review.opendev.org/c/openstack/nova/+/94087319:33
sean-k-mooneymaybe, when you did that you did not update the follow up patch which is the one your referncing19:34
mikalYeah, I did it quickly in the web UI and didn't really think it through...19:34
sean-k-mooneyok now its running19:35
sean-k-mooneythat was all that was wrong it seams19:35
sean-k-mooneyhttps://zuul.openstack.org/status?change=92484419:36
mikalOh nice. Thank you.19:36
mikalgmann left some comments on the tempest patch, so I'll address those today now that I have a test run to observe.19:36
sean-k-mooneyya i clicked into it but didnt look at the code yet19:37
sean-k-mooneyi dont think ether of the issues will prevent the test form running but the feedback maskes sense19:38
sean-k-mooneyat least  with my almost 0 knowladge of tempest convetions if gmann says it shoudl be done a certin way i tend to agree :)19:38
mikalI see your zero knowledge and raise you negative knowledge!19:39
mikalI just cargo culted the previous VNC test.19:39
sean-k-mooneywell you actully implemnted part fo the spice handshake it seams19:41
sean-k-mooneyso you proably deserve more credit then your giving yourself19:41
mikalHeh. Well yeah, the scaffolding is all cargo culted. The SPICE protocol bit was a cut and paste from the Kerbside code and some tweaks.19:42
JayFgood artists borrow; great artists steal :) 19:42
mikalHeh19:43
mikalsean-k-mooney: has there been any more discussion of an os-traits release? I think that blocks the other patchset for now.19:43
sean-k-mooneyare the traits patches merged19:44
gmannsean-k-mooney: commented in devstack change to enable tempest cofig option also, https://review.opendev.org/c/openstack/devstack/+/94083819:45
sean-k-mooneyhttps://review.opendev.org/c/openstack/os-traits/+/94041819:45
sean-k-mooneygmann: did you see my nova job changes?19:46
gmannsean-k-mooney this one?   https://review.opendev.org/c/openstack/nova/+/94087319:46
sean-k-mooneyyep19:46
gmannyeah, but that does not set tempest config19:47
sean-k-mooneyso that is what enabled the testing in the nova check pipelien19:47
sean-k-mooneyright so ill update devstack with the change you susggeted19:47
gmanneither you can set tempest option also in job or better is devstack set it based on nova configuration 19:47
sean-k-mooneyi was orginling thinkign of configuring that in the job as a followup19:47
gmanni see19:47
sean-k-mooneybut im fine with devstack19:47
gmannthat will be easy if we want to test it in more jobs or so19:48
sean-k-mooneysure so i dont neesisaly want to kick the current run out of the gate on the other hand the job has not started yet its still queue19:49
sean-k-mooneyso will i just do it now or do it after the current execution completes so we have one test run?19:49
sean-k-mooneyi guess in its current state it wont run the new test19:50
sean-k-mooneyso its proably better to update devstack now19:50
gmannyeah19:50
gmannat least to merge the tempest tests (which is almost ready), we need to see test running somewhere. 19:51
sean-k-mooneyyep, do you have an example of how we set this in devstack normally19:52
sean-k-mooneyis it in lib/tempest or lib/nova19:52
mikalsean-k-mooney: no, the os-traits change hasn't seen any more review recently.19:52
sean-k-mooneyah found it19:52
gmannsean-k-mooney: here https://github.com/openstack/devstack/blob/master/lib/tempest#L41719:53
gmannsean-k-mooney: you can add both condition 'is_service_enabled n-spice || [ "$NOVA_SPICE_ENABLED" != False ]' there same as how nova is configuring 19:53
sean-k-mooney... whitespace https://review.opendev.org/c/openstack/devstack/+/940838/3/lib/tempest19:55
sean-k-mooneyso i need ot update that again anyway, i can do that if you liek but tempest is normmaly only installed on the contoler where the spice/serial proxy runs19:55
gmannsean-k-mooney: I think you should add "$NOVA_SPICE_ENABLED" != False also like done in lib/nova19:56
sean-k-mooneyif i add || [ "$NOVA_SPICE_ENABLED" != False ]19:56
sean-k-mooneywill it break on the compute19:56
sean-k-mooneyon i guess tempet wont be enabeld on the compute19:56
sean-k-mooneyand if it was then it would be correct to do this ther too19:56
sean-k-mooneyok ya ill fix that19:56
sean-k-mooneygmann: ok https://review.opendev.org/c/openstack/devstack/+/940838 should work i think19:59
gmannsean-k-mooney: yup. looks good20:00
sean-k-mooneyok ill recheck mikal patch then to get a run with that enabled20:00
sean-k-mooneyonce i find the tab...20:00
sean-k-mooneycool its running again20:01
gmannthanks20:02
sean-k-mooney39 itmes in check, my guess is we will have results in about 3 hours, maybe 2 if we get lucky with ci capsity20:03
sean-k-mooneymikal: feel free to recheck or update any or my patches as needed. ill proably finish in the ned 30 mins or so20:04
opendevreviewSylvain Bauza proposed openstack/nova master: Fix verifying all the alloc requests from a multi-create  https://review.opendev.org/c/openstack/nova/+/84678620:47
mikalsean-k-mooney: no worries, thanks yet again.21:09
sean-k-mooneyits runnng tempet by the way21:10
sean-k-mooneywe are just waiting for it to report back now so my estimate is off it will be doen in less then an hour form now maybe 20 mins21:10
sean-k-mooneyi also never finish when i intened too...21:11
sean-k-mooneymikal: ... 2025-02-10 21:14:19.686874 | controller | {3} setUpClass (tempest.api.compute.servers.test_spice.SpiceDirectConsoleTestJSON) ... SKIPPED: SPICE console feature is disabled.21:18
sean-k-mooneyso either i messed up this ran with an old verison of the devstack plugin21:19
sean-k-mooneylets let is complete and we can check what is wrong21:19
sean-k-mooneyits still runing temepst i just saw that in the job log21:19

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!