*** elodilles is now known as elodilles_pto | 06:33 | |
*** tkajinam is now known as Guest8639 | 09:04 | |
sean-k-mooney | gibi: i forgot to hit send, the numa code and pci code expect that its workign with an upleveled object at least for the instance numa objectss | 10:54 |
---|---|---|
sean-k-mooney | so i disagre that that how nova was designed | 10:55 |
sean-k-mooney | it might have been how dansmith intenetd we use ovo in nova | 10:55 |
sean-k-mooney | but in practice its not the mental model i or others had when writing the numa features | 10:55 |
sean-k-mooney | we always wote the code assuming the obejct would be version matched to the current processes class defintion | 10:58 |
gibi | sean-k-mooney: yeah I would like to use our ovos in a way that if a field is defined in the `fields` dict of the class then i) the ovo has that field always ii) the ovo hides all the necessary version forward (or backward) migration logic from the caller | 11:00 |
gibi | This is not the first time such assumption of mine gets invalidated | 11:01 |
sean-k-mooney | we should never loose data by upleveling, (unless we stop supporting a feature) | 11:01 |
sean-k-mooney | i dont think its invlaide | 11:01 |
sean-k-mooney | i think dansmith has been overcome by events | 11:01 |
sean-k-mooney | i.e. even if there view poitn was the orginal intent it not how we have been using them in the code for years | 11:02 |
sean-k-mooney | migration is fine by the way becasue we pass data with dedicated objects for live migration | 11:02 |
sean-k-mooney | so we not passing an instance numa toplogy object form one compute to another | 11:03 |
sean-k-mooney | were are always gettign it form teh conductor | 11:03 |
sean-k-mooney | or constucting it locally | 11:03 |
sean-k-mooney | the conductor will backlevel it if the compute is older but the compute can use the latest version it knows about in all cases | 11:04 |
gibi | I see | 11:04 |
sean-k-mooney | oh i did push my comments on teh follow up https://github.com/openstack/nova/blob/e27bbe72e0d293e55c30d4f90ca0afcf47427419/nova/compute/manager.py#L955-L958 | 11:05 |
sean-k-mooney | cool i was going to check _get_numa_constriats in hardware.py to see what we set it too | 11:06 |
sean-k-mooney | just since you might not know this. we use the instance numa toplogy in 2 ways. | 11:06 |
sean-k-mooney | for the schduler we partially popuplated it with the "requests" form the flavor/image | 11:07 |
sean-k-mooney | and then later we actully fully populated it with the actul cores/numa assignmentes on the compute node | 11:07 |
sean-k-mooney | so in the schduler 2 we jsut have "i woudl like 2 numa nodes please" and on the ocmpute we actully also record "sure you can have 1 an 3" | 11:08 |
sean-k-mooney | that may or may not be relevent jsut want to let you knwo that the api/sechduler use only a subset of the files. we could have had 2 obejct but we didnt for historical reasons i dont remember | 11:10 |
sean-k-mooney | well no, the intent was to evenutally have the scheduler/conductor do the numa claim in the db but we never did that so the idea was we woudl pass the fully populated numa toplogy obejct to the comptue isntead of claiming the speciric cores ectra on the compute | 11:12 |
sean-k-mooney | but we decied to build placement instead so never got around to that | 11:13 |
gibi | we have a case to return cpu_policy None from https://github.com/openstack/nova/blob/e27bbe72e0d293e55c30d4f90ca0afcf47427419/nova/virt/hardware.py#L1562 | 11:14 |
sean-k-mooney | https://github.com/openstack/nova/blob/e27bbe72e0d293e55c30d4f90ca0afcf47427419/nova/virt/hardware.py#L2294 | 11:14 |
sean-k-mooney | yep that exactly what i rememebred and wanted to confrim | 11:15 |
gibi | anyhow I pushed a fix that considers None the same as SHARED | 11:15 |
sean-k-mooney | yep i saw | 11:16 |
gibi | it seems we don't have the bot active posting the incoming patchsets | 11:16 |
sean-k-mooney | i still got the emails | 11:16 |
sean-k-mooney | i think there was a gerrit restart at teh weekend maybe the bot need to be restarted? | 11:16 |
gibi | yeah | 11:17 |
sean-k-mooney | the last review email was from friday as far as i can tell | 11:17 |
sean-k-mooney | any idea how to do that? | 11:17 |
gibi | nope | 11:18 |
gibi | probably infra knows | 11:18 |
sean-k-mooney | ok i was going to say letst just ask there | 11:18 |
sean-k-mooney | if i open a chat to opendevreview it does nto prompt with anythign so i guess its not contolled like that | 11:21 |
opendevreview | Rajesh Tailor proposed openstack/nova master: Add support for showing image properties in server show response https://review.opendev.org/c/openstack/nova/+/939649 | 13:59 |
gibi | hey the bot is back \o/ | 14:06 |
sean-k-mooney | yep fungi restarted it. there was an unplanned gerrit outage over the weekend i think and it just didnt reconnect | 14:06 |
fungi | yeah, if gerrit vanishes like happened that time (nova spontaneously stopped the vm, still waiting to hear back from the provider on a root cause), the client doesn't notice the server has gone away and continues listening on a tcp connection that will never receive new packets | 14:08 |
sean-k-mooney | nova only updates it to stopped if it sees it stopped on the libvirt side | 14:09 |
sean-k-mooney | or if it recived an api request to do so | 14:09 |
sean-k-mooney | so proably an issue on the underlying host like an OOM kills would be my guess | 14:09 |
sean-k-mooney | if they cant root cause it tell them to file a nova but and let use know and we can maybe point them in the right direction | 14:10 |
bauzas | hmmmpffff, I got a similar nova-next and nova-multicell failure for my patch | 15:10 |
bauzas | https://6944f33f3ef66c20bc22-320386062a8fef96051148fe5e7af6b1.ssl.cf5.rackcdn.com/940642/5/check/nova-next/c60809e/testr_results.html | 15:10 |
bauzas | https://8ea52cf6ed28836394cc-227b19b7b0e554c64ba32d661a445fcc.ssl.cf1.rackcdn.com/940642/5/check/nova-multi-cell/13c8443/testr_results.html | 15:11 |
bauzas | looks like our lordy gate is b0rken | 15:11 |
MengyangZhang[m] | <sean-k-mooney> "Mengyang Zhang: i suspect the..." <- Hi Sean, after giving another thought about this. I don't think we need to add this min version check. Migrating instances between compute hosts with different versions should be fine. | 15:43 |
MengyangZhang[m] | During the live migration process, pre_live_migration is called on the source host, which calls the Cinder API to re-create volume attachments, updating connection_info and attachment_id on all volume BDMS to reflect the new destination host attachment and returns the information as migrate_data. Then nova performs live migration by calling LibvirtDriver(driver.ComputeDriver)'s live_migration method to execute the live | 15:43 |
MengyangZhang[m] | migration, utilizing migrate_data to update the current XML. | 15:43 |
MengyangZhang[m] | * Hi Sean, after giving another thought about this. I don't think we need to add this min version check. Migrating instances between compute hosts with different versions should be fine.... (full message at <https://matrix.org/oftc/media/v1/media/download/ASNbn5eOSN0eZUGb2nl-I79qeQk8MrG28vSQcbe97Ljgln-5krYH4sKDCO_d1rvB2o3m5nhyeSYEkc3QXYITUeZCeVOKSligAG1hdHJpeC5vcmcvd0NYdFZjcU1wQ3d6cXBvbVdUWGdFWURI>) | 15:47 |
opendevreview | ribaudr proposed openstack/nova master: Add managed flag to PCI device specification https://review.opendev.org/c/openstack/nova/+/937649 | 15:49 |
opendevreview | ribaudr proposed openstack/nova master: Update driver to deal with managed flag https://review.opendev.org/c/openstack/nova/+/938405 | 15:49 |
opendevreview | Fabian Wiesel proposed openstack/nova master: WIP: Re-create volume backed root before destroy https://review.opendev.org/c/openstack/nova/+/941095 | 15:49 |
bauzas | good news are that apparently nova-next runs are not really stopped https://zuul.openstack.org/builds?job_name=nova-next&skip=0 | 16:03 |
sean-k-mooney | stopped? | 16:04 |
sean-k-mooney | by what? | 16:04 |
sean-k-mooney | oh you saw a few failures | 16:04 |
bauzas | sean-k-mooney: see my above comment ^ | 16:04 |
sean-k-mooney | MengyangZhang[m]: migration woudl fail in pre-livemigrate | 16:05 |
sean-k-mooney | actully no it would not | 16:05 |
sean-k-mooney | MengyangZhang[m]: migrating form an upgraded compute to an non upgraded one will generate the correct config | 16:06 |
sean-k-mooney | but then if you hard reboot the qos config would be lost | 16:06 |
sean-k-mooney | so we woudl need to have the migration fail in that case | 16:07 |
opendevreview | Merged openstack/nova master: api: Allow min/max_version arguments to response https://review.opendev.org/c/openstack/nova/+/936364 | 16:46 |
opendevreview | Merged openstack/nova master: trivial: Remove legacy API artifact https://review.opendev.org/c/openstack/nova/+/937377 | 16:46 |
mikal | Was there a zuul outage yesterday or something? CI doesn't seem to have run on https://review.opendev.org/c/openstack/nova/+/924844 after about 10 hours. | 18:50 |
sean-k-mooney | there was yesterday at some point i think | 19:23 |
sean-k-mooney | not today that im aware of | 19:23 |
sean-k-mooney | mikal: it looks like the last buidl for that was https://zuul.openstack.org/buildset/5704a9aeddb44126966dad2b267a6263 | 19:27 |
sean-k-mooney | so around "2025-02-09 00:09:08" | 19:28 |
sean-k-mooney | mikal: i see your recheck from 30 mins ago but i dont see anyign in zuul for that | 19:29 |
sean-k-mooney | i wonder is there a merge confligt with the depends on agaisnt my patch | 19:30 |
mikal | Yeah, I did that recheck because I couldn't find any evidence that zuul had actually run a check on the latest patchset. | 19:32 |
mikal | That recheck did cause the vmware test to run at least, but I agree I can't see anything in zuul. | 19:32 |
sean-k-mooney | let me rebase my patch quickly | 19:33 |
sean-k-mooney | v4 is not based on teh latest verion of its parent https://review.opendev.org/c/openstack/nova/+/940873/4 | 19:33 |
mikal | I fixed a typo in one of yours to remove a stray pep8 error in my test results. I wonder if that is the problem?> | 19:33 |
opendevreview | sean mooney proposed openstack/nova master: Dont deploy n-spice on compute nodes. https://review.opendev.org/c/openstack/nova/+/940873 | 19:33 |
sean-k-mooney | maybe, when you did that you did not update the follow up patch which is the one your referncing | 19:34 |
mikal | Yeah, I did it quickly in the web UI and didn't really think it through... | 19:34 |
sean-k-mooney | ok now its running | 19:35 |
sean-k-mooney | that was all that was wrong it seams | 19:35 |
sean-k-mooney | https://zuul.openstack.org/status?change=924844 | 19:36 |
mikal | Oh nice. Thank you. | 19:36 |
mikal | gmann left some comments on the tempest patch, so I'll address those today now that I have a test run to observe. | 19:36 |
sean-k-mooney | ya i clicked into it but didnt look at the code yet | 19:37 |
sean-k-mooney | i dont think ether of the issues will prevent the test form running but the feedback maskes sense | 19:38 |
sean-k-mooney | at least with my almost 0 knowladge of tempest convetions if gmann says it shoudl be done a certin way i tend to agree :) | 19:38 |
mikal | I see your zero knowledge and raise you negative knowledge! | 19:39 |
mikal | I just cargo culted the previous VNC test. | 19:39 |
sean-k-mooney | well you actully implemnted part fo the spice handshake it seams | 19:41 |
sean-k-mooney | so you proably deserve more credit then your giving yourself | 19:41 |
mikal | Heh. Well yeah, the scaffolding is all cargo culted. The SPICE protocol bit was a cut and paste from the Kerbside code and some tweaks. | 19:42 |
JayF | good artists borrow; great artists steal :) | 19:42 |
mikal | Heh | 19:43 |
mikal | sean-k-mooney: has there been any more discussion of an os-traits release? I think that blocks the other patchset for now. | 19:43 |
sean-k-mooney | are the traits patches merged | 19:44 |
gmann | sean-k-mooney: commented in devstack change to enable tempest cofig option also, https://review.opendev.org/c/openstack/devstack/+/940838 | 19:45 |
sean-k-mooney | https://review.opendev.org/c/openstack/os-traits/+/940418 | 19:45 |
sean-k-mooney | gmann: did you see my nova job changes? | 19:46 |
gmann | sean-k-mooney this one? https://review.opendev.org/c/openstack/nova/+/940873 | 19:46 |
sean-k-mooney | yep | 19:46 |
gmann | yeah, but that does not set tempest config | 19:47 |
sean-k-mooney | so that is what enabled the testing in the nova check pipelien | 19:47 |
sean-k-mooney | right so ill update devstack with the change you susggeted | 19:47 |
gmann | either you can set tempest option also in job or better is devstack set it based on nova configuration | 19:47 |
sean-k-mooney | i was orginling thinkign of configuring that in the job as a followup | 19:47 |
gmann | i see | 19:47 |
sean-k-mooney | but im fine with devstack | 19:47 |
gmann | that will be easy if we want to test it in more jobs or so | 19:48 |
sean-k-mooney | sure so i dont neesisaly want to kick the current run out of the gate on the other hand the job has not started yet its still queue | 19:49 |
sean-k-mooney | so will i just do it now or do it after the current execution completes so we have one test run? | 19:49 |
sean-k-mooney | i guess in its current state it wont run the new test | 19:50 |
sean-k-mooney | so its proably better to update devstack now | 19:50 |
gmann | yeah | 19:50 |
gmann | at least to merge the tempest tests (which is almost ready), we need to see test running somewhere. | 19:51 |
sean-k-mooney | yep, do you have an example of how we set this in devstack normally | 19:52 |
sean-k-mooney | is it in lib/tempest or lib/nova | 19:52 |
mikal | sean-k-mooney: no, the os-traits change hasn't seen any more review recently. | 19:52 |
sean-k-mooney | ah found it | 19:52 |
gmann | sean-k-mooney: here https://github.com/openstack/devstack/blob/master/lib/tempest#L417 | 19:53 |
gmann | sean-k-mooney: you can add both condition 'is_service_enabled n-spice || [ "$NOVA_SPICE_ENABLED" != False ]' there same as how nova is configuring | 19:53 |
sean-k-mooney | ... whitespace https://review.opendev.org/c/openstack/devstack/+/940838/3/lib/tempest | 19:55 |
sean-k-mooney | so i need ot update that again anyway, i can do that if you liek but tempest is normmaly only installed on the contoler where the spice/serial proxy runs | 19:55 |
gmann | sean-k-mooney: I think you should add "$NOVA_SPICE_ENABLED" != False also like done in lib/nova | 19:56 |
sean-k-mooney | if i add || [ "$NOVA_SPICE_ENABLED" != False ] | 19:56 |
sean-k-mooney | will it break on the compute | 19:56 |
sean-k-mooney | on i guess tempet wont be enabeld on the compute | 19:56 |
sean-k-mooney | and if it was then it would be correct to do this ther too | 19:56 |
sean-k-mooney | ok ya ill fix that | 19:56 |
sean-k-mooney | gmann: ok https://review.opendev.org/c/openstack/devstack/+/940838 should work i think | 19:59 |
gmann | sean-k-mooney: yup. looks good | 20:00 |
sean-k-mooney | ok ill recheck mikal patch then to get a run with that enabled | 20:00 |
sean-k-mooney | once i find the tab... | 20:00 |
sean-k-mooney | cool its running again | 20:01 |
gmann | thanks | 20:02 |
sean-k-mooney | 39 itmes in check, my guess is we will have results in about 3 hours, maybe 2 if we get lucky with ci capsity | 20:03 |
sean-k-mooney | mikal: feel free to recheck or update any or my patches as needed. ill proably finish in the ned 30 mins or so | 20:04 |
opendevreview | Sylvain Bauza proposed openstack/nova master: Fix verifying all the alloc requests from a multi-create https://review.opendev.org/c/openstack/nova/+/846786 | 20:47 |
mikal | sean-k-mooney: no worries, thanks yet again. | 21:09 |
sean-k-mooney | its runnng tempet by the way | 21:10 |
sean-k-mooney | we are just waiting for it to report back now so my estimate is off it will be doen in less then an hour form now maybe 20 mins | 21:10 |
sean-k-mooney | i also never finish when i intened too... | 21:11 |
sean-k-mooney | mikal: ... 2025-02-10 21:14:19.686874 | controller | {3} setUpClass (tempest.api.compute.servers.test_spice.SpiceDirectConsoleTestJSON) ... SKIPPED: SPICE console feature is disabled. | 21:18 |
sean-k-mooney | so either i messed up this ran with an old verison of the devstack plugin | 21:19 |
sean-k-mooney | lets let is complete and we can check what is wrong | 21:19 |
sean-k-mooney | its still runing temepst i just saw that in the job log | 21:19 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!