*** tjgresha has quit IRC | 00:00 | |
*** betherly has quit IRC | 00:04 | |
*** tetsuro has joined #openstack-nova | 00:14 | |
*** tbachman has quit IRC | 00:23 | |
*** dakshina-ilangov has quit IRC | 00:24 | |
*** lbragstad has quit IRC | 00:45 | |
openstackgerrit | Merged openstack/nova stable/rocky: Replace openstack.org git:// URLs with https:// https://review.openstack.org/646686 | 00:46 |
---|---|---|
*** tbachman has joined #openstack-nova | 00:51 | |
*** tetsuro has quit IRC | 00:58 | |
*** Cardoe has quit IRC | 01:01 | |
*** medberry has joined #openstack-nova | 01:01 | |
*** Cardoe has joined #openstack-nova | 01:02 | |
*** gyee has quit IRC | 01:03 | |
*** medberry has quit IRC | 01:05 | |
*** betherly has joined #openstack-nova | 01:11 | |
*** whoami-rajat has joined #openstack-nova | 01:12 | |
*** tiendc has joined #openstack-nova | 01:13 | |
*** betherly has quit IRC | 01:16 | |
*** mriedem has quit IRC | 01:18 | |
*** openstackgerrit has quit IRC | 01:30 | |
*** nicolasbock has quit IRC | 01:45 | |
*** openstackgerrit has joined #openstack-nova | 01:51 | |
openstackgerrit | melanie witt proposed openstack/nova master: Use futurist.GreenThreadPoolExecutor in scatter_gather_cells https://review.openstack.org/650172 | 01:51 |
*** betherly has joined #openstack-nova | 01:53 | |
*** betherly has quit IRC | 01:57 | |
*** hongbin has joined #openstack-nova | 02:29 | |
*** owalsh_ has joined #openstack-nova | 02:34 | |
*** owalsh has quit IRC | 02:34 | |
*** cfriesen has quit IRC | 02:37 | |
openstackgerrit | Boxiang Zhu proposed openstack/nova master: Use the functional test test_parallel_evacuate_with_server_group https://review.openstack.org/649963 | 02:44 |
*** betherly has joined #openstack-nova | 02:45 | |
openstackgerrit | Li Zhouzhou proposed openstack/nova stable/queens: SRIOV pci_numa_policy dosen't working when create instance with 'cpu_policy' and 'numa_nodes https://review.openstack.org/651429 | 02:45 |
*** betherly has quit IRC | 02:50 | |
*** psachin has joined #openstack-nova | 03:01 | |
*** mmethot has quit IRC | 03:09 | |
*** mmethot has joined #openstack-nova | 03:09 | |
*** udesale has joined #openstack-nova | 03:26 | |
*** ricolin has joined #openstack-nova | 03:31 | |
*** hongbin has quit IRC | 03:42 | |
*** imacdonn_ has quit IRC | 04:03 | |
*** imacdonn_ has joined #openstack-nova | 04:03 | |
*** sridharg has joined #openstack-nova | 04:04 | |
*** dave-mccowan has quit IRC | 04:15 | |
*** betherly has joined #openstack-nova | 04:18 | |
*** betherly has quit IRC | 04:23 | |
*** openstackstatus has quit IRC | 04:35 | |
*** openstackstatus has joined #openstack-nova | 04:36 | |
*** ChanServ sets mode: +v openstackstatus | 04:36 | |
*** tetsuro has joined #openstack-nova | 04:59 | |
*** betherly has joined #openstack-nova | 05:00 | |
*** ratailor has joined #openstack-nova | 05:01 | |
*** betherly has quit IRC | 05:04 | |
*** pcaruana has joined #openstack-nova | 05:06 | |
*** Luzi has joined #openstack-nova | 05:17 | |
*** ricolin has quit IRC | 05:30 | |
*** sidx64 has joined #openstack-nova | 05:42 | |
*** shilpasd has joined #openstack-nova | 05:45 | |
*** gerrykopec_ has quit IRC | 05:53 | |
kaisers | efried: Thanks for looking. I'm busy with other stuff atm but will follow up on your feedback later on! | 06:15 |
*** janki has joined #openstack-nova | 06:19 | |
*** bhagyashris_ has joined #openstack-nova | 06:33 | |
*** ralonsoh has joined #openstack-nova | 06:36 | |
*** luksky has joined #openstack-nova | 06:45 | |
*** liuyulong_ has joined #openstack-nova | 06:50 | |
*** slaweq has joined #openstack-nova | 07:00 | |
*** rpittau|afk is now known as rpittau | 07:05 | |
*** awalende has joined #openstack-nova | 07:08 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove deprecated 'default_flavor' config option https://review.openstack.org/645476 | 07:09 |
*** tesseract has joined #openstack-nova | 07:09 | |
*** tesseract has quit IRC | 07:10 | |
*** tesseract has joined #openstack-nova | 07:10 | |
*** ivve has joined #openstack-nova | 07:12 | |
*** owalsh_ is now known as owalsh | 07:17 | |
* alex_xu very enjoy on reply the comment from sean-k-mooney | 07:24 | |
*** tssurya has joined #openstack-nova | 07:26 | |
*** tosky has joined #openstack-nova | 07:28 | |
*** psachin has quit IRC | 07:32 | |
*** pcaruana has quit IRC | 07:34 | |
*** pcaruana has joined #openstack-nova | 07:35 | |
*** psachin has joined #openstack-nova | 07:35 | |
*** lpetrut has joined #openstack-nova | 07:41 | |
*** boxiang has quit IRC | 07:45 | |
*** ccamacho has joined #openstack-nova | 07:45 | |
*** boxiang has joined #openstack-nova | 07:46 | |
openstackgerrit | Adrian Chiris proposed openstack/nova master: libvirt: auto detach/attach sriov ports on migration https://review.openstack.org/629589 | 07:47 |
sean-k-mooney | alex_xu: which comment wast that? | 08:00 |
sean-k-mooney | i didnt get time to review as much specs yesterday as i would have liked but i assume one of the cpu frequencey specs? | 08:00 |
sean-k-mooney | alex_xu: ah i see you replied on https://review.openstack.org/#/c/649882/ ill take another look in an hour or so | 08:02 |
*** tkajinam has quit IRC | 08:04 | |
*** priteau has joined #openstack-nova | 08:05 | |
*** phasespace has quit IRC | 08:08 | |
openstackgerrit | Boxiang Zhu proposed openstack/nova-specs master: Add host and hypervisor_hostname flag to create server https://review.openstack.org/645458 | 08:09 |
*** luksky has quit IRC | 08:11 | |
*** evrardjp has quit IRC | 08:18 | |
*** evrardjp has joined #openstack-nova | 08:19 | |
*** Luzi has quit IRC | 08:21 | |
*** ttsiouts has joined #openstack-nova | 08:27 | |
openstackgerrit | Boxiang Zhu proposed openstack/nova-specs master: Add host and hypervisor_hostname flag to create server https://review.openstack.org/645458 | 08:27 |
*** ricolin has joined #openstack-nova | 08:30 | |
*** derekh has joined #openstack-nova | 08:35 | |
openstackgerrit | Boxiang Zhu proposed openstack/nova-specs master: Add host and hypervisor_hostname flag to create server https://review.openstack.org/645458 | 08:44 |
*** luksky has joined #openstack-nova | 08:50 | |
*** liuyulong_ has quit IRC | 08:50 | |
alex_xu | sean-k-mooney: thanks! | 08:58 |
*** dtantsur|afk is now known as dtantsur | 08:58 | |
alex_xu | sean-k-mooney: I mean all of them, they are good comments :) | 08:58 |
alex_xu | sean-k-mooney: I saw you like the extra spec defined in https://review.openstack.org/#/c/651024/5/specs/train/approved/rmd-plugin-energy-efficiency-core-p-states-control.rst@310, but I'm not quite understand, it will be great you have an example for it, if it is good, I'm happy to change also | 09:02 |
alex_xu | for now, I need to offline for catching my daughter | 09:02 |
sean-k-mooney | ok o/ | 09:02 |
alex_xu | thanks! | 09:02 |
sean-k-mooney | and ya i like it more as its a finite set | 09:03 |
sean-k-mooney | using arbiarty traits makes me a little uneasy | 09:03 |
alex_xu | ah, i see now | 09:03 |
sean-k-mooney | it could be fine just need to get used to the idea | 09:03 |
alex_xu | but I don't want to assume there are only two priority(high and low), the user may have their own configuration, they can have some custom tratis | 09:04 |
alex_xu | anyway, I have to go, catch up you later | 09:04 |
sean-k-mooney | they may but that is an interoperablity headache between clouds. sure dont be late for dad taxi service :) | 09:05 |
*** ttsiouts has quit IRC | 09:18 | |
*** ttsiouts has joined #openstack-nova | 09:19 | |
*** ttsiouts has quit IRC | 09:23 | |
*** tetsuro has quit IRC | 09:24 | |
*** phasespace has joined #openstack-nova | 09:30 | |
*** ociuhandu has quit IRC | 09:35 | |
*** rchurch has quit IRC | 09:44 | |
*** rchurch has joined #openstack-nova | 09:45 | |
kashyap | alex_xu: sfinucan: After some more thinking last night, I'm updating that `cpu_model_list` spec a bit more | 09:46 |
kashyap | Err, stephenfin ^ :-) | 09:47 |
kashyap | stephenfin: On the name, I'm responding on the change...but quickly here: I didn't even think of the "Hungarian Notation" until you mentioned :-0 | 09:47 |
kashyap | But I see your point, though. | 09:47 |
*** rcernin has quit IRC | 09:47 | |
kashyap | IMHO, it's just unambiguously clear to go with the _list; instead of playing with alises and such. | 09:48 |
stephenfin | kashyap: Yeah, I really don't think it's necessary. oslo.config is powerful enough to let us handle the migration issues, as I noted | 09:48 |
kashyap | I've seen far too many such subtle and confusing mistakes in config attributes in the past. :-( | 09:48 |
stephenfin | Perhaps, but you're going to have to use aliases anyway, assuming you want to remove the current value at some point in the future | 09:48 |
sean-k-mooney | kashyap: useing _list may be clear but it breaks convention | 09:49 |
kashyap | I see. It's not merely the technical. But I'm still not convinced from the human / operator "visual clarity" | 09:49 |
kashyap | sean-k-mooney: What convention? We should break "conventions" when it makes sense. | 09:49 |
stephenfin | I think that matters less than you think. People don't set nova.conf files, tools do | 09:49 |
sean-k-mooney | the conventions of we just use the plural form to denote lists in the config | 09:49 |
stephenfin | Be that TripleO, Kolla, or something else | 09:49 |
kashyap | Yeah, realize it as much. I want to wait a bit more to convince myself | 09:50 |
sean-k-mooney | kashyap: im not arguing _list is wronge im just saying i dont think its that important if its _list or plural | 09:50 |
kashyap | (I'm not convinced. I'll think a bit more and write on the review.) But there's more important bits in that spec to tackle, though | 09:51 |
stephenfin | sean-k-mooney: If I resize an instance, does that instance move host? | 09:51 |
kashyap | (The more I think about this topic, the more subtle questions I can think of...) | 09:51 |
stephenfin | s/does/can/ | 09:51 |
sean-k-mooney | not in all cases but it is allowed too move | 09:51 |
sean-k-mooney | rebuilds are always to the same host but resize default to migration but you can enable resize to same host in the config in which case we dont add the current host to the ingroed host list | 09:52 |
*** boxiang has quit IRC | 09:52 | |
*** boxiang has joined #openstack-nova | 09:53 | |
*** bhagyashris_ has quit IRC | 09:54 | |
*** ttsiouts has joined #openstack-nova | 09:57 | |
sean-k-mooney | stephenfin: does that make sesnse ^ | 09:58 |
stephenfin | sean-k-mooney: It sure does | 09:58 |
*** ttsiouts has quit IRC | 10:07 | |
*** tiendc has quit IRC | 10:08 | |
*** tbachman has quit IRC | 10:13 | |
*** tssurya has quit IRC | 10:31 | |
*** cdent has joined #openstack-nova | 10:33 | |
*** brinzhang has joined #openstack-nova | 10:48 | |
*** erlon has joined #openstack-nova | 10:48 | |
*** tbachman has joined #openstack-nova | 10:50 | |
*** nicolasbock has joined #openstack-nova | 10:52 | |
yaawang | bauzas: stephenfin Thanks for your review. I agree with your comments about not expose driver's feature to the REST api. So how about putting these into the flavor extra flags and image metadata? Use these properties to datermine whether to use 'auto converge' or 'post copy' during live migration. | 11:00 |
*** jamesdenton has quit IRC | 11:03 | |
*** nowster has joined #openstack-nova | 11:06 | |
*** ttsiouts has joined #openstack-nova | 11:06 | |
*** tssurya has joined #openstack-nova | 11:08 | |
*** brinzhang has quit IRC | 11:11 | |
*** erlon has quit IRC | 11:16 | |
*** jamesdenton has joined #openstack-nova | 11:20 | |
trident | Are there any specific reasons why block migration is not implemented with the LVM backend? Any specific issues with getting it to work or just that nobody has implemented support for it? | 11:22 |
*** udesale has quit IRC | 11:24 | |
*** udesale has joined #openstack-nova | 11:24 | |
*** cdent_ has joined #openstack-nova | 11:29 | |
*** cdent has quit IRC | 11:31 | |
*** cdent_ is now known as cdent | 11:31 | |
*** aloga has quit IRC | 11:35 | |
*** aloga has joined #openstack-nova | 11:36 | |
*** sidx64 has quit IRC | 11:39 | |
kashyap | stephenfin: Idea: Is this any palatable: `acceptable_cpu_models`? (It closely captures the intended meaning _and_ we can avoid "Hungarian") | 11:47 |
*** dpawlik has joined #openstack-nova | 11:47 | |
kashyap | (I am wary of bike-shedding, while knowing that "naming is hard", so want to avoid mistakes when I can. :-)) | 11:47 |
kashyap | (Context to others: By "Hungarian", /me was referring to the "Hungarian Notation" -- https://en.wikipedia.org/wiki/Hungarian_notation) | 11:48 |
*** sidx64 has joined #openstack-nova | 11:50 | |
*** rabel has joined #openstack-nova | 11:57 | |
rabel | hi there. if a hypervisor has a negative disk_available_least value, but free space on the disk (local storage, qcow2 images). and now it does not allow live migrations to this hypervisor. is it sufficient to increase disk_allocation_ratio to make this hv available for live migration again? | 11:58 |
kashyap | Maybe `acceptable_cpu_models` is not acceptable; it breaks the "symmetry" of existing config, options which all start with: cpu_{mode, model|models|model_list, extra_flags} | 11:59 |
*** udesale has quit IRC | 11:59 | |
*** maciejjozefczyk has joined #openstack-nova | 11:59 | |
*** eharney has joined #openstack-nova | 12:06 | |
*** tbachman has quit IRC | 12:06 | |
*** chhagarw has joined #openstack-nova | 12:06 | |
*** sidx64 has quit IRC | 12:06 | |
*** ttsiouts has quit IRC | 12:08 | |
*** ttsiouts has joined #openstack-nova | 12:09 | |
*** sidx64 has joined #openstack-nova | 12:09 | |
*** priteau has quit IRC | 12:12 | |
*** ttsiouts has quit IRC | 12:13 | |
*** tbachman has joined #openstack-nova | 12:13 | |
*** priteau has joined #openstack-nova | 12:14 | |
sean-k-mooney | kashyap: enabled_cpu_models would be similar to enabled_filters for the schduler | 12:16 |
*** dave-mccowan has joined #openstack-nova | 12:17 | |
sean-k-mooney | but also i think both are verbose and we can use something shorter | 12:17 |
kashyap | A little verbose is OK, so long as it conveys intended meaning. I'll stick to `cpu_model_list` for now. It is just unambiguous. | 12:18 |
*** sidx64 has quit IRC | 12:19 | |
*** avolkov has joined #openstack-nova | 12:21 | |
sean-k-mooney | kashyap: its also a chose that several people dislike but as i said it dotn care enough to rathole on it | 12:22 |
*** awalende has quit IRC | 12:26 | |
*** awalende has joined #openstack-nova | 12:26 | |
*** sidx64 has joined #openstack-nova | 12:31 | |
*** awalende has quit IRC | 12:31 | |
rabel | nvm my question. it seems we were hitting https://bugs.launchpad.net/nova/+bug/1708572 | 12:36 |
openstack | Launchpad bug 1708572 in OpenStack Compute (nova) "Unable to live-migrate : Disk of instance is too large" [High,Fix released] - Assigned to Matthew Booth (mbooth-9) | 12:36 |
*** udesale has joined #openstack-nova | 12:36 | |
*** tbachman has quit IRC | 12:37 | |
*** sidx64 has quit IRC | 12:38 | |
*** tbachman has joined #openstack-nova | 12:42 | |
*** awaugama has joined #openstack-nova | 12:45 | |
alex_xu | kashyap: in the initial spec discussion, we said we can't done the check without hypervisour-literate api https://review.openstack.org/#/c/620959/5..9/specs/stein/approved/cpu-model-selection.rst@63 | 12:50 |
alex_xu | let me what we changed now :) | 12:50 |
*** ricolin has quit IRC | 12:50 | |
* kashyap clicks and re-reads | 12:50 | |
*** ttsiouts has joined #openstack-nova | 12:51 | |
*** ricolin has joined #openstack-nova | 12:51 | |
*** lbragstad has joined #openstack-nova | 12:58 | |
*** udesale has quit IRC | 13:01 | |
*** lpetrut has quit IRC | 13:01 | |
*** mriedem has joined #openstack-nova | 13:02 | |
*** udesale has joined #openstack-nova | 13:03 | |
*** irclogbot_0 has joined #openstack-nova | 13:03 | |
kashyap | alex_xu: Being dragged into a meeting; will respond there | 13:05 |
*** sidx64 has joined #openstack-nova | 13:06 | |
*** altlogbot_1 has joined #openstack-nova | 13:07 | |
*** sidx64 has quit IRC | 13:08 | |
*** sidx64 has joined #openstack-nova | 13:09 | |
*** sidx64 has quit IRC | 13:10 | |
alex_xu | kashyap: no hurry | 13:11 |
*** dklyle has joined #openstack-nova | 13:17 | |
*** maciejjozefczyk has quit IRC | 13:19 | |
*** mvkr has quit IRC | 13:19 | |
*** maciejjozefczyk has joined #openstack-nova | 13:21 | |
*** maciejjozefczyk has quit IRC | 13:21 | |
sean-k-mooney | stephenfin: i think https://review.openstack.org/#/c/648123/4 looks good now | 13:28 |
sean-k-mooney | alex_xu: efried care to take a look ^ i think this is a fairly simple fix | 13:28 |
efried | sean-k-mooney: Wow, that has more words in the title that I don't understand than that I do. | 13:30 |
efried | sean-k-mooney: If you get desperate I'll take a look, but you should find someone more familiar with the subject matter. | 13:30 |
sean-k-mooney | efried: no worries | 13:31 |
alex_xu | sean-k-mooney: I can do tomorrow, I don't think I have clear mind now | 13:31 |
sean-k-mooney | alex_xu: specs have that effect | 13:31 |
alex_xu | hah :) | 13:32 |
*** sidx64 has joined #openstack-nova | 13:34 | |
*** phasespace has quit IRC | 13:36 | |
alex_xu | that is really bad to see jaypipes's strong -1 just after decide end the day :) | 13:38 |
*** sidx64 has quit IRC | 13:39 | |
alex_xu | anyway, I will leave it to tomorrow | 13:40 |
*** ratailor has quit IRC | 13:43 | |
*** tbachman has quit IRC | 13:47 | |
*** tbachman has joined #openstack-nova | 13:47 | |
artom | sean-k-mooney, you OK with the tests refactor being in the same patch as the actual fix? | 13:51 |
artom | I'm kinda leaning towards splitting that in two patches? | 13:51 |
*** mvkr has joined #openstack-nova | 13:53 | |
sean-k-mooney | artom: are you refering to https://review.openstack.org/#/c/648123/4 | 13:57 |
artom | sean-k-mooney, yeah. Just left a revirw | 13:58 |
artom | *review | 13:58 |
sean-k-mooney | artom: i did ask myself that question and the anser was "meh" ill see if anyone else cares | 13:58 |
*** mlavalle has joined #openstack-nova | 13:58 | |
sean-k-mooney | for backporting would likely be simpler to have them as seperate patches i gues | 13:58 |
sean-k-mooney | because the test refactor was related to the change i was ok with it but normally i would have prefered them to be split | 13:59 |
artom | sean-k-mooney, well I have a -1 for other small nits, we'll let others make that call for us :D | 14:01 |
artom | Since we both appear to be "meh" :) | 14:01 |
*** mchlumsky has joined #openstack-nova | 14:01 | |
sean-k-mooney | artom: on the api question there are times where this works today and it ends up in the api with the mac but not stats | 14:02 |
sean-k-mooney | artom: so i dont think its an api change | 14:03 |
artom | sean-k-mooney, that's what I had understood as well | 14:03 |
artom | And the test does assert one of the fields is None, so we're probably covered | 14:03 |
*** amodi has joined #openstack-nova | 14:04 | |
sean-k-mooney | oh i didnt put https://bugzilla.redhat.com/show_bug.cgi?id=1649688#c8 into the upstream bug | 14:04 |
openstack | bugzilla.redhat.com bug 1649688 in openstack-nova "[osp13] nova diagnostics command is not working with all instances" [High,Assigned] - Assigned to fpalin | 14:04 |
sean-k-mooney | actully there might be a slight api change in that before nic_details was empty and no i has a mac | 14:05 |
*** phasespace has joined #openstack-nova | 14:05 | |
sean-k-mooney | but i dont think that is versioned but good question | 14:06 |
artom | I don't think it'd have been empty before, rather an error 50 | 14:07 |
artom | 500 | 14:07 |
sean-k-mooney | no it was empty when i tried to reporduce the bug on my local systlem | 14:08 |
sean-k-mooney | if the vf has a netdev before its plug libvirt added the target element and we did not raise an error but you had no data at all | 14:08 |
artom | Hrmm, in any case, None is still allowed in fields | 14:09 |
artom | So nic_details = None is valid, as is nic_detail = {mac: blah, rest of fields: None} | 14:09 |
sean-k-mooney | well we dont have any tests assering the api behavior so i think there is some flexibity here. we can either say its fine as it is and this was a bug that was fixed and there does not need a microverion or there is no testing therfor it was broke by default and this fixed it :) | 14:13 |
*** janki has quit IRC | 14:13 | |
artom | Well we have an api-ref and a when-to-microversion flow chart | 14:13 |
sean-k-mooney | but i dont think its being to checky to say it does not need a microverions | 14:13 |
artom | And fixing 500 errors is NOT a new microversion | 14:14 |
sean-k-mooney | we do | 14:14 |
sean-k-mooney | correct | 14:14 |
artom | I was just concerned the fields would be *missing* outright | 14:14 |
*** openstackgerrit has quit IRC | 14:14 | |
artom | Which is moot, because it doesn't appear to be the case, but would have required a new microversion, or more likely an updated patch that didn't have that problem | 14:14 |
sean-k-mooney | yep well this is why we have code review is a valid concern i think we are ok however. | 14:15 |
* sean-k-mooney runs for lunch brb | 14:16 | |
*** slaweq has quit IRC | 14:16 | |
mriedem | boxiang: just a few small things to address in your spec here https://review.openstack.org/#/c/645458/ | 14:24 |
mriedem | but i know it's late so feel free to address those comments tomorrow | 14:24 |
*** openstackgerrit has joined #openstack-nova | 14:26 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Bump to hacking 1.1.0 https://review.openstack.org/651553 | 14:26 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: hacking: Resolve E731 issues https://review.openstack.org/651554 | 14:26 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: hacking: Resolve W503 issues https://review.openstack.org/651555 | 14:26 |
*** awalende has joined #openstack-nova | 14:27 | |
gibi | mriedem: hi! when you implemented heal allocation in nova-manage did you thought about adding some kind of end-to-end test for it? I'm wondering how complex it would be to add something to grenade for the heal port allocation but I don't have enough grenade knowledge to see if it is feasible at all | 14:31 |
*** awalende has quit IRC | 14:31 | |
openstackgerrit | Boxiang Zhu proposed openstack/nova master: Make evacuation respects anti-affinity rule https://review.openstack.org/649963 | 14:33 |
mriedem | when i implemented it (in rocky? queens?) we were only healing basic resource class allocations like VCPU/MEMORY_MB/DISK_GB and only if they didn't exist, which by that time wouldn't be a problem if using the FilterScheduler - the command was really built to get people off the CachingScheduler which didn't use placement, and we didn't have a grenade job that used the CachingScheduler, so no i didn't really put much thought in | 14:33 |
mriedem | ntegration testing for that command | 14:33 |
gibi | mriedem: ack, thanks | 14:34 |
mriedem | we could have a post-test hook script run it from the nova-next job, but we'd have to fake the missing allocations by creating a server then manually removing it's allocations and then running the command and verify they were healed | 14:35 |
mriedem | which...we can just do that in functional in-tree tests | 14:35 |
gibi | mriedem: yeah, I'm doing that in the func test already | 14:35 |
mriedem | bw provider ports is a whole other ball of wax - can we even test those in the gate? | 14:35 |
mriedem | i mean we can test that you can create a min bw policy on a network and create ports on that network and make sure the allocations and such are all wired up | 14:36 |
mriedem | but i'm not sure we can test that the bw is actually limited within the guest in the gate | 14:36 |
mriedem | just making sure nova+neutron+placement are all doing what they are supposed to (from the gate) would be a good step anyway | 14:36 |
gibi | mriedem: yeah I think we have such tempest test on review | 14:37 |
gibi | let me look up the link | 14:37 |
*** cfriesen has joined #openstack-nova | 14:37 | |
gibi | mriedem: https://review.openstack.org/#/c/629253/ | 14:37 |
gibi | for the heal port allocation we would need to create a bound port in rocky that would need allocation in stein | 14:38 |
gibi | unfortunately in rocky only sriov ports supported min bw in neutron, and sriov is no no in the gate | 14:38 |
gibi | I can trick the system by adding the min bw policy to the port _after_ it was bound but that feels hackis already | 14:39 |
mriedem | agree | 14:39 |
gibi | I think I wil stick to the functional test env for heal port allocation | 14:39 |
gibi | and I might do some manual test in a devstack where I have sriov | 14:40 |
*** sridharg has quit IRC | 14:42 | |
*** ttsiouts has quit IRC | 14:44 | |
*** ttsiouts has joined #openstack-nova | 14:45 | |
mriedem | melwitt: are you ok to re-approve the cross-cell resize spec https://review.openstack.org/#/c/642807/ ? I saw you were just +1. I'd like to get the blueprint re-approved so i can start putting the bottom tail of the (huge) series into runways to get some reviews going. | 14:46 |
*** ttsiouts has quit IRC | 14:47 | |
*** ttsiouts has joined #openstack-nova | 14:47 | |
*** artom has quit IRC | 14:50 | |
*** sidx64 has joined #openstack-nova | 14:50 | |
*** luksky has quit IRC | 14:51 | |
stephenfin | mriedem: Fancy revisiting this? https://review.openstack.org/#/c/650018/ | 14:53 |
*** slaweq has joined #openstack-nova | 14:54 | |
mriedem | "I've spoken with Lars Kurth (XenProject advisory board chair) and he agrees that at this point in time, this config can be removed." | 14:55 |
mriedem | so we have no CI for libvirt+xen | 14:55 |
mriedem | and no plans to | 14:55 |
mriedem | ok | 14:55 |
*** sidx64 has quit IRC | 14:57 | |
*** sidx64_ has joined #openstack-nova | 14:57 | |
mriedem | stephenfin: back at you https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bug/1823781 | 14:57 |
stephenfin | ack | 14:58 |
stephenfin | mriedem: Out of curiosity, does this look familiar to you? http://paste.openstack.org/show/P6AlRRpBWnYwFfWWVJFl/ It's a stable/rocky deployment | 14:59 |
mriedem | yes i think so, sec | 15:00 |
mriedem | actually no nvm, was thinking of something else pci related | 15:01 |
mriedem | it's blowing up on this? "n_id = compute_node.id" | 15:02 |
mriedem | i only see _setup_pci_tracker called with an existing compute node record where id should be set | 15:02 |
mriedem | unless you guys have a fork | 15:02 |
stephenfin | Not that I'm aware of. I had been doing some work in that area but none of that is merged yet :/ https://review.openstack.org/#/c/649559/ | 15:04 |
stephenfin | But yeah, looks like that. I don't know if that attribute should be set or not. I can't see why it wouldn't but all things Ironic really aren't my strong suit | 15:05 |
melwitt | mriedem: I was thinking dansmith would want to sign off on it | 15:06 |
* stephenfin is currently tracing paths that create/modify ComputeNode objects to see where this might be missed | 15:06 | |
*** tbachman has quit IRC | 15:08 | |
mriedem | stephenfin: i don't think that error has anything to do with ironic | 15:08 |
stephenfin | Perhaps, but it's only happening with baremetal nodes so I suspect it's some with how we're initializing the Ironic driver | 15:13 |
stephenfin | but that's a grasp | 15:13 |
*** maciejjozefczyk has joined #openstack-nova | 15:14 | |
mriedem | so the driver returns a list of nodenames for that host and those get passed one at a time to RT https://github.com/openstack/nova/blob/stable/rocky/nova/compute/manager.py#L7893 | 15:15 |
*** jangutter_ is now known as jangutter | 15:15 | |
mriedem | should eventually call through to here https://github.com/openstack/nova/blob/stable/rocky/nova/compute/resource_tracker.py#L743 | 15:16 |
mriedem | if the compute node already exists in the db you'd get here on first startup https://github.com/openstack/nova/blob/stable/rocky/nova/compute/resource_tracker.py#L567 | 15:17 |
mriedem | this should also be ok https://github.com/openstack/nova/blob/stable/rocky/nova/compute/resource_tracker.py#L530 | 15:18 |
*** tbachman has joined #openstack-nova | 15:18 | |
mriedem | otherwise we'd create the node in the db https://github.com/openstack/nova/blob/stable/rocky/nova/compute/resource_tracker.py#L584 | 15:18 |
mriedem | which should set the id on the cn object | 15:18 |
mriedem | https://github.com/openstack/nova/blob/stable/rocky/nova/objects/compute_node.py#L322 | 15:19 |
mriedem | looks like you've got some debugging to do if you can recreate it | 15:19 |
*** dave-mccowan has quit IRC | 15:23 | |
*** dave-mccowan has joined #openstack-nova | 15:24 | |
*** munimeha1 has joined #openstack-nova | 15:25 | |
*** gyee has joined #openstack-nova | 15:26 | |
*** ivve has quit IRC | 15:28 | |
*** ccamacho has quit IRC | 15:28 | |
*** efried has quit IRC | 15:28 | |
*** artom has joined #openstack-nova | 15:29 | |
*** _alastor_ has joined #openstack-nova | 15:29 | |
*** _alastor_ has quit IRC | 15:30 | |
*** _alastor_ has joined #openstack-nova | 15:30 | |
mriedem | lyarwood: melwitt: dansmith: bauzas: looks like we can start approving stable/stein backports now so i'll start going through those and removing the WIP on mine | 15:32 |
mriedem | we need to roll through some of those to get things flushed out down through pike for pike-em | 15:32 |
dansmith | ack | 15:32 |
melwitt | ah, yep | 15:32 |
melwitt | I have a bug fix that needs to go back to pike but it hasn't even merged on master yet and having trouble getting review https://review.openstack.org/611974 | 15:33 |
cdent | mriedem: does a server always get buried to cell0 when it fails to find a host? Is it possible to say "only bury if you requested a build at a microversion that will allow an unbury"? | 15:34 |
mriedem | melwitt: i definitely don't know that code so would need at least lyarwood and/or mdbooth to sign off | 15:34 |
mriedem | cdent: "does a server always get buried to cell0 when it fails to find a host?" yes | 15:34 |
*** gerrykopec has joined #openstack-nova | 15:34 | |
mriedem | "Is it possible to say "only bury if you requested a build at a microversion that will allow an unbury"?" i think that gets really complicated fast | 15:35 |
cdent | i imagine so | 15:35 |
cdent | what was the original purpose of burying? | 15:35 |
dansmith | yeah that's not okay | 15:35 |
mriedem | because it's more like (1) only bury if using a microversion to allow unburying except (2) unless you didn't request personality files etc | 15:35 |
melwitt | there's this one too but people were meh about reviewing it too https://review.openstack.org/582408 | 15:35 |
mriedem | cdent: need to be able to show the servers from the API even though they are in error state and not on a host, | 15:36 |
dansmith | cdent: it's still the original purpose, which is that if we didn't pick a host, we didn't pick a cell, thus we have nowhere to store it | 15:36 |
mriedem | and the api pulls servers from a cell db | 15:36 |
mriedem | we have build_requests but those are short-lived | 15:36 |
cdent | ah, right, I keep forgetting that a server that is an ERROR is still a server (which is a decision I've never quite understood) | 15:36 |
mriedem | melwitt: it's failing pep8 :) | 15:37 |
bauzas | mriedem: ack, sorry on a meeting | 15:37 |
*** tbachman has quit IRC | 15:38 | |
melwitt | sigh | 15:38 |
openstackgerrit | Surya Seetharaman proposed openstack/nova master: Block automatic transport_url update for cell0 https://review.openstack.org/605414 | 15:40 |
openstackgerrit | Merged openstack/nova-specs master: Re-propose cross-cell-resize spec for Train https://review.openstack.org/642807 | 15:46 |
*** efried has joined #openstack-nova | 15:47 | |
*** sidx64_ has quit IRC | 15:51 | |
*** artom has quit IRC | 15:51 | |
*** rpittau is now known as rpittau|afk | 15:53 | |
*** artom has joined #openstack-nova | 15:59 | |
*** tesseract has quit IRC | 16:03 | |
*** francoisp_ has joined #openstack-nova | 16:03 | |
*** tbachman has joined #openstack-nova | 16:04 | |
*** ociuhandu has joined #openstack-nova | 16:08 | |
*** ociuhandu has quit IRC | 16:10 | |
*** psachin has quit IRC | 16:14 | |
*** efried is now known as efried_rollin | 16:14 | |
*** dtantsur is now known as dtantsur|afk | 16:15 | |
bauzas | melwitt: thanks for reviewing https://review.openstack.org/#/c/650963/ I replied to your question | 16:22 |
melwitt | bauzas: a-ha, thanks | 16:23 |
*** derekh has quit IRC | 16:28 | |
melwitt | lyarwood, mdbooth: begging for reviews on this again, disk unit stuff https://review.openstack.org/611974 | 16:30 |
*** ivve has joined #openstack-nova | 16:30 | |
lyarwood | melwitt: ack saw above, I'll take a look once I've finished with all of this downstream paperwork... bugzilla-- | 16:30 |
melwitt | oh cool, sorry for the double ping | 16:30 |
lyarwood | np at all | 16:31 |
mriedem | efried_rollin: want to drop the -W on https://review.openstack.org/#/c/649600/ ? | 16:34 |
openstackgerrit | Theodoros Tsioutsias proposed openstack/nova-specs master: Enable rebuild for instances in cell0 https://review.openstack.org/648686 | 16:35 |
openstackgerrit | melanie witt proposed openstack/nova master: Add functional recreate test for bug 1764556 https://review.openstack.org/562041 | 16:37 |
openstack | bug 1764556 in OpenStack Compute (nova) ""nova list" fails with exception.ServiceNotFound if service is deleted and has no UUID" [Medium,In progress] https://launchpad.net/bugs/1764556 - Assigned to melanie witt (melwitt) | 16:37 |
openstackgerrit | melanie witt proposed openstack/nova master: Add functional regression test for bug 1778305 https://review.openstack.org/582407 | 16:37 |
openstackgerrit | melanie witt proposed openstack/nova master: Don't generate service UUID for deleted services https://review.openstack.org/582408 | 16:37 |
openstack | bug 1778305 in OpenStack Compute (nova) "Nova may erronously look up service version of a deleted service, when hostname have been reused" [Undecided,In progress] https://launchpad.net/bugs/1778305 - Assigned to melanie witt (melwitt) | 16:37 |
*** ttsiouts has quit IRC | 16:40 | |
*** ttsiouts has joined #openstack-nova | 16:41 | |
*** ttsiouts has quit IRC | 16:45 | |
mriedem | mlavalle: did you sort out the NoValidHost error you were getting the other night? | 16:46 |
mlavalle | mriedem: yes, thanks for following up. it was the image filter | 16:46 |
*** altlogbot_1 has quit IRC | 16:46 | |
*** phasespace has quit IRC | 16:47 | |
mriedem | mlavalle: ok so you have some image with metadata that didn't match any hosts i'm guessing? | 16:47 |
dansmith | mriedem: were you going to un-WIP your stein backports? | 16:47 |
mriedem | dansmith: just finished | 16:47 |
mlavalle | mriedem: excatly and sisnce I have only one compute, that was the end of that request | 16:48 |
dansmith | okay gerrit didn't notify me | 16:48 |
mriedem | mlavalle: did enabling debug in the scheduler and checking the logs tell you what was up? or did you have to then look at the code too? | 16:48 |
stephenfin | mriedem: fwiw, your assumption that the issue I pointed out earlier wasn't ironic specific seems to have proven out. I spotted this after http://paste.openstack.org/show/fsWdgjpwsycPzdqPXNLI/ | 16:49 |
stephenfin | Not sure why that's happening (I suspect something HA'y) but that would explain why ID, an auto-incremented primary key field (I assume) wasn't set | 16:49 |
stephenfin | I found a similar issue reported by MarkusZ years ago that I need to look into more, but that's tomorrow's work https://bugs.launchpad.net/nova/+bug/1566783/comments/4 | 16:50 |
openstack | Launchpad bug 1566783 in OpenStack Compute (nova) "On service start, nova-compute fails to register itself to the DB" [Low,Confirmed] | 16:50 |
mriedem | stephenfin: that's a failure to create the compute node in the db because some other node with the same uuid already exists, but you wouldn't have a compute_nodes table record without an id | 16:50 |
stephenfin | Just closing that loop :) | 16:50 |
mriedem | so i still don't understand how you'd get that object action error on loading ComputeNode.id | 16:50 |
*** ivve has quit IRC | 16:51 | |
mlavalle | mriedem: without enabling debug, the conductor log gave the filter that was failing. at the end the conductor logs how many hosts each filter returns. All the filters looked something like: in: 1, out 1. The image filter was: in: 1, out: 0 | 16:51 |
* mlavalle is paraphrasing. I don't remember the eaxcat notation | 16:51 | |
stephenfin | mriedem: I figured the initial object creation was failing so we're left with a Python object but no underlying DB entry | 16:51 |
mriedem | mlavalle: sure, just wondering if it was really hard to figure out but sounds like it wasn't | 16:51 |
mriedem | stephenfin: we shouldn't even get to _setup_pci_tracker in that case though | 16:52 |
mlavalle | mriedem: nop, just looking closer at the logs. I guess when I pinged you, it was Friday, night and I was tired..... Or maybe I just needed an excuse to say hi ;-) | 16:52 |
mriedem | stephenfin: i.e. if this fails https://github.com/openstack/nova/blob/stable/rocky/nova/compute/resource_tracker.py#L584 you're not going to be calling https://github.com/openstack/nova/blob/stable/rocky/nova/compute/resource_tracker.py#L589 | 16:53 |
mriedem | the question would be why would we even try to create the compute node record if it already exists by uuid | 16:53 |
*** wolverineav has joined #openstack-nova | 16:53 | |
mriedem | because https://github.com/openstack/nova/blob/stable/rocky/nova/compute/resource_tracker.py#L567 should find the existing cn in the db | 16:53 |
mriedem | unless, | 16:54 |
mriedem | the hypervisor_hostname is different | 16:54 |
stephenfin | Could it be racy if you've multiple compute services running? | 16:54 |
sean-k-mooney | i was going to say we lookup the compute service by hostname to get the uuid | 16:54 |
mriedem | it could race if you've got multiple compute services trying to manage the same node yeah | 16:55 |
*** wolverineav has quit IRC | 16:56 | |
sean-k-mooney | or multiple nodes with the same hostname? | 16:56 |
mriedem | stephenfin: do you have this fix? https://github.com/openstack/nova/commit/57566f4c8d2929c25e76564883369a7c6eda720a | 16:56 |
*** wolverineav has joined #openstack-nova | 16:56 | |
* stephenfin checks | 16:57 | |
mriedem | sean-k-mooney: i'm not sure what you're saying | 16:57 |
mriedem | "compute service by hostname to get the uuid" is wrong though in this case | 16:57 |
mriedem | compute nodes in the db are 1:1 with ironic nodes, and in this case we're looking up the compute node by nodename, i.e. hypervisor_hostname, i.e. ironic node.uuid | 16:57 |
mriedem | not the nova-compute service hostname | 16:57 |
sean-k-mooney | oh this is ironic never mind | 16:57 |
stephenfin | Damn it, I've to go (tag rugby match in 15 minutes). Sorry /o\ | 16:58 |
mriedem | back to the pile | 16:58 |
stephenfin | yuuuup | 16:58 |
* mriedem gets lunch | 16:59 | |
sean-k-mooney | i was going to say if there are two hosts with the same hostname in the cloud then the both endup sharing the same compute service and hypervior entry in the db and evertying is broken. | 16:59 |
sean-k-mooney | thats not related to that however | 16:59 |
cfriesen | off-the-wall question here...with the rise of containerized openstack services it's a bit of a pain to tweak the code since killing the running service kills the container. has anyone considered a way to "hot reload" new code without killing the running nova process? | 17:00 |
sean-k-mooney | cfriesen: as in via a SIG_HUP | 17:00 |
sean-k-mooney | i dont think so | 17:01 |
sean-k-mooney | whats the issue with restarting the container | 17:01 |
cfriesen | sean-k-mooney: you have to make a new image, then tag it, and upload it to the repository, then tell helm to use the new image... | 17:01 |
cdent | or starting a different container to take over | 17:02 |
sean-k-mooney | cdent: oh i taught you meet jsut execing into the contanter and patching it | 17:02 |
sean-k-mooney | cfriesen: ^ | 17:02 |
cfriesen | sean-k-mooney: that's what I'd like to do | 17:02 |
cfriesen | sean-k-mooney: but if you stop nova-compute (for example) the container dies and a new unpatched one starts up | 17:02 |
cdent | but the nova process is the only one | 17:02 |
sean-k-mooney | cfriesen: in that case build your continer form git repos instad of tarballs | 17:02 |
*** chhagarw has quit IRC | 17:03 | |
sean-k-mooney | cfriesen: that depends on how you run the container | 17:03 |
cdent | cfriesen: it's not impossible to reload python code in the same process, but it is very icky | 17:03 |
sean-k-mooney | your change inside a contaienr dont disappare on a docker reaster nova_compute | 17:03 |
sean-k-mooney | cdent: and eventlet monkeypatching makes it even harder | 17:04 |
cdent | yes | 17:04 |
jangutter | cfriesen: docker containers ? I've hotpatched them, done 'docker commit' and 'docker restart'. | 17:04 |
jangutter | cfriesen: ugly, but hey, so am I. | 17:04 |
sean-k-mooney | jangutter: or docker exec then edit it and docker restart | 17:05 |
jangutter | sean-k-mooney: yep, that too. | 17:05 |
cfriesen | jangutter: do you know how that behaves when the container is part of a pod that is part of a deployment or statefulset? | 17:05 |
sean-k-mooney | cfriesen: it works fine in a pod | 17:05 |
jangutter | cfriesen: does the answer involve a singularity that slowly consumes all in its path? | 17:06 |
sean-k-mooney | you are not deleting the contiaer so the deployment does not recreated the pod | 17:06 |
cfriesen | sounds promising...will have to try that. | 17:07 |
sean-k-mooney | cfriesen: but modifying docker conatiner is generally considerd heresy. in k8s land its even more vilified | 17:07 |
jangutter | cfriesen: protip, pass '-u root' when doing the exec since you don't always have permissions. | 17:07 |
jangutter | cfriesen, sean-k-mooney: yeah, there's no guarantee any of the changes will persist or not. at best it's just something you do before you destroy the system. | 17:08 |
sean-k-mooney | cfriesen: if you were using kolla-ansible they have a dev mode where the clone the git repo on the host and just bind mont into the container i think | 17:08 |
*** mvkr has quit IRC | 17:09 | |
dansmith | mriedem: am I missing what bug this actually fixes? https://review.openstack.org/#/c/647713/1 | 17:09 |
sean-k-mooney | the intended worflow being edit on host then docker restart nova_compute | 17:09 |
cfriesen | sean-k-mooney: yeah, that'd be cleaner | 17:10 |
cdent | cfriesen: is this sort of live code tweaking a thing you find common? | 17:12 |
cdent | I mean I can imagine it often enough in test settings, experiments, and the like | 17:13 |
cdent | but in a deployed world aren't the choices usually: observe, kill, make a new one? | 17:13 |
mnaser | there's not really any pythonic api to manage cells, right? | 17:14 |
mnaser | or any api at all | 17:14 |
*** slaweq has quit IRC | 17:14 | |
mnaser | I am looking at adding multicell support in openstackansible, one of the things to introduce is the addition of some way to define cells | 17:15 |
cfriesen | cdent: this would be for tweaking stuff on lab systems. I agree for areas where devstack is sufficient that'd be better. | 17:15 |
cdent | cfriesen: I guess I was thinking a lab system is one of the main areas where "you have to make a new image, then tag it, and upload it to the repository, then tell helm to use the new image..." wouldn't be too much of a big deal, and you might have some simple(-ish) tooling to support it | 17:18 |
cdent | I agree it is a PITA | 17:18 |
dansmith | mnaser: you mean via the external API I assume? there isn't by design | 17:18 |
dansmith | mnaser: or do you mean something like being able to import nova-manage and use its routines? | 17:19 |
mnaser | dansmith: yeah.. or I was thinking maybe we could do something like.. import nova.cmd.manage; manage.add_cell() or whatever | 17:19 |
cfriesen | cdent: tool improvements would help, I agree. | 17:19 |
mnaser | #2 (because ansible can just have python scripts run) | 17:19 |
dansmith | mnaser: yeah, that could be a thing, but isn't currently | 17:19 |
mnaser | fair enough, so for now probably best to stick to parsing CLI output | 17:19 |
dansmith | mnaser: tbh, I think you probably don't want to import and run in your own namespace, since talking to cells requires pivoting global oslo.db state and other things | 17:20 |
mnaser | dansmith: fair enough.. so probably better off just doing CLI operations | 17:21 |
dansmith | mnaser: the cli commands aim to be pretty idempotent, but if we need --robot output formatting or something, then we should do that | 17:21 |
*** udesale has quit IRC | 17:23 | |
openstackgerrit | Chris Dent proposed openstack/nova master: WIP: Use update_provider_tree in vmware virt driver https://review.openstack.org/651615 | 17:26 |
*** priteau has quit IRC | 17:31 | |
*** boxiang has quit IRC | 17:43 | |
openstackgerrit | Dakshina Ilangovan proposed openstack/nova-specs master: Nova LLC allocation - RMD plugin for RDT CAT https://review.openstack.org/651233 | 17:45 |
*** luksky has joined #openstack-nova | 17:47 | |
*** boxiang has joined #openstack-nova | 17:48 | |
*** ricolin has quit IRC | 17:49 | |
mriedem | dansmith: it's a perf improvement to avoid a call to get allocations for the instance created by the scheduler when we already have those in the Selection object in scope | 17:56 |
dansmith | mriedem: ..right I got that ;) | 17:56 |
dansmith | is that really substantial enough to backport? | 17:56 |
mriedem | it's just new code in stein and missed the cutoff so it's not controversial imo | 17:58 |
mriedem | we wouldn't die without it no | 17:58 |
openstackgerrit | Dakshina Ilangovan proposed openstack/nova-specs master: Nova local resource management that uses RMD https://review.openstack.org/651130 | 17:59 |
*** tjgresha has joined #openstack-nova | 17:59 | |
*** tjgresha has quit IRC | 18:09 | |
*** awalende has joined #openstack-nova | 18:11 | |
*** awalende has quit IRC | 18:15 | |
*** tosky has quit IRC | 18:21 | |
*** wolverineav has quit IRC | 18:25 | |
*** wolverineav has joined #openstack-nova | 18:25 | |
*** wolverineav has quit IRC | 18:29 | |
*** wolverineav has joined #openstack-nova | 18:29 | |
*** ivve has joined #openstack-nova | 18:38 | |
*** slaweq has joined #openstack-nova | 18:46 | |
*** slaweq has quit IRC | 18:52 | |
*** cdent has quit IRC | 18:54 | |
openstackgerrit | Dakshina Ilangovan proposed openstack/nova-specs master: Resource Management Daemon - Base Enablement https://review.openstack.org/651130 | 18:57 |
-openstackstatus- NOTICE: Restarting Gerrit on review.openstack.org to pick up new configuration for the replication plugin | 19:05 | |
*** wolverineav has quit IRC | 19:05 | |
*** wolverineav has joined #openstack-nova | 19:06 | |
mriedem | melwitt: i re-read the counting quotas from placement spec for train again and noticed a few things that could be cleaned up https://review.openstack.org/#/c/645302/ | 19:06 |
melwitt | mriedem: ok, thanks | 19:06 |
*** tssurya has quit IRC | 19:07 | |
melwitt | mriedem: to your other comment (I'll reply on the review too), we're not counting unmigrated qfd instance mappings because we're falling back to legacy counting if unmigrated qfd instance mappings are detected | 19:07 |
melwitt | (as opposed to counting unmigrated and being potentially wrong in the count) | 19:08 |
mriedem | right i gathered that from the exchange between you and surya but i hadn't looked at the code yet that does the fallback logic | 19:09 |
*** wolverineav has quit IRC | 19:10 | |
*** artom has quit IRC | 19:11 | |
melwitt | oh, I see | 19:12 |
melwitt | I misunderstood what you meant by saying you needed to load the context of how it will be used | 19:12 |
openstackgerrit | Dakshina Ilangovan proposed openstack/nova-specs master: Resource Management Daemon - Last Level Cache https://review.openstack.org/651233 | 19:20 |
*** eharney has quit IRC | 19:28 | |
*** awalende has joined #openstack-nova | 19:28 | |
*** tbachman has quit IRC | 19:29 | |
efried_rollin | mriedem: done | 19:37 |
*** efried_rollin is now known as efried | 19:37 | |
*** owalsh has quit IRC | 19:43 | |
*** wolverineav has joined #openstack-nova | 19:45 | |
*** ralonsoh has quit IRC | 19:48 | |
*** wolverineav has quit IRC | 19:49 | |
*** owalsh has joined #openstack-nova | 19:50 | |
melwitt | efried, mriedem: working on this old service uuid upgrade bug, IIUC because of this change https://review.openstack.org/620711 if we do a service delete followed by a service create, the running compute service will never create the resource provider and compute node again, for the same 'host', because it knows about it already in the RT | 19:51 |
*** tbachman has joined #openstack-nova | 19:51 | |
melwitt | even after another update_available_resource interval | 19:51 |
efried | hm, I thought someone fixed that recently. | 19:52 |
melwitt | the func tests (written before that change landed) now fail because can't get the RP created again | 19:52 |
efried | oh, this is if you delete and recreate the service without actually stopping the running process? | 19:53 |
melwitt | I rebased the set today, so I should have all of the latest changes | 19:53 |
melwitt | correct | 19:53 |
efried | Yes, this was definitely addressed recently, but I thought someone killed the change for reasons. Let me find it... | 19:53 |
mriedem | https://review.openstack.org/#/c/641899/ ? | 19:53 |
efried | Yup, that's the one. melwitt ^ | 19:54 |
melwitt | looking | 19:54 |
melwitt | I'm not even getting to that code though, it looks like that would at least try to create the provider | 19:55 |
*** amodi has quit IRC | 19:55 | |
efried | melwitt: Does that have the wrong bug number associated with it? | 19:55 |
efried | (I'm asking, why didn't you find it if you were looking into this already?) | 19:56 |
efried | duplicate bugs? | 19:56 |
melwitt | oh they're saying after a service restart the RP create fails | 19:56 |
* efried shakes fist at PTL gods. "What have you done to me, that I care more about the paperwork than the solution??" | 19:57 | |
melwitt | efried: oh sorry, I'm working on refreshing patches for an unrelated bug | 19:57 |
melwitt | and ran into this problem of the RP will never be created again while the service is running | 19:57 |
melwitt | because the func tests I'm working with are now failing because of it | 19:57 |
efried | I mean, it makes sense, because we have the provider tree cache, and no reason to think it has expired, so we never go looking for the provider to notice it's gone so we never create it. | 19:58 |
efried | This is what SIGHUP was supposed to be for. | 19:58 |
melwitt | ok, so it's a known thing that if an operator deletes and creates a service with the same hostname, they're supposed to SIGHUP now | 19:59 |
efried | And this would be a doc issue: "If you delete and recreate the service, SIGHUP (or restart ffs) your n-cpu process" | 19:59 |
efried | Well, I'm saying if it's not documented as such, that would be my first proposed solution. I'm not positive it's going to fix the problem (even when SIGHUP works, which it still doesn't afaik) but it'd be the first thing I'd try. | 19:59 |
mriedem | fwiw we already have a note about that in the API reference https://developer.openstack.org/api-ref/compute/?expanded=delete-compute-service-detail#delete-compute-service | 19:59 |
melwitt | ok. I can simulate that in the test, but wanted to mention it in case it wasn't desired/expected | 20:00 |
melwitt | thanks | 20:00 |
efried | cool, the doc actually says to stop the thing, which makes freakin sense to me. | 20:00 |
efried | Maybe I'm being too simplistic in my thinking | 20:01 |
efried | but how does an operator expect "delete a thing but keep it running" to ever work? | 20:01 |
*** ivve has quit IRC | 20:01 | |
melwitt | no, it's fine, I just saw a change and didn't have the context about it | 20:01 |
melwitt | well, the delete it and create it again, so they expect it to be running after they create it, in this scenario | 20:02 |
efried | sorry, tbc I'm not griping at you; I guess I'm kind of wtf-ing that there's a bug about this. | 20:02 |
efried | Perhaps we should have delete kill the service | 20:02 |
efried | I think that may be what cdent was suggesting in the above patch | 20:02 |
melwitt | I can't remember why operators have done that, something to do with an upgrade. cfriesen ran into it before I think | 20:03 |
efried | ...or make the delete operation fail if the service is running, to force the operator to do it. | 20:03 |
melwitt | (delete service, create service, for same hostname) | 20:03 |
efried | Oh, I'm *sure* there's a good reason to do that ^ | 20:04 |
efried | :) | 20:04 |
efried | but expecting the running process to survive intact and work properly through it... | 20:04 |
efried | perhaps it used to work by pure luck and now they're facing RBB. | 20:05 |
cfriesen | I think we probably hit it during upgrade...our upgrade mechanism was to delete the standby controller, then reinstall it with the new load (and the same hostname as before), then migrate the DB over and cut over to the new controller. | 20:06 |
melwitt | cfriesen: that sounds like what I recall | 20:07 |
cfriesen | hmm..that wouldn't delete the nova-compute service though. | 20:07 |
*** wolverineav has joined #openstack-nova | 20:08 | |
*** owalsh has quit IRC | 20:08 | |
melwitt | cfriesen: yeah this was deleting the nova-compute service for a host, then recreating it. deleting the nova-compute service record would kill the compute node record too | 20:08 |
*** owalsh has joined #openstack-nova | 20:08 | |
melwitt | and you're saying your upgrade routine would not do that | 20:08 |
*** igordc has joined #openstack-nova | 20:08 | |
cfriesen | I don't think so...trying to find the context for https://bugs.launchpad.net/nova/+bug/1764556 | 20:10 |
openstack | Launchpad bug 1764556 in OpenStack Compute (nova) ""nova list" fails with exception.ServiceNotFound if service is deleted and has no UUID" [Medium,In progress] - Assigned to melanie witt (melwitt) | 20:10 |
melwitt | cfriesen: it's ok, don't spend time on it. we were just talking about the whole service delete and create thing. behavior changed semi recently | 20:11 |
melwitt | and I thought I remembered you might have done something like that in the past | 20:11 |
cfriesen | okay, so it looks like for that bug we *did* delete a compute node while on Newton, then migrated to Pike, then created a new node with the same name. May have been an artificial test though, not actually required for upgrade. | 20:11 |
cfriesen | upgraded to Pike, rather | 20:12 |
melwitt | gotcha | 20:12 |
mriedem | the delete compute service while it's still running was never really thought out i don't think | 20:16 |
mriedem | we've only semi recently tried to shore up a lot of that nonsense | 20:16 |
mriedem | it's only been noticed as more of a problem since we started building onto it with things like host mappings and resource providers | 20:16 |
* melwitt nods | 20:17 | |
mriedem | i.e. this is semi recent https://github.com/openstack/nova/blob/d42a007425d9adb691134137e1e0b7dda356df62/nova/api/openstack/compute/services.py#L247 | 20:17 |
mriedem | and this https://github.com/openstack/nova/blob/d42a007425d9adb691134137e1e0b7dda356df62/nova/api/openstack/compute/services.py#L270 | 20:17 |
mriedem | and this https://github.com/openstack/nova/blob/d42a007425d9adb691134137e1e0b7dda356df62/nova/api/openstack/compute/services.py#L275 | 20:17 |
melwitt | ah yeah, I remember that first one | 20:18 |
*** awalende has quit IRC | 20:19 | |
*** awalende has joined #openstack-nova | 20:19 | |
*** slaweq has joined #openstack-nova | 20:23 | |
openstackgerrit | Merged openstack/nova stable/stein: doc: Capitalize keystone domain name https://review.openstack.org/650600 | 20:23 |
openstackgerrit | Merged openstack/nova stable/stein: Add functional regression test for bug 1669054 https://review.openstack.org/649319 | 20:24 |
openstack | bug 1669054 in OpenStack Compute (nova) stein "RequestSpec.ignore_hosts from resize is reused in subsequent evacuate" [Medium,In progress] https://launchpad.net/bugs/1669054 - Assigned to Matt Riedemann (mriedem) | 20:24 |
*** awalende has quit IRC | 20:24 | |
openstackgerrit | Merged openstack/nova stable/stein: doc: Fix openstack CLI command https://review.openstack.org/648412 | 20:24 |
openstackgerrit | melanie witt proposed openstack/nova master: Add functional recreate test for bug 1764556 https://review.openstack.org/562041 | 20:28 |
openstack | bug 1764556 in OpenStack Compute (nova) ""nova list" fails with exception.ServiceNotFound if service is deleted and has no UUID" [Medium,In progress] https://launchpad.net/bugs/1764556 - Assigned to melanie witt (melwitt) | 20:28 |
openstackgerrit | melanie witt proposed openstack/nova master: Add functional regression test for bug 1778305 https://review.openstack.org/582407 | 20:28 |
openstackgerrit | melanie witt proposed openstack/nova master: Don't generate service UUID for deleted services https://review.openstack.org/582408 | 20:28 |
openstack | bug 1778305 in OpenStack Compute (nova) "Nova may erronously look up service version of a deleted service, when hostname have been reused" [Medium,In progress] https://launchpad.net/bugs/1778305 - Assigned to melanie witt (melwitt) | 20:28 |
*** wolverineav has quit IRC | 20:31 | |
openstackgerrit | Chris Friesen proposed openstack/nova stable/rocky: Add missing libvirt exception during device detach https://review.openstack.org/651637 | 20:31 |
*** priteau has joined #openstack-nova | 20:33 | |
openstackgerrit | Chris Friesen proposed openstack/nova stable/queens: Add missing libvirt exception during device detach https://review.openstack.org/651639 | 20:34 |
*** phasespace has joined #openstack-nova | 20:36 | |
openstackgerrit | Chris Friesen proposed openstack/nova stable/pike: Add missing libvirt exception during device detach https://review.openstack.org/651642 | 20:36 |
*** wolverineav has joined #openstack-nova | 20:40 | |
*** wolverineav has quit IRC | 20:43 | |
*** wolverineav has joined #openstack-nova | 20:43 | |
*** ivve has joined #openstack-nova | 20:48 | |
*** priteau has quit IRC | 20:48 | |
*** wolverineav has quit IRC | 20:48 | |
*** mvkr has joined #openstack-nova | 20:52 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Use InstanceList.get_count_by_hosts when deleting a compute service https://review.openstack.org/651647 | 20:54 |
*** nicolasbock has quit IRC | 21:05 | |
*** wolverineav has joined #openstack-nova | 21:12 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add archive_deleted_rows wrinkle to cross-cell functional test https://review.openstack.org/651650 | 21:12 |
*** nicolasbock has joined #openstack-nova | 21:13 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add archive_deleted_rows wrinkle to cross-cell functional test https://review.openstack.org/651650 | 21:15 |
*** slaweq has quit IRC | 21:21 | |
*** wolverineav has quit IRC | 21:24 | |
*** ivve has quit IRC | 21:25 | |
*** wolverineav has joined #openstack-nova | 21:25 | |
*** wolverineav has quit IRC | 21:26 | |
*** wolverineav has joined #openstack-nova | 21:26 | |
openstackgerrit | Matt Riedemann proposed openstack/nova-specs master: FUP for I68498afd481f7291a6102928d7999b4be49ded7a https://review.openstack.org/651653 | 21:33 |
*** eharney has joined #openstack-nova | 21:37 | |
*** slaweq has joined #openstack-nova | 21:38 | |
*** mriedem has quit IRC | 21:40 | |
*** slaweq has quit IRC | 21:42 | |
*** wolverineav has quit IRC | 21:46 | |
openstackgerrit | Merged openstack/nova stable/stein: Override the 'get' method in DriverBlockDevice class https://review.openstack.org/647646 | 21:47 |
openstackgerrit | Merged openstack/nova stable/stein: Don't warn on network-vif-unplugged event during live migration https://review.openstack.org/650060 | 21:47 |
*** wolverineav has joined #openstack-nova | 21:48 | |
*** wolverineav has quit IRC | 21:52 | |
openstackgerrit | Merged openstack/nova stable/stein: libvirt: disconnect volume when encryption fails https://review.openstack.org/650931 | 21:55 |
openstackgerrit | Merged openstack/nova stable/stein: Move create of ComputeAPI object in websocketproxy https://review.openstack.org/649374 | 22:01 |
*** eharney has quit IRC | 22:06 | |
*** rcernin has joined #openstack-nova | 22:06 | |
*** tjgresha has joined #openstack-nova | 22:15 | |
*** luksky has quit IRC | 22:17 | |
*** mlavalle has quit IRC | 22:19 | |
*** wolverineav has joined #openstack-nova | 22:25 | |
*** tosky has joined #openstack-nova | 22:26 | |
*** wolverineav has quit IRC | 22:31 | |
*** betherly has joined #openstack-nova | 22:32 | |
openstackgerrit | Colleen Murphy proposed openstack/nova stable/rocky: Move create of ComputeAPI object in websocketproxy https://review.openstack.org/649375 | 22:35 |
*** betherly has quit IRC | 22:37 | |
*** munimeha1 has quit IRC | 22:41 | |
*** tosky has quit IRC | 22:47 | |
*** slaweq has joined #openstack-nova | 22:49 | |
*** whoami-rajat has quit IRC | 22:51 | |
*** tkajinam has joined #openstack-nova | 22:53 | |
*** slaweq has quit IRC | 22:53 | |
*** wolverineav has joined #openstack-nova | 23:06 | |
*** wolverineav has quit IRC | 23:11 | |
*** betherly has joined #openstack-nova | 23:12 | |
*** betherly has quit IRC | 23:17 | |
*** david-lyle has joined #openstack-nova | 23:36 | |
*** dklyle has quit IRC | 23:36 | |
*** david-lyle has quit IRC | 23:46 | |
*** artom has joined #openstack-nova | 23:48 | |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: Revert "Wait for network-vif-plugged on resize revert" https://review.openstack.org/639396 | 23:51 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: [WIP] Revert resize: wait for external events in compute manager https://review.openstack.org/644881 | 23:51 |
*** rcernin has quit IRC | 23:52 | |
*** owalsh has quit IRC | 23:57 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!