*** mgoddard has quit IRC | 00:06 | |
*** wolverineav has joined #openstack-nova | 00:07 | |
*** wolverineav has quit IRC | 00:12 | |
*** tetsuro has joined #openstack-nova | 00:13 | |
*** mgoddard has joined #openstack-nova | 00:22 | |
*** antonym has joined #openstack-nova | 00:26 | |
mnaser | melwitt: https://review.openstack.org/#/c/649204/2 I don't think I have time to actually get this to merge, but I tried working on it and I think it works.. so threw it up there if you wanna get it merged or if someone else wants to pick it up | 00:28 |
---|---|---|
melwitt | mnaser: I don't think that helps with the brokenness from novnc though | 00:29 |
mnaser | melwitt: oh yeah it doesn't, but I did the shuffling so I figured if someone was interested in making that code base a bit neater.. :P | 00:30 |
melwitt | self.path won't have the token in it by the time it gets to the plugin | 00:30 |
*** tetsuro has quit IRC | 00:31 | |
melwitt | mnaser: ok. I did similar locally to come to the same conclusion. the only thing I notice offhand is I think you removed the scheme validation, at best you'd have to have it run in a different order than before (currently it's scheme, token, the rest). with the plugin it would have to be token, scheme, the rest | 00:33 |
*** tetsuro has joined #openstack-nova | 00:35 | |
*** hongbin has joined #openstack-nova | 00:37 | |
*** wolverineav has joined #openstack-nova | 00:46 | |
*** tiendc has joined #openstack-nova | 00:47 | |
*** wolverineav has quit IRC | 00:51 | |
*** markvoelker has joined #openstack-nova | 00:55 | |
*** igordc has quit IRC | 01:02 | |
*** luksky has quit IRC | 01:08 | |
*** ricolin has joined #openstack-nova | 01:29 | |
*** whoami-rajat has joined #openstack-nova | 01:31 | |
*** brinzhang has joined #openstack-nova | 01:35 | |
*** BjoernT has quit IRC | 01:37 | |
*** brinzhang has quit IRC | 01:47 | |
*** brinzhang has joined #openstack-nova | 01:48 | |
*** BjoernT has joined #openstack-nova | 01:49 | |
*** BjoernT has quit IRC | 01:51 | |
*** BjoernT has joined #openstack-nova | 01:52 | |
*** lbragstad has joined #openstack-nova | 02:02 | |
*** nicolasbock has quit IRC | 02:05 | |
*** rcernin_ has joined #openstack-nova | 02:05 | |
*** rcernin has quit IRC | 02:06 | |
*** mrhillsman_afk is now known as mrhillsman | 02:10 | |
*** rcernin_ has quit IRC | 02:12 | |
*** openstackgerrit has joined #openstack-nova | 02:13 | |
openstackgerrit | Brin Zhang proposed openstack/nova-specs master: Specifying az when restore shelved server https://review.openstack.org/624689 | 02:13 |
*** rcernin has joined #openstack-nova | 02:15 | |
*** wolverineav has joined #openstack-nova | 02:44 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: WIP: Add a live migration regression test https://review.openstack.org/641200 | 02:53 |
*** takashin has joined #openstack-nova | 02:54 | |
*** cfriesen has quit IRC | 02:55 | |
*** hongbin has quit IRC | 02:57 | |
*** hongbin has joined #openstack-nova | 02:58 | |
*** BjoernT has quit IRC | 03:05 | |
*** hongbin has quit IRC | 03:12 | |
*** phasespace has quit IRC | 03:14 | |
*** psachin has joined #openstack-nova | 03:14 | |
*** hongbin has joined #openstack-nova | 03:14 | |
*** samueldmq has quit IRC | 03:15 | |
openstackgerrit | Seyeong Kim proposed openstack/nova stable/rocky: Share snapshot image membership with instance owner https://review.openstack.org/643853 | 03:19 |
*** wolverineav has quit IRC | 03:22 | |
*** wolverineav has joined #openstack-nova | 03:23 | |
*** hongbin has quit IRC | 03:31 | |
*** spsurya has joined #openstack-nova | 03:31 | |
*** brinzhang has quit IRC | 03:45 | |
*** brinzhang has joined #openstack-nova | 03:46 | |
*** cfriesen has joined #openstack-nova | 04:04 | |
*** udesale has joined #openstack-nova | 04:11 | |
*** krypto has joined #openstack-nova | 04:27 | |
*** wolverineav has quit IRC | 04:28 | |
*** wolverineav has joined #openstack-nova | 04:32 | |
*** alex_xu has quit IRC | 04:46 | |
*** alex_xu has joined #openstack-nova | 04:53 | |
*** ileixe has joined #openstack-nova | 04:55 | |
*** ratailor has joined #openstack-nova | 04:59 | |
*** wolverineav has quit IRC | 05:01 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Add a live migration regression test https://review.openstack.org/641200 | 05:12 |
*** cfriesen has quit IRC | 05:15 | |
openstackgerrit | Seyeong Kim proposed openstack/nova stable/rocky: Share snapshot image membership with instance owner https://review.openstack.org/643853 | 05:23 |
*** lbragstad has quit IRC | 05:33 | |
openstackgerrit | Seyeong Kim proposed openstack/nova stable/rocky: Share snapshot image membership with instance owner https://review.openstack.org/643853 | 05:34 |
*** pcaruana has joined #openstack-nova | 05:35 | |
*** tbachman has quit IRC | 05:38 | |
*** tbachman has joined #openstack-nova | 05:39 | |
*** ratailor has quit IRC | 05:45 | |
*** ratailor has joined #openstack-nova | 05:50 | |
*** tbachman has quit IRC | 05:54 | |
*** markvoelker has quit IRC | 05:58 | |
openstackgerrit | Brin Zhang proposed openstack/nova-specs master: Specifying az when restore shelved server https://review.openstack.org/624689 | 05:59 |
*** openstackgerrit has quit IRC | 06:09 | |
*** sridharg has joined #openstack-nova | 06:12 | |
*** openstackgerrit has joined #openstack-nova | 06:20 | |
openstackgerrit | Artem Vasilyev proposed openstack/nova master: systemd detection result caching nit fixes https://review.openstack.org/649229 | 06:20 |
kaisers | mdbooth: Yep, saw that. Will follow up | 06:22 |
openstackgerrit | Artem Vasilyev proposed openstack/nova master: systemd detection result caching nit fixes https://review.openstack.org/649229 | 06:24 |
*** markvoelker has joined #openstack-nova | 06:29 | |
*** mdbooth_ has joined #openstack-nova | 06:35 | |
*** mdbooth has quit IRC | 06:39 | |
*** ivve has joined #openstack-nova | 06:43 | |
*** slaweq has joined #openstack-nova | 06:44 | |
*** dpawlik has joined #openstack-nova | 06:44 | |
*** tesseract has joined #openstack-nova | 07:02 | |
*** belmoreira has joined #openstack-nova | 07:05 | |
*** tosky has joined #openstack-nova | 07:09 | |
*** kashyap has quit IRC | 07:10 | |
*** awalende has joined #openstack-nova | 07:11 | |
*** luksky has joined #openstack-nova | 07:15 | |
*** tssurya has joined #openstack-nova | 07:19 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Fix a deprecation warning https://review.openstack.org/649234 | 07:21 |
*** ircuser-1 has quit IRC | 07:23 | |
*** rpittau|afk is now known as rpittau | 07:23 | |
*** ccamacho has joined #openstack-nova | 07:24 | |
*** krypto has quit IRC | 07:32 | |
*** yan0s has joined #openstack-nova | 07:34 | |
*** ccamacho has quit IRC | 07:42 | |
*** balszoll has joined #openstack-nova | 07:43 | |
*** ccamacho has joined #openstack-nova | 07:49 | |
*** helenafm has joined #openstack-nova | 07:50 | |
*** ralonsoh has joined #openstack-nova | 07:51 | |
*** kashyap has joined #openstack-nova | 07:54 | |
*** takashin has left #openstack-nova | 08:00 | |
*** sidx64_ has joined #openstack-nova | 08:03 | |
*** tetsuro has quit IRC | 08:12 | |
*** ttsiouts has joined #openstack-nova | 08:14 | |
openstackgerrit | Merged openstack/nova master: Pass --nic when creating servers in evacuate integration test script https://review.openstack.org/649036 | 08:15 |
*** wolverineav has joined #openstack-nova | 08:16 | |
*** wolverineav has quit IRC | 08:20 | |
*** tkajinam has quit IRC | 08:21 | |
*** tetsuro has joined #openstack-nova | 08:22 | |
*** zbr|pto is now known as zbr | 08:23 | |
*** sidx64_ has quit IRC | 08:26 | |
*** tetsuro has quit IRC | 08:29 | |
*** cdent has joined #openstack-nova | 08:29 | |
*** xek has joined #openstack-nova | 08:30 | |
*** priteau has joined #openstack-nova | 08:34 | |
*** tetsuro has joined #openstack-nova | 08:36 | |
*** derekh has joined #openstack-nova | 08:41 | |
openstackgerrit | Michael Still proposed openstack/nova master: Remove fake_libvirt_utils from connection tests. https://review.openstack.org/642557 | 08:46 |
openstackgerrit | Michael Still proposed openstack/nova master: Remove fake_libvirt_utils from snapshot tests. https://review.openstack.org/642558 | 08:46 |
openstackgerrit | Michael Still proposed openstack/nova master: Make privsep.chown mocking for libvirt snapshot tests less magic. https://review.openstack.org/642134 | 08:46 |
openstackgerrit | Michael Still proposed openstack/nova master: Remove fake_libvirt_utils from virt driver tests. https://review.openstack.org/643894 | 08:46 |
openstackgerrit | Michael Still proposed openstack/nova master: Remove fake_libvirt_utils from libvirt imagebackend tests. https://review.openstack.org/643895 | 08:46 |
openstackgerrit | Michael Still proposed openstack/nova master: Remove remaining vestiges of fake_libvirt_utils from unit tests. https://review.openstack.org/643896 | 08:46 |
openstackgerrit | Michael Still proposed openstack/nova master: Remove fake_libvirt_utils users in functional testing. https://review.openstack.org/644793 | 08:46 |
openstackgerrit | Michael Still proposed openstack/nova master: Remove usused umask argument to virt.libvirt.utils.write_to_file https://review.openstack.org/645086 | 08:46 |
openstackgerrit | Michael Still proposed openstack/nova master: Remove write_to_file. https://review.openstack.org/645087 | 08:46 |
*** manjeets has quit IRC | 08:56 | |
openstackgerrit | Yongli He proposed openstack/nova master: Clean up orphan instances virt driver https://review.openstack.org/648912 | 08:57 |
openstackgerrit | Yongli He proposed openstack/nova master: Clean up orphan instances https://review.openstack.org/627765 | 08:57 |
*** manjeets has joined #openstack-nova | 08:57 | |
*** manjeets has quit IRC | 09:05 | |
*** davidsha has joined #openstack-nova | 09:10 | |
*** rcernin has quit IRC | 09:11 | |
*** ccamacho has quit IRC | 09:14 | |
*** davidsha has quit IRC | 09:22 | |
*** balszoll has quit IRC | 09:26 | |
*** sapd1_x has joined #openstack-nova | 09:26 | |
*** davidsha has joined #openstack-nova | 09:28 | |
openstackgerrit | Seyeong Kim proposed openstack/nova stable/rocky: Share snapshot image membership with instance owner https://review.openstack.org/643853 | 09:31 |
*** dtantsur|afk is now known as dtantsur | 09:33 | |
openstackgerrit | Kashyap Chamarthy proposed openstack/nova-specs master: Re-propose the spec to allow specifying a list of CPU models https://review.openstack.org/642030 | 09:36 |
*** manjeets has joined #openstack-nova | 09:38 | |
*** tbachman has joined #openstack-nova | 09:40 | |
*** maciejjozefczyk has quit IRC | 09:42 | |
*** Sundar has joined #openstack-nova | 09:43 | |
*** maciejjozefczyk has joined #openstack-nova | 09:45 | |
*** sapd1_x has quit IRC | 09:46 | |
*** zigo has joined #openstack-nova | 09:55 | |
*** ccamacho has joined #openstack-nova | 10:00 | |
*** priteau has quit IRC | 10:07 | |
*** priteau has joined #openstack-nova | 10:12 | |
openstackgerrit | Michael Still proposed openstack/nova master: Style corrections for privsep usage. https://review.openstack.org/648615 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Hacking N362: Don't abbrev/alias privsep import https://review.openstack.org/649190 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Improve test coverage of nova.privsep.path. https://review.openstack.org/648601 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Improve test coverage of nova.privsep.fs. https://review.openstack.org/648602 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Improve test coverage of nova.privsep.fs, continued. https://review.openstack.org/648603 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Add test coverage for nova.privsep.libvirt. https://review.openstack.org/648616 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Add test coverage for nova.privsep.qemu. https://review.openstack.org/649191 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Privsepify ipv4 forwarding enablement. https://review.openstack.org/635431 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Remove unused FP device creation and deletion methods. https://review.openstack.org/635433 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Privsep the ebtables modification code. https://review.openstack.org/635435 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Move adding vlans to interfaces to privsep. https://review.openstack.org/635436 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Move iptables rule fetching and setting to privsep. https://review.openstack.org/636508 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Move dnsmasq restarts to privsep. https://review.openstack.org/639280 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Move router advertisement daemon restarts to privsep. https://review.openstack.org/639281 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Move calls to ovs-vsctl to privsep. https://review.openstack.org/639282 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Move setting of device trust to privsep. https://review.openstack.org/639283 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Move final bridge commands to privsep. https://review.openstack.org/639580 | 10:13 |
openstackgerrit | Michael Still proposed openstack/nova master: Cleanup the _execute shim in nova/network. https://review.openstack.org/639581 | 10:13 |
*** wolverineav has joined #openstack-nova | 10:17 | |
*** ttsiouts has quit IRC | 10:20 | |
*** ttsiouts has joined #openstack-nova | 10:21 | |
*** wolverineav has quit IRC | 10:21 | |
*** ttsiouts has quit IRC | 10:26 | |
sean-k-mooney | stephenfin: bauzas care to hit this https://review.openstack.org/#/c/622972/16 | 10:30 |
*** tbachman has quit IRC | 10:47 | |
stephenfin | done | 10:49 |
sean-k-mooney | :) thank you | 10:50 |
NewBruce | sean-k-mooney - DM’ing you with an update on the issue re: port binding issue whens live migrating RDO -> OSA | 10:54 |
sean-k-mooney | NewBruce: oh cool so you made some progress root causing the interaction issues | 10:54 |
*** nicolasbock has joined #openstack-nova | 10:58 | |
NewBruce | so, i filled that with a bunch of extra debugs - if im migrating from compute28 -> compute29 (both OSA and successful) | 11:05 |
NewBruce | 2019-02-12 13:09:33.458 59488 INFO nova.network.neutronv2.api [req-8e14a273-fc18-461e-a2de-b8bad835dcd8 a3bee416cf67420995855d602d2bccd3 a564613210ee43708b8a7fc6274ebd63 - default default] [BRUCE](A-8-2): _setup_migration_port_profile: host_id = cc-compute29-kna1 | 11:05 |
*** tiendc has quit IRC | 11:05 | |
*** erlon_ has joined #openstack-nova | 11:06 | |
*** ttsiouts has joined #openstack-nova | 11:06 | |
*** _alastor_ has quit IRC | 11:08 | |
*** udesale has quit IRC | 11:10 | |
jaypipes | stephenfin: you around? I'm having a hell of time trying to run tests in nova now that tox requirements have changed. | 11:12 |
jaypipes | stephenfin: nothing I seem to do will get me out of the hell that is this: [jaypipes@uberbox nova]$ tox -efunctional | 11:12 |
jaypipes | ERROR: tox version is 2.5, required is at least 3.1.1 | 11:12 |
jaypipes | I've sudo -H pip install -U pip tox, I've apt purge'd python-tox | 11:12 |
jaypipes | and still /usr/local/bin/tox persists and is pointing somewhere old | 11:13 |
jaypipes | and I have no idea why this crap has to just suddenly break :) | 11:13 |
*** janki has joined #openstack-nova | 11:14 | |
jaypipes | stephenfin: cdent has graciously helped me. needed to `sudo -H pip uninstall tox && sudo -H pip install tox && reload bash...` | 11:18 |
sean-k-mooney | jaypipes: ya i was just going to say that you proably need to uninstall first | 11:32 |
sean-k-mooney | i had the same issue more or less to as i started with tox form my package mangaer and need to swap to pip later | 11:32 |
*** cdent has quit IRC | 11:36 | |
*** ttsiouts has quit IRC | 11:44 | |
*** ttsiouts has joined #openstack-nova | 11:45 | |
*** tetsuro has quit IRC | 11:48 | |
*** tetsuro has joined #openstack-nova | 11:49 | |
*** ttsiouts has quit IRC | 11:49 | |
*** ttsiouts has joined #openstack-nova | 11:53 | |
*** phasespace has joined #openstack-nova | 11:54 | |
stephenfin | jaypipes: Good to hear you got sorted. FWIW, the current guidelines suggest not using 'sudo pip install' since it's way too likely to break distro packages. You need to prepend (or append, I don't recall the order) '~/.local/bin' to PATH then 'pip install --local' | 12:03 |
stephenfin | jaypipes: fwiw, I was reluctant to merge the patches that required tox 3.1.1+ but I think we decided the benefits outweighed the costs | 12:03 |
* stephenfin uses system tox, reno, etc. wherever possible | 12:04 | |
*** tbachman has joined #openstack-nova | 12:05 | |
* sean-k-mooney aviods system packages whenever possibel and prefers pip packages or developer repos/ppas over disto one in general | 12:08 | |
sean-k-mooney | that said it depend on what the thing is that im installing | 12:08 |
sean-k-mooney | stephenfin: can i get you input on http://logs.openstack.org/33/647733/2/check/nova-tox-functional/72500de/testr_results.html.gz | 12:08 |
sean-k-mooney | our functional notification tests are asserting behavior of the payloads as json dicts | 12:09 |
sean-k-mooney | 1st this feals wrong at first glacne to call these fucntional test but i have not looked at the code to see how they work so ill put that aside for a minut | 12:10 |
*** sapd1_x has joined #openstack-nova | 12:10 | |
sean-k-mooney | should i a.) update these to use the new version of the object b.) convert them to do assertion using the objects or c.) force them to use the old version somehow? | 12:11 |
sean-k-mooney | by the way these are semi valid failures | 12:13 |
sean-k-mooney | as i am updating the allowed values in a field in the image metadata object but unlike everywhere else in nova we appearend version bump composed object in this specific case. | 12:15 |
sean-k-mooney | https://review.openstack.org/#/c/647733/ | 12:15 |
*** manjeets has quit IRC | 12:16 | |
*** wolverineav has joined #openstack-nova | 12:17 | |
jaypipes | thx stephenfin | 12:21 |
*** wolverineav has quit IRC | 12:22 | |
*** markvoelker has quit IRC | 12:25 | |
*** markvoelker has joined #openstack-nova | 12:25 | |
sean-k-mooney | jaypipes: by the way i dont know if you have time to revew the last two patches in the sriov migrtion blueprint https://review.openstack.org/#/q/topic:bp/libvirt-neutron-sriov-livemigration+(status:open) | 12:26 |
jaypipes | sean-k-mooney: hmm, my favorite topics, merged into one. | 12:26 |
sean-k-mooney | it would be nice to get that squared away before the ptg | 12:26 |
sean-k-mooney | haha all that is missing is numa | 12:27 |
jaypipes | and FPGAs. | 12:28 |
sean-k-mooney | live migration with numa affined fpga exposed by sriov passthough... at some point it just gets easier to move the damb server | 12:29 |
sean-k-mooney | it is the one of the up sides of ironic | 12:30 |
artom | Or you know, register a new corporation and buy them new machines. | 12:30 |
sean-k-mooney | im just going to pop out to grab lunch so ill brb | 12:32 |
*** jmlowe has quit IRC | 12:34 | |
openstackgerrit | Artem Vasilyev proposed openstack/nova master: systemd detection result caching nit fixes https://review.openstack.org/649229 | 12:37 |
*** tbachman has quit IRC | 12:39 | |
*** brinzhang has quit IRC | 12:39 | |
*** tbachman has joined #openstack-nova | 12:41 | |
stephenfin | sean-k-mooney: They look like valid errors to me. I'm guessing we should update the notification samples but gibi_off or mriedem are probably the people to ask | 12:42 |
*** belmoreira has quit IRC | 12:43 | |
*** belmoreira has joined #openstack-nova | 12:47 | |
*** eharney has joined #openstack-nova | 12:50 | |
*** mriedem has joined #openstack-nova | 12:52 | |
*** cdent has joined #openstack-nova | 12:55 | |
*** tetsuro has quit IRC | 12:57 | |
mriedem | cdent: question inline https://review.openstack.org/#/c/649068/ | 12:58 |
*** dikonoor has joined #openstack-nova | 13:04 | |
* cdent goes to look | 13:06 | |
*** mriedem has quit IRC | 13:08 | |
*** mriedem has joined #openstack-nova | 13:09 | |
*** ygk_12345 has joined #openstack-nova | 13:09 | |
kashyap | Can anyone remind me again, has upstream Git master opened for Train, yet? (/me is dazed after PTO) | 13:14 |
sean-k-mooney | yes | 13:14 |
sean-k-mooney | like a week or two ago | 13:14 |
kashyap | Thanks | 13:15 |
sean-k-mooney | it opens after RC1 but we dont then to merge big thing for a few weeks after | 13:15 |
gibi_off | sean-k-mooney: for notifications we only emit the latest version only | 13:17 |
gibi_off | sean-k-mooney: so please update the samepl file according to the change in the object | 13:17 |
gibi_off | sean-k-mooney: the tests are functional ast they call the nova API and assert if notifications are received | 13:18 |
sean-k-mooney | gibi_off: ok but would they not avoid this issue if they parsed the notificaion into a python object and then checked that | 13:18 |
sean-k-mooney | gibi_off: also if your off today your doing it wrong :) but thanks ill update it | 13:19 |
kashyap | efried: When you get a moment, since Train has forked, might want to put this through: https://review.openstack.org/#/c/641981/ | 13:19 |
efried | kashyap: done, thanks for the reminder. | 13:20 |
gibi_off | sean-k-mooney: the parsed object would have different version | 13:21 |
gibi_off | sean-k-mooney: also nova cannot assume how the notifications are parsed by the consumer so we assert the json we emmit | 13:21 |
sean-k-mooney | ah good points | 13:21 |
kashyap | efried: Tack! | 13:22 |
*** lbragstad has joined #openstack-nova | 13:22 | |
kashyap | (Swedish for "thanks", if people don't want to look it up :D) | 13:22 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/stein: Add functional regression test for bug 1669054 https://review.openstack.org/649319 | 13:25 |
openstack | bug 1669054 in OpenStack Compute (nova) "RequestSpec.ignore_hosts from resize is reused in subsequent evacuate" [Medium,In progress] https://launchpad.net/bugs/1669054 - Assigned to Matt Riedemann (mriedem) | 13:25 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/stein: Do not persist RequestSpec.ignore_hosts https://review.openstack.org/649320 | 13:25 |
*** lpetrut has joined #openstack-nova | 13:26 | |
*** priteau has quit IRC | 13:28 | |
*** trident has quit IRC | 13:30 | |
*** ratailor has quit IRC | 13:32 | |
*** trident has joined #openstack-nova | 13:33 | |
*** hongbin has joined #openstack-nova | 13:35 | |
jaypipes | mnaser: around? can you please execute both of the queries in this pastebin and show me the output please? http://paste.openstack.org/show/748718/ | 13:36 |
*** awalende has quit IRC | 13:37 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/rocky: Add functional regression test for bug 1669054 https://review.openstack.org/649325 | 13:37 |
openstack | bug 1669054 in OpenStack Compute (nova) stein "RequestSpec.ignore_hosts from resize is reused in subsequent evacuate" [Medium,In progress] https://launchpad.net/bugs/1669054 - Assigned to Matt Riedemann (mriedem) | 13:37 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/rocky: Do not persist RequestSpec.ignore_hosts https://review.openstack.org/649326 | 13:37 |
*** awalende has joined #openstack-nova | 13:37 | |
*** BjoernT has joined #openstack-nova | 13:39 | |
*** BjoernT has quit IRC | 13:39 | |
*** awalende has quit IRC | 13:42 | |
*** helenafm has quit IRC | 13:44 | |
*** bbowen__ has joined #openstack-nova | 13:48 | |
yonglihe | mriedem: alex_xu: patch splited, and new unit test added. and... i need fix 2 of them. https://review.openstack.org/#/c/627765/13. thanks. | 13:49 |
*** awaugama has joined #openstack-nova | 13:49 | |
*** beagles is now known as beagles_dentist | 13:50 | |
alex_xu | yonglihe: sorry for not reach it for a while, will try tomorrow | 13:51 |
yonglihe | thanks, great news for me. | 13:52 |
*** eharney_ has joined #openstack-nova | 13:52 | |
efried | kashyap: I actually knew that :) | 13:53 |
kashyap | efried: Nice. Then I can try a few more phrases later on, then :D | 13:53 |
efried | In Swedish? That would be pretty much the extent of it for me. Hit me with some other languages though | 13:54 |
*** eharney has quit IRC | 13:55 | |
efried | kashyap: full disclosure, I wasn't sure of the spelling. The same pronunciation has the same meaning (but slightly different spellings) in Swedish, Danish, and Norwegian. | 13:55 |
kashyap | efried: Yeah, you're right -- they're all North Germanic languages | 13:55 |
kashyap | So to put the spelling thing to rest: Tack (Swedish), Takk (Norwegian), and Tak (Danish) :D | 13:56 |
*** BjoernT has joined #openstack-nova | 13:56 | |
ygk_12345 | hi all | 13:57 |
ygk_12345 | i having issues with spinning up vms on a compute node. they are forever in the scheduling state | 13:58 |
ygk_12345 | it is rocky setup OSA | 13:58 |
*** lpetrut has quit IRC | 14:00 | |
kashyap | ygk_12345: Hi, as noted in PM, I think you might get better responses on the more generic #openstack channel, where more admins / operators might hang out. | 14:04 |
efried | mriedem: Here's an interesting one: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Background%20on%20this%20error%20at%3A%20http%3A%2F%2Fsqlalche.me%2Fe%2Frvf5%5C%22 | 14:05 |
efried | (e.g. http://logs.openstack.org/50/638150/7/check/openstack-tox-py27/345d1d1/testr_results.html.gz) | 14:05 |
efried | Something happens every day around 1am that makes sqla race like hell. | 14:05 |
*** tssurya has quit IRC | 14:05 | |
*** ttsiouts has quit IRC | 14:07 | |
*** ttsiouts has joined #openstack-nova | 14:08 | |
*** sapd1_x has quit IRC | 14:08 | |
*** ttsiouts has quit IRC | 14:10 | |
*** ttsiouts has joined #openstack-nova | 14:10 | |
artom | sean-k-mooney, hah, midair collision :) In any case, thanks for the sanity check | 14:12 |
artom | sean-k-mooney++ | 14:12 |
artom | Doh, wrong channel | 14:12 |
*** tssurya has joined #openstack-nova | 14:14 | |
ygk_12345 | any nova expert here ? | 14:15 |
ygk_12345 | tried in #openstack channel but no one could help me | 14:15 |
*** smcginnis_pto is now known as smcginnis | 14:16 | |
efried | ygk_12345: Have you had a look in the logs yet? | 14:17 |
ygk_12345 | efried: yes | 14:17 |
ygk_12345 | efried: it doesn't seem to indicate that the build started | 14:18 |
ygk_12345 | efried: http://paste.openstack.org/show/748717/ | 14:18 |
ygk_12345 | efried: it is in the scheduling state forever | 14:18 |
*** wolverineav has joined #openstack-nova | 14:18 | |
efried | ygk_12345: What about the n-sch and/or n-cpu logs? Anything appear to be hanging/repeating in there? | 14:20 |
ygk_12345 | efried: nothing this is the log I got so far | 14:20 |
ygk_12345 | efried: I see this "claim resources in the placement API for instance c2093503-d2d0-4401-8956-6a68a7d6e0dc claim_resources /openstack/venvs/nova-18.1.5/lib/python2.7/site-packages/nova/scheduler/utils.py:934" | 14:21 |
openstackgerrit | Chris Dent proposed openstack/nova master: Don't report 'exiting' when mapping cells https://review.openstack.org/649340 | 14:22 |
*** sapd1_x has joined #openstack-nova | 14:22 | |
*** mlavalle has joined #openstack-nova | 14:22 | |
*** helenafm has joined #openstack-nova | 14:23 | |
*** wolverineav has quit IRC | 14:23 | |
efried | ygk_12345: that was in the n-sch log presumably. Nothing after that? | 14:25 |
cdent | ygk_12345: did you discover hosts in your cells? | 14:25 |
ygk_12345 | cdent: efried yes | 14:25 |
ygk_12345 | cdent: how to validate it ? | 14:25 |
ygk_12345 | cdent: regarding cells ? | 14:25 |
*** dpawlik has quit IRC | 14:26 | |
openstackgerrit | Stephen Finucane proposed openstack/nova-specs master: Standardize CPU resource tracking https://review.openstack.org/555081 | 14:27 |
cdent | listing your hypervisors | 14:27 |
cdent | I think you need to look at your n-cpu logs | 14:27 |
ygk_12345 | cdent: all the hypervisors are enabled | 14:27 |
mriedem | efried: i think that's this http://status.openstack.org/elastic-recheck/#1793364 | 14:27 |
ygk_12345 | cdent: where to look at the n-cpu logs ? | 14:28 |
cdent | ygk_12345: it depends on how you installed your openstack, but somewhere you have a nova-compute process, on the host which is or manage your hypervisor | 14:28 |
ygk_12345 | cdent: i see only nova-compute log in the hypervisor. Also I cant find any entry with the vm uuid in the nova-compute log | 14:29 |
cdent | nova-compute log and n-cpu log are the same thing | 14:29 |
cdent | the name depends on how things were installed | 14:30 |
ygk_12345 | cdent: ok then I dont find any vm uuid entries there and all seems to be fine on the hypervisor | 14:30 |
cdent | ygk_12345: then either in the conductor log or the scheduler log there should be some kind of error, after the timestamp of the "claim resources" | 14:32 |
cdent | grepping for the vm uuid won't be sufficient, you need to look through the log for an error or warning | 14:32 |
openstackgerrit | Hamdy Khader proposed openstack/nova master: Fix port update of host_id in case of baremetal instance https://review.openstack.org/649345 | 14:33 |
ygk_12345 | cdent: I find this in conductor log "Setting instance to ERROR state.: MessagingTimeout: Timed out waiting for a reply to message ID ef508c01e69841ae9f84356b7463165c" | 14:35 |
ygk_12345 | cdent: Failed to compute_task_build_instances: Timed out waiting for a reply to message ID ef508c01e69841ae9f84356b7463165c: MessagingTimeout: Timed out waiting for a reply to message ID ef508c01e69841ae9f84356b7463165c | 14:35 |
sean-k-mooney | ygk_12345: well that is the down call to the comptue to spawn the instance | 14:36 |
sean-k-mooney | i think | 14:36 |
cdent | ygk_12345: okay that's progress (in the sense of useful info). either your messaging bus (rabbitmq) is too busy or your conductor and compute host are unable to talk to one another over that bus | 14:36 |
sean-k-mooney | so on the compute you should be able to grep for ef508c01e69841ae9f84356b7463165c | 14:37 |
cdent | this might mean the compute node is misconfigured | 14:37 |
*** ivenszambrano has joined #openstack-nova | 14:37 | |
ygk_12345 | cdent: sean-k-mooney let me check | 14:37 |
sean-k-mooney | cdent: as in listneing to the wrong exchange | 14:37 |
cdent | ygk_12345: I'm sorry but I've got to go, good luck with it | 14:37 |
*** cdent has quit IRC | 14:37 | |
ygk_12345 | sean-k-mooney: i dont find any log entry on the compute node with that message ID | 14:38 |
ygk_12345 | sean-k-mooney: do u suspect its the rabbitmq issue with compute node ? | 14:39 |
sean-k-mooney | that would seam like the next most likely option | 14:39 |
ygk_12345 | sean-k-mooney: how to check that ? | 14:39 |
ygk_12345 | sean-k-mooney: but I dont think we made any changes to rabbitmq | 14:40 |
sean-k-mooney | am first you need to check the amqp setting in your nova.conf on both the conduction and compute node and make sure they are the same | 14:40 |
sean-k-mooney | the next thing to do would be to look a the topic queue and see if you can see the pending message | 14:40 |
*** ccamacho has quit IRC | 14:41 | |
ygk_12345 | sean-k-mooney: i see that all the rabbit servers are pingable from the compute node | 14:41 |
mnaser | jaypipes: around right now, ill do the explain | 14:44 |
ygk_12345 | sean-k-mooney: any idea ? | 14:44 |
mnaser | jaypipes: http://paste.openstack.org/show/748727/ | 14:46 |
sean-k-mooney | ygk_12345: have you logged into rabbitmq to see if the messages are still in the excahge queues | 14:46 |
ygk_12345 | sean-k-mooney: i dont see any queues around when I did list_queues | 14:47 |
ygk_12345 | sean-k-mooney: there r three rabbit containers | 14:47 |
sean-k-mooney | well presuably you are using the clustering plugin so from any of them you should see the same view | 14:48 |
ygk_12345 | sean-k-mooney: yes haproxy | 14:48 |
sean-k-mooney | haproxy is not the same thing | 14:48 |
ygk_12345 | sean-k-mooney: i see no error messages in the compute log | 14:48 |
sean-k-mooney | haproxy sits in front of rabbitmq and loadblances across the rabbit instances. but the 3 rabbitmq instance also need to be in a cluster. | 14:49 |
ygk_12345 | sean-k-mooney: yes it is openstack-ansible setup | 14:49 |
dansmith | this sounds like it's very not dev discussion? maybe you guys could move to #openstack? | 14:50 |
sean-k-mooney | perhaps although ygk_12345 i think the openstack ansible chanel might be able to help more | 14:51 |
sean-k-mooney | there is obviosly some issue between the compute node and the conducttor after the compute node filled up its disk and it appears to be related to rabbitmq | 14:52 |
ygk_12345 | sean-k-mooney: ok | 14:53 |
*** liuyulong has joined #openstack-nova | 14:53 | |
*** ygk_12345 has quit IRC | 14:54 | |
mnaser | btw, novnc has broken us and refuses to revert the thing that broke us: https://github.com/novnc/noVNC/pull/1220 | 14:56 |
mnaser | so CI currently installs novnc from packaging but anyone using the actual latest novnc will be broken | 14:56 |
*** janki has quit IRC | 14:56 | |
*** sridharg has quit IRC | 14:56 | |
*** ileixe has quit IRC | 14:57 | |
*** lpetrut has joined #openstack-nova | 14:59 | |
dansmith | mnaser: so it looks like we need to do something on our end then | 14:59 |
jaypipes | mnaser: that's what I was afraid of... thanks. | 15:00 |
mnaser | dansmith: yeah, im not sure how we "do something on our end" because of the design architecture, it doesn't give you a way to pass that info, unless we implement our own vnc.html and store it in repo and serve it.. overlaid on top of the vnc code, it starts to be iffy | 15:01 |
mnaser | I pinned the novnc release in openstack ansible to avoid this but yeah. | 15:01 |
mnaser | jaypipes: :< you're welcome | 15:01 |
dansmith | mnaser: yeah, I'm guessing that is what they're saying, that we should provide our own implementation of the client (html) if we're going to do the auth part | 15:02 |
*** cfriesen has joined #openstack-nova | 15:02 | |
mnaser | dansmith: the weird thing is that the auth part, they have their own "token" stuff, including something called token_plugins which you can implement | 15:02 |
mnaser | but even if you implement a token plugin, you can't even use novnc without rewriting things.. I don't get it. | 15:03 |
dansmith | without rewriting an html file you mean right? | 15:04 |
*** yan0s has quit IRC | 15:04 | |
dansmith | passing the token in path means we just get the token at the websocket url, right? | 15:06 |
dansmith | that'd be the right place to do the auth, so why is that a problem? | 15:06 |
*** tbachman_ has joined #openstack-nova | 15:07 | |
*** tbachman has quit IRC | 15:08 | |
*** tbachman_ is now known as tbachman | 15:08 | |
dansmith | (looks further) yeah, that's where we're getting the token for our server side websocket | 15:08 |
dansmith | so I'm not sure why that's not a solution | 15:09 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/queens: Add functional regression test for bug 1669054 https://review.openstack.org/649362 | 15:10 |
openstack | bug 1669054 in OpenStack Compute (nova) stein "RequestSpec.ignore_hosts from resize is reused in subsequent evacuate" [Medium,In progress] https://launchpad.net/bugs/1669054 - Assigned to Matt Riedemann (mriedem) | 15:10 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/queens: Do not persist RequestSpec.ignore_hosts https://review.openstack.org/649363 | 15:10 |
kashyap | efried++ | 15:13 |
kashyap | (Hmm, should probably get a karma bot in here.) | 15:13 |
*** lpetrut has quit IRC | 15:14 | |
*** beagles_dentist is now known as beagles | 15:15 | |
dansmith | no, we shouldn't. | 15:17 |
mnaser | dansmith: what if we did ?path=token -- thoughts? | 15:19 |
mnaser | then we can drop that whole path parsing thing and just grab all query params | 15:20 |
dansmith | mnaser: you mean path=$token instead of path=token=$token? | 15:20 |
mnaser | yeah | 15:20 |
kashyap | dansmith: Was only joking; where is your characteristic sense of humor... | 15:20 |
dansmith | you don't want to defeat the ability to actually use a path as part of that and have a proxy in between do you? | 15:20 |
dansmith | path=$token seems far more hacky than actually providing a legit url as the path | 15:21 |
mnaser | yeah, that is valid | 15:21 |
mnaser | the only concern I have now is *hopefully* that works for both old and new novnc | 15:22 |
* mnaser doesn't have time right tow to follow up on a lot of this | 15:22 | |
dansmith | not sure why that matters, since we control our package versions (and it's up to the distro to match), but as far as I can tell, they didn't change the path behavior | 15:22 |
dansmith | or claim not to | 15:22 |
*** belmorei_ has joined #openstack-nova | 15:22 | |
*** belmorei_ has quit IRC | 15:23 | |
*** eharney_ has quit IRC | 15:23 | |
mnaser | dansmith: nova's CI actually installs novnc from distro pkgs which is probably why this wasn't caught | 15:25 |
mnaser | and yeah, the path behavior seems to always have been there and continued to be there | 15:25 |
dansmith | oh? I thought we had an u-c for it | 15:25 |
mnaser | nope, discovered this yesterday | 15:25 |
mnaser | we have u-c for Websockify probably | 15:26 |
mnaser | but novnc isn't even a python package so yeah | 15:26 |
*** helenafm has quit IRC | 15:26 | |
mnaser | dansmith: https://github.com/openstack-dev/devstack/blob/358cc122c3a6d30bf043b3e478790fd2773e9a88/.zuul.yaml#L220 | 15:26 |
dansmith | okay | 15:27 |
dansmith | yeah I guess that makes sense actually | 15:27 |
sean-k-mooney | mnaser: oh your right we set NOVNC_FROM_PACKAGE=True | 15:27 |
openstackgerrit | Mohammed Naser proposed openstack/nova master: wip: start using ?path=%3Ftoken%3D=<token> https://review.openstack.org/649372 | 15:35 |
mnaser | I don't have time to follow up on this too much but I guess its a start for someone to go through and start a discussion | 15:35 |
*** ivve has quit IRC | 15:35 | |
dansmith | seems like melwitt has some interest at least | 15:36 |
*** BjoernT has quit IRC | 15:36 | |
mnaser | yeah, so maybe that's the initial copy pasta which will probably fail | 15:37 |
* mnaser needs to go back to the fine art of openstack upgrades | 15:37 | |
cfriesen | sean-k-mooney: do you know if nova is supposed to handle PCI passthrough for Intel QAT devices? | 15:39 |
sean-k-mooney | yes | 15:39 |
sean-k-mooney | it has worked since before icehouse | 15:39 |
sean-k-mooney | you can do both pf and vf passhtough | 15:39 |
sean-k-mooney | i think intel QAT device were perhaps the first usecase that pci passhtough was enabeld for. i know nic came after it | 15:40 |
cfriesen | sean-k-mooney: we're hitting a weird issue where they're hittting the SRIOV_VF clause in LibvirtDriver._get_device_type() and failing in pci_utils.get_ifname_by_pci_address(). Are we configuring something wrong? | 15:41 |
sean-k-mooney | cfriesen: do you have a physnet set in the pci whitelist for that device | 15:42 |
cfriesen | sean-k-mooney: not sure, can check. Should it have one? | 15:42 |
sean-k-mooney | no | 15:42 |
sean-k-mooney | that should only be set on nics | 15:42 |
sean-k-mooney | QAT we expect the type of the device to be PCI for both pf and vf in the pcimanager | 15:43 |
openstackgerrit | Colleen Murphy proposed openstack/nova stable/stein: Move create of ComputeAPI object in websocketproxy https://review.openstack.org/649374 | 15:43 |
openstackgerrit | Colleen Murphy proposed openstack/nova stable/rocky: Move create of ComputeAPI object in websocketproxy https://review.openstack.org/649375 | 15:43 |
sean-k-mooney | but its possible that the code we added for the bandwith based schduing is causing issues | 15:43 |
*** sapd1_x has quit IRC | 15:44 | |
sean-k-mooney | cfriesen: https://github.com/openstack/nova/blob/master/nova/pci/request.py#L16-L39 | 15:44 |
cfriesen | sean-k-mooney: sweet, thanks | 15:45 |
sean-k-mooney | "product_id": "0443" is the VF and "product_id": "0442" is the pf | 15:46 |
*** ccamacho has joined #openstack-nova | 15:46 | |
sean-k-mooney | cfriesen: yep QAT was the thing that intoduce pci passthough https://github.com/openstack/nova/commit/fe67148234dba42468793f33c2ca83ce0616e824 so it really should work still | 15:47 |
cfriesen | sean-k-mooney: I expect it's a config issue. | 15:48 |
sean-k-mooney | cfriesen: its posible but the pci_utils.get_ifname_by_pci_address call was intoduce for stien for band with based schudling so there could be a bug. i dont have qat devices to test with | 15:49 |
*** hamzy has quit IRC | 15:49 | |
sean-k-mooney | well the fucntion existed before we jsut use it in more places now | 15:49 |
efried | sean-k-mooney: this look kosher to you: https://review.openstack.org/#/c/635533/ | 15:50 |
efried | dansmith: ^ if you please? | 15:51 |
sean-k-mooney | xen is not really my thing but ill take a look | 15:51 |
*** krypto has joined #openstack-nova | 15:51 | |
efried | sean-k-mooney: it's not a xen thing, really an ssl thing. | 15:52 |
*** belmoreira has quit IRC | 15:53 | |
dansmith | it's not an ssl thing, it's a python/oslo thing AFAICT | 15:54 |
sean-k-mooney | ya its processutils exit code checking | 15:55 |
sean-k-mooney | but yes on second look it looks sane to me | 15:55 |
sean-k-mooney | efried: is there any change that the password could be loged to standard error if openssl did not like it | 15:56 |
openstackgerrit | Eric Fried proposed openstack/nova master: Add minimum value in max_concurrent_live_migrations https://review.openstack.org/648302 | 15:57 |
efried | sean-k-mooney: I would seriously hope there is no ssl command in existence that will echo back a password to you :) | 15:58 |
efried | dansmith: only peripherally. They're just changing from "anything on stderr means we should fail" to "nonzero return code means we should fail". | 15:58 |
dansmith | ...right | 15:58 |
*** BjoernT has joined #openstack-nova | 15:59 | |
sean-k-mooney | efried: yes | 16:00 |
dansmith | efried: the patch asserts that before, writing anything to stderr would cause it to raise, even if it exited with zero, rght? | 16:00 |
dansmith | I don't see where in oslo that behavior happens | 16:00 |
dansmith | (it'd be pretty dumb, which is why I'm curious) | 16:00 |
sean-k-mooney | dansmith: https://github.com/openstack/oslo.concurrency/blob/master/oslo_concurrency/processutils.py#L414-L424 | 16:01 |
efried | dansmith: It wasn't in oslo, they were triggering the exception here (in nova) based on stderr being nonempty. | 16:02 |
dansmith | sean-k-mooney: that's not what I'm asking about | 16:02 |
dansmith | efried: oh? I thought they were asserting that was oslo... | 16:02 |
dansmith | oh RuntImeError, I see | 16:02 |
sean-k-mooney | they are assertign that if openssl wrote anything to stderr then nova would raise the runtime error | 16:03 |
*** ttsiouts has quit IRC | 16:04 | |
dansmith | efried: that makes it much more openssl-related than I thought, I just zeroed in on the process handling.. so I dunno, probably need someone much more familiar with openssl to validate those semantics | 16:04 |
sean-k-mooney | im guessing that if openssl ever decied to add a deperaction warning or somehting this caused issues for them | 16:04 |
*** ttsiouts has joined #openstack-nova | 16:04 | |
dansmith | right, I definitely get that | 16:04 |
sean-k-mooney | i looks liek they jsut want to be a little more graceful and assume openssl follow the standard unix thing of if it returned and exit code of 0 it succeded | 16:05 |
sean-k-mooney | which i think is resonable | 16:05 |
sean-k-mooney | did the bug have an explcit example | 16:05 |
dansmith | sean-k-mooney: yes, that's all obvious :) | 16:06 |
sean-k-mooney | yes so the patch is sane yes to efried original quetion | 16:07 |
sean-k-mooney | the bug was reported as a deprecation warning as i gueesed | 16:07 |
sean-k-mooney | |RuntimeError: OpenSSL error: *** WARNING : deprecated key derivation used. | 16:07 |
dansmith | the original code didn't look at the return code for something security-related and so changing that behavior is potentially pretty impactful | 16:07 |
sean-k-mooney | |Using -iter or -pbkdf2 would be better. | 16:07 |
*** ttsiouts has quit IRC | 16:08 | |
sean-k-mooney | that is possibly true yes in this specifc instance othe Runtime error that was raise i think its safe but you are concerend that there coudl be other case where it would not be | 16:08 |
*** ttsiouts has joined #openstack-nova | 16:08 | |
*** tssurya has quit IRC | 16:09 | |
dansmith | I'm saying they used to fail if anything was written to stderr and now they won't | 16:09 |
sean-k-mooney | i woudl hope openssl would not exit with code 0 for anything other then sucess but i dont know that for certine even if i belive it very likely | 16:09 |
dansmith | and depending on what is going on here, that could be, you know, a big deal | 16:09 |
sean-k-mooney | yes | 16:09 |
dansmith | depends on the command and what is going on | 16:09 |
sean-k-mooney | well we can see the command it fixed | 16:10 |
dansmith | I'm saying someone needs to go make that determination, IMHO and not just blindly approve this | 16:10 |
sean-k-mooney | we are encrypting input text with a shared key using ase-123-cbc | 16:10 |
sean-k-mooney | fair im still wondering why we are doing this via the shell in the first place | 16:11 |
*** imacdonn has joined #openstack-nova | 16:12 | |
*** wolverineav has joined #openstack-nova | 16:19 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/pike: Fix functional tests for USE_NEUTRON https://review.openstack.org/649385 | 16:22 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/pike: Add functional regression test for bug 1669054 https://review.openstack.org/649386 | 16:22 |
openstack | bug 1669054 in OpenStack Compute (nova) stein "RequestSpec.ignore_hosts from resize is reused in subsequent evacuate" [Medium,In progress] https://launchpad.net/bugs/1669054 - Assigned to Matt Riedemann (mriedem) | 16:22 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/pike: Do not persist RequestSpec.ignore_hosts https://review.openstack.org/649387 | 16:22 |
openstackgerrit | Merged openstack/nova master: Correct lower-constraints.txt and the related tox job https://review.openstack.org/622972 | 16:23 |
openstackgerrit | Merged openstack/nova master: Adding tests to demonstrate bug #1821824 https://review.openstack.org/647957 | 16:23 |
openstack | bug 1821824 in OpenStack Compute (nova) "Forbidden traits in flavor properties don't work" [Undecided,In progress] https://launchpad.net/bugs/1821824 - Assigned to Magnus Bergman (magnusbe) | 16:23 |
dansmith | looks like dimitri did a fairly decent analysis of the openssl code, which is good, but I also wonder if we couldn't just ignore lines that start with "WARNING" or something from the stderr and retain some of the original behavor | 16:24 |
*** rpittau is now known as rpittau|afk | 16:24 | |
dansmith | anyway, I don't really have time to dig into it super deep | 16:24 |
*** dtantsur is now known as dtantsur|afk | 16:24 | |
openstackgerrit | Merged openstack/nova master: Add placement as required project to functional py36 and 37 https://review.openstack.org/649068 | 16:25 |
*** wolverineav has quit IRC | 16:25 | |
*** ccamacho has quit IRC | 16:32 | |
openstackgerrit | Merged openstack/nova master: libvirt: Use 'writeback' QEMU cache mode when 'none' is not viable https://review.openstack.org/641981 | 16:34 |
*** ircuser-1 has joined #openstack-nova | 16:35 | |
*** BjoernT has quit IRC | 16:35 | |
cfriesen | sean-k-mooney: apparently nova-compute chokes on startup with this particular QAT hardware even with no whitelist/alias entries in nova.conf. | 16:35 |
*** _alastor_ has joined #openstack-nova | 16:37 | |
*** BjoernT has joined #openstack-nova | 16:37 | |
*** _alastor_ has quit IRC | 16:39 | |
*** _alastor_ has joined #openstack-nova | 16:39 | |
sean-k-mooney | cfriesen: ok then this is likely related to gibi_off's change to auto lookup the netdev name for bandwith based scheduling | 16:41 |
cfriesen | sean-k-mooney: yeah, confirmed that we're dying in the code gibi_off added in Dec 2018. | 16:42 |
cfriesen | sean-k-mooney: we're hitting this with the standard embedded Intel QAT, so it's going to cause grief with standard hardware | 16:44 |
sean-k-mooney | its proably this bit https://github.com/openstack/nova/commit/c02e213d507c830427a86d6a4bb4f7a2f5158590#diff-f4019782d93a196a0d026479e6aa61b1R5938 | 16:45 |
cfriesen | sean-k-mooney: the issue is that there is no "net" in the device path (i.e. /sys/bus/pci/devices/<pci_addr>/net) | 16:45 |
sean-k-mooney | ya | 16:46 |
sean-k-mooney | so https://github.com/openstack/nova/blob/c02e213d507c830427a86d6a4bb4f7a2f5158590/nova/virt/libvirt/driver.py#L5938-L5940 | 16:46 |
sean-k-mooney | should only be executed for VF that are network devices | 16:46 |
*** davidsha has quit IRC | 16:47 | |
sean-k-mooney | that shoudl be a simple fix | 16:47 |
sean-k-mooney | but we will need to land it in RC2 or backport to stien before thurday to include it in the GA release | 16:47 |
sean-k-mooney | cfriesen: we are expecting qat to hit the final return however | 16:49 |
sean-k-mooney | so there is something else going on | 16:49 |
cfriesen | sean-k-mooney: there are VFs for this device, so I was assuming we're enumerating the VFs | 16:50 |
*** BjoernT has quit IRC | 16:50 | |
sean-k-mooney | yay be we are only ment to report the it as type SRIOV_VF if its a nic | 16:50 |
sean-k-mooney | all non nic VF are ment to be TYPE_PCI | 16:51 |
cfriesen | sean-k-mooney: where is that code? | 16:51 |
sean-k-mooney | im looking for it now but its the only thing that prevented you geting a qat device instead fo a nic VF when you ahave a neturon prot of vnic_type direct in the past | 16:53 |
*** dikonoor has quit IRC | 16:55 | |
*** amodi has quit IRC | 16:58 | |
cfriesen | sean-k-mooney: it kind of looks like _get_pcidev_info() is calling self._host.device_lookup_by_name() to get the XML for the device. Is it possible libvirt is doing something different? | 17:01 |
openstackgerrit | Artem Vasilyev proposed openstack/nova master: systemd detection result caching nit fixes https://review.openstack.org/649229 | 17:03 |
sean-k-mooney | is that where you are having the failure? | 17:03 |
sean-k-mooney | cfriesen: can you post a copy fo the error to paste.openstack.org | 17:04 |
cfriesen | sean-k-mooney: yeah, nova-compute startup. here's the starlingx bug, the nova stuff is partway down: https://bugs.launchpad.net/starlingx/+bug/1821938 | 17:04 |
openstack | Launchpad bug 1821938 in StarlingX "No nova hypervisor can be enabled on workers with QAT devices" [High,Triaged] | 17:05 |
*** hamzy has joined #openstack-nova | 17:05 | |
sean-k-mooney | cfriesen: thanks | 17:05 |
*** ttsiouts has quit IRC | 17:05 | |
cfriesen | sean-k-mooney: extra info: http://paste.openstack.org/show/748734/ | 17:05 |
*** ttsiouts has joined #openstack-nova | 17:06 | |
sean-k-mooney | ya so this is not failing because of https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L6067 | 17:06 |
cfriesen | we don't have the resources to fix this in the near future, got other stuff to deal with | 17:06 |
sean-k-mooney | its failing because of https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L6047 | 17:06 |
sean-k-mooney | because it does not have a netdev | 17:06 |
cfriesen | sean-k-mooney: correct, but you were wondering why we were going down the VF path for things that werent nics | 17:07 |
sean-k-mooney | which is because of gibis change | 17:07 |
sean-k-mooney | ya but the other filtering im thinking of could be else where | 17:07 |
sean-k-mooney | i think we just need to put a guard around that call | 17:07 |
cfriesen | sean-k-mooney: looks like _get_device_capabilities() also assumes that SRIOV_VF is a NIC | 17:08 |
sean-k-mooney | yes it does | 17:09 |
sean-k-mooney | although it is reading form libvirt | 17:09 |
sean-k-mooney | instead of sysfs | 17:09 |
sean-k-mooney | so it proably fine | 17:09 |
*** dpawlik has joined #openstack-nova | 17:09 | |
cfriesen | sean-k-mooney: _get_pcinet_info calls get_net_name_by_vf_pci_address() | 17:10 |
cfriesen | so I think it'll choke | 17:10 |
*** ttsiouts has quit IRC | 17:10 | |
sean-k-mooney | ill quickly hack something up one sec | 17:10 |
*** eharney has joined #openstack-nova | 17:11 | |
sean-k-mooney | cfriesen: https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L6008-L6010 i think will guard it for that case | 17:12 |
sean-k-mooney | actully no it wont | 17:12 |
*** ralonsoh has quit IRC | 17:12 | |
sean-k-mooney | actully it should be fine | 17:14 |
sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/pci/utils.py#L205 | 17:14 |
sean-k-mooney | the exception is caught internally | 17:14 |
cfriesen | ah, yes | 17:14 |
sean-k-mooney | am is that the case in the starlingx code? | 17:15 |
cfriesen | should be, but we were choking earlier in _get_device_type() | 17:15 |
*** dpawlik has quit IRC | 17:15 | |
sean-k-mooney | oh i see the issue | 17:16 |
sean-k-mooney | we are doing | 17:16 |
sean-k-mooney | 'parent_ifname': | 17:16 |
sean-k-mooney | pci_utils.get_ifname_by_pci_address( | 17:16 |
sean-k-mooney | pci_address, pf_interface=True), | 17:16 |
sean-k-mooney | so we are calling pci_utils.get_ifname_by_pci_address directly so we dont catch the excpetion | 17:17 |
cfriesen | yep | 17:17 |
sean-k-mooney | where as the other code calls get_net_name_by_vf_pci_address which does | 17:17 |
*** ricolin has quit IRC | 17:17 | |
sean-k-mooney | ok this is a simile fix ill do it now and make it as closing the bug | 17:17 |
sean-k-mooney | efried: im going to try and fix https://bugs.launchpad.net/starlingx/+bug/1821938 can we land it in stien? | 17:18 |
openstack | Launchpad bug 1821938 in StarlingX "No nova hypervisor can be enabled on workers with QAT devices" [High,Triaged] | 17:18 |
cfriesen | I think one of my coworkers is going to open a nova bug | 17:18 |
sean-k-mooney | ok ill add the starlingx bug as a related bug so | 17:18 |
sean-k-mooney | or do ye want to fix it | 17:18 |
cfriesen | go for it | 17:19 |
*** erlon has joined #openstack-nova | 17:20 | |
cfriesen | I think my guy is having lunch. :) I'll send you the nova bug when I get the number. | 17:20 |
*** artom has quit IRC | 17:22 | |
stephenfin | mriedem: https://review.openstack.org/626949 | 17:24 |
* stephenfin -> home | 17:24 | |
openstackgerrit | Jared Winborne proposed openstack/nova master: Leave the brackets on Ceph Monitor IPv6 addresses for libguestfs https://review.openstack.org/649405 | 17:26 |
*** KH-Jared has joined #openstack-nova | 17:28 | |
*** jmlowe has joined #openstack-nova | 17:28 | |
KH-Jared | I fully expect my change didn't follow some proper practice on the change I just submitted, guess I just have to wait and find out what that is at this point | 17:31 |
*** wolverineav has joined #openstack-nova | 17:34 | |
*** amodi has joined #openstack-nova | 17:36 | |
openstackgerrit | sean mooney proposed openstack/nova master: gracefuly handel none nic VFs https://review.openstack.org/649409 | 17:37 |
*** eharney has quit IRC | 17:37 | |
sean-k-mooney | cfriesen: ^ | 17:37 |
*** jmlowe has quit IRC | 17:37 | |
mriedem | KH-Jared: commented | 17:37 |
sean-k-mooney | i need to run the unit test and see if any fail and alsoadd a new one | 17:38 |
KH-Jared | ty mriedem | 17:39 |
sean-k-mooney | mriedem: not sure if you were following but ^ is a fix for https://bugs.launchpad.net/starlingx/+bug/1821938 | 17:39 |
openstack | Launchpad bug 1821938 in StarlingX "No nova hypervisor can be enabled on workers with QAT devices" [High,Triaged] | 17:39 |
mriedem | sean-k-mooney: i wasn't | 17:39 |
sean-k-mooney | think we could land it in stien if i get it ready soon | 17:39 |
mriedem | idk wtf a qat device is | 17:39 |
sean-k-mooney | intesl Quick assist crypto card | 17:40 |
mriedem | looks like something gibi_off should be aware of | 17:40 |
sean-k-mooney | it was the first pci passthough device we supported in nova | 17:40 |
sean-k-mooney | mriedem: ya so gibi_off added the auto parent interface name lookup feature i suggeted for the bandwidth based schudling | 17:41 |
sean-k-mooney | but we missed that it shoudl have handeld VF that were not nics | 17:41 |
sean-k-mooney | so it raise an excpution if you have a pci device that support sriov but is not a nic | 17:41 |
sean-k-mooney | like a QAT device or a GPU that uses sriov like amd do | 17:42 |
mriedem | well i don't know about all that wackiness but i know you need a test and you could avoid blanket ignoring Exception if you changed to handle PciDeviceNotFoundById | 17:43 |
mriedem | and add a comment about why it's ok to ignore | 17:43 |
mriedem | run a spellchecker on your commit message as well :) | 17:44 |
sean-k-mooney | yep ill do all of the above. i was copying what we do here more or less https://github.com/openstack/nova/blame/2384c41b781a84de98d0932f44d4b3c544c3fe3d/nova/pci/utils.py#L205 | 17:44 |
*** tbachman has quit IRC | 17:45 | |
mriedem | you also need a nova bug for that and tag it with stein-rc-potential | 17:47 |
mriedem | and inform the PTL | 17:47 |
sean-k-mooney | yes cfriesen or one of his coworkers is filing the bug but i guess i can jsut do that and i pingged efried but i think he is away or having lunch | 17:48 |
*** BjoernT has joined #openstack-nova | 17:48 | |
sean-k-mooney | or ignoring me that is valid too | 17:48 |
mriedem | he's out cracking skulls over lunch break i'm sure | 17:48 |
*** MasterofJOKers has quit IRC | 17:48 | |
*** MasterofJOKers has joined #openstack-nova | 17:49 | |
sean-k-mooney | cool ill get all of this done in the next hour or so. im asummig this qualifies by the way for the rc/stien release | 17:49 |
dansmith | this is stein ptl anyway right? | 17:49 |
sean-k-mooney | oh that would be melwitt then | 17:50 |
efried | I'm back | 17:50 |
efried | sean-k-mooney: I'm not stable anyway | 17:50 |
efried | take that however you like | 17:50 |
dansmith | nice | 17:50 |
sean-k-mooney | efried: cfriesen/starlingx noticed you cant start nova compute oh host with qat integrated into the cpu/chipset | 17:51 |
efried | clearly we just need to get rid of the pci subsystem | 17:51 |
sean-k-mooney | clearly. the fix is trivail and im cleaning it up now. ill ping people when its all ready | 17:52 |
KH-Jared | trying to make sure I'm handling my change in the best way. I saw two options for making the addresses happy for libguestfs without changing how they were provided to libvirt, make striping the brackets optional or add them back if it looked like IPv6. Adding them back seemed easier but improper, since it would be removing and adding the brackets for no purpose, so I was going to go with trying to leave the brackets, | 17:52 |
KH-Jared | optionally | 17:52 |
dansmith | sean-k-mooney: and you're going to spell check the snot out of it right? | 17:52 |
dansmith | sean-k-mooney: maybe two or three times just to be sure? | 17:52 |
sean-k-mooney | yes :) | 17:52 |
*** ivenszambrano has quit IRC | 17:54 | |
efried | sean-k-mooney: So you want that bug assigned to you? | 17:54 |
*** tbachman has joined #openstack-nova | 17:54 | |
sean-k-mooney | well i or cfriesen will file a nova one and ya it can be assigned to me | 17:54 |
sean-k-mooney | the current bug is against starlingx | 17:55 |
*** psachin has quit IRC | 17:55 | |
efried | sean-k-mooney: You can add 'affects' to the same bug, nah? | 17:55 |
mordred | mriedem: https://review.openstack.org/626949 - patch to osc regarding live migration and arguments | 17:56 |
efried | I guess the nova fix has to be ported to the stx fork? | 17:56 |
*** hongbin has quit IRC | 17:57 | |
*** hongbin has joined #openstack-nova | 17:57 | |
mriedem | mordred: yeah i added you since i know you're at least aware of there being a few changes for that same issue | 17:57 |
sean-k-mooney | so i was thinking about just adding nova as a component of the same bug but im not sure how that works with the rc potential tag | 17:57 |
mriedem | mordred: L18 https://etherpad.openstack.org/p/DEN-osc-compute-api-gaps | 17:57 |
*** BjoernT has quit IRC | 17:58 | |
*** bbowen_ has joined #openstack-nova | 18:00 | |
openstackgerrit | Artem Vasilyev proposed openstack/nova master: systemd detection result caching nit fixes https://review.openstack.org/649229 | 18:01 |
*** eharney has joined #openstack-nova | 18:02 | |
*** jonspw has joined #openstack-nova | 18:02 | |
mriedem | someone tell me how this makes sense: | 18:02 |
mriedem | pike: https://github.com/openstack/nova/blame/1c6f99dc9aacaea78242561df35957bb711c6161/nova/tests/unit/conductor/test_conductor.py#L1360 | 18:02 |
mriedem | ocata: https://github.com/openstack/nova/blame/9b76fc7a0a1afbd9f2cd0d5786c37138c1b820f1/nova/tests/unit/conductor/test_conductor.py#L1377 | 18:02 |
mriedem | the code is different, but the commit on the left in the blame is the same | 18:02 |
*** bbowen__ has quit IRC | 18:02 | |
dansmith | code looks the same to me | 18:03 |
dansmith | wait, I opened the same thing twice :D | 18:03 |
mriedem | heh | 18:04 |
mriedem | oh i know, | 18:04 |
mriedem | something just removed code which is why it's not showing up in the pike blame | 18:04 |
dansmith | yeah, looks like it | 18:05 |
openstackgerrit | Artem Vasilyev proposed openstack/nova master: systemd detection result caching nit fixes https://review.openstack.org/649229 | 18:05 |
dansmith | request_spec={} et al | 18:05 |
mriedem | yeah | 18:05 |
*** wolverineav has quit IRC | 18:06 | |
mriedem | bingo | 18:07 |
mriedem | https://github.com/openstack/nova/commit/e211fca55a11c80058d5d78e31dc3ad466d7edfd#diff-df5e04ccff7072ded89c488a5649639e | 18:07 |
mriedem | https://www.youtube.com/watch?v=YqAyz1coj44 | 18:08 |
dansmith | heh | 18:08 |
*** cdent has joined #openstack-nova | 18:09 | |
*** BjoernT has joined #openstack-nova | 18:11 | |
*** jmlowe has joined #openstack-nova | 18:11 | |
*** wolverineav has joined #openstack-nova | 18:14 | |
*** wolverineav has quit IRC | 18:14 | |
*** artom has joined #openstack-nova | 18:14 | |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: Docs: emulator threads: clarify expected behavior https://review.openstack.org/649416 | 18:14 |
artom | stephenfin, sean-k-mooney ^^ from this morning's downstream discussion | 18:15 |
sean-k-mooney | oh the Horror https://review.openstack.org/#/c/649416/1/doc/source/user/flavors.rst@576 | 18:15 |
openstackgerrit | Merged openstack/nova master: Do not persist RequestSpec.ignore_hosts https://review.openstack.org/647512 | 18:16 |
sean-k-mooney | first glance ignoring the space it looks fine | 18:17 |
sean-k-mooney | ill wait for the docs job to finish | 18:17 |
*** wolverineav has joined #openstack-nova | 18:20 | |
*** BjoernT_ has joined #openstack-nova | 18:20 | |
*** BjoernT has quit IRC | 18:22 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/ocata: Add functional regression test for bug 1669054 https://review.openstack.org/649419 | 18:25 |
openstack | bug 1669054 in OpenStack Compute (nova) stein "RequestSpec.ignore_hosts from resize is reused in subsequent evacuate" [Medium,In progress] https://launchpad.net/bugs/1669054 - Assigned to Matt Riedemann (mriedem) | 18:25 |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/ocata: Do not persist RequestSpec.ignore_hosts https://review.openstack.org/649420 | 18:25 |
*** BjoernT_ has quit IRC | 18:26 | |
*** tbachman has quit IRC | 18:27 | |
*** BjoernT has joined #openstack-nova | 18:27 | |
*** cdent has quit IRC | 18:30 | |
openstackgerrit | Artem Vasilyev proposed openstack/nova master: systemd detection result caching nit fixes https://review.openstack.org/649229 | 18:31 |
*** spsurya has quit IRC | 18:32 | |
*** tbachman has joined #openstack-nova | 18:33 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/stein: Error out migration when confirm_resize fails https://review.openstack.org/649421 | 18:33 |
openstackgerrit | sean mooney proposed openstack/nova master: gracefully handle non-nic VFs https://review.openstack.org/649409 | 18:35 |
*** tbachman_ has joined #openstack-nova | 18:36 | |
*** bbowen__ has joined #openstack-nova | 18:37 | |
*** tbachman has quit IRC | 18:37 | |
*** tbachman_ is now known as tbachman | 18:37 | |
*** mdbooth has joined #openstack-nova | 18:39 | |
*** bbowen_ has quit IRC | 18:39 | |
*** mdbooth_ has quit IRC | 18:42 | |
openstackgerrit | Eric Fried proposed openstack/nova master: docs: Rework all things metadata'y https://review.openstack.org/640730 | 18:47 |
*** wolverineav has quit IRC | 18:49 | |
*** wolverineav has joined #openstack-nova | 18:51 | |
*** tbachman has quit IRC | 18:53 | |
*** wolverineav has quit IRC | 18:55 | |
*** artom has quit IRC | 18:58 | |
*** dpawlik has joined #openstack-nova | 18:58 | |
*** wolverineav has joined #openstack-nova | 18:59 | |
*** dpawlik has quit IRC | 19:02 | |
*** wolverineav has quit IRC | 19:02 | |
*** wolverineav has joined #openstack-nova | 19:03 | |
*** gmann is now known as gmann_afk | 19:04 | |
*** tesseract has quit IRC | 19:04 | |
openstackgerrit | sean mooney proposed openstack/nova master: Libvirt: gracefully handle non-nic VFs https://review.openstack.org/649409 | 19:07 |
sean-k-mooney | melwitt: efried i think ^ is now correct? | 19:08 |
*** wolverineav has quit IRC | 19:08 | |
sean-k-mooney | i have added nova to the existing bug and added the stein-rc-potential tag but if you would like a new bug filed i can do that too. | 19:09 |
sean-k-mooney | im going to have dinner but if you want any changes let me know | 19:10 |
efried | sean-k-mooney: assuming stx doesn't care about our stein-rc-potential tag, I think this should be fine. | 19:10 |
efried | Thanks sean-k-mooney | 19:10 |
sean-k-mooney | cfriesen: any comment on ^ | 19:10 |
cfriesen | looking | 19:11 |
sean-k-mooney | i think it should be ok but we should discuss it at the ptg too. e.g. how to tack cross project bugs like this between openstack and starlingx | 19:11 |
*** owalsh_ has joined #openstack-nova | 19:15 | |
*** owalsh has quit IRC | 19:16 | |
cfriesen | sean-k-mooney: looks reasonable, but why not put the result['parent_ifname'] assignment in the try block and get rid of the initial assignment to None and the conditional? | 19:16 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova-specs master: Spec for API policy updates https://review.openstack.org/547850 | 19:18 |
cfriesen | efried: one of our people in charge thinks adding nova (and the tag) to that bug is a good solution for tracking joint issues | 19:21 |
efried | wfm. mriedem may have a stronger opinion. | 19:21 |
cfriesen | (i.e. StarlingX, just to be clear) | 19:21 |
*** erlon_ has quit IRC | 19:22 | |
efried | cfriesen: what's not clear to me is whether anything needs to be done on the stx side at all. | 19:22 |
cfriesen | on our end it'll likely just be picking up a new load once it's fixed in nova and validating the issue is gone | 19:22 |
sean-k-mooney | cfriesen: ya i think from a souce code point of view it will be a cherrypick or rebase | 19:25 |
sean-k-mooney | there is obviously the testing aspect too. | 19:25 |
cfriesen | sean-k-mooney: with your fix we're seeing logs every minute: nova.pci.utils [req-d9c8620e-7990-4b3c-a6b0-88a131852e47 - - - - -] No net device was found for VF 0000:3d:02.2: PciDeviceNotFoundById: PCI device 0000:3d:02.2 not found | 19:26 |
*** BjoernT has quit IRC | 19:27 | |
sean-k-mooney | that is from the other capablities fucntion | 19:27 |
sean-k-mooney | not adding a second message every minute was why i add the pass in teh except block in stead of logging | 19:27 |
cfriesen | might make sense to quiet that down, but that's not quite so urgent. | 19:29 |
sean-k-mooney | cfriesen: that comes form here https://github.com/openstack/nova/blob/master/nova/pci/utils.py#L224-L225 | 19:29 |
cfriesen | it's 48 logs every minute | 19:29 |
*** bbobrov has quit IRC | 19:30 | |
sean-k-mooney | cfriesen: ya that has been there for 2 or 3 releases | 19:30 |
*** awaugama has quit IRC | 19:30 | |
sean-k-mooney | im going to work on a followup to fix some comments i notice i can remove that warning too. | 19:30 |
sean-k-mooney | or make it a debug message | 19:30 |
cfriesen | debug might be good | 19:30 |
sean-k-mooney | there is nothing that an operator can do to scilence it if they have non nic VF currently and its not helpful in that case | 19:31 |
cfriesen | agreed | 19:31 |
*** bbobrov has joined #openstack-nova | 19:31 | |
sean-k-mooney | anyway the fact that your nova-compute agent didnt die means the fix is at least minimally working. | 19:32 |
sean-k-mooney | i might look at optimising this code a bit too. currently we call _get_pcidev_info from _get_pci_passthrough_devices before applying the pci whitelist so we look at way more device then we need too | 19:35 |
*** dpawlik has joined #openstack-nova | 19:43 | |
*** wolverineav has joined #openstack-nova | 19:44 | |
*** jmlowe has quit IRC | 19:47 | |
*** dpawlik has quit IRC | 19:47 | |
*** dpawlik has joined #openstack-nova | 19:48 | |
*** wolverineav has quit IRC | 19:49 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Fix comment in test_attach_with_multiattach_fails_not_available https://review.openstack.org/649440 | 19:50 |
openstackgerrit | Merged openstack/nova master: Fix a deprecation warning https://review.openstack.org/649234 | 19:52 |
*** dpawlik has quit IRC | 19:53 | |
efried | mriedem: How about this one: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22ModuleNotFoundError%3A%20No%20module%20named%20'memcache'%5C%22 | 19:54 |
efried | Seems to hit several different jobs, but always grenade | 19:54 |
efried | mriedem: https://review.openstack.org/#/c/649096/ ? | 19:56 |
mriedem | efried: known issue | 19:57 |
mriedem | yeah related to that bug | 19:58 |
mriedem | i need to update the e-r query, or you can | 19:58 |
mriedem | to include build_name:"neutron-grenade-multinode" OR build_name:"neutron-grenade-dvr-multinode" | 19:58 |
efried | mriedem: there's also grenade-py3 | 19:58 |
efried | let me try | 19:58 |
mriedem | the query is already restricted to just grenade-py3 | 19:58 |
mriedem | so need message:"..." AND tags:"screen" AND (build_name:"grenade-py3" OR build_name:"..."...) | 19:59 |
efried | k, hadn't pulled it up yet. | 19:59 |
mriedem | melwitt: dansmith: duh duh duh https://bugs.launchpad.net/devstack/+bug/1822873 | 19:59 |
openstack | Launchpad bug 1822873 in devstack "stack fails if NOVA_NUM_CELLS > 1 and n-novnc enabled" [Undecided,New] | 19:59 |
efried | the bug says the module not found is etcd | 19:59 |
mriedem | efried: it's a whole bunch of packages, | 19:59 |
*** wolverineav has joined #openstack-nova | 19:59 | |
mriedem | i spent half of yesterday digging into that | 19:59 |
dansmith | mriedem: duh duh duhntcare | 20:00 |
efried | okay, I see the message is just ModuleNotFound, cool. | 20:00 |
mriedem | but it's also just busted networks getting to pypi | 20:00 |
mriedem | efried: just need to update the existing query for http://status.openstack.org/elastic-recheck/#1820892 | 20:00 |
efried | mriedem: on it | 20:00 |
*** markmcclain has quit IRC | 20:01 | |
* mriedem gives more money to mnaser to create a new vm to stack w/o n-novnc | 20:01 | |
*** wolverineav has quit IRC | 20:01 | |
*** wolverineav has joined #openstack-nova | 20:01 | |
mnaser | mriedem: if you launch it in sjc1, you'd be launching it against stein :) | 20:02 |
*** jmlowe has joined #openstack-nova | 20:03 | |
*** bbowen__ has quit IRC | 20:04 | |
efried | mriedem: Okay, so would there have been a better way for me to determine that this was 1820892 ? | 20:06 |
*** xek has quit IRC | 20:08 | |
*** BjoernT has joined #openstack-nova | 20:12 | |
*** igordc has joined #openstack-nova | 20:15 | |
melwitt | mriedem: dammit novnc | 20:17 |
*** artom has joined #openstack-nova | 20:17 | |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: Docs: emulator threads: clarify expected behavior https://review.openstack.org/649416 | 20:17 |
*** markvoelker has quit IRC | 20:23 | |
eandersson | Does anyone recall the reason why the functionality to search based on metadata was removed from the api? | 20:29 |
eandersson | > search_options["metadata"] = '{"my_key" : "bla" }' | 20:29 |
eandersson | https://review.openstack.org/#/c/408571/ | 20:29 |
eandersson | Also, is the alternative to use tags? | 20:29 |
*** pcaruana has quit IRC | 20:30 | |
*** whoami-rajat has quit IRC | 20:30 | |
*** hamzy has quit IRC | 20:41 | |
efried | eandersson: I have no idea, but did you look at the spec? http://specs.openstack.org/openstack/nova-specs/specs/ocata/implemented/add-whitelist-for-server-list-filter-sort-parameters.html | 20:43 |
efried | That appears to answer the first question. alex_xu could probably answer the second. | 20:44 |
mriedem | mnaser: i only see ca-ymq-2 | 20:45 |
mnaser | huh? sjc1 is a separate region | 20:45 |
mnaser | OS_REGION_NAME=sjc1 | 20:46 |
eandersson | I can't find any good documentation for tags, but maybe I just suck at googling :D | 20:46 |
eandersson | nova cli has things like > nova server-tag-add | 20:48 |
eandersson | but openstackcli has no mention of tags | 20:48 |
eandersson | > https://docs.openstack.org/python-openstackclient/latest/cli/command-objects/server.html#server-create | 20:49 |
mriedem | mnaser: oh i thouht it was a zone | 20:49 |
mriedem | eandersson: openstack server set i think | 20:49 |
mriedem | https://docs.openstack.org/python-openstackclient/latest/cli/command-objects/server.html#server-set | 20:49 |
mriedem | oh nvm | 20:49 |
mriedem | no tags support there either yet | 20:49 |
mriedem | consulting https://etherpad.openstack.org/p/compute-api-microversion-gap-in-osc | 20:49 |
mriedem | https://review.openstack.org/#/c/569386/ | 20:50 |
eandersson | That explains it | 20:50 |
mriedem | eandersson: if you're going to be in denver https://www.openstack.org/summit/denver-2019/summit-schedule/events/23665/closing-compute-api-feature-gaps-in-the-openstack-cli | 20:51 |
eandersson | I will | 20:52 |
mriedem | i've started the etherpad for that session https://etherpad.openstack.org/p/DEN-osc-compute-api-gaps | 20:52 |
*** dpawlik has joined #openstack-nova | 20:54 | |
*** dpawlik has quit IRC | 20:58 | |
*** amodi has quit IRC | 21:01 | |
*** ceryx has joined #openstack-nova | 21:03 | |
*** priteau has joined #openstack-nova | 21:04 | |
*** erlon has quit IRC | 21:06 | |
*** slaweq has quit IRC | 21:07 | |
NewBruce | hej mriedem - we look to have stumbled on a bug in Neutrons Port Binding API, in a particular area of the code which you will probably be more intimately familliar with than I | 21:09 |
NewBruce | https://bugs.launchpad.net/nova/+bug/1822884 (cc sean-k-mooney) | 21:10 |
openstack | Launchpad bug 1822884 in OpenStack Compute (nova) "live migration fails due to port binding duplicate key entry in post_live_migrate" [Undecided,New] | 21:10 |
openstackgerrit | melanie witt proposed openstack/nova stable/stein: Add doc on VGPU allocs and inventories for nrp https://review.openstack.org/649454 | 21:12 |
NewBruce | if you have a minute, id love your opinion - im trying to decide how much more debugging is worth before we move over to cold migrations | 21:12 |
*** bbowen__ has joined #openstack-nova | 21:14 | |
*** wolverineav has quit IRC | 21:16 | |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: Docs: emulator threads: clarify expected behavior https://review.openstack.org/649416 | 21:16 |
mriedem | NewBruce: i don't know what the differences are between RDO and OSA which would cause issues here | 21:17 |
mriedem | as you said, "This is unexpected, as even in the RDO to RDO case, both nodes are Rocky and so the new process should be in use." | 21:18 |
NewBruce | hrmm, no - that seems to be our sticking point - i was chatting with sean-k-mooney and also mnaser on this | 21:18 |
mriedem | are you sure you're not hitting a case where one node is really queens? | 21:18 |
*** wolverineav has joined #openstack-nova | 21:18 | |
*** wolverineav has quit IRC | 21:18 | |
*** wolverineav has joined #openstack-nova | 21:19 | |
NewBruce | yeah, absolutely sure - as I am using a single test source node which is absolutely Rocky as well as the target | 21:19 |
mriedem | is 'Port Bindings Extended' showing up in the neutron api extension list in both the rdo and osa case? | 21:21 |
mriedem | if rdo is rocky, i don't know why rdo->rdo would use the old flow | 21:22 |
mriedem | unless the 'Port Bindings Extended' neutron extension is not showing up | 21:22 |
NewBruce | i can confirm that on the target it is enabled | 21:23 |
mriedem | the live migration task should only do the new flow with the 2nd deactivated port binding if (1) the neutron 'Port Bindings Extended' API extension is available, and (2) both the source and dest compute service versions are >= 37 | 21:23 |
mriedem | *35 | 21:23 |
*** slaweq has joined #openstack-nova | 21:24 | |
eandersson | mriedem, solution was to just list all VMs in the project and then just filter on metadata :p | 21:24 |
*** ttsiouts has joined #openstack-nova | 21:24 | |
eandersson | until tags is supported everywhere (openstackclient, terraform etc) | 21:24 |
NewBruce | yeah, so we looked at that as well; | 21:25 |
NewBruce | | cc-compute10-kna1 | nova-compute | 35 | | 21:25 |
NewBruce | | cc-compute29-kna1 | nova-compute | 35 | | 21:25 |
NewBruce | source = 10 (RDO Rocky) Target = 29 (OSA Rocky) | 21:25 |
*** derekh has quit IRC | 21:26 | |
NewBruce | however we do have other compute nodes in the same environment which have not been upgraded to Rocky yet; (service version 30) | 21:26 |
mriedem | NewBruce: do you have [upgrade_levels]/compute pinned to queens or something? | 21:27 |
openstackgerrit | Jared Winborne proposed openstack/nova master: Leave the brackets on Ceph Monitor IPv6 addresses for libguestfs https://review.openstack.org/649405 | 21:28 |
*** wolverineav has quit IRC | 21:28 | |
*** slaweq has quit IRC | 21:28 | |
NewBruce | [upgrade_levels] | 21:29 |
NewBruce | compute = auto | 21:29 |
*** krypto has quit IRC | 21:29 | |
mriedem | so that will pin to the lowest nova-compute service version in the deployment while you're upgrading, | 21:29 |
mriedem | so if you have queens computes, it will be using the queens rpc versions and backlevel the migrate_data object | 21:29 |
mriedem | which likely is dropping the vifs information for the 2nd deactivated port binding | 21:30 |
NewBruce | Aha…. | 21:30 |
mriedem | that's my guess anyway | 21:30 |
*** wolverineav has joined #openstack-nova | 21:30 | |
NewBruce | ok, so can we override that? | 21:30 |
NewBruce | again, mnaser / sean-k-mooney suggested that might be the case and to upgrade the entire compute - which we were in the process of, however anothe error in a node caused a bit of strife due to broken libevent so’s in OVS so we halted | 21:31 |
mriedem | well you can pin the compute rpc api version to a specific release (queens) or even rpc api version (5.0) but that could break things where the controller is sending versions of objects to older queens computes that won't understand those chnages | 21:32 |
NewBruce | also, we have other sites where we have performed the same procedure, without issue | 21:32 |
mriedem | ultimately it looks like https://github.com/openstack/nova/blob/stable/rocky/nova/conductor/tasks/live_migrate.py#L41 is faulty in that it doesn't account for pinned RPC versions | 21:32 |
NewBruce | i have a test machine, so i can override to at least test it | 21:32 |
mriedem | dansmith: yeah ^? | 21:32 |
*** luksky has quit IRC | 21:32 | |
mriedem | NewBruce: the kill switch would be to disable the 'Port Bindings Extended' neutron API extension until you're fully upgraded to rocky | 21:33 |
*** BjoernT has quit IRC | 21:33 | |
dansmith | mriedem: if conductor is new enough then the new objects are fine | 21:33 |
NewBruce | which would force into the old flow | 21:33 |
mriedem | dansmith: sounds like conductor in this case is rocky but some computes are queens | 21:33 |
dansmith | mriedem: which is fine | 21:33 |
NewBruce | dansmith correct | 21:33 |
mriedem | because of the bounce back thingy? | 21:33 |
dansmith | yes | 21:33 |
dansmith | the rpc pin has nothing to do with the objects | 21:33 |
dansmith | it's only really method signatures | 21:33 |
mriedem | sure | 21:33 |
mriedem | but, | 21:34 |
*** slaweq has joined #openstack-nova | 21:34 | |
mriedem | conductor is doing some stuff in neutron based on what the compute service versions are, | 21:34 |
mriedem | and then setting fields in the objects that get passed to compute, | 21:34 |
*** wolverineav has quit IRC | 21:34 | |
*** wolverineav has joined #openstack-nova | 21:34 | |
mriedem | and then compute has logic that is based on if those new fields are set or not, | 21:34 |
mriedem | which it sounds like they might not be | 21:34 |
mriedem | tldr compute is probably not doing the right thing | 21:35 |
dansmith | well, if the backport is right then it should be okay, unless the conductor is doing something against neutron that the compute can't possibly handle properly | 21:35 |
dansmith | but if so, we broke upgrades | 21:36 |
mriedem | mnaser: did you hit https://bugs.launchpad.net/nova/+bug/1822884 when upgrading to rocky and doing live migrations with mixed computes (queens and rocky?) | 21:38 |
openstack | Launchpad bug 1822884 in OpenStack Compute (nova) "live migration fails due to port binding duplicate key entry in post_live_migrate" [Undecided,New] | 21:38 |
*** slaweq has quit IRC | 21:38 | |
mriedem | mnaser: or do you upgrade neutron after nova? | 21:38 |
mnaser | we upgrade neutron after nova usually mriedem | 21:38 |
mnaser | I’m very intimately familiar with that bug though. | 21:38 |
mriedem | first i've heard of it :/ | 21:39 |
mriedem | mnaser: ok but that's why you didn't hit it | 21:39 |
mnaser | I’ve been trying to help NewBruce nail it down forever. | 21:39 |
mnaser | But I’ve kinda struggled at finding the code path to replicate it | 21:39 |
*** BjoernT has joined #openstack-nova | 21:39 | |
mnaser | There are other environments where there is mixed compute AND Rocky network and this doesn’t happen | 21:40 |
mnaser | Right NewBruce ? | 21:40 |
NewBruce | yeah thats right mnaser | 21:40 |
mriedem | right he said, "2. OSA -> OSA uses the new flow (two entries which are cleaned up)" | 21:40 |
NewBruce | this is the first site we’ve seen it on... | 21:40 |
NewBruce | mriedem right; so if i grab a VM on an OSA node, and migrate it to an OSA node, watch ml2_port_bindings - ill see one port, briefly two ports, the profiles change, and then the second port entry removed | 21:41 |
NewBruce | as described in the bluebrint | 21:41 |
NewBruce | testing RDO - RDO, i don’t see that behaviour - just a single port entry in ml2_port_bindings throughout the entire migration | 21:42 |
mnaser | mriedem: NewBruce runs a few regions and this one is the only one where it works. With both queens to queens | 21:42 |
mnaser | Shit, what’s upgrade levels set to in the rdo notes? | 21:42 |
mnaser | Nodes | 21:42 |
mnaser | Sorry. I’m on mobile. | 21:43 |
NewBruce | mnaser auto | 21:43 |
NewBruce | mriedem and i chatted above that will pin it to the lowest version, as you suspected previously | 21:43 |
mnaser | ok, so yeah, I’m pretty torn on why it would actually do that in one environment and not another. | 21:43 |
mriedem | "all control nodes and net nodes are running OSA (Rocky), some compute are running RDO (Queens), some are RDO (Rocky) and the remaining are OSA (Rocky)." | 21:43 |
NewBruce | ill double check some other regions and see if they are diferent there | 21:44 |
mnaser | NewBruce: by any chance maybe the other regions aren’t pinned to lower version maybe? | 21:44 |
mnaser | And so this is why we don’t catch it? | 21:44 |
NewBruce | mriedem correct / mnaser ill double check | 21:44 |
NewBruce | quick (not exhaustive) check, upgrade levels is auto | 21:46 |
NewBruce | we have the same mix of service versions there as well (30 / 35) | 21:47 |
mnaser | Yet that issue somehow doesn’t happen there | 21:49 |
mnaser | So off | 21:49 |
mnaser | Odd | 21:49 |
mriedem | NewBruce: it fails in _post_live_migration right/ | 21:50 |
mriedem | ? | 21:50 |
mnaser | I think so. Rather than deleting the old binding and activating the new one, it tries to update the port binding | 21:51 |
NewBruce | mriedem correct | 21:51 |
NewBruce | mnaser correct | 21:51 |
mnaser | Which is what it would be with old port binding method. | 21:52 |
mriedem | right in the old method there is just one port binding | 21:52 |
mriedem | and we change the host on it | 21:52 |
NewBruce | mriedem and thats the exact behavior we see in RDO - RDO | 21:55 |
mnaser | So rocky to rocky and it does old port binding | 21:55 |
mnaser | NewBruce: can you restart a nova-compute with rdo and debug=true and double check the value of upgrade_levels in the output on startup of Oslo CFC | 21:56 |
mnaser | Cfg | 21:56 |
mnaser | In case rdo is doing weird stuff | 21:56 |
NewBruce | sure | 21:57 |
sean-k-mooney | sound like ye are making some progress on this | 21:57 |
*** wolverineav has quit IRC | 21:57 | |
NewBruce | [upgrade_levels] | 21:57 |
NewBruce | compute = auto | 21:57 |
*** tbachman has joined #openstack-nova | 21:58 | |
NewBruce | debug=true | 21:58 |
sean-k-mooney | is the theory that RDO and OSA are not using the same compute RPC version due to there configs eventhough they are running more or less the same code | 22:00 |
*** wolverineav has joined #openstack-nova | 22:00 | |
sean-k-mooney | and as a result RDO is useing the old mechaniusm while osa is usign the new mechanisium? | 22:00 |
*** wolverineav has quit IRC | 22:00 | |
*** wolverineav has joined #openstack-nova | 22:01 | |
NewBruce | mnaser | 22:03 |
NewBruce | nova-compute.log:2019-04-02 23:59:08.448 10483 DEBUG oslo_service.service [req-4f4874a7-a967-49d1-a643-c59b856c5c61 - - - - -] upgrade_levels.compute = auto log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3032 | 22:03 |
mnaser | I mean I dug through the conductor code a lot and it checks the source and dest to be above or at a certain later | 22:03 |
mnaser | Level | 22:04 |
mnaser | As far as I know the conductor creates the port bindings | 22:04 |
sean-k-mooney | not in all cases | 22:04 |
sean-k-mooney | it can be created by the compute node | 22:04 |
mnaser | Oh really? So I think in the new flow, the new compute node creates it right? | 22:05 |
mriedem | there is already an active port binding for the source host when you start the live migration, | 22:05 |
mriedem | in the new flow, conductor will create an inactive port binding for the dest host | 22:05 |
mriedem | and saves information about that new dest host port binding on the LiveMigrateData.vifs field that gets passed around to the computes | 22:06 |
sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L280-L284 | 22:06 |
mnaser | Right. That’s what I thought. I even asked NewBruce to set the “dont live migrate until vif is plugged” option that was introduced in rocky as false and moved to true in Stein | 22:06 |
sean-k-mooney | yes we do the version check the bind the ports on teh destiation | 22:06 |
mriedem | mnaser: that's unrelated to this | 22:07 |
mriedem | i mean it was part of the same bp | 22:07 |
mriedem | but doesn't rely on the active/inactive port bindings | 22:07 |
sean-k-mooney | mnaser: in the old flow in post livemigate on dest the compute node updated the port binding | 22:07 |
mriedem | correct that is the call here https://github.com/openstack/nova/blob/stable/rocky/nova/compute/manager.py#L6836 | 22:08 |
mnaser | NewBruce: can you update conductor to perhaps output the content of every statement in the if that sean-k-mooney linked? | 22:08 |
mnaser | And do and rdo to rdo migration and see which one of them is evaluating to false and not doing port bindings? | 22:09 |
mriedem | that code in post_live_migration_at_destination won't update the port binding host if it's already the specified host https://github.com/openstack/nova/blob/stable/rocky/nova/network/neutronv2/api.py#L3088 | 22:09 |
mnaser | Cause rdo to rdo seems to do old school bindings from what I understand | 22:09 |
mriedem | which is why it's a no-op with the new flow | 22:09 |
mriedem | because we've already activated the dest host port binding before we get to https://github.com/openstack/nova/blob/stable/rocky/nova/network/neutronv2/api.py#L3088 | 22:09 |
NewBruce | mnaser yeah, thats no problem (to update the conductor and test) | 22:10 |
sean-k-mooney | right be if we create an inactive binding on the dest and then somehow end ups runint the old code in post live migrate the host it gets back from neturon will be the source node as that will still be the active binding right | 22:11 |
mriedem | NewBruce: in case you're not familar, this is the code on the source compute that activates the dest port binding once we've switched to post-copy | 22:11 |
mriedem | https://github.com/openstack/nova/blob/stable/rocky/nova/compute/manager.py#L1136-L1160 | 22:11 |
sean-k-mooney | so if we are corssing the streams here that check might be a little wierd | 22:11 |
NewBruce | mriedem right; yeah i have filled that with a ton of debug messages as well :D (have the logs saved down in my diaries if they become useful) | 22:12 |
mnaser | I think once we find out why rdo-rdo uses old flow even if upgrade levels are auto.. it might help | 22:13 |
cfriesen | when doing spec re-approvals for T for stuff that didn't make it in S, do I just copy from stein/approved into train/approved? | 22:14 |
mordred | mriedem: cool - thanks for the context on the migrate | 22:14 |
sean-k-mooney | mriedem: that code assumes we get the lifecycle event | 22:15 |
*** priteau has quit IRC | 22:15 | |
mriedem | cfriesen: https://specs.openstack.org/openstack/nova-specs/readme.html#previously-approved-specifications | 22:15 |
sean-k-mooney | if we dont we wont activate it. | 22:15 |
sean-k-mooney | until post live migration so may that could be whats happening in the RDO to OSA flow | 22:15 |
cfriesen | mriedem: you mean I'm supposed to read the readme? | 22:15 |
cfriesen | :) | 22:16 |
mriedem | sean-k-mooney: that code is just a best effort to try and activate the dest host port binding to reduce network downtime, as the comment says, " Otherwise the ports are bound to the destination host # in post_live_migration_at_destination." | 22:16 |
*** wolverineav has quit IRC | 22:16 | |
sean-k-mooney | ya but if we dont activate here then https://github.com/openstack/nova/blob/stable/rocky/nova/network/neutronv2/api.py#L3088 will be true | 22:16 |
mnaser | not gonna lie, bugs like this makes me wish we had one agent that did it all :P | 22:17 |
mnaser | I’ll walk myself out | 22:17 |
mriedem | sean-k-mooney: sure, and honestly i thought that was still supposed to work | 22:17 |
mriedem | i don't know enough about what's happening within neutron to cause that duplicate primary key error | 22:17 |
sean-k-mooney | it proably shoudl work but i dont think we have testeded it. | 22:18 |
mriedem | the new flow deals with port bindings on the port bindings resources in the port bindings API, the old flow just deals with the port resource and its binding:host_id attribute | 22:18 |
NewBruce | i will post a large log dump into the launchpad… brb | 22:18 |
mnaser | mriedem: nova triés to update the old binding to point to the new host, but because a new binding already exists, it blows up | 22:18 |
mnaser | Cause you can’t have a binding with same port/host combo | 22:18 |
sean-k-mooney | we might be abel to reproduce the neutron issue with a functional test | 22:19 |
mriedem | mnaser: yeah, i just thought neutron was handling that for us | 22:19 |
mnaser | i guess maybe that’s where the bug lives | 22:19 |
*** wolverineav has joined #openstack-nova | 22:19 | |
mriedem | i.e. i thought if we changed the ports binding:host_id value and there was already a port binding for that host, but was inactive, neutron would automatically activate it | 22:19 |
sean-k-mooney | create a port binding, create an inactive port binding on another host, try to update the orginal binding to the destiation instead of activating | 22:19 |
sean-k-mooney | mriedem: i dont think it does that | 22:20 |
mnaser | Yep that would be a reproducer sean-k-mooney | 22:20 |
mriedem | sean-k-mooney: hell i could just push a patch to comment this out https://github.com/openstack/nova/blob/stable/rocky/nova/compute/manager.py#L1155 and our live migration job should blow up | 22:20 |
* mriedem pushes | 22:20 | |
mriedem | NewBruce: while you're at it, dump the libvirtd and qemu-kvm package versions for both rdo and osa rocky nodes in the bu | 22:21 |
mriedem | *bug | 22:21 |
NewBruce | (i’ve just added to the ml2_port_binding trace to the launchpad as well as a debug version of the logs) | 22:22 |
openstackgerrit | Chris Friesen proposed openstack/nova-specs master: Re-propose emulated virtual TPM spec to train https://review.openstack.org/649463 | 22:25 |
*** rcernin has joined #openstack-nova | 22:25 | |
*** ttsiouts has quit IRC | 22:27 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: DNM: Test theory about bug 1822884 https://review.openstack.org/649464 | 22:28 |
openstack | bug 1822884 in OpenStack Compute (nova) "live migration fails due to port binding duplicate key entry in post_live_migrate" [Undecided,New] https://launchpad.net/bugs/1822884 | 22:28 |
mriedem | mnaser: NewBruce: sean-k-mooney: ^ we'll find out | 22:28 |
*** ttsiouts has joined #openstack-nova | 22:28 | |
mnaser | Let’s wait and see | 22:29 |
NewBruce | mriedem posted version info | 22:29 |
sean-k-mooney | i might be able to reporduce somethign on the neutron side too if i extend https://github.com/openstack/neutron/blob/master/neutron/tests/fullstack/test_ports_rebind.py but honestly im not sure that neutorn has mulit service functional test like nova has where we use a fake message bus and run everything in the one process | 22:30 |
NewBruce | mriedem will test it out | 22:30 |
mriedem | NewBruce: our ci system will test it | 22:30 |
mriedem | if that's the bug, the nova-live-migration job should explode | 22:31 |
* mnaser has maintenance from 12am to 3am. M | 22:31 | |
mriedem | NewBruce: and it does look like your libvirt/qemu versions are different between rdo and osa so i wonder if that has something to do with the event getting sent or not | 22:31 |
mnaser | I’ll head off for a lil bit to doze and catch up on this buffer | 22:31 |
mriedem | reminds me that i still have https://review.openstack.org/#/c/594527/ | 22:32 |
*** ttsiouts has quit IRC | 22:32 | |
mriedem | and https://review.openstack.org/#/c/594139/1 | 22:33 |
NewBruce | yep, id better get some shut eye soon too and check back in the morning. mnaser do you still want the debug from sean-k-mooney post earlier? | 22:33 |
NewBruce | mriedem any value in testing the service values? / disabling binding-extended | 22:34 |
NewBruce | ? | 22:34 |
mriedem | hmm, well we have a nova-grenade-live-migration job but would need to have that running on a stable/rocky change because then one compute would be queens and one would be rocky, or we could just pin rpc to queens in a job on master... | 22:35 |
sean-k-mooney | i dont think neutron provides a way to disabel binding extended via config so it would need a code chagne | 22:35 |
mriedem | sean-k-mooney: there is no api for that? | 22:35 |
mriedem | i guess not https://developer.openstack.org/api-ref/network/v2/index.html#id5 | 22:35 |
sean-k-mooney | i have looked before ill check but i dont think so | 22:35 |
*** BjoernT has quit IRC | 22:38 | |
NewBruce | i think we can run and upgrade accross all the compute nodes anyway; the issue we had with libevent seems to be isolated and easily tested for …. ok, lets see how that job comes out anyway | 22:38 |
NewBruce | cheers | 22:38 |
*** slaweq has joined #openstack-nova | 22:38 | |
mriedem | posted https://review.openstack.org/649470 for the upgrade_levels/compute=queens pin simulation | 22:38 |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: Update tests from fake_libvirt_util mocks https://review.openstack.org/649471 | 22:39 |
sean-k-mooney | mriedem: we expect this to show up in both the nova-live-migration and the grenade version right? | 22:41 |
sean-k-mooney | actully ill check it in the morning or ill fall a sleep watching devstack | 22:42 |
sean-k-mooney | night o/ | 22:42 |
*** slaweq has quit IRC | 22:43 | |
openstackgerrit | Merged openstack/nova-specs master: Re-propose emulated virtual TPM spec to train https://review.openstack.org/649463 | 22:45 |
*** tkajinam has joined #openstack-nova | 22:55 | |
*** gmann_afk is now known as gmann | 23:11 | |
*** hongbin has quit IRC | 23:17 | |
mriedem | ooooo hot dog i've got a devstack single node env with 2 non-cell0 cells, 2 computes on cell1 and 1 on cell2 | 23:19 |
mriedem | and it's pretty easy | 23:19 |
*** tosky has quit IRC | 23:24 | |
*** mlavalle has quit IRC | 23:33 | |
openstackgerrit | Merged openstack/nova master: libvirt: vzstorage: Use 'writeback' QEMU cache mode https://review.openstack.org/643376 | 23:41 |
openstackgerrit | Merged openstack/nova master: libvirt: smbfs: Use 'writeback' QEMU cache mode https://review.openstack.org/643377 | 23:42 |
openstackgerrit | Merged openstack/nova master: Fix comment in test_attach_with_multiattach_fails_not_available https://review.openstack.org/649440 | 23:42 |
*** wolverineav has quit IRC | 23:45 | |
mriedem | heh just ran into bug 1781286 again | 23:47 |
openstack | bug 1781286 in OpenStack Compute (nova) "CantStartEngineError in cell conductor during reschedule - get_host_availability_zone up-call" [Medium,Triaged] https://launchpad.net/bugs/1781286 | 23:47 |
mriedem | we should maybe think about fixing that... | 23:47 |
mriedem | also, if things fail during server create rescheduling in conductor, chances are pretty good we don't set the instance to error status and it's stuck in build status | 23:48 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!