openstackgerrit | Alex Xu proposed openstack/nova master: [DNM] Print some logs https://review.opendev.org/682766 | 00:08 |
---|---|---|
*** lbragstad has quit IRC | 00:19 | |
*** sapd1_x has joined #openstack-nova | 00:19 | |
*** lbragstad has joined #openstack-nova | 00:19 | |
*** yingjun has joined #openstack-nova | 00:31 | |
*** TxGirlGeek has joined #openstack-nova | 00:47 | |
*** gyee has quit IRC | 00:50 | |
*** markvoelker has quit IRC | 01:08 | |
*** slaweq has joined #openstack-nova | 01:11 | |
*** slaweq has quit IRC | 01:16 | |
*** tbachman has quit IRC | 01:19 | |
*** markvoelker has joined #openstack-nova | 01:20 | |
*** yedongcan has joined #openstack-nova | 01:27 | |
*** tbachman has joined #openstack-nova | 01:33 | |
openstackgerrit | Arthur Dayne proposed openstack/nova master: Fix block disk attchment failure https://review.opendev.org/682772 | 01:43 |
*** sapd1_x has quit IRC | 01:50 | |
openstackgerrit | Arthur Dayne proposed openstack/nova master: Fix block disk attachment failure https://review.opendev.org/682772 | 01:59 |
openstackgerrit | Arthur Dayne proposed openstack/nova master: Fix block disk attachment failure https://review.opendev.org/682772 | 02:10 |
*** mdbooth has quit IRC | 02:23 | |
*** tbachman has quit IRC | 02:25 | |
*** BjoernT has joined #openstack-nova | 02:26 | |
*** mkrai has joined #openstack-nova | 02:40 | |
*** larainema has joined #openstack-nova | 02:49 | |
*** mkrai has quit IRC | 02:50 | |
*** hemna has joined #openstack-nova | 02:52 | |
*** BjoernT has quit IRC | 03:01 | |
*** BjoernT has joined #openstack-nova | 03:02 | |
*** markvoelker has quit IRC | 03:09 | |
*** slaweq has joined #openstack-nova | 03:11 | |
*** slaweq has quit IRC | 03:16 | |
*** yingjun has quit IRC | 03:24 | |
*** dave-mccowan has quit IRC | 03:36 | |
*** hemna has quit IRC | 03:40 | |
*** sapd1_x has joined #openstack-nova | 03:45 | |
*** ricolin has joined #openstack-nova | 03:54 | |
*** etp has joined #openstack-nova | 04:14 | |
*** sapd1_x has quit IRC | 04:16 | |
*** etp has quit IRC | 04:28 | |
*** ociuhandu has joined #openstack-nova | 04:30 | |
*** janki has joined #openstack-nova | 04:34 | |
*** ociuhandu has quit IRC | 04:35 | |
*** mkrai has joined #openstack-nova | 04:39 | |
*** udesale has joined #openstack-nova | 04:45 | |
*** pcaruana has joined #openstack-nova | 04:46 | |
*** BjoernT has quit IRC | 04:48 | |
*** TxGirlGeek has quit IRC | 04:49 | |
*** avolkov has joined #openstack-nova | 04:52 | |
*** tbachman has joined #openstack-nova | 04:56 | |
*** jaosorior has quit IRC | 04:57 | |
*** jaosorior has joined #openstack-nova | 04:57 | |
*** Luzi has joined #openstack-nova | 04:59 | |
*** mkrai has quit IRC | 05:00 | |
*** tbachman_ has joined #openstack-nova | 05:02 | |
*** tbachman has quit IRC | 05:02 | |
*** tbachman_ is now known as tbachman | 05:02 | |
*** markvoelker has joined #openstack-nova | 05:10 | |
*** slaweq has joined #openstack-nova | 05:11 | |
*** markvoelker has quit IRC | 05:14 | |
*** slaweq has quit IRC | 05:16 | |
*** Tianhao_Hu has joined #openstack-nova | 05:39 | |
*** Tianhao_Hu has left #openstack-nova | 05:39 | |
*** mkrai has joined #openstack-nova | 05:51 | |
*** lpetrut has joined #openstack-nova | 05:54 | |
*** ttsiouts has joined #openstack-nova | 06:00 | |
*** mjozefcz has joined #openstack-nova | 06:01 | |
*** ratailor has joined #openstack-nova | 06:02 | |
*** ttsiouts has quit IRC | 06:04 | |
*** tbachman has quit IRC | 06:04 | |
openstackgerrit | Luyao Zhong proposed openstack/nova master: libvirt: Enable driver configuring PMEM namespaces https://review.opendev.org/679640 | 06:08 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: doc: attaching virtual persistent memory to guests https://review.opendev.org/680300 | 06:08 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: objects: use all_things_equal from objects.base https://review.opendev.org/681397 | 06:08 |
*** slaweq has joined #openstack-nova | 06:11 | |
*** slaweq has quit IRC | 06:16 | |
openstackgerrit | garyk proposed openstack/nova master: Deconstruct the mother of all locks https://review.opendev.org/682242 | 06:17 |
*** igordc has joined #openstack-nova | 06:17 | |
openstackgerrit | Luyao Zhong proposed openstack/nova master: objects: use all_things_equal from objects.base https://review.opendev.org/681397 | 06:18 |
*** igordc has quit IRC | 06:19 | |
*** xek_ has joined #openstack-nova | 06:22 | |
*** ratailor has quit IRC | 06:25 | |
*** ratailor has joined #openstack-nova | 06:27 | |
*** xek_ has quit IRC | 06:30 | |
*** jawad_axd has joined #openstack-nova | 06:47 | |
*** slaweq has joined #openstack-nova | 06:55 | |
*** rcernin has quit IRC | 06:57 | |
*** cshen has joined #openstack-nova | 06:57 | |
*** mdbooth has joined #openstack-nova | 07:01 | |
*** trident has quit IRC | 07:08 | |
*** luksky has joined #openstack-nova | 07:13 | |
*** trident has joined #openstack-nova | 07:19 | |
*** takashin has joined #openstack-nova | 07:29 | |
*** mmedvede has quit IRC | 07:31 | |
*** ivve has joined #openstack-nova | 07:32 | |
*** rpittau|afk is now known as rpittau | 07:32 | |
*** arxcruz has quit IRC | 07:32 | |
*** damien_r has joined #openstack-nova | 07:33 | |
*** mmedvede has joined #openstack-nova | 07:34 | |
*** arxcruz has joined #openstack-nova | 07:36 | |
openstackgerrit | Yongli He proposed openstack/nova master: clean up orphan instances https://review.opendev.org/627765 | 08:01 |
*** ralonsoh has joined #openstack-nova | 08:04 | |
*** tkajinam has quit IRC | 08:04 | |
bauzas | good morning Nova | 08:15 |
*** ttsiouts has joined #openstack-nova | 08:16 | |
gibi | bauzas: good morning! | 08:19 |
* kashyap waves | 08:20 | |
openstackgerrit | Arthur Dayne proposed openstack/nova master: Fix block disk attachment failure https://review.opendev.org/682772 | 08:21 |
openstackgerrit | Balazs Gibizer proposed openstack/nova stable/pike: Stabilize unshelve notification sample tests https://review.opendev.org/676677 | 08:24 |
*** takashin has left #openstack-nova | 08:31 | |
*** ociuhandu has joined #openstack-nova | 08:37 | |
*** derekh has joined #openstack-nova | 08:40 | |
*** ociuhandu has quit IRC | 08:41 | |
*** cshen has quit IRC | 08:48 | |
*** cshen has joined #openstack-nova | 08:48 | |
*** ratailor has quit IRC | 08:51 | |
cshen | Good morning. I have a short question. This config "rpc_conn_pool_size = 30" in nova.conf on compute nodes , is this pool shared with all workers, or each worker has a rpc connection pool (size=30)? | 08:51 |
*** ociuhandu has joined #openstack-nova | 08:52 | |
*** ratailor has joined #openstack-nova | 08:52 | |
*** ociuhandu has quit IRC | 08:53 | |
*** KeithMnemonic1 has joined #openstack-nova | 08:55 | |
*** KeithMnemonic has quit IRC | 08:55 | |
*** KeithMnemonic1 has quit IRC | 08:55 | |
*** KeithMnemonic1 has joined #openstack-nova | 08:56 | |
*** pcaruana has quit IRC | 08:57 | |
bauzas | cshen: good question | 08:57 |
bauzas | cshen: the rpc parameter you mention comes from oslo.rpc | 08:57 |
bauzas | cshen: while the notion of workers comes from oslo.service | 08:57 |
*** ociuhandu has joined #openstack-nova | 08:58 | |
bauzas | cshen: since workers are fully seen as Linux processes, I'd guess the pool size is per worker | 08:58 |
bauzas | but I could be wrong | 08:59 |
*** pcaruana has joined #openstack-nova | 09:01 | |
*** priteau has joined #openstack-nova | 09:07 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: docs: Rewrite host aggregate, availability zone docs https://review.opendev.org/667133 | 09:08 |
stephenfin | bauzas: You think you could take a look at that (unfortunately large) doc patch today/this week? ^ | 09:09 |
stephenfin | I mostly need a sanity check to make sure I'm not telling any lies | 09:09 |
bauzas | stephenfin: I surely can try to take a look on it :) | 09:09 |
* bauzas clikcs | 09:09 | |
stephenfin | aaaaand _another_ random job has failed for the cpu-resources series. FFS. | 09:10 |
stephenfin | This time nova-live-migration with something from devstack-gate, which I thought we weren't using anymore | 09:10 |
bauzas | oh, facepalm, dude | 09:10 |
bauzas | stephenfin: I haven't paid attention to the series tbh | 09:11 |
bauzas | I mean, once they were approved | 09:11 |
* bauzas doesn't have a customer care service department :p | 09:11 | |
stephenfin | I count 13 rechecks on the base patch since approval :( | 09:12 |
stephenfin | and all of them for different reasons | 09:12 |
stephenfin | talk about a moving target | 09:12 |
brinzhang | stephenfin: these patch are not merged until now :( | 09:13 |
brinzhang | as you said, the failed reason are not same every time | 09:14 |
brinzhang | Is it because zuul's execution pressure is too great? | 09:15 |
stephenfin | I think it's just a lot of things combined | 09:26 |
stephenfin | TripleO hammering the gate, issues with a package, pressure of the gate, large series with many patches that means increased chances of hitting a bug, etc. | 09:27 |
cshen | bauzas: thanks for the answer. How many nova workers can I see on the compute node? | 09:31 |
* bauzas fucked up a large afternoon finding why his functional test wasn't working while he found a PEBKAC | 09:31 | |
cshen | bauzas: I can see only 1 nova process is running my compute node now. | 09:32 |
*** panda|ruck|off is now known as panda|ruck | 09:37 | |
*** tesseract has joined #openstack-nova | 09:44 | |
*** dtantsur|afk is now known as dtantsur | 09:49 | |
*** markvoelker has joined #openstack-nova | 09:59 | |
openstackgerrit | Merged openstack/nova stable/stein: Fix non-existent method of Mock https://review.opendev.org/676838 | 09:59 |
openstackgerrit | Merged openstack/nova stable/stein: Fix wrong assertions in unit tests https://review.opendev.org/677388 | 09:59 |
*** sapd1_x has joined #openstack-nova | 10:00 | |
*** Tianhao_Hu has joined #openstack-nova | 10:02 | |
*** Tianhao_Hu has left #openstack-nova | 10:02 | |
*** markvoelker has quit IRC | 10:04 | |
*** openstackgerrit has quit IRC | 10:06 | |
*** ociuhandu has quit IRC | 10:07 | |
*** zbr has quit IRC | 10:11 | |
*** ttsiouts has quit IRC | 10:12 | |
*** zbr has joined #openstack-nova | 10:13 | |
*** ttsiouts has joined #openstack-nova | 10:13 | |
*** ociuhandu has joined #openstack-nova | 10:14 | |
*** markvoelker has joined #openstack-nova | 10:16 | |
*** ttsiouts has quit IRC | 10:17 | |
*** sapd1_x has quit IRC | 10:18 | |
*** ociuhandu_ has joined #openstack-nova | 10:18 | |
*** markvoelker has quit IRC | 10:20 | |
*** ociuhandu has quit IRC | 10:22 | |
*** openstackgerrit has joined #openstack-nova | 10:46 | |
openstackgerrit | Merged openstack/nova stable/stein: Retrun 400 if invalid query parameters are specified https://review.opendev.org/676026 | 10:46 |
*** mkrai has quit IRC | 10:54 | |
*** mkrai_ has joined #openstack-nova | 10:54 | |
*** ttsiouts has joined #openstack-nova | 10:55 | |
*** mkrai has joined #openstack-nova | 10:58 | |
*** mkrai_ has quit IRC | 11:00 | |
*** udesale has quit IRC | 11:01 | |
*** udesale has joined #openstack-nova | 11:02 | |
*** mkrai has quit IRC | 11:04 | |
*** shilpasd has joined #openstack-nova | 11:14 | |
*** pcaruana has quit IRC | 11:19 | |
*** psachin has joined #openstack-nova | 11:25 | |
openstackgerrit | Arthur Dayne proposed openstack/nova master: Fix block disk attachment failure https://review.opendev.org/682772 | 11:27 |
*** pcaruana has joined #openstack-nova | 11:28 | |
*** ratailor has quit IRC | 11:30 | |
*** Luzi has quit IRC | 11:35 | |
aspiers | https://www.suse.com/c/improving-trust-in-the-cloud-with-openstack-and-amd-sev/ | 11:39 |
aspiers | https://blog.adamspiers.org/2019/09/13/improving-trust-in-the-cloud-with-openstack-and-amd-sev/ | 11:39 |
aspiers | which contain a special shout-out to efried and all the Red Hatters here for your help :) | 11:40 |
sean-k-mooney | aspiers: are you not ment to be on PTO | 11:40 |
aspiers | yes I am | 11:40 |
aspiers | but I wanted to get it out this week | 11:40 |
aspiers | sean-k-mooney: I tweaked that sentence on snooping based on your feedback, thanks | 11:40 |
sean-k-mooney | fair enough | 11:40 |
aspiers | hope it's more accurate now | 11:40 |
* aspiers goes back to vacation | 11:40 | |
sean-k-mooney | aspiers yep and the footnote looks good too | 11:42 |
sean-k-mooney | enjoy | 11:43 |
*** udesale has quit IRC | 11:44 | |
*** brault has joined #openstack-nova | 11:49 | |
*** Luzi has joined #openstack-nova | 11:49 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Trigger real BuildAbortException during migrate with bandwidth https://review.opendev.org/682876 | 11:53 |
*** brault has quit IRC | 11:53 | |
gibi | bauzas: a test improvement for the bandwidth functional tests ^^ | 11:54 |
bauzas | gibi: cool, will look | 11:54 |
gibi | bauzas: thanks | 11:54 |
gibi | bauzas: the next patch I will cook up is your request to test migration with pinned RPC version | 11:55 |
*** markvoelker has joined #openstack-nova | 12:05 | |
*** lpetrut has quit IRC | 12:06 | |
*** brault has joined #openstack-nova | 12:06 | |
*** janki has quit IRC | 12:07 | |
*** yedongcan has left #openstack-nova | 12:09 | |
*** awalende has joined #openstack-nova | 12:19 | |
*** derekh has quit IRC | 12:24 | |
*** redrobot has joined #openstack-nova | 12:26 | |
*** raghavendrat has joined #openstack-nova | 12:27 | |
*** luksky has quit IRC | 12:27 | |
*** raghavendrat has left #openstack-nova | 12:27 | |
*** brault has quit IRC | 12:34 | |
openstackgerrit | sean mooney proposed openstack/nova master: conf: Deprecate 'devname' field of '[pci] passthrough_whitelist' https://review.opendev.org/670585 | 12:35 |
*** damien_r has quit IRC | 12:40 | |
efried | ++ thanks aspiers! | 12:45 |
efried | stephenfin, alex_xu: thanks for splitting out the series. In retrospect (hindsight 20/20 and all that) we should have done it long ago, but I didn't want to lose all the +Vs we already had up the series since the gate was so slow. | 12:47 |
openstackgerrit | Tao Yang proposed openstack/nova master: Add missing parameter https://review.opendev.org/682886 | 12:48 |
*** nweinber has joined #openstack-nova | 12:50 | |
*** mriedem has joined #openstack-nova | 12:51 | |
*** brault has joined #openstack-nova | 12:52 | |
*** janki has joined #openstack-nova | 12:52 | |
*** janki has quit IRC | 12:53 | |
*** yaawang has quit IRC | 12:56 | |
openstackgerrit | sean mooney proposed openstack/nova master: make config drives sticky bug 1835822 https://review.opendev.org/669738 | 12:56 |
openstack | bug 1835822 in OpenStack Compute (nova) "vms loose acess to config drive with CONF.force_config_drive=True after hard reboot" [Medium,In progress] https://launchpad.net/bugs/1835822 - Assigned to sean mooney (sean-k-mooney) | 12:56 |
*** brault has quit IRC | 12:58 | |
mriedem | dansmith: if you're looking for an easy way to get back into things and be productive we need stable reviews https://review.opendev.org/#/q/status:open+project:openstack/nova+branch:stable/stein | 12:59 |
*** tbachman has joined #openstack-nova | 13:03 | |
*** derekh has joined #openstack-nova | 13:04 | |
*** ociuhandu_ has quit IRC | 13:06 | |
shilpasd | dansmith: hi, want your review on spec for isolate aggregate https://review.opendev.org/#/c/675384/ | 13:06 |
*** ricolin_ has joined #openstack-nova | 13:08 | |
*** ricolin has quit IRC | 13:11 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: make config drives sticky bug 1835822 https://review.opendev.org/669738 | 13:11 |
openstack | bug 1835822 in OpenStack Compute (nova) "vms loose acess to config drive with CONF.force_config_drive=True after hard reboot" [Medium,In progress] https://launchpad.net/bugs/1835822 - Assigned to sean mooney (sean-k-mooney) | 13:11 |
*** ricolin_ is now known as ricolin | 13:12 | |
mriedem | need another core on https://review.opendev.org/#/c/669738/ since it fixes a regression introduced in train | 13:12 |
sean-k-mooney | mriedem: thanks i though the api comment was for the flavor no the images i would have used the api otherwise | 13:14 |
sean-k-mooney | and ya i forgot to chagne the vm name | 13:14 |
sean-k-mooney | mdbooth: do we actully need the followup patch https://review.opendev.org/#/c/485930/10 i dont think we do as the orginal bug is actully already fixed but i think you wanted to refactor how this was done in general | 13:15 |
*** hemna has joined #openstack-nova | 13:16 | |
sean-k-mooney | i.e. once https://review.opendev.org/#/c/669738/ is merged i think we can just abandon the follow up patch unless we just want to do it for refactoring reasons | 13:16 |
*** eharney has joined #openstack-nova | 13:17 | |
*** ociuhandu has joined #openstack-nova | 13:18 | |
*** luksky has joined #openstack-nova | 13:19 | |
*** ttsiouts has quit IRC | 13:19 | |
sean-k-mooney | bauzas: gibi if ye are around care to review https://review.opendev.org/#/c/669738/ | 13:19 |
*** eharney has quit IRC | 13:19 | |
*** ttsiouts has joined #openstack-nova | 13:20 | |
*** awalende has quit IRC | 13:20 | |
*** ttsiouts has quit IRC | 13:20 | |
*** eharney has joined #openstack-nova | 13:20 | |
*** ttsiouts has joined #openstack-nova | 13:21 | |
*** tbachman has quit IRC | 13:28 | |
*** munimeha1 has joined #openstack-nova | 13:28 | |
*** psachin has quit IRC | 13:33 | |
dansmith | mriedem: cool, thanks.. I love being spoon fed | 13:33 |
dansmith | shilpasd: ack | 13:33 |
efried | dansmith: you got that one, then? I can stop reading? | 13:34 |
mriedem | dansmith: i'll burp and change your dipee for $2 | 13:34 |
mriedem | *dipey? | 13:34 |
efried | I'm gonna go with dipey. | 13:34 |
dansmith | efried: what one? You're already +2 on the spec revision | 13:34 |
efried | dansmith: The config drive regression | 13:35 |
*** ociuhandu has quit IRC | 13:35 | |
dansmith | efried: I was responding to mriedem asking me to hit the stable queue | 13:36 |
efried | okay, I was referring to 8:12:31 AM - mriedem: need another core on https://review.opendev.org/#/c/669738/ since it fixes a regression introduced in train | 13:37 |
efried | I'll keep reading | 13:37 |
dansmith | so, it looks like some of the stuff we +Wd before FF still haven't merged.. ISTR efried saying we'd reassess if things haven't merged by monday, and now it's wednesday | 13:40 |
dansmith | assume we're backing off from the recheck grind? | 13:40 |
efried | no, I hadn't planned to. | 13:40 |
bauzas | I think it's okay | 13:40 |
bauzas | at least until next Thursday... :D | 13:40 |
dansmith | how close to freeze are we going to keep doing that? | 13:40 |
efried | Ask me again on Monday | 13:41 |
* dansmith shakes his head | 13:41 | |
*** martinkennelly has joined #openstack-nova | 13:45 | |
*** brault has joined #openstack-nova | 13:48 | |
*** BjoernT has joined #openstack-nova | 13:48 | |
*** brault has quit IRC | 13:48 | |
*** Luzi has quit IRC | 13:49 | |
mriedem | gibi: need to deal with some ironic-isms in https://review.opendev.org/#/c/666857/ | 13:52 |
gibi | mriedem: ack, looking | 13:52 |
mriedem | gibi: also noted that we'll get a warning on every comptue fresh start up | 13:52 |
gibi | mriedem: is it OK to start up the nova-compute with ironic setup if the ironic-api is not available? | 13:53 |
mriedem | gibi: that's what we've always had since ironic in devstack is a plugin and comes after nova | 13:54 |
*** artom has quit IRC | 13:54 | |
mriedem | but i think in general it's good for the nova code to be resilient to when the hypervisor isn't available, | 13:55 |
gibi | mriedem: I see. Then I will log on that error but otherwise ignore it | 13:55 |
mriedem | e.g. in the libvirt driver case it auto-disables the service record so you can't schedule to that node | 13:55 |
mriedem | gibi: yeah i think log and return from the method is what i'd expect | 13:55 |
gibi | mriedem: ok | 13:55 |
*** tbachman has joined #openstack-nova | 14:00 | |
*** JamesBenson has joined #openstack-nova | 14:02 | |
shilpasd | dansmith: thanks for ack | 14:03 |
*** xek_ has joined #openstack-nova | 14:06 | |
efried | "There is no script for 398 version" <== I thought this means I need to rebuild my testenv, but that didn't fix it. | 14:06 |
dansmith | efried: clean your directory of pyc files | 14:08 |
efried | thanks dansmith | 14:09 |
*** dtantsur is now known as dtantsur|afk | 14:10 | |
*** zhubx has quit IRC | 14:16 | |
*** zhubx has joined #openstack-nova | 14:17 | |
*** openstackgerrit has quit IRC | 14:21 | |
*** mkrai has joined #openstack-nova | 14:28 | |
*** mkrai has quit IRC | 14:29 | |
*** mkrai_ has joined #openstack-nova | 14:30 | |
*** ociuhandu has joined #openstack-nova | 14:30 | |
dansmith | melwitt: were you going to update this? https://review.opendev.org/#/c/662095 | 14:33 |
*** TxGirlGeek has joined #openstack-nova | 14:33 | |
mriedem | alex_xu: is anyone working on renaming "Intel_Zuul" so it shows up as 3rd party CI? | 14:33 |
mriedem | i.e. comments in patches will be hidden by default | 14:33 |
sean-k-mooney | https://docs.openstack.org/infra/system-config/third_party.html#creating-a-service-account covers how to configure it but they jsut need to chage teh account full name to have CI in it | 14:36 |
*** brault has joined #openstack-nova | 14:36 | |
efried | dtroyer: ^^ Would you be able to (poke someone to) make a PR to make that adjustment? | 14:38 |
sean-k-mooney | its why everyone ignores my "Sean Mooney CI" account | 14:39 |
efried | I never know which one to use when adding you to a change. Is your RH email the right one? | 14:39 |
sean-k-mooney | use the non ci one | 14:40 |
sean-k-mooney | but yes the redhat one | 14:40 |
sean-k-mooney | the ci one was for testing ci jobs locally when i was planning to run my own thridparty ci | 14:41 |
*** mdbooth has quit IRC | 14:42 | |
sean-k-mooney | i might someday set that up again but it was burning like 100-200 euro a month in power someing like that anyway | 14:42 |
mriedem | lyarwood: can we drop these backports? https://review.opendev.org/#/q/I15a7c13edf78884ec223fd531a78a341106b41b8+status:open | 14:43 |
mriedem | we have no recreate and it's a very latent thing | 14:43 |
dtroyer | efried: will do | 14:44 |
efried | thank you dtroyer | 14:44 |
lyarwood | mriedem: ack sure | 14:45 |
*** jawad_axd has quit IRC | 14:46 | |
*** tbachman has quit IRC | 14:46 | |
*** jawad_axd has joined #openstack-nova | 14:46 | |
*** jawad_axd has quit IRC | 14:46 | |
lyarwood | mriedem: I'll do the same for https://review.opendev.org/#/q/topic:bug/1834048+(status:open+OR+status:merged) - iirc the downstream reporter never got back to us about this. | 14:46 |
*** mlavalle has joined #openstack-nova | 14:47 | |
*** priteau has quit IRC | 14:47 | |
*** luksky has quit IRC | 14:48 | |
efried | sean-k-mooney, mriedem: merging https://review.opendev.org/#/c/669738/ (config drive fix) | 14:48 |
mriedem | lyarwood: yup thanks | 14:49 |
sean-k-mooney | efried: cool, im not sure if mdbooth replied but i dont think the follow up patch is needed so we might want to abandon it? i did not rebase it because i dont know what value it adds. | 14:49 |
efried | oh, that one's been around for a whiiile. | 14:50 |
efried | I haven't looked at it recently. | 14:50 |
mriedem | with the size and breadth of that change the commit message would need a hell of a lot more detail | 14:50 |
*** priteau has joined #openstack-nova | 14:51 | |
*** priteau has quit IRC | 14:56 | |
*** tbachman has joined #openstack-nova | 14:56 | |
*** jaosorior has quit IRC | 14:57 | |
*** TxGirlGe_ has joined #openstack-nova | 14:57 | |
*** priteau has joined #openstack-nova | 14:58 | |
*** TxGirlGeek has quit IRC | 15:00 | |
melwitt | dansmith: guh, yeah I had intended to but it keeps getting left on the backburner | 15:01 |
dansmith | melwitt: okay, seems worth finishing | 15:01 |
melwitt | yeah, agreed :( | 15:01 |
melwitt | if anyone else wants to update it, of course feel free too | 15:02 |
melwitt | I'll try to do it after my calls today | 15:02 |
*** artom has joined #openstack-nova | 15:03 | |
gibi | sean-k-mooney: I think I found possible race in the test in https://review.opendev.org/#/c/669738/ | 15:04 |
*** TxGirlGe_ has quit IRC | 15:06 | |
sean-k-mooney | i think the state does change but ill check | 15:07 |
sean-k-mooney | maybe its the task sate | 15:07 |
gibi | sean-k-mooney: task_state changes but I think the status doesn't | 15:07 |
*** openstackgerrit has joined #openstack-nova | 15:08 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Error out interrupted builds https://review.opendev.org/666857 | 15:08 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Pull up compute node queries to init_host https://review.opendev.org/682680 | 15:08 |
gibi | sean-k-mooney: even if status changes during the reboot if it start from ACTIVE and ends in ACTIVE then you cannot be sure why the wait returned | 15:08 |
*** arxcruz is now known as arxcruz|ruck | 15:08 | |
sean-k-mooney | looking at https://docs.openstack.org/nova/latest/reference/vm-states.html you seam to be correct | 15:08 |
*** gyee has joined #openstack-nova | 15:08 | |
gibi | mriedem: fixed your comments in https://review.opendev.org/#/c/666857 | 15:09 |
sean-k-mooney | gibi: efried so we should stop https://review.opendev.org/#/c/669738/7 merging and i should fix that | 15:10 |
gibi | sean-k-mooney: a rebase would pull it out from the gate | 15:10 |
sean-k-mooney | i just set workflow -1 will that work? | 15:11 |
sean-k-mooney | i just removed erric +w | 15:11 |
gibi | I don't know what is stronger the W+ or the W- | 15:11 |
gibi | sean-k-mooney: ohh you can do that? | 15:11 |
sean-k-mooney | if you are the patch owner | 15:11 |
sean-k-mooney | or a core | 15:11 |
efried | whoah | 15:12 |
efried | I didn't know that | 15:12 |
sean-k-mooney | ya we block remove -* i think | 15:12 |
efried | anyway, that's not going to pull it out of the check queue, which is where it's sitting right now. But it should prevent it going into the gate. | 15:12 |
*** xek__ has joined #openstack-nova | 15:12 | |
efried | if it was already in the gate you would have to rebase to pull it. | 15:12 |
bauzas | mriedem: I'm pretty done with my audit command (just needing to work on unittests) but I'd like to provide a functional test about some migration error | 15:12 |
*** zul has joined #openstack-nova | 15:12 | |
efried | If you care to save on gate resources you could rebase to pull it out of check | 15:12 |
bauzas | mriedem: just to make it clear, there are cases where the migration goes into ERROR and then we have orphaned allocations, right? | 15:13 |
efried | if the patch itself needs changes. | 15:13 |
sean-k-mooney | efried: taht will kick off new jobs | 15:13 |
sean-k-mooney | ill just try to fix it before it starts running | 15:13 |
efried | yes, but at the back of the queue :) | 15:13 |
bauzas | mriedem: I'd like to mimic some behaviour | 15:13 |
sean-k-mooney | ok | 15:13 |
sean-k-mooney | ill do that | 15:13 |
efried | I think actually you might be able to pull it completely by abandoning and restoring | 15:13 |
efried | um, restore might start it again | 15:14 |
efried | so you could abandon until you're ready to push new ps | 15:14 |
*** xek_ has quit IRC | 15:14 | |
*** TxGirlGe_ has joined #openstack-nova | 15:15 | |
kashyap | sfinucan: Before the GA, this should go in, yeah? The typo-that-matters in driver.py: https://review.opendev.org/#/c/682267/3/nova/virt/libvirt/driver.py | 15:16 |
mriedem | gibi: sean-k-mooney: commenting on sean's patch | 15:17 |
*** ivve has quit IRC | 15:18 | |
mriedem | commented | 15:19 |
mriedem | seems the only issue is reboot | 15:19 |
*** belmoreira has quit IRC | 15:19 | |
mriedem | sorry i was looking at unshelve | 15:19 |
*** cshen has quit IRC | 15:20 | |
openstackgerrit | Shilpa Devharakar proposed openstack/nova-specs master: Update spec: filtering of alloc candidates by forbidden aggregates https://review.opendev.org/675384 | 15:21 |
mriedem | i think in either case waiting for task_state to be None is sufficient, but waiting on notifications also works | 15:21 |
gibi | mriedem: replied | 15:22 |
mriedem | bauzas: for that audit command i would have started with functional tests months ago | 15:26 |
mriedem | unit tests don't mean much for stuff like this | 15:26 |
*** TxGirlGeek has joined #openstack-nova | 15:26 | |
mriedem | in fact, i don't like unit tests really for any of these types of operational commands | 15:26 |
bauzas | mriedem: I can send you a new revision now, FWIW | 15:26 |
bauzas | mriedem: but I just wonder what to test from a migration usecase | 15:26 |
bauzas | but I think I'll just mock a failing migration that leaves an allocation record | 15:26 |
mriedem | there are notes about migration in the bug report aren't there? | 15:26 |
bauzas | correct, there are examples where we found orphaned allocations | 15:27 |
mriedem | i don't think mocking is really good here, i'm pretty sure we have known ways to hit stuff like that, with evacuate | 15:27 |
mriedem | https://review.opendev.org/#/c/678100/ has a couple of related bugs where we orphan allocations/providers | 15:28 |
bauzas | what I basically do is to compare whether I can find an active migration against an existing allocation | 15:28 |
bauzas | will look at the change ^ | 15:28 |
bauzas | and you know what ? I'll release the change now so people can voice against it | 15:28 |
mriedem | kashyap: i've tagged the bug with train-rc-potential | 15:29 |
kashyap | mriedem: Ah, thank you. Didn't think of it | 15:29 |
kashyap | IIRC, stephenfin wanted to hold off on it until the 'cpu-resources' merges | 15:29 |
mriedem | so like the day of RC1 | 15:30 |
mriedem | got it | 15:30 |
kashyap | (Due to conflicts, etc.) | 15:30 |
* stephenfin shakes fist at gate | 15:30 | |
kashyap | mriedem: :D I mentioned it exactly for that -- in case it doesn't merge, that typo shouldn't fall through the cracks | 15:30 |
*** liuyulong has joined #openstack-nova | 15:30 | |
kashyap | (And you've helpfully fixed it, by the tag) | 15:30 |
mriedem | i sure hope everyone rechecking the vpmem and pcpu series is actually looking at every failure to make sure we're not recheck grinding in some regression | 15:30 |
stephenfin | I am. We're seeing mirror issues, InnoDB issues, and something with devstack_gate and null bytes | 15:31 |
mriedem | kashyap: i don't see you or aspiers as +1 on that bug fix | 15:31 |
stephenfin | None of which I know how to fix :( | 15:31 |
kashyap | mriedem: Damn, I have it sit in a tab, let me do that | 15:32 |
*** markvoelker has quit IRC | 15:35 | |
*** david-lyle has quit IRC | 15:35 | |
*** dklyle has joined #openstack-nova | 15:35 | |
openstackgerrit | Sylvain Bauza proposed openstack/nova master: WIP: Add a placement audit command https://review.opendev.org/670112 | 15:36 |
*** zbr has quit IRC | 15:36 | |
*** ttsiouts has quit IRC | 15:36 | |
*** brinzhang has quit IRC | 15:37 | |
*** zbr has joined #openstack-nova | 15:37 | |
*** brault has quit IRC | 15:38 | |
*** dtantsur|afk is now known as dtantsur | 15:39 | |
*** panda|ruck is now known as panda | 15:39 | |
*** trident has quit IRC | 15:41 | |
shilpasd | dansmith: i have addressed your minor nit isolated aggregate spec on https://review.opendev.org/#/c/675384/ | 15:41 |
mriedem | i guess that wasn't related to aspiers so ignore the ping | 15:44 |
dansmith | shilpasd: you only changed it in that one place | 15:47 |
*** ociuhandu has quit IRC | 15:47 | |
dansmith | shilpasd: it should be changed elsewhere too, except maybe in work items.. | 15:47 |
shilpasd | dansmith: okay, will do that | 15:47 |
*** mjozefcz has quit IRC | 15:48 | |
*** ociuhandu has joined #openstack-nova | 15:48 | |
*** ociuhandu has quit IRC | 15:49 | |
*** ociuhandu has joined #openstack-nova | 15:49 | |
*** trident has joined #openstack-nova | 15:53 | |
mloza | Hello, I'm getting tons of 'Unexpected exception in API method: MigrationNotFound:' in nova-api. Here is the full log http://paste.openstack.org/show/777427/ | 15:53 |
*** markvoelker has joined #openstack-nova | 15:53 | |
mloza | I don't see pending migrations in `nova migration-list` | 15:54 |
mloza | I'm using stable/stein branch | 15:54 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: libvirt: Get the CPU model, not 'arch' from get_capabilities() https://review.opendev.org/682267 | 15:55 |
openstackgerrit | sean mooney proposed openstack/nova master: make config drives sticky bug 1835822 https://review.opendev.org/669738 | 15:58 |
openstack | bug 1835822 in OpenStack Compute (nova) "vms loose acess to config drive with CONF.force_config_drive=True after hard reboot" [Medium,In progress] https://launchpad.net/bugs/1835822 - Assigned to Matt Riedemann (mriedem) | 15:58 |
sean-k-mooney | rebase to kick it down the queue | 15:58 |
kashyap | mriedem: Thanks for updating | 15:58 |
*** TxGirlGeek has quit IRC | 15:58 | |
openstackgerrit | Merged openstack/nova master: Remove an unused file and a related description https://review.opendev.org/681955 | 16:00 |
openstackgerrit | Shilpa Devharakar proposed openstack/nova-specs master: Update spec: filtering of alloc candidates by forbidden aggregates https://review.opendev.org/675384 | 16:01 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: Add func test for 'required' PCI NUMA policy https://review.opendev.org/682941 | 16:02 |
artom | sean-k-mooney, stephenfin ^^ Dunno if it's that useful given that we test the 'legacy' policy which essentially does the same, but since I noticed it was missing while doing downstream reviews... | 16:02 |
*** henriqueof has quit IRC | 16:02 | |
sean-k-mooney | the legacy policy works differently if you have no device numa info on the host | 16:03 |
*** rpittau is now known as rpittau|afk | 16:03 | |
mriedem | artom: is that from you investigating that numa live migration testing thread in the ML? | 16:03 |
artom | mriedem, nope, from reviewing downstream backports | 16:04 |
artom | mriedem, he still hasn't posted logs from that ML post, btw | 16:04 |
artom | Makes it kinda hard to RCA :) | 16:04 |
sean-k-mooney | mriedem: its from https://review.opendev.org/#/c/674072/ we have a test backport for the customer to try to see if that actully works for them | 16:04 |
sean-k-mooney | i was wrigitn the spec for this but im fixing the config drive test currently | 16:05 |
sean-k-mooney | mriedem: we need to backport https://review.opendev.org/#/c/624444/ first and artom noticed that require did not have test coverage | 16:06 |
*** jawad_axd has joined #openstack-nova | 16:07 | |
*** zbr is now known as zbr|ruck | 16:07 | |
artom | Not in func tests, anyways | 16:07 |
sean-k-mooney | right there are unit tests | 16:07 |
efried | dansmith: The isolated aggs spec update lgtm https://review.opendev.org/#/c/675384/ want me to fast approve? | 16:07 |
mriedem | unit tests are generally insufficient for any of these kinds of features | 16:07 |
mriedem | at this point in nova, unit tests are generally insufficient for just about anything | 16:08 |
artom | mriedem, the same could be said for func tests ;) | 16:08 |
mriedem | artom: for low level hardware stuff sure | 16:08 |
mriedem | i wouldn't see that in general | 16:08 |
artom | mriedem, I mean low level hardware stuff, like PCI or NUMA | 16:08 |
mriedem | so the broken feature that landed in queens is just getting rolled out to the customer that wanted it now? | 16:09 |
mriedem | 18 months later...right on time | 16:09 |
sean-k-mooney | yes | 16:09 |
mriedem | and queens is your current LTS? | 16:09 |
artom | Long live the queen, indeed | 16:09 |
sean-k-mooney | newton will be end of line in decmeber and queens is our next LTS release downstream | 16:09 |
sean-k-mooney | so people are upgrading | 16:10 |
mnaser | this maybe more of a libvirt question, but i'll ask.. `cpu_mode` defaults to `host-model` in nova when using kvm, but the libvirt definition seems to be `<cpu mode='custom' match='exact' check='full'> ... </cpu>` (this is what i gather from virsh dumpxml) | 16:11 |
*** jawad_axd has quit IRC | 16:11 | |
mriedem | mnaser: meet kashyap | 16:11 |
mriedem | <3 | 16:11 |
*** xek__ has quit IRC | 16:11 | |
* mriedem is like cupid | 16:11 | |
mnaser | reading the nova code, it seems that nova is doing the right thing(tm) | 16:11 |
sean-k-mooney | mnaser: libvirt is also ment to default to host model if we say nothing for what its worth | 16:12 |
mnaser | right, im just checkin an issue where nested virt is available, virsh capabilities shows vmx | 16:12 |
mnaser | but vms dont spawn with it | 16:12 |
sean-k-mooney | but i dont think it always did which is why nova does. | 16:12 |
efried | I'm going to be taking most of the rest of the week off (family in town). Will check in periodically. Email if you need to reach me. o/ | 16:12 |
*** efried is now known as efried_pto | 16:13 | |
mnaser | so i can add it to the list of extra flags (or try and solve it upstream :]) | 16:13 |
sean-k-mooney | do you have a warning about svm not abliable in teh qemu instance log | 16:13 |
mriedem | mnaser: there are a bunch of recently updated docs here as well https://docs.openstack.org/nova/stein/admin/configuration/hypervisor-kvm.html#specify-the-cpu-model-of-kvm-guests | 16:13 |
mriedem | if that helps | 16:13 |
mnaser | ya i went over those, i can totally do passthrough or the other options, but right now my combination should just work ideally | 16:13 |
mnaser | man it would be nice if we split otu some of the libvirt driver code | 16:14 |
mnaser | so github search indexes it | 16:14 |
mnaser | :p | 16:14 |
sean-k-mooney | qemu does not always detect the host model correctly and somethime expose amd version of vmx (svm) if you use host model | 16:14 |
sean-k-mooney | and nested virt is enabled | 16:14 |
mnaser | yeah but the libvirt definition i see here clearly says use a custom set of stuff forbidding callback too | 16:14 |
mnaser | with a bunch of "requires" too | 16:15 |
sean-k-mooney | can you paste the full xml somewhere | 16:15 |
mnaser | i dont know if virsh dumpxml shows the thing that virsh determines or the thing that it was fed | 16:15 |
sean-k-mooney | i dont think we should be listing a lot of features in the xml | 16:15 |
mnaser | http://paste.openstack.org/show/777429/ | 16:15 |
*** shilpasd has quit IRC | 16:16 | |
sean-k-mooney | do you have the extra cpu flags config set out of interest | 16:16 |
mnaser | https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L10171-L10174 | 16:16 |
mnaser | nope none are set | 16:16 |
mnaser | but that.. | 16:16 |
mriedem | stephenfin: https://review.opendev.org/#/c/681938/ now depends on your fix so once that passes i'll review your fix | 16:16 |
mnaser | that doesnt look right | 16:17 |
stephenfin | mriedem: You saw https://review.opendev.org/#/c/682111/, I assume? | 16:17 |
mriedem | i didn't dig into it | 16:17 |
mnaser | https://github.com/openstack/nova/blob/master/nova/virt/libvirt/utils.py#L533-L539 | 16:17 |
mnaser | so tohat gives it back qemu64 | 16:17 |
sean-k-mooney | mnaser: can you set cpu-mode="host-model" explcitly? | 16:17 |
mnaser | then we start with https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L10158-L10166 | 16:18 |
mriedem | stephenfin: artom said i can't trust functional tests for numa stuff so... | 16:18 |
*** jawad_axd has joined #openstack-nova | 16:18 | |
mnaser | sean-k-mooney: yeah i think i might try that next | 16:18 |
mnaser | i think the config docs aren't up to date in that case | 16:18 |
* mriedem waits for stephenfin to tell me it's not numa stuff | 16:18 | |
mnaser | unset/none != host-model | 16:18 |
sean-k-mooney | mnaser: that code path is ment to be equavlent but i think its actullly not | 16:18 |
stephenfin | it's not NUMA stuff | 16:18 |
* stephenfin obliges | 16:18 | |
mriedem | HA | 16:18 |
stephenfin | https://openstack.fortnebula.com:13808/v1/AUTH_e8fd161dc34c421a979a9e6421f823e9/zuul_opendev_logs_ebb/682111/1/check/nova-tox-functional/ebbc041/testr_results.html.gz | 16:18 |
stephenfin | Also the reason that patch is WIP (the test doesn't belong in that file) | 16:19 |
sean-k-mooney | mnaser: we would bail out here normally https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L10152-L10156 | 16:19 |
mnaser | except if its not set, which the config docs say: If ``virt_type="kvm|qemu"``, it will default to ``host-model``, otherwise it will default to ``none``. | 16:19 |
stephenfin | but yeah, with that we see the exact issue I was seeing locally | 16:19 |
mnaser | https://github.com/openstack/nova/blob/stable/stein/nova/conf/libvirt.py#L539-L540 | 16:20 |
stephenfin | I also fixed the other spotted the thing kashyap pointed out earlier | 16:20 |
mnaser | which meant either the config docs should be updated, or the correct behaviour should be done | 16:20 |
sean-k-mooney | ya but i think on mater the new supprot for multiple custom cpu models has changed things | 16:21 |
mnaser | what it seems to be doing is actaully grabbing qemu64 (base) and then looping over all the features of baseline cpu and adding them as extra features | 16:21 |
sean-k-mooney | mnaser: it used to be the same | 16:21 |
mnaser | this is a stein deployment so maybe we just forgot to document the config change | 16:21 |
sean-k-mooney | the change i was refering to only landed on master recently but we might have acidentally changed the behavior in rocky or stine | 16:22 |
sean-k-mooney | and not upstaed the config | 16:22 |
*** jawad_axd has quit IRC | 16:22 | |
mnaser | ok yeah no this code isnt there in master | 16:22 |
* mnaser goes back | 16:22 | |
sean-k-mooney | this is what we used to do https://github.com/openstack/nova/blob/stable/stein/nova/virt/libvirt/driver.py#L3912-L3931 | 16:22 |
mnaser | i wonder if we ended up running newer than stein in some messed up wrong way :X | 16:23 |
* mnaser checks | 16:23 | |
sean-k-mooney | mnaser: you would have to be running master form last week too have that new code i think | 16:23 |
*** gmann is now known as gmann_afk | 16:23 | |
mnaser | ok then def not | 16:23 |
sean-k-mooney | mnaser: did you pull the wheels form pypi/tarballs.openstack.org by mistake? | 16:24 |
mnaser | no i just double checked and its the right stuff | 16:24 |
sean-k-mooney | oh you were allso looking at the traits code not the xml generation code | 16:25 |
mnaser | ah gotcha | 16:26 |
sean-k-mooney | the config is still the same https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L4150-L4181 | 16:26 |
openstackgerrit | melanie witt proposed openstack/nova stable/stein: Add reno about nova-api eventlet monkey-patching and rabbitmq https://review.opendev.org/662095 | 16:26 |
openstackgerrit | Merged openstack/nova stable/stein: neutron: refactor nw info cache refresh out of associate_floating_ip https://review.opendev.org/682181 | 16:27 |
openstackgerrit | Merged openstack/nova stable/stein: Trap and log errors from _update_inst_info_cache_for_disassociated_fip https://review.opendev.org/682182 | 16:27 |
openstackgerrit | Merged openstack/nova stable/stein: Find instance in another cell during floating IP re-association https://review.opendev.org/682183 | 16:27 |
openstackgerrit | Merged openstack/nova stable/stein: Log notifications if assertion in _test_live_migration_force_complete fails https://review.opendev.org/681743 | 16:27 |
mriedem | stephenfin: you mean https://review.opendev.org/#/c/682267/ right? | 16:28 |
stephenfin | yup | 16:29 |
sean-k-mooney | mnaser: those feature flags still look wrong to me | 16:29 |
mriedem | i definitely think we need some functional testing for the libvirt driver that sets cpu extra flags b/c that's 2 regressions when using that config option in the last week or so | 16:29 |
mriedem | and configuring cpu extra flags seems pretty important for red hat customers... | 16:29 |
sean-k-mooney | mriedem: it imples we are getting to the extra_flag section or libvirt added them | 16:29 |
stephenfin | Agreed. I want to flesh out that functional test of mine and use that | 16:29 |
stephenfin | Just need to figure out where to put it now | 16:29 |
sean-k-mooney | mriedem: soory that was for mnaser | 16:29 |
mnaser | sean-k-mooney: yeah it doesn't look right | 16:29 |
mnaser | but i dont know if virsh dumpxml does some magic | 16:29 |
mnaser | and doesnt actaully print out what it got from nova | 16:29 |
mriedem | stephenfin: maybe just a new simple libvirt functoinal test that isn't in the test_numa module or whatever | 16:30 |
sean-k-mooney | mnaser: was that config form the nova compute log or libvirt via virsh | 16:30 |
mnaser | no virsh | 16:30 |
mnaser | so i havent blamed nova yet | 16:30 |
sean-k-mooney | virsh does not | 16:30 |
sean-k-mooney | libvirt transforms the xml we pass in | 16:30 |
mnaser | yeah so this might be transformed | 16:30 |
sean-k-mooney | e.g. it does not actually use the xml we provide. | 16:30 |
stephenfin | yeah, I'd say so. I just need to check how much of the mocking I need to carry across | 16:30 |
mnaser | though i wonder why it wouldn't keep host-model there | 16:30 |
* stephenfin will set aside tomorrow morning to do just that | 16:30 | |
mnaser | hmm i remember virsh dumpxml used to have a "dumpable" version | 16:31 |
sean-k-mooney | mnaser: ya i dont know either. this is where i delegate to danpb or kashyap as i read C but dont really want to have to find where that happens | 16:31 |
sean-k-mooney | mnaser: its not a virsh issue | 16:31 |
mnaser | ok fair nuff | 16:31 |
sean-k-mooney | wehn we create teh xml and define the domain libvirt parses it and save an updated version to disk and in memory | 16:32 |
*** ivve has joined #openstack-nova | 16:32 | |
sean-k-mooney | virsh talks to libvirt over a unix socket and virsh dumpxml give you the rendered version | 16:32 |
mnaser | oooh | 16:33 |
sean-k-mooney | libvirt when it ingest the xml fills in a bunch of things like guest pci address that we dont set | 16:33 |
mnaser | wait so i can do cat on the local disk then | 16:33 |
sean-k-mooney | there is a copy saved to disk somewhere but you could also just check the nova compute log | 16:33 |
sean-k-mooney | the xml is saved in it | 16:33 |
mnaser | yeah not running $world with DEBUG on | 16:33 |
mnaser | so that'd have to be a little exercise :p | 16:33 |
mnaser | its only logged when running under debug afaik | 16:34 |
sean-k-mooney | yes it is | 16:34 |
mnaser | time to google for the millionth time "how to run an openstack instance on a specific node" | 16:34 |
sean-k-mooney | anyway i would check the qemu instance log and see what the error is | 16:34 |
openstackgerrit | Merged openstack/nova-specs master: Update spec: filtering of alloc candidates by forbidden aggregates https://review.opendev.org/675384 | 16:35 |
sean-k-mooney | cat /var/log/libvirt/qemu/instanace... | 16:35 |
mnaser | yeah but i tihnk its a layer before that because the qemu process doesnt have the flag | 16:35 |
mnaser | "-cpu IvyBridge-IBRS,ss=on,pcid=on,hypervisor=on,arat=on,tsc_adjust=on,md-clear=on,stibp=on,ssbd=on,xsaveopt=on,pdpe1gb=on" | 16:35 |
sean-k-mooney | which flag is missing? | 16:35 |
sean-k-mooney | i though the vm was crashing? | 16:36 |
mnaser | vmx (and the fact that things are manually added) | 16:36 |
mnaser | nope, no crashes at all, just literally a flag missing | 16:36 |
mnaser | virsh capabilities shows vmx | 16:36 |
mnaser | but booting instances without defining cpu_model does not | 16:36 |
sean-k-mooney | right but did you turn nested virt on in the host kernel | 16:36 |
mnaser | does not boot it with vmx then | 16:36 |
mnaser | yep | 16:36 |
sean-k-mooney | ok i normally use host-passthough with nested virt so the flag is there | 16:37 |
mnaser | `virsh capabilities` wouldn't report vmx there if it wasnt on either afaik | 16:37 |
mnaser | yeah, im trying to avoid that one | 16:37 |
sean-k-mooney | the model your using might just not included it | 16:37 |
sean-k-mooney | you could jsut add it | 16:37 |
mnaser | cat /sys/module/kvm_intel/parameters/nested => Y | 16:37 |
sean-k-mooney | with the config | 16:37 |
mnaser | sean-k-mooney: it actually does though -- http://paste.openstack.org/show/777431/ | 16:37 |
mnaser | thats the thing | 16:37 |
mnaser | i could add it manually but im trying to see if its something we can improve upstream :> | 16:38 |
sean-k-mooney | the host does | 16:38 |
*** dtantsur is now known as dtantsur|afk | 16:38 | |
mnaser | isnt the output of virsh capabiltiies show the model used by libvirt? | 16:38 |
mnaser | in this case IvyBridge-IBRS which does include vmx | 16:38 |
sean-k-mooney | that does not mean the IvyBridge-IBRS has the flag | 16:38 |
mnaser | wait what | 16:38 |
mnaser | seriously | 16:38 |
sean-k-mooney | yes | 16:38 |
sean-k-mooney | there is an xml file you can check | 16:39 |
mnaser | i thought virsh capabilities was what would host-model be | 16:39 |
*** jmlowe has quit IRC | 16:39 | |
sean-k-mooney | no | 16:39 |
mnaser | cpu_map.xml | 16:39 |
sean-k-mooney | its litrally the host capablites unfiltered | 16:39 |
sean-k-mooney | the model it reporet is the closet on to your host cpu | 16:39 |
*** TxGirlGe_ has quit IRC | 16:40 | |
mnaser | well | 16:40 |
mnaser | that explains all my confusion | 16:40 |
mnaser | then i need to add the extra cpu flag | 16:40 |
mnaser | cpu_map.xml shows nothing actually has vmx flag | 16:40 |
mnaser | i could _swear_ that it used to be there before | 16:40 |
* mnaser goes to check | 16:40 | |
sean-k-mooney | where is cpu_map by the way i was looking for it locally | 16:40 |
mnaser | sean-k-mooney: /usr/share/libvirt/cpu_map.xml | 16:41 |
mnaser | (on centos at least) | 16:41 |
*** derekh has quit IRC | 16:41 | |
sean-k-mooney | ah user share i was looking in /var/lib | 16:41 |
mnaser | ok im really confused as to how it used to work before but i guess it did | 16:41 |
* mnaser sighs | 16:41 | |
sean-k-mooney | it could be a libvirt change | 16:42 |
sean-k-mooney | or you were using host-passthough and did not no/notice/remember | 16:42 |
mnaser | no as much as host-passthrough is great im very anti that because live migrations | 16:43 |
sean-k-mooney | you can use it with live migrations | 16:43 |
sean-k-mooney | but you need to group things by hardware type | 16:43 |
mnaser | yeah it gets tricky for new hardware | 16:44 |
sean-k-mooney | by the way host-model is not relaly better | 16:44 |
sean-k-mooney | if you use host model and you move for ivybridge to haswell then reboot the vm it wont be abel to migrte back | 16:44 |
mnaser | yeah im perfectly ok with migrating from old to new but not new to old | 16:44 |
sean-k-mooney | if you really care about migration you should use custom and select a version across all nodes | 16:44 |
mnaser | it enables my use case of "upgrading things" | 16:44 |
*** mkrai_ has quit IRC | 16:45 | |
*** liuyulong has quit IRC | 16:47 | |
*** macz has joined #openstack-nova | 16:51 | |
stephenfin | mriedem: Can you cut an sqlalchemy-migrate release? | 16:52 |
stephenfin | I'm seeing logs about a contextual_connect popping up and I assume that's not helping with our "we generate too many logs" gate issues. A patch from zzzeek is on master | 16:53 |
*** tesseract has quit IRC | 16:53 | |
sean-k-mooney | stephenfin: we cant in train | 16:55 |
sean-k-mooney | but we can for U | 16:55 |
mriedem | stephenfin: i've got a release pending for U | 16:56 |
sean-k-mooney | stephenfin: i think mriedem may have fixed that | 16:56 |
mriedem | https://review.opendev.org/#/c/682656/ | 16:56 |
mriedem | i see what you mean in https://7adee8d979f9b27778af-fc266c4961a026b4ec86218d0c17f3b6.ssl.cf5.rackcdn.com/523559/10/check/openstack-tox-py36/55f9ad4/job-output.txt and https://ce11c0ee0a74ee8ebb7e-9c035ca54e1e355b36ad4d338836f375.ssl.cf2.rackcdn.com/523559/10/check/nova-tox-functional-py36/ac3beea/job-output.txt though | 16:57 |
mriedem | stephenfin: in nova we can add a warnings filter to only log that once | 16:57 |
mriedem | in the WarningsFixture we have | 16:57 |
sean-k-mooney | stephenfin: speaking of that im updating there tox file here https://review.opendev.org/#/c/682515/ if you care too take a look. the native coverage support still seam to not work. i spend an hour working on it so i did the same hack we have in nova | 16:57 |
mriedem | stephenfin: so let's do that in nova in Train and if it's fixed in sqla-migrate we can bump mins when that gets released in U | 16:58 |
mriedem | you could push the nova patch under bug 1813147 | 16:58 |
openstack | bug 1813147 in OpenStack Compute (nova) "p35 jobs are failing with subunit.parser ... FAILED" [High,In progress] https://launchpad.net/bugs/1813147 - Assigned to Balazs Gibizer (balazs-gibizer) | 16:58 |
mriedem | this is the patch you were talking about btw https://review.opendev.org/#/c/671040/ | 16:59 |
*** brault has joined #openstack-nova | 17:00 | |
openstackgerrit | melanie witt proposed openstack/nova master: Add note about needing noVNC >= v1.1.0 with using ESX https://review.opendev.org/682946 | 17:01 |
*** luksky has joined #openstack-nova | 17:06 | |
melwitt | KeithMnemonic1: ^ (late) | 17:11 |
KeithMnemonic1 | lol | 17:15 |
*** mgariepy has quit IRC | 17:16 | |
sean-k-mooney | im not sure why i try to make my tests robost... | 17:18 |
sean-k-mooney | gibi: mriedem http://paste.openstack.org/show/777439/ | 17:20 |
sean-k-mooney | so you know why stat is not ment to be HARD_REBOOT ... | 17:20 |
*** mgariepy has joined #openstack-nova | 17:22 | |
sean-k-mooney | im going to add backin self._wait_for_state_change(self.api, shelved_server, 'ACTIVE') | 17:22 |
sean-k-mooney | or active_server in this case | 17:23 |
mriedem | idk what you are doing there | 17:24 |
sean-k-mooney | im using the fake notifiyer to wait for the reboot end notifcaiont | 17:24 |
mriedem | but why do you need a whole separate _get_server_info | 17:25 |
sean-k-mooney | the same one that is used in the allocation fucntional tests | 17:25 |
mriedem | self._wait_for_state_change(self.api, active_server, 'ACTIVE') does the same thing you added | 17:25 |
sean-k-mooney | because i was nolonger checking for the server being active | 17:25 |
sean-k-mooney | right but the whole point was the status was ment to always be active | 17:25 |
sean-k-mooney | or the previous patch would have been correct | 17:26 |
sean-k-mooney | its actully HARD_REBOOT | 17:26 |
mriedem | after the task_state is None? | 17:26 |
mriedem | if you wait for ACTIVE and task_state=None you shouldn't have a problem | 17:26 |
sean-k-mooney | no im not checking for that im waiting for the notificaiton | 17:26 |
sean-k-mooney | ya i can do that instead | 17:26 |
mriedem | i look forward to at least 5 more patch sets for this fix | 17:27 |
sean-k-mooney | i like how this is litrlly all because im swaping the order of two lines | 17:27 |
*** ricolin has quit IRC | 17:27 | |
mriedem | pick one or the other i don't think it matters, the task_state is None here https://github.com/openstack/nova/blob/c67057dff34a0054977ae3873d33313c0617b308/nova/compute/manager.py#L3529 and the notification is here https://github.com/openstack/nova/blob/c67057dff34a0054977ae3873d33313c0617b308/nova/compute/manager.py#L3536 | 17:29 |
mriedem | for the purpose of your test either is sufficient | 17:29 |
mriedem | if you do wait for the task_state to be None, use https://github.com/openstack/nova/blob/c67057dff34a0054977ae3873d33313c0617b308/nova/tests/functional/integrated_helpers.py#L247 | 17:29 |
mriedem | don't write some new function | 17:29 |
mriedem | using _wait_for_server_parameter will get you the latest copy of the server with the config drive value in the api so i'd use that myself | 17:30 |
sean-k-mooney | yep i can do that | 17:30 |
openstackgerrit | Merged openstack/nova stable/stein: Fix the server group "policy" field type in api-ref https://review.opendev.org/662224 | 17:30 |
sean-k-mooney | but the reason i pinged you on irc was it look like our fixture are not doing the right thing | 17:31 |
openstackgerrit | Merged openstack/nova stable/stein: Fixing broken links https://review.opendev.org/681401 | 17:31 |
openstackgerrit | Merged openstack/nova stable/stein: libvirt: stub logging of host capabilities https://review.opendev.org/682210 | 17:31 |
openstackgerrit | Merged openstack/nova stable/stein: Fix rebuild of baremetal instance when vm_state is ERROR https://review.opendev.org/680869 | 17:31 |
sean-k-mooney | well either we really do go from ACTIVE->HARD_REBOOTING->ACTIVE which is why i orginaly thought in which case its not a race or the fixture is reutrning the worng value. ill look at that after i do the taskstae check | 17:33 |
mriedem | you're seeing HARD_REBOOTING b/c of the task_state not being None | 17:34 |
sean-k-mooney | for rebuild im getting the same behavior "esttools.matchers._impl.MismatchError: 'ACTIVE' != 'REBUILD'" | 17:34 |
mriedem | does your test properly stub out the fake notifier in setUp? | 17:34 |
sean-k-mooney | yes | 17:34 |
mriedem | this is why you saw HARD_REBOOT https://github.com/openstack/nova/blob/c67057dff34a0054977ae3873d33313c0617b308/nova/api/openstack/common.py#L53 | 17:35 |
mriedem | b/c the task_state was set | 17:35 |
sean-k-mooney | i copied the sutb and the example usage form the integrated helpers | 17:35 |
*** zbr|ruck is now known as zbr | 17:36 | |
sean-k-mooney | mriedem: but that shoudl not be reported in the status field it should be in OS-EXT-STS:task_state | 17:37 |
sean-k-mooney | or is that not how that works | 17:37 |
sean-k-mooney | there is also OS-EXT-STS:vm_state | 17:37 |
mriedem | that's not how that works | 17:38 |
mriedem | status is a mix of vm_state and task_state | 17:38 |
mriedem | OS-EXT-STS:vm_state and OS-EXT-STS:task_state are separate | 17:38 |
sean-k-mooney | yes and there is a third field status | 17:38 |
sean-k-mooney | wait for state chage checks status | 17:39 |
sean-k-mooney | so is status task_state when it not equal to None and vm_sate otherwise? | 17:39 |
sean-k-mooney | because that seam to be the behavior of the fixture | 17:40 |
mriedem | no | 17:40 |
mriedem | see the link i just posted above | 17:40 |
mriedem | https://github.com/openstack/nova/blob/c67057dff34a0054977ae3873d33313c0617b308/nova/api/openstack/common.py#L117 | 17:41 |
sean-k-mooney | ya im reading that now | 17:41 |
sean-k-mooney | it uses a mapp to do it but its similar to what i said. but there is not a race in the current version then | 17:43 |
*** jmlowe has joined #openstack-nova | 17:44 | |
sean-k-mooney | anyway to be extra safe ill push with the wait for state change + notificaitons since that works | 17:46 |
artom | mriedem, I've had an intuition for https://review.opendev.org/#/c/641453/2 | 17:48 |
artom | Comments inline | 17:48 |
openstackgerrit | sean mooney proposed openstack/nova master: make config drives sticky bug 1835822 https://review.opendev.org/669738 | 17:49 |
openstack | bug 1835822 in OpenStack Compute (nova) "vms loose acess to config drive with CONF.force_config_drive=True after hard reboot" [Medium,In progress] https://launchpad.net/bugs/1835822 - Assigned to sean mooney (sean-k-mooney) | 17:49 |
*** jmlowe has quit IRC | 17:53 | |
*** ociuhandu has quit IRC | 17:56 | |
*** mmethot_ has joined #openstack-nova | 17:57 | |
*** mmethot_ has quit IRC | 17:58 | |
*** mmethot_ has joined #openstack-nova | 17:59 | |
*** mmethot has quit IRC | 18:00 | |
mriedem | artom: replied but i still don't really get it | 18:00 |
mriedem | we know that migrate_data.bdms isn't getting set during pre_live_migratoin | 18:01 |
mriedem | which should happen on the dest | 18:01 |
artom | mriedem, to be honest neither do I | 18:01 |
mriedem | it's probably something really dumb and i'm just overlooking it | 18:01 |
*** priteau has quit IRC | 18:01 | |
artom | I mean, that's 4 of us, at this point? | 18:02 |
artom | We can't all be dumb | 18:02 |
mriedem | oh we can | 18:02 |
mriedem | and will be gdi | 18:02 |
artom | I checked the dumb stuff like args out of order | 18:02 |
artom | I admire your stubborness | 18:02 |
mriedem | wonder if the partial with source_bdms is somehow at fault | 18:02 |
*** brault has quit IRC | 18:03 | |
mriedem | OH | 18:03 |
mriedem | i SEE it | 18:03 |
artom | mriedem, don't think so - the failure is "in" the migration | 18:04 |
mriedem | I SEE IT | 18:04 |
artom | When it's updating the instance XML, migrate_data.bdms is not there | 18:04 |
artom | TELL US | 18:04 |
mriedem | IT'S FULL OF STARS | 18:05 |
mriedem | commented inline to explain the problem, | 18:05 |
mriedem | but tl;dr we didn't return the migrate_data that we got back from the dest to pass to the driver on the source | 18:05 |
mriedem | so we passed our stale copy | 18:05 |
artom | "Thanks Artom, you helped me figure it out." | 18:06 |
artom | I'm going to use that to fall asleep now | 18:06 |
artom | Better than any lullaby | 18:06 |
*** mmethot_ has quit IRC | 18:06 | |
artom | Ah, I get it | 18:07 |
artom | *facepalm* | 18:07 |
artom | Side effects suck | 18:07 |
mriedem | pycharm might have even warned me locally that i was overwriting a method arg | 18:08 |
*** mmethot has joined #openstack-nova | 18:08 | |
mriedem | i know it does if you shadow imports | 18:08 |
mriedem | which is helpful in cases where we pass around a thing called 'context' and have a module import of nova.context | 18:08 |
artom | IIRC importing nova.context you still to access it with nova.context | 18:09 |
artom | As opposed to from nova import context | 18:09 |
mriedem | i mean the latter | 18:10 |
mriedem | from nova import context | 18:10 |
mriedem | ... | 18:10 |
mriedem | instances = objects.Instance.get_by_instance_uuid(context, instance_uuid) | 18:10 |
artom | Yep, that would suck | 18:10 |
mriedem | we do it all over the place, but i think it's only bit us in the ass once that i know of | 18:11 |
*** mmethot has quit IRC | 18:11 | |
*** ociuhandu has joined #openstack-nova | 18:11 | |
sean-k-mooney | passing the module instead of a context object | 18:11 |
sean-k-mooney | that would have intersting sideffect if it did not explode | 18:12 |
* artom context switches to an internal escalation | 18:12 | |
*** zhubx has quit IRC | 18:15 | |
*** boxiang has joined #openstack-nova | 18:15 | |
*** ociuhandu has quit IRC | 18:16 | |
*** mmethot has joined #openstack-nova | 18:16 | |
sean-k-mooney | if your really unlucky and only do the assignment to the module in unit test you can end up with the tests passing or fialing dependign on the order that they are run | 18:17 |
sean-k-mooney | i remember helping to debug an intermitent failture in the neutron gate which was casuse by acidetally assign to the module in a test | 18:18 |
*** martinkennelly has quit IRC | 18:18 | |
mloza | hello, is it safe to cleanup failed migrations(those status that are in confirmed or error) in the migration table of the nova database? | 18:20 |
sean-k-mooney | well those that are in confimed are not failed | 18:22 |
sean-k-mooney | if you mean can you clean up old migration recored in general i belive so although im not sure if they are needed for teh audit logs. | 18:23 |
sean-k-mooney | the main sideffect of cleaning up old migration that are confirm or error should be just they are nolonger available for quitying via the api. but maybe mriedem can double check that is correct? | 18:23 |
sean-k-mooney | i dont know if we have a nova manage command for that but i can see that they would build up over time | 18:24 |
sean-k-mooney | at least until the vms are delested in any case | 18:24 |
mriedem | there is no audit log for migrations, in case you're thinking of instance actions | 18:27 |
mriedem | which are different | 18:27 |
mriedem | and yeah the downside to removing complete migratoin records is losing any history on those and yeah they won't be archived/pruned until the instance is deleted, | 18:28 |
sean-k-mooney | yes i was trying to think of what side effect it could have | 18:28 |
mriedem | we don't have any nova-manage command to archive old migration records - i don't think you can even do that b/c of a foreign key reference to the instance | 18:28 |
mriedem | mloza: so to answer your questoin i don't think you can do that until the instances for those migrations are also deleted, | 18:29 |
mriedem | and if you have deleted those instances, than simply running the archive/prune commands will remove them | 18:29 |
sean-k-mooney | it depend i gues on how we specified the forin key constratint | 18:30 |
sean-k-mooney | e.g. is it set to cascase on vm deletion or is it a restict key that prevents deleteion untill the instance is deleted | 18:31 |
mriedem | we don't have any fkeys that do cascading deletes | 18:31 |
sean-k-mooney | this would be specifid in the modles.py right? | 18:31 |
sean-k-mooney | we had a customer delete migration recently so i think you can do it | 18:32 |
mriedem | it's in the models yes | 18:32 |
mriedem | sean-k-mooney: are you fixing that config drive functional test or what? | 18:32 |
mloza | sorry, I meant only status error. Just found the confirmed status are for migration and resize action | 18:33 |
sean-k-mooney | i pushed a version | 18:33 |
*** ociuhandu has joined #openstack-nova | 18:33 | |
mloza | mriedem: ok, i'll just the migration records in the db for now | 18:33 |
sean-k-mooney | https://review.opendev.org/#/c/669738/9/nova/tests/functional/regressions/test_bug_1835822.py@66 | 18:33 |
mloza | leave* | 18:33 |
sean-k-mooney | i added the wait for notification and kept the state change wait | 18:34 |
sean-k-mooney | so it cant race on still being active | 18:34 |
sean-k-mooney | when we start to wait | 18:34 |
sean-k-mooney | mloza: ok the foreign_key constratin just ensure the instance_uuid exists in the isntace table and the migration instance uuid matche and the instance is not deleted | 18:36 |
sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/models.py#L805-L809 | 18:36 |
sean-k-mooney | so it wont prevent you deleteing the old migration recored while the vm still exists | 18:36 |
*** openstackgerrit has quit IRC | 18:37 | |
*** ociuhandu has quit IRC | 18:38 | |
mloza | sean-k-mooney: any sides effect the old migration record that are status error while the vm still exist? | 18:39 |
sean-k-mooney | the only one im aware of is that you will nolonger be able to retivie the migration info since you deleted it. | 18:40 |
*** openstackgerrit has joined #openstack-nova | 18:41 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Refactor pre-live-migration work out of _do_live_migration https://review.opendev.org/641453 | 18:41 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Mark "block_migration" arg deprecation on pre_live_migration method https://review.opendev.org/682963 | 18:41 |
mloza | understand. I don't need it for auditing | 18:41 |
mloza | Thanks | 18:41 |
mriedem | don't take that as a recommendation that it's ok | 18:41 |
mriedem | messing with the db directly is not really a supported thing, so beware | 18:42 |
mriedem | especially with migration-based allocation consumers since queens i could see us (nova) relying on migration records existing to determine a type of consumer of resource allocatoins in placement | 18:42 |
mriedem | like in the audit command that bauzas is working on | 18:42 |
mriedem | mloza: ^ | 18:42 |
sean-k-mooney | right from a downstream perspecit i think we would normally ask you to file a support exception first and assess if its safe | 18:43 |
mriedem | for example, let's say a migration failed and we failed to cleanup allocatoins in placement properly, then placement might be saying there are more consumers of resources than there actually are, and without some kind of audit tooling to determine that - which would use migration records - you might have a hard time sorting that all out | 18:43 |
mriedem | that kind of orphan issue can lead to a situation where you expect to be able to land new VMs on a host but placement filters them out saying there is no room | 18:44 |
sean-k-mooney | yes. our customer issue was related to FFU and failed evauctations leave evaucate migration recored in pre-migrating | 18:45 |
sean-k-mooney | in that case in queens we deleted there vms when the comptue agent start up | 18:45 |
sean-k-mooney | that said they didnt porperly check the evaucations before unfenceign the node | 18:46 |
mriedem | that said you didn't provide them a tool to check when it was safe to unfence the ndoe | 18:47 |
mriedem | *node | 18:47 |
mriedem | especially for the $$$ they are giving you | 18:47 |
sean-k-mooney | true but the db recoreds showed they used --force --host to do a nova evac to the same host | 18:48 |
sean-k-mooney | which is why the evacuation failed | 18:48 |
sean-k-mooney | you cant evacuate to the same host that you are evacuating form. we should however have an api check for that | 18:49 |
sean-k-mooney | this came up last week so we havent really gotten around to figuring out what needs to be done upstream yet | 18:49 |
mriedem | well, w/o --force the scheduler would have kicked it out | 18:50 |
sean-k-mooney | yep | 18:50 |
mriedem | as long as the compute was still disabled | 18:50 |
mriedem | we should probably deprecate the --force option on the evacuate cli https://docs.openstack.org/python-novaclient/latest/cli/nova.html#nova-evacuate | 18:50 |
sean-k-mooney | if the compute was active it would have kicked it out too | 18:50 |
sean-k-mooney | mriedem: didnt you already do that? | 18:50 |
mriedem | the force parameter was deprecated in the api | 18:50 |
mriedem | but it's still there on earlier microversions | 18:50 |
mriedem | added in 2.29 and removed in 267 | 18:51 |
sean-k-mooney | ah ok i remember talking about not supporting it in the osc version | 18:51 |
mloza | mriedem: I guess I hitting that orphan issue. I possibly cleared the migration records before. Now wonder I everytime I launch a new VM it goes always goes to a specific host. | 18:51 |
sean-k-mooney | speaking of which i should update that patch. | 18:51 |
mloza | What would be a remedy to this issue? | 18:52 |
sean-k-mooney | mloza: what release are you running | 18:52 |
mloza | stable/stein | 18:52 |
mriedem | sean-k-mooney: please tell me you ran pep8 on this https://review.opendev.org/#/c/669738/8.. | 18:52 |
sean-k-mooney | ok i was wondering if you had the old failed builds behavior. | 18:52 |
mriedem | https://review.opendev.org/#/c/669738/ | 18:52 |
sean-k-mooney | i think i did | 18:54 |
sean-k-mooney | i know i had to fix the odering of the imports | 18:54 |
mriedem | mloza: this is a work in progress but you might be able to use this https://review.opendev.org/#/c/670112/ - that would also provide valuable feedback on what it reports | 18:54 |
mriedem | mloza: you can use https://docs.openstack.org/osc-placement/latest/ to see what is consuming resource allocations on your nodes (resource providers) | 18:55 |
mriedem | i would start by investigating a particular host that you think should be available but placement is saying it's not | 18:55 |
mloza | will take a look | 18:56 |
mriedem | so using this https://docs.openstack.org/osc-placement/latest/cli/index.html#resource-provider-show with the --allocations option | 18:56 |
mriedem | the allocations are keyed by consumer uuid, which in the case of nova since queens can be migrations or instances | 18:57 |
mriedem | migrations for the source host during a migration, instances for the dest host | 18:57 |
mriedem | so let's say you evacuated all instances from a down host but placement is still reporting allocations against that host - those allocation consumers are likely migration records | 18:57 |
mriedem | and were orphaned | 18:57 |
mriedem | there are a couple of related bugs in this patch https://review.opendev.org/#/c/678100/ | 18:58 |
*** cshen has joined #openstack-nova | 19:01 | |
*** jmlowe has joined #openstack-nova | 19:03 | |
*** cshen has quit IRC | 19:06 | |
*** mriedem has quit IRC | 19:15 | |
*** mriedem has joined #openstack-nova | 19:16 | |
mriedem | melwitt: how are you feeling about this? https://review.opendev.org/#/c/541420/ i split the refactor out and without the compat code for old cinder it's quite a bit simpler than the last time you looked. | 19:18 |
*** belmoreira has joined #openstack-nova | 19:23 | |
*** ociuhandu has joined #openstack-nova | 19:24 | |
*** eharney has quit IRC | 19:39 | |
*** ociuhandu has quit IRC | 19:40 | |
*** jmlowe has quit IRC | 19:42 | |
*** artom has quit IRC | 19:47 | |
*** ralonsoh has quit IRC | 19:56 | |
*** hoonetorg has quit IRC | 20:01 | |
KeithMnemonic1 | I have a customer running into an odd issue with multipath during migration (Pike). To make a long story short the LUNS assignments are getting mixed up somewhere. geguileo mentioned it possibly could be this https://review.opendev.org/#/c/551302/ I did a quick look at cherry-picking and it seems there are a ton of dependencies. Does that seems accurate or does someone have some magic to cherry-pick it to | 20:03 |
KeithMnemonic1 | pike? | 20:03 |
*** larainema has quit IRC | 20:06 | |
*** pcaruana has quit IRC | 20:10 | |
*** jmlowe has joined #openstack-nova | 20:12 | |
*** belmoreira has quit IRC | 20:13 | |
KeithMnemonic1 | mriedem, melwitt any thoughts on my question above | 20:15 |
mriedem | coincidentally that came up yesterday on pike https://review.opendev.org/#/c/670016/ | 20:16 |
mriedem | ^ will not fix your issue though | 20:16 |
mriedem | looking at the conflicts on the queens backport i'm not surprised that there would be conflicts going to pike, but w/o doing it myself i do'nt know how bad it is, | 20:17 |
mriedem | red hat probably doesn't care to backport that since i don't think they are supporting pike | 20:17 |
mriedem | efried_pto: are you going to be running the meeting tomorrow morning or should someone else? | 20:23 |
KeithMnemonic1 | mriedem i was using aspiers git-deps and it was a long list. I am not experienced enough to know if there are any shortcuts | 20:39 |
*** eharney has joined #openstack-nova | 20:43 | |
KeithMnemonic1 | I was looking to confirm with someone if my inital look was correct (that it is too cumbersome) or if it is not that bad | 20:48 |
mriedem | i can peek in a bit | 20:50 |
melwitt | mriedem: I dunno, haven't looked at it in a long time. will go through it today | 20:51 |
*** ociuhandu has joined #openstack-nova | 20:56 | |
melwitt | KeithMnemonic1, mriedem: I can confirm that we haven't backported https://review.opendev.org/#/c/551302/ beyond queens downstream either, so I also don't know what to expect as far as how gnarly the backport will be without actually trying to do it | 20:56 |
KeithMnemonic1 | I took a look and it seemed gnarly but was hoping for a second opinion | 20:58 |
mriedem | is this a new failure in building docs locally? | 21:00 |
mriedem | WARNING: RSVG converter command 'rsvg-convert' cannot be run. Check the rsvg_converter_bin setting | 21:00 |
KeithMnemonic1 | melwitt a plug for aspiers this is what I used to gauge the effort https://github.com/aspiers/git-deps | 21:00 |
KeithMnemonic1 | but that pulls in every dependency and i am not experienced enough to know if any could be skipped | 21:04 |
mriedem | ah i see https://github.com/openstack/nova/commit/16b9486bf7e91bfd5dc48297cee9f54b49156c93 | 21:05 |
*** avolkov has quit IRC | 21:09 | |
*** nweinber has quit IRC | 21:11 | |
mriedem | we need librsvg2-bin in bindep.txt | 21:15 |
*** gbarros has joined #openstack-nova | 21:17 | |
*** brault has joined #openstack-nova | 21:19 | |
mriedem | KeithMnemonic1: do you have a paste of the dependencies? i wouldn't be surprised if a bunch of the conflicts are due to like mox removal patches | 21:19 |
*** brault has quit IRC | 21:23 | |
*** JamesBenson has quit IRC | 21:25 | |
mriedem | melwitt: if you have a fedora or centos system available can you confirm that librsvg2-tools provides the rsvg-convert command? | 21:28 |
mriedem | which is i think just: yum provides rsvg-convert | 21:28 |
mriedem | right? | 21:28 |
melwitt | let me check | 21:29 |
*** JamesBenson has joined #openstack-nova | 21:30 | |
openstackgerrit | Matt Riedemann proposed openstack/nova-specs master: Re-propose cross-cell-resize spec for Ussuri https://review.opendev.org/683002 | 21:31 |
melwitt | mriedem: you are correct, comes from librsvg2-tools | 21:33 |
mriedem | cool, thanks. that's what cinder had in their bindep but i've only got ubuntu and that's all that zuul uses for the pdf jobs | 21:33 |
melwitt | looked on centos7.3 | 21:33 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add librsvg2* to bindep https://review.opendev.org/683003 | 21:34 |
mriedem | looks like we just had a zuul restart | 21:34 |
*** gbarros has quit IRC | 21:34 | |
*** JamesBenson has quit IRC | 21:35 | |
mriedem | and it's busted, "Unable to freeze job graph: 'dict_keys' object does not support indexing" | 21:40 |
mriedem | on that note, looks like i'm out of there for the day | 21:40 |
*** panda has quit IRC | 21:41 | |
*** mriedem is now known as mriedem_afk | 21:41 | |
*** panda has joined #openstack-nova | 21:42 | |
*** TxGirlGeek has joined #openstack-nova | 21:47 | |
mriedem_afk | KeithMnemonic1: quickly glancing at that pike backport from queens, there are a bunch of conflicts but the big ones look like they are due to the multiattach volume support added in queens but those aren't functional dependencies, so i don't think those would actually impact backporting the fix to pike except it makes the backport harder | 21:49 |
*** takashin has joined #openstack-nova | 21:50 | |
*** eharney has quit IRC | 22:04 | |
*** mlavalle has quit IRC | 22:22 | |
*** markvoelker has quit IRC | 22:24 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (13) https://review.opendev.org/576020 | 22:34 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (13) https://review.opendev.org/576020 | 22:34 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (14) https://review.opendev.org/576027 | 22:36 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (14) https://review.opendev.org/576027 | 22:36 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (15) https://review.opendev.org/576031 | 22:39 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (15) https://review.opendev.org/576031 | 22:39 |
*** ociuhandu has quit IRC | 22:40 | |
*** munimeha1 has quit IRC | 22:40 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (16) https://review.opendev.org/576299 | 22:41 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (16) https://review.opendev.org/576299 | 22:41 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (17) https://review.opendev.org/576344 | 22:42 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (17) https://review.opendev.org/576344 | 22:42 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (18) https://review.opendev.org/576673 | 22:43 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (18) https://review.opendev.org/576673 | 22:43 |
openstackgerrit | Eric Fried proposed openstack/nova master: libvirt: Enable driver configuring PMEM namespaces https://review.opendev.org/679640 | 22:44 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (19) https://review.opendev.org/576676 | 22:45 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (19) https://review.opendev.org/576676 | 22:45 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (20) https://review.opendev.org/576689 | 22:46 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (20) https://review.opendev.org/576689 | 22:46 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (21) https://review.opendev.org/576709 | 22:47 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (21) https://review.opendev.org/576709 | 22:47 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (22) https://review.opendev.org/576712 | 22:48 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (22) https://review.opendev.org/576712 | 22:48 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Add TODO note for mox removal https://review.opendev.org/576758 | 22:49 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Add TODO note for mox removal https://review.opendev.org/576758 | 22:49 |
*** mriedem_afk has quit IRC | 22:53 | |
*** tkajinam has joined #openstack-nova | 23:02 | |
openstackgerrit | Takashi NATSUME proposed openstack/python-novaclient master: Add a check for --config-drive option on nova boot https://review.opendev.org/653683 | 23:08 |
*** luksky has quit IRC | 23:13 | |
*** rcernin has joined #openstack-nova | 23:16 | |
*** mriedem has joined #openstack-nova | 23:17 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/pike: WIP: Avoid redundant initialize_connection on source post live migration https://review.opendev.org/683008 | 23:30 |
*** BjoernT has quit IRC | 23:30 | |
mriedem | KeithMnemonic1: ^ is the pike backport with runtime code conflicts handled, test module conflicts are not handled (yet) but the commit message calls out what caused the conflicts so it's just a matter of resolving those test module conflicts | 23:31 |
* alex_xu knocks the desk with head when found the ci fail again | 23:32 | |
sean-k-mooney | can we actully spligt the cpu resouces series form the vpmem seriese at this point | 23:34 |
sean-k-mooney | it was intended to prevent merge conflict but i think it has ended up hurting more then it helped | 23:34 |
sean-k-mooney | although at this point i guess the vpmem serice has merged? | 23:35 |
sean-k-mooney | looks like we are not on the first useful patch of the pcpu series | 23:36 |
alex_xu | sean-k-mooney: we already split | 23:36 |
sean-k-mooney | today? | 23:37 |
sean-k-mooney | what was the failure out of interest | 23:37 |
alex_xu | yesterday | 23:38 |
alex_xu | I only saw two bug, one is test_model_sync, another one is db timeout | 23:39 |
sean-k-mooney | sorry i though you were refering to https://review.opendev.org/#/c/672693/ | 23:40 |
sean-k-mooney | its going to fail i nthe gate beacue devstack failed on the second node in the live migration job | 23:41 |
sean-k-mooney | looks like the nova compute service db entry for the subnode never got created properly | 23:42 |
alex_xu | yea, I thought those two in vpmems, just checking what fail in cpu resource | 23:42 |
efried_pto | mriedem: I'll try to be around for the meeting tomorrow; but if I'm not, you wanna run it? | 23:43 |
mriedem | sure | 23:43 |
mriedem | alex_xu: zuul had 2 restarts today | 23:43 |
mriedem | so that reset everything | 23:44 |
efried_pto | alex_xu: totally bug 1823251 again. | 23:44 |
openstack | bug 1823251 in OpenStack Compute (nova) "Spike in TestNovaMigrationsMySQL.test_walk_versions/test_innodb_tables failures since April 1 2019 on limestone-regionone" [High,Confirmed] https://launchpad.net/bugs/1823251 | 23:44 |
efried_pto | I'm trying hard to nail it down, but so far no luck. | 23:44 |
alex_xu | same bug after reset? | 23:44 |
efried_pto | dunno, I just rebased and re+Wed like half an hour ago. | 23:44 |
efried_pto | correction: 1h ago. | 23:45 |
*** JamesBenson has joined #openstack-nova | 23:46 | |
efried_pto | (only rebased to requeue since it was failing the constraints job bogusly - on the aforementioned bug) | 23:46 |
efried_pto | now of course it's going to have to wait twelve freaking hours to get a node again. | 23:46 |
mriedem | you know, it might be worth just skipping TestNovaMigrationsMySQL until all of these things are merged | 23:46 |
efried_pto | Agreeeeed | 23:47 |
mriedem | unskip before rc1 | 23:47 |
efried_pto | that would increase our merge percentage by an order of magnitude. | 23:47 |
mriedem | do it | 23:47 |
efried_pto | you wanna propose it quick I'll fast approve it. | 23:47 |
*** efried_pto is now known as mriedem1 | 23:47 | |
mriedem | merging all of this stuff in the middle of next week is crazy | 23:47 |
*** mriedem1 is now known as efried_pto | 23:47 | |
*** ivve has quit IRC | 23:48 | |
efried_pto | yeah, I'd really rather merge it *this* week. | 23:48 |
efried_pto | Hell, I would have rather merged it last Thursday | 23:48 |
mriedem | was just about to say ^ | 23:48 |
mriedem | i'm in the middle of something so you go ahead | 23:48 |
alex_xu | i can do it | 23:50 |
mriedem | well let me see here | 23:50 |
sean-k-mooney | whne the compute service joing the service group i should see a entry in the cell conductor corresponding to the compute node entry correct | 23:50 |
*** JamesBenson has quit IRC | 23:50 | |
openstackgerrit | Dustin Cowles proposed openstack/nova master: Provider Config File: YAML file loading and schema validation https://review.opendev.org/673341 | 23:50 |
openstackgerrit | Dustin Cowles proposed openstack/nova master: Provider Config File: Function to further validate and retrieve configs https://review.opendev.org/676029 | 23:50 |
openstackgerrit | Dustin Cowles proposed openstack/nova master: WIP: Provider Config File: Merge provider configs to provider tree https://review.opendev.org/676522 | 23:50 |
mriedem | alex_xu: i've got it | 23:50 |
alex_xu | mriedem: ok, so I can help to +2 | 23:51 |
efried_pto | alex_xu: ++ | 23:51 |
sean-k-mooney | hum maybe not | 23:52 |
sean-k-mooney | that compute node does not have the db config so it woul have to go via the conductor but i guess it may not be logged or rabbit could have dropped the message | 23:55 |
sean-k-mooney | oh never mined i miss read the devstack message | 23:56 |
sean-k-mooney | it finishe registering the compute servce and then the neutron openvswitch agent did not start | 23:56 |
*** cfriesen has quit IRC | 23:57 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Temporarily skip TestNovaMigrationsMySQL https://review.opendev.org/683009 | 23:58 |
sean-k-mooney | wait is this complianing baour the compute agent or neutron openvswitch agent | 23:59 |
sean-k-mooney | https://storage.gra1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_5b9/672693/24/gate/nova-live-migration/5b959a8/logs/subnode-2/devstacklog.txt.gz | 23:59 |
sean-k-mooney | /baour/about/ | 23:59 |
mriedem | alex_xu: efried_pto: there you go https://review.opendev.org/683009 | 23:59 |
*** MarkMielke has quit IRC | 23:59 | |
alex_xu | mriedem: thanks | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!