opendevreview | Nobuhiro MIKI proposed openstack/nova-specs master: Add PXB support for libvirt https://review.opendev.org/c/openstack/nova-specs/+/869416 | 02:35 |
---|---|---|
*** blarnath is now known as d34dh0r53 | 07:01 | |
opendevreview | Aaron S proposed openstack/nova master: Add further workaround features for qemu_monitor_announce_self https://review.opendev.org/c/openstack/nova/+/867324 | 11:03 |
*** dasm|off is now known as dasm | 14:14 | |
stephenfin | gibi: No point rechecking jobs that exhibit this failure | 14:46 |
stephenfin | tox.tox_env.python.api.NoInterpreter: could not find python interpreter matching any of the specs functional-py39 | 14:46 |
gibi | ahh | 14:46 |
stephenfin | it's another tox 4 bug | 14:46 |
gibi | nice | 14:46 |
stephenfin | https://github.com/tox-dev/tox/issues/2811 | 14:46 |
gibi | what can we do? | 14:47 |
stephenfin | I've happened to expose it by fixing another bug that resulted in us using the wrong interpreter version | 14:47 |
stephenfin | :( | 14:47 |
sean-k-mooney | i have been seeing it since before you fix | 14:48 |
sean-k-mooney | but ya all the gates are currently blocked | 14:48 |
stephenfin | yeah, most likely on projects without base_python set | 14:48 |
sean-k-mooney | i saw it on nova yesterday and i think on older builds form durign the week | 14:49 |
sean-k-mooney | we have base_python set | 14:49 |
sean-k-mooney | we dont actully need to have it set anymore since we are python3 only | 14:49 |
stephenfin | the fix to tox merged yesterday so it was probably that | 14:49 |
sean-k-mooney | ya the release happend 19 hours ago but i toughthe builds were older then that | 14:50 |
sean-k-mooney | i saw it on gibis seriese | 14:50 |
sean-k-mooney | im wondering if we shoudl repin to tox <4.0 tempoerally | 14:51 |
sean-k-mooney | we can proably wait another week but if we cant resolve the issue by the end of next week i think we should | 14:52 |
gibi | sean-k-mooney, stephenfin: is there a mail thread about the gate block on the ML yet or should I send one? | 14:59 |
stephenfin | There isn't. The fix is here https://github.com/tox-dev/tox/pull/2828 though if you want to send one and point to that | 14:59 |
gibi | I will send one | 14:59 |
dansmith | sean-k-mooney: tox has been slowly breaking everything for weeks now.. pinning to <4 temporarily seems futile | 15:01 |
sean-k-mooney | dansmith: im currently trying to fix some os-vif tox issues related to ubuntu 22.04 | 15:08 |
sean-k-mooney | 4.0 is after that on my list | 15:09 |
sean-k-mooney | dansmith: we had a pin in place until recently to prevent the gate block | 15:16 |
dansmith | yeah and we're pinning on stable, I'm just saying I don't think _temporarily_ pinning and expecting things to stabilize is realistic | 15:16 |
sean-k-mooney | i understand why they remvoed it but i dont think we shoudl block the gate while we are fixing it | 15:16 |
dansmith | it's not like they broke a bunch of backwards compat in 4.0 and now things are stable.. they *keep* breaking things | 15:17 |
dansmith | also for the reason that tox will auto-upgrade itself in certain scenarios (which is like ....) | 15:18 |
sean-k-mooney | i havent really had issue wiht tox but also havnt been using it much in the last while as i have not been really coding in python for a few months | 15:18 |
sean-k-mooney | dansmith: apprently you can force it to install iseslf in a venv and use that version to run things | 15:18 |
dansmith | sean-k-mooney: it will do that itself if it decides to | 15:18 |
dansmith | but only the latest, not a specific version | 15:19 |
sean-k-mooney | not according to Brian Rosmaita's latest email | 15:19 |
sean-k-mooney | you can force the version via requires in tox.ini | 15:19 |
dansmith | " it doesn't ensure that the available tox is that version." | 15:20 |
dansmith | oh, there's two pins, with different behaviors | 15:20 |
dansmith | he's talking about requires, but there are projects with ensure | 15:21 |
dansmith | it's really a mess | 15:22 |
sean-k-mooney | yep | 15:23 |
sean-k-mooney | the reason i was suggestign we wait a week is at that poitn we would be 4 weeks form FF | 15:23 |
sean-k-mooney | and dont really wnat to still have the gates blocked by this at that point | 15:23 |
sean-k-mooney | i.e. lets see if we can fix it next week and if not pin it so we can continue merging things and work on it in parallel | 15:24 |
sean-k-mooney | gmann: stephenfin i have got the os-vif fucntional test workign locally | 18:02 |
sean-k-mooney | it looks like we need CAP_DAC_OVERRIDE on ubuntu 22.04 | 18:03 |
sean-k-mooney | without that vsctl and some other commands fail | 18:03 |
sean-k-mooney | CAP_NET_ADMIN used to work | 18:03 |
sean-k-mooney | i have some other chagne locally so im going to see if they are required or not | 18:04 |
sean-k-mooney | its proably because i am not a meber of the openvswitch group but it also fails for ip link commands | 18:04 |
sean-k-mooney | so i think this has to do with disto packaging and how the goups are configured | 18:05 |
sean-k-mooney | so while i dont like adding CAP_DAC_OVERRIED that is proably what we will need to do | 18:05 |
sean-k-mooney | what im less happy about is this is only required when using the vsctl ovs backend which is deprecated | 18:06 |
sean-k-mooney | so i might us a diffferent privsep context based on the driver to limit the scope of the change. | 18:07 |
sean-k-mooney | actully i think the cahnge is less in vasive then that and only in the test code | 18:20 |
opendevreview | sean mooney proposed openstack/os-vif master: add CAP_DAC_OVERRIDE to test privsep contexts https://review.opendev.org/c/openstack/os-vif/+/869500 | 18:34 |
sean-k-mooney | gibi: stephenfin gmann ^ i think that will fix the functional job and unblock https://review.opendev.org/c/openstack/os-vif/+/868420 and https://review.opendev.org/c/openstack/os-vif/+/861468 | 18:35 |
sean-k-mooney | once we have those 3 commits merged we may want to consider an os-vif release | 18:36 |
gmann | sean-k-mooney: thanks, will keep eyes on gate result | 18:54 |
darkhorse | Hi team, I would like to resize an shelved_offloaded instance. The use case is that when an instance with pci device is shelved_offloaded and the device is broken, it fails to unshelve. However, as a user, I would like to recover my data in the instance so I would like to change the flavor with a new one that does not have pci cards. | 19:09 |
darkhorse | Is there a quick workaround to this? | 19:09 |
darkhorse | Thank you in advance for any help! | 19:10 |
opendevreview | Danylo Vodopianov proposed openstack/nova master: Napatech SmartNIC support https://review.opendev.org/c/openstack/nova/+/859577 | 19:19 |
dansmith | melwitt: around? | 19:24 |
melwitt | dansmith: o/ | 19:35 |
dansmith | hey so, | 19:35 |
dansmith | I don't really know what I was going to say | 19:35 |
dansmith | part of it was that now that I've implemented compute undelete later in the series, | 19:35 |
dansmith | I'm failing a couple more regression tests, | 19:35 |
dansmith | but around ironic because of all the hash ring rebalance weirdness | 19:36 |
dansmith | unfortunately I think those are going to have to change a bit as well, which makes me nervous, | 19:36 |
dansmith | but they're asserting things like "this shouldn't create another thing, but it does, so assert that it happens, and then assert that it goes away later" sort of stuff | 19:36 |
dansmith | which is kinda expected and kinda why we're doing this, and the ironic hash-ring-ectomy thing | 19:37 |
dansmith | but I dunno, I guess I just want to say... I hope you're going to check all my work :P | 19:37 |
melwitt | dansmith: ack, that's the intent (to check everything) :) thanks for the heads up | 19:40 |
dansmith | I know | 19:40 |
dansmith | but this rabbit hole is deep, the tea tastes funny, and everyone is wearing strange hats | 19:41 |
melwitt | haha, I hear you (and am not surprised) | 19:42 |
opendevreview | Danylo Vodopianov proposed openstack/nova master: Napatech SmartNIC support https://review.opendev.org/c/openstack/nova/+/859577 | 19:49 |
sean-k-mooney | darkhorse: no quick workaround. resize is currently not supproted while shelve_offloaded but we have discussed that it could be supported in the future | 19:52 |
sean-k-mooney | darkhorse: i think artom has already fixed the issue with pci device shelve however | 19:53 |
sean-k-mooney | so i dont think that happens any more | 19:53 |
darkhorse | sean-k-mooney: do you have a link to the patchset? when you say the issue is fixed, does that mean you can unshelve instances even if pci card is broken or unavailable? | 20:06 |
opendevreview | Danylo Vodopianov proposed openstack/nova master: Napatech SmartNIC support https://review.opendev.org/c/openstack/nova/+/859577 | 20:12 |
artom | sean-k-mooney, IIUC darkhorse wants to resize a shelved_offloaded instance | 20:14 |
artom | Which IIUC is... not a thingÉ | 20:14 |
artom | ? | 20:14 |
artom | As in, you have to unshelve first, and then resize? | 20:14 |
artom | And yeah, unshelve with PCI has been backported to... I want to say Ussuri? | 20:14 |
artom | Or maybe wallaby | 20:14 |
darkhorse | sean-k-mooney: If you can share the link of the discussion of the resize support for shelved instances, it would be helpful. I will take a look and work on it. | 20:14 |
darkhorse | artom: Do you mean you can unshelve instance when pci card is broken/unavailable in Ussuri or Wallaby? | 20:15 |
artom | darkhorse, https://review.opendev.org/q/Icfa8c1d6e84eab758af6223a2870078685584aaa | 20:16 |
artom | wallaby | 20:16 |
darkhorse | artom: We are operating on xena. So if I understood you correct, all I need to do to allow users to unshelve pci instance even if card is broken/unavailable is to backport this patch to xena, is that correct? | 20:19 |
artom | darkhorse, no, you should be set. Xena is after wallaby :) | 20:19 |
artom | When the master patch merged, master was xena | 20:20 |
artom | darkhorse, hold on though - define "card is broken/unavailable"? | 20:20 |
artom | The unselve will attempt to find a PCI card that fits the port (if it's Neutron SRIOV)/flavor | 20:20 |
artom | But... if no such cards are available, then it will (legitimately) fails to schedule | 20:21 |
darkhorse | artom: not neutron SRIOV but fpga device. | 20:21 |
artom | So flavor PCI passthrough... | 20:21 |
darkhorse | right! | 20:21 |
artom | That should just... work. Off the top of my head I don't recall any issues with PCI and unshelve | 20:22 |
darkhorse | artom: no it fails to unshelve because the pci device is unavailable. | 20:23 |
artom | Unavailable how? It got pulled from the server? :) | 20:23 |
darkhorse | in that case, i would like to either snapshot or resize the instance so that I don't lose the data inside it. | 20:23 |
darkhorse | either because the card is occupied by another instances or physically broken | 20:24 |
darkhorse | artom: did i answer your question? | 20:32 |
artom | darkhorse, ah, I think I see. If you can't unshelve the instance because the cloud lacks the resources the instance needs (in this case, a PCI card), you'd like to be able to boot it regardless with its disk intact, just without the PCI device | 20:34 |
artom | So a shelved_offloaded instance lives as an image in Glance | 20:34 |
artom | IIRC you should just be able to boot a new instance from that image? | 20:34 |
artom | If keeping the same UUID is important to you though, you're out of luck I believe :( | 20:35 |
darkhorse | artom: the point is i want to recover the data inside the instance. if i boot a new instance, i think i am not able to get the data? | 20:36 |
artom | If it's been shelved offloaded, its disk has been uploaded to Glance as an image. | 20:37 |
artom | But... if you want data to persist, the "real" solution is to use volumes | 20:37 |
darkhorse | artom: will you elaborate? i was thinking of snapshotting or resizing with a new flavor that does not have pci so that i can unshelve. | 20:38 |
artom | darkhorse, elaborate on which aspect? Volumes, or booting from the Glance image? | 20:39 |
darkhorse | 1. if booting from glance image will save the data 2. volumes | 20:40 |
darkhorse | artom:1. if booting from glance image will save the data 2. volumes | 20:41 |
artom | darkhorse, it's been a while since I've done this, but a shelved_offloaded image will have its disk uploaded as image in Glance | 20:42 |
artom | I believe you can just boot from that image with `openstack server create --image <image uuid> <etc>` | 20:42 |
artom | And for volumes... you create a volume, attach it to your instance | 20:42 |
artom | Inside the guest you mount it as /data or whatever | 20:43 |
artom | And then anything in /data will live on the volume, so even if the instance is deleted, that volume persists and can be attached to other instances | 20:43 |
darkhorse | artom: the point is to recover the data in the instance. what should i do when instance is not able to get unshelved? | 20:44 |
artom | I'm not sure how much more clear I can be | 20:45 |
artom | <artom> darkhorse, it's been a while since I've done this, but a shelved_offloaded image will have its disk uploaded as image in Glance | 20:45 |
artom | <artom> I believe you can just boot from that image with `openstack server create --image <image uuid> <etc>` | 20:45 |
darkhorse | artom: ok thank you! let me try that. | 20:46 |
opendevreview | Danylo Vodopianov proposed openstack/os-vif master: MTU support for DPDK port added https://review.opendev.org/c/openstack/os-vif/+/859574 | 21:01 |
*** dasm is now known as dasm|off | 21:37 | |
opendevreview | Ghanshyam Mann proposed openstack/python-novaclient master: DNM: test tox<4 pinning in stable branches https://review.opendev.org/c/openstack/python-novaclient/+/869516 | 23:57 |
opendevreview | Ghanshyam Mann proposed openstack/osc-placement stable/zed: DNM: test tox<4 pinning in stable branches https://review.opendev.org/c/openstack/osc-placement/+/869517 | 23:59 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!