*** dasm|off is now known as Guest9353 | 04:02 | |
opendevreview | Amit Uniyal proposed openstack/placement master: Bugtracker link update https://review.opendev.org/c/openstack/placement/+/876768 | 05:38 |
---|---|---|
frickler | melwitt: that at least sounds plausible, thanks for checking | 06:03 |
opendevreview | Tobias Urdin proposed openstack/nova master: Remove libvirt tunnelled migration https://review.opendev.org/c/openstack/nova/+/879021 | 06:18 |
opendevreview | Tobias Urdin proposed openstack/nova master: Remove libvirt tunnelled migration https://review.opendev.org/c/openstack/nova/+/879021 | 09:34 |
bauzas | if people are OK, I'd like to add a functest for testing cross_az_attach https://review.opendev.org/c/openstack/nova/+/878948 | 09:35 |
gibi | bauzas: I will be absent from the vPTG from 15:00 UTC hopefully only for half an hour due to a downstream call | 09:41 |
bauzas | ack | 09:44 |
bauzas | we'll discuss with glance | 09:44 |
bauzas | see the agenda I posted on the ML | 09:44 |
bauzas | we'll discuss with glance *at that time* | 09:44 |
* bauzas runs errand by nbow | 09:45 | |
gibi | ack | 10:15 |
opendevreview | yatin proposed openstack/nova master: [DNM] Test lower tb cache https://review.opendev.org/c/openstack/nova/+/868419 | 10:27 |
tobias-urdin | i have a conundrum that i cannot wrap my head around, i've been looking at the possibility of removing the need for remotefs (rsync/scp over ssh) when doing live migrations, that should be possible but requires some RPC changes to get rid of testing if instance dir is on shared storage and some other stuff, just out of curiosity i checked the | 10:41 |
tobias-urdin | remotefs usage all around and we also use it for config drive migrations (as qemu (libvirt blocks us) does not allow migrating read-only ISO source file), fetching kernel and ramdisk, copying vtpm data, copying cached images from other compute nodes(?) etc | 10:41 |
tobias-urdin | so while what I want to do, to remove the need for SSH key distribution for live migrations is possible, getting it completely removed in libvirt driver seems hard, because I cannot wrap my head around what we could replace that logic in terms of copying files around | 10:42 |
tobias-urdin | if only libvirt had a copy file implementation in their protocol we wouldn't need anything else for remote access and could get tls etc, theoretically could be done with block devices using storage pools but that would be even worse imo | 10:43 |
* tobias-urdin brain hurts | 10:43 | |
jrosser | tobias-urdin: key distribution can be avoided by using signed keys | 11:13 |
zigo | bauzas: You remember I couldn't do some live-migration with some of my VMs? It turns out that: | 11:21 |
zigo | - It only happens with SOME images, like Rocky Linux 8.7 | 11:21 |
zigo | - upgrading both source and destination from Qemu 5.2 to 7.2 (from bullseye-backports) fixes the issue ! \o/ | 11:21 |
tobias-urdin | jrosser: yeah, but it's also more to it than that, what about lateral movement between compute nodes; being able to essentially wipe instance disks of another compute node, or what about a operating system with no ssh running, sure could run ssh in a container and expose nova dir but then point #1 still applies, hard problem | 11:23 |
sean-k-mooney1 | zigo: glad its fixed but odd that it depeneded on the image | 11:38 |
zigo | sean-k-mooney1: According to the people from #qemu, it's likely that the image is using some new feature of the virtio stuff. | 11:39 |
zigo | What's weird is that with Roky Linux 9, I didn't have the issue... | 11:39 |
sean-k-mooney1 | its proably using the transitional virtio device or something like that | 11:39 |
sean-k-mooney1 | as it the driver is proably negocating an older feature set | 11:40 |
*** sean-k-mooney1 is now known as sean-k-mooney | 11:40 | |
zigo | sean-k-mooney1: I'll upgrade all my cluster to the newer version of Qemu and will migrate (non-live) the problematic instances ... | 11:40 |
zigo | Still very annoying, but lucky, only very few instances are affected. | 11:41 |
kashyap | zigo: Ahh, so upgrading the source and dest QEMU to 7.2 fixes -- did you test it? | 12:29 |
kashyap | Well, you did test it, otherwise, you wouldn't put that "\o/" | 12:31 |
bauzas | zigo: sorry was taking a bit of time off before the vPTG | 12:36 |
bauzas | (after the usual child taxi for lunch :p ) | 12:36 |
bauzas | as I said to the PTGbot, we start at 1pm UTC with the neutron x-p session in the Neutron room (juno, ie. https://www.openstack.org/ptg/rooms/juno ) | 12:39 |
artom | sean-k-mooney, I'll be late for that (son's dentist appointment), can you cover the delete_on_termination stuff? | 12:47 |
artom | IIRC you're the only other person with the context | 12:47 |
sean-k-mooney | am yes i can | 12:49 |
artom | Cheers! | 12:49 |
*** whoami-rajat__ is now known as whoami-rajat | 13:28 | |
bauzas | dvo-plv: are you in the neutron room ? | 13:30 |
bauzas | dvo-plv: we're discussing your topic now | 13:30 |
dvo-plv | yes, thank you | 13:37 |
bauzas | break now until 3pm UTC and then please join the cinder room : https://bluejeans.com/556681290 | 14:42 |
bauzas | also, I had a topic about Xena EM, we'll discuss this at next meeting | 14:42 |
bauzas | (unless people have concerns by now, fer sur) | 14:43 |
elodilles | bauzas: ack, we can have a couple of words about it o/ | 14:43 |
bauzas | elodilles: tbh, I need to look at the current open changfes | 14:44 |
sean-k-mooney | bauzas: what room should we be in next | 14:44 |
sean-k-mooney | well now | 14:44 |
bauzas | sean-k-mooney: cinder room, 3pm UTC | 14:45 |
bauzas | we have a break now | 14:45 |
elodilles | bauzas: yes, there are plenty of open patches: https://review.opendev.org/q/status:open+(project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/placement+OR+project:openstack/nova)+branch:stable/xena | 14:45 |
bauzas | that was my ask before we left | 14:45 |
elodilles | the question is though whether anyone see anything that should be part of the 'final-before-em' release of stable/xena | 14:46 |
dansmith | sean-k-mooney: what is the qemu security issue that started causing detach issues that you mentioned on the list? | 14:46 |
sean-k-mooney | dansmith: the orgianl motivation for the change in qemu was related to a secuity issue i belvie | 14:46 |
sean-k-mooney | i dont actully know the details | 14:46 |
dansmith | sean-k-mooney: but what's the change? I wasn't aware anything changed (intentionally) | 14:46 |
bauzas | was in libvirt 8, right? | 14:46 |
* bauzas missed the regression index | 14:47 | |
sean-k-mooney | it was undefiend behaivor if you could retry detach while it was in progress | 14:47 |
sean-k-mooney | they intentionally made it an error and have it abort the inprogress detach | 14:47 |
bauzas | elodilles: I can raise the question to the nova team by email | 14:47 |
elodilles | bauzas: yepp, that is perfectly OK i think | 14:47 |
bauzas | elodilles: and we could conclude on the next weekly meeting | 14:47 |
sean-k-mooney | in old version fo qemu it would actully try detaching again | 14:47 |
*** Guest9353 is now known as dasm | 14:48 | |
elodilles | bauzas: ++ | 14:48 |
dansmith | sean-k-mooney: that's not actually causing us trouble though right? if we needed to retry the detach it probably wasn't working anyway, right? | 14:48 |
bauzas | elodilles: or the next one, I've seen the deadline for approving | 14:48 |
dansmith | sean-k-mooney: ah meaning we're never sending the acpi event anymore after the first one? | 14:48 |
sean-k-mooney | correct after the first one it never gets sent again but also our retry mechaium would stop the detach form proceeding | 14:49 |
sean-k-mooney | i.e. if was just slow detachign and we retried it would abort the detach | 14:49 |
elodilles | bauzas: though we should not postpone the release close to the transition, otherwise if we hurry and merge things and something will be broken then we cannot fix it anymore ;) | 14:49 |
dansmith | hmm, okay | 14:49 |
sean-k-mooney | dansmith: gibi swapped use form blind retryes on a timeout/interval oto trying to use qemu events | 14:49 |
sean-k-mooney | or maybe that was lee | 14:50 |
sean-k-mooney | in either case that was not enough to resolve this issue | 14:50 |
dansmith | okay, I guess I didn't know about this other detail | 14:50 |
sean-k-mooney | it just seam like we need to kick the vms several times to get the detach to work | 14:50 |
bauzas | elodilles: yeah that's understandable, we shall not be lazy | 14:51 |
dansmith | if it's a matter of the first event getting missed or something, that definitely *could* support the "use a real distro" argument I guess | 14:51 |
dansmith | if the first one fails, is there some way the "in progress"-ness gets reset such that allowing a retry *ever* works? | 14:51 |
bauzas | elodilles: despite (tbh), I'm like ~0% interested about the Xena branch :) | 14:51 |
bauzas | actually, EM is helping my work :) | 14:51 |
sean-k-mooney | dansmith: currently i belive there is no way to rest the state without restarting the vm | 14:52 |
sean-k-mooney | https://gitlab.com/libvirt/libvirt/-/issues/309 | 14:52 |
dansmith | does running the guest agent allow us to do it that way instead of just acpi? | 14:52 |
dansmith | thanks, I'll brush up on that bug | 14:53 |
sean-k-mooney | good question i do not think so but maybe | 14:53 |
sean-k-mooney | i have never really looke at what the guest agent can actully do | 14:53 |
sean-k-mooney | i knowit has some filesystem apis to freeze them | 14:53 |
sean-k-mooney | dansmith: the other thing to keep in mind is its not always acpi | 14:54 |
sean-k-mooney | qhen you change to q35 we started to use pcie natiave hotplug instead | 14:54 |
sean-k-mooney | the proved to be buggey so they went back to acpi | 14:54 |
dansmith | for disks? but I'm using acpi as a stand-in.. I guess for disks I figured it was an eject request or something | 14:54 |
sean-k-mooney | im not sure if they have changed back to native pcie hotplug or if it still uses acpi | 14:54 |
sean-k-mooney | there are 2 ways to signel it for the pc machine type it used ahci interupts becasue you only had a pci bus not pcie | 14:55 |
sean-k-mooney | pcie has its own hotplug mechanium and qemu tried to use that instead | 14:55 |
elodilles | bauzas: we had a xena release this year, so at least we are mostly good ;) | 14:55 |
sean-k-mooney | then then hit bugs and went back to ahci | 14:56 |
sean-k-mooney | for virtio-blk each disk is a pci device | 14:56 |
sean-k-mooney | for virtio-scsi then they are not | 14:56 |
sean-k-mooney | they are scsi device connected to the contoller | 14:56 |
bauzas | reminder : we restart in 3 mins, cinder room | 14:57 |
sean-k-mooney | speakign of we could try using virtio-scsi i guess | 14:57 |
sean-k-mooney | just set hw_disk_bus=scsi in devstack | 14:57 |
sean-k-mooney | i dont think htat helps as i think its the ahci path thats buggy but its an option to try | 14:58 |
sean-k-mooney | dansmith: sorry for the context dump :) | 14:58 |
whoami-rajat | https://redhat.bluejeans.com/556681290 | 15:00 |
sean-k-mooney | dansmith: ^ are you joining that by the way | 15:01 |
dansmith | sean-k-mooney: nope, tc now.. | 15:01 |
sean-k-mooney | ah ok | 15:01 |
sean-k-mooney | we can recap hte direct image location converstaion | 15:02 |
bauzas | senrique: oh you just joined, cool thanks | 15:46 |
*** whoami-rajat__ is now known as whoami-rajat | 15:53 | |
bauzas | senrique: a few docs so :) | 15:55 |
bauzas | senrique: this is our overall process workflow https://docs.openstack.org/nova/latest/contributor/process.html#how-do-i-get-my-code-merged | 15:56 |
senrique | bauzas, hey :) | 15:56 |
bauzas | tl;dr: create a blueprint on https://blueprints.launchpad.net/nova/ | 15:56 |
bauzas | then, once you think you have time to attend a specific nova meeting, add your topic to the weekly meeting agenda https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting (in the open discussion last topic) | 15:57 |
bauzas | and then, we'll discuss it on the subsequent meeting, asking you a few details and if we agree, then we just approve your blueprint desing (from a procedural pov) | 15:57 |
bauzas | design* even | 15:58 |
bauzas | then, you're free to work on the implementation patches and ask for reviews | 15:58 |
bauzas | senrique: one example is https://meetings.opendev.org/meetings/nova/2023/nova.2023-01-17-16.00.log.html#l-287 | 15:59 |
bauzas | break until 1620UTC | 16:10 |
bauzas | and then, see you back on https://www.openstack.org/ptg/rooms/diablo | 16:10 |
* gibi is back for the break, best timing ever | 16:11 | |
senrique | thank you bauzas!! | 16:19 |
opendevreview | sean mooney proposed openstack/nova master: [DNM] testing enableind discard by default https://review.opendev.org/c/openstack/nova/+/879077 | 18:41 |
opendevreview | Merged openstack/nova stable/xena: Accept both 1 and Y as AMD SEV KVM kernel param value https://review.opendev.org/c/openstack/nova/+/843938 | 19:04 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!