14:04:14 #startmeeting cinder 14:04:14 Meeting started Wed Aug 13 14:04:14 2025 UTC and is due to finish in 60 minutes. The chair is jbernard. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:04:14 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:04:14 The meeting name has been set to 'cinder' 14:04:27 jungleboyj rosmaita smcginnis tosky whoami-rajat m5z e0ne geguileo eharney jbernard hemna fabiooliveira yuval tobias-urdin adiare happystacker dosaboy hillpd msaravan sp-bmilanov Luzi sfernand simondodsley: courtesy reminder 14:04:34 #link https://etherpad.opendev.org/p/cinder-flamingo-meetings 14:04:37 o/ 14:04:37 #topic roll call 14:04:42 o/ 14:04:44 o/ 14:04:45 o/ 14:04:47 o/ 14:04:51 o/ 14:05:02 o/ 14:05:47 o/ 14:05:59 o/ 14:06:00 o/ 14:06:05 o/ 14:06:05 o/ 14:06:20 o-/ 14:06:24 o/ 14:07:35 hello everyone 14:08:10 i see a nova spec being added to the agenda, while that happens some quick reminders 14:08:37 Monday, Aug 18 is our midcycle/review session at 1400 UTC (this slot) 14:09:24 #link https://releases.openstack.org/flamingo/schedule.html 14:09:40 * sp-bmilanov is adding things at the last moment 14:10:20 (I can elaborate a bit more when I have the floor) 14:10:44 sp-bmilanov: sure, go ahead 14:11:48 hi, thanks, so we hit a case where, during a live migration of an instance, the OOM killer kills the nova agent at the source, but in just the right moment so that the instance is now running at the destination 14:11:57 but it is not reflected in OpenStack's state 14:12:11 and when you boot the instance again, it gets started on the source hypervisor 14:12:35 leading to data corruption for the volumes that are attached at both the source and destination 14:12:42 yikes 14:13:12 I've brought this up with Nova, hence the spec, but I was wondering if there is something more we can do to eliminate double-attachments as a whole 14:14:00 the StorPool storage system can force-detach from all but the instance that is being powered on (the linked change), so you don't at least have data corruption 14:14:25 (https://review.opendev.org/c/openstack/os-brick/+/940245) 14:14:47 what I wanted to discuss is does it make sense to generalize this a bit and ship it as something other drivers can implement? 14:15:23 (something like an invariant that storage systems, where possible, to check that some invariants hold) 14:16:50 ok, wrong link -- https://review.opendev.org/c/openstack/os-brick/+/957117 14:17:53 and the context in which it can be called -- https://review.opendev.org/c/openstack/nova/+/957119 14:18:43 currently, it makes a lot of assumptions, might not cover edge cases, etc, but the spirit of the change is "consult the storage system that it's ok to boot" 14:19:14 this will certainly require nova's input, 14:19:31 the source of the bug is restarting on the source when an instance is in a trasitional state, no? 14:20:27 in a transitional state from the PoV of OpenStack, yes, from what I've gathered, the instance was fully migrated on the destination, there was just some bookeeping and cleanup left on the source 14:21:06 we support multiattach, so there are at least some cases where eliminating this would not be desired... (im reading as fast as i can, there's quite a bit to think about) 14:21:42 yep, no rush, I just wanted to get the idea out there 14:21:59 and yes, I am not sure how this will mesh with multi-attach 14:22:03 sp-bmilanov: if finalizing cleanup is necessary, wouldn't bringing up the instance again on the source be the wrong thing to do? 14:23:18 yep, but from what I gathered, a restart was issued, which made OpenStack recreate the instance at the source 14:24:07 (there are also discussions to look into maybe adding more obstacles if OpenStack detects that a migration has failed) 14:24:18 more obstacles to get a VM restarted 14:24:20 but i would hope a restart on a migrated instance would be able to correct for this case 14:24:57 it's a known defect, the source doesn't check if the instance is running on the destination 14:25:01 im resisitant to adding additional logic to cinder if nova is capable of improving the restart logic (as a general idea) 14:25:44 (the spec will aim to address this and maybe describe that the source checking the destination for running instances is the way to go) 14:25:49 i think the source of the problme is the best place to address it, instead of adding safeguards in cinder to work around it (if possible) 14:26:41 makes sense 14:26:49 i want to hear nova's take on it 14:28:22 I will bring it up again on the Nova weekly next week 14:28:44 sp-bmilanov: what is the lp bug for this? im not seeing it' 14:29:26 https://bugs.launchpad.net/nova/+bug/2092391 14:29:56 ahh, sean has some comments on the spec, need to look at that closer 14:29:58 sp-bmilanov: thanks 14:30:50 you're welcome, yes, and https://meetings.opendev.org/irclogs/%23openstack-nova/%23openstack-nova.2025-07-29.log.html https://meetings.opendev.org/irclogs/%23openstack-nova/%23openstack-nova.2025-07-22.log.html 14:31:26 I need to update the spec proposal with what we've discussed in the meets, but the focus in the spec is the cleanup enhancement 14:31:46 #action sp-bmilanov to raise restart/migration issue in nova meeting and report back 14:32:19 sp-bmilanov: it could be that nova cannot handle it any better, i just want to understand 14:32:59 it can, and in that specific case it will solve the issue, but I was wondering if a more general approach would be even better 14:33:22 in order to avoid a potential next double-attach-start situation 14:33:51 "in that specific case it will solve the issue" it = a potential enhancement to the cleanup 14:35:28 ok, im interested in addressing the immediate issue first 14:35:42 ack 14:36:13 im skeptical of solutions in need of problems, but it's certainly something we can dicuss, perhaps you have a reproducer some additional analysis 14:37:11 #topic open discussion 14:37:21 hey 14:37:30 when is feature freeze for 2025.2? 14:38:02 https://releases.openstack.org/flamingo/schedule.html#f-ff 14:38:02 did I missed it already? 14:38:06 Hello, 14:38:06 R-5 i believe 14:38:07 https://review.opendev.org/c/openstack/cinder/+/951829 14:38:11 hi i just wanted to know: how is the review state for the image encryption patches? Is there anything we still need to do? 14:38:23 yuval: not yet 14:38:37 28.8? 14:38:40 Aug 25 - Aug 29 14:38:42 Luzi: i need another core reviewer 14:39:25 Hey everyone, I have what I think is a basic question here but is there detailed documentation on how to modify an existing gerrit merge request submitted by someone else for cinder specifically? I've checked here https://docs.openstack.org/project-team-guide/review-the-openstack-way.html#modifying-a-change and here https://docs.opendev.org/opendev/infra-manual/latest/developers.html#updating-a-change but con 14:39:34 https://review.opendev.org/c/openstack/cinder/+/955054 raised patch 10 days back. Please help with review 14:39:55 Luzi: i was hoping to get it merged early so that we could address any regressions should they arise, but everyone is quite busy this cycle. im still hoping to get it merged, ill try to raise it in Monday's meeting 14:39:58 folks we will talk about the https://review.opendev.org/c/openstack/os-brick/+/955379 (agenda) ? 14:40:10 thank you jbernard 14:40:16 Need review for CINDER plugin registration https://review.opendev.org/c/openstack/cinder/+/951829. As I have updated the patch and removed the OS version as it was having GDPR implications. 14:41:23 anthonygamboa: merge request, as in: a patch submitted for review? 14:42:18 anthonygamboa: anyone can update an existing patch, you can use git-review and the author and uploader will then show different values 14:43:04 anthonygamboa: but it will kindof overwrite someone elses work, so make sure you're being helpful and it's the right action to take 14:44:02 pedrovlf: sure, what's up? 14:45:22 Hi jbernard we would like a review on the that patch we submit cc: hillpd 14:45:44 We found a an issue in os-brick where multiple iSCSI logins per path aren’t handled correctly. The current logic stops after finding the first session. In practice, we found that this can prevent the clean-up of devices configured under other sessions. 14:46:14 pedrovlf: yes, im aware, i haven't had time to look at it yet 14:46:25 We have a proposed fix and were looking for feedback on this solution. Re: https://review.opendev.org/c/openstack/os-brick/+/955379 14:46:30 okay, thanks 14:47:00 also we described the full situation on the https://bugs.launchpad.net/os-brick/+bug/2116553 14:47:57 ok, the patch may well be in fine shape, it just needs someone with time to take a look 14:48:36 Hello Joe, Can you please review this CINDER plugin registration https://review.opendev.org/c/openstack/cinder/+/951829. 14:48:46 pedrovlf, hillpd: we can raise this in Monday's session 14:48:48 * jungleboyj is looking at that one. 14:49:14 We have dropped mail regarding previous review comments. 14:49:34 thank you jbernard let us know if you need any other detail 14:49:36 yep, i see it, it's in the review queue as well 14:49:41 Sandip: ^ 14:49:52 pedrovlf: ok, thanks for reaching out 14:49:57 Thanks Joe. 14:54:01 ok, last call for problems, issues, dumpster fires, etc 14:54:19 good news is alwasy welcome too :) 14:56:27 thanks everyone! 14:56:30 #endmeeting