14:00:49 <jbernard> #startmeeting cinder
14:00:49 <opendevmeet> Meeting started Wed Jul 30 14:00:49 2025 UTC and is due to finish in 60 minutes.  The chair is jbernard. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:49 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:49 <opendevmeet> The meeting name has been set to 'cinder'
14:01:03 <jbernard> jungleboyj rosmaita smcginnis tosky whoami-rajat m5z e0ne geguileo eharney jbernard hemna fabiooliveira yuval tobias-urdin adiare happystacker dosaboy hillpd msaravan sp-bmilanov Luzi sfernand simondodsley: courtesy reminder
14:01:13 <jayaanand> hi
14:01:14 <jbernard> #topic roll call
14:01:22 <jbernard> o/ hello everyone
14:01:27 <gireesh> o/
14:01:29 <Sai> o/
14:01:55 <nileshthathagar> hi
14:02:45 <jungleboyj> o/
14:03:02 <simondodsley> o/
14:04:04 <hvlcchao1> o/
14:04:46 <whoami-rajat__> Hi
14:04:51 <jbernard> #link https://etherpad.opendev.org/p/cinder-flamingo-meetings
14:04:52 <vdhakad> hi
14:05:43 <jbernard> welcome everyone,
14:05:51 <jbernard> this may be a short meeting, we'll see
14:06:08 <jbernard> #link https://releases.openstack.org/flamingo/schedule.html
14:06:16 <jbernard> ^ current flamingo schedule
14:07:26 <jbernard> whoami-rajat__: if you're here, i now simondodsley has/had some questions about the openstack sdk and some functions that seemed to be missing
14:07:45 <simondodsley> yes, yes...
14:07:50 <jbernard> whoami-rajat__: i thought you might be most qualified to respond
14:08:14 <whoami-rajat__> Sure
14:09:43 <simondodsley> There is a list of cinder features not present in the SDK so I can't (easily) create Ansible modules for these in the OpenStack collection.
14:09:54 <simondodsley> I'd like to know if these will ever be added?
14:10:05 <simondodsley> The most concerning is lack of support for QoS
14:11:22 <simondodsley> I created this: https://etherpad.opendev.org/p/cinder-openstacksdk-gaps but there is a much bigger sheet on this subject as well
14:12:04 <whoami-rajat__> Openstacksdk is definitely missing some support, i was working on getting it up to date but my work ended after reaching parity with openstackclient which was some releases ago
14:12:31 <whoami-rajat__> Regarding the current state, i had prepared some google docs at that time which can depict what we have and what we dont
14:12:45 <whoami-rajat__> Currently i am not actively working on it due to other priority work
14:13:20 <whoami-rajat__> And we as a team decided that an intern could pick it up, but i havent heard about any outreachy or any other type of intern in a while
14:14:39 <whoami-rajat__> So to answer your question, i can’t guarantee that the work will be over, surely i can spare some cycles for specific important operations if its blocking work for vendors - but will require my manager’s approval first
14:15:22 <simondodsley> As OpenStack becomes more important and has greater visibility the automation side of the house is coming under scrutiny and the lack of Ansible modules for basic functions is an issue.  QoS is a pretty fundamental requirement for MSPs so this hole is glaring
14:15:51 <simondodsley> I'm happy to write the ANsible modules, but I need the SDK to have the functions to call
14:17:07 <whoami-rajat__> I understand and agree with your concerns and its not that hard to add the support, just its time consuming to code and test
14:17:23 <jbernard> re an intern, i think outreachy is having some funding pressure, it's my impression that we were not likely to have an intern candidate in the near future
14:17:25 <simondodsley> Happy to work with you on this
14:18:04 <whoami-rajat__> Anyways, we can talk about it after the meeting if you can share a list of few important operations (like qos), i can initiate a discussion to get some time working on it
14:18:31 <whoami-rajat__> jbernard: ah, i thought so, thanks for the update
14:20:32 <jbernard> we don't have much in the way of topics today
14:20:36 <vdhakad> @Brian/ @core-members, IBM third party CI is up, requesting reviews on [IBM SVf driver] Adding support for temporary volumegroup | https://review.opendev.org/c/openstack/cinder/+/925450
14:21:06 <simondodsley> Add it to the list: https://etherpad.opendev.org/p/cinder-flamingo-reviews
14:22:05 <harsh> hi.. added a few more reviews for IBM SVf drivers in the list under the name "harsh". They are also part of the same volumegroup feature.
14:22:15 <Anoop_Shukla> Heads Up - NetApp team is working on support for ASAr2 platform and there will be some PRs upcoming for reviews. We are looking to support basic Volume operations in Flamingo. Upcoming release will have other operations supported.
14:22:52 <jbernard> vdhakad: the ci results would be more helpful if we could see what tests were run and what the outcome was, as opposed to the stdout dump
14:23:13 <simondodsley> So on that subject...
14:23:20 <jbernard> im personally looking at tobias' patches this week
14:23:47 <jbernard> specifically adding az support to backupws
14:23:54 <simondodsley> Part of the issue with seeing the logs through gerrit is that Zuul have deprecated a pamater that allows gerrit to get the correct URL location of the logs
14:24:33 <simondodsley> so it will only point to the zuul webserver url - this means anyone with a CI behind a firewall won't be able to expose the logs
14:24:35 <jbernard> simondodsley: have you inquired about what they envision as the proper way to see logs?
14:25:04 <simondodsley> the say that the zuul dashboard is the correct way - they completely ignored corporate security...
14:25:29 <simondodsley> the other option is to set up some reverse proxy, but again corporate security may have a negative opinion of that
14:25:38 <jbernard> do they have any advice? or we're on our own for that?
14:26:07 <simondodsley> the only advice is 'use a reverse proxy' - not particularly helpful
14:26:59 <simondodsley> we have managed to get to the logs into  apublic guthub repo with a custome zuul role, but the issue is getting the gerrit response to contain that url
14:27:38 <vdhakad> jbernard: List of test cases can be found here: https://github.com/vp0410/IBM-Cinder-CI/blob/main/925450/37/fc_summary.log
14:27:44 <jbernard> simondodsley: it looks like vdhakad was successful, im looking a this one: https://review.opendev.org/c/openstack/cinder/+/925450
14:28:03 <jbernard> vdhakad: ahh, thank you
14:28:18 <jbernard> simondodsley: ^ could you do something like that?
14:28:35 <harsh> yuo. the std out has the list of all the testcase summary for iscsi and fc both.
14:28:43 <simondodsley> would love to - vdhakad: can we talk about your config offline?
14:28:44 <harsh> yes *
14:29:00 <jbernard> harsh: yep, i see the summary now, i had missed that earlier
14:29:34 <harsh> :) np
14:29:40 <vdhakad> simondodsley: Sure. I'll include vivek from my team, he understands the config best.
14:29:48 <simondodsley> thanks
14:30:14 <jbernard> if that works out, it would be really nice to capture that in a doc somewhere
14:31:43 <jbernard> #topic open discussion
14:34:22 <jbernard> if noone has anything else to add, we can end early
14:35:11 <Anoop_Shukla> I had a question about DR support on Cinder.
14:35:17 <jbernard> Anoop_Shukla: sure
14:35:59 <Anoop_Shukla> VMWare has a tool called SRM (Site Recovery Manager) that a lot of Virtualization customers are used to..
14:36:03 <gireesh> There are list of pending patches for review, requested to core team to looked into those
14:36:17 <Anoop_Shukla> A lot of customers migrating from VMWare are asking for a similar support on OpenStack
14:36:57 <Anoop_Shukla> Are there any integrations/tools that can provide similar DR (VM level recovery between two sites with two OpenStack deployments)?
14:37:07 <jbernard> Anoop_Shukla: im not familiar with what features SRM offers, but we have support for replication in participating drivers
14:37:17 <harsh> in a previous meeting we discussed about having a review day for non-S reviews. Are there any updates on that?
14:37:34 <Anoop_Shukla> Right. But today recovery on the VMs is manual..
14:38:07 <jbernard> harsh: ahh, will send a mail about that today, thanks for reminding me
14:38:33 <harsh> :) np, thanks
14:38:39 <whoami-rajat__> Anoop_Shukla: yeah there is no automated way but i have proposed a documentation that will guide you step by step to recover your workloads
14:38:58 <Anoop_Shukla> Okay. Is there a link to that document?
14:39:36 <whoami-rajat__> I will need to find it as i am afk, jbernard i remember you reviewing it, is it possible for you to post the link here?
14:39:45 <Anoop_Shukla> We wanted to see if we can hack something using metadata between two deployments where the recovery can be automated between two sites by creating a recovery plan of some sort..
14:41:39 <whoami-rajat__> Okay i found it https://review.opendev.org/c/openstack/cinder/+/950859
14:41:39 <jbernard> https://review.opendev.org/c/openstack/cinder/+/950859
14:41:43 <jbernard> ^ i think that's the one
14:41:45 <whoami-rajat__> Thanks
14:41:46 <Anoop_Shukla> FYI VMWare Site Recovery Manager Documentation: https://www.vmware.com/docs/site-recovery-manager-technical-overview
14:42:04 <whoami-rajat__> Which backend are you using Anoop_Shukla ?
14:42:09 <Anoop_Shukla> 👍
14:42:52 <Anoop_Shukla> We are planning for NetApp driver
14:43:07 <whoami-rajat__> Ack, fc?
14:43:18 <Anoop_Shukla> iSCSi to start with..
14:43:41 <whoami-rajat__> Okay, ideally the procedure should work for all but its using ceph rbd as a reference
14:43:43 <gireesh> this is replication at storage level, Anoop point is DR solution across 2 different site (between 2 openstack cluster)
14:44:13 <gireesh> and that will be at Nova level, DR for vms
14:44:16 <Anoop_Shukla> Yes
14:44:49 <whoami-rajat__> its related to DR using the rbd mirror utility and it covers the scenarios for boot from volume and external data volumes
14:44:59 <gireesh> I think we don't have this solution available in OpenStack and now OpenStack is picking up fast so now customer is asking this kind of solution in OpenStack
14:45:16 <whoami-rajat__> For testing i deployed both clusters in single node but i have tested DR across two different openstack clusters as well
14:45:26 <gireesh> if one site went down how my vms will automatically migrated to other site
14:46:09 <whoami-rajat__> There is no automatic migration, everything happens manually with the failover operations
14:46:35 <whoami-rajat__> Openstack is not as tightly coupled as vmware and supporting this kind of DR automation is a complex task in OpenStack
14:47:32 <Anoop_Shukla> Yes. That is where we need some sort of communication/pairing between two OpenStack deployments (may be at Nova level). If the metadata can be synced between two different site deployments, since storage can be failed over on site B storage, it would be just recovering the VM and creating instance on secondary OpenStack deployment.
14:47:44 <whoami-rajat__> Though the failover operation does migrate all volumes for a particular backend, the vm recovery remains manual
14:48:16 <Anoop_Shukla> Agreed. But if site A is lost, VM/Instant metadata is lost..
14:49:42 <whoami-rajat__> I don’t think any such effort if being pursued upstream
14:50:15 <Anoop_Shukla> Okay
14:51:49 <Anoop_Shukla> Are there any third party tools available to help on these use cases?
14:53:20 <whoami-rajat__> Replication has been a feature upstream since several releases so i assume someone would have build tooling around it though i am not aware of any
14:55:09 <simondodsley> FYI there is, in the very early stages, a project to create an SRM-like tool for OpenStack
14:55:48 <Anoop_Shukla> Oh..is there a link/documentation available for that?
14:56:59 <simondodsley> Not yet - still in early stages - will be done based on a commerical project initially and leveraging cinder replication, but also using Ansible to perform some of the failover tasks
14:57:19 <Anoop_Shukla> okay
14:58:14 <Anoop_Shukla> There is also a solution for MetroCluster available on VMware called vMSC - which provides a active active failover across sites..via multipathing
14:58:36 <simondodsley> Pure also has ActiveCluster which does the same thing
14:59:30 <Anoop_Shukla> Right. NetApp also has active sync on storage layer. But the RTO is completely dependent on how long it takes for the VMs to be recovered on secondary site..
14:59:42 <Anoop_Shukla> without a host level support its hard to get 0 RTO
15:00:56 <jbernard> ok, we're at time
15:01:12 <jbernard> #endmeeting