14:00:49 <jbernard> #startmeeting cinder 14:00:49 <opendevmeet> Meeting started Wed Jul 30 14:00:49 2025 UTC and is due to finish in 60 minutes. The chair is jbernard. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:49 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:49 <opendevmeet> The meeting name has been set to 'cinder' 14:01:03 <jbernard> jungleboyj rosmaita smcginnis tosky whoami-rajat m5z e0ne geguileo eharney jbernard hemna fabiooliveira yuval tobias-urdin adiare happystacker dosaboy hillpd msaravan sp-bmilanov Luzi sfernand simondodsley: courtesy reminder 14:01:13 <jayaanand> hi 14:01:14 <jbernard> #topic roll call 14:01:22 <jbernard> o/ hello everyone 14:01:27 <gireesh> o/ 14:01:29 <Sai> o/ 14:01:55 <nileshthathagar> hi 14:02:45 <jungleboyj> o/ 14:03:02 <simondodsley> o/ 14:04:04 <hvlcchao1> o/ 14:04:46 <whoami-rajat__> Hi 14:04:51 <jbernard> #link https://etherpad.opendev.org/p/cinder-flamingo-meetings 14:04:52 <vdhakad> hi 14:05:43 <jbernard> welcome everyone, 14:05:51 <jbernard> this may be a short meeting, we'll see 14:06:08 <jbernard> #link https://releases.openstack.org/flamingo/schedule.html 14:06:16 <jbernard> ^ current flamingo schedule 14:07:26 <jbernard> whoami-rajat__: if you're here, i now simondodsley has/had some questions about the openstack sdk and some functions that seemed to be missing 14:07:45 <simondodsley> yes, yes... 14:07:50 <jbernard> whoami-rajat__: i thought you might be most qualified to respond 14:08:14 <whoami-rajat__> Sure 14:09:43 <simondodsley> There is a list of cinder features not present in the SDK so I can't (easily) create Ansible modules for these in the OpenStack collection. 14:09:54 <simondodsley> I'd like to know if these will ever be added? 14:10:05 <simondodsley> The most concerning is lack of support for QoS 14:11:22 <simondodsley> I created this: https://etherpad.opendev.org/p/cinder-openstacksdk-gaps but there is a much bigger sheet on this subject as well 14:12:04 <whoami-rajat__> Openstacksdk is definitely missing some support, i was working on getting it up to date but my work ended after reaching parity with openstackclient which was some releases ago 14:12:31 <whoami-rajat__> Regarding the current state, i had prepared some google docs at that time which can depict what we have and what we dont 14:12:45 <whoami-rajat__> Currently i am not actively working on it due to other priority work 14:13:20 <whoami-rajat__> And we as a team decided that an intern could pick it up, but i havent heard about any outreachy or any other type of intern in a while 14:14:39 <whoami-rajat__> So to answer your question, i can’t guarantee that the work will be over, surely i can spare some cycles for specific important operations if its blocking work for vendors - but will require my manager’s approval first 14:15:22 <simondodsley> As OpenStack becomes more important and has greater visibility the automation side of the house is coming under scrutiny and the lack of Ansible modules for basic functions is an issue. QoS is a pretty fundamental requirement for MSPs so this hole is glaring 14:15:51 <simondodsley> I'm happy to write the ANsible modules, but I need the SDK to have the functions to call 14:17:07 <whoami-rajat__> I understand and agree with your concerns and its not that hard to add the support, just its time consuming to code and test 14:17:23 <jbernard> re an intern, i think outreachy is having some funding pressure, it's my impression that we were not likely to have an intern candidate in the near future 14:17:25 <simondodsley> Happy to work with you on this 14:18:04 <whoami-rajat__> Anyways, we can talk about it after the meeting if you can share a list of few important operations (like qos), i can initiate a discussion to get some time working on it 14:18:31 <whoami-rajat__> jbernard: ah, i thought so, thanks for the update 14:20:32 <jbernard> we don't have much in the way of topics today 14:20:36 <vdhakad> @Brian/ @core-members, IBM third party CI is up, requesting reviews on [IBM SVf driver] Adding support for temporary volumegroup | https://review.opendev.org/c/openstack/cinder/+/925450 14:21:06 <simondodsley> Add it to the list: https://etherpad.opendev.org/p/cinder-flamingo-reviews 14:22:05 <harsh> hi.. added a few more reviews for IBM SVf drivers in the list under the name "harsh". They are also part of the same volumegroup feature. 14:22:15 <Anoop_Shukla> Heads Up - NetApp team is working on support for ASAr2 platform and there will be some PRs upcoming for reviews. We are looking to support basic Volume operations in Flamingo. Upcoming release will have other operations supported. 14:22:52 <jbernard> vdhakad: the ci results would be more helpful if we could see what tests were run and what the outcome was, as opposed to the stdout dump 14:23:13 <simondodsley> So on that subject... 14:23:20 <jbernard> im personally looking at tobias' patches this week 14:23:47 <jbernard> specifically adding az support to backupws 14:23:54 <simondodsley> Part of the issue with seeing the logs through gerrit is that Zuul have deprecated a pamater that allows gerrit to get the correct URL location of the logs 14:24:33 <simondodsley> so it will only point to the zuul webserver url - this means anyone with a CI behind a firewall won't be able to expose the logs 14:24:35 <jbernard> simondodsley: have you inquired about what they envision as the proper way to see logs? 14:25:04 <simondodsley> the say that the zuul dashboard is the correct way - they completely ignored corporate security... 14:25:29 <simondodsley> the other option is to set up some reverse proxy, but again corporate security may have a negative opinion of that 14:25:38 <jbernard> do they have any advice? or we're on our own for that? 14:26:07 <simondodsley> the only advice is 'use a reverse proxy' - not particularly helpful 14:26:59 <simondodsley> we have managed to get to the logs into apublic guthub repo with a custome zuul role, but the issue is getting the gerrit response to contain that url 14:27:38 <vdhakad> jbernard: List of test cases can be found here: https://github.com/vp0410/IBM-Cinder-CI/blob/main/925450/37/fc_summary.log 14:27:44 <jbernard> simondodsley: it looks like vdhakad was successful, im looking a this one: https://review.opendev.org/c/openstack/cinder/+/925450 14:28:03 <jbernard> vdhakad: ahh, thank you 14:28:18 <jbernard> simondodsley: ^ could you do something like that? 14:28:35 <harsh> yuo. the std out has the list of all the testcase summary for iscsi and fc both. 14:28:43 <simondodsley> would love to - vdhakad: can we talk about your config offline? 14:28:44 <harsh> yes * 14:29:00 <jbernard> harsh: yep, i see the summary now, i had missed that earlier 14:29:34 <harsh> :) np 14:29:40 <vdhakad> simondodsley: Sure. I'll include vivek from my team, he understands the config best. 14:29:48 <simondodsley> thanks 14:30:14 <jbernard> if that works out, it would be really nice to capture that in a doc somewhere 14:31:43 <jbernard> #topic open discussion 14:34:22 <jbernard> if noone has anything else to add, we can end early 14:35:11 <Anoop_Shukla> I had a question about DR support on Cinder. 14:35:17 <jbernard> Anoop_Shukla: sure 14:35:59 <Anoop_Shukla> VMWare has a tool called SRM (Site Recovery Manager) that a lot of Virtualization customers are used to.. 14:36:03 <gireesh> There are list of pending patches for review, requested to core team to looked into those 14:36:17 <Anoop_Shukla> A lot of customers migrating from VMWare are asking for a similar support on OpenStack 14:36:57 <Anoop_Shukla> Are there any integrations/tools that can provide similar DR (VM level recovery between two sites with two OpenStack deployments)? 14:37:07 <jbernard> Anoop_Shukla: im not familiar with what features SRM offers, but we have support for replication in participating drivers 14:37:17 <harsh> in a previous meeting we discussed about having a review day for non-S reviews. Are there any updates on that? 14:37:34 <Anoop_Shukla> Right. But today recovery on the VMs is manual.. 14:38:07 <jbernard> harsh: ahh, will send a mail about that today, thanks for reminding me 14:38:33 <harsh> :) np, thanks 14:38:39 <whoami-rajat__> Anoop_Shukla: yeah there is no automated way but i have proposed a documentation that will guide you step by step to recover your workloads 14:38:58 <Anoop_Shukla> Okay. Is there a link to that document? 14:39:36 <whoami-rajat__> I will need to find it as i am afk, jbernard i remember you reviewing it, is it possible for you to post the link here? 14:39:45 <Anoop_Shukla> We wanted to see if we can hack something using metadata between two deployments where the recovery can be automated between two sites by creating a recovery plan of some sort.. 14:41:39 <whoami-rajat__> Okay i found it https://review.opendev.org/c/openstack/cinder/+/950859 14:41:39 <jbernard> https://review.opendev.org/c/openstack/cinder/+/950859 14:41:43 <jbernard> ^ i think that's the one 14:41:45 <whoami-rajat__> Thanks 14:41:46 <Anoop_Shukla> FYI VMWare Site Recovery Manager Documentation: https://www.vmware.com/docs/site-recovery-manager-technical-overview 14:42:04 <whoami-rajat__> Which backend are you using Anoop_Shukla ? 14:42:09 <Anoop_Shukla> 👍 14:42:52 <Anoop_Shukla> We are planning for NetApp driver 14:43:07 <whoami-rajat__> Ack, fc? 14:43:18 <Anoop_Shukla> iSCSi to start with.. 14:43:41 <whoami-rajat__> Okay, ideally the procedure should work for all but its using ceph rbd as a reference 14:43:43 <gireesh> this is replication at storage level, Anoop point is DR solution across 2 different site (between 2 openstack cluster) 14:44:13 <gireesh> and that will be at Nova level, DR for vms 14:44:16 <Anoop_Shukla> Yes 14:44:49 <whoami-rajat__> its related to DR using the rbd mirror utility and it covers the scenarios for boot from volume and external data volumes 14:44:59 <gireesh> I think we don't have this solution available in OpenStack and now OpenStack is picking up fast so now customer is asking this kind of solution in OpenStack 14:45:16 <whoami-rajat__> For testing i deployed both clusters in single node but i have tested DR across two different openstack clusters as well 14:45:26 <gireesh> if one site went down how my vms will automatically migrated to other site 14:46:09 <whoami-rajat__> There is no automatic migration, everything happens manually with the failover operations 14:46:35 <whoami-rajat__> Openstack is not as tightly coupled as vmware and supporting this kind of DR automation is a complex task in OpenStack 14:47:32 <Anoop_Shukla> Yes. That is where we need some sort of communication/pairing between two OpenStack deployments (may be at Nova level). If the metadata can be synced between two different site deployments, since storage can be failed over on site B storage, it would be just recovering the VM and creating instance on secondary OpenStack deployment. 14:47:44 <whoami-rajat__> Though the failover operation does migrate all volumes for a particular backend, the vm recovery remains manual 14:48:16 <Anoop_Shukla> Agreed. But if site A is lost, VM/Instant metadata is lost.. 14:49:42 <whoami-rajat__> I don’t think any such effort if being pursued upstream 14:50:15 <Anoop_Shukla> Okay 14:51:49 <Anoop_Shukla> Are there any third party tools available to help on these use cases? 14:53:20 <whoami-rajat__> Replication has been a feature upstream since several releases so i assume someone would have build tooling around it though i am not aware of any 14:55:09 <simondodsley> FYI there is, in the very early stages, a project to create an SRM-like tool for OpenStack 14:55:48 <Anoop_Shukla> Oh..is there a link/documentation available for that? 14:56:59 <simondodsley> Not yet - still in early stages - will be done based on a commerical project initially and leveraging cinder replication, but also using Ansible to perform some of the failover tasks 14:57:19 <Anoop_Shukla> okay 14:58:14 <Anoop_Shukla> There is also a solution for MetroCluster available on VMware called vMSC - which provides a active active failover across sites..via multipathing 14:58:36 <simondodsley> Pure also has ActiveCluster which does the same thing 14:59:30 <Anoop_Shukla> Right. NetApp also has active sync on storage layer. But the RTO is completely dependent on how long it takes for the VMs to be recovered on secondary site.. 14:59:42 <Anoop_Shukla> without a host level support its hard to get 0 RTO 15:00:56 <jbernard> ok, we're at time 15:01:12 <jbernard> #endmeeting