14:00:03 <rosmaita> #startmeeting cinder 14:00:04 <openstack> Meeting started Wed Jan 29 14:00:03 2020 UTC and is due to finish in 60 minutes. The chair is rosmaita. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:05 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:07 <openstack> The meeting name has been set to 'cinder' 14:00:12 <rosmaita> #topic roll call 14:00:14 <lseki> hi 14:00:19 <eharney> hey 14:00:27 <LiangFang> hi 14:00:43 <rosmaita> greetings thierry 14:00:59 <rosmaita> #link https://etherpad.openstack.org/p/cinder-ussuri-meetings 14:01:03 <ttx> hi! just lurking :) 14:01:09 <whoami-rajat> Hi 14:01:12 <raghavendrat> hi 14:01:12 <sfernand> hi 14:01:33 <jungleboyj> o/ 14:01:42 <rosmaita> looks like a good turnout 14:01:51 <rosmaita> #topic announcements 14:01:53 <tosky> o/ 14:02:29 <rosmaita> i've been meaning to mention that you may have noticed, that i'm not as good as jay was about keeping notes in the agenda etherpad 14:02:37 <rosmaita> so if you miss a meeting and want to know what went on 14:02:43 <rosmaita> you need to look at the meeting log 14:02:53 <rosmaita> otherwise, you may think nothing happened! 14:03:00 <rosmaita> ok, first real announcement 14:03:09 <jungleboyj> :-) I can try to get back to doing notes. 14:03:12 <rosmaita> #link https://etherpad.openstack.org/p/cinder-ussuri-meetings 14:03:20 <rosmaita> that wasn't what i meant 14:03:23 <enriquetaso> o/ 14:03:33 <rosmaita> rocky goes to "extended maintenance" status next month 14:03:40 <smcginnis_> I think the meeting logs are the best. Especially with the use of #action, #info, etc. 14:03:52 <jungleboyj> smcginnis_: :-) 14:03:52 <whoami-rajat> jay for notes ++ 14:04:01 <rosmaita> yeah, jungleboyj i'd kind of like to push people to using the meeting logs 14:04:11 <rosmaita> ok but about rocky going to EM ... 14:04:13 <rosmaita> final release must happen before 24 February 14:04:19 <jungleboyj> rosmaita: Ok. Sounds good. 14:04:20 <enriquetaso> you are doing great rosmaita 14:04:27 <rosmaita> doesn't look like there are any/many outstanding patches for rocky 14:04:28 <enriquetaso> :P 14:04:50 <whoami-rajat> rosmaita, i think one is mine 14:04:53 <rosmaita> so this is really a notice that if there *is* something that looks like it should be backported, please propose it soon 14:05:22 <rosmaita> whoami-rajat: right, i will keep an eye on that one 14:05:52 <whoami-rajat> rosmaita, thanks 14:06:07 <rosmaita> so we'll do the final rocky release 20 Feb 14:06:42 <rosmaita> second announcement: 14:06:43 <rosmaita> spec freeze on Friday 31 January (must be merged by 23:59 UTC) 14:06:48 <rosmaita> that's this friday 14:06:54 <kaisers> hi 14:06:57 <rosmaita> looks like we have 3 specs still in play for ussuri 14:07:06 <rosmaita> they are on the agenda later 14:07:28 <rosmaita> #topic Continued discussion about 3rd Party CI 14:07:37 <rosmaita> thanks to jungleboyj for keeping on top of this 14:07:44 <rosmaita> jungleboyj: you have the floor 14:07:51 <jungleboyj> :-) 14:08:14 <jungleboyj> Thanks. So, we started this topic last week and it seemed we needed to continue the discussion this week. 14:08:30 <jungleboyj> Or actually I guess it was during the virtual mid-cycle. 14:08:44 <rosmaita> last week as well 14:08:55 <LiangFang> :) 14:09:01 <jungleboyj> Anyway, I sent an e-mail to the mailing list and also targeted the CI e-mails for failing vendors. 14:09:13 <jungleboyj> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-January/012151.html 14:09:30 <jungleboyj> We got some responses as you can see in the etherpad. 14:09:31 <raghavendrat> hi, i am from HPE. we are trying to bring up our CI 14:09:45 <rosmaita> raghavendrat: that is good to hear 14:09:46 <raghavendrat> its in progress 14:09:47 <jungleboyj> raghavendrat: Awesome. 14:09:52 <jungleboyj> Thank you for being here. 14:10:26 <jungleboyj> Thanks to ttx for working with the OSF to reach out to vendors as well. 14:10:53 <jungleboyj> So, the additional drivers to be unsupported has shrunk. 14:11:12 <jungleboyj> The question that is left, however, is what do we do now? 14:11:33 <jungleboyj> Do we need to re-address what we are doing with 3rd Party CI? 14:11:40 <rosmaita> we had floated the idea last week about maybe just unsupporting but not removing drivers 14:11:53 <rosmaita> i think smcginnis had a good point that you can't do that for very long 14:12:03 <rosmaita> as libraries get updated, you will start to get failures 14:12:41 <jungleboyj> True. We are at the point that we have unsupported/removed nearly half the drivers over the last couple of releases. 14:12:50 <rishabhhpe> Hi , I am from HPE , we are trying to setup for CI .. but facing some difficulties . is there any documentation available or a automated scripts to bring the setup in a single shot ? 14:12:51 <rosmaita> i am hoping the Software Factory project may help with CI 14:13:06 <smcginnis_> An alternative being that we could move them to a different repo with a noop CI job. 14:14:22 <ttx> yeah, only keep CI-tested oens in mainline, and use a separate repo for everything else 14:14:25 <jungleboyj> rosmaita: We someone working on setting up an example of how to use that? 14:14:38 <rosmaita> rishabhhpe: take a look at https://softwarefactory-project.io/docs/index.html 14:14:38 <ttx> The current doc is certainly lacking 14:15:00 <jungleboyj> smcginnis_: It seems keeping them somewhere is somewhat better than totally removing. 14:15:03 <e0ne> hi 14:15:06 <rosmaita> jungleboyj: tosky was speaking with someone in the cinder channel the other day about it 14:15:32 <rosmaita> i forget who though, but they were setting up a cinder CI 14:15:33 <smcginnis_> jungleboyj: Then if a distro wants to include them: "apt install openstack-cinder openstack-cinder-unsupported-drivers" 14:16:01 <jungleboyj> Ok. So, that is an option. 14:16:18 <tosky> I just jumped in a discussion started by rosmaita :) 14:16:47 <tosky> smcginnis_: you don't need to move them into a separate repository for distributions to split the packages 14:16:54 <rosmaita> basically, for the Software Factory situation, we need someone to actually set it up for cinder and then report back 14:17:05 <whoami-rajat> it was Hitachi i guess rosmaita tosky 14:17:11 <smcginnis_> tosky: Effect, not cause. ;) 14:17:21 <rosmaita> there is a community around Software FActory, and RDO is using it for CI, so it is pretty solid 14:17:35 <rishabhhpe> <rosmaita> : ok 14:17:36 <rosmaita> whoami-rajat: ty, that's right, it was Hitachi 14:17:45 <tosky> smcginnis_: moving code around complicates the usage of the history; my suggestion would be to keep them in-tree and mark them somehow with some annotation 14:18:15 <smcginnis_> tosky: That's what we have today. 14:18:21 <jungleboyj> smcginnis_: ++ 14:18:22 <smcginnis_> The issue raised is that will eventually break. 14:18:27 <rosmaita> i guess we could blacklist them from tests? 14:18:29 <tosky> smcginnis_ but with removals part 14:18:42 <smcginnis_> So the options are either to remove them completely, or move them somewhere out of the way. 14:18:42 <eharney> putting drivers in a separate repo also means you have to figure out how to keep dependencies in sync, or nobody will actually be able to install the unsupported drivers 14:18:58 <smcginnis_> eharney: Yeah, it just moves the problem really. 14:18:59 <jungleboyj> :-( 14:19:23 <jungleboyj> And since the vendors aren't maintaining them then it is unlikely anyone is going to do that work. 14:20:00 <e0ne> jungleboyj: +1 14:20:10 <m5z> maybe we could move to unsupported list and when any dependency fails remove it? 14:20:23 <tosky> smcginnis_: wouldn't it be possible to disable the setuptools entry points (if they are used; at least for sahara we used them) 14:20:23 <tosky> IMHO, and from the past experience with sahara, either everything should stay in-tree as it is, or each driver should have its own repository from the start 14:20:23 <tosky> any other solution is looking for troubles :) 14:20:24 <smcginnis_> m5z: Was just thinking that. 14:20:32 <smcginnis_> That might be a good compromise. 14:21:02 <rosmaita> m5z: that is a good idea 14:21:14 <rosmaita> i'd prefer to just have one repo 14:21:20 <jungleboyj> rosmaita: ++ 14:21:21 <smcginnis_> But then it's a fire and we can't wait to see if they get an update to any dependencies. 14:21:32 <smcginnis_> But probably better than just nuking them right away. 14:21:47 <jungleboyj> :-) 14:22:36 <rosmaita> maybe we could have unsupported -> unit test failures -> removal before next release 14:22:46 <lseki> ++ 14:22:52 <rosmaita> we would blacklist as soon as we hit unit test failures 14:22:54 <smcginnis_> We couldn't do removal before next release. 14:22:55 <sfernand> ++ 14:23:08 <smcginnis_> It would have to be removal before we can merge anything else because suddenly the gate it borked. 14:23:18 <jungleboyj> Yeah. 14:23:43 <rosmaita> if we blacklisted the tests, wouldn't that unblock the gate? 14:23:44 <jungleboyj> So, it goes away in that release, but that is ok because it was already unsupported. 14:24:25 <smcginnis_> rosmaita: So add a SkipTest to get around it right away, then remove by ~milestone-3 if not fixed? 14:24:37 <smcginnis_> I think I'd rather just remove it at that point. 14:24:59 <jungleboyj> Yeah, not sure the value of delaying the removal. 14:24:59 <rosmaita> well, the skip test would give them a final few weeks to get it done 14:25:00 <smcginnis_> They can always propose a revert if dependencies are fixed, but considering it is already unsupported, that's not likely. 14:25:09 <jungleboyj> Fair enough. 14:25:58 <m5z> smcginnis_: +1 14:26:52 <rosmaita> ok, so we would remove an unsupported driver from the tree immediately upon it causing test failures in the gate 14:26:55 <jungleboyj> smcginnis_: That is like what we are currently doing. 14:27:17 <smcginnis_> jungleboyj: We wouldn't remove it the cycle after marking unsupported though. 14:27:25 <smcginnis_> Only as soon as it starts causing failures. 14:27:30 <eharney> i think in a lot of cases we can opt to just fix the failing tests ourselves -- this is part of why it's useful to keep them in the tree 14:27:46 <smcginnis_> That might be an option is its something trivial. 14:28:07 <rosmaita> we could keep that as an unadvertised option 14:28:18 <jungleboyj> smcginnis_: eharney ++ 14:28:18 <eharney> yeah 14:28:56 <rosmaita> alright, this sounds good ... i will write up something for us to look at before we announce this 14:29:02 <rosmaita> but it think it's a good direction 14:29:09 <smcginnis_> Soooo... if we adopt this policy, are we going to revert some of the removals we've already done? 14:29:15 <ttx> I see a lot of value in the CI we run ourselves (for "open source software" drivers). I'm unsure of the real value of 3rd-party CI for us. It's really a service for the vendors, to help them check they are not broken by changes 14:29:34 <rosmaita> smcginnis_: foos uwarion 14:29:45 <ttx> So i'm unsure we should support or unsupport them based on availability of CI 14:29:49 <rosmaita> did not mean to say that 14:29:56 <smcginnis_> ttx: It's also good for the project as a whole as it prevents cases where someone installs cinder and has a lot of trouble getting it to run. 14:30:07 <smcginnis_> That looks just as bad for cinder as it does for the vendor. 14:30:20 <ttx> smcginnis_: assuming that the 3rd-party CI actually tests the driver 14:30:22 <smcginnis_> Sometimes more so, because they think it's cinder's problem, not the vendors problem. 14:30:36 <smcginnis_> ttx: Yes, but that's what I'm saying. 14:30:38 <rosmaita> yeah, i would prefer to keep 3rd party CI 14:31:00 <smcginnis_> We need 3rd party CI, or we need to remove non-open drivers from tree. 14:31:09 <jungleboyj> rosmaita: It is at least an indication that the vendor is engaged. 14:31:11 <ttx> yeah 14:31:34 <rosmaita> smcginnis_: i guess we should consider re-instating the drivers removed during this cycle 14:31:43 <jungleboyj> And I think that there should be some incentive to stay engaged. 14:31:54 <ttx> those are the two options. But I'd say the more difficult we make 3rdparty CI, the less likely it is to report useful results 14:32:21 <smcginnis_> It's been a constant headache, but as a whole, I think our 3rd party CI has been useful. 14:32:22 <rosmaita> ttx: that is why we are pushing Software Factory 14:32:26 <ttx> So the two options really are... simplify 3rd-party CI setup, or remove drivers that require special hardware from the tree 14:32:33 <jungleboyj> Well, that is the thing being worked in parallel is making 3rd Party CI easier. 14:32:44 <ttx> rosmaita: I agree, just trying to reframe why :) 14:33:11 <smcginnis_> It certainly can be simple: https://github.com/j-griffith/sos-ci 14:33:22 <smcginnis_> Just everyone wants to duplicate how infra works. 14:33:23 <jungleboyj> :-) 14:33:49 <jungleboyj> I thought at some point infra was pushing people to do that? 14:34:00 <smcginnis_> I don't think so. 14:34:08 <smcginnis_> This has been a headache for them too. 14:34:25 <jungleboyj> Ok. Yeah, I was surprised when they came back with that. I was unaware. 14:34:36 <rosmaita> ok, we need to wrap this up for today 14:34:45 <smcginnis_> Yeah, let's move along. 14:34:53 <smcginnis_> rosmaita: Want to summarize the plan? 14:34:59 <rosmaita> i think we made some progress 14:35:00 <jungleboyj> rosmaita: Please. 14:35:06 <raghavendrat> one query: whats end date ... when drivers would be marked as uspported/removed ? 14:35:19 <rosmaita> unsupported would be same as now 14:35:33 <rosmaita> removal would be when first failure in our gate occurs 14:35:38 <jungleboyj> rosmaita: ++ 14:35:51 <rosmaita> i will write something up for us to review 14:36:04 <smcginnis_> #link https://wiki.openstack.org/wiki/Cinder/tested-3rdParty-drivers#Non-Compliance_Policy 14:36:21 <rosmaita> #action rosmaita write up summary of what we decided or edit ^^ 14:36:31 <raghavendrat> ok. will have a look and also keep close watch 14:36:59 <rosmaita> you may want to reach out to the hitachi people and combine efforts on Software Factory 14:37:06 <jungleboyj> Sounds good. Should I revert the removals that I pushed up this cycle? 14:37:11 <rosmaita> check the openstack-cinder channel log for yesterday 14:37:18 <raghavendrat> ok 14:37:39 <rosmaita> jungleboyj: i would hold off until after we are absolutely sure about this 14:37:52 <rosmaita> (just in case someone thinks of a major objection we haven't considered) 14:37:52 <smcginnis_> Upgrade checkers too. 14:37:58 <rosmaita> right 14:38:06 <rosmaita> thanks jungleboyj and ttx 14:38:08 <jungleboyj> Ok. So, continue discussion. 14:38:24 <rosmaita> #topic Spec: Volume local cache 14:38:30 <LiangFang> hi 14:38:32 <jungleboyj> Thank you guys. 14:38:34 <rosmaita> #link https://review.opendev.org/#/c/684556/ 14:38:55 <LiangFang> should we do a microversion change for this? 14:39:00 <rosmaita> my questions have been met except for the microversion one 14:39:09 <rosmaita> https://review.opendev.org/#/c/684556/12/specs/ussuri/support-volume-local-cache.rst@180 14:39:45 <eharney> i'm not sure "volume details" is the right place for that information unless i'm misunderstanding what that refers to 14:39:57 <eharney> it should be part of the connection info etc, not the volume metadata? 14:40:09 <LiangFang> it is in connection info 14:40:22 <rosmaita> well, the volume-type extra specs will have the cacheable property 14:40:25 <LiangFang> cinder fill the fields in that 14:40:38 <eharney> "volume details" sounds like it would appear on "cinder show" etc 14:41:06 <rosmaita> yes, that's how it sounded to me 14:41:19 <LiangFang> sorry for misleading 14:42:39 <LiangFang> should I change the word "volume details", then keep microversion not change? 14:42:59 <rosmaita> yes 14:43:10 <LiangFang> ok, thanks 14:43:14 <rosmaita> no microversion impact if the API response doesn't change 14:43:44 <rosmaita> ok, other than that, i think eharney and geguileo had a bunch of comments on earlier versions of the spec 14:43:45 <LiangFang> ok 14:44:07 <rosmaita> would be good if you could make sure the current version addresses your concerns 14:44:52 <rosmaita> LiangFang: did you have any questions? 14:45:10 <LiangFang> no more questions now:) thanks 14:45:20 <rosmaita> ok, great 14:45:32 <rosmaita> #topic src_backup_id 14:45:41 <rosmaita> #link https://review.opendev.org/#/c/700977/ 14:45:59 <rosmaita> this is close to being done 14:46:16 <rosmaita> we talked last week about could it be a bug instead of a spec 14:46:28 <smcginnis_> Yeah, I still think this should just be dropped as a spec. Just add it. 14:46:30 <rosmaita> but eric brought up a point about us using volume metadata for the field 14:46:42 <rosmaita> i think that needs to be documented 14:46:57 <rosmaita> mainly, that operators can't rely on it being there or accurate 14:47:13 <rosmaita> but otherwise, i think the proposal is fine 14:47:27 <rosmaita> also there was an issue about which id is used for incrementals 14:47:33 <rosmaita> it's addressed in the spec 14:47:58 <rosmaita> so, this will just need quick reviews once it's revised 14:48:07 <rosmaita> but i don't think there's anything controversial 14:48:27 <rosmaita> #topic Spec: 'fault' info in volume-show response 14:48:37 <rosmaita> #link https://review.opendev.org/#/c/689977/ 14:48:56 <rosmaita> this is probably not ready 14:49:15 <rosmaita> it's still not clear why the user messages won't work 14:49:33 <rosmaita> and i don't like the idea of adding another DB table until we are sure it's necessary 14:49:39 <eharney> yeah, i still don't have a sense of why we want to add this when we already have a system that attempts to mostly do the same thing 14:49:54 <jungleboyj> ++ 14:50:09 <eharney> there are probably some subtle differences but i suspect the answer is to just improve what we have rather than creating a new API for this 14:50:15 <rosmaita> eharney: ++ 14:50:25 <rosmaita> i will keep an eye on it for revisions 14:50:41 <whoami-rajat> seems like it's inspired by nova instances having 'fault' property 14:51:19 <rosmaita> yes, it's just not clear to me that it's going to provide the info the proposer is looking for 14:51:19 <eharney> we currently have a scheme that ties faults to operations rather than the object being acted on 14:51:26 <eharney> it's different, but seems to work well 14:51:49 <eharney> if you want something like nova faults you can query our user messages by volume id already 14:52:29 <rosmaita> well, i left enough comments asking for specific answers for what exactly can't be done 14:52:34 <rosmaita> so we'll see what happens 14:52:37 <whoami-rajat> yep, agreed. it's different but works 14:52:45 <rosmaita> #topic sqlalchemy update to 1.3.13 breaks cinder 14:52:54 <rosmaita> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-January/012210.html 14:53:06 <rosmaita> ok, so the situation is that one of our unit tests fails 14:53:33 <rosmaita> i took a look, but it turns out what we're doing in the test *only* happens in that test 14:53:41 <rosmaita> so we could fix this by just changing the test 14:53:56 <rosmaita> or by slightly modifying the db.sqlalchemy.api 14:54:31 <rosmaita> i am inclined to just change the test at this point 14:54:53 <rosmaita> because the db api change loads the glance metadata into each volume object 14:54:55 <eharney> geguileo fixed some DetachedInstanceError problems a while ago, i wonder if this is a similar bug in our objects code that is just being revealed in tests now 14:55:13 <rosmaita> that could be 14:55:42 <rosmaita> most of the time when we want the glance info, we just make a call to get it, we don't expect it in the volume object 14:56:47 <rosmaita> i'll grep the logs for geguileo's fix and see whether it's the same kind of thing 14:56:58 <rosmaita> because i guess we'd do the same fix now to be consistent 14:57:15 <rosmaita> ok, i'll take a look and then update my patch 14:57:37 <geguileo> the issue is usually us trying to do a lazy load when we no longer have the transaction in place... 14:57:41 <rosmaita> i'm not sure how anxious the requirements team is to get sqlalchemy 1.3.13 into u-c 14:57:56 <geguileo> it works if it happens fast enough, but that's not usually the case iirc 14:58:12 <rosmaita> maybe that's why it's suddenly broken 14:58:21 <rosmaita> they may have optimized some code 14:58:30 <rosmaita> and now it can't happen fast enough 14:58:36 <geguileo> in other words, it's usually bad code in cinder, something that could happen in a production env 14:59:04 <rosmaita> as far as i can tell, this particular pattern is only used in that one unit test 14:59:19 <whoami-rajat> i think the bot automatically updates u-c when a lib is released. 14:59:25 <whoami-rajat> i mean puts up a patch for it 14:59:48 <rosmaita> looks like we are out of time 15:00:01 <rosmaita> thanks everyone! will try to have some open discussion next week 15:00:04 <jungleboyj> Thanks! 15:00:05 <whoami-rajat> thanks! 15:00:07 <rosmaita> but the CI discussion was helpful 15:00:16 <raghavendrat> thanks 15:00:27 <rosmaita> #endmeeting