14:00:31 #startmeeting glance 14:00:32 Meeting started Thu Feb 25 14:00:31 2021 UTC and is due to finish in 60 minutes. The chair is abhishekk. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:33 #topic roll call 14:00:34 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:36 The meeting name has been set to 'glance' 14:00:41 #link https://etherpad.openstack.org/p/glance-team-meeting-agenda 14:00:52 o/ 14:01:04 o/ 14:01:06 o/ 14:01:09 * lbragstad lingers 14:01:30 lets wait couple of minutes for jokke 14:01:33 o/ 14:01:38 cool 14:02:02 #topic release/periodic jobs update 14:02:07 Final release of non-client libraries - Next week 14:02:11 TODO: releasenotes and release patch for glance_store 14:02:30 I will do that on Monday 14:02:59 Request to go through open patches of store and see what we can get in two days 14:03:16 Milestone 3 - 2 weeks away 14:03:47 We have lots of open patches, kindly review them as well 14:04:27 Also we have a gate situation 14:04:50 Job which is passing during check is failing of gate or vice a versa 14:05:14 We need plan to get all important patches in 14:05:26 I am open for ideas on how we should proceed 14:05:44 is it some test of ours, or just general test flakiness? 14:05:57 general test flakiness 14:06:16 I've had a lot of rechecks to do lately, but it's all been unrelated stuff 14:06:18 ack, okay 14:06:21 most of the failures are related to volume 14:06:28 yeah, I see a lot of those 14:06:37 leading up to m3, things will likely get worse with load, as they always do 14:06:46 and couple of times I saw TIME out for functional-py36 job as well 14:07:00 so I think in the short term, extra vigilance around babysitting and rechecking is what we have to do 14:07:26 116 things in the check queue at 6am PST is pretty high load 14:07:28 Yep, today I have added recheck almost around 9-10 times 14:07:48 yeah, this is why you can never put stuff off until late in the cycle :/ 14:07:55 Can we offload some jobs on check or gate ? 14:08:21 I think the ones that are failing for us are all base jobs right? 14:08:26 or atleast turn them non voting 14:08:35 I guess so 14:08:48 yeah, I'm not in favor of disabling tests to make it through :/ 14:09:20 getting everything +Wd that we can so we have as much time to get them through the gate as possible is critical 14:09:37 I think your tasks set is all +W, and I +W'd a bunch of rbac stuff yesterday 14:09:49 yes 14:10:09 dansmith: I realy hope that mail of yours about the infra job runs continues to get attention and everyone some point being bit more critical about their tests 14:10:09 we might also have 2 new patches coming for RBAC 14:10:24 jokke: there has been some movement already, but... yeah 14:10:55 jokke: the natural uptick in load at the end of the cycle makes it hard to "feel" like much has happened, just because of the timing 14:11:02 i know this might not be favorable either, but it we run up against the wall on m3 and a patch is tripping on gate and checks for a week trying to get in, would glance issue an exception? 14:11:31 lbragstad, you mean FFE? 14:11:37 lbragstad: I hope so, especially for something we've been happy with and are just not able to get through the gate.. that's really what FFE is for, IMHO 14:11:37 right 14:11:48 dansmith: Oh I know. I wasn't referring to it as measure of current situation, but if we get less pointless testing from everyone, it will eventually make the life easier in the long run 14:12:08 yup 14:12:43 lbragstad, +1 for FFE 14:12:45 * jokke remembers the old good times when people were horrified of test job passed 1hr mark ;) 14:13:14 passing 14:13:28 So, we need to keep eyes on the patches and add rechecks 14:13:39 I will be off the grid this weekend, 14:13:43 I will try to keep watch over the weekend 14:13:44 otherwise I would be here to keep rechecking 14:13:52 but once I'm back I'll surely help with the babysitting 14:13:59 I will volunteer for this week 14:14:02 abhishekk: I guess I will take the late shift on that 14:14:09 cool 14:14:20 dansmith: going camping or something among those lines? 14:14:36 we almost have around 15+ patches in line 14:14:44 jokke: undisclosed location, classified. (but yes :) 14:15:00 we will give priority to RBAC, Tasks and Distributed Image import first 14:15:19 if you need another baby-sitter, feel free to add me as a reviewer 14:15:23 I will create on etherpad to track all important patches 14:15:24 dansmith: sounds great, I wish I could do that (damn lockdown keeps me within 5km from the house or significant fines will be issued) 14:15:30 lbragstad, cool, thank you 14:15:52 Moving ahead 14:16:00 #topic Glance secure RBAC 14:16:06 We are going to implement Project scope in this cycle as Experimental 14:16:42 lbragstad, is suggesting to use oslo_policies config option as showing this work as EXPERIMENTAL 14:17:00 certainly open for debate here 14:17:02 #link https://review.opendev.org/c/openstack/glance/+/776588/4 14:17:06 Reviews; 14:17:10 #link https://review.opendev.org/q/topic:%2522secure-rbac%2522+(status:open+OR+status:merged)+project:openstack/glance 14:17:19 #link https://review.opendev.org/q/topic:%2522secure-rbac%2522+(status:open+OR+status:merged)+project:openstack/glance-tempest-plugin 14:17:26 so, 14:17:33 lbragstad: are you sure that only logs once? 14:17:44 in nova when we added something like that, it was on every policy check 14:17:45 dansmith yeah - i tested it manually 14:18:09 but - we should add a test to make sure that's the case 14:18:18 okay, so that probably happens one every child spawn and recycle right? 14:18:26 better than each request, but still maybe too verbose if it's constantly setting off alarms at runtime 14:18:31 yeah 14:18:51 lbragstad: abhishekk: my problem using the oslo config option for that is that the config file has no indication that turning it on would be experimental in glance perspective, and as we all know, people do not read release notes for stuff like that 14:18:54 so - we'll need a way to detect that 14:20:03 we could name the config option do_experimental_thing and then just use the deprecated_param= thing later 14:20:10 if they're opting in now, and we're going to log a warning, 14:20:25 it's probably not too much to ask that they change to the real config option later 14:20:38 however, I'm not overly concerned about them turning something on and not knowing it's experimental, 14:20:47 if we have it in the config help, and log a warning, etc 14:21:08 jokke, same for me, but the default value of that is also false 14:21:46 glance ships a default config with all the comments right? 14:21:53 Well my original point of overwriting the oslo config option was to overwrite it only off. So document that you need to have both on for it to work, after we're happy with it, we remove our experimental config option and then it will be just the oslo one dictating the behaviour 14:21:54 yes 14:22:10 abhishekk: so if it's marked experimental in there, surely that's good enough no? 14:22:26 I think so 14:22:26 or wait, 14:22:33 is this a conf option we import from oslo? 14:22:38 yes 14:22:41 dansmith: corret 14:22:45 right, right, okay 14:22:50 `glance-api.conf [oslo_policy] enforce_new_defaults` 14:22:58 that's why it's not showing in our config files as experimental 14:23:00 which defaults to False 14:23:12 yup, got it 14:23:40 and I haven't seen anywhere of overwriting the config value 14:23:49 no, please don't do that 14:23:52 if it was, I'd had no need to get specific option in there ... our conf file is mile ong as it is 14:24:12 my only hesitation would be creating a different configuration experience for operators 14:24:18 abhishekk: actually we do that ... I'll dig out example for you 14:24:23 what we can do is log warning or stop service to start if former is True and later is False 14:24:33 e.g., i have to set x in all these config files, but i have to remember to set x and y in glance 14:25:06 but - maybe that's not a huge concern since they're already opting into a pretty big config change across services 14:25:10 abhishekk: we could, and far better than mutating config ourselves, but.. I feel like a warning log and renos is enough 14:25:23 :) 14:25:40 lbragstad: especially since this is opt-in, I really think we don't need to jump through too many hoops 14:26:44 the documentation for enforce_new_defaults is pretty clear 14:27:41 it's explicit in saying it enforces only the behavior of the new defaults, and if i'm an operator, i should probably understand those before i flip that bit 14:27:57 yeah 14:28:00 Ok, I think we should have one config option in glance with proper help text and then log warnings on startup if either is ON 14:28:27 abhishekk: what does the glance config option do? 14:29:10 dansmith, when I added that config option, I have defined policies based on that config option 14:29:25 i.e. if that is False then it was enforcing current policies 14:29:30 abhishekk: so either the oslo.policy thing will enable the new behavior, or the new glance thing? 14:29:48 and if it is True it will enforce new policies 14:30:15 can you make it do that without mutating the oslo_policy config? 14:30:45 yes, I was doing something like this 14:31:33 https://review.opendev.org/c/openstack/glance/+/763208/4/glance/policies/tasks.py#106 14:32:30 if `glance-api.conf [oslo_policy] enforce_new_defaults=True` but the glance option is False, the policies wouldn't opt into the new behavior, right? 14:32:30 hmm, okay, that's better than mutating config but I think that might be a little confusing for the operator if I'm reading it right, 14:32:36 but at least it will get their attention 14:32:46 how about this option: 14:33:09 lbragstad, that is what we are thinking 14:33:38 We add a new enable_experimental_policies=False config.. if they enable the oslo thing without setting this to true, we refuse to start with a reason.. basically either both are off (current behavior) or you have to turn both on if you want new behavior, and if they disagree, fail to start 14:33:52 Sounds better to me 14:34:19 I think if the operator asks for a thing in policy, ignoring that and starting with different rules is risky behavior 14:34:30 since this is security related 14:34:44 Yep 14:35:11 I'm fine with that 14:35:15 So not to start if later is True when glance thing is False is better way 14:35:42 lbragstad, sounds good? 14:35:47 yep 14:35:50 great 14:35:56 Moving ahead 14:36:02 lbragstad: can you slap that into your stuff or do you need someone to work on that? 14:36:38 i'm a little buried trying to figure out the metadef api stuff, but i can probably get around to it by EOD 14:36:56 Also I think we should move this patch out of the tree and keep it independent 14:36:58 that _should_ be like the config option definition + maybe 2-3 lines in glance/cmd/api.py 14:37:01 okay we can talk after 14:37:02 abhishekk: agree 14:37:25 jokke, correct 14:37:31 abhishekk: just move on with the tree, knowing we'll merge the safety thing before release 14:37:59 dansmith, ack 14:38:31 Later topics are related to review requests only 14:38:50 one more thing about RBAC before we move on 14:39:01 shoot 14:39:32 Should we bump up the oslo_policy logging level by a lot in tests or did we have some fixture we could use to filter the deprecations out of it's logging? 14:40:34 It's massive, like to the point that my terminal slows the test runs because of it (if I `grep -iv deprecat` of the output, my tests actually run faster) 14:40:58 yeah, I briefly tried and failed to squelch those myself 14:41:02 it's quite annoying 14:41:34 and likely hitting the gate hard everywhere too as there it causes disk io, not just terminal 14:42:36 I think there is way to avoid deprecated warnings, right? 14:42:45 glance doesn't generate enough load on the gate to really matter about that, but if other projects are doing the same, then yeah 14:43:56 dansmith: well we do see sporadic timeouts in our functional tests where it's highest ... I'm not sure if it slows the run down enough to timeout, but I would not be against trying to eliminate it if possible 14:44:29 I doubt it, our functional tests are much smaller than say a nova run with no deprecations :) 14:44:42 but it's still waste, for sure 14:44:52 if it is going to help and reduce time, I think we should do that 14:45:19 last 15 minutes, moving ahead 14:45:31 #topic Task show API 14:45:39 https://review.opendev.org/c/openstack/glance/+/763739 (base patch) 14:45:55 I think 2 patches in this tree need approval 14:46:05 So please have a look 14:46:17 #topic Distributed Image Import 14:46:44 jokke is working on his approach and by Monday we could have a clear picture about this 14:47:04 Just to highlight 14:47:18 we have staging cleaning patch ready and in good shape as well 14:48:10 I did bunch of refactoring/cleanup from what's in gerrit at the moment. Chasing one bug atm. breaking loads of unittests exactly the same way. Once I get that nail down, I will push the new PS 14:48:30 Cool, thank you for the update 14:48:59 Moving ahead, to Open discussion 14:49:07 #topic Open discussion 14:49:26 Most of us will not be around tomorrow 14:49:51 But I will keep check on patches and add rechecks if required 14:50:18 That's it from me for today 14:51:00 abhishekk: on your tasks thing, 14:51:11 the only one I see not approved is the version bump at the top, is there more? 14:51:17 veeery quick question: what is the oldest branch you consider supported? queens? pike? (background: I've noticed it would be useful to remove a legacy job defined in openstack-zuul-jobs, but it needs to disappear from all branches where it's used) 14:51:34 dansmith, there is one client patch as well 14:51:41 not knowing glance's procedures there I've been waiting for someone else to comment, but obviously my tempest test for that depends on it, so I'm okay with it if everyone else is :) 14:51:46 let me fetch the link for you 14:51:52 ah, I'll look at topic 14:52:09 #link https://review.opendev.org/c/openstack/python-glanceclient/+/776403 14:52:32 abhishekk: sorry I totally missed that, I'll hit that after 14:52:44 dansmith, sorry, topic is different on this one 14:52:46 cool 14:53:03 abhishekk: I think it'd be good to get a glance elder on the api version bump before I do 14:53:04 I liked how I am printing the output for this one :D 14:53:35 tosky: IIRC anything before Ussuri is on extended stable cycle not maintained by us. So If they break and are not being fixed but the group who wanted the extension, we should be retiring them 14:54:01 jokke: EM is before train (train is still fully supported), but I get it, thanks 14:54:12 dansmith, ack 14:54:23 cinder already officially retired ocata and pike, just in case 14:54:28 tosky: right, latest-2 not current -2 14:55:32 so for such a job, if I wanted to remove all occurrences of the job, should I a) remove that legacy tempest job from such old branches or b) copy the legacy job in-tree if it works and the native job can't be backported? 14:55:32 5 minutes 14:56:17 I think option a 14:56:26 tosky: which branches are affected? 14:56:54 so, it's basically the follow-up of https://review.opendev.org/c/openstack/glance_store/+/749235 14:57:21 it's definitely not an high-priority task, just a bit of cleanup when there is some time for it 14:57:22 tosky: oh, o it's basically all stable branches? 14:57:26 yep 14:57:43 that job should probably work on stein and rocky, but I'm not sure about earlier releases 14:58:46 I'd say up to train b) before train what ever EM wants to do with them ... if they don't want to run tempest on those branches (in case the v3 job just isn't drop in replacement) we should retire those branches then 14:59:20 it's pointless to keep the branches around if they are not tested 14:59:55 okidoki, thanks 15:00:00 I will try to backport and see what happens 15:00:10 Thank you all 15:00:25 continue on openstack-glance if required 15:00:31 #endmeeting