14:00:01 <abhishekk> #startmeeting glance 14:00:02 <opendevmeet> Meeting started Thu Jan 20 14:00:01 2022 UTC and is due to finish in 60 minutes. The chair is abhishekk. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:02 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:02 <opendevmeet> The meeting name has been set to 'glance' 14:00:04 <abhishekk> #topic roll call 14:00:09 <abhishekk> #link https://etherpad.openstack.org/p/glance-team-meeting-agenda 14:00:10 <abhishekk> o/ 14:02:33 <abhishekk> Waiting for others to show 14:03:27 <rosmaita> o/ 14:03:37 <rajiv> Hey 14:04:12 <abhishekk> cool, lets start, 14:04:21 <abhishekk> may be others will show in between 14:04:37 <abhishekk> #topic release/periodic jobs update 14:04:46 <abhishekk> Milestone 3 6 weeks from now 14:04:53 <abhishekk> Possible targets for M3 14:05:03 <abhishekk> Cache API 14:05:04 <abhishekk> Stores detail API 14:05:04 <abhishekk> Unified limits usage API 14:05:04 <abhishekk> Append existing metadef tags 14:05:28 <abhishekk> So these are some important work we are targeting for M3 14:05:51 <abhishekk> Will ping for reviews as and when they are up 14:05:57 <abhishekk> Non-Client library release - 5 weeks 14:06:20 <abhishekk> We do need to release glance-store by next week with V2 clone fix 14:06:35 <abhishekk> Periodic jobs all green 14:06:47 <abhishekk> #topic Cache API 14:07:14 <abhishekk> Cache API base patch is up for review, couple of suggestions from dansmith, I will fix them 14:07:25 <abhishekk> Tempest coverage is in progress 14:07:35 <abhishekk> #link https://review.opendev.org/c/openstack/glance/+/825115 14:08:14 <abhishekk> I am thinking to cover more cache APIs and scenarios, will be open for reviews before next meeting 14:08:29 <abhishekk> #topic Devstack CephAdmin plugin 14:08:38 <abhishekk> #link http://lists.openstack.org/pipermail/openstack-discuss/2022-January/026778.html 14:09:24 <abhishekk> There will be efforts to create new cephadmin devstack plugin 14:09:45 <abhishekk> I will sync with victoria for more information 14:10:19 <abhishekk> from glance prospective, we need to make sure that this plugin will deploy ceph with single store as well as multistore configuration 14:10:40 <abhishekk> that's it from me for today 14:11:00 <abhishekk> rosmaita, do you have any inputs to add about cephadm plugin? 14:11:38 <rosmaita> no, i think sean moody's response to vkmc's initial email is basically correct 14:12:09 <rosmaita> that is, do the work in the current devstack-plugin-ceph, don't make a new one 14:13:02 <abhishekk> yes, I went through it 14:13:56 <abhishekk> lets see how it goes 14:14:02 <abhishekk> #topic Open discussion 14:14:12 <abhishekk> I don't have anything to add 14:14:16 <jokke_> I guess it's just matter of changing devstack to deploy with the new tooling Ceph introduced 14:14:27 <jokke_> not sure if there's anything else really to it for now 14:14:57 <abhishekk> likely 14:16:16 <abhishekk> anything else to discuss or we should wrap this up? 14:16:24 <jokke_> abhishekk: I saw you had revived the cache management api patch but didn't see any of your negative tests you held it from merging last cycle ... we're still expecting new ps for that? 14:16:48 <abhishekk> jokke_, yes, I am working on those 14:17:04 <jokke_> I still have no idea what you meant with that so can't tell if I just missed them, but there was nothing added 14:17:36 <jokke_> kk 14:17:47 <abhishekk> Nope, I haven't pushed those yet as facing some issues 14:18:30 <abhishekk> Like one scenario for example 14:18:47 <abhishekk> create image without any data (queued status) 14:19:07 <abhishekk> add that image to queue for cache and its getting added to queued 14:19:41 <abhishekk> So I am thinking whether we add some validation there (like non-active images should not be added to queue) 14:21:02 <jokke_> up to you ... I tried to get the API entry point moved last cycle and was very clear that I had no interest to change the actual logic that happens in the caching module ... IMHO those things should be bugfixes and changed on their pwn patches 14:21:11 <jokke_> but you do as you wish with them 14:21:22 <abhishekk> ack 14:21:50 <abhishekk> sounds good 14:22:00 <abhishekk> anything else to add ? 14:22:12 <abhishekk> croelandt, ^ 14:22:19 <jokke_> it makes sense to fix issues like that and the bug I filed asap for the new API endpoints so we're not breaking them right after release ;) 14:22:44 <jokke_> but IMO they are not related to moving the endpoints from the middleware to actual api 14:23:44 <croelandt> abhishekk: nope :D 14:24:02 <abhishekk> yes they are not, but I am just thinking to do it at this point only 14:24:39 <abhishekk> croelandt, ack 14:24:45 <dansmith> o/ 14:24:54 <abhishekk> hey 14:25:09 <abhishekk> we are done for today 14:25:16 <dansmith> sweet :) 14:25:21 <abhishekk> dansmith, do you have anything to add ? 14:25:25 <rajiv> hi, i would like to follow up on this bug : https://bugs.launchpad.net/python-swiftclient/+bug/1899495 14:25:26 <dansmith> nope 14:25:52 <abhishekk> I have cache tempest base work up, if you have time, please have a look 14:26:19 <rosmaita> i must say, it is nice to see all this tempest work for glance happening 14:26:19 <dansmith> I saw yesterday yep 14:26:24 <dansmith> rosmaita: ++ 14:27:19 <abhishekk> rajiv, unfortunately didn't get time to go through it much 14:27:49 <jokke_> rajiv: I just read Tim's last comment on it 14:28:15 <jokke_> rajiv: have you actually confirmed that scenario that it happens when there is other images in the container? 14:28:25 <abhishekk> I just need input whether we wait for default cache periodic time (5 minutes) or set it in zuul.yaml to less time 14:28:28 <rajiv> jokke_: yes, i replied to the comment, we have already implemented it but it dint help 14:28:59 <jokke_> rajiv: ok, so the 500 is coming from the swift, not from Glance? 14:30:31 <rajiv> since i have nginx in the middle in my containerised setup, i am unable to validate the source 14:31:35 <jokke_> kk, I'll try to give it another look and see if I can come up with something that could work based on Tim's comment 14:32:04 <rosmaita> rajiv: looking at your last comment in the bug, i think it's always possible to get a 5xx response even though we didn't list them in the api-ref 14:33:08 <rajiv> 409 for sure comes from swift/client.py but 500 from glance 14:33:40 <jokke_> Ok, that's what I was asking, so the 500 is coming from glance, swift correctly returns 409 14:33:50 <rajiv> 2022-01-20 02:02:01,536.536 23 INFO eventlet.wsgi.server [req-7cd63508-bed1-4c5f-b2cc-7f0e93907813 60d12fe738fe73aeea4219a0b3b9e55c8435b55455e7c9f144eece379d88f252 a2caa84313704823b7321b3fb0fc1763 - ec213443e8834473b579f7bea9e8c194 ec213443e8834473b579f7bea9e8c194] 10.236.203.62,100.65.1.96 - - [20/Jan/2022 02:02:01] "DELETE /v2/images/5f3c87fd-9a0e-4d61-88f9-301e3f01309d HTTP/1.1" 500 430 28.849376 14:34:10 <abhishekk> rajiv, any stack trace ? 14:34:45 <rajiv> abhishekk: not more than this :( 14:34:52 <abhishekk> ack 14:35:04 <rajiv> 2022-01-20 02:02:01,469.469 23 ERROR glance.common.wsgi [req-7cd63508-bed1-4c5f-b2cc-7f0e93907813 60d12fe738fe73aeea4219a0b3b9e55c8435b55455e7c9f144eece379d88f252 a2caa84313704823b7321b3fb0fc1763 - ec213443e8834473b579f7bea9e8c194 ec213443e8834473b579f7bea9e8c194] Caught error: Container DELETE failed: https://objectstore-3.eu-de-1.cloud.sap:443/v1/AUTH_a2caa84313704823b7321b3fb0fc1763/glance_5f3c87fd-9a0e-4d61-88f9-301e3f01309d 409 Conflict [ 14:35:27 <jokke_> so we do always expect to whack the container. I'm wondering if we really do store one image per container and it doesn't get properly deleted or if there is a chanse of having multiple images in thta one contianer and it's really jut cleanup we fail to catch 14:35:56 <rajiv> its 1 container per image 14:36:09 <rajiv> and segments of 200MB inside the container 14:36:14 <jokke_> I thought it should 14:36:27 <jokke_> so it's really a problem of the segments not getting deleted 14:36:51 <rajiv> yes, our custom code retries deletion 5 times in case of a conflict 14:37:12 <rajiv> and wait time was increased from 1 to 5 seconds, but had no luck 14:37:37 <rajiv> code : https://github.com/sapcc/glance_store/blob/stable/xena-m3/glance_store/_drivers/swift/store.py#L1617-L1639 14:38:47 <jokke_> I wonder what would happen if we instead of trying to delete the object and then the container we just asked swiftclient to delete the container recursively 14:39:05 <jokke_> and let it to deal with it, would the result be the same 14:39:17 <rajiv> yes, i tried this as well but had same results 14:39:24 <jokke_> ok, thanks 14:39:48 <rajiv> does the code need to be time.sleep(self.container_delete_timeout) https://github.com/sapcc/glance_store/blob/stable/xena-m3/glance_store/_drivers/swift/store.py#L1637 14:39:55 <abhishekk> no 14:40:25 <abhishekk> https://github.com/sapcc/glance_store/blob/2cb722c22a085ee9cdf77d39e37d2955f48811c3/glance_store/_drivers/swift/store.py#L37 14:40:33 <rajiv> i see a similar spec in cinder, hence i asked : https://github.com/sapcc/glance_store/blob/stable/xena-m3/glance_store/_drivers/cinder.py#L659 14:40:34 <jokke_> lets try to get on the next swift weekly and see if they have any better ideas why this happens and how to get around it now whn we know that it's for sure 1:1 relation and it's really swift not deleting the segments 14:41:16 <rajiv> abhishekk: ack 14:41:23 <abhishekk> wait 14:42:19 <abhishekk> this is wrong coding practice but it will work 14:42:53 <abhishekk> Lets move this to glance Irc channel 14:43:35 <rajiv> sure 14:43:41 <abhishekk> thank you all 14:43:50 <abhishekk> #endmeeting