#openstack-meeting-4 log

14:00:01 <jokke_> #startmeeting glance
14:00:02 <openstack> Meeting started Thu Jan 10 14:00:01 2019 UTC and is due to finish in 60 minutes.  The chair is jokke_. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:03 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:05 <openstack> The meeting name has been set to 'glance'
14:00:06 <jokke_> #topic roll-call
14:00:28 <jokke_> #link https://etherpad.openstack.org/p/glance-team-meeting-agenda
14:00:31 <jokke_> o/
14:00:38 <GregWaines> o/
14:01:03 <rosmaita> o/
14:01:45 <jokke_> I think we have as much quorum as we can, just noticed message from Abhishek that he is not feeling well and will need to skip the meeting
14:02:05 <jokke_> #topic updates
14:02:21 <jokke_> So we're hitting Milestone 2 today
14:02:30 <LiangFang> o/
14:02:37 * rosmaita wants to note in the minutes that we all hope Abhishek gets well soon!
14:03:02 <jokke_> Not as big of a deal as it used to be while we were still releasing milestone releases, but good reminder that we're running very thin on time for Stein!
14:03:11 <jokke_> rosmaita++
14:04:03 <jokke_> Abhishek has been huge help working with me to get the visibility change going through tests, which sparked one of the topics for today
14:04:52 <jokke_> but first I want to bring the oslo discussion back on the table
14:05:03 <jokke_> #topic using olso libraries
14:05:39 <jokke_> So this is one example,  https://review.openstack.org/#/c/625232/ not by any means blame for the situation and we've discussed about this before
14:06:38 <jokke_> We should not use oslo just for the sake of using oslo. The uuidutils is just extremely good example as it was proposed that we import it to replace oneliner from stdlib just because it exists
14:07:25 <jokke_> I was digging into it and looking the oslo code and it's literally just few years old wrapper to do exactly the same thing with different name we do directly from stdlib
14:07:40 <rosmaita> yeah, last time i looked, there wasn't much in it
14:07:50 <jokke_> so I marked the bug as "Won't fix" and -2d the patch
14:07:51 <rosmaita> and i doubt openstack will change the uuid implementation
14:08:00 <jokke_> indeed
14:08:17 <jokke_> nor we can do even if someone gets the brilliant idea to do it for some other project
14:08:44 <jokke_> so lets avoid the overhead of someone getting the idea of changing that and keep using stdlib for our uuid needs
14:09:03 <jokke_> we execute few lines less code and get the exactly same result
14:10:26 <jokke_> and I think this applies to the other oslo libs like I've said before. I'm all for using oslo if it provides any level of benefit for us and makes sense, but please lets not have these bugs/patches of "Lets use oslo because it exists"
14:11:10 <rosmaita> i may put up a patch to https://docs.openstack.org/glance/latest/contributor/minor-code-changes.html to mention this case
14:11:39 <jokke_> like we saw with oslo.context it is possible that something changes there without us realizing and we spend days trying to figure out why all of our gates are red
14:11:43 <LiangFang> jokke_: ++
14:11:58 <LiangFang> if new code, may use oslo
14:12:57 <jokke_> LiangFang: and even then it better be for some good reason. I won't be using olso.someutils.print just because someone got print wrapper merged to oslo that makes no sense for us ;)
14:14:00 <jokke_> rosmaita: that would be great to have in dev docs indeed
14:14:08 <jokke_> next topic
14:14:32 <jokke_> #topic MVP/new features and releasing them experimental
14:15:02 <jokke_> so we have lots of stuff coming around we are releasing first as experimental and trying to work the kinks out over multiple cycles
14:15:52 <jokke_> I just wanted to probe the feeling should we put some DEFAULT level config option like allow_experimental [True|False] to make it more obvious for operators
14:16:38 <jokke_> and if it's false we just refuse to start the service if it's configured to use those experimental features (like image import was and multiple backends still is)
14:17:58 <jokke_> and I think what ever approach we take on the edge caching/split braining I think that might be in this list for cycle or two before we have got it to the point we're confident to tell our customers to goahead and run it in production
14:18:41 <rosmaita> do any other projects have a similar thing?
14:18:42 <LiangFang> should we have seperate option for each experimental feature?
14:19:44 <jokke_> rosmaita: I do not know about other OpenStack projects, but I have seen it elsewhere as some audits are quite picky about running experimental/deprecated features
14:21:36 <jokke_> LiangFang: we're documenting it pretty decently in release notes and config options I think, but we really don't have any clear indication to the operators if they are running experimental code or not other than going through all those few thousands of lines of config files and or poking the api and crossreferencing that to the docs
14:21:47 <rosmaita> i don't have a strong opinion, though i do think just one greneral config option, not one for each feature
14:21:59 <GregWaines> I think it's a good idea to have an 'explicit' configurable option around the use of experimental features
14:22:01 <rosmaita> becasue we'll want it 'on' all the time in devstack, i think
14:22:32 <jokke_> I was more thinking of just having single switch operator could flip and see what their deployment tooling is doing and having indication in the logs that they are indeed relying to the experimental feature
14:23:25 <rosmaita> either that or we patch oslo.config to have an 'EXPERIMENTAL' option, like they do now for 'DEPRECATED'
14:23:28 <jokke_> GregWaines: that was way better worded version of my multiple lines of trying to explain what I was looking for ;)
14:24:06 <GregWaines> :)
14:24:23 <rosmaita> then it would be in both logs and config
14:24:32 <jokke_> rosmaita: that would be great way to identify them, then it's just the matter of communicating it clearly to the operator (and having that flip switch to prevent service start if such exists)
14:25:47 <rosmaita> the problem with the switch is that some options will have values even if they're not being used
14:25:58 <rosmaita> i'm thinkihng the way import was introduced
14:26:03 <jokke_> I think it just would be something that could make life easier. Maybe I should send short survey out and collect feedback from field and form proposal based on that?
14:26:54 <rosmaita> jokke_: yeah, i should be clear.  (1) we should do something, but (2) i'm not sold on the general flip switch
14:27:18 <rosmaita> you could survey for ideas, though you never know what you'll get!
14:28:02 <rosmaita> the last thing we want is really complicated code to analyze whether some features are in use
14:28:08 <jokke_> I was more thinking of asking "do you want flipswitch" "Are you ok if we just log clearly when such feature is enabled on start" :P
14:28:19 <jokke_> and write spec(lite) based on that
14:28:40 <jokke_> not necessarily asking to get 13 different ideas what to do from 12 responses :D
14:28:44 <rosmaita> well, i wouldn't ask "do you want flipswitch" until we have a good idea how it would be implemented
14:29:10 <rosmaita> but i guess if we use only sample defaults for these options
14:29:16 <rosmaita> they would all be None
14:29:32 <rosmaita> makes it a PITA to actually turn the stuff on, though
14:29:54 <rosmaita> so the tradeoff is: easy for operator to turn off, but more config to turn on
14:30:22 <rosmaita> so maybe the flipswitch code won't be too bad
14:30:32 <jokke_> I think we have always had those features turned off by default when they get introduced as experimental
14:31:19 <jokke_> so it's easy to do it on the same logic it just check which way the flipswitch is combined with the condition to enable the feature vs. just checking the feature specific
14:32:16 <jokke_> more of say we have some deployment tooling right now deplying glance with multiple backends configured, it's huge amount of hunting to check if that's the case for someone who doesn't know the details
14:32:48 <jokke_> but if there is flip for allow_experimetal it's really easy to flip it over and see what blows up in the testing environment
14:33:53 <jokke_> and it really haven't been a problem when we have maybe 1, that's easy to grep from configs. But I have a feeling that the edge boom is going to bring more than one or three different things parallel that may stay experimental for a while
14:34:54 <jokke_> specially as most of the requests seems to be "Something like this but we're not exactly sure about the requirements yet"
14:35:17 <rosmaita> it may be that we should do both (a) introduce EXPERIMENTAL in oslo.config and (b) the flip switch
14:35:26 <rosmaita> but i will shut up noe
14:35:28 <rosmaita> *now
14:35:42 <jokke_> yeah I do not see those two being by any means mutually exclusive
14:36:03 <jokke_> ok last thing from me
14:36:09 <jokke_> #topic registry removal
14:36:47 <jokke_> Soo, the v1 registry was sort of gone with v1 api being gone and the v2 registry was deprecated to be removed in Stein
14:37:17 <jokke_> and this was one of those things that is possibly making the life difficult with the visibility patch
14:38:02 <jokke_> so Just a reminder if someone has interest and cycles, we can remove registry in Stein and we could do with quite a bit of cleanup for the related code and documentation
14:38:39 <LiangFang> devstack may also need to adjust
14:38:53 <jokke_> I think there is still quite a bit of v1 codebase left because of the registry being so lovely tangled together there
14:39:22 <jokke_> LiangFang: I think/hope that devstack has been running glance quite a few cycles already without registry
14:39:39 <LiangFang> OK
14:40:03 <jokke_> IIRC they were quite eager not to deploy it and have glance talking directly to the db as soon as it was possible when v1  got phased out
14:40:37 <jokke_> any correction to that assumption is more than welcome
14:41:27 <jokke_> that was all from me this week and all we had in the agenda
14:41:34 <jokke_> #topic open discussion
14:42:06 <LiangFang> sudo systemctl status devstack@g-reg.service
14:42:18 <LiangFang> seems devstack still setup register
14:42:28 <jokke_> LiangFang: ok, Thank You!
14:42:38 <GregWaines> wrt open discussion ... interested in brief discussion on the glance edge caching spec
14:42:39 <jokke_> So that needs to get sorted if we remove it :D
14:43:20 <jokke_> GregWaines: sure, as promised I've read the spec with care couple of times and sent you+others in that mailchain of ours some thoughts
14:43:40 <rosmaita> someone does need to take a look, the registry config is set up in https://github.com/openstack-dev/devstack/blob/master/lib/glance but i don't know if it's used in the default; may only be there to support when v1 is enabled
14:43:55 <GregWaines> Yeah I saw that ... thanks for your input.
14:44:14 <GregWaines> Agree with you that reviews by Dan and Bogdan have been helpful ... a handful of good suggestions
14:44:19 <jokke_> I was super happy seeing that Dan also reviewed it. Nice to have bit out of box viewpoints as well
14:44:29 <jokke_> ++
14:44:45 <GregWaines> I am going to update the spec base on their comments ... just while it is fresh in my mind and not to lose their input
14:44:48 <GregWaines> but
14:45:02 <GregWaines> also want to look at the suggestions in your email
14:45:16 <GregWaines> I'm open to looking at the multiple backend approach again
14:45:29 <GregWaines> is there a detailed SPEC or proposal on that ?
14:45:44 <GregWaines> i've seen what we have on the edge-computing wiki ... but that's pretty light
14:46:00 <jokke_> so the multiple backends work is being done and it is currently experimental ;P
14:46:17 <GregWaines> is there a spec ?
14:46:30 <jokke_> I think it's unfortunately one of those things where the edge format is not actually well documented anywhere
14:46:45 <jokke_> and I'm more than willing to work out proposal for it next week
14:47:15 <rosmaita> GregWaines: there is a spec for multiple backends in rocky/implemented
14:47:15 <GregWaines> understood ... who is working on it ? ... maybe I can get more details directly
14:47:26 <rosmaita> (at least there should be)
14:47:28 <jokke_> http://specs.openstack.org/openstack/glance-specs/specs/rocky/implemented/glance/multi-store.html
14:47:39 <rosmaita> and there it is
14:47:56 <GregWaines> ok ... i'll take a look at the rocky/implemented spec ... but yeah suspect that central --> edge may not be explicitly discussed in that
14:48:05 <GregWaines> but suspect what you were suggesting was to build off that
14:48:27 <rosmaita> GregWaines: i think you are correct about that
14:48:42 <jokke_> but in nutshell my idea that has been fluently ignored in every single discussion around the edge model has been to utilize that with site local backends
14:49:23 <jokke_> and instead of sending pull requests to the edge sites for the images that are needed there to be cached pushing the image to the local backend in that site
14:49:52 <jokke_> and having db replication taking care of having up-to-date metadata
14:50:12 <GregWaines> understand ... historically, our decision to do the pull approach was to really build off the caching that was already supported in glance
14:50:21 <jokke_> just because the database guys have been solving this very issue for long time and they are doing it really well
14:50:37 <jokke_> yeah, I totally understand that perspective
14:50:54 <GregWaines> ( our original approach did not have the metadata synching ... so did not have the issues being discussed in the spec right now )
14:51:00 <jokke_> And I really want to avoid getting here https://i2.wp.com/ecbiz168.inmotionhosting.com/~perfor21/performancemanagementcompanyblog.com/wp-content/uploads/2014/04/intrinsic-round-wheels-already-in-wagon.gif :P
14:51:10 <GregWaines> or meta data pulling
14:51:32 <rosmaita> GregWaines: i have not looked at your fork, where were you storing the metadata locally?
14:52:14 <jokke_> it definitely has it's perks but there is also lots of twirks and overhead by polling service to sync something that is not designed to be synced outside :D
14:52:14 <GregWaines> no ... in our original approach, we did not support funcationality when disconnected from the central site ... so just always got the metadata from central site and only cached the image
14:52:25 <rosmaita> ok, gotcha
14:53:03 <GregWaines> but we have a lot of use cases with our commercial product that are using Distributed Cloud and connectivity is not 100% reliable and need functionality to work when disconnected
14:53:15 <jokke_> GregWaines: and honestly the caching is still great way to speed up booting multiple VMs from backend that has slow connectivity
14:53:38 <rosmaita> the problem with push is that the edge sites are the ones who will know when they have been disconnected, so i think pull is actually better
14:53:48 <rosmaita> or at least a "hey, i am ready for a push"
14:53:55 <GregWaines> that was our thinking
14:53:56 <GregWaines> although
14:54:24 <GregWaines> we actually were thinking of possibly supporting both a push and pull model ... as some customers / use cases may align better with one than the other
14:55:12 <rosmaita> right, i can see that
14:55:27 <jokke_> yup
14:55:32 <rosmaita> it depends on how much maintenance we want the central glance to do
14:56:06 <rosmaita> becasuse i can see these edge sites being very ephemeral, so it will be hard to know a site has disappeared vs. a bad network disruption
14:56:22 <jokke_> so just quick recap of what I had in mind would provide both in same package
14:57:00 <GregWaines> anyways ... good discussions ... wanted to let you know that    a) i will update spec to capture recent comments  b) will look at multiple backend approach to better understand that    c) will follow up on other questions in erno's email
14:57:32 <jokke_> _if_ we replicate the db and make possible (which I think oslo.db makes very easy) for glance-api to talk to 2 db (one for reads one for writes) we would have the metadata always there, we would have the locally stored images always there
14:57:34 <rosmaita> jokke_: i wonder whether you should patch Greg's spec with an "alternatives" section mapping out what you think
14:58:08 <jokke_> and the caching would still work for images that are needed to be fetched from the central store
14:58:53 <GregWaines> yeah makes sense
14:58:54 <rosmaita> i am worried about using db replication vs. a well-defined API, mainly because i wonder how long doing a glance-edge-on-the-cheap will last ... it may be that the "normal" vs. "edge" glances are going to be a bit different as this develops
14:59:12 <jokke_> GregWaines: I'll try to get my idea to easier digestable format next week when I'm back home and I'll send it to you and we can have quick call to brainstorm/walk it through
14:59:22 <GregWaines> sounds good
14:59:51 <jokke_> rosmaita: the problem is that we don't have well defined api and we're planning to proxy api calls which has been quite a big nono in past for the community
15:00:01 <GregWaines> wrt DB replication vs API replication ... Just FYI ... StarlingX does have an API Synchronization Framework for using REST APIs to synchronize data between two clouds
15:00:27 <rosmaita> that may be useful
15:00:30 <jokke_> but we're out of time
15:00:32 <rosmaita> next topic:  we do need to prioritize giving devstack some love, for example eliminate https://github.com/openstack-dev/devstack/blob/88f8c7f02d7553d373abcab91e7af1d9e7334773/lib/glance#L60
15:00:34 <jokke_> thanks all!
15:00:38 <GregWaines> thanks
15:00:53 <jokke_> lets keep this going and find the solution to fix all edge issues! :P
15:00:57 <jokke_> #endmeeting