16:01:41 <smcginnis> #startmeeting Cinder
16:01:46 <openstack> Meeting started Wed May 24 16:01:41 2017 UTC and is due to finish in 60 minutes.  The chair is smcginnis. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:01:47 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:01:47 <xyang1> Hi
16:01:49 <openstack> The meeting name has been set to 'cinder'
16:01:50 <smcginnis> ping dulek duncant eharney geguileo winston-d e0ne jungleboyj jgriffith thingee smcginnis hemna xyang1 tbarron scottda erlon rhedlind jbernard _alastor_ bluex karthikp_ patrickeast dongwenjuan JaniceLee cFouts Thelo vivekd adrianofr mtanino karlamrhein diablo_rojo jay.xu jgregor lhx_ baumann rajinir wilson-l reduxio wanghao thrawn01 chris_morrell watanabe.isao,tommylikehu mdovgal ildikov wxy
16:01:53 <Swanson> Hello.
16:01:56 <hemna> @!
16:01:56 <pewp> hemna (╯°□°)╯︵ ┻━┻
16:01:56 <smcginnis> viks ketonne abishop sivn breitz
16:01:57 <cFouts> Hi
16:01:57 <_alastor_> o/
16:01:59 <smcginnis> Well that was funny. :)
16:02:00 <jungleboyj> Hello again o/
16:02:00 <e0ne> was it  the shortest kolla meeting :)
16:02:01 <scottda> Hi again
16:02:02 <eharney> hi
16:02:03 <wxy|> o/
16:02:03 <e0ne> hi
16:02:05 <abishop> o/
16:02:07 <rawanh> Hi
16:02:15 <diablo_rojo_phon> Hello :)
16:02:18 <patrickeast> o/
16:02:27 <smcginnis> #topic Announcements.
16:02:43 <smcginnis> Two weeks left until Pike-2!
16:02:50 <smcginnis> #link https://etherpad.openstack.org/p/cinder-spec-review-tracking Review focus
16:02:59 <geguileo> hi!
16:03:02 <DuncanT> We should run our meeting as fast as that
16:03:04 <rajinir> o/
16:03:07 <smcginnis> Cores, please take a look through the new drivers in the etherpad to help them along. ^
16:03:17 <smcginnis> DuncanT: Hah
16:03:34 <smcginnis> Specs could use a pass through as well.
16:04:02 <smcginnis> PTG planning is starting for September in Denver.
16:04:20 <smcginnis> ML discussion right now on rooms. Feel free to add anything there if you have any unput.
16:04:24 <smcginnis> or input
16:04:46 <smcginnis> #topic Versioned Objects - RTFM
16:04:51 <smcginnis> geguileo: All yours :)
16:05:13 <geguileo> I wanted to remind people that we must Read The Fantastic Manuals
16:05:23 <geguileo> primarily regarding OVOs
16:05:24 <smcginnis> Fantastic... right. :)
16:05:28 <geguileo> lol
16:05:42 <geguileo> because we keep merging things that break rolling upgrades
16:05:49 <tommylikehu> oh, yes :)
16:05:51 <geguileo> with things that are explained in the docs
16:06:15 <geguileo> I have updated the error we get on the tests when we have to bump an OVO version
16:06:20 <geguileo> but long story short
16:06:27 <smcginnis> geguileo: I think that error message should help.
16:06:35 <e0ne> geguileo: what about CI for it?
16:06:47 <geguileo> we only need to bump a version if we add fields, remove fields, or add new values to a field that would break old versions
16:07:00 <geguileo> And if we bump a version we need backporting code
16:07:03 <hemna> this is the problem with OVO though, almost nobody understands them.
16:07:23 <geguileo> e0ne: Well, we cannot have CI for everything in this case
16:07:24 <smcginnis> Unit tests should catch issues, but folks need to know to write them.
16:07:38 <geguileo> e0ne: Because it means we must pass all possible payloads between services
16:07:50 <geguileo> that are in different versions
16:08:14 <geguileo> So the easy rule to remember is what I just said
16:08:25 <geguileo> so if we bump a version we need to add backporting code as mentioned in the docs
16:08:30 <geguileo> nothing more
16:08:56 <geguileo> after this rant I'm done with the topic, unless someone has a question
16:09:27 <tommylikehu> geguileo:  thanks
16:09:57 <smcginnis> geguileo: Thanks. I think with the better message and better core awareness we can hopefully catch these.
16:10:06 <mdovgal> hi folks)
16:10:19 <hemna> geguileo, do we have an example review of what not to do?
16:10:19 <diablo_rojo> Thanks for the re-education geguileo :)
16:10:19 <smcginnis> geguileo: Do we have details in a devref that explains unit testing,etc, for OVO changes?
16:10:22 <xyang1> geguileo: can you provide a link to the Fantastic Manual?
16:10:28 <jungleboyj> geguileo:  Thanks.
16:10:35 <geguileo> smcginnis: yup
16:10:51 <smcginnis> #link https://docs.openstack.org/developer/cinder/devref/rolling.upgrades.html#rpc-payload-changes-oslo-versionedobjects
16:11:03 <xyang1> Thanks
16:11:13 <geguileo> hemna: yes, it's in the meeting agenda
16:11:18 <geguileo> the patch with the issue
16:11:21 <hemna> geguileo, ok.
16:11:28 <geguileo> and the patch with the fix
16:11:39 <geguileo> code is actually really simple
16:11:41 <hemna> geguileo, I have a bug that I'd like to discuss with you regarding OVO breaking something.  we can take it offline
16:12:00 <geguileo> hemna: ok, we can discuss after the meeting
16:12:09 <hemna> np
16:12:14 <smcginnis> #link https://review.openstack.org/455131 Missed patch
16:12:25 <smcginnis> #link https://review.openstack.org/466254 Fix
16:12:55 <smcginnis> geguileo: OK, thanks for bringing it up and fixing it.
16:13:00 <geguileo> np
16:13:24 <smcginnis> #topic Open Discussion
16:13:31 <smcginnis> Anything else to bring up today?
16:13:47 <jgriffith> Geesh, I showed up for nuthin
16:13:51 <jgriffith> :)
16:13:52 <smcginnis> hah
16:13:54 <jungleboyj> Should we mention mysql vs mysql+pymysql ?
16:14:07 <smcginnis> jgriffith: Want to give a little informercial on your recently merged stuff?
16:14:17 <jgriffith> Sure!
16:14:27 <smcginnis> jungleboyj: Sure, after jgriffith I guess.
16:14:40 <jungleboyj> smcginnis:  Sounds good.
16:14:43 <jgriffith> For those that didn’t see it and may be interested:  https://review.openstack.org/#/c/467304/
16:15:00 <jgriffith> That adds a standalone cinder deployment template into Cinder
16:15:33 <jgriffith> It leverages the LOCI project for image building and basically tries to get rid of the notion that “openstack/cinder” is hard to deploy
16:16:03 <jgriffith> It’s super easy if you have an external backend, and slightly more complex if you use LVM (slightly)
16:16:25 <jgriffith> Anyway, I tried to document things pretty well. I would LOVE it if people were interested and took a look at it
16:16:37 <smcginnis> jgriffith: I'd love to see a devref or something with a little detail of how you are using this for development instead of devstack.
16:16:40 <jgriffith> There’s certainly opportunity to grow it and integrate it more with the source tree
16:16:47 <smcginnis> +1
16:16:52 <xyang1> jgriffith: is keystone still required for this?
16:16:54 <jgriffith> smcginnis for sure, I’d be happy to do that
16:16:58 <jgriffith> xyang1 nope
16:17:17 <xyang1> jgriffith: great
16:17:17 <jgriffith> No keystone, but that can be added and at some point should perhaps be documented
16:17:26 <jungleboyj> jgriffith:  Thanks for working on that.  Think it is a great add.
16:17:35 <smcginnis> Yeah, like you said, this is a start.
16:17:42 <jgriffith> Keep in mind the current minderclient on pypy doesn’t have standalone support so you’ll need to build your own
16:17:43 <tommylikehu> jgriffith:  should I start from this link? https://review.openstack.org/#/c/467304/1/contrib/block-box/README.md
16:17:43 <smcginnis> Could be cool to see where it goes.
16:17:53 <jgriffith> Gnudge nudge smcginnis
16:18:04 <jgriffith> tommylikehu yup
16:18:07 <e0ne> jgriffith: it looks awesome! I want to try it asap
16:18:13 <jgriffith> And if folks have issues/problems hit me up
16:18:19 <smcginnis> jgriffith: Do we just need a new release?
16:18:30 <jgriffith> smcginnis yes :)
16:18:57 <jungleboyj> smcginnis:  Takes a hint quickly.  ;-)
16:19:02 <smcginnis> jgriffith: OK, I can look at that soon.
16:19:03 <jgriffith> So something else that is going on and getting a bit more traction is the idea of creating a docker and kubernetes plugin for Cinder
16:19:04 <smcginnis> Hah!
16:19:26 <jgriffith> I know I’ve mentioned this before, but there’s an official open stack project for this effort
16:20:00 <jgriffith> It would be really great if all of us vendors got together on this and made something really robust and solid
16:20:11 <patrickeast> jgriffith: got links for those?
16:20:13 <smcginnis> jgriffith: Pointers to that?
16:20:26 <jgriffith> And also FYI this effort is pat of what drove the interest for the block-box stuff
16:20:28 <e0ne> jgriffith: what project do you mean?
16:20:34 <jgriffith> fuxi
16:20:40 <jgriffith> Part of kuryr
16:20:46 <smcginnis> For the benefit of English speakers, that X makes a "sh" sound, so y'all can stop giggling.
16:20:47 <tommylikehu> oh fuxi !
16:20:55 <jgriffith> Although I’m not sure why it’s not a sub-project of Cinder, but that’s a whole different thing
16:20:59 <jgriffith> smcginnis :)
16:21:28 <jgriffith> Anyway… there’s a bp that I’m working on to convert that code to golang
16:21:33 <xyang1> smcginnis: so now you know how to pronounce my name:)
16:21:37 * jungleboyj gigles anyway.
16:21:43 <smcginnis> xyang1: Hah!
16:21:48 <e0ne> xyang1: :)
16:21:48 <tommylikehu> xyang1:  lol
16:21:48 <jgriffith> And, dims has worked up a first pass at a kubernetes flex volume plugin to use cinder
16:21:50 <xyang1> :)
16:22:09 <smcginnis> xyang1: I'll probably still not get it quite right, but I'm learning. ;)
16:22:19 <jgriffith> So for those of us at organizations where we are working on both sides of the aisle it might be interesting/worhtwhile to just make Cinder the answer
16:22:29 <jungleboyj> xyang1:  It is like King , right?  ;-)
16:22:43 <Swanson> Shing?
16:22:44 <xyang1> :)
16:22:50 <jungleboyj> jgriffith:  ++
16:22:51 <smcginnis> jgriffith: +1, I think these are great efforts.
16:22:57 <xyang1> Swanson: that sounds better
16:23:16 <e0ne> jgriffith:  +1
16:23:22 <hemna> also, fwiw, I've used this stuff already and it's really great :)
16:23:37 <e0ne> hemna: does it work? :)
16:23:39 <jgriffith> hemna :)
16:23:42 <smcginnis> Great infomercial, we even have endorsements. :D
16:23:57 <jgriffith> So something to think about if you have time… ways to leverage this in your development cycle etc
16:24:27 <e0ne> jgriffith: can we use if for functional tests for brickclient?
16:24:32 <jgriffith> It would be really neat if there’s interest and it works well for people we could publish some sort of blog like “tips from the team” or whatever
16:24:42 <jungleboyj> Cool.
16:24:51 <jgriffith> e0ne Yes, it works really well for isolated tests
16:25:02 <e0ne> jgriffith: because I'm not sure when needed patch will be merged to devstack
16:25:09 <Swanson> xyang1, Probably going to continue with X-ing.
16:25:26 <e0ne> jgriffith: ok. I'll play with it and let you know if I have any issues
16:25:30 <jgriffith> e0ne so there’s effort to do containers in gate but I’m not very familiar with where it’s at or how it’s going
16:25:49 <hemna> e0ne, yes :)
16:25:50 <jgriffith> I believe there are some projects that have it working so we could look into those and how to use their templates etc
16:26:01 * jungleboyj is laughing at Swanson
16:26:24 <jgriffith> Anyway, that’s it from me.. thanks smcginnis and hemna and all
16:26:54 <smcginnis> jgriffith: Thanks for putting that up there. Could be hyoooj.
16:27:09 <jungleboyj> huoooj
16:27:14 <smcginnis> OK, jungleboyj brought this up:
16:27:15 <jungleboyj> hyoooj ?
16:27:16 <smcginnis> #link http://lists.openstack.org/pipermail/openstack-operators/2017-May/013500.html
16:27:40 <smcginnis> arnewiebalck__ did an awesome job sticking with this error.
16:27:53 <smcginnis> Finally came down to the DB connection string in cinder.conf.
16:27:59 <jungleboyj> smcginnis:  ++
16:28:22 <jgriffith> No way!!
16:28:28 <smcginnis> So if you have an older deployment that was installed after the guides switched to say to use pymysql, you could be running into deadlocks.
16:28:34 <smcginnis> So don't do that.
16:28:56 <jgriffith> I was just trying to help a customer with this exact problem the other day!
16:28:56 <jungleboyj> jgriffith:  Was that sarcastic ?
16:29:10 <smcginnis> jgriffith: Almost perfect timing. :)
16:29:15 <jungleboyj> jgriffith:  Ok, so not.  I was frustrated I didn't narrow it down to that faster.
16:29:15 <smcginnis> jgriffith: Had you figured it out yet?
16:29:17 <jgriffith> jungleboyj no, it was a ‘crap, wish I’d known this about 3 days ago'
16:29:23 <jgriffith> smcginnis NOPE!
16:29:25 <jungleboyj> jgriffith:  +1000
16:29:26 <hemna> damn
16:29:27 <Swanson> Neat.
16:29:32 <hemna> that was the deadlock cause?
16:29:33 <smcginnis> jgriffith: Well there you go... :)
16:29:34 <jgriffith> I just manually cleaned their cloud up
16:29:44 <jgriffith> And said “don’t do that again"
16:29:50 <jungleboyj> jgriffith:  Yeah, it will happen again if they don't make that change.
16:29:51 <smcginnis> hemna: Yep, basically the native driver completely blocks and causes problems.
16:30:09 <jgriffith> jungleboyj but I told them “don’t do that again”. Doesn’t that count :)
16:30:11 <hemna> that's such an easy mistake to make
16:30:16 <smcginnis> So I believe the plan out of this is oslo.log will add a warning if it sees it being used.
16:30:24 <hemna> is there a way for us to puke at startup if we detect that config db string ?
16:30:29 <smcginnis> So at least there's a slim chance it will be noticed and fixed.
16:30:30 <jungleboyj> jgriffith: Oh yes, because we never try something that hurt before again.  :-)
16:30:42 <hemna> it should be more than a warning.   we should fail to start IMHO
16:30:43 <jgriffith> jungleboyj :)
16:31:08 <jungleboyj> So, arnewiebalck__  Was going to work with oslo to get a warning output.
16:31:13 <smcginnis> hemna: There was some talk of just internally switching it over to mysql+pymsql, but there was some concern abou tust switching it.
16:31:35 <clarkb> there are tradeoffs involved
16:31:36 <jungleboyj> I suppose we could step it up and puke or like smcginnis said, just ignore it.
16:31:53 <clarkb> also you can't assume that a user using the mysql driver has pymysql installed
16:31:55 <hemna> if it's a known issue, we shouldn't allow the setting to be used.
16:31:59 <hemna> as it will cause problems.
16:32:02 <smcginnis> hemna: It is apparently more performant, so they didn't want to block it completely if a deployer had some good reason for sticking with it.
16:32:04 <clarkb> hemna: again there are tradeoffs
16:32:17 <clarkb> smcginnis: right that, the C is faster
16:32:20 <smcginnis> clarkb: +1
16:32:25 <hemna> smcginnis, even thought that more performant config will cause deadlocks and failures ?!
16:32:41 <hemna> something doesn't make sense to me.
16:32:49 <smcginnis> hemna: Only if their usage causes a lot of simultaneous operations to be performed.
16:32:56 <hemna> we are really fast at getting you to deadlock!
16:33:05 <jungleboyj> Yay!
16:33:38 <eharney> does anyone actually know why it deadlocks?  i.e. is it as risk for all cinder configs, or only when setup in certain ways?
16:33:45 <hemna> I dunno man, this is cloud sw.  it's supposed to work with parallel actions.
16:34:30 <clarkb> eharney: because mysql driver ffis out to C eventlet must block. aiui if that blocking happens at an unfortunate time you can deadlock
16:34:39 <eharney> ahh ok
16:34:46 <clarkb> eharney: pymysql doesn't have this problem because it is a pure python driver so eventlet just greentrheads it like everything else
16:34:51 <smcginnis> In this case, they had been running fine and only would run into problems when trying to delete 10+ volumes at the same time.
16:35:33 <Swanson> smcginnis, And that happens how often? All the time?
16:35:39 <hemna> so....when you want speed because you are trying to do lots of parallel operations, you are almost guaranteed a deadlock
16:35:41 <jungleboyj> The good news is that we are aware of it now and know how to fix it quickly.
16:35:52 <hemna> but when you don't care and only do 1 operation at a time.......
16:35:59 <smcginnis> Swanson: Frequently I think.
16:35:59 <jungleboyj> *Do dooo dooooh*  The more you know ....
16:36:25 <jungleboyj> Swanson:  Yeah, it sounds like it is done fairly frequently.
16:36:25 <hemna> so the reason why you use the C client is for speed, but it will cause problems.
16:36:35 <Swanson> I vote for making it not break in the first place.
16:36:38 <hemna> seems like we should do more than warn.
16:37:02 <clarkb> hemna: yes I think we are saying everyone should use pymysql and we will warn you if you don't. However, people have used the other driver for years and there may be valid reasons for them to continue to do so
16:37:06 <jungleboyj> arnewiebalck__:  Was able to get around it by removing a lock, but that also seems potentially dangerous.
16:37:31 <smcginnis> jungleboyj: And the problem went away once he updated his connection string.
16:37:37 <clarkb> hemna: there re also people that use the other oracle driver I think
16:37:48 <smcginnis> Well, feel free to chime in on that thread if you have other input.
16:37:51 <clarkb> hemna: so if you force everyone to use pymysql there will likely be undesired fallout involved
16:37:54 <Swanson> clarkb, so if this is setup a long time ago would someone be looking for the warning?
16:38:02 <jgriffith> Seems to me that documenting it and warning is good step
16:38:16 <hemna> clarkb, I was just suggesting blocking mysql:// and using mysql+pymysql:// instead
16:38:22 <hemna> not blocking everything
16:38:26 <jgriffith> Puking on startup for people that have configured a certain way (taking that ability away) is no bueno
16:38:33 <clarkb> hemna: but the other driver alsos suffers the same problem
16:38:37 <jgriffith> And the fact is that people will want to make that trade off
16:38:38 <clarkb> hemna: as it runs in C too?
16:38:45 <jungleboyj> jgriffith:  Have to start somewhere.
16:38:45 <clarkb> anyways its a tradeoff and so communication is importnat
16:39:02 <smcginnis> Hence the post the the -operators ML and not just to -dev.
16:39:09 <hemna> seems dangerous.
16:39:15 <Swanson> If they want the 911 they'll have to risk putting it in a tree, I suppose.
16:39:38 * jungleboyj is laughing
16:39:42 <hemna> I think the warning will simply be missed and ignored
16:39:45 <hemna> until there is a problem
16:40:11 <hemna> anyway, I'll shut up. :)
16:41:06 <Swanson> Wow, and he did.
16:41:27 <hemna> Swanson, shh!
16:41:40 <jungleboyj> smcginnis:  Thanks for bringing this up.  Glad we were able to help jgriffith
16:41:43 <smcginnis> Hah! Anything else for today?
16:42:10 <jungleboyj> Huge thanks to arnewiebalck__  for chasing it and Gerhard for finding the solution.
16:43:40 <smcginnis> OK, thanks everyone.
16:43:47 <smcginnis> #endmeeting