#openstack-meeting log

16:00:16 <smcginnis> #startmeeting Cinder
16:00:16 <openstack> Meeting started Wed Jul 12 16:00:16 2017 UTC and is due to finish in 60 minutes.  The chair is smcginnis. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:17 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:20 <openstack> The meeting name has been set to 'cinder'
16:00:26 <Swanson> Hello.
16:00:42 <patrickeast> Hi
16:00:45 <tommylikehu> hello!
16:00:49 <arnewiebalck> hi
16:01:06 <diablo_rojo_phon> Hello :)
16:01:11 <jungleboyj> o/
16:01:13 <geguileo> hi!
16:01:24 <xyang1> Hi
16:01:28 <smcginnis> Hello all.
16:01:29 * geguileo is missing the ping  };-)
16:01:30 <wxy-> hi
16:01:48 <smcginnis> ping dulek duncant eharney geguileo winston-d e0ne jungleboyj jgriffith thingee smcginnis hemna xyang1 tbarron scottda erlon rhedlind jbernard _alastor_ bluex karthikp_ patrickeast dongwenjuan JaniceLee cFouts Thelo vivekd adrianofr mtanino karlamrhein diablo_rojo jay.xu jgregor lhx_ baumann rajinir wilson-l reduxio wanghao thrawn01 chris_morrell watanabe.isao,tommylikehu mdovgal ildikov wxy
16:01:53 <smcginnis> viks ketonne abishop sivn
16:01:55 <geguileo> thanks!  ;-)
16:02:01 <smcginnis> :)
16:02:09 <smcginnis> I need a macro or something
16:02:10 <_pewp_> hemna ヘ(°￢°)ノ
16:02:12 <bswartz> .o/
16:02:21 <e0ne> hi
16:02:51 <tbarron> hi
16:02:58 <abishop> o/
16:03:07 <smcginnis> OK, guess we can get going.
16:03:12 <smcginnis> #topic Announcements
16:03:21 <smcginnis> #link https://etherpad.openstack.org/p/cinder-spec-review-tracking Review focus
16:03:42 <smcginnis> Still some open work on merged specs for Pike in there.
16:03:47 <smcginnis> And we're running really low on time.
16:04:05 <smcginnis> So if at all possible, please help with reviews and updates.
16:04:26 <smcginnis> I hope to be a little more proactive this time arround and move merged specs that don't fully land.
16:04:34 <smcginnis> Since we've just left them for the most part in the past.
16:04:40 <jungleboyj> smcginnis:  ++
16:04:44 <smcginnis> Then people come along a few releases later and ask why it's not working.
16:05:25 <e0ne> smcginnis: I didn't find API-WG decision as was asked for 'Backup Service Enabled API' - so let's move it to Queens
16:05:55 <lhx_> o/ hi
16:06:00 <smcginnis> e0ne: OK, great. Want to propose that?
16:06:02 <rajinir> o/
16:06:22 <e0ne> smcginnis: I need to look on it a bit deeper forst
16:06:28 <e0ne> s/forst/first
16:06:33 <smcginnis> e0ne: OK, sounds good.
16:06:35 <smcginnis> #link https://etherpad.openstack.org/p/cinder-ptg-queens Planning etherpad for PTG
16:06:48 <smcginnis> We have a topic planning etherpad started for the PTG.
16:07:04 <smcginnis> Please add any topics you think would be good to discuss face to face there.
16:07:14 <smcginnis> Hopefully many of you can attend.
16:07:42 <smcginnis> Oh, even if you don't have a topic, please add your name if you are planning to attend so we can get an idea of who will be there.
16:07:44 <e0ne> even if you can't attend - please, add topic you are interesting  in
16:07:55 <diablo_rojo_phon> If you don't think you can, please apply for tsp.
16:07:56 <smcginnis> e0ne: ;)
16:08:13 <smcginnis> diablo_rojo_phon: Jumping the gun! :D
16:08:15 <smcginnis> #link https://www.eventbrite.com/e/project-teams-gathering-denver-2017-tickets-33219389087 PTG registration
16:08:19 <e0ne> I think if topic is really important - it should be discussed
16:08:27 <smcginnis> #link https://openstackfoundation.formstack.com/forms/travelsupportptg_denver
16:08:27 <e0ne> maybe with some hangout session too
16:08:38 <smcginnis> Travel support program is accepting applications.
16:08:41 <hemna> diablo_rojo_phon, tsp link?
16:08:44 <diablo_rojo_phon> I know it's just important :)
16:08:46 * e0ne still is not sure about PTG attandance
16:08:58 <smcginnis> If you are involved but are unable to get company funding, please apply for travel support.
16:09:03 <smcginnis> e0ne: :(
16:09:14 <smcginnis> We willd definitely try to stream again.
16:09:20 <diablo_rojo_phon> hemna smcginnis: has it on the agenda
16:09:22 <smcginnis> Hopefully with decent audio.
16:09:26 <hemna> oh yah it still costs $100 to attend
16:09:49 <e0ne> hemna: $100 + hotel + flight to US :(
16:09:56 <hemna> yah
16:10:00 <hemna> thanks for the links
16:10:09 <Swanson> Swim.
16:10:17 <smcginnis> diablo_rojo_phon: Does the TSP cover anything with the registration cost?
16:10:42 <diablo_rojo_phon> Tap can cover all of it.
16:10:44 <diablo_rojo_phon> Tsp
16:10:50 <smcginnis> Sweet.
16:10:56 <diablo_rojo_phon> Reg, flight and hotel
16:10:56 <smcginnis> So definitely apply if you need it.
16:11:01 <diablo_rojo_phon> Or some subset
16:11:27 <smcginnis> #topic Update on mysql/pymysql issues with oslo.db
16:11:34 <smcginnis> No name on this one. Was that you arnewiebalck ?
16:11:43 <jungleboyj> smcginnis:  Sorry, that was me.
16:11:50 <arnewiebalck> nope
16:11:52 <jungleboyj> Just for awareness.
16:11:57 <arnewiebalck> not me, I mean :)
16:12:13 <smcginnis> arnewiebalck: Well, it was you. ;)
16:12:14 <jungleboyj> I talked to the Oslo team about this and they agreed that it was something to improve.
16:12:38 <smcginnis> #link https://bugs.launchpad.net/oslo.db/+bug/1692956
16:12:38 <openstack> Launchpad bug 1692956 in oslo.db "Warn about potentially misconfigured connection string " [Undecided,Fix committed]
16:12:44 <_alastor_> o/
16:12:54 * smcginnis marks _alastor_ as tardy
16:12:59 <jungleboyj> They will add a warning to logs when the config option is used and they will update the help text to indicate that the option can cause deadlocks.
16:13:07 <jungleboyj> gcb was going to push the patch up.
16:13:15 <smcginnis> jungleboyj: OK, great.
16:13:23 <_alastor_> smcginnis: sorry teach :P
16:13:29 <jungleboyj> Looks like he already has a patch linked in there.
16:13:43 <smcginnis> I know there was some debate about just internally switching it to the pymysql connection, but didn't want to change behavior on people.
16:13:44 <jungleboyj> arnewiebalck:  You mind taking a look and making sure it looks good to you?
16:13:58 <arnewiebalck> sure, will do
16:14:16 <jungleboyj> arnewiebalck:  Thank you sir.
16:14:23 <jungleboyj> I will go look at the patch in the bug as well.
16:15:07 <jungleboyj> That was all on that from me.  :-)
16:15:22 <smcginnis> jungleboyj: Thanks for the update.
16:15:48 <smcginnis> #info oslo.db proposal is to log warning and improve help text.
16:15:58 <jungleboyj> smcginnis:  Welcome.
16:16:01 <smcginnis> #topic Documentation Migration
16:16:07 <smcginnis> It's the jungleboyj show today. :)
16:16:17 <jungleboyj> :-)
16:16:23 <e0ne> :)
16:16:27 <smcginnis> #link https://review.openstack.org/481756
16:16:29 <jungleboyj> Just wanted to make people aware of how this is progressing.
16:16:34 <smcginnis> #link https://review.openstack.org/481847
16:16:42 <smcginnis> #link https://review.openstack.org/481848
16:16:43 <jungleboyj> So, the main admin-guide content has been migrated and merged.
16:16:54 <jungleboyj> We have some cli documentation that hasn't merged yet.
16:17:14 <jungleboyj> Found that the detailed driver config info hasn't been moved yet either.
16:17:22 <smcginnis> So in case anyone missed what was going on, it's been decided all documentation is moving out of openstack-manuals into the individual projects.
16:17:32 <smcginnis> So we can update docs along with patches.
16:17:41 <smcginnis> So first step is getting things moved over.
16:17:45 <jungleboyj> smcginnis:  Oh yeah, background is important.  :-)
16:17:50 <smcginnis> Then we can improve any formatting and other issues.
16:17:58 <smcginnis> jungleboyj: ;)
16:18:06 <jungleboyj> This is all because of the brain drain from the documentation team.
16:18:19 <smcginnis> Which reminds me - no real need for DocImpact tags now.
16:18:27 <jungleboyj> hemna:  Is super excited that we get to maintain our own documentation now.
16:18:33 <smcginnis> As if there is a DocImpact, the patch should just include the doc updates.
16:18:35 <jungleboyj> smcginnis:  Good note.
16:18:45 <tommylikehu> wow
16:18:47 <geguileo> I raised the subject of whether it makes sense or not having openstack commands in all the docs when we don't actually maintain that client...
16:19:01 <geguileo> Anybody thinks this is kind of "weird"
16:19:01 <smcginnis> All english grammer and style questions can go to hemna
16:19:03 <smcginnis> :D
16:19:03 <jungleboyj> smcginnis:  ++ So, now reviewers need to enforce doing documentation updates in their patches.
16:19:18 <bswartz> geguileo: +1
16:19:27 <diablo_rojo_phon> smcginnis: :)
16:19:27 <smcginnis> geguileo: Yeah, I kind of agree there. But not sure if there's a way around that now.
16:19:31 <ildikov> jungleboyj: +1, that's one of the key aspects here
16:19:42 <geguileo> smcginnis: OK, so it is what it is :-(
16:20:06 <jungleboyj> ildikov: :-)  I will be the documentation Czar about would appreciate help enforcing the need for doc updates.
16:20:16 <geguileo> jungleboyj: or they could go in a different patch dependent on the code one, but they should be available before merging the code
16:20:16 <smcginnis> Yeah, I think things like the admin guide should show openstack CLI commands, since that's what we want end users to move to.
16:20:27 <smcginnis> Even though we don't really have much to do with that.
16:20:37 <jungleboyj> geguileo:  That is good too.
16:20:38 <smcginnis> geguileo: +1
16:21:01 <smcginnis> jungleboyj: Sorry, I kind of hijacked that. Anything else?
16:21:03 <jungleboyj> geguileo:  We need to move the cinderclient content right now.
16:21:15 <ildikov> jungleboyj: with the Ceilometer team we've been for moving the admin-guide for a while to be able to handle the doc updates along with the code changes
16:21:26 <jungleboyj> I don't know that we want that to go away but we do need to work on moving people to OpenStack client.  Let me think about that.
16:22:19 <jungleboyj> Anyway, so I will be pushing up a patch for the driver config stuff soon.  Then we need to enable handing spinx warnings as errors.
16:22:20 <xyang1> jungleboyj: are you just talking about doc or the cinderclient code?
16:22:30 <jungleboyj> xyang1:  Just the doc.
16:22:34 <xyang1> Ok
16:23:06 <jungleboyj> There are a lot of docstring issues in our code that I am going to have to push patches up to resolve.  Will do that in bite size pieces before enabling errors.
16:23:21 <jungleboyj> Some have seen me -1 patches for docstring issues.
16:23:49 <tommylikehu> jungleboyj:  :)
16:23:52 <jungleboyj> I need to work on better understanding what is right and wrong.  Hope to have that together for everyone to look at before next week's meeting and then can answer questions.
16:24:17 <jungleboyj> I just know right now what causes the doc build to fail.  How to fix it.  Want to find if there are better ways to avoid the warnings.
16:24:45 <smcginnis> jungleboyj: I think we all need to learn what is correct formatting, but once we are able to enable warnings as errors, at least it will be pretty obvious.
16:25:04 <jungleboyj> If you find missing documentation please let me know.
16:25:23 <smcginnis> jungleboyj: Careful what you ask for. :)
16:25:32 <jungleboyj> smcginnis:  ++ Yeah,  that will help.
16:25:51 <jungleboyj> I think people will need to get in the habit of doing a docs build with their changes.
16:26:05 <jungleboyj> Or we could make it a part of pep8 maybe?
16:26:30 <eharney> it's already a separate job, doesn't need to be done as part of pep8
16:26:39 <diablo_rojo_phon> I would just run their tox check build or whatever after
16:26:41 <e0ne> jungleboyj: we already have doc job. can we re-use it?
16:26:45 <diablo_rojo_phon> Not combine them
16:26:56 <ildikov> jungleboyj: the docs job is fairly quick, so it should be fine
16:27:19 <ildikov> jungleboyj: it's only the matter of raising awareness on that it counts from now on
16:27:19 <jungleboyj> diablo_rojo_phon:  Then people will need to remember to do 'tox -e docs' before the do a review.
16:27:32 <jungleboyj> ildikov:  Right.
16:27:41 <ildikov> jungleboyj: people will learn form the docs job failures, I wouldn't add it to pep8 either
16:27:47 <smcginnis> eharney: I wonder if we should have a "tox -e pregitreview" target that does fast8, docs, and py27 or something.
16:27:54 <eharney> people can just run "tox", dragging docs into pep8 is not the right thing to do
16:28:08 <smcginnis> Kind of a "here are the things you should really run before proposing a patch".
16:28:16 <jungleboyj> smcginnis:  That would be nice.
16:28:20 <eharney> smcginnis: shouldn't that be the list of what runs by default when no environment is specified...?
16:28:48 <smcginnis> eharney: Well, I think that does full pep8, not fast8, and both py27 and 35.
16:29:04 <smcginnis> eharney: And now some jerk wants to add py36 as well.
16:29:05 <smcginnis> :P
16:29:12 <eharney> lol
16:29:16 <smcginnis> hehe
16:29:44 <smcginnis> Anyway... anything else to cover jungleboyj?
16:30:05 <jungleboyj> Anyway, we can bikeshed on that piece when I have the doc builds all working.  :-)
16:30:11 <smcginnis> +1
16:30:27 <jungleboyj> smcginnis:  Not right now.  Appreciate everyone's support getting through the migration.
16:30:41 <jungleboyj> And being aware that the docs have a new level of importance.
16:30:51 <smcginnis> jungleboyj: Thanks!
16:31:02 <smcginnis> #topic Gathering of thin provisioning stats in Ocata
16:31:07 <smcginnis> arnewiebalck: OK, now it's you.
16:31:25 <arnewiebalck> Ok :)
16:31:58 <arnewiebalck> As mentioned yesterday, we upgraded to Ocata and hit the problem that the provisioning stats gathering broke the upgrade.
16:32:39 <arnewiebalck> The prob is that it cycles through our 4000+ volumes and that takes too long.
16:33:13 <arnewiebalck> So, we had to disable it to upgrade.
16:33:41 <geguileo> arnewiebalck: is that on a specific backend?
16:33:43 <smcginnis> #link https://github.com/openstack/cinder/commit/d4fd5660736a1363a4e78480b116532c71b5ce49
16:33:49 <arnewiebalck> Ceph.
16:33:52 <eharney> it's a ceph issue
16:34:08 <geguileo> eharney: I thought so, but wanted to be sure
16:34:18 <arnewiebalck> I have to admit I’m also struggling with the overall idea behind this.
16:34:30 <geguileo> 'cause I remember Jon Bernard mentioning that that was slow
16:35:03 <geguileo> arnewiebalck: I believe it was the only mechanism to get the data (though I don't remember the specifics)
16:35:13 <arnewiebalck> From what I see the code queries Ceph for every volume to get the image size.
16:35:24 <arnewiebalck> Cinder knows these sizes already.
16:35:35 <arnewiebalck> It’s not getting the actual usage from what I see..
16:35:51 <eharney> it only knows the provisioned size already, not the amount of data actually written/consumed
16:36:12 <arnewiebalck> The code doesn’t give you that either.
16:36:25 <arnewiebalck> It gives you the allocated size.
16:36:29 <eharney> the goal of this is to gather that to be able to calculate the overprovisioning ratio
16:36:47 <geguileo> arnewiebalck: it does give you the real size
16:37:28 <eharney> there was lengthy discussion in reviews about this code (i think the first attempt was wrong and later fixed), so i hope it's doing the right thing at this point
16:37:28 <arnewiebalck> geguileo: I don’t think so. We patched the code and it gives you the allocated size, not the used space of each volume.
16:37:30 <geguileo> it iterates over the diffs to calculate the real size
16:38:37 <arnewiebalck> geguileo: even if it does this, what you wnat to know in the end is how much space is used in the pool, no?
16:38:54 <geguileo> and that is the sum of all the diffs
16:39:23 <arnewiebalck> geguileo: right (if you get the real size :-) )
16:39:32 <geguileo> arnewiebalck: correct
16:39:42 <eharney> assuming that there isn't other data written into the pool than just the cinder volumes
16:39:52 <arnewiebalck> in our case you do 4000 calls to Ceph on startup
16:39:57 <arnewiebalck> and on each create
16:40:01 <arnewiebalck> and on each delete
16:40:01 <geguileo> jbernard: could you chime in?
16:40:23 <jbernard> i agree with the direction, as long as pools are cinder-exclusive
16:40:28 <jbernard> else, stats will be misleading
16:40:29 <geguileo> arnewiebalck: no, not on each create (afaik)
16:40:36 <smcginnis> Seems like there should be a more efficient way than making 4000 calls.
16:40:48 <geguileo> smcginnis: I don't think there is
16:40:49 <arnewiebalck> geguileo: ok ok, I got excited ;)
16:40:52 <jbernard> and we need to preserve allocated reporting, and not virtual
16:41:01 <eharney> there was also a pending optimization to move to diff_iterate2 which hasn't been tried afaik
16:41:05 <jbernard> but i think we're all on the same page about that, from reading the backlog
16:41:11 <geguileo> smcginnis: the pool could be used for other volumes
16:41:47 <jbernard> geguileo: then a per-pool call willl be inaccurate
16:41:51 <arnewiebalck> geguileo: if it used by other volumes, it will be very diffcult to do the over subscription correctly
16:41:59 <geguileo> jbernard: yes, that's what I meant
16:42:10 <geguileo> jbernard: that there's no easier way, because it could be shared
16:42:39 <jbernard> if it can, then we must iterate
16:42:41 <geguileo> arnewiebalck: yes, but it's possible
16:43:00 <jbernard> or, document that we perfer it not be
16:43:04 <arnewiebalck> geguileo: sounds pretty complicated
16:43:06 <jbernard> or add a setting
16:43:13 <jbernard> but lets not do that
16:43:25 <geguileo> arnewiebalck: unless we explicitly prevent it somehow it's possible
16:43:58 <jbernard> geguileo: the admin would have to adhere to a policy, and deploy as such
16:44:42 <smcginnis> I don't suppose a call could be added for ceph to take a collection and return the result in one call?
16:44:51 <geguileo> jbernard: Or we could check on start that all volumes belong to cinder (or look like they do) and report a warning that data will be inaccurate if not
16:45:00 <smcginnis> Not sure passing 4000 IDs is much better. Or possible.
16:45:04 <jbernard> smcginnis: it could, but we'd have to iterate there
16:45:15 <jbernard> smcginnis: and it would take time to adopt
16:45:19 <arnewiebalck> Why isn’t it enough to know how full the pool is?
16:45:38 <arnewiebalck> IN the end, the admin needs to take action when some threshold of real usage is used, no?
16:46:08 <smcginnis> Yeah, seems like you would want to know the pool usage total, not just the cinder useage. So if it's used by non-cinder you can actually take some of that into account.
16:46:09 <arnewiebalck> No matter what filled the pool.
16:46:30 <jungleboyj> smcginnis:  Good point.  :-)
16:46:31 <geguileo> arnewiebalck: it looks like it got broken and you are right
16:46:39 <geguileo> I was looking at the original code
16:46:53 <geguileo> but this broke it: https://review.openstack.org/#/c/410884/5/cinder/volume/drivers/rbd.py
16:46:57 <arnewiebalck> geguileo: ok, thx for checking!
16:47:02 <eharney> yeah, i also thought the current code still did diff_iterate, apparently not
16:47:25 <geguileo> arnewiebalck: it's adding the total size so it's like you say, not doing what it should
16:47:50 <geguileo> So 2 issues, right no it's not returning the right data
16:48:12 <geguileo> And it's inefficient
16:49:08 <smcginnis> arnewiebalck: Not just inefficient. It breaks large deployments, right?
16:49:36 <arnewiebalck> smcginnis: c-vol for that pool didn’t start
16:49:37 <eharney> presumably it breaks them by causing a timeout somewhere to be exceeded that could be raised in config?
16:49:41 <geguileo> arnewiebalck: it's not knowing how full the pool is, but how much space WE are using the problem
16:49:53 <arnewiebalck> service-list reported that c-vol as XXX
16:52:41 <smcginnis> Maybe enough for the meeting? Sounds like there will need to be some follow up discussion later.
16:52:41 <arnewiebalck> geguileo: You would use a pool for something else than just Cinder volumes (and not have a separate pool)?
16:53:05 <geguileo> arnewiebalck: or you could have 2 different cinder-volume services using the same pool
16:53:06 <arnewiebalck> smcginnis: Shall we open a bug for the follow-up?
16:53:21 <smcginnis> arnewiebalck: Sounds like that might be good if the current bug doesn't cover all of it.
16:53:25 <e0ne> arnewiebalck: +1 for bug for it
16:54:11 <arnewiebalck> smcginnis: you mean https://bugs.launchpad.net/cinder/+bug/1698786
16:54:12 <openstack> Launchpad bug 1698786 in Cinder "cinder-volume fails on start when rbd pool contains partially deleted images" [Undecided,In progress] - Assigned to Ivan Kolodyazhny (e0ne)
16:54:12 <arnewiebalck> ?
16:55:19 <arnewiebalck> geguileo: I can see how you use 2 pools for 1 service, but the other way round?
16:55:32 <geguileo> arnewiebalck: I've seen it done
16:55:47 <geguileo> I'm not saying it makes sense, buuuuut, I've seen it
16:55:51 <arnewiebalck> geguileo: was there an explanation ? ;)
16:56:02 <arnewiebalck> geguileo: ah, I see :-D
16:56:19 <smcginnis> arnewiebalck: Yeah, that's what I was thinking of.
16:56:40 <smcginnis> Any other things we need to discuss yet? 4 minutes.
16:57:42 <smcginnis> OK, let's wrap up then. Thanks everyone.
16:57:44 <e0ne> arnewiebalck, smcginnis: it's a different bug
16:58:10 <arnewiebalck> e0ne: ok, I’ll open one then
16:58:27 <smcginnis> #endmeeting