#openstack-meeting log

16:07:49 <jungleboyj> #startmeeting cinder
16:07:50 <openstack> Meeting started Wed Dec  5 16:07:49 2018 UTC and is due to finish in 60 minutes.  The chair is jungleboyj. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:07:51 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:07:53 <openstack> The meeting name has been set to 'cinder'
16:08:08 <whoami-rajat> Hi again
16:08:12 <rosmaita> o/
16:08:16 <e0ne> hi
16:08:31 <jungleboyj> courtesy ping:  jungleboyj diablo_rojo, diablo_rojo_phon, rajinir tbarron xyang xyang1 e0ne gouthamr thingee erlon tpsilva ganso patrickeast tommylikehu eharney geguileo smcginnis lhx_ lhx__ aspiers jgriffith moshele hwalsh felipemonteiro lpetrut lseki _alastor_ whoami-rajat yikun rosmaita
16:08:48 <geguileo> o/
16:08:53 <jungleboyj> #topic announcements
16:08:55 <_alastor_> o/
16:08:56 <rajinir> hi
16:09:51 <jungleboyj> So, the only announcement I have is that I have sent the responses to the User Survey Feedback that we put together to Superuser.
16:10:13 <jungleboyj> #link https://docs.google.com/document/d/1d4_nUfuZacGABG2hqqCz3hC238uoVqdloi_qVGRNoRY/edit?usp=sharing
16:10:50 <jungleboyj> Nicole made it sound like they have quite a few things to process yet so it will probably take them a bit to get to it.  So, if you want to take another look and/or make comments, please do so.
16:11:33 <abishop> o/
16:11:37 <jungleboyj> Any questions or comments there?
16:12:13 <rosmaita> nope, looks good, thanks for putting it together
16:12:26 <jungleboyj> rosmaita:  No problem.  Thanks for looking at it.
16:12:38 <jungleboyj> One other announcement I suppose.
16:12:46 <jungleboyj> We are coming up on milestone 2 for the release.
16:13:11 <jungleboyj> We have a couple of driver submissions so we should be trying to get them some bandwidth to review those in time.
16:13:21 <jungleboyj> Would greatly appreciate any assistance doing reviews there.
16:14:03 <jungleboyj> A list of the ones I know about can be seen here:
16:14:07 <jungleboyj> #link https://etherpad.openstack.org/p/cinder-spec-review-tracking
16:15:24 <jungleboyj> Ok.  I think that is it for announcements.
16:15:55 <jungleboyj> #topic shared_targets_online_data_migration fails when cinder-volume service not running
16:16:05 <jungleboyj> imacdonn:  You here?
16:16:52 <jungleboyj> Guess not.
16:17:15 <jungleboyj> #link https://bugs.launchpad.net/cinder/+bug/1806156
16:17:16 <openstack> Launchpad bug 1806156 in Cinder "shared_targets_online_data_migration fails when cinder-volume service not running" [Undecided,New]
16:17:26 <geguileo> jungleboyj: I mentioned that issue at the PTG
16:17:36 <geguileo> when complaining about the issues on the shared_targets implementation
16:17:53 <jungleboyj> geguileo:  Ok, I thought it at least sounded familiar
16:17:56 <geguileo> and I added the comment from the first link
16:18:15 <geguileo> to our code, to make sure we don't do it again: https://github.com/openstack/cinder/commit/2cd5957c5e891f0bc5cf57253c0f5e18b330e954
16:19:23 <jungleboyj> Ok.  So what do we need to do to fix the existing problem?
16:19:41 <geguileo> it's a pita
16:20:03 <geguileo> because there is no way to do this offline
16:20:28 <jungleboyj> Ugh.  Ok.
16:21:17 <geguileo> well, there may be a way
16:21:40 <jungleboyj> Can we fix shared targets as suggested in the bug to avoid the limitation?
16:22:00 <jungleboyj> Given that it looks like we didn't follow the spec.
16:23:07 <geguileo> well, that part of the spec is also against the whole rolling upgrades/online migration concept
16:23:32 <jungleboyj> :-(
16:23:40 <geguileo> in theory the volumes should have been migrated as they are accessed
16:23:59 <jungleboyj> Ok.
16:24:04 <geguileo> and there should have been a mechanism to do the online migrations without the services being online
16:24:56 <jungleboyj> I feel like this is going to be a problem that we are going to keep hitting in other places in the future.
16:25:40 <geguileo> well, if we implement things right it won't be aproblem
16:25:47 <jungleboyj> So, how should we move forward?
16:25:49 <jungleboyj> He he.
16:25:55 <jungleboyj> Isn't that always the case?
16:26:22 <geguileo> lol
16:26:48 <geguileo> if we don't care about overloading the DB on start, we can do the proposed solution
16:27:12 <jungleboyj> So, the root of the problem on the upgrade here is the fact that we are trying to query the shared_targets setting during the upgrade?
16:27:31 <jungleboyj> Accessing the volumes through the volume service?
16:27:58 <geguileo> yup
16:28:29 <jungleboyj> Ok.  So, the proposed solution isn't unprecedented.  Right?  We query volumes for other things at volume service startup.
16:28:39 <jungleboyj> Checking if they are in deleting status and stuff like that.
16:28:47 <geguileo> well, it's against what should be done  XD
16:28:52 <geguileo> I don't know if there's precendent
16:29:08 <jungleboyj> geguileo:  Why is it against what should be done?
16:29:12 <geguileo> the solution is basically implement a data migration on start
16:29:29 <geguileo> on EVERY start of the service
16:29:47 <geguileo> FOREVER
16:29:56 <jungleboyj> Ok.  I wonder how Nova has dealt with this?
16:30:05 * jungleboyj is scared by the all caps forever
16:30:30 <geguileo> unless we use a config option or store something in the DB to disable it once it's done
16:30:41 <jungleboyj> geguileo:  That was what I was going to ask.
16:30:48 <geguileo> because you can upgrade from N to N+2 and will still want that field to be set correctly
16:30:57 <jungleboyj> We have access to the DB obviously during the migration.
16:31:23 <geguileo> the proper solution would have been to make the migration online when the volumes were loaded
16:31:33 <jungleboyj> Seems like we should be able to indicate to the DB that a migration has happened and that the volume service needs to do something on the next start.
16:31:48 <geguileo> and store in the service row if the backend required shared or not
16:32:00 <geguileo> and then on the next release use that field from the DB to do it
16:32:32 <jungleboyj> But we can't go back and fix that at this point.
16:32:39 <geguileo> nop
16:32:57 <geguileo> somebody has to go and figure out a decent solution
16:33:48 <jungleboyj> Ok.
16:34:23 <jungleboyj> I need to understand the urgency of this problem.  Guess that isn't clear yet.
16:34:41 <geguileo> we should ask in the bug who/what is affected by this
16:34:56 <jungleboyj> Do we need to solve it today or can we put this on the agenda for the PTG and try to work it out when some of us are in a room.
16:35:08 <jungleboyj> Ok, yeah, hard to understand that without imacdonn here.
16:36:07 <jungleboyj> Lets start there.  I will update the bug after the meeting and see if we can get more input.  Follow up in next week's meeting when hopefully Iain can be here.
16:36:14 <jungleboyj> Sound like a plan?
16:36:32 <geguileo> +1
16:36:35 <jungleboyj> Cool.
16:36:51 <jungleboyj> #action jungleboyj to update the bug and we will follow up in next week's meeting
16:37:36 <jungleboyj> #topic remove policy checks at DB layer?
16:37:40 <jungleboyj> rosmaita:
16:37:45 <rosmaita> i'll be quick.  i thought removing the policy checks at the db layer was a no-brainer, but i am having second thoughts
16:37:53 <jungleboyj> :-)
16:37:56 <rosmaita> anyway, since it's already on the agenda, let me explain what's up
16:38:05 <rosmaita> here's the use case: an operator wants to have a "read-only" admin who can do audits but not make changes
16:38:15 <rosmaita> if you try to do this in the policy file:
16:38:15 <rosmaita> "some-delete-call": "rule:admin_api"
16:38:15 <rosmaita> "some-get-call": "rule:admin_api or role:observer-admin"
16:38:23 <rosmaita> the some-get-call fails when a non-admin user with role:observer-admin makes the call
16:38:35 <rosmaita> it's traceable to the db layer where we have a decorator @require_admin_context
16:38:45 <rosmaita> so we could eliminate that decorator ... but it decorates 97 functions in db/sqlalchemy/api.py
16:38:54 <rosmaita> that seems pretty risky
16:39:04 <rosmaita> plus, i've been thinking about this for a while, and it turns out that doing that won't completely address the use case anyway
16:39:22 <rosmaita> the reason why not is that if you want your read-only admin to see stuff like the admin metadata on a volume, that person has to be a "real" admin (in the sense of having a property defined in context_is_admin in the policy file that will make is_admin:True hold)
16:39:35 <rosmaita> so to handle that situation, you need to do an unsafe workaround that i was hoping to fix
16:39:46 <rosmaita> it's unsafe because if you want a read-only admin, you have to give that person a role that fits into how context_is_admin is defined, which makes that person a serious admin
16:39:58 <rosmaita> and then you have to plug up all the holes by adding something like "not role:observer-admin and ..." to each of the policies you *don't* want them to have
16:40:13 <rosmaita> i don't see any way around that
16:40:22 <rosmaita> anyway, i wrote up an etherpad while i was trying to figure this out
16:40:33 <rosmaita> #link https://etherpad.openstack.org/p/different-types-of-admin
16:40:41 <rosmaita> i'd appreciate it if anyone interested could read through and see if there's something i missed or if i'm being stupid
16:40:47 <jungleboyj> Yikes.  That all sounds kind of scary for what would be a limited use case.
16:40:53 <rosmaita> but i think my proposal at this point would be to "fix" this via documentation
16:41:02 <rosmaita> and not touch the code at all
16:41:07 <geguileo> rosmaita: does the context store the roles of the caller?
16:41:25 <rosmaita> geguileo: yes, pretty sure it does
16:41:38 <jungleboyj> rosmaita:  At this point in time that sounds like the safest solution.
16:41:41 <eharney> is it really a limited use case?  i think the use case is "anyone who wants to adjust policy to give users access to certain things they don't normally have"
16:41:41 <geguileo> rosmaita: "fix" by documentation means that it's not possible? or what does it mean?
16:42:11 <rosmaita> the workaround works, i could just write up an example of how to do this
16:42:24 <jungleboyj> eharney:  I hadn't interpreted it that broadly.
16:42:30 <rosmaita> the key thing is that the operator's responsibility to test carefully
16:42:35 <geguileo> rosmaita: the workaround is to add all those "not role:observer" rules?
16:42:48 <rosmaita> geguileo: yes
16:42:58 <rosmaita> and possibly more, depending on how fine-grained you want it
16:43:02 <geguileo> rosmaita: what about setting a Cinder conf for the admin-observer role name?
16:43:10 <eharney> i think it is that broad unless i missed something here?
16:43:12 <geguileo> rosmaita: then check it in Cinder?
16:43:30 <rosmaita> i think that wouldn't work in the long run
16:43:40 <rosmaita> because you might also want a creator-only-admin
16:43:48 <rosmaita> (i mention this in the ehterpad)
16:44:02 <rosmaita> i think what we have now is flexible enough
16:44:09 <geguileo> I think that's different
16:44:10 <rosmaita> operators just have to be careful
16:44:34 <geguileo> or do you mean you want to allow 1 person to do an admin create call that allows to create volumes like an admin?
16:45:11 <rosmaita> geguileo: what i mean is an admin who can create everything an admin can, but can't delete anything
16:45:26 <rosmaita> sort of a role for interns :)
16:45:40 <jungleboyj> He he he.
16:45:52 <geguileo> rosmaita: is that a real use case somebody has asked for?
16:46:07 <rosmaita> yes
16:46:09 <geguileo> because I know the admin observer role is requested by many people
16:46:37 <rosmaita> yeah, the observer came up this morning in the glance channel
16:46:38 <geguileo> the admin create only role seems less needed (from my point of view)
16:47:08 <rosmaita> rackspace has the admin/observer/creator defined
16:47:17 <rosmaita> even for normal user accounts
16:47:27 <geguileo> I think the admin can create but cannot delete should be resolved as you discussed, with the "not role:" rules
16:47:57 <geguileo> create is one thing, another is create as an admin
16:48:01 <jungleboyj> rosmaita: Could we start by taking something like rackspace has and document it in our documentation and see how people feel about that?
16:48:23 <jungleboyj> Point people to that if they are asking.
16:48:32 <rosmaita> jungleboyj: i don't know if we want to go that far, but i have a writeup for observer-admin
16:48:44 <rosmaita> don't want to go beyond that without a lot of tests!
16:48:47 <geguileo> I think we should have the observer role without so much work...
16:48:48 <jungleboyj> Then if there is enough demand readdress the risk/reward of changing Cinder.
16:48:54 <rosmaita> although there's always the "no warranty" disclaimer
16:49:00 <jungleboyj> :-)
16:50:00 <jungleboyj> rosmaita: So, start by formalizing that documentation?>
16:50:03 <rosmaita> we can do all those now, it's just that it's a little error-prone
16:50:20 <rosmaita> jungleboyj: ok, i can put up a patch
16:50:39 <jungleboyj> rosmaita:  Lets start there.
16:50:43 <jungleboyj> Any objections?
16:51:15 <jungleboyj> Ok.  Cool.
16:51:20 <rosmaita> here's that etherpad again: https://etherpad.openstack.org/p/different-types-of-admin
16:51:29 <jungleboyj> rosmaita: Thanks for bringing this up.
16:51:36 <rosmaita> np
16:51:43 <jungleboyj> #action rosmaita to push up a patch with documentation
16:52:32 <jungleboyj> #topic update on possible mid-cycle
16:52:50 <jungleboyj> #link https://etherpad.openstack.org/p/cinder-stein-mid-cycle-planning
16:53:05 <jungleboyj> So, I brought this up with my management and we have budget to do this.
16:53:12 <jungleboyj> So, Lenovo can host.
16:53:29 <rosmaita> nice!
16:53:42 <jungleboyj> The caveat is that our campus is undergoing a lot of construction so we would have to deal with that and possible parking challenges.
16:54:09 <jungleboyj> Is that a big enough deal to make anyone else in RTP volunteer to host?
16:55:07 <jungleboyj> Beuhler .... Beuhler
16:55:20 <rosmaita> i am up for parking challenges
16:55:24 <jungleboyj> Guess that is a no.
16:55:53 <jungleboyj> Ok, like I said last time this would probably be pretty bare bones but we would have a place to meet and internet.
16:56:54 <jungleboyj> And the week proposed works for those people in RTP?
16:57:09 <jungleboyj> eharney: rosmaita _hemna jbernard
16:57:31 <jungleboyj> I know that smcginnis and I can get there as we planned the date together.
16:57:32 <rosmaita> yes for me
16:58:04 <jungleboyj> Ok.  I will keep moving forward with the process at Lenovo then.
16:58:24 <jungleboyj> Ok.  Finaly topic.
16:58:43 <jungleboyj> 3rd Party CI requirements for connectors
16:58:49 <jungleboyj> mszwed:  You here?
16:58:52 <mszwed> Hi
16:58:55 <mszwed> I'm currently working with Mellanox on continuos integration setup for new SPDK NVMe-oF volume and target drivers. I want to know which tests should I run on this setup. First approach was to run same tests as for current nvmet driver, but there are also different opinions.
16:59:23 <jungleboyj> #link https://wiki.openstack.org/wiki/Cinder/tested-3rdParty-drivers#Volume_.26_Connector_Drivers
16:59:56 <jungleboyj> So, I think the answer is:  tox -e all -- volume
16:59:58 <eharney> i'm not sure that faq is correct...
17:00:14 <jungleboyj> Ugh.  We are out of time already.
17:00:15 <eharney> and we clearly need to sort this out since there have been numerous questions lately about what tests are needed
17:00:25 <jungleboyj> eharney:  Agreed.
17:00:35 <mszwed> ok
17:00:38 <eharney> it needs to instruct people to run the cinder tempest plugin tests too :/
17:00:52 <jungleboyj> Lets make this the first topic for next week.
17:01:13 <jungleboyj> eharney:  Right, but I don't think we can require that right now as we haven't spread the word on that yet.
17:01:45 <jungleboyj> mszwed: Can you shoot for what I shared above and we will discuss more next week?
17:01:52 <mszwed> sure
17:01:58 <jungleboyj> Great.  Thank you!
17:02:03 <mszwed> :)
17:02:08 <jungleboyj> Thanks everyone for joining.
17:02:14 <jungleboyj> Sorry for the network issues I had.
17:02:20 <whoami-rajat> Thanks.
17:02:20 <jungleboyj> #endmeeting