15:00:21 <bswartz> #startmeeting manila 15:00:22 <openstack> Meeting started Thu Sep 14 15:00:21 2017 UTC and is due to finish in 60 minutes. The chair is bswartz. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:23 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:25 <openstack> The meeting name has been set to 'manila' 15:00:29 <gouthamr> o/ 15:00:35 <bswartz> hello all 15:00:38 <cknight> Hi 15:00:39 <ganso> hello 15:00:41 <zhongjun> hi 15:01:29 <bswartz> I'll give people a moment to gather because of possible travel 15:01:34 <tbarron> hi 15:01:50 <bswartz> some of us are here at PTG in Denver, crashing the cinder sessions 15:02:01 <gouthamr> #curling 15:03:00 <xyang1> Hi 15:03:04 <bswartz> also I'm on sketchy wifi so if I drop please be patient 15:03:25 <bswartz> it's been good for the most part but I have seen dropouts this week 15:03:44 <bswartz> #topic announcemnts 15:03:58 <bswartz> so first of all, next week is Manila PTG 15:04:28 <toabctl> hey 15:04:52 <bswartz> the etherpad link is in the channel topic, and we plan to meet for 2 days, possibly all day, but we'll try to wrap things up early each day if we can for the benefit of people in far away time zones 15:05:13 <bswartz> I plan to use webex for audio because that's worked well in the past 15:05:36 <bswartz> there is sad news from xyang1 15:05:56 <xyang1> Hi, I was just laid off by dell emc 15:06:08 <bswartz> :-( 15:06:08 <tbarron> xyang1: omg! :( 15:06:09 <ganso> xyang1: :( 15:06:16 <zhongjun> oh no 15:06:17 <xyang1> My group no longer need open source contributor:( 15:06:43 <gouthamr> xyang1: sorry to hear that 15:06:50 <xyang1> If anyone knows there is an opportunity, please let me know 15:07:15 <xyang1> My personal email: xingyang105@gmail.com 15:08:17 <bswartz> it sounds like xyang wants to continue her role in openstack so any companies that want an experienced contributor on staff have great opportunity to snap her up 15:08:33 <bswartz> we have a packed agenda today though so I'm going to try to move quickly 15:08:38 <dustins> \o 15:08:44 <bswartz> #agenda https://wiki.openstack.org/wiki/Manila/Meetings 15:08:51 <amito-infinidat> o/ 15:08:57 <bswartz> #topic Add total count information in our list APIs 15:09:05 <bswartz> zhongjun: you're up 15:09:08 <zhongjun> bswartz: thanks 15:09:12 <zhongjun> #link https://review.openstack.org/#/c/501934/ 15:09:12 <zhongjun> This feature can be used for showing how many resources a user or tenant has in the web portal's summary section. 15:09:12 <zhongjun> It already discussed in cinder meeting 15:09:24 <xyang1> In addition to openstack, I have also started contributing in kubernetes. So I am interested in opportunities there as well. Thanks! 15:09:33 <zhongjun> The cinder proposal will match with the API WG guidelines 15:09:33 <zhongjun> Will we also follow up the API WG proposal? 15:09:45 <bswartz> link? 15:09:55 <zhongjun> #link https://github.com/openstack/api-wg/blob/64e3e9b07272f50353429dc51d98524642ab6d67/guidelines/counting.rst#L12 15:10:21 <bswartz> I think this is a good idea 15:10:50 <tbarron> #link https://review.openstack.org/#/c/500665/ 15:10:58 <bswartz> it amounts to a performance optimization, and though it's not great REST, if there's a standard within the community that's good enough for me 15:11:26 <markstur> xyang1: :( 15:12:01 <gouthamr> zhongjun: thanks for bringing this in. I like it too, tommylikehu and you should propose this to api-sig 15:12:31 <tommylikehu> gouthamr: propose what? 15:12:47 <bswartz> gouthamr it sounds like the proposal comes FROM the API WG 15:13:00 <bswartz> I wasn't aware of it, but it seems reasonable to me 15:13:00 <gouthamr> ? not that i'm aware of.. 15:13:07 <gouthamr> ohh 15:13:07 <tommylikehu> xyang1: :) 15:13:08 <zhongjun> bswartz: okay, so we just continue to do this work 15:13:09 <gouthamr> yes 15:13:48 <bswartz> okay I also don't expect this to be a ton of work to code, or to review 15:13:58 <bswartz> we will want functional test coverage of course 15:13:59 <tbarron> +1 15:14:12 <zhongjun> bswartz: sure 15:14:18 <bswartz> and I recommend writing a manila spec which largely just refers to the API WG document 15:14:31 <gouthamr> +1 15:14:48 <bswartz> the manila spec should enumerate all of the object types for which counts will be added 15:15:28 <bswartz> okay nex topic 15:15:35 <zhongjun> bswartz: it is simple spec, just added some manila api description 15:15:42 <bswartz> #topic Register and Document Policy in Code 15:15:54 <bswartz> this is zhongjun again 15:16:03 <zhongjun> bswartz: Our current policy system is a little chaos 15:16:03 <zhongjun> Do we need to rewrite those policies: 15:16:22 <zhongjun> #link: https://etherpad.openstack.org/p/manila-ptg-queens 15:16:27 <bswartz> so I haven't been talking to people about this TC goal here in Denver 15:16:35 <bswartz> errr 15:16:46 <bswartz> I _have_ been talking to people about this TC goal here in Denver 15:16:59 <zhongjun> bswartz: Or we could just implement the policy in code and save the original policies in the first step. 15:17:16 <bswartz> this is something we clearly want to do 15:17:17 <zhongjun> bswartz: ? 15:17:44 <zhongjun> bswartz: oh, you talked with people about this 15:17:47 <bswartz> the general agreement is to migrate the existing policies into code first 15:17:49 <tbarron> separating those concerns seems like a good idea: policy in code, revamp policy contnet 15:18:09 <bswartz> save any changes to the policies themselves for later changes 15:18:25 <zhongjun> bswartz: okay, we should just following those steps 15:19:06 <bswartz> we don't need to go into great detail here because the TC goal doc is available, and zhongjun has volunteered to do the work 15:19:28 <bswartz> we cold revisit this topic at PTG if anyone has issues 15:19:50 <bswartz> for this one I'm not sure we need a spec 15:20:01 <bswartz> would a manila-specific spec any any value here? 15:20:12 <bswartz> s/any/add/ 15:21:01 <gouthamr> redo policies = new spec <--- but this can wait as mentioned earlier 15:21:17 <bswartz> I can't think of a reason to, so let's just do this work as part of the TC goal program 15:21:23 <bswartz> next up 15:21:32 <bswartz> #topic Dynamic Log Level 15:21:48 <tbarron> #link https://review.openstack.org/#/c/445885/ 15:21:51 <zhongjun> bswartz: Do we need to add REST API to control services' log levels dynamically 15:21:52 <tbarron> ^^ cinder 15:22:06 <zhongjun> tbarron: thanks 15:22:16 <bswartz> ugh 15:22:23 <tbarron> ? 15:22:24 <bswartz> I don't like that idea 15:22:36 <tbarron> it seems quite useful, why not? 15:22:40 <bswartz> why can't we just reread the conf file for these options? 15:23:00 <tommylikehu> bswartz: reread and restart? 15:23:07 <ganso> bswartz: I believe the benefit is not having to restart the services 15:23:18 <bswartz> it's a bad idea to have things in the conf file overrided by API calls, because after restart they will revert back 15:23:19 <zhongjun> bswartz: how to reread 15:23:46 <tbarron> maybe PTG topic then 15:23:50 <ganso> I've heard about the oslo guys implementing something that allows reloading CONF files 15:23:51 <bswartz> we need the feature cinder has to update the conf without restarting 15:23:54 <tommylikehu> it does not matter because we can get the latest values. 15:24:16 <tbarron> long conversation here 15:24:30 <bswartz> okay should we postpone this one to PTG? 15:24:33 <zhongjun> bswartz: Oslo.config has supported this for a few cycles now, but it still have some problems 15:25:01 <zhongjun> bswartz: It could be hard to manage log level when we have multiple nodes and multiple services. 15:25:18 <bswartz> my stance is that this can and should be achieved by rereading the conf file without restarting, but I'm open to hearing reasons why that's a bad idea 15:25:51 <bswartz> zhongjun: no harder than keeping the rest of the conf files consistent across a deployment 15:25:57 <zhongjun> bswartz: we'll no longer be sure of what log level a service is running at a given time. 15:26:13 <ganso> zhongjun: why? 15:26:15 <bswartz> I assume that's a solved problem under any decent deployment tool 15:27:10 <bswartz> I'm familiar with how puppet keeps config files synched -- I presume other technologies work at least as well 15:27:17 <tbarron> PTG topic please 15:27:20 <tbarron> full agenda 15:27:22 <bswartz> okay moving on.... 15:27:33 <zhongjun> ganso: there could not have a way to get the log level from current service 15:27:40 <bswartz> #topic Install guide testing 15:27:45 <zhongjun> tbarron: ok 15:27:55 <bswartz> So this came up in cinder yesterday 15:28:18 <bswartz> since the docs are now in our repo, there is an expectation that we QA the install guides 15:28:49 <bswartz> the install guides are a pretty bare bones way to install openstack -- without any deployment tools 15:29:17 <bswartz> tesing them is as easy as reading the doc and following the instructions and seeing if you end up with a working manila installation 15:29:28 <bswartz> the only downside is that it's a manual process 15:29:45 <bswartz> and there are slightly different instructions per-linux-flavor 15:30:23 <bswartz> because the package names and the directory paths vary slighty between centos/ubuntu/opensuse 15:30:51 <bswartz> if we find any bugs we should file them in LP and fix them like any code bugs 15:31:40 <bswartz> and we may want to consider cleaning up those docs and reducing duplication if it exists (haven't checked yet) by using sphinx includes 15:32:28 <bswartz> also bugs in the install guide should be fixed and backported to pike, so the sooner we can get install guide testing done the better 15:32:35 <zhongjun> bswartz: Is there a link? 15:33:08 <bswartz> I'm looking for volunteers to test each platform 15:33:10 <bswartz> #link https://docs.openstack.org/manila/pike/install/ 15:34:17 <bswartz> okay I just wanted to mention that 15:34:26 <bswartz> we might be able to do some of that work during PTG 15:34:26 <gouthamr> there are a bunch of new bugs showing up 15:34:36 <bswartz> or before PTG if people can find time 15:34:45 <bswartz> I'll move on 15:34:50 <tbarron> let's settle at PTG but I can probably look at the CentOS part 15:34:58 <bswartz> #topic Automatic generation of docs for configuration options 15:35:11 <bswartz> this is another docs topic that came up here in denver 15:35:24 <zhongjun> I can probably look at the ubuntu part :) 15:35:57 <bswartz> there are now tools to autogenerate the tables of config opts from the latest live source code for documentation purposes 15:36:33 <bswartz> so a change to a manila config option could be automatically reflected in the documentation when it merges 15:37:17 <bswartz> the Cinder guys will be updating their docs to use this mechanism, and I suggest we follow suit 15:37:40 <tbarron> +1 15:37:55 <bswartz> from what I heard yesterday we might need to add some small code hooks for the generator so it knows what options are relevant to each driver (for example) 15:38:16 <bswartz> but it sounds like a very cool sphinx plugin 15:38:23 <bswartz> on to my last topci 15:38:30 <bswartz> #topic Timeouts 15:38:58 <bswartz> I haven't made much progress with the infra folks here 15:39:05 <bswartz> #link https://review.openstack.org/#/c/493092/ 15:39:46 <bswartz> they seem allergic to the idea of increasing our job timeouts 15:40:06 <tbarron> should we all just add more timed out job failures here? 15:40:14 <bswartz> on the plus side they do seem committed to trying to fix the underlying problem 15:40:53 <bswartz> well there was a time that they claimed that the underlying problem was fixed, so I abandoned my change 15:41:03 <bswartz> then the timeouts came back, so I restored it 15:41:15 <bswartz> it's possible they'll fix another underlying problem 15:41:26 <bswartz> but we should keep collecting evidence 15:41:36 <tbarron> to recognize these look for "timeout -s 9" in the console log 15:41:40 <tbarron> nonvoting jobs too 15:41:46 <bswartz> links to job logs of timed out jobs serve 2 purposes: 15:41:51 <tbarron> e.g. http://logs.openstack.org/66/502666/8/check/gate-manila-tempest-dsvm-mysql-generic-ubuntu-xenial-nv/35914d5/console.html#_2017-09-13_14_08_22_943668 15:41:59 <gouthamr> bswartz tbarron: We were always bordering that timeout we had in project-config 15:42:02 <bswartz> 1) they help infra narrow down where the real problem lies 15:42:25 <gouthamr> bswartz tbarron: in the past we worked around by splitting tests into multiple classes 15:42:43 <bswartz> 2) having a large amount of evidence supports our case that manila is disproportionately affected by this issue 15:42:57 <gouthamr> i don't think infra's slow-nodes issue has anything to do with our request to increase timeout in general 15:43:04 <gouthamr> so we can get more time to run the tests 15:43:16 <gouthamr> however, slow nodes take 40+ minutes to Devstack smh 15:43:29 <bswartz> well I can modify my patch to request a more modest increase with a different justification 15:44:19 <gouthamr> s/has anything/should have anything 15:44:23 <bswartz> in that case, we should find some examples of jobs passing with very little margin before the timeout, that are NOT on the so-called "very slow nodes" 15:44:37 <tbarron> gouthamr: isn't there an overall time budget and if it takes too long to build then the tests have even less time to run? 15:44:48 <bswartz> tbarron: exactly 15:44:57 <gouthamr> tbarron: yes.. that's the one bswartz is toggling in his patch 15:45:04 <bswartz> we have less margin for error than other project jobs 15:45:23 <gouthamr> tbarron: imo the overall budget was low... and limiting to our enthusiasm to add a ton of tests 15:45:36 <tbarron> gouthamr: I see, you are just saying that we don't need to bump the actual *test* timeout vs the overall budget time 15:45:51 <bswartz> is there a second timeout? 15:46:01 <gouthamr> tbarron: yep... there're three timeouts 15:46:15 <bswartz> I'm talking about the job timeout before zuul simply aborts 15:46:36 <gouthamr> bswartz: an overall zuul job timeout (this one is funny, 10 minutes are reserved for post operations inside this timeout) 15:46:44 <gouthamr> a test suite timeout 15:46:55 <bswartz> can we change the other timeouts on our own? 15:46:56 <gouthamr> and a per test resource wait timeout 15:47:19 <bswartz> oh yeah we can change the resouce wait timeouts, but those only tend to matter during failures 15:47:39 <gouthamr> bswartz: yes... the third one is a tempest timeout option, the second one: there's a DEVSTACK_GATE opt for that 15:47:49 <ganso> resource failures should be kept at default, as increasing them will largely impact the duration of tests 15:48:25 <bswartz> ganso: we could decrease them and expect overall less timeouts, albeit more failures 15:48:28 <ganso> s/resource failures/resource wait timeouts 15:48:33 <gouthamr> bswartz: remember that these are nested timeouts 15:48:55 <gouthamr> bswartz: so your patch needs to be accepted imo 15:49:03 <bswartz> we may need to revisit this topic next week 15:49:23 <bswartz> my request is just to keep a list of links to jobs that demonstrate the problem 15:49:40 <tbarron> anyone have elastic search fu to catch all these automatically? 15:49:49 <zhongjun> gouthamr: Do we have many nested timeouts? 15:49:54 <bswartz> my log-stash-fu is not good enough to simply query a list of all of the failures we're interested in 15:49:55 <gouthamr> yes 15:49:59 <ganso> bswartz: usually if something fails due to resource wait timeouts, then something is very wrong with the deployment and more stuff is likely to fail 15:50:09 <ganso> bswartz: s/stuff/tests 15:50:20 <gouthamr> i have a base link, but can't toggle the logstash options via API 15:50:47 <tbarron> we want a query that shows overall timeout plus test time is within the test timeout 15:50:56 <bswartz> ganso: yes I'm less concerend about failure cases than the success cases 15:51:13 <tbarron> these are the cases we can't address w/o infra help 15:51:32 <bswartz> if it's broken you're going to have to push another patch anyways -- what I want to get away from is rechecks 15:51:49 <bswartz> yes 15:52:02 <bswartz> okay before we run out of time 15:52:06 <bswartz> #topic open discussion 15:52:10 <bswartz> anything else for today? 15:52:30 <tbarron> ocata driverfixes is proposed 15:52:43 <bswartz> link? 15:52:44 <tbarron> #link https://review.openstack.org/#/c/504032/ 15:53:21 <bswartz> ack, I will review that 15:53:55 <gouthamr> zhongjun: see https://review.openstack.org/#/c/493092/ <--- this is to extend the Zuul job timeout, which is essential 15:53:55 <gouthamr> tbarron: +1 thanks 15:53:57 <gouthamr> tbarron: i'm talking to eharney about our unit test issues 15:54:13 <bswartz> alright thanks everyone 15:54:14 <gouthamr> he thinks we can get away without changing requirements and zuul.. 15:54:30 <zhongjun> gouthamr: thanks 15:54:31 <tbarron> gouthamr: thanks 15:54:32 <gouthamr> he's trying to get unittest jobs running on driverfixes branches in cinder 15:54:44 <bswartz> and thanks to PTG wifi for not failing duing this meeting 15:54:52 <gouthamr> yeah good wifi 15:54:55 <amito-infinidat> indeed 15:55:06 <bswartz> I'll see you next wednesday morning for PTG 15:55:21 <bswartz> #endmeeting