14:00:25 #startmeeting nova_scheduler 14:00:26 Meeting started Mon Dec 5 14:00:25 2016 UTC and is due to finish in 60 minutes. The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:27 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:29 The meeting name has been set to 'nova_scheduler' 14:00:37 Good UGT morning! Who's here? 14:00:40 o/ 14:00:41 \o 14:01:36 The agenda is in the usual place: https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting 14:02:07 Hmmm... quiet so far. Have the holiday doldrums started early this year? 14:02:45 3 is a crowd 14:02:48 * _gryf waves late 14:03:59 maybe we need to do some louder pinging of jaypipes and bauzas ? 14:04:33 I really hate that practice. We all have calendars, don't we? 14:04:54 Mine even have alarms! 14:04:57 cdent: U;n gere 14:05:05 cdent: guh, typing. 14:05:10 cdent: I'm here :) 14:05:13 Go home jaypipes - you're drunk 14:05:19 <_gryf> :D 14:05:29 edleafe: no, just on two meetings simultaneously :) 14:05:37 jaypipes: :) 14:05:52 Well, let's get started 14:05:59 * cdent sits comfortably 14:06:02 #topic Specs & Reviews 14:06:27 I listed several of the current bits of code being worked on in the Agenda 14:06:36 Any questions/comments about any of them? 14:06:52 edleafe: not from me. 14:07:07 is nice to see so much going on 14:07:27 My only question was on https://review.openstack.org/#/c/362766/. cdent, is that out of the running for Ocata? 14:07:46 I have no idea. 14:08:25 Well, Matt pointed out that the BP wasn't approved, and then didn't approve it 14:08:34 because he wanted a spec 14:08:39 I read that as "not gonna happen in Ocata" 14:08:50 and then the spec didn't get created because there wasn't consensus on what was wanted 14:09:02 so an etherpad was created on which peope were going to decide what they wanted 14:09:06 BPs can be approved before the spec is done 14:09:08 and it seemed to stall there 14:09:32 Link to the etherpad? 14:09:35 edleafe: yeah, just not a priority. 14:09:36 so my interpretation is that the current state of opinion is "meh, we'll wait and see" 14:09:44 https://etherpad.openstack.org/p/placement-optional-db-spec 14:09:44 cdent: ya 14:09:58 #link https://etherpad.openstack.org/p/placement-optional-db-spec 14:10:43 Yeah, it seems to have come full circle: from optional in Newton, mandatory in Ocata, to "meh" 14:10:56 (my current opinion is that we should try hard to make it not matter: placement database wherever it lives should be self healing) 14:11:47 OK, let's move on then 14:11:51 #topic Bugs 14:12:00 cdent: You added https://bugs.launchpad.net/nova/+bug/1647316 this morning 14:12:00 Launchpad bug 1647316 in OpenStack Compute (nova) "scheduler report client sends allocations with value of zero, violating min_unit" [Medium,Triaged] - Assigned to Chris Dent (cdent) 14:12:05 If you like, I can take that 14:12:28 yeah, EmilienM is workign on puppet integration and discovered that while doing tests with ceph-based storage 14:12:42 edleafe: have at if you have time/interest 14:12:48 I agree that if the amount is zero, there shouldn't be an allocation created 14:13:17 Done. Assigned it to me 14:13:25 ✔ 14:13:32 Any other bugs we need to discuss? 14:13:56 yeah, there's one thing: 14:14:15 ok 14:14:33 this https://review.openstack.org/#/c/395194/ is a proposed fix for https://bugs.launchpad.net/nova/+bug/1635182 14:14:33 Launchpad bug 1635182 in OpenStack Compute (nova) "[placement] in api code repeated need to restate json_error_formatter is error prone" [Low,In progress] - Assigned to Pushkar Umaranikar (pushkar-umaranikar) 14:15:14 however, all the proposed fixes have failed to actually work because webob has proven difficult. so what's needed are some additional eyes to help discern additional strategies 14:15:52 cdent: no disagreements from me on that. I trust you on API-level stuff more than my own opinion :) 14:16:28 * cdent is feeling pretty "meh" about webob 14:16:38 cdent: yeah 14:16:43 not a fan either. 14:16:54 mehs all around 14:17:05 cdent: so my quick impression: is this something that is needed with the nova stack because it is so dense, but your simpler WSGI implementation in placement makes this unnecessary? 14:17:21 anyway, if people have some ideas, that would be great 14:17:29 edleafe: which "this" do you mean? 14:18:06 the json_formatter addition 14:18:29 that's there to get placement api to follow the api-wg errors guideline 14:18:39 nova-api doesn't follow that 14:19:33 cdent: ah, ok. So what is the downside of adding the json_formatter to the exception raising? 14:20:14 boilerplate that's easy to forget. it's not the end of the world, would just be nice if there was a way to do it once 14:20:26 (without awful monkey hacks) 14:21:05 ok, so the objection is one of elegance, not merely necessity 14:21:17 (which I totally get) 14:21:40 elegance + wanting to make sure all exceptions do in fact follow the guideline 14:21:55 s/e$/e elegantly/ 14:22:18 it would be totally fair to say "meh, too hard, let's work on other stuff" 14:22:20 I don't see those as different. If they are, that's not very elegant. :) 14:23:02 #topic Open discussion 14:23:05 to remotable or not to remotable, that is the question: https://review.openstack.org/#/c/404279/ 14:23:14 cdent: care to start? 14:23:34 (on json_formatter) If Pushkar is able to come up with something, then cool, otherwise may as well let it die 14:23:42 yeah, without bauzas here it might be hard though 14:24:12 basically most of here have already said that we think it would be reasonable to get rid of remotable on the resource provider objects that the placement api uses 14:24:23 The idea as I understand it is that we don't need remotable, we don't foresee ever needing remotable, so let's simplify our objects 14:24:28 this is because the intent is that the http api be the one interface 14:24:34 and thus we don't need remotable 14:24:42 cdent: agreed on that point 14:24:54 in irc discussion bauzas objected, essentially saying "but what if we change our mind" 14:24:55 it does seem more like a nova-ism 14:25:05 to which I responded "YAGNI" 14:25:06 with inter-service RPC calls being the norm 14:25:56 I like the idea of removing it if for no other reason than to discourage the sort of calls that are common in Nova 14:25:59 yeah, the idea is that api won't need/want multiple services to perform its function 14:26:10 and thus RPC is out 14:26:25 thus we should remove unused code (even if it "doesn't hurt anything") 14:26:48 the other side of the argument is that placement may grow over time, and we may need RPC within placement 14:26:59 Again, YAGNI seems about right 14:27:56 cdent: yeah, kill the remoteable methods, IMHO. 14:28:04 _gryf or macsz any thoughts? 14:28:20 #action edleafe jaypipes _gryf macsz to weigh in on https://review.openstack.org/#/c/404279/ 14:28:36 cool 14:28:41 <_gryf> I think that if we need any rpc later, do the appropriate work later :) 14:28:47 Since bauzas isn't here, let's discuss on the review 14:28:51 _gryf: ++ 14:28:59 <_gryf> i would focus on non remotable now 14:29:29 Any other topics for Opens? 14:29:49 <_gryf> edleafe, I've sent a patch for Add a retry loop to ResourceClass creation 14:29:54 <_gryf> check it put 14:29:59 <_gryf> *out 14:30:06 _gryf: yes, I saw it. Thanks for pitching in 14:30:16 <_gryf> there are two things 14:30:24 <_gryf> which might bother me and cdent 14:30:37 <_gryf> one is how many retries we would need 14:30:54 <_gryf> and those exceptions 14:30:59 <_gryf> thinkgy 14:31:04 <_gryf> thingy* 14:31:18 * _gryf hands are frozen 14:31:35 I like keeping the # of retries high enough that we'll never hit it, but having it there prevents the infinite loop that cdent was concerned about 14:31:56 <_gryf> edleafe, 100? 10,000 ? :) 14:32:07 I'd love some insights into how to distinguish the duplicate entry 14:32:12 (i recognize that the odds of the loop are really low, but still) 14:32:19 10 seems fine 14:32:31 <_gryf> that was my shot either 14:33:50 Also, I believe that DBDuplicateEntry should always have the 'columns' attribute 14:34:09 so I don't see the need for the hasattr that Matt suggested 14:34:19 * _gryf will check that, otherwise edleafe want to take it cover from here 14:34:34 _gryf: sure, I can pick it back up 14:34:39 <_gryf> cool 14:34:48 I was planning on working on it this morning anyway 14:35:04 was happy to see you already did most of the work :) 14:35:17 <_gryf> anything I can help with, you can ping me on #openstack-nova 14:35:42 The only other thing I think we need is a log message if we hit the retry limit. 14:35:51 that way we know if 10 is too low 14:36:20 <_gryf> +1 14:36:41 I know the calling methods will probably log the error, but this way when we have evidence that 10 is not too low, we can remove that specific log entry 14:37:15 Otherwise we are just guessing as to how often these conflicts will happen 14:37:57 * _gryf also wondering, if that retry count needs a config option 14:38:17 cdent: I think if there is a script to add a bunch of Ironic nodes, and it has to add nested RPs for each, it could be hit that way 14:38:32 _gryf: no, please, not a config option! 14:38:32 :) 14:38:35 _gryf: no config! 14:38:38 jinx 14:38:40 <_gryf> ok… :) 14:38:56 This isn't anything an operator should ever have to think about 14:39:11 It's just a failsafe to prevent infinite loops 14:39:25 :) 14:39:27 So maybe let's make it 100 - we should never hit it then 14:39:35 and 640K will always be enough 14:39:39 :) 14:39:45 <_gryf> yeah :) 14:40:16 OK, I'll get to that later today. 14:40:26 Anyone have anything else for Opens? 14:40:31 no sir 14:40:45 * _gryf neither 14:41:14 OK, thanks everyone! 14:41:15 #endmeeting