14:00:15 #startmeeting nova_scheduler 14:00:16 Meeting started Mon Apr 10 14:00:15 2017 UTC and is due to finish in 60 minutes. The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:17 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:19 The meeting name has been set to 'nova_scheduler' 14:00:25 o/ 14:00:28 Good UGT Morning, all! 14:00:35 o/ 14:00:42 ya, here. 14:00:45 \o 14:00:50 o/ 14:01:29 well, let's get into it. 14:01:30 #topic Specs & Reviews 14:01:45 \o 14:01:47 Traits 14:01:48 #link Add Traits API to Placement https://review.openstack.org/#/c/376200/ 14:02:35 That looks like it's ready to go 14:02:50 Any comments? 14:03:11 apparently not 14:03:15 Next up: os-traits 14:03:16 #link os-traits reorg: https://review.openstack.org/#/c/448282/ 14:03:45 I added my automation on top of the NIC code 14:03:52 ed's automation and de-register +1 14:04:27 Again, this series looks ready to go 14:04:32 Next up: Placement API 14:04:33 #link Improvements to PUT and POST https://review.openstack.org/#/c/447625/ 14:04:58 looks like it just needs its +W back 14:05:03 That's got a ton of +2s, but no +W 14:05:07 Any reason why? 14:05:31 oh, wait - the rechecks 14:05:42 It did, but got rebased 14:05:45 bauzas had +W'd it earlier 14:06:05 kaboom 14:06:10 thx 14:06:13 or rather +W'aboom 14:06:27 heh 14:07:24 Moving on: Claims in placement spec 14:07:24 #link WIP placement doing claims: https://review.openstack.org/437424 14:07:27 #link Forum session proposed for claims in the scheduler: http://forumtopics.openstack.org/cfp/details/63 14:07:30 We'll be having a hangout in a few hours to discuss this in more depth to help move it forward 14:08:37 i need to follow up with fifeld 14:08:42 *t 14:09:50 So please join the hangout at 1600 UTC. mriedem will post the link in -nova, right? 14:10:07 umm, 14:10:11 it's super secret 14:10:19 but i can 14:10:30 FWIW, I haven't yet updated the spec, holding until the Clarification 14:10:34 We Follow the 3.5 Opens! 14:11:05 :) 14:11:27 ok, next up: Show scheduler hints in server details 14:11:28 #link Show sched. hints in server details: https://review.openstack.org/440580 14:11:30 There are mixed opinions on whether this belongs in all server detail responses, or only when specifically requested. 14:11:40 if we did this, 14:11:45 i agree it should be a subresource as sdague noted 14:11:50 IMO it's unnecessary clutter 14:11:55 it sounds like gibi wants to abandon now though 14:12:12 ah, ok. Hadn't looked in a few days 14:13:15 unless something changes in the next couple of days, 14:13:19 it will probably die on the vine for pike 14:13:55 Moving on: Nested Resource Providers 14:13:55 #link Nested RPs: https://review.openstack.org/#/c/415920/ 14:13:56 Nested Resource provider series still on hold until traits is done, but will this effort be revived soon? 14:14:04 jaypipes: ^^ 14:14:24 since traits is pretty darn close 14:14:48 edleafe: it's rebased locally, but will need to be rebased again with the 1.6 mircoversion that traits API adds. 14:15:19 edleafe: I'd really like to get the os-traits patches merged. 14:15:50 jaypipes: there don't seem to be any objections to them, right? 14:16:03 what do they need to get over the finish line? 14:16:06 edleafe: just needs core reviews AFAIK 14:16:19 ok, cores, you know who you are! 14:16:37 i don't 14:16:49 i've starred that last change, i haven't reviewed any of the traits stuff, 14:16:57 but i can try to get to it with some gentle prodding 14:17:00 this afternoon 14:17:21 * edleafe gets out the virtual cattle prod 14:17:30 I think you need a sheep prod, if you want gentle 14:18:29 I've got a spec to add to the list, which i forgot to put on the agenda, once we get to the end 14:19:10 OK, then, last agenda item: Use local-scheduler spec 14:19:10 #link Add use-local-scheduler spec https://review.openstack.org/#/c/438936/ 14:19:13 Seems like a bit of a misnomer - more like "merge scheduler into conductor" 14:19:40 i think that one is premature for pike, last i checked 14:20:03 agreed 14:20:06 well, it could be good to have it for cellsv2 14:20:14 it's in "backlog" so okay 14:20:25 but yeah, pike seems early 14:20:26 oh didn't realize it was backlog 14:20:43 i moved it 14:20:51 it seems we are not yet ready for this, either way 14:20:55 agreed or not 14:20:55 the recent discovery of the instance info upcalls from compute scared me for this one 14:21:26 kept the comments, will address them later, after the spec freeze 14:21:51 ok, then - cdent, you had a spec to add? 14:22:03 yeah, wrote it up last week after realizing my api change needed one 14:22:11 #link placement idempotent put resource class: https://review.openstack.org/#/c/453732/ 14:22:33 there's already code for that. the change is relatively straightforward, and it has a small but relevant data integrity impact 14:23:53 OK, then let's get some specs-core reviewers to go over that one 14:24:05 Anything else for specs & reviews? 14:24:16 so this is just create-or-update right? 14:24:24 mriedem: yes 14:24:37 just to mention, there are pending placement-api-ref changes in progress 14:24:49 get your reviews in early if you want to help set tone or whatever 14:25:02 it's probably going to be a many stage process 14:25:23 #link placement-api-ref https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:cd/placement-api-ref 14:26:01 mriedem: actually it's not realy update, more create-or-verify, as there's no body 14:26:38 "make sure this exists" 14:27:22 cdent: be sure to add those to the agenda if there are any issues 14:27:45 Anything else for specs & reviews? 14:27:47 edleafe: none, yet, but wil 14:27:48 l 14:27:53 <_gryf> edleafe, regarding "Provide detailed error information for placement API" (https://review.openstack.org/#/c/418393/) blueprint, it seems that I will not be able to finish this 14:28:55 _gryf: is it a time constraint, or an issue with the BP itself? 14:29:08 <_gryf> edleafe, neither (or both) 14:29:19 :) 14:29:28 _gryf isn't working on openstack anymore, right? 14:29:41 so it's a person constraint 14:29:48 mriedem: yes 14:29:48 <_gryf> edleafe, I've been assigned to different tasks within my division 14:30:00 _gryf: ah, understood. 14:30:17 * cdent shakes his tiny fist 14:30:31 <_gryf> so that I've simply no time for finish that. 14:30:41 * _gryf is really sorry 14:30:54 _gryf: don't be sorry 14:30:57 never. 14:31:02 just to be clear my fist is shaking at capitalism (or something) not _gryf 14:31:15 cdent: it's life and life can be hard 14:31:18 well, it looks like a worthwhile BP. 14:31:35 * bauzas teaching his 6-yr daughter of such 14:31:41 if we could get closure on the spec, 14:31:44 and agreement to a plan, 14:31:48 then that would be a good clean break, 14:31:52 and someone else could pick it up 14:32:05 having said that, i haven't reviewed it 14:32:09 we could put that for the nova 101 session at the Forum 14:32:45 I guess some folks could be interested in working on operator-related issues 14:32:48 * johnthetubaguy wonders about the consistent API errors work, and the logging working group 14:32:59 that too, of course, but let's not diverge 14:33:11 I just think that blueprint is worth being shared with volunteers 14:33:27 well that is one of the consistent API errors bit, just thinking about co-ordinating those efforts 14:33:42 johnthetubaguy: +2 14:33:47 API-WG spec is maybe enough, just curious 14:33:55 sorry, didn't mean to sound that enthusiastic 14:34:01 I really just meant +1 14:34:06 * edleafe has fat fingers 14:34:06 lol 14:34:42 paying more is always appreciated :) 14:35:20 johnthetubaguy: got your point, maybe the above doesn't require a spec then ? 14:35:35 it needs a spec, its an API change 14:35:50 just curious on co-ordination 14:35:53 sorry, I'm unclear 14:35:57 but we should move on if no one has an update 14:36:25 johnthetubaguy: something for the api-wg meeting? 14:36:35 probably 14:36:36 I was misunderstanding by the fact that I thought we had a spec covering that already 14:37:10 OK, moving on then 14:37:14 #topic Bugs 14:37:21 None added to the agenda 14:37:35 Anyone have a bug that they would like to focus on? 14:37:38 i sort of have something 14:37:48 due to time, 14:37:58 i haven't written a functional test for https://bugs.launchpad.net/nova/+bug/1679750 but someone could 14:37:59 Launchpad bug 1679750 in OpenStack Compute (nova) "Allocations are not cleaned up in placement for instance 'local delete' case" [Undecided,New] 14:38:20 we have some local delete functional tests now under nova/tests/functional/regressions that could be copied as a start 14:38:31 not a huge issue, 14:38:48 plus, i guess it depends on if we actually decide to delete allocations in placement from the api during a local delete scenario 14:39:21 we could also handle cleaning up the allocations when the host comes back up 14:39:36 which would keep nova-api from needing to have a dependency on placement, which might be ideal 14:40:04 mriedem: stuff like this was the reason for the periodic updates, no? 14:40:07 anywho, ^ is a quagmire for someone to drown in 14:40:23 edleafe: maybe? 14:40:36 we kind of know that things can get out of sync from time to time 14:40:36 standard cleanup quandary it would seem... :) 14:40:37 however, 14:40:51 i assume the periodic updates in the compute rely on actually having instances running on that host right? 14:41:03 since those are tied to the allocations as the consumer_id 14:41:19 if we delete the instances in the api, then i doubt anything in the compute is looking up those deleted instances and cleaning up allocations 14:41:20 the allocations for the entire host are retrieved as well, and compared 14:41:22 they are zombies 14:41:33 if this is a non-issue, i'm happy 14:41:34 (if I recall correctly, ymmv, etc) 14:41:58 that's something that has been kind of hand-wavy. One of the reasons we want to add a bunch of functional tests that cover these scenarios 14:41:58 but i'm fairly sure we don't have a functional test for the scenario to be able to tell 14:42:07 jinx 14:42:17 hence why i was going to write that first 14:42:20 but time 14:42:23 thank goodness edleafe said it, it would be bad form to call jinx on other people 14:42:47 ok i'm done 14:42:49 #agreed Bad form to call jinx on other people 14:43:06 #topic Open discussion 14:43:20 So... what's on your mind? 14:44:27 although i plan to be lurking for a bit, i do plan to work on nova related things going forward, as part of osic; most likely scheuler 14:44:30 scheduler 14:44:40 so hi everyone 14:44:45 * johnthetubaguy waves at jimbaker 14:44:53 yay, new people 14:44:58 \o/ 14:44:59 * edleafe hands jimbaker a cookie 14:45:22 thanks everyone! 14:45:56 cdent I spoke to jimbaker about your email on the placement performance 14:46:13 ah, excellent 14:46:14 yep, so that looks like a good way into the project 14:46:21 touches a few things 14:46:40 I think as a first pass just observing placement under load to see how it behaves is a great start 14:46:53 I've never taken it beyond a single node :( 14:47:17 hell, 14:47:22 then from those observations some more rigorous testing ought to reveal itself 14:47:30 comparing boot times between filter scheduler with placement and without would be a nice start 14:47:31 I mentioned fake virt, no idea how good that is right now for this purpose 14:47:33 johnthetubaguy asked me to look at workload characterization with respect to placement performance - so i guess we will look at more than one node after all 14:47:40 mriedem: +1 14:48:35 i'm not sure if rally/os-profiler would tell us enough info before/after using placement in the scheduler or not 14:48:41 but you could do that with newton pretty easily 14:48:58 as placement in the scheduler is optional in newton, if it's not there the scheduler falls back to the pre-placement code 14:49:03 true newton does both, good idea 14:49:12 doing this comparison in ocata would be tricky 14:49:39 got it 14:49:58 ping me if you need details on how to fake that out for a test run in newton 14:50:13 i think even single node to start would be fine 14:50:19 no 14:50:23 ocata does both 14:50:27 mriedem, thanks. and in general, i think instrumentation is a good thing. i'm sure i will it get it wrong at first 14:50:40 newton can do placement, but with less features 14:50:48 bauzas: oh that's right 14:50:49 good point 14:50:53 right, the filter scheduler doesn't start requesting providers until ocata 14:50:54 jimbaker: yeah sorry, so start with ocata 14:50:59 while ocata can exactly do both, if you run newton computes 14:51:09 ocata will fallback if placement isn't available 14:51:16 or not new enough 14:51:17 but my general sense of things is that better instrumentation results in better implementations, and better ecosystem in general 14:51:23 ah, so just kill the endpoint in keystone to fall back 14:51:35 oh man, jimbaker has the exact same pidgin color than mriedem 14:52:05 johnthetubaguy: or goof the placement creds in nova.conf 14:52:06 hashing to a small number of bits... got to love it 14:52:15 anyway 14:52:16 mriedem: thats simpler 14:52:19 we can discuss the details later 14:52:31 again, thanks for the ideas, and i know who to find here! 14:52:35 johnthetubaguy: I'd just use newton computes with ocata scheduler 14:52:37 reminds me i need to figure out how to get a new devstack vm in the new job 14:52:41 bench them 14:52:47 * macsz has the same color problem with mriedem johnthetubaguy and _gryf 14:52:55 and then, start deploying placement and rollup ocata computes 14:53:25 bauzas: that seems much more complicated for someone that's new 14:53:31 bauzas: that sounds complicated 14:53:36 throwing upgrades into the mix would just make this harder 14:53:41 fair enough 14:53:49 jimbaker: we can talk in -nova after the meeting 14:53:53 which should end right now! 14:54:03 makes it hard to switch back to retest, but lots of ways 14:54:04 yup, 3 hours of meetings back to back 14:54:09 mriedem, +1 14:54:09 hey,w e still have 6 minutes! 14:54:15 mriedem: good hinting 14:54:19 and I'm just at the end of my first hour 14:54:26 * edleafe was just letting you guys hash it out 14:54:26 so, please save my soul 14:54:30 #endmeeting