14:00:13 #startmeeting nova_scheduler 14:00:14 Meeting started Mon Jul 17 14:00:13 2017 UTC and is due to finish in 60 minutes. The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:15 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:18 The meeting name has been set to 'nova_scheduler' 14:00:21 o../ 14:00:23 Anyone around? 14:00:36 o/ 14:02:11 * bauzas waves hand 14:02:20 o/ 14:02:21 * edleafe waves foot 14:02:37 i took me many seconds to go from /join to joined 14:02:39 edleafe: going all Daniel Day Lewis today? 14:02:44 * cdent shakes his tired tiny fist once more 14:02:53 ok, let's get started 14:02:59 #topic Specs & Reviews 14:03:08 #link Amend spec for Custom Resource Classes in Flavors https://review.openstack.org/#/c/481748/ 14:03:20 This seems ok, but still a little fuzzy on the details 14:03:47 I read through your comments, edleafe. 14:03:56 I'm working on the code for this, but it seems setting up a devstack on a VM to think it's running Ironic is no small thing 14:04:01 I'm wondering if I can help clarify anything for you on there? 14:04:16 Matt has responded to you, but we can discuss your remaining concerns here. 14:04:50 jaypipes: it wasn't clear that the flavor for the instance wouldn't have the extra_specs in it 14:05:10 any help with ironic needed? /me did not quite understand the comment above 14:05:17 edleafe: you mean for *existing* instances? 14:05:19 jaypipes: also, I figured out the whole node name confusion (I think) 14:05:38 jaypipes: well, yeah - that's the only thing the spec is concerned with 14:06:24 edleafe: k, are you clear on that now though? you mentioned still being a little fuzzy on some things. 14:06:30 IOW, I didn't know if the update that the operators did to flavors was propagated to existing instances 14:06:39 ah. 14:06:45 no, it definitely is not. 14:06:49 jaypipes: the wording in the spec is fuzzy. I've since figure a bunch out 14:07:16 dtantsur: I could use some help testing ironic behavior with only a devstack VM 14:07:20 edleafe: k. what's the remaining fuzziness we can help with? 14:07:31 edleafe: sure, ping me after the meeting 14:07:43 dtantsur: ok, will do 14:07:56 jaypipes: it's the wording of the spec, not my brain 14:08:17 edleafe: heh, I know. and I'm asking how we can reword the spec to make it more clear. 14:08:32 jaypipes: I guess that's not a big priority, is it? 14:08:42 not sure... 14:08:56 would be good to get the spec amendment finalized and agree on its clarity. 14:08:59 I know what I need to do. Now to fo it... 14:09:07 do it, even 14:09:16 edleafe: that's good with me, then. 14:09:25 jaypipes: if you want I could try to push a revision to the spec 14:09:42 tbh, I'd love to make sure we can land something about custom RCs for Pike 14:09:49 edleafe: go for it. 14:09:53 the spec being amended or not doesn't seem to me super urgent 14:10:06 bauzas: a good spec makes it a lot easier to review the code 14:10:09 bauzas: agreed. code is more important. as long as edleafe is clear, I'm good. 14:10:27 cdent: sure, I'm talking of merging that spec amendment 14:10:29 ok, maybe I can revise it once I get working code 14:10:37 Let's move on 14:10:39 #link Claims in the Scheduler - sereis starting with https://review.openstack.org/#/c/483564/ 14:10:39 edleafe++ 14:10:54 jaypipes: status update? 14:10:56 edleafe: tbh, I'll review your implementation changes 14:11:41 edleafe: bottom two patches in that series are approved. just fixing up the p_sums thing now in the third. 14:11:56 edleafe: reviews on the final patch in that series that does the claiming would be great 14:12:05 great - thanks for fixing that 14:12:17 * bauzas shakes his boots 14:12:40 next up: 14:12:41 #link Devstack to use resource classes by default https://review.openstack.org/#/c/476968/ 14:12:50 dtantsur: anything to say about that one? 14:14:11 I was out for 2 weeks, so not really, sorry 14:14:43 dtantsur: ok, ping us in -nova if we can help 14:14:53 #link Update allocation-candidates.yaml for gabbi 1.35 https://review.openstack.org/#/c/484356/ 14:14:56 sure, I'll get to it eventually 14:14:57 dtantsur: pfft. they let you take vacation?! 14:15:03 That looks straightforward enough 14:15:35 jaypipes: naive people, right? 14:15:56 next up: 14:15:57 #link Add traits to the ResourceProviders filters https://review.openstack.org/#/c/474602/ 14:16:14 That's largely on hold until the AllocationCandidate stuff merges 14:16:24 dtantsur: :) 14:16:37 Then I can probably duplicate the code there, since they are essentially the same query 14:16:39 edleafe: doesn't really need to be, though... 14:17:07 edleafe: meaning the resource providers query stuff can be reviewed and merged since it's not used by claims in the scheduler but is still useful. 14:17:39 jaypipes: sure, but it would simplify reviewing to have them together 14:17:46 understood. 14:18:09 and since it isn't in the critical path, no rush 14:18:27 moving on 14:18:29 #link Nested Resource Providers - series starting with https://review.openstack.org/#/c/470575/ 14:18:35 Still in a coma 14:18:42 :) 14:18:46 indeed. 14:19:18 next up: 14:19:20 #link Traits support in the Allocation Candidates https://review.openstack.org/478464/ 14:19:41 I haven't had a chance to review this yet. Is it a dupe of https://review.openstack.org/#/c/474602/? 14:20:10 IOW, could that be the AllocCand version of the RP version? 14:20:42 if so, we should ensure that they are working the same way 14:20:51 agreed 14:21:30 or, frankly, it looks like both patches could build on a common patch that adds support for a helper method that returns all resource providers having all traits in a list. 14:21:45 similar to the _get_all_shared_with() helper method. 14:21:59 jaypipes: yeah, I was thinking something like that 14:22:04 edleafe: instead of both patches adding what amounts to the same subquery. 14:22:16 but without first seeing the code, I didn't want to assume 14:22:23 yeah, I'll review both. 14:22:39 ok, great 14:22:43 a common patch return a subquery? 14:23:21 yes, since we need to add a filter for traits in both 14:23:27 alex_xu: a common patch that returns a list of resource provider IDs would be my preference instead of making the _get_all_with_shared() function even bigger. but I'll comment on the patches. 14:23:28 ah, I see 14:23:43 jaypipes: ok, thanks 14:24:08 Last up on the agenda: 14:24:09 #link Placement api-ref docs https://review.openstack.org/#/q/topic:cd/placement-api-ref+status:open 14:24:23 Looks like steady progress there 14:24:25 yup 14:24:31 slow but steady 14:25:00 Anything else for specs and/or reviews? 14:25:20 not from me. 14:25:20 I wanted to ask about that bug fix I did: https://review.openstack.org/#/c/484162/ 14:25:24 for https://bugs.launchpad.net/nova/+bug/1704574 14:25:25 Launchpad bug 1704574 in OpenStack Compute (nova) "[placement] attempt to put allocation to resource provide that does not host class of resource causes 500" [Medium,In progress] - Assigned to Chris Dent (cdent) 14:25:42 not sure about the completeness or location of the fix 14:25:56 wait your turn!!!! 14:26:00 #topic Bugs 14:26:01 #link Placement bugs https://bugs.launchpad.net/nova/+bugs?field.tag=placement 14:26:12 #link fix 500 error when allocating to bad class https://launchpad.net/bugs/1704574 14:26:15 #link Proposed fix for 1704574 https://review.openstack.org/#/c/484162/ 14:26:16 I never understand these topics. 14:26:27 * edleafe mutters about cdent jumping the gun 14:26:34 other story of my life 14:27:04 Do you want to summarize your concern about the fix? 14:28:28 there’s three concerns, I guess: 14:28:49 a) I don’t understand why the other sanity checks on the data aren’t catching the problem 14:29:11 b) it seems like the early set() checks might be capable of also checking for this problem 14:29:21 c) ed identified yet another KeyError that may also need more effective handling 14:29:42 a) is probably more correctly stated as “why didn’t we already see this?” 14:29:50 cdent: reviewed. 14:30:22 cdent: guess we haven't wandered too far from the happy path 14:31:07 (sorry my internets are being very slow) 14:31:40 jaypipes: those uuids are the result of uuidgen, they are just in there because in ‘latest’ microversion we need them 14:32:34 cdent: but if you put in some nonsense uuid like 'aaaaa-aaaa-...' would it still work? 14:32:37 cdent: k 14:32:51 jaypipes: can you explain why the issue only shows up when there are at least two resoure classes (one of them bad)? 14:32:52 IOW, is there anything 'magical' about those uuids? 14:32:59 edleafe: no, they are uuids 14:33:10 I don’t understand what is being asked? 14:33:25 cdent: I meant "defined elsewhere and need to match that other definition" 14:34:53 no, that’s why I’m using static uuids instead of $ENVIRON 14:35:01 cdent: I'm looking at that now... not sure off top of my head. 14:36:16 cdent: also, shouldn't these be 400s not 409s? 14:36:20 in any case, that change is a good example of what I’m intending to do while you guys are doing the meat: searching for bugs, edges, etc 14:36:47 no, I guess 409 is appropriate. the structure of the request is correct. 14:36:51 jaypipes: I thought that too, but was trying to be consistent with the 409s that were already there and could not decide 14:37:13 I _think_ 409 is saying “the resource’s state of inventory is not aligned with what you are asking for" 14:37:37 right 14:37:49 and like I said, the request structure is proper, so 400 isn't appropriate. 14:38:26 409 seems OK, as long as it has a good error message so we can distinguish the different failures 14:39:49 So... anything else on bugs? 14:40:42 cdent: ah... I figured it out. 14:40:46 folks, I'll need to bail out in a very few 14:40:52 ok bauzas 14:41:17 #link https://review.openstack.org/480379 14:41:36 ^ just waiting for review 14:41:54 oops, there is commment from ken'ichi 14:42:17 ok, thanks for the link alex_xu. Added to my review list 14:42:23 edleafe: thanks 14:42:44 cdent: line 1600 adds providers with *any inventory of any requested resource* to the provs_with_inv set. however, if that provider only has some and not all of the inventory requested, it passes the check in lines 1602-1608 but will fail the check on line 1616 14:43:28 cdent: so, before your patch, if you were to try allocating for two resource classes, neither of which existed for a resource provider, you'd get back the 409 result. 14:43:56 cdent: it's just when you request two resource classes and one of them actually does exist that the 500 is returned 14:44:18 alex_xu: ah, yeah, I will re-review that patch as well. 14:44:25 jaypipes: thanks 14:44:41 jaypipes: yeah, that’s what I surmised, but couldn’t think of a good way to catch it in the earlier block (especially not on sunday’s brain) 14:44:52 heh, yeah :) 14:45:02 ok, folks, I need to run to a dentist appt. fun! 14:45:04 ciao 14:45:26 Then let's move on to... 14:45:27 #topic Open discussion 14:45:45 got anything to discuss? 14:46:25 not for discussion, but a request, if possible: 14:46:29 https://review.openstack.org/#/c/449257/ 14:46:38 and the following patch 14:46:46 https://review.openstack.org/#/c/451777/ 14:47:02 To modify the pci_passthough filter 14:47:30 #link Neutron port patch https://review.openstack.org/#/c/449257/ 14:47:30 Reviews will be welcome 14:47:53 #link Another Neutron port patch https://review.openstack.org/#/c/451777/ 14:48:15 ralonsoh: thanks - added to the review queue 14:48:21 me too 14:48:28 thanks! 14:49:18 Anyone have anything else? 14:49:57 no sir 14:50:47 ok, then, enjoy the rest of what's left of your day! And maybe even the day after that! 14:50:51 #endmeeting