Thursday, 2019-02-28

*** mriedem has quit IRC		00:13
*** mriedem has joined #openstack-placement		00:19
*** takashin has joined #openstack-placement		00:43
openstackgerrit	Tetsuro Nakamura proposed openstack/placement master: Use set instead of list https://review.openstack.org/639887	02:47
openstackgerrit	Tetsuro Nakamura proposed openstack/placement master: Refactor _get_trees_matching_all() https://review.openstack.org/639888	02:47
openstackgerrit	Tetsuro Nakamura proposed openstack/placement master: Adds debug log in allocation candidates https://review.openstack.org/639889	02:47
*** mriedem has quit IRC		03:32
openstackgerrit	Merged openstack/placement master: Remove redundant second cast to int https://review.openstack.org/639203	06:55
*** tssurya has joined #openstack-placement		08:19
*** takashin has left #openstack-placement		08:30
*** helenafm has joined #openstack-placement		08:37
*** rubasov has quit IRC		08:50
gibi	bauzas: why the driver uses mdev and mdev_types as well for lisiting devices?	08:54
gibi	https://github.com/openstack/nova/blob/337b24ca41d2297cf5315d31cd57458526e1e449/nova/virt/libvirt/host.py#L900	08:54
gibi	https://github.com/openstack/nova/blob/337b24ca41d2297cf5315d31cd57458526e1e449/nova/virt/libvirt/host.py#L893	08:54
*** ttsiouts has joined #openstack-placement		09:11
*** e0ne has joined #openstack-placement		09:32
openstackgerrit	Chris Dent proposed openstack/placement master: Docs: extract testing info to own sub-page https://review.openstack.org/639628	09:34
*** rubasov has joined #openstack-placement		10:09
*** rubasov has quit IRC		10:19
*** rubasov has joined #openstack-placement		10:26
*** cdent has joined #openstack-placement		11:00
cdent	jaypipes: if you get a brief moment to swoop in with an opinion on classmethods v module level methods in https://review.openstack.org/#/c/639391/ that would be handy as we continue to remove more code. no need, now, for a big review	11:07
cdent	just an opinion on the quibble that eric and I are enjoying having	11:08
*** ttsiouts has quit IRC		11:10
*** ttsiouts has joined #openstack-placement		11:10
*** ttsiouts has quit IRC		11:15
*** e0ne has quit IRC		11:30
*** e0ne has joined #openstack-placement		11:36
*** ttsiouts has joined #openstack-placement		12:09
bauzas	efried: jaypipes: I'm working on fixing https://review.openstack.org/#/c/636591/5/nova/virt/libvirt/driver.py@583	13:31
jaypipes	bauzas: k	13:31
jaypipes	cdent: ack	13:31
cdent	thanks	13:31
bauzas	efried: jaypipes: but for that, I need to get allocations for all instances from a compute	13:31
bauzas	efried: jaypipes: so I saw https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L1887 but it looks like I couldn't be using it ?	13:32
bauzas	because I would like to pass allocations to https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1202	13:33
jaypipes	bauzas: isn't the set of allocations passed to the reshape method?	13:42
jaypipes	bauzas: oh, I see, you're trying to do this on restart after you've successfully reshaped..	13:43
jaypipes	bauzas: question: are mdev UUIDs guaranteed to persist in a consistent fashion across reboots?	13:45
bauzas	jaypipes: unfortunately not :(	13:51
bauzas	they disappear when rebooting the host	13:52
bauzas	they're not persisted by the kernel	13:52
bauzas	:(	13:52
bauzas	hence why we have https://review.openstack.org/#/c/636591/5/nova/virt/libvirt/driver.py@579	13:52
bauzas	we basically recreate mdevs there	13:52
jaypipes	bauzas: that's crazy.	13:58
jaypipes	bauzas: in any case, I've added a comment to that patch in the function in question. please see that comment.	13:58
jaypipes	bauzas: BTW, since nvidia invented the whole mdev system, what do their developers advise?	13:59
jaypipes	bauzas: or are their developers basically not engaged at all?	13:59
bauzas	jaypipes: humpf	14:00
bauzas	I mean, nvidia wants to get paid	14:00
bauzas	but all the devs I know are just working on their own kernel driver	14:00
bauzas	anyway	14:01
bauzas	jaypipes: I just looked at your comment	14:01
bauzas	jaypipes: so we know the instance UUID and even the mdev UUID	14:01
bauzas	jaypipes: that's what does self._get_all_assigned_mediated_devices() => it looks at the guest XMLs	14:02
bauzas	all the guest XMLs	14:02
jaypipes	bauzas: heh	14:02
bauzas	and as you can see, we get the instance UUID	14:03
jaypipes	bauzas: ok, so if we have that information, what do we need the allocs dicts for?	14:03
bauzas	unfortunately yes, because we don't know which RP is used by the consumer	14:03
bauzas	in case we have two pGPUs, we have two children RPs	14:03
bauzas	so i need to get the allocation for knowing the RP UUID	14:04
bauzas	if not, I'll create a new mdev for this instance, but maybe not by the same pGPU	14:04
jaypipes	bauzas: oh, I'm sorry, I thought the mdev UUID was the provider UUID?	14:05
bauzas	unfortunately no	14:05
jaypipes	ah, no... I guess not, there are >1 mdevs corresponding to a single resource of VGPU consumed on the provider, right?	14:05
bauzas	and it's not also the instance UUID	14:05
bauzas	well, not sure I understand correcly your question, but... for one RP (the PGPU), we could have >1 mdev yes	14:06
bauzas	mdev == VGPU basically	14:07
jaypipes	yeah, sorry, forgot about that	14:07
bauzas	np	14:07
bauzas	what we could do is to ask operators to recreate the mdevs by them	14:07
bauzas	but... :)	14:08
jaypipes	bauzas: well, in theory, I don't have any particular problem with passing the allocations information in init_host(). after all, we're doing an InstanceList.get_by_host() right after init_host() (https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1204-L1205) and that's essentially the exact same thing as getting allocations from placement	14:08
jaypipes	bauzas: we could just as easily do that call to InstanceList.get_by_host() before calling init_host() and pass that stuff in init_host(). The allocations info from placement is a similar type of info.	14:09
jaypipes	I wonder what mriedem would think of that though.	14:09
bauzas	jaypipes: yeah I was thinking of that, but once we get all the instances UUIDs by self._get_all_assigned_mediated_devices() we need to call placement, right?	14:11
bauzas	jaypipes: for knowing their allocs	14:11
bauzas	jaypipes: so, say we have 1000 instances, we could hit 1000 times placement just for that :(	14:11
bauzas	hence why I was looking at alternative API calls	14:11
*** mriedem has joined #openstack-placement		14:13
jaypipes	bauzas: you're looking for basically a new placement API call. something like GET /allocations?consumer_id=in:<instance1>,<instance2>, etc...	14:20
bauzas	if so, that's not good :)	14:21
bauzas	jaypipes: I was looking at any existing placement call :)	14:21
jaypipes	bauzas: the alternative to that would be to store a file on the compute service that keeps that mapping of instance UUID -> provider UUID for you. (this is actually what I said would be needed in the original spec review)	14:21
bauzas	so that reshape services wouldn't be blocked because of a missing API :)	14:21
bauzas	jaypipes: yeah...	14:22
bauzas	I don't disagree with that	14:22
bauzas	you know what ? I'm just about to write docs and say how terrible it is to reboot a host	14:22
jaypipes	bauzas: there is also GET /resource_providers/{compute_uuid}/allocations	14:22
bauzas	jaypipes: that's the https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L1887 report method, right?	14:22
bauzas	jaypipes: I was a bit scared by looking at the docstring :D	14:23
jaypipes	yes	14:23
bauzas	oh wait	14:23
bauzas	no, I can use this method :)	14:23
jaypipes	see here a comment from efried: https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L1970-L1971	14:23
bauzas	I confused it with https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L1907	14:23
bauzas	oh wait, actually I need the whole tree	14:24
jaypipes	bauzas: all get_allocations_for_provider_tree() does is call get_allocations_for_provider() over and over again :)	14:24
bauzas	yeah	14:24
jaypipes	thus the comment from efried there :)	14:24
bauzas	again, I was confused	14:24
bauzas	heh	14:24
jaypipes	no worries, this is a confusing area	14:24
bauzas	and I think efried is right	14:24
bauzas	the problem with persisting things on our side is that it makes it mandatory, but just because libvirt lacks a feature	14:25
bauzas	which I dislike at mosty	14:25
bauzas	it's not because nvidia devs weren't able to convince kernel folks to persist mdevs that we should suffer from this	14:25
bauzas	and do crazy things just because of this	14:25
bauzas	jaypipes: hence me reluctant to persist any mdev mapping info	14:26
jaypipes	bauzas: yeah, definitely it's a catch-22 situation.	14:27
* bauzas googles this		14:28
* bauzas doesn't have a single shit of culture		14:28
bauzas	:(	14:28
jaypipes	bauzas: oh, sorry... a better term would be "a lose-lose situation"?	14:29
bauzas	ha-ah, I understand it better :)	14:29
bauzas	but yeah I agree	14:29
bauzas	the more I think about that, the more I'm reluctant to do any single change	14:30
bauzas	I should rather file a libvirt bug asking for mdevs to be persistent	14:30
bauzas	I don't know who decided mdevs should be part of sysfs	14:31
bauzas	but then...	14:31
bauzas	your call	14:31
jaypipes	bauzas: I think we should chat with mriedem about the pros and cons of passing a built-up allocations variable to init_host() for all the instances on a host. (one of the cons being that this could be an ENORMOUS variable for nova-compute services running Ironic...)	14:34
jaypipes	or frankly any nova-compute with hundreds or thousands of VMs on it.	14:34
efried	jaypipes, bauzas: Do we need the allocations for everything under the compute node, or only allocations against the pGPU RPs?	14:36
bauzas	efried: ideally the latter	14:36
jaypipes	efried: are you thinking something like a GET /allocations?consumer_id=in:<uuids>?resource_class=VGPU ?	14:37
cdent	ugh	14:38
efried	No, I was thinking we already know the pGPU RP UUIDs, don't we?	14:38
efried	same way we generated them?	14:41
bauzas	no, because that's on init	14:42
bauzas	oh wait	14:42
bauzas	init is actually after urp	14:42
bauzas	I think I see where you're coming	14:42
bauzas	we could get the RP UUIDs	14:42
bauzas	by looking up the cache	14:43
bauzas	in the driver	14:43
bauzas	but then, we would still need to call placement to get the allocations	14:43
bauzas	or or...	14:43
* bauzas has his mind thinking multiple things		14:43
efried	Another option is GET /allocations?in_tree=<cn_uuid>	14:44
efried	This would reduce the ironic issue down to a single node.	14:44
efried	But could still be big if there's a crap ton of instances on the node.	14:44
cdent	as a first pass, a lot of small GETs if you need lots of different allocations	14:45
cdent	if that proves too clostly _then_ fix it	14:45
cdent	but getting allocations ought to be one of the faster operations	14:46
cdent	and if you really need it to be properly vast throw all the request down an eventlet thread pool and async them	14:46
cdent	s/vast/fast/ but vast works too	14:47
efried	I've been trying to come up with a good definition of clostly	14:47
cdent	clostly is _obviously_ a type of expense is close quarters or time constraints	14:49
cdent	s/is/in/	14:49
* cdent sighs		14:49
efried	I figured it was something like that.	14:53
bauzas	so, we have the providerTree even when we init_host()	14:55
bauzas	so there is a way to know the pGPU RPs	14:55
bauzas	but then, I'd need to get the allocations :(	14:55
efried	you even have their names in the ProviderTree.	14:55
bauzas	yeah but again, I need to lookup allocations	14:56
efried	Right, but you can narrow it down to only the pGPUs on only one compute node.	14:56
efried	which will be what, eight max?	14:56
bauzas	oh I see	14:56
bauzas	oh no	14:56
efried	I mean, how many GPU cards are we expecting on one system?	14:56
bauzas	hah, this, well, it depends but 8 looks reasonable	14:57
efried	I mean, just as a heuristic	14:57
bauzas	but a GPU card == N pGPUs	14:57
bauzas	so theorically, maybe 16 or 64	14:57
bauzas	but meh	14:57
bauzas	reasonable enough	14:57
efried	my point is, you'll be making somewhere in the magnitude of 10-100 GET /allocations calls	14:57
bauzas	efried: sure, but who would be the caller ? the libvirt driver, right?	14:58
bauzas	which we said N times no	14:58
bauzas	or, we need to get the providertree in the compute manager	14:58
mriedem	someone is going to have to catch me up because i don't want to read all this scrollback	14:58
efried	we have the provider tree in the compute manager - but it doesn't have allocations in it (nor should it).	14:58
bauzas	mriedem: nothing really crispy atm	14:58
bauzas	efried: do we ?	14:58
bauzas	you make my day if so	14:59
efried	bauzas: hum, well, we have a report client, so you could call get_provider_tree_and_ensure_root.	14:59
efried	which would be the way you should get the provider tree in any case.	14:59
efried	so if it's not already cached, it'll be pulled down.	15:00
bauzas	hah	15:01
bauzas	worth trying then	15:01
*** e0ne has quit IRC		15:16
*** e0ne has joined #openstack-placement		15:19
*** rubasov has quit IRC		15:54
*** nguyenhai_ has quit IRC		16:14
*** nguyenhai_ has joined #openstack-placement		16:15
*** e0ne has quit IRC		16:16
*** ttsiouts has quit IRC		16:16
*** ttsiouts has joined #openstack-placement		16:21
*** e0ne has joined #openstack-placement		16:23
*** Sundar has joined #openstack-placement		16:29
*** rubasov has joined #openstack-placement		16:47
*** helenafm has quit IRC		16:53
*** ttsiouts has quit IRC		16:54
*** tssurya has quit IRC		16:57
*** ttsiouts has joined #openstack-placement		16:58
*** ttsiouts has quit IRC		16:59
*** ttsiouts has joined #openstack-placement		16:59
*** ttsiouts has quit IRC		17:04
*** e0ne has quit IRC		17:05
efried	cdent: are you rebasing the placement ObjectList series?	17:13
cdent	efried: yes	17:13
cdent	right this minute	17:13
openstackgerrit	Chris Dent proposed openstack/placement master: Factor listiness into an ObjectList base class https://review.openstack.org/637325	17:15
openstackgerrit	Chris Dent proposed openstack/placement master: Move _set_objects into ObjectList https://review.openstack.org/637328	17:15
openstackgerrit	Chris Dent proposed openstack/placement master: Move *List.__repr__ into ObjectList https://review.openstack.org/637332	17:15
openstackgerrit	Chris Dent proposed openstack/placement master: Clean up ObjectList._set_objects signature https://review.openstack.org/637335	17:15
openstackgerrit	Chris Dent proposed openstack/placement master: Use native list for lists of Usage https://review.openstack.org/639391	17:15
openstackgerrit	Chris Dent proposed openstack/placement master: Move RC_CACHE in resource_class_cache https://review.openstack.org/640114	17:15
* cdent takes a walk while that cooks		17:17
*** Sundar has quit IRC		18:05
*** e0ne has joined #openstack-placement		18:51
*** e0ne has quit IRC		19:03
mriedem	cdent: some comments on your osc-placement test cleanup patch https://review.openstack.org/#/c/639717/	19:36
cdent	thanks mriedem will respond, but brief glance there's nothing to disagree with	19:51
cdent	the whole thing was rather bizarre. bunch of stuff totally broke for py3	19:54
*** ttsiouts has joined #openstack-placement		20:09
*** e0ne has joined #openstack-placement		20:16
openstackgerrit	Chris Dent proposed openstack/placement master: WIP: Move Allocation and AllocationList to own module https://review.openstack.org/640184	20:36
cdent	efried, jaypipes, edleafe ↑ is now the end of the big refactoring stack. I'm going to continue this stuff the bitter end unless you guys want to stop me. Your feedback thus far has been great, so thanks for that.	20:37
efried	I though there was no end, bitter or otherwise	20:37
efried	gdi, how do you override \| ?	20:38
edleafe	Agree with efried - there will never be an ending	20:38
cdent	well, I don't want there to be an end, that's kind of the point: constant refactoring, endless loop	20:39
cdent	efried: you mean when booleaning two objects?	20:39
efried	cdent: I mean set union	20:40
cdent	is it not __or__?	20:41
efried	ah, yes	20:41
efried	after all that, I changed my mind about suggesting it.	20:44
openstackgerrit	Chris Dent proposed openstack/osc-placement master: Update tox and tests to work with modern setups https://review.openstack.org/639717	20:46
openstackgerrit	Chris Dent proposed openstack/osc-placement master: Add support for 1.18 microversion https://review.openstack.org/639738	20:46
cdent	mriedem: that ↑ ought to fix your concerns.	20:47
mriedem	fancy arrows	20:47
cdent	I got this one too ↓	20:47
cdent	but that's as far as it goes	20:47
cdent	that's enough for me today	20:48
cdent	goodnight all	20:48
*** cdent has quit IRC		20:48
mriedem	i'm +2 on https://review.openstack.org/#/q/topic:cd/make-tests-work+(status:open+OR+status:merged) if someone else wants to hit them, pretty trivial	20:53
mriedem	1.18 is the first placement microversion added in rocky,	20:53
mriedem	so we're a bit behind on osc-placement parity with the api	20:53
*** takashin has joined #openstack-placement		20:56
*** e0ne has quit IRC		21:19
*** s10 has joined #openstack-placement		21:28
*** e0ne has joined #openstack-placement		21:42
*** e0ne has quit IRC		21:44
*** e0ne has joined #openstack-placement		21:45
*** e0ne has quit IRC		22:01
*** s10 has quit IRC		22:27
openstackgerrit	Eric Fried proposed openstack/placement master: DNM: get_rc_cache https://review.openstack.org/640226	23:49

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!