Monday, 2019-07-08

*** altlogbot_3 has quit IRC		01:28
*** altlogbot_2 has joined #openstack-placement		01:29
*** tetsuro has joined #openstack-placement		05:57
*** tetsuro has quit IRC		05:58
*** tetsuro has joined #openstack-placement		05:59
*** tetsuro has quit IRC		06:03
*** helenafm has joined #openstack-placement		07:17
*** tssurya has joined #openstack-placement		07:28
openstackgerrit	Tetsuro Nakamura proposed openstack/placement master: Support `same_subtree` queryparam https://review.opendev.org/668376	07:47
openstackgerrit	Tetsuro Nakamura proposed openstack/placement master: Doc `same_subtree` queryparam https://review.opendev.org/669616	07:47
*** tetsuro has joined #openstack-placement		07:55
*** tetsuro has quit IRC		07:57
*** ttsiouts has joined #openstack-placement		08:01
*** ttsiouts has quit IRC		08:13
*** ttsiouts has joined #openstack-placement		08:13
*** ttsiouts has quit IRC		08:18
openstackgerrit	Chris Dent proposed openstack/placement master: Update implemented spec and spec document handling https://review.opendev.org/669184	08:18
openstackgerrit	Chris Dent proposed openstack/placement master: Add whereto for testing redirect rules https://review.opendev.org/669370	08:18
openstackgerrit	Chris Dent proposed openstack/placement master: tox: Stop building api-ref docs with the main docs https://review.opendev.org/669371	08:18
*** ttsiouts has joined #openstack-placement		08:24
helenafm	:q	08:28
*** cdent has joined #openstack-placement		09:05
*** cdent has quit IRC		09:45
*** ttsiouts has quit IRC		10:32
*** ttsiouts has joined #openstack-placement		10:33
*** cdent has joined #openstack-placement		10:33
*** ttsiouts has quit IRC		10:37
cdent	gibi: have you had a chance to look at tetsuro's same_subtree work? needs someone besides me and efried_pto looking at it	10:46
gibi	cdent: not yet, I will dive into it now.	10:49
cdent	great, thanks	10:49
*** ttsiouts has joined #openstack-placement		11:01
*** cdent has quit IRC		11:12
*** cdent has joined #openstack-placement		11:18
*** sean-k-mooney has quit IRC		12:03
*** sean-k-mooney has joined #openstack-placement		12:16
*** cdent has quit IRC		12:26
*** artom has joined #openstack-placement		12:33
*** edleafe has joined #openstack-placement		12:42
openstackgerrit	Merged openstack/os-resource-classes master: Add Python 3 Train unit tests https://review.opendev.org/669479	12:57
openstackgerrit	Merged openstack/os-traits master: Add Python 3 Train unit tests https://review.opendev.org/669480	13:00
*** takashin has left #openstack-placement		13:01
*** mriedem has joined #openstack-placement		13:23
*** cdent has joined #openstack-placement		13:34
gibi	cdent: this same_subtree patch is pretty dense. I hope I finish it today but no promises.	13:40
cdent	gibi: no worries, there's not a huge rush because we're fairly ahead of schedule, but the sooner it is merged, the sooner nova can start playing with it I suppose	13:40
gibi	Is there a volunteer from nova side to play with this during Train?	13:41
gibi	I mean who will be the developer first consuming this? as we might need a review from that dev as well	13:42
cdent	gibi: I don't really know	13:44
cdent	efried_pto: is the one who has been driving that it needs to happen	13:44
gibi	cdent: OK	13:46
cdent	I hope this doesn't turn into another situation where placement is months or even years ahead of nova, but it could well do, and that's fine	13:47
*** amodi has joined #openstack-placement		13:50
*** efried_pto is now known as efried		13:54
gibi	I don't remember a nova spec that explicity stated same_subtree as a requirement	13:54
gibi	and nova tend to be a lot slower to move forward than placement	13:55
efried	I'm sort of hoping to bully bauzas into abandoning his vgpu affinity spec in favor of working NUMA nesting in nova.	13:55
gibi	anyhow I will review the implementation, it was just a sidetrack of mine to get the "user" of the feature involved	13:55
bauzas	efried: sorry but why ?	13:55
bauzas	I already said both specs aren't competitive	13:56
efried	No, they're not mutually exclusive... except from the standpoint of development resource.	13:56
efried	IMO retrofitting filter-based affinity for placement-modeled vgpu is a backward step.	13:57
bauzas	that's your opinion	13:58
efried	That's what "IMO" means.	13:58
efried	I'm just one voice. If there's a preference from the team to move ahead with it, so be it.	13:59
gibi	there is complexity difference between the two solutions that could mean leadtime differences as well.	13:59
efried	yes, absolutely. The filter bauzas proposes could be easily contained in Train. NUMA modeling/affinity in placement may well take longer.	14:00
bauzas	efried: I'm not adding a filter	14:00
efried	it has already taken long enough.	14:00
efried	weigher?	14:00
bauzas	indeed, have you looked at my spec honestly?	14:01
bauzas	the weigher is even not needed	14:01
efried	I certainly could be misusing terminology. I'm really good at that.	14:01
bauzas	if that's really a problem for you, I can even remove it honestly	14:01
bauzas	what I really just need is https://review.opendev.org/#/c/650963/9/specs/train/approved/libvirt-vgpu-numa-affinity.rst@92	14:02
bauzas	efried: the point is, a filter can get NoValidHostds	14:02
bauzas	efried: not a weigher	14:02
bauzas	efried: it just helps to make sure we spread instances between hosts	14:03
bauzas	for vGPUs	14:03
bauzas	in order to have less races	14:03
efried	bauzas: Where is the code for @92 going to live, if not in the NUMATopologyFilter or a new weigher? In the libvirt virt driver?	14:03
bauzas	efried: I said it in the spec	14:04
bauzas	https://review.opendev.org/#/c/650963/9/specs/train/approved/libvirt-vgpu-numa-affinity.rst@101	14:05
bauzas	https://review.opendev.org/#/c/650963/9/specs/train/approved/libvirt-vgpu-numa-affinity.rst@211 and https://review.opendev.org/#/c/650963/9/specs/train/approved/libvirt-vgpu-numa-affinity.rst@214	14:05
efried	bauzas: Okay, I've reread the spec.	14:14
bauzas	thanks	14:14
efried	It all makes perfect sense in a world where there's no NUMA affinity at the placement level.	14:14
efried	but once we have that, 80% of this code goes away.	14:15
efried	A pack/spread weigher for its own sake may make sense.	14:16
efried	though not related to affinity	14:16
efried	The code to pick the proper NUMA node based on which PGPUs are allocated becomes n/a.	14:17
efried	So my point is, a) this becomes tech debt almost immediately; and b) the effort spent coding & reviewing could be better spent getting us closer to placement-based NUMA modeling & affinity.	14:18
efried	IMO	14:18
cdent	IMOT	14:18
cdent	Too	14:19
cdent	otherwise we've spent a huge complexity cheque in placement for nada	14:19
bauzas	efried: cdent: there will still be nova changes for using the new placement microversions	14:21
bauzas	so, yeah, I'll work on this too	14:22
bauzas	efried: the libvirt code will still possibly be there but AFAIK and unless I misunderstood, we will only have hard affinity	14:22
bauzas	by using placement	14:23
efried	correct	14:23
efried	using group_policy=none would allow you to employ post-scheduling soft affinity	14:25
*** purplerbot has quit IRC		14:28
*** purplerbot has joined #openstack-placement		14:30
bauzas	efried: so operators wanting to only have soft affinity for vGPUs would still need the libvirt claim	14:30
bauzas	and maybe the weigher	14:30
efried	actually, I take it back	14:32
efried	you can't use the soft affinity after the fact - you can't disregard the claim you got.	14:33
efried	it's all or nothing	14:33
efried	You can use a host that hasn't modeled NUMA in nested providers, and get soft affinity; or if you're on a host that has modeled NUMA in nested providers, you can have hard affinity or no affinity.	14:34
bauzas	efried: if you don't ask placement for hard affinity, then you can still have soft affinity	14:51
bauzas	that's why I say the spec is not competitive	14:52
efried	bauzas: you can't have soft affinity if the host is modeled with nested RPs.	14:53
efried	bauzas: because we've created an allocation with resources from particular NUMA nodes.	14:53
efried	So if we didn't request affinity from placement, and you happen to get allocations from opposite NUMA nodes, you can't just ignore that on the host and pick a different distribution of NUMA nodes.	14:54
efried	so	14:54
efried	all or nothing	14:54
*** dklyle has joined #openstack-placement		15:13
mriedem	ima butt in with a question,	15:14
mriedem	on https://developer.openstack.org/api-ref/placement/?expanded=update-allocations-detail#update-allocations	15:14
mriedem	the 409 in the description is mostly about inventory conflicts,	15:14
mriedem	but isn't there also a 409 response if the consumer exists and you pass 1.28 with consumer_generation=None?	15:15
cdent	yes	15:15
cdent	Inventory and/or allocations changed while attempting to allocate	15:16
cdent	one could argue (weakly) that "or inventories are updated by another thread while attempting the operation" fits, since allocations change an inventory's capacity	15:17
cdent	and if you try to send None for consumer_generation and it doesn't work, then there have been allocations out from under you	15:17
cdent	but yes, it could be documented better	15:17
* mriedem is storyboarding		15:20
cdent	you are a star	15:20
mriedem	https://storyboard.openstack.org/#!/story/2006180	15:22
*** dklyle has quit IRC		15:28
*** dklyle has joined #openstack-placement		15:28
*** amodi has quit IRC		15:29
*** amodi has joined #openstack-placement		15:39
*** helenafm has quit IRC		15:43
*** tssurya has quit IRC		15:53
gibi	cdent: left comments in https://review.opendev.org/#/c/668376	16:03
cdent	thanks gibi	16:04
sean-k-mooney	efried: in the numa case we should catch that in the numa toplogy filter before we hit the compute node	16:07
sean-k-mooney	it wont be going away at least not in the short term after we have numa support in placment	16:07
efried	sean-k-mooney: Is the NUMATopologyFilter a filter or a weigher?	16:08
sean-k-mooney	it might eventurlly go away however but ya we cant ignore the allcoation form placmeent	16:08
sean-k-mooney	efried: its a filter	16:08
sean-k-mooney	doing hard affinity	16:08
efried	so the filter part that rejects an allocation that doesn't provide affinity - that would be moot	16:08
efried	because why would we not have requested affinity from placement if we were going to enforce it in the filter anyway?	16:09
sean-k-mooney	if an only if we implement everything it currently does in placment	16:09
sean-k-mooney	so cpu, hugepages, and pci device numa affinity	16:09
sean-k-mooney	when we can enforce all of the above with placmenet it can go away but not before we have all 3	16:09
sean-k-mooney	so it will allow us to do it picemeal in that more and more of the filtering can be left to placmenet and eventualy everthing will be enforce by placement and it can be removed once we have parity	16:11
sean-k-mooney	that is proably after U	16:12
sean-k-mooney	stephenfin: jangutter: i fixed my missing bug nit in https://review.opendev.org/#/c/666387/2 if you want to hit that one quickly. its not urgent but lets try and land it by m2	16:24
sean-k-mooney	we might want to backport it too to stien but we can do that when needed	16:25
sean-k-mooney	oops wrong channel	16:25
cdent	cleanly sean-k-mooney needs to be kickbanned for violating the rules so egregiously	16:26
cdent	clearly!	16:26
sean-k-mooney	clearly :)	16:27
edleafe	and certainly not cleanly!	16:31
* cdent waves goodnight		16:33
*** cdent has quit IRC		16:33
efried	sean-k-mooney: I agree the filter itself needs to stick around, especially because (IMO) we should not be trying to go whole hog in placement with things like hugepages etc. However, pieces of that code will become redundant (run but never reject) due to the bits we are enabling in placement.	16:35
sean-k-mooney	yep	16:36
sean-k-mooney	althouhg that code need to be made placment aware	16:36
efried	though (run but never reject) may be better as (remove) depending on how reliably we're running the placement side.	16:36
sean-k-mooney	its the same code that does teh assignment on the compute node and it need to know it can only assing the resouce that correspond to the allocations/RPs selected by placement	16:37
efried	yeah, that could get a little crazy.	16:37
sean-k-mooney	efried: the filter works by invokeing the assingment code that will be used on the compute node without actully claiming the resouce in the RT	16:37
sean-k-mooney	so 90% of the code would still be used after placment does it	16:38
sean-k-mooney	but it need to learn that it cant select form all resouces anymore and can only look at the resouce that correspond to the RP in teh placement allocation	16:38
sean-k-mooney	part of the logic will be updated to look at the alloction by the standardise cpu in placment work	16:39
sean-k-mooney	when hugepage or pci device are moved to plamcent the rest will have to be updted	16:40
sean-k-mooney	on the plus side it should make the filter faster	16:40
efried	which is kind of the whole point	16:41
sean-k-mooney	making it faster	16:41
efried	faster in two senses	16:41
efried	failures happen earlier with less racing; and the filter itself actually performs better.	16:42
efried	whole point of placement	16:42
sean-k-mooney	ya, although the real win is reducing rescudle however to do that we likely need to move teh RT claim to the conductor too.	16:42
*** ttsiouts has quit IRC		16:43
*** ttsiouts has joined #openstack-placement		16:44
*** ttsiouts has quit IRC		16:49
*** mriedem has quit IRC		21:53
openstackgerrit	Merged openstack/placement master: Update implemented spec and spec document handling https://review.opendev.org/669184	22:22
openstackgerrit	Merged openstack/placement master: Add whereto for testing redirect rules https://review.opendev.org/669370	22:32
openstackgerrit	Merged openstack/placement master: tox: Stop building api-ref docs with the main docs https://review.opendev.org/669371	22:32

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!