Friday, 2018-08-24

*** ChanServ has quit IRC		01:00
*** ChanServ has joined #openstack-placement		01:04
*** barjavel.freenode.net sets mode: +o ChanServ		01:04
*** mriedem_afk has quit IRC		01:05
*** Nel1x has joined #openstack-placement		01:28
*** dansmith has quit IRC		01:49
*** dansmith has joined #openstack-placement		01:51
*** lei-zh has joined #openstack-placement		02:01
*** Nel1x has quit IRC		02:28
*** nicolasbock has quit IRC		02:28
*** dims has quit IRC		02:28
*** Nel1x has joined #openstack-placement		02:30
*** nicolasbock has joined #openstack-placement		02:30
*** dims has joined #openstack-placement		02:30
*** openstack has joined #openstack-placement		13:23
*** ChanServ sets mode: +o openstack		13:23
openstackgerrit	Balazs Gibizer proposed openstack/nova-specs master: Add subtree filter for GET /resource_providers https://review.openstack.org/595236	13:24
cdent	giblet: ^^ that hacks tox.ini to make it possible to isntall placement from github. The problem I was experiencing before is that installs from git require pip directly, not whatever is installing nova itself (which I've noted in the updated commit message)	13:24
*** openstackstatus has joined #openstack-placement		13:25
*** ChanServ sets mode: +v openstackstatus		13:25
giblet	cdent: thanks	13:26
giblet	cdent: I'm pulling it down to play with it	13:26
cdent	yay!	13:27
cdent	I changed the reference to my branch, rather than the pull request so that edleafe's repo can fluctuate as required	13:27
cdent	seems to be running okay in the gate (at least for functional). gonna run out for a bit	13:33
openstackgerrit	Stephen Finucane proposed openstack/nova master: api: Remove unnecessary default parameter https://review.openstack.org/564451	13:50
*** efried_afk is now known as efried		14:16
*** efried is now known as senhor_granhular		14:24
senhor_granhular	leakypipes, giblet: Diacritics not allowed in nicknames, so I had to go with Portuguese.	14:25
leakypipes	heh :)	14:32
giblet	:)	14:33
senhor_granhular	leakypipes, giblet: Responded.	14:36
*** senhor_granhular is now known as fried_rice		14:37
fried_rice	leakypipes: Not having caught up with emails yet, did you give further thought to the refresh-in-reshaper-flow issue we started talking about yesterday?	14:37
giblet	fried_rice: thanks	14:39
fried_rice	lemme know if makes no sense	14:39
fried_rice	Easy +A: https://review.openstack.org/#/c/595453/	14:39
giblet	fried_rice: easily approved :)	14:41
fried_rice	thanks giblet	14:41
giblet	fried_rice: and your comment about non-granula group totally make sense	14:42
fried_rice	phew	14:42
*** ttsiouts has quit IRC		14:46
fried_rice	giblet: Your +2 on the bottom patch is useful, because 2x+2 all the way up the series will be our signal to remove the -2 and merge.	14:57
mriedem	i suppose that's my cue to review the rest of the series now	14:59
giblet	fried_rice: if you agree to do the doc change in a followup then I plug my +2 back	15:01
fried_rice	giblet: Sure thing.	15:01
giblet	fried_rice: I've plugged the +2 back	15:03
fried_rice	giblet: thx	15:03
cdent	our testing infrastructure is working against the kind of abuse I'm trying to do at the moment. I guess that's good most of the time.	15:05
*** ttsiouts has joined #openstack-placement		15:07
fried_rice	giblet, leakypipes: Why no +W on https://review.openstack.org/#/c/584598/ ? Is it because of the spurious random -1?	15:09
edleafe	cdent: Which files do we need to preserve in the current nova/api/openstack directory?	15:10
cdent	edleafe: none, just the placement directory and below	15:11
edleafe	cdent: thanks. Wasn't sure if we needed the wsgi stuff	15:11
cdent	no, placement has its own (simpler) stuff	15:11
edleafe	kewl	15:11
cdent	okay, I just got a devstack working off that trimmed nova repo. nova-placement-api is using the placement repo code, but presenting itself as a nova thing	15:13
cdent	gonna create some servers, make sure that's all happy, and then switch over to a placement-api using the same db	15:14
mriedem	fried_rice: cdent: giblet: leakypipes: correct me if i'm wrong, but there is a behavior change in the RT with https://review.openstack.org/#/c/584598/ which has gone unnoticed	15:15
mriedem	except maybe by the wily ying wang	15:15
cdent	mriedem, fried_rice : how errors and return values are handled in the RT is a complete black hole for me :(	15:16
fried_rice	heh	15:17
mriedem	cdent: you were +1 on the change so i called you out	15:17
mriedem	with the others	15:17
fried_rice	If mriedem's comments are accurate, I really effed up. Because I fully intended to make that particular code path backward compatible other than the log msg	15:17
cdent	mriedem: haven't you figured out by now that my +1 means "I'm happy to see this fail later"?	15:17
fried_rice	looking closer...	15:17
mriedem	cdent: i assume that's a joke	15:18
fried_rice	ah, mriedem, I think you may have missed the `or {}` in the original.	15:18
mriedem	well f me sideways	15:18
* fried_rice looks up straight jackets again		15:19
mriedem	+W	15:19
mriedem	sorry	15:19
cdent	mriedem: well, more accurately: of the things I have experience in, this does not offend my sensibilities, and I trust the other people who have already reviewed.	15:19
cdent	but also: I'm okay with bugs leaking through some of the time: they are the grist in the mill that keeps people coming back	15:19
mriedem	https://bugs.launchpad.net/nova/	15:20
mriedem	1 → 75 of 804 results	15:20
mriedem	...	15:20
cdent	:)	15:20
mriedem	half that shit is probably bugs in havana anyway	15:20
cdent	several conflicting forces at work	15:20
*** nicolasbock has joined #openstack-placement		15:21
fried_rice	fwiw, I think "of the things I have experience in, this does not offend my sensibilities, and I trust the other people" is a reasonable review strategy, and is why we require multiple +2s to merge a thing.	15:21
giblet	mriedem: to be honest. We I now went back to look at your comment I did not noticed the or {}	15:21
fried_rice	mriedems of the world notwithstanding, it is unreasonable to expect even an illustrious core to understand all the implications of every change before approving it.	15:22
giblet	s/we/when/	15:22
mriedem	fried_rice: that's why i only +1ed this https://review.openstack.org/#/c/592285/	15:23
openstackgerrit	Claudiu Belu proposed openstack/nova master: hyper-v: autospec classes before they are instantiated https://review.openstack.org/342211	15:23
fried_rice	heh, and why I avoided it entirely	15:23
openstackgerrit	Claudiu Belu proposed openstack/nova master: WIP: replace spec with autospec https://review.openstack.org/557299	15:23
fried_rice	because I figure I should understand at least some part of it before voting.	15:24
leakypipes	fried_rice: sorry, was grabbing lunch. I decided to back off and propose a patch in the future that shows what I was talking about.	15:25
mriedem	like me reviewing new powervm features...	15:25
leakypipes	fried_rice: on the refresh-in-reshaper tihng	15:25
leakypipes	thing	15:25
mriedem	sspd viao wtf	15:25
fried_rice	leakypipes: Roger, saw that comment, looking forward to seeing what you come up with :)	15:25
edleafe	cdent: I changed the extract script to move n/a/o/placement to placement/api instead of the placmeent directory. What about the test directories? Should n/t/u/a/o/placement be p/t/u/api or just p/t/unit?	15:27
cdent	edleafe: latter. unless you have an N package somewhere in the "real" code that name shouldn't show up in the tests?	15:28
leakypipes	fried_rice: also, on cdent's API patch, I'm cool addressing that one race condition at a later time.	15:28
leakypipes	fried_rice: it would be super rare anyway and not something I'm too concerned about	15:29
edleafe	I was writing a little search/replace thing to change the pathing, and stumbled upon that difference	15:29
cdent	"little search/replace thing"++	15:29
fried_rice	I guess I would expect things under p/t/u/api to be testing the API, which would really be /f/ tests not /u/ tests. So I'd be in favor of nixing /api on the /u/ side.	15:31
cdent	fried_rice: the strategy I have in my head is it's all "api" unless otherwise specified	15:33
cdent	as that maps to what was already in place	15:33
fried_rice	sure, makes sense.	15:33
cdent	also, it _might_ help us keep placement a thing with only one long running service	15:33
fried_rice	I guess PlacementDirect would be non-api?	15:33
*** giblet is now known as giblet_off		15:34
cdent	when I say "all api" I mean there is no api directory	15:34
cdent	so everything just does where it aligns with the existing package hierarchy	15:34
cdent	so sincer direct.py is top level, so would its tests	15:34
cdent	edleafe: does all that correspond with your thinking?	15:37
edleafe	Hmmm... then why the need for placement/api for the current n/a/o/placement stuff, instead of just moving it to placement/ ?	15:39
cdent	that's what I'm saying there shouldn't be any 'api' directory anywhere	15:40
cdent	the contents of n/a/o/placement is what becomes $repo/placement	15:40
cdent	I wrote that on https://etherpad.openstack.org/p/placement-extraction-file-notes line 25 ish. did you see that?	15:41
edleafe	I've read that a bunch of times along with other stuff. I'm getting confused; hence the clarification request	15:42
cdent	I'm okay with there being a subdir there as it might be tidier (thus it being "up for debate" on the etherpaf). What do people thin	15:42
edleafe	In my first push, placement/ had 3 subdirs: api/, db/, and tests/	15:43
edleafe	So cut that down to just the latter two?	15:43
cdent	I think so, yeah, assuming you're not counting policies, schemas, handlers (and other?) dirs in that "two"?	15:44
edleafe	well yeah, after the stuff in api/ is moved down	15:45
*** ttsiouts has quit IRC		15:46
cdent	I feel like maybe I'm still not understanding you but I guess it will all become clear in the next example and we can continue to iterate	15:51
edleafe	No, I think we're on the same page	15:52
*** tssurya has quit IRC		15:54
fried_rice	Is any of this repathing instrumental to getting things working, or could it maybe be done in a subsequent change set after the repo is seeded?	15:54
fried_rice	not advocating, just asking.	15:54
fried_rice	I guess you're having to determine paths for everything and change import lines anyway.	15:55
mriedem	fried_rice: so in https://review.openstack.org/#/c/584599/ why did you change over the heal_allocations CLI but not the other usage in compute and conductor?	15:57
mriedem	assuming b/c it was the only user of include_generation=True?	15:57
fried_rice	mriedem: Because the heal allocations one was the only one using the new microversion at the time	15:57
fried_rice	yes	15:57
fried_rice	the other changeovers were going to be pretty complicated, IIRC, and I wanted to stay focused on what was needed for the reshaper series.	15:58
fried_rice	and do the others "later"	15:58
mriedem	sure	15:58
mriedem	just checking	15:58
fried_rice	hence the implicit TODO in the commit msg https://review.openstack.org/#/c/584599/21//COMMIT_MSG@23	15:58
fried_rice	and in the code https://review.openstack.org/#/c/584599/21/nova/scheduler/client/report.py@1539	15:59
mriedem	yes yes	15:59
fried_rice	sorry, not trying to be defensive, just double checking myself	15:59
mriedem	consider me bludgeoned	15:59
fried_rice	dredging up memories from way back when I wrote this.	15:59
mriedem	i have some comments on the heal_allocations part of it, but haven't posted yet	15:59
fried_rice	ack	15:59
fried_rice	knowing that you're reviewing the series, I wasn't planning to lift the bottom -2 until you're done.	16:00
mriedem	yeah hopefully will be done today	16:01
cdent	fried_rice: if we want to be able to do testing with nova.api.openstack.placement.handler and handler.placement potentially being in the same python space (as I'm doing right now) then we need to repath things	16:01
cdent	sorry handler.placement -> placement.handler	16:01
mriedem	fried_rice: ok comments inline on that one	16:03
mriedem	note the risk of putting a release name in a commit message near a release boundary :)	16:03
fried_rice	did that happen? oops	16:04
mriedem	so it looks like you've got -1s to address on the next patch after this, so might as well respin that commit message when you rebase and i'll fast approve	16:05
fried_rice	mriedem: ack.	16:07
fried_rice	though I'm not completely sure about the -1s above <== giblet_off leakypipes	16:08
mriedem	did we say everything must be down during reshape runs? if so, that kind of breaks rolling upgrades right?	16:11
mriedem	dansmith: ^	16:11
dansmith	I'm not sure what you mean,	16:12
mriedem	https://review.openstack.org/#/c/584648/20/nova/scheduler/client/report.py	16:12
dansmith	obviously placement can't be down so I don't really get it	16:13
dansmith	I'm focusing on something else right now so I can't grok all that at the moment	16:13
openstackgerrit	Stephen Finucane proposed openstack/nova master: doc: Note NUMA topology requirements for numa-aware-vswitches https://review.openstack.org/596393	16:14
cdent	Is that maybe about FFU and the direct interface, which is only a sometime, not all the time	16:16
cdent	fried_rice ^ ?	16:16
fried_rice	I thought placement offline was the whole reason we did PlacementDirect	16:18
openstackgerrit	Merged openstack/nova stable/rocky: Correct the release notes related to nova-consoleauth https://review.openstack.org/595890	16:18
openstackgerrit	Merged openstack/nova master: tests: Move mocking to setUp https://review.openstack.org/595802	16:18
mriedem	well we have 2 upgrade scenarios,	16:18
fried_rice	But I think maybe we did say we would allow reshape to happen on compute startup.	16:18
mriedem	first start of compute with the new code which is an online upgrade	16:18
fried_rice	yeah, that would be the second one?	16:18
mriedem	compute fails to start if reshape fails	16:18
mriedem	FFU is the offline one	16:19
mriedem	like how we migrated ironic flavors	16:19
fried_rice	okay. And in either case, is it possible for a partial-evacuate scenario to exist?	16:19
mriedem	we did the ironic flavor migration online during n-cpu start and had the offline option via nova-manage	16:19
mriedem	sure	16:19
fried_rice	okay. Then I guess we need to handle that case. How?	16:19
mriedem	i've got 1000 hosts, 3 failed and i evacuated from them. then i upgrade to stein.	16:20
mriedem	i left some comments - as gibi said, i'm not entirely sure it will ruin anything, but i'm not that far ahead in the series to know	16:20
fried_rice	butbut, don't the evacuated thingies get purged before we get here?	16:20
mriedem	when we reshape, aren't we only reshaping things for the root provider in the tree which is going to be the current node i'm on - specifically speaking for the libvirt and xen drivers	16:21
mriedem	so if there are other providers in the tree i probably don't even care about those	16:21
mriedem	so it depends on how this is used i guess https://review.openstack.org/#/c/584648/20/nova/scheduler/client/report.py@2055	16:22
leakypipes	fried_rice: my -1 on that patch is because by adding the sharing pool to the ptree before passing it to update_provider_tree() you are populating a record in the ProviderTree for the shared storage pool. And I wanted to functionally test the scenario when that would not be populated in the ProviderTree and when an allocation involving that shared storage pool popped up, that the ProviderTree would "fill in" the missing sharing provider	16:22
leakypipes	record.	16:22
fried_rice	You're talking about specific known cases of reshape that will be happening in Stein. In the future, a provider tree may need to be reshaped to a different provider tree shape. This needs to be made to work for both.	16:22
mriedem	the RPs per consumer could be more than just the $nodename RP	16:22
mriedem	we know of 2 cases where the RPs can be other than $nodename: 1) sharing providers - which we don't support yet and (2) evacuated-from nodes	16:23
mriedem	right?	16:23
fried_rice	leakypipes: You mean the ssp doesn't exist before we reshape?	16:23
leakypipes	fried_rice: doesn't exist in the compute node's ProviderTree cache, yes.	16:23
fried_rice	right, that's what I'm doing.	16:23
mriedem	maybe assert what you expect is being setup?	16:24
mriedem	for clarity?	16:24
alex_xu	leakypipes: fried_rice cdent, our team is working on nvdimm device. but we face the fragmentation issue, looking for some help from your guys. The case is just if you already have 2gb allocated in the middle of 10gb device, you can't allocate the rest 8gb, since the device required contiguous space. Pretty sure we can't modify inventory after claim just like fpga and gpu case. so appreciate help	16:24
alex_xu	me on some idea	16:24
leakypipes	fried_rice: the comment on that line is this:	16:24
leakypipes	# Another unrelated compute node. We don't use the report client's	16:24
leakypipes	# convenience methods because we don't want this guy in the cache.	16:24
fried_rice	leakypipes: The whole design of upt (and its precursors) is to make it appear in the ProviderTree so that upt sees it and can figure out what to do with it.	16:24
leakypipes	fried_rice: and I'm saying I would prefer that you do the same for the shared storage pool. create it outside of the provider tree convenience methods to ensure the provider tree doesn't have it already when get_allocations_for_provider() ends up being called	16:25
fried_rice	ohhhhhh, now I follow. I wasn't reading the context carefully enough. The code comment you highlighted is talking about a different non-sharing provider elsewhere in placement, and your review comment is asking why same wasn't done 6LOC earlier when we created the SSP. Sorry, ack, will look closer. (Gotta run now)	16:27
leakypipes	right, exactly.	16:27
fried_rice	But I think the ssp won't show up in the cache if I do that.	16:27
leakypipes	fried_rice: that's kinda what I'm trying to tease out...	16:27
leakypipes	fried_rice: to see if the assumptions made in the code are tested for edge cases like this.	16:28
* alex_xu scroll the screen, sounds like something broke		16:28
fried_rice	owait, I take it back, if MISC_SHARES is set, and it's in the same agg, it should show up. So yeah, I can change that out.	16:28
cdent	alex_xu: I've read your message above, and I'm thinking.	16:28
leakypipes	fried_rice: rock on.	16:28
fried_rice	alex_xu: This sounds like a job for reserved=total	16:29
alex_xu	cdent: thanks, sorry for inject the message	16:29
alex_xu	fried_rice: emm...do you mean after the allocation, then you change the reserved of inventory?	16:30
cdent	alex_xu: I would guess some kind of dynamic max_unit adjustments might be workable	16:30
leakypipes	alex_xu: if I'm being blunt, this sounds like something you'll need to solve on your own outside of placement.	16:30
fried_rice	heh	16:30
cdent	at start max_unit is 8, after the 2 is used, it is 3 or 4	16:30
fried_rice	that's three wildly different answers	16:30
alex_xu	cdent: yea, but we can't dynamic change inventory	16:31
cdent	i'm pretty okay with client side dynamically adjusting inventory if they feel that's the right thing	16:31
cdent	alex_xu: why not?	16:31
alex_xu	leakypipes: yea...my initial thought that should be something on the host or device drive to fix the fragmentation issue, but it just doesn't fix that issue	16:31
alex_xu	cdent: emm...that should be same with gpu case, if we dynamic change the inventory after a resource claim, we will have race problem	16:32
leakypipes	alex_xu: or you could punt to Cinder, since that's basically what you're doing with nvdimm... volume management.	16:32
alex_xu	leakypipes: it isn't cinder thing, it is memory device.	16:33
leakypipes	I've heard they really like live resize functionality.	16:33
alex_xu	I have one idea...	16:33
leakypipes	alex_xu: it's not memory. it's memory with caveats.	16:33
alex_xu	ask the operator setup the device first, for 10gb device, separate as 5 fragment, each one only can be 2gb, and max-unit,min-unit=2gb	16:34
cdent	alex_xu: yes, that would be an option	16:36
alex_xu	leakypipes: there are two way to use nvdimm device, use it as storage, use it as memory, we working on the case use it as memory, actually just passthrough it to the VM, and VM see it as a vNVDIMM device	16:36
alex_xu	cdent: but someone push me back, the reason is it isn't flexible for usage	16:36
cdent	alex_xu: yeah, I'm not sure there's a truly flexible solution if you are unable to dynamically manage inventory	16:37
*** fried_rice is now known as fried_rolls		16:39
mriedem	fried_rolls: related to leakypipes' comment on the test, isn't it implicitly hitting the evacuate scenario?	16:40
alex_xu	leakypipes: really not sure there is document can explain nvdimm easily..but from the qemu doc, it is realy a memory device https://github.com/qemu/qemu/blob/master/docs/nvdimm.txt	16:40
mriedem	"Another unrelated compute node."	16:40
openstackgerrit	Dan Smith proposed openstack/nova master: Batch results per cell when doing cross-cell listing https://review.openstack.org/592698	16:41
openstackgerrit	Dan Smith proposed openstack/nova master: List instances from all cells explicitly https://review.openstack.org/593717	16:41
openstackgerrit	Dan Smith proposed openstack/nova master: Make instance_list perform per-cell batching https://review.openstack.org/593131	16:41
openstackgerrit	Dan Smith proposed openstack/nova master: Record cell success/failure/timeout in CrossCellLister https://review.openstack.org/594265	16:41
openstackgerrit	Dan Smith proposed openstack/nova master: Optimize global marker re-lookup in multi_cell_list https://review.openstack.org/594577	16:41
alex_xu	cdent: can we let placement support dynamically manage inventory?	16:41
alex_xu	leakypipes: fried_rolls ^ is that option, or we already totally say no	16:41
cdent	alex_xu: I don't understand what you mean?	16:41
cdent	you can change inventory in placement whenever you want	16:42
leakypipes	alex_xu: no.	16:42
cdent	if you're asking for placement to provide some form of transactional control, that's not going to happen	16:42
leakypipes	what cdent said.	16:42
alex_xu	cdent: leakypipes yea, 'transcational' that is what i mean	16:43
cdent	allocations and inventory are very intentional not strongly connected	16:43
alex_xu	cdent: emm..what is key point we won't support forever, just want to learn the idea	16:45
cdent	alex_xu: as I understand what you're asking, you want to avoid a race condition where at time X we allocate 2gb in the middle of nvdimm and at near the same time you want to change the inventory so that (in a variety of strategies) the inventory is represented in a way that allows more of it to be consumed accurately. To avoid the race there the initial allocation would somehow have to signal a lock on any interactio	16:48
cdent	with the inventory until it was updated	16:48
alex_xu	cdent: yes	16:49
cdent	alex_xu: is that right? if so, that's more placement-side state management than we've designed into the system. adding something like that would be very complicated and contrary to some of the original design goals about allocations	16:49
cdent	It is probably possible, but it would be hard, for what amounts to an edge case	16:49
cdent	it would be better to either: accept the risk of the race, or figure out a way to manage the inventory in a way that works with the existing constraints	16:50
openstackgerrit	Dan Smith proposed openstack/nova master: Optimize global marker re-lookup in multi_cell_list https://review.openstack.org/594577	16:51
alex_xu	cdent: i got it, thanks	16:51
alex_xu	cdent: leakypipes fried_rolls, so thanks, at least I got one option won't think about it anymore. probably continue push the fixed-size idea	16:53
cdent	alex_xu: it's at least a way to get started, and then you can iterate	16:53
alex_xu	cdent: yes, agree with that, hope i can persuade people	16:54
dansmith	mriedem: do you still want/need me to look at that upgrade thing from earlier or did it get worked out?	16:57
mriedem	umm, might need to ask fried_rolls	17:02
mriedem	i'm leaving for a bit	17:02
*** mriedem is now known as mriedem_afk		17:02
* alex_xu continuously explain the placement won't support transactional control from the cat, fpga to nvdimm for people		17:05
openstackgerrit	Dan Smith proposed openstack/nova master: Batch results per cell when doing cross-cell listing https://review.openstack.org/592698	17:18
openstackgerrit	Dan Smith proposed openstack/nova master: List instances from all cells explicitly https://review.openstack.org/593717	17:18
openstackgerrit	Dan Smith proposed openstack/nova master: Make instance_list perform per-cell batching https://review.openstack.org/593131	17:18
openstackgerrit	Dan Smith proposed openstack/nova master: Record cell success/failure/timeout in CrossCellLister https://review.openstack.org/594265	17:18
openstackgerrit	Dan Smith proposed openstack/nova master: Optimize global marker re-lookup in multi_cell_list https://review.openstack.org/594577	17:18
*** fried_rolls is now known as fried_rice		17:47
fried_rice	dansmith: What upgrade thing from earlier?	17:47
fried_rice	oh	17:47
fried_rice	There's been a couple of reshape scenarios identified where we may have the potential to race, and we're wondering to what extent the reshaper is only being run in a steady/reduced-activity state, and how that impacts those races.	17:48
fried_rice	dansmith: The one we were talking about earlier was here: https://review.openstack.org/#/c/584648/20/nova/scheduler/client/report.py	17:49
fried_rice	Wondering about evacuate and whether we can have one consumer with allocations straddling multiple computes' providers.	17:49
fried_rice	which it sounds like we could	17:50
dansmith	instances in the middle of a resize across an upgrade boundary is not (at all) uncommon	17:50
dansmith	which means you have an upgraded and unupgraded node that hold allocations for that instance which may be reverted or confirmed on either side of the upgrade by the old or new node	17:51
dansmith	we have no way to prevent that scenario, and when we've talked about it, real clouds confirmed it's completely impossible	17:51
fried_rice	but those scenarios don't use the "migration UUID".	17:51
fried_rice	So there would in fact be the same consumer_uuid existing on providers owned by two different hosts.	17:51
dansmith	um, what?	17:51
dansmith	no, the migration uuid is the consumer in that case, which I assume makes your thing easier,	17:52
dansmith	but you may have an older node restoring allocations for a new one	17:52
fried_rice	"older" meaning "before migration_uuid was a thing"?	17:53
dansmith	because you have allocations held on an old node by migration uuid, then on revert, we use those to restore them against the new node	17:53
dansmith	no, old as in pre-reshape	17:53
fried_rice	okay, but I think that's fine as long as the consumer UUIDs are different on both sides (hosts)	17:53
fried_rice	I don't care which one is the real instance UUID and which is the migration UUID	17:53
dansmith	it's not if the older node tries to restore a flat allocation against a nested inventory	17:54
dansmith	but regardless, I'm not all caught up on the actual scenario,	17:54
dansmith	I'm just saying you really can't assume that "all evacuations are quiesced" before an upgrade or whatever you said this morning	17:54
dansmith	and you can't assume scheduler or placement is down (or up) either	17:55
fried_rice	Cool, got that part understood.	17:55
fried_rice	So what I actually need to know now is whether there's any kind of move/migration/resize/evacuation/etc. where it's possible to have the same consumer UUID on two providers owned by different hosts.	17:55
fried_rice	(sharing providers don't count)	17:55
dansmith	I don't think you can say that won't or can't happen	17:56
fried_rice	e.g. could GET /allocations/{c} ever return	17:56
fried_rice	{ cn1_rp_uuid: { ... },	17:56
fried_rice	cn2_rp_uuid: { ... }	17:56
fried_rice	}	17:56
fried_rice	?	17:56
dansmith	it shouldn't happen right now with nova as it is, but if cyborg gets in the mix I would think you could have that fairly easily	17:57
fried_rice	well	17:57
fried_rice	I don't think I care about that, do I?	17:57
fried_rice	and	17:57
dansmith	I definitely don't know what you care about	17:57
fried_rice	cyborg is going to be "owning" the device providers, but those are still going to be nested under the compute RPs.	17:58
dansmith	anyway, mriedem_afk can tell you all of this that you need to know, so I needn't be involved in this I think	17:58
fried_rice	okay. Thanks for the input.	17:58
openstackgerrit	Merged openstack/nova master: Make CELL_TIMEOUT a constant https://review.openstack.org/594570	18:08
* cdent waves		18:14
*** cdent has quit IRC		18:14
openstackgerrit	Ghanshyam Mann proposed openstack/nova master: Merge security groups extension response into server view builder https://review.openstack.org/585475	18:34
openstackgerrit	Ghanshyam Mann proposed openstack/nova master: Merge extended_status extension response into server view builder https://review.openstack.org/592092	18:35
openstackgerrit	Merged openstack/nova master: tests: Create functional libvirt test base class https://review.openstack.org/407055	18:41
*** mriedem_afk is now known as mriedem		18:45
mriedem	fried_rice: the answer to "So what I actually need to know now is whether there's any kind of move/migration/resize/evacuation/etc. where it's possible to have the same consumer UUID on two providers owned by different hosts." is definitely "yes" for an evacuated instance	18:48
mriedem	but as noted in the review, if the original evacuated-from source host ever restarts we'll remove it's allocations for any instances evacuated from that host	18:48
mriedem	and we fixed the case that we'd orphan providers when deleting a nova-compute service in the os-services API	18:49
mriedem	impossible to predict all of the weird shit that could happen, which is why we have heal_allocations i guess...	18:49
fried_rice	mriedem: Thanks. I'll have to think through the implications in the reshaper context, which afaict will start with the fact that the allocations param we pass into the (second) update_provider_tree call will contain those allocations from the other host. Not sure what will actually fall out of that. May just be a case of documenting the possibility for the implementor of the upt reshape flow.	18:56
fried_rice	sounds like for the evacuation scenario that documentation should simply say "leave these tf alone if you see 'em"	18:57
fried_rice	which I imagine ought to be the default behavior anyway, since that flow should be focusing on moving allocations only for providers it's messing with, which should not include providers from other hosts.	18:57
mriedem	^ is kind of what i was getting at earlier and you said yes for now in the very specific stein case	18:58
mriedem	the implementor won't actually know what the other RPs are probably w/o any kind of context,	18:58
mriedem	i.e. looking up, "is this a compute node and if so, was it involved in some kind of migration?"	18:58
fried_rice	right. It should employ the strategy of, "Is this allocation related to a provider I'm reshaping? No? Ignore."	19:02
mriedem	the only thing is,	19:09
mriedem	what does the virt driver have for context? it gets the nodename but it doesn't necessarily know based on RP UUID which RPs in the tree are the ones it cares about, right?	19:10
mriedem	like, how does the virt driver identify the rp it actually cares about?	19:10
mriedem	i know we could figure that out by looking up the compute node via CONF.host and nodename to get the CN UUID	19:11
mriedem	but that's rather fugly	19:11
mriedem	especially since the RT could just pass it down	19:11
mriedem	"this is your nodename and this is your CN UUID"	19:11
mriedem	if only we had a local yaml file that told nova-compute exactly what it's local inventory was.... :)	19:12
fried_rice	doesn't even get the nodename	19:18
fried_rice	the rt already passes down my nodename	19:18
fried_rice	and we already tell upt implementors (via the docstring) not to futz with providers they don't recognize as being owned by them.	19:19
fried_rice	which they mostly identify by knowing which ones they think are supposed to be the ones in the tree	19:19
fried_rice	but also potentially via namespacing	19:20
fried_rice	which might or might not wind up causing problems here.	19:20
fried_rice	leakypipes: I'm rebasing https://review.openstack.org/#/c/590041/ k?	19:20
openstackgerrit	Eric Fried proposed openstack/nova master: [placement] split gigantor SQL query, add logging https://review.openstack.org/590041	19:23
leakypipes	fried_rice: sure, np	19:38
leakypipes	fried_rice: did you want me to change that log message?	19:38
fried_rice	leakypipes: I did, and I did.	19:38
fried_rice	leakypipes: See epic cumulative comment response	19:39
leakypipes	ah, never mind then :)	19:39
fried_rice	melwitt: --^	19:39
leakypipes	thx fried_rice	19:39
fried_rice	mriedem: leakypipes: Restacking the reshaper series. Are you done reviewing?	19:39
*** mriedem has quit IRC		19:40
leakypipes	fried_rice: I was not, no.	19:44
leakypipes	fried_rice: can you hold on a bit on that one?	19:45
fried_rice	leakypipes: I've started working on the last one you -1'd. I can wait for you after that.	19:45
leakypipes	fried_rice: k. five minutes pls	19:48
fried_rice	sho	19:48
*** mriedem has joined #openstack-placement		19:48
openstackgerrit	Merged openstack/nova master: Stash the cell uuid on the context when targeting https://review.openstack.org/594571	20:09
fried_rice	leakypipes: Kid run, bbiab. But finished local restack up to https://review.openstack.org/#/c/584648/	20:12
*** fried_rice is now known as efried_afk		20:13
*** efried_afk is now known as fried_rice		20:14
fried_rice	leakypipes: Cancel that, wires crossed, no kid run.	20:17
leakypipes	fried_rice: :)	20:17
openstackgerrit	Dan Smith proposed openstack/nova master: Batch results per cell when doing cross-cell listing https://review.openstack.org/592698	20:29
openstackgerrit	Dan Smith proposed openstack/nova master: List instances from all cells explicitly https://review.openstack.org/593717	20:29
openstackgerrit	Dan Smith proposed openstack/nova master: Make instance_list perform per-cell batching https://review.openstack.org/593131	20:29
openstackgerrit	Dan Smith proposed openstack/nova master: Record cell success/failure/timeout in CrossCellLister https://review.openstack.org/594265	20:30
openstackgerrit	Dan Smith proposed openstack/nova master: Optimize global marker re-lookup in multi_cell_list https://review.openstack.org/594577	20:30
leakypipes	fried_rice: k, done. sorry for delay.	20:53
fried_rice	leakypipes: ack, thx	20:53
fried_rice	leakypipes: What is necessary to get your +1 upgraded to +2 on the top patch?	21:01
fried_rice	Cause I think I've got the rest of the pile done.	21:01
openstackgerrit	Eric Fried proposed openstack/nova master: Report client: Real get_allocs_for_consumer https://review.openstack.org/584599	21:13
openstackgerrit	Eric Fried proposed openstack/nova master: Report client: get_allocations_for_provider_tree https://review.openstack.org/584648	21:13
openstackgerrit	Eric Fried proposed openstack/nova master: Report client: _reshape helper, placement min bump https://review.openstack.org/585034	21:13
openstackgerrit	Eric Fried proposed openstack/nova master: Report client: update_from_provider_tree w/reshape https://review.openstack.org/585049	21:13
openstackgerrit	Eric Fried proposed openstack/nova master: Compute: Handle reshaped provider trees https://review.openstack.org/576236	21:13
fried_rice	leakypipes, mriedem, giblet_off, cdent: ^^^^^	21:14
openstackgerrit	Matt Riedemann proposed openstack/nova master: api-ref: fix volume attachment update policy note https://review.openstack.org/596489	21:29
openstackgerrit	Matt Riedemann proposed openstack/nova master: api-ref: add a warning about calling swap volume directly https://review.openstack.org/596492	21:47
openstackgerrit	Eric Fried proposed openstack/nova master: Document no content on POST /reshaper 204 https://review.openstack.org/596494	21:49
openstackgerrit	Eric Fried proposed openstack/nova master: Compute: Handle reshaped provider trees https://review.openstack.org/576236	21:53
fried_rice	leakypipes: In case you were in the middle, forgot one thing in that top patch ^	21:53
openstackgerrit	Matt Riedemann proposed openstack/nova master: Add functional test for live migrate with anti-affinity group https://review.openstack.org/588935	21:53
mriedem	fried_rice: i suppose you should re-propose that reshaper spec for stein huh	22:04
mriedem	or did you already?	22:04
fried_rice	mriedem: did already	22:05
fried_rice	mriedem: https://review.openstack.org/#/c/592650/	22:05
mriedem	ah i se it	22:05
mriedem	yeah	22:05
mriedem	oh god you had to format it didn't you	22:06
openstackgerrit	Eric Fried proposed openstack/nova master: Fix race condition in reshaper handler https://review.openstack.org/596497	22:07
fried_rice	mriedem: no?	22:10
mriedem	it's fine, very thorough as usual	22:10
mriedem	+W	22:10
fried_rice	Straight copy, plus deltas as advertised.	22:10
fried_rice	Thanks.	22:10
fried_rice	mriedem: Oh, I had to rename those linkylinks because they have to be globally unique across the whole doc build :( :( :(	22:11
fried_rice	(made me 3x sad)	22:12
mriedem	oh i was wondering about dropping the ()	22:15
mriedem	yeah that sucks	22:15
melwitt	fried_rice: cool, will check it out	22:16
mriedem	i hit something in the cinder api-ref the other day b/c of v2 and v3 api sections, took me awhile to realize why it was complaining	22:16
openstackgerrit	Merged openstack/nova-specs master: Repropose reshaper spec for Stein https://review.openstack.org/592650	22:23
mriedem	fried_rice: looks like maybe another missing uuids.agg1 in the test here https://review.openstack.org/#/c/585034/	23:03
openstackgerrit	Matt Riedemann proposed openstack/nova master: Deprecate Core/Ram/DiskFilter https://review.openstack.org/596502	23:28
openstackgerrit	Matt Riedemann proposed openstack/nova master: Deprecate Core/Ram/DiskFilter https://review.openstack.org/596502	23:32
openstackgerrit	melanie witt proposed openstack/nova master: Make scheduler.utils.setup_instance_group query all cells https://review.openstack.org/540258	23:45

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!