*** tetsuro has joined #openstack-placement | 01:40 | |
openstackgerrit | Eric Fried proposed openstack/placement master: Move provider_ids_from_rp_ids to allocation_candidate https://review.opendev.org/675084 | 02:54 |
---|---|---|
*** tetsuro has quit IRC | 03:25 | |
openstackgerrit | Merged openstack/placement master: Optimize trait creation to check existence first https://review.opendev.org/673555 | 03:47 |
*** tetsuro has joined #openstack-placement | 04:07 | |
*** tetsuro has quit IRC | 06:32 | |
*** tetsuro has joined #openstack-placement | 07:14 | |
*** tetsuro has quit IRC | 07:18 | |
*** tssurya has joined #openstack-placement | 07:23 | |
*** takashin has left #openstack-placement | 08:01 | |
openstackgerrit | Chris Dent proposed openstack/placement master: Move provider_ids_from_rp_ids to allocation_candidate https://review.opendev.org/675084 | 08:14 |
*** tetsuro has joined #openstack-placement | 08:17 | |
*** helenafm has joined #openstack-placement | 08:24 | |
*** e0ne has joined #openstack-placement | 09:04 | |
*** cdent has joined #openstack-placement | 09:07 | |
openstackgerrit | Chris Dent proposed openstack/placement master: Move provider_ids_from_rp_ids to allocation_candidate and fix https://review.opendev.org/675084 | 09:29 |
cdent | tetsuro: fixed ^. thanks for catching that | 09:29 |
cdent | tetsuro: this is another of the important perf fixes: https://review.opendev.org/#/c/673991/ | 09:30 |
cdent | and this is simply more data: https://review.opendev.org/#/c/673540/ | 09:31 |
tetsuro | Thanks, will have a look on them | 09:32 |
cdent | thank you | 09:32 |
cdent | once the first two are merged I'll compare profiles from the patch that added same_subtree to now. I'm hoping it will look pretty good | 09:33 |
openstackgerrit | Chris Dent proposed openstack/placement master: Use another expanding bindparam in _get_usages_by_provider_trees https://review.opendev.org/675287 | 10:30 |
cdent | tetsuro, gibi, stephenfin : Do any of you have strong feelings about using (or not) a different JSON dumper: https://review.opendev.org/674661 ? | 10:32 |
*** tetsuro has quit IRC | 11:02 | |
cdent | efried: when you have a spare momen to drag your mind back through history can you look at then you added this: https://review.opendev.org/#/c/517757/31..33/nova/api/openstack/placement/objects/resource_provider.py,unified and recall the reasons why? It's not a not-deep copy, but it is _expensive_. If I change it to not copy, we don't get accurate results, but I'm struggling to suss out the exact mechanisms of why | 11:12 |
gibi | cdent: left comments in https://review.opendev.org/#/c/674661 I'm not against a different json lib if we make a well informed decision | 11:37 |
cdent | thanks gibi | 11:38 |
openstackgerrit | Chris Dent proposed openstack/placement master: Correct SQL docstring on _get_usages_by_provider_trees https://review.opendev.org/675302 | 11:43 |
*** tssurya has quit IRC | 11:50 | |
*** tssurya has joined #openstack-placement | 11:50 | |
*** cdent has quit IRC | 12:04 | |
*** takashin has joined #openstack-placement | 12:19 | |
openstackgerrit | Merged openstack/placement master: _get_all_by_filters_from_db do not cast to list of dict https://review.opendev.org/673991 | 13:02 |
*** cdent has joined #openstack-placement | 13:05 | |
*** mriedem has joined #openstack-placement | 13:16 | |
efried | cdent: It was so we don't reuse arrZ across separate results | 13:17 |
cdent | efried: I understand that, but where are we post-manipulating the ones that we've already used? | 13:18 |
efried | I don't have it fully in my head, but something about the way we blow out that itertools.product means we would have otherwise ended up trying to use the same arr for a different result. And the algorithm there is changing its values, so they'd be wrong for the other result. | 13:18 |
efried | (itertools.product in _merge_candidates that is) | 13:19 |
openstackgerrit | Merged openstack/placement master: Add apache benchmark (ab) to end of perfload jobs https://review.opendev.org/673540 | 13:19 |
cdent | so, at the mo, _merge_candidates and especially _consolidate... are big hitters in the proflle data and they happen to be the parts I least understand | 13:20 |
efried | I remember debugging this and putting a printf in that spot. Maybe like printing the __id__ of the arr? And proving that we're hitting the same one more than once | 13:20 |
efried | Yeah, that's not surprising. _merge_candidates is a beast. | 13:20 |
* cdent nods | 13:20 | |
cdent | it isn't, however, as much of an issue as the json dumping | 13:21 |
efried | does that copy stand out as being the most expensive part of _merge_candidates? | 13:21 |
cdent | yes | 13:22 |
cdent | it's the most expensive part of _consolidate, which is the most expensive part of _merge | 13:22 |
efried | I assume it would make no difference if instead of copy you explicitly initialized a new ARR at that point | 13:23 |
cdent | I wouldn't have thought so, unless copy is doing something really weird. It may be worth trying, but I haven't, yet | 13:24 |
efried | we're probably only noticing because here we're rebuilding *all* the ARRs at once, whereas otherwise we're building them spread out | 13:24 |
efried | like, building them the first time is probably just as expensive, but spread out, am I making sense? | 13:24 |
cdent | the profile is aggregating the total time of a method across all n-thousand times it is called during one request | 13:25 |
cdent | (at a particular location in the call tree) | 13:25 |
efried | So I'm saying, if you look at ARR's constructor, does it overall account for 2x the time of that copy? | 13:27 |
cdent | let me find some data | 13:28 |
efried | Anyway, not sure that's a super productive line of inquiry. Point is that we most likely don't need that copy a large percentage of the time. | 13:29 |
cdent | (it's hard to find the data because of the way filenames are truncated...) | 13:30 |
cdent | note: 60060.0015912.649e-070.0015912.649e-07allocation_candidate.py:217(__init__) | 13:31 |
cdent | 0.0797 for the copy | 13:32 |
cdent | 20020.023761.187e-050.10895.44e-05allocation_candidate.py:598(_consolidate_allocation_requests) | 13:32 |
efried | I think it becomes necessary when we have multiple permutations of results involving the same tree, when you've requested the same resource class from multiple groups. Let me try to concoct an example... | 13:33 |
efried | ...and also only with group_policy=none | 13:34 |
efried | so like, if you ask for VF:1,VF:1 and your host has multiple PFs | 13:34 |
*** stephenfin has quit IRC | 13:35 | |
efried | You'll get a results with | 13:35 |
efried | PF1{VF:1},PF2{VF:1} | 13:35 |
efried | PF1{VF:2} | 13:35 |
efried | PF2{VF:2} | 13:35 |
efried | To get those four results, we start with only two ARRs: PF1{VF:1} and PF2{VF:1} | 13:36 |
efried | s/four/three/ | 13:36 |
*** stephenfin has joined #openstack-placement | 13:36 | |
efried | when we construct the second result, in _consolidate if we just used the original PF1{VF:1} and incremented it, it would make the first result wrong. | 13:37 |
efried | So we have to copy the ARR | 13:37 |
efried | is this making sense? | 13:37 |
* cdent nods | 13:39 | |
efried | so what we could conceivably do, assuming we don't just find that copy() is borked, is do some calculations ahead of time to figure out whether the copy is necessary | 13:39 |
efried | Like, if group_policy=isolate it should never be necessary. | 13:40 |
efried | And if this RC only shows up in one request group, it shouldn't be necessary for this RC. | 13:40 |
efried | and then extend this conditional with a third branch that just assigns, doesn't copy, in those cases. | 13:40 |
cdent | I'm hesitant to make changes that need to be overly aware of the requesting style | 13:41 |
efried | actually, it's possible we should never be hitting _consolidate at all with group_policy=isolate | 13:41 |
cdent | imma try a thing, hold please | 13:41 |
efried | yeah, I agree with you, but that's how perf tuning goes. | 13:41 |
efried | clean, slow code => messy, performant code | 13:41 |
cdent | yes, I know, but I don't think we're there yet | 13:44 |
cdent | we currently have messy slow that is being improved to less messy and faster | 13:44 |
cdent | interesting | 13:45 |
cdent | implement our own __copy__ seems to help quite a bit: | 13:46 |
cdent | 0.0797 -> 0.0225 | 13:46 |
cdent | and what that copy is is simply create a new object with same attributes | 13:46 |
efried | that sounds like a python bug | 13:46 |
efried | fwiw, passing group_policy in and blowing out that conditional to only copy for 'none' passes func tests | 13:47 |
cdent | it's standard python: if you don't like what copy does natively, add a __copy__ | 13:47 |
cdent | I'm just surprised in this case that it makes so much difference | 13:47 |
cdent | up front python can't look at an object and know that a simple clone is sufficient because of the way attributes are managed | 13:47 |
cdent | but yeah, it does seem dumb | 13:47 |
* cdent makes yet another patch | 13:47 | |
efried | I want to say I tried this a couple different ways while I was messing with this. | 13:48 |
efried | One of them was, instead of implementing __copy__, I just initialized a new object explicitly using the fields of the old one. Basically inlining the __copy__logic. | 13:48 |
efried | but yeah, __copy__ is better | 13:49 |
efried | any interest in doing the group_policy-based thing? | 13:49 |
cdent | it's apparently the "idiomatic" form of "clone()" | 13:49 |
cdent | maybe later | 13:49 |
cdent | I want to avoid that sort of thing until we absolutely run out of others things | 13:49 |
efried | okay. FWIW it's just this http://paste.openstack.org/show/755656/ (pep8 would need fixing) | 13:50 |
cdent | and doing __copy__ more than halves the cost of _consolidate | 13:50 |
cdent | you could go ahead and put that in gerrit and we could decide by looking at the numbers | 13:51 |
cdent | i don't know if we are doing much isolate-driven testing yet, though? | 13:51 |
cdent | not in perfload | 13:51 |
efried | dunno. And that'll become N/A hopefully soon when we stop using big-hammer group_policy | 13:52 |
cdent | is that Y or Z at this point? | 13:53 |
efried | I can work up the other optimization as well, where we only do the copy for resources we requested multiples of from separate groups. | 13:53 |
* cdent scurries away | 13:53 | |
cdent | there's another copy that ought to be helped by the same thing (for AR instead of ARR) | 14:00 |
*** cdent has quit IRC | 14:31 | |
*** takashin has left #openstack-placement | 15:00 | |
openstackgerrit | Chris Dent proposed openstack/placement master: Add __copy__ method to AllocationRequest{,Resource} https://review.opendev.org/675372 | 15:02 |
*** tssurya has quit IRC | 15:43 | |
*** helenafm has quit IRC | 15:45 | |
*** e0ne has quit IRC | 16:16 | |
*** cdent has joined #openstack-placement | 16:29 | |
*** Sundar has joined #openstack-placement | 16:51 | |
*** Sundar has quit IRC | 16:54 | |
*** e0ne has joined #openstack-placement | 17:05 | |
openstackgerrit | Chris Dent proposed openstack/placement master: Add __copy__ method to AllocationRequest{,Resource} https://review.opendev.org/675372 | 17:08 |
cdent | heh, a single response to the nested-perfload GET /a_c query, when sent through json_pp is 154006 lines | 17:09 |
cdent | i now feel slightly less bad about it taking more than a second | 17:10 |
openstackgerrit | Eric Fried proposed openstack/placement master: Copy AllocationRequestResource only when necessary https://review.opendev.org/675416 | 17:12 |
efried | cdent: Wanna do your profiley thing on ^ ? | 17:13 |
efried | It's a tad hacky, but not as bad as it could be really. | 17:13 |
cdent | because _consolidate is called multiple times and your closing over rw_ctx, that method would be better defined out side of _consolidate... and have rw_ctx passed to it | 17:16 |
cdent | i think | 17:16 |
efried | I had that and moved it because it's so specific to that method | 17:17 |
efried | but don't I *want* to close over rw_ctx, which doesn't change throughout a request? | 17:17 |
efried | for that matter, make it a method on rw_ctx? | 17:17 |
efried | I mean, this works correctly test-wise, are you saying it might perform better if refactored a certain way? | 17:17 |
cdent | a method on rw_ctx would be okay too, but the issue is that python has to recompute the definition of _copy_arr_if_needed each time _consolidate_allocation_requests is called | 17:18 |
efried | okay. | 17:18 |
cdent | (otherwise it seems like a great idea and I'm happy to go profiley on it, but I'm in the midst of writing up a bunch of profiley stuff, so my profiling tooling is in use) | 17:19 |
openstackgerrit | Eric Fried proposed openstack/placement master: Copy AllocationRequestResource only when necessary https://review.opendev.org/675416 | 17:24 |
efried | cdent: moved ^ | 17:24 |
efried | actually makes a fair bit of sense in there | 17:24 |
* cdent loosk again | 17:31 | |
cdent | request wide context was a good add | 17:32 |
cdent | i suspect eventually all sort of crap will be hanging from it | 17:32 |
*** e0ne has quit IRC | 17:35 | |
cdent | efried: it definitely seems to help, at least for the case that nested-perfload is presenting | 17:39 |
efried | cool. Be interesting to see how much it helps, esp. if it still helps on top of your __copy__ fix | 17:40 |
cdent | yeah. It's hard to make solid comparisons with the changes all coming in in parallel | 17:41 |
cdent | with your change, _consolidate is no longer the big consumer under _merge, _satisfies_same_subtree is | 17:42 |
efried | cdent: I suspect that "same RC in multiple groups" will be relatively rare, until we have NUMA modeling. And most NUMA modeling will want to isolate (I think??) so it'll still help there. And even for cases where we do need the copy, it'll *only* be for the ARRs whose RCs were requested in more than one group, which will (probably, almost) never be all of them. | 17:42 |
cdent | what is make _satisfies_same_subtree hot is _get_ancestors_by_one_uuid | 17:42 |
efried | so basically, I think this optimization makes us very rarely copy | 17:42 |
* cdent nods | 17:42 | |
efried | that's not surprising. Recursive sucks. | 17:43 |
cdent | functional tests screaming along nicely: 1027 tests in 8.6472 sec | 17:48 |
cdent | https://anticdent.org/placement-performance-analysis.html | 17:50 |
cdent | by morning I'm hoping the gate will have caught up with me and I'll redo the analysis | 17:53 |
cdent | but now | 17:53 |
* cdent waves | 17:53 | |
*** cdent has quit IRC | 17:53 | |
openstackgerrit | Merged openstack/placement master: Move provider_ids_from_rp_ids to allocation_candidate and fix https://review.opendev.org/675084 | 18:59 |
openstackgerrit | Merged openstack/placement master: Use another expanding bindparam in _get_usages_by_provider_trees https://review.opendev.org/675287 | 18:59 |
*** e0ne has joined #openstack-placement | 19:16 | |
*** e0ne has quit IRC | 20:37 | |
openstackgerrit | Chris Dent proposed openstack/placement master: Add __copy__ method to AllocationRequest{,Resource} https://review.opendev.org/675372 | 21:34 |
*** takashin has joined #openstack-placement | 21:48 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!