Monday, 2019-05-06

*** dklyle has joined #openstack-placement01:20
*** dklyle has quit IRC02:02
*** altlogbot_0 has quit IRC02:41
*** altlogbot_2 has joined #openstack-placement02:43
*** altlogbot_2 has quit IRC03:31
*** altlogbot_2 has joined #openstack-placement03:32
*** altlogbot_2 has quit IRC07:13
*** altlogbot_0 has joined #openstack-placement07:18
*** tssurya has joined #openstack-placement08:37
*** altlogbot_0 has quit IRC09:11
*** altlogbot_0 has joined #openstack-placement09:16
*** gibi_cape is now known as gibi09:24
*** ttsiouts has joined #openstack-placement09:42
*** ttsiouts_ has joined #openstack-placement09:56
*** ttsiouts has quit IRC09:59
*** ttsiouts_ has quit IRC10:14
*** altlogbot_0 has quit IRC11:19
*** altlogbot_2 has joined #openstack-placement11:21
*** ttsiouts has joined #openstack-placement11:26
*** ttsiouts has quit IRC11:43
*** ttsiouts has joined #openstack-placement11:48
*** ttsiouts_ has joined #openstack-placement11:53
*** ttsiouts has quit IRC11:56
*** ttsiouts_ has quit IRC12:10
*** altlogbot_2 has quit IRC12:14
*** altlogbot_1 has joined #openstack-placement12:21
*** ttsiouts has joined #openstack-placement12:28
*** ttsiouts_ has joined #openstack-placement12:36
*** ttsiouts has quit IRC12:40
*** ttsiouts_ has quit IRC12:46
*** ttsiouts has joined #openstack-placement12:46
*** ttsiouts has quit IRC12:53
edleafeWith travel recovery and meeting burnout, what say we skip this morning's meeting? We can continue to discuss outstanding issues in email.13:15
*** altlogbot_1 has quit IRC13:15
*** altlogbot_1 has joined #openstack-placement13:17
*** mriedem has joined #openstack-placement13:19
*** altlogbot_1 has quit IRC13:21
*** ttsiouts has joined #openstack-placement13:22
*** altlogbot_2 has joined #openstack-placement13:25
*** altlogbot_2 has quit IRC13:25
*** altlogbot_0 has joined #openstack-placement13:27
*** efried has joined #openstack-placement13:41
efriedo/13:42
*** amodi has joined #openstack-placement14:00
*** ttsiouts has quit IRC14:00
gibiedleafe: I agree14:04
*** egonzalez has quit IRC14:22
*** egonzalez has joined #openstack-placement14:24
*** dklyle has joined #openstack-placement14:48
*** dklyle has quit IRC14:49
*** david-lyle has joined #openstack-placement14:49
*** belmoreira has joined #openstack-placement14:54
*** tssurya has quit IRC15:09
*** david-lyle is now known as dklyle15:13
*** belmoreira has quit IRC15:52
*** cdent has joined #openstack-placement16:24
efriedcdent: I never rendered an opinion on doing https://review.opendev.org/#/c/657074/ for other pyXXs. I'm in favor.16:34
cdentefried: noted. i'll take that action and get on it in the gaps.16:35
efriedyeah, I don't think it's urgent16:36
* cdent nods16:36
* cdent is knackered16:36
openstackgerritChris Dent proposed openstack/placement master: WIP: Allow [A-Za-z0-9_-]{1,32} for request group suffix  https://review.opendev.org/65741917:58
cdentefried, mriedem, edleafe that ^ is stab to see what questions are raised by changing request group suffix. please have a look and add answers or more questions17:58
edleafecdent: ack18:10
openstackgerritChris Dent proposed openstack/placement master: DNM: See what happens with 10000 resource providers  https://review.opendev.org/65742318:10
openstackgerritChris Dent proposed openstack/placement master: DNM: See what happens with 10000 resource providers  https://review.opendev.org/65742318:15
openstackgerritChris Dent proposed openstack/placement master: DNM: See what happens with 10000 resource providers  https://review.opendev.org/65742318:16
efriedcdent: I pity the fool who has to go change numbered/unnumbered everywhere.18:25
cdentquite18:25
openstackgerritChris Dent proposed openstack/placement master: DNM: See what happens with 10000 resource providers  https://review.opendev.org/65742318:39
cdentmy efforts on that ^ makes me think that we should choose to not be too overly concerned about all performance concerns in all situations because the tweaks required to fully stretch both uwsgi and mysql (and any other set up) is ... extensive18:40
edleafeNow *returning* 10K rps might be a different story...18:41
efriededleafe, cdent: I thought that ^ was the actual problem cern was having.18:45
cdentefried, edleafe : I'm not able to parse either of your statements18:46
efriedcdent: Does perfload actually run a single query, either GET /a_c or GET /rps, that sends all the providers across the wire in a json payload?18:46
edleafecdent: I meant that while placement can query/work with large data sets, those receiving large data set (CERN) might have other problems18:46
cdentefried: yes, it does a GET /a_c. The previous version returned 100018:47
efriedoh, okay. So maybe edleafe is just saying that the client-side processing of that many records is the actual problem CERN doesn't want to deal with.18:47
cdentedleafe: yes, that's indeed the case, but it's a good data point from which to explore18:47
efriedbut I don't have a good handle on that.18:47
cdentyes, the client side processing is the the problem that CERN is having18:48
cdentso we need to make it easier for them to ask smaller18:48
cdentso in this case my patch is mostly an exploration, not a specific goal-oriented thing18:48
edleafeCERN's problem is that the nova scheduler has to repeatedly pass 10K results through filters and weighers18:49
efriedDo either of you remember how we landed on CERN's anti-affinity issue?18:50
efriedI've got these notes:18:50
efried"""18:50
efriedAffinity/anti-affinity for CERN18:50
efriedAffinity is easy: in_tree18:50
efriedAnti-affinity, in_tree=!in:<small list>, but big lists are hard.18:50
efriedtssurya has patch to remove the limit18:50
efriedplacement filter needs a spec - tssurya to own18:50
efried"""18:50
efriedbut I don't remember what the spec tssurya owns is supposed to do18:50
efriedin_tree=!in<small_list> ?18:50
efriedcdent: ^18:51
cdentefried: that memory is currently swapped out but may come back later if some email is made18:52
efriedmriedem, dansmith: do you remember? ---^18:53
dansmithefried: her current bug fix is to reset the limit on anti-affinity18:53
dansmithwell, either I guess18:54
dansmithefried: and I think we mostly said we'd take her patch, fix affinity because it's easy,18:54
dansmithand maybe punt on anti-, but when we do, try to be reasonable with !in_tree and some not crazy length limit18:54
efrieddansmith: reasonable how? How big can a server group be?18:56
dansmith100 by default18:56
dansmithbut larger by confg18:56
dansmitha hundred uuids on top of the request of the query is too big, IMHO18:57
dansmith*rest of the query18:57
efriedWell, it's not one uuid per server; it's one uuid per host-on-which-said-servers-have-previously-landed18:57
dansmithwith anti-affinity, they are the same18:57
dansmithso, 20 uuids for a group of 20 instances18:57
efriedUnless there's fewer than 20 hosts in your... what, cell? host agg?18:58
dansmitheh?18:58
efriedi.e. what's the scope (number of hosts involved) in such a request?18:58
efriedBasically "all of them"?18:58
dansmithyes, all of them18:58
efriedokay. And in a small deployment, with like 10 hosts, your 11th anti-affinity-requesting spawn would get back zero allocation_candidates based on ^ design. So you would have to account for that and do a second query without the filter.18:59
dansmithunless we do something crazy, we'd build this part of the request purely from the list of hosts of other instances in your group, unrelated to any other scope-limiting factors, which we won't really be able to reduce on the nova side18:59
dansmithno,18:59
dansmithif you have 10 hosts and you boot an 11th in an anti-affinity group, there is no retry, you just fail19:00
efriedo19:00
efriedneat19:00
dansmithI mean, that's the point19:00
efriedOkay, I thought it was for "spread" purposes19:00
efriedas opposed to "never ever run two of these guys together"19:01
dansmithanti-affinity is for fault isolation only19:01
efriedgot it19:01
efriedSo we just enable in_tree=!<list> and either limit the size of <list> or let the implicit querystring length limit take care of it, and it's the deployment's problem to make sure they don't use enormous anti-affinity server groups?19:01
dansmithI would tend to say limit the list length, maybe with a tunable19:02
dansmith"the deployment" doesn't get to choose the group size, other than just the limit on member count they set19:03
dansmithit's a user-visible thing19:03
efriedokay. These details to be worked out in the spec I suppose.19:03
efriedthanks dansmith19:03
dansmithI dunno what the query length limit is in practice, but I guess I would expect that if it's too large, the sql query just gets nuts and starts to affect performance19:03
efriedoh, I guess I would have expected wsgi to bounce before we got big enough to worry the db. But either way.19:04
efriedexpect to see your words regurgitated in next summary email :)19:04
dansmithwell, some db-knowing person should opine on that detail19:05
*** amodi has quit IRC19:11
edleafeIhave1newknee-019:16
edleafeOh, well, now everyone can get into that VM ")19:16
openstackgerritChris Dent proposed openstack/placement master: WIP: Allow [A-Za-z0-9_-]{1,32} for request group suffix  https://review.opendev.org/65741919:17
openstackgerritChris Dent proposed openstack/placement master: WIP: Allow [A-Z0-9_-]{1,32} for request group suffix  https://review.opendev.org/65741919:18
cdentefried, edleafe : that ^ makes some tweaks but now lunch19:18
*** cdent has quit IRC19:18
*** amodi has joined #openstack-placement19:42
*** cdent has joined #openstack-placement20:19
openstackgerritChris Dent proposed openstack/placement master: WIP: Allow [A-Z0-9_-]{1,32} for request group suffix  https://review.opendev.org/65741920:42
openstackgerritChris Dent proposed openstack/placement master: DNM: See what happens with 10000 resource providers  https://review.opendev.org/65742320:45
openstackgerritChris Dent proposed openstack/placement master: DNM: See what happens with 10000 resource providers  https://review.opendev.org/65742320:48
openstackgerritChris Dent proposed openstack/placement master: Skip notification sample tests when running nova functional  https://review.opendev.org/65745521:06
openstackgerritChris Dent proposed openstack/placement master: DNM: See what happens with 5000 resource providers  https://review.opendev.org/65742321:11
cdentmriedem, efried : If we can get https://review.opendev.org/651939 merged we can save a few cycles in the gate and get a story off the radar.21:13
cdentmriedem: and this one too https://review.opendev.org/#/c/656717/21:15
efriedcdent: +A21:18
efriedon the first one21:18
cdentrad21:18
openstackgerritMerged openstack/osc-placement master: Use PlacementFixture in functional tests  https://review.opendev.org/65193921:27
mriedemsorry, been off in watcher land all day21:36
mriedemcdent: questions in https://review.opendev.org/#/c/656717/21:51
cdentaye21:51
mriedemthe answers look like they might all be in the commit message21:54
mriedemand i'm just slow21:54
mriedemgotta run, will check back later - else ping me in the morrow22:01
openstackgerritChris Dent proposed openstack/placement master: DNM: See what happens with 10000 resource providers  https://review.opendev.org/65742322:03
openstackgerritChris Dent proposed openstack/placement master: Package db migration scripts in placement pypi dist  https://review.opendev.org/65671722:10
cdent10000 rps leads to ~46s GET /a_c: http://logs.openstack.org/23/657423/8/check/placement-perfload/07de88c/logs/placement-perf.txt22:21
openstackgerritEric Fried proposed openstack/placement master: Add NUMANetworkFixture for gabbits  https://review.opendev.org/65746322:29
efriedcdent: I'm deliberately not putting any tests on this ^ because the series won't be linear22:31
efriedit'll be, like, nested and shit.22:31
efriedsame reason I'm not tagging a task on it.22:31
cdentaye22:31
openstackgerritEric Fried proposed openstack/os-resource-classes master: Propose standard ACCELERATOR_FPGA resource class  https://review.opendev.org/65746422:32
cdentgood idea22:32
cdent(the fixture)22:32
cdentefried: which, if any, spec was ACCELERATOR_FPGA  defined in (it's not on that topic)?22:38
cdentdoesn't really matter, but I wanted to confirm that there was some consensus building on that name22:39
efriedcdent: Apparently it's not explicitly in any spec yet as a standard resource class. All the cyborg-ish specs currently use customs. Worth commenting on the spec linked from that commit...22:39
cdent" Worth commenting on the spec linked from that commit..." which one is that?22:40
efriednova-cyborg-interaction22:41
efriedhold22:41
cdentthe blueprint doesn't link to a spec was my starting investigation22:42
efriedahcrap22:42
efriedhttps://review.opendev.org/#/c/603955/22:42
efriedyeah, the whiteboard should be updated. I'm updating the topic now...22:42
cdentah, bad topic on that spec22:42
efriednot sure if the bp gets updated automatically when that happens. I want to say it didn't used to, but that started happening lately.22:43
efriedwe'll see shortly..22:43
efriedI commented accordingly.22:43
cdentword22:43
efriedI'm going to go now.22:43
efriedI very well may take tomorrow off, since I didn't take today off...22:44
cdentaye aye22:45
cdentefried: if you're not gone yet, want to lay a seed in your brain about request group orphans23:05
cdentIt looks like we only force required and member_of to have an associated resources, not in_tree23:08
cdentdo we want resource-less member_of?23:08
*** david-lyle has joined #openstack-placement23:55
*** dklyle has quit IRC23:55

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!