*** takashin has left #openstack-placement | 02:32 | |
*** licanwei has joined #openstack-placement | 03:21 | |
*** e0ne has joined #openstack-placement | 06:06 | |
*** e0ne has quit IRC | 06:25 | |
*** tetsuro has joined #openstack-placement | 06:25 | |
*** tetsuro has quit IRC | 07:09 | |
*** yikun has joined #openstack-placement | 07:10 | |
*** helenafm has joined #openstack-placement | 07:37 | |
*** tetsuro has joined #openstack-placement | 07:51 | |
*** tetsuro has quit IRC | 08:38 | |
*** tetsuro has joined #openstack-placement | 08:57 | |
openstackgerrit | Balazs Gibizer proposed openstack/placement master: Resource provider - request group mapping in allocation candidate https://review.opendev.org/657582 | 08:59 |
---|---|---|
*** e0ne has joined #openstack-placement | 09:00 | |
gibi | efried: if you can answer my question in https://review.opendev.org/#/c/657419/7/api-ref/source/parameters.yaml@169 then I can +A that patch | 09:18 |
gibi | efried: cdent seems to be out | 09:18 |
openstackgerrit | Balazs Gibizer proposed openstack/os-traits master: Add REPORT_PARENT_INTERFACE_NAME_FOR_SRIOV_NIC trait https://review.opendev.org/658852 | 09:25 |
openstackgerrit | Chris Dent proposed openstack/placement master: Optionally run a wsgi profiler when asked https://review.opendev.org/643269 | 09:46 |
openstackgerrit | Chris Dent proposed openstack/placement master: DNM: See what happens with 10000 resource providers https://review.opendev.org/657423 | 10:05 |
*** cdent has joined #openstack-placement | 10:07 | |
*** tetsuro has quit IRC | 10:26 | |
*** jaypipes has joined #openstack-placement | 10:42 | |
*** jaypipes has quit IRC | 10:46 | |
*** jaypipes has joined #openstack-placement | 10:58 | |
*** ttsiouts has joined #openstack-placement | 11:33 | |
*** ttsiouts has quit IRC | 11:38 | |
cdent | edleafe: can I delegate a thing to you if you're able? | 12:29 |
cdent | which is: orchestrate when office hours should be | 12:31 |
openstackgerrit | Chris Dent proposed openstack/placement master: perfload with written allocations https://review.opendev.org/660754 | 13:05 |
*** mriedem has joined #openstack-placement | 13:15 | |
*** purplerbot has quit IRC | 13:20 | |
*** purplerbot has joined #openstack-placement | 13:20 | |
efried | gibi: Responded in https://review.opendev.org/#/c/657419/ | 13:28 |
gibi | efried: thanks | 13:52 |
gibi | approved the patch | 13:52 |
*** amodi has joined #openstack-placement | 14:01 | |
efried | Thanks gibi | 14:10 |
efried | cdent: We might want to have a hangout with placement+nova teams (or maybe put it on the agenda for next nova meeting) about the use cases for can_split. | 14:10 |
efried | Because it sounds like "land me anywhere" isn't good enough, and "preserve existing behavior" is not possible. | 14:10 |
efried | (other than by doing nothing, including NUMA modeling in any form) | 14:11 |
cdent | efried: I think we should perhaps first (or simultaneously) do a query to operators, and something during the nova meeting | 14:14 |
cdent | I think the shortest path to preserve existing behavior is "only model numa when the host is configured to do so" | 14:15 |
cdent | that does mean operators have to do something but perhaps that's not such a big deal | 14:15 |
efried | cdent: Right, but that gets us to the limitation that you can't co-locate guests with and without a NUMA topology. | 14:16 |
cdent | yes, which is why I'm saying we need to talk to operators. maybe that's nbd | 14:16 |
cdent | or | 14:16 |
cdent | maybe it is not enough of a deal to incur the cost | 14:17 |
efried | okay, an we "talk to operators" via the ML? | 14:17 |
efried | or what? | 14:17 |
cdent | do we have any other option? | 14:17 |
efried | swhat I'm asking. | 14:17 |
cdent | it's a good place to start | 14:17 |
efried | ight | 14:17 |
cdent | i guess [ops][nova][placement] tags or something like | 14:18 |
efried | want me to craft something? | 14:18 |
cdent | well. I think if I do it I will bias it too much one way. If you do it it will bias it too much another way, so it depends on which foot we want to start on (and the other will follow later with the other foot) | 14:19 |
efried | I think I can do it without bias. Starting... | 14:19 |
efried | ...but going to run some errands, so expect something in an hour or two. | 14:22 |
cdent | aye aye | 14:23 |
openstackgerrit | Chris Dent proposed openstack/placement master: Add olso.middleware.cors to conf generator https://review.opendev.org/661769 | 14:30 |
*** dklyle has joined #openstack-placement | 14:59 | |
*** e0ne has quit IRC | 15:05 | |
efried | cdent: I'm back. | 15:44 |
cdent | welcome back | 15:44 |
efried | Unsurprisingly, having a hard time phrasing questions | 15:44 |
efried | because they sound silly to me | 15:44 |
cdent | do you want to share a draft somewhere | 15:44 |
efried | yeah, lemme etherpad it... | 15:44 |
cdent | do you need a camp counsellor to tell you there are no stupid questions? | 15:45 |
*** helenafm has quit IRC | 15:51 | |
cdent | efried: Imma be outtie pretty soon, I see you got drawn away. If you send me the etherpad link I can look before you wake up. But in the meantime the important questions to me are: | 16:20 |
cdent | How important is it to be able to mix non-NUMA and NUMA workloads on the same host? | 16:21 |
cdent | Should any host that supports NUMA represent it's VPCU and MEMORY as part of NUMA nodes or should some machine be "simple"? | 16:22 |
cdent | If given a choice between a faster and simpler placement service while needing to manage inventory (as in Should... above ) more closely or a potentially slower placement service and always report NUMA inventory, which would you choose? | 16:23 |
cdent | Do you distinguish between low and high performance hosts? How? | 16:24 |
sean-k-mooney | cdent: i would prefer if all host were modeled as numa hosts | 16:24 |
sean-k-mooney | and similarly if all guest were modeled as numa guests so we can simply the code | 16:24 |
cdent | But mostly: I think we've laid out the questions pretty well on the spec and we need to summarize them and point interested parties to the spec | 16:25 |
sean-k-mooney | but if we need to support both the i can live with that | 16:25 |
cdent | sean-k-mooney: yes, I know, these are prompts to try to help eric create an email to find out what operators prefer | 16:25 |
sean-k-mooney | i am almost finsihed https://review.opendev.org/#/c/658510/4/doc/source/specs/train/approved/2005575-nested-magic.rst by the way | 16:25 |
cdent | our preferences aren't really the important part | 16:25 |
cdent | excellent, glad to hear it | 16:25 |
sean-k-mooney | well my preference are based on simplying the nova and palcement code by only haveing one code path for all instances | 16:26 |
cdent | I think my comment on there about the locus of control being in the wrong place is the crux of things for me. | 16:26 |
cdent | placement doesn't know what an instance is | 16:26 |
cdent | yet the only reason can_split is coming up is for allowing a certain type of instance | 16:26 |
cdent | so it me, it seems...model breaking | 16:26 |
sean-k-mooney | hehe well if we make all isntance numa instance then it goes away | 16:27 |
cdent | would that mean changing a bunch of flavors? | 16:27 |
sean-k-mooney | no | 16:28 |
cdent | I guess I'm not certain what you mean by "numa instance" | 16:28 |
sean-k-mooney | any instance that does not ahve a numa configrutain specifed via hw:numa_nodes or implictly creted by hugepage or cpu pinning is a non numa instnace | 16:28 |
sean-k-mooney | we implcitly generate a request for one numa node if you set hw:cpu_policy=dedicated or hw:mem_page_size=large | 16:29 |
sean-k-mooney | the cahnge would be if you dont set hw:numa_nodes=x we assuem x is 1 | 16:29 |
sean-k-mooney | that breaks some usecase but it also simplfies the code in some respects | 16:30 |
cdent | that breaks what has been described as The Problem: making use of space cpus spread across numa nodes for instance that "don't care" | 16:30 |
cdent | s/space/spare/ | 16:30 |
sean-k-mooney | yep | 16:30 |
cdent | so presumably that's a deal break _if_ The Problem really is the problem. which is why I think we need to talk to operators | 16:31 |
sean-k-mooney | it does but supporting that usecases cause alot of extra complexity | 16:31 |
cdent | because if it is not a problem, then we can either solve it your way | 16:31 |
cdent | or solve it my way: don't report numa inventory for some hosts | 16:31 |
sean-k-mooney | or solve it the way s im suggesting in the spec which do neither | 16:32 |
sean-k-mooney | e.g. it support having the spread, lives with the complexity, and add a little more to respect the allocations form placement | 16:32 |
cdent | I'll look forward to reading that later, but to reiterate: I think we need to stop talking about what's possible, and talk more about what people want | 16:33 |
cdent | because it feels like we got down this road in the first place by saying "we have to be able to do X" without really confirming that requirement versus its costs | 16:33 |
cdent | brb | 16:33 |
* cdent is back | 16:35 | |
cdent | efried: maybe some of that chat ^ will be good fodder? | 16:37 |
sean-k-mooney | anyway to your point yes it would be good to know if operators care about the "i dont care about numa usecase" i think this will be enterpise folk mainly. and as always you will have the telcos on the other end making life complicated. | 16:38 |
cdent | mriedem: this stack may not really be your bailiwick, but if you're inclined tetsuro has done some nice cleanups which appear to result in some good peformance gains: https://review.opendev.org/658778 (and also make some of the nested magic easier to do) | 16:39 |
cdent | sean-k-mooney: I think the telco side of things is relatively sane/clear [1]: they want to report NUMA and they want to place with high levels of control. The real issue here is effective use of resources for people who are just "give me a vm". | 16:40 |
cdent | [1] I can't believe I just said that | 16:40 |
mriedem | i'm gonna be spending quite a bit of time here writing a recreate test for zigo's issue from earlier | 16:40 |
mriedem | since it goes back to ocata | 16:40 |
sean-k-mooney | cdent: hehe that telcos are sane. are you feeling ok | 16:40 |
cdent | mriedem: ossum! There's no rush on that tetsuro thing, was just thinking it might tickle your fancy | 16:41 |
sean-k-mooney | :) | 16:41 |
sean-k-mooney | but yes i know what you mean | 16:41 |
cdent | I know, right? But what I mean is: they essentially don't care about dynamic placement. They frequently want a very specific thing. The enterprise or cloudy case is what we think of as "the simple case" but mixing it into a numa setting is complicating things. Which is why I'm wonder if perhaps "don't report that numa stuff" is a reasonable out. | 16:42 |
sean-k-mooney | cdent: well the current numa in placement spec was proposeing a per compute node conifg option that list the resouce classes to report per numa node | 16:43 |
sean-k-mooney | which defalted ot none for backwards compatiablity | 16:43 |
sean-k-mooney | so if we continue to tell people dont mix numa instace with non numa instance | 16:44 |
cdent | right, so maybe that's enough | 16:44 |
sean-k-mooney | that is what we recommend to day by the way the sure | 16:44 |
cdent | perhaps that's another for efried's list of questions then: are you okay with the status quo | 16:45 |
sean-k-mooney | ya it might be although we liekly would have to do some prefilter stuff to transform our existing flavor into something placement would ike | 16:45 |
cdent | I assume cfriesen might have thoughts | 16:45 |
cdent | but he's not in here | 16:45 |
sean-k-mooney | i know he has costomer that use openstack to spin up a single giant vm that uses all thee resouce on the plathform and he does not want ot have to tell them to create mulpile numa node in the guest | 16:46 |
sean-k-mooney | so he woudl prefer to not break the "i dont care" usecase for them | 16:46 |
sean-k-mooney | i hope they are a minority | 16:47 |
sean-k-mooney | the poeple that i belive really want ot mix numa and non numa instance on the same host are the edge folks | 16:47 |
sean-k-mooney | mainly because they dont have enough hosts at an edge site to partion them staticly | 16:48 |
cdent | it's the case that you can already mix, right? the issue is mixing efficiently | 16:48 |
sean-k-mooney | well yes but we are removing that limitation at least partially in train | 16:49 |
sean-k-mooney | and windrive have donwstream only code that makes it work today | 16:49 |
sean-k-mooney | they have a host agent that confines the floating instance so they dont float over teh pinned ones | 16:49 |
sean-k-mooney | and it dynamicl changes there confinment as pinned isntace are added and remvoed | 16:49 |
sean-k-mooney | i think that is opensouced as part of starlinx but im not sure | 16:50 |
cdent | if that were present on a host _and_ numa vcpu was being reported _and_ can_split was being used, placement would quickly become wrong | 16:52 |
sean-k-mooney | yes if it did not repect the numa node the allcoation was form | 16:53 |
sean-k-mooney | but this predates placmenet so it was not a usecse they conisdered | 16:53 |
cdent | sure, I get that, just thinking out loud | 16:54 |
cdent | but it supports my concerns about locus of control | 16:54 |
sean-k-mooney | specifically? | 16:55 |
cdent | can_split is worrisome when the operating system or other tools on the compute-node might move _which_ vcpu are being used by a workload | 16:56 |
cdent | anyway, i'll watch the spec, I have to depart for the evening | 16:56 |
sean-k-mooney | ah | 16:56 |
cdent | i linked to the log near here on the spec, so hopefully other people will hop on | 16:56 |
cdent | g'night | 16:56 |
* cdent waves | 16:56 | |
*** cdent has left #openstack-placement | 16:57 | |
sean-k-mooney | well i think if a virt dirver preport resoces in a numa aware way i think its resonable to require them to consume them based on the allocation they were provided | 16:57 |
*** irclogbot_0 has quit IRC | 17:17 | |
*** irclogbot_0 has joined #openstack-placement | 17:19 | |
sean-k-mooney | efried: i may have -1'd https://review.opendev.org/#/c/658510/4/doc/source/specs/train/approved/2005575-nested-magic.rst but i agree with the general direction you are putting froward and most of the dessiosn that have been made | 17:29 |
sean-k-mooney | i still need to finish it but i also need to clear my head so ill come back to it tomorow | 17:29 |
efried | sean-k-mooney: Duly noted. I don't see that spec getting merged before the can_split issue is resolved. | 17:29 |
efried | and so far we seem to have like four different (*very* different) views on the requirements around that. | 17:30 |
*** ttsiouts has joined #openstack-placement | 17:49 | |
*** ttsiouts has quit IRC | 17:59 | |
*** efried has quit IRC | 18:17 | |
*** efried has joined #openstack-placement | 18:18 | |
*** licanwei has quit IRC | 19:56 | |
*** mriedem has quit IRC | 20:07 | |
*** mriedem has joined #openstack-placement | 20:51 | |
*** e0ne has joined #openstack-placement | 20:58 | |
*** e0ne has quit IRC | 21:12 | |
*** mriedem has quit IRC | 22:29 | |
*** amodi has quit IRC | 22:49 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!