Tuesday, 2020-02-11

*** spatel has joined #openstack-nova00:01
*** slaweq has quit IRC00:04
*** spatel has quit IRC00:06
*** Liang__ has joined #openstack-nova00:20
*** spatel has joined #openstack-nova00:34
*** mlavalle has quit IRC00:47
*** yedongcan has joined #openstack-nova01:05
*** gentoorax has quit IRC01:06
*** zhanglong has joined #openstack-nova01:15
*** openstackgerrit has joined #openstack-nova01:15
openstackgerritMerged openstack/nova stable/rocky: Use stable constraint for Tempest pinned stable branches  https://review.opendev.org/70671601:15
*** gentoorax has joined #openstack-nova01:37
*** david-lyle is now known as dklyle02:05
openstackgerritHuachang Wang proposed openstack/nova-specs master: Use PCPU and VCPU in one instance  https://review.opendev.org/66865602:06
*** psachin has joined #openstack-nova02:06
*** nweinber has joined #openstack-nova02:07
*** nweinber has quit IRC02:19
openstackgerritGhanshyam Mann proposed openstack/nova master: Introduce scope_types in os-create-backup  https://review.opendev.org/70703802:21
openstackgerritGhanshyam Mann proposed openstack/nova master: Add new default roles in os-create-backup policies  https://review.opendev.org/70703902:25
*** nicolasbock has quit IRC02:28
openstackgerritGhanshyam Mann proposed openstack/nova master: Introduce scope_types in os-console-output  https://review.opendev.org/70704002:36
*** ileixe has joined #openstack-nova02:37
openstackgerritGhanshyam Mann proposed openstack/nova master: Add new default roles in os-console-output policies  https://review.opendev.org/70704102:41
*** gyee has quit IRC03:11
*** gentoorax has quit IRC03:14
ileixeHi Nova,03:22
ileixeDoes anyone know the current status of https://blueprints.launchpad.net/nova/+spec/ip-aware-scheduling-placement?03:22
*** spatel has quit IRC03:23
ileixeI thought if nova does not aware of neutron segment, routed network does not work and the spec say it's not implemented yet.03:23
ileixeDoes it mean routed network not yet implemented?03:23
*** hongbin has joined #openstack-nova03:48
*** yedongcan has quit IRC03:50
*** gentoorax has joined #openstack-nova03:52
*** Sundar has quit IRC03:53
*** udesale has joined #openstack-nova04:19
*** mkrai has joined #openstack-nova04:37
*** hongbin has quit IRC04:41
*** vesper has quit IRC05:18
*** vesper11 has joined #openstack-nova05:23
*** evrardjp has quit IRC05:34
*** evrardjp has joined #openstack-nova05:34
alex_xuileixe: I think it is implemented05:50
ileixealex_xu: Thanks for response. Do you mean nova lookup segment then?05:51
*** igordc has joined #openstack-nova05:51
alex_xuileixe: no, the neutron side will do that, and report the resource to the placement05:51
alex_xuileixe: https://blueprints.launchpad.net/neutron/+spec/routed-networks05:52
alex_xuileixe: I never try that feature, but hope ^ that can help you05:52
*** mkrai has quit IRC05:52
alex_xuileixe: also this one https://docs.openstack.org/neutron/pike/admin/config-routed-networks.html05:53
*** mkrai has joined #openstack-nova05:54
ileixealex_xu: Hm.. maybe I understand what neutron does for routed network05:56
ileixeWhat I do not understand is.. how nova use resource provider which neutron gave05:57
alex_xuileixe: maybe i'm wrong, i saw the note from the matt, looks like neutorn side impelemnt, but yes, nova side do nohting now05:57
ileixeiirc, matt you said is the owner of the commit (https://review.opendev.org/#/c/656885/) right?05:58
alex_xuileixe: yes, he isn't working on that anymore05:59
ileixeAnd the commit does not implemented which I thought for routed network.05:59
ileixeSo.. I assume that routed network does not work (especially related to nova scheduling)05:59
*** igordc has quit IRC06:00
alex_xuileixe: yes, I think you are right06:00
ileixealex_xu: Hm... thanks for the answer..06:00
alex_xunp06:02
*** mkrai has quit IRC06:09
*** ratailor has joined #openstack-nova06:21
*** gentoora- has joined #openstack-nova06:27
*** gentoorax has quit IRC06:27
*** gentoora- is now known as gentoorax06:27
*** ccamacho has quit IRC06:50
*** lpetrut has joined #openstack-nova07:03
*** mkrai has joined #openstack-nova07:05
*** maciejjozefczyk has joined #openstack-nova07:08
*** damien_r has joined #openstack-nova07:18
*** damien_r has quit IRC07:23
*** yedongcan has joined #openstack-nova07:36
*** udesale has quit IRC07:46
*** udesale has joined #openstack-nova07:47
*** imacdonn has quit IRC07:53
*** imacdonn has joined #openstack-nova07:53
*** mkrai has quit IRC08:02
*** mriosfer has joined #openstack-nova08:22
mriosferHi guys, after change in the flavor and image the vram value, in our openstack queens and rebuild the instance i saw in the virsh xml that its correctly added to vm config "<model type='qxl' ram='65536' vram='131072' vgamem='16384' heads='1' primary='yes'/>" with windows with dxdiag detect 0MB vram. Is it correct? Should be running?08:24
*** tosky has joined #openstack-nova08:27
*** priteau has joined #openstack-nova08:28
*** tesseract has joined #openstack-nova08:29
openstackgerritBrin Zhang proposed openstack/nova master: Expose instance action event details out of the API  https://review.opendev.org/69443008:33
openstackgerritBrin Zhang proposed openstack/nova master: Add server actions v82 samples test  https://review.opendev.org/70625108:35
*** mkrai has joined #openstack-nova08:37
openstackgerritBrin Zhang proposed openstack/nova master: Add instance actions v82 samples test  https://review.opendev.org/70625108:38
*** jcosmao has joined #openstack-nova08:40
*** ccamacho has joined #openstack-nova08:46
openstackgerritBalazs Gibizer proposed openstack/nova master: Merge qos related renos for Ussuri  https://review.opendev.org/70676608:49
gibibauzas: I'm here if you want to chat about NUMA08:50
bauzasgibi: 10 mins please but yeah :)08:50
gibibauzas: sure08:50
*** ralonsoh has joined #openstack-nova08:52
*** rpittau|afk is now known as rpittau08:55
*** martinkennelly has joined #openstack-nova09:01
*** ccamacho has quit IRC09:01
bauzasok, processed the whole bunch of comments for the NUMA in Placement spec...09:01
* bauzas grabs some coffee and pings gibi later09:01
gibiok09:02
gibiI just realized tha sean-k-mooney and efried also commented while I was away... reading them...09:02
*** elod has quit IRC09:03
*** amoralej|off is now known as amoralej09:03
*** slaweq has joined #openstack-nova09:07
*** mkrai has quit IRC09:09
*** mkrai_ has joined #openstack-nova09:10
bauzasgibi: I'm back09:10
bauzasgibi: FWIW efried mostly replied on your concerns09:11
bauzasthe spec needs another round of rewrites so I'm starting it now09:11
*** dtantsur|afk is now known as dtantsur09:12
*** elod has joined #openstack-nova09:12
openstackgerritHYSong proposed openstack/nova master: Update request_specs.availability_zone during live migration  https://review.opendev.org/70664709:12
gibibauzas: I'm still reading efried's comments, I will get back to you soon09:12
bauzascool09:13
*** ociuhandu has joined #openstack-nova09:28
*** takamatsu has joined #openstack-nova09:28
openstackgerritHYSong proposed openstack/nova master: Update request_specs.availability_zone during live migration  https://review.opendev.org/70664709:28
*** ociuhandu has quit IRC09:29
*** martinkennelly has quit IRC09:33
openstackgerritHYSong proposed openstack/nova master: Update request_specs.availability_zone during live migration  https://review.opendev.org/70664709:33
openstackgerritHYSong proposed openstack/nova master: Update request_specs.availability_zone during live migration  https://review.opendev.org/70664709:35
*** psachin has quit IRC09:42
*** derekh has joined #openstack-nova09:44
bauzasgibi: unrelated, see my last comment on https://review.opendev.org/#/c/706647/509:44
bauzasI know not a lot of folks know about AZs, so I want to make sure that all of us as cores know about the design consensus :)09:45
gibibauzas: ack, queued for double check09:45
bauzasno rush, it's more for a knowledge09:47
*** psachin has joined #openstack-nova09:47
*** ivve has joined #openstack-nova09:53
gibibauzas: replyied in the NUMA spec. I agree Eric's proposals in his reply. I'm still a bit open in the upgrade check case but that is a low prio issue09:53
bauzascool, I'm just modifying things as we speak09:53
*** psachin has quit IRC09:54
gibibauzas: you did a great job pulling the piece together I think I see the light of the end of this tunnel09:54
*** davidsha has joined #openstack-nova09:54
*** slaweq has quit IRC09:56
openstackgerritjichenjc proposed openstack/nova master: set default value to 0 instead of ''  https://review.opendev.org/70673010:02
*** psachin has joined #openstack-nova10:02
gibibauzas: https://review.opendev.org/#/c/706647 thanks for chiming in I agree with you. I missed the fact that we don't allow moving instance between AZs (in recent microversions)10:04
bauzasgibi: no worries, again, my ping is just for making sure we share our knowledge10:06
*** ociuhandu has joined #openstack-nova10:06
bauzasI'm always on and off upstream, so the more people know about AZs, the better it will be10:06
*** xek has joined #openstack-nova10:07
*** ociuhandu has quit IRC10:09
*** ociuhandu has joined #openstack-nova10:09
*** mkrai__ has joined #openstack-nova10:10
*** ociuhandu has quit IRC10:10
*** ociuhandu has joined #openstack-nova10:11
*** mkrai_ has quit IRC10:13
*** ociuhandu has quit IRC10:18
*** psachin has quit IRC10:20
*** ociuhandu has joined #openstack-nova10:20
*** psachin has joined #openstack-nova10:21
*** ociuhandu has quit IRC10:21
*** martinkennelly has joined #openstack-nova10:21
*** ociuhandu has joined #openstack-nova10:26
*** ociuhandu has quit IRC10:27
*** priteau has quit IRC10:28
*** ociuhandu has joined #openstack-nova10:29
*** ociuhandu has quit IRC10:30
bauzasshit, I lack time for fixing all the comments10:42
bauzasgibi: I think we need efried and sean-k-mooney around this afternoon for discussing the upgrade pre-flight check and the Ussuri condition10:45
bauzasI'm personnally in favor of keeping NUMA workloads in Ussuri as they are10:45
bauzasie. no migration asked10:46
bauzasand a pre-flight check pre-Victoria10:46
bauzasbecause if not, that's a chicken-and-egg issue10:46
gibisure lets see what they think10:50
*** priteau has joined #openstack-nova11:03
*** ociuhandu has joined #openstack-nova11:04
*** ociuhandu has quit IRC11:09
*** zhanglong has quit IRC11:12
*** ociuhandu has joined #openstack-nova11:13
*** zhanglong has joined #openstack-nova11:15
*** ociuhandu has quit IRC11:18
*** mkrai__ has quit IRC11:19
*** rpittau is now known as rpittau|bbl11:21
*** fungi has quit IRC11:24
*** fungi has joined #openstack-nova11:27
*** yedongcan has left #openstack-nova11:37
*** tbachman has quit IRC11:46
*** ociuhandu has joined #openstack-nova11:51
*** ociuhandu has quit IRC11:57
*** ociuhandu has joined #openstack-nova12:01
*** mriedem has joined #openstack-nova12:01
*** nicolasbock has joined #openstack-nova12:02
*** amoralej is now known as amoralej|lunch12:04
*** udesale_ has joined #openstack-nova12:11
*** udesale has quit IRC12:13
openstackgerritStephen Finucane proposed openstack/nova master: WIP: api: Add support for extra spec validation  https://review.opendev.org/70464312:14
*** jaosorior has joined #openstack-nova12:15
openstackgerritBalazs Gibizer proposed openstack/nova master: Support unshelve with qos ports  https://review.opendev.org/70475912:24
openstackgerritBalazs Gibizer proposed openstack/nova master: Enable unshelve with qos ports  https://review.opendev.org/70547512:25
openstackgerritBalazs Gibizer proposed openstack/nova master: Merge qos related renos for Ussuri  https://review.opendev.org/70676612:27
*** dtantsur is now known as dtantsur|brb12:27
*** ociuhandu has quit IRC12:31
*** jaosorior has quit IRC12:32
*** ociuhandu has joined #openstack-nova12:34
openstackgerritStephen Finucane proposed openstack/nova master: WIP: api: Add support for extra spec validation  https://review.opendev.org/70464312:36
fricklerhello nova, the new alembic==1.4.0 is causing this failure, please have a look https://c264ca14759376f2bea5-6cd49316b9babb1f90743cae9cd67f9e.ssl.cf2.rackcdn.com/705380/10/check/cross-nova-py36/8bcec81/testr_results.html , I'll pin to the previous version for now, see https://review.opendev.org/70538012:38
*** mdbooth_ has quit IRC12:45
*** mkrai__ has joined #openstack-nova12:46
*** damien_r has joined #openstack-nova12:55
*** mriedem has quit IRC13:00
*** adriant has quit IRC13:02
*** adriant has joined #openstack-nova13:02
*** mkrai__ has quit IRC13:08
*** decrypt has joined #openstack-nova13:09
*** jmlowe has joined #openstack-nova13:10
*** tbachman has joined #openstack-nova13:10
*** ratailor has quit IRC13:11
openstackgerritStephen Finucane proposed openstack/nova master: WIP: api: Add support for extra spec validation  https://review.opendev.org/70464313:14
*** takamatsu has quit IRC13:14
*** amoralej|lunch is now known as amoralej13:18
*** rpittau|bbl is now known as rpittau13:22
efriedbauzas, gibi: o/13:23
efriedWhat's the question?13:23
gibiefried: my remaning open question is https://review.opendev.org/#/c/552924/16/specs/ussuri/approved/numa-topology-with-rps.rst@223 about upgrade checks13:25
gibibut I have to jump on a call for the next 1 and a half hour so talk to you later13:25
*** derekh has quit IRC13:27
*** nweinber has joined #openstack-nova13:28
*** jmlowe has quit IRC13:30
*** zhanglong has quit IRC13:33
*** zhanglong has joined #openstack-nova13:34
bauzasefried: gibi: will ping you later, just updating the spec now13:36
*** takamatsu has joined #openstack-nova13:37
*** jmlowe has joined #openstack-nova13:37
alex_xubauzas: I leave one also https://review.opendev.org/#/c/552924/16/specs/ussuri/approved/numa-topology-with-rps.rst@42113:47
*** dtantsur|brb is now known as dtantsur13:49
bauzasack13:50
bauzasalex_xu: good point, I was thinking of the upgrade issue when rolling the compute upgrades13:52
alex_xugood news, we can just copy the way of standard-cpu-resource-tracking way13:53
alex_xuprobably need another workaround config option for disable the fallback placement query13:53
bauzasI'll sat this13:53
bauzassay*13:54
*** nweinber has quit IRC13:58
*** tkajinam has joined #openstack-nova13:59
*** derekh has joined #openstack-nova14:01
*** nweinber has joined #openstack-nova14:07
*** jmlowe has quit IRC14:09
*** jmlowe has joined #openstack-nova14:11
*** zhanglong has quit IRC14:28
*** takamatsu has quit IRC14:29
*** tbachman has quit IRC14:33
efriedbauzas, gibi: responded.14:39
bauzasdammit, need to Ctrl-R14:39
bauzasI'm litterally writing live14:39
efriedbauzas: shouldn't be anything earth-shattering in my response.14:40
bauzaseek, sean-k-mooney did too14:40
* bauzas needs to look again at all comments and process them14:40
*** mriosfer has quit IRC14:40
bauzasefried: for the placement-ish syntax, I'm all for docs14:42
bauzasand not code14:42
bauzasFWIW14:42
bauzaslike, you can do it but you can mess it14:42
bauzasyour dog14:42
bauzasanyway, continuing to write14:43
efriedIMO the only reason we shouldn't block placement-ish syntax is because we might miss something in our translation utility and have to provide a workaround until we fix it.14:43
efriedeven there be tygers.14:44
*** Liang__ is now known as LiangFang14:44
LiangFanggibi: hi gibi, regarding https://review.opendev.org/#/c/689070/14:45
LiangFanggibi: how do you think to set trait for the host machine, and specify trait in flavor extra spec?14:47
LiangFanggibi: so the guest can be scheduled to the host with cache capability14:47
*** vishalmanchanda has joined #openstack-nova14:51
*** ociuhandu has quit IRC14:57
*** elod has quit IRC14:57
*** elod has joined #openstack-nova14:58
*** artom has joined #openstack-nova15:04
*** bbowen has joined #openstack-nova15:07
*** artom has quit IRC15:09
*** takamatsu has joined #openstack-nova15:11
*** artom has joined #openstack-nova15:12
*** artom has quit IRC15:12
*** artom has joined #openstack-nova15:13
openstackgerritSylvain Bauza proposed openstack/nova-specs master: Proposes NUMA topology with RPs  https://review.opendev.org/55292415:22
*** mlavalle has joined #openstack-nova15:22
bauzasefried: gibi: sean-k-mooney: alex_xu: thanks for the comments, here is another baking of NUMA topology spec https://review.opendev.org/55292415:22
gibiLiangFang: if we only care about having a cache configured on the host then it can be a capability represented by a trait. If we also needs to think about the available size of the caches then it is a resource15:23
*** spatel has joined #openstack-nova15:24
gibibauzas, efried: ack, will check back soon15:24
*** ociuhandu has joined #openstack-nova15:28
*** tkajinam has quit IRC15:29
stephenfindansmith: Any particular reason we don't use entrypoints for custom scheduler filters? Is it something we could/should do?15:39
dansmithdon't we already?15:40
dansmithbauzas:15:40
stephenfinCustom scheduler drivers, yes. Not filters for the filter_scheduler though15:40
dansmithI was sure we did15:40
stephenfinYou've to configure them via a python path in '[filter_scheduler] enabled_filters'15:40
stephenfinfwict15:40
*** mkrai__ has joined #openstack-nova15:43
*** jmlowe has quit IRC15:44
*** TxGirlGeek has joined #openstack-nova15:50
*** udesale_ has quit IRC15:52
*** udesale_ has joined #openstack-nova15:53
*** ociuhandu has quit IRC15:54
*** KeithMnemonic has quit IRC15:56
*** lpetrut has quit IRC16:01
spatelsean-k-mooney: morning16:02
spatellet me know if you around i want to share my load-test result.16:02
*** udesale_ has quit IRC16:06
gibibauzas: thanks for the update on the NUMA spec it looks good to me now16:06
*** udesale_ has joined #openstack-nova16:07
*** ociuhandu has joined #openstack-nova16:11
*** ociuhandu has quit IRC16:12
*** ociuhandu has joined #openstack-nova16:12
*** ivve has quit IRC16:13
*** udesale_ has quit IRC16:16
sean-k-mooneybauzas: im skiming through it now but ya im more or less happy with it. ill proably +1 it when i finish this pass16:24
bauzasdansmith: sorry was AFK16:26
bauzasstephenfin: yeah you have to set a specific option16:27
dansmithbauzas: np. I thought our scheduler filter interface was already using entry points, but stephenfin says it's not.. I'm sure he's right I was just poking you in case you were also surprised16:27
bauzashttps://docs.openstack.org/nova/latest/configuration/config.html#filter_scheduler.available_filters16:28
bauzasstephenfin: ^16:28
stephenfinright, matches up with what I suspected so16:28
stephenfinbauzas: any reason not to deprecate that an ask people to use entrypoints instead?16:28
stephenfingiven I'll be asking them to do that for extra spec validators16:28
bauzaswell, I don't have any opinion16:29
bauzaswe had a lot of entrypoints16:29
bauzasso for sure we could just use another one16:29
sean-k-mooneyspatel: did you see an improvement?16:30
bauzasstephenfin: dansmith: that's how Nova knows about filters https://github.com/openstack/nova/blob/master/nova/loadables.py#L7816:32
*** gyee has joined #openstack-nova16:32
bauzasfor example you can ask to have a new custom filter by doing something like scheduler_available_filters = myownproject.scheduler.filters.climate_filter.ClimateFilter16:33
*** dtantsur is now known as dtantsur|afk16:34
efriedbauzas, gibi, sean-k-mooney: I don't understand the "fallback" thing.16:35
efriedhttps://review.opendev.org/#/c/552924/17/specs/ussuri/approved/numa-topology-with-rps.rst@51616:36
sean-k-mooneyefried: its mimicing PCPUS16:36
sean-k-mooneybasically when procesing a vm request with a numa toplogy if and only if the placement allocation candiates responce is empty we will fall back to the non numa aware query and let the numa toplogy filter elimidate the host if it cant fit16:37
bauzasefried: try: <call placement asking for NUMA-aware instances> except NoValidHosts: <call Placement like in Train>16:38
sean-k-mooneyyes but without using excption for control flow :P16:38
efriedSorry, let me clarify16:38
bauzasefried: sean-k-mooney: I could have provided a link to the implementation instead of the spec :)16:39
efriedI understand the what/how. I don't understand the why.16:39
bauzasefried: because,16:39
bauzassay a rolling upgrade16:39
sean-k-mooneyefried: so we dont need to have a global flag in the schduler to turn on the translation16:39
*** tbachman has joined #openstack-nova16:39
bauzasor just a Ussuri cloud with only one node being transformed16:40
bauzasthen we could have some problems16:40
sean-k-mooneythe same reason we did it for PCPUs to make upgrdade simpler16:40
bauzasefried: have you seen the Upgrade Impact section already ?16:40
sean-k-mooneybauzas: ya the single node case is basicaly during a rolling upgrade16:40
bauzasI tried to explain the *why* there16:40
efriedBecause what's going to happen here is:16:41
efriedI'll upgrade to Ussuri, where my workaround option is set to not reshape;16:41
efriedand all my flavors will continue to work, only a teeny bit slower because I'll get empty allocation candidates on the first query 100% of the time.16:41
efriedAnd I won't notice, so I'll never flip the workaround off.16:41
efriedso,16:42
efriedAm I misunderstanding what release/combination will have this fallback mechanism in play?16:42
gibiefried: I understand the need to push the operators to switch. But pushing them to do the switch right at the upgrade feels too much to me. It would like an ultimate. "When you upgrade to Ussuri you will loose all the NUMA aware capacity of your cloud, but you can get them back iff you do the reshape one every NUMA aware compute, but you should not do that all at once as that will overload placement."16:43
gibis/one/on/16:43
dansmithgibi: you think that's okay?16:44
efriedWell, I don't buy that it would overload placement. The reshape will be one query per compute host.16:44
sean-k-mooneyefried: i would hope we remove it when we remove the config option16:44
gibidansmith: I think it is not OK to loose all the NUMA aware capacity at upgrade16:44
dansmithgibi: okay you were describing the problem, not the goal, is that right?16:45
sean-k-mooneywe could remove it when we change the default but i would hope it will not last more then 2 releases16:45
efriedThen we kind of have to position this as a "tech preview" or experimental change.16:45
gibidansmith: my goal is not to loose the capacity at upgrade. I dont like the the quoted part16:45
efriedIt just seems pretty pointless to me.16:45
dansmithgibi: ++16:45
efriedbecause we're going to do a bunch of work to enable something that nobody is going to use.16:46
sean-k-mooneywell since we are not enableing it by default in ussuri it kind of will be16:46
sean-k-mooneybut in vitora i think we should enable numa reporting by default. i would be happy to do that in ussuri but that might be a bit agressive16:46
*** slaweq has joined #openstack-nova16:47
efriedI buy that we can't enable it by default as long as we can't fit non-NUMA-aware workloads onto NUMA-modeled hosts.16:47
sean-k-mooneyefried: well actully im not sure i agree with that but thats a sperate dicussion16:48
efriedbut with this fallback mechanism as designed, we're giving operators NO reason to switch over.16:48
sean-k-mooneyefried: well with my downstream hat on i will be pushing to make numa on the default for our next lts16:48
efriedwhich means we might as well not bother with this incremental improvement. We might as well just wait until we've solved the fitting problem.16:48
sean-k-mooneyefried: we have said that for 4+ releases16:49
sean-k-mooneythis will imporve schdulign time and reduce the chance of racecs for numa instnaces16:49
efriedno16:49
efriedit won't16:49
sean-k-mooneyso it is still usful16:49
efriedit will do exactly nothing16:49
efriedexcept inject an extra placement call into every scheduling request.16:49
sean-k-mooneyit will because non numa instance will ignore numa nosts16:50
sean-k-mooneyand numa instance will ignore non numa nosts16:50
efriedthere will be16:50
efriedno16:50
efriednuma16:50
efriedhosts.16:50
sean-k-mooneypeople will turn this on16:50
sean-k-mooneyi can almost guarentee the then first release that we productise downstream with this feature will have it enabeld by default regardless of the upstream defaul16:51
efriedwhat, do your downstream releases force only flavors with guest numa topos?16:51
artomsean-k-mooney, I don't see how we can do that...16:51
artomWouldn't that make all computes essentially not usable for non-NUMA instances?16:52
efried^16:52
artomIf they're too big to fit on 1 NUMA node?16:52
gibiefried: actually that what happens in my downstream project as we use pinning and huge page16:52
sean-k-mooneyno they dont but we do force the numa hosts to be partioned16:52
artomSo?16:52
efriedartom: even if they're not too big. Because we're adding the HW_NUMA_ROOT trait.16:52
sean-k-mooneyso all the hosts that will run numa workloads will have the feature turned on and the non numa host will have it off16:52
artomThey'd still all suddenly NUMA-exposing16:52
artomsean-k-mooney, uh, so that's not "on by default as soon as they upgrade" :)16:53
artomefried, oh right16:53
sean-k-mooneyit will be enabled in roles that configure the host for dpdk, hugepages or pinning16:54
sean-k-mooneyi would argue it should be set to on for all host and provide a way for them to opt out if they need the giant vm case16:54
sean-k-mooneynon of our telco customer need that case16:54
artomsean-k-mooney, our telco customers are not 100% of openstack users16:55
artomWould CERN want that, for example? :)16:55
sean-k-mooneyi dont think so no16:55
sean-k-mooneyi think they understand the perfomance cost and would use numa instances16:55
sean-k-mooneycern are not going to trow away 30% of there compute performance16:56
sean-k-mooneyplus they have ironci if they really need to alocate all the resouce of a full host to an instance16:56
artomsean-k-mooney, maybe, maybe not. I don't know how they, or anyone else how isn't a RH telco customer, operate. Which is why I'm weary of making this opt-out, and would feel safer making it opt-in.16:57
artomIf our deployment tooling wants to turn it on by default in some cases, that's cool16:57
artomBut not a Nova default16:57
sean-k-mooneyi think as proposed we get the best of both worlds16:58
sean-k-mooneythe fallback mechanisum will mean that it will just work if you dont set anything16:58
bauzassorry you lost my attention by not highlighting me16:58
sean-k-mooneyand we can add a nova status check to warn that you shoudl enable this before upgrading to Victora16:58
artomsean-k-mooney, wait, the current proposed thing is disable_placement_numa_reporting = <bool> (default True for Ussuri)16:59
sean-k-mooneyyes16:59
artomWhich is what I'm advocating for16:59
sean-k-mooneyit will be disabled in ussuri16:59
artomOK, so we agree :)16:59
sean-k-mooneyand hopefully enabled by default in Victoria16:59
dansmithand that means what for flavors that currently have numa?16:59
*** rpittau is now known as rpittau|afk17:00
sean-k-mooneydansmith: the inital placement query will be empty17:00
sean-k-mooneythen we fall back to the current query17:00
artomdansmith, IIUC same as what we did for PCPUs - try the new Placement query, if it comes back empty, try the legacy one17:00
sean-k-mooneyand leave it to the numa toplogy filter to do all the work17:00
*** priteau has quit IRC17:00
sean-k-mooneyso they will just work17:00
dansmithack, yep, I think that's the sane path for U17:01
sean-k-mooneystephenfin: by the way are you removing the fallback for PCPUs in U17:02
sean-k-mooneystephenfin: that was the plan but i dont think you have had time to work on it17:02
stephenfinI could but I was thinking I'd wait another cycle17:02
dansmithif it's on by default, could we do the opposite for non-numa flavors? meaning, query with numa_nodes=1 and if we get no options, then try with =2, etc?17:02
dansmithuntil we get a range capability with placement17:02
stephenfinTo let it bake in more. It's just dead code once the correct config options have been set17:03
sean-k-mooneyam maybe but the performace of that might not be great17:03
sean-k-mooneyor acccptable.17:03
artomdansmith, that would assign a NUMA topology even if the user didn't request it (explicitly or implicitly), no?17:03
sean-k-mooneyartom: ya but that would actully be fine17:04
sean-k-mooneythey expressed no prefernce17:04
artomNUMA topologies have a whole bunch of limitations :)17:04
sean-k-mooneyso we can decided what that is17:04
sean-k-mooneyartom: you removed the main one e.g. live migration17:04
artomTrue...17:04
openstackgerritDouglas Mendizábal proposed openstack/nova master: Allow TLS ciphers/protocols to be configurable for console proxies  https://review.opendev.org/67950217:04
dansmithartom: yes, it would, but I definitely think that we should be getting to the place where we don't just pretend numa doesn't exist17:05
artomI'd still feel uncomfortable springing that on people17:05
dansmithartom: so we want people to not have to worry about it in detail, but not just let them totally ignore it, IMHO17:05
sean-k-mooneyi think that is something we shoudl consider in V when we consider changing the default17:06
dansmithyup17:06
sean-k-mooneyif there is a sensable way to express it to placmenet or a sensible way to make it work form our side we should explore it17:06
dansmithwhat I don't think we should do, is continue to have two distinct ways of running instances forever17:06
sean-k-mooneyi agree with that17:07
artomSame here17:07
artomThough I'm not entirely convinced giving everyone a NUMA topology is the way to do it17:07
dansmithsimilar to cellsv1, that didn't work out well, and we never closed the feature and bug gap for people using it until cellsv2 where we just make everyone use it.. it's a little more overhead, but the gap is way smaller17:07
dansmithartom: but everyone _has_ a numa topology17:08
artomdansmith, you know what I mean ;)17:08
dansmithif we need to retool the numa stuff in nova (and probably placement) then that's what we need to do17:08
*** martinkennelly has quit IRC17:08
artomI think I'd lean towards the can_split stuff in placement17:08
artomThough I admittedly have no idea how complicated that would be17:09
dansmithyep, it seems like the major barrier here is lack of expressivity with what we ask of placement when we don't care as much17:09
artomLike, if the host is NUMA, fine, expose it17:09
artomBut as you said, being able to say "I don't care about NUMA" would be the best way to solve this, I think17:10
*** dave-mccowan has joined #openstack-nova17:10
sean-k-mooneyartom: can split is very ineefincet to implement in sql17:10
artomSo, I have beef with "placement has to be SQL", but that's just me ;)17:10
sean-k-mooneyartom: if "i dont care about numa" means we are free to invent numa toploigies then sure17:10
artomsean-k-mooney, weeelll... wouldn't that lead to a bunch of packing problems?17:11
bauzasfolks, I have to disappear17:11
bauzasleave comments, disagreements, concers on the spec17:11
sean-k-mooneyartom: oh it doesnt have to be sql. but it would be very hard to support can split with the current impeneation17:11
sean-k-mooneyartom: no17:11
artomI guess not, if you retry with enough combinations of guest NUMA nodes and CPUs per node17:11
*** martinkennelly has joined #openstack-nova17:12
artomSo if you some how end up with NUMA0 with 1 CPU free, and NUMA1 with 3 CPUs free, and boot an instance with 4 CPUs, you would need to retry until you get to the numa0_cpus:1,numa2_cpus=3 combo17:12
sean-k-mooneyhonestly doing 4 placment queries with 1-4 numa nodes is still proably faster then the numa toplogy filter today17:12
sean-k-mooneybut i dont think this is productive to continue now17:13
artomTrue17:13
artom(On the second point)17:13
dansmithif, like I said, we had a range of nodes, I would think placement could just loop internally much faster than even our retry process17:13
dansmithand just dump us more options in the first go17:13
sean-k-mooneydansmith: ya its really a proablem of being able to express our actul constratin to placmenet so they can do an efficent thing17:14
dansmithright17:14
sean-k-mooneyat the moment we are over spcifying and underspecifying at the same time17:14
artomdansmith, I think placement would want to now know about NUMA17:14
artomAnd only think of things in terms of generic RPs17:14
sean-k-mooneysince the way we express the query does not fully match what we need17:14
artom*to not know17:15
dansmithartom: that's the proposal, AFAIK17:15
dansmithartom: to model this in placement as RPs with the relevant hierarchy17:15
artomdansmith, how does this fit with your "internal placement retry loop" idea though?17:15
sean-k-mooneyartom: right but we added some emantic meaning that is not helpful to come concpets in placment.17:15
dansmithartom: ?17:15
artomdansmith, well, wouldn't placement need to understand what a NUMA node is to do that?17:16
sean-k-mooneyfor example resouces:vcpu=2 must allcaote 2 cpus form the same RP17:16
sean-k-mooneyartom: no it just need to know that some resouce need to be in the same sub tree17:16
*** ivve has joined #openstack-nova17:16
sean-k-mooneywith a parent contiing a opacue trating "HW_NUMA_ROOT"17:17
artomsean-k-mooney, I guess I can see that17:17
dansmithartom: no, I don't think so, we just need to provide some way to communicate to placement that it can satisfy the required resources by allowing them to exist in multiple parts in the tree, with some minimum amount per provider, with maybe matching ratios of cpus and memory17:17
dansmithartom: I don't mean an "&numa_nodes=1-4" level of explicitness (even though I've said that just as an example), but it really just needs to be the generic form of that17:17
artomdansmith, I wonder if we could tell placement something like "aggregate_at_root=true", and then it could sum the child RPs resources temporarily when handling that query17:18
dansmithartom: we need more than that, I think17:18
sean-k-mooneyno that is not enouch17:18
dansmithartom: we need to be able to say "don't give me 7 cpus on one node with 1MB of ram, and 1 CPU on the other with 8G"17:18
sean-k-mooneythe thing that breaks with simple models is alwasy disk space17:18
sean-k-mooneyif i asked for 10017:19
dansmithartom: something like "cpus and mem must match 60/40 across the split" or something17:19
sean-k-mooney100G you cant give me 2x50G17:19
dansmithyep17:19
sean-k-mooneyso it has to be per resouce class and we need to express grouping constratins and potaentail sizing info like the 64/40 split17:20
*** gmann is now known as gmann_afk17:20
sean-k-mooneyartom: right now to do ^ we have to specify the toplogy exactly rather then saying these are the limits give me anyting that matches17:20
artomBut... the disk would never be on a NUMA child RP...17:20
dansmithartom: he's giving an example of a resource splitting constraint17:21
artomTBH I still don't see why we need to express splitting constraints, but I'm probably just being thick17:22
artomAnd as sean-k-mooney said, it's an academic discussion not relevant to the current spec17:22
sean-k-mooneysame thing applies to hugpegaes. if i ask for 512mb of hugepages and the host only has 1G hugepages allocated you cant split it17:22
sean-k-mooneydisk is just more approchable17:22
dansmithbecause if you ask for 16 CPUs and 32G of RAM, we need to be able to say "we're willing to take that split across two nodes as long as the ratio of the split for those resources is at most 60/40"17:22
artomsean-k-mooney, right, but hugepages we model as their own RP17:22
*** mkrai__ has quit IRC17:23
artomSo you'd never get 1GB hugepages anyways, you'd get, for example, 400MB from 1 RP, and 112 from another (unrealistic numbers, I know)17:23
artomdansmith, so that's my thing - why do we want to say that? What's wrong with CPUs split 1/15 and memory 30GB/2GB17:24
sean-k-mooneyit would get rejected by the step size yes17:24
*** ccamacho has joined #openstack-nova17:24
dansmithartom: because that would be pretty unhelpful?17:24
*** igordc has joined #openstack-nova17:24
artomdansmith, hey, they user they don't care about NUMA topologies ^_^17:25
sean-k-mooneybut point is there are limits on how thing can be split that depned on the resouce class and how it willl be used17:25
artomBut seriously, unhelpful from a performance POV?17:25
dansmithartom: right, the user isn't opinionated, but they still want a sane instance17:25
dansmithartom: not caring about numa doesn't mean we should give them something completely pathologically stupid17:25
*** ociuhandu_ has joined #openstack-nova17:26
artomlulz - we need a hw:sanity extra specs17:26
artomAnd if they set it to ludicrous we do ^^17:26
artomdansmith, but yeah, I get your point17:26
dansmithif we did this, we'd want to be able to specify what those splits are, and if the op really doesn't care, then they can set the split policy to something very fine17:26
dansmithbut it wouldn't make much sense for them to do that17:27
artomI was mostly joking about that extra spec17:27
sean-k-mooneyyes we know17:27
sean-k-mooneyanyway form my view point if you dont set hw:numa_nodes at all it gives nova the freedome to do something sane17:28
sean-k-mooneythat can be create a numa toplogy if it chooese17:28
*** ociuhandu has quit IRC17:29
sean-k-mooneyfor now we leave libvirt invent a single numa node with no affinity17:29
*** ociuhandu_ has quit IRC17:30
sean-k-mooneyhaveing one or multiple numa nodes in the guest and mapping them to 1 or more numa node on the host are two different thing17:30
sean-k-mooneyso can if we want expose 1 numa node to the guest and on the host map it across them17:31
sean-k-mooneyi think we can do better then that howwever. anyway time to go review something else17:31
*** evrardjp has quit IRC17:34
*** evrardjp has joined #openstack-nova17:34
*** martinkennelly has quit IRC17:39
*** ccamacho has quit IRC17:42
*** spatel has quit IRC17:50
stephenfinah, crap. Today was supposed to be spec review day17:51
stephenfinGuess tomorrow is spec review day for me now \o/17:51
yoctozeptostephenfin: the most busy today - tomorrow :-)17:53
stephenfinsean-k-mooney: Tempest is failing on my extra spec validation patch because it's using generic e.g. 'key1=value1' extra specs. What's the most generic extra spec we've got?17:57
stephenfinI've been using 'hw:numa_nodes' but that's libvirt/HyperV specific17:57
*** spatel has joined #openstack-nova17:57
spatelsean-k-mooney: sorry i was in meeting17:57
*** derekh has quit IRC17:58
spatelsean-k-mooney: as per your recommendation i have added cpu_threads=2 and cpu_sockets=2 in flavor and run test but result was OK (compare to 16 vCPU with single numa0)17:59
spatelstill i don't understand why, erlang correctly detected CPUTopology on VM but still result was poor with 28vCPU18:00
*** igordc has quit IRC18:02
*** igordc has joined #openstack-nova18:06
*** davidsha has quit IRC18:07
openstackgerritLee Yarwood proposed openstack/nova master: images: Move qemu-img info calls into privsep  https://review.opendev.org/70689718:11
openstackgerritLee Yarwood proposed openstack/nova master: images: Allow the output format of qemu-img info to be controlled  https://review.opendev.org/70689818:11
openstackgerritLee Yarwood proposed openstack/nova master: virt: Pass request context to extend_volume  https://review.opendev.org/70689918:11
openstackgerritLee Yarwood proposed openstack/nova master: WIP libvirt: Fix attached encrypted LUKSv1 volume extension  https://review.opendev.org/70690018:11
*** xek_ has joined #openstack-nova18:12
*** jcosmao has left #openstack-nova18:13
openstackgerritStephen Finucane proposed openstack/nova master: WIP: api: Add support for extra spec validation  https://review.opendev.org/70464318:13
openstackgerritStephen Finucane proposed openstack/nova master: trivial: Remove FakeScheduler  https://review.opendev.org/70722418:13
openstackgerritStephen Finucane proposed openstack/nova master: conf: Deprecate '[scheduler] driver'  https://review.opendev.org/70722518:13
openstackgerritStephen Finucane proposed openstack/nova master: doc: Improve documentation on writing custom scheduler filters  https://review.opendev.org/70722618:13
stephenfinsean-k-mooney: I went with 'hw:numa_nodes' and 'hw:cpu_policy' for want of something better18:13
*** macz has joined #openstack-nova18:15
*** xek has quit IRC18:15
*** umbSublime has quit IRC18:23
*** maciejjozefczyk has quit IRC18:23
*** tbachman has quit IRC18:23
*** tbachman has joined #openstack-nova18:24
*** mriedem has joined #openstack-nova18:30
*** ralonsoh has quit IRC18:33
*** martinkennelly has joined #openstack-nova18:37
efriedsean-k-mooney, bauzas, gibi: I think I would actually prefer the approach dansmith suggests, even if we only do it at the coarsest level, rather than effectively disable the topology modeling in ussuri.18:44
*** gmann_afk is now known as gmann18:49
efriedin other words:18:49
efried- Model with NUMA topology by default. Provide the [workaround] option to *disable* (and un-reshape) for situations where the following is just sh*tting all over itself and nothing is landing.18:49
efried- Flavors with hw:numa*-isms get translated as specced.18:49
efried- Flavors without hw:numa*-isms get looped for $n in range(0, $max_sane_number_of_numa_nodes_we_expect_a_host_to_ever_have) to behave as if they had specified hw:numa_nodes=$n, stopping as soon as we get a hit.18:49
efriedThis is without changing anything in placement.18:50
*** amoralej is now known as amoralej|off18:51
efriedFor uneven splits, we just get as close as we can, but make no attempt to be "fuzzy" (like "up to 60/40" or anything like that). So like, for $n=2 and VCPU=10, we try 10, then 5/5, then 3/3/4, then 2/3/2/3.18:52
dansmiththere has to be some sort of policy tunable for how wild we're able to get I think18:56
efriedmax_numa_nodes_guessed ?18:56
*** psachin has quit IRC18:57
efriedAre there really hosts out there with more than, say, 8 NUMA cells?18:57
dansmithno, I mean how small of a footprint on a given node we're going to allow18:57
efriedTo be clear, I'm talking about splitting as evenly as possible, always.18:58
efriedYou're saying that sometimes that would result in unreasonably small footprint on one numa node anyway?18:58
efriedobv we stop trying to split if any of the cells are going to get 0 of anything.18:59
efriedso if you ask for 2 vcpu, we go to a max of hw:numa_nodes=218:59
dansmithsure, but I think we need to not be willing to split memory at 1MB and 8191MB18:59
dansmithor 31 cpus and 118:59
dansmithand the ratio needs to be close/equal for cpus and memory19:00
efriedRight, I'm saying that happens implicitly19:00
dansmithhow?19:00
efriedeach iteration of the loop simply behaves as if you said hw:numa_nodes=$n.19:00
efriedwhich simply splits your CPU and mem "evenly" across $n nodes.19:01
dansmiththat just seems to naive to me19:01
efriedyes19:01
efriedit is19:01
efriedbut so is "I don't care about NUMA"?19:01
efriedAnd it ought to work 99% of the time19:01
efriedand if it doesn't, switch on your [workaround] for a couple of hosts.19:01
dansmithI want the workaround to go away, remember19:02
efriedYes, the workaround goes away completely once we beef up placement to understand can_split19:02
dansmithokay your even splitting is just the first pass at sanity? then that's fine19:02
efriedoh, yeah, eventually we want the utopia where all of this happens in one call with can_split, whose ratios are tunable (via placement side conf? via nova conf fed into placement qparams?)19:04
dansmithhas to be nova-side19:04
dansmithcommunicated to placement via the query19:04
efriedthen we won't need the workaround anymore, because everything will be able to land (and if it can't, it's because it *shouldn't*).19:05
efriedIn practice, we may even find that nobody needs the workaround. But we'll see.19:05
*** maciejjozefczyk has joined #openstack-nova19:07
efriedDid we decide how we're going to deal with "control plane is updated but some computes are not"? Does the control plane wait to start using the new query style until all the computes are updated? We have that capability, right?19:09
efriedokay, I see that discussed in the spec.19:11
*** LiangFang has quit IRC19:11
sean-k-mooneyefried: i think if we go that route in ussuri we wont be able to land it in time19:18
efriedwhy not?19:19
efriedwe're not talking about trying to implement can_split in any form in ussuri19:19
efriedare you concerned that the progressive-splitting algorithm is too complicated?19:20
sean-k-mooneyjust multiple queries wehre we try to progressively split19:20
sean-k-mooneyefried: yes19:20
efriedmeh, I don't see how it's any worse than the proposed fallback.19:21
sean-k-mooneyefried: it will have to take into accoung numa, native request groups in teh flavor, external requests form cyborng and netorn ports and not suck form a performnce point of view19:21
sean-k-mooneythe proposed fallback is two queries, the native nuam one followed by the query we do today19:21
sean-k-mooneyand it only impacts the perfomce of numa instances19:22
sean-k-mooneythe other way imacpts the perfomce of all non numa instnaces19:22
efriedsean-k-mooney: I don't think it has to take all that stuff into account at all.19:22
efriedsean-k-mooney: It should behave *exactly* as if you said hw:numa_nodes=$n with no other hw:numa*-isms.19:23
efried(except I think we said we would bounce if we couldn't split evenly; that restriction would have to be lifted for this case.)19:23
sean-k-mooneyso we would create a fake flaovr where we overrid that and pass it to the current get numa constratis funct19:24
sean-k-mooney*function19:24
efriedif you like19:24
efriedthat would be the spirit, anyway.19:24
sean-k-mooneyi gues that makes it simpler19:24
*** maciejjozefczyk has quit IRC19:25
sean-k-mooneyso the get_numa_constraits function will reject any invalid toplogy with regard to even spliting with an excetion19:25
sean-k-mooneyso we would just loop and contiue if an excption is raised up to the limit19:25
efriedor we relax the constraint to split as close to evenly as possible. Or do that split first.19:26
efriedImplementation detail. Point is, it shouldn't be super hard to figure out.19:26
sean-k-mooneywe could use the asemetric numa modeling support that is there yes19:26
sean-k-mooneye.g. if you had a 9 core vm and we are on numa=2 do 4cpus+5cpus19:27
sean-k-mooneyinstead of going to 3 numa nodes with 3 cpus19:28
efriedright19:28
sean-k-mooneyim  not sure which is more likely to cause fragmentation of the top of my head19:28
sean-k-mooneywe should tell people to just use powers of 219:28
efriedexample I gave above was with 10 VCPUs, we would try 10, then 5/5, then 3/3/4, then 2/3/2/3.19:28
efriedno19:28
sean-k-mooneyi was jokeing19:29
efriedwe should tell people to use real numa specs if they care.19:29
sean-k-mooneyit does make life easier when they do but ya i can see that working19:29
sean-k-mooneyok if we caluate the split in the tempory flavor we pass to the numa constratis function then it would populate the instance numa toplogy object as if the user had set it manuyaly in teh falvor19:30
sean-k-mooneythen if we save that in the instnace we could ensure we dont break live migration by changing it19:31
efriedjust so.19:31
efriedwell, I would expect we shouldn't save it in the flavor, because we want the instance to be able to morph to fit somewhere else if it needs to.19:31
sean-k-mooneyright19:31
efriedbut I don't know how that works, do you break an instance if you "change" its topo from under it?19:31
sean-k-mooneyi ment save the instance_numa_toplogy object19:31
sean-k-mooneynot the falavor19:32
sean-k-mooneyefried: during live migration you would19:32
sean-k-mooneycold migration it might mess up some manula config but it should not break it in general19:32
efriedhm, well that's a bummer. So how do we migrate numa-agnostic instances today?19:32
sean-k-mooneyi was suggesting once we select a toplogy we store it in the request_spec and instnace19:32
sean-k-mooneyso that it stays the same for the liftime of the instance unless you resize19:33
sean-k-mooneyor rebuild19:33
sean-k-mooneyam today19:33
sean-k-mooneynon numa isntace are alwasy exposed as 1 numa node19:33
sean-k-mooneyso it never chagnes form the gest point of view19:33
sean-k-mooneyso we just lie to the guest19:34
efriedcouldn't we continue lying to the guest?19:34
sean-k-mooneywe can yes19:34
efrieddoes the guest do things differently if it knows CPU x is affined to memory y?19:34
sean-k-mooneyso we would do the progessive spliting to select the host resouces and present it as 1 numa node to the guest19:34
sean-k-mooneybut that will have worse performace then telling it its actull toplogy19:35
sean-k-mooneyyes19:35
sean-k-mooneythe kernel will take that into account wehn allocation memroy for a proces19:35
sean-k-mooneytrying to use numa local memory ahead of remote numa memory19:35
spatelsean-k-mooney: hey19:35
efriedwell, I guess this is a problem we would have eventually anyway, right dansmith?19:35
sean-k-mooneyif we dotn expose the toplogy to the vm the vm kernel wont know how to optimise19:35
spateldid you see my last mesg?19:36
dansmithefried: what? needing everything to support guests with numa?19:36
spatelrunning vm on single numa0 give very high performance compare to running on both numa node (with cpu_socket=2 and cpu_threads=2 option)19:37
efrieddansmith: TL;DR: In the world where you say you don't care about NUMA, we give you an instance that's NUMA-ified, but with whatever split we were able to fit.19:37
efriedIf you migrate that instance, you have to preserve that topo on the target; you can't just munge it to a new shape.19:37
efriedWhich will then limit where you can fit it on migration.19:37
sean-k-mooneyspatel: i was away but just saw it now. fundemtally i think erlang is jut not optimising correctly. im not really sure how to help other then suggestin run more small vms   if you can scale out the application horrizontally instead of vertically19:37
dansmithsean-k-mooney: presumably if the flavor on the instance is not numa-aware then we can just find a new host during cold migration and give it a new topology when it moves19:37
sean-k-mooneydansmith: yes we could19:37
dansmithefried: only on cold migration19:38
dansmithsorry19:38
dansmithefried: only on live migration19:38
dansmithefried: on cold migration it could change because you're rebooting and there shouldn't be anything in the guest that persistently cares what the topology is..19:38
efriedOkay, that would be a new limitation.19:38
spatelsean-k-mooney: totally understand but i am going to loose 16 cpu in that case. but anyway i can live with that19:38
dansmithefried: we have that limitation today and it's handled by the numa live migration stuff, AFAIK19:38
efrieddansmith: oh, my understanding was that, by lying to the guest and saying it only has one NUMA node, regardless of how many are under the covers, we can change the under-the-covers on live migration without "affecting" the guest. It would still have crappy performance on both sides, but it wouldn't notice that anything had changed.19:39
sean-k-mooneyefried: well the limitation would be dont change the view of hardwar the vm sees in live migration19:39
dansmithefried: and selecting a host in that situation should be the same as selecting a host for a new instance boot where the flavor cares deeply about the topology19:39
sean-k-mooneywhen framed that way its what we do today19:39
dansmithefried: only insofar as it is unaware of how stupid it's being, regardless of what is underneath19:40
sean-k-mooneyif we lye to the vm so it only sees 1 numa node we can in some cases change the mapping underneth yes19:40
dansmithefried: so yes, if all guests are numa-aware then migrating indifferent ones becomes a little more restrictive, but that's the same goal as not pretending this stuff doesn't exist at boot time, IMHO19:41
efrieddansmith: okay, I understand and agree with that; I'm just questioning whether that's going to effectively spike the chances of NVH trying to live migrate a NUMA-agnostic instance.19:41
efriedsounds like the answer is yes, and we're okay with that.19:41
dansmithefried: it may increase the difficulty of moving things, yes19:41
sean-k-mooneyefried: well in a non full clould it very likely that the numa=1 case will just work19:42
sean-k-mooneyunless the vm is very large19:42
dansmithefried: I refer to the documented goal of not trying to schedule the last byte of memory19:42
dansmithyep19:42
sean-k-mooneyand in that case the numa=2 case is likely to work19:42
sean-k-mooneyso i dont think it would spike much19:42
dansmithsmall guests are less likely to care about numa, and thus more likely to fit into numa=1, and thus more likely to be easily movable19:43
dansmithlarge guests are more likely to need numa for proper performance and have the moving restriction today19:43
*** vishalmanchanda has quit IRC19:43
sean-k-mooneyefried: dansmith im wondering if we could/should have a second spec or defer this to the implematnion19:44
efriedThis spec needs to say whether we're going to try to do the progressive splitting thing.19:44
sean-k-mooneye.g. if we agree withthe proposed placmenet modeling should we decied how to do the query splitting seperatly19:45
efriedBut that's not the only factor in play. We also need to address the partially-upgraded cloud.19:45
sean-k-mooneyvvs fallbackj19:45
sean-k-mooneyso with the progress spilting i think we still need the  fallback19:45
sean-k-mooneyto cover that case19:45
sean-k-mooneythe fall back is the only thing that can ever land on the non upgraded hosts19:46
efriedI'm not sure we should do the fallback thing at all.19:46
efriedNot if there's a way we can simply avoid doing the translation until all computes are upgraded.19:46
sean-k-mooneywe cant without a global config19:47
sean-k-mooneywhich is why we have the fallback for pcpus19:47
efriedI thought the control plane was able to tell which computes were at what level?19:47
dansmithefried: we have to do the fallback no?19:47
efriedBy RPC something something?19:47
dansmithefried: it can19:47
sean-k-mooneyit was a gloabl config untill we agreed to do the fallback at the end of train19:47
efrieddansmith: isn't that going to end up violating pack/spread weighing and server affinity groups, because it will always favor upgraded/reshaped computes?19:48
*** spatel has quit IRC19:48
dansmithyes?19:48
efriedI mean, I see your point, that you really can't upgrade a compute *and* reshape it unless the scheduler is going to do some translating.19:49
dansmithor, do both queries, merge the two results and let the filters/weighers decide?19:49
efriedgross. But yeah.19:49
dansmithdoesn't seem more gross to me.. two queries, yes, but everything will start with two queries effectively19:49
sean-k-mooneydansmith: do we merge them for PCPUs or is there a reason we dont?19:50
dansmithidk19:50
efriedIIRC we don't19:50
efriedwe do one, then if no results, we do the other19:50
sean-k-mooneyright which by the way we still do19:50
efriedas for a reason, probably just didn't think about the weighing etc.19:51
*** jmlowe has joined #openstack-nova19:51
sean-k-mooneyno i remebere this coming up19:51
sean-k-mooneyi just dont recally why we chose to/not to merge them19:51
sean-k-mooneyso is the poposal always do both queries and merge them or to the progressive spliting19:52
efriedBoth19:53
efriedbecause19:53
efriedwe want upgraded hosts to cut over to NUMA-modeled by default.19:53
sean-k-mooneyok so progressive for non numa and both qureies for numa19:53
efriedI think we need two queries for both.19:54
sean-k-mooneyi think it will be more like 2 for numa and 5 for non numa19:54
efriedyeah19:54
sean-k-mooney5 asumign we dint max_numa_nodes=519:55
efried419:55
sean-k-mooneysorry 419:55
efriedI asked this earlier: what's the max number of NUMA nodes we know of on any system? Is it 4?19:55
sean-k-mooneycould we bail out if the first numa query passed19:55
efriedyes19:55
sean-k-mooneyon a 32-64 core 2 socket amd eypci host with a numa node per l3 region 16-32 numa node19:56
efriedye gods19:56
*** tbachman has quit IRC19:57
sean-k-mooneyif you dont expose a numa node per l3 region then 8 is more realisitc19:57
efriedCan we tell from the db?19:57
efriedI guess we could tell by querying placement.19:57
sean-k-mooneyits in teh hohst numa toplogy blob in the cell db19:57
*** tbachman has joined #openstack-nova19:57
sean-k-mooneyso yes technically.19:57
efriedbut I don't know that we really want to go discovering that every time we schedule.19:58
sean-k-mooneyevery time defintly not19:58
sean-k-mooneyschduler config option?19:58
efriedconfigurable [scheduler]max_implicit_numa_nodes ?19:58
efriedyeah19:58
efriedbecause even if it's possible to have a 32-way split, that would almost never be a good idea probably.19:58
sean-k-mooneyset it to 0 to diable numa qureis entrily otherwise its the limit19:58
sean-k-mooneyefried: it give you almost a 35% performace boost19:59
efried"disable" meaning what?19:59
sean-k-mooneyefried: disabel mean i have disable numa reporting in my entire cloud so dont even try19:59
efriedNeed to grok how that query would be different from the non-upgraded query.20:00
sean-k-mooneyit would be the same20:00
sean-k-mooneyjust want we do today in tran20:00
efriedOh, because it would have the HW_NUMA_ROOT trait.20:00
sean-k-mooneywell require=!HW_NUMA_ROOT would be the detla form train i guess20:01
sean-k-mooneybut if no host reports numa thats a noop20:01
efriedIn this new picture, I'm not sure we need/want to forbid that trait.20:01
sean-k-mooneywe dont20:02
sean-k-mooneyif we are allowed to invent numa toplogies20:02
sean-k-mooneywe could specificly to land on un upgraded hosts20:02
sean-k-mooneybut it has no other use20:02
efriedin this case we don't *want* to target un-upgraded hosts.20:02
efriedwe want to *allow* landing there, but not *force* it ever.20:03
sean-k-mooneysure20:03
efriedhum, but what we don't want is to land across numa nodes on an upgraded host. So yeah, I think the 'fallback' query in both cases should have !HW_NUMA_ROOT.20:04
sean-k-mooneyso [scheduler]max_implicit_numa_nodes=0 means dont add HW_NUMA_ROOT or granular groups, anyting above 0 is the amount of numa nodes to try for progressive spliting20:04
efriedyeah. But only for NUMA-agnostic flavors. For NUMA-aware flavors, we have an explicit number of nodes we're trying for.20:04
sean-k-mooneyyes20:05
efriedDo you have it in you to write this up, sean-k-mooney?20:05
efriedI made a start on it, but I need to step away for... possibly the rest of the day.20:05
sean-k-mooneyill start a second etherpad.20:05
sean-k-mooneyor if you have a start put it in one and i can extend it.20:05
sean-k-mooneyefried: do we still need teh per host config option in this model20:06
sean-k-mooneyi think no if we do the progressive spliting20:06
sean-k-mooneybut we might want to for perfomcne reasin if we jsut want to turn it off20:06
efriedyes we do, because sometimes the progressive splitting won't get a result, and they want to force a host to behave like a Train host.20:06
sean-k-mooneyok20:07
efriedbut that's why the workaround is *off* by default. You have to really need it to turn it on.20:07
sean-k-mooneyok that makes sense20:07
sean-k-mooneydansmith: would you be oke with a [scheduler]/max_implicit_numa_nodes config option20:07
sean-k-mooneyto contol the progessive spliting20:08
efried"config-driven API behavior" warning. Not sure I see a better alternative though.20:08
sean-k-mooneyefried: well the virt driver can today dowhatever the hell it like in this case anyway so im not sure its an observable thing20:09
sean-k-mooneyat least form the api perspctive20:09
sean-k-mooneybut i get where your coming form20:09
*** jmlowe has quit IRC20:14
dansmiththat's totally not config-driven api behavior20:15
dansmithand yes, I think that's fine20:15
sean-k-mooneyok ill try to write this up in a comment to the spec and then ill try not to melt bauzas brain when i try to explain this to him tomorrow in our downstream tech call20:18
sean-k-mooneyi think 90% of the spec woudl remain the same we just need to update the section that refence the fallback and upgrade impact20:18
efriedsean-k-mooney: I left a comment20:22
efriedI think I covered the high points, but I'm pretty fried (*e*fried) so I probably missed some things, if you want to fill in.20:22
efriedgtg o/20:22
*** efried is now known as efried_afk20:22
sean-k-mooneyefried_afk: ill review it after coffee20:22
efried_afkthx20:23
sean-k-mooneyefried_afk: o/20:23
*** owalsh has quit IRC20:30
*** martinkennelly has quit IRC20:41
*** owalsh has joined #openstack-nova20:44
*** spatel has joined #openstack-nova21:18
*** nweinber has quit IRC21:20
*** spatel has quit IRC21:22
*** xek_ has quit IRC21:35
*** mmethot has quit IRC21:36
*** mmethot has joined #openstack-nova21:37
*** mmethot has quit IRC21:38
*** mmethot has joined #openstack-nova21:38
*** mmethot has quit IRC21:42
*** damien_r has quit IRC21:44
gmannjohnthetubaguy: what you think of passing service as actual target in service policies? - https://review.opendev.org/#/c/676688/8/nova/api/openstack/compute/services.py21:53
*** maciejjozefczyk has joined #openstack-nova21:57
*** maciejjozefczyk has quit IRC21:59
*** rcernin has joined #openstack-nova22:07
*** umbSublime has joined #openstack-nova22:11
*** kaisers has joined #openstack-nova22:21
*** mriedem has quit IRC22:25
*** slaweq has quit IRC22:25
artomWe should probably address those errors when running func tests:22:35
artomException ignored in: <function _after_fork at 0x7f6382a2ed40>22:35
artomTraceback (most recent call last):22:35
artom  File "/usr/lib64/python3.7/threading.py", line 1373, in _after_fork22:35
artom    assert len(_active) == 122:35
openstackgerritGhanshyam Mann proposed openstack/nova master: Skip all integration jobs for policies only changes.  https://review.opendev.org/70726822:42
gmannefried_afk: stephenfins dansmith gibi melwitt ^^ this will speed up the gate for policy BP changes.22:43
gmannalex_xu: ^^22:43
openstackgerritGhanshyam Mann proposed openstack/nova master: Skip to run all integration jobs for policies-only changes.  https://review.opendev.org/70726822:44
openstackgerritGhanshyam Mann proposed openstack/nova master: Skip to run all integration jobs for policies-only changes.  https://review.opendev.org/70726822:44
*** slaweq has joined #openstack-nova22:50
*** iurygregory has quit IRC22:52
*** slaweq has quit IRC22:55
openstackgerritGhanshyam Mann proposed openstack/nova master: Skip to run all integration jobs for policies-only changes.  https://review.opendev.org/70726823:06
gmannmelwitt: updated ^^23:06
melwittack23:06
*** tkajinam has joined #openstack-nova23:06
*** slaweq has joined #openstack-nova23:11
*** jmlowe has joined #openstack-nova23:15
*** slaweq has quit IRC23:16
sean-k-mooneydansmith: efried_afk:  i did a thing. https://review.opendev.org/#/c/552924/17/specs/ussuri/approved/numa-topology-with-rps.rst@51623:23
sean-k-mooneydansmith: efried_afk its a trivial poc of the progresive generation of the numa toplogies for a non numa vm23:23
sean-k-mooneyjust the toplogy object not the queries but i could proably hack that up tomorrow23:24
*** mlavalle has quit IRC23:27
*** nicolasbock has quit IRC23:36
*** mmethot has joined #openstack-nova23:46
*** igordc has quit IRC23:46
*** jmlowe has quit IRC23:48
*** Liang__ has joined #openstack-nova23:49
*** rcernin has quit IRC23:51

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!