*** spatel has joined #openstack-nova | 00:01 | |
*** slaweq has quit IRC | 00:04 | |
*** spatel has quit IRC | 00:06 | |
*** Liang__ has joined #openstack-nova | 00:20 | |
*** spatel has joined #openstack-nova | 00:34 | |
*** mlavalle has quit IRC | 00:47 | |
*** yedongcan has joined #openstack-nova | 01:05 | |
*** gentoorax has quit IRC | 01:06 | |
*** zhanglong has joined #openstack-nova | 01:15 | |
*** openstackgerrit has joined #openstack-nova | 01:15 | |
openstackgerrit | Merged openstack/nova stable/rocky: Use stable constraint for Tempest pinned stable branches https://review.opendev.org/706716 | 01:15 |
---|---|---|
*** gentoorax has joined #openstack-nova | 01:37 | |
*** david-lyle is now known as dklyle | 02:05 | |
openstackgerrit | Huachang Wang proposed openstack/nova-specs master: Use PCPU and VCPU in one instance https://review.opendev.org/668656 | 02:06 |
*** psachin has joined #openstack-nova | 02:06 | |
*** nweinber has joined #openstack-nova | 02:07 | |
*** nweinber has quit IRC | 02:19 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Introduce scope_types in os-create-backup https://review.opendev.org/707038 | 02:21 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add new default roles in os-create-backup policies https://review.opendev.org/707039 | 02:25 |
*** nicolasbock has quit IRC | 02:28 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Introduce scope_types in os-console-output https://review.opendev.org/707040 | 02:36 |
*** ileixe has joined #openstack-nova | 02:37 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add new default roles in os-console-output policies https://review.opendev.org/707041 | 02:41 |
*** gyee has quit IRC | 03:11 | |
*** gentoorax has quit IRC | 03:14 | |
ileixe | Hi Nova, | 03:22 |
ileixe | Does anyone know the current status of https://blueprints.launchpad.net/nova/+spec/ip-aware-scheduling-placement? | 03:22 |
*** spatel has quit IRC | 03:23 | |
ileixe | I thought if nova does not aware of neutron segment, routed network does not work and the spec say it's not implemented yet. | 03:23 |
ileixe | Does it mean routed network not yet implemented? | 03:23 |
*** hongbin has joined #openstack-nova | 03:48 | |
*** yedongcan has quit IRC | 03:50 | |
*** gentoorax has joined #openstack-nova | 03:52 | |
*** Sundar has quit IRC | 03:53 | |
*** udesale has joined #openstack-nova | 04:19 | |
*** mkrai has joined #openstack-nova | 04:37 | |
*** hongbin has quit IRC | 04:41 | |
*** vesper has quit IRC | 05:18 | |
*** vesper11 has joined #openstack-nova | 05:23 | |
*** evrardjp has quit IRC | 05:34 | |
*** evrardjp has joined #openstack-nova | 05:34 | |
alex_xu | ileixe: I think it is implemented | 05:50 |
ileixe | alex_xu: Thanks for response. Do you mean nova lookup segment then? | 05:51 |
*** igordc has joined #openstack-nova | 05:51 | |
alex_xu | ileixe: no, the neutron side will do that, and report the resource to the placement | 05:51 |
alex_xu | ileixe: https://blueprints.launchpad.net/neutron/+spec/routed-networks | 05:52 |
alex_xu | ileixe: I never try that feature, but hope ^ that can help you | 05:52 |
*** mkrai has quit IRC | 05:52 | |
alex_xu | ileixe: also this one https://docs.openstack.org/neutron/pike/admin/config-routed-networks.html | 05:53 |
*** mkrai has joined #openstack-nova | 05:54 | |
ileixe | alex_xu: Hm.. maybe I understand what neutron does for routed network | 05:56 |
ileixe | What I do not understand is.. how nova use resource provider which neutron gave | 05:57 |
alex_xu | ileixe: maybe i'm wrong, i saw the note from the matt, looks like neutorn side impelemnt, but yes, nova side do nohting now | 05:57 |
ileixe | iirc, matt you said is the owner of the commit (https://review.opendev.org/#/c/656885/) right? | 05:58 |
alex_xu | ileixe: yes, he isn't working on that anymore | 05:59 |
ileixe | And the commit does not implemented which I thought for routed network. | 05:59 |
ileixe | So.. I assume that routed network does not work (especially related to nova scheduling) | 05:59 |
*** igordc has quit IRC | 06:00 | |
alex_xu | ileixe: yes, I think you are right | 06:00 |
ileixe | alex_xu: Hm... thanks for the answer.. | 06:00 |
alex_xu | np | 06:02 |
*** mkrai has quit IRC | 06:09 | |
*** ratailor has joined #openstack-nova | 06:21 | |
*** gentoora- has joined #openstack-nova | 06:27 | |
*** gentoorax has quit IRC | 06:27 | |
*** gentoora- is now known as gentoorax | 06:27 | |
*** ccamacho has quit IRC | 06:50 | |
*** lpetrut has joined #openstack-nova | 07:03 | |
*** mkrai has joined #openstack-nova | 07:05 | |
*** maciejjozefczyk has joined #openstack-nova | 07:08 | |
*** damien_r has joined #openstack-nova | 07:18 | |
*** damien_r has quit IRC | 07:23 | |
*** yedongcan has joined #openstack-nova | 07:36 | |
*** udesale has quit IRC | 07:46 | |
*** udesale has joined #openstack-nova | 07:47 | |
*** imacdonn has quit IRC | 07:53 | |
*** imacdonn has joined #openstack-nova | 07:53 | |
*** mkrai has quit IRC | 08:02 | |
*** mriosfer has joined #openstack-nova | 08:22 | |
mriosfer | Hi guys, after change in the flavor and image the vram value, in our openstack queens and rebuild the instance i saw in the virsh xml that its correctly added to vm config "<model type='qxl' ram='65536' vram='131072' vgamem='16384' heads='1' primary='yes'/>" with windows with dxdiag detect 0MB vram. Is it correct? Should be running? | 08:24 |
*** tosky has joined #openstack-nova | 08:27 | |
*** priteau has joined #openstack-nova | 08:28 | |
*** tesseract has joined #openstack-nova | 08:29 | |
openstackgerrit | Brin Zhang proposed openstack/nova master: Expose instance action event details out of the API https://review.opendev.org/694430 | 08:33 |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add server actions v82 samples test https://review.opendev.org/706251 | 08:35 |
*** mkrai has joined #openstack-nova | 08:37 | |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add instance actions v82 samples test https://review.opendev.org/706251 | 08:38 |
*** jcosmao has joined #openstack-nova | 08:40 | |
*** ccamacho has joined #openstack-nova | 08:46 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Merge qos related renos for Ussuri https://review.opendev.org/706766 | 08:49 |
gibi | bauzas: I'm here if you want to chat about NUMA | 08:50 |
bauzas | gibi: 10 mins please but yeah :) | 08:50 |
gibi | bauzas: sure | 08:50 |
*** ralonsoh has joined #openstack-nova | 08:52 | |
*** rpittau|afk is now known as rpittau | 08:55 | |
*** martinkennelly has joined #openstack-nova | 09:01 | |
*** ccamacho has quit IRC | 09:01 | |
bauzas | ok, processed the whole bunch of comments for the NUMA in Placement spec... | 09:01 |
* bauzas grabs some coffee and pings gibi later | 09:01 | |
gibi | ok | 09:02 |
gibi | I just realized tha sean-k-mooney and efried also commented while I was away... reading them... | 09:02 |
*** elod has quit IRC | 09:03 | |
*** amoralej|off is now known as amoralej | 09:03 | |
*** slaweq has joined #openstack-nova | 09:07 | |
*** mkrai has quit IRC | 09:09 | |
*** mkrai_ has joined #openstack-nova | 09:10 | |
bauzas | gibi: I'm back | 09:10 |
bauzas | gibi: FWIW efried mostly replied on your concerns | 09:11 |
bauzas | the spec needs another round of rewrites so I'm starting it now | 09:11 |
*** dtantsur|afk is now known as dtantsur | 09:12 | |
*** elod has joined #openstack-nova | 09:12 | |
openstackgerrit | HYSong proposed openstack/nova master: Update request_specs.availability_zone during live migration https://review.opendev.org/706647 | 09:12 |
gibi | bauzas: I'm still reading efried's comments, I will get back to you soon | 09:12 |
bauzas | cool | 09:13 |
*** ociuhandu has joined #openstack-nova | 09:28 | |
*** takamatsu has joined #openstack-nova | 09:28 | |
openstackgerrit | HYSong proposed openstack/nova master: Update request_specs.availability_zone during live migration https://review.opendev.org/706647 | 09:28 |
*** ociuhandu has quit IRC | 09:29 | |
*** martinkennelly has quit IRC | 09:33 | |
openstackgerrit | HYSong proposed openstack/nova master: Update request_specs.availability_zone during live migration https://review.opendev.org/706647 | 09:33 |
openstackgerrit | HYSong proposed openstack/nova master: Update request_specs.availability_zone during live migration https://review.opendev.org/706647 | 09:35 |
*** psachin has quit IRC | 09:42 | |
*** derekh has joined #openstack-nova | 09:44 | |
bauzas | gibi: unrelated, see my last comment on https://review.opendev.org/#/c/706647/5 | 09:44 |
bauzas | I know not a lot of folks know about AZs, so I want to make sure that all of us as cores know about the design consensus :) | 09:45 |
gibi | bauzas: ack, queued for double check | 09:45 |
bauzas | no rush, it's more for a knowledge | 09:47 |
*** psachin has joined #openstack-nova | 09:47 | |
*** ivve has joined #openstack-nova | 09:53 | |
gibi | bauzas: replyied in the NUMA spec. I agree Eric's proposals in his reply. I'm still a bit open in the upgrade check case but that is a low prio issue | 09:53 |
bauzas | cool, I'm just modifying things as we speak | 09:53 |
*** psachin has quit IRC | 09:54 | |
gibi | bauzas: you did a great job pulling the piece together I think I see the light of the end of this tunnel | 09:54 |
*** davidsha has joined #openstack-nova | 09:54 | |
*** slaweq has quit IRC | 09:56 | |
openstackgerrit | jichenjc proposed openstack/nova master: set default value to 0 instead of '' https://review.opendev.org/706730 | 10:02 |
*** psachin has joined #openstack-nova | 10:02 | |
gibi | bauzas: https://review.opendev.org/#/c/706647 thanks for chiming in I agree with you. I missed the fact that we don't allow moving instance between AZs (in recent microversions) | 10:04 |
bauzas | gibi: no worries, again, my ping is just for making sure we share our knowledge | 10:06 |
*** ociuhandu has joined #openstack-nova | 10:06 | |
bauzas | I'm always on and off upstream, so the more people know about AZs, the better it will be | 10:06 |
*** xek has joined #openstack-nova | 10:07 | |
*** ociuhandu has quit IRC | 10:09 | |
*** ociuhandu has joined #openstack-nova | 10:09 | |
*** mkrai__ has joined #openstack-nova | 10:10 | |
*** ociuhandu has quit IRC | 10:10 | |
*** ociuhandu has joined #openstack-nova | 10:11 | |
*** mkrai_ has quit IRC | 10:13 | |
*** ociuhandu has quit IRC | 10:18 | |
*** psachin has quit IRC | 10:20 | |
*** ociuhandu has joined #openstack-nova | 10:20 | |
*** psachin has joined #openstack-nova | 10:21 | |
*** ociuhandu has quit IRC | 10:21 | |
*** martinkennelly has joined #openstack-nova | 10:21 | |
*** ociuhandu has joined #openstack-nova | 10:26 | |
*** ociuhandu has quit IRC | 10:27 | |
*** priteau has quit IRC | 10:28 | |
*** ociuhandu has joined #openstack-nova | 10:29 | |
*** ociuhandu has quit IRC | 10:30 | |
bauzas | shit, I lack time for fixing all the comments | 10:42 |
bauzas | gibi: I think we need efried and sean-k-mooney around this afternoon for discussing the upgrade pre-flight check and the Ussuri condition | 10:45 |
bauzas | I'm personnally in favor of keeping NUMA workloads in Ussuri as they are | 10:45 |
bauzas | ie. no migration asked | 10:46 |
bauzas | and a pre-flight check pre-Victoria | 10:46 |
bauzas | because if not, that's a chicken-and-egg issue | 10:46 |
gibi | sure lets see what they think | 10:50 |
*** priteau has joined #openstack-nova | 11:03 | |
*** ociuhandu has joined #openstack-nova | 11:04 | |
*** ociuhandu has quit IRC | 11:09 | |
*** zhanglong has quit IRC | 11:12 | |
*** ociuhandu has joined #openstack-nova | 11:13 | |
*** zhanglong has joined #openstack-nova | 11:15 | |
*** ociuhandu has quit IRC | 11:18 | |
*** mkrai__ has quit IRC | 11:19 | |
*** rpittau is now known as rpittau|bbl | 11:21 | |
*** fungi has quit IRC | 11:24 | |
*** fungi has joined #openstack-nova | 11:27 | |
*** yedongcan has left #openstack-nova | 11:37 | |
*** tbachman has quit IRC | 11:46 | |
*** ociuhandu has joined #openstack-nova | 11:51 | |
*** ociuhandu has quit IRC | 11:57 | |
*** ociuhandu has joined #openstack-nova | 12:01 | |
*** mriedem has joined #openstack-nova | 12:01 | |
*** nicolasbock has joined #openstack-nova | 12:02 | |
*** amoralej is now known as amoralej|lunch | 12:04 | |
*** udesale_ has joined #openstack-nova | 12:11 | |
*** udesale has quit IRC | 12:13 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: WIP: api: Add support for extra spec validation https://review.opendev.org/704643 | 12:14 |
*** jaosorior has joined #openstack-nova | 12:15 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Support unshelve with qos ports https://review.opendev.org/704759 | 12:24 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Enable unshelve with qos ports https://review.opendev.org/705475 | 12:25 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Merge qos related renos for Ussuri https://review.opendev.org/706766 | 12:27 |
*** dtantsur is now known as dtantsur|brb | 12:27 | |
*** ociuhandu has quit IRC | 12:31 | |
*** jaosorior has quit IRC | 12:32 | |
*** ociuhandu has joined #openstack-nova | 12:34 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: WIP: api: Add support for extra spec validation https://review.opendev.org/704643 | 12:36 |
frickler | hello nova, the new alembic==1.4.0 is causing this failure, please have a look https://c264ca14759376f2bea5-6cd49316b9babb1f90743cae9cd67f9e.ssl.cf2.rackcdn.com/705380/10/check/cross-nova-py36/8bcec81/testr_results.html , I'll pin to the previous version for now, see https://review.opendev.org/705380 | 12:38 |
*** mdbooth_ has quit IRC | 12:45 | |
*** mkrai__ has joined #openstack-nova | 12:46 | |
*** damien_r has joined #openstack-nova | 12:55 | |
*** mriedem has quit IRC | 13:00 | |
*** adriant has quit IRC | 13:02 | |
*** adriant has joined #openstack-nova | 13:02 | |
*** mkrai__ has quit IRC | 13:08 | |
*** decrypt has joined #openstack-nova | 13:09 | |
*** jmlowe has joined #openstack-nova | 13:10 | |
*** tbachman has joined #openstack-nova | 13:10 | |
*** ratailor has quit IRC | 13:11 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: WIP: api: Add support for extra spec validation https://review.opendev.org/704643 | 13:14 |
*** takamatsu has quit IRC | 13:14 | |
*** amoralej|lunch is now known as amoralej | 13:18 | |
*** rpittau|bbl is now known as rpittau | 13:22 | |
efried | bauzas, gibi: o/ | 13:23 |
efried | What's the question? | 13:23 |
gibi | efried: my remaning open question is https://review.opendev.org/#/c/552924/16/specs/ussuri/approved/numa-topology-with-rps.rst@223 about upgrade checks | 13:25 |
gibi | but I have to jump on a call for the next 1 and a half hour so talk to you later | 13:25 |
*** derekh has quit IRC | 13:27 | |
*** nweinber has joined #openstack-nova | 13:28 | |
*** jmlowe has quit IRC | 13:30 | |
*** zhanglong has quit IRC | 13:33 | |
*** zhanglong has joined #openstack-nova | 13:34 | |
bauzas | efried: gibi: will ping you later, just updating the spec now | 13:36 |
*** takamatsu has joined #openstack-nova | 13:37 | |
*** jmlowe has joined #openstack-nova | 13:37 | |
alex_xu | bauzas: I leave one also https://review.opendev.org/#/c/552924/16/specs/ussuri/approved/numa-topology-with-rps.rst@421 | 13:47 |
*** dtantsur|brb is now known as dtantsur | 13:49 | |
bauzas | ack | 13:50 |
bauzas | alex_xu: good point, I was thinking of the upgrade issue when rolling the compute upgrades | 13:52 |
alex_xu | good news, we can just copy the way of standard-cpu-resource-tracking way | 13:53 |
alex_xu | probably need another workaround config option for disable the fallback placement query | 13:53 |
bauzas | I'll sat this | 13:53 |
bauzas | say* | 13:54 |
*** nweinber has quit IRC | 13:58 | |
*** tkajinam has joined #openstack-nova | 13:59 | |
*** derekh has joined #openstack-nova | 14:01 | |
*** nweinber has joined #openstack-nova | 14:07 | |
*** jmlowe has quit IRC | 14:09 | |
*** jmlowe has joined #openstack-nova | 14:11 | |
*** zhanglong has quit IRC | 14:28 | |
*** takamatsu has quit IRC | 14:29 | |
*** tbachman has quit IRC | 14:33 | |
efried | bauzas, gibi: responded. | 14:39 |
bauzas | dammit, need to Ctrl-R | 14:39 |
bauzas | I'm litterally writing live | 14:39 |
efried | bauzas: shouldn't be anything earth-shattering in my response. | 14:40 |
bauzas | eek, sean-k-mooney did too | 14:40 |
* bauzas needs to look again at all comments and process them | 14:40 | |
*** mriosfer has quit IRC | 14:40 | |
bauzas | efried: for the placement-ish syntax, I'm all for docs | 14:42 |
bauzas | and not code | 14:42 |
bauzas | FWIW | 14:42 |
bauzas | like, you can do it but you can mess it | 14:42 |
bauzas | your dog | 14:42 |
bauzas | anyway, continuing to write | 14:43 |
efried | IMO the only reason we shouldn't block placement-ish syntax is because we might miss something in our translation utility and have to provide a workaround until we fix it. | 14:43 |
efried | even there be tygers. | 14:44 |
*** Liang__ is now known as LiangFang | 14:44 | |
LiangFang | gibi: hi gibi, regarding https://review.opendev.org/#/c/689070/ | 14:45 |
LiangFang | gibi: how do you think to set trait for the host machine, and specify trait in flavor extra spec? | 14:47 |
LiangFang | gibi: so the guest can be scheduled to the host with cache capability | 14:47 |
*** vishalmanchanda has joined #openstack-nova | 14:51 | |
*** ociuhandu has quit IRC | 14:57 | |
*** elod has quit IRC | 14:57 | |
*** elod has joined #openstack-nova | 14:58 | |
*** artom has joined #openstack-nova | 15:04 | |
*** bbowen has joined #openstack-nova | 15:07 | |
*** artom has quit IRC | 15:09 | |
*** takamatsu has joined #openstack-nova | 15:11 | |
*** artom has joined #openstack-nova | 15:12 | |
*** artom has quit IRC | 15:12 | |
*** artom has joined #openstack-nova | 15:13 | |
openstackgerrit | Sylvain Bauza proposed openstack/nova-specs master: Proposes NUMA topology with RPs https://review.opendev.org/552924 | 15:22 |
*** mlavalle has joined #openstack-nova | 15:22 | |
bauzas | efried: gibi: sean-k-mooney: alex_xu: thanks for the comments, here is another baking of NUMA topology spec https://review.opendev.org/552924 | 15:22 |
gibi | LiangFang: if we only care about having a cache configured on the host then it can be a capability represented by a trait. If we also needs to think about the available size of the caches then it is a resource | 15:23 |
*** spatel has joined #openstack-nova | 15:24 | |
gibi | bauzas, efried: ack, will check back soon | 15:24 |
*** ociuhandu has joined #openstack-nova | 15:28 | |
*** tkajinam has quit IRC | 15:29 | |
stephenfin | dansmith: Any particular reason we don't use entrypoints for custom scheduler filters? Is it something we could/should do? | 15:39 |
dansmith | don't we already? | 15:40 |
dansmith | bauzas: | 15:40 |
stephenfin | Custom scheduler drivers, yes. Not filters for the filter_scheduler though | 15:40 |
dansmith | I was sure we did | 15:40 |
stephenfin | You've to configure them via a python path in '[filter_scheduler] enabled_filters' | 15:40 |
stephenfin | fwict | 15:40 |
*** mkrai__ has joined #openstack-nova | 15:43 | |
*** jmlowe has quit IRC | 15:44 | |
*** TxGirlGeek has joined #openstack-nova | 15:50 | |
*** udesale_ has quit IRC | 15:52 | |
*** udesale_ has joined #openstack-nova | 15:53 | |
*** ociuhandu has quit IRC | 15:54 | |
*** KeithMnemonic has quit IRC | 15:56 | |
*** lpetrut has quit IRC | 16:01 | |
spatel | sean-k-mooney: morning | 16:02 |
spatel | let me know if you around i want to share my load-test result. | 16:02 |
*** udesale_ has quit IRC | 16:06 | |
gibi | bauzas: thanks for the update on the NUMA spec it looks good to me now | 16:06 |
*** udesale_ has joined #openstack-nova | 16:07 | |
*** ociuhandu has joined #openstack-nova | 16:11 | |
*** ociuhandu has quit IRC | 16:12 | |
*** ociuhandu has joined #openstack-nova | 16:12 | |
*** ivve has quit IRC | 16:13 | |
*** udesale_ has quit IRC | 16:16 | |
sean-k-mooney | bauzas: im skiming through it now but ya im more or less happy with it. ill proably +1 it when i finish this pass | 16:24 |
bauzas | dansmith: sorry was AFK | 16:26 |
bauzas | stephenfin: yeah you have to set a specific option | 16:27 |
dansmith | bauzas: np. I thought our scheduler filter interface was already using entry points, but stephenfin says it's not.. I'm sure he's right I was just poking you in case you were also surprised | 16:27 |
bauzas | https://docs.openstack.org/nova/latest/configuration/config.html#filter_scheduler.available_filters | 16:28 |
bauzas | stephenfin: ^ | 16:28 |
stephenfin | right, matches up with what I suspected so | 16:28 |
stephenfin | bauzas: any reason not to deprecate that an ask people to use entrypoints instead? | 16:28 |
stephenfin | given I'll be asking them to do that for extra spec validators | 16:28 |
bauzas | well, I don't have any opinion | 16:29 |
bauzas | we had a lot of entrypoints | 16:29 |
bauzas | so for sure we could just use another one | 16:29 |
sean-k-mooney | spatel: did you see an improvement? | 16:30 |
bauzas | stephenfin: dansmith: that's how Nova knows about filters https://github.com/openstack/nova/blob/master/nova/loadables.py#L78 | 16:32 |
*** gyee has joined #openstack-nova | 16:32 | |
bauzas | for example you can ask to have a new custom filter by doing something like scheduler_available_filters = myownproject.scheduler.filters.climate_filter.ClimateFilter | 16:33 |
*** dtantsur is now known as dtantsur|afk | 16:34 | |
efried | bauzas, gibi, sean-k-mooney: I don't understand the "fallback" thing. | 16:35 |
efried | https://review.opendev.org/#/c/552924/17/specs/ussuri/approved/numa-topology-with-rps.rst@516 | 16:36 |
sean-k-mooney | efried: its mimicing PCPUS | 16:36 |
sean-k-mooney | basically when procesing a vm request with a numa toplogy if and only if the placement allocation candiates responce is empty we will fall back to the non numa aware query and let the numa toplogy filter elimidate the host if it cant fit | 16:37 |
bauzas | efried: try: <call placement asking for NUMA-aware instances> except NoValidHosts: <call Placement like in Train> | 16:38 |
sean-k-mooney | yes but without using excption for control flow :P | 16:38 |
efried | Sorry, let me clarify | 16:38 |
bauzas | efried: sean-k-mooney: I could have provided a link to the implementation instead of the spec :) | 16:39 |
efried | I understand the what/how. I don't understand the why. | 16:39 |
bauzas | efried: because, | 16:39 |
bauzas | say a rolling upgrade | 16:39 |
sean-k-mooney | efried: so we dont need to have a global flag in the schduler to turn on the translation | 16:39 |
*** tbachman has joined #openstack-nova | 16:39 | |
bauzas | or just a Ussuri cloud with only one node being transformed | 16:40 |
bauzas | then we could have some problems | 16:40 |
sean-k-mooney | the same reason we did it for PCPUs to make upgrdade simpler | 16:40 |
bauzas | efried: have you seen the Upgrade Impact section already ? | 16:40 |
sean-k-mooney | bauzas: ya the single node case is basicaly during a rolling upgrade | 16:40 |
bauzas | I tried to explain the *why* there | 16:40 |
efried | Because what's going to happen here is: | 16:41 |
efried | I'll upgrade to Ussuri, where my workaround option is set to not reshape; | 16:41 |
efried | and all my flavors will continue to work, only a teeny bit slower because I'll get empty allocation candidates on the first query 100% of the time. | 16:41 |
efried | And I won't notice, so I'll never flip the workaround off. | 16:41 |
efried | so, | 16:42 |
efried | Am I misunderstanding what release/combination will have this fallback mechanism in play? | 16:42 |
gibi | efried: I understand the need to push the operators to switch. But pushing them to do the switch right at the upgrade feels too much to me. It would like an ultimate. "When you upgrade to Ussuri you will loose all the NUMA aware capacity of your cloud, but you can get them back iff you do the reshape one every NUMA aware compute, but you should not do that all at once as that will overload placement." | 16:43 |
gibi | s/one/on/ | 16:43 |
dansmith | gibi: you think that's okay? | 16:44 |
efried | Well, I don't buy that it would overload placement. The reshape will be one query per compute host. | 16:44 |
sean-k-mooney | efried: i would hope we remove it when we remove the config option | 16:44 |
gibi | dansmith: I think it is not OK to loose all the NUMA aware capacity at upgrade | 16:44 |
dansmith | gibi: okay you were describing the problem, not the goal, is that right? | 16:45 |
sean-k-mooney | we could remove it when we change the default but i would hope it will not last more then 2 releases | 16:45 |
efried | Then we kind of have to position this as a "tech preview" or experimental change. | 16:45 |
gibi | dansmith: my goal is not to loose the capacity at upgrade. I dont like the the quoted part | 16:45 |
efried | It just seems pretty pointless to me. | 16:45 |
dansmith | gibi: ++ | 16:45 |
efried | because we're going to do a bunch of work to enable something that nobody is going to use. | 16:46 |
sean-k-mooney | well since we are not enableing it by default in ussuri it kind of will be | 16:46 |
sean-k-mooney | but in vitora i think we should enable numa reporting by default. i would be happy to do that in ussuri but that might be a bit agressive | 16:46 |
*** slaweq has joined #openstack-nova | 16:47 | |
efried | I buy that we can't enable it by default as long as we can't fit non-NUMA-aware workloads onto NUMA-modeled hosts. | 16:47 |
sean-k-mooney | efried: well actully im not sure i agree with that but thats a sperate dicussion | 16:48 |
efried | but with this fallback mechanism as designed, we're giving operators NO reason to switch over. | 16:48 |
sean-k-mooney | efried: well with my downstream hat on i will be pushing to make numa on the default for our next lts | 16:48 |
efried | which means we might as well not bother with this incremental improvement. We might as well just wait until we've solved the fitting problem. | 16:48 |
sean-k-mooney | efried: we have said that for 4+ releases | 16:49 |
sean-k-mooney | this will imporve schdulign time and reduce the chance of racecs for numa instnaces | 16:49 |
efried | no | 16:49 |
efried | it won't | 16:49 |
sean-k-mooney | so it is still usful | 16:49 |
efried | it will do exactly nothing | 16:49 |
efried | except inject an extra placement call into every scheduling request. | 16:49 |
sean-k-mooney | it will because non numa instance will ignore numa nosts | 16:50 |
sean-k-mooney | and numa instance will ignore non numa nosts | 16:50 |
efried | there will be | 16:50 |
efried | no | 16:50 |
efried | numa | 16:50 |
efried | hosts. | 16:50 |
sean-k-mooney | people will turn this on | 16:50 |
sean-k-mooney | i can almost guarentee the then first release that we productise downstream with this feature will have it enabeld by default regardless of the upstream defaul | 16:51 |
efried | what, do your downstream releases force only flavors with guest numa topos? | 16:51 |
artom | sean-k-mooney, I don't see how we can do that... | 16:51 |
artom | Wouldn't that make all computes essentially not usable for non-NUMA instances? | 16:52 |
efried | ^ | 16:52 |
artom | If they're too big to fit on 1 NUMA node? | 16:52 |
gibi | efried: actually that what happens in my downstream project as we use pinning and huge page | 16:52 |
sean-k-mooney | no they dont but we do force the numa hosts to be partioned | 16:52 |
artom | So? | 16:52 |
efried | artom: even if they're not too big. Because we're adding the HW_NUMA_ROOT trait. | 16:52 |
sean-k-mooney | so all the hosts that will run numa workloads will have the feature turned on and the non numa host will have it off | 16:52 |
artom | They'd still all suddenly NUMA-exposing | 16:52 |
artom | sean-k-mooney, uh, so that's not "on by default as soon as they upgrade" :) | 16:53 |
artom | efried, oh right | 16:53 |
sean-k-mooney | it will be enabled in roles that configure the host for dpdk, hugepages or pinning | 16:54 |
sean-k-mooney | i would argue it should be set to on for all host and provide a way for them to opt out if they need the giant vm case | 16:54 |
sean-k-mooney | non of our telco customer need that case | 16:54 |
artom | sean-k-mooney, our telco customers are not 100% of openstack users | 16:55 |
artom | Would CERN want that, for example? :) | 16:55 |
sean-k-mooney | i dont think so no | 16:55 |
sean-k-mooney | i think they understand the perfomance cost and would use numa instances | 16:55 |
sean-k-mooney | cern are not going to trow away 30% of there compute performance | 16:56 |
sean-k-mooney | plus they have ironci if they really need to alocate all the resouce of a full host to an instance | 16:56 |
artom | sean-k-mooney, maybe, maybe not. I don't know how they, or anyone else how isn't a RH telco customer, operate. Which is why I'm weary of making this opt-out, and would feel safer making it opt-in. | 16:57 |
artom | If our deployment tooling wants to turn it on by default in some cases, that's cool | 16:57 |
artom | But not a Nova default | 16:57 |
sean-k-mooney | i think as proposed we get the best of both worlds | 16:58 |
sean-k-mooney | the fallback mechanisum will mean that it will just work if you dont set anything | 16:58 |
bauzas | sorry you lost my attention by not highlighting me | 16:58 |
sean-k-mooney | and we can add a nova status check to warn that you shoudl enable this before upgrading to Victora | 16:58 |
artom | sean-k-mooney, wait, the current proposed thing is disable_placement_numa_reporting = <bool> (default True for Ussuri) | 16:59 |
sean-k-mooney | yes | 16:59 |
artom | Which is what I'm advocating for | 16:59 |
sean-k-mooney | it will be disabled in ussuri | 16:59 |
artom | OK, so we agree :) | 16:59 |
sean-k-mooney | and hopefully enabled by default in Victoria | 16:59 |
dansmith | and that means what for flavors that currently have numa? | 16:59 |
*** rpittau is now known as rpittau|afk | 17:00 | |
sean-k-mooney | dansmith: the inital placement query will be empty | 17:00 |
sean-k-mooney | then we fall back to the current query | 17:00 |
artom | dansmith, IIUC same as what we did for PCPUs - try the new Placement query, if it comes back empty, try the legacy one | 17:00 |
sean-k-mooney | and leave it to the numa toplogy filter to do all the work | 17:00 |
*** priteau has quit IRC | 17:00 | |
sean-k-mooney | so they will just work | 17:00 |
dansmith | ack, yep, I think that's the sane path for U | 17:01 |
sean-k-mooney | stephenfin: by the way are you removing the fallback for PCPUs in U | 17:02 |
sean-k-mooney | stephenfin: that was the plan but i dont think you have had time to work on it | 17:02 |
stephenfin | I could but I was thinking I'd wait another cycle | 17:02 |
dansmith | if it's on by default, could we do the opposite for non-numa flavors? meaning, query with numa_nodes=1 and if we get no options, then try with =2, etc? | 17:02 |
dansmith | until we get a range capability with placement | 17:02 |
stephenfin | To let it bake in more. It's just dead code once the correct config options have been set | 17:03 |
sean-k-mooney | am maybe but the performace of that might not be great | 17:03 |
sean-k-mooney | or acccptable. | 17:03 |
artom | dansmith, that would assign a NUMA topology even if the user didn't request it (explicitly or implicitly), no? | 17:03 |
sean-k-mooney | artom: ya but that would actully be fine | 17:04 |
sean-k-mooney | they expressed no prefernce | 17:04 |
artom | NUMA topologies have a whole bunch of limitations :) | 17:04 |
sean-k-mooney | so we can decided what that is | 17:04 |
sean-k-mooney | artom: you removed the main one e.g. live migration | 17:04 |
artom | True... | 17:04 |
openstackgerrit | Douglas Mendizábal proposed openstack/nova master: Allow TLS ciphers/protocols to be configurable for console proxies https://review.opendev.org/679502 | 17:04 |
dansmith | artom: yes, it would, but I definitely think that we should be getting to the place where we don't just pretend numa doesn't exist | 17:05 |
artom | I'd still feel uncomfortable springing that on people | 17:05 |
dansmith | artom: so we want people to not have to worry about it in detail, but not just let them totally ignore it, IMHO | 17:05 |
sean-k-mooney | i think that is something we shoudl consider in V when we consider changing the default | 17:06 |
dansmith | yup | 17:06 |
sean-k-mooney | if there is a sensable way to express it to placmenet or a sensible way to make it work form our side we should explore it | 17:06 |
dansmith | what I don't think we should do, is continue to have two distinct ways of running instances forever | 17:06 |
sean-k-mooney | i agree with that | 17:07 |
artom | Same here | 17:07 |
artom | Though I'm not entirely convinced giving everyone a NUMA topology is the way to do it | 17:07 |
dansmith | similar to cellsv1, that didn't work out well, and we never closed the feature and bug gap for people using it until cellsv2 where we just make everyone use it.. it's a little more overhead, but the gap is way smaller | 17:07 |
dansmith | artom: but everyone _has_ a numa topology | 17:08 |
artom | dansmith, you know what I mean ;) | 17:08 |
dansmith | if we need to retool the numa stuff in nova (and probably placement) then that's what we need to do | 17:08 |
*** martinkennelly has quit IRC | 17:08 | |
artom | I think I'd lean towards the can_split stuff in placement | 17:08 |
artom | Though I admittedly have no idea how complicated that would be | 17:09 |
dansmith | yep, it seems like the major barrier here is lack of expressivity with what we ask of placement when we don't care as much | 17:09 |
artom | Like, if the host is NUMA, fine, expose it | 17:09 |
artom | But as you said, being able to say "I don't care about NUMA" would be the best way to solve this, I think | 17:10 |
*** dave-mccowan has joined #openstack-nova | 17:10 | |
sean-k-mooney | artom: can split is very ineefincet to implement in sql | 17:10 |
artom | So, I have beef with "placement has to be SQL", but that's just me ;) | 17:10 |
sean-k-mooney | artom: if "i dont care about numa" means we are free to invent numa toploigies then sure | 17:10 |
artom | sean-k-mooney, weeelll... wouldn't that lead to a bunch of packing problems? | 17:11 |
bauzas | folks, I have to disappear | 17:11 |
bauzas | leave comments, disagreements, concers on the spec | 17:11 |
sean-k-mooney | artom: oh it doesnt have to be sql. but it would be very hard to support can split with the current impeneation | 17:11 |
sean-k-mooney | artom: no | 17:11 |
artom | I guess not, if you retry with enough combinations of guest NUMA nodes and CPUs per node | 17:11 |
*** martinkennelly has joined #openstack-nova | 17:12 | |
artom | So if you some how end up with NUMA0 with 1 CPU free, and NUMA1 with 3 CPUs free, and boot an instance with 4 CPUs, you would need to retry until you get to the numa0_cpus:1,numa2_cpus=3 combo | 17:12 |
sean-k-mooney | honestly doing 4 placment queries with 1-4 numa nodes is still proably faster then the numa toplogy filter today | 17:12 |
sean-k-mooney | but i dont think this is productive to continue now | 17:13 |
artom | True | 17:13 |
artom | (On the second point) | 17:13 |
dansmith | if, like I said, we had a range of nodes, I would think placement could just loop internally much faster than even our retry process | 17:13 |
dansmith | and just dump us more options in the first go | 17:13 |
sean-k-mooney | dansmith: ya its really a proablem of being able to express our actul constratin to placmenet so they can do an efficent thing | 17:14 |
dansmith | right | 17:14 |
sean-k-mooney | at the moment we are over spcifying and underspecifying at the same time | 17:14 |
artom | dansmith, I think placement would want to now know about NUMA | 17:14 |
artom | And only think of things in terms of generic RPs | 17:14 |
sean-k-mooney | since the way we express the query does not fully match what we need | 17:14 |
artom | *to not know | 17:15 |
dansmith | artom: that's the proposal, AFAIK | 17:15 |
dansmith | artom: to model this in placement as RPs with the relevant hierarchy | 17:15 |
artom | dansmith, how does this fit with your "internal placement retry loop" idea though? | 17:15 |
sean-k-mooney | artom: right but we added some emantic meaning that is not helpful to come concpets in placment. | 17:15 |
dansmith | artom: ? | 17:15 |
artom | dansmith, well, wouldn't placement need to understand what a NUMA node is to do that? | 17:16 |
sean-k-mooney | for example resouces:vcpu=2 must allcaote 2 cpus form the same RP | 17:16 |
sean-k-mooney | artom: no it just need to know that some resouce need to be in the same sub tree | 17:16 |
*** ivve has joined #openstack-nova | 17:16 | |
sean-k-mooney | with a parent contiing a opacue trating "HW_NUMA_ROOT" | 17:17 |
artom | sean-k-mooney, I guess I can see that | 17:17 |
dansmith | artom: no, I don't think so, we just need to provide some way to communicate to placement that it can satisfy the required resources by allowing them to exist in multiple parts in the tree, with some minimum amount per provider, with maybe matching ratios of cpus and memory | 17:17 |
dansmith | artom: I don't mean an "&numa_nodes=1-4" level of explicitness (even though I've said that just as an example), but it really just needs to be the generic form of that | 17:17 |
artom | dansmith, I wonder if we could tell placement something like "aggregate_at_root=true", and then it could sum the child RPs resources temporarily when handling that query | 17:18 |
dansmith | artom: we need more than that, I think | 17:18 |
sean-k-mooney | no that is not enouch | 17:18 |
dansmith | artom: we need to be able to say "don't give me 7 cpus on one node with 1MB of ram, and 1 CPU on the other with 8G" | 17:18 |
sean-k-mooney | the thing that breaks with simple models is alwasy disk space | 17:18 |
sean-k-mooney | if i asked for 100 | 17:19 |
dansmith | artom: something like "cpus and mem must match 60/40 across the split" or something | 17:19 |
sean-k-mooney | 100G you cant give me 2x50G | 17:19 |
dansmith | yep | 17:19 |
sean-k-mooney | so it has to be per resouce class and we need to express grouping constratins and potaentail sizing info like the 64/40 split | 17:20 |
*** gmann is now known as gmann_afk | 17:20 | |
sean-k-mooney | artom: right now to do ^ we have to specify the toplogy exactly rather then saying these are the limits give me anyting that matches | 17:20 |
artom | But... the disk would never be on a NUMA child RP... | 17:20 |
dansmith | artom: he's giving an example of a resource splitting constraint | 17:21 |
artom | TBH I still don't see why we need to express splitting constraints, but I'm probably just being thick | 17:22 |
artom | And as sean-k-mooney said, it's an academic discussion not relevant to the current spec | 17:22 |
sean-k-mooney | same thing applies to hugpegaes. if i ask for 512mb of hugepages and the host only has 1G hugepages allocated you cant split it | 17:22 |
sean-k-mooney | disk is just more approchable | 17:22 |
dansmith | because if you ask for 16 CPUs and 32G of RAM, we need to be able to say "we're willing to take that split across two nodes as long as the ratio of the split for those resources is at most 60/40" | 17:22 |
artom | sean-k-mooney, right, but hugepages we model as their own RP | 17:22 |
*** mkrai__ has quit IRC | 17:23 | |
artom | So you'd never get 1GB hugepages anyways, you'd get, for example, 400MB from 1 RP, and 112 from another (unrealistic numbers, I know) | 17:23 |
artom | dansmith, so that's my thing - why do we want to say that? What's wrong with CPUs split 1/15 and memory 30GB/2GB | 17:24 |
sean-k-mooney | it would get rejected by the step size yes | 17:24 |
*** ccamacho has joined #openstack-nova | 17:24 | |
dansmith | artom: because that would be pretty unhelpful? | 17:24 |
*** igordc has joined #openstack-nova | 17:24 | |
artom | dansmith, hey, they user they don't care about NUMA topologies ^_^ | 17:25 |
sean-k-mooney | but point is there are limits on how thing can be split that depned on the resouce class and how it willl be used | 17:25 |
artom | But seriously, unhelpful from a performance POV? | 17:25 |
dansmith | artom: right, the user isn't opinionated, but they still want a sane instance | 17:25 |
dansmith | artom: not caring about numa doesn't mean we should give them something completely pathologically stupid | 17:25 |
*** ociuhandu_ has joined #openstack-nova | 17:26 | |
artom | lulz - we need a hw:sanity extra specs | 17:26 |
artom | And if they set it to ludicrous we do ^^ | 17:26 |
artom | dansmith, but yeah, I get your point | 17:26 |
dansmith | if we did this, we'd want to be able to specify what those splits are, and if the op really doesn't care, then they can set the split policy to something very fine | 17:26 |
dansmith | but it wouldn't make much sense for them to do that | 17:27 |
artom | I was mostly joking about that extra spec | 17:27 |
sean-k-mooney | yes we know | 17:27 |
sean-k-mooney | anyway form my view point if you dont set hw:numa_nodes at all it gives nova the freedome to do something sane | 17:28 |
sean-k-mooney | that can be create a numa toplogy if it chooese | 17:28 |
*** ociuhandu has quit IRC | 17:29 | |
sean-k-mooney | for now we leave libvirt invent a single numa node with no affinity | 17:29 |
*** ociuhandu_ has quit IRC | 17:30 | |
sean-k-mooney | haveing one or multiple numa nodes in the guest and mapping them to 1 or more numa node on the host are two different thing | 17:30 |
sean-k-mooney | so can if we want expose 1 numa node to the guest and on the host map it across them | 17:31 |
sean-k-mooney | i think we can do better then that howwever. anyway time to go review something else | 17:31 |
*** evrardjp has quit IRC | 17:34 | |
*** evrardjp has joined #openstack-nova | 17:34 | |
*** martinkennelly has quit IRC | 17:39 | |
*** ccamacho has quit IRC | 17:42 | |
*** spatel has quit IRC | 17:50 | |
stephenfin | ah, crap. Today was supposed to be spec review day | 17:51 |
stephenfin | Guess tomorrow is spec review day for me now \o/ | 17:51 |
yoctozepto | stephenfin: the most busy today - tomorrow :-) | 17:53 |
stephenfin | sean-k-mooney: Tempest is failing on my extra spec validation patch because it's using generic e.g. 'key1=value1' extra specs. What's the most generic extra spec we've got? | 17:57 |
stephenfin | I've been using 'hw:numa_nodes' but that's libvirt/HyperV specific | 17:57 |
*** spatel has joined #openstack-nova | 17:57 | |
spatel | sean-k-mooney: sorry i was in meeting | 17:57 |
*** derekh has quit IRC | 17:58 | |
spatel | sean-k-mooney: as per your recommendation i have added cpu_threads=2 and cpu_sockets=2 in flavor and run test but result was OK (compare to 16 vCPU with single numa0) | 17:59 |
spatel | still i don't understand why, erlang correctly detected CPUTopology on VM but still result was poor with 28vCPU | 18:00 |
*** igordc has quit IRC | 18:02 | |
*** igordc has joined #openstack-nova | 18:06 | |
*** davidsha has quit IRC | 18:07 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: images: Move qemu-img info calls into privsep https://review.opendev.org/706897 | 18:11 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: images: Allow the output format of qemu-img info to be controlled https://review.opendev.org/706898 | 18:11 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: virt: Pass request context to extend_volume https://review.opendev.org/706899 | 18:11 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: WIP libvirt: Fix attached encrypted LUKSv1 volume extension https://review.opendev.org/706900 | 18:11 |
*** xek_ has joined #openstack-nova | 18:12 | |
*** jcosmao has left #openstack-nova | 18:13 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: WIP: api: Add support for extra spec validation https://review.opendev.org/704643 | 18:13 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: trivial: Remove FakeScheduler https://review.opendev.org/707224 | 18:13 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: conf: Deprecate '[scheduler] driver' https://review.opendev.org/707225 | 18:13 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: doc: Improve documentation on writing custom scheduler filters https://review.opendev.org/707226 | 18:13 |
stephenfin | sean-k-mooney: I went with 'hw:numa_nodes' and 'hw:cpu_policy' for want of something better | 18:13 |
*** macz has joined #openstack-nova | 18:15 | |
*** xek has quit IRC | 18:15 | |
*** umbSublime has quit IRC | 18:23 | |
*** maciejjozefczyk has quit IRC | 18:23 | |
*** tbachman has quit IRC | 18:23 | |
*** tbachman has joined #openstack-nova | 18:24 | |
*** mriedem has joined #openstack-nova | 18:30 | |
*** ralonsoh has quit IRC | 18:33 | |
*** martinkennelly has joined #openstack-nova | 18:37 | |
efried | sean-k-mooney, bauzas, gibi: I think I would actually prefer the approach dansmith suggests, even if we only do it at the coarsest level, rather than effectively disable the topology modeling in ussuri. | 18:44 |
*** gmann_afk is now known as gmann | 18:49 | |
efried | in other words: | 18:49 |
efried | - Model with NUMA topology by default. Provide the [workaround] option to *disable* (and un-reshape) for situations where the following is just sh*tting all over itself and nothing is landing. | 18:49 |
efried | - Flavors with hw:numa*-isms get translated as specced. | 18:49 |
efried | - Flavors without hw:numa*-isms get looped for $n in range(0, $max_sane_number_of_numa_nodes_we_expect_a_host_to_ever_have) to behave as if they had specified hw:numa_nodes=$n, stopping as soon as we get a hit. | 18:49 |
efried | This is without changing anything in placement. | 18:50 |
*** amoralej is now known as amoralej|off | 18:51 | |
efried | For uneven splits, we just get as close as we can, but make no attempt to be "fuzzy" (like "up to 60/40" or anything like that). So like, for $n=2 and VCPU=10, we try 10, then 5/5, then 3/3/4, then 2/3/2/3. | 18:52 |
dansmith | there has to be some sort of policy tunable for how wild we're able to get I think | 18:56 |
efried | max_numa_nodes_guessed ? | 18:56 |
*** psachin has quit IRC | 18:57 | |
efried | Are there really hosts out there with more than, say, 8 NUMA cells? | 18:57 |
dansmith | no, I mean how small of a footprint on a given node we're going to allow | 18:57 |
efried | To be clear, I'm talking about splitting as evenly as possible, always. | 18:58 |
efried | You're saying that sometimes that would result in unreasonably small footprint on one numa node anyway? | 18:58 |
efried | obv we stop trying to split if any of the cells are going to get 0 of anything. | 18:59 |
efried | so if you ask for 2 vcpu, we go to a max of hw:numa_nodes=2 | 18:59 |
dansmith | sure, but I think we need to not be willing to split memory at 1MB and 8191MB | 18:59 |
dansmith | or 31 cpus and 1 | 18:59 |
dansmith | and the ratio needs to be close/equal for cpus and memory | 19:00 |
efried | Right, I'm saying that happens implicitly | 19:00 |
dansmith | how? | 19:00 |
efried | each iteration of the loop simply behaves as if you said hw:numa_nodes=$n. | 19:00 |
efried | which simply splits your CPU and mem "evenly" across $n nodes. | 19:01 |
dansmith | that just seems to naive to me | 19:01 |
efried | yes | 19:01 |
efried | it is | 19:01 |
efried | but so is "I don't care about NUMA"? | 19:01 |
efried | And it ought to work 99% of the time | 19:01 |
efried | and if it doesn't, switch on your [workaround] for a couple of hosts. | 19:01 |
dansmith | I want the workaround to go away, remember | 19:02 |
efried | Yes, the workaround goes away completely once we beef up placement to understand can_split | 19:02 |
dansmith | okay your even splitting is just the first pass at sanity? then that's fine | 19:02 |
efried | oh, yeah, eventually we want the utopia where all of this happens in one call with can_split, whose ratios are tunable (via placement side conf? via nova conf fed into placement qparams?) | 19:04 |
dansmith | has to be nova-side | 19:04 |
dansmith | communicated to placement via the query | 19:04 |
efried | then we won't need the workaround anymore, because everything will be able to land (and if it can't, it's because it *shouldn't*). | 19:05 |
efried | In practice, we may even find that nobody needs the workaround. But we'll see. | 19:05 |
*** maciejjozefczyk has joined #openstack-nova | 19:07 | |
efried | Did we decide how we're going to deal with "control plane is updated but some computes are not"? Does the control plane wait to start using the new query style until all the computes are updated? We have that capability, right? | 19:09 |
efried | okay, I see that discussed in the spec. | 19:11 |
*** LiangFang has quit IRC | 19:11 | |
sean-k-mooney | efried: i think if we go that route in ussuri we wont be able to land it in time | 19:18 |
efried | why not? | 19:19 |
efried | we're not talking about trying to implement can_split in any form in ussuri | 19:19 |
efried | are you concerned that the progressive-splitting algorithm is too complicated? | 19:20 |
sean-k-mooney | just multiple queries wehre we try to progressively split | 19:20 |
sean-k-mooney | efried: yes | 19:20 |
efried | meh, I don't see how it's any worse than the proposed fallback. | 19:21 |
sean-k-mooney | efried: it will have to take into accoung numa, native request groups in teh flavor, external requests form cyborng and netorn ports and not suck form a performnce point of view | 19:21 |
sean-k-mooney | the proposed fallback is two queries, the native nuam one followed by the query we do today | 19:21 |
sean-k-mooney | and it only impacts the perfomce of numa instances | 19:22 |
sean-k-mooney | the other way imacpts the perfomce of all non numa instnaces | 19:22 |
efried | sean-k-mooney: I don't think it has to take all that stuff into account at all. | 19:22 |
efried | sean-k-mooney: It should behave *exactly* as if you said hw:numa_nodes=$n with no other hw:numa*-isms. | 19:23 |
efried | (except I think we said we would bounce if we couldn't split evenly; that restriction would have to be lifted for this case.) | 19:23 |
sean-k-mooney | so we would create a fake flaovr where we overrid that and pass it to the current get numa constratis funct | 19:24 |
sean-k-mooney | *function | 19:24 |
efried | if you like | 19:24 |
efried | that would be the spirit, anyway. | 19:24 |
sean-k-mooney | i gues that makes it simpler | 19:24 |
*** maciejjozefczyk has quit IRC | 19:25 | |
sean-k-mooney | so the get_numa_constraits function will reject any invalid toplogy with regard to even spliting with an excetion | 19:25 |
sean-k-mooney | so we would just loop and contiue if an excption is raised up to the limit | 19:25 |
efried | or we relax the constraint to split as close to evenly as possible. Or do that split first. | 19:26 |
efried | Implementation detail. Point is, it shouldn't be super hard to figure out. | 19:26 |
sean-k-mooney | we could use the asemetric numa modeling support that is there yes | 19:26 |
sean-k-mooney | e.g. if you had a 9 core vm and we are on numa=2 do 4cpus+5cpus | 19:27 |
sean-k-mooney | instead of going to 3 numa nodes with 3 cpus | 19:28 |
efried | right | 19:28 |
sean-k-mooney | im not sure which is more likely to cause fragmentation of the top of my head | 19:28 |
sean-k-mooney | we should tell people to just use powers of 2 | 19:28 |
efried | example I gave above was with 10 VCPUs, we would try 10, then 5/5, then 3/3/4, then 2/3/2/3. | 19:28 |
efried | no | 19:28 |
sean-k-mooney | i was jokeing | 19:29 |
efried | we should tell people to use real numa specs if they care. | 19:29 |
sean-k-mooney | it does make life easier when they do but ya i can see that working | 19:29 |
sean-k-mooney | ok if we caluate the split in the tempory flavor we pass to the numa constratis function then it would populate the instance numa toplogy object as if the user had set it manuyaly in teh falvor | 19:30 |
sean-k-mooney | then if we save that in the instnace we could ensure we dont break live migration by changing it | 19:31 |
efried | just so. | 19:31 |
efried | well, I would expect we shouldn't save it in the flavor, because we want the instance to be able to morph to fit somewhere else if it needs to. | 19:31 |
sean-k-mooney | right | 19:31 |
efried | but I don't know how that works, do you break an instance if you "change" its topo from under it? | 19:31 |
sean-k-mooney | i ment save the instance_numa_toplogy object | 19:31 |
sean-k-mooney | not the falavor | 19:32 |
sean-k-mooney | efried: during live migration you would | 19:32 |
sean-k-mooney | cold migration it might mess up some manula config but it should not break it in general | 19:32 |
efried | hm, well that's a bummer. So how do we migrate numa-agnostic instances today? | 19:32 |
sean-k-mooney | i was suggesting once we select a toplogy we store it in the request_spec and instnace | 19:32 |
sean-k-mooney | so that it stays the same for the liftime of the instance unless you resize | 19:33 |
sean-k-mooney | or rebuild | 19:33 |
sean-k-mooney | am today | 19:33 |
sean-k-mooney | non numa isntace are alwasy exposed as 1 numa node | 19:33 |
sean-k-mooney | so it never chagnes form the gest point of view | 19:33 |
sean-k-mooney | so we just lie to the guest | 19:34 |
efried | couldn't we continue lying to the guest? | 19:34 |
sean-k-mooney | we can yes | 19:34 |
efried | does the guest do things differently if it knows CPU x is affined to memory y? | 19:34 |
sean-k-mooney | so we would do the progessive spliting to select the host resouces and present it as 1 numa node to the guest | 19:34 |
sean-k-mooney | but that will have worse performace then telling it its actull toplogy | 19:35 |
sean-k-mooney | yes | 19:35 |
sean-k-mooney | the kernel will take that into account wehn allocation memroy for a proces | 19:35 |
sean-k-mooney | trying to use numa local memory ahead of remote numa memory | 19:35 |
spatel | sean-k-mooney: hey | 19:35 |
efried | well, I guess this is a problem we would have eventually anyway, right dansmith? | 19:35 |
sean-k-mooney | if we dotn expose the toplogy to the vm the vm kernel wont know how to optimise | 19:35 |
spatel | did you see my last mesg? | 19:36 |
dansmith | efried: what? needing everything to support guests with numa? | 19:36 |
spatel | running vm on single numa0 give very high performance compare to running on both numa node (with cpu_socket=2 and cpu_threads=2 option) | 19:37 |
efried | dansmith: TL;DR: In the world where you say you don't care about NUMA, we give you an instance that's NUMA-ified, but with whatever split we were able to fit. | 19:37 |
efried | If you migrate that instance, you have to preserve that topo on the target; you can't just munge it to a new shape. | 19:37 |
efried | Which will then limit where you can fit it on migration. | 19:37 |
sean-k-mooney | spatel: i was away but just saw it now. fundemtally i think erlang is jut not optimising correctly. im not really sure how to help other then suggestin run more small vms if you can scale out the application horrizontally instead of vertically | 19:37 |
dansmith | sean-k-mooney: presumably if the flavor on the instance is not numa-aware then we can just find a new host during cold migration and give it a new topology when it moves | 19:37 |
sean-k-mooney | dansmith: yes we could | 19:37 |
dansmith | efried: only on cold migration | 19:38 |
dansmith | sorry | 19:38 |
dansmith | efried: only on live migration | 19:38 |
dansmith | efried: on cold migration it could change because you're rebooting and there shouldn't be anything in the guest that persistently cares what the topology is.. | 19:38 |
efried | Okay, that would be a new limitation. | 19:38 |
spatel | sean-k-mooney: totally understand but i am going to loose 16 cpu in that case. but anyway i can live with that | 19:38 |
dansmith | efried: we have that limitation today and it's handled by the numa live migration stuff, AFAIK | 19:38 |
efried | dansmith: oh, my understanding was that, by lying to the guest and saying it only has one NUMA node, regardless of how many are under the covers, we can change the under-the-covers on live migration without "affecting" the guest. It would still have crappy performance on both sides, but it wouldn't notice that anything had changed. | 19:39 |
sean-k-mooney | efried: well the limitation would be dont change the view of hardwar the vm sees in live migration | 19:39 |
dansmith | efried: and selecting a host in that situation should be the same as selecting a host for a new instance boot where the flavor cares deeply about the topology | 19:39 |
sean-k-mooney | when framed that way its what we do today | 19:39 |
dansmith | efried: only insofar as it is unaware of how stupid it's being, regardless of what is underneath | 19:40 |
sean-k-mooney | if we lye to the vm so it only sees 1 numa node we can in some cases change the mapping underneth yes | 19:40 |
dansmith | efried: so yes, if all guests are numa-aware then migrating indifferent ones becomes a little more restrictive, but that's the same goal as not pretending this stuff doesn't exist at boot time, IMHO | 19:41 |
efried | dansmith: okay, I understand and agree with that; I'm just questioning whether that's going to effectively spike the chances of NVH trying to live migrate a NUMA-agnostic instance. | 19:41 |
efried | sounds like the answer is yes, and we're okay with that. | 19:41 |
dansmith | efried: it may increase the difficulty of moving things, yes | 19:41 |
sean-k-mooney | efried: well in a non full clould it very likely that the numa=1 case will just work | 19:42 |
sean-k-mooney | unless the vm is very large | 19:42 |
dansmith | efried: I refer to the documented goal of not trying to schedule the last byte of memory | 19:42 |
dansmith | yep | 19:42 |
sean-k-mooney | and in that case the numa=2 case is likely to work | 19:42 |
sean-k-mooney | so i dont think it would spike much | 19:42 |
dansmith | small guests are less likely to care about numa, and thus more likely to fit into numa=1, and thus more likely to be easily movable | 19:43 |
dansmith | large guests are more likely to need numa for proper performance and have the moving restriction today | 19:43 |
*** vishalmanchanda has quit IRC | 19:43 | |
sean-k-mooney | efried: dansmith im wondering if we could/should have a second spec or defer this to the implematnion | 19:44 |
efried | This spec needs to say whether we're going to try to do the progressive splitting thing. | 19:44 |
sean-k-mooney | e.g. if we agree withthe proposed placmenet modeling should we decied how to do the query splitting seperatly | 19:45 |
efried | But that's not the only factor in play. We also need to address the partially-upgraded cloud. | 19:45 |
sean-k-mooney | vvs fallbackj | 19:45 |
sean-k-mooney | so with the progress spilting i think we still need the fallback | 19:45 |
sean-k-mooney | to cover that case | 19:45 |
sean-k-mooney | the fall back is the only thing that can ever land on the non upgraded hosts | 19:46 |
efried | I'm not sure we should do the fallback thing at all. | 19:46 |
efried | Not if there's a way we can simply avoid doing the translation until all computes are upgraded. | 19:46 |
sean-k-mooney | we cant without a global config | 19:47 |
sean-k-mooney | which is why we have the fallback for pcpus | 19:47 |
efried | I thought the control plane was able to tell which computes were at what level? | 19:47 |
dansmith | efried: we have to do the fallback no? | 19:47 |
efried | By RPC something something? | 19:47 |
dansmith | efried: it can | 19:47 |
sean-k-mooney | it was a gloabl config untill we agreed to do the fallback at the end of train | 19:47 |
efried | dansmith: isn't that going to end up violating pack/spread weighing and server affinity groups, because it will always favor upgraded/reshaped computes? | 19:48 |
*** spatel has quit IRC | 19:48 | |
dansmith | yes? | 19:48 |
efried | I mean, I see your point, that you really can't upgrade a compute *and* reshape it unless the scheduler is going to do some translating. | 19:49 |
dansmith | or, do both queries, merge the two results and let the filters/weighers decide? | 19:49 |
efried | gross. But yeah. | 19:49 |
dansmith | doesn't seem more gross to me.. two queries, yes, but everything will start with two queries effectively | 19:49 |
sean-k-mooney | dansmith: do we merge them for PCPUs or is there a reason we dont? | 19:50 |
dansmith | idk | 19:50 |
efried | IIRC we don't | 19:50 |
efried | we do one, then if no results, we do the other | 19:50 |
sean-k-mooney | right which by the way we still do | 19:50 |
efried | as for a reason, probably just didn't think about the weighing etc. | 19:51 |
*** jmlowe has joined #openstack-nova | 19:51 | |
sean-k-mooney | no i remebere this coming up | 19:51 |
sean-k-mooney | i just dont recally why we chose to/not to merge them | 19:51 |
sean-k-mooney | so is the poposal always do both queries and merge them or to the progressive spliting | 19:52 |
efried | Both | 19:53 |
efried | because | 19:53 |
efried | we want upgraded hosts to cut over to NUMA-modeled by default. | 19:53 |
sean-k-mooney | ok so progressive for non numa and both qureies for numa | 19:53 |
efried | I think we need two queries for both. | 19:54 |
sean-k-mooney | i think it will be more like 2 for numa and 5 for non numa | 19:54 |
efried | yeah | 19:54 |
sean-k-mooney | 5 asumign we dint max_numa_nodes=5 | 19:55 |
efried | 4 | 19:55 |
sean-k-mooney | sorry 4 | 19:55 |
efried | I asked this earlier: what's the max number of NUMA nodes we know of on any system? Is it 4? | 19:55 |
sean-k-mooney | could we bail out if the first numa query passed | 19:55 |
efried | yes | 19:55 |
sean-k-mooney | on a 32-64 core 2 socket amd eypci host with a numa node per l3 region 16-32 numa node | 19:56 |
efried | ye gods | 19:56 |
*** tbachman has quit IRC | 19:57 | |
sean-k-mooney | if you dont expose a numa node per l3 region then 8 is more realisitc | 19:57 |
efried | Can we tell from the db? | 19:57 |
efried | I guess we could tell by querying placement. | 19:57 |
sean-k-mooney | its in teh hohst numa toplogy blob in the cell db | 19:57 |
*** tbachman has joined #openstack-nova | 19:57 | |
sean-k-mooney | so yes technically. | 19:57 |
efried | but I don't know that we really want to go discovering that every time we schedule. | 19:58 |
sean-k-mooney | every time defintly not | 19:58 |
sean-k-mooney | schduler config option? | 19:58 |
efried | configurable [scheduler]max_implicit_numa_nodes ? | 19:58 |
efried | yeah | 19:58 |
efried | because even if it's possible to have a 32-way split, that would almost never be a good idea probably. | 19:58 |
sean-k-mooney | set it to 0 to diable numa qureis entrily otherwise its the limit | 19:58 |
sean-k-mooney | efried: it give you almost a 35% performace boost | 19:59 |
efried | "disable" meaning what? | 19:59 |
sean-k-mooney | efried: disabel mean i have disable numa reporting in my entire cloud so dont even try | 19:59 |
efried | Need to grok how that query would be different from the non-upgraded query. | 20:00 |
sean-k-mooney | it would be the same | 20:00 |
sean-k-mooney | just want we do today in tran | 20:00 |
efried | Oh, because it would have the HW_NUMA_ROOT trait. | 20:00 |
sean-k-mooney | well require=!HW_NUMA_ROOT would be the detla form train i guess | 20:01 |
sean-k-mooney | but if no host reports numa thats a noop | 20:01 |
efried | In this new picture, I'm not sure we need/want to forbid that trait. | 20:01 |
sean-k-mooney | we dont | 20:02 |
sean-k-mooney | if we are allowed to invent numa toplogies | 20:02 |
sean-k-mooney | we could specificly to land on un upgraded hosts | 20:02 |
sean-k-mooney | but it has no other use | 20:02 |
efried | in this case we don't *want* to target un-upgraded hosts. | 20:02 |
efried | we want to *allow* landing there, but not *force* it ever. | 20:03 |
sean-k-mooney | sure | 20:03 |
efried | hum, but what we don't want is to land across numa nodes on an upgraded host. So yeah, I think the 'fallback' query in both cases should have !HW_NUMA_ROOT. | 20:04 |
sean-k-mooney | so [scheduler]max_implicit_numa_nodes=0 means dont add HW_NUMA_ROOT or granular groups, anyting above 0 is the amount of numa nodes to try for progressive spliting | 20:04 |
efried | yeah. But only for NUMA-agnostic flavors. For NUMA-aware flavors, we have an explicit number of nodes we're trying for. | 20:04 |
sean-k-mooney | yes | 20:05 |
efried | Do you have it in you to write this up, sean-k-mooney? | 20:05 |
efried | I made a start on it, but I need to step away for... possibly the rest of the day. | 20:05 |
sean-k-mooney | ill start a second etherpad. | 20:05 |
sean-k-mooney | or if you have a start put it in one and i can extend it. | 20:05 |
sean-k-mooney | efried: do we still need teh per host config option in this model | 20:06 |
sean-k-mooney | i think no if we do the progressive spliting | 20:06 |
sean-k-mooney | but we might want to for perfomcne reasin if we jsut want to turn it off | 20:06 |
efried | yes we do, because sometimes the progressive splitting won't get a result, and they want to force a host to behave like a Train host. | 20:06 |
sean-k-mooney | ok | 20:07 |
efried | but that's why the workaround is *off* by default. You have to really need it to turn it on. | 20:07 |
sean-k-mooney | ok that makes sense | 20:07 |
sean-k-mooney | dansmith: would you be oke with a [scheduler]/max_implicit_numa_nodes config option | 20:07 |
sean-k-mooney | to contol the progessive spliting | 20:08 |
efried | "config-driven API behavior" warning. Not sure I see a better alternative though. | 20:08 |
sean-k-mooney | efried: well the virt driver can today dowhatever the hell it like in this case anyway so im not sure its an observable thing | 20:09 |
sean-k-mooney | at least form the api perspctive | 20:09 |
sean-k-mooney | but i get where your coming form | 20:09 |
*** jmlowe has quit IRC | 20:14 | |
dansmith | that's totally not config-driven api behavior | 20:15 |
dansmith | and yes, I think that's fine | 20:15 |
sean-k-mooney | ok ill try to write this up in a comment to the spec and then ill try not to melt bauzas brain when i try to explain this to him tomorrow in our downstream tech call | 20:18 |
sean-k-mooney | i think 90% of the spec woudl remain the same we just need to update the section that refence the fallback and upgrade impact | 20:18 |
efried | sean-k-mooney: I left a comment | 20:22 |
efried | I think I covered the high points, but I'm pretty fried (*e*fried) so I probably missed some things, if you want to fill in. | 20:22 |
efried | gtg o/ | 20:22 |
*** efried is now known as efried_afk | 20:22 | |
sean-k-mooney | efried_afk: ill review it after coffee | 20:22 |
efried_afk | thx | 20:23 |
sean-k-mooney | efried_afk: o/ | 20:23 |
*** owalsh has quit IRC | 20:30 | |
*** martinkennelly has quit IRC | 20:41 | |
*** owalsh has joined #openstack-nova | 20:44 | |
*** spatel has joined #openstack-nova | 21:18 | |
*** nweinber has quit IRC | 21:20 | |
*** spatel has quit IRC | 21:22 | |
*** xek_ has quit IRC | 21:35 | |
*** mmethot has quit IRC | 21:36 | |
*** mmethot has joined #openstack-nova | 21:37 | |
*** mmethot has quit IRC | 21:38 | |
*** mmethot has joined #openstack-nova | 21:38 | |
*** mmethot has quit IRC | 21:42 | |
*** damien_r has quit IRC | 21:44 | |
gmann | johnthetubaguy: what you think of passing service as actual target in service policies? - https://review.opendev.org/#/c/676688/8/nova/api/openstack/compute/services.py | 21:53 |
*** maciejjozefczyk has joined #openstack-nova | 21:57 | |
*** maciejjozefczyk has quit IRC | 21:59 | |
*** rcernin has joined #openstack-nova | 22:07 | |
*** umbSublime has joined #openstack-nova | 22:11 | |
*** kaisers has joined #openstack-nova | 22:21 | |
*** mriedem has quit IRC | 22:25 | |
*** slaweq has quit IRC | 22:25 | |
artom | We should probably address those errors when running func tests: | 22:35 |
artom | Exception ignored in: <function _after_fork at 0x7f6382a2ed40> | 22:35 |
artom | Traceback (most recent call last): | 22:35 |
artom | File "/usr/lib64/python3.7/threading.py", line 1373, in _after_fork | 22:35 |
artom | assert len(_active) == 1 | 22:35 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Skip all integration jobs for policies only changes. https://review.opendev.org/707268 | 22:42 |
gmann | efried_afk: stephenfins dansmith gibi melwitt ^^ this will speed up the gate for policy BP changes. | 22:43 |
gmann | alex_xu: ^^ | 22:43 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Skip to run all integration jobs for policies-only changes. https://review.opendev.org/707268 | 22:44 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Skip to run all integration jobs for policies-only changes. https://review.opendev.org/707268 | 22:44 |
*** slaweq has joined #openstack-nova | 22:50 | |
*** iurygregory has quit IRC | 22:52 | |
*** slaweq has quit IRC | 22:55 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Skip to run all integration jobs for policies-only changes. https://review.opendev.org/707268 | 23:06 |
gmann | melwitt: updated ^^ | 23:06 |
melwitt | ack | 23:06 |
*** tkajinam has joined #openstack-nova | 23:06 | |
*** slaweq has joined #openstack-nova | 23:11 | |
*** jmlowe has joined #openstack-nova | 23:15 | |
*** slaweq has quit IRC | 23:16 | |
sean-k-mooney | dansmith: efried_afk: i did a thing. https://review.opendev.org/#/c/552924/17/specs/ussuri/approved/numa-topology-with-rps.rst@516 | 23:23 |
sean-k-mooney | dansmith: efried_afk its a trivial poc of the progresive generation of the numa toplogies for a non numa vm | 23:23 |
sean-k-mooney | just the toplogy object not the queries but i could proably hack that up tomorrow | 23:24 |
*** mlavalle has quit IRC | 23:27 | |
*** nicolasbock has quit IRC | 23:36 | |
*** mmethot has joined #openstack-nova | 23:46 | |
*** igordc has quit IRC | 23:46 | |
*** jmlowe has quit IRC | 23:48 | |
*** Liang__ has joined #openstack-nova | 23:49 | |
*** rcernin has quit IRC | 23:51 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!