Wednesday, 2025-05-21

opendevreviewMerged openstack/placement master: improve test logging and replace psycopg2 with psycopg2-binary  https://review.opendev.org/c/openstack/placement/+/94548701:04
*** ralonsoh_out is now known as ralonsoh05:25
opendevreviewArnaud Morin proposed openstack/nova master: Fix missing marker with instances in build_requests  https://review.opendev.org/c/openstack/nova/+/94780406:33
opendevreviewStefan K proposed openstack/nova-specs master: Add Cloud Hypervisor support spec  https://review.opendev.org/c/openstack/nova-specs/+/94554906:34
sean-k-mooneyUggla: there are 2 potential specless blueprint for use to consider https://blueprints.launchpad.net/nova/+spec/xml-image-meta which has an implemeation aviabel https://review.opendev.org/q/topic:%22bp/xml-image-meta%22 is the first09:24
sean-k-mooneyUggla: the author reached out druing feature freeze and i chatted to them about it and gave some early feedback but they never procedurlly asked for the blueprint to be specless and for it to be reviewd09:25
sean-k-mooneyi suggested that they add it to the next meeting or ask for the approave via the mailing list if they cant attend09:26
sean-k-mooneythe ohter is more of a feature request but i think it could be specless https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/NSW3OG5ME5RDPQYLHF4T4RCWPQYG57PK/09:27
sean-k-mooneyTLDR libvirt can now auto free ram allocated to a geust when the guest releases it internally. 09:27
sean-k-mooneyyou would not really want to do that for realtime vms09:27
sean-k-mooneyand you cant do it for hugepage/filebacked vms09:28
sean-k-mooneybut for standard vms we proably shoudl just enabel that always09:28
bauzassean-k-mooney: I need to go to the physio in 30 mins but if you have time, it would be nice if you could help me on https://review.opendev.org/c/openstack/nova/+/92214009:28
sean-k-mooneythis is more just an fyi that thsoe would both eb low hanging fruite features.09:29
sean-k-mooneybauzas: form the logs the last time it was pretty clear that the isseu was nto with the subnode using the wrong commit09:29
sean-k-mooneyi say you said that in the team meeting yesterday but that was no tthe problem previosuly, we coudl see the commit that was beign used in the job output on the ocmptue and it was correct09:30
sean-k-mooneywitht hat said sure i can take a look09:30
Ugglasean-k-mooney, Ok I will add them to the doc and we will be able to discuss them in next meeting if it is required.09:30
sean-k-mooneybauzas: i assume you have tired this locally?09:31
sean-k-mooneybauzas: https://zuul.opendev.org/t/openstack/build/d31c2c01045a409a96d9dfd5e8aabbaa/log/job-output.txt#25980 so its defintly using the expected commit on the compute node09:37
sean-k-mooneyin the comptue node devstack logs we can also see the tat compilation is enabeld 2025-05-20 12:55:42.033 | ++ /opt/stack/nova/devstack/settings:source:2 :   NOVA_COMPILE_MDEV_SAMPLES=True09:40
sean-k-mooneymdevctl is also installed09:40
sean-k-mooneylooking at the compiel itesf there are some warning but apprently they were compled and loaded properly 09:43
sean-k-mooneyhttps://paste.opendev.org/show/bcAgW92YFhDTcFs5QLDD/09:43
sean-k-mooneythe probelm to me look more like libvirt has not detected the mdevs yet and we only see that on the compute beasue there is less of a delay between the devstack stages09:45
sean-k-mooneyif i had to guess movign the compilation and install earliyer in devstack or restarting libvirt might fix this.09:46
bauzassean-k-mooney: sorry I was not notified by my TheLounge instance09:50
bauzasmy main wonder is why compute1 says mtty_mtty is a wrong device09:51
bauzasanyway, I need to go to the physio now :(09:51
sean-k-mooneybauzas: ill push a patch while your away09:51
sean-k-mooneywe can see if moving the module install before installign libvirt fixes it09:51
bauzasI don't think this is the problem but let's see09:52
sean-k-mooneyi think its due to caching fo the device and a race between nova-comptue starting and libvirt seeing the device09:52
sean-k-mooneybauzas: well i confimed its usign the correct commit, the compile succeed and the kernel modules were loaded09:52
sean-k-mooneyso that means the issue is either on the libvirt side or in nova09:53
opendevreviewsean mooney proposed openstack/nova master: move compile earlier  https://review.opendev.org/c/openstack/nova/+/95051610:22
sean-k-mooneyif my intuition is correct then ^ will fix it but we may need to keep the compile in install dependin on if we have depenciy issues10:23
*** sfinucan is now known as stephenfin11:19
opendevreviewsean mooney proposed openstack/nova master: move compile earlier  https://review.opendev.org/c/openstack/nova/+/95051612:27
gibisean-k-mooney: dansmith: bauzas: FYI the oslo.service threading backend patch has been merged https://review.opendev.org/c/openstack/oslo.service/+/945720 14:24
dansmithcool14:26
dansmithgibi: pci-in-placement (I think) question14:30
dansmith...and I'm getting close to being able to do this again, so if you don't know what I'm talking about, I'll repro to show you exactly, but:14:30
dansmithsince I don't have a lot of identical pci devices to play with, I tried configuring two different ones with the same name for my flavor (they're equivalent, just not identical)14:31
dansmithwhen I tried to boot with that flavor, I got an error about "only one pci request allowed".. I think it was pci-in-placement specific14:31
dansmithI always thought one of the benefits of symbolic naming for the pci devices was so you could use something like "nvme256G" and even if you had multiple generations of hardware, you'd get a suitable device14:32
dansmithis that a pci-in-placement restriction or am I mistaken about that normally working?14:32
gibiis it two aliases with the same name but different request?14:33
dansmithright14:33
gibihttps://bugs.launchpad.net/nova/+bug/210203814:34
gibiit is definitely does not work with pci in placement. Without it it is accepted but based on Sean this never really fully  worked there either14:35
bauzasgibi: dansmith: in a current meeting but looking 14:36
dansmithhmm, okay14:36
dansmithgibi: I thought you could also specify devices in alias by address, but maybe that's not right14:36
gibihttps://review.opendev.org/c/openstack/nova/+/944062 see the comments from sean-k-mooney 14:36
gibidansmith: I don't have the full picture of this without PCI in Placement 14:36
gibiwith PCI in Placement it is not trivial to support14:37
bauzasare we talking of PCI aliases ?14:37
dansmithokay I don't know if I agree with sean's comments about it never being intended to be supported14:37
sean-k-mooneydansmith: no the alilas doe not have an adress14:37
gibias we cannot say OR between resouce classes in allocationa candidate query14:37
sean-k-mooneydansmith:the devspec does but the alaise does not14:37
dansmithI obviously wasn't closely involved with the early pci stuff, but I remember a conversation at a design summit about addresses specifically so you didn't have to commit all of one vendor/product14:37
sean-k-mooneydansmith: that was rejected because the alsi is ment to eb the same on all hosts14:38
sean-k-mooneyand the adress woudl vary14:38
sean-k-mooneyyou dont need to commit all of one vendor id and product id14:39
sean-k-mooneyyou do the filtering in the dev spec14:39
sean-k-mooneythe alias is an abstraction like the resouce classs in placement14:39
bauzasI think you can even star the devices14:40
bauzasfor the alias14:40
dansmithokay digging back through old release config docs, I guess I'm thinking of the whitelist14:40
sean-k-mooneyya the whitelists is now the devspec14:41
dansmithit seems really unfortunate to not be able to construct a flavor that gets one of a set of multiple identical-enough devices14:41
bauzasdansmith: it should14:41
sean-k-mooneydansmith: you can14:41
bauzasI don't get what we can't do14:41
bauzasthere are two different things14:41
bauzassec14:41
dansmithoh, multiple aliases in the flavor with commas?14:42
sean-k-mooneyyes14:42
dansmithand that will do one-of?14:42
dansmithokay, gotcha14:42
sean-k-mooneyno14:42
bauzashttps://docs.openstack.org/nova/latest/configuration/config.html#pci.alias14:42
sean-k-mooneyin the flavor if you use commas that is asking for multipel devices 14:42
sean-k-mooneybut not one of14:42
bauzasin the alias, you can request PCI devices by vendor/product IDs14:42
dansmithokay, so how do I do one of?14:42
sean-k-mooneyyou cant14:43
sean-k-mooneynot with out breaking other things14:43
sean-k-mooneywell14:43
sean-k-mooneyok so you can do one of via resouce classes an pci in placement14:43
bauzasagain, I don't see the problem14:43
dansmithokay, so it _is_ a limitation that we can't do what I asserted above: a flavor that selects one of a set of identical-enough hardware14:43
sean-k-mooneydansmith: identical enough for live migration means they device must be exactly the same14:44
bauzasbut we can say 'nvme256:1' in the flavor, right?14:44
sean-k-mooneypotentlaly down to the firmware level14:44
dansmithsean-k-mooney: for things that care about live migration, but (a) I have two devices that are the same vendor/product but different firmwares *and* one supports crypto and the other does not,14:44
dansmithso I'm sure qemu has to do more is-this-the-same-for-real checking if we expect that to work robustly14:45
bauzasprovided that alias would match different hardware based on the same alias combination (either the vendor/product id or a star, IIRC)14:45
sean-k-mooneydansmith: right so you can requiest trait in the alias14:45
bauzasbut that requires PCI in placvement14:45
dansmithbauzas: I only care about pci in placement :)14:45
dansmithsean-k-mooney: I don't see that in the docs, can you link me?14:45
bauzasah, then provide device_spec with traits14:46
bauzasand then request that trait thru the alias14:46
bauzashttps://docs.openstack.org/nova/latest/configuration/config.html#pci.device_spec14:46
bauzashttps://docs.openstack.org/nova/latest/configuration/config.html#pci.alias14:46
dansmith... I don't see in the docs how to request a trait with the alias14:46
sean-k-mooneydansmith: https://github.com/openstack/nova/blob/master/nova/pci/request.py#L110-L11514:46
dansmiththem's not docs :)14:46
sean-k-mooneydansmith: it may or may not be but it was added with the pci in placement feature14:47
bauzasdansmith: there is a traits dict key that you can add14:47
sean-k-mooneyso it shoudl be in the spec14:47
bauzastraits14:47
bauzas    An optional comma separated list of Placement trait names requested to be present on the resource provider that fulfills this alias. Each trait can be a standard trait from os-traits lib or it can be an arbitrary string. If it is a non-standard trait then Nova will normalize the trait name by making it upper case, replacing any consecutive14:47
bauzascharacter outside of [A-Z0-9_] with a single ‘_’, and prefixing the name with CUSTOM_ if not yet prefixed. The maximum allowed length of a trait name is 255 character including the prefix. Every trait in traits requested in the alias ensured to be in the list of traits provided in the traits field of the [pci]device_spec when scheduling the14:47
bauzasrequest. This field can only be used only if [filter_scheduler]pci_in_placement is enabled.14:47
dansmithhttps://docs.openstack.org/nova/latest/configuration/extra-specs.html#pci_passthrough:alias14:47
sean-k-mooneythat is the wrong doc14:47
bauzasyou don't specify the trait explicitely14:47
bauzasI mean in the flavor14:47
dansmithohhh, you mean an alias in _config_ can be a trait instead of a vendor/product?14:47
bauzasthe flavor only requests an amount of that alias14:48
bauzasand that alias is defined by having a traits key in it14:48
bauzasdansmith: yes14:48
bauzasand device_spec allows you to tag with traits some PCI devices automatically14:48
sean-k-mooneydansmith: correct you can use placement resouce classes + traits instead of vendor and product id14:48
bauzasjust add traits to the device_spec dict in question14:49
dansmithI see, I thought we were talking about the alias in the request14:49
sean-k-mooneyin the flavor no14:49
bauzasthat's transparent to the user or to the API admin14:49
dansmithso the doc for alias needs a trait example I guess14:49
sean-k-mooneythis is all intentioally hiden form the flavor14:49
bauzasdansmith: we have the conf docs 14:49
sean-k-mooneydansmith: ya https://docs.openstack.org/nova/latest/admin/pci-passthrough.html#pci-tracking-in-placement is where i was expectin git14:49
dansmithsean-k-mooney: I understand, and that's how I expected it to work, I just didn't see the path to that alias requesting one of multiple things14:50
bauzasbut indeed the examples don't show< it14:50
sean-k-mooneyor maybe here https://docs.openstack.org/nova/latest/admin/pci-passthrough.html#configuring-pci-aliases-for-users14:50
dansmithwe need one here: https://docs.openstack.org/nova/latest/configuration/config.html#pci.alias14:50
dansmithbecause that says you can use traits, but doesn't give me an example to see where that is the only thing used14:50
gibiI can add that doc if that helps getting reviews on the eventlet series :)14:50
bauzasdansmith: at least that's documented below in the list of accepted keys14:50
dansmithI'll try this locally and then document if it works14:50
bauzascool14:51
dansmithbauzas: I understand (now), but obviously not obvious to me :)14:51
bauzasdansmith: we tested that with Uggla when he was working on vfio-pci14:51
gibi(I assume it works there are functional test coverage on it)14:51
bauzasshould work14:51
bauzasUggla can give you some examples14:51
sean-k-mooneyya so in the devspec you can advertise them https://specs.openstack.org/openstack/nova-specs/specs/zed/approved/pci-device-tracking-in-placement.html#pci-device-spec-configuration and in the aliss you can requet them but we dont have that in the doc today14:51
dansmithgibi: sorry, I'm not farting around for no reason, I'm trying to get some other important stuff tested and this is in my way.. I promise I'm not ignoring your series for unimportant reasons14:51
sean-k-mooneyhttps://specs.openstack.org/openstack/nova-specs/specs/zed/approved/pci-device-tracking-in-placement.html#pci-alias-configuration14:51
gibiack14:52
dansmithsean-k-mooney: yeah I think I understand the glue path now14:52
bauzasgibi: I'd be more than happy to give a shot to your series14:52
bauzasprovided my meeting of meetings is done14:52
sean-k-mooneyso in https://docs.openstack.org/nova/latest/configuration/config.html#pci.alias14:53
sean-k-mooneywe do mention traits but only provide the eexample of a resouce class14:53
dansmithsean-k-mooney: right14:53
dansmithwhich is really just vendor/model (by default) so it doesn't look helpful to someone looking for it14:53
bauzaswe need one more example, I agree14:57
bauzaswithout the product/vendor ID need14:57
bauzasbut IIRC, we still need to somehow define them, right Uggla ?14:57
UgglaGive me a sec, need to reload the context.14:59
bauzaslong story short, can we skip the vendor/product ID settings in the pci alias ? I think we had to do something like providing a star (*)15:00
dansmithbauzas: star does not help me here15:01
dansmithbut as I said, I  will test this example shortly15:01
bauzasdansmith: I guess I understood your case15:01
bauzasyou wanna request a group of arbitrary pci devices by traits15:02
bauzasthat's why I'm saying you shouldn't provide vendor and product IDs in the alias definition or you would restrict that list 15:02
gibiyeah so if you have two different devices with different product id then you can whitelist them via address (with full or partial match on the address) and connect them the the same resource class in the whitelist, you can add traits there as well. Then you can request that resource class in the alias 15:03
bauzasbut IIRC, when we wanted to do that thing, we found some limitation that we fixed somehow I can't recall (maybe a star or something else)15:03
UgglaYes I think in the alias section we can skip product / vendor and only use ressource class or trait.15:03
gibithe key is that you can use the same RC in multiple device_spec lines15:03
UgglaYep I updated the documentation for the alias section to highlight that product_id / vendor_id are optional.15:05
dansmithI don't see that15:07
dansmith...that they are optional.. I do see the class example15:07
Uggladansmith, https://docs.openstack.org/nova/latest/configuration/config.html  --> alias --> A JSON dictionary which describe a PCI device. 15:08
dansmithgibi: traits would be better for me that RC.. if they use the same RC, then I can't do inventory planning based on RC, and also, I might want to still request via RC for specific devices, or traits for generic "just give me a 256G nvme"15:08
gibihttps://github.com/openstack/nova/blob/221a3e89e8988bc664298106ee691a4e41ca71f9/nova/tests/functional/libvirt/test_pci_in_placement.py#L1759-L177215:08
dansmithUggla: yeah I said I see the example, just not that vendor/product is optional (as it says for traits and rc)15:08
gibidansmith: if you don't define the RC then the RC will be based on the product_id so you cannot have a single alias requesting it as a single alias cannot request two different RC15:09
dansmithgibi: ... but I can request just a trait no?15:09
Uggladansmith, -->  Note that [...] indicates optional field.15:10
gibiyou can request a set of traits15:10
gibibut you need to eithe provide an RC or a product/vendor id in the alias15:10
gibiotherwise we cannot createa  placement allocation candidate query15:10
dansmithUggla: fine, but very not as obvious as: "resource_class: The **optional** Placement resource class name"15:11
dansmithgibi: okay if that's the case then that's definitely not very well explained in the docs here15:12
dansmithlemme try with just the trait so I can see what happens15:12
gibilets improve the doc15:12
gibiif you provides just traits in the alias and no RC or vendor/product id then you will get that the alias is invalid15:13
gibi(I hope : )15:13
dansmithI do not, on service startup at least15:15
Uggladansmith, I agree it could be better.15:15
gibidansmith: it is validated when the alias is used15:18
gibiit was true for before PCI in Placement time15:19
dansmithgibi: so, I might be missing something but it looks to me like I was able to create a server and it just silently ignored my pci request15:19
dansmithhttps://paste.opendev.org/show/bkADIKJZwFGSAv3UcXCF/15:20
gibialias = {"traits": "CUSTOM_NVME256G", "device_type":"type-PCI", "name": "nvme256g"}15:21
gibithis should have been rejected at server create. if not that is a bug15:21
gibiit cannot work as we cannot generate a proper allocation candidate query as the resource class is mandatory there15:22
dansmithyou see it's running and you can confirm the embedded flavor attempted to request it right?15:22
gibiyes I see15:22
gibiso it is a bug15:22
gibidansmith: do you use the same nova-pci.conf for both compute and api?15:24
dansmithboth computes, api and scheduler yeah15:24
dansmiththat's why it's on mnt :)15:24
gibiack, I tend to make that mistake that I update the alias in the nova-compute conf and not in the nova-api conf15:25
dansmithyep, tried to set myself up to avoid that this time15:25
gibi(the whole above discussion shows that this is the first time we really started using the PCI in Placement feature so we are finding the real bugs now)15:26
dansmiththis is a just-rebuilt setup so let me test with the other aliases to make sure pci is working in general15:26
dansmithhmm, no actually15:28
dansmithI wonder if my pci filter is not getting set, hmm15:28
dansmithah, it's not15:29
dansmithso that's probably why it worked before too, hrm15:29
dansmithgibi: so, I had [scheduler] instead of [filter_scheduler]15:33
Uggladansmith, not sure but my pci_in_placement is in the filter_scheduler section.15:33
Uggla:)15:33
dansmithbut, still works.. what all reads that? more than n-sch?15:33
dansmithconductor maybe?15:34
gibiif pci_in_placement is read by the api15:36
gibis/if//15:36
gibirepot_in_placement is read by the compute15:37
gibi(yes it is not optimal I know)15:37
dansmithokay api is seeing the new value, still not stopping me15:38
gibiI can try to reproduce the issue on my side...15:39
bauzaswhat's the issue ?15:39
dansmithbut, even using a good alias isn't getting me the device or a failure, so something else must be broken. here15:40
Ugglabauzas, my understanding is if you only specify an trait in the alias section, pci in placement should block that as it needs RC too. And it is not the case.15:41
bauzasyeah, that could be the case15:42
bauzasdansmith doesn't see anything in placement ?15:42
bauzasabout the PCI RPs15:42
bauzas(just rebooted due to a wifi crash, <3 F42)15:42
dansmithgibi: okay I had a typo in my flavor property so it was just being ignored. but, fixing that, here's what I see as a user:15:47
dansmithhttps://paste.opendev.org/show/bI7lakvj0atefW6KPNkw/15:47
dansmiththe log does have something in it, but it's not quite obvious I suspect15:47
dansmithI suspect no operator will know what "resources={CUSTOM_PCI_NONE_NONE=1}" means15:47
gibiyepp we need to add an explicit validation15:49
gibithat either RC or product/vendor needs to be provided in the alias15:49
dansmithyeah15:50
gibiI can file a bug and link it to the downstream PCI in Placement Jira as well so we will have time for it to fix15:51
dansmithgibi: so ... why again can we not support just the trait? You can ask for all providers with that trait, no?15:51
gibiyou cannot have an allocation candidate without telling placement what resource you are allocating15:51
dansmithbecause you're asking for inventory not a provider?15:51
dansmith(and providers have the traits)15:52
gibiI'm asking for things I can allocate, I cannot allocate a trait or a provider I can only allocate a piece of inventory15:52
dansmithright, that's what I mean15:53
gibiyepp15:53
gibiin our oversimplified case where a single PF RP only have a single RC with a single inventory allocation the RP or allocating one piece of that inventory is the same. 15:54
* dansmith nods15:54
gibiso it is easy to make that mental jump that we want to allocate RPs 15:54
gibithis also shows the effect of the design decisions that we connect traits to RPs not to an inventory of the RP15:56
gibiwe allocate inventories but we filter on traits that are not on the inventories15:57
gibiit is a bit of a mixup15:57
gibiso we tend to namespace traits to somehow refer to inventories :)15:57
gibilike with the OTU trait, we put PCI in the name to signify it is only related to PCI inventories of the RP15:58
gibianyhow15:58
gibiI will going to file a bug15:58
gibibased on your paste15:58
Ugglagibi, yes but finally the code has blocked the invalid settings. So not so bad.16:01
gibihttps://bugs.launchpad.net/nova/+bug/211144016:02
gibiUggla: yep, but we should consider pre-validating the alias at sevice startup16:02
gibiand also be a bit more instructive in the lgs16:02
gibilogs16:02
dansmithgibi: yeah there have been a few things in this journey where traits applied to providers to say something about inventory has been a bit weird16:02
dansmithgibi: Uggla I also had a typo in my pci alias config (unparseable json) and it didn't get noticed until the first time I tried to boot with a pci-enabled flavor, and we returned json parse complaints to the user in the error :(16:03
dansmithwhich of course is "line 1, character X" which makes no sense since they're all one line16:04
dansmithwould be much better if we can parse and log on startup which one is the problem16:04
gibifiled the upstream bug and a matching downstream jira 16:05
opendevreviewsean mooney proposed openstack/nova master: move compile earlier  https://review.opendev.org/c/openstack/nova/+/95051616:07
sean-k-mooneydansmith: that a typo in the docs16:11
sean-k-mooneywhen we added the docs for pci maged and live migration flags16:12
dansmithwhat's a typo in the docs?16:12
sean-k-mooneywe missed it was not valid json. i hit the same thing when testin gthe OTU devices feature and pinged Uggla but i dont knwo if we filed an upstream bug or not to fix it16:13
sean-k-mooneygibi: validiatign the config on start up is good we also shoudl be cachign it.16:13
sean-k-mooneygibi: https://review.opendev.org/c/openstack/nova/+/42714516:15
gibisean-k-mooney: good point, added the link to the bug16:16
sean-k-mooneystephen didn twant the addtional complexity  but going form 4062 function calls  to 3 after caching is a masive savign16:17
sean-k-mooneyespically since we can just slap functools.cache on it now16:18
sean-k-mooneyfrom the blueprint """During the creation of an instance, a list of request is build. One of this requested elements are the PCI devices, both from the API call and from the flavor.16:20
sean-k-mooneyIn the PCI request gathering from the flavor, the variable "pci_alias", stored in the Nova config file, is always parsed. This step is only needed once, because the information is static."""16:20
sean-k-mooneyso every time we need to trasnlate a pci alise to to a pci request object we parst all the alisas again https://github.com/openstack/nova/blob/master/nova/pci/request.py#L124-L20916:23
dansmithUggla: (cc gibi) FYI nvme emulation in qemu does work for testing this and is cleanable (via format only, not sanitize)16:24
Uggladansmith, cool !16:25
dansmiththat device reports that it supports lots of namespaces (not sure if it really does, haven't tested yet) but that flags a warning I have queued for the script locally about multi-namespace devices being cleaned with format only (so that's a good test)16:25
sean-k-mooneydansmith: the only issue wiht that iw while qemu can emulate it libvirt currently does nto supprot that so you have to use qemu directly unless i missed something in the xml for this ?16:25
dansmithsean-k-mooney: right, you can configure it in the xml, but only via qemu:commandline16:25
gibicool. So we can use that for local testing. 16:26
dansmithso not helpful for people testing on top of an existing nova impl, unfortunately16:26
gibinot in CI though16:26
dansmithgibi: yeah16:26
dansmithright16:26
sean-k-mooneyyou can yes16:26
dansmithgibi: it's also helpful so you can get a bunch of devices instead of just one (if you only have one)16:27
sean-k-mooneyi looked into that when we were reviwiong the spec but kind of gave up when i realed the gap in libvirt.  you could just add the qemu args in virt manager to test it locally or just invoke qemu yourslef16:27
dansmithhttps://termbin.com/tm0b16:28
sean-k-mooney@gibi: the crd change is in merge conflict https://github.com/openstack-k8s-operators/nova-operator/pull/94817:03
sean-k-mooneygibi: im sure its trival but it will need to be rebased before it can merge17:03
sean-k-mooneyoh you already have17:03
sean-k-mooneygibi: in that case ill add lgtm and assumign it passes ci it can proceed, does that work for you?17:04
gibiyepp17:07
gibiworks for me17:07
gibi(strange place to discuss it but fine :)17:08
sean-k-mooneygibi: hehe wrong tab17:13
* sean-k-mooney has been trying diffent way of arranging my widnows to be less interup driveen 17:14
* sean-k-mooney does not think that will work out for me becasue i like being interupt driven17:15
-opendevstatus- NOTICE: Gerrit is being updated to the latest 3.10 bugfix release as part of early prep work for an eventual 3.11 upgrade. Gerrit will be offline momentarily while it restarts on the new version.17:34
opendevreviewDan Smith proposed openstack/nova master: Make example OTU cleaner support NVMe sanitize  https://review.opendev.org/c/openstack/nova/+/95059217:52
sean-k-mooneygmaan: i didnt get as far as i hoped today but the first 10 patches in stephens sereise are still good to merge  IMO i.e. https://review.opendev.org/c/openstack/nova/+/936365/8 -> https://review.opendev.org/c/openstack/nova/+/937048/1118:42
sean-k-mooneyill take a look at the next coupel of patches tomorrow18:43
gmaansean-k-mooney: ack, I will see if I can check those this week but next week I am planning. 18:44
Callum027Hi sean-k-mooney, I added a comment to the change but I'd thought I'd let you know here as well - sorry for the inactivity on my changes, been a bit busy with work. We've deployed the changes to our production OpenStack environments and everything seems to be working great, so I'm happy to attend the next Nova meeting so we can discuss the proposal18:44
Callum027and hopefully get it merged for Flamingo.18:44
sean-k-mooneyCallum027: ack, if you cant just send a mail to the list asking for it to be approved18:45
sean-k-mooneyi hope its not contoverial18:45
sean-k-mooneythere is at least some other interest form teh teleemtry folks to enabel this with cloud kitty18:46
sean-k-mooneyso that posivie18:46
Callum027Yeah, I'm sure they'd be interested since this improves the scalability of Ceilometer polling18:47
Callum027I can understand why people wouldn't be happy about it since this is basically putting lipstick on a pig, but anything more radical would require more fundamental design changes and I'm not sure I'm capable of championing that at this stage :)18:47
sean-k-mooneywell with my nova hat on it replaceing Ceilometer pooling the nova api with Ceilometer polling the libvirt api so if its a perfroamc eissue its not someone elses probelem. on the ohter had the domain xml is ment to be internal state18:49
sean-k-mooneyso if we just stop using them or libvirt changed ot yaml18:49
sean-k-mooneythat not a breakign change form a nova perspecitve as the xml is not public18:49
sean-k-mooneybut realisticlly that not going to happen18:49
sean-k-mooneyso as long as ceilometer is readonly its proably a net win18:50
Callum027I do think it's a relatively elegant solution to the scalability issues Ceiilometer faced, with the drawback that changes to the files require all instances to be shelved-and-unshelved to apply the changes (or updated in place using virsh commands)18:58
sean-k-mooneyno it just requries a hard reboot18:59
sean-k-mooneythe xml is regenerated on every hard reboot or start/power on call19:00
Callum027Oh, right, of course19:00
sean-k-mooneyshelve would work too but there are cheaper options19:00
Callum027I should update the release notes on those commits to mention that19:00
Callum027But the main thing is live migrating doesn't work since the XML gets copied over without modificaiton19:00
sean-k-mooneyit will also get added on a live migrate i think19:01
Callum027In our testing it didn't, but we're using an older version of Nova19:01
sean-k-mooneyor at least it would if you implemnted that in your code change19:01
sean-k-mooneywell no i mean nova can do that but you need to write the code19:01
Callum027Maybe part of "standardising" this would be that we write a more elegant way of updating the metadata of instances in place19:01
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/virt/libvirt/migration.py#L55-L9719:02
Callum027For our production environment we wrote a one-off Ansible playbook to do it19:02
sean-k-mooneyi mean that fair too19:02
sean-k-mooneyalthough we dont like operators touching the xml19:02
Callum027Yeah, I definitely would have preferred to use the proper core for it19:02
sean-k-mooneyso perhaps as a follow up we could add updating of the metadata in the domain on live migrate19:03
sean-k-mooneythat not doen today in general19:03
sean-k-mooneyso its not really a bug in yoru patch19:03
sean-k-mooneyits just exsitng beahvior. cold migrate would pick up the chakge as weel19:03
Callum027Yeah, it's definitely a useful change though19:04
Callum027For now I've added the item to the agenda, I'll make sure I'm at the next meeting19:04
sean-k-mooneyack19:04
opendevreviewsean mooney proposed openstack/nova master: move compile earlier  https://review.opendev.org/c/openstack/nova/+/95051619:17

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!