| *** mhen_ is now known as mhen | 02:40 | |
| opendevreview | Takashi Kajinami proposed openstack/nova master: libvirt: Use firmware auto-selection by libvirt https://review.opendev.org/c/openstack/nova/+/969132 | 06:30 |
|---|---|---|
| opendevreview | Takashi Kajinami proposed openstack/nova master: libvirt: Add capability to load smm feature from existing xml https://review.opendev.org/c/openstack/nova/+/969131 | 06:33 |
| opendevreview | Takashi Kajinami proposed openstack/nova master: libvirt: Use firmware auto-selection by libvirt https://review.opendev.org/c/openstack/nova/+/969132 | 06:33 |
| gokhan | hello folks, I want to get your thoughts about this patch https://review.opendev.org/c/openstack/nova/+/973750 . It seems https://github.com/openstack/nova/commit/5a55a78d510b86975f0f4f8f43ee1feef7206244#diff-47eb12598e353b9e0689707d7b477353200d0aa3ed13045ffd3d017ee7d9e753 commit resolves live migration pre migrate failure. Regarding the fixes in Bug #1899835 (Avoid volume rollback mismatches), which handles cleanup on destination when | 07:39 |
| gokhan | pre_live_migration fails: I’ve observed that many live migration failures (like the one where a volume is in 'backing-up' status) could potentially be caught much earlier. Currently, if a volume is busy or in an unstable state, the failure often occurs during pre_live_migration, triggering the complex rollback logic we've been refining. Would it make sense to implement a proactive volume status check at the Nova API level (or early in th | 07:39 |
| gokhan | e Conductor) before the RPC call to the destination node? | 07:39 |
| gibi | gokhan: I agree that if there is a way to catch "bad" volume states early in the complex live migration process then that can decrease the likelyhood of a problematic rollback. | 07:47 |
| gokhan | thanks gibi We have submitted a patch (https://review.opendev.org/c/openstack/nova/+/973750) for catching in api side. We would like to confirm if this is the recommended approach. | 07:56 |
| gokhan | Since melwitt has previously worked on similar issues—specifically the 'Avoid volume rollback mismatches' patches—I would highly value her insights and comments on this matter to ensure we're aligned with the existing logic. | 08:02 |
| gibi | gokhan: yeah, I'm reading you patch now... | 08:03 |
| gokhan | thanks gibi :) | 08:04 |
| *** LarsErik1 is now known as LarsErikP | 08:07 | |
| gibi | gokhan: I think the direction is OK. I left comments about the test coverage, and pinged melwitt in the review | 08:10 |
| gokhan | Thanks for the review gibi, we will address your comments and upload a new patch set soon. | 08:15 |
| opendevreview | Seyeong Kim proposed openstack/nova master: libvirt: Support boot_index for multiple block devices https://review.opendev.org/c/openstack/nova/+/963665 | 08:17 |
| tobias-urdin | can somebody refresh my memory, when doing live migration for an instance with cpu_policy=dedicated and cpu_thread_policy=require and cpu_dedicated_set set, does scheduler take into account remapping PCPU for example on source cores 0,40 to something like 1,41 if 0,40 was not free on destination compute? | 11:04 |
| tobias-urdin | gibi: ^ iirc it should work? (caracal) | 11:05 |
| opendevreview | Lajos Katona proposed openstack/nova master: Use SDK for Neutron extensions https://review.opendev.org/c/openstack/nova/+/962270 | 11:12 |
| opendevreview | Lajos Katona proposed openstack/nova master: Use SDK for Neutron floating IPs https://review.opendev.org/c/openstack/nova/+/962604 | 11:12 |
| opendevreview | Lajos Katona proposed openstack/nova master: Use SDK for Neutron networks https://review.opendev.org/c/openstack/nova/+/928022 | 11:24 |
| opendevreview | Lajos Katona proposed openstack/nova master: Use SDK for Neutron subnets https://review.opendev.org/c/openstack/nova/+/962190 | 11:24 |
| opendevreview | Lajos Katona proposed openstack/nova master: Use SDK for Neutron extensions https://review.opendev.org/c/openstack/nova/+/962270 | 11:25 |
| opendevreview | Esra Ozkan proposed openstack/nova master: Fix Concurrent VM Live Migrate - Volume Backup Error https://review.opendev.org/c/openstack/nova/+/973750 | 11:30 |
| opendevreview | Max proposed openstack/nova master: Add regression test for bug #2137366 https://review.opendev.org/c/openstack/nova/+/974831 | 11:32 |
| opendevreview | Max proposed openstack/nova master: fix: delete attachments after rescheduling delete https://review.opendev.org/c/openstack/nova/+/974832 | 11:32 |
| opendevreview | Esra Ozkan proposed openstack/nova master: Fix Concurrent VM Live Migrate - Volume Backup Error https://review.opendev.org/c/openstack/nova/+/973750 | 11:36 |
| opendevreview | Lajos Katona proposed openstack/nova master: Use SDK for Neutron floating IPs https://review.opendev.org/c/openstack/nova/+/962604 | 11:36 |
| gibi | tobias-urdin: yeah I think it should work but obviously we need more data. | 12:04 |
| gibi | tobias-urdin: I think this is the funtional test that at least partially covering it https://github.com/openstack/nova/blob/d840c63a18ad951382be821f90a6b28c627911fa/nova/tests/functional/libvirt/test_numa_live_migration.py#L197 | 12:09 |
| sean-k-mooney | tobias-urdin: it does in any release the properly supprot numa live migration which i think was added in train or wallaby | 12:09 |
| sean-k-mooney | ya it was train | 12:10 |
| sean-k-mooney | https://specs.openstack.org/openstack/nova-specs/specs/train/implemented/numa-aware-live-migration.html | 12:10 |
| tobias-urdin | thanks, not sure why i had a vague memory of it having to be equal on both sides, probably pre-train days something then :) | 12:33 |
| sean-k-mooney | yep pre-train the live migraiton woudl be allowed but the vms xml would not be updated until it was hard rebboted | 12:33 |
| sean-k-mooney | asumign the cores existed if they didnt the live migraiton woudl fail but only in that case | 12:34 |
| sean-k-mooney | we put a block in place and backported pre train with a workaround to reenable it as part of supprotign it properly | 12:34 |
| sean-k-mooney | basiclly pre train we said only do live migration with numa if your usign hugepage without cpu pinning | 12:35 |
| opendevreview | Stephen Finucane proposed openstack/nova master: Follow-up for I85a7729ee65f8987ed0239f80d8d0082a414ab8f https://review.opendev.org/c/openstack/nova/+/974843 | 12:42 |
| opendevreview | Merged openstack/nova master: api: Simplify servers views (1/3) https://review.opendev.org/c/openstack/nova/+/956231 | 13:53 |
| opendevreview | Balazs Gibizer proposed openstack/nova master: Use an executor to delay STOPPED events https://review.opendev.org/c/openstack/nova/+/974445 | 14:00 |
| opendevreview | Balazs Gibizer proposed openstack/nova master: Run nova-compute in native threading mode https://review.opendev.org/c/openstack/nova/+/965467 | 14:00 |
| opendevreview | Balazs Gibizer proposed openstack/nova master: DNM:Test with oslo.vmware + compute eventlet removal patches https://review.opendev.org/c/openstack/nova/+/973468 | 14:00 |
| opendevreview | Balazs Gibizer proposed openstack/nova master: SubclassSignatureTestCase to use NoDBTestCase as base https://review.opendev.org/c/openstack/nova/+/974861 | 14:00 |
| gibi | sean-k-mooney: bauzas: gmaan: https://review.opendev.org/c/openstack/nova/+/974445 this is ready for serious review now. I have couple of remaining todos but nothing is expected to change in the core of the patch. In the other hand the core of the patch needs deep review | 14:11 |
| bauzas | ack, I definitely need a coffee for reviewing it | 14:12 |
| gibi | don't worry if it take two :) | 14:13 |
| opendevreview | Merged openstack/nova master: api: Simplify servers views (2/3) https://review.opendev.org/c/openstack/nova/+/956232 | 14:13 |
| opendevreview | Merged openstack/nova master: api: Simplify servers views (3/3) https://review.opendev.org/c/openstack/nova/+/956233 | 14:17 |
| sean-k-mooney | gibi: ok ill try an take a look today or tomorrow | 14:23 |
| gibi | thanks both of you :) | 14:56 |
| MaxLamprecht[m] | Hey folks, I'm currently debugging many out of sync cases between nova and cinder BDMs/attachments in our environment. Mostly these are caused by parallel operations/lock-contention or other reasons. I added a comment... (full message at <https://matrix.org/oftc/media/v1/media/download/AYgeZZCT43vzk9WSn7ySGLYl0AqCdodcgtuHH4PEsRyizpNZ9dfQRaY5IGWjRMbtweIwNYxNRndku0zt8aS2xE9CecSB29CAAG1hdHJpeC5vcmcvb0drQk95eXZDTkVTUmphRGJPSk5GYlRZ>) | 15:13 |
| sean-k-mooney | MaxLamprecht[m] multi line messages on matrix only rednere the first line on irc with a link to the full message so its best to avoid multi line messages | 15:17 |
| sean-k-mooney | MaxLamprecht[m]: it would be good to add those to https://etherpad.opendev.org/p/nova-2026.1-status#L76 | 15:19 |
| sean-k-mooney | add a new line per bug with 2 sub bullet point for the regression and fix reviews | 15:19 |
| sean-k-mooney | MaxLamprecht[m]: ohter then that the best thing you can do to get review input is to bring it up here or in the open discuss section of the weekly irc meeting | 15:20 |
| MaxLamprecht[m] | ack. Good to know :) | 15:20 |
| MaxLamprecht[m] | thx, I will add these MRs | 15:20 |
| sean-k-mooney | the fact you have created the repoducer and fix sepreatly ectra will definly help with the reivew | 15:21 |
| gmaan | gibi: ack | 15:22 |
| sean-k-mooney | MaxLamprecht[m]: 2 of the serise refence the same bug so you have 3 bug fix/regresion serise bvut only 2 bugs | 15:27 |
| sean-k-mooney | on actully no you have 3 | 15:28 |
| sean-k-mooney | just similar numbers | 15:28 |
| MaxLamprecht[m] | haha. I was already checking if I messed something up :D | 15:29 |
| sean-k-mooney | MaxLamprecht[m]: can you make sure you assign https://bugs.launchpad.net/nova/+bug/2088066 https://bugs.launchpad.net/nova/+bug/2137366 and https://bugs.launchpad.net/nova/+bug/2139135 to your self in lanchpad | 15:29 |
| sean-k-mooney | MaxLamprecht[m]:it was the 66 that was confusing me | 15:30 |
| MaxLamprecht[m] | sean-k-mooney: done | 15:30 |
| sean-k-mooney | all 3 sound liek valid bugs so i quickly tiragers all 3 as medium and added the approate tags | 15:31 |
| MaxLamprecht[m] | perfect. I'm currently trying to resolve/find root causes for such bdm/attachment inconcistencies in our environments. Maybe I will find additional bugs while digging into this. | 15:41 |
| MaxLamprecht[m] | I have added another small db index MR to the etherpad that is mostly relevant for large scale deployments | 15:42 |
| sean-k-mooney | MaxLamprecht[m]: its still proably a good idea to summerise this in a mailing list post or in the next irc meetign so that folks are aware of the related work your are trying to progress | 15:43 |
| sean-k-mooney | its not striclty requried but the more visiablity your patches have the more likely it will get reviewd | 15:43 |
| MaxLamprecht[m] | sean-k-mooney: ack. I will try to join the next irc meeting | 15:48 |
| rm_work[m] | hopefully quick q about nova vendordata dynamicJSON... I can't find it in the docs anywhere but it seems like I need to configure the `vendordata_dynamic_targets` option with the target named `cloud-init@<address>` or else cloud-init doesn't actually return anything for it. is that expected? and how can I use multiple targets if they need to be named that way? or did I maybe get confused from some other bug I had when I was testing | 15:50 |
| rm_work[m] | this and any name will work? | 15:50 |
| sean-k-mooney | rm_work[m]: so yes and no | 15:58 |
| sean-k-mooney | rm_work[m]: we have no upstream supoprt for usign it as a data souce for cloud-init | 15:59 |
| sean-k-mooney | they have some documentation that suggest you could use it that way | 15:59 |
| sean-k-mooney | but ti has never been officaly supproted in nova | 15:59 |
| sean-k-mooney | rm_work[m]: https://github.com/canonical/cloud-init/issues/5221 | 16:00 |
| sean-k-mooney | rm_work[m]: in https://github.com/canonical/cloud-init/issues/5221#issuecomment-2086560281 i provide a example of how to hack it to work by doing | 16:01 |
| sean-k-mooney | [API] | 16:01 |
| sean-k-mooney | vendordata_dynamic_targets=['cloud-init:string@http://127.0.0.1:123'] | 16:01 |
| sean-k-mooney | there was a spec that was not implemtned this https://review.opendev.org/c/openstack/nova-specs/+/917109/6/specs/2024.2/approved/dynamicjson-vendordata-cloud-config.rst https://blueprints.launchpad.net/nova/+spec/dynamicjson-vendordata-cloud-config | 16:04 |
| sean-k-mooney | rm_work[m]: so yes if your dynmaic backend return a single string that is the cloud init info i think it can work but since it not offially supproted if ti does not its not a nova bug | 16:05 |
| sean-k-mooney | its a new feature as this was never intended to work with cloud init automaticly like that | 16:06 |
| rm_work[m] | I mean I don't need it to RUN as part of cloud-init, I just need cloud-init to properly fetch the contents into the vendordata2.json file | 16:36 |
| rm_work[m] | like, VM boots, user checks the json file and it contains stuff | 16:37 |
| rm_work[m] | using a colon is a neat trick though, ok will try that | 16:37 |
| sean-k-mooney | so it will be present in the the config drive and via the metadata api | 16:38 |
| sean-k-mooney | you just wont be abel to execute scrips in the guest vms via vender data automaticly without the hack | 16:38 |
| sean-k-mooney | so if you just want to make the file avaiable you do not need the hack | 16:38 |
| sean-k-mooney | if you want to install packages ectra then you do but as i noted its not offically supproted so just be aware of that | 16:39 |
| rm_work[m] | yeah it looks like I CAN use any name after all, I think I had an encoding bug originally and fixed that at the same time as trying switching to cloud-init as the key, and assumed I needed that name | 16:59 |
| rm_work[m] | nova doesn't like if the target returns anything other than a string with 200 code | 17:01 |
| rm_work[m] | I was trying to return JSON originally... seeing as how it's called DynamicJSON... but nope lol | 17:01 |
| rm_work[m] | THAT seems like a bug to me but i assume there's a reason | 17:01 |
| sean-k-mooney | nova is expectign a json respocne but a single stirng is also a vaild json responce | 17:10 |
| sean-k-mooney | rm_work[m]: we are doing json.loads on the responce | 17:10 |
| sean-k-mooney | rm_work[m]: https://www.madebymikal.com/nova-vendordata-deployment-an-excessively-detailed-guide/ | 17:11 |
| rm_work[m] | hmm | 17:12 |
| rm_work[m] | yeah that was the guide I was originally following | 17:12 |
| sean-k-mooney | yep so the responce format should be the same as the static json file | 17:13 |
| rm_work[m] | ok I'll try again again again | 17:13 |
| sean-k-mooney | the hack of using a single string with the file contents for cloud init config is abusign the fact that a string is a valid json document | 17:13 |
| sean-k-mooney | we expect the respocne to be a valid json documetn and specifcly a json object but as long as its valid json it can be proceesed by nova | 17:14 |
| rm_work[m] | ok, i'm trying again | 17:21 |
| rm_work[m] | I feel like I was dealing with like 4 different issues at the same time, and by the time I solved them all I was in this weird configuration and had made some inaccurate assumptions about what the actual issues were | 17:22 |
| rm_work[m] | yeah ok, that does seem to work T_T | 17:32 |
| rm_work[m] | so it looks like almost everything I had to adjust to get this to work was likely unnecessary | 17:32 |
| rm_work[m] | I BELIEVE there is still an issue with return codes -- nova will not actually include the body of the response if the return code is a 5XX right? | 17:33 |
| sean-k-mooney | proably | 17:33 |
| sean-k-mooney | but your server should be returnning a 200 ok for the get request we do | 17:34 |
| rm_work[m] | yeah there are cases where the request result is an error, but we do want that returned to the user, so I'm just blindly doing 200 OK returns and just including an error in the body. that does make some sense at least, the CALL didn't fail, just the internal operation did. | 17:35 |
| sean-k-mooney | ya so if you want to return an errror in the body it needs to be a json body not html or it will break nova | 17:36 |
| sean-k-mooney | nova will likly just drop it but its expcting json | 17:36 |
| sean-k-mooney | you could jsut have a single string as i noted above | 17:36 |
| rm_work[m] | yeah I made a structure like '{"success": False, "error": "blah"}' | 17:36 |
| sean-k-mooney | ack | 17:37 |
| sean-k-mooney | that shoudl work | 17:37 |
| rm_work[m] | yeah ok thanks for calling that stuff out, this simplifies things a bit | 17:39 |
| rm_work[m] | sean-k-mooney: ah ok so I see now, I'm relying on it going into /var/lib/cloud/instance/vendor-data2.txt on boot | 19:26 |
| rm_work[m] | when I test just hitting the endpoint, yes it does return properly... but if I boot a new VM, nothing shows up there with this setup | 19:27 |
| rm_work[m] | going back through to test individually to see whether the issue is the naming or the JSON-encoding | 19:27 |
| rm_work[m] | I think maybe this is why I had to name it cloud-init | 19:27 |
| opendevreview | Ghanshyam proposed openstack/nova master: PoC: Graceful shutodwn of nova services https://review.opendev.org/c/openstack/nova/+/967261 | 19:29 |
| opendevreview | Ghanshyam proposed openstack/nova master: PoC: Graceful shutodwn of nova services https://review.opendev.org/c/openstack/nova/+/967261 | 19:29 |
| rm_work[m] | so I guess this is more an issue of how cloud-init is configured to call/parse the vendordata endpoint | 19:45 |
| sean-k-mooney | rm_work[m]: to have that work you need config drive enabled | 19:48 |
| sean-k-mooney | it will be placed in the iso | 19:48 |
| sean-k-mooney | and then it shoudl get mounted with the other info there | 19:48 |
| sean-k-mooney | but in that case it will only be updated once on first boot | 19:48 |
| sean-k-mooney | so if it failed when the config drive was irst build and you fixed it after it would not be present | 19:49 |
| rm_work[m] | yeah, so what I'm seeing is that IF I name it cloud-init AND I return a single "string" from a json.dumps(), then /var/lib/cloud/instance/vendor-data2.txt does contain the actual JSON | 20:08 |
| rm_work[m] | which implies that cloud-init hits that endpoint, pulls out the cloud-init key, and json-decodes it O_o | 20:10 |
| sean-k-mooney | yes it does | 20:10 |
| sean-k-mooney | it will use it as one of the input like user-data | 20:11 |
| sean-k-mooney | so it a way for you as a cloud amdin for example to auto install teh qemu-guest-agent or do similar customaistion | 20:11 |
| sean-k-mooney | now a lot of poeple dont what the cloud servidce provider doing that but the functionatliy exits in cloud-init for it | 20:11 |
| sean-k-mooney | if the content of the string is a cloud-cofnig compatiabel file for example you can use the normal cloud init moduels | 20:12 |
| sean-k-mooney | https://cloudinit.readthedocs.io/en/latest/explanation/about-cloud-config.html#example-cloud-config-file | 20:12 |
| sean-k-mooney | cloudinit.readthedocs.io/en/latest/explanation/vendordata.html | 20:13 |
| sean-k-mooney | Vendor-data follows the same rules as user-data, with the following caveats: | 20:13 |
| sean-k-mooney | Users have ultimate control over vendor-data. They can disable its execution or disable handling of specific parts of multi-part input. | 20:13 |
| sean-k-mooney | By default it only runs on first boot. | 20:13 |
| sean-k-mooney | Vendor-data can be disabled by the user. If the use of vendor-data is required for the instance to run, then vendor-data should not be used. | 20:13 |
| sean-k-mooney | User-supplied cloud-config is merged over cloud-config from vendor-data. | 20:13 |
| sean-k-mooney | rm_work[m]: while its techinally docuematned bhavior of could-init its not someithgn that operator and user geneerally expect | 20:14 |
| opendevreview | Merged openstack/nova master: [hacking]Do not mock threading.Event https://review.opendev.org/c/openstack/nova/+/971454 | 20:42 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!