*** haleyb is now known as haleyb|out | 00:38 | |
opendevreview | Takashi Kajinami proposed openstack/nova master: SEV; Add TODO note about iommu setting done in recent QEMU https://review.opendev.org/c/openstack/nova/+/909635 | 03:59 |
---|---|---|
opendevreview | Takashi Kajinami proposed openstack/nova master: SEV: Add TODO note about iommu setting done in recent QEMU https://review.opendev.org/c/openstack/nova/+/909635 | 04:01 |
adam-metal3 | Hi all! I am new to using openstack sdk and in general I am new in most openstack components, after reading the nova and openstacksdk docs I still have a question. I hope someone could give some direction. If I understand correctly multiple volumes could be attached to a VM and the VM can be booted from volume. | 07:47 |
adam-metal3 | Could there be a scenario where multiple bootable volume is attached to a VM and if yes then how could I tell via the openstacksdk that which volume was used for booting in this scenario? | 07:47 |
LarsErikP | jamespage: Hi! sorry if I'm getting annoying :P Any news on a SRU for zed containing nova 26.2.1? 👼 | 07:52 |
opendevreview | melanie witt proposed openstack/nova master: WIP Support encrypted backing files for qcow2 https://review.opendev.org/c/openstack/nova/+/907961 | 08:35 |
opendevreview | melanie witt proposed openstack/nova master: Support cross cell resize with ephemeral encryption for qcow2 https://review.opendev.org/c/openstack/nova/+/909595 | 08:35 |
opendevreview | melanie witt proposed openstack/nova master: libvirt: Introduce support for raw with LUKS https://review.opendev.org/c/openstack/nova/+/884313 | 08:35 |
opendevreview | melanie witt proposed openstack/nova master: libvirt: Introduce support for rbd with LUKS https://review.opendev.org/c/openstack/nova/+/889912 | 08:35 |
opendevreview | Rajesh Tailor proposed openstack/nova master: Add support for showing requested az in output https://review.opendev.org/c/openstack/nova/+/904568 | 08:59 |
opendevreview | Takashi Kajinami proposed openstack/nova master: SEV: Add TODO note about iommu setting done in recent QEMU https://review.opendev.org/c/openstack/nova/+/909635 | 09:29 |
mnasiadka | Hello, I've raised https://bugs.launchpad.net/nova/+bug/2054437 - is that maybe by design or is that an actual bug? | 10:24 |
*** mklejn_ is now known as mklejn | 10:32 | |
opendevreview | Dmitry Tantsur proposed openstack/nova master: ironic: clean up references to memory_mb/cpus/local_gb https://review.opendev.org/c/openstack/nova/+/878418 | 11:18 |
* bauzas is back, was on PTO this morning | 12:30 | |
gibi | Uggla: sean-k-mooney: bauzas: discussing an edge case in the manilas series | 13:36 |
gibi | when the user detaches a share from a VM via the REST API and this is the last user of the share on the compute | 13:36 |
gibi | the share is unmounted | 13:37 |
gibi | lets assume that the unmounting fails on the compute | 13:37 |
gibi | should nova delete the share mapping and leak the mount | 13:37 |
sean-k-mooney | ... the detach instnace action should go to error | 13:37 |
gibi | OR | 13:37 |
sean-k-mooney | i think | 13:37 |
sean-k-mooney | an allow detach again via the api | 13:38 |
gibi | should nova put the mapping to ERROR and let the user retry | 13:38 |
sean-k-mooney | alternitivly we can have a perodic | 13:38 |
sean-k-mooney | i think letting the user retyr is good | 13:38 |
gibi | then we are on the same page | 13:38 |
sean-k-mooney | but nova could also retry if we wanted in a perodic | 13:38 |
gibi | periodic can be added separately yes | 13:38 |
Uggla | @sean-k-mooney, ok but it means the VM cannot be restarted until the error on the share is gone. | 13:39 |
sean-k-mooney | the periodic only comes to mind as the share being monted may mean that the data can be accessed on the host | 13:39 |
sean-k-mooney | and if there was a vm breakout it might expose mroe data to that mallious guest then woudl other wise be there | 13:39 |
Uggla | @sean-k-mooney, if I'm not wrong you wanted to be able to start the vm even if the share is in an error state for debugging. Am i wrong ? | 13:42 |
bauzas | sorry, I was preparing a meeting | 13:44 |
* bauzas looking above | 13:44 | |
bauzas | gibi: I would agree with sean-k-mooney, the instance action should be an errort | 13:46 |
bauzas | so the user would need to detach the share again | 13:46 |
sean-k-mooney | i think we could allow starting the vm but not attch the errored share to it | 13:46 |
sean-k-mooney | or we could prevent that and require you to do the detach again before starting | 13:47 |
sean-k-mooney | im not sure what is the perfered approch here | 13:47 |
bauzas | sean-k-mooney: if we are not able to correctly umount the share by the compute, then we should continue to have a share mapping | 13:48 |
bauzas | I think Uggla doesn't support already mounted shares without having share mappings | 13:48 |
bauzas | Uggla: right? | 13:48 |
sean-k-mooney | thats fine | 13:48 |
sean-k-mooney | we can keep the vm in stopped with the share attach | 13:49 |
sean-k-mooney | and the instnace action in error | 13:49 |
sean-k-mooney | and then the user can either start the vm or try the detach again | 13:49 |
bauzas | yeah | 13:49 |
sean-k-mooney | depending on what they want | 13:49 |
bauzas | if they start the instance, they will have the share in it | 13:49 |
sean-k-mooney | if that is the approch we take then we dont need a perodic | 13:49 |
Uggla | I could add a detach force option, that will allow the instance to start ? (but leak the share, I could avoid db surgery) | 13:50 |
sean-k-mooney | yep | 13:50 |
sean-k-mooney | Uggla: please no :) we have scares form any time we add force | 13:50 |
Uggla | ok | 13:50 |
bauzas | Uggla: I can't remember, does your series support to look at shares that are altrady mounted ? | 13:50 |
sean-k-mooney | i think if we just start with the share mounted as if you never tired the detach then we are good | 13:50 |
sean-k-mooney | we just need to make sure that we dont update the vm unless the host unmount works | 13:51 |
Uggla | bauzas, yes you will be able to see the share in error state. | 13:51 |
sean-k-mooney | Uggla: the share does not need to be in error | 13:51 |
sean-k-mooney | the share can stay in attached/in-use | 13:51 |
sean-k-mooney | the instance action should be in error | 13:51 |
sean-k-mooney | oh actully | 13:52 |
sean-k-mooney | detach is not an instance ac tion n the normal case | 13:52 |
sean-k-mooney | do we have an instnace action event for detach | 13:52 |
sean-k-mooney | if we are going t oboot the vm with the share attached then the share attachemnt state should not be error | 13:53 |
Uggla | sean-k-mooney, yes there will have an instance action for detach. | 13:53 |
bauzas | Uggla: I need to look again at the spec then | 13:53 |
sean-k-mooney | ack then ya only that need to go to error | 13:53 |
sean-k-mooney | eveything else can keep its previous status | 13:53 |
bauzas | because I think users shouldn't know whether the share is already mounted or not | 13:53 |
sean-k-mooney | correct | 13:53 |
sean-k-mooney | we should not leak that | 13:53 |
sean-k-mooney | we should just report we failed to detach the share form the vm | 13:54 |
bauzas | tbc, I thought the detail was: | 13:54 |
sean-k-mooney | and unmounting it is part of that internally but not somethign we should tell the use via the api | 13:54 |
bauzas | as an user, I want to attach an instance to a share | 13:54 |
bauzas | it will create a share mapping | 13:54 |
bauzas | eventually, when I'm starting, then nova looks at the share | 13:55 |
bauzas | if the share is already mounted, cool | 13:55 |
bauzas | if the share needs to be mounted, all cool too, it does it now | 13:55 |
bauzas | right? | 13:55 |
sean-k-mooney | thats my expectation | 13:55 |
sean-k-mooney | the user contract is if a share mappign exits between an instnace and a share | 13:56 |
Uggla | bauzas, no. User need to ensure the share is in active status (means it is mounted and VM and share has a link) | 13:56 |
sean-k-mooney | if we start the instance then nova will do whatever is required to ensure teh vm starts with the share | 13:56 |
bauzas | Uggla: okay, hence my needing to look at your spec | 13:57 |
Uggla | I mean if you start the VM and the share is still in "attaching status" then it will fail. | 13:57 |
bauzas | Uggla: when is the share checked whether it's mounted ? | 13:57 |
bauzas | inactive == mounted, while active == mounted, right ? | 13:57 |
sean-k-mooney | Uggla: that sounds wrong to me | 13:57 |
bauzas | doh | 13:57 |
bauzas | inactive == unmounted | 13:57 |
bauzas | there it goes, opening https://specs.openstack.org/openstack/nova-specs/specs/2024.1/approved/libvirt-virtiofs-attach-manila-shares.html | 13:58 |
sean-k-mooney | the use should not have to pool the share mapping status to start the vm | 13:58 |
sean-k-mooney | when the vm is stopp and we attach a share | 13:58 |
sean-k-mooney | we should be able to do start immietaly | 13:58 |
gibi | but then there will be a race between the mount RPC and the start RPC | 13:59 |
bauzas | "Mount operation will be done when the share is not mounted on the compute host. If a previous share would have been mounted on the compute host for another server, then it will attempt to mount it and a warning will be logged that the share was already mounted." | 13:59 |
Uggla | bauzas, nothing = nothing, attaching = attach in progress, inactive = share mounted and ready, actif = Vm up and share readable | 13:59 |
bauzas | it doesn't say when we mount | 13:59 |
sean-k-mooney | well the mount api call can only return after the mapping is created in the db | 13:59 |
sean-k-mooney | so there should not be a race | 13:59 |
sean-k-mooney | or rather theer does not need to be a race | 14:00 |
gibi | the race would be about the host mount not about the DB | 14:00 |
sean-k-mooney | right but spwan should be ensuring the mount | 14:00 |
bauzas | Uggla: nothing == nothing, that's the "phi" you have in the spec ? | 14:00 |
gibi | but we have a separate RPC to ensure the mount | 14:00 |
Uggla | yep. | 14:00 |
gibi | it is not the spawn that does the host mount | 14:00 |
bauzas | gibi: when is called that RPC method ? | 14:01 |
sean-k-mooney | that soudn incorrect to me | 14:01 |
bauzas | shit, I need to go in meeting | 14:01 |
gibi | bauzas: during the attach REST API call | 14:01 |
bauzas | ah, now I remember, that's the async RPC call | 14:01 |
gibi | if I understood Sean correctly they want to do the host mount during spawn instead | 14:02 |
sean-k-mooney | yep | 14:02 |
bauzas | I'm fine whenever | 14:02 |
sean-k-mooney | i would also expect the share to be unmoutned when we stop the vm for what its worth | 14:02 |
bauzas | my point is that the share mount can be done whenever we want, at least *before* we start | 14:02 |
sean-k-mooney | because i would expect hard reboot to unmount it and remount it | 14:03 |
sean-k-mooney | since we normally use hard reboot to fix things like that | 14:03 |
bauzas | sean-k-mooney: you can attach the same mount multiple times | 14:03 |
sean-k-mooney | thats fine | 14:03 |
sean-k-mooney | the mount and unmount could be based on the first/last active vm | 14:03 |
sean-k-mooney | or better each instance could have its onw mount and we could avoid reusign it. that what i orgianly wanted in the spec | 14:04 |
bauzas | sean-k-mooney: fwiw, in the spec it says that an instance stop does make the share umounted | 14:04 |
sean-k-mooney | to have the same share mounted multipel times in seperat locations | 14:04 |
sean-k-mooney | but i know that is not what is in the code | 14:04 |
bauzas | so it's already fine | 14:04 |
bauzas | (if the implementation is the same) | 14:04 |
sean-k-mooney | so a few weeks ago i was sruprised | 14:04 |
gibi | I guess the RPC during the attach REST API call is created to prepare for the future when a share can be mounted live without a stop/start cycle. | 14:05 |
sean-k-mooney | wehn the share was not mounted under the instance state dir | 14:05 |
sean-k-mooney | because i was orgianlly expecting the share not to be shared betwen instnace on the same host | 14:05 |
bauzas | I think the word "inactive" is correct | 14:05 |
bauzas | nothing is using the share | 14:05 |
bauzas | but we still have the reservation | 14:05 |
sean-k-mooney | bauzas: the active/inactive state of a share mapping cant be mapped ot if its mounted on the hsot | 14:06 |
bauzas | I hope so | 14:06 |
sean-k-mooney | since if we are sharing the mount point betwen vms it woudl depend on if other vms on the host are using it | 14:06 |
bauzas | yeah, inactive just means that the instance doesn't use the share but it's still 'mapped' to it | 14:06 |
bauzas | whether the share is umounted or not, that's something internal | 14:07 |
bauzas | as you say, the share could be continued to be mounted, because used by other instances | 14:07 |
bauzas | now, back to the original question from gibi, | 14:07 |
bauzas | there are two possibilities | 14:08 |
sean-k-mooney | for me active means we have done what is required in nova and maniall so that the vm can use it if you start the vm | 14:08 |
Uggla | sean-k-mooney, no active means VM is up and uses the share. | 14:08 |
sean-k-mooney | then it has no extra info | 14:09 |
bauzas | IMHO, I think that a detach action that eventually goes into unmounting and which is impossible to do so should make the instance action on error | 14:09 |
sean-k-mooney | that is what instnace status=active means | 14:09 |
bauzas | or we need a periodic | 14:09 |
bauzas | but honestly, I think we should say the truth | 14:09 |
sean-k-mooney | the active status of a share mappign shoudl not depend on the vm status (active vs powered off) | 14:10 |
bauzas | that's written in the spec | 14:10 |
sean-k-mooney | these are two diffenet logical resouces in the api | 14:10 |
gibi | bauzas: it is not about the instance action but about the state of the sharemapping. We can keep it inactive or we can put it in error. | 14:11 |
bauzas | gibi: I got that, lemme summarize the exception handling case | 14:12 |
bauzas | user stops a VM, share mapping goes inactive | 14:12 |
sean-k-mooney | looking at the Share mapping status diagram in the spec | 14:13 |
bauzas | when it stops the VM, some internal call is doing two actions : | 14:13 |
bauzas | 1/ detach the share from the VM | 14:13 |
sean-k-mooney | i can see where you were going with it but it does not follow the same conventiosn as cinder or neutron ports | 14:13 |
bauzas | 2/ umount the share if not used elsewhere | 14:13 |
gibi | bauzas: OK so you suggest to do the unmount during stop, not during the detach REST API call | 14:14 |
sean-k-mooney | well the spec says we do the unmount during stop | 14:14 |
bauzas | nah, sorry | 14:14 |
bauzas | I'm waaaay confused | 14:14 |
bauzas | user has to do two actions then | 14:14 |
bauzas | stop then detach | 14:14 |
bauzas | right? | 14:14 |
Uggla | yep | 14:14 |
bauzas | and detach does two things | 14:15 |
bauzas | (internally) | 14:15 |
bauzas | one is updating the guest XML | 14:15 |
gibi | nope | 14:15 |
bauzas | the other is to umount the share | 14:15 |
gibi | the vm is stopped no need to update the XML | 14:15 |
Uggla | gibi ++ | 14:15 |
bauzas | my head is full of vGPU crap for now | 14:15 |
gibi | the XML is generated during start based on the share_mappings | 14:16 |
bauzas | I have to reload that context | 14:16 |
sean-k-mooney | the vm is stopped meaning the share is also unmoutned already | 14:16 |
gibi | not | 14:16 |
gibi | today | 14:16 |
sean-k-mooney | in the spec that what the diagram states | 14:16 |
gibi | today stop is not unmounting the share from the host | 14:16 |
gibi | OK, then that is a gap between the spec and the code | 14:16 |
sean-k-mooney | well that is not in line with the spec then | 14:16 |
sean-k-mooney | yep | 14:16 |
sean-k-mooney | what in the spec i think aligns with what i would like to see | 14:17 |
gibi | do we want to move the host unmount to stop? | 14:17 |
sean-k-mooney | if we do that then detach cant fail in this edge case | 14:17 |
gibi | and do we want to fail the stop action if unmount fails? | 14:17 |
bauzas | do we all agree on the fact that unmount is not strictly required ? | 14:18 |
sean-k-mooney | and i belive unmounting the share (if no other vm is using it) in stop is also more secure | 14:18 |
gibi | bauzas: do you mean, is it OK to leak mounts? | 14:18 |
sean-k-mooney | bauzas: no | 14:18 |
gibi | I'm not sure what happens with a mount if the export on the manila side is removed | 14:18 |
bauzas | umount is not strictly required if we still have a share mapping, right ? | 14:18 |
gibi | bauzas: if you mean we have a record of the mount so we can do the unmount later, then yes | 14:19 |
sean-k-mooney | as far as im concerned the attach and detach rps shoudl not try to mount/unmount the share | 14:19 |
bauzas | gibi: I'm going back to my original statement : I don't want to have share leaks | 14:19 |
bauzas | gibi: if we're not able to guarantee that some share mapping state will continue to handle shares that are sitll residues on the compute, then we need to error the detach call | 14:20 |
sean-k-mooney | form the spec | 14:20 |
sean-k-mooney | """Share attachment/detachment can only be done if the VM state is STOPPED or ERROR. These are operations only on the database, and no RPC calls will be required to the compute API. This is an intentional design for this spec. As a result, this could lead to situation where the VM start operation fails as an underlying share attach fails.""" | 14:20 |
sean-k-mooney | that ^ is what the code should be doing | 14:20 |
gibi | bauzas: I agree | 14:20 |
bauzas | the phi-state in the diagram actually needs to be explained more | 14:21 |
gibi | OK. so If we follow that statement from the spec then both the mount and unmount RPC is unnecessary | 14:21 |
sean-k-mooney | correct | 14:21 |
sean-k-mooney | its just a cell db update | 14:21 |
bauzas | I have a vgpu meeting in 8 mins so I'll need to bail out shortly | 14:21 |
sean-k-mooney | to create a row or delete it | 14:21 |
gibi | this also means we will host mount during start and unmount during stop / delete | 14:22 |
sean-k-mooney | yep | 14:22 |
sean-k-mooney | if start fails to mount we go to error wheere you can either retry or detach the share | 14:22 |
gibi | so if the mount / umount fails start or stop action can fail and put the VM state in ERROR | 14:22 |
sean-k-mooney | yep | 14:22 |
sean-k-mooney | thats what we approved | 14:22 |
sean-k-mooney | anb all the later assumtiosn of the spec are based on that | 14:23 |
gibi | Uggla: do you have a reasoning why we diverged from the spec in this regard? Was there any late findings during implementation? | 14:23 |
gibi | or in other words, what issues do you forsee if we do what the spec says and move the host mount / unmount? | 14:25 |
Uggla | gibi, yes but I do not remember well. I guess the reason was simpler error mgmt and simplify later online attach. | 14:27 |
sean-k-mooney | well this discussion shows that the error magament is more complex | 14:28 |
gibi | online attach is a good topic. When we have that in libvirt we will need an RPC during the attach REST API call. | 14:28 |
sean-k-mooney | yes, although i do not think that is relvent here | 14:29 |
bauzas | I think the intent was to let the user restart its instance if it was erroring due to transient issues | 14:29 |
sean-k-mooney | yes we will need an rpc but that rpc would only be successful if the share was both mounted and attached to the vm | 14:29 |
sean-k-mooney | it shoudl likely revert the share mounting on the host if the libvirt attach failed | 14:30 |
sean-k-mooney | so you can try attach again | 14:30 |
bauzas | that said, there are billing implications on a stop action that would fail due to some internal problem | 14:30 |
sean-k-mooney | we explcitly allow detach in ERROR | 14:30 |
sean-k-mooney | to make sure that is not a porblem | 14:30 |
bauzas | I don't think our public cloud providers woud like their users to be continued to be charged because they have some network issues | 14:30 |
gibi | I think bauzas refer to stop failing not detach failing | 14:30 |
sean-k-mooney | if stop failes you can detach and then stop again | 14:30 |
bauzas | gibi: correct | 14:31 |
gibi | ahh | 14:31 |
gibi | you cannot detach if it is not stopped | 14:31 |
sean-k-mooney | the spec say you can do it in the error state | 14:31 |
sean-k-mooney | also from the ordering the vm would be stopped then the detach woudl fail | 14:31 |
sean-k-mooney | *then the unount woudl fail | 14:32 |
gibi | so stop fails, puts the VM in ERROR, we allow detach in ERROR, so user does the detach, detach deletes the mapping, we lost the mapping while the unmount never happened | 14:32 |
sean-k-mooney | so it should be safe to allow the detach to update the db. | 14:32 |
sean-k-mooney | yep and that where a perodic would come in | 14:33 |
sean-k-mooney | to allow that to be healed | 14:33 |
* bauzas on a meeting by now | 14:34 | |
sean-k-mooney | same | 14:34 |
gibi | then lets park this for after the meeting | 14:34 |
opendevreview | Takashi Kajinami proposed openstack/os-vif master: Drop wrong stacklevel https://review.opendev.org/c/openstack/os-vif/+/909682 | 15:05 |
tkajinam | ^^^ I noticed I've made this mistake in a few places... it'd be nice if this can be merged and included in the final release for 2024.1 ... | 15:23 |
opendevreview | Rajesh Tailor proposed openstack/nova master: Add support for showing requested az in output https://review.opendev.org/c/openstack/nova/+/904568 | 15:29 |
opendevreview | Stephen Finucane proposed openstack/nova-specs master: Add openapi spec https://review.opendev.org/c/openstack/nova-specs/+/909448 | 18:06 |
artom | So, I tried running with CPU power management defaulted to on, and got this: https://review.opendev.org/c/openstack/whitebox-tempest-plugin/+/909785/1 | 20:05 |
artom | Interesting libvirt failures launching, for example, with an emulator thread policy | 20:05 |
artom | I wonder if the job is doing something wrong, or are there bugs... | 20:06 |
sean-k-mooney | is that using isolate | 20:09 |
artom | <cputune> | 20:10 |
artom | <vcpupin vcpu='0' cpuset='6'/> | 20:10 |
artom | <vcpupin vcpu='1' cpuset='5'/> | 20:10 |
artom | <emulatorpin cpuset='4'/> | 20:10 |
artom | </cputune> | 20:10 |
artom | That looks like `share` | 20:10 |
sean-k-mooney | its only share if cpu_shared_set=4 | 20:11 |
sean-k-mooney | cpu_dedicated_set = 4-6 | 20:11 |
sean-k-mooney | that is isolate | 20:11 |
sean-k-mooney | https://763b7c31d44017973024-0677f910aec9b59957afc606ff2e533f.ssl.cf2.rackcdn.com/909785/1/check/whitebox-devstack-multinode/532b307/compute/logs/etc/nova/nova-cpu_conf.txt | 20:12 |
sean-k-mooney | su yes there is a bug its not onlinging the isolat cores | 20:12 |
sean-k-mooney | there will be 1 and its included in teh instnace numa toplogy | 20:12 |
artom | That's on the compute | 20:12 |
artom | cpu_dedicated_set = 4-6 | 20:12 |
artom | cpu_shared_set = 2,3 | 20:12 |
sean-k-mooney | oh this was on the contoler | 20:13 |
artom | No you got it right, it's the compute | 20:13 |
sean-k-mooney | so ya sound like a trivial bug with isolate | 20:13 |
sean-k-mooney | that should eb simple to reproduce and fix | 20:13 |
artom | I think there are others | 20:14 |
artom | It's failing with a similar error on NUMA live migration... | 20:15 |
sean-k-mooney | that is the only one that can use a core form the cpu dedicated set that is not for a vcpu | 20:15 |
artom | But I need to track down the exact instance that it's live migrating... | 20:15 |
sean-k-mooney | so isolate is the only way we have a non vcpu pinned to the cpu_dedicated_set | 20:16 |
sean-k-mooney | for numa live migration this would cause issues if we dont have renes fix for the cpu pinning | 20:16 |
artom | Which we don't | 20:17 |
sean-k-mooney | well we should merge and backport that fix | 20:17 |
sean-k-mooney | for now can you file a bug for isolate | 20:17 |
sean-k-mooney | https://review.opendev.org/c/openstack/nova/+/877773 | 20:18 |
sean-k-mooney | so ya still open | 20:18 |
artom | Yeah, I guess he's concentrating on the manila stuff | 20:18 |
sean-k-mooney | which is fair | 20:19 |
* artom tries depending on it to see if it reduces the number of failures | 20:20 | |
sean-k-mooney | but that why that is broken | 20:20 |
sean-k-mooney | it should assumign its not in mergee confict | 20:20 |
artom | I'll see you in two hours ;) | 20:21 |
sean-k-mooney | i was going to say i wont be around then.... | 20:21 |
artom | Which is entirely legitimate | 20:21 |
sean-k-mooney | but when has that happend when i planed | 20:21 |
artom | That's between you and your sense of disipline ;) | 20:22 |
sean-k-mooney | and my stomach | 20:23 |
sean-k-mooney | for some reasons it has notions that i should feed it at least once a day | 20:23 |
artom | Pfft, what does it think it is | 20:24 |
sean-k-mooney | so the bug is here by the way https://github.com/openstack/nova/blob/master/nova/virt/libvirt/cpu/api.py#L105 | 20:25 |
sean-k-mooney | i could have sworn instance.numa_topology.cpu_pinning also had the isolated core for emulator threads | 20:25 |
sean-k-mooney | when using that policy | 20:26 |
artom | More in power_up(), but yeah | 20:26 |
artom | Both, really | 20:26 |
sean-k-mooney | so based on the docs string https://github.com/openstack/nova/blob/master/nova/objects/instance_numa.py#L291-L296 | 20:28 |
sean-k-mooney | it should be all the cores for teh instance | 20:29 |
artom | When has a docstring ever been wrong ;) | 20:29 |
sean-k-mooney | i said this hsould be easy but this behviaor is berried deep in the cpu pinning code | 20:39 |
sean-k-mooney | looking at it quickly https://github.com/openstack/nova/blob/3209f6551652cff7bef0b9d9719ab940dd05a0f8/nova/virt/hardware.py#L2693-L2709 | 20:39 |
sean-k-mooney | i think it shold be working... | 20:40 |
artom | Yeah, same, the more I look, the more I'm lost | 20:40 |
sean-k-mooney | as far as i can tell we are correctly reserving 1 addtional core in the vms numa node 0 for the isolated thread | 20:40 |
sean-k-mooney | and then storign it in the same set/dict as all the rest | 20:41 |
sean-k-mooney | that what if instance_cell.cpuset_reserved: | 20:41 |
sean-k-mooney | pinned_cpus |= instance_cell.cpuset_reserved | 20:41 |
sean-k-mooney | is doing | 20:41 |
sean-k-mooney | anyway first stpe file bug | 20:42 |
sean-k-mooney | second step create repoducer | 20:42 |
artom | So at least for that same instance | 20:43 |
artom | With the emulator thread on host CPU 4 | 20:43 |
artom | It logs this: | 20:44 |
artom | Feb 21 19:15:30.167111 [compute/logs/screen-n-cpu.txt] np0036828693 nova-compute[42254]: DEBUG nova.virt.hardware [None req-ec45061f-e9a4-4b02-9354-0cb390bd28cf tempest-EmulatorThreadTest-1184416592 tempest-EmulatorThreadTest-1184416592-project-member] Selected cores for pinning: [(0, 6), (1, 5)], in cell 0 {{(pid=42254) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:905}} | 20:44 |
sean-k-mooney | so im sus that https://github.com/openstack/nova/blob/3209f6551652cff7bef0b9d9719ab940dd05a0f8/nova/virt/hardware.py#L2689-L2694 | 20:44 |
sean-k-mooney | does not have a bug | 20:44 |
sean-k-mooney | but i cant fully say why i felll like that | 20:44 |
artom | So it doesn't log the "extra" emulator thread core | 20:45 |
sean-k-mooney | right | 20:46 |
artom | And doesn't pin it at https://github.com/openstack/nova/blob/3209f6551652cff7bef0b9d9719ab940dd05a0f8/nova/virt/hardware.py#L908 | 20:46 |
sean-k-mooney | just doing an or | 20:46 |
sean-k-mooney | pinned_cpus |= instance_cell.cpuset_reserved | 20:46 |
sean-k-mooney | would work if cpuset_reserved is disjoit from instance_cell.cpu_pinning.values() | 20:46 |
sean-k-mooney | which it should be | 20:46 |
sean-k-mooney | but... | 20:47 |
sean-k-mooney | this do you see this debug message https://github.com/openstack/nova/blob/3209f6551652cff7bef0b9d9719ab940dd05a0f8/nova/virt/hardware.py#L663-L673 | 20:48 |
sean-k-mooney | Packing an instance onto a set of sibling ... | 20:48 |
artom | Feb 21 19:15:30.165439 [compute/logs/screen-n-cpu.txt] np0036828693 nova-compute[42254]: DEBUG nova.virt.hardware [None req-ec45061f-e9a4-4b02-9354-0cb390bd28cf tempest-EmulatorThreadTest-1184416592 tempest-EmulatorThreadTest-1184416592-project-member] Packing an instance onto a set of siblings: host_cell_free_siblings: [{6}, set(), {5}, {4}, set()] instance_cell: InstanceNUMACell(cpu_pinning_raw=None,cpu_policy='dedi | 20:49 |
artom | cated',cpu_thread_policy=None,cpu_topology=<?>,cpuset=set([]),cpuset_reserved=None,id=0,memory=64,pagesize=None,pcpuset=set([0,1])) host_cell_id: 0 threads_per_core: 1 num_cpu_reserved: 1 {{(pid=42254) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:663}} | 20:49 |
sean-k-mooney | ok so ya its asking for the the isolated core | 20:50 |
artom | Well, we know the policy itself works | 20:50 |
artom | But is it recording the extra pinned core in a way that the CPU power management code can then power on/off | 20:50 |
sean-k-mooney | so we should run https://github.com/openstack/nova/blob/3209f6551652cff7bef0b9d9719ab940dd05a0f8/nova/virt/hardware.py#L754-L782 | 20:50 |
sean-k-mooney | its recordigin it in cpuset_reserved | 20:51 |
sean-k-mooney | which is then |= with the other cores | 20:51 |
artom | The comment on that says that it's reserved for the hypervisor | 20:51 |
artom | In the object | 20:51 |
sean-k-mooney | in this case hypervior means qemu | 20:52 |
sean-k-mooney | its reserved for the qemu emulator thread | 20:52 |
artom | Hah | 20:52 |
artom | So should the power management code be powering down/up cpuset_reserved as well? | 20:52 |
artom | Without going through the whole paperwork, we can do a quick POC and Depends-on in whitebox to see | 20:53 |
sean-k-mooney | it should already be incldued into pinned_cpus | 20:53 |
sean-k-mooney | but yes | 20:53 |
sean-k-mooney | it should be powering up the emulator thread core | 20:53 |
sean-k-mooney | do you see this message https://github.com/openstack/nova/blob/3209f6551652cff7bef0b9d9719ab940dd05a0f8/nova/virt/hardware.py#L776-L777 | 20:53 |
artom | It just checks cpu_pinning... | 20:54 |
sean-k-mooney | yep | 20:54 |
sean-k-mooney | Feb 21 19:15:30.166825 np0036828693 nova-compute[42254]: INFO nova.virt.hardware [None req-ec45061f-e9a4-4b02-9354-0cb390bd28cf tempest-EmulatorThreadTest-1184416592 tempest-EmulatorThreadTest-1184416592-project-member] Computed NUMA topology reserved pCPUs: usable pCPUs: [[], [], [4]], reserved pCPUs: {4} | 20:55 |
sean-k-mooney | but notice anythying odd | 20:55 |
sean-k-mooney | we have these two logs back to back | 20:55 |
sean-k-mooney | Feb 21 19:15:30.166472 np0036828693 nova-compute[42254]: INFO nova.virt.hardware [None req-ec45061f-e9a4-4b02-9354-0cb390bd28cf tempest-EmulatorThreadTest-1184416592 tempest-EmulatorThreadTest-1184416592-project-member] Computed NUMA topology CPU pinning: usable pCPUs: [[6], [5], [4]], vCPUs mapping: [(0, 6), (1, 5)] | 20:55 |
sean-k-mooney | Feb 21 19:15:30.166825 np0036828693 nova-compute[42254]: INFO nova.virt.hardware [None req-ec45061f-e9a4-4b02-9354-0cb390bd28cf tempest-EmulatorThreadTest-1184416592 tempest-EmulatorThreadTest-1184416592-project-member] Computed NUMA topology reserved pCPUs: usable pCPUs: [[], [], [4]], reserved pCPUs: {4} | 20:55 |
sean-k-mooney | it calulated both | 20:56 |
sean-k-mooney | and we shoudl be merging them | 20:56 |
sean-k-mooney | but then it prints | 20:57 |
sean-k-mooney | Selected cores for pinning: [(0, 6), (1, 5)], in cell 0 {{(pid=42254) _pack_instance_onto_cores /opt/stack/nova/nova/virt/hardware.py:905}} | 20:57 |
sean-k-mooney | which is form here https://github.com/openstack/nova/blob/3209f6551652cff7bef0b9d9719ab940dd05a0f8/nova/virt/hardware.py#L905-L906 | 20:58 |
sean-k-mooney | artom: tldr adding cpuset_reserved to power on and power off would fix it | 20:59 |
artom | Yeah, lemme try a quick hax of that | 20:59 |
sean-k-mooney | but but i thin kbased on the code we ere expiectin that not to be required | 21:00 |
sean-k-mooney | so maybe we need to update the doc string or the propery to include cpuset_reserved | 21:00 |
sean-k-mooney | i.e. if you made https://github.com/openstack/nova/blob/3209f6551652cff7bef0b9d9719ab940dd05a0f8/nova/objects/instance_numa.py#L291-L296 also include it | 21:02 |
sean-k-mooney | ok im going to finish there let me know how your hacking goes | 21:02 |
artom | It's done, just need to push and retry whitebox | 21:02 |
opendevreview | Artom Lifshitz proposed openstack/nova master: POC: online dedicated emulator threads as well https://review.opendev.org/c/openstack/nova/+/909795 | 21:03 |
artom | Hrmm, I can't Depends-on different Nova changes, can I? | 21:03 |
sean-k-mooney | you can | 21:04 |
sean-k-mooney | you can depends on 2 diffent patches in the same repo | 21:05 |
sean-k-mooney | you normally dont want too | 21:05 |
artom | Really? And Zuul does like a magic merge of them? | 21:05 |
sean-k-mooney | yep | 21:05 |
sean-k-mooney | so as long as they dont have merge conflcit between each other it will work | 21:05 |
artom | Dayum, I'm impressed | 21:05 |
artom | OK, https://review.opendev.org/c/openstack/nova/+/909795 can run while I pick up the kids from school | 21:06 |
sean-k-mooney | we normally done do tha because its way more fragile but it works i limited caes | 21:06 |
opendevreview | Artom Lifshitz proposed openstack/nova master: POC: online dedicated emulator threads as well https://review.opendev.org/c/openstack/nova/+/909795 | 21:07 |
sean-k-mooney | artom: cell is spelt with a c | 21:08 |
sean-k-mooney | https://review.opendev.org/c/openstack/nova/+/909795/1/nova/objects/instance_numa.py#301 | 21:08 |
artom | Depends what you're celling | 21:09 |
sean-k-mooney | :) | 21:09 |
opendevreview | Pavlo Shchelokovskyy proposed openstack/nova master: Fix device type when booting from ISO image https://review.opendev.org/c/openstack/nova/+/909611 | 21:09 |
opendevreview | Pavlo Shchelokovskyy proposed openstack/nova master: Fix device type when booting from ISO image https://review.opendev.org/c/openstack/nova/+/909611 | 23:08 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!