opendevreview | Zhang Hua proposed openstack/nova stable/2024.1: Fix deepcopy usage for BlockDeviceMapping in get_root_info https://review.opendev.org/c/openstack/nova/+/926696 | 06:52 |
---|---|---|
opendevreview | Zhang Hua proposed openstack/nova stable/2024.1: Fix device type when booting from ISO image https://review.opendev.org/c/openstack/nova/+/945816 | 06:52 |
zigo | sean-k-mooney: Sorry for I didn't write your first name correctly on my message to the list. :/ | 13:10 |
zigo | And thanks a lot for your reply. | 13:11 |
zigo | There's still things I don't understand. | 13:11 |
zigo | Manila has multiple modes, one is the "generic" one that I used to use, and the other one that has to be used for CephFS. | 13:11 |
zigo | When you write that I have "nothing to do", I suppose that is, compared to the non-generic mode, right? | 13:12 |
sean-k-mooney | zigo: it fine i dont mind i technically didnt either its "seán" not "sean" but typeing the foda over the a often breaks things :) | 13:13 |
sean-k-mooney | zigo: when i say there is noghting to do i mean nova should not need to care how manilla is configured | 13:14 |
zigo | Yeah, but I need to somehow teach my management software to configure Manila for virtio-fs... :) | 13:14 |
sean-k-mooney | its just useing the info form the share export/attachment (whatever its called in manilas api) | 13:14 |
sean-k-mooney | manila does nto need special config | 13:15 |
sean-k-mooney | the way this is working is we are attachign the manila share to the compute host and then using virtio-fs to passthough that "local" filestytem to the vm | 13:15 |
zigo | Yeah, got that point. | 13:16 |
zigo | Though, for example, in manila.conf's share_driver, what should I write? | 13:16 |
sean-k-mooney | for cephfs the compute node need to access to the ceph cluster for that to work. with the generic driver the comptue node wound need to have network connectivity with the vm that runing the nfs server | 13:16 |
sean-k-mooney | zigo: Uggla can correct men if im wrong but it should not need any config changes form using the backend without this feature | 13:17 |
sean-k-mooney | zigo: as far as im aware, your installer should not need to do any config on the manilla side and the only config change needed on nova are the manilla keystone auth parmaters https://docs.openstack.org/nova/latest/configuration/config.html#manila | 13:20 |
sean-k-mooney | we have only one manila specific option https://docs.openstack.org/nova/latest/configuration/config.html#manila.share_apply_policy_timeout | 13:21 |
* zigo is in meeting, will read soon | 13:21 | |
carloss | zigo: ++ on sean-k-mooney's info. No additional configuration is needed in Manila. You would only need to have the backend properly configured as in any deployment other deployment, create a share and provide it during the nova attachment process, then nova and manila will do the rest for you (creating access rules, locking the share, locking the access rules and so on) | 13:24 |
sean-k-mooney | if your using the generic driver the main limiation is the share network needs to be routeable form the compute nodes | 13:26 |
sean-k-mooney | basically meaning it should be a neutron external network unlss floating ips work with manilla? | 13:26 |
Uggla | Yes only a manila section like the one from cinder but with name [manila] | 13:27 |
Uggla | so nova could get the enpoint + cred to access manila. | 13:28 |
sean-k-mooney | Uggla: i notcied we dont seam to have a tempest job for this in nova yet | 13:28 |
sean-k-mooney | or at least i didnt see this enabled in any of our exsitign jobs | 13:29 |
sean-k-mooney | do you have patches for that still pending? | 13:29 |
Uggla | yep I know and I'm guilty about it. I worked on them, but did not complete them yet. | 13:29 |
sean-k-mooney | it would be nice ot test cephfs in our ceph job and the generic driver with lvm in one of the ohters | 13:29 |
sean-k-mooney | ack | 13:29 |
sean-k-mooney | Uggla: well you did a lot last cycle. i was more asking to see if we could provide a refence to zigo of devstack cofniguration for ceph and nfs via cinder lvm | 13:30 |
sean-k-mooney | so they could take a look at how it works | 13:30 |
Uggla | sean-k-mooney, but there are discussion to have this effort done by the manila team. | 13:31 |
Uggla | or part of the effort. | 13:31 |
sean-k-mooney | it could be but we should be testing it in our jobs too | 13:31 |
Uggla | Of course I can provide zigo how to configure with devstack, so he could transpose to his configuration. | 13:32 |
sean-k-mooney | i dont think we need new jobs in nova just our existing jobs shoudl be tweeked | 13:32 |
sean-k-mooney | Uggla: ya i think that would help | 13:32 |
Uggla | btw there is a fn job with the sdk which is doint minimal testing against tempest. | 13:33 |
Uggla | s/doint/doing | 13:33 |
zigo | sean-k-mooney: If I'm going to use virtio-fs, that's precisely because I do not want to use the generic driver. | 14:21 |
zigo | What I don't get though, is if I'm using the "normal" cephfs driver, I do not want my users to be able to use cephfs direct connection from their VMs. | 14:23 |
zigo | For us, the Ceph backplane is *not* accessible from the VMs. | 14:23 |
zigo | (or purpose, for security reasons). | 14:23 |
zigo | So is there a way to desactivte it? | 14:23 |
zigo | Uggla: ^ | 14:28 |
sean-k-mooney | zigo: you can deploy your ceph so that there is no routeablity form the vm subnets | 14:29 |
sean-k-mooney | but there is routablity on the compute nodes | 14:29 |
zigo | That's the case already. :) | 14:29 |
zigo | But then, the API will be still available, I was hoping for it to say "404: cephfs API not activated, please use virtio-fs instead" or something ... | 14:30 |
sean-k-mooney | no | 14:32 |
sean-k-mooney | because manilla does not know about virtiofs | 14:32 |
sean-k-mooney | that an internal implemation detail of nova (sepcrically the libvirt driver) | 14:32 |
sean-k-mooney | zigo: i belive there might be a way to limit what info is shown to nova (via the service role) and the user (member/reader) | 14:33 |
sean-k-mooney | on the manilla side | 14:33 |
opendevreview | Dan Smith proposed openstack/nova master: Invalidate PCI-in-placement cached RPs during claim https://review.opendev.org/c/openstack/nova/+/944149 | 14:42 |
opendevreview | Dan Smith proposed openstack/nova master: Support "one-time-use" PCI devices https://review.opendev.org/c/openstack/nova/+/943816 | 14:42 |
opendevreview | Dan Smith proposed openstack/nova master: Add one-time-use devices docs and reno https://review.opendev.org/c/openstack/nova/+/944262 | 14:42 |
opendevreview | Dan Smith proposed openstack/nova master: DNM: Test nova with placement alloc fix https://review.opendev.org/c/openstack/nova/+/945626 | 14:42 |
dansmith | gibi: are you intending to hold off approving/merging OTU until after discussion at the PTG? | 14:48 |
sean-k-mooney | are there open desgin decisions ? | 14:49 |
sean-k-mooney | i.e. that need dicsussion at the ptg | 14:49 |
dansmith | not that I'm aware of, and the spec is merged, I just had the thought that maybe gibi was expecting to wait | 14:49 |
sean-k-mooney | or is it just normal code review that needed | 14:50 |
dansmith | just normal review, I think | 14:50 |
sean-k-mooney | ack | 14:50 |
dansmith | it's on the PTG for discussion, but TBH i was hoping to have it merged before then | 14:50 |
dansmith | since I'll be out for part of it (and the week after) | 14:50 |
sean-k-mooney | i personally have not had time to loop back to the impl much | 14:50 |
gibi | dansmith: I think we can progress with the normal code review. Right now I don't see any open questions on the spec level | 14:50 |
sean-k-mooney | but i didnt have any huge concern with the approch that was nto adressed in the spec | 14:50 |
dansmith | gibi: ack, I agree, I just realized I wasn't sure what others were actually thinking | 14:51 |
sean-k-mooney | dansmith: the only thing that comes to mind really are your expermient with using virt-rng ot try and test this in ci | 14:51 |
dansmith | sean-k-mooney: I've already got the images modified on vexx, and was planning to try to work on that next sprint | 14:52 |
gibi | sean-k-mooney: I still have the intention to test it with IGB locally | 14:52 |
sean-k-mooney | that not blocking i just dont know fi you want to present that approch to folks | 14:52 |
dansmith | but I also don't want to hold this up for that, since that is more just general PCI testing, and OTU in tempest will be a bit difficult | 14:52 |
sean-k-mooney | dansmith: oh cool good to know | 14:52 |
sean-k-mooney | dansmith: i was more thinking did we want to dicuss what else we could test with that | 14:52 |
dansmith | OTU in tempest is probably not doable because I need a single device, a dedicated flavor, no parallelism, etc, so I suspect whitebox is the best option | 14:53 |
sean-k-mooney | it might be possibel in upstream tempest but i dont think anyone would object to whitebox | 14:53 |
sean-k-mooney | anyway if we keep the ptg session this is what i was expecting to come up | 14:54 |
dansmith | the problem is tempest would need to know the alias in order to create the flavor in the one test we use it, or we'd need to communicate another flavor devstack setup that is only for that one test | 14:54 |
sean-k-mooney | not really OTU itself but how can we improve passhtough testing in general | 14:54 |
dansmith | but yeah, maybe we use the PTG to discuss testing | 14:54 |
sean-k-mooney | like we never finish the mdev testing work | 14:54 |
sean-k-mooney | we have your mtty devstack supprot just missing the nova bit to use it | 14:55 |
dansmith | ack, I thought bauzas had ruled that out | 14:55 |
dansmith | but it'd be nice | 14:55 |
sean-k-mooney | i think bauzas was just to busy to finish it | 14:55 |
dansmith | okay | 14:55 |
sean-k-mooney | i dont know of anything blockign completing the work | 14:55 |
bauzas | I'm pretty burned about vgpu but I can try to update my mtty series | 14:55 |
* gibi adding a mdev, PCI testing topic to the PTG... | 14:56 | |
sean-k-mooney | the other thing gibi and i were hopign to do is after a coulple of the provider get to epoxy we would like to see if we can get image with igb enbaled | 14:56 |
sean-k-mooney | so we can test sriov | 14:56 |
sean-k-mooney | not sure when that will happen but its related. | 14:57 |
Uggla | bauzas, maybe you can offload some vgpu stuff to me. | 14:57 |
sean-k-mooney | dansmith: on a related note but changing the topic slightly https://review.opendev.org/c/openstack/placement/+/945465 is being tracked as a bugfix, but do you think its a backportable change or master only | 14:59 |
gibi | https://etherpad.opendev.org/p/nova-2025.2-ptg#L378 | 14:59 |
dansmith | sean-k-mooney: I think it's backportable and would be a good candidate, personally, since it's very self-contained and the bug really sucks for operators, IMHO | 14:59 |
dansmith | sean-k-mooney: but, I have not been able to test that it actually fixes the nova migration problems yet since I need two computes to do it | 15:00 |
sean-k-mooney | ack my follow up is if that backportable do we need nova work for the existing nova bug or do you think cold migration ectra woudl just start working? | 15:00 |
dansmith | I believe this should make it just work on the nova side | 15:00 |
sean-k-mooney | ack, not that our downsteam qe have a lot of capsity for upstream work but we might want to talk to them about how we coudl test the previously broken operation in upstream tempest | 15:01 |
sean-k-mooney | well or whitebox | 15:02 |
dansmith | again, this requires putting the compute node into overcapacity state with an allocation_ratio change, so I don't really think we can do that in upstream parallel tempest | 15:02 |
dansmith | whitebox potentially | 15:02 |
sean-k-mooney | tempest has the ida of serial test that run one by one. that might help here but many of these edge cases have the potentil to conflict wth other test so i guess we would need to be carful about how we would test this | 15:04 |
dansmith | yeah, true.. it just seems sort of off the table for tempest by scope too, but perhaps not | 15:05 |
sean-k-mooney | the other concern i have is will this break any current functionl tests if we have repoducer for any of the nova bugs | 15:05 |
sean-k-mooney | i dont think we do | 15:05 |
sean-k-mooney | but if you fix it in placment it will presumable sto raising an error with the placement driect fixture | 15:06 |
dansmith | if we do, I'm not aware of them | 15:06 |
gibi | having a placement only tempest test for the overallocation fix could be written to be independent from any other test by creating a separate RP, but I'm not sure that worth it compared to a full nova migration test with overallocation. | 15:06 |
sean-k-mooney | ok in that case i realy dont have any other question related to this. | 15:07 |
sean-k-mooney | gibi: that could be done but gabit kind fo does that already right | 15:07 |
gibi | yeah | 15:07 |
dansmith | gibi: yeah could definitely do that but the placement functional tests feels just as good for that to me | 15:07 |
gibi | I a big believer in functional tests :) | 15:08 |
gibi | I'm | 15:08 |
dansmith | certainly for this they're plenty adequate I think | 15:11 |
dansmith | sean-k-mooney: didn't you have a recipe for using the upstream ansible playbooks to deploy multinode devstack locally? | 16:19 |
ivveh | hey, im looking to be pointed in the correct direction. i want to nova to build my custom xmls. like hooks or some other method that is supported by nova and preferably has a framework for that. i'd like to avoid modifying the libvirt driver as much as possible | 16:54 |
sean-k-mooney | dansmith: sorry yes i have | 17:05 |
sean-k-mooney | dansmith: https://github.com/SeanMooney/ard/blob/master/ansible/deploy_multinode_devstack.yaml is the playbook that does it that was driven by molecule with vagrant_libvirt to provdie the vms https://github.com/SeanMooney/ard/blob/master/molecule/default/molecule.yml | 17:07 |
dansmith | ah with vagrant | 17:08 |
sean-k-mooney | you didnt need vagrant | 17:08 |
sean-k-mooney | the playbooks could work with any servers once you hadd ssh access | 17:08 |
sean-k-mooney | so like i provisioned some server internally for vdpa testing and just poited the palybooks at it with an inventory file | 17:09 |
sean-k-mooney | https://github.com/SeanMooney/ard/tree/master/examples/vdpa | 17:09 |
sean-k-mooney | i wrote 3 roles devstack_{common,compute,controller} that wrappted the upstream devstack roles form teh gate jobs | 17:11 |
sean-k-mooney | https://github.com/SeanMooney/ard/tree/master/ansible/roles | 17:11 |
sean-k-mooney | dansmith: its been like 3+ years since i used any of this | 17:12 |
sean-k-mooney | it proably still wroks but i havent been maintining it | 17:12 |
sean-k-mooney | dansmith: was there a specific reason you asked | 17:13 |
dansmith | sean-k-mooney: I was struggling through re-re-remembering the multinode devstack rain dance | 17:14 |
dansmith | but I got it figured out whilst waiting | 17:14 |
sean-k-mooney | ah ok | 17:14 |
dansmith | sean-k-mooney: gibi: can confirm that just that placement change fixes the nova migrate-while-over-capacity bug | 17:14 |
sean-k-mooney | i have some downstream docs of the process with bash | 17:14 |
dansmith | I was able to repro, then literally swapped placement to that commit, restarted, retried the same migration and it went, confirmed, all good | 17:15 |
sean-k-mooney | and one of my team recently wrote an ansible playbook to deploy multi node devstack on our internal cloud i can ping you does internally | 17:15 |
sean-k-mooney | dansmith: oh nice | 17:15 |
dansmith | sean-k-mooney: thanks, I need to formalize my own notes, because I have some of my own scripting and making things work with those is most ideal, | 17:16 |
dansmith | I just suck at persisting the notes after I get it workin, | 17:16 |
dansmith | but I'm going to do that now | 17:16 |
ivveh | as more context to my previous question is that i want to include iothread stanza into the xml (for now) | 17:18 |
sean-k-mooney | dansmith: the main parts of multi node are syncing the tls/ceph data, exchantign the ssh keys/known_hosts, ntp sync and updating /etc/hosts before deploying the comptues. then adding a discover host at the end | 17:19 |
dansmith | yeah, it was just the CA stuff I forgot this time | 17:20 |
dansmith | as soon as it failed, I knew the problem, but then had to remember to copy the CA dir *and* the bundle, lest it would nuke the latter on stacking the subnode | 17:20 |
sean-k-mooney | did you know that tls is appreantly not on by default in devstack. i always used the ci jobs as a basis for mine | 17:21 |
sean-k-mooney | but if you use devstack by default it does not enabel the tls-proxy | 17:21 |
dansmith | yeah, I have it on my base node though, so.. | 17:21 |
gibi | dansmith: nice test results | 17:21 |
dansmith | gibi: also works for the case where reserved is changed to cover the node, so that covers the OTU case too | 17:22 |
sean-k-mooney | ivveh: we do not have any hook point in nova to allow you to modify the xml. that is a very intetional decsions | 17:22 |
sean-k-mooney | ivveh: if you want to modify the xml you are required to modify the virt driver because we do not want to supprot external customisation fo the domain thoguh any kind of hooks | 17:23 |
sean-k-mooney | ivveh: what is it you are actully trying to do to the xml | 17:24 |
gibi | dansmith: cool | 17:25 |
dansmith | I was hoping to have tested the migration allocation thing for real last week but ran out of time. Luckily my monday morning meeting was canceled I could get a jump on it | 17:29 |
ivveh | sean-k-mooney: add <iothreads/> | 17:46 |
ivveh | (as i couldn't find it anywhere in the extraspecs or other methods of modifying the xml) but feel free to point me in the correct direcion if iothreads do exist, i need to point them out and pin on them, maybe some other stuff too | 17:48 |
sean-k-mooney | its not currently supproted. we had a blueprint to add it a year ago but the person working on it did actuly push any code | 17:49 |
ivveh | gotcha, would it be as an extraspec or some other method? | 17:49 |
sean-k-mooney | what i was suggeting was to allway allcoate 1 and make it float over the cpu_share_set | 17:50 |
sean-k-mooney | we discussed if we shoudl supprot more then one but the advice form our virt team was no | 17:50 |
sean-k-mooney | they suggested that in the future 1 per cinder volume might make sense but the high level degisn we had dicussed in teh past was 1 keep it simple and start with all vms having 1 iothread | 17:51 |
sean-k-mooney | 2 in the future add a flavor extra spec to opt into one per cinder volum eif needed | 17:51 |
sean-k-mooney | 3 possible allow this to be set on the image or config instead (TBD if/when we add the 1 per cinder volume feature) | 17:52 |
sean-k-mooney | https://review.opendev.org/c/openstack/nova/+/939254 is the most recent attempt | 17:52 |
sean-k-mooney | https://blueprints.launchpad.net/nova/+spec/iothreads-for-instances | 17:53 |
ivveh | thanks, ill have a look | 17:53 |
sean-k-mooney | this version is adding a config option to contol the behvior | 17:53 |
sean-k-mooney | ah the blueprint has the link to the ptg dicssion form last cyle https://etherpad.opendev.org/p/nova-2025.1-ptg#L686 | 17:54 |
ivveh | yeah, i wouldn't know how to do it at this point, i was thinking of having a pool of cores for this use and somehow allocate to those cores via vms (no matter how many volumes they had). i also need to do the same thing for virtiofsd (a different more complex story as i also have to figure out how to force it to use fifo or change some code) | 17:55 |
sean-k-mooney | ivveh: so this level of low level customstion of the domain is onlyt possibe via modifying the in tree driver | 17:56 |
sean-k-mooney | upstream we consier any out of band modification of the dmoain to make the vm unsupported | 17:56 |
ivveh | yeah, thats what i thought. i was hoping to avoid it, but.. | 17:56 |
sean-k-mooney | ivveh: we only added the iniall use of virtiofs this cycle | 17:57 |
sean-k-mooney | so we have not looekd at any kind of affinity for vritio fs deamon | 17:57 |
ivveh | my usecase is very specific as i know how many and how large my vms can be and i wanna assign them a certain amount of cores | 17:57 |
sean-k-mooney | i see. that not really a usecase that is very compatible with cloud computeing :) | 17:58 |
ivveh | depends on what flavors you rent ;) | 17:58 |
ivveh | but i get your point | 17:58 |
sean-k-mooney | waht i mean is we allow you to express constratis like "i want this vm to have dedicated pinned cpus" and obviosly you can create falvors with X amount of cpus | 17:59 |
sean-k-mooney | but we intentioally do not allow you to do direct maping of logcial cpus to host cpus | 17:59 |
ivveh | yeah no, i understand openstack needs to have its constraints, its normal and acceptable | 17:59 |
sean-k-mooney | nova can do that internlly but its not exposed directly in the flavor | 17:59 |
ivveh | for shenenings you have to go custom | 18:00 |
sean-k-mooney | we talked about having a pool for iothread. but we didnt want to have to schdule on them | 18:00 |
ivveh | i think either way, it would be great for nova to have a iothread pool that nova/cinder could take resources from (as these days defaults aren't enough) | 18:01 |
ivveh | would be great to have like flavors with iothread spec extraspec for example, to be able to utilize those resources, or not | 18:03 |
sean-k-mooney | when we dicussed it the idea was for it to just be an addtion set of core, taht woould default to cpu_shared_set if not defiend in the config | 18:03 |
sean-k-mooney | i.e. add a iothread_set to nova. those cpu could overlap with cpu_shared_set but now cpu_dedicated_set. we kind of felt that was too complex to start with | 18:04 |
sean-k-mooney | that does not mean we could not add that later | 18:04 |
ivveh | yea if you use cpu_shared_set and vms are pinned then they would always be free for stuff like iothreads or virtiofsd | 18:06 |
sean-k-mooney | so we already supprot moving the emulator thread to cpu_shared_set for pinned vms | 18:06 |
sean-k-mooney | so we were effectivly going to treat the extra iothread the same | 18:07 |
ivveh | or cpu_isolation if you wanna let things do whatever, i guess | 18:07 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!