noonedeadpunk | colby__: there was deprecation of live_migration_tunnelled in favor of native libvirtd migrations (live_migration_with_native_tls). Not sure if that is the reason of your issue, but smth to check I guess | 07:14 |
---|---|---|
noonedeadpunk | As it seems that live_migration_tunnelled were not fully dropped as of today | 07:15 |
tobias-urdin | (proposed removing that but was not ok in bobcat, maybe will be done in caracal https://review.opendev.org/c/openstack/nova/+/879021) | 07:53 |
bauzas | sean-k-mooney: dansmith: excellent news, my mtty WIP patch works with devstack and we can create instances https://paste.opendev.org/show/bBKFG3Vg7YTQBHPz11Za/ | 08:17 |
bauzas | if we pre-create mdevs of course | 08:17 |
bauzas | I just need to fix the RP name which is because of the <parent> value in the mtty :) | 08:23 |
bauzas | or we can just meh this | 08:23 |
sean-k-mooney | bauzas: cool i was conidering extending the devstack plugin to be able to optionally precrete the mdev and generate the approate config as required but i have not had time to play with it yet | 12:58 |
bauzas | sean-k-mooney: we have some internal but report that I created for RHEL asking to fix the libvirt problem | 12:59 |
bauzas | but until this is fixed (and delivered to Ubuntu 22.04 somehow), we may need to precreate them indeed | 12:59 |
bauzas | I can do the plugin thing | 12:59 |
bauzas | if you don't mind | 12:59 |
bauzas | I'm not super expert in writing CI things, and that would improve a bit my knowledge | 13:00 |
sean-k-mooney | well it would jsut be an extion to the devstack plugin so you can dev that locally | 13:00 |
sean-k-mooney | baisclly i was going to add something like this NOVA_MDEV_TYPE=${NOVA_MDEV_TYPES:-$(ls /sys/devices/virtual/*/*/mdev_supported_types/ | grep -v ":" | xargs)} | 13:03 |
sean-k-mooney | and then loop over the NOVA_MDEV_TYPE list and allcoate all mdevs for each enabeld type | 13:04 |
sean-k-mooney | so by default we would be enabling all mdev but if you wnated to just enabel 1 or a set you could define NOVA_MDEV_TYPE in yoru local.conf | 13:04 |
sean-k-mooney | that should be NOVA_MDEV_TYPES consitnetly as it woudl be a list | 13:05 |
bauzas | okay, I can try to work on it | 13:06 |
sean-k-mooney | addtionally i was going to write the mdev uuids into a file per mdev type and create a systemd service file to create them if we wanted to test reboots in a followup | 13:06 |
sean-k-mooney | so intally jsut do it at runtime and if we wnated to test reboot later add that as a follow up | 13:07 |
sean-k-mooney | we can test reboots in ci we normally dont because it slows down the job a non trivial amount | 13:08 |
bauzas | cool | 13:09 |
bauzas | dansmith: https://paste.opendev.org/show/bBKFG3Vg7YTQBHPz11Za/ (at the bottom) is what you need to do in order to configure nova to use mtty | 13:39 |
bauzas | tl;dr: instead of specifying a pci address for the type named 'mtty-1', you just write 'mtty' | 13:40 |
bauzas | but in order to pass the pci address check we have, I hardcoded that string (plus the other module names) in https://review.opendev.org/c/openstack/nova/+/898100/2/nova/virt/libvirt/driver.py#245 | 13:40 |
sean-k-mooney | ya so i woudl be happy with sbauza-dev2_computer being sbauza-dev2_mtty | 13:40 |
bauzas | and sean-k-mooney had concerns about hardcoding that list | 13:41 |
bauzas | which I can understand | 13:41 |
sean-k-mooney | well partly becasue you dont even supprot all the modules that dan is compiling in the plugin | 13:41 |
dansmith | we can make up that list from looking in sysfs right? | 13:41 |
sean-k-mooney | so at a minium i want all of them to be valid but i woudl prefer it to be a config option in the [devices] section | 13:42 |
sean-k-mooney | dansmith: yes we can do that too | 13:42 |
dansmith | I guess if the operator might want to limit some devices to not be picked up by nova, then there's a legit reason to have a conf option | 13:42 |
bauzas | sean-k-mooney: ack I can try to change https://review.opendev.org/c/openstack/nova/+/898100/2/nova/virt/libvirt/driver.py#8065 for not using the "parent" name | 13:42 |
dansmith | I'm just a bit over having to config everything by hand that we should be able to discover | 13:42 |
bauzas | me too | 13:42 |
sean-k-mooney | dansmith: so unless we enable an mdev type form the device | 13:43 |
bauzas | so basically I let you guys decide | 13:43 |
sean-k-mooney | we shoudl not be reporting the mtty devices anyway | 13:43 |
sean-k-mooney | so im not sure we need to configure things at all | 13:43 |
sean-k-mooney | basically if you dont have | 13:43 |
sean-k-mooney | [devices] | 13:44 |
sean-k-mooney | enabled_mdev_types = mtty-1 | 13:44 |
sean-k-mooney | [mdev_mtty-1] | 13:44 |
sean-k-mooney | device_addresses = mtty | 13:44 |
sean-k-mooney | then the mtty rp shoudl not be created right | 13:44 |
bauzas | so, shall we just do an exec on lsmod and see whether mtty is loaded ? | 13:44 |
dansmith | ah right right | 13:44 |
dansmith | bauzas: don't be silly | 13:44 |
sean-k-mooney | bauzas: hell no | 13:44 |
bauzas | yeah, correct | 13:44 |
bauzas | about the fact that no RPs are created until you configure nova.confg | 13:44 |
sean-k-mooney | do me the fact device_addresses is not a pci adress shoudl be all you need | 13:44 |
dansmith | sean-k-mooney: so discover the parents but still obviously let them configure them specifically | 13:45 |
sean-k-mooney | no need for hardcoded sentineal | 13:45 |
dansmith | configure the devices specifically I mean | 13:45 |
bauzas | sean-k-mooney: if we do this, we would remove the pci check and let anyone do fatfingering | 13:45 |
bauzas | I'm not sure I like that | 13:45 |
bauzas | https://review.opendev.org/c/openstack/nova/+/898100/2/nova/virt/libvirt/driver.py#7974 is where we parse config | 13:46 |
sean-k-mooney | what i was orginally thinkgig is as follows | 13:46 |
sean-k-mooney | if device_addresses does not container a pci adress | 13:46 |
dansmith | being only able to handle pci devices seems too specific to me anyway | 13:46 |
bauzas | dansmith: so you'd be in favor of lifting this all ? | 13:48 |
sean-k-mooney | check if the value of device_addresses is in /sys/devices/virtual/<device_address> | 13:48 |
bauzas | tbc, there are some assumptions where we say that we have a PCI address | 13:48 |
bauzas | sean-k-mooney: ah, good idea | 13:49 |
sean-k-mooney | if that folder does not exsit we can raise that as a config error in init_host | 13:49 |
bauzas | that's a readonly call, we don't need to privsep call it | 13:49 |
bauzas | and that's a cheap I/O call only made at startup | 13:50 |
bauzas | wfm so | 13:50 |
sean-k-mooney | ya its listraly just a file exist check | 13:50 |
dansmith | bauzas: but is that really ideal? I mean, it's a bit of a reach, but there are other busses on other systems, and I can totally imagine this sort of virt-friendly hardware someday exposing something totally different there | 13:50 |
dansmith | (that == being so tied to pci addresses specifically) | 13:50 |
sean-k-mooney | dansmith: you are correct that mdev does not require pci device parents | 13:50 |
bauzas | dansmith: I don't disagree with you long-term, but fwiw, the only vfio-mdev devices I've seen rely on being PCI addresses | 13:51 |
sean-k-mooney | that is an atritifal limiation in novas code base today to simplfy the code | 13:51 |
dansmith | yeah | 13:51 |
bauzas | so I'd please defer that PCI untanglement until some real hardware comes up which isn't PCI based | 13:51 |
bauzas | serial or whatever it would be | 13:51 |
sean-k-mooney | bauzas: tehre are once that use cpx? compute express link what ever the acfronum is | 13:51 |
sean-k-mooney | ah CXL | 13:51 |
dansmith | so the plan is to stick to pci addresses for now, but validate them by checking in sysfs and tolerate non-pci things if they're in /virtual/<addr> ? | 13:51 |
bauzas | sean-k-mooney: I haven't yet asked to add CXL into nova and I'll stick with playing the ostrich for now :) | 13:52 |
bauzas | dansmith: yeah | 13:52 |
sean-k-mooney | actully my orgianll proposal a few days ago was for device_adress to be set to the subpath under /sys/devices | 13:52 |
sean-k-mooney | so that it woudl work in any buss | 13:52 |
sean-k-mooney | bauzas: i have talked about CXL for nova several times and we have also said not yet | 13:53 |
bauzas | only for mediated devices | 13:53 |
dansmith | sean-k-mooney: ah yeah that's interesting | 13:53 |
bauzas | mdev capable devices won't show up in /sys/devices | 13:53 |
sean-k-mooney | bauzas: they will under /sys/devices/pci/ | 13:54 |
bauzas | ah yeah you're right | 13:54 |
bauzas | yay, another good idea worth implementing it someday | 13:54 |
sean-k-mooney | so bacially for now i think the minimal change would be "if not a pci adress assume its a virual device" | 13:55 |
sean-k-mooney | in the futre we can add a sysfs check if we want, but if we plan do do that in the future we shoudl consider upgraades now | 13:55 |
bauzas | I can try to write it the way it prepares ourselves to admit that pci devices are just a subset of all the devices that can create mdevds | 13:55 |
bauzas | sean-k-mooney: I think reading sys https://review.opendev.org/c/openstack/nova/+/898100/2/nova/virt/libvirt/driver.py#7975 is cheap and worthwhile | 13:56 |
sean-k-mooney | bauzas: it is an you can resue the funciton for the cpu ofline fucntions | 13:56 |
sean-k-mooney | bauzas: i also suggested this a few days ago | 13:57 |
bauzas | yup, and there are multiple things we can do this cycle about how we do the liaison | 13:57 |
bauzas | I'm still having this enhancement on me about using libvirt for creating the mdev | 13:58 |
sean-k-mooney | speaking of which i think its fair to say htat after the ptg wheree we should discuss some of this we should likely capture the plan in a spec | 13:58 |
sean-k-mooney | but for sysfs reading in genal plwad use https://github.com/openstack/nova/blob/master/nova/filesystem.py#L28 if you do implement this | 13:59 |
bauzas | sean-k-mooney: yup, I'll | 14:00 |
sean-k-mooney | besided the fact this should have a spec, the main reason i want one is to defien the scoep and what will be doen in which order and wthat will eb defered to a future release | 14:01 |
sean-k-mooney | partly so we dont forget soemthing | 14:01 |
bauzas | don't disagree | 14:02 |
dansmith | sean-k-mooney: bauzas I didn't save a link to the bug filed yesterday about the upgrade check, but was there any discussion there about revert vs. change to warning? | 14:11 |
bauzas | dansmith: I very briefly looked at your convo this morning, can you explain it again please ? | 14:11 |
bauzas | iirc, it was because of non-slurp | 14:11 |
dansmith | no | 14:11 |
dansmith | bauzas: basically sean-k-mooney strong-armed me into adding a status check for comptue-object-ids in this last cycle, which raised an error if you hadn't completed the upgrade (which you wouldn't have yet anyway) | 14:12 |
dansmith | I begged for mercy but there was no mercy given so I added the check and then an op showed up yesterday wondering why I had ruined his day | 14:12 |
dansmith | (I may be exaggerating here) | 14:13 |
dansmith | anyway, the status check really should be *next* cycle saying "you haven't finished your homework from last time" | 14:13 |
dansmith | so we have two options: 1. just revert out the check entirely from 2023.2 or 2. convert it to a warning result | 14:13 |
sean-k-mooney | fyi https://paste.opendev.org/show/bGuYFgfWt70oGOHcAuH9/ | 14:14 |
sean-k-mooney | it looks like its working for me too | 14:14 |
sean-k-mooney | but i have not tried to boot any vms with those mdevs yet | 14:14 |
sean-k-mooney | i have allcoated 4 mdevs in this case | 14:15 |
bauzas | sean-k-mooney: I went down to create and even sshing to the cirros guest, the serial was present | 14:15 |
bauzas | dansmith: recollecting | 14:15 |
sean-k-mooney | bauzas: ack ill do that quickly too just to confirm but i htink this is usabel for startting to do ci | 14:16 |
dansmith | \o/ | 14:16 |
sean-k-mooney | we have some basic mdev/gpu testsin whitebox | 14:16 |
sean-k-mooney | so we coudl try extending the job to depend on your patch and enable those | 14:17 |
dansmith | sean-k-mooney: what happens if you try to migrate an instance with a non-migratable mdev? Maybe that's something we can test in normal multinode tempest in the short term.. that we don't break | 14:17 |
sean-k-mooney | unfortunetlly the tests as written wont work | 14:17 |
sean-k-mooney | since they depend on pci | 14:17 |
dansmith | oh we have to do this in whitebox because we need to know mdev details on the host I guess? | 14:18 |
sean-k-mooney | dansmith: good question i currently only have singel node devstack deployed so cant check but sylvain should be able to just try that | 14:18 |
sean-k-mooney | dansmith: no we can do this is tempest too | 14:18 |
dansmith | okay | 14:18 |
dansmith | seems like a good negative test to add | 14:18 |
sean-k-mooney | we just would not eb able ot look at the xml but we can just check for the device in the guest via ssh | 14:18 |
dansmith | to make sure we don't sh*t the bed if the migration fail for that reason | 14:19 |
sean-k-mooney | i would expect libvirt to fail the migration wehn we call migrate_to_uri3 or whatever the libvirt function is | 14:19 |
sean-k-mooney | same as if the cpu compatiable check in libvirt fails | 14:19 |
dansmith | yep, just seems like a good check if it's easy | 14:19 |
sean-k-mooney | or if you have hostdev devices | 14:19 |
bauzas | dansmith: good question, I know that we test some move ops | 14:19 |
sean-k-mooney | so you think the best approch woudl be to start creating new geneic mdev tests in tempest right | 14:20 |
bauzas | and yeah I have a 2-node env with one mtty, can try to migrate my test instance | 14:20 |
sean-k-mooney | cool | 14:20 |
dansmith | sean-k-mooney: depends on what they look like really, but.. yeah why not? | 14:20 |
sean-k-mooney | cold migrate should work live migrate will likely fail or we/libvirt have a bug | 14:20 |
bauzas | back to dansmith's point about status checks | 14:21 |
dansmith | this is clearly a gap | 14:21 |
sean-k-mooney | dansmith: no reason not to just wanted to confirm that is your prefernce | 14:21 |
bauzas | dansmith: so, about status checks, I'd say revert it | 14:22 |
sean-k-mooney | so we will need a new compute feature config option in tempest for it and perhasp a cofnig option to identify how to detect the device in the vm | 14:22 |
dansmith | sean-k-mooney: yeah | 14:22 |
bauzas | no real gain in having a warning imho | 14:22 |
dansmith | bauzas: okay that was sean-k-mooney's opinion too | 14:22 |
dansmith | bauzas: so revert in master, backport that, and then maybe add it back later when we start depending on it? | 14:22 |
bauzas | wfm | 14:22 |
dansmith | aight | 14:23 |
sean-k-mooney | ya so i think that would make sense | 14:23 |
sean-k-mooney | fell like pushing the revert patch | 14:23 |
sean-k-mooney | and we can review it quickly since we are talking about it | 14:23 |
dansmith | I'm doing it | 14:24 |
dansmith | I need to find that bug tho | 14:24 |
opendevreview | Dan Smith proposed openstack/nova master: Revert "Add upgrade check for compute-object-ids linkage" https://review.opendev.org/c/openstack/nova/+/898741 | 14:25 |
bauzas | hmmm, looking more at the mdev parent being 'computer' in the device XML, there may be some confusions when creating inventories if we were about asking two different virtual devices, like say mtty and mdpy | 14:28 |
bauzas | that's a potential bug, so I'm pdb'ing it on my devstack | 14:29 |
bauzas | another slight difference from libvirt that may need to change on their side | 14:29 |
sean-k-mooney | well we can have mulitpel inventories of mdevs for diffent types | 14:51 |
sean-k-mooney | so it will be a slightly didfdfent toplogy i.e. 1 rp for all virtual devs rahter then one per pareent devce | 14:52 |
sean-k-mooney | but i think that is fine | 14:52 |
bauzas | no, because we don't support heterogenous VGPUs (yet) | 14:52 |
bauzas | a single device can only support one type | 14:53 |
sean-k-mooney | i didnt think that was enforecd there i tought that was done as part of the config | 14:53 |
bauzas | that's how we account those | 14:53 |
sean-k-mooney | parsing | 14:53 |
bauzas | there is a config parser that checks that yes | 14:53 |
bauzas | but our internal dicts rely on that too | 14:53 |
sean-k-mooney | ok well that another reason to use the device_adress | 14:53 |
sean-k-mooney | for virtrual stuff | 14:54 |
bauzas | yeah, I filed a bug against libvirt fwiw | 14:54 |
bauzas | the vfio guys are aware of the bug | 14:54 |
bauzas | and we'll just adapt | 14:54 |
bauzas | it occurs to me that OpenStack is probably the only high-level project that's about to integrate mtty for testing ;:) | 14:55 |
bauzas | mtty was probably just a toy project | 14:55 |
bauzas | but if that helps others's testing, that's a net win | 14:55 |
bauzas | netcalol | 15:43 |
bauzas | sean-k-mooney: fwiw, read_sys only reads files :) https://github.com/openstack/nova/blob/master/nova/filesystem.py#L28 | 16:13 |
bauzas | sean-k-mooney: and /sys/devices/virtual/mtty is a dir :) | 16:13 |
bauzas | I can provide some helper method that would wrap os.path.isdir() but would you nack if I was just calling os.path.isdir() straight in the driver ? | 16:14 |
bauzas | or .exists() | 16:14 |
colby__ | noonedeadpunk: Thanks Ill check on that. Did not realize the tunnel was dropped. Was this a change on xena to yoga? Everything worked fine on xena. | 16:21 |
sean-k-mooney | bauzas: good point on it being a dir i guess you can just add a new helper ya in the same module | 16:23 |
sean-k-mooney | or just call os.path.isdir() but it would be ok ot put in the filesystem module | 16:23 |
sean-k-mooney | an exist check is enough | 16:23 |
bauzas | sean-k-mooney: well, if the helper method only wraps isdir, that sounds to me a bit unnecessary | 16:23 |
sean-k-mooney | so i would go with exist because it might actully be a simlink | 16:24 |
colby__ | I have another question. It appears that the amount of local disk available changed in yoga. It appears to be only considering the actual local disk now (without ceph). But instances that use a ceph backed volume to boot are still counting as local disk usage. Was this intentional? Its causing no resources available issues for me. Im working on changing the disk allocation ratio to get around it. | 16:24 |
bauzas | yeah, agreed on exists(), semantically better | 16:24 |
bauzas | but if I'm just returning the boolean, meh | 16:24 |
bauzas | I can pretend to return a FileNotFound exception if exists says no | 16:25 |
bauzas | at least that would justify the helper | 16:25 |
sean-k-mooney | colby__: boot form volume instance shoudl use flavor with disk_gb = 0 | 16:25 |
sean-k-mooney | we allow boot form volue instnace with flavor that dont set it to 0 but it is only for legacy/upgrade reasons | 16:26 |
sean-k-mooney | if you mena ceph volume via image_type=rbd | 16:26 |
sean-k-mooney | then use flavor as normal | 16:26 |
sean-k-mooney | with disk_gb set to the correct size | 16:27 |
sean-k-mooney | in yoga and master for that mater we report local_gb based on the rbd pool size for every instnace as far as i remember | 16:27 |
colby__ | yea unfortunately our flavors have disk size set even though they all use ceph boot volumes. Everythign we use is in ceph (images/ephemeral/volumes). But seems there is no way around the already set flavor disk size then. From the hypervisor screen its seems to be only showing the hypervisor boot disk volume for local disk | 16:31 |
colby__ | Im in the process of upgrading yoga and trying to live migrate instances to the already upgraded machines and running into this issue | 16:31 |
colby__ | is there a way to override the disk size of a running instance flavor? | 16:32 |
sean-k-mooney | colby__: are you using images_type=rbd | 16:32 |
sean-k-mooney | or cinder volumes | 16:32 |
colby__ | cinder volumes | 16:33 |
sean-k-mooney | adn your using images_type=qcow | 16:33 |
sean-k-mooney | or raw or flat | 16:33 |
colby__ | we did not set it so its currently default | 16:34 |
sean-k-mooney | ok that woudl be defcault which for libvirt is qcow | 16:34 |
sean-k-mooney | imporanly in that config then the expect bevioru is to report local_disk_gb as the free space on the compute node partion where /var/lib/nova is | 16:35 |
sean-k-mooney | and that was the ame behvior in older release | 16:35 |
colby__ | ok so set images_type=rbd then? | 16:35 |
sean-k-mooney | no | 16:35 |
sean-k-mooney | that means use cpeh for root disk | 16:35 |
sean-k-mooney | even with out ceinder | 16:35 |
colby__ | ah sorry I misunderstood. so raw then? | 16:36 |
sean-k-mooney | no dont change it | 16:36 |
sean-k-mooney | nova is working as it should form a compute config point of view | 16:36 |
colby__ | ok | 16:36 |
sean-k-mooney | i need to check somehting but i alo need to join a 1:1 with my manager | 16:37 |
colby__ | ok no problem. Thanks for the help | 16:37 |
sean-k-mooney | did you used to set allcoation raitio by aggreates prehaps? | 16:37 |
dansmith | ugh, feels like we're back to gate suckage.. some trivial patches are in recheck hell | 16:39 |
colby__ | sean-k-mooney: we never set anything for disk allocation | 16:41 |
colby__ | sean-k-mooney: I was able to increase the disk allocation ratio but now seeing this: nova.exception.InvalidSharedStorage: <host> is not on shared storage: Shared storage live-migration requires either shared storage or boot-from-volume with no local disks. This is a boot from volume ceph instance | 17:03 |
colby__ | is this all due to the disk_gb of the flavors? | 17:04 |
colby__ | Im migrating from a xena hypervisor to a yoga and seeing this issue | 17:08 |
sean-k-mooney | colby__: so partly. but you shoudl have had the same behavior on xena and yoga i think | 17:21 |
sean-k-mooney | colby__: so you should not be passing --shared-srotage wehn live migrating | 17:22 |
colby__ | I was just using the GUI for the live migration but can switch to the cli | 17:22 |
sean-k-mooney | you should be using a newer microvstion where that defautl to auto | 17:22 |
sean-k-mooney | so for boot form volume guest you should not tick shared storage | 17:23 |
sean-k-mooney | shared storage for that option really means is /var/lib/nova on nfs | 17:23 |
sean-k-mooney | so are you using images_type=rbd | 17:23 |
colby__ | no | 17:24 |
colby__ | just set to default | 17:24 |
sean-k-mooney | i missed typed | 17:24 |
sean-k-mooney | what i ment is you only say "shared_stroage=true" if you the vm is booted on a shared file cycle without using cinder volumes | 17:25 |
sean-k-mooney | so that is not your case | 17:25 |
colby__ | also there is no check box for shared storage just "block migration" and "disk overcommit" on gui live migration | 17:25 |
sean-k-mooney | right so your using horizon yes? | 17:25 |
sean-k-mooney | it likely would be better to do live migration form the openstack client | 17:26 |
colby__ | yep will try that | 17:26 |
sean-k-mooney | so in mitaka 2.25 we added auto | 17:28 |
sean-k-mooney | https://docs.openstack.org/nova/latest/reference/api-microversion-history.html#maximum-in-mitaka | 17:28 |
melwitt | stephenfin: I wondered if you might be able to shed some light on this bug report regarding an alembic migration https://bugs.launchpad.net/nova/+bug/2008716 they report a failure to drop columns in the build_requests table going from xena -> yoga. yet this doesn't fail in the grenade job for xena -> yoga, so I'm confused if there is really a bug? | 17:32 |
sean-k-mooney | melwitt: i think gibi ^ looked at that before | 17:40 |
colby__ | sean-k-mooney: that seemed to work. Im able to migrate from xena to yoga hypervisor. But Im not able to migrate to the original hypervisor that I fully patched (updated qemu/libvirt). On that machine Im seeing the following: qemu-kvm: load of migration failed: Invalid argument: libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2023-10-18T17:33:57.637682Z qemu-kvm: Missing section footer for | 17:40 |
sean-k-mooney | i that sound liek a qemu crash | 17:41 |
melwitt | sean-k-mooney: yeah, he did and I agree it seems like a bug in the migration. but if it is, why doesn't it fail in grenade? I looked for awhile and couldn't figure it out. I thought stephenfin might have ideas since he worked on alembic stuff | 17:44 |
sean-k-mooney | i think we get diffent behavior in fressh install vs upgraded case but i dont exactly rememebr why | 17:46 |
sean-k-mooney | i assume the base scemea we use in once ase either has or does not have the column | 17:46 |
melwitt | yeah.. if a fresh install of xena doesn't have the columns and the yoga migration would fail to drop them bc they don't exist, I thought it should have happened in grenade since grenade does a fresh install of xena and then upgrades it to yoga | 17:48 |
sean-k-mooney | https://github.com/openstack/nova/blob/stable/xena/nova/db/api/migrations/versions/d67eeaabee36_initial_version.py#L175-L208 | 17:48 |
sean-k-mooney | so that is how the build request tabel is created on a clena install | 17:48 |
sean-k-mooney | https://github.com/openstack/nova/blob/stable/xena/nova/db/api/migrations/versions/d67eeaabee36_initial_version.py#L186 | 17:49 |
sean-k-mooney | so it has vm_states | 17:49 |
sean-k-mooney | so a clean install on xena woudl have it | 17:49 |
sean-k-mooney | but a cloud that was upgraded form wallaby perhaps might not | 17:49 |
sean-k-mooney | hum it should be there in wallaby https://github.com/openstack/nova/blob/stable/wallaby/nova/db/sqlalchemy/api_migrations/migrate_repo/versions/067_train.py#L178 | 17:50 |
melwitt | older versions do the initial migration too. I am just not seeing how it's possible to get into a case where you don't have the columns in xena if the yoga migration is the only thing that removes them | 17:51 |
sean-k-mooney | its also there if you use legacy migrations for green filed in xena https://github.com/openstack/nova/blob/stable/xena/nova/db/api/legacy_migrations/versions/067_train.py#L166 | 17:51 |
sean-k-mooney | melwitt: perhaps we need to look at the code before we collapsed the migrations | 17:52 |
sean-k-mooney | if the bug was intoduced anywhere it was in the db compaction series | 17:54 |
sean-k-mooney | so in the patch before the dbcompaction | 17:57 |
sean-k-mooney | https://github.com/openstack/nova/blob/2d4eff0ef789957fe5498640015a6cccff43a2c3/nova/db/sqlalchemy/api_models.py#L241 | 17:57 |
sean-k-mooney | the build request tabel did not have the offending colum in the model | 17:57 |
sean-k-mooney | ah hwere so even then it was created in migration 6 with the field | 17:59 |
sean-k-mooney | https://github.com/openstack/nova/blob/2d4eff0ef789957fe5498640015a6cccff43a2c3/nova/db/sqlalchemy/api_migrations/migrate_repo/versions/006_build_request.py#L48 | 17:59 |
sean-k-mooney | i dont see the removal migation however at this point in the repo | 18:00 |
melwitt | is that yoga? bc the removal migration was added in yoga I think | 18:01 |
sean-k-mooney | that is xena | 18:01 |
melwitt | oh ok | 18:01 |
sean-k-mooney | so in xena the table was created with the column and i dont see the removal | 18:02 |
sean-k-mooney | an pre xena the colume also existed | 18:02 |
sean-k-mooney | do you knwo which commit added the drop | 18:03 |
sean-k-mooney | https://github.com/openstack/nova/commit/9657297dd6c63e7a1e0c84c3e943b26f1795d388 | 18:04 |
sean-k-mooney | so honestly i don tsee the bug | 18:06 |
sean-k-mooney | xena had the colume and both ways of createating a clena db have the column | 18:06 |
melwitt | yeah, same | 18:32 |
opendevreview | Alexey Stupnikov proposed openstack/nova master: Preserve cached base images for failed resize ops https://review.opendev.org/c/openstack/nova/+/877410 | 18:42 |
opendevreview | melanie witt proposed openstack/nova master: docs: Fix unified limits code block text alignment https://review.opendev.org/c/openstack/nova/+/898759 | 22:02 |
opendevreview | Merged openstack/nova master: Improve logging at '_numa_cells_support_network_metadata' https://review.opendev.org/c/openstack/nova/+/860930 | 23:05 |
opendevreview | Merged openstack/nova master: Clean up service_get_all() https://review.opendev.org/c/openstack/nova/+/897674 | 23:16 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!