opendevreview | melanie witt proposed openstack/nova master: Add logic to enforce local api and db limits https://review.opendev.org/c/openstack/nova/+/712139 | 03:35 |
---|---|---|
opendevreview | melanie witt proposed openstack/nova master: Enforce api and db limits https://review.opendev.org/c/openstack/nova/+/712142 | 03:35 |
opendevreview | melanie witt proposed openstack/nova master: Update quota_class APIs for db and api limits https://review.opendev.org/c/openstack/nova/+/712143 | 03:35 |
opendevreview | melanie witt proposed openstack/nova master: Update limit APIs https://review.opendev.org/c/openstack/nova/+/712707 | 03:35 |
opendevreview | melanie witt proposed openstack/nova master: Update quota sets APIs https://review.opendev.org/c/openstack/nova/+/712749 | 03:35 |
opendevreview | melanie witt proposed openstack/nova master: Tell oslo.limit how to count nova resources https://review.opendev.org/c/openstack/nova/+/713301 | 03:35 |
opendevreview | melanie witt proposed openstack/nova master: Enforce resource limits using oslo.limit https://review.opendev.org/c/openstack/nova/+/615180 | 03:35 |
opendevreview | melanie witt proposed openstack/nova master: Add legacy limits and usage to placement unified limits https://review.opendev.org/c/openstack/nova/+/713498 | 03:35 |
opendevreview | melanie witt proposed openstack/nova master: Update quota apis with keystone limits and usage https://review.opendev.org/c/openstack/nova/+/713499 | 03:35 |
opendevreview | melanie witt proposed openstack/nova master: Add reno for unified limits https://review.opendev.org/c/openstack/nova/+/715271 | 03:35 |
opendevreview | melanie witt proposed openstack/nova master: Enable unified limits in the nova-next job https://review.opendev.org/c/openstack/nova/+/789963 | 03:35 |
*** hemna8 is now known as hemna | 07:38 | |
opendevreview | Fabian Wiesel proposed openstack/nova master: Transport context to all threads https://review.opendev.org/c/openstack/nova/+/827467 | 10:10 |
opendevreview | Fabian Wiesel proposed openstack/nova master: VmWare: Remove unused legacy_nodename regex https://review.opendev.org/c/openstack/nova/+/806336 | 10:14 |
amoralej | hi, it seems devstack on centos9 is failing with some errors related to qemu unplugging volumes | 10:18 |
amoralej | https://review.opendev.org/c/openstack/devstack/+/827420 | 10:19 |
amoralej | Feb 02 09:04:16.541765 centos-9-stream-ovh-gra1-0028274422 nova-compute[72151]: ERROR oslo_messaging.rpc.server nova.exception.DeviceDetachFailed: Device detach failed for vdb: Run out of retry while detaching device vdb with device alias virtio-disk1 from instance ee94e833-1561-480f-b6c8-0e391833c0c7 from the live domain config. Device is still attached to the guest. | 10:19 |
amoralej | Feb 02 09:04:16.325611 centos-9-stream-ovh-gra1-0028274422 nova-compute[72151]: ERROR nova.virt.libvirt.driver [None req-7e39fb4f-50b1-4d86-b9ce-7addab205c30 tempest-ServerStableDeviceRescueTest-117889040 tempest-ServerStableDeviceRescueTest-117889040-project] Waiting for libvirt event about the detach of device vdb with device alias virtio-disk1 from instance ee94e833-1561-480f-b6c8-0e391833c0c7 is timed out. | 10:19 |
amoralej | ans similar | 10:19 |
amoralej | did anyone see this issue? is it known issue? | 10:20 |
opendevreview | Fabian Wiesel proposed openstack/nova master: Vmware: Fix indentation in conditionals https://review.opendev.org/c/openstack/nova/+/806391 | 10:37 |
MrClayPole | When I live migrate any instances that has more than 1 volume attached the windows VMs sometimes lose access to their volumes and then hang. The logging stops on the host until after they are restarted. I'm not sure how to troubleshoot this further or if live migration of instances with more than one disk is supported? Any help would be appreciated. | 11:44 |
MrClayPole | The VMs in question are using NFS storage via the NetApp cinder driver | 11:45 |
ralonsoh | gibi, hi, this is about https://bugs.launchpad.net/neutron/+bug/1959749 | 12:01 |
ralonsoh | please check https://review.opendev.org/c/openstack/neutron/+/468982/5/neutron/services/qos/drivers/manager.py | 12:01 |
ralonsoh | "openstack network qos rule type list" will return only those rules accepted by all loaded mech drivers only | 12:02 |
ralonsoh | You can create a rule of any type, but you won't be able to assign it | 12:02 |
gibi | ralonsoh: do you mean assigning it to a port? | 12:08 |
ralonsoh | if the port is bound | 12:08 |
ralonsoh | because that means you'll call the mech driver qos extension | 12:08 |
opendevreview | Stephen Finucane proposed openstack/nova master: docs: Follow-ups for cells v2, architecture docs https://review.opendev.org/c/openstack/nova/+/827336 | 12:09 |
gibi | ralonsoh: I have a port with a qos policy that uses the min pps rule type and I can boot a VM with that port and the port is bound successfully | 12:10 |
ralonsoh | gibi, in your deployment, what backend are you using? | 12:11 |
ralonsoh | or backends | 12:11 |
gibi | ovs and sriov | 12:12 |
opendevreview | Dmitrii Shcherbakov proposed openstack/nova master: Introduce remote_managed tag for PCI devs https://review.opendev.org/c/openstack/nova/+/824834 | 12:12 |
opendevreview | Dmitrii Shcherbakov proposed openstack/nova master: Bump os-traits to 2.7.0 https://review.opendev.org/c/openstack/nova/+/826675 | 12:12 |
opendevreview | Dmitrii Shcherbakov proposed openstack/nova master: [yoga] Add support for VNIC_TYPE_SMARTNIC https://review.opendev.org/c/openstack/nova/+/824835 | 12:12 |
opendevreview | Dmitrii Shcherbakov proposed openstack/nova master: Filter computes without remote-managed ports early https://review.opendev.org/c/openstack/nova/+/812111 | 12:12 |
gibi | ovs support the min packet rate rule | 12:12 |
ralonsoh | gibi, yes, right, as I weas guessing | 12:12 |
ralonsoh | the problem is how we build the rule type set, as I commented in the bug | 12:12 |
gibi | looking... | 12:13 |
ralonsoh | we return the intersection of all mech driver supported types | 12:13 |
ralonsoh | instead of returning the union | 12:13 |
sean-k-mooney | gibi: ovn does not support qos fully yes i think | 12:13 |
gibi | sean-k-mooney: yes, I uses ovs :) | 12:13 |
sean-k-mooney | so if you are using the new default it might not work unless you reverted to ml2/ovs | 12:13 |
sean-k-mooney | ok | 12:13 |
gibi | sean-k-mooney: I reverted, yes :) | 12:13 |
ralonsoh | sean-k-mooney, we do support qos in OVN | 12:14 |
ralonsoh | fully | 12:14 |
gibi | ralonsoh: ohh so because I have sriov I cannot see the min pps | 12:14 |
ralonsoh | gibi, right | 12:14 |
gibi | ralonsoh: let me check that in another devstack that only hase ovs but not sriov | 12:14 |
ralonsoh | I think this should be reconsidered in the API | 12:14 |
sean-k-mooney | ralonsoh: that new this cycle right | 12:14 |
gibi | ralonsoh: I agree | 12:14 |
ralonsoh | I'll push a patch today | 12:14 |
gibi | ralonsoh: this sounds incorrect that I can boot with a qos rule but the rule type list does not show it | 12:14 |
gibi | ralonsoh: thank you! | 12:14 |
ralonsoh | sean-k-mooney, that was supported since wallaby | 12:15 |
ralonsoh | and in D/S in OSP16 | 12:15 |
sean-k-mooney | oh ok | 12:15 |
sean-k-mooney | ralonsoh: this is not the first time this api design choice has come up | 12:16 |
ralonsoh | sean-k-mooney, yeah... I think the current implementation is wrong | 12:16 |
sean-k-mooney | well i ment in general | 12:16 |
sean-k-mooney | neutron also has the same problem with vlan transparncy | 12:17 |
sean-k-mooney | to work around that vlan transpace was set to yes for sriov | 12:17 |
sean-k-mooney | event though it really done not support it properly | 12:17 |
ralonsoh | the aim of the API is to return only what is supported by all drivers | 12:18 |
sean-k-mooney | i think in general neutron need to list the capyablity per ml2/driver | 12:18 |
ralonsoh | for example: https://review.opendev.org/q/3299cdffae5cd7196a1676da103da5e2e413ec21 | 12:18 |
sean-k-mooney | ralonsoh: ya i know but that has never felt useful to me | 12:18 |
ralonsoh | it was changed before and then reverted | 12:18 |
sean-k-mooney | the api shoudl idally list the qos polices per driver | 12:18 |
ralonsoh | sean-k-mooney, then what we can do is to create another API call | 12:18 |
ralonsoh | return all_supported_qos_types | 12:19 |
ralonsoh | or something similar | 12:19 |
sean-k-mooney | perhaps | 12:19 |
ralonsoh | I'll propose a new API | 12:20 |
sean-k-mooney | i still think having the new api return a dictonaly keyd by either the driver or vnic-type with the support qos polices as the values would be the way to organsie that api but just add a property to the exising one to list all is less work | 12:22 |
ralonsoh | I think we can do this with the current API, just adding a new parameter to the CLI call | 12:23 |
sean-k-mooney | the issue i have with all_supported_qos_types is that you cant tell if a port you create will work with any specific policy | 12:23 |
sean-k-mooney | ralonsoh: actully at the end of the day what we really need is schduler support | 12:23 |
ralonsoh | yes | 12:23 |
gibi | minimum pps / bw has scheduler support :) | 12:24 |
sean-k-mooney | neutron need to use traits to model which host support which polcies and nova need to shcdule the prot to such a host based on the requst | 12:24 |
sean-k-mooney | gibi: it does but im thinking for dscp ectra | 12:24 |
gibi | yeah for dhcp it is a different game | 12:24 |
sean-k-mooney | e.g. the non quntitive qos | 12:24 |
sean-k-mooney | its a fair point about min* | 12:24 |
sean-k-mooney | those are ahead of the game | 12:25 |
stephenfin | sean-k-mooney: Care to finally get these in? It's only been 16 months :) https://review.opendev.org/c/openstack/nova/+/705792/ https://review.opendev.org/c/openstack/nova/+/754448/ | 12:25 |
gibi | and supporting the non quantitative is problematic by the resourceless request group problem in placement | 12:25 |
sean-k-mooney | stephenfin: im looking at https://review.opendev.org/c/openstack/nova/+/814562 now but i can look at them after | 12:26 |
stephenfin | Okay, sweet. ty :) | 12:26 |
sean-k-mooney | stephenfin: o nnueton refactoring ya ill review those too | 12:26 |
sean-k-mooney | ah you have a follow up for the cells doc cool i was going to ask if you were doing a new reviesion | 12:35 |
*** dasm|off is now known as dasm|rover | 12:44 | |
*** amoralej is now known as amoralej|lunch | 12:56 | |
gibi | ralonsoh: I'm not sure I understand the reason of the wontfix on https://bugs.launchpad.net/neutron/+bug/1959749 | 13:12 |
gibi | ralonsoh: you #1) point is what I would need to work. So that the rule type list returns all rule types not just rule types that are supported by every configured driver | 13:16 |
ralonsoh | gibi, sorry, I don't know why I set this flag | 13:32 |
ralonsoh | confirmed, this is the correct value | 13:32 |
gibi | ralonsoh: that is better, thanks :) | 13:32 |
admin1 | hi guys .. is there a way to "transfer ownership" of an instance from one project to another ? | 13:48 |
*** amoralej|lunch is now known as amoralej | 14:01 | |
opendevreview | yuval proposed openstack/nova master: Lightbits LightOS driver https://review.opendev.org/c/openstack/nova/+/821606 | 14:04 |
opendevreview | Dmitrii Shcherbakov proposed openstack/nova master: Document remote-managed port usage considerations https://review.opendev.org/c/openstack/nova/+/827513 | 14:58 |
gibi | if somebody wants a change of (code) scenery then I can suggest looking at the placement code review series to support any-traits queries in microvarsion 1.39. The series starts here https://review.opendev.org/c/openstack/placement/+/825846/3 :) | 16:45 |
opendevreview | Merged openstack/nova master: docs: Add a new cells v2 document https://review.opendev.org/c/openstack/nova/+/814562 | 17:01 |
melwitt | gibi: I will look at some new code scenery :) | 17:15 |
*** lbragstad9 is now known as lbragstad | 17:19 | |
dmitriis | gibi, sean-k-mooney: mostly been getting unrelated gate failures so I am waiting for some rechecks to complete. I made a functional change to the patch that introduces the remote_managed tag here https://review.opendev.org/c/openstack/nova/+/824834/8/nova/pci/devspec.py#322 to include a check for the presence of a serial number when a device is | 17:27 |
dmitriis | tagged as remote_managed which is something I overlooked in the previous iteration and updated testing to reflect that. | 17:27 |
dmitriis | I started working on the docs and started a doc review but most of the docs will be in Neutron under the OVN driver guide similar to how it's done today with OVS hardware offload. | 17:28 |
gibi | melwitt: thanks! :) | 17:42 |
gibi | dmitriis: I will read back tomorrow I have to go now | 17:43 |
dmitriis | gibi: np, thanks a lot for the help so far | 17:43 |
*** amoralej is now known as amoralej|off | 18:08 | |
sean-k-mooney | dmitriis: ack | 18:13 |
sean-k-mooney | dmitriis: im looking at some downstream stuff currently but ill try to take a look proably tomorow at this point | 18:14 |
sean-k-mooney | dmitriis: most of the doc make sense for the neutorn guide but we shoudl detail how to use the remote managed flag ectra in nova | 18:14 |
dmitriis | sean-k-mooney: ack, ta. | 18:16 |
dmitriis | sean-k-mooney: I currently describe some of it in the latest doc change and reference the option docstring but I can expand the description in the docs themselves too. | 18:16 |
opendevreview | Ilya Popov proposed openstack/nova master: Fix to implement 'pack' or 'spread' VM's NUMA cells https://review.opendev.org/c/openstack/nova/+/805649 | 18:18 |
opendevreview | melanie witt proposed openstack/nova master: Raise InstanceNotFound on fkey constraint fail saving info cache https://review.opendev.org/c/openstack/nova/+/826942 | 18:34 |
sean-k-mooney | o/ are we tracking the failure of tempest.api.compute.servers.test_device_tagging.TaggedAttachmentsTest.test_tagged_attachment | 18:51 |
sean-k-mooney | as a potential gate issue anywhere | 18:51 |
sean-k-mooney | im seeing that fail more and more on reviews over the last 2 weeks | 18:52 |
sean-k-mooney | its like it knew lyarwood was starting on kubvirt this week :) | 18:52 |
*** artom__ is now known as artom | 18:53 | |
sean-k-mooney | so this could be q35 related | 18:54 |
artom | Yeah, repeating what I said downstream... it's not even a Tempest race or whatever, it's the guest itself. Is this the q35 problem again? Surely we'd see other tests fail in that job, unless nova-next doesn't do any other device attachment tests, which would be... weird | 18:54 |
sean-k-mooney | but im not sure about that | 18:54 |
sean-k-mooney | it could be that the volume is not attched fully yet | 18:54 |
sean-k-mooney | i.e. the series from lee to wait fothe vm to be pingable might help | 18:54 |
sean-k-mooney | but in this case the test is sshing into the vm | 18:54 |
sean-k-mooney | to check the tag is there right | 18:55 |
sean-k-mooney | the failure message at the top level is Details: Timeout while verifying metadata on server. | 18:55 |
artom | No, there are definitely other tests that attach stuff that pass | 18:55 |
sean-k-mooney | so the test is doing Remote command: set -eu -o pipefail; PATH=$PATH:/sbin:/usr/sbin; curl http://169.254.169.254/openstack/latest/meta_data.json | 18:56 |
artom | Mind you, they may not be SSH'ing into the guest? | 18:56 |
sean-k-mooney | by sshing into the guest | 18:56 |
sean-k-mooney | and that ssh connection is well connecting | 18:56 |
artom | So it's curl that's timing out? | 18:57 |
sean-k-mooney | im looking at https://zuul.opendev.org/t/openstack/build/d836724c364843e98bf893ac71574828 | 18:57 |
sean-k-mooney | no i think the curl command is working | 18:57 |
sean-k-mooney | its doing it in a loop | 18:57 |
sean-k-mooney | but by the time the test complete the data is not in the metadata servie | 18:57 |
sean-k-mooney | but that could be related to the attach taking a long time | 18:58 |
artom | It's not 100% though, so whatever we try will have to be rechecked at least a few times | 19:00 |
sean-k-mooney | ya i dont know it just gotten flaky recently | 19:01 |
sean-k-mooney | not clear reason why | 19:01 |
sean-k-mooney | and as you say its no 100% so its hard to tell why | 19:02 |
artom | I wonder if we should wait for volume and interface attach before carrying on | 19:11 |
artom | Like, we're somehow "confusing" the guest by issuing two device attach commands in quick succession | 19:11 |
artom | I realize how non-engineery that sounds | 19:11 |
opendevreview | Artom Lifshitz proposed openstack/nova master: DNM: Testing change to test_tagged_attachment in tempest https://review.opendev.org/c/openstack/nova/+/827549 | 19:22 |
artom | ^^ we'll see | 19:22 |
sean-k-mooney | artom: we proably should although i think we have an instace level lock at the comptue manager so only one attachment can happen at a time | 19:51 |
sean-k-mooney | we do for 2 interfaces or volumes but not sure about one of each | 19:51 |
sean-k-mooney | so ya lets see if that helps | 19:52 |
sean-k-mooney | o/ all chat to ye tomrrow | 19:54 |
artom | It's a shot in the dark to serve as a data poin | 19:54 |
artom | t | 19:54 |
admin1 | hi .. is there a nova command to see all ongoing migrations ? | 21:19 |
mloza | hello, need help figure out what would cause nova to delete a port on an instance | 21:23 |
mloza | i see this in the logs `Creating event network-vif-deleted` | 21:23 |
mloza | here's the full logs https://paste.openstack.org/raw/bRTqxJGwy3WxAPGVEPo9/ | 21:23 |
admin1 | openstack compute service delete 56 => Unable to delete compute service that has in-progress migrations. .. How do I check these migrations ? | 21:52 |
*** dasm|rover is now known as dasm|off | 22:01 | |
melwitt | admin1: https://docs.openstack.org/python-openstackclient/latest/cli/command-objects/server-migration.html#server-migration-list | 22:08 |
admin1 | melwitt, this cluster is still in rocky .. so that command is not there | 22:10 |
admin1 | we are in the process of upgrading it .. so migrating, delete compute, reinstall and upgrade, add it back | 22:10 |
melwitt | oh I see | 22:11 |
opendevreview | melanie witt proposed openstack/nova master: Raise InstanceNotFound on fkey constraint fail saving info cache https://review.opendev.org/c/openstack/nova/+/826942 | 22:11 |
admin1 | nova migration-list lists the migraitons, but there is nothing pending or ongoing | 22:18 |
melwitt | you could try the force or abort commands on the migrations if they are leftover https://docs.openstack.org/python-novaclient/latest/cli/nova.html#nova-live-migration-force-complete | 22:23 |
melwitt | https://docs.openstack.org/python-novaclient/latest/cli/nova.html#nova-live-migration-abort | 22:23 |
admin1 | i found the instance.. the instance is in pre-migrating status .... when i enter the command nova live-migration-abort $UUID $ID, it says Instance $UUID is in an invalid state for 'abort live migration' | 22:34 |
admin1 | how do I abort a pre-migration status ? | 22:34 |
admin1 | server show status is Running .. so instance does not have a pre-migrating status | 22:35 |
melwitt | did you try the force complete too? | 22:38 |
admin1 | yeah .. it gave Instance $UUID is in an invalid state for 'force_complete' | 22:40 |
admin1 | maybe i can just delete the compute service from the db ? | 22:41 |
admin1 | and add it again .. | 22:41 |
admin1 | want to do it properly though .. | 22:41 |
admin1 | melwitt, https://gist.github.com/a1git/9a975b96cd91da4683084c7df3220530 | 22:44 |
admin1 | baiscally the way i have been upgrading from rocky (xenial ) -> bionic is .. for example, migrate all instances from A - B and empty A, then compute service delete A , reinstall A with bionic, same hostname, install nova .. , and then repeat the process again .. | 22:45 |
admin1 | but for some reasons, this one is locked/blocked .. | 22:45 |
melwitt | +1 going into the db is the last resort if the proper tools don't work | 22:46 |
admin1 | i think 'pre-migrating' is blocking the deletion .. and I have found nothing to force this to either error or completed | 22:48 |
admin1 | i did a mysqldump and i found the pre-migrating in only 1 table .. nova.migrations .. | 22:51 |
melwitt | I think you are right that it's the migration(s) blocking the service delete | 22:53 |
admin1 | update migrations set status='error' where instance_uuid=$UUID and status='pre-migrating'; | 22:54 |
admin1 | and then the delet worked fine :) .. openstack compute service delete 56 | 22:54 |
melwitt | I was just about to suggest that, change the status rather than deleting the migration record :) | 22:56 |
admin1 | how to delete a host from placement service ? | 23:03 |
melwitt | hm, the service delete should have done that | 23:04 |
melwitt | it's the host you deleted the service for right? | 23:05 |
admin1 | ResourceProviderCreationFailed: Failed to create resource provider h1 | 23:06 |
admin1 | openstack compute service list -- it gets added there .. | 23:07 |
admin1 | i tried via "openstack compute service delete $id" and in 2nd attempt nova service-delete $uuid | 23:08 |
admin1 | 4 other servers, no issues .. this one hypervisor = this and that errors :) | 23:08 |
melwitt | this is the cli for the placement service https://docs.openstack.org/osc-placement/latest/cli/index.html#resource-provider-delete | 23:08 |
melwitt | just be careful and make sure it's not associated with any instances allocations | 23:10 |
melwitt | before deleting it | 23:10 |
admin1 | there are no instances | 23:10 |
melwitt | k | 23:10 |
admin1 | all instances were migrated, this is a new install which just came up | 23:10 |
melwitt | ok cool | 23:10 |
admin1 | no such command in rocky | 23:10 |
admin1 | this upgrade is necessary to upgrade from rocky =.. as stein needs 18.04 .. | 23:11 |
admin1 | if i need to delete a hypervisor, isn't it just deleting compute service ? | 23:11 |
melwitt | yeah but it does a few things: deletes the nova service record, the nova compute_nodes record, and then the placement resource provider | 23:12 |
melwitt | and the last step has (probably) failed in your case from the previous delete | 23:12 |
melwitt | so it sees a duplicate placement resource provider and refuses to create it. so you are correct you need to delete it so it can recreate it again | 23:13 |
melwitt | as for the command, you could upgrade openstackclient to newer than rocky and it will still work with rocky | 23:13 |
admin1 | sudo apt install python3-openstackclient python3-openstackclient is already the newest version (5.6.0-0ubuntu1). ... no such command | 23:18 |
melwitt | yeah I'd try pip installing it in a venv or something | 23:19 |
admin1 | 5.6.0 is the latest version, even via pip install | 23:23 |
admin1 | in a vnv | 23:23 |
admin1 | and it does not have placement | 23:23 |
melwitt | oh ugh sorry, you need the osc-placement package | 23:23 |
melwitt | it's a openstackclient plugin | 23:23 |
admin1 | hmm.. how to install it ? | 23:24 |
melwitt | I think you can just apt install it | 23:24 |
admin1 | found it | 23:24 |
admin1 | i will try tomorrow on this .. | 23:27 |
admin1 | melwitt, many thanks .. will report back tomorrow | 23:27 |
melwitt | ok yw o/ | 23:28 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!