*** hemna9 is now known as hemna | 06:36 | |
gibi | good morning | 07:32 |
---|---|---|
* kashyap waves | 07:57 | |
bauzas | good morning | 08:10 |
bauzas | gibi: actually, yesterday I was organizing a birthday party for my daughter with 8 of her friends :) | 08:10 |
bauzas | not sure I "enjoyed" it :D | 08:11 |
Uggla | bauzas, still alive ? :) | 08:31 |
gibi | bauzas, Uggla: https://www.youtube.com/watch?v=qdrs3gr_GAs ? :) | 09:57 |
*** bhagyashris_ is now known as bhagyashris | 10:25 | |
sean-k-mooney | gibi: portal is a great game | 11:36 |
gibi | indeed | 11:36 |
sean-k-mooney | by the way i watched your summit video | 11:37 |
sean-k-mooney | live demos are fun | 11:37 |
sean-k-mooney | did you restack live on stage | 11:37 |
sean-k-mooney | you said you rebooted the vm but devstack does not survie reboots anymore | 11:37 |
sean-k-mooney | since there are ordering issues with the systemd service files | 11:37 |
gibi | I restarted the devstack VM, but did not restack it. It worked:D | 11:54 |
sean-k-mooney | maybe the issues have been fixed | 11:59 |
sean-k-mooney | the wsgi service used to race with appache i think and there was an issue with the /run dirs being created | 12:00 |
sean-k-mooney | you could get lucky | 12:00 |
sean-k-mooney | but often the service woudl not start cleanly | 12:00 |
gibi | I never seen that issue locally | 12:05 |
gibi | but meh | 12:05 |
sean-k-mooney | its been a while since i tried it honestly | 12:06 |
sean-k-mooney | like 18.04 era | 12:06 |
*** dasm|off is now known as dasm | 13:03 | |
sean-k-mooney | bauzas: was i imaginign it or did you have an implementation of removing the ssh key pair generation | 13:44 |
sean-k-mooney | bauzas: i was going to go review it today but i cant find it | 13:44 |
sean-k-mooney | bauzas: i was hopign we could get that closed out to remove it form the list of thing we need to finish this cycle | 13:45 |
sean-k-mooney | bauzas: speaking of things i would like to clsoe out can you review https://review.opendev.org/c/openstack/nova/+/847001 | 13:46 |
sean-k-mooney | gibi: it would be nice if you could reivew this too https://review.opendev.org/c/openstack/nova/+/833411 i would like to keep that backport series moving along | 13:49 |
* sean-k-mooney is actully logged into a terminal today and lookign at patches | 13:49 | |
gibi | sean-k-mooney: on it | 13:50 |
sean-k-mooney | thanks | 13:51 |
sean-k-mooney | im also going to rebase https://review.opendev.org/c/openstack/nova/+/830829 shortly once i run and fix the functional tests | 13:57 |
sean-k-mooney | tja twill be a nice easy win we defered it last cycle because it was too close to FF | 13:58 |
sean-k-mooney | so i would like to try an land it by m2 | 13:58 |
sean-k-mooney | mainly to catch any ci fallout if there is any | 13:58 |
sean-k-mooney | the benifit of me running test locally on my laptop instead of my home server is it takes a while and i do code reviews while im waiting | 14:01 |
* sean-k-mooney is still tempeted to swap to my home server to run thses instead however :P | 14:01 | |
mfo | dansmith, hey! sorry, i couldn't get back to you yesterday ("context dropped out"). most certainly.. the long delay/lost context is on me :/ | 14:02 |
mfo | so, briefly: in victoria/ussuri one can set `hw_firmware_type=uefi` and the VM won't boot if the UEFI ovmf firmware/loader picked is the secureboot one (*.secboot.fd), as the default `hw_machine_type=pc` doesn't support it (that needs `q35`). | 14:02 |
mfo | so we should *try* other firmware/loader files, or use .secboot.fd anyway if its' the only one that exists (not to break/change from what is done in this old/stable branch). | 14:02 |
mfo | This was a suggestion from sean-k-mooney, and also to have unit tests. | 14:02 |
mfo | Your suggestions were 1) to convert the list of firmware/loader paths to constants (so to intentionally "break" any downstream stable consumers that possibly patched that, so they'd be aware of the change and review their version), and 2) change the patch approach to not even iterate the list of firmware/loader files with .secboot.fd if we're on PC machine type. (I just combined it with, "ok, use it if it's the only one" per Sean's | 14:02 |
mfo | suggestion too). | 14:02 |
sean-k-mooney | i think i basicaly said sort so that its last in the list and if its the only one then use it and maybe it will work | 14:03 |
mfo | I guess that's what this is about and where we are. :) | 14:03 |
sean-k-mooney | rahter then deleteing it form the list | 14:03 |
mfo | sean-k-mooney, yup. this is currently how it is, but the for-loop doesn't have a `break` in case it finds one, so the last file is used. | 14:04 |
sean-k-mooney | but yes sercure boot need uefi | 14:04 |
mfo | i could just add a break in there, but this would change things / file end up chosen in the general case, i guess. | 14:05 |
sean-k-mooney | so it wont work with hw_machine_type=pc` | 14:05 |
mfo | and for older stable branches, well, i didnt want to risk such changes. (i'm not too familiar w/ openstack yet, actually). | 14:05 |
sean-k-mooney | i think we aleay want to break on the first file we find that exsits as long as we make the secure boot one last if you did not ask for secure boot | 14:07 |
sean-k-mooney | we are only going to boot the vm once anyway so there is no point in continuing to check | 14:07 |
sean-k-mooney | but i dont have your patch open currently so that depnds on how you wrote the loop | 14:08 |
opendevreview | ribaudr proposed openstack/nova master: Allow unshelve to a specific host (Compute API part) https://review.opendev.org/c/openstack/nova/+/831507 | 14:09 |
opendevreview | ribaudr proposed openstack/nova master: Allow unshelve to a specific host (REST API part) https://review.opendev.org/c/openstack/nova/+/845897 | 14:09 |
mfo | right, i wondered why the (existing) loop didn't have a 'break' statement in the first place when it was introduced, but it was long ago, and didn't seem like a thing to change for old stable branches. also, in this version one can't ask for secure boot yet (this is victoria/ussuri, and secureboot came in wallaby), it just happened that a patch adding the OVMF paths included a .secboot.fd file, before proper SB support came in. | 14:10 |
sean-k-mooney | ack | 14:11 |
sean-k-mooney | the ovmf frimware images can be build so that secure boot is avaiable but not required | 14:12 |
sean-k-mooney | so i think the old logic was intended to allow it to be used if it was supproted | 14:12 |
sean-k-mooney | by the ovmf image by manually configuring it via the boot menu | 14:12 |
sean-k-mooney | but that was not really supported by nova | 14:13 |
mfo | sean-k-mooney, ah, i see. | 14:13 |
mfo | i think that works for the SB feature alone, but i think the SMM feature support once it's built in then it isn't opt-in, right? than that requires support in the emulator (ie, qemu's q35). | 14:14 |
mfo | s/than/then/ | 14:14 |
sean-k-mooney | i think that is correct but dont know all the details | 14:14 |
sean-k-mooney | SMM does require qemu to be configured to enable it | 14:15 |
sean-k-mooney | but i dont know if you need that enabled if its complied in to the ovmf image | 14:15 |
sean-k-mooney | its all a bit of a mess | 14:15 |
mfo | yes, there are many details in this. i added some docs/links to the patch for the research i had done, but it's indeed at the lower-level details. | 14:15 |
bauzas | sean-k-mooney! sorry, haven't yet done the implementation for the keypair deprecation | 14:17 |
sean-k-mooney | bauzas: ack no worries i was just expecting that to be relitivly small so if it was off your backlog you woudl have more time to reivew | 14:19 |
mfo | sean-k-mooney, if you're ever curious, i doc'ed the research details of OVMF/SB/SMM/QEMU in bug 1960758 comment #6, the takeaway is, you can build OVMF with SMM_ENABLE, but then it's not really _secure_ Secure Boot; for that you need SMM_REQUIRE, and then platform support is not optional, it's required as well. | 14:19 |
sean-k-mooney | bauzas: i was just trying to see if we could reduce context switching by merging some small easy wins before m2 | 14:19 |
sean-k-mooney | to have less to context switch between coming up to FF | 14:20 |
sean-k-mooney | mfo: SMM is system management mode support which is actuly unrealted to secure boot | 14:20 |
sean-k-mooney | SMM is used for some other security feature and ring -2 hypervior feature at run time | 14:21 |
mfo | sean-k-mooney, IIUIC indeed, it's orthogonal, but it's involved in the implementation of not allowing the OS to tamper w/ the SB things in memory, otherwise the OS could bypass things. | 14:21 |
mfo | the link to pbonzini's presentation/video about it is really clarifying. | 14:22 |
sean-k-mooney | yep so in generall i would expect the default uefi image to be build with SMM_ENABLE | 14:22 |
sean-k-mooney | and the secure boot one to have require | 14:22 |
bauzas | sean-k-mooney: https://review.opendev.org/c/openstack/nova/+/847001 +W with comments | 14:24 |
bauzas | sean-k-mooney: yeah but I had a lot of other stuff to do before :) | 14:24 |
bauzas | like the vGPU bugs | 14:25 |
bauzas | and then reviews and bug triage | 14:25 |
bauzas | (and now, back to be a mentor :) ) | 14:25 |
bauzas | eventually, will work on this feature next week | 14:25 |
bauzas | maybe tomorrow if I have time | 14:25 |
sean-k-mooney | no rush sicne i actully had time to do upstream stuff today i was just looking to see what was close to merging | 14:26 |
sean-k-mooney | and the open review priorties | 14:26 |
sean-k-mooney | stephenfin: are you around today? if so got time to look at this os-vif patch its pretty small https://review.opendev.org/c/openstack/os-vif/+/839102 | 14:28 |
stephenfin | sean-k-mooney: sure, I'll look now | 14:28 |
sean-k-mooney | ta | 14:29 |
stephenfin | done | 14:34 |
sean-k-mooney | many thanks | 14:36 |
opendevreview | Merged openstack/osc-placement master: Support microversion 1.39 https://review.opendev.org/c/openstack/osc-placement/+/828545 | 14:37 |
*** diablo_rojo__ is now known as diablo_rojo | 14:39 | |
stephenfin | sean-k-mooney: Seen this before? https://zuul.opendev.org/t/openstack/build/b5c09ce1dbdd42228f5f2928d9df6178/log/controller/logs/screen-n-cpu.txt#10060 | 14:56 |
stephenfin | nova.exception.InternalError: Unexpected vif_type=unbound | 14:56 |
stephenfin | It rings a bell, but I thought we'd fixed this years ago | 14:56 |
sean-k-mooney | we had older bugs related to unbound | 14:57 |
sean-k-mooney | that is the state when the host-id is not set on the port | 14:57 |
sean-k-mooney | i think i looked at this | 14:57 |
sean-k-mooney | No conversion for VIF type unbound yet {{(pid=97953) nova_to_osvif_vif /opt/stack/nova/nova/network/os_vif_util.py:530}} | 14:58 |
sean-k-mooney | is really just a side effect of the prot not being bound properly in neutron | 14:58 |
sean-k-mooney | ah right | 15:00 |
sean-k-mooney | https://zuul.opendev.org/t/openstack/build/b5c09ce1dbdd42228f5f2928d9df6178/log/controller/logs/screen-q-svc.txt#10891 | 15:00 |
sean-k-mooney | so this is an issue with slow neutron i think | 15:00 |
opendevreview | Merged openstack/nova stable/wallaby: fake: Ensure need_legacy_block_device_info returns False https://review.opendev.org/c/openstack/nova/+/843678 | 15:00 |
opendevreview | Merged openstack/nova stable/wallaby: Add a regression test for bug 1939545 https://review.opendev.org/c/openstack/nova/+/843679 | 15:00 |
sean-k-mooney | stephenfin: basically when i was looking at this i was assumeing this happened because we retried the port bidning because neutron was slow to responed | 15:01 |
sean-k-mooney | and that caused a concurnet bind attempt that left it in ubound or something like that | 15:01 |
stephenfin | Hmm, that sounds reasonable. We saw it bubble up in the OSC tests because the server create failed. That sounds like a likely root cause though | 15:02 |
sean-k-mooney | if that is what is happening i woudl expect to see the retry logged somewher or see two bind attempts in the nutron log | 15:02 |
sean-k-mooney | ill check if i can see that | 15:02 |
sean-k-mooney | we do see the port had just finished binding when we got the concurrent error | 15:03 |
sean-k-mooney | ] Bound port: a2fb8af2-d4df-4b29-bd3f-5591aa8819d2, host: ubuntu-focal-rax-dfw-0030231262, vif_type: ovs, vif_details: {"connectivity": "l2", "port_filter": true, "ovs_hybrid_plug": false, "datapath_type": "system", "bridge_name": "br-int"}, binding_levels: [{'bound_driver': 'openvswitch', 'bound_segment': {'id | 15:03 |
sean-k-mooney | ya it looks like there are 3 attempts to bind the port | 15:05 |
frickler | stephenfin: I mentioned that yesterday, it also seemed related to neutron retrying binds | 15:05 |
sean-k-mooney | in the neutron side the last to of which had the concurrent bind excption | 15:05 |
sean-k-mooney | frickler: well its actully the neutornclient retrying the bind | 15:05 |
sean-k-mooney | frickler: that was entirely broken in nova until somewhat recently | 15:06 |
melwitt | bauzas: I wanted to get your thoughts on this proposed patch to change logic in the placement audit nova-manage command, since you worked on it https://review.opendev.org/c/openstack/nova/+/844418 it seems like there is a bug in the current logic but it's not clear to me what the logic should be | 15:06 |
sean-k-mooney | frickler: we fixed retires about a year or so ago | 15:06 |
bauzas | melwitt: okay, I'll look | 15:06 |
melwitt | thanks | 15:06 |
frickler | sean-k-mooney: iiuc there is an internal retry in neutron happening now | 15:06 |
sean-k-mooney | frickler: there is also likely one in the db decorator | 15:07 |
sean-k-mooney | we can see transaction error in the log | 15:07 |
frickler | is this with neutron-segment enabled? we have some issue with segment ID reuse in OSC | 15:07 |
sean-k-mooney | i think segments were enabled yes | 15:08 |
frickler | oh, that's the osc job, yes | 15:08 |
sean-k-mooney | its using vxlan however i think | 15:09 |
sean-k-mooney | rather then routed provider networks | 15:09 |
sean-k-mooney | Bound port: a2fb8af2-d4df-4b29-bd3f-5591aa8819d2, host: ubuntu-focal-rax-dfw-0030231262, vif_type: ovs, vif_details: {"connectivity": "l2", "port_filter": true, "ovs_hybrid_plug": false, "datapath_type": "system", "bridge_name": "br-int"}, binding_levels: [{'bound_driver': 'openvswitch', 'bound_segment': {'id': 'cd5c5c6b-1027-4fc7-bbc7-b8204df12e32', 'network_type': 'vxlan', | 15:10 |
sean-k-mooney | 'physical_network': None, 'segmentation_id': 1, 'network_id': 'ff960d9f-3b68-4b9b-8d69-78fe6441f27b'}}] {{(pid=90353) _bind_port_level /opt/stack/neutron/neutron/plugins/ml2/managers.py:948}} | 15:10 |
frickler | ah. maybe it is side effect of the segments test that runs in parallel. breaking other random tests | 15:10 |
sean-k-mooney | no right after ^ where the mech driver is able to bind | 15:10 |
sean-k-mooney | we get a concurrent bind excption | 15:10 |
sean-k-mooney | then neutron retires | 15:11 |
frickler | https://zuul.opendev.org/t/openstack/build/b5c09ce1dbdd42228f5f2928d9df6178/log/job-output.txt#22299 | 15:11 |
sean-k-mooney | 7.556330 ubuntu-focal-rax-dfw-0030231262 neutron-server[90353]: WARNING neutron.plugins.ml2.plugin [req-f9a5c6a8-ab26-4f1f-ab63-dd518edf32f3 req-c372ca6e-78a4-4f09-976b-c74d5f169c66 service neutron] Concurrent port binding operations failed on port a2fb8af2-d4df-4b29-bd3f-5591aa8819d2 | 15:11 |
sean-k-mooney | Jun 30 11:36:27.557557 ubuntu-focal-rax-dfw-0030231262 neutron-server[90353]: INFO neutron.plugins.ml2.plugin [req-f9a5c6a8-ab26-4f1f-ab63-dd518edf32f3 req-c372ca6e-78a4-4f09-976b-c74d5f169c66 service neutron] Attempt 2 to bind port a2fb8af2-d4df-4b29-bd3f-5591aa8819d2 | 15:11 |
frickler | look at ^^ the passing segment test right after the failure | 15:11 |
frickler | I'm pretty sure this is related | 15:11 |
sean-k-mooney | unless you are using the same port in both tests i dont see how it would be | 15:12 |
frickler | the segment test locking the DB leading to retries in other actions | 15:12 |
sean-k-mooney | oh so you think this trace https://zuul.opendev.org/t/openstack/build/b5c09ce1dbdd42228f5f2928d9df6178/log/controller/logs/screen-q-svc.txt#10722 | 15:12 |
sean-k-mooney | is caused by the segment test | 15:13 |
sean-k-mooney | ORM session: SQL execution without transaction in progress, traceback | 15:13 |
sean-k-mooney | lets check the nova logs and see if there is a rety on our side | 15:14 |
sean-k-mooney | if not then its an internal neutron issue | 15:14 |
stephenfin | sean-k-mooney: I'm not sure if that's related to this issue or not | 15:14 |
stephenfin | actually no, maybe it is. It's an update_port call that's causing the issue | 15:16 |
opendevreview | Merged openstack/nova stable/xena: reenable greendns in nova. https://review.opendev.org/c/openstack/nova/+/833411 | 15:31 |
opendevreview | Merged openstack/os-vif master: Check for hybrid plugging in OVS https://review.opendev.org/c/openstack/os-vif/+/839102 | 16:04 |
opendevreview | Merged openstack/nova master: ignore deleted server groups in validation https://review.opendev.org/c/openstack/nova/+/847001 | 16:37 |
opendevreview | Merged openstack/nova stable/wallaby: compute: Ensure updates to bdms during pre_live_migration are saved https://review.opendev.org/c/openstack/nova/+/843680 | 16:54 |
opendevreview | Merged openstack/nova stable/wallaby: fup: Make connection_info returned by CinderFixture unique per attachment https://review.opendev.org/c/openstack/nova/+/844594 | 16:54 |
opendevreview | Merged openstack/nova stable/wallaby: fup: Assert state of connection_info during LM rollback in func tests https://review.opendev.org/c/openstack/nova/+/844595 | 16:54 |
opendevreview | Jay Faulkner proposed openstack/nova stable/victoria: [ironic] Minimize window for a resource provider to be lost https://review.opendev.org/c/openstack/nova/+/800873 | 18:44 |
*** dasm is now known as dasm|off | 20:42 | |
*** diablo_rojo is now known as Guest3865 | 21:17 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!