*** dtantsur_ is now known as dtantsur | 01:50 | |
opendevreview | Nobuhiro MIKI proposed openstack/nova master: libvirt: Support maxphysaddr. https://review.opendev.org/c/openstack/nova/+/907516 | 02:38 |
---|---|---|
opendevreview | melanie witt proposed openstack/nova master: libvirt: Configure and teardown ephemeral encryption secrets https://review.opendev.org/c/openstack/nova/+/826754 | 06:41 |
opendevreview | melanie witt proposed openstack/nova master: imagebackend: Add support to libvirt_info for LUKS based encryption https://review.opendev.org/c/openstack/nova/+/826755 | 06:41 |
opendevreview | melanie witt proposed openstack/nova master: Add encryption support to convert_image https://review.opendev.org/c/openstack/nova/+/870934 | 06:41 |
opendevreview | melanie witt proposed openstack/nova master: Add hw_ephemeral_encryption_secret_uuid image property https://review.opendev.org/c/openstack/nova/+/870935 | 06:41 |
opendevreview | melanie witt proposed openstack/nova master: libvirt: make <encryption> a sub element of <source> https://review.opendev.org/c/openstack/nova/+/905515 | 06:41 |
opendevreview | melanie witt proposed openstack/nova master: Support create with ephemeral encryption for qcow2 https://review.opendev.org/c/openstack/nova/+/870932 | 06:41 |
opendevreview | melanie witt proposed openstack/nova master: Support (resize|cold migration) with ephemeral encryption for qcow2 https://review.opendev.org/c/openstack/nova/+/870933 | 06:41 |
opendevreview | melanie witt proposed openstack/nova master: Add encryption support to qemu-img rebase https://review.opendev.org/c/openstack/nova/+/870936 | 06:41 |
opendevreview | melanie witt proposed openstack/nova master: Support snapshot with ephemeral encryption for qcow2 https://review.opendev.org/c/openstack/nova/+/870937 | 06:41 |
opendevreview | melanie witt proposed openstack/nova master: Support rebuild and unshelve with ephemeral encryption https://review.opendev.org/c/openstack/nova/+/870939 | 06:41 |
opendevreview | melanie witt proposed openstack/nova master: Support rescue with ephemeral encryption https://review.opendev.org/c/openstack/nova/+/873675 | 06:41 |
opendevreview | melanie witt proposed openstack/nova master: WIP Support live migration with ephemeral encryption https://review.opendev.org/c/openstack/nova/+/905512 | 06:41 |
opendevreview | melanie witt proposed openstack/nova master: libvirt: Introduce support for raw with LUKS https://review.opendev.org/c/openstack/nova/+/884313 | 06:41 |
opendevreview | melanie witt proposed openstack/nova master: libvirt: Introduce support for rbd with LUKS https://review.opendev.org/c/openstack/nova/+/889912 | 06:41 |
opendevreview | melanie witt proposed openstack/nova master: DNM test ephemeral encryption + resize: qcow2, raw, rbd https://review.opendev.org/c/openstack/nova/+/862416 | 06:42 |
melwitt | sean-k-mooney: this is a checkpoint ^ there are so many changes already, I wanted to push them for now while I keep working on stuff | 06:43 |
opendevreview | Merged openstack/nova master: testing: Use inspect.isfunction() to check signatures https://review.opendev.org/c/openstack/nova/+/883217 | 06:51 |
opendevreview | Fabian Wiesel proposed openstack/nova master: vmware: Integer division Python 2 -> 3 fix https://review.opendev.org/c/openstack/nova/+/907444 | 08:25 |
gibi | elodilles: when you are up, could you take a quick look at https://review.opendev.org/q/topic:%22power-mgmt-fixups%22+branch:stable/2023.1 thanks! | 08:29 |
elodilles | gibi: ack, added to my TODO! | 09:24 |
gibi | elodilles: thanks a lot | 09:24 |
opendevreview | Sylvain Bauza proposed openstack/nova master: Reserve mdevs to return to the source https://review.opendev.org/c/openstack/nova/+/904209 | 10:35 |
opendevreview | Sylvain Bauza proposed openstack/nova master: Modify the mdevs in the migrate XML https://review.opendev.org/c/openstack/nova/+/904258 | 10:35 |
sean-k-mooney | gibi: elodilles: im +2w on the 2023.1 gpu backports | 12:18 |
sean-k-mooney | sorry not gpu backport power management backport | 12:22 |
sean-k-mooney | https://review.opendev.org/q/topic:%22power-mgmt-fixups%22 | 12:22 |
sean-k-mooney | melwitt: ack ill deploy that in a 2 node envionment and start testing it later today | 12:23 |
gibi | sean-k-mooney: thank you | 12:26 |
elodilles | sean-k-mooney: ack, thank you o:) | 12:29 |
opendevreview | Rajesh Tailor proposed openstack/nova master: Add support for showing requested az in output https://review.opendev.org/c/openstack/nova/+/904568 | 12:32 |
opendevreview | Rajesh Tailor proposed openstack/nova master: Add support for showing requested az in output https://review.opendev.org/c/openstack/nova/+/904568 | 13:15 |
opendevreview | Takashi Kajinami proposed openstack/placement master: tox: Drop envdir https://review.opendev.org/c/openstack/placement/+/907590 | 14:41 |
opendevreview | Takashi Kajinami proposed openstack/nova master: tox: Drop envdir https://review.opendev.org/c/openstack/nova/+/907591 | 14:42 |
opendevreview | Takashi Kajinami proposed openstack/osc-placement master: tox: Drop envdir https://review.opendev.org/c/openstack/osc-placement/+/907596 | 14:49 |
*** blarnath is now known as d34dh0r53 | 14:53 | |
opendevreview | Takashi Kajinami proposed openstack/os-vif master: tox: Drop envdir https://review.opendev.org/c/openstack/os-vif/+/907604 | 14:57 |
noonedeadpunk | hey folks. I have something weird going on. In a brand new (still not production) environment I've accidentally spotted, that one of computes is not mapped to the cell | 15:29 |
noonedeadpunk | so `nova-manage cell_v2 list_hosts` and `openstack compute service list --service nova-compute` differ by 1 compute not added to cell | 15:29 |
noonedeadpunk | `nova-manage cell_v2 discover_hosts` does not discover it | 15:30 |
noonedeadpunk | no errors or weird output is seen | 15:32 |
noonedeadpunk | Or well, shceduler does have `Host mapping not found for host compute01-az2. Not tracking instance info for this host.` -> thats exactly host that's missing from the cell | 15:32 |
noonedeadpunk | only 1 cell is present justi n case | 15:34 |
noonedeadpunk | hm... seem like running nova-manage cell_v2 discover_hosts --cell_uuid <UUID> --by-service did discover the host.... | 15:41 |
noonedeadpunk | and it was `--by-service` specifically | 15:42 |
noonedeadpunk | I can recall now somebody already suggested running by-service previously :) | 15:43 |
bauzas | noonedeadpunk: as a reminder, when adding a new compute, you need to use nova-manage for telling which cell you should use for it | 15:45 |
bauzas | hence why you can't find it | 15:45 |
noonedeadpunk | well, we have `discover_hosts_in_cells_interval` set | 15:46 |
noonedeadpunk | and I was trying to run without --by-service 10 times by now | 15:47 |
noonedeadpunk | also running nova-manage cell_v2 discover_hosts --cell_uuid UUID was just giving out "Found 0 unmapped computes in cell: UUID" | 15:47 |
noonedeadpunk | so eventually I was expecting compute to be auto-discovered... | 15:49 |
noonedeadpunk | (like all others were) | 15:49 |
bauzas | ah ok | 15:55 |
bauzas | maybe this was a bug then | 15:56 |
sean-k-mooney | noonedeadpunk: do you have the same hostname in multiple cells or perhaps is it in a diffent cell then the others | 16:27 |
noonedeadpunk | sean-k-mooney: no, not that I'm aware of - that was actuall output: https://paste.openstack.org/show/bOGJJWefzMBvhKkdHkiZ/ | 16:32 |
sean-k-mooney | hum ok ya thats odd | 16:34 |
sean-k-mooney | the names all look unique | 16:34 |
sean-k-mooney | and they are all in cell 1 | 16:34 |
noonedeadpunk | yeah, and there was no weird db records either | 16:34 |
sean-k-mooney | i think there is a verbose mode for discover_hosts but now that its fixed im usre it wont repoduce | 16:35 |
noonedeadpunk | I can imagine there could be some connectivity issues in the region as it's heavily under develpment... but I've restarted all schedulers and compute that was not found without any result | 16:35 |
noonedeadpunk | verbose was not helpful fwiw | 16:36 |
sean-k-mooney | and it worked fine when you mapped it with --by-service | 16:36 |
noonedeadpunk | https://paste.openstack.org/show/bx7p6C37Qv8LHym7Dviq/ | 16:36 |
noonedeadpunk | yup | 16:36 |
sean-k-mooney | the urllib3 thing can be ingored by the way | 16:36 |
sean-k-mooney | we should proably remove that warning | 16:36 |
noonedeadpunk | or well. I run --by-service without verbose... So I kinda assume it's what worked, as nothing else was done in between.... | 16:36 |
sean-k-mooney | it was fixed a lon gtime ago | 16:36 |
noonedeadpunk | like that's the output that was in between basically when I found it fixed: https://paste.openstack.org/show/bvFhJpJyhSv0NYn6WHaF/ | 16:38 |
sean-k-mooney | i guess if you see this again try logging with --by-service and verbose and let us know if there is anything intersting | 16:38 |
noonedeadpunk | yeah, sure... | 16:39 |
noonedeadpunk | but potentially by-service makes sense, as compute was in service... so I assume it got mapped from there then | 16:39 |
noonedeadpunk | but yeah... | 16:39 |
sean-k-mooney | i dont remmeber how that works off the top of my head so ya im not sure | 16:41 |
sean-k-mooney | ... i forgot when doing multi node devstack you need ot sync the data dir to make the ssl ca aviable if you dont turn that off | 16:45 |
* sean-k-mooney has not done multi node devstack by hand in about 2 years. i shoudl proably just use my ansible roles... | 16:46 | |
sean-k-mooney | basiclaly i need to do https://opendev.org/openstack/devstack/src/branch/master/roles/sync-devstack-data/tasks/main.yaml or the compute wont be able to talk to the contoler because the ssl cert will be rejected | 16:48 |
noonedeadpunk | it's osa aio, so I SSLs should be taken care of nicely | 17:19 |
sean-k-mooney | im not refering to osa | 17:20 |
* noonedeadpunk can't even recall how to do devstack manually | 17:20 | |
sean-k-mooney | i have have a repo where i resue the zuul job playbooks and roels to deploy multi node devstack | 17:20 |
noonedeadpunk | yeah, ok, gotcha, I guess I just missed the context | 17:21 |
sean-k-mooney | https://github.com/SeanMooney/ard/blob/6abfcae59013165404ab38ec80fa143a1c96b86a/ansible/deploy_multinode_devstack.yaml | 17:21 |
sean-k-mooney | right now im trying to figure out how to fix the ssl issues manually but might use swap to ansible | 17:22 |
sean-k-mooney | but i havent used my repo in 2 years so i dont know if it still works | 17:22 |
sean-k-mooney | i used to just turn off tls when doing multi ndoe to not need to copy the self signed certs | 17:23 |
sean-k-mooney | thats all that is currently failing | 17:23 |
clarkb | sean-k-mooney: I think you just need to copy the ca dir. IIRC devsatck doesn't centralize the ca. It makes one and then copies it so each host can use it directly | 17:37 |
sean-k-mooney | ya i did that and it didnt update the ca on the compute | 17:37 |
sean-k-mooney | the zuul jobs just copy it to the subnode before its stacks | 17:38 |
sean-k-mooney | so im not sure why its different | 17:38 |
sean-k-mooney | perhaps permissions but im using hte same user on both so im not sure | 17:38 |
sean-k-mooney | lrwxrwxrwx 1 root root 49 Feb 2 17:02 devstack-int.pem -> /usr/local/share/ca-certificates/devstack-int.crt | 17:44 |
sean-k-mooney | lrwxrwxrwx 1 root root 50 Feb 2 17:02 devstack-root.pem -> /usr/local/share/ca-certificates/devstack-root.crt | 17:44 |
sean-k-mooney | thos are broken symlinks so i think if i just remove them | 17:44 |
sean-k-mooney | and stack it might fix its self | 17:44 |
sean-k-mooney | ah i missed one | 17:52 |
sean-k-mooney | i also need /opt/stack/data/devstack-cert.pem | 17:52 |
sean-k-mooney | i coppied /opt/stack/data/ca-bundle.pem and /opt/stack/data/CA/ | 17:53 |
sean-k-mooney | now it stacked fine | 17:53 |
sean-k-mooney | melwitt: https://termbin.com/ba59 less successful this time | 18:10 |
sean-k-mooney | thats using https://review.opendev.org/c/openstack/nova/+/889912 | 18:11 |
sean-k-mooney | so it looks like there is an error here https://review.opendev.org/c/openstack/nova/+/873675/22/nova/virt/libvirt/blockinfo.py#405 | 18:14 |
sean-k-mooney | my guess is you remvoed a fucntion or did not do a git add at the right time when rebasing | 18:15 |
bauzas | sean-k-mooney: noonedeadpunk: fwiw, I'm getting the same issue with my own multinode devstack I just installed | 18:34 |
sean-k-mooney | discover hosts worked fine for me | 18:34 |
bauzas | discover_hosts only works if I use --by-service | 18:34 |
sean-k-mooney | i had no issue with it using melwitt serise | 18:35 |
sean-k-mooney | i wasnt tecnially runing master but close enough | 18:35 |
bauzas | this was bizarre, the first compute service was working but I wasn't having any host_mappings values | 18:37 |
bauzas | oh wait, I found why https://paste.opendev.org/show/br3ghOvqs7NRQAfY80HX/ | 18:41 |
sean-k-mooney | oh you nit the centos libvirt bug | 18:41 |
sean-k-mooney | ya so libvirt is borked on centos for 2 days | 18:41 |
bauzas | so the service ref was creating but not the compute node one | 18:41 |
bauzas | sean-k-mooney: which one ? | 18:42 |
sean-k-mooney | that unicode error | 18:42 |
sean-k-mooney | UnicodeDecodeError: 'utf-8' codec can't decode byte 0x91 in position 0: invalid start byt | 18:42 |
sean-k-mooney | that form libvirt | 18:42 |
sean-k-mooney | we have a cix for the next gen installer | 18:42 |
sean-k-mooney | it started seeing that in ci 2 days ago | 18:42 |
sean-k-mooney | https://issues.redhat.com/browse/OSPCIX-176 | 18:43 |
sean-k-mooney | that should be public for everyone ? | 18:43 |
clarkb | it asks me to login | 18:44 |
sean-k-mooney | ok the ci escalation project must be private... | 18:44 |
sean-k-mooney | the cijob is runign in rdo based on github patches https://review.rdoproject.org/zuul/build/5cf66b0bd0b6402faaa3306d3b193f81 | 18:45 |
bauzas | the RHEL bug is telling about the VDP issue, but I don't have this hardware | 18:47 |
sean-k-mooney | https://logserver.rdoproject.org/27/127/e7992da92f6f67327ffbf593a64b712e36b04cc6/github-check/tcib-podified-multinode-edpm-deployment-crc/5cf66b0/controller/ci-framework-data/logs/192.168.122.100/log/containers/nova/nova-compute.log | 18:47 |
sean-k-mooney | i commented that is looks like the same vpd parsing issue | 18:47 |
sean-k-mooney | but it might now be | 18:47 |
sean-k-mooney | basiclaly the libvirt api is returning non unicode bytes | 18:48 |
sean-k-mooney | so we can decode them into a strign properly | 18:48 |
sean-k-mooney | we are calling self.get_connection().getType() | 18:48 |
sean-k-mooney | which is exploding in the libvirt python module | 18:49 |
sean-k-mooney | when it calls libvirtmod.virConnectGetType(self._o) | 18:49 |
sean-k-mooney | so that a libvirt or libvirt python bug | 18:49 |
bauzas | that said I had no problem with the other compute | 18:49 |
bauzas | lemme look if I use the same version between the two computes | 18:50 |
sean-k-mooney | if your not can you record which one works | 18:50 |
bauzas | nope, I use the same version :( | 18:51 |
bauzas | that's bizarre | 18:51 |
sean-k-mooney | bauzas: can you commet on the downstream bug tracker | 18:52 |
sean-k-mooney | lets see if we can get some of our virt team folks to have a look on monday before it starts impacting the upstream gate | 18:53 |
bauzas | the CIX one ? | 18:53 |
sean-k-mooney | ya | 18:53 |
bauzas | cool | 18:53 |
sean-k-mooney | so it sound like it something works and somethime does not | 18:53 |
bauzas | actually, I can confirm that the other compute got the same exception, hence why I had no compute node | 18:59 |
sean-k-mooney | oh ok so both failed | 18:59 |
sean-k-mooney | melwitt: i fixed the missing fucntion but then it failed elsewher eon the comptue with AttributeError: 'Qcow2' object has no attribute 'disk_encryption' | 19:00 |
sean-k-mooney | in disk.libvirt_info() | 19:01 |
bauzas | apparently, now the call works for the first node | 19:01 |
sean-k-mooney | melwitt: ill try and take a look at your code again on monday if you can fix the rebase issues | 19:02 |
sean-k-mooney | bauzas: so its flaky? | 19:02 |
sean-k-mooney | i.e. on the same node it sometiems works and somethimes does not | 19:02 |
bauzas | actually, when directly calling libvirt with the python binding, I have the problem | 19:03 |
bauzas | (for the host that now works) | 19:03 |
bauzas | https://paste.opendev.org/show/blEFSbHapI8eCPEMRgyY/ | 19:03 |
sean-k-mooney | if you had a standalone repoducer that would proably help | 19:05 |
sean-k-mooney | i.e. just a python script that import libvirtmod and calls virtConnectGetType | 19:05 |
bauzas | that's what I did | 19:05 |
sean-k-mooney | oh you didnt use the nova code | 19:05 |
sean-k-mooney | ok well if you can repoduce jsut with the libvirt module in a short script | 19:06 |
sean-k-mooney | it will be a lot simpler to report this to the libvirt folks | 19:06 |
sean-k-mooney | we dont have a libvirt bug for this yet | 19:06 |
bauzas | just added a comment | 19:07 |
bauzas | lemme see if virsh gets that too | 19:08 |
sean-k-mooney | im not sure what the virsh equivlent command would be | 19:08 |
bauzas | at least domcapabilities doesn't say anything | 19:17 |
melwitt | sean-k-mooney: ugh, ok, sorry about that :( | 19:18 |
bauzas | ok I need to leave | 19:19 |
bauzas | eventually I was able to start both of the services | 19:20 |
bauzas | but we still have the bug | 19:20 |
melwitt | sean-k-mooney: every time I splice commits apart I manage to f something up 🙄 I'll get it fixed. thanks for trying it out | 19:20 |
sean-k-mooney | bauzas: i filed a bug with the libvirt folks | 19:20 |
bauzas | cool | 19:21 |
* bauzas goes off for the weekend \o | 19:21 | |
sean-k-mooney | melwitt: its fine that happens to me too | 19:24 |
artom | dansmith, I think with your stable node UUID series https://bugs.launchpad.net/nova/+bug/1817833 is fixed, no? | 19:24 |
sean-k-mooney | melwitt: i usuallly end up going commit by commmint and running a subset of tox -e py3,functional,pep8 | 19:25 |
artom | I happened upon its functional reproducer doing the downstream backport... | 19:25 |
melwitt | sean-k-mooney: yeah. I do that too ... most of the time 😒 | 19:28 |
dansmith | artom: um, not sure | 19:29 |
dansmith | artom: they seem to be complaining that they can't delete the compute to get it to have a new identity and my thing doesn't really fix that, it just means nova-compute will fail to start | 19:30 |
sean-k-mooney | artom: dansmith my readin was maybe in some specifica casses as a side effect | 19:30 |
dansmith | if you reset the uuid it will probably be right back to this problem when the name on the RP conflicts | 19:30 |
sean-k-mooney | so in the fucntionl tests the repoduce might be impacted becasue the agent will fail to start or simialr | 19:31 |
dansmith | right, it doesn't solve the problem (if you agree it's a problem) it just refuses to fail at the place it did before, and does so earlier (at start) | 19:31 |
sean-k-mooney | ya that was my feeling but i didnt fully parse the bug details | 19:32 |
artom | I wasn't really pushing one way or another, just wanted to check that we don't have stable bug reports laying around | 19:33 |
artom | (I mean, I'm sure we do, but if we can close _one_... ;) | 19:33 |
opendevreview | Merged openstack/nova stable/2023.1: Revert "[pwmgmt]ignore missin governor when cpu_state used" https://review.opendev.org/c/openstack/nova/+/905674 | 19:34 |
*** priteau_ is now known as priteau | 21:44 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!