frickler | priteau: ^^ | 05:56 |
---|---|---|
opendevreview | Balazs Gibizer proposed openstack/nova master: Revert "Temp disable nova-manage placement heal_allocation testing" https://review.opendev.org/c/openstack/nova/+/816242 | 07:05 |
bauzas | good morning Nova | 08:20 |
bauzas | gibi: sent https://review.opendev.org/c/openstack/nova/+/815940 to the gate | 08:21 |
gibi | bauzas: good morning and thank | 08:51 |
gibi | 's | 08:51 |
bauzas | np | 08:52 |
bauzas | how things went those 2 days ? | 08:52 |
gibi | nothing signifficant for me but I was mostly off yesterday | 08:52 |
frickler | kashyap: in case you didn't see it yet: https://gitlab.com/libvirt/libvirt/-/issues/229 | 09:08 |
kashyap | frickler: Morning | 09:08 |
kashyap | frickler: Just back after 2 days away from email. I indeed didn't see it. Thanks for filing it! | 09:08 |
frickler | kashyap: I also tested with a custom built qemu in https://review.opendev.org/c/openstack/devstack/+/815958 , which essentially has tb-size=64M | 09:09 |
frickler | the failures seem to be unrelated | 09:09 |
kashyap | frickler: Oh, cool, so you fetched the file and tested it. (I see no failures there; Zuul gave a +1) | 09:14 |
kashyap | frickler: Did setting it to 64M bring it back to the "previous capacity"? | 09:14 |
kashyap | (I'm putting it in quotes because, I don't know how many instances you were able to launch before this QEMU change) | 09:14 |
frickler | kashyap: the failures were in some of the rechecks. the old failures weren't 100% deterministic, it depends on how tempest with -c4 has parallel jobs that all start multiple instances | 09:20 |
kashyap | frickler: Rigt. Shall we let it run on multiple clouds / setups for a week or so? To rule out it's not the tb-size? | 09:21 |
kashyap | s/Rigt/Right/ | 09:21 |
frickler | I based the 64M on looking at a single instance locally with a 128M flavor, qemu then uses a bit more memory than with 4.2, but not too much hopefully, like ~200M instead of 150M | 09:21 |
frickler | kashyap: I intend to do a couple more rechecks, but there seems to be some issue on the neutron side which makes things unstable | 09:22 |
frickler | I'm pretty confident by now though that tb-size is the trigger | 09:23 |
stephenfin | I need to test the 'GET /servers/{server_id}/migrations/{id}' API, meaning I need a way to slow down live migrations so I catch one in the act. Anyone have a suggestion for an easy throttle I can set to do that? | 09:37 |
stephenfin | (rather than relying on big or busy guests) | 09:38 |
gibi | stephenfin: limiting bandwidth ? | 09:39 |
stephenfin | I assume there isn't a nova or libvirt config option I can use for that though? | 09:39 |
stephenfin | This is a simple DevStack two-node deployment, so I don't have a separate management network :) | 09:39 |
gibi | there is something in libvirt | 09:39 |
gibi | as there is virsh migrate-setspeed command in virsh | 09:40 |
stephenfin | oh, I looked and didn't see anything obvious | 09:40 |
* stephenfin googles | 09:40 | |
kashyap | frickler: Do mention the Neutron issue in the change, if / when you get a minute | 09:41 |
kashyap | frickler: And probably Cc some folks from Neutron who might be able to debug | 09:41 |
stephenfin | gibi: that's exactly what I wanted. Thanks! | 09:41 |
kashyap | stephenfin: Yes, migrate-setspeed lets you throttle indeed | 09:42 |
gibi | stephenfin: cool | 09:42 |
bauzas | mmmm | 09:51 |
bauzas | just saw a new "Your Turn" series in Gerrit default dashboard | 09:51 |
bauzas | what's the "attention:self" query ? | 09:51 |
bauzas | hah, nevermind, found https://gerrit-review.googlesource.com/Documentation/user-attention-set.html | 09:52 |
bauzas | interesting | 09:52 |
gibi | I'm still learning the rules described in ^^ | 09:59 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Reno for qos-minimum-guaranteed-packet-rate https://review.opendev.org/c/openstack/nova/+/805046 | 10:15 |
gibi | bauzas: fyi, this is the final patch (the reno) https://review.opendev.org/c/openstack/nova/+/805046 for the https://blueprints.launchpad.net/openstack/?searchtext=qos-minimum-guaranteed-packet-rate blueprint. So we can close that bp soon \o/ | 10:17 |
bauzas | gibi: wow, this was fast. | 10:17 |
gibi | bauzas: we only missed the nova-manage part of that bp in xena | 10:18 |
bauzas | yup, I know | 10:18 |
bauzas | but still :) | 10:18 |
gibi | yeah, it is always nice to close out a bp even before M1 | 10:18 |
bauzas | we discussed this at the PTG, I wasn't expecting the nova-manage patch to land that soon :) | 10:18 |
gibi | it is thanks to stephenfin and melwitt | 10:19 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Reno for qos-minimum-guaranteed-packet-rate https://review.opendev.org/c/openstack/nova/+/805046 | 10:24 |
gibi | bauzas: btw, there is a bug fix for the series (for those part we landed in xena) https://review.opendev.org/c/openstack/nova/+/811396 | 10:26 |
bauzas | +w | 10:28 |
opendevreview | Federico Ressi proposed openstack/nova master: Debug Nova APIs call failures https://review.opendev.org/c/openstack/nova/+/806683 | 10:32 |
lyarwood | frickler: just catching up after a few weeks out, excellent work with the QEMU tb-size issue! | 10:38 |
kashyap | lyarwood: Yeah, libvirt needs to wire it up now, though | 10:42 |
kashyap | I'll file a RHEL libvirt RFE - that might get on their triage queue quicker | 10:43 |
gibi | bauzas: awesome, thank you | 10:44 |
* bauzas goes off for gym duties | 10:47 | |
lyarwood | kashyap: yeah, shame we can't hackaround this in the meantime somehow | 10:50 |
lyarwood | kashyap: couldn't we pass QEMU args directly through libvirt from Nova in the meantime? | 10:50 |
* lyarwood has a look | 10:51 | |
kashyap | lyarwood: Definitely, there's QEMU command-line passthrough... | 10:51 |
kashyap | For libvirt XML | 10:51 |
lyarwood | second day back and I'm already writing another hackaround | 10:51 |
kashyap | lyarwood: But wait: | 10:51 |
kashyap | Nova doesn't have that XML modelling classes for command-line passthrough (for good reasons) :-( | 10:51 |
kashyap | lyarwood: The only current hack is to upload a custom QEMU build with that built in | 10:52 |
lyarwood | ewww | 10:52 |
lyarwood | I'd rather add the logic in Nova with a workaround option tbh | 10:52 |
lyarwood | than build our own custom QEMU | 10:52 |
kashyap | lyarwood: I agree, it's nasty to do the custom builds for medium-term | 10:52 |
kashyap | The logic in Nova would require to wire in these bits, BTW: https://libvirt.org/kbase/qemu-passthrough-security.html | 10:53 |
kashyap | (Including the namespace at the top) | 10:53 |
lyarwood | Yup that's easy enough | 10:53 |
* lyarwood gives it a go now | 10:54 | |
kashyap | And still it requires more edits. I was testing last week | 10:54 |
kashyap | When using `-accel tcg,tb-size=256`, we should remove "accel=tcg" from `-machine q35,accel=tcg` | 10:54 |
kashyap | Otherwise QEMU fails to launch | 10:55 |
kashyap | (I think libvirt uses the latter syntax by default: "-machine ... accel=") | 10:55 |
kashyap | (Yep, it does. Just verified) | 10:56 |
ebbex | Is there a option/toggle to disable sending numa_topology from nova-compute? (We have some numascale hardware that submits "Data too long for column 'numa_topology') | 10:56 |
lyarwood | kashyap: oh fun | 10:57 |
kashyap | lyarwood: Yeah. </shameless-plug> For more on the nature of QEMU command-line, see my LWN article: https://lwn.net/SubscriberLink/872321/221e8d48eb609a38/) | 10:58 |
kashyap | (Especially the "Complexity on the QEMU command line" section) | 10:59 |
gibi | lyarwood: o/ we can revert the temp disable on the heal_allocation in nova-next the nova-manage support landed during the night. https://review.opendev.org/c/openstack/nova/+/816242 | 11:01 |
lyarwood | awesome checking | 11:01 |
gibi | thanks | 11:01 |
lyarwood | +W'd | 11:01 |
gibi | thanks | 11:02 |
lyarwood | kashyap: would you be able to test if we could overwrite the original `-machine q35,accel=tcg` part using <qemu:commandline> via libvirt? | 11:03 |
kashyap | lyarwood: Let me try | 11:04 |
kashyap | I think <qemu:commandline> _does_ take precedence | 11:04 |
* kashyap will confirm in a few | 11:04 | |
lyarwood | would be ace as Nova could do that itself then | 11:04 |
kashyap | lyarwood: Afraid, I was wrong :-( | 11:17 |
kashyap | I tried this: | 11:17 |
* kashyap is getting a paste-bin | 11:17 | |
kashyap | lyarwood: It doesn't overwrite, that was the XML (see line-1 and lines 102-105) https://paste.centos.org/view/1fcbc6a4 | 11:19 |
kashyap | With that, when I start the guest, it gives the familiar: | 11:19 |
kashyap | $> virsh start cvm2 | 11:19 |
kashyap | error: Failed to start domain 'cvm2' | 11:19 |
kashyap | error: internal error: process exited while connecting to monitor: 2021-11-02T11:18:34.543504Z qemu-kvm: The -accel and "-machine accel=" options are incompatible | 11:19 |
lyarwood | sorry was just on a call | 11:21 |
kashyap | No rush; I don't count on instant responses :-) | 11:21 |
lyarwood | kashyap: what if you also define -machine in the XML? | 11:22 |
kashyap | Hmm, lemme try | 11:22 |
kashyap | lyarwood: Wait, you mean setting -accel and -machine in qemu:commandline explicitly? | 11:22 |
lyarwood | kashyap: yes | 11:23 |
kashyap | (If so, that should fail the same way as above, but lemme double-confirm. libvirt uses "-machine accel" under the hood, by inference from <domain type='kvm') | 11:24 |
kashyap | Yep, it fails the same way. | 11:24 |
kashyap | lyarwood: Oh, wait. There might be another hack, based on my chat w/ Paolo last week: | 11:25 |
kashyap | 17:55 < kashyap> bonzini: Hm, how exactl does "-machine accel=kvm -machine accel=tcg" differ from "-accel kvm -accel tcg"? | 11:25 |
kashyap | 17:55 < bonzini> "-machine accel=tcg" overwrites "-machine accel=kvm" | 11:25 |
kashyap | lyarwood: So, I can specify by "-accel tcg -accel kvm" ... and see if that works :D | 11:25 |
* lyarwood tilts head | 11:26 | |
kashyap | Gaah, no, ignore me. I misread the above complexity. | 11:26 |
* kashyap taps on the table and thinks ... wonder if DanPB knows a trick | 11:27 | |
kashyap | No, there isn't a current trick. | 11:39 |
kashyap | lyarwood: That said, based on last week chat w/ QEMU folks, libvirt itself should switch to "-accel" as that's recommended than "-machine accel" | 11:40 |
* kashyap goes to file an upstream libvirt GitLab ticket for that | 11:40 | |
lyarwood | kashyap: argh kk, so there's no workaround until that happens | 11:41 |
kashyap | No, besides the ugly hack we both revulse at :D | 11:41 |
lyarwood | kashyap: blocking pretty much all upstream Openstack testing using qemu until then | 11:41 |
lyarwood | yeah without the custom build | 11:41 |
lyarwood | urgh | 11:41 |
lyarwood | tbh we need to make a big deal out of this | 11:41 |
kashyap | Yeah, QEMU changed it pretty much w/o considering the management tools :-( | 11:42 |
* kashyap --> needs lunch, hangry | 11:43 | |
opendevreview | Lee Yarwood proposed openstack/nova master: nova-manage: Always get BDMs using get_by_volume_and_instance https://review.opendev.org/c/openstack/nova/+/811716 | 11:47 |
lyarwood | https://review.opendev.org/q/topic:%22bug%252F1943431%22+(status:open%20OR%20status:merged) & https://review.opendev.org/q/topic:%22bug%252F1937084%22+(status:open%20OR%20status:merged) should be ready for reviews if people have time btw, simple bugfixes | 11:47 |
EugenMayer | When trying to rebuild an instance and use --preserve-ephimeral is see `The current driver does not support preserving ephemeral partitions.` | 12:01 |
EugenMayer | Is this option only available when using a storage like nfs/ceph? But how is that different to volumes then? Currently i use the compute node local storage | 12:01 |
EugenMayer | I use LVM on my computes with ext4 - do i need zfs/btrfs for that to work? | 12:03 |
sean-k-mooney[m] | EugenMayer that is only supported on ironic | 12:04 |
sean-k-mooney[m] | its not supported with libvirt or any other vm or container based driver | 12:04 |
sean-k-mooney[m] | rebuild is intended to remove all data from ths instance by recrating the root disk any epmeral disks | 12:05 |
sean-k-mooney[m] | if you want to use rebuild and preserve data you should store your data in cinder volumes | 12:05 |
EugenMayer | i though of ironic as just a barebone provisioning (via biofrost?) then running libvirt - but that is wrong? | 12:06 |
EugenMayer | ironic means, that one does not use any hypervisor at all - the barebone is the actual instance. That is the point right? sean-k-mooney[m] | 12:07 |
sean-k-mooney[m] | ironic is openstack beremetal as a service project and it can be used with nova to provide instance that are phsyicl servers instead of vms | 12:07 |
EugenMayer | Understood - thank you for clarifing | 12:08 |
sean-k-mooney[m] | bifrost is an installer for ironic written in ansible | 12:08 |
sean-k-mooney[m] | bifrost installs ironic in standalone mode so it canbe used without the rest of openstack to manage your phsyical hardware | 12:09 |
EugenMayer | Thank you! | 12:11 |
sean-k-mooney[m] | lyarwood: given we do not allow the use of qemu arg passthough in nova i dont see anyway for us to adress this in nova | 12:12 |
EugenMayer | One question to cinder - how do you deal with databases? I mean using NFS or alikes and storing (running) a database on such a storage with heavily impact performance - this was the main reason to use local disk (we yet avoided cinder in our setup idea). How do you deal with that? Are you using a specific cinder backend like ceph/gluster so you | 12:12 |
EugenMayer | actually have local latency and 'sync' the data to the central storage? | 12:12 |
sean-k-mooney[m] | for database workload i think its more common to use a dedicated san and mount the data over iscsi nfs really is not up to that level of iops. ceph can handel database but generally you will need to use flash if you have high iops | 12:14 |
sean-k-mooney[m] | i know that many do use local for dbs | 12:15 |
sean-k-mooney[m] | e.g. the root disk or epmeral disk but then you just need to ensure that you do not use rebuild and make backups at the application level | 12:15 |
EugenMayer | ok so this is a common issue | 12:16 |
EugenMayer | sean-k-mooney[m] with flash you mean SSD/NVME drives, right? (we have those only, the latter) | 12:17 |
sean-k-mooney[m] | yes if you have nvme storage and a high speed 25G+ networking you can deploy high iops workloads on cpeh but your network will become the bottleneck | 12:18 |
EugenMayer | well our network is about 1GB | 12:19 |
EugenMayer | it's provider based | 12:19 |
EugenMayer | do i understand ceph correcly here, that it is actually local access when a 'backend sync' in the background, other then nfs which is a transparent access on the network mount with the performance pain | 12:20 |
sean-k-mooney[m] | the normal way to deploy databases with local storage is to deploy them in a 3 node ha cluster with local sotrage and backup to cinder volumes with update managed via yum/apt ectra inside the vms. | 12:21 |
sean-k-mooney[m] | no ceph is directly acessed over the network | 12:21 |
sean-k-mooney[m] | it use the rbd protocol rather then iscsi but its more similar to iscsi then nfs | 12:22 |
EugenMayer | this means that one rather runs central db clusters for each DB variant (5.5,5.6,8 or pg 9.6,10,10) and loses the encapsulation that every app travels with it's database like we rather used right now (self container docker-compose / k8s stacks) | 12:23 |
sean-k-mooney[m] | if you need to use rebuild because of a higher level orchestror then effectivly the best way to do that with only local sotrage i is to serials the rebuild and by removing 1 instace form the cluser, rebuilding it, rejoin it to the cluster then waiting for it to sync with the latest state. then repeat for the rest. | 12:24 |
sean-k-mooney[m] | well with k8s it change slightly | 12:24 |
sean-k-mooney[m] | in that it assumes that you have shared network based storage by default | 12:25 |
EugenMayer | it can be, also k8s can use ephemeral, which we plan to | 12:25 |
sean-k-mooney[m] | so it assume it cna just terminate the db contaienr and when its recreated after update it can reconnect to the same storeage on any host and get the data back | 12:25 |
sean-k-mooney[m] | you plan ot use the local provide to back your persitnet volume claims? | 12:26 |
EugenMayer | yeah, that is the optional/usual assumed mode in k8s - it can though also use ephemeral storage which then cannot be distributed to other nodes just like that | 12:27 |
sean-k-mooney[m] | right the local provide is not normally intended for production use | 12:27 |
EugenMayer | you plan ot use the local provide to back your persitnet volume claims? <- not sure i can answer this question / understand it | 12:27 |
EugenMayer | local provide is what you define as 'local disk' | 12:27 |
sean-k-mooney[m] | it can be used of corse but it puts the burden of persitign the storage on the operator to configure the storage to be ha by some means | 12:28 |
EugenMayer | or no ha at all | 12:28 |
EugenMayer | depends on the needs, of course | 12:28 |
EugenMayer | sean-k-mooney[m] you really helped enormously. Thanks. I guess i have to rethink the ideas with all the input given. | 12:29 |
sean-k-mooney[m] | EugenMayer: am in k8s there are 2 ways to use local storage 1 dont use a k8s volume and just use the storage in the contaienr fs, 2 confire a local storage provider on each host that can be used to creat persitent volumen that are attached to pods vis persistent volume claims | 12:29 |
EugenMayer | sean-k-mooney[m] i think we planned the second | 12:30 |
sean-k-mooney[m] | but yes if you have no ha that works as long as you are careful with the data replication and or dont need the data to outlive the conatienr | 12:30 |
sean-k-mooney[m] | i have never done this mind you but i have always wondered how mergerfs would work in production. e.g. could you use it to merge your local storage with remote such that all data is sync to the remote sotrage in the background. | 12:33 |
sean-k-mooney[m] | anyway im not sure how much i helped but i have to run o/ | 12:33 |
opendevreview | Merged openstack/nova master: Fix unit test for oslo.concurrency 4.5 https://review.opendev.org/c/openstack/nova/+/815940 | 12:42 |
opendevreview | Merged openstack/nova master: Query ports with admin client to get resource_request https://review.opendev.org/c/openstack/nova/+/811396 | 12:42 |
opendevreview | Merged openstack/nova master: Revert "Temp disable nova-manage placement heal_allocation testing" https://review.opendev.org/c/openstack/nova/+/816242 | 12:42 |
lyarwood | sean-k-mooney: hey sorry missed your reply above, yeah we could easily add the config classes and only use them with a workarounds configurable but as kashyap highlighted even then thanks to some other QEMU bugs it isn't going to work | 13:24 |
EugenMayer | sean-k-mooney[m] you helped me BIG times. | 13:25 |
EugenMayer | sean-k-mooney[m] i ask myself if working with freezer could be an option for rebuilding, while sticking to local disks | 13:29 |
gibi | lyarwood: could I ask you a favor to look at the backports of https://review.opendev.org/q/topic:bug/1944759 it is a fairly easy patch and clean backport | 14:31 |
lyarwood | ack, just catching up with some downstream backports first then I'll try to get to these | 14:32 |
lyarwood | FWIW I'll be afk for the upstream meeting today, have to fetch my kid from nursery | 14:32 |
gibi | thanks lyarwood | 14:35 |
bauzas | gibi: hmpff, I need to ask you two things | 14:37 |
gibi | bauzas: sure, shoot | 14:37 |
bauzas | 1/ I'll need to go off at 5.30pm in our TZ, so I could chair the nova meeting for only 30 mins | 14:38 |
bauzas | 2/ I'll be off tomorrow | 14:38 |
gibi | 1/ I will be here to chair things after you left | 14:38 |
gibi | 2/ I will be off tomorrow too :) | 14:38 |
bauzas | ahah ok :) | 14:40 |
bauzas | thanks :) | 14:40 |
gibi | I'm not sure what was the request in 2/ :D | 14:40 |
bauzas | gibi: just to tell you I was off :) | 14:50 |
gibi | ahh OK | 14:50 |
artom | That (wait, what excactly?) reminds me - what do we need to do to get https://review.opendev.org/c/openstack/nova/+/796909 moving? | 15:05 |
artom | That's only the top change, there's 2 backport series below it, with some dependencies that are... borderline backportable? I mean to me they clearly are | 15:05 |
artom | But there's some policy controversy there :) | 15:05 |
EugenMayer | when having not swift / cinder active, and i do a snapshot on a compute, where is the snapshot located? | 15:09 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Enable min pps tempest testing in nova-next https://review.opendev.org/c/openstack/nova/+/811748 | 15:20 |
bauzas | warning European folks, our nova meeting will start in 28 mins !!! | 15:32 |
bauzas | daylight savings change | 15:32 |
bauzas | 1600UTC is now 5pm for CET and 4pm for BST | 15:33 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/xena: Reproduce bug 1945310 https://review.opendev.org/c/openstack/nova/+/811405 | 15:36 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/xena: Query ports with admin client to get resource_request https://review.opendev.org/c/openstack/nova/+/811407 | 15:36 |
EugenMayer | running kvm on a compute (empty) with 128GB ram, EE nvmes (very fast, raid1 mdadm) with 4 cores on a AMD Ryzen 9 5950X 16-Core Processor, the instance takes 3 minutes to install python3-pip, while it is not the download, but the processing. What could be the reason that it underperformance so massively? | 15:37 |
EugenMayer | doing the same on the compute host would take < 8s | 15:38 |
EugenMayer | wait the compute shows as QEMU in the hypervisor overview..how did that happen? | 15:39 |
sean-k-mooney[m] | EugenMayer: kvm is an accleration not a hypervior | 15:40 |
sean-k-mooney[m] | qemu is the hyperviors | 15:40 |
sean-k-mooney[m] | we show the same regardless of if you use qemu with the tcg backend or kvm | 15:40 |
EugenMayer | i see | 15:41 |
bauzas | kashyap: lyarwood: need help with some bug to triage upstream https://bugs.launchpad.net/nova/+bug/1949224 | 15:41 |
* kashyap clicks | 15:41 | |
EugenMayer | i wondered, since the default of nova_compute_virt_type is kvm | 15:41 |
EugenMayer | sean-k-mooney[m] how to be sure KVM is used? | 15:42 |
kashyap | bauzas: Hmm, I have a vague recall of a similar issue from the past, but I need to reload some context. Will check tomm | 15:42 |
bauzas | kashyap: I just asked for qemu/libvirt versions | 15:42 |
sean-k-mooney[m] | on the compute host you can do a virsh list then a virsh dumpxml on the domian | 15:42 |
sean-k-mooney[m] | EugenMayer: ^ | 15:42 |
bauzas | kashyap: I'm tempted to say this is unrelated to Nova but rather a libvirt/virt bug | 15:42 |
kashyap | bauzas: That's a good catch. We don't test this area all that much | 15:42 |
kashyap | Let's see if it's a bug yet :) | 15:43 |
EugenMayer | sean-k-mooney[m]: <domain type='kvm' id='1'> | 15:43 |
sean-k-mooney[m] | yep that should be using kvm then | 15:44 |
EugenMayer | so what could be the actual reason for this massive slowdown? this host easily hosted heavy load with proxmox | 15:44 |
sean-k-mooney[m] | what do you have the cpu model set to? | 15:45 |
sean-k-mooney[m] | and mode | 15:45 |
sean-k-mooney[m] | by default its unset which for the libvirt dirver will get treated as if you set host-model | 15:45 |
EugenMayer | sean-k-mooney[m] https://gist.github.com/EugenMayer/cacd5c44ae7dafffa31c9c8025bee3aa | 15:45 |
sean-k-mooney[m] | i know that there were slowdowns on older qemu versions with amd cpus because they did not have the correct cpu_models defiedn | 15:46 |
sean-k-mooney[m] | you should proably try setting [libvirt]/cpu_mode=host-passthrough | 15:46 |
EugenMayer | this is bullseye debian - same as for proxmox | 15:46 |
EugenMayer | i would assume that it would run the same way. Sureley pve uses a different kernel but still suprised | 15:47 |
sean-k-mooney[m] | if that improves the performance on the razen 5950x then it likely means you just need a newer qemu to get better perfromance out of the box without using host-passhtough | 15:48 |
EugenMayer | understood | 15:48 |
EugenMayer | trying to up the kernel from the backports and will check what qemu version would be on ubuntu | 15:48 |
EugenMayer | thank you for the hint! | 15:48 |
EugenMayer | how would i actually change the cpu_mode with nova? | 15:48 |
sean-k-mooney[m] | in the nova.conf set cpu_mode in the libvirt section | 15:49 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/wallaby: Query ports with admin client to get resource_request https://review.opendev.org/c/openstack/nova/+/811416 | 15:49 |
EugenMayer | sean-k-mooney[m] do i need to redo the instances or just stop/start? | 15:50 |
bauzas | last reminder: nova weekly meeting starts in 10 mins now here in this chan :) | 15:50 |
EugenMayer | bauzas i will shut up then, promised :) | 15:51 |
bauzas | (spoiler alert, you'll see new meetbot commands :p ) | 15:51 |
sean-k-mooney[m] | just hard reboot after restarting the agent | 15:51 |
sean-k-mooney[m] | https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.cpu_mode | 15:51 |
EugenMayer | sean-k-mooney[m] again, thanks! | 15:52 |
bauzas | (actually, not 'new', since I used them in 2014, but I haven't seen them used for a while now, so I'm all about using them back :D ) | 15:52 |
sean-k-mooney[m] | EugenMayer: you likely dont need to update the kernel by the way but its worth a shot if that does not help | 15:52 |
EugenMayer | bauzas you jinxed it, so it might fail! :D | 15:52 |
bauzas | the more people joining us, the better it will be :) | 15:53 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/wallaby: Reproduce bug 1945310 https://review.opendev.org/c/openstack/nova/+/811414 | 15:54 |
bauzas | if I can make jokes to let people chime in, I can try | 15:54 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/wallaby: Query ports with admin client to get resource_request https://review.opendev.org/c/openstack/nova/+/811416 | 15:55 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/wallaby: Query ports with admin client to get resource_request https://review.opendev.org/c/openstack/nova/+/811416 | 15:55 |
EugenMayer | sean-k-mooney[m] massive speedup - 17 seconds | 15:58 |
sean-k-mooney[m] | cool so basically what happening qemu does not have a profile that matches your cpu. the simplet way to fix that is to update qemu to on that does. failing that host-passthough is a good choice if you dont need to live migrate the vms to different cpu models | 15:59 |
sean-k-mooney[m] | althernitvie you can pick one that is close and add flags | 16:00 |
sean-k-mooney[m] | we can tell you how to do that after the meeting but what you have should be fine for now | 16:00 |
bauzas | 3... | 16:00 |
bauzas | 2... | 16:00 |
bauzas | 1... | 16:00 |
bauzas | #startmeeting nova | 16:00 |
opendevmeet | Meeting started Tue Nov 2 16:00:51 2021 UTC and is due to finish in 60 minutes. The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:00 |
opendevmeet | The meeting name has been set to 'nova' | 16:00 |
bauzas | #link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting | 16:01 |
bauzas | good day, everyone | 16:01 |
dansmith | o/ | 16:01 |
sean-k-mooney[m] | o/ | 16:01 |
gibi | \o | 16:01 |
bauzas | as discussed before, I will exceptionnally be only able to chair this meeting for 30 mins | 16:02 |
bauzas | so I'll let gibi co-chair | 16:02 |
bauzas | #chair gibi | 16:02 |
opendevmeet | Current chairs: bauzas gibi | 16:02 |
* gibi accepts the challenge | 16:02 | |
bauzas | let's start while people join | 16:02 |
bauzas | #topic Bugs (stuck/critical) | 16:02 |
bauzas | No Critical bug | 16:02 |
bauzas | #link 20 new untriaged bugs (+0 since the last meeting): #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New | 16:03 |
bauzas | thanks to anybody who triaged a few of them | 16:03 |
bauzas | #help any help is appreciated with bug triage and we have a how-to https://wiki.openstack.org/wiki/Nova/BugTriage#Tags | 16:04 |
bauzas | 32 open stories (+0 since the last meeting) in Storyboard for Placement #link https://storyboard.openstack.org/#!/project/openstack/placement | 16:04 |
bauzas | so, maybe we closed one or more stories in Storyboard, but I don't think so | 16:05 |
bauzas | yeah, last story was written on Oct 21st | 16:06 |
bauzas | any bug to discuss in particular ? | 16:06 |
bauzas | ok, I guess no, moving on | 16:07 |
bauzas | #topic Gate status | 16:07 |
bauzas | Nova gate bugs #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure | 16:07 |
bauzas | nothing new | 16:07 |
bauzas | Placement periodic job status #link https://zuul.openstack.org/builds?project=openstack%2Fplacement&pipeline=periodic-weekly | 16:07 |
bauzas | no issues so far ^ | 16:07 |
bauzas | just the usual reminder, | 16:08 |
bauzas | Please look at the gate failures, file a bug, and add an elastic-recheck signature in the opendev/elastic-recheck repo (example: #link https://review.opendev.org/#/c/759967) | 16:08 |
bauzas | that's it for gate status | 16:08 |
gibi | this is a gate fix that needs a second core https://review.opendev.org/c/openstack/nova/+/814036 | 16:08 |
bauzas | oh right | 16:08 |
bauzas | gibi: I'll look at it while we speak | 16:08 |
gibi | thanks | 16:09 |
bauzas | (already looked at it, but need one last glance) | 16:09 |
bauzas | moving on or any gate failure to mention besides the above one ? | 16:09 |
bauzas | #topic Release Planning | 16:10 |
bauzas | Yoga-1 is due Nova 18th #link https://releases.openstack.org/yoga/schedule.html#y-1 | 16:10 |
bauzas | (3 weeks from now) | 16:11 |
bauzas | err, 2 weeks | 16:11 |
bauzas | +2d | 16:11 |
bauzas | which means, typey typey your specs | 16:11 |
bauzas | https://review.opendev.org/q/project:openstack/nova-specs+is:open is not that large | 16:12 |
bauzas | what makes me tell : | 16:12 |
bauzas | #startvote Spec review day proposal on Tuesday Nova 16th ? (yes, no) | 16:12 |
gibi | yes | 16:13 |
gibi | #yes | 16:13 |
gibi | (how to vote?) | 16:13 |
bauzas | dang, the meetbot doesn't tell what to say | 16:13 |
bauzas | #vote yes | 16:13 |
dansmith | you have a space in front | 16:13 |
gibi | #vote yes | 16:13 |
dansmith | of the startvote | 16:13 |
bauzas | #startvote Spec review day proposal on Tuesday Nova 16th ? (yes, no) | 16:13 |
opendevmeet | Begin voting on: Spec review day proposal on Tuesday Nova 16th ? Valid vote options are , yes, no, . | 16:13 |
opendevmeet | Vote using '#vote OPTION'. Only your last vote counts. | 16:13 |
bauzas | dansmith: huzzah | 16:13 |
gibi | #vote yes | 16:13 |
opendevmeet | gibi: yes is not a valid option. Valid options are , yes, no, . | 16:13 |
sean-k-mooney[m] | yes | 16:14 |
gibi | #vote yes, ! | 16:14 |
opendevmeet | gibi: yes, ! is not a valid option. Valid options are , yes, no, . | 16:14 |
gibi | #vote yes, | 16:14 |
opendevmeet | gibi: yes, is not a valid option. Valid options are , yes, no, . | 16:14 |
dansmith | lol | 16:14 |
bauzas | oh man | 16:14 |
bauzas | this is an absolutule fail. | 16:14 |
dansmith | #vote yes, | 16:14 |
opendevmeet | dansmith: yes, is not a valid option. Valid options are , yes, no, . | 16:14 |
dansmith | #vote yes, no | 16:14 |
opendevmeet | dansmith: yes, no is not a valid option. Valid options are , yes, no, . | 16:14 |
gibi | #vote , | 16:14 |
opendevmeet | gibi: , is not a valid option. Valid options are , yes, no, . | 16:14 |
dansmith | #vote yes, no | 16:14 |
opendevmeet | dansmith: yes, no is not a valid option. Valid options are , yes, no, . | 16:14 |
sean-k-mooney[m] | lets just assume we are ok with the 16th and move on | 16:14 |
* bauzas facepalms | 16:14 | |
dansmith | #vote yes, no, | 16:14 |
opendevmeet | dansmith: yes, no, is not a valid option. Valid options are , yes, no, . | 16:14 |
yuriys | this is bot abuse! | 16:14 |
bauzas | #endvote | 16:14 |
opendevmeet | Voted on "Spec review day proposal on Tuesday Nova 16th ?" Results are | 16:14 |
dansmith | #vote meh | 16:14 |
gibi | thet is fun | 16:14 |
bauzas | ok I guess this was epic but we should leave this bot quiet back for 5 years | 16:15 |
bauzas | anyway, | 16:15 |
bauzas | #agreed Spec review day happening on Nov 16th | 16:15 |
bauzas | voilĂ | 16:15 |
bauzas | moving on | 16:15 |
bauzas | #topic Review priorities | 16:16 |
bauzas | #link https://review.opendev.org/q/status:open+(project:openstack/nova+OR+project:openstack/placement)+label:Review-Priority%252B1 | 16:16 |
dansmith | also leading space | 16:16 |
bauzas | dansmith: good catch, the copy/paste makes me mad | 16:16 |
bauzas | #topic Review priorities | 16:16 |
bauzas | #undo | 16:16 |
opendevmeet | Removing item from minutes: #topic Review priorities | 16:16 |
bauzas | fun, the meetbot isn't telling new topics | 16:17 |
bauzas | anyway, next point | 16:17 |
dansmith | it doesn't on oftc I think | 16:17 |
bauzas | #action bauzas to propose a documentation change by this week as agreed on the PTG | 16:17 |
dansmith | but if you don't do it #properly it won't record them either | 16:17 |
bauzas | for adding a gerrit ACL to let contributors +1ing | 16:18 |
bauzas | didn't had time to formalize it yet | 16:18 |
bauzas | #topic Stable Branches | 16:18 |
bauzas | elodilles: floor is yours | 16:18 |
bauzas | I guess he's not around | 16:20 |
bauzas | so I'll paste | 16:20 |
bauzas | stein and older stable branches are blocked, needs the setuptools pinning patch to unblock: https://review.opendev.org/q/I26b2a14e0b91c0ab77299c3e4fbed5f7916fe8cf | 16:20 |
bauzas | we need a second stable core especially on https://review.opendev.org/c/openstack/nova/+/813451 | 16:20 |
bauzas | Ussuri Extended Maintenance transition is scheduled to next week (Nov 12) | 16:21 |
bauzas | the list of open and unreleased patches: https://etherpad.opendev.org/p/nova-stable-ussuri-em | 16:21 |
bauzas | I guess we need to make a few efforts before ussuri becomes EM | 16:22 |
bauzas | elodilles: again, I offer my help if you ping me | 16:22 |
bauzas | patches that need one +2 on ussuri: https://review.opendev.org/q/project:openstack/nova+branch:stable/ussuri+is:open+label:Code-Review%253E%253D%252B2 | 16:22 |
bauzas | (I'll skim this list) | 16:23 |
elodilles | oh, sorry, DST :S | 16:23 |
bauzas | last but not least: https://review.opendev.org/806629 patch (stable/train) needed 14 rechecks, I was pinged with the question whether testing should be reduced in train to avoid this amount of rechecks (mostly volume detach issue) | 16:23 |
bauzas | elodilles: hah, I warned about it in the channel :p | 16:23 |
sean-k-mooney[m] | are the detach issue due to the qemu version we have in bionic | 16:24 |
sean-k-mooney[m] | i assume train is not on focal? | 16:24 |
elodilles | yes, train is on bionic | 16:24 |
elodilles | (just like ussuri) | 16:24 |
bauzas | hmmm, technically, Train is EM | 16:26 |
sean-k-mooney[m] | im somewhat tempeted to same maybe move it to centos 8 or focal but we could disable the volume tests | 16:26 |
sean-k-mooney[m] | yes it is | 16:26 |
bauzas | I'd rather prefer us fixing the gate issues rather than reducing the test coverage, but this depends on any actions we can take | 16:27 |
bauzas | so, let's be pragmatic | 16:27 |
sean-k-mooney[m] | well the first question would be does train have gibis event based witing patch or is it still using the retry loop | 16:28 |
bauzas | gibi's patch isn't merged yet, right? | 16:28 |
bauzas | could it help ? | 16:28 |
sean-k-mooney[m] | the only options reallly to fi this are change the qemu verions or backport gibis patch | 16:28 |
bauzas | (I'll have to leave gibi chair in the next 2 mins but dansmith has a point I'm interested in) | 16:29 |
dansmith | I also have to go sooner | 16:29 |
bauzas | sean-k-mooney[m]: we can try to backport gibi's patch and see whether that helps | 16:29 |
dansmith | maybe we could swap open and libvirt? | 16:29 |
bauzas | dansmith: I'll | 16:29 |
gibi | I don't think there is anything in the libvirt topic | 16:29 |
gibi | lyarwood is out now | 16:30 |
dansmith | okay | 16:30 |
bauzas | okay, elodilles I'll propose to wait for gibi's patch to land in master and then be backported | 16:30 |
gibi | bauzas: it is backported til wallaby | 16:30 |
elodilles | bauzas: ack | 16:30 |
bauzas | and punt the decision to reduce the test coverage once we get better ideas | 16:30 |
gibi | if we are talking about https://review.opendev.org/q/topic:bug/1882521 | 16:30 |
bauzas | gibi: then we need to backport it down to train | 16:31 |
gibi | I don't think it will be a piece of cake to bring that back train | 16:31 |
gibi | *to train | 16:31 |
bauzas | gibi: (apologies I confused with the vnic types waiting patch) | 16:31 |
bauzas | I have to leave, but can we hold this one discussion and go straight to dansmith's point | 16:31 |
bauzas | ? | 16:32 |
gibi | anyhow we can take that outside when lyarwood is back | 16:32 |
bauzas | so I and dansmith can leave | 16:32 |
gibi | lets go to that | 16:32 |
bauzas | #topic Sub/related team Highlights | 16:32 |
bauzas | nothing to tell | 16:32 |
bauzas | #topic Open discussion | 16:32 |
bauzas | Bring default overcommit ratios into sanity (dansmith / yuriys) | 16:32 |
dansmith | So, I think we all know the default 16x cpu overcommit default is insane | 16:32 |
yuriys | exciting | 16:32 |
dansmith | we've got reports that some operators are USING those defaults because they think we're recommending them | 16:32 |
sean-k-mooney[m] | yes it is | 16:32 |
yuriys | I am so used to Slack and Discord for drop a paragraph level of communication, so pardon all the incoming spam! I prewrote stuff. | 16:33 |
bauzas | hah | 16:33 |
dansmith | yuriys is here to offer guidance and help work on this, | 16:33 |
dansmith | but I think we need to move those defaults to something sane, both in code and update the docs | 16:33 |
bauzas | I guess this can be workload dependent, right? | 16:33 |
sean-k-mooney[m] | it basically should not be set over 4x | 16:33 |
yuriys | 4:1 for cpu , 1:1 for mem. | 16:33 |
sean-k-mooney[m] | yep | 16:33 |
bauzas | I think we started to document things based on workloads | 16:33 |
yuriys | yes | 16:33 |
dansmith | bauzas: it's completely workload dependent, but we should not be recommending really anything, and thus I think the default needs to be closer to 1:1 with docs saying why you may increase it (or not) | 16:33 |
bauzas | but we never achieved this | 16:33 |
yuriys | it's VERY use case specific | 16:34 |
sean-k-mooney[m] | it is | 16:34 |
bauzas | I'm referring to https://docs.openstack.org/nova/latest/user/feature-classification.html | 16:34 |
yuriys | Ideally the documentation is restructured on how to scale up these over commits to match the desired use case, density and performance and start at sane values/defaults. Engineers can then template out the necessary config values after they've gotten to know system capabilities. Instead of working backwards from chaos and mayhem, giving future admins the opportunity to reach desired state through scaling up from a stable syste | 16:34 |
yuriys | hould be the goal of arch design. | 16:34 |
dansmith | so can we agree that we'll change the defaults to something that seems reasonable, and modify the docs that just say "these are the defaults" to have big flashing warnings that defaults will never be universal in this case? | 16:35 |
sean-k-mooney[m] | the 16:1 number originally was assuming webhosting as the main usecase or similar workloads | 16:35 |
dansmith | specifically: https://docs.openstack.org/arch-design/design-compute/design-compute-overcommit.html | 16:35 |
sean-k-mooney[m] | that does not fit with how openstack is typicaly used | 16:35 |
bauzas | dansmith: this sounds a reasonable change | 16:35 |
dansmith | cool, specless bp or bug? | 16:35 |
bauzas | my only worries would go on how this is wired at the object level but we can take the opportunity to lift this | 16:36 |
sean-k-mooney[m] | 4:1 for cpu is the highist ratio i would general consider usable in production | 16:36 |
bauzas | dansmith: there are a few upgrade concerns with placement as IIRC this is set by the model itself | 16:36 |
dansmith | I think we already moved towards something better when we moved the defaults to placement right? but we need to do something | 16:36 |
sean-k-mooney[m] | well we have the inital allcoation ratios now | 16:36 |
yuriys | i said 4 to be reasonable haha, i think i've set about 3:1 with nova-scheduler and placement randomization elements. | 16:36 |
dansmith | bauzas: yeah I think placement now has explicit defaults right? | 16:36 |
bauzas | I'm pretty sure we have some default value in the placement DB that says "16" | 16:37 |
sean-k-mooney[m] | but it still default to 16 | 16:37 |
dansmith | sean-k-mooney[m]: right | 16:37 |
sean-k-mooney[m] | i think we can decrease inial to 4 for cpu an 1 for memory | 16:37 |
dansmith | so I think we can just move those and reno that operators who took those defualts years ago should change them likely | 16:37 |
sean-k-mooney[m] | +1 | 16:37 |
bauzas | I don't wanna go procedural | 16:37 |
bauzas | so a specless BP could work for me but, | 16:37 |
bauzas | we need renos | 16:37 |
dansmith | sean-k-mooney[m]: yeah I think 4:1 CPU and 1:1 memory is fine for a default, we might need up to up for devstack I guess but that's where insane defaults should be :) | 16:38 |
sean-k-mooney[m] | yep i was thinking the same | 16:38 |
bauzas | + we need to ensure we consider the DB impact before | 16:38 |
dansmith | cool, specless bp and renos.. sounds good | 16:38 |
bauzas | if that becomes debatable in the reviews, we could go drafting more | 16:38 |
sean-k-mooney[m] | bauzas: i dont think there will be any | 16:38 |
bauzas | but here, we're talking of changing defaults, not changing existing deployments | 16:38 |
sean-k-mooney[m] | if we are just changing the initial values it wont affect existing RPs | 16:39 |
dansmith | yeah I think it'll be straightforward, but we can always revise the plan if needed | 16:39 |
dansmith | right | 16:39 |
bauzas | OK, looks to me we have a plan | 16:39 |
dansmith | #micdrop | 16:39 |
bauzas | #agreed changing overcommit CPU ratio to <16.0 can be a specless BP | 16:39 |
bauzas | yuriys: typey typey | 16:39 |
bauzas | and ping me on IRC once you have the Launchpad BP up so I can approve it | 16:40 |
* bauzas needs to drop | 16:40 | |
gibi | OK | 16:40 |
gibi | is there anything else for today? | 16:40 |
yuriys | no idea what that means, but sounds good? | 16:40 |
yuriys | ill just coordinate through dan i suppose | 16:41 |
gibi | yuriys: you need a file a blueprint here https://blueprints.launchpad.net/nova/ | 16:41 |
gibi | so we can track the work | 16:41 |
dansmith | yuriys: with just an overview of what we said, no big deal | 16:41 |
gibi | yepp | 16:41 |
yuriys | Ah sounds good. Dang, I thought I was going to have like do a whole speech and everything | 16:42 |
yuriys | just to win over votes | 16:42 |
yuriys | haha | 16:42 |
gibi | :) | 16:42 |
dansmith | yuriys: I told you it wouldn't be a big deal :) | 16:42 |
gibi | it is not a bid deal if dansmith is on your side ;) | 16:42 |
yuriys | yeah, i think the expandability still needs to be part of that doc btw | 16:42 |
yuriys | for dollar reasons | 16:42 |
yuriys | but ill throw up a BP and we'll go from there | 16:43 |
gibi | cool | 16:43 |
gibi | is there anything else for today? I don't see other topics on the agenda | 16:43 |
gibi | it seems not | 16:44 |
gibi | so then I have the noble job to close the meeting:) | 16:44 |
gibi | thank you all for joining today | 16:44 |
gibi | #endmeeting | 16:44 |
opendevmeet | Meeting ended Tue Nov 2 16:44:40 2021 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:44 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-02-16.00.html | 16:44 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-02-16.00.txt | 16:44 |
opendevmeet | Log: https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-02-16.00.log.html | 16:44 |
EugenMayer | sean-k-mooney[m] what is the 'qemu' version. i guess nova-compute-qemu/stable 2:22.0.1-2 all is just management version of nova, not the actual qemu version | 16:46 |
sean-k-mooney[m] | try qemu-system-x86_64 —verions | 16:46 |
EugenMayer | sean-k-mooney[m] hmm, i use kolla, thus i guess libvirt is insite the docker package | 16:47 |
sean-k-mooney[m] | ah in that case you can docker exec into the nova_libvirt container | 16:47 |
sean-k-mooney[m] | it contians the libvirt and qemu binaries that are uesd | 16:48 |
EugenMayer | but that means that this is not controlled by me | 16:48 |
sean-k-mooney[m] | well you could rebuild the container. do you have multiple servers with differnt cpus? | 16:49 |
EugenMayer | sean-k-mooney[m] qemu-kvm 1:4.2-3ubuntu6.18 amd64 QEMU Full virtualization on x86 hardware | 16:49 |
EugenMayer | all main computes have the same (exact) - all have AMDs (there are smaller AMDs) | 16:49 |
sean-k-mooney[m] | if they are all exactly the same then there is no downside to using host-passthough | 16:50 |
EugenMayer | well i willh ave no live migrations | 16:50 |
sean-k-mooney[m] | it will give you the best performance but the limitation it imposes is you can only live migrate to other hosts with the exact same cpu | 16:50 |
EugenMayer | i have no live migrations since no shared block storage | 16:50 |
EugenMayer | (not planing to) | 16:51 |
sean-k-mooney[m] | you do not need shared storage for live migration | 16:51 |
sean-k-mooney[m] | it just makes it faster if you do | 16:51 |
EugenMayer | i guess a non-live migration from AMD-A to AMD-B should be no issue right | 16:51 |
EugenMayer | It tells me 'live migration is not available' | 16:52 |
sean-k-mooney[m] | cold migration has no cpu requirement beyond dont change the architeture i guess | 16:52 |
sean-k-mooney[m] | so ya you can always fall back to cold migration | 16:52 |
EugenMayer | sure, amd64 they all are, but most are the big ryzen, the others the little brothers | 16:52 |
sean-k-mooney[m] | so your other option is to find the closest cpu model that your qemu support and add in addtional cpu flags | 16:53 |
EugenMayer | 99% of live migration will happen between the main compute, all AMD Ryzen 9 5950X | 16:53 |
sean-k-mooney[m] | you could group them in an az | 16:54 |
EugenMayer | doesnt host-passthrough also harm the security / encapsulation? | 16:54 |
EugenMayer | yes, az planned for the big ones, the smaller ones are internal CI servers only (azure agents or concourse CI workers) | 16:54 |
sean-k-mooney[m] | no it just allows the vm to use all the cpu features | 16:56 |
sean-k-mooney[m] | host-model still allows the vm to know what model of cpu is used | 16:56 |
EugenMayer | hmm, all VMs are our VMs, no customers or such. so i know what runs on each of them | 16:57 |
sean-k-mooney[m] | so form a security point of view more or less the same | 16:57 |
EugenMayer | i guess passthrough and hosting freebsd could be an issue. What about running windows? | 16:57 |
EugenMayer | Thank you so much! | 16:59 |
EugenMayer | the only thing i disklike is that i would need to fix every compute and i'am not sure this configuration file is controlled by kolla, but they have overrides and in the end, i have chef, so this will do it in any case | 17:00 |
sean-k-mooney[m] | it is | 17:01 |
sean-k-mooney[m] | you can drop an override in /etc/kolla/config/nova.conf or in /etc/kolla/nova/nova-compute.conf i belvie | 17:02 |
sean-k-mooney[m] | https://docs.openstack.org/kolla-ansible/latest/admin/deployment-philosophy.html#kolla-s-solution-to-customization | 17:03 |
sean-k-mooney[m] | its basically the example they use | 17:03 |
sean-k-mooney[m] | its one of the reasons i like kolla | 17:04 |
sean-k-mooney[m] | it makes this type of config simple | 17:04 |
yuriys | there are way too many reasons to like kolla | 17:04 |
EugenMayer | sean-k-mooney[m] thanks. Did that with globa_pysical_mtu already | 17:24 |
EugenMayer | yuriys there also some to not do so. As with everything | 17:24 |
EugenMayer | sean-k-mooney[m] your right, nova.conf would work https://gist.github.com/EugenMayer/d74ffdc0b15db9c9c1af344bd27accd1 | 17:33 |
Zer0Byte | hey @lyarwood are you there? | 17:49 |
lyarwood | \o evening, yeah | 17:52 |
Zer0Byte | hey | 17:52 |
Zer0Byte | https://bugs.launchpad.net/bugs/1949224 | 17:52 |
Zer0Byte | is regarding about this bug | 17:52 |
lyarwood | Yeah apologies I just saw your update | 17:53 |
lyarwood | and I missed this QoS spec on the cinder side was per GB | 17:53 |
Zer0Byte | what i mean is performing the resize only should update the iops on the kvm configuration isn't it? | 17:53 |
lyarwood | I guess this is valid in that case but I'm not sure how we can fix this between Nova and Cinder | 17:53 |
lyarwood | if it's a size related QoS spec then yeah I guess it should | 17:54 |
lyarwood | tbh I didn't even know these existed | 17:54 |
Zer0Byte | yeah im using on my backend storage and work well | 17:54 |
* lyarwood reads https://docs.openstack.org/cinder/latest/admin/blockstorage-capacity-based-qos.html | 17:55 | |
Zer0Byte | the problem is with nfs this is why in moving to the frontend | 17:55 |
Zer0Byte | and if i create a new instance with the same volume resized take the new iops specs | 17:56 |
lyarwood | https://github.com/openstack/nova/blob/82be4652e2c840bd69ec354fd734a2d3f83f395b/nova/virt/libvirt/volume/volume.py#L63-L77 is where Nova is told about the cinder side QoS FWIW | 17:56 |
Zer0Byte | but this is only during volume attach ? | 17:57 |
lyarwood | Yeah there's a missing piece during a resize | 17:58 |
lyarwood | A workaround would be shelve and unshelve the instance | 17:58 |
Zer0Byte | let me try it | 17:58 |
lyarwood | That should regenerate the connection_info in cinder and have that passed to Nova | 17:58 |
Zer0Byte | you are right @lyarwood works if i perform shelve and unshelve | 18:04 |
Zer0Byte | quesiton but shelveand unshelve change mac address or any machine configuration? | 18:05 |
Zer0Byte | like uuid or serial | 18:05 |
lyarwood | Overall things should remain the same but I'm not entirely sure if we persist the MAC addresses, sean-k-mooney ^ any idea? | 18:06 |
Zer0Byte | yeah is running cloud init again | 18:07 |
Zer0Byte | changing the ssh key | 18:07 |
lyarwood | cloud-init shouldn't regenerate ssh keys if they already exist right? | 18:08 |
Zer0Byte | mhmm if the machine id change | 18:09 |
Zer0Byte | trigger execute cloud init agai | 18:09 |
Zer0Byte | n | 18:09 |
Zer0Byte | and cloudinit perform ssh-keygen | 18:09 |
sean-k-mooney[m] | lyarwood: the mac adress comes from the neutron port so it wont change | 18:10 |
sean-k-mooney[m] | the machine id i guess you mean the one shown in dmidecoe | 18:11 |
sean-k-mooney[m] | i think that depends on your config but i think its the vms uuid by default | 18:11 |
sean-k-mooney[m] | https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.sysinfo_serial | 18:11 |
sean-k-mooney[m] | Zer0Byte: so it would only change if you had set the sysinfo_serial to OS which uses the host /etc/machine-id file , hardware which uses the host uuid form libvirt or auto whcih choses between those | 18:14 |
sean-k-mooney[m] | Zer0Byte: so with our default config of unique unshleving should not change the guest serial and cloud-init should not run | 18:14 |
sean-k-mooney[m] | the mac adress should not change either unless you changed it manually in neutron while i was sheleved | 18:15 |
lyarwood | anyway there's some additional tooling in Xena to refresh connection_info for shutdown instances without the need to shelve and unshelve etc | 18:17 |
lyarwood | https://docs.openstack.org/nova/latest/cli/nova-manage.html#volume-attachment-commands | 18:17 |
lyarwood | that's another way to workaround it while we try to work something out in-tree | 18:17 |
lyarwood | tbh I can't think of a way with the current cinder APIs | 18:18 |
* lyarwood brb | 18:18 | |
Zer0Byte | i got to go thanks anyway i will check the option of @sean-k-mooney[m] | 18:19 |
sean-k-mooney[m] | no worries im offically on pto until tomorrow anyway so not really here today just saw you question while i was checking over my car insurance | 18:20 |
*** ianw_pto is now known as ianw | 19:00 | |
EugenMayer | Anybody in here uses freezer (successfully?) | 19:01 |
dansmith | bauzas: when you return: https://blueprints.launchpad.net/nova/+spec/nova-change-default-overcommit-values | 19:16 |
hyang[m] | Can anyone help to take a look this patch https://review.opendev.org/c/openstack/nova/+/811521 thanks in advance! | 20:10 |
opendevreview | Merged openstack/nova stable/stein: [stable-only] Pin virtualenv and setuptools https://review.opendev.org/c/openstack/nova/+/813451 | 22:31 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!