Tuesday, 2021-11-16

sean-k-mooney	hyang[m]: its merged its not relaease yet	00:59
opendevreview	melanie witt proposed openstack/nova master: Poison usage of eventlet spawn_n() in tests https://review.opendev.org/c/openstack/nova/+/818042	03:52
*** abhishekk is now known as akekane\|home		05:12
*** akekane\|home is now known as abhishekk		05:12
EugenMayer	It seems like the first boot of an instance (with cloud-init) has diffenret results in network then all the other after that. Is that desired?	07:10
EugenMayer	am I right that using --user-data has 2 flavours, using it with --config-drive true will expect to have meta-data alike content there, without using config-drive yaml based cloud-init files are expected. So the same --user-data for 2 different subsystems on init?	07:47
EugenMayer	(reading https://docs.openstack.org/nova/queens/user/config-drive.html)	07:48
gibi_	o/ morning nova	08:03
*** gibi_ is now known as gibi		08:03
bauzas	good spec review day, Nova	08:40
* gibi already on it		08:45
opendevreview	Merged openstack/nova-specs master: Repropose flavour and image defined ephemeral storage encryption https://review.opendev.org/c/openstack/nova-specs/+/810867	09:13
jengbers	Morning, I have been searching OpenStack history, but I haven't been able to find why it is not possible to add existing instances to server groups. Does anyone know if that is just never discussed or if there is a fundamental technical problem?	09:33
jengbers	We have been adding servers to server groups by changing the database and the scheduler has always handled this well.	09:33
gibi	jengbers: the problem is consistency. When you add a server to a group then you can create a situation when the group membership and the actaul placement of the instance contradicts	09:38
gibi	and the question is what to do then	09:38
gibi	a) move the instance to restore consistency	09:39
gibi	b) reject the addition of the instance to the group	09:39
gibi	c) allow temporary inconsitency and let the next move operation on the instance fix the group	09:40
gibi	I think we never agreed which direction to take	09:40
sean-k-mooney	there was a propsoal to extend it recently to allow this but the suggestion then was to have teh add do migration which i dont thinkis the right approch	09:43
sean-k-mooney	its particaly a probelm for the affinity policy since that is more likely to break then the anti affinity policy	09:43
kashyap	bauzas: I might not be able to make the meeting today; I see it's at 17u CET :-(	09:48
kashyap	Morning, BTW	09:48
bauzas	kashyap: ah ok, no worries then	09:48
bauzas	kashyap: just add notes to your specless ask in the agenda, so we can discuss there	09:49
kashyap	bauzas: Yep; doing that now	09:50
kashyap	bauzas: If there are any questions, you can ask me here, I'll answer when I'm back later in the evening. I'm away from 17-18u CET	09:50
jengbers	I guess option b) would be the least surprising.	09:52
gibi	jengbers: with option b) the problem is that then the user first somehow move the instance to the proper place (but the user has no tool for it) then add it to the group	09:59
opendevreview	Merged openstack/nova-specs master: Store and allow libvirt instance device buses and models to be updated https://review.opendev.org/c/openstack/nova-specs/+/810235	10:00
sean-k-mooney	gibi: if you have the correct wiegher enabel i belvie you can ask for an isntnace to be created on the same hos tor a different host to a specific instance via a scheduler hint	10:01
sean-k-mooney	its certenly requires a lot of knoladge of nova and the instance to do correctly	10:01
gibi	sean-k-mooney: if you use SameHost / DifferentHost filters then you don't need server groups	10:02
sean-k-mooney	you cant a s a normal use cold migrate to the same host or simialrly aline them all	10:02
sean-k-mooney	gibi: ya that is also kind of true	10:02
gibi	they implement similar logic but a very different way :)	10:02
sean-k-mooney	yep	10:03
sean-k-mooney	i do wish we had ways to make server groups more useful	10:03
sean-k-mooney	but its kind fo hard to extend them for the reason above	10:03
gibi	sean-k-mooney: for that we need to solve jengbers' problem and also extend the logic to support multiple groups per instance (or nested groups	10:03
gibi	)	10:04
gibi	both is painfully missing but hard to solve	10:04
gibi	bauzas: I'm done with the spec sweep. I could not really comment on the ironic one https://review.opendev.org/c/openstack/nova-specs/+/815789 and it seems nobody commented yet	10:05
gibi	the rest of the specs has feedback	10:05
sean-k-mooney	ya. on second tought we should have "server aggreates" in parallel to server groups. you knwo so we can aggreate servers and give that aggreate of servers a name and even have some metadta that can be shared like this server it the primary of the aggreate and just not have it related to vm placement at all	10:05
bauzas	gibi: I still have 3 specs to look at	10:05
sean-k-mooney	that way we can pretend server groups dont exist :)	10:05
bauzas	gibi: but OK, and thanks for the fish	10:05
gibi	sean-k-mooney: :D	10:06
jengbers	gibi, sean-k-mooney: If it was only possible for admins, that could work, because they can also migrate servers, but for users it seems quite hard.	10:14
gibi	jengbers: yeah that could work. Feel free to propose a spec about the new API to get wider discussion around it	10:15
jengbers	On the other hand, they can power of and start an instance. I guess that would mean it is started on a different hypervisor.	10:16
sean-k-mooney	really there are 2 paths we could take. 1 allow normall user ask nova to cold/live migrate an instance to be consitent with a server group that it is currently not a member of and then allow them to add the server to the group after, rejecting the request if the policy is volated	10:20
sean-k-mooney	or 2 we can have the server group add triger the migration as part fo the request	10:21
kashyap	Is this failing for anyone else too?	10:24
kashyap	tempest.api.compute.admin.test_live_migration.LiveAutoBlockMigrationV225Test.test_live_migration_with_trunk [108.238299s] ... FAILED	10:24
gibi	kashyap: could you link the test run?	10:25
kashyap	gibi: https://zuul.opendev.org/t/openstack/build/632f8ed30e9a4a04a32648843f227ef3	10:25
jkulik	we've downstream extended the server-groups api to allow adding servers after the fact. we opted for not allowing to add servers if this would be against the server-group's rules	10:25
gibi	looking	10:25
gibi	kashyap: I think you got hit by https://bugs.launchpad.net/neutron/+bug/1940425	10:27
jkulik	it helps customers if they already spawned an instance and forgot the server-group and now want to spawn another instance in some affinity to the existing one	10:28
gibi	the stack trace is the same	10:28
gibi	jkulik, jengbers: so both of you would like the same behavior, you should team up proposing this upstream :)	10:28
kashyap	gibi: Oh, thank you	10:28
kashyap	gibi: Now what? ... Should I do a "recheck 1940425"?	10:29
kashyap	Or pray to the ju-ju at the bottom of the sea? Or...	10:29
gibi	kashyap: yepp, recheck bug 1940425	10:29
jkulik	https://github.com/sapcc/nova/commit/7220be3968ee1dd257c9add88228cc5bb9857795 is the main commit downstream for us	10:29
jkulik	gibi: yes, we talked internally already about proposing this upstream, but small team, much work :/	10:30
gibi	kashyap: I added your run to the bug maybe that way we can get attention to the failure as it is still happening	10:30
kashyap	gibi: Thx for the quick spot	10:30
gibi	jkulik: no pressure, I know that type of frustration	10:31
bauzas	jkulik: we tried discussing this in the past upstream, but operators are very afraid by the races conditions it creates	10:35
bauzas	jkulik: problem is, in a distributed service model like Nova, you can't get a valid answer whether you can do it, as when you validate, you don't ask the nova-compute service	10:36
jkulik	bauzas: that reminds me ... we wanted to change the DB to disallow having a server in multiple server-groups to help with races. we haven't done that, yet. thanks :D	10:37
bauzas	in theory we should hold new instance creations per compute once you ask for adding a new instance to the group	10:38
jkulik	our problem is a little different, still, as we use VMware and not libvirt. thus, we have a lot of hidden hypervisors as nova only sees the cluster. therefore, hard anti-affinity doesn't really matter for us that much	10:40
jkulik	customers want to make sure they run on different hypervisors and thus we sync the server-groups to the VMware clusters. VMware then migrates VMs around to make sure the rules apply.	10:40
jkulik	i.e. most of our customers depend on soft-anti-affinity, which is a Weigher in nova-scheduler anyways	10:42
kashyap	bauzas: Alright, added it to the Open Discussion here: https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting	11:43
opendevreview	Rajat Dhasmana proposed openstack/nova-specs master: Add spec for volume backed server rebuild https://review.opendev.org/c/openstack/nova-specs/+/809621	11:44
dmitriis	gibi: tyvm for the feedback	12:10
gibi	dmitriis: you are welcome. it is a well written spec, thanks for putting in the effort	12:10
sean-k-mooney	i dont know if i set send on my respocne to the last verions of it	12:22
sean-k-mooney	gibi you are correct the pci passhtoug filter will be suffiect without the prefilter	12:23
sean-k-mooney	but the prefiletr can help reduce the set if we only report the new trait on host that have off path devices	12:23
sean-k-mooney	and the capablity to use them of course	12:23
sean-k-mooney	ill re review that specs later today	12:24
gibi	sean-k-mooney: yes, exactly my argument, the prefilter is not mandatory but it is good to have	12:26
*** mdbooth1 is now known as mdbooth		12:53
dmitriis	sean-k-mooney: ack, ty for confirming	13:00
sean-k-mooney	i dont currently have access to hardware to test what you have done but i may have access before the end of the cycle. if i do i might reach out to you and try and test it end to end although i dont know if i will have time to do that or not	13:02
elodilles	bauzas: i'll update now the meeting wiki #stable section if that is not interfering with you right now	13:24
dmitriis	sean-k-mooney: btw, fnordahl and I have done end-to-end testing of this in a lab. Here's a PPA https://launchpad.net/~fnordahl/+archive/ubuntu/smartnic-enablement that was used in the process (the WIP reviews are in use there). It has https://listman.redhat.com/archives/libvir-list/2021-November/msg00431.html included as well - I am trying to get	13:26
dmitriis	someone to review it sooner than later.	13:26
dmitriis	it doesn't yet have the prefilter and compute capability parts that were recently added to the spec but I will work on updating the WIP review soon with that and on raising a relevant os-traits change	13:28
dmitriis	We had a VM booted with a floating IP assigned which we then connected to via a router. The flows were offloaded into the ConnectX-6 chip present on BF2.	13:30
sean-k-mooney	i think i have ping that patch to people dowstream already but ill let the vert team know	13:31
sean-k-mooney	*virt	13:31
dmitriis	ack, tyvm	13:32
dmitriis	sean-k-mooney: besides testing overlays we also tried using VLAN provider networks. That worked as well but the only thing to note there is that collocating VMs with ports attached to overlay networks via PCI devices with the ones that are directly attached to VLAN networks is going to be problematic with the current whitelist based lookup	13:33
dmitriis	implementation.	13:33
dmitriis	entries in the whitelist get a physnet tag (either null for overlay networks or a physnet label)	13:34
sean-k-mooney	correct they do	13:34
dmitriis	but there is only one vendor/device id pair	13:34
sean-k-mooney	and technially null was never intended to be supported	13:34
sean-k-mooney	we never had a nova feature to support overlays with pci devices	13:35
sean-k-mooney	they exploted a lack of null checking and it happend to work	13:35
sean-k-mooney	dmitriis: anyway back to your point why is that problematic	13:36
dmitriis	sean-k-mooney: heh, yes, I wasn't aware of the history but the hardware offload docs explicitly mention that null needs to be used https://docs.openstack.org/neutron/latest/admin/config-ovs-offload.html#configure-nodes-vxlan-configuration	13:36
sean-k-mooney	dmitriis: yes that was never intended to work	13:36
sean-k-mooney	but peopel now have it in produciton	13:36
sean-k-mooney	dmitriis: https://bugs.launchpad.net/nova/+bug/1915282	13:37
dmitriis	sean-k-mooney: IIRC PCI requests come with a specific physnet parameter (or null). So when PCI stats are looked at, this parameter is used for lookup	13:37
dmitriis	let me find that code again	13:37
sean-k-mooney	yes we end up passing python NONE in the pci request	13:37
sean-k-mooney	becaue the physent of a vxlan or geneve netowrk is not set	13:38
sean-k-mooney	that wil match the null phsynet specified in the whitelist	13:38
dmitriis	((Pdb)) request	13:39
dmitriis	InstancePCIRequest(alias_name=<?>,count=1,is_new=<?>,numa_policy=<?>,request_id=c3a87cba-323a-4203-bca7-0916927dcd5b,requester_id='28ea5b12-729c-46b4-b441-518fe786ea10',spec=[{physical_network=None,remote_managed='True'}])	13:39
dmitriis	I had something like this ^	13:39
sean-k-mooney	yep	13:39
sean-k-mooney	that shoudl work	13:39
sean-k-mooney	that is the python None	13:39
sean-k-mooney	not its not quoted	13:39
sean-k-mooney	*note	13:39
sean-k-mooney	that will match physical_network=null	13:40
dmitriis	ah, maybe that's an old note that I have. It was since fixed to use a string	13:40
sean-k-mooney	you have to use "'physical_network':null" not "'physical_network':'null'" in the whitelist	13:40
sean-k-mooney	like this passthrough_whitelist={ "vendor_id":"15b3", "product_id":"101e", "physical_network":null }	13:41
sean-k-mooney	that enabels a connectx-6 dx for overlay networking	13:42
sean-k-mooney	dmitriis: if you want to have some VF for vlan/flat and other for geneve tunnels you need to use the adress field to partion the vfs into groups	13:43
dmitriis	sean-k-mooney: I suppose that could be one way to do it	13:44
sean-k-mooney	dmitriis: this is because tunnels was never ment to be supported at all so we never impleted a way to allow a device to be part of multile physnets	13:44
sean-k-mooney	dmitriis: if it was not for the fact that this was used as production we would have closed this as a secuirty bug and blocked the use of null the detail are in the bug	13:45
dmitriis	sean-k-mooney: yeah, makes sense. I think that documenting this and suggesting address-based partitioning as a workaround is viable for now	13:46
sean-k-mooney	dmitriis: the tl;dr is we use a json parser to parse the whitelist and in json unquoted null is mapped to the python None object which just happens to be what we get when we parse the phsynet form networks that dont have one	13:46
sean-k-mooney	which is why physical_network=None in the pci request will actully match	13:47
sean-k-mooney	sicne that is also python None object not the stirng 'None'	13:48
sean-k-mooney	fyi hte docs for the whitelist are not greate but incase yo udont know we support both bash style globs and python regex expression in the adress filed	13:49
sean-k-mooney	and we supprot both in either the sting or dict form	13:50
sean-k-mooney	https://docs.openstack.org/nova/latest/configuration/config.html#pci.passthrough_whitelist has some examples	13:50
dmitriis	sean-k-mooney: I recall some other place in Nova where I had to use a string instead (trying to find where so maybe I wrongly brought this up here).	13:50
sean-k-mooney	there might be if you find it let me knwo and i might know the history or it might just be a bug	13:51
dmitriis	sean-k-mooney: that's what we used in the lab	13:52
dmitriis	passthrough_whitelist = [{"vendor_id": "15b3", "product_id": "101e", "physical_network": null, "remote_managed": "true"}]	13:52
dmitriis	and for physnets: passthrough_whitelist = [{"vendor_id": "15b3", "product_id": "101e", "physical_network": "physnet1", "remote_managed": "true"}]	13:53
sean-k-mooney	not at the same time right	13:53
sean-k-mooney	if you add the adress filed you could use both but both look valid to me the first for geneve and the second for flat/vlan	13:54
sean-k-mooney	huh interesting	13:55
sean-k-mooney	the VFs for the connectx-6 on the bluefiled 2 have the same vendor and product id as a normal connectx-6	13:55
bauzas	elodilles: sure, please do it, I'll just update the wikipage after you	13:59
dmitriis	sean-k-mooney: ack on the address field usage.	14:06
dmitriis	sean-k-mooney: I don't have a separate ConnectX-6 at hand but BF2 has ConnectX-6 in it. Let me check the PCI ID DB - I think I've seen different ids but maybe that's for something else.	14:07
elodilles	bauzas: done, thanks (i might overused the info and link markers o:) feel free to edit :))	14:11
bauzas	elodilles: ack, thanks	14:11
dmitriis	sean-k-mooney: so the PF is different but VFs look like the ones from a "regular" ConnectX-6.	14:13
dmitriis	PF:	14:13
dmitriis	82:00.0 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)	14:13
dmitriis	82:00.0 0200: 15b3:a2d6 (rev 01)	14:13
dmitriis	Subsystem: 15b3:0061	14:13
dmitriis	VF:	14:13
dmitriis	82:00.3 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function (rev 01)	14:13
dmitriis	82:00.3 0200: 15b3:101e (rev 01)	14:13
dmitriis	Subsystem: 15b3:0061	14:13
dmitriis	so careful "remote_managed" tagging is needed	14:14
sean-k-mooney	ack good to know	14:14
sean-k-mooney	dmitriis: you can use the adress of the pf and vendor id of the VF to whitelist all the VFs that belog to that pf	14:29
sean-k-mooney	just so you know	14:29
sean-k-mooney	dmitriis: that behavior is not well known	14:29
dmitriis	sean-k-mooney: didn't know that (not surprisingly), thanks for the info.	14:32
dmitriis	sean-k-mooney: btw, BF2 does bonding at the ARM CPU side transparently to the hypervisor	14:33
dmitriis	and there's an option to hide the inactive PF for the hypervisor side: https://docs.mellanox.com/display/BlueFieldSWv24011082/BlueField%20Link%20Aggregation	14:34
dmitriis	that makes it easier for OpenStack deployers/operators since only one PF needs to be taken into account	14:35
dmitriis	sean-k-mooney: so this is the place where I had to use a string (instead of bool, not None so my reference was not correct) https://review.opendev.org/c/openstack/nova/+/812111/3/nova/network/neutron.py#2295 - that's where a device spec is dynamically generated (not based on flavor or image properties).	14:42
sean-k-mooney	im on a call but ill look it up after thanks	14:45
dmitriis	ack	14:47
Adri2000	hi, I've got a race condition issue on ussuri and victoria when resizing an instance... specifically this is with /var/lib/nova/instances on NFS, and the following happens sometimes when resizing an instance where a cold migration is triggered: `qemu-img resize` will be run on the new compute node, before the old compute node has fully released the lock on the disk file; this	15:03
Adri2000	will put the instance in ERROR state. does that ring a bell to anyone?	15:03
Adri2000	ERROR nova.compute.manager [req-...] [instance: 6ca672fd-8746-441f-bbca-6baa3234bb5e] Setting instance vm_state to ERROR: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. Command: qemu-img resize /var/lib/nova/instances/6ca672fd-8746-441f-bbca-6baa3234bb5e/disk Exit code: 1 Stdout: ''	15:03
Adri2000	Stderr: "qemu-img: Could not open '/var/lib/nova/instances/6ca672fd-8746-441f-bbca-6baa3234bb5e/disk': Could not open '/var/lib/nova/instances/6ca672fd-8746-441f-bbca-6baa3234bb5e/disk': Permission denied\n"	15:03
sean-k-mooney	Adri2000: are you using nfsv3	15:04
Adri2000	sean-k-mooney: `/var/lib/nova/instances type nfs4 (rw,relatime,vers=4.1...`	15:04
sean-k-mooney	ok nfsv3 has locking issues v4.1 improves the situration but recommend v4.2+	15:05
sean-k-mooney	lyarwood: does ^ seem familar to you	15:05
sean-k-mooney	Adri2000: i belive there are some tunabel in the mount option that can be used to help resolve this	15:07
sean-k-mooney	Adri2000: are you using raw images?	15:11
bauzas	gibi: I'll have to hardstop our meeting by 5:50pm our TZ	15:11
gibi	bauzas: ack	15:12
bauzas	in case we have to continue discussing, could you be chairing it ?	15:12
sean-k-mooney	dmitriis: oh there	15:13
sean-k-mooney	str(self._is_remote_managed(vnic_type)),	15:13
sean-k-mooney	dmitriis: ya that makes sense	15:13
Adri2000	sean-k-mooney: qcow3 images. one nfs option I have currently is local_lock=none, maybe I should look into this one.	15:13
sean-k-mooney	dmitriis: technicaly the tags are defiend to be of type String	15:14
sean-k-mooney	so its a dict of string to string	15:14
sean-k-mooney	Adri2000: ack the reason i asked is apparently the lockign behavior is differnt in qemu for raw vs qcow	15:15
dmitriis	sean-k-mooney: ack	15:15
sean-k-mooney	dmitriis: https://github.com/openstack/nova/blob/master/nova/pci/devspec.py#L262-L263	15:17
dmitriis	sean-k-mooney: yep, makes sense	15:18
dmitriis	sean-k-mooney: btw, the ovn-vif repo is now up under ovn-org https://github.com/ovn-org/ovn-vif	15:18
sean-k-mooney	yes i saw your comment	15:19
sean-k-mooney	just looking at the code	15:19
dmitriis	ack	15:19
sean-k-mooney	am i right in assuming we do not want to allow these device to be used for flavor based pci passthouhg	15:19
dmitriis	sean-k-mooney: yes, they won't be of much use without being plugged appropriately. Not the VFs at least.	15:20
sean-k-mooney	ya	15:21
sean-k-mooney	im wondering if we should explictly block that	15:21
sean-k-mooney	unfortunetly i dont see a trivial way to do that	15:21
sean-k-mooney	although we might already do that	15:22
dmitriis	sean-k-mooney: I guess we could exclude devices from search results if remote_managed is present but not requestedd	15:22
sean-k-mooney	yep	15:22
sean-k-mooney	i was just going to provide an exmaple	15:22
sean-k-mooney	we do this in other cases already	15:23
sean-k-mooney	https://github.com/openstack/nova/blob/master/nova/pci/stats.py#L411-L433	15:23
sean-k-mooney	That filteres out PF if you did not ask for one	15:23
sean-k-mooney	dmitriis: so you can copy past https://github.com/openstack/nova/blob/master/nova/pci/stats.py#L520-L535	15:24
sean-k-mooney	and then add a new function that will filter out remote managed device if not requested	15:24
dmitriis	sean-k-mooney: https://review.opendev.org/c/openstack/nova/+/812111/3/nova/network/neutron.py#2295	15:24
sean-k-mooney	i did this recently when i added support for vdpa https://github.com/openstack/nova/blob/master/nova/pci/stats.py#L540	15:25
dmitriis	actually, I'm explicitly passing remote_managed=False	15:25
sean-k-mooney	that wont work	15:25
dmitriis	sean-k-mooney: even with this? https://review.opendev.org/c/openstack/nova/+/812111/3/nova/pci/stats.py#111	15:25
sean-k-mooney	it will break existing deploymetn on upgrade as there existing device wont have remote_managed=False	15:25
sean-k-mooney	and it woudl only apply for pci recuts form ports	15:26
sean-k-mooney	dmitriis: that would work but we might end up doign a data migration of all existing rows	15:26
sean-k-mooney	dmitriis: ok we can review this as part of the code review rather then the spec.	15:27
sean-k-mooney	im just finsihing reading it now and ill approve it shortly	15:27
dmitriis	sean-k-mooney: ack, I am open to adding a filter as you suggested	15:27
sean-k-mooney	either would work but one involes updateing every row in the pci device tabel with remote_managed=false :)	15:28
dmitriis	right, I would certainly like to avoid introducing a change that would break with a stale state in PCI stats	15:28
sean-k-mooney	the important thing is there is not a gap in the design	15:28
dmitriis	agreed	15:28
*** artom__ is now known as artom		15:34
sean-k-mooney	dmitriis: ok i captured some of my tought in this converstaion in the spec but +2 +w from me	15:43
sean-k-mooney	dmitriis: feel free to ping me to review the implemetaion too. i do not have +2 right on the code repo but ill try and spend some time reviewing it end to end next week	15:45
dmitriis	sean-k-mooney: tyvm. I'll try to get the code updated with some of the latest changes by then. Still have to extend func tests to cover more cases but there are some already.	15:46
dmitriis	sean-k-mooney: speaking of other lifecycle operations, I've spent some time looking at the recent VF hot-plug/unplug changes so I may revisit some of the unsupported operations at a later point	15:47
bauzas	reminder : nova weekly meeting starts in 13 mins here in this #chan	15:47
dmitriis	maybe we can actually make things like cold migration work, just need to review that further	15:47
sean-k-mooney	ack	15:47
sean-k-mooney	dmitriis: it might just work	15:48
sean-k-mooney	there is very littel in the nova side that will need to be updated	15:48
sean-k-mooney	also for the live migration	15:48
dmitriis	sean-k-mooney: yes, we might need to document the need for extra slots to be added via the new config	15:48
dmitriis	https://review.opendev.org/c/openstack/nova/+/545034/16/nova/conf/libvirt.py	15:49
sean-k-mooney	dmitriis: that is really only need for q35	15:49
sean-k-mooney	and we already have a config to add extra slots in that case	15:49
sean-k-mooney	yep that one	15:49
dmitriis	ack	15:49
sean-k-mooney	the pc machine type ahs 24 or 32 pci slot by default	15:50
sean-k-mooney	for q35 the defautl behavior is to allocation all that are required for your vm +1 free for hotplug	15:50
sean-k-mooney	oh...	15:51
sean-k-mooney	there might be a bug in sriov live migration with q35	15:51
dmitriis	From the guest OS perspective, the PCI addressing is tied to the virtual PCI topology. Hopefully it is consistent across migration so that device naming doesn't change for the guest while the MAC is reprogrammed anyway.	15:52
sean-k-mooney	i did most of my testing with pc and when i tested with q35 i dont know if i tested with more then one sriov nic	15:52
sean-k-mooney	dmitriis: we dont gaurntee ti will be	15:52
sean-k-mooney	so it might change	15:53
dmitriis	sean-k-mooney: ah, good to know. Changing PCI addresses will change persistent device names tied to PCI addresses.	15:53
sean-k-mooney	yes the way around that is to leverage device role tagging	15:54
sean-k-mooney	but really we want qemu/kvm/nvidia to finish implemeitn live migration support for vdpa	15:54
sean-k-mooney	so that we can just leave teh vdpa device attach	15:54
sean-k-mooney	dmitriis: by the way at some point we likely need to consider how to supprot vdpa+bluefiled-2	15:55
sean-k-mooney	we can get the simple version working first however.	15:56
dmitriis	sean-k-mooney: yes, I agree. There are two cases: software and hardware vDPA. For soft vDPA there is an extra agent needed on the hypervisor host.	15:56
dmitriis	so that definitely has some challenges	15:56
sean-k-mooney	im hoping we can simple not specify a device_type and relay on remote_manged=True	15:56
dmitriis	another interesting area is Scalable Functions (SFs) which rely on mdev and a vendor-specific driver	15:56
sean-k-mooney	well maybe not we can see	15:57
sean-k-mooney	dmitriis: yes i have worked with that in the past	15:57
dmitriis	it kind of erases the benefits of hardware virtio tbh	15:57
sean-k-mooney	its not clear that the mdev based apparoch will go to market or not at least form teh vendor i was workign with	15:57
sean-k-mooney	well the mdev impleation can be in hardware and present virtio too	15:58
sean-k-mooney	it predates the vdpa buss	15:58
dmitriis	ah, in that case, I take it back :^)	15:58
whoami-rajat	Hi, just to be sure the nova meeting is in this channel right?	15:59
dmitriis	I was also thinking of what CXL would bring and how much churn will it introduce to the existing PCI management implementation in Nova	15:59
sean-k-mooney	the protype i was working on used an fpga to implent virtio in "hardware" but the long term plan was to do that in an asic. i just dont know if they have pivitored to vdpa now or not but it was mdev based at the time	15:59
bauzas	#startmeeting nova	16:00
opendevmeet	Meeting started Tue Nov 16 16:00:10 2021 UTC and is due to finish in 60 minutes. The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot.	16:00
opendevmeet	Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.	16:00
opendevmeet	The meeting name has been set to 'nova'	16:00
gibi	o/	16:00
elodilles	o/	16:00
bauzas	#link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting	16:00
bauzas	good 'day, everyone ;)	16:00
whoami-rajat	Hi	16:00
opendevreview	Merged openstack/nova-specs master: Integration With Off-path Network Backends https://review.opendev.org/c/openstack/nova-specs/+/787458	16:00
gmann	o/	16:01
bauzas	I'll have to hardstop working in 45-ish mins, sooo	16:01
bauzas	#chair gibi	16:01
opendevmeet	Current chairs: bauzas gibi	16:01
bauzas	sorry again	16:01
gibi	so I will take the rest	16:01
* bauzas is a taxi		16:01
bauzas	anyway, let's start	16:02
bauzas	#topic Bugs (stuck/critical)	16:02
bauzas	#info No Critical bug	16:02
bauzas	#link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 28 new untriaged bugs (+3 since the last meeting)	16:02
bauzas	#help Nova bug triage help is appreciated https://wiki.openstack.org/wiki/Nova/BugTriage	16:02
bauzas	I'm really a sad panda	16:02
bauzas	in general, I'm triaging bugs on Tuesday, but I forgot about our today's spec review day :)	16:03
bauzas	so I'll look at the bugs tomorrow	16:03
bauzas	in case people want to help us, <3	16:03
bauzas	any bug to discuss ?	16:03
bauzas	#link https://storyboard.openstack.org/#!/project/openstack/placement 33 open stories (+1 since the last meeting) in Storyboard for Placement	16:04
bauzas	about thisq.$	16:04
bauzas	this... *	16:04
bauzas	I tried to find which story was new :)	16:04
bauzas	but the last story was already the one I knew	16:05
bauzas	so, in case people know...	16:05
dansmith	o/	16:05
gibi	bauzas: if at some point I have time I can try to dig but I pretty full at the moment	16:06
bauzas	also, Storyboard is a bit... slow, I'd say	16:06
bauzas	5 secs at least everytime it takes for looking about a story	16:06
bauzas	I mean, for stories, maybe we should use Facebook then ? :p	16:07
bauzas	(heh, :p )	16:07
* bauzas was joking in case people were not knowing		16:07
bauzas	OK, this looks like a bad joke	16:08
bauzas	moving on :p	16:08
bauzas	#topic Gate status	16:08
bauzas	#link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs	16:08
bauzas	nothing new	16:08
bauzas	#link https://zuul.openstack.org/builds?project=openstack%2Fplacement&pipeline=periodic-weekly Placement periodic job status	16:08
bauzas	now placement-nova-tox-functional-py38 job works again :)	16:09
bauzas	thanks !	16:09
bauzas	#topic Release Planning	16:09
bauzas	#info Yoga-1 is due Nova 18th #link https://releases.openstack.org/yoga/schedule.html#y-1	16:10
bauzas	which is in 2 days	16:10
bauzas	nothing really to say about it	16:10
bauzas	#info Spec review day is today	16:10
bauzas	I think I reviewed all the specs but one (but I see this one was merged ;) )	16:10
bauzas	thanks for all who already reviewed specs	16:11
gibi	yeah I think we pushed forward all the open specs	16:11
whoami-rajat	Sorry If I'm interrupting but I had one doubt regarding my spec	16:12
bauzas	we merged 3 specs today	16:12
bauzas	whoami-rajat: no worries, we can discuss this spec if you want during the open discussion topic	16:12
whoami-rajat	ack thanks bauzas	16:12
bauzas	whoami-rajat: but what is your concern ?	16:12
bauzas	a tl;dr if you prefer	16:13
bauzas	for other specs, I'll mark the related blueprints accepted in Launchpad by tomorrow	16:14
whoami-rajat	bauzas, so I'm working on the reimage spec for volume backed instances and we decided to send connector details with the reimage API call and cinder will do the attachment update (this was during PTG), Lee pointed out that we should follow our current mechanism of nova doing attachment update like we do for other operations	16:14
bauzas	ok, if this is a technical question, let's discuss this during the open discussion topic as I said	16:15
whoami-rajat	sure, np	16:15
bauzas	ok, next topic then	16:15
bauzas	#topic Review priorities	16:15
bauzas	#link https://review.opendev.org/q/status:open+(project:openstack/nova+OR+project:openstack/placement)+label:Review-Priority%252B1	16:15
bauzas	#info https://review.opendev.org/c/openstack/nova/+/816861 bauzas proposing a documentation change for helping contributors to ask for reviews	16:16
bauzas	gibi already provided some comments on it	16:16
bauzas	I guess the concern is how to help contributors to ask for reviews priorities like we did with the etherpad	16:16
bauzas	but if we have a consensus saying that it is not an issue, I'll stop	16:17
bauzas	but my only concern is that I think asking people to come on IRC and ping folks is difficult so we could use gerrit	16:18
gibi	what is more difficult? Finding the reason of a faul in nova code and fixing it or joing IRC to ask for review help?	16:19
sean-k-mooney	well one you "might" be able to do offlien/async	16:19
sean-k-mooney	the other invovles talking to peopel albeit by text	16:20
sean-k-mooney	unfortunetly those are sometime non overlaping skill sets	16:20
bauzas	gibi: I'm just thinking of on and off contributors that just provide bugfixes	16:20
gibi	doing code review is talking to people via text :)	16:20
bauzas	but let's continue discussing this in the proposal, I don't wanna drag the whole attention by now	16:21
sean-k-mooney	bauzas: for one off patches i think the expectaion shoudl still be on use to watch the patchs come in and help them	16:21
sean-k-mooney	rather ten assuemign they will use any tools we provide	16:21
bauzas	sean-k-mooney: yeah but then how to discover them ?	16:21
bauzas	either way, let's discuss this by Gerrit :p	16:22
sean-k-mooney	well if its a similar time zone i watch for teh irc bot commeting for the patches	16:22
sean-k-mooney	if i dont recognise it or the name i open it	16:22
sean-k-mooney	and then one of use can request the reqview priority in gerrirt or publicise the patch to others	16:22
bauzas	that's one direction	16:23
sean-k-mooney	if there is something in gerrit i can set im happy to do that on patches when i think they are ready otherwise ill just ping them to ye as i do now	16:23
bauzas	either way, we have a large number of items for the open discussion topic, so let's move on	16:23
sean-k-mooney	ack	16:24
bauzas	#topic Stable Branches	16:24
bauzas	elodilles: fancy copy/pasting or do you want me to do so ?	16:24
elodilles	either way is OK :)	16:24
bauzas	I can do it	16:25
bauzas	#info stable gates' status look OK, no blocked branch	16:25
bauzas	#info final ussuri nova package release was published (21.2.4)	16:25
bauzas	#info ussuri-em tagging patch is waiting for final python-novaclient release patch to merge	16:25
bauzas	#link https://review.opendev.org/c/openstack/releases/+/817930	16:26
bauzas	#link https://review.opendev.org/c/openstack/releases/+/817606	16:26
bauzas	#info intermittent volume detach issue: afaik Lee has an idea and started to work on how it can be fixed:	16:26
bauzas	#link https://review.opendev.org/c/openstack/tempest/+/817772/	16:26
bauzas	any question ?	16:26
elodilles	thanks :)	16:26
bauzas	looks like none	16:27
bauzas	#topic Sub/related team Highlights	16:27
gibi	the volume detach issue feel more an more like not related to detach	16:27
bauzas	#undo	16:27
opendevmeet	Removing item from minutes: #topic Sub/related team Highlights	16:27
gibi	the kernel panic happens before we issue detach	16:28
elodilles	gibi: true	16:28
gibi	it is either related to the attach or the live migration itself	16:28
gibi	I have trials placing sleep in different places to see where we are too fast https://review.opendev.org/c/openstack/nova/+/817564	16:28
bauzas	which stable branches are impacted ?	16:28
gibi	stable/victoria	16:28
bauzas	ubuntu focal-ish I guess ?	16:29
bauzas	ack thanks	16:29
elodilles	(and other branches as well, but might be different root causes)	16:29
gibi	I only see kernel panic in stable/victoria (a lot) and one single failure in stable/wallaby	16:29
gibi	so if there are detach issues in older stable that is either not causing kernel panic, or we don't see the panic in the logs	16:30
bauzas	I guess kernel versions are different between branches	16:30
bauzas	right?	16:30
bauzas	could we imagine somehow to verify another kernel version for stable/victoria	16:31
bauzas	?	16:31
gibi	we tested with guest cirros 0.5.1 (victoria default) and 0.5.2 (master default) it is reproducible with both	16:31
bauzas	ack so unrelated	16:31
gibi	there is a summary here https://bugs.launchpad.net/nova/+bug/1950310/comments/8	16:31
bauzas	#link https://bugs.launchpad.net/nova/+bug/1950310/comments/8 explaining the guest kernel panic related to stable/victoria branch	16:32
sean-k-mooney	ya the fiew cases i looked at with you last week were all happing befoer detach	16:32
sean-k-mooney	so its either the attach or live migration	16:32
gibi	sean-k-mooney: I have more logs in the runs of https://review.opendev.org/c/openstack/nova/+/817564 if you are interested	16:32
sean-k-mooney	i looked downstream at our qemu bugs but didnt see anythign relevent	16:32
sean-k-mooney	gibi: sure ill try and take a look proably tomorrow	16:33
sean-k-mooney	but ill open it in a tab	16:33
gibi	sean-k-mooney: thanks, I will retrigger that patch for a couple times to see if the current sleep before the live migration helps	16:33
bauzas	a good sleep always helps	16:34
bauzas	:)	16:34
elodilles	:]	16:34
sean-k-mooney	when sleep does not work we can also try a trusty print statement	16:34
gibi	sleep is not there as a solution but as a troubleshooting to see which step we are too fast :D	16:35
* sean-k-mooney is dismayed by how may race condition __dont__ appeare when you use print for debugging		16:35
gibi	and I do have a lot of print(server.console) like statements in the tempest :D	16:35
sean-k-mooney	i think we can move on but its good you were able to confirm we were attaching before the kerenl finished booting	16:36
sean-k-mooney	at least in some cases	16:36
sean-k-mooney	that at least lend weight to the idea we are racing	16:36
bauzas	ok, let's move on	16:37
gibi	ack	16:37
bauzas	again, large agenda todayu	16:37
bauzas	#topic Sub/related team Highlights	16:37
bauzas	damn	16:37
bauzas	#topic Sub/related team Highlights	16:37
bauzas	Libvirt : lyarwood ?	16:37
bauzas	I guess nothing to tell	16:38
bauzas	moving on to the last topic	16:38
bauzas	#topic Open discussion	16:38
bauzas	whoami-rajat: please queue	16:39
whoami-rajat	thanks!	16:39
bauzas	(kashyapc) Blueprint for review: "Switch to 'virtio' as the default display device" -- https://blueprints.launchpad.net/nova/+spec/virtio-as-default-display-device	16:39
bauzas	this is a specless bp ask	16:39
bauzas	kashyap said " The full rationale is in the blueprint; in short: "cirrus" display device has many limitations and is "considered harmful"[1] by QEMU graphics maintainers since 2014."	16:39
bauzas	do we need a spec for this bp or are we OK for approving it by now ?	16:40
whoami-rajat	so lyarwood had a concern with my reimage spec, we agreed to pass the connector info to reimage API (cinder) and cinder will do attachment update and return the connection info with events payload	16:40
gibi	I think we don't need a spec this is pretty self contained in the libvirt driver	16:40
bauzas	kashyap was unable to attend the meeting today	16:40
whoami-rajat	(in PTG)	16:40
sean-k-mooney	i think we are ok with approving it the main thing to call out is we will be chaing it for existing isntnace too	16:40
bauzas	whoami-rajat: please hold, sorry	16:40
whoami-rajat	oh ok	16:40
gibi	the only open question we had with sean-k-mooney is how to change the default	16:40
gibi	but kashyap tested it out that changing the default during hard reboot not cause any trouble to guests	16:41
gibi	as the new video dev has a fallback vga mode	16:41
bauzas	gibi: I'm thinking hard of any potential upgrade implication	16:41
sean-k-mooney	right so when we dicussed this before we decied to change it only for new instances to avoid upgrade issue	16:41
bauzas	correct	16:41
sean-k-mooney	our downstream QE tested this with windows guests and linux guest and both seamd to be ok with the change	16:41
bauzas	I'm in favor of not touching the running instances	16:42
bauzas	or asking to rebuild them	16:42
gibi	we are not toching the running instance, we only touch hard rebooting instances	16:42
sean-k-mooney	so kasyap has impletne this for all instnaces	16:42
bauzas	gibi: which happens when you stop/start, right?	16:42
gibi	right	16:42
sean-k-mooney	bauzas: yes as gibi says it will only take effect when the xml is next regenreted	16:42
gibi	it happens while the guest is not running	16:42
*** akekane_ is now known as abhishekk		16:43
gibi	it is not an unplug/plug for a running guest	16:43
bauzas	do we want admins to opt-in instances ?	16:43
bauzas	or do we agree it would be done automatically?	16:43
sean-k-mooney	it will happen on start/stop hard reboot or a non live move operations	16:43
gibi	bauzas: I trust kashyap that it is safe to change this device	16:44
bauzas	do we also want to have a nova-status upgrade check for yoga about this ?	16:44
sean-k-mooney	no	16:44
bauzas	gibi: me too	16:44
sean-k-mooney	why would we need too	16:44
sean-k-mooney	we are not removing support for cirrus	16:44
gibi	we don't remove cirros	16:44
sean-k-mooney	jsut not the default	16:44
gibi	yepp	16:44
sean-k-mooney	gibi: context is downstream it is being remvoed form rhel 9	16:45
bauzas	sean-k-mooney: sure, that just means that long-living instances could continue running cirros	16:45
sean-k-mooney	so wwe need to care about it for our product	16:45
sean-k-mooney	actully cirrus is not beeing remvoed in rhel 9	16:45
sean-k-mooney	but like in rhel 10	16:45
sean-k-mooney	bauzas: yep which i think is ok	16:46
sean-k-mooney	we coudl have a nova status check but it woudl have to run on the compute nodes	16:46
sean-k-mooney	which is kind of not nice	16:46
sean-k-mooney	since it woudl have to check the xmls	16:46
bauzas	I know	16:46
sean-k-mooney	so i woudl not add it personally	16:46
bauzas	I'm just saying that we enter a time that could last long	16:47
gibi	I agree, we don't need upgrade check	16:47
sean-k-mooney	shal we continue this in the patch review	16:48
bauzas	but agreed on the fact this is not a problem until cirros support is removed and this is not an upstream question	16:48
bauzas	sean-k-mooney: you're right, nothing needing a spec	16:48
bauzas	#agreed https://blueprints.launchpad.net/nova/+spec/virtio-as-default-display-device is accepted as specless BP for the Yoga release timeframe	16:49
bauzas	moving on	16:49
gibi	\o/	16:49
bauzas	next item	16:49
bauzas	(kashyapc) Blueprint for review: "Add ability to control the memory used by fully emulated QEMU guests -- https://blueprints.launchpad.net/nova/+spec/control-qemu-tb-cache	16:49
bauzas	again, a specless bp ask	16:49
bauzas	he said " This blueprint allows us to configure how much memory a plain-emulated (TCG) VM, which is what OpenStack CI uses. Recently, QEMU changed the default memory used by TCG VMs to be much higher, thus reducing the no. of VMs you TCG could run per host. Note: the libvirt patch required for this will be in libvirt-v7.10.0 (December 2021)."	16:49
bauzas	" See this issue for more details: https://gitlab.com/qemu-project/qemu/-/issues/693 (Qemu increased memory usage with TCG)"	16:49
sean-k-mooney	im a little torn on this	16:50
sean-k-mooney	im not sure i like this being a per host config option	16:50
sean-k-mooney	but its also breaking existing deployemnts	16:50
sean-k-mooney	so we cant really adress that with flavor extra specs or iamge properties	16:51
sean-k-mooney	sicne it would be a pain for operators to use	16:51
gibi	but that requires rebuild of existing instances	16:51
sean-k-mooney	yep	16:51
sean-k-mooney	so with that in mind the config option proably is the way to go	16:51
sean-k-mooney	just need to bare in mind it might chagne after a hard reboot if you live migrate	16:51
gibi	yeah, config as a first step, if later more fine grained control is needed we can add an extra_spec	16:51
bauzas	there are libvirt dependencies	16:52
sean-k-mooney	if we capture the (this should really be the same on all host in a region) pice in the docs im ok with this	16:52
sean-k-mooney	bauzas: and qemu deps	16:52
bauzas	you need a recent libvirt in order to be able to use it	16:52
bauzas	right	16:52
sean-k-mooney	its only supproted on qemu 5.0+	16:52
gibi	sean-k-mooney: yeah that make sense to document	16:52
sean-k-mooney	so we will need a libvirt verion and qemu check in the code	16:53
sean-k-mooney	which is fine we know how to do that	16:53
bauzas	so, if this is a configurable, this has to explain which versions you need	16:53
sean-k-mooney	yep	16:53
bauzas	we would expose something unusable for the most	16:53
sean-k-mooney	the only tricky bit will be live migration	16:53
sean-k-mooney	if the dest is not new enough but the host is	16:54
bauzas	correct, the checks ?	16:54
sean-k-mooney	we will need to make sure we validate that	16:54
bauzas	right	16:54
bauzas	but this looks to me an implementation detail	16:54
bauzas	all of this seems not needing a spec, right?	16:54
bauzas	upgrade concerns are N/A	16:54
bauzas	as you explicitely need a recent qemu	16:55
sean-k-mooney	am the live migration check will be a littel complex but other then that i dont see a need for a spec	16:55
sean-k-mooney	im a littel concerned about the livemgration check which is what makes me hesitate to say no spec	16:55
bauzas	we can revisit this decision if the patch goes hairy	16:55
sean-k-mooney	yes	16:55
sean-k-mooney	that works for mee	16:55
gibi	works for me too	16:56
sean-k-mooney	i htink we have the hypervior version avaible in the conductor so i think we can do it without an rpc/object change	16:56
bauzas	#agreed https://blueprints.launchpad.net/nova/+spec/control-qemu-tb-cache can be a specless BP but we need to know more about the live migration checks before we approve	16:56
bauzas	gibi: sean-k-mooney: works for you what I wrote ?	16:56
sean-k-mooney	+1	16:57
bauzas	ok,	16:57
bauzas	next topic is ganso	16:57
bauzas	and eventually, whoami-rajat	16:57
ganso	hi!	16:57
bauzas	ganso: you have one min :)	16:57
ganso	so my question is about adding hw_vif_multiqueue_enabled setting to flavors	16:57
ganso	it was removed from the original spec	16:57
ganso	https://review.opendev.org/c/openstack/nova-specs/+/128825/comment/7ad32947_73515762/#90	16:57
ganso	today it can be only used in image properties	16:58
ganso	does it make at all semantically or is this something that only makes sense as an image property?	16:58
sean-k-mooney	ya this came up semi recently	16:58
sean-k-mooney	i think we can just add this in the flavor	16:58
bauzas	the other way would be a concern to me	16:58
ganso	ok. Would this require a spec?	16:58
bauzas	as users could use a new property	16:59
sean-k-mooney	well image propertise are for exposing thing that affect the virtualised hardware	16:59
bauzas	but given we already accept this for images, I don't see a problem with accepting it as a flavor extraspec	16:59
sean-k-mooney	so in gnerally you want that to be user setable	16:59
ganso	great	17:00
bauzas	sean-k-mooney: right, I was just explaning that image > flavor seems not debatable while flavor > image seems to be discussed	17:00
ganso	to me it sounds simple enough to not require a spec, do you agree?	17:00
bauzas	good question	17:00
bauzas	but we're overtime	17:00
sean-k-mooney	https://blueprints.launchpad.net/nova/+spec/multiqueue-flavor-extra-spec	17:00
sean-k-mooney	this is the implemation https://review.opendev.org/q/topic:bp/multiqueue-flavor-extra-spec	17:01
bauzas	ganso: whoami-rajat: let's continue discussing your concerns after the meeting	17:01
bauzas	#endmeeting	17:01
opendevmeet	Meeting ended Tue Nov 16 17:01:10 2021 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)	17:01
opendevmeet	Minutes: https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-16-16.00.html	17:01
opendevmeet	Minutes (text): https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-16-16.00.txt	17:01
opendevmeet	Log: https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-16-16.00.log.html	17:01
whoami-rajat	ack	17:01
bauzas	I need to leave	17:01
sean-k-mooney	ganso: stephenfin was working on this before he moved team last cycle	17:01
bauzas	ganso: about your ask, I'll put the specless bp acceptance to next week	17:01
sean-k-mooney	ganso: i think we can do it as a specless blueprint	17:01
bauzas	ganso: but we can basically agreed on this without waiting for it to be papered	17:02
bauzas	agree*	17:02
sean-k-mooney	ganso: all of the code is there i just didnt get time to pick it back up after stephenfin move so if you want to pick it up please do	17:02
* bauzas needs to leave		17:02
ganso	sean-k-mooney, bauzas thank you very much!!	17:02
gibi	whoami-rajat: would be nice to have the reimage discussion when lyarwood is present	17:03
gibi	I not feel knowledgeable enough in cinder	17:03
whoami-rajat	gibi, ok, just wanted the team's thoughts on it, can you suggest a time that would be suitable?	17:04
gibi	whoami-rajat: try to ping lyarwood tomorrow	17:05
whoami-rajat	ok	17:06
gibi	both me and bauzas was +2 on your spec so a quick chat with lyarwood would be enough	17:06
whoami-rajat	ack, i will fix the gate failure and see what lyarwood thinks about it	17:06
opendevreview	Dan Smith proposed openstack/nova master: WIP: Revert project-specific APIs for servers https://review.opendev.org/c/openstack/nova/+/816206	17:19
kashyap	bauzas: Just back: on the blueprint for changing video model to "virtio": yes, you can trust the test results posted in the change. As noted, I've got it properly integration-tested for Windows and Linux guests with Red Hat virt QE	17:37
kashyap	Also, gibi --^ (Thanks for the trust :))	17:37
gibi	kashyap: :)	17:38
kashyap	gibi: On context: it is not specific to downstream RHEL9 removing it (as sean-k-mooney phrased it). Regardless of what RHEL9 does it is not a good default. That's the bare argument.	17:38
kashyap	"it" == Cirrus, I mean.	17:38
kashyap	gibi: On the tb-cache thing: as a reminder, it is mostly used by CI setups that can't have KVM. All sensible production users will use KVM	17:39
kashyap	Unless they have some need to run emulated-only guests -- because the performance is cripplingly slow compared to hardware-accelerated virt	17:40
gibi	yeah good point	17:40
gibi	it is for a specific non production use case	17:40
clarkb	kashyap: there are production use cases for emulation though. For example docker image builds for different architectures (we do a bunch of that)	17:40
clarkb	That doesn't concern nova, but it should be something that qemu/libvirt consider	17:41
clarkb	basically the emulation use case shouldn't simply be dismissed	17:41
kashyap	clarkb: Heya. Fully agree - that's a valid use-case. :-) But I was speaking from a compute-workload point of view: 90% of them are on KVM driver	17:42
kashyap	clarkb: Enabling cross-arch builds is one of the appealing points, sure.	17:42
kashyap	clarkb: Although, my use of "sensible production users" is a bit dismissive, I agree. Sorry :)	17:43
sean-k-mooney	kashyap: rackspace used to run there public cloud using qemu for x86 on power hardware for a long time	17:43
kashyap	sean-k-mooney: Sure; but it's also far, far, less secure. And upstream QEMU doesn't have any security guranatees	17:45
sean-k-mooney	kashyap: yep and that is fine for many	17:45
sean-k-mooney	espically if they use selinxu/contiaer to add an extra laywer of security around the qemu instancve	17:46
kashyap	sean-k-mooney: Sure; as long as they're aware of it. I just double-checked with the QEMU folks: they "explicitly disclaim any security for TCG"	17:46
sean-k-mooney	yes i kno	17:46
sean-k-mooney	know	17:46
sean-k-mooney	its in there wiki	17:46
kashyap	Public docs: https://qemu-project.gitlab.io/qemu/system/security.html	17:47
sean-k-mooney	https://www.qemu.org/docs/master/system/security.html#non-virtualization-use-case	17:47
kashyap	Yep.	17:47
kashyap	sean-k-mooney: Note, though; SELinux/AppArmour can mitgate some of the risk, but as the QEMU folks say elsewhere: "depending on the config you can still have massive holes you can drive a truck through" (Cc: gibi, clarkb)	17:50
clarkb	sure, I'm not saying it is a good idea for production cloud VM usage. But I do think there are valid use cases out there	17:51
kashyap	Agreed. I was just tempering the "production cloud w/ TCG" usage point-of-view. In case any lurkers are observing this conversation, I wanted to plug the security implications here	17:52
sean-k-mooney	i dont really think its a debate there have been several production largescase cloud that did run with just qemu	17:53
sean-k-mooney	depening on your security model it may or may not be an issue	17:53
kashyap	Also Rackspace used to offer Xen too. Not just plain QEMU.	18:01
kashyap	I don't want belabour this point. I wonder who are these "largescale clouds". Overall, any serious user who wants to run non-toy compute workloads will not use plain emulation.	18:02
kashyap	Anyhow...time to wrap up the day.	18:03
*** tosky is now known as Guest6054		18:05
*** tosky_ is now known as tosky		18:05
*** tosky_ is now known as tosky		18:48
dasp	sean-k-money: I opened the BP like you suggested but didn't tag it for yoga properly, so it may have been missed: https://blueprints.launchpad.net/nova/+spec/configurable-no-compression-image-types	19:11
sean-k-mooney	am we will tag it when its review but you just need to add it to the metting adgenda by updating the wiki	19:12
opendevreview	Rodrigo Barbieri proposed openstack/nova master: Add 'hw:vif_multiqueue_enabled' flavor extra spec https://review.opendev.org/c/openstack/nova/+/792356	19:12
dasp	sean-k-mooney: thanks, done	19:17
*** mdbooth5 is now known as mdbooth		19:35
opendevreview	Artom Lifshitz proposed openstack/nova master: DNM: Test token expiration during live migration https://review.opendev.org/c/openstack/nova/+/817778	20:19
*** tosky is now known as Guest6070		22:42
*** tosky_ is now known as tosky		22:42
*** tosky is now known as Guest6073		23:07
*** tosky_ is now known as tosky		23:07

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!