Thursday, 2025-07-03

sean-k-mooney[m]	i dont think we should supprot the dmi info at all	01:45
sean-k-mooney[m]	its a libvirt internal implation detail so im really not a fan of building on that for config drive or any other metadata mechanium	01:46
sean-k-mooney[m]	i certainly dont want to rush into adding supprot for this as a bug fix without propper discussion as this si an undocumjented internal detail	01:47
sean-k-mooney[m]	if your using virtio-scsi you can have 100s of disks if needed but iconfig drive does consume a pci/pcie sloct by default but iits not generally a problem	01:51
mikal	sean-k-mooney[m]: I finally have all the USB and sound changes passing CI, so that's nice.	06:43
Uggla	sean-k-mooney[m], fyi I did a first review pass on 953940: support iothread for nova vms \| https://review.opendev.org/c/openstack/nova-specs/+/953940	09:00
bauzas	sean-k-mooney[m]: honestly, I'd prefer to leave the current owner to provide a new revision from him	09:25
mikal	For what its worth, https://review.opendev.org/q/topic:%22libvirt-vdi%22 is a series of four patches all with passing CI and a +2 on them if someone has a spare minute or two.	09:54
sean-k-mooney	bauzas: well i wanted to hevilaly revise there proposal, also i have propsoed thie in some form twice before so i wrote it form scratch based on my prior exploration of this in past cycles	10:01
opendevreview	Alexey Stupnikov proposed openstack/nova master: Don't create instance_extra entry for deleted instance https://review.opendev.org/c/openstack/nova/+/412771	10:38
opendevreview	Rajesh Tailor proposed openstack/nova-specs master: Show finish_time field in instance action show https://review.opendev.org/c/openstack/nova-specs/+/929780	11:49
opendevreview	sean mooney proposed openstack/nova-specs master: support iothread for nova vms https://review.opendev.org/c/openstack/nova-specs/+/953940	12:10
opendevreview	Biser Milanov proposed openstack/nova master: Hardcode the use of iothreads for KVM. https://review.opendev.org/c/openstack/nova/+/918669	12:43
opendevreview	Biser Milanov proposed openstack/nova master: Hardcode the use of iothreads for KVM. https://review.opendev.org/c/openstack/nova/+/918669	12:43
opendevreview	Balazs Gibizer proposed openstack/nova master: Run nova-api and -metadata in threaded mode https://review.opendev.org/c/openstack/nova/+/951957	12:54
opendevreview	Balazs Gibizer proposed openstack/nova master: Warn on long task wait time for executor https://review.opendev.org/c/openstack/nova/+/952666	12:54
opendevreview	Balazs Gibizer proposed openstack/nova master: Allow to start unit test without eventlet https://review.opendev.org/c/openstack/nova/+/953436	12:54
opendevreview	Balazs Gibizer proposed openstack/nova master: Run unit test with threading mode https://review.opendev.org/c/openstack/nova/+/953475	12:54
opendevreview	Balazs Gibizer proposed openstack/nova master: [test]RPC using threading or eventlet selectively https://review.opendev.org/c/openstack/nova/+/953815	12:54
opendevreview	Balazs Gibizer proposed openstack/nova master: Do not yield in threading mode https://review.opendev.org/c/openstack/nova/+/950994	12:59
opendevreview	Balazs Gibizer proposed openstack/nova master: Run nova-api and -metadata in threaded mode https://review.opendev.org/c/openstack/nova/+/951957	12:59
opendevreview	Balazs Gibizer proposed openstack/nova master: Warn on long task wait time for executor https://review.opendev.org/c/openstack/nova/+/952666	12:59
opendevreview	Balazs Gibizer proposed openstack/nova master: Allow to start unit test without eventlet https://review.opendev.org/c/openstack/nova/+/953436	12:59
opendevreview	Balazs Gibizer proposed openstack/nova master: Run unit test with threading mode https://review.opendev.org/c/openstack/nova/+/953475	12:59
opendevreview	Balazs Gibizer proposed openstack/nova master: [test]RPC using threading or eventlet selectively https://review.opendev.org/c/openstack/nova/+/953815	12:59
sean-k-mooney	bauzas: can you provide input on specifcly this comment https://review.opendev.org/c/openstack/nova-specs/+/953940/comment/c0db3aad_9199e6e7/ do you want me to move hw:iothreads_per_disk to the alternitives or only update the iothread on reboot for this cycle.	13:04
sean-k-mooney	ill simplfy the spec based on that input.	13:04
sean-k-mooney	long term we defintly shoudl supprot hw:iothreads_per_disk btu we dont need to supprot it initally my expection is we will enhance the iothrad funciotnaltiy incremnetally over a few rleases as we need too.	13:06
sean-k-mooney	the ohter thing i was undeceded on was shoudl it be hw:iothreads_per_disk or hw:iothreads_per_volume i.e. the number of addtional io treads to add per cinder volume or per any disk assocated with the guest i.e. swap or ephemreal	13:08
sean-k-mooney	i think for simplictly to have a baseline this cycle we shoudl just simplfy and move all the per disk part to a diffent spec for next cycle like you were suggesting	13:09
bauzas	sean-k-mooney: yeah my concern was about to say "given all the discussions we need to have for iothreads per disk, maybe we should just do that by another spec"	13:15
bauzas	and my main concern with that is the fact that we need to restart an instance for using other io threads if we modify the number when we add a new volume	13:16
sean-k-mooney	so we dont actuuly need the restart as far as i can tell but we can just update the xml we need to call a sperat api to add or remove the iothread	13:18
sean-k-mooney	so ok ill move alll fo that to alterniivs and simplfy the spec to only static iothread set via the hw:iothreads extra spec	13:19
bauzas	oh, the libvirt API accepts a specific call to add a new iothread ?	13:20
bauzas	if so, tell that please :)	13:20
sean-k-mooney	so you can do it on the cli to a live domain. i commented on that in my reply	13:21
sean-k-mooney	so there is an api but i need to go find what virsh is calling	13:21
sean-k-mooney	but we do need to track/assign the iothread id to use it	13:21
sean-k-mooney	so we can use it but its more complexity then i would ike to do this cycle	13:22
sean-k-mooney	im going to simplfy the cpu affintiy too.	13:29
opendevreview	sean mooney proposed openstack/nova-specs master: support iothread for nova vms https://review.opendev.org/c/openstack/nova-specs/+/953940	13:51
opendevreview	Rajesh Tailor proposed openstack/nova-specs master: Show finish_time field in instance action show https://review.opendev.org/c/openstack/nova-specs/+/929780	14:05
opendevreview	sean mooney proposed openstack/nova-specs master: support iothread for nova vms https://review.opendev.org/c/openstack/nova-specs/+/953940	14:30
opendevreview	sean mooney proposed openstack/nova-specs master: support iothread for nova vms https://review.opendev.org/c/openstack/nova-specs/+/953940	14:34
sean-k-mooney	Uggla: bauzas ^ the only open quesiton left in that version is triat or compute service bump. a triat is simpler in my view and will be a better ux but i am not going to update the spec until folks provide feed back. i need to go test somethign else for a bit	14:43
Uggla	sean-k-mooney, I'm more in favor of trait, but I'm not sure I have rational behind that. I need to think about it.	14:51
sean-k-mooney	i do it work for mixed clouds with diffent virt drivers	14:51
sean-k-mooney	either differnt version or vmware + libvirt or libvirt + ironic ectra	14:52
sean-k-mooney	we can do it th eother way but to me its harder to implement correctly	14:52
opendevreview	Masahito Muroi proposed openstack/nova-specs master: Add spec for virtio-blk multiqueue extra_spec https://review.opendev.org/c/openstack/nova-specs/+/951636	16:50
masahito	Thanks for the quick reviews. I narrow down the spec scope only supporting multiqueue to enable the incremental supports and to simplify the discussion.	16:55
sean-k-mooney	masahito: im sorry that we didnt really have tiem to work on this mroe closely earlier in the cycle	16:58
sean-k-mooney	masahito: im not entrily sure what the right thing to do is. technial the approval deadline is end of buisse today	16:59
sean-k-mooney	ill try and take a look at your spec again shortly	17:00
sean-k-mooney	btu im not sure if other will have time today	17:00
masahito	it's ok. I understand the iothreads part needs extra discussion now because of the entire Nova consistency.	17:09
opendevreview	Merged openstack/nova master: Use futurist for _get_default_green_pool() https://review.opendev.org/c/openstack/nova/+/948072	17:11
dansmith	sean-k-mooney: my request is that instead of you proposing a competing spec, you propose a revision on top of masahito's for whatever you think needs to be different	17:11
masahito	and if we keep talking the iothreads part we could merge it early timing of G development cycle even though the spec is slipped from F release.	17:11
sean-k-mooney	dansmith: they are not descibign very diffent things	17:11
dansmith	it will make it much easier to understand any differences and avoid the appearance that we're going to choose one or the other	17:11
dansmith	sean-k-mooney: cool, then should be easy	17:11
sean-k-mooney	dansmith: masahito spec if for block multi queue only and mine is only for iothreads	17:12
sean-k-mooney	and the two compose together	17:12
sean-k-mooney	dansmith: masahito's orginally did both in a more adanced way the either not propose	17:13
masahito	^ iothreads part means mapping disk to iothread.	17:13
dansmith	sean-k-mooney: okay I only skimmed them but I thought yours was proposing both	17:14
sean-k-mooney	no i was going to rely on the fact multi is enable by default by qemu for now and focus only on the iothrad part	17:15
dansmith	multi-queue has the same concerns for blk as eth I assume, where you have to scale them appropriately without using up all the .. what is it, pci root ports or something?	17:15
sean-k-mooney	no	17:15
masahito	yup. sean's spec gives me clear view of road to goal.	17:15
sean-k-mooney	so multi queue for blk is on by default but you get i think 1 queue per vcpu? masahito is that correct	17:15
sean-k-mooney	dansmith: for networkign the problem was that on the host side all the quese were used by ovs	17:16
sean-k-mooney	dansmith: but you and to enable them manually in the guest driver	17:16
sean-k-mooney	about 2-3 year ago the virtio net driver in the guest was updated to enabel all the queue if the nic has more then one	17:16
sean-k-mooney	so you dont actully need to do anyting to have netowrking work propelry with a moderen kernel	17:17
masahito	right. 1 queue per vcpu is default from 5.2	17:18
dansmith	okay so blk multiqueue on its own gets you more queues, but limited actual parallelism by the default of $vcpu threads to service them,	17:19
dansmith	and iothreads lets you increase that past that number if you want, say 16 queues, 2 vcpus, and 16 io threads to serve those queues?	17:19
sean-k-mooney	dansmith: no by default it create a virtual device with 1 queue pre vcpu but all the io is handel by the single emulator thread	17:20
sean-k-mooney	dansmith: so the only thing it gives you is lock free reade/write in the guest kernel	17:20
sean-k-mooney	if you mape one queue to each core or somethign like that	17:20
dansmith	by default, I understand	17:20
dansmith	er, well, okay,	17:21
dansmith	by default one queue per vcpu, all funneling through a single thread shared with emulator right?	17:21
sean-k-mooney	yep	17:21
dansmith	so two devices, four vcpus means 8 queues for one thread shared with emulator to handle	17:22
sean-k-mooney	yep	17:22
dansmith	so even more iothreads with the default queues will improve performance	17:22
sean-k-mooney	masahito can correct me if im wrong but what they ahve observied is if you have too many queue it does not actully help with perfoamce	17:22
dansmith	I'm not sure why multiple queues per device would really help with performance if you don't increase threads, but maybe it's because block devices are scheduled and have barriers unlike nics?	17:23
sean-k-mooney	so if you ahve a vm with 64vcpus and 4 qeues it can perfroae better the 64Qs in some cases	17:23
masahito	right for disk performance, not network performance.	17:23
sean-k-mooney	dansmith: masahito want to decouble the queu count form teh vcpu count manily to reduce the queu count on large vms and i gues increase it on really small ones	17:24
dansmith	ah, okay I see that in the spec but I glossed over it before	17:25
sean-k-mooney	but the reall perfromace uplift is expcted to come form the addtion of iothreads	17:25
masahito	2 or 4 queue with 1 emulator thread performance is better than 1 queue. But in case of large CPU flavor, 16 or more, the number of queues seems to be too much for single threads.	17:25
dansmith	yeah okay, I had a different impression from reading these two things, but I see why we'd need both now	17:25
masahito	The IO performance doesn't increase even though the number of queue is increased. 1 queue and 1 emulator thread is better performance than 16 queues and 1 emulator thread in case of 16 cores	17:26
sean-k-mooney	so multi queu for disks results in one kernel vhost thread per queue if i recall, at least with ovs-dpdk it allwos more dpdk cores to process packtest as it did a rount robbin of the quce across teh aviabel dpdk cores.	17:28
sean-k-mooney	but for block queue we dont get that automcati implict scaling	17:28
sean-k-mooney	i think that partly becasue there is no vhost offload for block devices	17:30
sean-k-mooney	the main benifit to havign a multi queu dis is it allwos the use of the multi quei disk io schduelr in the kernel to do io priotisitation	17:31
sean-k-mooney	so you can use mq-deadline or bfq or kyber in the guest ectra	17:32
dansmith	yeah, that's what I meant above in terms of benefit for the scheduled block requests	17:32
sean-k-mooney	ack	17:32
dansmith	sean-k-mooney: can we not combine these, at least at the trait layer, to allow advertising and requesting these blk queues and threads?	17:33
dansmith	I know the threads are useful for more than just blk, but if we can assume that threads mean you can do the blk queue that might be nice	17:33
sean-k-mooney	we can we can also combine the spec if we really want ot at this point	17:33
dansmith	just hate the trait explosion we get	17:34
sean-k-mooney	orgially masahito goal was much more advance then the combidnaiton of these two spec, we can still evlove to that goal but the intent was to reduce the scope to soemthign achivable this cycle	17:34
sean-k-mooney	dansmith: if we can do both this cycle i dont se why we would need 2 traits	17:36
sean-k-mooney	so if we put the multi queue enhacement on top of the iotreads one when we are writign the code for this i think we can resue it to mean both	17:37
dansmith	yeah	17:37
sean-k-mooney	the multi queu change will now be very small so im not worried about that missing	17:37
dansmith	I don't really see how we can get these both merged this cycle at this point	17:37
dansmith	I'm out for the next ten days starting tomorrow	17:38
dansmith	I dunno how others feel	17:38
dansmith	gibi: just noticed you rechecking for rabbit reasons again... any chance we're mucking with things that are causing us to stall a real thread or something?	18:12
dansmith	I mean, I realize I should know as well as you, but..	18:12
dansmith	are we seeing those same failures elsewhere?	18:12
clarkb	dansmith: have a link to an example failure?	18:13
clarkb	(I want to rule one thing out or in)	18:13
dansmith	clarkb: https://review.opendev.org/c/openstack/nova/+/948079?tab=change-view-tab-header-zuul-results-summary	18:13
dansmith	not sure which of those fails is the one he rechecked for	18:14
dansmith	(man we have /got/ to squelch the verbose tracebacking from os-brick)	18:15
clarkb	each of the three failures were multinode jobs and zuul-launcher (the new nodepool replacement) gave you mixed cloud provider nodes. Its only really supposed to do this when there is no other choice, but there was at least one bug causing it to do that too often that should be fixed now	18:16
clarkb	those jobs ran ~5 hours ago so the fix for that particular problem should've been in place.	18:16
dansmith	ah	18:16
clarkb	But anyway if your multinode job can't handle that (due to ip addressing or latency or whatever) that could be a reason	18:16
dansmith	so, nova officially doesn't support cross-site MQ, even though lots of people do it	18:19
dansmith	redhat has strict latency requirements for when we allow people to do it	18:19
dansmith	getting nodes from multiple providers would seem to make that kindof difficult	18:19
clarkb	I've brought up that the mixed nodesets are still happening in #opendev as we're trying to track zuul-launcher behaviors that are unexpected or cause problems there	18:20
clarkb	ya its not the default	18:20
dansmith	ack	18:20
clarkb	and I think the vast majority of the time it isn't happening that way (opendev's own multinode jobs testing deployment of our software is also sensitive to it and I noticed immediately when it was doing it too often. Since I think I'e only seen one failure on our side btu we also run fewer jobs than nova and have fewer changes)	18:21
dansmith	in general, we're a lot more tolerant of it than I would have thought, so it's probably not really a problem in a lot of cases (and maybe good to be throwing some of it in there)	18:24
dansmith	I'm just saying, it's definitely not what we expect to happen	18:24
dansmith	looks like this is the fail he was referring to: https://zuul.opendev.org/t/openstack/build/66557ea92bae4e809c09526eba619c07/log/compute1/logs/screen-n-cpu.txt	18:25
dansmith	which is just fully unable to talk to rabbit	18:25
clarkb	which may be an issue of ip routing	18:25
clarkb	depending no how the ip addrs are configured. By default we'd allow the traffic to cross the internet and get from point a to point b but you have to use the public ips on both sides which may be floating ips etc	18:26
dansmith	and you think that our jobs might be misconfigured for that?	18:26
dansmith	I don't _think_ nova-next does a ton of weird config	18:27
opendevreview	Masahito Muroi proposed openstack/nova-specs master: Add spec for the network id query in server list and server details https://review.opendev.org/c/openstack/nova-specs/+/938823	18:27
clarkb	yes its possible they use private ips ratehr than public ips	18:28
clarkb	or listen only on the local ip address so connections for the floating ip are rejected	18:28
dansmith	transport_url = rabbit://stackrabbit:secretrabbit@10.209.32.130:5672/	18:28
clarkb	that'll do it	18:28
dansmith	okay so I don't think that's specific to nova-next here	18:29
dansmith	well, it does make some choices about neutron so maybe it's wrapped up in that	18:32
sean-k-mooney	its been a very long time but we used tc when i was at intel to injectup to 120ms of latency into the links between host	18:51
sean-k-mooney	for the most part that seam to still be fine provide the clocks were in sync	18:51
dansmith	yeah it's clearly not routable because of the private ip in this case	18:51
dansmith	uep	18:51
sean-k-mooney	ah its routing ya	18:52
sean-k-mooney	that woudl make sense. i know we try and setbale a vxlan or gre mesh usign linux bridge in the multi node jobs	18:52
sean-k-mooney	and hten use that for neutron tenant networking	18:52
sean-k-mooney	but i dont know if we actuly route all traffic over that	18:52
sean-k-mooney	i guess we are not or that is not working properly if we are	18:53
dansmith	it's not that because this is just nova-compute unable to contact the mq on the controller node because it's not using its public ip	18:54
dansmith	nothing to do with guest networking	18:54
sean-k-mooney	what i ment was if we used the ip form that brdige network those might work	18:55
sean-k-mooney	but ya in devstack we lookup the ip of the node by defualt	18:55
sean-k-mooney	i dotn think we overried that in the jobs	18:55
dansmith	right	18:55
sean-k-mooney	we expct the vms to be on the same neutron network in the josb since nodes in any given nodeset are nto allwos to comefrom diffent providers	18:56
sean-k-mooney	or at least they were not allowed to do tha tin the past so the josb depend on that	18:56
sean-k-mooney	changign that would be a breakign change	18:56
sean-k-mooney	not nessiarly a bad one but not one that our jobs can handel so it woudl need ot be opt in	18:56
opendevreview	Brett Holman proposed openstack/nova master: Add DMI data to non-x86 architectures https://review.opendev.org/c/openstack/nova/+/953989	20:33
*** haleyb is now known as haleyb\|out		21:52

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!