sean-k-mooney[m] | i dont think we should supprot the dmi info at all | 01:45 |
---|---|---|
sean-k-mooney[m] | its a libvirt internal implation detail so im really not a fan of building on that for config drive or any other metadata mechanium | 01:46 |
sean-k-mooney[m] | i certainly dont want to rush into adding supprot for this as a bug fix without propper discussion as this si an undocumjented internal detail | 01:47 |
sean-k-mooney[m] | if your using virtio-scsi you can have 100s of disks if needed but iconfig drive does consume a pci/pcie sloct by default but iits not generally a problem | 01:51 |
mikal | sean-k-mooney[m]: I finally have all the USB and sound changes passing CI, so that's nice. | 06:43 |
Uggla | sean-k-mooney[m], fyi I did a first review pass on 953940: support iothread for nova vms | https://review.opendev.org/c/openstack/nova-specs/+/953940 | 09:00 |
bauzas | sean-k-mooney[m]: honestly, I'd prefer to leave the current owner to provide a new revision from him | 09:25 |
mikal | For what its worth, https://review.opendev.org/q/topic:%22libvirt-vdi%22 is a series of four patches all with passing CI and a +2 on them if someone has a spare minute or two. | 09:54 |
sean-k-mooney | bauzas: well i wanted to hevilaly revise there proposal, also i have propsoed thie in some form twice before so i wrote it form scratch based on my prior exploration of this in past cycles | 10:01 |
opendevreview | Alexey Stupnikov proposed openstack/nova master: Don't create instance_extra entry for deleted instance https://review.opendev.org/c/openstack/nova/+/412771 | 10:38 |
opendevreview | Rajesh Tailor proposed openstack/nova-specs master: Show finish_time field in instance action show https://review.opendev.org/c/openstack/nova-specs/+/929780 | 11:49 |
opendevreview | sean mooney proposed openstack/nova-specs master: support iothread for nova vms https://review.opendev.org/c/openstack/nova-specs/+/953940 | 12:10 |
opendevreview | Biser Milanov proposed openstack/nova master: Hardcode the use of iothreads for KVM. https://review.opendev.org/c/openstack/nova/+/918669 | 12:43 |
opendevreview | Biser Milanov proposed openstack/nova master: Hardcode the use of iothreads for KVM. https://review.opendev.org/c/openstack/nova/+/918669 | 12:43 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Run nova-api and -metadata in threaded mode https://review.opendev.org/c/openstack/nova/+/951957 | 12:54 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Warn on long task wait time for executor https://review.opendev.org/c/openstack/nova/+/952666 | 12:54 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Allow to start unit test without eventlet https://review.opendev.org/c/openstack/nova/+/953436 | 12:54 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Run unit test with threading mode https://review.opendev.org/c/openstack/nova/+/953475 | 12:54 |
opendevreview | Balazs Gibizer proposed openstack/nova master: [test]RPC using threading or eventlet selectively https://review.opendev.org/c/openstack/nova/+/953815 | 12:54 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Do not yield in threading mode https://review.opendev.org/c/openstack/nova/+/950994 | 12:59 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Run nova-api and -metadata in threaded mode https://review.opendev.org/c/openstack/nova/+/951957 | 12:59 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Warn on long task wait time for executor https://review.opendev.org/c/openstack/nova/+/952666 | 12:59 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Allow to start unit test without eventlet https://review.opendev.org/c/openstack/nova/+/953436 | 12:59 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Run unit test with threading mode https://review.opendev.org/c/openstack/nova/+/953475 | 12:59 |
opendevreview | Balazs Gibizer proposed openstack/nova master: [test]RPC using threading or eventlet selectively https://review.opendev.org/c/openstack/nova/+/953815 | 12:59 |
sean-k-mooney | bauzas: can you provide input on specifcly this comment https://review.opendev.org/c/openstack/nova-specs/+/953940/comment/c0db3aad_9199e6e7/ do you want me to move hw:iothreads_per_disk to the alternitives or only update the iothread on reboot for this cycle. | 13:04 |
sean-k-mooney | ill simplfy the spec based on that input. | 13:04 |
sean-k-mooney | long term we defintly shoudl supprot hw:iothreads_per_disk btu we dont need to supprot it initally my expection is we will enhance the iothrad funciotnaltiy incremnetally over a few rleases as we need too. | 13:06 |
sean-k-mooney | the ohter thing i was undeceded on was shoudl it be hw:iothreads_per_disk or hw:iothreads_per_volume i.e. the number of addtional io treads to add per cinder volume or per any disk assocated with the guest i.e. swap or ephemreal | 13:08 |
sean-k-mooney | i think for simplictly to have a baseline this cycle we shoudl just simplfy and move all the per disk part to a diffent spec for next cycle like you were suggesting | 13:09 |
bauzas | sean-k-mooney: yeah my concern was about to say "given all the discussions we need to have for iothreads per disk, maybe we should just do that by another spec" | 13:15 |
bauzas | and my main concern with that is the fact that we need to restart an instance for using other io threads if we modify the number when we add a new volume | 13:16 |
sean-k-mooney | so we dont actuuly need the restart as far as i can tell but we can just update the xml we need to call a sperat api to add or remove the iothread | 13:18 |
sean-k-mooney | so ok ill move alll fo that to alterniivs and simplfy the spec to only static iothread set via the hw:iothreads extra spec | 13:19 |
bauzas | oh, the libvirt API accepts a specific call to add a new iothread ? | 13:20 |
bauzas | if so, tell that please :) | 13:20 |
sean-k-mooney | so you can do it on the cli to a live domain. i commented on that in my reply | 13:21 |
sean-k-mooney | so there is an api but i need to go find what virsh is calling | 13:21 |
sean-k-mooney | but we do need to track/assign the iothread id to use it | 13:21 |
sean-k-mooney | so we can use it but its more complexity then i would ike to do this cycle | 13:22 |
sean-k-mooney | im going to simplfy the cpu affintiy too. | 13:29 |
opendevreview | sean mooney proposed openstack/nova-specs master: support iothread for nova vms https://review.opendev.org/c/openstack/nova-specs/+/953940 | 13:51 |
opendevreview | Rajesh Tailor proposed openstack/nova-specs master: Show finish_time field in instance action show https://review.opendev.org/c/openstack/nova-specs/+/929780 | 14:05 |
opendevreview | sean mooney proposed openstack/nova-specs master: support iothread for nova vms https://review.opendev.org/c/openstack/nova-specs/+/953940 | 14:30 |
opendevreview | sean mooney proposed openstack/nova-specs master: support iothread for nova vms https://review.opendev.org/c/openstack/nova-specs/+/953940 | 14:34 |
sean-k-mooney | Uggla: bauzas ^ the only open quesiton left in that version is triat or compute service bump. a triat is simpler in my view and will be a better ux but i am not going to update the spec until folks provide feed back. i need to go test somethign else for a bit | 14:43 |
Uggla | sean-k-mooney, I'm more in favor of trait, but I'm not sure I have rational behind that. I need to think about it. | 14:51 |
sean-k-mooney | i do it work for mixed clouds with diffent virt drivers | 14:51 |
sean-k-mooney | either differnt version or vmware + libvirt or libvirt + ironic ectra | 14:52 |
sean-k-mooney | we can do it th eother way but to me its harder to implement correctly | 14:52 |
opendevreview | Masahito Muroi proposed openstack/nova-specs master: Add spec for virtio-blk multiqueue extra_spec https://review.opendev.org/c/openstack/nova-specs/+/951636 | 16:50 |
masahito | Thanks for the quick reviews. I narrow down the spec scope only supporting multiqueue to enable the incremental supports and to simplify the discussion. | 16:55 |
sean-k-mooney | masahito: im sorry that we didnt really have tiem to work on this mroe closely earlier in the cycle | 16:58 |
sean-k-mooney | masahito: im not entrily sure what the right thing to do is. technial the approval deadline is end of buisse today | 16:59 |
sean-k-mooney | ill try and take a look at your spec again shortly | 17:00 |
sean-k-mooney | btu im not sure if other will have time today | 17:00 |
masahito | it's ok. I understand the iothreads part needs extra discussion now because of the entire Nova consistency. | 17:09 |
opendevreview | Merged openstack/nova master: Use futurist for _get_default_green_pool() https://review.opendev.org/c/openstack/nova/+/948072 | 17:11 |
dansmith | sean-k-mooney: my request is that instead of you proposing a competing spec, you propose a revision on top of masahito's for whatever you think needs to be different | 17:11 |
masahito | and if we keep talking the iothreads part we could merge it early timing of G development cycle even though the spec is slipped from F release. | 17:11 |
sean-k-mooney | dansmith: they are not descibign very diffent things | 17:11 |
dansmith | it will make it much easier to understand any differences and avoid the appearance that we're going to choose one or the other | 17:11 |
dansmith | sean-k-mooney: cool, then should be easy | 17:11 |
sean-k-mooney | dansmith: masahito spec if for block multi queue only and mine is only for iothreads | 17:12 |
sean-k-mooney | and the two compose together | 17:12 |
sean-k-mooney | dansmith: masahito's orginally did both in a more adanced way the either not propose | 17:13 |
masahito | ^ iothreads part means mapping disk to iothread. | 17:13 |
dansmith | sean-k-mooney: okay I only skimmed them but I thought yours was proposing both | 17:14 |
sean-k-mooney | no i was going to rely on the fact multi is enable by default by qemu for now and focus only on the iothrad part | 17:15 |
dansmith | multi-queue has the same concerns for blk as eth I assume, where you have to scale them appropriately without using up all the .. what is it, pci root ports or something? | 17:15 |
sean-k-mooney | no | 17:15 |
masahito | yup. sean's spec gives me clear view of road to goal. | 17:15 |
sean-k-mooney | so multi queue for blk is on by default but you get i think 1 queue per vcpu? masahito is that correct | 17:15 |
sean-k-mooney | dansmith: for networkign the problem was that on the host side all the quese were used by ovs | 17:16 |
sean-k-mooney | dansmith: but you and to enable them manually in the guest driver | 17:16 |
sean-k-mooney | about 2-3 year ago the virtio net driver in the guest was updated to enabel all the queue if the nic has more then one | 17:16 |
sean-k-mooney | so you dont actully need to do anyting to have netowrking work propelry with a moderen kernel | 17:17 |
masahito | right. 1 queue per vcpu is default from 5.2 | 17:18 |
dansmith | okay so blk multiqueue on its own gets you more queues, but limited actual parallelism by the default of $vcpu threads to service them, | 17:19 |
dansmith | and iothreads lets you increase that past that number if you want, say 16 queues, 2 vcpus, and 16 io threads to serve those queues? | 17:19 |
sean-k-mooney | dansmith: no by default it create a virtual device with 1 queue pre vcpu but all the io is handel by the single emulator thread | 17:20 |
sean-k-mooney | dansmith: so the only thing it gives you is lock free reade/write in the guest kernel | 17:20 |
sean-k-mooney | if you mape one queue to each core or somethign like that | 17:20 |
dansmith | by default, I understand | 17:20 |
dansmith | er, well, okay, | 17:21 |
dansmith | by default one queue per vcpu, all funneling through a single thread shared with emulator right? | 17:21 |
sean-k-mooney | yep | 17:21 |
dansmith | so two devices, four vcpus means 8 queues for one thread shared with emulator to handle | 17:22 |
sean-k-mooney | yep | 17:22 |
dansmith | so even more iothreads with the default queues will improve performance | 17:22 |
sean-k-mooney | masahito can correct me if im wrong but what they ahve observied is if you have too many queue it does not actully help with perfoamce | 17:22 |
dansmith | I'm not sure why multiple queues per device would really help with performance if you don't increase threads, but maybe it's because block devices are scheduled and have barriers unlike nics? | 17:23 |
sean-k-mooney | so if you ahve a vm with 64vcpus and 4 qeues it can perfroae better the 64Qs in some cases | 17:23 |
masahito | right for disk performance, not network performance. | 17:23 |
sean-k-mooney | dansmith: masahito want to decouble the queu count form teh vcpu count manily to reduce the queu count on large vms and i gues increase it on really small ones | 17:24 |
dansmith | ah, okay I see that in the spec but I glossed over it before | 17:25 |
sean-k-mooney | but the reall perfromace uplift is expcted to come form the addtion of iothreads | 17:25 |
masahito | 2 or 4 queue with 1 emulator thread performance is better than 1 queue. But in case of large CPU flavor, 16 or more, the number of queues seems to be too much for single threads. | 17:25 |
dansmith | yeah okay, I had a different impression from reading these two things, but I see why we'd need both now | 17:25 |
masahito | The IO performance doesn't increase even though the number of queue is increased. 1 queue and 1 emulator thread is better performance than 16 queues and 1 emulator thread in case of 16 cores | 17:26 |
sean-k-mooney | so multi queu for disks results in one kernel vhost thread per queue if i recall, at least with ovs-dpdk it allwos more dpdk cores to process packtest as it did a rount robbin of the quce across teh aviabel dpdk cores. | 17:28 |
sean-k-mooney | but for block queue we dont get that automcati implict scaling | 17:28 |
sean-k-mooney | i think that partly becasue there is no vhost offload for block devices | 17:30 |
sean-k-mooney | the main benifit to havign a multi queu dis is it allwos the use of the multi quei disk io schduelr in the kernel to do io priotisitation | 17:31 |
sean-k-mooney | so you can use mq-deadline or bfq or kyber in the guest ectra | 17:32 |
dansmith | yeah, that's what I meant above in terms of benefit for the scheduled block requests | 17:32 |
sean-k-mooney | ack | 17:32 |
dansmith | sean-k-mooney: can we not combine these, at least at the trait layer, to allow advertising and requesting these blk queues and threads? | 17:33 |
dansmith | I know the threads are useful for more than just blk, but if we can assume that threads mean you can do the blk queue that might be nice | 17:33 |
sean-k-mooney | we can we can also combine the spec if we really want ot at this point | 17:33 |
dansmith | just hate the trait explosion we get | 17:34 |
sean-k-mooney | orgially masahito goal was much more advance then the combidnaiton of these two spec, we can still evlove to that goal but the intent was to reduce the scope to soemthign achivable this cycle | 17:34 |
sean-k-mooney | dansmith: if we can do both this cycle i dont se why we would need 2 traits | 17:36 |
sean-k-mooney | so if we put the multi queue enhacement on top of the iotreads one when we are writign the code for this i think we can resue it to mean both | 17:37 |
dansmith | yeah | 17:37 |
sean-k-mooney | the multi queu change will now be very small so im not worried about that missing | 17:37 |
dansmith | I don't really see how we can get these both merged this cycle at this point | 17:37 |
dansmith | I'm out for the next ten days starting tomorrow | 17:38 |
dansmith | I dunno how others feel | 17:38 |
dansmith | gibi: just noticed you rechecking for rabbit reasons again... any chance we're mucking with things that are causing us to stall a real thread or something? | 18:12 |
dansmith | I mean, I realize I should know as well as you, but.. | 18:12 |
dansmith | are we seeing those same failures elsewhere? | 18:12 |
clarkb | dansmith: have a link to an example failure? | 18:13 |
clarkb | (I want to rule one thing out or in) | 18:13 |
dansmith | clarkb: https://review.opendev.org/c/openstack/nova/+/948079?tab=change-view-tab-header-zuul-results-summary | 18:13 |
dansmith | not sure which of those fails is the one he rechecked for | 18:14 |
dansmith | (man we have /got/ to squelch the verbose tracebacking from os-brick) | 18:15 |
clarkb | each of the three failures were multinode jobs and zuul-launcher (the new nodepool replacement) gave you mixed cloud provider nodes. Its only really supposed to do this when there is no other choice, but there was at least one bug causing it to do that too often that should be fixed now | 18:16 |
clarkb | those jobs ran ~5 hours ago so the fix for that particular problem should've been in place. | 18:16 |
dansmith | ah | 18:16 |
clarkb | But anyway if your multinode job can't handle that (due to ip addressing or latency or whatever) that could be a reason | 18:16 |
dansmith | so, nova officially doesn't support cross-site MQ, even though lots of people do it | 18:19 |
dansmith | redhat has strict latency requirements for when we allow people to do it | 18:19 |
dansmith | getting nodes from multiple providers would seem to make that kindof difficult | 18:19 |
clarkb | I've brought up that the mixed nodesets are still happening in #opendev as we're trying to track zuul-launcher behaviors that are unexpected or cause problems there | 18:20 |
clarkb | ya its not the default | 18:20 |
dansmith | ack | 18:20 |
clarkb | and I think the vast majority of the time it isn't happening that way (opendev's own multinode jobs testing deployment of our software is also sensitive to it and I noticed immediately when it was doing it too often. Since I think I'e only seen one failure on our side btu we also run fewer jobs than nova and have fewer changes) | 18:21 |
dansmith | in general, we're a lot more tolerant of it than I would have thought, so it's probably not really a problem in a lot of cases (and maybe good to be throwing some of it in there) | 18:24 |
dansmith | I'm just saying, it's definitely not what we expect to happen | 18:24 |
dansmith | looks like this is the fail he was referring to: https://zuul.opendev.org/t/openstack/build/66557ea92bae4e809c09526eba619c07/log/compute1/logs/screen-n-cpu.txt | 18:25 |
dansmith | which is just fully unable to talk to rabbit | 18:25 |
clarkb | which may be an issue of ip routing | 18:25 |
clarkb | depending no how the ip addrs are configured. By default we'd allow the traffic to cross the internet and get from point a to point b but you have to use the public ips on both sides which may be floating ips etc | 18:26 |
dansmith | and you think that our jobs might be misconfigured for that? | 18:26 |
dansmith | I don't _think_ nova-next does a ton of weird config | 18:27 |
opendevreview | Masahito Muroi proposed openstack/nova-specs master: Add spec for the network id query in server list and server details https://review.opendev.org/c/openstack/nova-specs/+/938823 | 18:27 |
clarkb | yes its possible they use private ips ratehr than public ips | 18:28 |
clarkb | or listen only on the local ip address so connections for the floating ip are rejected | 18:28 |
dansmith | transport_url = rabbit://stackrabbit:secretrabbit@10.209.32.130:5672/ | 18:28 |
clarkb | that'll do it | 18:28 |
dansmith | okay so I don't think that's specific to nova-next here | 18:29 |
dansmith | well, it does make some choices about neutron so maybe it's wrapped up in that | 18:32 |
sean-k-mooney | its been a very long time but we used tc when i was at intel to injectup to 120ms of latency into the links between host | 18:51 |
sean-k-mooney | for the most part that seam to still be fine provide the clocks were in sync | 18:51 |
dansmith | yeah it's clearly not routable because of the private ip in this case | 18:51 |
dansmith | uep | 18:51 |
sean-k-mooney | ah its routing ya | 18:52 |
sean-k-mooney | that woudl make sense. i know we try and setbale a vxlan or gre mesh usign linux bridge in the multi node jobs | 18:52 |
sean-k-mooney | and hten use that for neutron tenant networking | 18:52 |
sean-k-mooney | but i dont know if we actuly route all traffic over that | 18:52 |
sean-k-mooney | i guess we are not or that is not working properly if we are | 18:53 |
dansmith | it's not that because this is just nova-compute unable to contact the mq on the controller node because it's not using its public ip | 18:54 |
dansmith | nothing to do with guest networking | 18:54 |
sean-k-mooney | what i ment was if we used the ip form that brdige network those might work | 18:55 |
sean-k-mooney | but ya in devstack we lookup the ip of the node by defualt | 18:55 |
sean-k-mooney | i dotn think we overried that in the jobs | 18:55 |
dansmith | right | 18:55 |
sean-k-mooney | we expct the vms to be on the same neutron network in the josb since nodes in any given nodeset are nto allwos to comefrom diffent providers | 18:56 |
sean-k-mooney | or at least they were not allowed to do tha tin the past so the josb depend on that | 18:56 |
sean-k-mooney | changign that would be a breakign change | 18:56 |
sean-k-mooney | not nessiarly a bad one but not one that our jobs can handel so it woudl need ot be opt in | 18:56 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!