opendevreview | Jorhson Deng proposed openstack/nova master: recheck the attachment_id after the reschedule successful https://review.opendev.org/c/openstack/nova/+/796209 | 02:41 |
---|---|---|
opendevreview | melanie witt proposed openstack/placement master: Narrow scope of set allocations database transaction https://review.opendev.org/c/openstack/placement/+/807014 | 05:57 |
opendevreview | melanie witt proposed openstack/placement master: Add reproducer for Allocation/Inventory update race bug https://review.opendev.org/c/openstack/placement/+/807493 | 05:58 |
gibi | good morning | 06:49 |
gibi | the nova-next job constantly failing on master since Friday https://zuul.opendev.org/t/openstack/builds?job_name=nova-next&project=openstack%2Fnova&branch=master | 06:50 |
gibi | e.g. https://zuul.opendev.org/t/openstack/build/f5b881e3601f4160a5f82a9b4cdc10ad/log/job-output.txt#62198-62200 | 06:50 |
gibi | reported a bug https://bugs.launchpad.net/nova/+bug/1942740 | 06:58 |
frickler | gibi: looks placement-y once again. maybe https://review.opendev.org/c/openstack/releases/+/806580 | 06:58 |
gibi | frickler: yes | 06:58 |
gibi | yes placement-y | 06:58 |
gibi | I will look into it shortly | 06:58 |
gibi | could be the new osc-placement release, yes | 06:58 |
lyarwood | gibi: Morning, I'm going to be afk this morning looking after my sick kid once again, I'll try to help with the nova-next stuff this afternooon \o | 07:14 |
gibi | lyarwood: ack, hope the kid will be better soon | 07:15 |
bauzas | morning folks | 07:39 |
bauzas | gibi: ack, will look up, I need to write the prelude section :D | 07:39 |
gibi | bauzas: morning, I'm debugging the nova-next job now, so you are free to work with the prelude :) | 07:40 |
bauzas | :D | 07:44 |
bauzas | (good opportunity for looking at what we eventually merged during this cycle fwiw) | 07:44 |
bauzas | (in case people want to work on a prelude change for Yoga :p ) | 07:45 |
gibi | frickler: indeed the new osc-placement release broke the test we have a bug in https://review.opendev.org/c/openstack/osc-placement/+/804458 | 07:57 |
gibi | frickler: I will propose a fix for osc-placement. Can we release a new osc-placement lib for Xena? | 07:58 |
gibi | elodilles: ^^ ? | 07:58 |
frickler | gibi: I looked at that patch, but assumed it would only change behavior when actively requesting that version | 07:59 |
frickler | gibi: releasing bug-fixes should always be possible, too | 07:59 |
gibi | frickler: it moved some code around outside of the version guard | 08:00 |
opendevreview | Jorhson Deng proposed openstack/nova master: recheck the attachment_id after the reschedule successful https://review.opendev.org/c/openstack/nova/+/796209 | 08:17 |
opendevreview | Pierre-Samuel Le Stang proposed openstack/nova master: Fix instance's image_ref lost on failed unshelving https://review.opendev.org/c/openstack/nova/+/807551 | 08:52 |
opendevreview | Balazs Gibizer proposed openstack/osc-placement master: Repro allocation show bug with empty allocation https://review.opendev.org/c/openstack/osc-placement/+/807553 | 09:12 |
gibi | bauzas: ^^ | 09:12 |
gibi | sorry, I was too fast, it will fail ... | 09:13 |
opendevreview | Balazs Gibizer proposed openstack/osc-placement master: Repro allocation show bug with empty allocation https://review.opendev.org/c/openstack/osc-placement/+/807553 | 09:22 |
opendevreview | Pierre-Samuel Le Stang proposed openstack/nova master: Fix instance's image_ref lost on failed unshelving https://review.opendev.org/c/openstack/nova/+/807551 | 09:27 |
opendevreview | Pierre-Samuel Le Stang proposed openstack/nova master: Fix instance's image_ref lost on failed unshelving https://review.opendev.org/c/openstack/nova/+/807555 | 09:27 |
opendevreview | Balazs Gibizer proposed openstack/osc-placement master: Repro allocation show bug with empty allocation https://review.opendev.org/c/openstack/osc-placement/+/807553 | 09:39 |
opendevreview | Pierre-Samuel Le Stang proposed openstack/nova master: Fix instance's image_ref lost on failed unshelving https://review.opendev.org/c/openstack/nova/+/807551 | 09:48 |
opendevreview | Balazs Gibizer proposed openstack/osc-placement master: Fix allocation show / unset on empty allocation https://review.opendev.org/c/openstack/osc-placement/+/807556 | 09:51 |
gibi | bauzas: please prioritize ^^ | 09:53 |
opendevreview | Balazs Gibizer proposed openstack/nova master: DNM: check nova-next with osc-placement fix https://review.opendev.org/c/openstack/nova/+/807558 | 09:57 |
*** bhagyashris_ is now known as bhagyashris | 10:00 | |
gibi | bauzas: fyi, I've created the yoga series in launchpad https://launchpad.net/nova/yoga | 10:49 |
gibi | bauzas: I will set it to active once Xena RC1 is cut | 10:50 |
sean-k-mooney | hehe i proably shoudl do that for os-vif although i generally only do it when we need to backport something and the series does not exist | 10:59 |
sean-k-mooney | the last one i created was victoria... | 10:59 |
gibi | sean-k-mooney: I use the series and the milestones to track bps | 10:59 |
sean-k-mooney | ya for os-vif we just have bugs for RFEs so it tend to be less important | 11:00 |
sean-k-mooney | ill quickly add wallaby and yoga | 11:00 |
gibi | ack | 11:02 |
viks__ | hi, i'm doing some disk I/O testing in my openstack setup... In my test, i see that write speed in VM is very less compared to that of host.. So is there any way i can increase the VM disk I/O? is there any comparision or something which says what percentage ratio disk write speed gets degraded in VM comparison to host? Can some plz give some direction? | 11:02 |
sean-k-mooney | ok done | 11:02 |
opendevreview | Balazs Gibizer proposed openstack/nova master: [doc] port-resource-request-groups not landed in Xena https://review.opendev.org/c/openstack/nova/+/807564 | 11:07 |
sean-k-mooney | viks__: it should be close to identical at least in the raw backend | 11:41 |
sean-k-mooney | for qcow it should also be close but we expect some overhead while the file is growing. we dont really maintian the lvm image_backend activly but it used to give slightly better write performnce the qcow but i think raw was on par | 11:42 |
sean-k-mooney | you may want to exlopre useing virtio-scsi instaed of virtio-blk which is our default | 11:42 |
sean-k-mooney | viks__: we do not support multiple io treads or multiqueue currently so if you have multiple disk the performace will not scale per disk | 11:43 |
sean-k-mooney | but for a vm with a singel root disk and preallcoated stroage the perfromance shoudl be close (within singel digit percent) to that of the host | 11:44 |
viks__ | sean-k-mooney: is `virtio-scsi` applies for vm with single root disk/preallcoated storage or cinder volumes attached? | 11:47 |
sean-k-mooney | when you enabel virtio-scsi via hw_disk_bus it is used for all storage | 11:48 |
sean-k-mooney | my suggestion to try it is because it supports a slightly different feature set like trim | 11:48 |
sean-k-mooney | so dpending on if your using ssd and your workload sometimes virtio-scsi perfroms better even for a vm with 1 root disk and no cinder volumes | 11:49 |
sean-k-mooney | virtio-scsi is our recomendation when ever using ceph | 11:49 |
lyarwood | that reminds me I need to read up on the virtio-blk trim support thread | 11:49 |
sean-k-mooney | but for local storage we generally suggest staying with virtio-blk unless you have messured a performace increase with virtio-scsi | 11:50 |
sean-k-mooney | virtio-scsi is general better when you use a large number of cinder volumes as it does not consume a pci device slot per volumn | 11:51 |
sean-k-mooney | so it scales better in that regard but again all io is handel by the qemu emulator thread since we do not use io threads | 11:51 |
sean-k-mooney | so that will be the bottle neck regardless of if you use virtio-blk or virtio-scsi | 11:51 |
viks__ | sean-k-mooney: ok thanks for the inputs.. i'll explore and test `virtio-scsi` also and see if i have some findings.. | 11:55 |
opendevreview | Merged openstack/os-vif stable/ussuri: Refactor code of linux_net to more cleaner and increase performace https://review.opendev.org/c/openstack/os-vif/+/765419 | 12:49 |
viks__ | sean-k-mooney: here is what i have tested... https://paste.openstack.org/show/808597/ These are from my default setup.. i see a huge difference... but as per you, it should not differ too much even in default setup.. am i doing something wrong or missing some configuration? | 12:50 |
sean-k-mooney | correct it should not differe much. so some questions what virt driverf are you using (libvirt?) what images_backend have you configured (qcow? raw? flat? lvm?) | 12:51 |
sean-k-mooney | have you set your disk cache mode or is it at the default and have you enabeld image preallocation our not | 12:52 |
opendevreview | Merged openstack/os-vif stable/ussuri: Fix - os-vif fails to get the correct UpLink Representor https://review.opendev.org/c/openstack/os-vif/+/765967 | 12:52 |
sean-k-mooney | viks__: unless sysbench preallocates the file and then writes to it the large delta there may just be growing the root disk | 12:54 |
sean-k-mooney | viks__: so you might need to rerun that | 12:54 |
sean-k-mooney | for fio i know it preallocates and there we are seeign about a 2x delta | 12:54 |
bauzas | gibi: ok, so you need to create a new series for launchpad ? | 12:54 |
bauzas | I wasn"t knowing about it | 12:54 |
bauzas | gibi: could you please add this for https://docs.openstack.org/nova/latest/contributor/ptl-guide.html ? | 12:55 |
bauzas | ah, nevermind | 12:56 |
bauzas | https://docs.openstack.org/nova/latest/contributor/ptl-guide.html#immediately-after-rc | 12:56 |
viks__ | sean-k-mooney: when i rerun sysbench, then also i get the below, which is still way below the one on the host: | 12:58 |
viks__ | https://www.irccloud.com/pastebin/sf7VLtQi/ | 12:58 |
sean-k-mooney | viks__: ack but without answering my ohter question i cant really help | 12:59 |
sean-k-mooney | viks__: the default parmaters that are used will depend on your deployment tool so that is why i am asking what you image_backend set to and cache modes also have you preallocated the image or not | 13:00 |
viks__ | sean-k-mooney: oh sorry... it's libvirt/qcow, `preallocate_images = none, disk_cachemodes = file=writeback,block=writeback` | 13:02 |
sean-k-mooney | ok so preallocat_image=none means tehre will be some degradation while the qcow is expanded but that does not account of rthe fio delta since it precallocates teh files in the guest to prevent that form being an issue | 13:05 |
sean-k-mooney | writeback is generally quite good for perfromance you coudl try settign the cache mode to none if you host suport O_DIRECT that might improve perfromance | 13:06 |
sean-k-mooney | if you leeve the cachemode unest we will default to none if O_DIRECT is support or use writeback which was our default until recently | 13:06 |
viks__ | sean-k-mooney: ok.. what is the easy way to check for `O_DIRECT` support in the host? | 13:09 |
sean-k-mooney | well you were using --direct=1 in your fio test so it should support it | 13:10 |
sean-k-mooney | O_DIRECT is the file mapping mode used for direct io | 13:11 |
viks__ | ok... | 13:11 |
sean-k-mooney | first thing i would try is setting the cachemode for file to none | 13:11 |
opendevreview | Balazs Gibizer proposed openstack/osc-placement master: Repro allocation show bug with empty allocation https://review.opendev.org/c/openstack/osc-placement/+/807553 | 13:12 |
opendevreview | Balazs Gibizer proposed openstack/osc-placement master: Fix allocation show / unset with empty allocation https://review.opendev.org/c/openstack/osc-placement/+/807556 | 13:12 |
sean-k-mooney | then hard reboot the vm and see if that improve the performance | 13:12 |
viks__ | sean-k-mooney: ok thanks... will try the same | 13:12 |
sean-k-mooney | if that has no effect i would try setting preallocate_images=space and then try changing images_backend to raw. you could also in parallel create a second vm with hw_disk_bus=virtio-scsi to compare side by side to virtio-blk | 13:14 |
sean-k-mooney | viks__: for really write heavy worklaods lvm used to slightly outperform raw but that requried extra setup and it is less well maintained | 13:15 |
sean-k-mooney | so if the other tuning dont help in your case that is what i woudl try last | 13:15 |
sean-k-mooney | viks__: outside of openstack there are some host level sysctl tunings you can do to improve guest io perfromance like changing the io schduler or dirty data writeback tresholds | 13:16 |
sean-k-mooney | tuned can be used as a erfernce for some of the commnly tuned settings | 13:18 |
sean-k-mooney | https://github.com/redhat-performance/tuned/blob/master/profiles/throughput-performance/tuned.conf#L24-L59 | 13:18 |
viks__ | sean-k-mooney: thanks for all the inputs.. will do some testing based on these .... thanks again... | 13:19 |
opendevreview | Balazs Gibizer proposed openstack/nova master: DNM: check nova-next with osc-placement fix https://review.opendev.org/c/openstack/nova/+/807558 | 13:22 |
sean-k-mooney | viks__: if you are testing virtio-scsi i the image options are hw_disk_bus=scsi hw_scsi_model=virtio-scsi by they way | 13:25 |
sean-k-mooney | i previously msityped it as hw_disk_bus=virtio-scsi | 13:25 |
viks__ | sean-k-mooney: ok... | 13:26 |
bauzas | huzzah, can see my review priorities : https://review.opendev.org/q/label:Review-Priority%253E%253D1%252Csbauza | 14:06 |
opendevreview | Merged openstack/osc-placement master: Repro allocation show bug with empty allocation https://review.opendev.org/c/openstack/osc-placement/+/807553 | 14:20 |
opendevreview | Merged openstack/osc-placement master: Fix allocation show / unset with empty allocation https://review.opendev.org/c/openstack/osc-placement/+/807556 | 14:20 |
gibi | lyarwood, bauzas: ^^ thanks for the quick approve. I will ask for a release freeze exception shortly | 14:41 |
gibi | RFE requested: http://lists.openstack.org/pipermail/openstack-discuss/2021-September/024686.html | 14:53 |
slaweq | hi stable nova cores, can somebody check https://review.opendev.org/c/openstack/nova/+/791421 /. | 14:55 |
slaweq | ? | 14:55 |
slaweq | thx in advance for help :) | 14:55 |
gibi | do we have USA out today due to Labor Day? | 15:28 |
kashyap | gibi: Yeah | 15:31 |
kashyap | Also Canada, IIRC | 15:31 |
gibi | OK, then I don't wait for them today :) | 15:31 |
bauzas | gibi: sorry was on a meeting | 15:57 |
bauzas | gibi: yup, saw your FFE request but as kashyap said and you guessed, a whole portion of the world located between Pacific and Atlantic oceans and above a certain latitude is currently shutdown for the day | 15:58 |
gibi | bauzas: no worries the release now queued | 15:59 |
bauzas | said a French guy working | 15:59 |
gibi | bauzas: I looked at melwitt's repro patch https://review.opendev.org/c/openstack/placement/+/807493 and I left some ideas but no real solutions yet | 15:59 |
bauzas | gibi: honestly we face the limits of the single-commit design we have with Placement | 16:01 |
gibi | bauzas: this is now more about writing a sane a repro test that excersise transaction isolation with mysql (as sqlite does not have it) | 16:03 |
gibi | I think the fix melwitt's proposed on top is sane but the repro test is racy somehow actually forks processes | 16:03 |
gibi | whihc is scarry | 16:04 |
bauzas | gibi: yup, I saw the tox modification and I understood the reasoning | 16:04 |
melwitt | I will try to find a different way to do it. it's inherently problematic to try and repro this because once one path starts a transaction, trying to do something in the middle of it (to cause the race state) poses a record locking problem. at one point I was trying to fake it by just returning bogus things to make it think it hit a generation conflict (without doing any database write), which worked to make it retry but that wasn't | 16:12 |
melwitt | showing the effect of the consistent read problem inside the transaction | 16:12 |
gibi | melwitt: yes, exactly, that was the dead end I went down this afternoon :/ | 16:21 |
gibi | melwitt: I think at some point we can accept that we cannot reliably test this in func env and simply land the fix and monitor the tempest jobs to see if it is resolved the race or not | 16:22 |
melwitt | gibi: I went down the dead end three days in a row 😑 | 16:22 |
melwitt | gibi: yeah. I was thinking that too. I kept thinking there has to be a way but if there is I'm not clever enough to find it haha | 16:23 |
gibi | melwitt: but enjoy your day off, we can continue this tomorrow | 16:24 |
melwitt | kk, thanks | 16:24 |
*** elodilles is now known as elodilles_pto | 20:32 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!