sapd1 | Hi everyone, I'm running openstack train. I have a problem when resize an instance with PCI devices(GPUPassthrough) to a flavor without PCI devices. | 01:56 |
---|---|---|
sapd1 | Does anyone know how to fix this problem. | 01:56 |
sapd1 | When I perform resize: No valid host was found. No valid host found for resize (HTTP 400) (Request-ID: req-9d6cf3f9-8f6f-482f-b0b1-95ec286a0fc3) | 01:57 |
auniyal | sapd1 - as told No Valid Host. | 05:33 |
auniyal | but if you are sure this is not correct and still looking for reason, can you add --debug in your resize cmd to find out exactly which call failed and why, then once you find out that look for req-ID in compute logs. it will give you better idea of fail. if there are any traceback's in logs - you can file a bug here: | 05:33 |
auniyal | https://bugs.launchpad.net/nova | 05:33 |
sapd1 | auniyal: this bug: https://bugs.launchpad.net/nova/+bug/1941005 | 07:07 |
auniyal | it seems fixed | 07:09 |
sapd1 | auniyal: but it's not merged yet | 07:09 |
auniyal | oh yes, in train its not merged | 07:10 |
sapd1 | auniyal: I patched it in my system. | 07:19 |
opendevreview | Konrad Gube proposed openstack/nova master: Use Cinder's os-extend_volume_completion volume action. https://review.opendev.org/c/openstack/nova/+/873560 | 07:40 |
bauzas | dansmith: gmann: it occurs to me that nova-next is almost failing on volume detach due to some kind of filesystem check (that's a guess) and kinda related to what you told this night (for me) | 08:32 |
bauzas | https://6d9b97dc35f887a99105-3214406b4544fce2f9d807df6ea4fe3f.ssl.cf5.rackcdn.com/886232/4/check/nova-next/31d3775/testr_results.html | 08:32 |
sean-k-mooney | is that using cinder lvm? | 09:02 |
sean-k-mooney | i.e. not ceph | 09:03 |
sean-k-mooney | its not quite the saem as what https://github.com/openstack/devstack/commit/58c80b2424623096e4a1f7a901f424be0ce6cb3f adressed and actully that should help there in any case | 09:04 |
sean-k-mooney | i think https://review.opendev.org/c/openstack/tempest/+/886991 could still help | 09:05 |
sean-k-mooney | bauzas: i think its https://bugs.launchpad.net/tempest/+bug/2024859 | 09:06 |
bauzas | sean-k-mooney: -ish yeah | 09:07 |
sean-k-mooney | i just rebased melwitt patch to see if it will pass ci | 09:08 |
sean-k-mooney | we have made some other improvment so it might help but im not sure | 09:09 |
sean-k-mooney | we could also enabel caching in qemu | 09:09 |
sean-k-mooney | so trun on teh writeback cache | 09:09 |
sean-k-mooney | and see if that can hide the slow storage performance | 09:09 |
sean-k-mooney | hehe i was tinkign of setting https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.disk_cachemodes to writeback | 09:13 |
sean-k-mooney | but we could also set it to unsafe ... that would definetly be faster | 09:14 |
bauzas | fwiw, I only see this pattern on nova-next | 09:15 |
bauzas | or, I'd rather say, this failure incidence is way higher on this job than any other one in the nova check and gate pipelines | 09:15 |
sean-k-mooney | [libvirt]/disk_cachemode=file=none,block=unsafe,network=writeback | 09:16 |
sean-k-mooney | im goign to add a patch to nova-next to configre that and lets see if it fails | 09:17 |
opendevreview | sean mooney proposed openstack/nova master: [WIP] use disk caching to hide slow cinder performance https://review.opendev.org/c/openstack/nova/+/889383 | 09:23 |
sean-k-mooney | bauzas: ^ that might work although in production you would not use unsafe but for ci it should be fine | 09:25 |
dvo-plv | Hello, nova folks. Maybe you will have a chance to review this commit, thank you https://review.opendev.org/c/openstack/nova/+/876075 | 09:37 |
sean-k-mooney | i started that review twice already so i proably should go finish it :) | 09:39 |
sean-k-mooney | dvo-plv: looks good to me | 09:47 |
kashyap | While I'm off the next 3 days, anyone willing to get this over the line? It's just failing on some timeouts: https://review.opendev.org/c/openstack/nova/+/887255 | 11:35 |
kashyap | (I see sean-k-mooney already did a recheck on the dep patch; thanks!) | 11:36 |
sean-k-mooney | it still needs review form other but i set RP+2 on it as well | 11:37 |
kashyap | Yeah, saw that, thank you! | 11:38 |
auniyal | sean-k-mooney, and others, how can I set debug=true for nova-scheduler; in devstack nova.conf for [DEFAULT] its already set there is another group [scheduler], I tried there as well but no difference | 12:53 |
auniyal | I am trying to see https://github.com/openstack/nova/blob/815683ea86492d3ed77b04cc56f3db87e2b8c47d/nova/weights.py#L136 | 12:53 |
auniyal | in scheduler logs while launching VM | 12:53 |
auniyal | is there any other conf I should look into ? | 12:55 |
sean-k-mooney | devstack runs in debug mode by default | 13:06 |
auniyal | yes, still these logs are not coming in "jouranlctl -u devstack@n-sch" | 13:07 |
sean-k-mooney | stack@upstream-devstack:~$ sudo journalctl -u devstack@n-sch | grep debug | wc | 13:09 |
sean-k-mooney | 10 172 2932 | 13:09 |
auniyal | yes debugs logs are coming in n-sch but not these logs https://github.com/openstack/nova/blob/815683ea86492d3ed77b04cc56f3db87e2b8c47d/nova/weights.py#L136 | 13:11 |
auniyal | so I think they should come whie VM create | 13:12 |
auniyal | right ? | 13:12 |
sean-k-mooney | im seeign the log in the ci | 13:12 |
sean-k-mooney | https://zuul.opendev.org/t/openstack/build/fd53404ef23341828ae15f8ccf596e6c/log/controller/logs/screen-n-sch.txt#1018 | 13:13 |
auniyal | yes, here they are coming, I have single-node devstack env localy | 13:15 |
auniyal | it should be right ? | 13:15 |
sean-k-mooney | /etc/nova/nova.conf as rendered by devstack has debug=true set in the default section | 13:23 |
sean-k-mooney | that is what the scheduler uses | 13:24 |
auniyal | ack, thanks | 13:30 |
sean-k-mooney | the backport does not seam to be working properly | 13:33 |
sean-k-mooney | wait why are you proposing https://review.opendev.org/c/openstack/nova/+/889311 directly to yoga | 13:34 |
sean-k-mooney | oh your are not | 13:34 |
sean-k-mooney | this is incorrect | 13:35 |
auniyal | ack | 14:01 |
opendevreview | alecorps proposed openstack/nova master: Workaround for issues with ephemeral disk named disk.local during resize https://review.opendev.org/c/openstack/nova/+/888220 | 14:10 |
opendevreview | Amit Uniyal proposed openstack/nova master: Added context manager for instance lock https://review.opendev.org/c/openstack/nova/+/873648 | 14:14 |
opendevreview | Amit Uniyal proposed openstack/nova master: Disconnecting volume from the compute host https://review.opendev.org/c/openstack/nova/+/877446 | 14:14 |
bauzas | elodilles: gibi: I'll need to stop leading the nova meeting after 20-25 mins | 14:37 |
bauzas | Uggla and me are going to visit artom who's in the surroundings | 14:37 |
opendevreview | Maxim Monin proposed openstack/nova master: Server Rescue leads to Server ERROR state if original image is deleted https://review.opendev.org/c/openstack/nova/+/872385 | 14:39 |
elodilles | bauzas: hmmm... i might be late like 20-25 mins, actually :S | 14:42 |
elodilles | (though i hope not...) | 14:43 |
auniyal | hey melwitt, I heard PS[number] from you few times, but not sure what it is, like PS1 or PS10 | 14:47 |
bauzas | auniyal: PS = patchset | 14:47 |
bauzas | the revision number in gerrit | 14:47 |
auniyal | so in one change patchset 10 | 14:47 |
auniyal | ack, thanks | 14:48 |
gibi | bauzas: I probably need to skip today's meeting (or I will be on and off during it) | 14:56 |
bauzas | np | 14:57 |
bauzas | honestly, given this and that, I don't want to skip but I'll say that we'll just check a few things | 14:57 |
gibi | ack | 14:59 |
auniyal | gibi, bauzas, dansmith, sean-k-mooney, melwitt can you please review these stable branch patches - https://etherpad.opendev.org/p/release-liaison-PatchesToReview | 15:52 |
auniyal | most of them are clean cherry-pick and already have 1 +2 and good to merge | 15:52 |
sean-k-mooney | the gates on stabel shoudl be unblocked | 15:54 |
sean-k-mooney | the nova-lvm and ceph issues shoudl be fixed on all branches | 15:54 |
bauzas | auniyal: ack, my main prio before leaving for 3 weeks is about features reviews but I'll try | 15:55 |
elodilles | yepp, they are unblocked, thanks sean-k-mooney for the patches \o/ | 15:56 |
bauzas | #startmeeting nova | 16:00 |
opendevmeet | Meeting started Tue Jul 25 16:00:23 2023 UTC and is due to finish in 60 minutes. The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:00 |
opendevmeet | The meeting name has been set to 'nova' | 16:00 |
bauzas | hey, let's try to have a 10-min meeting if we can | 16:00 |
dansmith | o/ | 16:00 |
bauzas | #link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting | 16:00 |
gmann | o/ | 16:00 |
auniyal | o/ | 16:01 |
elodilles | o/ | 16:01 |
bauzas | ok, starting | 16:01 |
bauzas | if someone wants to continue discussing, I can pass the chair | 16:01 |
bauzas | #topic Bugs (stuck/critical) | 16:01 |
bauzas | #info No Critical bug | 16:01 |
bauzas | #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 40 new untriaged bugs (+0 since the last meeting) | 16:01 |
bauzas | Uggla made a good effort on triaging this week, thanks | 16:01 |
bauzas | he shared to me https://etherpad.opendev.org/p/nova-bug-triage-20230725 | 16:02 |
bauzas | #info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster | 16:02 |
bauzas | elodilles: can you be the baton owner for this week ? | 16:02 |
elodilles | yepp | 16:02 |
bauzas | cool thanks ! | 16:02 |
bauzas | #info bug baton is being passed to elodilles | 16:02 |
bauzas | any bug to discuss ? | 16:02 |
elodilles | i'll be off on Friday, but otherwise fine! | 16:02 |
bauzas | elodilles: I'll be off next week so :) | 16:03 |
bauzas | #topic Gate status | 16:03 |
elodilles | touche ;) | 16:03 |
bauzas | #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs | 16:03 |
bauzas | #link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&pipeline=periodic-weekly Nova&Placement periodic jobs status | 16:03 |
bauzas | #info nova-emulation back to work | 16:03 |
bauzas | #link https://zuul.openstack.org/build/2eb47b5dcb944b7eb03224e6ee599a57 | 16:03 |
dansmith | gmann and I have been working on a bunch of patches to address job timeout issues | 16:03 |
bauzas | now we have a failure on tempest-integrated-placement | 16:03 |
dansmith | still lots in flight, but we're hopeful that it will improve those issues at least | 16:03 |
bauzas | but we'll see next week | 16:03 |
dansmith | but there are still lots of other spurious fails | 16:04 |
bauzas | dansmith: I was about to mention the other pipelines but the periodic-weekly | 16:04 |
bauzas | #info Please look at the gate failures and file a bug report with the gate-failure tag. | 16:04 |
dansmith | really need all hands on deck to improve this situation | 16:04 |
gmann | yeah, timeout now a days is due to multiple reasons and we are trying to improve a few of them | 16:04 |
bauzas | dansmith: so, yeah, there are a bunch of patches in flight | 16:04 |
bauzas | dansmith: gmann: I don't know if you have seen my pings today but yeah, there is another suspect | 16:05 |
bauzas | #link https://bugs.launchpad.net/tempest/+bug/2024859 | 16:05 |
dansmith | bauzas: the ping was about the mkfs issue? | 16:05 |
dansmith | that's known and unclear how to resolve, AFAICT | 16:05 |
bauzas | yup | 16:05 |
gmann | bauzas: I saw that, I did not look into that expect melwitt trying with some change | 16:05 |
gmann | yeah mkfs | 16:05 |
dansmith | but I'm highly concerned that we've got a group of workers that is suddenly very very IO constrained | 16:05 |
bauzas | sean-k-mooney has a WIP proposal | 16:06 |
dansmith | because we're seeing 100% slowdown on some nodes that seems IO-related, like that mkfs issue | 16:06 |
dansmith | melwitt had a proposal but it turned out not to help, last I checked | 16:06 |
gmann | yeah | 16:06 |
sean-k-mooney | the nova-lvm job passed with it btu i dont know if it was failing consitently before | 16:06 |
bauzas | about using disk cachinh | 16:06 |
bauzas | but I'm torn enabling it | 16:06 |
sean-k-mooney | https://review.opendev.org/c/openstack/nova/+/889383 | 16:06 |
bauzas | anyway, I don't disagree, we need to continue digging into it | 16:07 |
sean-k-mooney | well its partly enabled already | 16:07 |
gmann | sean-k-mooney: might not be so consistent but I see this 2-3 times in a week | 16:07 |
bauzas | dansmith: gmann: from the most remaining issues, is nova just a coal canary or is responsible for some of them ? | 16:07 |
sean-k-mooney | i was suggeting setting disk_cachemode: "file=none,block=unsafe,network=writeback" | 16:07 |
sean-k-mooney | block could also be writeback | 16:07 |
dansmith | bauzas: need more triage to know really | 16:08 |
dansmith | bauzas: I've been focusing on the timeout stuff for a week | 16:08 |
gmann | bauzas: timeout is more of test runner unbalancing on test worker and slow test etc etc | 16:08 |
bauzas | ok, I'll try to get more hands on it before I leave | 16:08 |
gmann | because timeout hold many of the gate fixes so that should be fixed first :) | 16:09 |
bauzas | ack | 16:09 |
bauzas | moving on | 16:09 |
bauzas | #topic Release Planning | 16:09 |
bauzas | #link https://releases.openstack.org/bobcat/schedule.html | 16:09 |
bauzas | #link https://etherpad.opendev.org/p/nova-bobcat-blueprint-status Etherpad for tracking blueprints status | 16:09 |
bauzas | #info 5 weeks before FeatureFreeze | 16:09 |
bauzas | as I said before, my attention will go to features sets this week | 16:10 |
bauzas | in theory, today was a Feature Review Day | 16:10 |
bauzas | but I feel many people having different priorities | 16:10 |
bauzas | so I haven't really called it | 16:10 |
bauzas | #topic Review priorities | 16:11 |
bauzas | #link https://review.opendev.org/q/status:open+(project:openstack/nova+OR+project:openstack/placement+OR+project:openstack/os-traits+OR+project:openstack/os-resource-classes+OR+project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/osc-placement)+(label:Review-Priority%252B1+OR+label:Review-Priority%252B2) | 16:11 |
bauzas | #info As a reminder, people eager to review changes can +1 to indicate their interest, +2 for asking cores to also review | 16:11 |
bauzas | #topic Stable Branches | 16:11 |
bauzas | elodilles: your time | 16:11 |
elodilles | #info nova-lvm / nova-ceph-multistore jobs are fixed on all branches (2023.1, zed, yoga) \o/ | 16:11 |
elodilles | #info stable/victoria gate fix has also landed | 16:11 |
elodilles | #info gates from 2023.1 back till train should be OK | 16:11 |
elodilles | #info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci | 16:11 |
elodilles | EOM | 16:11 |
bauzas | thanks | 16:11 |
bauzas | #info train-eol patch proposed https://review.opendev.org/c/openstack/releases/+/885365 | 16:12 |
bauzas | we're missing a second release core vote | 16:12 |
bauzas | but should happen eventually | 16:12 |
elodilles | yepp | 16:12 |
bauzas | auniyal had a point, I'll rephrase it | 16:12 |
bauzas | #info Please review these backport patches of stable 2023.1, zed and yoga for next minor release | 16:12 |
bauzas | #info most of these already have one +2. | 16:12 |
bauzas | #link https://etherpad.opendev.org/p/release-liaison-PatchesToReview | 16:12 |
bauzas | and the stable branches are now back in healthy state | 16:13 |
bauzas | #topic Open discussion | 16:13 |
bauzas | I have one item | 16:13 |
bauzas | as I said previously, I'll be on PTO starting next Tues | 16:14 |
bauzas | gibi volunteered for running the Aug 15 meeting | 16:14 |
bauzas | (and Aug 22) | 16:14 |
bauzas | but we may need someone to chair the Aug 1 and Aug 8 meetings | 16:14 |
bauzas | or we skip them | 16:14 |
gibi | jepp I still OK with the 15 and 22 | 16:15 |
bauzas | so, anyone fancy volunteering ? if not, nevermind, I'll send an email to cancel the two weeks | 16:15 |
bauzas | I can try to handle Aug 1 meeting as I won't (yet) be on a plane, but I can't promise | 16:16 |
bauzas | ok, let's consider the meetings cancelled | 16:16 |
bauzas | #info Aug 1 and Aug 8 Nova meetings are CANCELLED | 16:17 |
bauzas | #action bauzas to notify the ML accordingly | 16:17 |
bauzas | that's it | 16:17 |
bauzas | for me | 16:17 |
bauzas | anything anyone ? | 16:17 |
bauzas | ok, thanks then | 16:18 |
bauzas | #endmeeting | 16:18 |
opendevmeet | Meeting ended Tue Jul 25 16:18:41 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:18 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/nova/2023/nova.2023-07-25-16.00.html | 16:18 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/nova/2023/nova.2023-07-25-16.00.txt | 16:18 |
opendevmeet | Log: https://meetings.opendev.org/meetings/nova/2023/nova.2023-07-25-16.00.log.html | 16:18 |
gibi | o/ | 16:19 |
elodilles | thanks, too o/ | 16:19 |
gmann | thanks o/ | 16:21 |
opendevreview | sean mooney proposed openstack/nova master: [WIP] use disk caching to hide slow cinder performance https://review.opendev.org/c/openstack/nova/+/889383 | 16:58 |
sean-k-mooney | if we decied this is the correct approch in general i can do ^ in devstack instead | 16:59 |
sean-k-mooney | or in the base job but im adding it to all the in tree job to get more input on if it affect the ci jobs positivly or negitivly | 16:59 |
dansmith | sean-k-mooney: I think doing it in devstack itself is the wrong move, but maybe in some of the devstack job defs.. this is really an optimization for our own CI and not something we'd want anyone running without knowing it, even in devstack, IMHO | 17:02 |
dansmith | I'll be interested to see if that helps significantly.. I've seen that bug manifest only occasionally in my surveying recently | 17:02 |
sean-k-mooney | i was thinking of doing it like the mysql memory thing | 17:02 |
sean-k-mooney | e.g. off by default with a macro/var to turn it on in ci | 17:03 |
dansmith | yeah, it could be a flag, sure | 17:03 |
sean-k-mooney | basically i just dont want to have to copy paste that to even more jobs over time :) | 17:03 |
sean-k-mooney | im not sure if it will help or not but we will see | 17:04 |
sean-k-mooney | in theory it should not sure | 17:04 |
opendevreview | Maxim Monin proposed openstack/nova master: Server Rescue leads to Server ERROR state if original image is deleted https://review.opendev.org/c/openstack/nova/+/872385 | 19:24 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!