dansmith | melwitt: oh jeez I didn't realize that wasn't +V from zuul, thanks | 00:19 |
---|---|---|
melwitt | np. it's hard to see through all the fail | 00:20 |
dansmith | just too many patches really | 00:20 |
melwitt | huh.. it looks like it's already been running. it was already +W by you.. I didn't think of the +W kicking off the job again /facepalm | 00:23 |
dansmith | oh I swore it was | 00:23 |
dansmith | I see your recheck put it back in the arm queue, | 00:24 |
dansmith | which is only a few minutes old but the other one has been running in regular check for over an hour | 00:24 |
dansmith | I thought I must have been looking at the wrong on | 00:24 |
melwitt | yeah. it doesn't restart the normal jobs thankfully | 00:24 |
melwitt | no, I'm just le dumb | 00:24 |
dansmith | no, there are too many identical patches :) | 00:25 |
dansmith | crap, 2023.1 patch is going to fail | 00:33 |
dansmith | dhcp fail in the guest | 00:34 |
dansmith | le sigh | 00:34 |
melwitt | dangit | 00:34 |
dansmith | top patch is +V waiting for another trip to gateland | 00:49 |
dansmith | man cripes | 00:49 |
dansmith | my friggin timeout patch is gonna fail again | 00:49 |
melwitt | 😑 | 00:50 |
dansmith | clearly something ceph related: | 00:50 |
dansmith | [ 10.258086] Buffer I/O error on dev vda1, logical block 20, lost async page write | 00:50 |
dansmith | vda is the root disk, which is on ceph | 00:50 |
dansmith | last time around cinder just stopped creating volumes halfway through (maybe for similar reasons) | 00:51 |
dansmith | Out of memory: Killed process 55722 (ceph-osd) | 00:52 |
dansmith | that'll do it like every time | 00:52 |
dansmith | cripes | 00:52 |
melwitt | ouch | 00:53 |
dansmith | ah, | 00:54 |
dansmith | we're lowering the swap on our fatter job than the ceph job I increased in their repo | 00:54 |
opendevreview | Dan Smith proposed openstack/nova master: Bump nova-ceph-multstore timeout https://review.opendev.org/c/openstack/nova/+/882890 | 00:56 |
dansmith | melwitt: ^ | 00:56 |
melwitt | I don't understand that sentence 😆 | 00:59 |
* melwitt reads the commit message | 00:59 | |
melwitt | ok nevermind | 00:59 |
dansmith | wait, in my commit message or | 00:59 |
melwitt | yeah you explained what I didn't understand in your commit message | 01:00 |
dansmith | I bumped the ceph job to 8G while jammifiying and cephadmifying it, but we inherit from that and set it down to 4G, which makes no sense because we run even more stuff than they do | 01:00 |
dansmith | ack | 01:00 |
*** dmellado9 is now known as dmellado | 05:04 | |
opendevreview | Merged openstack/nova stable/2023.1: Use force=True for os-brick disconnect during delete https://review.opendev.org/c/openstack/nova/+/882858 | 05:41 |
opendevreview | Amit Uniyal proposed openstack/nova stable/2023.1: Have host look for CPU controller of cgroupsv2 location. https://review.opendev.org/c/openstack/nova/+/882913 | 05:47 |
opendevreview | Amit Uniyal proposed openstack/nova stable/zed: Have host look for CPU controller of cgroupsv2 location. https://review.opendev.org/c/openstack/nova/+/882914 | 05:50 |
gibi | rechecked https://review.opendev.org/c/openstack/nova/+/882890 as it failed nova-next in the gate with unrealted timeouts | 06:43 |
gibi | also rechecked https://review.opendev.org/c/openstack/nova/+/882859 | 06:46 |
gibi | both failed with http read timeout on various openstack APIs (cinder, neutron, nova) | 06:47 |
gibi | I think this is tracked here as a bug https://bugs.launchpad.net/tempest/+bug/1999893 | 06:48 |
bauzas | morning | 07:05 |
bauzas | gibi: catching up the world explosion after my yesterday PTO | 07:05 |
gibi | I don't have the full context as the CVE got public why I was away | 07:29 |
gibi | I see the fixes proposed so I'm trying to land them | 07:29 |
opendevreview | Amit Uniyal proposed openstack/nova stable/yoga: Have host look for CPU controller of cgroupsv2 location. https://review.opendev.org/c/openstack/nova/+/882920 | 07:29 |
gibi | s/why/while/ | 07:32 |
opendevreview | Sylvain Bauza proposed openstack/nova stable/2023.1: Revert "Debug Nova APIs call failures" https://review.opendev.org/c/openstack/nova/+/882783 | 07:32 |
bauzas | gibi: np, just looking at gerrit reviews | 07:33 |
bauzas | gibi: do you need some explanations ? | 07:33 |
gibi | I read the CVE it is well documented so thanks I'm OK about the fixes we are pushing | 07:33 |
opendevreview | Amit Uniyal proposed openstack/nova stable/xena: Have host look for CPU controller of cgroupsv2 location. https://review.opendev.org/c/openstack/nova/+/882921 | 07:34 |
*** atmark is now known as Guest1107 | 07:51 | |
sean-k-mooney | bauzas: can you take a look at some os-vif changes for me | 09:47 |
sean-k-mooney | this one is trivial https://review.opendev.org/c/openstack/os-vif/+/882755 and this one is the one i really want to merge https://review.opendev.org/c/openstack/os-vif/+/881751 | 09:48 |
sean-k-mooney | or gibi or anyone else who is about | 09:48 |
bauzas | sean-k-mooney: fyk see my comment on https://review.opendev.org/c/openstack/os-vif/+/882755 | 10:11 |
bauzas | the TC will add back py38 in the Bobcat PTI | 10:12 |
bauzas | (but that's just a FYK, since Focal won't be accepted again) | 10:12 |
sean-k-mooney | 38 i htink is still tested in os-vif | 10:25 |
sean-k-mooney | i havent got around to updating that so cool i can leave that | 10:25 |
sean-k-mooney | we just use openstack-python3-jobs to contol that | 10:26 |
sean-k-mooney | i might add 3.11 like we did in nova seperatly but ill keep 3.8 testable | 10:27 |
sean-k-mooney | thanks for the reminder | 10:27 |
opendevreview | Amit Uniyal proposed openstack/nova stable/wallaby: Have host look for CPU controller of cgroupsv2 location. https://review.opendev.org/c/openstack/nova/+/882939 | 10:32 |
opendevreview | Merged openstack/nova master: Bump nova-ceph-multstore timeout https://review.opendev.org/c/openstack/nova/+/882890 | 10:51 |
gibi | sean-k-mooney: approved the os-vif qdisc patch | 12:01 |
sean-k-mooney | gibi: thanks for reviewing | 12:02 |
opendevreview | Merged openstack/os-vif master: remove focal based jobs https://review.opendev.org/c/openstack/os-vif/+/882755 | 12:08 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/2023.1: Bump nova-ceph-multstore timeout https://review.opendev.org/c/openstack/nova/+/882784 | 12:15 |
bauzas | stable cores are needed for 2023.1 https://review.opendev.org/c/openstack/nova/+/882783 | 12:15 |
gibi | bauzas, dansmith, melwitt: The nova-ceph-mutlistore timeout bump is merged to master so I cherry-picked it to 2023.1 ^^ | 12:16 |
bauzas | gibi: done | 12:16 |
bauzas | gibi: sean-k-mooney: could you please look at https://review.opendev.org/c/openstack/nova/+/882783 ? | 12:16 |
sean-k-mooney | yep | 12:17 |
sean-k-mooney | elodilles: :) | 12:18 |
elodilles | done :) | 12:18 |
sean-k-mooney | bauzas: they are both now +2w'd | 12:18 |
gibi | \o/ | 12:18 |
sean-k-mooney | by the way i dont care that w is a letter can cant be past tense but +2w'd is definelty a thing :) and the 'd gets that point across | 12:20 |
sean-k-mooney | english sometime allows you to convay info in a way that would make tech writers cry | 12:21 |
sean-k-mooney | and im ok with that, we all know i mainly comunicate in seanspeak anyway :) | 12:22 |
elodilles | yepp, i also use EOL'd +2'd +W'd o:) | 12:22 |
elodilles | i don't remember where i saw that 1st time, but it can be understood, so i started to use that myself too | 12:25 |
sean-k-mooney | i dont think any one has ever objected but from a grammer rules point of view english allows you to syntasies past tense verbs this way. i just know my english teach would have been unhappy | 12:27 |
sean-k-mooney | with makeing an acronym both be a verb and have tence | 12:28 |
sean-k-mooney | ' is a very powerful thing :) | 12:28 |
elodilles | :) | 12:30 |
Uggla | @bauzas, gibi, if you can have a look at those small patches https://review.opendev.org/c/openstack/tempest/+/882822/1, https://review.opendev.org/c/openstack/tempest/+/882823/2 as I said tuesday that may help to find the bug. | 12:43 |
ykarel | sean-k-mooney, bauzas can you please revisit https://review.opendev.org/c/openstack/nova/+/868419 when get a chance | 12:54 |
opendevreview | Merged openstack/nova stable/2023.1: Enable use of service user token with admin context https://review.opendev.org/c/openstack/nova/+/882859 | 12:56 |
dansmith | gibi: thanks! | 13:23 |
opendevreview | Amit Uniyal proposed openstack/nova stable/wallaby: Have host look for CPU controller of cgroupsv2 location. https://review.opendev.org/c/openstack/nova/+/882939 | 13:28 |
bauzas | dansmith: thanks btw. for having +2d on my way | 13:29 |
bauzas | upstream CVE bugfixes are already merged by master and 2023.1 \o/ | 13:30 |
dansmith | bauzas: no problem | 13:32 |
opendevreview | Merged openstack/os-vif master: set default qos policy https://review.opendev.org/c/openstack/os-vif/+/881751 | 13:42 |
gibi | Uggla: done. I don't have +2 rights in tempest but those patches looks good | 13:52 |
Uggla | gibi, cool thx. | 13:53 |
dansmith | bauzas: gibi so just an update, based on my opensearch digging, I don't think we have seen any volume detach failures in the last week with the exception of cases where either ceph oomed or the guest had a kernel panic | 13:56 |
dansmith | not definitive for sure, but based on previous behavior, I think that's massively better | 13:56 |
opendevreview | Merged openstack/nova stable/2023.1: Revert "Debug Nova APIs call failures" https://review.opendev.org/c/openstack/nova/+/882783 | 14:21 |
bauzas | dansmith: bravo to you | 14:25 |
dansmith | also, the cve backport just failed on the backport validator | 14:25 |
dansmith | maybe just github not updated yet? | 14:25 |
opendevreview | Elod Illes proposed openstack/nova master: CI: fix backport validator for new branch naming https://review.opendev.org/c/openstack/nova/+/882956 | 14:25 |
elodilles | bauzas dansmith : we need this fix as new branch naming broke our validator :S ^^^ | 14:26 |
dansmith | ah | 14:27 |
dansmith | +2 | 14:27 |
elodilles | thx | 14:27 |
dansmith | elodilles: nice job thanks | 14:27 |
elodilles | i'll propose the backports if this is about to merge | 14:28 |
gibi | dansmith: nice results! | 14:28 |
dansmith | cinder is still running with validations disabled and so they're hitting all the ones we used to but we're not, so that's also a nice A/B comparison :) | 14:29 |
opendevreview | Oleksandr Klymenko proposed openstack/nova master: Host removed from AZ when service is manually disabled https://review.opendev.org/c/openstack/nova/+/882957 | 14:48 |
opendevreview | Elod Illes proposed openstack/nova stable/2023.1: CI: fix backport validator for new branch naming https://review.opendev.org/c/openstack/nova/+/882964 | 15:05 |
opendevreview | Elod Illes proposed openstack/nova stable/zed: CI: fix backport validator for new branch naming https://review.opendev.org/c/openstack/nova/+/882965 | 15:06 |
opendevreview | Elod Illes proposed openstack/nova stable/yoga: CI: fix backport validator for new branch naming https://review.opendev.org/c/openstack/nova/+/882966 | 15:07 |
opendevreview | Elod Illes proposed openstack/nova stable/xena: CI: fix backport validator for new branch naming https://review.opendev.org/c/openstack/nova/+/882967 | 15:08 |
opendevreview | Elod Illes proposed openstack/nova stable/wallaby: CI: fix backport validator for new branch naming https://review.opendev.org/c/openstack/nova/+/882968 | 15:09 |
opendevreview | Elod Illes proposed openstack/nova stable/victoria: CI: fix backport validator for new branch naming https://review.opendev.org/c/openstack/nova/+/882969 | 15:11 |
opendevreview | Elod Illes proposed openstack/nova stable/ussuri: CI: fix backport validator for new branch naming https://review.opendev.org/c/openstack/nova/+/882970 | 15:12 |
opendevreview | Elod Illes proposed openstack/nova stable/train: CI: fix backport validator for new branch naming https://review.opendev.org/c/openstack/nova/+/882971 | 15:13 |
opendevreview | Sylvain Bauza proposed openstack/nova stable/zed: Revert "Debug Nova APIs call failures" https://review.opendev.org/c/openstack/nova/+/882786 | 15:29 |
bauzas | folks, I'm also saying it loudly, tomorrow I'll be working and looking at specs | 16:17 |
dansmith | melwitt: the backport checker fix is going to fail on functional, that db table race thing | 16:52 |
melwitt | argh | 16:54 |
dansmith | is there a bug open for that/ | 16:54 |
melwitt | yes, sec | 16:55 |
dansmith | I dunno why zuul hasn't kicked it out so I can recheck it yet | 16:56 |
melwitt | I believe it's this one https://bugs.launchpad.net/nova/+bug/1946339 | 16:56 |
dansmith | hmm, similar at least | 16:57 |
melwitt | gibi has done a lot of work to improve the situation but it's a gnarly issue | 16:57 |
dansmith | okay yeah that's the same | 16:58 |
JayF | Hey; I re-proposed the Ironic sharding spec for this cycle about 2 weeks ago. It's not gotten any reviews. If anyone can take a look I'd appreciate it: https://review.opendev.org/c/openstack/nova-specs/+/881643 | 16:58 |
melwitt | there is a change I think we could do that might help but I haven't proposed it yet bc it hadn't been happening very often for a long time there | 16:58 |
dansmith | melwitt: okay gibi just commented on the bug that we've seen an. uptick recently | 16:59 |
melwitt | yeah, I just saw that too | 16:59 |
dansmith | fungi: there's something that has completed all our jobs in gate that we need to recheck, but it's just sitting there in the queue and I'm not sure why | 17:12 |
dansmith | it's already identified as failed and out of the queue, but not.. uh, reporting or whatever | 17:13 |
clarkb | dansmith: because the changes ahead of it haven't finished | 17:13 |
clarkb | only the first thing in the queue can report | 17:13 |
clarkb | doing so removes it from the queue and then the next item can be processed | 17:13 |
dansmith | I thought if nothing in front could have caused the failure on that job it would come out immediately | 17:13 |
clarkb | zuul doesn't have that information so can't do that | 17:14 |
dansmith | hmm, okay | 17:14 |
clarkb | by putting things in the same queue you are asserting a failure in one may be caused by the other | 17:14 |
clarkb | and zuul is operating on that knowledge | 17:14 |
dansmith | okay | 17:14 |
dansmith | but it already shows it as failed out (meaning the fork in the line) so I thought that was it saying it knows the things behind it no longer depend | 17:15 |
clarkb | correct the things behind it no longer depend on it. But the things ahead of it may be where the actual bug is | 17:15 |
clarkb | in that case you want to evict the broken stuff ahead and restart the things ehind | 17:16 |
dansmith | so the green checks behind this are based on skipping it or with it applied? | 17:16 |
clarkb | the green checks behind are based on skipping the one that has failed | 17:16 |
dansmith | I assume with it applied and they'll restart if it decides it was legit to kick it out? | 17:16 |
dansmith | hmm okay | 17:16 |
clarkb | the unknown is the not yet completed jobs ahead of it | 17:16 |
dansmith | so if it doesn't get kicked out they restart? | 17:16 |
fungi | but they'll all be tested again from scratch if something else ahead of all of those fails a job | 17:16 |
clarkb | you have 5 changes, 6th is a failure, then X behind. Zuul does not know if the failure was caused by the 5 changes at the front so it does not completely evict the 6th until it processes the 5 ahead of it | 17:17 |
fungi | until all changes ahead of the failing change merge successfully, zuul can't be sure that there's something wrong with that change | 17:17 |
dansmith | it's too bad we can't mark a job as isolated or something, because this is only running nova unit tests, but it's held up as if it has the same dependencies as something with a tempest (which is why the queue needs to be shared) | 17:18 |
dansmith | obviously not a very common case | 17:18 |
dansmith | fungi: ack, the fork in the graph makes it look to me like it's already "out" but yeah okay | 17:19 |
dansmith | "out of consideration" I should say | 17:19 |
dansmith | but yeah I guess I thought there was job affinity and not just place-in-the-queue | 17:20 |
fungi | right, if the failure were due to a change ahead of it in an oslo lib, the bug in that oslo change might fail on some other job which exposed the same bug through some other tests which aren't the nova unit test job | 17:24 |
fungi | it's all fairly abstract from zuul's perspective | 17:24 |
dansmith | yeah, probably safer that way I guess, it's just not how I thought it worked | 17:25 |
sean-k-mooney | JayF: i think we have just been a bit busy and missed it | 17:29 |
sean-k-mooney | JayF: one of the things we agree at the ptg however as not to auto reappove previosly approves specs if there was no code proposed in the previos cycle | 17:30 |
sean-k-mooney | JayF: i know you were working on the iroinc side fo that last cyle | 17:31 |
JayF | sean-k-mooney: that's an interesting case; there was lots of code landed last cycle related to that spec. None in nova though (we had to get the Ironic API released, which we have) | 17:31 |
sean-k-mooney | JayF: how is that going | 17:31 |
JayF | Ironic shards API exists, was shipped in Antelope | 17:31 |
sean-k-mooney | ack | 17:31 |
JayF | openstacksdk support for it is landed, unsure if released but it can be if eneded | 17:31 |
JayF | I'm working on Ironic CLI support for that, which is only really needed once the Nova stuff is released | 17:32 |
JayF | right now, if that spec doesn't hit a speed bump, we've hit every milestone on time | 17:32 |
sean-k-mooney | cool | 17:32 |
sean-k-mooney | are you planning to work on the nova part this cycle | 17:32 |
sean-k-mooney | assuimg its the same as the spec form last cycle | 17:32 |
sean-k-mooney | i dont really see any issues with it | 17:33 |
sean-k-mooney | as long as there is someone to work on it we can review | 17:33 |
JayF | I believe John Garbutt is going to be doing most of the heavy lifting, with Julia and I as backup / docs writing | 17:33 |
sean-k-mooney | ok have they confirmed that since john has been out of active nova dev for a while | 17:33 |
JayF | I have confirmed that downstream | 17:34 |
JayF | He helped us with the design, and wrote the spec last cycle which was approved. | 17:34 |
JayF | Either way, regardless of which human writes the code, it's our intent to implement the spec as listed. I sure hope John does it; his familiarity will save a lot of time but even if not, this is too important to let it live/die on one persons' shoulders. | 17:35 |
sean-k-mooney | ack | 17:35 |
sean-k-mooney | ill try an review it proably monday at this point but if they want to +2 it i can proably +w it assuming its basically the same as last cycle. | 17:36 |
sean-k-mooney | i was happy with the desgin previously | 17:36 |
sean-k-mooney | and i dont think anythin has maritarly change on the nova side that woudl affect it | 17:37 |
JayF | I appreciate it. My only urgency in getting the spec merged is I believe there's a deadline in the nova process for things we want to land this cycle, yeah? | 17:37 |
sean-k-mooney | there technially is but its milestone too | 17:37 |
sean-k-mooney | *two | 17:37 |
sean-k-mooney | so July 6th | 17:37 |
JayF | aha, I was worried it was -1 | 17:37 |
JayF | sounds good :) thanks Sean! | 17:38 |
sean-k-mooney | no we encurage peopel to submit the first draft before m1 | 17:38 |
sean-k-mooney | you have time | 17:38 |
JayF | I'm going to use some of that time now to land the ironic cli for shards o/ ty again | 17:38 |
sean-k-mooney | since i have it open im going to do a quick pass on it and compre to last release but then i need to swap to somethign else. | 17:41 |
sean-k-mooney | JayF: the ironic cli is now a osc plugin yes | 17:42 |
sean-k-mooney | or does ironic still have a standalone cli too | 17:42 |
JayF | sean-k-mooney: yes-ish. We have a plugin for OSC which can also operate independently (e.g. with just Ironic client plugin installed, you can still run `baremetal whatever`) | 17:42 |
JayF | but if the primary openstack cli client is installed, `openstack baremetal whatever` works | 17:43 |
sean-k-mooney | oh neat | 17:43 |
JayF | single codebase, same command structure, just prefix for when it's integrated vs no prefix when it's not | 17:43 |
JayF | that's also why all the Ironic docs use `baremetal X` instead of `openstack baremetal X` (the non-openstack-namespaced version works universally) | 17:44 |
sean-k-mooney | well without i assume the "prefix" is the binary name | 17:44 |
sean-k-mooney | so ironic baremetal X ? vs openstack baremental X | 17:45 |
JayF | Gonna be honest; I've done very little work in the clients. Part of why I'm speaking inexactly is my knowledge is inexact. | 17:45 |
JayF | No, it's `openstack baremetal X` or `baremetal X` (no Ironic at any point) | 17:45 |
sean-k-mooney | no worries | 17:45 |
sean-k-mooney | ok so then teh console script entryp oint and the binary on the path is called "baremental" then | 17:46 |
JayF | https://github.com/openstack/python-ironicclient/blob/master/setup.cfg#L25 we have both a binary and the entrypoints setup | 17:46 |
sean-k-mooney | thhat woudl be yes https://github.com/openstack/python-ironicclient/blob/master/setup.cfg#L27 | 17:46 |
JayF | heh jinx | 17:46 |
fungi | dansmith: melwitt: (or anybody else plugged into ossa-2023-003), do you happen to know if the vulnerability affects iscsi based deployments that don't rely on multipathd? i asked just now in https://launchpad.net/bugs/2004555 because an operator reached out to me directly with the question | 17:47 |
dansmith | fungi: I just replied and pinged gorka | 17:47 |
fungi | oh, perfect. thanks! | 17:47 |
sean-k-mooney | JayF: no worreis just had not seen that done before but that was what i was expecting | 17:47 |
opendevreview | Merged openstack/nova master: CI: fix backport validator for new branch naming https://review.opendev.org/c/openstack/nova/+/882956 | 20:25 |
opendevreview | Merged openstack/nova stable/2023.1: CI: fix backport validator for new branch naming https://review.opendev.org/c/openstack/nova/+/882964 | 20:25 |
dansmith | woot | 20:26 |
opendevreview | Merged openstack/nova stable/yoga: Remove deleted projects from flavor access list https://review.opendev.org/c/openstack/nova/+/881314 | 22:40 |
opendevreview | Merged openstack/nova stable/zed: Ironic: retry when node not available https://review.opendev.org/c/openstack/nova/+/867924 | 22:40 |
opendevreview | Merged openstack/nova stable/zed: CI: fix backport validator for new branch naming https://review.opendev.org/c/openstack/nova/+/882965 | 22:40 |
opendevreview | Merged openstack/nova master: doc: Update version info https://review.opendev.org/c/openstack/nova/+/880614 | 23:34 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!