frickler | good mornings, approved openstack ansible 8 patch now, let the fireworks begin ;) | 04:48 |
---|---|---|
opendevreview | Merged openstack/project-config master: Switch OpenStack's Zuul tenant to Ansible 8 by default https://review.opendev.org/c/openstack/project-config/+/893290 | 04:56 |
frickler | nothing exploded at first sight. I wonder though whether we'll be able to detect false positives, where jobs might not be failing when they should have, due to some change in ansible behaviour | 05:26 |
opendevreview | Harry Kominos proposed openstack/diskimage-builder master: feat: Add new fail2ban elemenent https://review.opendev.org/c/openstack/diskimage-builder/+/892541 | 08:03 |
Clark[m] | frickler thank you for merging that. I'm working on BBQ things so I'll try to check on periodically throughout the day. | 12:09 |
fungi | i'll continue to check in whenever it won't get me in trouble with christine | 12:11 |
frickler | Clark[m]: fungi: checking open patches I found https://review.opendev.org/c/opendev/system-config/+/891869 , I don't think there's any special risk in merging that one? | 12:37 |
fungi | yeah, we just both +2'd a few seconds apart and neither of us noticed | 12:45 |
fungi | i've approved it now | 12:45 |
fungi | or will once my gertty decides it has to use ipv4 to get to vexxhost | 12:45 |
frickler | ah, I didn't notice that the reviews were that close to each other, else I might have just workflowed myself, thx | 12:47 |
opendevreview | Merged opendev/system-config master: Run bootstrap-bridge with empty nodeset https://review.opendev.org/c/opendev/system-config/+/891869 | 12:53 |
opendevreview | Merged zuul/zuul-jobs master: Drop Helm v2 support to fix v3 issue https://review.opendev.org/c/zuul/zuul-jobs/+/885987 | 14:07 |
gthiemonge | Hi Folks, we have a patch in gerrit that doesn't trigger the CI: https://review.opendev.org/c/openstack/octavia/+/893649 | 14:46 |
gthiemonge | we had a previous version of this patch (different change-id) that had the same issue (in the last patchsets) https://review.opendev.org/c/openstack/octavia/+/878043 | 14:47 |
gthiemonge | any idea? | 14:47 |
Clark[m] | gthiemonge: quick guess is the grenade job explicitly doesn't match the jobs.yaml file so it isn't triggered | 14:50 |
gthiemonge | Clark[m]: do you mean that this file would be in irrelevant-files? | 14:54 |
fungi | gthiemonge: or not included in the files list | 14:55 |
Clark[m] | You could edit a python file in the change to check that. As .py should be included not excluded | 14:58 |
gthiemonge | hmm trying... | 15:01 |
gthiemonge | I saw it briefly in https://zuul.openstack.org/status#893649 but then it disappeared | 15:02 |
Clark[m] | Yes, all changes show up on the status page as they are evaluated. But go away if there is no work to do. | 15:13 |
Clark[m] | Might need to grep the zuul server log to see why it is deciding to ignore the change | 15:15 |
gthiemonge | Clark[m]: for instance, patchset 10 in https://review.opendev.org/c/openstack/octavia/+/893649 has changes in 2 .py files, I enabled only a few jobs. but I don't see anything in zuul | 15:36 |
clarkb | gthiemonge: the zuul server log reports patchset 10 has an invalid configuration. And reporting this failed because "'NoneType' object has no attribute 'result'". Not reporting seems to be a zuul bug | 16:41 |
clarkb | now to figure out where the configuration error is | 16:42 |
clarkb | https://paste.opendev.org/show/bwf5l06NWrJQooB9EInr/ there is the relevant log. Unfortuantely it doesn't add more context as to what the actual error is | 16:43 |
clarkb | the only current items reported by zuul are the warnings for the regex negative lookaheads. Those shouldn't be errors but maybe we are inadverdently treating them as such somewhere? | 16:45 |
clarkb | cc corvus: couple of thoughts, maybe we should always explicitly log the errors when we record "invalid config for change" in the zuul log? that way if there are reporting errors to gerrit we have some info? | 16:47 |
clarkb | also it looks like the reporting error is because we put the builduuid in the report but that doesn't exist yet. So we need some sort of check for that before rendering the more verbose reports? | 16:47 |
clarkb | is it possible that zuul is ok loading existing configs with warnings but won't accept them in new updates? That could explain the behavior I guess. | 16:54 |
clarkb | I feel like that would also be a bug if that is the cause here | 16:54 |
clarkb | I think I see the bug maybe | 17:00 |
clarkb | yup give me a few minutes to sit down and process it enough to write a fix | 17:01 |
gthiemonge | clarkb: wow, thanks | 17:06 |
gmann | yeah, even the same change ran the tests 10 days ago so something recently changed in zuul side? | 17:26 |
fungi | gmann: our zuul deployment upgrades automatically every saturday, so things are changing weekly. this looks to be related to new features that support non-fatal configuration warnings and potentially triggered by the new deprecation warning for regex lookarounds that the google-re2 implementation doesn't support | 17:29 |
clarkb | remote: https://review.opendev.org/c/zuul/zuul/+/893682 Allow new configs to be used when warnings are present | 17:30 |
clarkb | I think that is the fix | 17:30 |
gmann | ack, thanks. will wait for this. | 17:31 |
clarkb | I haven't actually run that test case locally because I don't have a zuul test suite running currently here | 17:31 |
clarkb | this reminds me I still need to debug my laptop display artifacts... | 17:31 |
carloss | clarkb: thanks for working on the fix for the issue pointed out by gthiemonge. We are hitting the same issue in Manila, and we currently have our gate blocked, since a change was merged in Nova (bumping the libvirt version) | 19:11 |
carloss | we need to do some changes to our CI jobs to unblock the gate | 19:12 |
corvus | carloss: check back in about an hour; that's about the minimum time it'll take to merge and restart | 19:18 |
carloss | tyvm corvus :D | 19:19 |
frickler | 2nd patch is failing in gate :( | 19:34 |
frickler | but just to double check my understanding, this only affects changes that modify zuul config for a project that has regex warnings? and only scheduler needs the fix or executors, too? | 19:36 |
Clark[m] | Correct and only the schedulers I think | 19:39 |
fungi | the second change is just a fix for testing, so less urgent | 19:39 |
Clark[m] | Ya the second change is something I noticed when writing the new test | 19:40 |
corvus | can we move this to #zuul:opendev.org ? | 19:41 |
frickler | you can discuss the details of the issue and fix over there, but for assessing the impact on opendev and keeping our consumers informed, I think this channel is more suited | 19:51 |
frickler | speaking of the latter, how about a status notice? | 19:51 |
Clark[m] | Something like #status notice Gerrit changes including configuration updates may fail due warnings in the configuration. Investigation for a fix is ongoing. | 19:55 |
Clark[m] | I'm not at home where I'm auth'd to send that though | 19:55 |
frickler | I can send it, I'd just amend "... due to warnings ..." | 19:56 |
frickler | also I wasn't aware that only authed users can send these. though of course that makes sense | 19:56 |
frickler | another question, if a fix turns out to need more time, can we restart the executors on an older version? or would we need to wait for a revert (of which patch(es)) and run a new image? | 19:57 |
frickler | corvus: fungi: do you agree with the above notice clark proposed? | 19:58 |
fungi | maybe "silently fail"? since we haven't observed any user-facing feedback on those, right? | 19:59 |
corvus | yes, i was referring to the details of the patch and its testing regime :) | 19:59 |
corvus | i think the status should include the word zuul | 19:59 |
frickler | "... may fail to be tested by zuul due to ..."? | 20:00 |
corvus | i would suggest: "Gerrit changes including Zuul configuration updates may silently fail. A fix is in progress." | 20:00 |
corvus | actually: "Gerrit changes that update Zuul configuration may silently fail. A fix is in progress." is better i think | 20:01 |
frickler | I think it would be better to be specific about not getting any response from zuul at all. "failing" will likely be associated with a V-1 which is not happening | 20:02 |
corvus | "Some Gerrit changes that update Zuul configuration may fail with no response from Zuul. A fix is in progress." | 20:03 |
fungi | agtm | 20:03 |
fungi | er, sgtm | 20:03 |
frickler | ack, do you want to send yourself or shall I? | 20:03 |
corvus | i can | 20:04 |
corvus | #status notice Some Gerrit changes that update Zuul configuration may fail with no response from Zuul. A fix is in progress. | 20:04 |
opendevstatus | corvus: sending notice | 20:04 |
-opendevstatus- NOTICE: Some Gerrit changes that update Zuul configuration may fail with no response from Zuul. A fix is in progress. | 20:04 | |
frickler | reading the latest in #zuul it sounds like the fix does look valid after all, so my other question may not be as relevant any more | 20:05 |
opendevstatus | corvus: finished sending notice | 20:07 |
*** mmalchuk_ is now known as mmalchuk | 20:13 | |
*** jonher_ is now known as jonher | 20:13 | |
corvus | the change merged but please hold on restarting | 20:56 |
corvus | i think we need one more fix to cover the cases we saw today; details in #zuul:opendev.org | 21:08 |
fungi | thanks! | 21:15 |
corvus | okay, more robust fix is enqueued; eta +1h | 21:33 |
corvus | i'm going to begin restarting schedulers now | 22:14 |
corvus | https://review.opendev.org/c/openstack/octavia/+/893649 looks good now -- anything else to double check? or should we send the all clear status? | 22:23 |
corvus | how about this? status notice Gerrit changes with updates to Zuul's configuration should now be handled correctly. Recheck any changes to Zuul configuration which did not report results. | 22:26 |
corvus | Clark: fungi ^? | 22:29 |
corvus | gthiemonge: carloss ^ things should be fixed if there's anything else you want to check | 22:30 |
fungi | corvus: sorry, stepped away. lgtm | 22:33 |
fungi | and thanks again! | 22:34 |
carloss | corvus: apparently all good now. thank you! | 22:35 |
corvus | #status notice Gerrit changes with updates to Zuul's configuration should now be handled correctly. Recheck any changes to Zuul configuration which did not report results. | 22:36 |
opendevstatus | corvus: sending notice | 22:36 |
-opendevstatus- NOTICE: Gerrit changes with updates to Zuul's configuration should now be handled correctly. Recheck any changes to Zuul configuration which did not report results. | 22:36 | |
fungi | yay! | 22:37 |
opendevstatus | corvus: finished sending notice | 22:39 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!