clarkb | the two new giteas have their empty giteas deployed now. Its late enough in the day I think I'll save brain transplants and setting up replication for tomorrow | 00:18 |
---|---|---|
fungi | sounds good | 00:18 |
clarkb | One thing I'll probably do is transplate from gitea09 rather than gitea01 | 00:19 |
clarkb | just to prove out the transitive continuity of this process | 00:20 |
clarkb | also /me makes a note to look into why its ok to land multiple project rename changes after we do the owrk on gerrit | 00:26 |
opendevreview | Ian Wienand proposed opendev/system-config master: dns: remove old openstack.org nameservers from iptables list https://review.opendev.org/c/opendev/system-config/+/876908 | 02:38 |
ianw | ok the copy conditions change rolled out to gerrit with https://zuul.opendev.org/t/openstack/build/5faa7ce2e20a4203a19e577bda607228/logs | 02:41 |
opendevreview | Ian Wienand proposed opendev/system-config master: remove adns1 host_vars file https://review.opendev.org/c/opendev/system-config/+/876909 | 02:44 |
opendevreview | Ian Wienand proposed opendev/system-config master: system-config-run-dns : update nodes to jammy https://review.opendev.org/c/opendev/system-config/+/876930 | 02:58 |
opendevreview | Ian Wienand proposed opendev/system-config master: Remove unused adns1/ns* host_vars files https://review.opendev.org/c/opendev/system-config/+/876909 | 04:12 |
opendevreview | Ian Wienand proposed opendev/system-config master: system-config-run-dns : update nodes to jammy https://review.opendev.org/c/opendev/system-config/+/876930 | 04:12 |
opendevreview | Ian Wienand proposed opendev/system-config master: dns variables : move to canonical locations https://review.opendev.org/c/opendev/system-config/+/876935 | 04:12 |
opendevreview | Ian Wienand proposed opendev/system-config master: Refactor adns variables https://review.opendev.org/c/opendev/system-config/+/876936 | 04:12 |
opendevreview | Ian Wienand proposed opendev/system-config master: system-config-run-dns : update nodes to jammy https://review.opendev.org/c/opendev/system-config/+/876930 | 05:37 |
opendevreview | Ian Wienand proposed opendev/system-config master: Refactor adns variables https://review.opendev.org/c/opendev/system-config/+/876936 | 05:37 |
opendevreview | Ian Wienand proposed opendev/system-config master: bind9 : drop obsolete option for later versions https://review.opendev.org/c/opendev/system-config/+/876937 | 05:37 |
ianw | clarkb: ^ still a bit of a wip but will update https://etherpad.opendev.org/p/2023-opendev-dns as i go along | 05:39 |
*** jpena|off is now known as jpena | 07:50 | |
frickler | infra-root: some people are reporting changes not getting merged after zuul +2. this isn't the usual "change needs rebase" situation. looking at zuul logs, I see some gerrit connection failures, but haven't been able to correlate those with failed merges | 08:58 |
frickler | there's a lot of errors like https://paste.opendev.org/show/b0wbrpOsg6pruJGkKwSh/ but those seem likely not related | 08:59 |
frickler | example patch is https://review.opendev.org/c/openstack/heat/+/876727/ | 08:59 |
tkajinam | https://review.opendev.org/c/openstack/heat/+/876727 https://review.opendev.org/c/openstack/heat/+/876728 | 09:00 |
tkajinam | these are some examples. just for your reference :-) | 09:00 |
tkajinam | seems the change is not submitted to git repo so I guess something failed before that https://opendev.org/openstack/heat/ | 09:03 |
frickler | I'm suspecting some relation with the submit requirements changes, but no idea how to debug this further. no obvious errors in gerrit log, so likely gerrit just assumes there is some unmet condition yet | 09:03 |
frickler | 2023-03-09 08:12:30,007 INFO zuul.GerritConnection: [e: 5428cb4fff314ed686e889946db11667] Conflict submitting data to gerrit, change may already be merged | 09:03 |
yoctozepto | frickler: maybe need to establish who is affected and start from there? and also: rechecks do not help? (to eliminate temporary issues) | 09:24 |
frickler | at least not all changes are affected, e.g. https://review.opendev.org/c/openstack/kolla-ansible/+/875320 merged fine | 09:26 |
yoctozepto | kolla rlz | 09:35 |
yoctozepto | anything interesting on the merge requirements diff between heat and kolla? | 09:36 |
tkajinam | idk | 10:08 |
hemanth | yoctozepto: i have the same problem with gate +2 and not merged. I did recheck and situation is same. For reference, https://review.opendev.org/c/openstack/charm-ops-sunbeam/+/876938 | 10:22 |
yoctozepto | uh-oh, so it's persistent | 10:23 |
yoctozepto | so we have charms and heat affected at least | 10:23 |
yoctozepto | and kolla unaffected | 10:23 |
tkajinam | I have one patch I have to merge to unblock requirement bump but seems I have to wait for a while | 10:31 |
frickler | yes, I'd rather have some of my co-roots check the situation before touching anything. things like another gerrit restart might as well make the situation worse than better | 11:02 |
tkajinam | yeah | 11:09 |
tkajinam | I'll leave now and come back later in a few hours. | 11:09 |
tkajinam | to check the status | 11:10 |
tkajinam | frickler, thanks again ! | 11:10 |
frickler | actually zuul does report the issue in the buildset results https://zuul.opendev.org/t/openstack/buildsets?pipeline=gate&result=MERGE_FAILURE&skip=0 | 11:12 |
frickler | first of the current list seems to have been at 00:28 (start time + buildset duration) . nothing too obvious to me in terms of the set of affected projects | 11:14 |
frickler | in the gerrit UI, all submit requirements are shown as unsatisfied, even with all the necessary +1/+2 in place | 11:25 |
frickler | to double-check I made myself admin shortly and verified that I don't get shown the merge button, either | 11:25 |
yoctozepto | hmm, worrying | 11:27 |
yoctozepto | I quickly checked that there were seemingly no relevant project-config updates that could break it so much | 11:28 |
yoctozepto | unless they hit some gerrit bug | 11:28 |
yoctozepto | https://review.opendev.org/c/openstack/project-config/+/867931 | 11:29 |
yoctozepto | like, if the changes here caused gerrit to misbehave | 11:29 |
yoctozepto | it seems it merged at a time likely to be a cause of the disruption | 11:30 |
frickler | even abandoning and restoring a patch doesn't help | 11:31 |
fungi | i'm a little fuzzy on the submit requirements stuff, but does it indicate satisfied submit requirements on any changes that haven't merged yet? are we just talking about zuul putting verified +2 on a change but then not merging it, or not enqueuing approved changes into the gate pipeline? | 12:36 |
fungi | the copycondition change is unlikely to be the cause, the primary acl (called all-projects) which every normal project's acl inherits from is not updated from changes in git, so ianw updated it manually in preparation for the gerrit 3.7 upgrade. see the most recent entry at the top of https://wiki.openstack.org/wiki/Infrastructure_Status | 12:40 |
fungi | that happened just before 2023-03-08 22:22:18 UTC | 12:40 |
fungi | could there be some connection between changes with an "n/a" status and changes zuul isn't merging? | 12:41 |
fungi | if so, forcing an online reindex might be the next step. we didn't do that because it was thought the n/a status was purely a cosmetic thing | 12:42 |
fungi | okay, i'm caught up on the discussion in #openstack-infra now, so it seems zuul tests things in the gate pipeline and then adds +2 verified but fails to merge it | 12:43 |
fungi | apparently it's possible to reindex a single change, so i suppose i could try that one one of the changes that didn't merge | 12:46 |
fungi | i did a `gerrit index changes 876726` just now | 12:47 |
fungi | and i've rechecked 876726 to see if that's changed the situation | 12:50 |
fungi | if it has, we can just force an online reindex of all changes | 13:03 |
fungi | and then we can pretty easily query gerrit for unmerged changes with verified +2 and recheck them | 13:05 |
frickler | the issue also doesn't seem to be related to specific projects, https://review.opendev.org/c/openstack/heat/+/876950 just merged without issues | 13:11 |
frickler | but https://review.opendev.org/c/openstack/glance/+/871621/ show green checkmarks for for CR and W while being processed in gate, for 876726 this still shows unsatisfied, so I fear the reindex has not helped | 13:13 |
bbezak | I have similar case with https://review.opendev.org/c/openstack/kayobe/+/876807 running gate for 3-rd time, as I tried to fix it | 13:18 |
bbezak | this too - https://review.opendev.org/c/openstack/kayobe/+/876808/ | 13:19 |
fungi | 876726 got another verified +2 but still did not merge even after the reindex | 13:21 |
fungi | i wonder if i need to use the --force flag, maybe the index command won't reindex a change that's already been indexed otherwise? | 13:22 |
fungi | mmm, no --force is only for a full reindex of all changes | 13:25 |
bbezak | according to https://zuul.opendev.org/t/openstack/status#876807, this change finished gate jobs but didn't add +2 - not sure if that's related | 13:26 |
fungi | unrelated, the change ahead of it hasn't completed its jobs yet and might fail, so zuul doesn't technically know yet whether 876807 on top of the change ahead represents a valid state for the branch | 13:29 |
bbezak | makes sense | 13:29 |
fungi | i'm going to start a full online reindex, since it will likely be a while before clarkb is awake, and he and ianw have a much stronger grasp of the submit requirements bits in gerrit 3.6/3.7 so i wouldn't want to attempt rolling back the various edits until he's around anyway | 13:30 |
fungi | 2793 tasks in the queue now | 13:32 |
fungi | it's already finished 1k of those | 13:38 |
fungi | waiting for it to get through all of the heat changes so i can see if that's changed the submit requirements satisfaction on 876726 | 13:39 |
fungi | under 1k tasks remaining now, but hasn't gotten to heat yet | 13:49 |
bbezak | gate started once more (on its own) for 876807 and 876910, and 876808 hangs on submit requirements | 13:55 |
bbezak | (FYI) | 13:55 |
fungi | heat changes have finished reindexing but the submit requirements on 876726 still all show unsatisfied | 13:57 |
fungi | so the forced full reindex doesn't seem to have made any difference either | 13:58 |
fungi | frickler: i know you had samples which indicate the issue isn't project-specific, but have you noticed if the difference in behavior is maybe branch-specific? for example, i see 876726 is for stable/2023.1 | 14:01 |
frickler | fungi: I've seen failures for both stable/2023.1 and master, I haven't seen success for stable branches yet, but that's not necessarily statistically significant yet | 14:04 |
fungi | right, thanks | 14:04 |
fungi | i've also asked in the gerritcodereview matrix/discord discussion channel | 14:12 |
fungi | in hopes someone there has seen this before | 14:12 |
iurygregory | hey opendev, we are trying to understand why https://review.opendev.org/c/openstack/ironic-inspector/+/876968 looks like is stuck, it has verified +2 but doesn't look like is merged... the chain patch says it failed to merge (https://review.opendev.org/c/openstack/ironic-inspector/+/876969/ based on zuul comment) | 14:19 |
fungi | iurygregory: it's something to do with the switch to submit requirements in preparation for upgrading to gerrit 3.7, i've just about exhausted my troubleshooting options so am asking the broader gerrit community for suggestions while waiting for people here who have a better grasp of the feature in order to discuss how to roll back the changes | 14:21 |
fungi | it seems to be impacting random changes, not all changes for the same project/branch get blocked like that | 14:22 |
fungi | but if a change is affected, there seems to be no point in retrying it until we have this fixed | 14:22 |
iurygregory | fungi, ack, tks for the heads-up! | 14:24 |
iurygregory | yeah, I will hold this in ironic patches till this the problem is solved | 14:24 |
fungi | looking in the gerrit error_log, i see changes which are successfully merging logging warnings like this: | 14:29 |
fungi | [2023-03-09T14:22:15.599Z] [HTTP POST /a/changes/openstack%2Freleases~master~I2f3f23b29a5e1b0f39c2bd61f2f5cd4f3ce4bc79/submit (zuul from [2001:4800:7819:103:be76:4eff:fe04:3df3])] WARN com.google.gerrit.server.submit.MergeOp : Change 876611: No result found for project config submit requirement 'workflow' [CONTEXT SUBMISSION_ID="876611-antelope-tp-latest-1678371734954-563cca49" | 14:29 |
fungi | project="openstack/releases" request="REST /changes/*/submit" ] | 14:29 |
fungi | it logs the same thing about code-review and verified labels too | 14:30 |
fungi | for the changes which are unable to merge due to unsatisfied submit requirements, i find no mention of them in the error_log whatsoever | 14:41 |
*** sfinucan is now known as stephenfin | 14:44 | |
fungi | the forced reindex does seem to have addressed the condition in this bug ianw reported at least: https://bugs.chromium.org/p/gerrit/issues/detail?id=16748 | 14:53 |
clarkb | my first hunch was that we weren't satisfying the requirement for some legit reason. Then perhaps a bug in the condition for the submit requirement. But both look fine (you can literally hover andview conditions on the labels in the UI and they all seem to match) | 14:54 |
fungi | and conditions shouldn't differ for changes to the same project+branch either | 14:55 |
clarkb | though they can I don't think any of the config we pushed did that (this is what applicableIf is for you can use that to make rules appl to specific branches) | 14:55 |
fungi | oh, like the prolog rules? | 14:56 |
clarkb | all of this exists to replace the prolog rules but be more approachable | 14:56 |
clarkb | but ya basically you can write a rule. YOu write conditions to satisfy the rule and conditions for when the rule applies | 14:56 |
fungi | anyway, in this case the hover pop-up for those conditions says approximately "these conditions aren't met... [conditions which the applied vote actually meets]" | 14:58 |
clarkb | yes if you expand further and look at the rule it seems to be evaluating the definitely should be satisfied | 14:58 |
clarkb | fungi: maybe try pushing a new change to the project+branch combo and see if setting satisfying votes on it changes anything | 15:01 |
fungi | yeah, i guess the example i'm looking at is a change which was pushed before the acl changes | 15:03 |
fungi | maybe i should rebase an affected change and the set votes on it as an administrator | 15:04 |
clarkb | ya I'm wondering if this is change specific somehow | 15:04 |
clarkb | I wouldn't rebase. I would create an entirely new change | 15:04 |
clarkb | and if that changes things then maybe rebase | 15:04 |
fungi | sure | 15:04 |
clarkb | (the change metadata is collected in a per change ref so if it is change specific rebase is less likely to change anything) | 15:04 |
fungi | https://review.opendev.org/c/openstack/heat/+/876984 DNM: Testing submit requirements [NEW] | 15:06 |
clarkb | ok doesn't appear change specific | 15:07 |
fungi | i added +2 code-review, +2 verified, +1 workflow as an admin | 15:07 |
fungi | still shows those as unsatisfied | 15:07 |
fungi | abandoning and will try on bindep master | 15:08 |
clarkb | https://review.opendev.org/c/openstack/heat/+/876950 is a heat master change that was fine | 15:10 |
clarkb | its almost like this is branch specific (which you started investigating previously) | 15:10 |
opendevreview | Jeremy Stanley proposed opendev/bindep master: DNM: Testing submit requirements https://review.opendev.org/c/opendev/bindep/+/876985 | 15:10 |
fungi | master branch examples were pointed out on a project with one change affected and another not | 15:11 |
clarkb | oh interesting | 15:11 |
clarkb | ah yup https://review.opendev.org/c/openstack/heat/+/876728 | 15:11 |
clarkb | what is extremely confusing is that it accurately shows the rules I would expect. So the problem isn't that the rules aren't applying or the rules are wrong. Its that for whatever reason it has decided the rule isn't satisfied | 15:12 |
fungi | https://review.opendev.org/c/opendev/bindep/+/876985 | 15:13 |
fungi | same behavior | 15:13 |
fungi | i've abandoned it now | 15:15 |
clarkb | thinking out loud here: there were a series of changes. The first of which we made directly to all-projects and seems like stuff was happy at that point. Could a followup change have somehow broken things (seems more likely that things were just inconsistently broken at the start but exploring ideas) | 15:15 |
clarkb | you can see https://review.opendev.org/c/opendev/system-config/+/876233 gets a vote at 2300 UTC ish esterday that satisfies the requirement | 15:17 |
fungi | the copy conditions removal was the only production acl change to merge after all-projects was updated, right? | 15:18 |
clarkb | fungi: ^ can you use your extra superpowers to remove ianw's vote and then apply a +2 code review for yourself? | 15:18 |
clarkb | fungi: with a pause in the middle so that we can confirm the requirement is unsaitisfed after ianw's vote is removed | 15:18 |
clarkb | ya https://review.opendev.org/c/openstack/project-config/+/867931 appears to be as far as the stack got | 15:20 |
clarkb | and that didn't touch opendev/system-config which also seems to exhibit this | 15:20 |
fungi | it immediately shows the requirement as unsatisfied when i delete ianw's vote | 15:20 |
fungi | confirm you see the same | 15:20 |
clarkb | fungi: yup I confirm | 15:20 |
fungi | now adding my own | 15:20 |
fungi | mine shows up satisfying it now | 15:21 |
clarkb | interesting | 15:21 |
fungi | going to test again | 15:21 |
clarkb | this has me going back to theidea it is change specific :/ | 15:21 |
fungi | yeah, so tested that change with both the webui and gertty, my cr+2 shows up as satisfactory | 15:22 |
fungi | regardless of how i applied my vote | 15:23 |
clarkb | the other odd thing is it affects all the submit requirements on he affected changes. This implies it isn't rule specific | 15:24 |
clarkb | (for example some projects do weird things with code-review like infra-specs and openstack/governance, but that doesn't seem to be in play here since verified and workflow are also affected) | 15:24 |
clarkb | UNSATISFIED specifically means "The submit requirement is applicable (applicableIf evaluates to true), but the evaluation of the submittableIf and overrideIf expressions return false for the change." according to the docs | 15:25 |
clarkb | the submittableIf value is what you see in the condition output | 15:26 |
bbezak | pretty recent occurrence (if that's helpful) - https://review.opendev.org/c/openstack/ansible-collection-kolla/+/876261/ | 15:28 |
opendevreview | Mark Fedorov proposed opendev/git-review master: Add CC similarly to reviewers https://review.opendev.org/c/opendev/git-review/+/849219 | 15:28 |
clarkb | ok if this is it then its going to be the silliest thing ever | 15:30 |
bbezak | does this issue only applies to changes pushed in relation chain? | 15:30 |
clarkb | fungi: New Gerrit (either 3.7 or master) allows for boolean operators to be lower or upper case. The docs for 3.6 only show upper case being valid | 15:30 |
clarkb | fungi: https://paste.opendev.org/show/brAj40R1mJbQZSXAXEQ5/ shows lower case being used for 'and' and not 'AND' | 15:31 |
clarkb | fungi: in our copy condition change 'OR' is used | 15:31 |
clarkb | anyway I'm looking at this becaues the docs say our submittableIf condition is not evaluating truthfully | 15:31 |
clarkb | I have no idea how this worked before if this is the proble, but maybe we try changing the 'and' to 'AND' in All-Projects ? | 15:31 |
clarkb | bbezak: no | 15:32 |
fungi | seems worthwhile (and fast) to try | 15:32 |
clarkb | bbezak: fungi reproduced with branch new changes on both stable/2023.1 and master on different projects with no relations to other changes | 15:32 |
clarkb | fungi: are you in a position to make that update? I think ou've already escalated privs. I haven't even loaded my ssh keys yet | 15:33 |
fungi | i already removed my account from the groups but can readd | 15:33 |
clarkb | compare https://review.opendev.org/Documentation/user-search.html#_boolean_operators to https://gerrit-review.googlesource.com/Documentation/user-search.html#_boolean_operators for the case difference I'm talking about | 15:33 |
fungi | just need to remember the fetch ref dance for all-projects config | 15:33 |
clarkb | fungi: I think if you clone all-projects the default head may be refs/meta/config | 15:34 |
clarkb | but you would checkout refs/meta/config edit to s/and/AND/ (though mabe do by hand to avoid extra edits its like three lines? ) and then git push HEAD:refs/meta/config | 15:34 |
clarkb | I'm trying to test the query now | 15:36 |
fungi | had to fetch origin refs/meta/config and checkout FETCH_HEAD | 15:37 |
clarkb | ya so if you do https://review.opendev.org/q/label:Code-Review%253DMAX+and+-label:Code-Review%253DMIN you don't get any changes that are having the problem. Switch to https://review.opendev.org/q/label:Code-Review%253DMAX+AND+-label:Code-Review%253DMIN and then I think you get all the ones that are unhappy | 15:38 |
clarkb | so ya I think maybe this is it | 15:38 |
clarkb | how this worked even a little bit I have no idea | 15:38 |
fungi | this is what i have: https://paste.opendev.org/show/bIjjkesJNFrmWeyucGD7/ | 15:39 |
fungi | lgty? | 15:39 |
clarkb | sorry kids are heading to school had to say bye. looking now | 15:40 |
clarkb | fungi: yes three updated rules replacing 'and' with 'AND' is what I expected. | 15:41 |
clarkb | LGTM | 15:41 |
fungi | pushing | 15:41 |
fungi | the push command always confuses me when pushing detached to an arbitrary remote ref | 15:43 |
clarkb | I think in this instance `git push HEAD:refs/meta/config` would work since you're doing a normal fast forward using the locally checked out state | 15:44 |
clarkb | but you don't want to use HEAD if the local state is not what you want ot push | 15:44 |
clarkb | fungi: can you let us know when it is pushed? | 15:45 |
fungi | fatal: You are not currently on a branch. To push the history leading to the current (detached HEAD) state now, use: git push HEAD:refs/meta/config HEAD:<name-of-remote-branch> | 15:46 |
fungi | git changed a bunch of stuff around how push works, which is part of what's confounding me | 15:46 |
clarkb | ya I've never seen that before. I like when git tries to be helpful but ends up making thigns worse for people who figured things out previously | 15:47 |
clarkb | fungi: maybe if you just checkout a branch state you'll be good? | 15:47 |
clarkb | basically stop being detached | 15:47 |
clarkb | git checkout -b update-meta-config should put you on a non detached state and then you can retry? | 15:47 |
fungi | ssh: Could not resolve hostname head: Name or service not known. fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. | 15:48 |
clarkb | oh sorry git push origin HEAD:refs/meta/config | 15:48 |
clarkb | or whatever remote is the one that is gerrit | 15:48 |
fungi | `git push origin HEAD:refs/meta/config` gets me farther, though my fungi.admin account apparently lacks permissions to do this by default | 15:49 |
clarkb | are you in bootstrappers? YOu need that for the push | 15:49 |
fungi | [remote rejected] HEAD -> refs/meta/config (prohibited by Gerrit: not permitted: update) | 15:49 |
clarkb | admin gets you like 80% of what you need tpically but when doing git push you also need bootstrappers | 15:49 |
fungi | yeah, that's where i was headed next | 15:50 |
fungi | 27bdea8..5c92d15 HEAD -> refs/meta/config | 15:50 |
fungi | done | 15:50 |
clarkb | I think that was it. I just refreshed a change I had +2'd and was previously unsatisfied | 15:51 |
clarkb | it is satisfied now | 15:51 |
clarkb | I'll restore your bindep change to check it | 15:51 |
frickler | https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/876941 also looks green now | 15:51 |
clarkb | yup satisfied on bindep now too | 15:51 |
clarkb | I have no idea why this was working at all | 15:51 |
frickler | but why would this have affected only a subset of all patches? | 15:51 |
fungi | wtf | 15:51 |
clarkb | yes indeed wtf. | 15:52 |
fungi | frickler: might be related to the warnings in the log for the changes that were merging? somehow gerrit didn't bother to evaluate the conditions on those? | 15:52 |
clarkb | the query links I posted above show you the difference between the two results | 15:52 |
clarkb | they are definitely different, but I'm not sure why/how it worked at all for some things. Maybe there was a transition point using the old rules | 15:53 |
clarkb | and the full reindex essentailly said no to that? | 15:53 |
fungi | (the log entry example i quoted at 14:29z) | 15:53 |
clarkb | we should make a change to our system-config docs | 15:53 |
clarkb | fwiw I focused on the condition and why it may not be satisfied since that is what the docs said was going on. Got a bit lucky that I seemed to recall newer gerrit made the boolean operators more forgiving and once I confiemd that checked if we had used the wrong version which we had. Then I checked the queries for deltas which existed and ya | 15:56 |
fungi | so as far as getting things back on track, someone probably should query for any unmerged changes with verified +2 and recheck or reapprove them depending on the tenant | 15:56 |
frickler | hmm, this query shows some reviews with CR+1, not +2 https://review.opendev.org/q/label:Code-Review%253DMAX+AND+label:Workflow%253DMAX+AND+label:Verified%253DMAX+AND+status:open | 15:57 |
clarkb | can also just enqueue straight to zuul if we want to do that | 15:57 |
clarkb | frickler: if you open the change directly they have max votes too | 15:57 |
clarkb | not sure why the summary shows hte lower value | 15:57 |
clarkb | frickler: I think the query results are accurate but the UI is buggy | 15:58 |
fungi | oh, right we could reenqueue to gate with zuul-client | 15:59 |
clarkb | fungi: I guess rechecking is fine for openstack though since they have verified +2's already | 15:59 |
clarkb | it won't clean check them which is what i was worried about | 16:00 |
frickler | hmm, https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/876676 didn't get submitted even after it received a fresh V+2 from zuul :-( | 16:00 |
fungi | i dunno, when i rechecked that affected heat stable/2023.1 change it went back to check | 16:00 |
frickler | buildset result still stays MERGE_FAILURE https://zuul.opendev.org/t/openstack/buildset/f98ec8e3999d4b03bff993942a665b46 | 16:01 |
fungi | i wonder if there's some cache involved? or whether we need to force another reindex | 16:02 |
clarkb | I would check the zuul logs for that chagne first as it should have more details on what exactly failed | 16:02 |
clarkb | and maybe the gerrit logs but yes I suppose that is possible if the check for mergablity by zuul is failing | 16:02 |
clarkb | though now that I've typed this I suddenly wonder if zuul needs explicit compatibility with submit requirements for mergability checking | 16:03 |
clarkb | (again how did this work for a short time for some changes if so) | 16:03 |
clarkb | fungi: I do notice that on the listing pages some of the X of Y values are stale compared to what is shown in the change page so ya could be some index is stale? | 16:04 |
clarkb | defintiely start with the service logs and work from there though | 16:04 |
fungi | i do suspect it's zuul deciding they can't merge without trying, because the gerrit error_log makes no mention of 876676 whatsoever | 16:05 |
frickler | not much more then this on zuul02 2023-03-09 15:50:09,133 INFO zuul.GerritConnection: [e: c322f3cd7eba42eb8e7b2a43f9df3e28] Conflict submitting data to gerrit, change may already be merged | 16:07 |
frickler | so this sounds like a response from gerrit to me | 16:07 |
clarkb | frickler: I think that the gerrit connection will log at a debug level the request and response it got before that? | 16:07 |
clarkb | ya it did | 16:08 |
clarkb | POST: https://review.opendev.org/a/changes/openstack%2Fneutron-dynamic-routing~stable%2F2023.1~Ia3593a6d2183c7621a5a39bf1a3023fc28e0f5ed/submit | 16:09 |
frickler | that just shows the post and no response | 16:09 |
clarkb | data: {} | 16:09 |
frickler | https://paste.opendev.org/show/bHXBvCUitV7F8o0pwfAy/ is the whole context | 16:09 |
clarkb | I think data is the post data | 16:09 |
frickler | but so zuul told gerrit to submit it, but it didn't happen | 16:10 |
clarkb | that says gerrit is returning a 409 | 16:10 |
clarkb | I would have expected gerrit to log that. But maybe because it isn't a 5XX it odesn't? | 16:11 |
frickler | where do you see that 409? | 16:13 |
fungi | i'm not finding 409 in the zuul scheduler debug log on zuul02 | 16:14 |
clarkb | frickler: its the code that zuul emits that message for. And I just confirmed in the gerrit httpd log it was a 409 | 16:14 |
fungi | not around the merge attempt for the change anyway | 16:14 |
clarkb | (note you have to grep by the change id not not number as that is what zuul posts to) | 16:14 |
clarkb | fungi: ya zuul converts the 409 to Conflict submitting ... | 16:14 |
fungi | got it | 16:14 |
clarkb | and https://gerrit-review.googlesource.com/Documentation/rest-api.html documents why that may happen | 16:15 |
fungi | 2023-03-09 15:50:08,055 DEBUG zuul.GerritConnection: POST: https://review.opendev.org/a/changes/openstack%2Fneutron-dynamic-routing~stable%2F2023.1~Ia3593a6d2183c7621a5a39bf1a3023fc28e0f5ed/revisions/13e0d8a63dbdbd9e1a863144999794d4fc9af22d/review | 16:15 |
fungi | is what it tried to post | 16:15 |
frickler | hmm, o.k., maybe try to submit manually in gerrit and see what happens then? | 16:15 |
clarkb | frickler: ya is it safe enough on that change (no other changes trying to merge on that branch that you know of ?) | 16:15 |
frickler | yes | 16:16 |
clarkb | I guess it is a release note too so should be fairl independent | 16:16 |
clarkb | fungi: ^ did you want ot try that if you still have your admin account ready? | 16:16 |
frickler | at least I see the submit button in the UI, but greyed out for lack of priviledge. but that's better than earlier, where it was not shown | 16:17 |
fungi | i don't leave my account escalated, but pulling up the commands for submitting via cli now | 16:17 |
clarkb | looks like some changes have merged since All-Projects was updated | 16:17 |
clarkb | its possible we've got two independent things happening here? | 16:17 |
clarkb | actually hold on one sec | 16:19 |
clarkb | frickler: your submit post came ~40 seconds before fungi reported the push completed | 16:19 |
clarkb | can we try rechecking it again to make sure we just didn't race each other | 16:19 |
clarkb | fungi: ^ fyi on the holding off | 16:19 |
fungi | k | 16:19 |
fungi | i was still putting the submit command together anyway | 16:20 |
clarkb | frickler: I'll let you do that since removing and applying the workflow vote should be quicker than a recheck | 16:20 |
fungi | i have the manual submit via gerrit ssh api ready to go if not | 16:22 |
frickler | done | 16:24 |
clarkb | oh its behind another change | 16:24 |
clarkb | but ya I suspect this may have just been a race between the rules getting updated and zuul trying to submit | 16:24 |
clarkb | particualrly since I see other changes merged since the rule update | 16:24 |
clarkb | 876795 should merge soon (it has a slow py37 node though) | 16:25 |
clarkb | nevermind that slow job timed out | 16:27 |
clarkb | there are two openstack releases changes that we can monitor that should merge in a few minutes | 16:30 |
opendevreview | Clark Boylan proposed opendev/system-config master: Fix boolean operator in submittableIf rules https://review.opendev.org/c/opendev/system-config/+/876995 | 16:34 |
clarkb | there is the docs update. I'm going to go chekc the project-config side ofthings more thoroughly now | 16:34 |
clarkb | I fetched the end of ianw's stack and grepped for ' or ' and ' and ' in submittableIf and copyCondition lines and found now results. ' AND ' has no results either but ' OR ' does | 16:37 |
clarkb | I think All-Projects was the only place this happened | 16:37 |
clarkb | those two releases chagnes merged: https://review.opendev.org/q/project:openstack/releases | 16:38 |
clarkb | I really need to resume normal morning bootstrapping. Do we want to manually enqueue things or do a notice indicating people should recheck? I'm reasonably confident things should work now | 16:39 |
fungi | i should be able to put together a query of affected changes and reenqueue them, i don't think it will take too long | 16:40 |
opendevreview | Merged opendev/system-config master: Fix boolean operator in submittableIf rules https://review.opendev.org/c/opendev/system-config/+/876995 | 16:45 |
fungi | that ^ merged fine | 16:47 |
fungi | gerrit query doesn't return data about the patch set number | 16:53 |
fungi | that's going to make building the enqueue list harder | 16:53 |
Clark[m] | Rechecks should be fine then | 16:54 |
Clark[m] | We don't need to overthink it | 16:54 |
fungi | seems there are 31 possibly affected changes | 16:55 |
fungi | `gerrit query 'is:open -is:wip label:Verified+2 after:2023-03-08' --no-limit` | 16:55 |
*** jpena is now known as jpena|off | 17:00 | |
fungi | https://paste.opendev.org/show/blApwBI85UvH5y5sm1gq/ | 17:06 |
fungi | going to run that now | 17:06 |
fungi | looks like all the affected changes i could find were in the openstack zuul tenant, so made the string manipulation simple | 17:07 |
fungi | and the script is done | 17:07 |
fungi | i see the changes in the gate pipeline now | 17:08 |
clarkb | thanks! | 17:09 |
fungi | and now i'm going to see about getting a shower and whatever else i missed by looking at the computer first thing when i woke up | 17:10 |
yoctozepto | what was the issue? | 17:10 |
yoctozepto | (with that not merging) | 17:10 |
yoctozepto | (just curious) | 17:10 |
fungi | yoctozepto: apparently when we switched to submit requirements late yesterday, there was a subtle typo in the config where boolean operators should have been written in all capital letters | 17:11 |
fungi | so the end result was that it parsed the word "and" as a required commit message substring and logically or'ed it with the conditions on either side of it | 17:11 |
clarkb | yoctozepto: https://review.opendev.org/c/opendev/system-config/+/876995 | 17:11 |
yoctozepto | ooh, interesting | 17:11 |
clarkb | fungi: logically AND'd not OR'd | 17:12 |
fungi | oh, could we have left the and out completely in that case? | 17:12 |
clarkb | so basically you had to have a code review +2 and no code review -2. But you also needed to have 'and' in the commit message | 17:12 |
clarkb | fungi: yes AND is the default. I think being explicit here is a good thing though so that we avoid any changes to those defaults | 17:12 |
yoctozepto | oh my | 17:12 |
yoctozepto | but how did the docs change fix the issue? :D | 17:13 |
clarkb | some changes did have 'and' in the commit message which is why they worked leading to all the confusion | 17:13 |
yoctozepto | I am getting the issue now, just not the fix | 17:13 |
clarkb | yoctozepto: it didn't. All-Projectsis not automatically managed. fungi manually applied the documented change in the docs to All-Projects. But the docs capture the situation | 17:13 |
yoctozepto | yeah, that "and" is really nasty | 17:13 |
fungi | the all-projects config is managed by manually pushing commits into gerrit, not through automation, that document is just a reflection of what we push | 17:13 |
yoctozepto | oooh, I see | 17:14 |
yoctozepto | thanks | 17:14 |
fungi | anyway, we were doing this in preparation for upgrading to gerrit 3.7 which doesn't support the old way of specifying requirements, and 3.6 supports using the new syntax. 3.7 apparently is more lax about accepting lower-case boolean operators, but 3.6 unfortunately is not | 17:14 |
yoctozepto | what is all-projects btw? some hidden repo? | 17:15 |
clarkb | it is one of two central config repos for gerrit | 17:15 |
fungi | yoctozepto: yes, it's the repository in gerrit which holds the master acl that all other projects inherit from | 17:15 |
clarkb | and yes, I don't think it is very visible anymore | 17:15 |
yoctozepto | I see, pretty mysterious | 17:15 |
yoctozepto | thanks for your explanations | 17:16 |
fungi | my pleasure | 17:16 |
clarkb | I should find a shower too. I did manage to eat and drink something though so thats a win | 17:17 |
clarkb | how does this look #status notice Yesterday's change to Gerrit configs to use submit-requirements had a boolean logic bug. This has not been corrected and any changes that did not merge as a result can be rechecked. We have rechecked the changes we identified as being affected. | 17:19 |
clarkb | *this has now been corrected | 17:19 |
clarkb | Now we need to turn 'and' into a secret handshake somehow | 17:21 |
fungi | /srechecked/reenqueued/ technically? | 17:21 |
clarkb | fungi: ++ | 17:21 |
clarkb | how does this look #status notice Yesterday's change to Gerrit configs to use submit-requirements had a boolean logic bug. This has now been corrected and any changes that did not merge as a result can be rechecked. We have reenqueued the changes we identified as being affected. | 17:21 |
clarkb | sorry thats an edited version with those two eidts | 17:22 |
fungi | don't want people confused by a lack of recheck comments on those changes | 17:22 |
fungi | lgtm! | 17:22 |
clarkb | #status notice Yesterday's change to Gerrit configs to use submit-requirements had a boolean logic bug. This has now been corrected and any changes that did not merge as a result can be rechecked. We have reenqueued the changes we identified as being affected. | 17:22 |
opendevstatus | clarkb: sending notice | 17:22 |
-opendevstatus- NOTICE: Yesterday's change to Gerrit configs to use submit-requirements had a boolean logic bug. This has now been corrected and any changes that did not merge as a result can be rechecked. We have reenqueued the changes we identified as being affected. | 17:22 | |
clarkb | https://github.blog/2023-03-09-raising-the-bar-for-software-security-github-2fa-begins-march-13/ fyi this is likely to affect openstack's gerrit replication but also many of us have github accounts | 17:23 |
opendevstatus | clarkb: finished sending notice | 17:25 |
clarkb | https://review.opendev.org/c/openstack/neutron-dynamic-routing/+/876941 did end up merging | 17:28 |
clarkb | as did https://review.opendev.org/c/openstack/heat/+/876726/ | 17:28 |
clarkb | https://www.lifeprint.com/asl101/pages-signs/a/and.htm somehow I have ended up here and find it really interesting that the word "and" isn't super common in ASL | 17:36 |
yoctozepto | clarkb: as it can also be omitted in gerrit and multiple other similar places - makes sense to me | 17:39 |
yoctozepto | seemingly humans default to "and" | 17:40 |
clarkb | ya I guess when you list things it is purely extra information that can be omitted | 17:40 |
fungi | that is indeed truly interesting | 18:22 |
fungi | apparently it's fine as a conjunction though | 18:23 |
clarkb | I've managed to get myself distracted with passport things this morning after the submit-requirements stuff. I need to reset my day and will probably pop out ofr a long lunch but when I get back will start looking at gitea db stuff | 19:04 |
fungi | yeah, i'm now quite behind on other things i meant to be working on, but that's life in opendev | 19:05 |
opendevreview | Merged opendev/system-config master: Update gerrit image builds for 3.6.4 and 3.7.1 tags https://review.opendev.org/c/opendev/system-config/+/876233 | 19:56 |
ianw | omg so the only thing that merged yesterday had "and" in the commit msg?! jeez sorry about that | 20:24 |
ianw | none of the follow-ons have binary conditions (https://review.opendev.org/q/topic:gerrit-s-r-3.7) | 20:29 |
ianw | i made a similar mistake with is:True as well, being a bit too pythonic -- that has to be is:true | 20:30 |
Clark[m] | Yup we made a secret merge handshake :) | 20:30 |
yoctozepto | haha | 20:37 |
yoctozepto | btw, any plans to merge the NebulOuS patches? | 20:37 |
Clark[m] | I've been meaning to take a look then ran into gitea fun yesterday and Gerrit todaym I'll try to look again today after doing the gitea stuff | 20:38 |
Clark[m] | But no need to wait for me either if others get to it first | 20:39 |
yoctozepto | ack | 20:40 |
fungi | in rax-ord news, things seem to have improved but there were still some ugly periods of what are probably more launch timeouts even with the 15-minute wait | 20:41 |
fungi | seeing a correlation with ~0 in use/high building with increased incidence of error node launches and higher time to ready on the graphs | 20:43 |
opendevreview | Clark Boylan proposed opendev/system-config master: Add gitea13 and 14 to Gerrit replication https://review.opendev.org/c/opendev/system-config/+/877046 | 22:21 |
opendevreview | Clark Boylan proposed opendev/system-config master: Add gitea13 and gitea14 to the haproxy load balancer https://review.opendev.org/c/opendev/system-config/+/877047 | 22:24 |
clarkb | infra-root ^ that first change should be good to go (but please do take a minute to double check the hosts look good to you) | 22:24 |
fungi | what do people use for listing listening sockets these days? seems netstat doesn't come installed any more | 22:24 |
clarkb | I'll WIP the second to prevent it from merging before we complete replication | 22:24 |
clarkb | fungi: `ss` is the new thing | 22:24 |
fungi | i see. unfortunate that it seems to assume people use a mile-wide terminal | 22:25 |
clarkb | also I transplanted the db from gitea09 to 13 and 14 to prove out the transitivity. Seems to have worked from what I can see | 22:27 |
fungi | yeah, seems to be working based on my tests | 22:28 |
opendevreview | Merged openstack/project-config master: gerrit/acl : submit-requirements for deprecated NoOp function https://review.opendev.org/c/openstack/project-config/+/875804 | 22:28 |
clarkb | ianw: I meant to ask if the earlire manage-projects run for the system-config inventory update did udpate the acls or if it waited for a subsequent run | 22:28 |
ianw | clarkb: it did seem to have to wait for the subsequent run | 22:29 |
ianw | https://zuul.opendev.org/t/openstack/build/5faa7ce2e20a4203a19e577bda607228/logs was where it eventually applied | 22:29 |
ianw | i've approved the next one now and will watch that too | 22:30 |
clarkb | cool fwiw I did double check that the project-config updates didn't have the same boolean fun | 22:30 |
ianw | oh, it just merged, 875804 | 22:30 |
clarkb | I could only find evidence of the problem in the all-projects side of things | 22:30 |
ianw | yeah, luckily no and's in that stack, i double checked too :) | 22:31 |
ianw | there's no queue for ^^^ to apply | 22:31 |
ianw | well just the last hourly job to finish | 22:32 |
clarkb | and the ors are all OR so we won't search foo AND or AND bar | 22:32 |
ianw | i did write up a little follow-on https://fosstodon.org/@ianw/109995327604555581 | 22:33 |
opendevreview | Ian Wienand proposed opendev/system-config master: Refactor adns variables https://review.opendev.org/c/opendev/system-config/+/876936 | 22:36 |
clarkb | I've approved the replication change. Hoping that if I can trigger that today then sometime tomorrow we can put them behind haproxy | 22:40 |
opendevreview | Merged opendev/system-config master: Add gitea13 and 14 to Gerrit replication https://review.opendev.org/c/opendev/system-config/+/877046 | 22:55 |
ianw | something like https://review.opendev.org/c/opendev/infra-specs/+/550550 seems right to me | 22:59 |
clarkb | I +2'd the nebulous changes but didn't approve them yet. FIrst I wanted to double check if fungi wanted to review them (also we want the first to land and apply before landing the second). And I've got gitea13 and gitea14 bootstrapping right now and while I don't think adding a new project at this stage will create problems since those hosts are fully managed with the other giteas | 23:04 |
clarkb | they just dont' have data I wanted to see if anyone else thought that mght be a problem | 23:04 |
opendevreview | Steve Baker proposed openstack/diskimage-builder master: A new diskimage-builder command for yaml image builds https://review.opendev.org/c/openstack/diskimage-builder/+/876245 | 23:07 |
opendevreview | Steve Baker proposed openstack/diskimage-builder master: Switch run_functests.sh from disk-image-create to diskimage-builder https://review.opendev.org/c/openstack/diskimage-builder/+/876479 | 23:07 |
opendevreview | Steve Baker proposed openstack/diskimage-builder master: Document diskimage-builder command https://review.opendev.org/c/openstack/diskimage-builder/+/876633 | 23:07 |
ianw | clarkb: i feel like it will be fine, as you say the creation should happen | 23:23 |
clarkb | I'm replicating bindep ti 13 and 14 now (its been my first replication project canary) | 23:33 |
clarkb | That appears successful. I've asked the plugin to replicate everything to those hosts now | 23:36 |
fungi | i can take a look shortly | 23:42 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!