opendevreview | Merged opendev/system-config master: rax: remove identity_api_version 2 pin https://review.opendev.org/c/opendev/system-config/+/865351 | 00:01 |
---|---|---|
ianw | ... so yeah, rerunning seems to "update" the same 24 changesets in neutron | 00:03 |
clarkb | fwiw I think if we are failing to update them it isn't a critical failure. THose chgnes will just need someone to manually add those votes if they are still active? | 00:04 |
clarkb | it might be worth an email to the gerrit mailing list asking them about this behavior and they can hopefully point us to methods for checking if it applied properly? | 00:05 |
ianw | yep, just seeing if i can glean anything from backend logs | 00:06 |
fungi | any obvious commonalities between a random sample of those 24 changes? | 01:07 |
fungi | corvus: i get a crash when trying to open basically any change with that gertty patch... sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such column: comment_1.unresolved | 01:45 |
fungi | am i missing a db migration or just need to start with a clean slate? | 01:45 |
fungi | rolling back to tip of master works again | 01:47 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Improve launch-node deps and fix script bugs https://review.opendev.org/c/opendev/system-config/+/865320 | 01:54 |
fungi | clarkb: ^ addressed your comment | 01:54 |
ianw | Change 84223 may not exceed 1000 updates. It may still be abandoned, submitted and you can add/remove reviewers to/from the attention-set. To continue working on this change, recreate it with a new Change-Id, then abandon this one. | 02:36 |
ianw | so that's one problem | 02:36 |
fungi | oh, right, that | 02:37 |
* fungi shudders | 02:38 | |
ianw | before that is "Error in a slice of project openstack/neutron, will retry and skip corrupt meta-refs [CONTEXT request="SSH" ]" | 02:39 |
ianw | it almost feels like this change has caused an entire "slice" to be skipped, maybe? | 02:40 |
ianw | there's two other errors | 02:40 |
ianw | cannot check change kind of new patch set c095827ea8f82a286c84da1ad3d1aab38a2e1328 in openstack/neutron | 02:40 |
ianw | cannot check change kind of new patch set f24a40fa5e727f176b63233c6b75476213ffc506 in openstack/neutron | 02:40 |
ianw | https://groups.google.com/g/repo-discuss/c/S8shRzWnFrQ/m/0ADVRI7ICAAJ <-- email about issues | 03:17 |
*** pojadhav|out is now known as pojadhav|ruck | 04:23 | |
*** yadnesh|away is now known as yadnesh | 04:31 | |
*** frenzy_friday|rover is now known as frenzy_friday|rover|doc | 06:56 | |
frickler | infra-root: reviews on https://review.opendev.org/c/zuul/zuul-jobs/+/866055 would be nice | 07:09 |
*** yadnesh is now known as yadnesh|afk | 07:50 | |
*** frenzy_friday|rover|doc is now known as frenzy_friday|rover | 08:17 | |
*** jpena|off is now known as jpena | 08:25 | |
*** yadnesh|afk is now known as yadnesh | 08:41 | |
*** dviroel_ is now known as dviroel | 10:47 | |
*** rlandy|out is now known as rlandy | 11:14 | |
*** ysandeep is now known as ysandeep|PTO | 12:38 | |
*** dasm|off is now known as dasm | 12:43 | |
opendevreview | Merged zuul/zuul-jobs master: Use recent node version for markdownlint job https://review.opendev.org/c/zuul/zuul-jobs/+/866055 | 12:56 |
*** frenzy_friday|rover is now known as frenzy_friday|rover|lunch | 12:58 | |
*** yadnesh is now known as yadnesh|afk | 13:02 | |
*** yadnesh|afk is now known as yadnesh | 13:32 | |
*** pojadhav|ruck is now known as pojadhav|dr_appt | 13:41 | |
*** frenzy_friday|rover|lunch is now known as frenzy_friday|rover | 13:50 | |
opendevreview | James E. Blair proposed ttygroup/gertty master: WIP: support inline comment threads https://review.opendev.org/c/ttygroup/gertty/+/860588 | 14:52 |
opendevreview | Cedric Jeanneret proposed opendev/system-config master: Correct how ansible-galaxy is proxified https://review.opendev.org/c/opendev/system-config/+/866175 | 14:53 |
Tengu | fungi: -^^ fyi | 14:53 |
corvus | fungi: i forgot the git add. should be gtg now. but you may want to back up your db in case you need to roll back. | 14:54 |
Tengu | locally tested [without TLS], it seems to work as expected, I could at least fetch some collection, and follow things in my local httpd logs. | 14:54 |
*** yadnesh is now known as yadnesh|away | 14:59 | |
*** dviroel is now known as dviroel|lunch | 15:17 | |
fungi | thanks corvus! | 15:22 |
fungi | Tengu: that's excellent news | 15:23 |
Tengu | fungi: yeah - bit of a headache to get it right, especially with ansible-galaxy being dumb as a donkey, but at least: it's apparently working good. | 15:23 |
Tengu | fungi: not sure if adding some more tests may help? | 15:24 |
dmsimard | Hello o/ There is an account for RDO's third party CI which is under dmsimard+rdothirdparty@redhat.com that has been lost to time since I haven't been at Red Hat for a while. Would it be possible to change the email for that account to softwarefactory-operations-team@redhat.com ? | 15:28 |
Tengu | fungi: hmmm yeah. maybe adding some (better) tests may help. Now I have a good overview of the queries made by ansible-galaxy, and should be able to mimic them using curl/wget/others. | 15:28 |
opendevreview | Cedric Jeanneret proposed opendev/system-config master: Correct how ansible-galaxy is proxified https://review.opendev.org/c/opendev/system-config/+/866175 | 15:42 |
Tengu | let's see. | 15:42 |
fungi | dmsimard: might be better to ask in #rdo since opendev doesn't really get involved in that sort of thing | 15:51 |
frickler | fungi: I read that as request about an opendev gerrit account | 15:55 |
dmsimard | yeah it's a gerrit account, they're the ones asking me about it since the password has been lost but they can't reset it | 15:56 |
dmsimard | or so is my understanding | 15:56 |
frickler | but why not simply create a new account? | 15:57 |
dmsimard | I don't have a strong opinion, I suppose it's easier to keep the same account if possible ¯\_(ツ)_/¯ | 16:00 |
fungi | dmsimard: frickler: oh! thanks, i definitely misread the comment | 16:02 |
fungi | i think we can probably update that via the gerrit rest api, i'll have to take a look after the tc meeting | 16:03 |
dmsimard | fungi: that would be great and much appreciated, it's not an emergency, thanks :) | 16:04 |
Clark[m] | We cannot change the openid | 16:07 |
Clark[m] | We can only delete openids currently not change them unless we take an outage | 16:08 |
Clark[m] | So depending on what lost password means this may not be possible | 16:08 |
fungi | oh, by "that" i meant changing the preferred e-mail address. but yes we can't really recover the account and hand it to someone else | 16:10 |
Clark[m] | Yes we need more specifics as to what the problem is cc dmsimard | 16:11 |
fungi | dmsimard: though if you still have login information for the launchpad account used, you could in theory give that to whoever maintains the rdo ci system and they could use it to log into gerrit and update the accounts both there and in launchpad | 16:12 |
dmsimard | yeah I see what you mean and that makes sense, I'll let them know and report back | 16:12 |
dmsimard | fungi: *nod*, if I still had access to it that's what I would have done | 16:13 |
dmsimard | I handed over credentials but the email was never updated :( | 16:14 |
fungi | oh, well maybe they still have it in that case | 16:14 |
opendevreview | Cedric Jeanneret proposed opendev/system-config master: Correct how ansible-galaxy is proxified https://review.opendev.org/c/opendev/system-config/+/866175 | 16:14 |
fungi | if they have a working ssh key, they may also be able to use the ssh api to update things via set-account | 16:16 |
fungi | dmsimard: Clark[m]: ^ | 16:16 |
*** dviroel|lunch is now known as dviroel | 16:19 | |
*** pojadhav|dr_appt is now known as pojadhav|out | 16:27 | |
*** marios is now known as marios|out | 16:27 | |
clarkb | fwiw I would avoid using a personal account for third party ci. In this case my hunch is that the openid login is associated with that personal account and in that case the account should be shutdown and a new one tied to the org should be used | 16:32 |
clarkb | jrosser: hey, for your bfv instance rescue setup is the backing volume system ceph? If so what were the special image metadata settings and values that you set? Is that set on the instance you are rescuing's image too or just the image used to rescue? | 16:32 |
jrosser | clarkb: yes it was ceph - i think we have the values set on all the images so you can then use anything to rescue anything else | 16:33 |
jrosser | clarkb: then the general purpose images we upload for users all have these properties https://paste.opendev.org/show/bxxFEEUWeUrkIVlBSGrw/ | 16:35 |
clarkb | thanks. I'm wondering if the rescued instance also needs those settings for things to work correctly | 16:36 |
opendevreview | Cedric Jeanneret proposed opendev/system-config master: Correct how ansible-galaxy is proxified https://review.opendev.org/c/opendev/system-config/+/866175 | 16:53 |
*** jpena is now known as jpena|off | 17:05 | |
clarkb | no new cert errors today as expected | 17:37 |
opendevreview | Merged opendev/system-config master: Improve launch-node deps and fix script bugs https://review.opendev.org/c/opendev/system-config/+/865320 | 17:51 |
fungi | excellent | 18:11 |
fungi | unfortunately the mm3 deployment still hasn't completed since it was skipped due to acme.sh errors at original deployment | 18:12 |
fungi | what's the best way to get it to finish, merge a minimal edit to something or manually run ansible or...? | 18:13 |
clarkb | fungi: you should be able to manually rerun ansible against the lists3 playbook | 18:14 |
clarkb | oh that doesn't do le though | 18:14 |
clarkb | so you'd need to manually rerun the le playbook too (ianw did that recently for other hosts and has commands in scrollback somewhere) | 18:15 |
clarkb | its possible that step completed overnight due to the daily runs though | 18:15 |
clarkb | and you just need to run the lists3 plabook now | 18:15 |
clarkb | I guess the lists3 playbook isn't in the daily list ( we should fix that if not and if it is then it may have deployed more than you expect) | 18:15 |
fungi | oh, actually it should have run in periodic but failed: https://zuul.opendev.org/t/openstack/build/6c9789bb200e4cca9255ebe8c12e4a97 | 18:17 |
fungi | AnsibleUndefinedVariable: 'mailman3_db_password' is undefined | 18:18 |
fungi | whoops. missed a step ;) | 18:18 |
fungi | would it make more sense to add things for this under host_vars or group_vars? | 18:19 |
fungi | i guess for the test strings we put them under host_vars | 18:20 |
frickler | is that a local db? would we have a different db when running a second mm3 host? | 18:20 |
clarkb | frickler: it is a local db. We shouldn't need a second mm3 host | 18:21 |
fungi | that's the question. not that we plan to run more than one mm3 host unless we're moving from one to another | 18:21 |
frickler | we might if we migrate to ubuntu 26.04 in the future | 18:21 |
fungi | so if there are ever two they'll only coexist for a short time | 18:21 |
fungi | the question is whether those need to share passwords or not | 18:21 |
clarkb | in this case I think group vars is probably fine? | 18:21 |
clarkb | unless you are worried our firewalls will break and allow external access from a secondary host accidentally | 18:22 |
frickler | I would prefer host_vars anyway, but not too strongly | 18:23 |
clarkb | ya host vars is probably most correct. Just more effort for the future | 18:23 |
clarkb | I'm happy either way | 18:23 |
fungi | i guess https://opendev.org/opendev/system-config/src/branch/master/playbooks/zuul/files/host_vars/lists99.opendev.org.yaml#L1-L9 are the things i need to generate | 18:23 |
fungi | well, generate and/or set | 18:24 |
fungi | we don't seem to set any of those for production in system-confg | 18:24 |
fungi | (not even username/email) | 18:24 |
clarkb | we might be able to set a couple of them but ya I tried to make it clear which things are mm3 specific with that prefix | 18:24 |
fungi | added | 18:27 |
frickler | what host is 64.94.110.11? whois says internap, but no rdns https://opendev.org/opendev/system-config/src/branch/master/playbooks/zuul/files/host_vars/lists99.opendev.org.yaml#L67 | 18:29 |
clarkb | I don't know that config came out of the original servers exim config | 18:29 |
fungi | frickler: oh, i think that got cargo-culted from before old lists.o.o was in puppet | 18:29 |
fungi | could have sworn i cleaned that up already | 18:30 |
fungi | looks like we have it cargo-culted into several other configs too if you git grep for it | 18:30 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Clean up an old raw IP address from our MTAs https://review.opendev.org/c/opendev/system-config/+/866203 | 18:34 |
fungi | frickler: clarkb: ^ | 18:34 |
fungi | i think i asked corvus and mordred about that a while back too, and neither of them remembered why it was there either | 18:34 |
clarkb | fungi: I guess it is already out of the prod mm3 config | 18:36 |
fungi | yeah, i must have cleaned it up in our new config and that's what i was remembering | 18:36 |
fungi | won't be in git history because it was removed before we approved the change | 18:37 |
fungi | infra-root: last call for edits on the mm3 migration announcement at https://etherpad.opendev.org/p/mm3migration | 18:42 |
fungi | i added a sentence to the final paragraph to clarify about future maintenance for migrating the other sites | 18:42 |
opendevreview | Merged openstack/project-config master: Add an Ubuntu FIPS testing token https://review.opendev.org/c/openstack/project-config/+/861457 | 19:22 |
ianw | LGTM! | 19:40 |
ianw | clarkb: yeah, and doesn't explain the 24 changes that keep updating | 19:43 |
ianw | although it's probably worth bumping the comment limit, running against neutron and seeing if it goes away | 19:43 |
ianw | that way we can file a bug that "comments with more than limit cause entire slice to re-run endlessly" (presuming it does) | 19:44 |
clarkb | ianw: ya but that requires an outage | 19:44 |
clarkb | it does look like luca agrees they shouldn't error in this case. Did you want to followup with them? | 19:45 |
ianw | yeah, i'll ask if it would cause the recurring updates | 19:45 |
clarkb | ianw: it does look like 84223 is a merged change too | 19:46 |
clarkb | but reading luca's response we might also want ot check if any of the changes are active and if not we can probably get away with this as is | 19:47 |
clarkb | if there are active changes that are sad we can ask reviewers to manually apply their votes on the latest patchsets as a workaround | 19:47 |
ianw | the 24 changes that may or may not be fixed are | 19:56 |
ianw | https://paste.opendev.org/show/bynUtJEe4BYfzGVGM4I7/ | 19:56 |
clarkb | ianw: I think the way to check is likely going to involve inspecting the notedb content for the changes | 19:57 |
clarkb | as luca mentioned a new note comment is added to capture the votes on the patchset so I think we need to look for those after identifying which votes we are trying to carry forward | 19:57 |
clarkb | ianw: re that paste it looks a lot like the other successful run that you shared (for dib iirc). How do we know if others haven't failed? | 19:58 |
clarkb | I guess the hint is that only neutron is rerunning each time? | 19:58 |
opendevreview | Merged openstack/project-config master: Deprecate OpenStack-Ansible rsyslog roles https://review.opendev.org/c/openstack/project-config/+/863079 | 20:01 |
ianw | clarkb: yeah, when rerunning over all projects, it's only this batch that appears again | 20:03 |
ianw | i just went through them all, they are all either abandoned or closed | 20:04 |
ianw | but interestingly, i just noticed 84223 is in there *twice* | 20:04 |
clarkb | that might explain the double error we get about th emissing sha | 20:05 |
clarkb | I think there was some wonder over whethe ror not multiple changes had shared an object | 20:05 |
ianw | f24a40fa5e727f176b63233c6b75476213ffc506 doesn't appear to be a ps in 84223 | 20:09 |
clarkb | it might be an object and not a commit? | 20:11 |
ianw | it does say "Cannot check change kind of new patch set f24a40fa5e727f176b63233c6b75476213ffc506" but that also may mean nothing | 20:12 |
opendevreview | Merged openstack/project-config master: Add repository for Skyline installation by OpenStack-Ansible https://review.opendev.org/c/openstack/project-config/+/863165 | 20:13 |
ianw | it seems to send me to https://review.opendev.org/c/openstack/neutron/+/22128 before it barfs | 20:13 |
fungi | yeah, i had noted that earlier in the other channel where the error came up | 20:14 |
ianw | c095827ea8f82a286c84da1ad3d1aab38a2e1328 sends me to 22128 as well | 20:14 |
fungi | i wonder if they're all broken revisions for the same change | 20:15 |
opendevreview | Merged openstack/project-config master: Add the cinder-infinidat charm to Openstack charms https://review.opendev.org/c/openstack/project-config/+/863954 | 20:15 |
ianw | 22128 *doesn't* appear in the list, updated or not | 20:16 |
opendevreview | Merged openstack/project-config master: Add the infinidat-tools charm to Openstack charms https://review.opendev.org/c/openstack/project-config/+/863955 | 20:16 |
opendevreview | Merged openstack/project-config master: Add manila-infinidat charm to OpenStack charms https://review.opendev.org/c/openstack/project-config/+/863957 | 20:16 |
fungi | ianw: according to git notes, f9aba49e05af6f69a6fca8b61c2e7a14d9b78e11 is the commit which merged for that change | 20:21 |
fungi | https://opendev.org/openstack/neutron/commit/f9aba49e05af6f69a6fca8b61c2e7a14d9b78e11 | 20:22 |
fungi | 10 years ago | 20:22 |
fungi | or nearly (early 2013) | 20:23 |
ianw | memories ... like the corners of my mind ... :) | 20:24 |
ianw | i think we will never find f24a40fa5e727f176b63233c6b75476213ffc506 again | 20:25 |
ianw | is it just a co-incidence that 84223 has been hashed into this "slice" of changes to update? why is 22128 not appearing in the list? does any of this matter? | 20:26 |
clarkb | hrm maybe 22128 is in the slice and it is failing while gerrit tries to process it | 20:33 |
clarkb | and that causes things to fail enough that it tries to retry every time. fwiw if every change in that lits is closed (apparnetly so) then i think the impact is basically nil according to what luca has said | 20:34 |
opendevreview | Merged openstack/project-config master: And ansible-role-proxysql repo to zuul jobs https://review.opendev.org/c/openstack/project-config/+/817272 | 20:35 |
opendevreview | Merged openstack/project-config master: Telemetry: Switch back to launchpad from storyboard https://review.opendev.org/c/openstack/project-config/+/805492 | 20:35 |
opendevreview | Merged openstack/project-config master: Add feature branch notifications to openstack-sdks https://review.opendev.org/c/openstack/project-config/+/799323 | 20:35 |
ianw | yeah, the only testing i option i see is quickly shutting down gerrit, bumping the max comments/thread and trying to copy-approvals on neutron and see what happens | 21:03 |
ianw | if we think that's worthwhile i can try this afternoon when it's quiet | 21:03 |
*** dviroel is now known as dviroel|afk | 21:04 | |
ianw | otherwise i can just file a issue about it, but since all the changes being reported are closed, we can assume we're ok | 21:04 |
*** Guest305 is now known as atmark | 21:07 | |
clarkb | infra-root I'm testing nodepool with new openstacksdk (actually running the test steup with known good openstacksdk first to make sure I don't confuse problems in test setup with sdk update issues) and I'm noticing that we may need to do a node cleanup and the inmotion cloud has leaked stuff due to placement I think (though there are a number of building servers too..) | 21:20 |
clarkb | all that to say we should dig into that. I'll try to take a better accounting once I'm done with nodepool testing | 21:21 |
clarkb | ok testing against iweb mtl01 with openstacksdk 0.103.0 produces: keystoneauth1.exceptions.catalog.EndpointNotFound: public endpoint for compute service in mtl01 region not found | 21:25 |
clarkb | this error does not occur when running against rax | 21:25 |
clarkb | ok the iweb issue is an issue with the vendor data in clouds.yaml | 21:28 |
clarkb | I'll try to manually write that content out and not rely on the profile | 21:28 |
corvus | clarkb: which are the old/new version numbers in question, just for those of us watching along at home? | 21:28 |
clarkb | corvus: old is openstacksdk==0.61.0 new is openstacksdk==0.103.0 | 21:30 |
clarkb | rax worked | 21:30 |
clarkb | for booting and deleting a node | 21:30 |
clarkb | I think I've got iweb working via clouds.yaml update. I'll write a change to update our iweb configs to work around this. Then assuming we're ok with updating nodepool to try 0.103 all over the place that might be worthwile? | 21:33 |
clarkb | I feel like doing exhuastive testing of all the clouds and and imgae uploads etc might be more effort than it is worth? We can update. Then check if we boot nodes and upload images? I'm open to more testing though just the list is long if we try to do it all | 21:34 |
corvus | clarkb: i think you've checked the clouds most likely to be affected, so at least we know if we update the launchers we won't faceplant. if there are uploads that fail, we'll continue using old images, so risk isn't crazy there. starting to try it in prod sgtm. | 21:36 |
clarkb | good point re image upload failures being low impact | 21:36 |
opendevreview | Clark Boylan proposed opendev/system-config master: Update iweb clouds.yaml for old and new openstacksdk https://review.opendev.org/c/opendev/system-config/+/866220 | 21:38 |
clarkb | corvus: ^ thats the fix on our side | 21:39 |
clarkb | infra-root in rax-dfw we appear to have leaked a number of old instances (I found one from 2020 I manually deleted just to check if it would delete) | 21:56 |
clarkb | I'm a bit confused as to why nodepool leaked instance cleanup wouldn't take care of this | 21:56 |
clarkb | they don't show up in the nodepool listing (for the ones I've spot checked) so they should be cleaned up by leak cleanup? | 21:58 |
clarkb | dfw, iad, ord all seem to have the same situation | 21:59 |
corvus | clarkb: maybe they were missing nodepool metadata? | 21:59 |
clarkb | oh right we check that to be sure we only leak cleanup instances created by nodepool. Ican check that | 21:59 |
clarkb | cleaning this up should be good for capacity in nodepool. Then I/we also need to look at cleaning up inmotion's sadness | 22:00 |
clarkb | all this made more important by the impending iweb removal | 22:00 |
clarkb | corvus: that was a good hunch. Currently active nodes have properties set with the nodepool info but spot checking these other nodes they do not | 22:02 |
clarkb | and that would prevent leak cleanup from cleaning them up. In that cas eI guess it must've been an issue in the cloud or a temporary bug in nodepool? and we should go ahead and manually clean up | 22:02 |
corvus | clarkb: yep one of those sounds most likely and i think manual cleanup is called for | 22:08 |
clarkb | thanks for confirming. I'll start work on that shortly | 22:09 |
ianw | clarkb: instances as in running vm's right? I have https://review.opendev.org/c/opendev/system-config/+/562510 that cleaned up a ton of leaked images and blobs quite a while ago, but iirc we identified and fixed that issue | 22:17 |
*** dasm is now known as dasm|off | 22:23 | |
clarkb | yes these are running VMs | 22:23 |
clarkb | I'm starting in ord, taking the list of likely leaked nodes then will check them against what nodepool thinks its got. Any that nodepool doesn't know about will be removed | 22:30 |
clarkb | that process is running now for rax-ord | 22:33 |
clarkb | a few of these have servers with the same name too and will need to be removed via uuid | 22:35 |
clarkb | ianw: can I delete ianw-f34-test in rax-ord too? | 22:36 |
ianw | umm, yes :) | 22:39 |
ianw | even a script sending an email alerting us of very very old instances in the CI clouds would probably be helpful, it's a bit embarrassing i left that behind | 22:39 |
clarkb | ok done. There are three relatively recent nodes (~1 month old) that appear stuck in building that do have properties and updated at fields that indicate nodepool is trying to delete them. Other than those rax-ord looks clean. Now rax-iad | 22:42 |
ianw | clarkb: while the script is in a loop -- i think my proposal to wrap up copy-approvals is to bump the comment limit when gerrit is quiet and re-run against neutron, and see if the 24 repeated updates goes away. if it does, that was the cause. either way i file an issue | 22:45 |
ianw | if it doesn't then it seems like the missing object on 84223 is the cause. nothing we can realistically do about that, but considering all of the 24 changes are abandoned or closed, we'll just leave it alone | 22:45 |
ianw | assuming we don't have to worry about it, even if it didn't apply correctly | 22:46 |
clarkb | sounds good. According to luca we dn't have to worry about it because all it does is add a new comment it doesn't change theactual schema | 22:48 |
ianw | yep, i think for purposes of our upgrade, the todo is done. we'll just try to report upstream so it's easier for the next person trying :) | 22:49 |
clarkb | in iad we've got more nodes that nodepool is trying to delete and some nodes that nodepool doens't know to delete | 22:51 |
clarkb | I'm going to try manually deleting all of them to see if that makes a difference | 22:51 |
*** rlandy is now known as rlandy|out | 23:01 | |
clarkb | ianw: and ianw-xenial in dfw is good to go? | 23:03 |
ianw | clarkb: basically anything with my name on it is gtg :) | 23:10 |
ianw | i'm not actively using any | 23:10 |
clarkb | ack | 23:12 |
clarkb | ianw: hrm it says that instance is locked. I don't know what that means but it won't let me delete it | 23:16 |
clarkb | spot checking iweb and ovh they look fine. Whatever caused this to happen also seems to coincide with colliding node names (those I had to delete by uuid) so I suspect it was something on our side | 23:19 |
ianw | i can't imagine i meant to lock it | 23:20 |
clarkb | there is a server unlock command | 23:21 |
clarkb | I've put digging into the inmotion stuff on my todo list for tomorrow. It will require me to page in a fair bit of stuff so I don't want to tackle that now | 23:22 |
clarkb | I see nodes properly held through zuul for corvus (zuul-tox-py38, frickler devstack-plugin-ceph, and ade_lee cinder-tempest-plugin-lvm-lio-barbican-fips) | 23:23 |
ianw | because i'm poking at the error_logs, i also notice "hook[patchset-created] exited with error status: 2" seems to be happening constantly | 23:24 |
clarkb | ianw: I want to say that is a known issue and due to us not having access to the db anymore | 23:24 |
clarkb | for the welcome message maybe when you push for first patch? | 23:24 |
clarkb | not running things that don't work is a good idea though | 23:24 |
ianw | i feel like that should be commented out -> https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/gerrit/files/hooks/patchset-created | 23:24 |
clarkb | oh I guess we addressed that the. Something else | 23:25 |
ianw | i wonder if some of the python updates affected it | 23:26 |
clarkb | #status log Cleaned up leaked nodepool instances in rax. Nodepool couldn't clean them up automatically due to missing metadata. | 23:26 |
opendevstatus | clarkb: finished logging | 23:26 |
clarkb | ianw: I thought gerrit recorded the stdout/stderr along with those errors and put them in the error_log | 23:26 |
ianw | seems like we need to up the logging to debug level | 23:27 |
ianw | this might be something useful to do when i restart to debug the max comment limit thing | 23:27 |
clarkb | oof that is probably going to be very chatty | 23:27 |
clarkb | but if it is temporary for ya that then might not be too big of a deal | 23:27 |
clarkb | configuring gerrit logging is its own set of expertise though :( | 23:27 |
ianw | https://gerrit-review.googlesource.com/Documentation/cmd-logging-set-level.html | 23:28 |
ianw | i wonder if this can work | 23:28 |
ianw | com.googlesource.gerrit.plugins.hooks.HookTask is the log id, i think | 23:29 |
ianw | com.googlesource.gerrit.plugins.hooks.HookTask: INFO | 23:29 |
ianw | ok, i set that to DEBUG ... TIL ... let's see if anything comes up in logs | 23:31 |
clarkb | ya that is new to me as well. | 23:31 |
ianw | output: update-blueprint: error: unrecognized arguments: --change-owner-username ... blah | 23:32 |
ianw | it seems to be an argument per https://opendev.org/opendev/jeepyb/src/branch/master/jeepyb/cmd/update_bug.py#L306 | 23:32 |
clarkb | https://opendev.org/opendev/jeepyb/commit/6eca4077d02a9c2198bb405669d3738bea21861e is it possible that we aren't running an up to date jeepyb there? | 23:34 |
ianw | i'm thinking so | 23:34 |
clarkb | oh wait | 23:34 |
clarkb | its this script https://opendev.org/opendev/jeepyb/src/branch/master/jeepyb/cmd/update_blueprint.py | 23:35 |
clarkb | and that one doesn't have the argument | 23:35 |
clarkb | I think we half updated? | 23:35 |
clarkb | ya I think that must be it. | 23:36 |
clarkb | side note jeepyb changes trigger gerrit image rebuilds. We may need ot double check that integration is stillgood (though I try to get it up to date after we upgrad egerrit) | 23:36 |
ianw | the container has jeepyb-0.0.1.dev485.dist-info/ | 23:36 |
clarkb | ianw: ya I think the issue is in update-blueprint which didn't get updated | 23:37 |
clarkb | update-blueprint was disabled previously becuse it needed the db then melwitt updated it to use the rest api but it must've never worked because we didn't have the correct arglist in the command? | 23:37 |
ianw | ahhh, so basically port https://opendev.org/opendev/jeepyb/commit/6eca4077d02a9c2198bb405669d3738bea21861e | 23:38 |
melwitt | clarkb: I noticed that it didn't work but didn't get time to figure out what's wrong with it | 23:38 |
ianw | i guess we found it :) | 23:39 |
melwitt | if so, that's great news :) | 23:39 |
opendevreview | Ian Wienand proposed opendev/jeepyb master: update_blueprint: handle recent gerrit arguments https://review.opendev.org/c/opendev/jeepyb/+/866237 | 23:43 |
clarkb | looks like jeepyb is configured to build a gerrit 3.5 image | 23:45 |
clarkb | which is in sync with what we are running so that bit should be good | 23:45 |
clarkb | ianw: small thing on that change | 23:46 |
opendevreview | Ian Wienand proposed opendev/jeepyb master: update_blueprint: handle recent gerrit arguments https://review.opendev.org/c/opendev/jeepyb/+/866237 | 23:49 |
ianw | thanks, i knew i couldn't get it without at least one typo! | 23:49 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!