opendevreview | Ian Wienand proposed opendev/system-config master: backups: add review02.opendev.org https://review.opendev.org/c/opendev/system-config/+/797564 | 00:01 |
---|---|---|
opendevreview | Ian Wienand proposed opendev/zone-opendev.org master: Update review.opendev.org to review02.opendev.org https://review.opendev.org/c/opendev/zone-opendev.org/+/798244 | 00:04 |
ianw | ugh that made me realise we want to make sure we move review02 out of review-staging group appropriately too | 00:32 |
opendevreview | Ian Wienand proposed opendev/system-config master: review02: move out of staging group https://review.opendev.org/c/opendev/system-config/+/797563 | 01:33 |
opendevreview | Ian Wienand proposed opendev/system-config master: backups: add review02.opendev.org https://review.opendev.org/c/opendev/system-config/+/797564 | 01:33 |
Clark[m] | ianw: are there bits in the review and Gerrit groups that we need on the server like ssh keys or LP credentials that we might need early in the migration? | 03:08 |
Clark[m] | Your catch of the group move made me think of that. If so we might want to add them to the staging group now and get them landed and in place early | 03:09 |
ianw | Clark[m]: i think that is all pretty much there, as i basically copied the review01 hosts .yaml file on bridge to review02 | 03:09 |
Clark[m] | Ok. Thought I would double check | 03:10 |
ianw | ++ :) i'm feeling better about the plan sorting a few of these bits out | 03:13 |
Clark[m] | Worst case we'll do a manual sync and then fix Ansible after :) | 03:15 |
*** ykarel|away is now known as ykarel | 04:32 | |
opendevreview | Ian Wienand proposed opendev/system-config master: Remove paste01.openstack.org https://review.opendev.org/c/opendev/system-config/+/800879 | 04:43 |
opendevreview | Ian Wienand proposed opendev/system-config master: Remove paste01.openstack.org https://review.opendev.org/c/opendev/system-config/+/800879 | 04:44 |
opendevreview | Ian Wienand proposed opendev/system-config master: Add paste01.opendev.org to backup https://review.opendev.org/c/opendev/system-config/+/800880 | 05:03 |
opendevreview | Merged opendev/system-config master: borg-backup: exclude /var/lib/snapd https://review.opendev.org/c/opendev/system-config/+/797562 | 05:09 |
opendevreview | Merged openstack/project-config master: Remove publish-to-pypi from retired neutron-lbaas repo https://review.opendev.org/c/openstack/project-config/+/800853 | 05:34 |
*** amoralej|off is now known as amoralej | 06:10 | |
opendevreview | OpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml https://review.opendev.org/c/openstack/project-config/+/800884 | 06:12 |
opendevreview | chzhang8 proposed openstack/project-config master: register and bring back tricircle under x namespaces https://review.opendev.org/c/openstack/project-config/+/800885 | 07:00 |
opendevreview | Merged opendev/system-config master: Add paste01.opendev.org to backup https://review.opendev.org/c/opendev/system-config/+/800880 | 07:09 |
opendevreview | xinliang proposed opendev/system-config master: Enable openEuler mirroring https://review.opendev.org/c/opendev/system-config/+/784874 | 07:12 |
*** rpittau|afk is now known as rpittau | 07:24 | |
*** ykarel is now known as ykarel|lunch | 09:17 | |
*** ykarel|lunch is now known as ykarel | 11:00 | |
*** dviroel|out is now known as dviroel | 11:02 | |
opendevreview | Ananya Banerjee proposed opendev/elastic-recheck master: Run elastic-recheck container https://review.opendev.org/c/opendev/elastic-recheck/+/729623 | 11:12 |
opendevreview | Ananya Banerjee proposed opendev/elastic-recheck master: Run elastic-recheck container https://review.opendev.org/c/opendev/elastic-recheck/+/729623 | 11:41 |
*** amoralej is now known as amoralej|lunch | 11:50 | |
*** amoralej|lunch is now known as amoralej|away | 11:50 | |
*** iurygregory_ is now known as iurygregory | 12:07 | |
opendevreview | Thierry Carrez proposed opendev/yaml2ical master: Update hacking version https://review.opendev.org/c/opendev/yaml2ical/+/800944 | 12:51 |
opendevreview | Ananya Banerjee proposed opendev/elastic-recheck master: Run elastic-recheck container https://review.opendev.org/c/opendev/elastic-recheck/+/729623 | 13:05 |
opendevreview | Thierry Carrez proposed opendev/yaml2ical master: Report which week a meeting occurs. https://review.opendev.org/c/opendev/yaml2ical/+/799691 | 13:09 |
*** chkumar|rover is now known as chandankumar | 14:05 | |
mnaser | infra-root: how long does a gerrit restart usually take? | 14:37 |
mnaser | i'm asking because since we're moving to a new dc, we might need to potentially organize a very short reboot of the instance. i'm wondering if it is beneficial for us to 'rush' through the current gerrit instance before migration so we avoid it before the actual day of the migration | 14:37 |
clarkb | mnaser: I normal restart of the gerrit service to update the docker image takes on a few minutes. Maybe 2-3 | 14:53 |
mnaser | clarkb: ok so it's not some wild thing that will cause a ton of interruption because it has to index something | 14:53 |
clarkb | mnaser: no, the reindexing is necessary during certain upgrade scenarios (we are also doing a reindex during our migration on sunday/monday to avoid needing to sync the indexes between servers) | 14:54 |
clarkb | if we need to coordinate a reboot in a week or two I don't expect that to be problematic. But it would also be fine to do it in the next day or two. | 14:55 |
melwitt | clarkb: eventually I got my logstash working with filebeat pushing the log lines from a sample job-output.txt file and so far I'm not able to get a _grokparsefailure tag happening from it like I see in our kibana. I came across this while googling https://discuss.elastic.co/t/logstash--grokparsefailure-issue/49756 where someone observed _grokparsefailure only when log events were overloading the system, | 15:41 |
melwitt | whereas in the debugger there was never a problem. it made me wonder, are our logstash batching settings in a public repo anywhere? just curious about how they are set | 15:41 |
clarkb | melwitt: I don't know that we set any config like that. We just run the service from the pacakge install then feed it the configs in that logstash configs repo | 15:48 |
clarkb | also it is an older version of logstash (one of the many issues with the aging ELK infrastructure) | 15:48 |
melwitt | clarkb: ah ok. yeah apparently you can configure logstash to filter events in batches to alleviate some congestion issues if you have congestion issues | 15:49 |
melwitt | do we know what version it is? I could try installing that | 15:49 |
clarkb | melwitt: https://opendev.org/opendev/puppet-logstash/src/branch/master/manifests/init.pp#L23-L28 should be what we grab | 15:50 |
melwitt | ty | 15:51 |
clarkb | you're welcome | 15:55 |
opendevreview | Merged openstack/project-config master: Normalize projects.yaml https://review.opendev.org/c/openstack/project-config/+/800884 | 15:58 |
*** rpittau is now known as rpittau|afk | 16:17 | |
*** marios is now known as marios|out | 16:25 | |
opendevreview | Rich Bowen proposed opendev/yaml2ical master: Report which week a meeting occurs. https://review.opendev.org/c/opendev/yaml2ical/+/799691 | 16:26 |
opendevreview | Rich Bowen proposed opendev/yaml2ical master: Report which week a meeting occurs. https://review.opendev.org/c/opendev/yaml2ical/+/799691 | 16:28 |
*** ykarel is now known as ykarel|away | 16:32 | |
*** sshnaidm is now known as sshnaidm|afk | 16:35 | |
opendevreview | Rich Bowen proposed opendev/yaml2ical master: Report which week a meeting occurs. https://review.opendev.org/c/opendev/yaml2ical/+/799691 | 16:42 |
* melwitt is now running with logstash 2.4.1 😅 | 16:57 | |
melwitt | what I've gleaned so far is that logstash is by far the biggest offender for resource consumption (both memory and cpu) but elasticsearch and kibana are still considered best in class | 17:29 |
clarkb | elasticsearch needs a lot of memory too since it loads indexes into memory | 17:30 |
clarkb | the problem with modern kibana is there wasn't a way to use it with a RO elasticsearch iirc | 17:31 |
clarkb | there are definitely some rough edges when it comes to the use case we've got :/ | 17:31 |
melwitt | and apparently elasticsearch has something called ingest pipelines that can be used to do filtering (instead of logstash) and that it's possible/favored to run elasticsearch and kibana without logstash | 17:31 |
melwitt | I'll read up on RO elasticsearch + kibana | 17:33 |
clarkb | melwitt: I think the amazon fork may have some of the functionality needed to control access in a way that kibana might be able to use it | 17:33 |
melwitt | ok I'll look for that. I'm not familiar with the RO mode | 17:35 |
melwitt | I mean, I'm not familiar with any of it really but this is the first I have heard of the RO mode heh | 17:35 |
clarkb | melwitt: well the RO mode is something we did because we want this data to be accessible. That is why you have to talk to it via that proxy | 17:37 |
clarkb | the proxy has a list of allowed api queries and they don't allow writes | 17:37 |
melwitt | oh, I see | 17:37 |
clarkb | the problem becomes that kibana wants to use elasticsearch to operate and to do that it needs to write to elasticsearch. But if we do that then no one can access elasticsearch | 17:37 |
clarkb | (because it isn't safe to have a giant elasticsearch cluster on the internet in that way) | 17:37 |
melwitt | got it | 17:38 |
clarkb | basically the main reason we got in this lack of upgrades problem was kibana. Old kibana didn't work with new elasticsearch but new kibana didn't work with RO elasticsaerch | 17:38 |
clarkb | and we let that rot on the vine for too long :( | 17:38 |
melwitt | I see. ok. so far I don't have a sense of what kibana alternatives there are, I will look deeper at that | 17:39 |
melwitt | so far, alternatives in general look complicated in comparison to ELK so I'm concerned about ease of use and maintenance too | 17:41 |
melwitt | it looks like dropping the L would be pretty simple but beyond that, I haven't seen anything clear | 17:42 |
clarkb | its possible we punt on kibana and tell epople to use the api. Not a great answer but might help simplify things | 17:48 |
clarkb | in trying to test my fix for the openid deletions via email deletion I've discovered that gerrit will not let you delete your preferred eamil. So now I need to figure out adding a second email address to my account on the test node without doing verification. I think I can do that as admin but this got more complicated | 17:49 |
melwitt | yeah.. it's been so nice using kibana | 17:58 |
clarkb | Error 409 (Conflict): Cannot remove e-mail 'testemail' which is directly associated with OPENID_SSO authentication <- success! I think | 17:58 |
clarkb | infra-root ^ It lets you click the delete button in the settings page and removes it from the ui but when you click on save settings you get this error | 17:59 |
clarkb | I've got a meeting with our second broken gerrit account user to help them test the fix against review02 shortly, then lunch, but once that is done I'll try to push my patch up to gerrit | 18:05 |
clarkb | I'd like to avoid forking gerrit again with our patch so will do my very best to make the cahnge mergable upstream | 18:05 |
clarkb | ianw: I had to modify the canonical web url on review02 to do testing of the gerrit account fixup. Details arei n the email. I figured I'd leave that in place for a bit just in case we need to do anymore similar testing, but when your day starts we can revert that to the way we want it | 18:59 |
clarkb | and now lunch | 19:00 |
opendevreview | Vishal Manchanda proposed openstack/project-config master: Retire django-openstack-auth https://review.opendev.org/c/openstack/project-config/+/800459 | 19:37 |
clarkb | infra-root https://gerrit-review.googlesource.com/c/gerrit/+/312302 exists now | 19:45 |
clarkb | if ya'll can take a look at that I would appreciate it | 19:46 |
clarkb | corvus: do you know how I view the results of their CI jobs? I have a failing code style check and I can't find a way to see why it failed | 19:53 |
clarkb | oh wait there is a little icon next to the rerun button | 19:53 |
*** dviroel is now known as dviroel|brb | 19:53 | |
corvus | yeah that :) | 19:54 |
clarkb | unfortunately that still doesnt' tell me what went wrong just that google-java-format-1.7 produced output | 19:54 |
* clarkb looks in the local repo to see if that is runnable from there | 19:55 | |
corvus | i'm stumped too | 19:55 |
* fungi places ci errors next to a rerun button, thinking surely that won't encourage users to just rerun things | 19:56 | |
clarkb | run `./tools/setup_gjf.sh` to download a local copy and set up a wrapper script. | 19:56 |
clarkb | I'll wait for the other checks to run too in case I have to fix other things | 19:56 |
clarkb | I have fumbled my way through that (I didn't run their script but fetched the jar myself and ran it. Couldn't figure out how to run it against the whole repo at once so did files individualy). I had an extra unnecessary import | 20:03 |
clarkb | I'll wait for the other builds before pushing my fix | 20:03 |
clarkb | thats neat. Pushing my second patchset gets hit with an error | 20:19 |
fungi | "neat" | 20:20 |
clarkb | I'll give it a few minutes then try again. If it fails again I guess I ask about it on slack | 20:20 |
clarkb | remote: INTERNAL Internal error encountered | 20:21 |
clarkb | I can't add comments to me change now either it seems and somene on slack says I should file an issue because it is probably a problem with the servers | 20:26 |
*** dviroel|brb is now known as dviroel | 20:27 | |
clarkb | https://bugs.chromium.org/p/gerrit/issues/detail?id=14792 has been filed in response to this problem | 20:37 |
clarkb | good news is that my change seems to do what I want on the test server. I can delete email addresses not assoicated iwth my openid and cannot delete changes associated with my openid. And when I show up as a new user it is able to creat a new account for me | 21:01 |
*** dviroel is now known as dviroel|out | 21:24 | |
ianw | clarkb: thanks for working through that. we can leave as is and add note in the checklist to update it if you like | 21:48 |
clarkb | ianw: at this point I suspect that we can revert it back to normal since no need for additional testing has come up | 21:49 |
ianw | ++ one less thing seems good | 21:49 |
clarkb | I can do that in a few | 21:49 |
ianw | i don't know about the plugin that seems to be highlighting references in https://gerrit-review.googlesource.com/c/gerrit/+/312302/1/java/com/google/gerrit/server/auth/openid/OpenIdRealm.java | 21:51 |
ianw | it seems to make my mouse move slower when i'm in the green area, it's very weird | 21:51 |
clarkb | ianw: it could also be an update to the core softwrae on latest gerrit | 21:51 |
ianw | i can also pin a cpu by wiggling my mouse in there | 21:52 |
ianw | i guess that's related | 21:52 |
clarkb | ianw: if you want to test it as well on the test node you need to manaully add an email address to your account with the admin account becaues I don't think we can do verification of the email addresses there and you can't delete your preferred email (which your only email from openid will be) | 21:53 |
clarkb | in my testing I logged in to create a new account. Added an email via the api as admin. Then switched my preferred email to that email. Tried to delete the other email which came from openid (that failed which we wanted). Then swapped preferred emails back around again and successfully deleted the email that I had added which isn't part of my openid | 21:54 |
clarkb | and now I can't push my new ps to fix the CI error | 21:55 |
ianw | it looks like it fairly logically fits in with what all the other authentication schemes are doing, which is good and suggests it's right to me | 21:56 |
clarkb | ianw: I've reverted the manual edit of review02's gerrit.config and restarted the services with docker-compose | 21:58 |
clarkb | you might want to take a look at the config just to make sure I didn't do anything silly but it was just a simple line removal and uncomment the old line | 21:58 |
ianw | thanks | 21:59 |
ianw | https://etherpad.opendev.org/p/gerrit-upgrade-2021 has been updated | 21:59 |
ianw | the only further thought i had was that as written maybe it shuts down zuul unnecessarily. | 22:00 |
clarkb | ya I think I left a comment about that saying zuul should reconnect when the dns update happens | 22:00 |
ianw | yeah just that the prior steps had it shut down | 22:00 |
clarkb | I don't think that it hurts to shut it down. I also plan to work with corvus to restart zuul tomorrow to ensure the changes going into it don't impact the gerrit move | 22:01 |
clarkb | I think there is one change and another that will probably get squashed into it that we want to have BMW rereview and if they are happy we land those and restart | 22:01 |
ianw | yeah we can be conservative this time and next time i imagine things will look very different | 22:03 |
clarkb | ianw: oh! another thing I noticed earlier was that I think your common topic for the review changes is no longer on all of the changes. Can you check that then I can pull the topic up and do another pass of reviews? | 22:04 |
ianw | ahh, git review must have helpful reset that for me | 22:05 |
clarkb | ok I was able to post my review comments on that upstream gerrit change and then push the new patchset | 22:09 |
clarkb | the 3 remaining changes lgtm. Not sure if you want to WIP the two that are not WIP | 22:16 |
clarkb | my upstream chagne passes codestyle checks now. It passed the build test previously | 22:18 |
clarkb | considering the only change I did was to remove an unused import I won't rebuild my held test node. I think what we have there is sufficiently like what I'ev pushed upstream. If I get reviews asking me to make major changes I will update the test machine | 22:19 |
ianw | sigh, i seem to have lost ipv6 connectivity | 22:23 |
ianw | if this persists, it's not something i'm going to enjoy trying to talk to support bots about | 22:24 |
clarkb | my ISP was hoping to roll out native ipv6 in ~February and it still hasn't happened yet. Supposedly they keep finding problems with their deployment in the test env | 22:24 |
ianw | letsencrypt is failing, having a look | 22:28 |
ianw | fatal: [cacti02.openstack.org]: FAILED! => {} | 22:28 |
ianw | that is really weird as i wouldn't have thought cacti was involved with letsencrypt | 22:32 |
clarkb | ianw: we do certcheck from cacti02 iirc. It was maybe writing the list of hosts ot certcheck back to that server? | 22:32 |
clarkb | but ya it doesn't LE directly, just does the checking I think | 22:32 |
ianw | ohhh, yeah, that's it | 22:32 |
ianw | 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'letsencrypt_certcheck_domains' | 22:33 |
clarkb | if we don't update any certs do we maybe fail to write the object properly? (just a thought haven't really dug in) | 22:34 |
ianw | mirror02.iad3.inmotion.opendev.org : ok=0 changed=0 unreachable=1 | 22:35 |
ianw | i think that might be the root cause | 22:35 |
ianw | indeed that host seems to not want to talk | 22:36 |
clarkb | the identity and image apis seem to talk but not the compute for that cloud | 22:37 |
clarkb | thats the openstack as a service cloud that we can poke at directly. But maybe our time is better spent this week putting that host in the emergency list then we can look at ti next week? | 22:38 |
clarkb | (I'm happy if someone else wants to look at it too but I think we can punt on the problem for now) | 22:38 |
ianw | we're not running ci resources there? | 22:40 |
clarkb | ianw: we would if the nova API stopped sending 500 errors :) | 22:41 |
clarkb | I think we must not be running anything there as a result of ^ | 22:41 |
ianw | is there like a 5-second guide of where to look? | 22:43 |
clarkb | ianw: yes, in the normal location should be login details for that cloud and for the cloud as a service management system | 22:44 |
clarkb | in horizon I think you should get IP addresses for the hosts involved as well and you can ssh into those. THough now I can't remember if we properly set up ssh keys for everyone on those. fungi and I should have keys on them though and can add others if that didn't happen | 22:44 |
clarkb | looks like you have the file open so I'll wait my turn :) | 22:45 |
opendevreview | Ian Wienand proposed openstack/project-config master: nodepool: set inmotion cloud to zero https://review.opendev.org/c/openstack/project-config/+/801009 | 23:08 |
opendevreview | Ian Wienand proposed openstack/project-config master: Revert "nodepool: set inmotion cloud to zero" https://review.opendev.org/c/openstack/project-config/+/801010 | 23:08 |
opendevreview | Ghanshyam proposed openstack/project-config master: Properly retire neutron-lbaas https://review.opendev.org/c/openstack/project-config/+/800147 | 23:10 |
ianw | ok, back to what i was actually looking at which was making sure the new paste server backs itself up :) | 23:15 |
ianw | clarkb: not particularly urgent but https://review.opendev.org/c/opendev/system-config/+/800879 cleans up the old bits | 23:16 |
clarkb | ianw: one really small thing inline but worth fixing imo | 23:20 |
clarkb | also that has merge conflicts with one of the review changes says gerrit | 23:20 |
clarkb | (thats fine, just need to remember to rebase and push if we land paste cleanup first) | 23:20 |
opendevreview | Ian Wienand proposed opendev/system-config master: Remove paste01.openstack.org https://review.opendev.org/c/opendev/system-config/+/800879 | 23:25 |
opendevreview | Merged openstack/project-config master: nodepool: set inmotion cloud to zero https://review.opendev.org/c/openstack/project-config/+/801009 | 23:28 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!