*** Tengu_ has joined #opendev | 01:16 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] reprepro https://review.opendev.org/757660 | 01:16 |
---|---|---|
*** Tengu has quit IRC | 01:19 | |
*** Tengu has joined #opendev | 01:21 | |
*** Tengu_ has quit IRC | 01:22 | |
ianw | AnsibleUndefinedVariable: the inline if-expression on line 41 evaluated to false and no else section was defined. | 02:08 |
ianw | {% for item in borg_backup_dirs + borg_backup_dirs_extra -%} | 02:08 |
ianw | {{ item }} {{ '\\' if not loop.last }} | 02:08 |
ianw | that is line 41 | 02:08 |
ianw | it doesn't fail in gate, but is somehow failing on bridge? | 02:08 |
ianw | - debug: | 02:35 |
ianw | msg: '{% for item in [1,2,3] %} {{ item }} {{ "," if not loop.last }} {% endfor %}' | 02:35 |
ianw | fails on bridge, but i can't make it fail anywhere else ... | 02:35 |
clarkb | anaible version maybe? | 02:40 |
ianw | running that out of a 2.9.8 virtualenv *on* bridge also works | 02:41 |
ianw | bridge has Jinja2 2.10 | 02:42 |
ianw | the venv has Jinja2 2.11.2 | 02:43 |
ianw | and if i downgrade the jinja2 in the virtualenv, it fails | 02:43 |
ianw | so, why isn't the bridge jinja updating i guess is the question | 02:44 |
*** dviroel has quit IRC | 02:44 | |
fungi | pip doesn't normally upgrade dependencies if they already satisfy the minimum version | 02:45 |
ianw | yeah, i guess so. i'm surprised ansible doesn't pin itself to a jinja version | 02:45 |
fungi | there is an alternative upgrade strategy you can specify, but it will wreak havoc if you're installing some python libs from distro packages | 02:46 |
fungi | which is bound to happen if you install some python applications from distro packages | 02:47 |
ianw | i guess the solution here is "run ansible out of a virtualenv" | 02:47 |
ianw | or a containered ansible like openstackclient i guess | 02:48 |
fungi | or run ansible from a distro package | 02:48 |
fungi | but yeah | 02:48 |
ianw | i think for now i might just manually upgrade jinja2 on bridge | 02:49 |
*** ysandeep|away is now known as ysandeep | 03:13 | |
ianw | ansible is avoiding pinning to avoid i guess having any annoying version dependencies for stable distros | 03:27 |
ianw | but, that also means you get annoying incompatible behaviour depending on your environment. so you can't have it both ways | 03:27 |
*** ysandeep is now known as ysandeep|afk | 03:47 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] install ansible in a virtualenv on bridge https://review.opendev.org/757670 | 04:07 |
ianw | https://api.us-east.open-edge.io:8080/swift/v1/AUTH_e02c11e4e2c24efc98022353c88ab506/zuul_opendev_logs_2f2/722148/9/check/dib-nodepool-functional-openstack-ubuntu-focal-containerfile-src/2f25c7d/zuul-manifest.json | 04:14 |
ianw | is it just me, firefox is blocking this as a "deceptive site"? | 04:14 |
ianw | https://transparencyreport.google.com/safe-browsing/search?url=https:%2F%2Fapi.us-east.open-edge.io:8080%2Fswift%2Fv1%2FAUTH_e02c11e4e2c24efc98022353c88ab506%2Fzuul_opendev_logs_2f2%2F722148%2F9%2Fcheck%2Fdib-nodepool-functional-openstack-ubuntu-focal-containerfile-src%2F2f25c7d%2Fzuul-manifest.json | 04:15 |
ianw | does seem to suggest it is not just me | 04:15 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] reprepro https://review.opendev.org/757660 | 04:23 |
*** ykarel|away has joined #opendev | 04:34 | |
*** ykarel|away is now known as ykarel | 04:44 | |
fungi | donnyd: "open-edge.io has been reported as a deceptive site. You can report a detection problem or ignore the risk and go to this unsafe site." | 04:46 |
fungi | "The site https://open-edge.io/ contains harmful content, including pages that: Try to trick visitors into sharing personal info or downloading software" | 04:46 |
fungi | i reported a detection problem, no idea what it takes to get delisted though | 04:50 |
*** ykarel_ has joined #opendev | 05:03 | |
*** ykarel has quit IRC | 05:04 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] reprepro https://review.opendev.org/757660 | 05:07 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] install ansible in a virtualenv on bridge https://review.opendev.org/757670 | 05:21 |
*** tkajinam has quit IRC | 05:21 | |
*** tkajinam has joined #opendev | 05:22 | |
*** ykarel_ is now known as ykarel | 05:22 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] install ansible in a virtualenv on bridge https://review.opendev.org/757670 | 05:39 |
*** ysandeep|afk is now known as ysandeep | 05:43 | |
openstackgerrit | likui proposed openstack/diskimage-builder master: replace imp module https://review.opendev.org/751236 | 06:04 |
cgoncalves | api.us-east.open-edge.io is being flagged as phishing by Google Chrome: https://snipboard.io/Tyl5kR.jpg | 06:07 |
*** elod_pto is now known as elod | 06:38 | |
*** eolivare has joined #opendev | 06:40 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] reprepro https://review.opendev.org/757660 | 06:41 |
ykarel | i see similar on firefox ^ | 06:48 |
ykarel | Deceptive site ahead | 06:48 |
ykarel | Firefox blocked this page because it may trick you into doing something dangerous like installing software or revealing personal information like passwords or credit cards. | 06:48 |
ykarel | for https://api.us-east.open-edge.io:8080/swift/v1/AUTH_e02c11e4e2c24efc98022353c88ab506/zuul_opendev_logs_c18/757605/1/check/tripleo-ci-centos-8-scenario003-standalone/c1864e3/ | 06:48 |
ykarel | same on chrome | 06:49 |
ianw | yeah, i'm not sure what to do but report is as a false negative | 06:49 |
ianw | there's no indication *why* it thinks this | 06:50 |
ianw | donnyd: ^ perhaps a recycled IP somehow? | 06:50 |
jrosser | https://developers.google.com/web/fundamentals/security/hacked/request_review | 06:52 |
*** hashar has joined #opendev | 06:52 | |
jrosser | the security console there will say why it's done this | 06:52 |
*** slaweq has joined #opendev | 06:57 | |
ianw | yeah it wasn't quite clear if that's all the same thing, but i agree the best chance of finding something is probably on the webmaster console | 07:02 |
*** ralonsoh has joined #opendev | 07:02 | |
*** andrewbonney has joined #opendev | 07:08 | |
jrosser | utilit | 07:08 |
*** tosky has joined #opendev | 07:26 | |
*** rpittau|afk is now known as rpittau | 07:27 | |
*** DSpider has joined #opendev | 07:32 | |
cgoncalves | seeing this in multiple job builds: "Immediate connect fail for 2604:e100:3:0:f816:3eff:fe6b:ad62: Network is unreachable" | 07:46 |
cgoncalves | https://api.us-east.open-edge.io:8080/swift/v1/AUTH_e02c11e4e2c24efc98022353c88ab506/zuul_opendev_logs_5f5/755777/10/check/octavia-v2-act-stdby-dsvm-scenario-stable-train/5f506d8/controller/logs/dib-build/amphora-x64-haproxy.qcow2_log.txt | 07:47 |
ianw | that one was in ovh gra1 | 08:03 |
ianw | which i don't believe has ipv6 | 08:04 |
ianw | more likely an issue with 38.108.68.124 ? (what is that?) | 08:04 |
ianw | huh opendev.org :) | 08:04 |
ianw | ok, it's a curl to get requirements | 08:05 |
ianw | works for me, but possible one backend is having issues and i'm not hashed to it | 08:06 |
*** fressi has joined #opendev | 08:07 | |
AJaeger | why is it usinrg curl for requirements? It should use the local download instead... | 08:14 |
*** qchris_ has quit IRC | 08:16 | |
*** qchris has joined #opendev | 08:17 | |
*** priteau has joined #opendev | 08:33 | |
frickler | I can confirm that opendev.org isn't reachable via IPv6 from my site either. I guess mnaser or some other vexxhost support will have to take a look | 08:44 |
frickler | can't seem to log into gitea-lb01.opendev.org via v4 either, maybe there's more broken. but for v6 I'm at once getting destination unreachable from my local ISP | 08:46 |
*** kopecmartin has joined #opendev | 08:54 | |
kopecmartin | hi, i created a new repo under x org by https://review.opendev.org/#/c/753773/ and i'd need to edit the group access now (maybe i should have done it within the review directly) .. i'd like to be added to ansible-role-refstack-client-core group and i'd like to include refstack-core group within the ansible-role-refstack-client-core one as well | 08:58 |
kopecmartin | can anyone help please? | 08:58 |
*** piequi has joined #opendev | 09:09 | |
*** piequi has left #opendev | 09:10 | |
*** piequi has quit IRC | 09:10 | |
frickler | kopecmartin: sure, added you to ansible-role-refstack-client-core and -release, you should be able to make all other changes yourself. not sure about the ansible-role-refstack-client-ci group, though, likely we would make -core the owner of that one? | 09:11 |
*** _marc-antoine_ has joined #opendev | 09:12 | |
kopecmartin | frickler: thank you .. sure, let's make -core the owner of -ci group | 09:12 |
*** mkalcok has joined #opendev | 09:12 | |
*** chrome0 has joined #opendev | 09:13 | |
frickler | kopecmartin: done | 09:15 |
kopecmartin | frickler: thank you | 09:15 |
rpittau | good morning everyone, any issue going on right now on opendev ? | 09:16 |
rpittau | I'm seeing this when trying to access: Failed to connect to opendev.org port 443: Connection timed out | 09:17 |
*** hashar has quit IRC | 09:17 | |
rpittau | got some reports from other people as well | 09:17 |
frickler | rpittau: yes, seems there are issues with IPv6 connectivity, forcing v4 might be a workaround for now. we likely need help from vexxhost to resolve | 09:18 |
rpittau | ok, thanks frickler | 09:18 |
chrome0 | I seem to have issues with ipv4 as well? https://paste.ubuntu.com/p/yhmWzmKDPj/ | 09:20 |
*** kleini has joined #opendev | 09:20 | |
frickler | hmm, yes, maybe the issue isn't v6 after all. via a different provider I could log reach gitea-lb01.opendev.org (via v6) and everything looks fine there afaict | 09:30 |
openstackgerrit | Merged openstack/diskimage-builder master: Ensure yum-utils is installed in epel element https://review.opendev.org/756010 | 09:56 |
donnyd | ianw probably certs | 10:01 |
donnyd | ianw: checking now | 10:02 |
chrome0 | fwiw opendev.org ipv4 recovered here "Connection to 38.108.68.124 80 port [tcp/http] succeeded!" | 10:09 |
*** priteau has quit IRC | 10:30 | |
frickler | infra-root: lots of error node launch attempts on ovh starting at 0600 according to grafana, nl04 logs look inconclusive to me | 10:36 |
donnyd | ykarel: try again with https://api.us-east.open-edge.io:8080/swift/v1/AUTH_e02c11e4e2c24efc98022353c88ab506/zuul_opendev_logs_c18/757605/1/check/tripleo-ci-centos-8-scenario003-standalone/c1864e3/ | 10:38 |
donnyd | I think the certs were up for renewal today | 10:39 |
ykarel | donnyd, same result | 10:39 |
ykarel | Deceptive site ahead | 10:39 |
donnyd | that's interesting | 10:40 |
donnyd | I just renewed the cert, so it's surely not that | 10:40 |
donnyd | it's possible that your browser is complaining because the service is on 8080 and its tls | 10:42 |
ykarel | don't know what's it, happening for me on both chrome/firefox | 10:44 |
ykarel | donnyd, for you ^ url working fine? | 10:45 |
donnyd | yea it works fine here | 10:45 |
ykarel | donnyd, and before renewing certs you got same error? | 10:45 |
donnyd | i got no errors before either | 10:46 |
donnyd | but that doesn't mean anything | 10:46 |
donnyd | what does your terminal say | 10:46 |
donnyd | curl https://api.us-east.open-edge.io:8080 | 10:46 |
ykarel | <?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult> | 10:46 |
donnyd | try this link - https://api.us-east.open-edge.io:8443/swift/v1/AUTH_e02c11e4e2c24efc98022353c88ab506/zuul_opendev_logs_c18/757605/1/check/tripleo-ci-centos-8-scenario003-standalone/c1864e3/ | 10:49 |
ykarel | ^ returns same Deceptive site ahead | 10:49 |
donnyd | https://transparencyreport.google.com/safe-browsing/search?url=https:%2F%2Fapi.us-east.open-edge.io | 10:52 |
donnyd | it would appear that google doesn't like me | 10:52 |
ykarel | donnyd, with ^ too same | 10:53 |
ykarel | donnyd, i just checked from https://www.proxysite.com/ and from there it's working by selecting EU or US server | 10:53 |
ykarel | so something to do with source request, i am from India | 10:54 |
donnyd | yea I am thinking there is probably some hoops they want me to jump through | 10:54 |
donnyd | ok, I think I have it figured out | 11:01 |
donnyd | We are working on it | 11:01 |
donnyd | ykarel: no, it's something else. We should have this fixed up soon. There is no reason for concern | 11:03 |
donnyd | I appreciate you bringing it up though | 11:04 |
ykarel | donnyd, Thanks, let us know once fixed | 11:06 |
openstackgerrit | Sagi Shnaidman proposed zuul/zuul-jobs master: Install openswitch and firewall if need a bridge only https://review.opendev.org/757831 | 11:26 |
*** dviroel has joined #opendev | 11:26 | |
*** lpetrut has joined #opendev | 11:37 | |
_marc-antoine_ | opendev.org is working again, well done guys ! | 11:38 |
*** marios has joined #opendev | 11:38 | |
*** eolivare has quit IRC | 11:41 | |
*** eolivare has joined #opendev | 11:42 | |
lourot | review.opendev.org is down now though | 11:53 |
icey | and now review.opendev.org is down? | 11:53 |
icey | (dang lourot beat me to it) | 11:53 |
marios | yah same lourot icey | 11:53 |
marios | icey: lourot: but i can work from git cli and git review stuff , just gerrit web is down looks like | 11:54 |
sshnaidm | down | 11:54 |
sshnaidm | Proxy Error | 11:54 |
iurygregory | same for me | 11:54 |
*** ysandeep is now known as ysandeep|brb | 11:56 | |
ykarel | same for me too | 12:00 |
redrobot | π₯π₯πΆπ₯π₯ | 12:03 |
*** ysandeep|brb is now known as ysandeep | 12:08 | |
fungi | donnyd: it's not just the swift service, it looks like firefox has decided the entire open-edge.io domain is suspect... https://open-edge.io/ gets me the same warning | 12:09 |
fungi | lourot: icey: marios: sshnaidm: iurygregory: ykarel: i'm looking into it now, guessing the gerrit service has stopped abruptly | 12:10 |
fungi | mm, no, it's running... | 12:11 |
marios | thank you fungi | 12:11 |
sshnaidm | fungi, same about https://open-edge.io/ , it's in blacklist of Google Safe Browsing. I reported there it's a good site, need just more people to do it I belive | 12:11 |
bolg | i am also getting: Proxy Error for review.opendev.org | 12:12 |
fungi | sshnaidm: yep, i did too before i went to bed | 12:12 |
sshnaidm | fungi, maybe worth to send to discuss list so more people can report it as a good site | 12:13 |
fungi | bolg: yes, it seems the apache service on the server isn't getting a timely response from the java service to which it's proxying connections, i'm trying to establish why | 12:13 |
bolg | fungi: thanks! (y) | 12:13 |
fungi | lots of errors in its log like: | 12:14 |
fungi | [2020-10-13 12:13:47,006] [HTTP-66-selector-ServerConnectorManager@54759210/0] WARN org.eclipse.jetty.util.thread.QueuedThreadPool : HTTP{STARTED,20<=20<=100,i=18,q=200} rejected org.eclipse.jetty.io.AbstractConnection$2@55a5c5ba | 12:14 |
fungi | [2020-10-13 12:13:47,006] [HTTP-66-selector-ServerConnectorManager@54759210/0] WARN org.eclipse.jetty.io.SelectorManager : Could not process key for channel java.nio.channels.SocketChannel[connected local=/127.0.0.1:8081 remote=/127.0.0.1:47906] | 12:14 |
fungi | looks like maybe thread contention for socket handling? | 12:15 |
fungi | disk utilization on root seems to have started climbing rapidly at ~11:10 utc, but that's likely just from logs filling up with error messages | 12:16 |
fungi | no other obvious signs anything is out of the ordinary resource wise (spike in established tcp connections, but that's a likely symptom of it not responding for a bit) | 12:17 |
fungi | there is a java process which seems fairly busy given nobody can reach the server | 12:17 |
fungi | i'm going to try restarting the container | 12:18 |
fungi | and done, though it will likely take a couple minutes to start up fully | 12:19 |
Tengu | thanks fungi | 12:20 |
frickler | there's this just before the socket errors start: [2020-10-13 11:47:17,193] [HTTP-2666233] WARN org.eclipse.jetty.util.thread.QueuedThreadPool : Unexpected thread death: org.eclipse.jetty.util.thread.QueuedThreadPool$3@140898fb in HTTP{START | 12:20 |
frickler | ED,20<=20<=100,i=16,q=0} | 12:20 |
Tengu | (and now everyone is refreshing, making the whole service re-crash ;)) - it seems to be back! | 12:21 |
fungi | #status log restarted gerrit container on review.opendev.org after it stopped responding to apache | 12:21 |
openstackstatus | fungi: finished logging | 12:21 |
fungi | 11:47:17 is definitely also just prior to anyone asking questions about it in here, so certainly seems suspicious | 12:22 |
fungi | start looking into the routing or outage issues for our gitea service next | 12:23 |
*** sboyron has joined #opendev | 12:24 | |
*** _marc-antoine_ has quit IRC | 12:25 | |
marios | thanks fungi looks like it's back? | 12:31 |
lourot | also back for me, thanks! | 12:33 |
fungi | marios: gerrit? yes, i restarted the service, looks like a thread got itself wedged somehow | 12:33 |
fungi | for the gitea connectivity issues, i'll have to dig deeper, but those are almost certainly unrelated | 12:34 |
marios | thanks | 12:39 |
ykarel | Tanks fungi | 12:44 |
fungi | also juggling virtual booth duty at ansiblefest right now, so had to switch focus to make sure i was all logged in and stuff, but looking into gitea now | 12:53 |
donnyd | fungi: thanks for the heads up, we are working it | 12:59 |
*** openstackgerrit has quit IRC | 13:17 | |
fungi | #status log restarted gerritbot on eavesdrop.o.o and germqtt on firehose.o.o following gerrit outage | 13:19 |
openstackstatus | fungi: finished logging | 13:19 |
fungi | i'm able to directly clone repositories from all 8 gitea backends, so it doesn't seem any of them is down hard | 13:29 |
*** moguimar has joined #opendev | 13:39 | |
bolg | Thanks fungi | 13:42 |
fungi | cacti graphs for the gitea backends don't indicate any obvious problems when folks were reporting connection timeouts, so i'm leaning increasingly toward assuming it was a temporary internet connectivity issue somewhere | 13:44 |
*** fressi has quit IRC | 13:57 | |
clarkb | fungi: catching up quickly before dialing into the board meeting and figuring out ansiblefest, it sounds like gerrit's java process was sad and restarted and separately there was a gitea issue? sounds like ipv6 connectivity issues to gitea | 13:59 |
clarkb | anything else I should try and get up to speed with? | 13:59 |
*** mlavalle has joined #opendev | 13:59 | |
clarkb | re openedge phishing, we do host pre built static websites there I wonder if google's indexer has found that and has decided its bad | 13:59 |
clarkb | for example zuul-ci.org contents will be available there in test builds but the domain will be openedge | 14:00 |
fungi | nothing else i can think of which is on fire, no | 14:00 |
fungi | also i think i've gotten the last of the mirror volumes caught back up, but still need to double-check their timestamps | 14:00 |
*** roman_g has joined #opendev | 14:02 | |
clarkb | my replication from review-test to test gitea is still running with ~1100 repos to go | 14:03 |
fungi | looks like mirror.ubuntu-ports needs help, but the rest are fine now. i'll see what it's problem is in a moment | 14:05 |
clarkb | thanks! | 14:05 |
clarkb | fungi: do you think we need to do more followup with gerrit, gitea, or openedge? | 14:05 |
fungi | clarkb: not at the moment probably | 14:05 |
fungi | unless someone wants to dig into the gerrit error frickler found | 14:06 |
clarkb | "unexpected thread death" ? | 14:07 |
clarkb | my initial hunch is we're going to discover its a jetty bug and yet another reason to upgrade :) | 14:07 |
fungi | that's my gut feel, yeah | 14:11 |
*** redrobot has quit IRC | 14:22 | |
*** ysandeep is now known as ysandeep|away | 14:25 | |
clarkb | is it possible the db outage didn't happen until just now? that may also explain it | 14:27 |
clarkb | the friday window was super quiet at least during the period of it I managed to stay away | 14:28 |
fungi | yeah, if the outage happened today then they didn't tell us | 14:28 |
clarkb | I wonder if we can check the db uptime somehow? mysql may expose that? | 14:28 |
clarkb | just to rule that out | 14:29 |
*** nuclearg1 has joined #opendev | 14:35 | |
fungi | mirror.ubuntu-ports had a stale vldb lock, so i've taken the flock for its update cronjob on mirror-update.openstack.org after the last run, unlocked the volume manually and started a vos release -localauth in a root screen session on afs01.dfw | 14:41 |
*** lpetrut has quit IRC | 14:51 | |
*** fressi has joined #opendev | 14:55 | |
*** fressi has quit IRC | 15:04 | |
*** smcginnis has quit IRC | 15:17 | |
*** ykarel is now known as ykarel|away | 15:18 | |
*** smcginnis has joined #opendev | 15:30 | |
*** mkalcok has quit IRC | 15:45 | |
*** ykarel|away has quit IRC | 15:50 | |
fungi | looking at http://grafana.openstack.org/d/ACtl1JSmz/afs?orgId=1 mirror.ubuntu seems to be slightly over quota and mirror.ubuntu-ports is very close | 15:59 |
fungi | as is mirror.yum-puppetlabs | 15:59 |
clarkb | you should be able to safely bump up ubuntu quota | 16:00 |
clarkb | we've cleaned up a lot of the old suse and fedora stuff recently so should have plenty of room | 16:00 |
fungi | mirror.ubuntu may not be over quota yet, but it's at least nearly there | 16:01 |
clarkb | we may also be able to trim the ubunut-ports mirror if people aren't testing on older images there? | 16:04 |
clarkb | that seems very likely given how we've relied on newer kernels than distros have provided by default in the past | 16:04 |
*** rpittau is now known as rpittau|afk | 16:21 | |
*** SotK has quit IRC | 16:27 | |
*** SotK has joined #opendev | 16:29 | |
*** rpittau|afk has quit IRC | 16:31 | |
*** marios is now known as marios|out | 16:32 | |
*** ShadowJonathan has quit IRC | 16:33 | |
*** rpittau|afk has joined #opendev | 16:34 | |
*** ShadowJonathan has joined #opendev | 16:35 | |
*** priteau has joined #opendev | 16:39 | |
*** marios|out has quit IRC | 16:43 | |
*** hamalq has joined #opendev | 16:44 | |
clarkb | down to 1100 replication tasks remaining | 16:53 |
*** ykarel|away has joined #opendev | 17:01 | |
mwhahaha | hey is there any plan to upgrade gerrit at least to a newer 2.x version? | 17:10 |
clarkb | mwhahaha: http://lists.opendev.org/pipermail/service-discuss/2020-October/000103.html | 17:11 |
clarkb | we haven't scheduled an outage window yet, but have what appears to be a working upgrade process all the way to 3.2 | 17:11 |
mwhahaha | nice | 17:12 |
fungi | hope to talk about the schedule in the opendev meeting later today | 17:12 |
mwhahaha | i ask cause Emilien was hitting problems with F33's ssh crypto policies and the current gerrit version | 17:12 |
clarkb | still doing testing. Working on replication to gitea as we speak. Need to test project creation and renaming after that. We also know that jeepyb launchpad integration will break because the database is going away | 17:12 |
fungi | mwhahaha: yep, i have him a hopefully less intrusive workaround in #openstack-infra a few minutes ago | 17:12 |
clarkb | with the release this week, summit next, ptg after, then elections after that I doubt it happens before middle of novemeber | 17:12 |
fungi | elections? | 17:13 |
mwhahaha | ah nice | 17:13 |
clarkb | mwhahaha: also https is an option if people would prefer not to modify ssh configs (though they can be modified on a per host basis so doesn't seem like a major deal) | 17:13 |
mwhahaha | you can target just review. int he ssh config anyway | 17:13 |
clarkb | mwhahaha: I've also been encouraging people that are interested to check out https://review-test.opendev.org which is an upgraded snapshot of production from october 1 | 17:14 |
mwhahaha | i was asking because i was thinking about the ed keys which was added in 2.14 it hink | 17:14 |
*** ykarel|away has quit IRC | 17:14 | |
clarkb | mwhahaha: in this case its the host key that it is complaining about not the user auth key | 17:14 |
mwhahaha | yes i'm aware | 17:14 |
clarkb | and I think ed keys don't work in gerrit at all? might be ecdsa | 17:14 |
clarkb | (it generates them for you when you upgrade then complains about them when you start it) | 17:15 |
mwhahaha | Support for elliptic curve/ed25519 SSH keys | 17:15 |
mwhahaha | is that just server side? | 17:15 |
mwhahaha | https://bugs.chromium.org/p/gerrit/issues/detail?id=4507 | 17:15 |
clarkb | mwhahaha: I think they added support for them for user auth but not host keys | 17:15 |
mwhahaha | right | 17:15 |
mwhahaha | i was talking user auth | 17:15 |
clarkb | just calling that out because switching the user auth key type doesn't necessarily fix emilienM's problem. I tested review-test and its host keys should be fine but not sure what version that changed in | 17:16 |
mwhahaha | yea i know it's two different things | 17:16 |
mwhahaha | thanks for clarifying tho | 17:16 |
clarkb | I think my favorite new feature in gerrit is the ability to set a user status. review-test thinks I'm partying | 17:18 |
mwhahaha | ha | 17:18 |
fungi | just need a tiki icon to go with it | 17:18 |
clarkb | there are actually a lot of properly useful features that I'll be happy to see like single page diffs for all files and better dashboarding | 17:19 |
clarkb | but we know it isn't perfect, we'll likely try and fix integrations and other things people notice after we upgrade just so that we actually get the upgrade done | 17:19 |
clarkb | one ui nit I've noticed recently is it puts the change approval, rebase, edit, abandon buttons below the user area and if you click on the user area it does a fly out thing that will cover those other commands | 17:20 |
clarkb | have to carefully click to avoid abandongin | 17:21 |
*** Guest75569 has joined #opendev | 17:21 | |
* fungi abandons his changes and goes partying | 17:23 | |
*** Guest75569 is now known as redrobot | 17:23 | |
clarkb | fungi: re election I expect people will be very distracted on top of normal ptg hangover state | 17:25 |
clarkb | I mean maybe that is a good time for us to upgrade gerrit if we can get over our own hangovers :) | 17:26 |
fungi | ahh, not openstack elections | 17:26 |
fungi | well, you're probably right that even people outside the usa will be distracted by whatever results from the usa elections | 17:27 |
*** nuclearg1 has quit IRC | 17:28 | |
*** andrewbonney has quit IRC | 17:28 | |
*** eolivare has quit IRC | 17:30 | |
clarkb | the major struggle in testing this is we have what I'm beginning to think of as data rot. We've accumulated so much data in gerrit over 8ish years that testing is slow. I did a lot of testing locally to check out simple things but eventually its just easier with better coverage confidence to test with the prod snapshot | 17:32 |
clarkb | I've noticed that nova has grown beyond 1GB recently too | 17:33 |
clarkb | its just over 1GB now | 17:33 |
fungi | that's after aggressive gc right? | 17:35 |
clarkb | yup | 17:35 |
clarkb | I used nova as a git clone test from gerrit | 17:35 |
clarkb | and it came out to like 1.01GB | 17:35 |
clarkb | s/gerrit/review-test/ | 17:35 |
fungi | time to declare nova feature complete | 17:35 |
clarkb | thats without the extra refs too | 17:36 |
clarkb | thats what you get if you clone it normally to dev on it | 17:36 |
*** roman_g has quit IRC | 17:52 | |
*** priteau has quit IRC | 17:53 | |
*** priteau has joined #opendev | 17:53 | |
*** ralonsoh has quit IRC | 18:00 | |
*** priteau has quit IRC | 18:01 | |
*** sshnaidm is now known as sshnaidm|afk | 18:24 | |
*** sboyron has quit IRC | 18:44 | |
fungi | ubuntu-ports volume was released successfully and then i ran another mirror update manually just to make sure it's working correctly | 18:49 |
fungi | i'll take a look at quotas here momentarily | 18:49 |
fungi | yeah, fs listquota says mirror.ubuntu is 99% and mirror.ubuntu-ports is 93% used | 18:52 |
*** hashar has joined #opendev | 18:57 | |
*** fressi has joined #opendev | 18:58 | |
*** fressi has quit IRC | 18:59 | |
fungi | #status log increased mirror.ubuntu afs quota from 550000000 to 650000000 (99%->84% used) | 19:00 |
openstackstatus | fungi: finished logging | 19:00 |
fungi | #status log increased mirror.ubuntu-ports afs quota from 500000000 to 550000000 (93%->84% used) | 19:01 |
openstackstatus | fungi: finished logging | 19:01 |
*** diablo_rojo has joined #opendev | 19:12 | |
*** openstackgerrit has joined #opendev | 19:17 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] reprepro https://review.opendev.org/757660 | 19:17 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] reprepro https://review.opendev.org/757660 | 19:30 |
ianw | corvus: not mentioned but there's a little stack @ https://review.opendev.org/#/c/756605/ about capturing container logs which you might like to consider | 20:01 |
ianw | not mentioned in meeting sorry | 20:01 |
corvus | ianw: ack, thanks for the heads up will look in a sec | 20:01 |
clarkb | infra-root I mentioned it in the meeting too but the stack at https://review.opendev.org/#/c/757162/ is good flavor for the things that will change config wise on the gerrit server as we go through the upgrade | 20:03 |
clarkb | https://etherpad.opendev.org/p/gerrit-2.16-upgrade is a rough rough draft of the mechanical process to implement the upgrade | 20:04 |
clarkb | I intend on rewriting that to reduce the questions and stick to a concrete plan soon. I think we've largely sorted out that process at this point and now its just verifying the results | 20:04 |
clarkb | one really neat thing is that if you gc --aggressive before reindexing the reindxing cost drops like a stone | 20:05 |
clarkb | that tip from luca is likely to be the major difference between a 2.16 only upgrade and the 3.2 upgrade | 20:05 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] reprepro https://review.opendev.org/757660 | 20:06 |
clarkb | the notedb migration does a built in reindex and it doesn't gc first whcih is part of why that step is so slow | 20:06 |
clarkb | but that is one of three reindexes so the other two are much quicker | 20:06 |
*** hashar has quit IRC | 20:30 | |
*** Dmitrii-Sh has quit IRC | 20:40 | |
*** Dmitrii-Sh has joined #opendev | 20:41 | |
*** iurygregory has quit IRC | 20:50 | |
*** iurygregory has joined #opendev | 20:50 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] reprepro https://review.opendev.org/757660 | 20:52 |
clarkb | ok ansible fest things are done for the day I'm going to get a bike ride in before it starts raining again | 20:55 |
clarkb | replication is still going | 20:55 |
corvus | ianw: +1 on that logging change but i noticed an anomoly; left a comment | 21:02 |
ianw | corvus: yeah, so in post we try to "docker logs > " all containers to capture their logs, which goes into the docker directory, but with the change updated containers are logging to /var/log/containers | 21:12 |
ianw | we can remove the "docker logs" dump when all containers are logging via syslog | 21:13 |
ianw | corvus/clarkb: the other one under that if you have a sec is https://review.opendev.org/#/c/756628/3 to remove the custom rsyslogd bit we install, which i don't think we need any more | 21:14 |
corvus | ianw: but docker logs is the one that's working? | 21:27 |
corvus | and whatever writes to 'containers' (i assume that's podman?) is failing? | 21:27 |
ianw | corvus: yes, in post we do a speculative dump of docker and podman container logs (both, i think) with ignore on, so for jobs with no containers involved its also failing | 21:29 |
corvus | ianw: we'll want to keep the 'docker logs' dump though, since that's still working. unless you plan to collect /var/logs/containers? | 21:31 |
corvus | ianw: i got that backwards didn't i? | 21:32 |
corvus | ianw: it's 'docker' that's failing and containers that's working | 21:32 |
ianw | corvus: yep; when containers are directing their logs to syslog (which we collect and save in /var/log/containers), "docker logs" on them shows up that "the logs aren't here" message | 21:33 |
corvus | ianw: because you did add an entry to collect /var/log/containers | 21:33 |
ianw | right, basically i'd like to convert everything so container logs are in "/var/log/containers/docker-<sensible-name>" | 21:34 |
corvus | ianw: okay, sorry i messed that up. i agree that change looks good and removing the docker collection is fine. one more question though: that means that "docker logs foo" isn't going to work for us on the real hosts either. are we okay with that? (i'm assuming we are -- we'll just 'less /var/log/containers' and probably enjoy that more anyway). | 21:34 |
ianw | right, personally i find "docker logs" quite frustrating compared to just a regular file on disk | 21:35 |
corvus | wfm. | 21:35 |
corvus | ianw: +2; will let you +w or circulate more as appropriate | 21:35 |
ianw | i'll update the other compose files as well | 21:36 |
fungi | some of the same complaints i have with journald | 21:36 |
ianw | in a follow-on | 21:36 |
fungi | good ol' log files in /var are hard to beat | 21:36 |
ianw | the only problem with syslog is the ridiculous low-precision, english-based timestamp format | 21:37 |
fungi | yes, i concur | 21:39 |
fungi | iso-8601 with subseconds would be far better | 21:40 |
fungi | interestingly, rfc 5424 (and 3339 before it) mandates iso-8601 for syslog protocol | 21:47 |
fungi | 5424 also allows for microsecond resolution in the timestamp | 21:48 |
openstackgerrit | Merged opendev/system-config master: Remove Ubuntu Xenial ARM64 base testing https://review.opendev.org/756627 | 21:48 |
fungi | you can instruct rsyslog to write out timestamps in any format you like | 21:49 |
ianw | yes, you may notice the esxi syslog puts out logs in this format | 21:55 |
ianw | i wonder if the latest releases are still using vmsyslgod | 21:57 |
ianw | looks like it. i originally wrote all that in python to replace the busybox syslog collection it had before, with the plan to write it "properly" in C++ or whatever once all the esxcli integration was fleshed out etc. | 22:00 |
ianw | fungi: if you have any thoughts on how else to get back to the default file in https://review.opendev.org/#/c/756628/ i'm open to suggestions to :) | 22:03 |
fungi | `apt install --reinstall rsyslog` might do it, `apt purge rsyslog && apt install rsyslog` almost certainly would | 22:06 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] reprepro https://review.opendev.org/757660 | 22:08 |
ianw | yeah that takes the purge/install approach | 22:12 |
ianw | right, i'm going to manually pip upgrade jinja on brdige now and try the borg playbook | 22:12 |
*** paramite has quit IRC | 22:23 | |
ianw | well ethercalc02 has applied borg ok, i'm running the backup script manually now and it seems to be backing up | 22:31 |
*** diablo_rojo has quit IRC | 22:34 | |
*** qchris has quit IRC | 22:41 | |
*** qchris has joined #opendev | 22:54 | |
fungi | yay! | 22:59 |
clarkb | woot | 23:00 |
* clarkb is now back from the bike ride | 23:00 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: borg-backups: add some extra excludes https://review.opendev.org/757965 | 23:02 |
clarkb | it occured to me on my bike ride I hadn't tested stream events yet. I have now done that, seems to work fine | 23:02 |
fungi | excellent | 23:03 |
*** mlavalle has quit IRC | 23:03 | |
clarkb | ianw: corvus: the problem I've found with docker logs foo is that it prints from the beginning of time which can lead to very long wait times | 23:03 |
clarkb | even if you provide a --since flag it has to scroll through without printing first | 23:04 |
ianw | yeah, same problem journalctl seems to have as well | 23:04 |
clarkb | ianw: for https://review.opendev.org/#/c/756628/3/playbooks/roles/base/server/tasks/Debian.yaml purging won't affect running services right? it should keep running the existing service process then restart it when the reinstall happens? | 23:05 |
* clarkb is trying to be slushy :) | 23:05 | |
clarkb | fungi: ^ as someone more in tune with the openstack releas process maybe you want to take a look at that one and decide if it is safe to go in now? | 23:05 |
ianw | clarkb: yeah, i think there will be a small period of cutover only. i don't mind waiting a bit on that | 23:05 |
clarkb | I've +2'd it and if fungi is comfortable with it he can approve now or we can approve tomorrow post release | 23:06 |
ianw | we can revert after one run of base | 23:06 |
clarkb | I'm going to take a second look at the replication plugin this time by rtfs'ing to see if we can perhaps make replication a bit more friendly post notedb migration | 23:07 |
clarkb | I get the sense that most people use replication to cluster or have a hot standby server and not to take load off the main server | 23:07 |
ianw | clarkb: if you'd like to play, backup02.ca-ymq-1.vexxhost.opendev.org with borg-ethercalc02 user should be able to be poked at | 23:08 |
clarkb | (and in those cases you want to replicat everything) | 23:08 |
*** slaweq has quit IRC | 23:08 | |
ianw | /opt/borg/bin/borg is the binary | 23:08 |
clarkb | ianw: we're doing that over ssh but without encrpytion at rest right? so no passphrase to sort out? | 23:09 |
ianw | clarkb: right, no encryption on disk | 23:10 |
clarkb | ianw: I see 3 backups for a total of 1.82 GB | 23:13 |
clarkb | now to try a fuse mount and see that the redis dump is there | 23:14 |
ianw | I got "Warning: The repository at location /opt/backups-202010/borg-ethercalc02/backup was previously located at /opt/backups/borg-ethercalc02/backup" when i looked | 23:15 |
clarkb | /opt/borg/bin/borg info '::ethercalc02-2020-10-13T22:49:26' is my command no warning | 23:15 |
clarkb | and borg list shows you what the valid ^ things are | 23:16 |
clarkb | before I fuse mount do we have an excluded backup mount point? | 23:16 |
clarkb | that may be good so we avoid a feedback loop | 23:16 |
clarkb | `/opt/borg/bin/borg mount '::ethercalc02-2020-10-13T22:49:26' /mnt` is how I was going to fuse mount but holding off while I check our exclusions | 23:16 |
ianw | i'm pretty sure it won't cross file-systems | 23:17 |
clarkb | looks like we don't have a good exclude for that should we call it /root/borg_mnt ? | 23:17 |
clarkb | oh | 23:17 |
ianw | the other thing is that it's an include-process, so as long as you don't mount under one of the included dirs too | 23:17 |
ianw | clarkb: https://review.opendev.org/#/c/757965/1/playbooks/roles/borg-backup/defaults/main.yaml (if you want to review that too :) | 23:18 |
clarkb | oh right we're backing up /etc /home /var and /root | 23:18 |
clarkb | so /mnt is safe /me mounts and looks then | 23:19 |
ianw | i think it cached my response to the "repository moved" question | 23:19 |
clarkb | borg mount not available: loading FUSE support failed [ImportError: No module named 'llfuse'] | 23:19 |
ianw | weird that it doesn't know about symlinks | 23:19 |
ianw | hrm, we may not have built the pip install with fuse support? | 23:19 |
clarkb | ya it needs the llfuse python package | 23:20 |
clarkb | I personally think that is worthwhile as the fuse support is one of my favorite borg functionalities | 23:20 |
clarkb | really simplifies verification and all that of backups | 23:20 |
ianw | do we need it on the hosts or just the server? | 23:20 |
clarkb | I believe the hosts | 23:20 |
clarkb | since its running the fuse on the client side | 23:21 |
ianw | oh right, to be able to mount on each server | 23:21 |
clarkb | pip install borgbackup[fuse] is what googling tells me | 23:21 |
clarkb | from what I can see so far it seems happy though | 23:23 |
clarkb | ianw: what command produces that warning? | 23:23 |
clarkb | and where are you running it from? | 23:23 |
ianw | clarkb: i was running that on the backup server, as the borg user | 23:23 |
*** zigo has quit IRC | 23:23 | |
ianw | but i think i'm seeing that i should reframe things to think about inspecting it on the remote server | 23:24 |
clarkb | ianw: fwiw I get that too because I do offsite backups to my brother's house over a shared ISP connection (so its really fast) but his dynamic addressing changes and it warns me | 23:24 |
clarkb | it seems to be harmless and is more of a "hey user you may not have expected this so we're warningyou" rather than "functioanlity is degraded" | 23:24 |
ianw | ok, as long as it's not corrupting anything :) | 23:24 |
clarkb | hrm ya I've always interacted with it in the context of the client side | 23:24 |
*** qchris has quit IRC | 23:25 | |
clarkb | because my remote is a slow raspi | 23:25 |
clarkb | with encryption you really really really want to do that on the faster client side :) | 23:25 |
ianw | ok, i can look at the the fuse bits. i think we leave it for a few days to test the nightly runs and then we can look at rolling it out | 23:27 |
clarkb | ++ | 23:27 |
clarkb | if the package deps for fuse are bad somehow we can also just document that you need to install it in a different venv or something | 23:27 |
clarkb | but ya beats grabbing a complete tarball and then extracting specific bits | 23:28 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] reprepro https://review.opendev.org/757660 | 23:30 |
ianw | looks like "libfuse-dev fuse pkg-config" which i guess isn't too bad | 23:31 |
*** qchris has joined #opendev | 23:34 | |
*** tosky has quit IRC | 23:40 | |
fungi | sorry, sucked into election activity but can maybe look later | 23:44 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!