*** DSpider has quit IRC | 00:37 | |
fungi | yes, i agree the fact that we're logging both vhosts to the same file makes investigating this slightly more confusing | 00:47 |
---|---|---|
fungi | ultimately, i would expect requests considered for caching to either be logged as a cache hit or a cache miss | 00:47 |
fungi | requests logged as neither are, i think, not being considered for caching at all | 00:47 |
fungi | possibly skipped by the cache mod, possibly not routed to it, i'm not sure which | 00:48 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: zuul-web: move LogFormat combined-cache into config https://review.opendev.org/751623 | 01:15 |
*** user_19173783170 has joined #opendev | 01:16 | |
ianw | fungi: ^ i agree it's not being considered, with the mod_cache status just "-" | 01:21 |
user_19173783170 | when i register my openstack fundation account, it alway prompt "Please confirm that you are not a robot",why can't i receive the captcha? | 03:14 |
user_19173783170 | when i register my openstack fundation account, it alway prompt "Please confirm that you are not a robot", why can't i receive the captcha? | 03:15 |
ianw | 2001:4800:7819:103:be76:4eff:fe04:5870 - - [2020-09-14 03:19:27.549] "GET /api/tenant/pyca/status HTTP/1.1" 200 1140 cache hit "-" "curl/7.47.0" | 03:20 |
ianw | so so much is wrong | 03:20 |
ianw | user_19173783170: what's the page url? | 03:20 |
user_19173783170 | it's this:"https://openstackid.org/auth/register?client_id=7tdfQq8hu5SbLqGRXtQk0lwfD4mHBnTt.openstack.client&redirect_uri=https%3A%2F%2Fwww.openstack.org%2FSecurity%2Flogin%3FBackURL%3Djoin%252Fregister%252F%253Fmembership-type%253Dfoundation%26BackURL%3Dhttps%253A%252F%252Fwww.openstack.org%252F" | 03:21 |
ianw | user_19173782170: ok, so you don't see the "i'm not a robot" check box down the bottom? i do | 03:23 |
ianw | or are you saying you select that and it doesn't belive you? | 03:23 |
user_19173783170 | i dont see it | 03:23 |
user_19173783170 | is the reason my ip is in china? | 03:24 |
ianw | user_19173782170: do you have any sort of ad-blockers or similar installed? | 03:24 |
ianw | oh ... china ... well maybe? it's a standard reCAPTCHA box i see | 03:24 |
ianw | Google reCAPTCHA works in China, as long as you reference reCAPTCHA library by https://www.recaptcha.net instead of https://www.google.com. See developer doc section “Can I use reCAPTCHA globally” | 03:25 |
ianw | . | 03:25 |
user_19173783170 | dont have ad-blockers | 03:25 |
ianw | <script src='https://www.google.com/recaptcha/api.js?render=onload'></script> | 03:26 |
ianw | so it looks like that recaptcha should probably be not referencing google.com for global support | 03:26 |
ianw | user_19173783170: it looks like the website will have to fix this ... can you use a vpn :/ | 03:28 |
clarkb | fwiw I expect jimmy can help tomorrow. Maybe file a bug? | 03:28 |
ianw | sorry not sure what else to suggest. we definitely have users from China, but I'm not sure if they worked around this or it's something new | 03:28 |
clarkb | https://bugs.launchpad.net/openstack-org | 03:29 |
clarkb | is the bug tracker and I can ping jimmy et al in the morning | 03:30 |
ianw | clarkb / user_19183783170 : i can quickly file the bug | 03:30 |
user_19173783170 | i use this for the first time | 03:31 |
ianw | https://bugs.launchpad.net/openstack-org/+bug/1895496 | 03:32 |
openstack | Launchpad bug 1895496 in openstack-org "User from China reporting reCAPTCHA does not work" [Undecided,New] | 03:32 |
ianw | user_19173783170: ^ i'm afraid we might have to wait for a resolution on (which i imagine will happen US daytime tomorrow) to get this going for you | 03:33 |
user_19173783170 | no problem, thanks for you help | 03:33 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: zuul-web: rework caching https://review.opendev.org/751645 | 04:00 |
ianw | fungi / clarkb: ^ i've been poking on the 000-default.conf to come up with that. with it, i'm seeing everything get it's cache-event flag in the logs filled out. i think it's on the way, at least | 04:11 |
*** lpetrut has joined #opendev | 05:57 | |
*** ysandeep|away is now known as ysandeep | 06:10 | |
*** cgoncalves has joined #opendev | 06:13 | |
*** qchris has quit IRC | 06:21 | |
openstackgerrit | Carlos Goncalves proposed openstack/project-config master: Update branch checkout for octavia-lib DIB element https://review.opendev.org/745877 | 06:33 |
*** qchris has joined #opendev | 06:33 | |
*** hashar has joined #opendev | 06:40 | |
*** andrewbonney has joined #opendev | 07:42 | |
*** ysandeep is now known as ysandeep|lunch | 07:46 | |
openstackgerrit | Pierre-Louis Bonicoli proposed zuul/zuul-jobs master: default test_command: don't use a shell builtin https://review.opendev.org/751659 | 07:53 |
*** moppy has quit IRC | 08:01 | |
*** moppy has joined #opendev | 08:01 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed opendev/elastic-recheck master: Create elastic-recheck container image https://review.opendev.org/750958 | 08:14 |
*** DSpider has joined #opendev | 08:24 | |
*** tosky has joined #opendev | 08:27 | |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed zuul/zuul-jobs master: Add support to use stow for ensure-python https://review.opendev.org/751611 | 08:46 |
openstackgerrit | wu.shiming proposed openstack/diskimage-builder master: Remove install unnecessary packages https://review.opendev.org/751665 | 08:46 |
openstackgerrit | Pierre-Louis Bonicoli proposed zuul/zuul-jobs master: default test_command: don't use a shell builtin https://review.opendev.org/751659 | 09:06 |
*** ysandeep|lunch is now known as ysandeep | 09:10 | |
*** sshnaidm|pto is now known as sshnaidm | 09:10 | |
*** dtantsur|afk is now known as dtantsur | 10:27 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed opendev/elastic-recheck master: Create elastic-recheck container image https://review.opendev.org/750958 | 10:32 |
*** ykarel has joined #opendev | 10:52 | |
ykarel | Is there some issue with http://codesearch.openstack.org/ | 10:52 |
ykarel | it returns 500 | 10:52 |
cgoncalves | I can confirm 500s | 11:01 |
user_19173783170 | i have solved the problem which can't receieve the CAPTCHA in chinese ip | 11:14 |
user_19173783170 | the solution is installing a plugin which named "Ghelper" in Chrome | 11:16 |
user_19173783170 | i also want to ask how to relate my openstack fundation account to my ubuntuone account | 11:33 |
*** lpetrut has quit IRC | 12:00 | |
*** lpetrut has joined #opendev | 12:01 | |
*** Goneri has joined #opendev | 12:11 | |
*** priteau has joined #opendev | 12:14 | |
*** slaweq_ has joined #opendev | 12:37 | |
ttx | user_19173783170: ah, good to know. I'll pass the info by | 12:41 |
ttx | user_19173783170: For your account issue, you should send an email to support@openstack.org so that they can help you | 12:42 |
*** mnaser has quit IRC | 13:32 | |
*** mnaser has joined #opendev | 13:32 | |
*** mnaser has quit IRC | 13:32 | |
*** mnaser has joined #opendev | 13:32 | |
*** tkajinam has quit IRC | 13:37 | |
*** slaweq_ has quit IRC | 13:38 | |
fungi | ykarel: yeah, emilienm reported it in #openstack-infra too. taking a look now | 13:59 |
fungi | #status log restarted houndd on codesearch.o.o following a json encoding panic at 10:03:40z http://paste.openstack.org/show/797837/ | 14:01 |
openstackstatus | fungi: finished logging | 14:01 |
fungi | ykarel: ^ it should be on its way back up now | 14:02 |
fungi | cgoncalves: ^ | 14:02 |
cgoncalves | fungi, thanks! waiting for reindexing to finish :) | 14:02 |
fungi | yeah, it takes a few minutes for that to complete unfortunately | 14:04 |
*** auristor has quit IRC | 14:05 | |
*** sshnaidm is now known as sshnaidm|afk | 14:07 | |
ykarel | fungi, Thanks | 14:08 |
fungi | user_19173783170: if you use the same e-mail addresses for both your openstack foundation and ubuntuone accounts, then we'll be able to correlate them | 14:12 |
*** hashar has quit IRC | 14:14 | |
*** auristor has joined #opendev | 14:18 | |
dmsimard | btw seeing all fedora-31 based jobs fail in RETRY_LIMIT due to unbound being in an "unknown state", i.e: https://zuul.opendev.org/t/openstack/build/71a1da78b3f24d2e9883db36a5cf156c/console#0/2/9/fedora-31 | 14:19 |
dmsimard | won't have time to troubleshoot for a while longer but wanted to point out in case others have a similar issue | 14:20 |
*** ykarel_ has joined #opendev | 14:25 | |
*** ykarel has quit IRC | 14:28 | |
*** ykarel__ has joined #opendev | 14:32 | |
*** ykarel_ has quit IRC | 14:35 | |
*** ykarel__ is now known as ykarel | 14:35 | |
openstackgerrit | Carlos Goncalves proposed openstack/project-config master: Add 'check arm64' trigger to check-arm64 pipeline https://review.opendev.org/751829 | 14:36 |
*** slaweq_ has joined #opendev | 14:40 | |
*** icey has quit IRC | 14:48 | |
*** icey has joined #opendev | 14:49 | |
*** slaweq_ has quit IRC | 14:55 | |
*** ykarel is now known as ykarel|away | 15:00 | |
*** lpetrut has quit IRC | 15:06 | |
*** Topner has joined #opendev | 15:11 | |
*** Topner has quit IRC | 15:11 | |
*** ykarel|away has quit IRC | 15:20 | |
*** Topner has joined #opendev | 15:23 | |
*** lpetrut has joined #opendev | 15:28 | |
*** priteau has quit IRC | 16:04 | |
openstackgerrit | Merged opendev/system-config master: zuul-web: move LogFormat combined-cache into config https://review.opendev.org/751623 | 16:05 |
*** ysandeep is now known as ysandeep|away | 16:31 | |
*** ykarel|away has joined #opendev | 16:56 | |
*** mlavalle has joined #opendev | 16:57 | |
*** lpetrut has quit IRC | 16:57 | |
*** Gyuseok_Jung has quit IRC | 17:00 | |
*** ykarel|away has quit IRC | 17:05 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed opendev/elastic-recheck master: Create elastic-recheck container image https://review.opendev.org/750958 | 17:15 |
clarkb | I'm looking at the fedora issue dmsimard reported | 17:20 |
fungi | thanks, i hadn't gotten time to dig into that yet | 17:20 |
clarkb | https://bugzilla.redhat.com/show_bug.cgi?id=1853736 seems related | 17:20 |
openstack | bugzilla.redhat.com bug 1853736 in systemd "systemctl show service fails with "Failed to parse bus message: Invalid argument"" [Unspecified,Closed: errata] - Assigned to systemd-maint | 17:20 |
clarkb | it seems that they fixed fedora 32 but not 31 | 17:22 |
fungi | fedora 31 is *so* last year (literally!) | 17:23 |
clarkb | ya I'm not sure how to handle this | 17:23 |
clarkb | we could shell out on f31 but I worry the next use of the service module will just fail | 17:23 |
clarkb | though in this case maybe having a working base job is enough for us then jobs can correct their use of service otherwise | 17:24 |
dmsimard | o/ thanks for looking into this -- it started happening fairly recently, worked until now | 17:31 |
*** andrewbonney has quit IRC | 17:31 | |
dmsimard | I can confirm that bumping to f32 "fixes" it | 17:31 |
clarkb | I'm just going to brute force the service restarts with commands | 17:32 |
clarkb | working on that change now | 17:32 |
*** dtantsur is now known as dtantsur|afk | 17:35 | |
clarkb | how do you use different handlers based on some criteria? | 17:40 |
clarkb | do we need to have different notifying tasks for those criteria? | 17:40 |
openstackgerrit | Clark Boylan proposed opendev/base-jobs master: Test handling unbound restart on fedora 31 https://review.opendev.org/751872 | 17:45 |
clarkb | I guess we can test and seeif ^ works | 17:45 |
clarkb | another option may be to just deprecate f31 quickly? I dunno where f33 is at. Something to talk to ianw about I guess | 17:50 |
clarkb | infra-root https://review.opendev.org/751645 and https://review.opendev.org/#/c/751426/ are two different followups to the zuul web performance issues from last week. One addresses caching and the other adds more zuul-webs | 17:51 |
dmsimard | f33 is in beta right now iirc | 17:52 |
fungi | and includes zuul! | 17:53 |
* fungi is so proud | 17:53 | |
clarkb | dmsimard: ya I mean if the fix works I think we should land it .Mostly concerned that ansible makes this difficult | 17:53 |
clarkb | its intentionally hidden in a test role to start too since I'm not super confident in it | 17:53 |
corvus | clarkb: i vote try cache first then scale; i reviewed accordingly | 17:56 |
corvus | clarkb: that okay, or do you want to get them going in parallel? | 17:56 |
clarkb | corvus: I think doing them one after another to better see impact is a good idea | 17:56 |
corvus | kk | 17:57 |
openstackgerrit | Merged opendev/system-config master: zuul-web: rework caching https://review.opendev.org/751645 | 18:32 |
openstackgerrit | Merged openstack/project-config master: Revert "Pin setuptools<50 in our image venvs" https://review.opendev.org/749777 | 18:37 |
fungi | what's the next step toward deleting nb04? just taking it out of system-config? are there any remaining blockers to that? has that change been proposed already? | 18:40 |
clarkb | fungi: we need to ensure none of its images are still alive in clouds | 18:40 |
fungi | ahh, right, particularly bfv clouds like vexxhost i guess | 18:40 |
fungi | i'll take a look shortly | 18:41 |
clarkb | there are two remaining opensuse-tumbleweed-0000240092 and opensuse-tumbleweed-0000240093 | 18:41 |
clarkb | those are the only two tumbleeed images we have | 18:41 |
fungi | stale ready nodes maybe | 18:42 |
clarkb | we probably aren't building new tumbleweed images otherwise nb01 and nb02 would have at least one | 18:42 |
fungi | #status log provider maintenance 2020-09-30 01:00-05:00 utc involving ~5-minute outages for databases used by cacti, refstack, translate, translate-dev, wiki, wiki-dev | 18:44 |
openstackstatus | fungi: finished logging | 18:44 |
clarkb | ya our tumbleweed image builds are failing | 18:46 |
fungi | ugh | 18:46 |
clarkb | conflict between grep and busybox-grep | 18:46 |
clarkb | I think we can add busybox-grep to the deinstalls list to fix it | 18:46 |
fungi | #status log deleted old 2017-01-04 snapshot of wiki.openstack.org/main01 in rax-dfw | 18:48 |
openstackstatus | fungi: finished logging | 18:48 |
fungi | #status log cinder volume for wiki.o.o has been replaced and cleaned up | 18:50 |
openstackstatus | fungi: finished logging | 18:50 |
fungi | so that only leaves the nb04 cinder volume which would be impacted by next months maintenance | 18:51 |
fungi | and rackspace seems to have cleaned up all our old error_deleting volumes too | 18:51 |
fungi | once nb04 is fully gone, i'll update the open ticket for the cinder maintenance and let them know we've replaced/deleted all the volumes they mentioned | 18:54 |
openstackgerrit | Clark Boylan proposed openstack/diskimage-builder master: Install grep before busybox on suse distros https://review.opendev.org/751880 | 18:58 |
clarkb | fungi: ^ I think that is the fix, we already do similar for xz in dib | 18:58 |
clarkb | (why xz doesn't just supercede busybox-xz and grep supercede busybox-grep I don't know) | 18:59 |
fungi | interesting. yeah, debian doesn't even allow that. packages have to declare replaces or breaks if they have conflicting files, otherwise they don't make it into the distro | 19:01 |
clarkb | zypper gives you the option of breaking rsync by keeping busybox-grep, replacing busybox-grep with grep or doing nothing | 19:02 |
clarkb | so it has some of the info but doesn't default just do the sane thing | 19:02 |
openstackgerrit | Clark Boylan proposed opendev/base-jobs master: Test handling unbound restart on fedora 31 https://review.opendev.org/751872 | 19:17 |
clarkb | linter didn't like that I used systemctl in command instead of the service module | 19:18 |
* fungi sighs | 19:18 | |
clarkb | it would be fine if the service module worked :) | 19:19 |
donnyd | So the transition to ceph and nvme for object storage at OE is complete and I think we would probably be ok to put it back in the rotation for logs | 19:34 |
donnyd | Not sure what needs to be tested before putting it back into prod | 19:36 |
clarkb | donnyd: nice. We'll want ot update the secret at https://opendev.org/opendev/base-jobs/src/branch/master/zuul.d/secrets.yaml#L188-L229 as well as update the list of clouds used at https://opendev.org/opendev/base-jobs/src/branch/master/playbooks/base-test/post-logs.yaml#L23-L28 | 19:38 |
clarkb | then when we're happy with base-test behavior we can make the same playbook change to the base/post-logs.yaml change | 19:38 |
clarkb | donnyd: should we go ahead and add that when we get time or do you have more to do? | 19:38 |
clarkb | (figure infra-root should do it since we have to encrypt the secret | 19:38 |
donnyd | I think we are good to go... but I will find out for sure when the workload comes | 19:39 |
donnyd | I have tested it and seems to work as much as one person can test | 19:39 |
*** Topner has quit IRC | 19:39 | |
fungi | clarkb: i expect we can just revert the removal unless we have reason to believe the credentials changed? | 19:42 |
fungi | well, revert but apply it to base-test i mean | 19:42 |
fungi | but not need to reencrypt | 19:42 |
clarkb | fungi: well the cloud name changed at least. Did we also change credentials or did they stay the same? | 19:42 |
fungi | oh, if it was pre-oe then yeah the creds are likely entirely different | 19:43 |
donnyd | yea, the creds will likely need to be redone | 19:43 |
clarkb | yes the current secret is labeled cloud_fn_one | 19:44 |
fungi | got it | 19:44 |
clarkb | I can work on a change in a bit | 19:44 |
openstackgerrit | Pierre Riteau proposed ttygroup/boartty master: Update author and home page to match gertty https://review.opendev.org/751886 | 19:46 |
openstackgerrit | Pierre Riteau proposed ttygroup/gertty master: Update author email address https://review.opendev.org/751887 | 19:47 |
*** slaweq_ has joined #opendev | 19:53 | |
openstackgerrit | Clark Boylan proposed opendev/base-jobs master: Use OpenEdge swift to host job logs https://review.opendev.org/751889 | 20:01 |
clarkb | donnyd: infra-root ^ fyi | 20:01 |
donnyd | LGTM | 20:05 |
openstackgerrit | Merged opendev/storyboard master: Optimise the Story browsing query https://review.opendev.org/742046 | 20:16 |
donnyd | If i wanted to help write some poorly written ansible to help the infra teams app deployments, where could I start? Is the deployment code in each app, or in a central repo? | 20:37 |
clarkb | donnyd: most of our config management is in https://opendev.org/opendev/system-config | 20:38 |
clarkb | donnyd: that contains our inventory and groups definitions as well as most of the playbooks and roles we use | 20:38 |
corvus | donnyd: we aim for well tested so poorly written won't bother us :) | 20:39 |
clarkb | donnyd: we're using more and more docker containers as well (driven by ansible and docker-compose) and the Dockerfiles for those tend to be in the application repos (like zuul or nodepool) unless we need to do a forked docker image for some reason | 20:39 |
donnyd | corvus: we will see about that.. I write some pretty bad stuff | 20:39 |
clarkb | fungi: good idea on the testing of OE swifts thing | 20:40 |
clarkb | let me do a new pas | 20:40 |
clarkb | *ps | 20:40 |
donnyd | So the dockerfile and compose will be in the app repo and system-config is the tooling to deploy it | 20:40 |
openstackgerrit | Clark Boylan proposed opendev/base-jobs master: Use OpenEdge swift to host job logs https://review.opendev.org/751889 | 20:40 |
fungi | donnyd: one which is halfway there is our storyboard deployment... we're publishing docker images to dockerhub for the various storyboard services but not using them yet, we're still deploying storyboard with the storyboard-puppet module at the moment | 20:41 |
clarkb | donnyd: the dockerfile will be in the app repo but then the docker-compose and ansible to deploy it is in system-config | 20:41 |
donnyd | so where do the containers get deployed? not that it matters.. just curious | 20:41 |
fungi | donnyd: the service roles | 20:42 |
clarkb | system-config/playbooks/roles/gitea may be a good example | 20:42 |
clarkb | though we have the dockerfile for gitea in system-config/docker/gitea because we've forked it to add our own main page and branding stuff | 20:42 |
fungi | er, service plabooks | 20:42 |
fungi | which then use service-specific roles | 20:43 |
donnyd | I was just looking at that one clarkb | 20:43 |
donnyd | I was reading the thread on storyboard and it made me think that maybe I could actually make a useful contribution | 20:45 |
donnyd | the etherpad one also looks like a decent example | 20:47 |
diablo_rojo_phon | You definitely could and we'd love to have whatever help you'd like to offer :) | 20:47 |
clarkb | diablo_rojo_phon: ya etherpad and gitea should be pretty similar to how we'd do storyboard except we'd put the Dockerfile in storyboard itself I bet | 20:47 |
clarkb | er donnyd ^ | 20:47 |
clarkb | tab complete failed me | 20:47 |
fungi | storyboard or anything else you want to help with, help is most welcome | 20:49 |
fungi | it's also worth noting that switching from puppet to ansible (+docker where relevant) is a blocker for us updating our deployment platforms too. the version of puppet we're stuck on works on xenial but not bionic, so to upgrade past xenial we need to replace the old puppet orchestration and config management | 20:51 |
donnyd | how many things still need to be migrated? lots?? | 20:51 |
clarkb | its a fair bit, though I think many of them should be of the more direct variety now | 20:53 |
clarkb | the more difficult ones like gerritand zuul have been done (though netx up is working out the gerrit upgrade whichI'm slowly making progress on) | 20:53 |
fungi | yeah, the stuff which remains doesn't really have interdependencies | 20:55 |
fungi | so a lot more manageable as a task on its own | 20:55 |
fungi | i think ianw has graphite in progress already, but i'm not aware of any others which are in progress | 21:08 |
openstackgerrit | Merged opendev/base-jobs master: Use OpenEdge swift to host job logs https://review.opendev.org/751889 | 21:14 |
*** diablo_rojo has joined #opendev | 21:15 | |
clarkb | I've rechecked https://review.opendev.org/#/c/680178/5 which should test ^ | 21:16 |
clarkb | the commit message is no longer accurate but its still using base-test | 21:18 |
clarkb | donnyd: https://api.us-east.open-edge.io:8080/swift/v1/AUTH_e02c11e4e2c24efc98022353c88ab506/zuul_opendev_logs_225/680178/5/check/tox-py27/22553d1/ it seems to work | 21:25 |
clarkb | donnyd: its a little weird to see https on port 8080 but nothing actually wrong iwth that. Do you want to check anything before I propose a change to add it into the production rotation? | 21:25 |
donnyd | Yea there are containers populating in the project | 21:25 |
donnyd | yea, it is probably a bit strage | 21:26 |
donnyd | LOL | 21:26 |
donnyd | I usually proxy to the 13XXX range | 21:26 |
donnyd | but eh.. .it works | 21:26 |
donnyd | I think we are good to hook | 21:26 |
clarkb | ya looks functional to me /me makes another chnage | 21:26 |
donnyd | I am hopeful that the nvme object storage will work well this time around | 21:27 |
donnyd | we will see when its time for logs to expire hit | 21:27 |
donnyd | https://usercontent.irccloud-cdn.com/file/ke89YaM4/image.png | 21:28 |
donnyd | LGTM | 21:28 |
openstackgerrit | Clark Boylan proposed opendev/base-jobs master: Use openedge swift for logs on all jobs https://review.opendev.org/751905 | 21:28 |
donnyd | so long as everyone can reach them, we should be good to hook | 21:28 |
clarkb | I'm ipv4 only at home. Maybe fungi wants to hit it from ipv6 before +2'ing ^ | 21:29 |
clarkb | or if there is no ipv6 then thats fine too :) I dind't check dns | 21:29 |
donnyd | hrm, there should be | 21:29 |
donnyd | I do have a record | 21:30 |
clarkb | ya there is a AAAA record so having someone like fungi confirm ipv6 access works t owould be good. Otherwise I think we can land it | 21:31 |
donnyd | https://usercontent.irccloud-cdn.com/file/Rbe9GTpf/image.png | 21:34 |
donnyd | looks like its open outside of my network best I can test.. that whole being local thing has bitten me before though... so probably best to wait for fungi | 21:35 |
*** slaweq_ has quit IRC | 21:38 | |
*** slaweq_ has joined #opendev | 21:40 | |
fungi | yup, sorry, food distractions here. what url am i testing ipv6 connectivity to? | 21:50 |
clarkb | fungi: https://api.us-east.open-edge.io:8080/swift/v1/AUTH_e02c11e4e2c24efc98022353c88ab506/zuul_opendev_logs_225/680178/5/check/tox-py27/22553d1/ | 21:51 |
clarkb | fungi: that was generated by rechecking https://review.opendev.org/#/c/680178/5 which used the base-test update to have OE swift hosted logs | 21:51 |
clarkb | if that looks good to you (it does to me via ipv4) then https://review.opendev.org/751905 should be safe to land | 21:52 |
fungi | yeah, i have no trouble accessing that over ipv6 | 21:52 |
fungi | approving | 21:52 |
*** slaweq_ has quit IRC | 21:53 | |
*** slaweq has joined #opendev | 21:54 | |
clarkb | unrelated: I'm about to send out the meeting agenda, Get your items in now :) | 21:54 |
openstackgerrit | Merged opendev/base-jobs master: Use openedge swift for logs on all jobs https://review.opendev.org/751905 | 21:58 |
*** slaweq has quit IRC | 22:05 | |
*** slaweq has joined #opendev | 22:09 | |
*** slaweq has quit IRC | 22:21 | |
ianw | are we good with the zuul-web proxy bits? | 22:27 |
clarkb | ianw: ya I think its working fine | 22:27 |
clarkb | or at least it hasn't regressed. I haven'ttried to characterize the cache hit rate or anything like that | 22:27 |
ianw | [2020-09-14 22:29:15.103] "GET /api/tenant/openstack/status/change/706153,10 HTTP/1.1" 200 2951 "cache miss: attempting entity save" "https://review.opendev.org/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36" | 22:29 |
ianw | hrm | 22:30 |
ianw | [2020-09-14 22:29:14.308] "GET /api/status HTTP/1.1" 200 94041 "-" "https://zuul.openstack.org/status" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:80.0) Gecko/20100101 Firefox/80.0" | 22:30 |
ianw | i feel like the first we didn't expect cached, and the second we did | 22:31 |
clarkb | I think we expected both? it will cache the urls we specify and any below | 22:31 |
clarkb | so maybe having the changes below that is unexpected but not necessarily wrong aiui | 22:31 |
clarkb | and ya I would've expected the second to be cached or missed | 22:31 |
ianw | i think i've copied it wrong in the openstack.org config | 22:32 |
*** tosky has quit IRC | 22:33 | |
clarkb | ianw: two other things came up that may interesy you. The first is systemd is broken for ansible on f31. https://review.opendev.org/#/c/751872/ attempts to work around that and has links to bugs. The other is our tumbleweed image was build on nb04 and we haven't had a successful nb01 or 02 build which is preventing us from deleting nb04. https://review.opendev.org/#/c/751880/ will fix that I think | 22:34 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: zuul-web: fix zuul.openstack.org location match https://review.opendev.org/751917 | 22:35 |
ianw | yeah i saw that on f31. given f33 comes in october, maybe we just get rid of it; we have f32 now | 22:35 |
ianw | although i think there was something to do with swap images i saw come by for that ... | 22:36 |
clarkb | the swap change in ozj merged iirc | 22:36 |
clarkb | basically use dd instead of fallocate because ext4 in new kernels breaks swapon on fallocate | 22:36 |
ianw | yeah, that one | 22:37 |
clarkb | we changed it universally to osince the issue is expected to hit everywhere soon enough | 22:37 |
ianw | i'll go through the dib queue and i guess we want a release | 22:41 |
clarkb | ya its also possible there is a better way to handle that in the zypper context | 22:43 |
clarkb | but we already do that workaround for xz and busybox-xz so went with it | 22:43 |
ianw | fungi / clarkb: maybe one more eye on https://review.opendev.org/#/c/747810/ for copying keys to apt dir would be nice | 22:43 |
clarkb | I can review that one in a few. Getting the infra meeting agenda out now | 22:44 |
clarkb | ianw: prometheanfire fungi I thought that doesn't work for xenial and I forget the correspinding debian release | 22:47 |
clarkb | I guess the issue is that it never worked in the first place? so if we fix it we can just fix it for newer distros? | 22:47 |
fungi | that worked for using binary pgp keyring files, just not ascii-armored keys | 22:49 |
clarkb | gotcha | 22:49 |
clarkb | so this will work for older releases too with the proper input data | 22:49 |
ianw | yeah, that's what i thought. we could possibly expand the release note to be clearer on that i guess if you want | 22:50 |
clarkb | I think chances are anyone was using this are slim since prometheanfire found it didn't work at all due to gpg being missing | 22:50 |
clarkb | so should be an improvement going foward. Probably fine as is | 22:51 |
*** tkajinam has joined #opendev | 22:52 | |
clarkb | ianw: hrm I've just noticed that https://zuul.opendev.org/t/openstack/build/80798144a8b749cd846356c561d4641f failed on the suse fix | 22:52 |
clarkb | I wonder if we should figure that out too | 22:53 |
ianw | https://zuul.opendev.org/t/openstack/build/80798144a8b749cd846356c561d4641f/log/nodepool/builds/test-image-0000000001.log#3219 | 22:53 |
ianw | e2fsprogs-1.45.6-1.19.x86_64 requires info, but this requirement cannot be provided | 22:53 |
clarkb | its the same basic problem with busybox-gzip I think | 22:54 |
clarkb | I wonder if we can do an install without busybox | 22:54 |
clarkb | since it seems to be problematic here | 22:55 |
prometheanfire | ohhai | 22:55 |
clarkb | busybox, busybox-gzip, and busybox-static are the 3 busybox things we install in that log | 22:55 |
clarkb | maybe if we do gzip instead of busybox-gzip (like with xz and grep) that will be sufficient | 22:56 |
johnsom | FYI we seem to be seeing that strange CDN/cache issue again. The recently released oslo.log 4.4.0 is returning not found on some jobs. | 22:56 |
prometheanfire | ah, cool, +2+W | 22:56 |
clarkb | patterns-openSUSE-base is likely what pulls in the busybox stuff | 22:56 |
ianw | johnsom: hrm, do you ahve a link? is ymq involved again? | 23:00 |
johnsom | ianw https://57cbdeeafe6bab618f2f-00780db440ef90d2fe18db9118d58aa1.ssl.cf1.rackcdn.com/751918/1/check/openstack-tox-pep8/f82c8b5/job-output.txt | 23:01 |
openstackgerrit | Clark Boylan proposed openstack/diskimage-builder master: Install gzip instead of busybox-gzip on suse https://review.opendev.org/751919 | 23:01 |
johnsom | Some jobs pass, some aren't | 23:01 |
clarkb | ianw: ^ that should test it at least | 23:01 |
clarkb | ianw: looks like ovh gra1 | 23:02 |
ianw | johnsom: ovh-gra1 | 23:02 |
johnsom | yep | 23:02 |
johnsom | ovh-bhs1 are passing and finding it fine | 23:03 |
ianw | https://pypi.org/simple/oslo-log/ will be the thing to look at | 23:06 |
clarkb | johnsom: they are on different continents :) | 23:07 |
ianw | < x-served-by: cache-bwi5141-BWI, cache-cdg20739-CDG | 23:07 |
ianw | < x-cache: HIT, HIT | 23:07 |
ianw | < x-cache-hits: 2, 1 | 23:07 |
ianw | and that seems to show it | 23:07 |
clarkb | http://mirror.gra1.ovh.opendev.org/pypi/simple/oslo-log/ it is there now on the gra1 proxy | 23:07 |
johnsom | I guess a number of the oslo libs are triggering it. I have only seen oslo.log but others are reporting other modules | 23:08 |
clarkb | unfortunately we end up serving whatever pypi gives us and that has a short TTL | 23:08 |
clarkb | often by the time we notice things have rolled over and are happy | 23:08 |
johnsom | inap-mtl01 is also good | 23:09 |
clarkb | https://github.com/pypa/warehouse/issues seems to be where they want pypi.org feedback and issues | 23:10 |
clarkb | I wonder how terrible it would be to file an issue with a captured index fiel | 23:11 |
ianw | if it had x- headers that would probably be good | 23:12 |
clarkb | ya, now th etrouble is catching one :/ | 23:12 |
ianw | i got pretty far with it last time, but fastly had status issue up for slow purging or something, so it was put down to that | 23:12 |
clarkb | I think the key thing from the job logs is that it sees 4.3.0 as a valid version which implies this isn't a python version thing since 4.4.0 and 4.3.0 both have the same python version requirements in the source index html. It also implies we got an index.html and not an empty response | 23:14 |
ianw | http://kafka.dcpython.org/day/pypa-dev/2020-08-24#23.31.28.ianw | 23:14 |
clarkb | the source html also has a serial on it | 23:14 |
ianw | http://kafka.dcpython.org/day/pypa-dev/2020-08-25#00.11.33.PSFSlack | 23:15 |
clarkb | ianw: prehaps https://status.fastly.com/incidents/x57ghk0zvq58 was the incident this time around | 23:16 |
clarkb | or maybe they never properly purged back then and we hit the bad servers | 23:16 |
clarkb | fwiw I think it is likely that fastly is at fault | 23:16 |
*** mlavalle has quit IRC | 23:20 | |
ianw | fungi: you could double check the /api/status cache match for zuul.openstack.org in https://review.opendev.org/#/c/751917/ and i can monitor it when it deploys | 23:20 |
clarkb | ianw: your IRC logs are for about whne oslo.log 4.4.0 was reeased too | 23:20 |
clarkb | ianw: I suppose it could just be fallout from that original incident and fastly/pypi never did a proper rsync | 23:20 |
ianw | ... ohhh, i had just assumed oslo.log 4.4.0 had released like an hour ago :) | 23:21 |
ianw | that perhaps makes it more interesting to pypa/fastly ... | 23:21 |
fungi | ianw: lgtm, i guess we got rid of all the conditional matches on whether different cache modules were loaded? | 23:22 |
clarkb | fungi: ya that was the prior change https://review.opendev.org/751645 | 23:22 |
fungi | thanks, today has been a bit hectic | 23:23 |
fungi | i saw the title of that change but hadn't taken time to look through it | 23:23 |
ianw | yeah, it took me quite a while to realise that the cache_mem modules was no more ... | 23:23 |
fungi | aha! what we were seeing makes MUCH more sense now | 23:23 |
fungi | thanks for figuring that out | 23:24 |
ianw | i think the <ifdef> stuff is a bit of an anti-pattern; it's better for apache to just stop if you don't have the modules you want i think | 23:24 |
fungi | it made some sense back when this was in presumed portable puppet modules, but no longer | 23:25 |
ianw | i did also read that mod_rewrite with [p] is not considered as good as proxypass | 23:27 |
fungi | did we change how puppet apply's stdout gets logged? is it no longer going to syslog? | 23:28 |
clarkb | ianw: ya my related changes converts it all to proxypass | 23:28 |
fungi | i'm trying and failing to work out why we've stopped deploying new storyboard commits on storyboard.o.o | 23:28 |
clarkb | ianw: but I think I may rewrite that one to not do the extra zuul-web servers if we don't need them with just the caching and the zuul-web bugfix | 23:28 |
fungi | we landed a new storyboard commit at 20:16 utc and it seems to have gotten checked out on the server at /opt/storyboard but not pip installed, according to `pbr freeze` there we're several commits behind | 23:29 |
fungi | but i can't figure out where puppet's attempt to log that would be. we used to get puppet-user entries logged in /var/log/syslog | 23:30 |
clarkb | fungi: I think it ends up in the ansible logs now | 23:31 |
clarkb | on bridge | 23:31 |
clarkb | I don't remember why it changed though | 23:31 |
fungi | i didn't think the puppet output ended up there, i guess that's the behavior change | 23:31 |
clarkb | I think it was so that we could hae the logs show up in zuul | 23:31 |
fungi | i grepped /var/log/ansible/remote_puppet_else.yaml.log for "pip" but didn't find anything | 23:31 |
fungi | that log is huge | 23:31 |
clarkb | we switched from dumping into syslog to stdout | 23:32 |
clarkb | and ansible grabs the stdout | 23:32 |
fungi | i guess i need to figure out what the name of the task would have been to run puppet on there | 23:32 |
ianw | there was some wip to split out the puppet jobs from one big puppet_else into more separate things i think? | 23:32 |
clarkb | fungi: it may be in an older log file too depending on when the triggering change landed | 23:32 |
clarkb | ianw: that hasn't been done for storyboard yet I don't think | 23:33 |
fungi | oh! we rotate these very aggressively | 23:33 |
fungi | yep | 23:33 |
clarkb | fwiw I could go back to using syslog since we aren't really using zuul for those logs | 23:36 |
fungi | there's a massive gap between the logfiles too | 23:37 |
fungi | the newest rotated logfile ends at 14:16:14 but the first activity in the current log is 22:49:11 | 23:38 |
fungi | where did the other 8.5 hours go? | 23:38 |
fungi | unfortunately our princess was in that castle, mario | 23:39 |
clarkb | fungi: its rotated by the job when they complete | 23:39 |
clarkb | I wonder if we're timing out and breaking that setup | 23:39 |
fungi | or maybe this was not the periodic job | 23:39 |
fungi | i'll start from the zuul end and work backwards | 23:39 |
clarkb | job runs and logs to service-foo.yaml.log. At the end of the job we copy that to service-foo.yaml.log.timestamp. Next job runs and does it again | 23:40 |
fungi | in this case there's no separate service log because it's handled by the puppet catch-all | 23:41 |
clarkb | a | 23:42 |
clarkb | *ya | 23:42 |
clarkb | but its the same system (its per playbook) | 23:42 |
fungi | yeah, the hourly runs for puppet are timing out | 23:43 |
fungi | i'll have to look at this with fresh eyes tomorrow, i'm quickly getting fuzzy here | 23:44 |
ianw | 104.130.246.111 | 23:46 |
ianw | elasticsearch06.openstack.org. | 23:46 |
ianw | i think that's the puppet holder-upper-er | 23:46 |
ianw | it's dead, jim | 23:47 |
clarkb | I wonder if we should proactively reboot those | 23:48 |
clarkb | reboot first one, wait for status to settle, reboot second, wait for settling, etc | 23:48 |
ianw | have we done them all yet? something has clearly happened to them :/ | 23:48 |
ianw | #status log rebooted elasticsearch06.opensatck.org, which was hung | 23:48 |
openstackstatus | ianw: finished logging | 23:48 |
clarkb | I've done I think two of them | 23:49 |
*** DSpider has quit IRC | 23:52 | |
ianw | i've killed all the stuck processes | 23:52 |
clarkb | we might also consider splitting them out of puppet else | 23:52 |
clarkb | then if they fail the impact is lessened | 23:52 |
*** Goneri has quit IRC | 23:53 | |
ianw | yeah, i think we should probably continue that work to split up all of puppet-else | 23:56 |
clarkb | the mechanics of it are pretty straightforward iirc. We create a new .pp file for that service/hosts. We then add a job to run the puppet for that manifest and basically run it hwen else runs | 23:57 |
clarkb | there are a couple explains we can look at to compare | 23:57 |
clarkb | this would make agood meeting agenda item. I'll ninja add it tomorrow | 23:57 |
openstackgerrit | Merged opendev/system-config master: zuul-web: fix zuul.openstack.org location match https://review.opendev.org/751917 | 23:59 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!