tonyb | corvus: I noticed that https://zuul.opendev.org/t/openstack/status isn't working as expected. Anything in the search/filter box doesn't match items I can see in the queue. | 01:28 |
---|---|---|
fungi | tonyb: tried to force-refresh your browser? | 02:08 |
fungi | though i suppose that was mentioned in relationship to stale css | 02:09 |
tonyb | fungi: Yup. I'll try firefox for a fresh look | 02:10 |
tonyb | Same thing with FF, I notice that | 05:03 |
tonyb | I suspect it's a 'react' issue not sure how to debug it | 05:03 |
*** ralonsoh_ is now known as ralonsoh | 08:08 | |
bbezak | I can confirm the same issue on safari/chrome/firefox | 11:26 |
opendevreview | Merged openstack/project-config master: Implement kolla/kayobe-unmaintained-core groups https://review.opendev.org/c/openstack/project-config/+/908321 | 14:34 |
corvus | tonyb: remote: https://review.opendev.org/c/zuul/zuul/+/908797 Fix filtering on status page [NEW] <-- should be fixed there | 14:58 |
* clarkb has tea and will dig into reviews for those fixups momentarily | 16:13 | |
*** dhill is now known as Guest2523 | 16:34 | |
clarkb | jgit moved from Gerrit's Gerrit to GerritForge. I've asked on discord if the copy that gerrit hosts is an up to date mirror as we fetch jgit from there for our builds | 16:41 |
clarkb | I think it may be because gerrit uses it as a submodule and having gerrit host submodules simplifies some things. But definitely don't awnt us to fetch the wrong jgit content | 16:42 |
dpanech | Hi this review is stuck, Zuul jobs aren't being triggered, could someone have a look? | 16:42 |
dpanech | https://review.opendev.org/c/starlingx/ha/+/906003 | 16:42 |
corvus | looking | 16:43 |
corvus | dpanech: it appears to be based on 906002 which is abandoned | 16:45 |
clarkb | due to the depends on in the commit mesasge | 16:46 |
dpanech | Oh duh. Thanks | 16:46 |
jrosser | i'm having trouble with the zuul status screen search box, is that still working? | 16:47 |
jrosser | i.e to narrow down to one project or review number | 16:48 |
corvus | jrosser: fix is in progress | 16:48 |
jrosser | ahha ok :) | 16:48 |
clarkb | fungi: doesn't look like you updated https://review.opendev.org/c/opendev/system-config/+/908328 with your test results. Can you review that change? and maybe we should land it? | 19:24 |
clarkb | or we can wait for after the preptg if we're worreid about impacting that | 19:24 |
fungi | oh, indeed thanks, i forgot to do so | 19:25 |
clarkb | thank you! | 19:27 |
fungi | infra-root: for those of you using keycloak to authenticate for zuul's admin webui, were you relying on any social auth logins or just username/password? i'm to the point with keycloak03 where we can start adding accounts but i can add specific social auth identity providers first if needed | 19:39 |
clarkb | I don't think any social providers were added yet but I may be wrong | 19:40 |
tonyb | fungi: I was using username/password. I don't recall any social buttons | 20:33 |
corvus | i think i experimentally added something for openstackid, just for exploration, but i haven't used it for realz. | 20:33 |
fungi | tonyb: corvus: thanks for confirming. infra-root: in that case it's open season for adding our accounts to the zuul realm on keycloak03 i suppose. in the short term, until we merge 908357 to change the cname, you'll have to locally override address resolution for keycloak to that of keycloak03 | 20:40 |
clarkb | because keycloak does strict name validation headers ya? | 20:41 |
fungi | yes, and there's a redirect too | 20:41 |
fungi | i should probably log into the current production server and see if it was set up for things like e-mail notifications so i recreate those settings | 20:47 |
corvus | fungi: i'll probably just wait until the cname changes, thanks! and i don't think i set up anything like that. was basically just the tutorial procedure. | 20:47 |
clarkb | possibly related: firefox really doesn't like the national forest service website because it apparently tries to serve random things as http? | 20:48 |
fungi | got it. we'll almost certainly want to set more of these options before we add it to other services, but for now it's something we can get by with for a handful of service admins | 20:48 |
clarkb | all this extra browser validation to protect users is annoying when it impacts perfectly innocent things like looking up fire closure maps | 20:49 |
corvus | clarkb: it's for your safety! | 20:57 |
fungi | now that 908797 has merged and promoted, should we do a restart of zuul-web containers, or a larger zuul-wide restart? | 21:13 |
fungi | (well, image pull and container restart i mean) | 21:14 |
clarkb | fungi: I think we only need the webs to restart in this case | 21:14 |
corvus | i will restart the web containers | 21:15 |
fungi | thanks corvus! | 21:15 |
fungi | i know sometimes we've also restarted zuul-scheduler containers for good measure, when it was generally only a web-affecting change, so i wasn't sure | 21:16 |
corvus | since this is strictly js, it falls well below my paranoia threshold for that | 21:17 |
fungi | makes sense | 21:17 |
clarkb | please add content to the infra meeting agenda if you've got it and update the preptg etherpad ith your interest and topics too | 21:18 |
corvus | #status log restarted zuul-web to pick up webui fixes | 21:28 |
opendevstatus | corvus: finished logging | 21:28 |
fungi | confirmed the filtering seems to be working for me now | 21:29 |
corvus | those should be in effect now | 21:29 |
fungi | tonyb: bbezak: jrosser: ^ please check again | 21:29 |
jrosser | looks like it works for me searcing by project or review number now, thanks for the fix | 21:33 |
clarkb | luca reports that the jgit in gerrit.googlesource should be an up to date mirror. Its just not where dev happens anymore. This means we don't need to update our image builds | 21:37 |
clarkb | I feel like the meeting agenda is so empty possibly because we're keeping the bigger agenda stuff for the preptg | 22:19 |
fungi | i saw a deploy failure for infra-prod-service-tracing on friday. looks like it happens occasionally due to reaching a 2-minute timeout waiting for port 16686 to be listening on the loopback address | 22:37 |
fungi | should we increase that? or is it possibly a sign of a deeper problem with jaeger | 22:37 |
clarkb | I suspect that when we update the image the container startup may not always be fast and we should increase the timeout | 22:38 |
fungi | that was my first instinct, but wanted to double-check | 22:38 |
corvus | could be, but i'd be surprised if jaeger takes a long time to start; i feel like that warrants some log reading | 22:38 |
corvus | (maybe it's related to our local data storage size though?) | 22:39 |
fungi | `grep 'Timeout when waiting' /var/log/ansible/service-tracing.yaml.log.*` turns up "recent" occurrences at 2024-01-18T04:44:01 and 2024-02-09T21:05:26 | 22:39 |
fungi | looks like we don't set up any additional logging for that container | 22:40 |
clarkb | ya I suspect it only happens when we update the image version. The latest version is from today (so will run in a few hours) and the one before that is from 6 days ago. The weird thing about those timestamps is that it doesn't seem to align with a periodic job run. Maybe it was tripped by the keycloak inventory updates | 22:41 |
corvus | indeed looks like last startup took 131 seconds | 22:41 |
corvus | from: "level":"info","ts":1707512780.2357833,"caller":"flags/service.go:119","msg":"Mounting metrics handler on admin server","route":"/metrics" | 22:42 |
corvus | to: "level":"info","ts":1707512911.3811984,"caller":"app/server.go:284","msg":"Starting HTTP server","port":16686,"addr":":16686" | 22:42 |
fungi | aha, yep, i found the same looking at docker-compose logs | 22:43 |
corvus | and it does seem like most of the slow bits are badger related; like compacting writeahead logs, etc, | 22:43 |
corvus | so yeah, i guess that's our non-optimized local storage | 22:44 |
corvus | probably bumping timeout is fine for our current level of concern with this service then :) | 22:44 |
fungi | should i up it to 180? i increased a similar timeout on the keycloak deploy to 300 just to be safe, because it does a container rebuild each time it's started | 22:44 |
corvus | i'd say bump to 300? | 22:45 |
clarkb | ++ | 22:46 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Increase Jaeger start timeout to 300 https://review.opendev.org/c/opendev/system-config/+/908867 | 22:47 |
opendevreview | Merged opendev/system-config master: Increase Jaeger start timeout to 300 https://review.opendev.org/c/opendev/system-config/+/908867 | 23:44 |
tonyb | fungi, corvus: filtering looks good to me. Thanks! | 23:57 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!