*** panda has quit IRC | 00:01 | |
*** mattw4 has quit IRC | 00:03 | |
*** panda has joined #openstack-infra | 00:04 | |
*** rlandy|ruck is now known as rlandy|ruck|bbl | 00:08 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Run ansible jobs on bridge.yaml changes https://review.opendev.org/663804 | 00:08 |
---|---|---|
openstackgerrit | Merged opendev/system-config master: Evaluate files vhosts after we determine ssl file paths https://review.opendev.org/663796 | 00:09 |
*** dpawlik has joined #openstack-infra | 00:10 | |
*** slaweq has joined #openstack-infra | 00:11 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: mirror: rename 80/443 log files https://review.opendev.org/662357 | 00:14 |
*** dpawlik has quit IRC | 00:14 | |
*** mriedem has joined #openstack-infra | 00:17 | |
*** gyee has quit IRC | 00:23 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Switch mirror Apache logs to ISO8601 https://review.opendev.org/663806 | 00:23 |
*** slaweq has quit IRC | 00:24 | |
ianw | infra-root: ^ if we can look at those, i can bring up say rax ord & iad opendev.org mirrors which we can switch in for more testing coverage | 00:24 |
*** eernst has joined #openstack-infra | 00:25 | |
*** eernst has quit IRC | 00:27 | |
*** diablo_rojo has quit IRC | 00:27 | |
*** markvoelker has joined #openstack-infra | 00:29 | |
*** eernst has joined #openstack-infra | 00:29 | |
*** mriedem has quit IRC | 00:30 | |
*** eernst has quit IRC | 00:33 | |
*** eernst has joined #openstack-infra | 00:33 | |
*** rkukura has quit IRC | 00:36 | |
*** igordc has joined #openstack-infra | 00:38 | |
*** eernst has quit IRC | 00:38 | |
*** aakarsh has quit IRC | 00:55 | |
clarkb | ianw +2'd | 00:57 |
*** markvoelker has quit IRC | 01:02 | |
*** aakarsh has joined #openstack-infra | 01:04 | |
*** slaweq has joined #openstack-infra | 01:06 | |
*** michael-beaver has quit IRC | 01:09 | |
*** jamesmcarthur has joined #openstack-infra | 01:14 | |
*** jcoufal has joined #openstack-infra | 01:16 | |
*** jamesmcarthur has quit IRC | 01:18 | |
*** slaweq has quit IRC | 01:19 | |
*** aakarsh has quit IRC | 01:25 | |
*** auristor has quit IRC | 01:41 | |
*** auristor has joined #openstack-infra | 01:43 | |
*** rosmaita has left #openstack-infra | 01:52 | |
*** apetrich has quit IRC | 01:57 | |
*** markvoelker has joined #openstack-infra | 01:59 | |
*** gregoryo has joined #openstack-infra | 02:00 | |
*** rkukura has joined #openstack-infra | 02:05 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: model: add annotateLogger procedure https://review.opendev.org/663819 | 02:07 |
*** tinwood has quit IRC | 02:10 | |
*** dpawlik has joined #openstack-infra | 02:11 | |
*** slaweq has joined #openstack-infra | 02:11 | |
*** tinwood has joined #openstack-infra | 02:11 | |
*** rlandy|ruck|bbl is now known as rlandy|ruck | 02:12 | |
*** rlandy|ruck has quit IRC | 02:15 | |
*** dpawlik has quit IRC | 02:15 | |
*** jcoufal has quit IRC | 02:16 | |
*** slaweq has quit IRC | 02:24 | |
*** rh-jelabarre has quit IRC | 02:24 | |
*** dpawlik has joined #openstack-infra | 02:27 | |
*** sean-k-mooney has quit IRC | 02:28 | |
*** jamesmcarthur has joined #openstack-infra | 02:29 | |
*** dpawlik has quit IRC | 02:32 | |
*** markvoelker has quit IRC | 02:32 | |
*** threestrands has joined #openstack-infra | 02:54 | |
*** jamesmcarthur has quit IRC | 02:58 | |
*** sean-k-mooney has joined #openstack-infra | 03:00 | |
*** whoami-rajat has joined #openstack-infra | 03:10 | |
*** slaweq has joined #openstack-infra | 03:11 | |
*** sean-k-mooney has quit IRC | 03:14 | |
*** sean-k-mooney has joined #openstack-infra | 03:16 | |
*** slaweq has quit IRC | 03:25 | |
*** igordc has quit IRC | 03:25 | |
*** aakarsh has joined #openstack-infra | 03:29 | |
*** ykarel|away has joined #openstack-infra | 04:03 | |
*** ykarel|away is now known as ykarel | 04:03 | |
*** yamamoto has quit IRC | 04:09 | |
*** threestrands has quit IRC | 04:12 | |
*** slaweq has joined #openstack-infra | 04:16 | |
*** sean-k-mooney has quit IRC | 04:19 | |
*** sean-k-mooney has joined #openstack-infra | 04:21 | |
*** slaweq has quit IRC | 04:24 | |
*** kjackal has joined #openstack-infra | 04:27 | |
*** markvoelker has joined #openstack-infra | 04:30 | |
*** udesale has joined #openstack-infra | 04:31 | |
*** hwoarang has quit IRC | 04:39 | |
*** hwoarang has joined #openstack-infra | 04:40 | |
*** dpawlik has joined #openstack-infra | 04:43 | |
*** yamamoto has joined #openstack-infra | 04:43 | |
*** dpawlik has quit IRC | 04:48 | |
*** yamamoto has quit IRC | 04:54 | |
*** raukadah is now known as chandankumar | 04:55 | |
*** markvoelker has quit IRC | 05:03 | |
*** yamamoto has joined #openstack-infra | 05:13 | |
*** slaweq has joined #openstack-infra | 05:13 | |
*** slaweq has quit IRC | 05:24 | |
*** dpawlik has joined #openstack-infra | 05:30 | |
*** slaweq has joined #openstack-infra | 05:32 | |
*** kaisers has quit IRC | 05:37 | |
*** kaisers has joined #openstack-infra | 05:40 | |
*** slaweq has quit IRC | 05:44 | |
*** dchen has quit IRC | 05:45 | |
*** markvoelker has joined #openstack-infra | 06:00 | |
*** dchen has joined #openstack-infra | 06:03 | |
*** jbadiapa has quit IRC | 06:13 | |
*** kjackal has quit IRC | 06:15 | |
*** kjackal has joined #openstack-infra | 06:15 | |
*** gregoryo has quit IRC | 06:16 | |
*** lpetrut has joined #openstack-infra | 06:19 | |
*** evgenyl has quit IRC | 06:20 | |
*** piotrowskim has joined #openstack-infra | 06:21 | |
*** hwoarang has quit IRC | 06:23 | |
*** hwoarang has joined #openstack-infra | 06:24 | |
*** evgenyl has joined #openstack-infra | 06:24 | |
*** sparkycollier has quit IRC | 06:25 | |
*** sparkycollier has joined #openstack-infra | 06:26 | |
*** markvoelker has quit IRC | 06:32 | |
*** apetrich has joined #openstack-infra | 06:34 | |
*** dchen has quit IRC | 06:34 | |
*** pcaruana has joined #openstack-infra | 06:35 | |
evrardjp | clarkb corvus pabelanger so there was 2 ways to secure against the leakage of information about the zuul executor: http://logs.openstack.org/43/663743/1/promote/openstack-helm-images-promote-elasticsearch-s3/30b200e/ara-report/result/9544aee4-1f89-4dcf-9696-88d5f45b2253/ ... Because I tried this locally, https://opendev.org/zuul/zuul/src/branch/master/zuul/executor/server.py#L409-L410 and it worked to stil gather facts. I | 06:49 |
evrardjp | suppose we mask the setup module. | 06:49 |
*** jtomasek has joined #openstack-infra | 06:50 | |
*** hwoarang has quit IRC | 06:55 | |
*** yolanda__ is now known as yolanda | 06:56 | |
openstackgerrit | Jean-Philippe Evrard proposed zuul/zuul-jobs master: Revert "Explicitly store date facts for promote" https://review.opendev.org/663848 | 06:56 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: model: add annotateLogger procedure https://review.opendev.org/663819 | 06:58 |
*** hwoarang has joined #openstack-infra | 06:58 | |
openstackgerrit | Jean-Philippe Evrard proposed zuul/zuul-jobs master: Revert "Explicitly store date facts for promote" https://review.opendev.org/663848 | 06:58 |
*** slaweq has joined #openstack-infra | 06:59 | |
*** ginopc has joined #openstack-infra | 07:08 | |
*** tesseract has joined #openstack-infra | 07:09 | |
*** kopecmartin|off is now known as kopecmartin | 07:12 | |
openstackgerrit | Ian Wienand proposed opendev/zone-opendev.org master: Add RAX IAD/ORD opendev.org mirrors https://review.opendev.org/663849 | 07:13 |
*** imacdonn has quit IRC | 07:16 | |
*** dtantsur|afk is now known as dtantsur | 07:20 | |
openstackgerrit | Jean-Philippe Evrard proposed zuul/zuul-jobs master: Revert "Explicitly store date facts for promote" https://review.opendev.org/663848 | 07:21 |
*** markvoelker has joined #openstack-infra | 07:30 | |
*** ykarel has quit IRC | 07:30 | |
openstackgerrit | Ian Wienand proposed opendev/zone-opendev.org master: Add RAX IAD/ORD opendev.org mirrors https://review.opendev.org/663849 | 07:31 |
*** ykarel has joined #openstack-infra | 07:32 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Add RAX IAD/ORD opendev.org mirrors https://review.opendev.org/663852 | 07:38 |
*** pgaxatte has joined #openstack-infra | 07:39 | |
openstackgerrit | Ian Wienand proposed openstack/project-config master: Switch RAX IAD/ORD mirrors to new opendev.org mirrors https://review.opendev.org/663854 | 07:40 |
*** aedc has quit IRC | 07:40 | |
*** ccamacho has joined #openstack-infra | 07:41 | |
evrardjp | hello folks -- I have broken promote pipeline, and this is the fix: https://review.opendev.org/#/c/663848/ | 07:48 |
*** hwoarang has quit IRC | 07:49 | |
*** hwoarang has joined #openstack-infra | 07:50 | |
*** jaosorior has joined #openstack-infra | 07:51 | |
*** xek has joined #openstack-infra | 07:51 | |
*** Emine has joined #openstack-infra | 07:52 | |
*** jpena|off is now known as jpena | 07:53 | |
*** rcernin has quit IRC | 07:55 | |
openstackgerrit | Merged zuul/zuul-jobs master: Revert "Explicitly store date facts for promote" https://review.opendev.org/663848 | 07:58 |
*** ralonsoh has joined #openstack-infra | 07:59 | |
*** markvoelker has quit IRC | 08:03 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Add RAX IAD/ORD opendev.org mirrors https://review.opendev.org/663852 | 08:05 |
ianw | clarkb: https://review.opendev.org/#/q/topic:rax-mirrors+(status:open+OR+status:merged) is a little stack to bring online two more rax mirrors ... i think it would be good to run them for a bit and see if we get any weirdness like before | 08:06 |
ianw | the hosts are up, have the right volumes attached and cache drives mounted | 08:07 |
*** roman_g has joined #openstack-infra | 08:09 | |
*** pkopec has joined #openstack-infra | 08:10 | |
*** gfidente has joined #openstack-infra | 08:16 | |
*** lucasagomes has joined #openstack-infra | 08:17 | |
*** aedc has joined #openstack-infra | 08:22 | |
*** e0ne has joined #openstack-infra | 08:23 | |
*** trident has quit IRC | 08:35 | |
*** derekh has joined #openstack-infra | 08:37 | |
*** trident has joined #openstack-infra | 08:37 | |
*** imacdonn has joined #openstack-infra | 08:37 | |
mnasiadka | kolla jobs have some problems using ovh-gra1 nodepool provider - all our build jobs fail on connection to percona.com:443 - is there a way to temporarily switch all jobs to rax for example? | 08:45 |
openstackgerrit | Matthieu Huin proposed zuul/zuul-jobs master: install-nodejs: add support for RPM-based OSes https://review.opendev.org/631049 | 08:49 |
dpawlik | I guess percona blacklist us not we percona :P | 08:50 |
*** emine__ has joined #openstack-infra | 08:51 | |
*** Emine has quit IRC | 08:52 | |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Add Bitbucket Server source functionality https://review.opendev.org/657837 | 08:54 |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Create a basic Bitbucket build status reporter https://review.opendev.org/658335 | 08:54 |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Create a basic Bitbucket event source https://review.opendev.org/658835 | 08:54 |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Upgrade formatting of the patch series. https://review.opendev.org/660683 | 08:54 |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 08:54 |
*** markvoelker has joined #openstack-infra | 09:00 | |
mnasiadka | dpawlik: either way it doesn’t work :) | 09:01 |
*** ykarel is now known as ykarel|lunch | 09:02 | |
*** priteau has joined #openstack-infra | 09:25 | |
openstackgerrit | Merged openstack/hacking master: Dropping the py35 testing https://review.opendev.org/654287 | 09:25 |
*** zbr has joined #openstack-infra | 09:29 | |
*** ykarel|lunch is now known as ykarel | 09:30 | |
*** jaosorior has quit IRC | 09:30 | |
*** markvoelker has quit IRC | 09:32 | |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 09:35 |
zbr | can we bring a bit more love to gertty? I use it an I know others doing the same, so maybe it would be a good idea to have more cores? Asking because a linting CR should not wait for 11 months.... | 09:36 |
*** tkajinam has quit IRC | 09:37 | |
*** lpetrut has quit IRC | 09:38 | |
*** jaosorior has joined #openstack-infra | 09:47 | |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 09:52 |
*** hrw has joined #openstack-infra | 09:54 | |
hrw | hello | 09:54 |
*** tdasilva_ has joined #openstack-infra | 09:57 | |
*** tdasilva_ is now known as tdasilva | 09:57 | |
*** yamamoto has quit IRC | 10:02 | |
*** lpetrut has joined #openstack-infra | 10:04 | |
*** lpetrut has quit IRC | 10:04 | |
*** lpetrut has joined #openstack-infra | 10:05 | |
*** pkopec is now known as pkopec|brb | 10:06 | |
*** hwoarang has quit IRC | 10:08 | |
*** hwoarang has joined #openstack-infra | 10:10 | |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 10:13 |
*** jaosorior has quit IRC | 10:13 | |
*** piotrowskim has quit IRC | 10:20 | |
*** ociuhandu has joined #openstack-infra | 10:25 | |
*** yamamoto has joined #openstack-infra | 10:29 | |
*** ociuhandu has quit IRC | 10:30 | |
*** ociuhandu has joined #openstack-infra | 10:31 | |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 10:35 |
*** yamamoto has quit IRC | 10:35 | |
*** pkopec|brb has quit IRC | 10:36 | |
*** hrw has left #openstack-infra | 10:38 | |
*** pkopec has joined #openstack-infra | 10:39 | |
*** zbr is now known as zbr|rover | 10:44 | |
*** Lucas_Gray has joined #openstack-infra | 10:45 | |
*** Lucas_Gray has quit IRC | 10:53 | |
*** udesale has quit IRC | 10:53 | |
*** udesale has joined #openstack-infra | 10:53 | |
*** janki has joined #openstack-infra | 10:57 | |
lpetrut | hi, is there anyone around that has the required rights to perform openstack github repo transfers? | 11:05 |
*** jaosorior has joined #openstack-infra | 11:07 | |
openstackgerrit | Aurelio Jargas proposed zuul/zuul master: Break long repo names to make them fit https://review.opendev.org/663899 | 11:12 |
*** Lucas_Gray has joined #openstack-infra | 11:24 | |
*** xek has quit IRC | 11:29 | |
*** markvoelker has joined #openstack-infra | 11:30 | |
*** Lucas_Gray has quit IRC | 11:32 | |
*** bobh has joined #openstack-infra | 11:33 | |
*** rosmaita has joined #openstack-infra | 11:35 | |
*** Lucas_Gray has joined #openstack-infra | 11:35 | |
evrardjp | mnasiadka: what's the issue? dns resolution? http problems? | 11:36 |
mnasiadka | evrardjp: connection timed out, seems like there’s a firewall somewhere that is blocking it - maybe some blacklist result | 11:37 |
evrardjp | not that I can change things, but switching to "where it works" sounds a less good than fixing the problem in the first place | 11:38 |
mnasiadka | evrardjp: works from multiple other places | 11:38 |
*** jpena is now known as jpena|lunch | 11:38 | |
evrardjp | are you fetching a package there or? | 11:38 |
evrardjp | could you make use of infra's reverse proxy cache features instead? | 11:39 |
mnasiadka | evrardjp: Yes, a gpg key for yum repo and packages later | 11:39 |
evrardjp | (if it's always flaky I mean) | 11:39 |
evrardjp | gpg key can be vendored in I guess | 11:39 |
mnasiadka | evrardjp: if you can point me to details I can try it out :) | 11:39 |
evrardjp | well i guess the first step is to establish the problem, and if there is maybe some one working on a fix :D | 11:40 |
evrardjp | I suppose there are ppl here that can help you better than me, I just wanted to raise the fact there are many ways to skin that cat :) | 11:41 |
evrardjp | mnasiadka: the caching mirrors are in system-config (playbooks/roles/mirror/templates/mirror.vhost.j2) | 11:44 |
evrardjp | reverse proxy cache I mean | 11:44 |
evrardjp | one is percona | 11:44 |
evrardjp | it might help you there. | 11:44 |
*** EmilienM is now known as EvilienM | 11:45 | |
*** yamamoto has joined #openstack-infra | 11:46 | |
*** xek has joined #openstack-infra | 11:53 | |
*** yamamoto has quit IRC | 11:54 | |
*** ykarel is now known as ykarel|mtg | 11:54 | |
*** yamamoto has joined #openstack-infra | 11:56 | |
mordred | lpetrut: yup, which one do you need? | 11:56 |
lpetrut | the cloudbase-init one | 11:56 |
lpetrut | we removed the one from /cloudbase | 11:57 |
mordred | cool - lemme go try | 11:57 |
lpetrut | great, thanks! | 11:57 |
mordred | lpetrut: so - you either need to now add me as an admin to the cloudbase org - or I need to transfer it to someone who has admin who can then do the final transfer | 12:00 |
mordred | lpetrut: do you have repo creation rights there? | 12:01 |
mordred | (this is a quirk with how github transfers work) | 12:01 |
lpetrut | got it, I think you may transfer it to ociuhandu and we'll take it from there | 12:03 |
*** markvoelker has quit IRC | 12:03 | |
mordred | lpetrut: ociuhandu should now have a transfer request to approve | 12:04 |
lpetrut | done, thanks a lot for taking care of this | 12:04 |
mordred | my pleasure! | 12:05 |
openstackgerrit | Dmitry Tantsur proposed openstack/diskimage-builder master: ironic-agent: install mdadm on the ramdisk https://review.opendev.org/663916 | 12:06 |
*** udesale has quit IRC | 12:10 | |
*** bobh has quit IRC | 12:10 | |
*** udesale has joined #openstack-infra | 12:11 | |
*** rh-jelabarre has joined #openstack-infra | 12:17 | |
*** yamamoto has quit IRC | 12:18 | |
*** yamamoto has joined #openstack-infra | 12:19 | |
*** jcoufal has joined #openstack-infra | 12:20 | |
*** happyhemant has joined #openstack-infra | 12:20 | |
*** tdasilva has quit IRC | 12:20 | |
*** ianychoi_ has joined #openstack-infra | 12:23 | |
*** slaweq_ has joined #openstack-infra | 12:24 | |
*** ianychoi has quit IRC | 12:27 | |
*** slaweq has quit IRC | 12:27 | |
*** udesale has quit IRC | 12:27 | |
*** paladox has quit IRC | 12:27 | |
*** udesale has joined #openstack-infra | 12:27 | |
*** flaper87 has quit IRC | 12:27 | |
*** lpetrut has quit IRC | 12:27 | |
*** flaper87 has joined #openstack-infra | 12:29 | |
*** dansmith has quit IRC | 12:29 | |
*** lsell has quit IRC | 12:29 | |
*** lsell has joined #openstack-infra | 12:30 | |
*** rfarr_ has joined #openstack-infra | 12:31 | |
*** dansmith has joined #openstack-infra | 12:32 | |
*** rlandy has joined #openstack-infra | 12:32 | |
*** rlandy is now known as rlandy|ruck | 12:33 | |
*** rfarr__ has joined #openstack-infra | 12:33 | |
*** kjackal has quit IRC | 12:34 | |
*** rfarr_ has quit IRC | 12:36 | |
pabelanger | evrardjp: left comment on 662828 | 12:38 |
*** rtjure has joined #openstack-infra | 12:40 | |
*** aaronsheffield has joined #openstack-infra | 12:41 | |
*** tdasilva has joined #openstack-infra | 12:42 | |
openstackgerrit | Merged opendev/zone-opendev.org master: Add RAX IAD/ORD opendev.org mirrors https://review.opendev.org/663849 | 12:44 |
*** Lucas_Gray has quit IRC | 12:45 | |
fungi | zbr|rover: gertty is not an infra project. i recommend you take it up with the gertty maintainer directly. that said, lack of whitespace checking the source code hasn't been creating any bugs in it that i've noticed | 12:47 |
evrardjp | pabelanger: and answered... :) | 12:47 |
*** jpena|lunch is now known as jpena | 12:47 | |
evrardjp | thanks btw :) | 12:47 |
fungi | mnasiadka: evrardjp: i have a feeling proxying requests through our mirror server in the same region will also break similarly if it's percona blocking connections from ovh as a whole | 12:47 |
fungi | better to find out why percona has decided to block connectivity from ovh, if that's what's really going on | 12:48 |
evrardjp | fungi: that might be true, but I don't really know what the problem is, so I can't judge, this is why I said that mnasiadka needs ppl with more knowledge of what's going on there that I have | 12:48 |
pabelanger | evrardjp: okay, I'll have to look at docker jobs more | 12:49 |
*** markvoelker has joined #openstack-infra | 12:49 | |
evrardjp | pabelanger: thank you, your kind sir! | 12:49 |
*** ginux has joined #openstack-infra | 12:50 | |
*** ginopc has quit IRC | 12:50 | |
*** ginux is now known as ginopc | 12:50 | |
*** Lucas_Gray has joined #openstack-infra | 12:52 | |
*** pkopec has quit IRC | 12:55 | |
fungi | evrardjp: yeah, we can perform some manual tests from ovh, and also talk to folks at ovh and at percona and try to get it straightened out. i doubt we're the only users in ovh having this problem if it really is breaking the way it sounds like | 12:55 |
*** kjackal has joined #openstack-infra | 12:57 | |
*** pkopec has joined #openstack-infra | 13:03 | |
*** janki has quit IRC | 13:04 | |
*** janki has joined #openstack-infra | 13:04 | |
*** lseki has joined #openstack-infra | 13:04 | |
*** mriedem has joined #openstack-infra | 13:10 | |
fungi | infra-root: per conversation with cdent in #openstack-dev, it looks like mailman hasn't processed any new messages to openstack-discuss for some hours. i'm digging into logs now to see if i can find a cause | 13:12 |
*** jcoufal has quit IRC | 13:13 | |
*** yamamoto has quit IRC | 13:15 | |
*** rkukura has quit IRC | 13:18 | |
clarkb | fungi: could be a stuck log again if the server was rebooted | 13:19 |
clarkb | s/log/lock/ | 13:19 |
openstackgerrit | Merged opendev/system-config master: mirror: rename 80/443 log files https://review.opendev.org/662357 | 13:21 |
fungi | didn't seem to be rebooted | 13:21 |
fungi | there was a recent-ish lock expiration logged in /srv/mailman/openstack/logs/locks | 13:22 |
*** priteau has quit IRC | 13:23 | |
clarkb | the lock file has a pid iirc and if that pid does not exist the file can be removed | 13:26 |
*** ykarel|mtg is now known as ykarel | 13:27 | |
clarkb | That was the big reason for stopping mail in the past iirc | 13:27 |
*** dpawlik has quit IRC | 13:27 | |
fungi | actually, that lock expiration i saw in the logs was from ~30 hours ago | 13:31 |
fungi | uptime for the server is 55 days | 13:31 |
*** rtjure has quit IRC | 13:32 | |
*** jamesmcarthur has joined #openstack-infra | 13:32 | |
fungi | and yeah, no locks for openstack-discuss in /srv/mailman/openstack/locks/ | 13:34 |
fungi | 5347 .pck files in /srv/mailman/openstack/qfiles/in/ currently and rising | 13:35 |
*** priteau has joined #openstack-infra | 13:37 | |
clarkb | are the mailman processes running? | 13:38 |
clarkb | remember different set per vhost | 13:38 |
*** rtjure has joined #openstack-infra | 13:42 | |
*** tdasilva has quit IRC | 13:45 | |
*** dpawlik has joined #openstack-infra | 13:46 | |
*** anteaya has quit IRC | 13:47 | |
*** yamamoto has joined #openstack-infra | 13:48 | |
*** yamamoto has quit IRC | 13:49 | |
*** yamamoto has joined #openstack-infra | 13:50 | |
*** tdasilva has joined #openstack-infra | 13:51 | |
*** tdasilva has quit IRC | 13:54 | |
*** tdasilva has joined #openstack-infra | 13:55 | |
fungi | yeah, i'm trying to figure out how to match them up, but maybe i'll just count | 13:55 |
*** ricolin has joined #openstack-infra | 13:55 | |
fungi | 5 processes matching '^list.*IncomingRunner' | 13:56 |
fungi | which is the number of sites we have on that server | 13:57 |
fungi | start time for all of them is april 12 when the server was last rebooted | 13:57 |
*** slaweq_ is now known as slaweq | 14:00 | |
*** tdasilva has quit IRC | 14:01 | |
*** jcoufal has joined #openstack-infra | 14:01 | |
*** tdasilva has joined #openstack-infra | 14:02 | |
*** tdasilva has quit IRC | 14:04 | |
*** tdasilva has joined #openstack-infra | 14:06 | |
*** ykarel is now known as ykarel|away | 14:06 | |
*** chandankumar is now known as raukadah | 14:06 | |
fungi | matching the pidfile up to the parent process, the incoming queue runner for the openstack site is 4677 and `strace -p 4677` indicates it's quite busy | 14:08 |
*** tdasilva has quit IRC | 14:08 | |
*** janki has quit IRC | 14:08 | |
*** tdasilva has joined #openstack-infra | 14:10 | |
clarkb | is it spinning on discarding a bunch of spam? | 14:10 |
fungi | hard to tell since mostly reads of a bunch of 8-bit data | 14:11 |
*** liuyulong has joined #openstack-infra | 14:11 | |
fungi | i do see a lot of snippets of the bounce message we reject posts with | 14:12 |
clarkb | iirc you can view the input pickle files to seeif it is spam | 14:14 |
*** tdasilva has quit IRC | 14:14 | |
*** tdasilva has joined #openstack-infra | 14:15 | |
fungi | yeah, was just doing that. though it doesn't really tell me much because most of the posts to openstack-discuss are spam (and we have rules to just reject non-subscriber messages from the worst offender domains like qq.com and 163.com | 14:16 |
fungi | ) | 14:16 |
fungi | though looking at the queue dir, it's working its way through them so that's probably what's happened | 14:18 |
*** tdasilva has quit IRC | 14:18 | |
fungi | but not processing them as fast as they're arriving, since the number of files in that directory continues to increase | 14:19 |
ttx | It's been slow recently (~5-10min processing), but haven't seen 20-30 minutes yet | 14:19 |
fungi | the last message delivered to openstack-discuss was from 08:48:07z | 14:19 |
fungi | though the queue has caught up to messages from 10:38z now | 14:20 |
fungi | so i think nothing's broken, it's just really slowed by trying to handle all the spam | 14:20 |
fungi | i'll see if there are options to just drop the messages which match the rejection pattern rather than bouncing them | 14:21 |
*** iurygregory has quit IRC | 14:21 | |
openstackgerrit | Scott Little proposed openstack/project-config master: Add ansible-playbook repo to starlingx https://review.opendev.org/663954 | 14:23 |
*** roman_g has quit IRC | 14:23 | |
*** igordc has joined #openstack-infra | 14:23 | |
fungi | hrm, no, the patterns i was thinking of are already set to discard not reject | 14:23 |
fungi | (presumably so as to avoid creating backscatter) | 14:24 |
*** cdent has joined #openstack-infra | 14:24 | |
fungi | hrm, i bet the ones i see it rejecting in the strace are probably for a different ml | 14:25 |
fungi | openstack-discuss messages only account for ~50% of the backlog | 14:26 |
*** iurygregory has joined #openstack-infra | 14:26 | |
aakarsh | Hi seems to be an issue as post job isn't getting triggered for one of the project ( browbeat ). The commit to add the post job had landed recently https://review.opendev.org/#/c/663451/4. And a commit was just merged https://review.opendev.org/#/c/626908/ but I can't see it in the queue https://zuul.openstack.org/status | 14:26 |
aakarsh | can anyone please help me find out what's wrong ^ | 14:26 |
*** dpawlik has quit IRC | 14:27 | |
*** Goneri has quit IRC | 14:29 | |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 14:29 |
fungi | this is the current breakdown: http://paste.openstack.org/show/752632/ | 14:30 |
fungi | aakarsh: take a look at https://zuul.openstack.org/builds?project=x%2Fbrowbeat | 14:32 |
*** Goneri has joined #openstack-infra | 14:33 | |
aakarsh | aah thanks fungi exactly what I was looking for | 14:33 |
fungi | aakarsh: http://logs.openstack.org/5c/5caaa650ff78aba018c8eadd3cbb32e5e9c8a519/post/browbeat-upload-git-mirror/4335148/ara-report/ | 14:33 |
fungi | rht-perf-ci@github.com: Permission denied (publickey). | 14:33 |
fungi | aakarsh: you may want to double-check that you generated the secret for that correctly (as in using the x/browbeat project's public key to encode it) | 14:34 |
corvus | #status log removed files02 from emergency file | 14:38 |
openstackstatus | corvus: finished logging | 14:38 |
*** eharney has joined #openstack-infra | 14:40 | |
cdent | fungi: any theories on the ml situation? | 14:41 |
fungi | cdent: yeah, seems it's working just backlogged by neary 4 hours at the moment | 14:41 |
fungi | handling a larger than usual amount of spam from nonsubscribers | 14:41 |
cdent | blargh. thanks for looking into it. I don't mind a bit of a mail gap, but wanted to be sure nothing was wrong | 14:42 |
fungi | or at least i think it's a larger than usual volume... it always gets a lot of spam so it's possible something else has slowed its ability to process the incoming queue | 14:43 |
fungi | it's currently processing messages which arrived at 10:58z | 14:43 |
clarkb | any idea which way it is trending? | 14:44 |
clarkb | (is queue growing or shrinking) | 14:44 |
fungi | the backlog is currently growing, not shrinking, so i suspect we need to do something | 14:45 |
fungi | i'm still noodling on what | 14:45 |
fungi | sudo grep -r --files-with-match 'From [0-9]\+@qq.com' /srv/mailman/openstack/qfiles/in/|wc -l | 14:46 |
fungi | 5855 | 14:46 |
fungi | that's 99% of the messages in the backlog | 14:46 |
fungi | corvus: is it safe to delete files out of there? | 14:47 |
corvus | fungi: should be | 14:47 |
corvus | (except the one it's processing) | 14:47 |
fungi | anybody have an opinion on me deleting the messages from all-numeric qq.com addresses out of the incoming queue? | 14:48 |
clarkb | are they all all-numeric? | 14:48 |
clarkb | (I seem to recall that being a thing) | 14:48 |
clarkb | in other words that may not exclude valid emails | 14:48 |
fungi | no, [0-9]\+@qq.com addresses can be valid addresses, but the few legitimate messages i've from qq.com users aren't from the numeric addresses | 14:49 |
fungi | er, messages i've seen from | 14:50 |
clarkb | gotcha | 14:50 |
*** ykarel|away has quit IRC | 14:50 | |
aakarsh | hey fungi can you expand a bit on using x/browbeat project's public key? how do i get it. sorry I'm a bit confused. when i'd generated the key initially i used the ssh key that was added to the user (rht-perf-ci ) and then generated it as --tenant openstack https://zuul.openstack.org openstack/browbeat . which probably was the issue, so I re-generated with --tenant openstack https://zuul.openstack.org | 14:50 |
aakarsh | x/browbeat | 14:50 |
mordred | fungi: I think it seems like a good potential tradeoff - should we add a more permanent filter to block all-numeric qq addresses earlier? | 14:51 |
*** paladox_ has joined #openstack-infra | 14:51 | |
fungi | clarkb: right now we already tell mailman to silently discard messages from [0-9]\+@qq.com if they're not subscribed, though other lists may not and so are probably trying to send moderation notifications or rejection notices to them | 14:51 |
clarkb | aakarsh: did you add the key to the github account? | 14:51 |
aakarsh | yes I did clarkb | 14:52 |
fungi | er, i should say we already tell mailman to silently discard messages from [0-9]\+@qq.com to openstack-discuss if they're not subscribed | 14:52 |
clarkb | aakarsh: I would double check that you can ssh to github with that key locally outside of zuul | 14:52 |
clarkb | aakarsh: if that works maybe the encryption failed or you encrypted the wrong value? It should be the private key that gets encrypted in zuul | 14:52 |
clarkb | fungi: gotcha, then ya that seems mostly safe? | 14:52 |
aakarsh | ack checking clarkb, i did encrypt the private key. | 14:53 |
*** paladox_ is now known as paladox | 14:53 | |
*** paladox is now known as paladox__ | 14:53 | |
*** paladox__ is now known as paladox | 14:53 | |
corvus | fungi, clarkb: we could enable spf checking in exim. qq.com has a -all record. | 14:54 |
clarkb | corvus: that would ensure the origin of the smtp connection is a valid qq.com server? that seems reasonable | 14:54 |
corvus | yep. the mail we are getting currently is not -- it's from botnets | 14:55 |
fungi | yeah, that would help | 14:55 |
fungi | would allow dropping the discard pattern from openstack-discuss too | 14:55 |
fungi | also would reduce the amount of spam i sift through in the moderation queue every day | 14:55 |
mordred | yeah - and dropping at the exim layer would reduce the load on mailman | 14:56 |
fungi | since i manually inspect the messages (or at least the subjects) from non-numeric @qq.com non-subscribers | 14:56 |
fungi | very much so, yes | 14:56 |
fungi | well, dropping or rejecting at rcpt time, the latter is preferable | 14:56 |
clarkb | unrelated but for booting a gitea06 replacement. Do we have a preference for that being called "gitea06" and do replacement with same name or should I make a gitea09 and delete gitea06 and retire that name for now? | 14:58 |
corvus | i would keep 06 | 14:59 |
fungi | yeah, that seems fine | 14:59 |
fungi | just delete the old 06 first | 14:59 |
fungi | the reason to do 09 is if we needed to keep 06 around temporarily | 15:00 |
clarkb | well we don't need to delete the old one first either | 15:01 |
clarkb | our inventory should handle duplicates iirc | 15:01 |
clarkb | nope I'm wrong | 15:01 |
clarkb | that changed when we switched to the static inventory | 15:01 |
clarkb | I guess I remove the old one from inventory then can delete it after. I'll probably do that | 15:02 |
*** pkopec has quit IRC | 15:02 | |
openstackgerrit | Marcin Juszkiewicz proposed opendev/system-config master: epel: mirror also aarch64 https://review.opendev.org/663973 | 15:04 |
fungi | #status log deleted ~6k messages matching 'From [0-9]\+@qq.com' in /srv/mailman/openstack/qfiles/in/ on lists.o.o | 15:04 |
openstackstatus | fungi: finished logging | 15:04 |
openstackgerrit | James E. Blair proposed opendev/system-config master: Enable SPF checking on all incoming mail https://review.opendev.org/663974 | 15:04 |
corvus | fungi, clarkb, mordred: ^ that's copypasta from the exim manual, i think that would do it. | 15:04 |
fungi | 6 new messages have been delivered to openstack-discuss now that mm has caught up on the processing backlog for openstack lists | 15:06 |
fungi | cdent: yours was among them | 15:06 |
clarkb | corvus: sounds great to me. | 15:06 |
cdent | cool | 15:06 |
fungi | corvus: thanks, reviewing now as the backlog is already growing again | 15:06 |
corvus | puppet just ran on files02 and apache is happy | 15:07 |
fungi | we're already up to almost 100 new messages waiting in the incoming queue | 15:08 |
fungi | i'm going to step away for just a moment but will brb | 15:08 |
*** cdent has quit IRC | 15:08 | |
corvus | docs.opendev.org, tarballs.opendev.org, zuul-ci.org all seem to work | 15:08 |
clarkb | Taking another look at vexxhost flavors for new gitea06. I think we need 4vcpu and 6GB of RAM. THe current flavor seems to be the only one that minimally captures that need. (we can do 2vcpu + 8GB RAM or 8vcpu 8GB of RAM) | 15:08 |
clarkb | I guess i can ask mnaser if using 16GB of ram with 4vcpu is preferable to 8vcpu + 8GB ram | 15:09 |
clarkb | mnaser: ^ do you have an opinion on that? | 15:09 |
clarkb | corvus: yay | 15:10 |
mnaser | I'm equally okay with both. I would imagine that 4/16 will give a better experience for the users because more file cache | 15:10 |
mnaser | Whatever works best for you! | 15:11 |
clarkb | mnaser: thanks | 15:11 |
*** aedc has quit IRC | 15:12 | |
mnaser | No problem. Thanks for asking! | 15:12 |
clarkb | /var/backups/gitea-mariadb/gitea-mariadb.sql.gz exists on gitea01 so that backup cron worked \o/ | 15:12 |
clarkb | The last bit we'll have to sort out is how to recover gitea06's db from say gitea01 | 15:13 |
clarkb | but that can happen after we have the server up and running I think | 15:13 |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 15:16 |
*** e0ne has quit IRC | 15:22 | |
clarkb | oh right I was gonna look for leaked images in other clouds. I should do that before gitea06 so I don't forget | 15:23 |
*** bnemec is now known as beekneemech | 15:29 | |
clarkb | most of the images we leak appear to be "queued" which I think means we fail to upload to them | 15:30 |
*** smarcet has joined #openstack-infra | 15:32 | |
clarkb | I'm going to clean out 2k ish leaked images in inap now | 15:35 |
clarkb | the vast majority are in a 'queued' state | 15:36 |
clarkb | mordred: Shrews ^ I expect that this is due to a flaw in the error catching code in shade/sdk | 15:37 |
*** aakarsh has quit IRC | 15:37 | |
mordred | clarkb: I don't think users have the ability to delete queued images | 15:38 |
*** cmurphy is now known as cmorpheus | 15:38 | |
clarkb | mordred: they seem to be deleting just fine now | 15:38 |
mordred | clarkb: oh - neat | 15:38 |
clarkb | (at least openstackclient isn't reutrning errors, I'll do a new listing after my for loop completes | 15:38 |
mordred | clarkb: still - it's a weird state for nodepool to be able to know what to do about - if an upload sticks in queued, when would we clean it out? or maybe we're not recording it properly yet or something? | 15:39 |
clarkb | mordred: nodepool doesn't know about the images. I think as far as it is concerned the upload failed and it "cleaned it up" | 15:39 |
mordred | I suppose at some point in life we move on to upload another image - so maybe that's a good time to clean out old queued images? | 15:39 |
mordred | AH | 15:39 |
mordred | with you now | 15:40 |
clarkb | my process is do a cloud image list, remove all images that show up in nodepool image-list run for loop over those uuids deleting them all | 15:40 |
*** pgaxatte has quit IRC | 15:40 | |
*** roman_g has joined #openstack-infra | 15:42 | |
*** smarcet has quit IRC | 15:44 | |
*** lucasagomes has quit IRC | 15:47 | |
*** smarcet has joined #openstack-infra | 15:47 | |
*** markvoelker has quit IRC | 15:49 | |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 15:50 |
clarkb | corvus: http://logs.openstack.org/74/663974/1/check/system-config-run-base/39de554/job-output.txt.gz#_2019-06-07_15_33_47_106670 yay for testing | 15:50 |
clarkb | corvus: I guess something isn't quite right in the spf cbhange | 15:50 |
*** gyee has joined #openstack-infra | 15:51 | |
corvus | exim4-daemon-light could be the culprit | 15:53 |
*** iurygregory has quit IRC | 15:55 | |
openstackgerrit | James E. Blair proposed opendev/system-config master: Enable SPF checking on all incoming mail https://review.opendev.org/663974 | 15:56 |
corvus | clarkb: let's see if that's any different | 15:56 |
*** ramishra has quit IRC | 15:56 | |
*** michael-beaver has joined #openstack-infra | 15:57 | |
clarkb | is that like heavy water? more deuterium? | 15:57 |
corvus | clarkb: yes, you don't want to drink exim4-daemon-heavy | 15:57 |
corvus | the light package has zero calories. | 15:58 |
*** dtantsur is now known as dtantsur|afk | 15:58 | |
*** aedc has joined #openstack-infra | 15:58 | |
openstackgerrit | Ben Nemec proposed openstack/pbr master: Make WSGI tests listen on localhost https://review.opendev.org/663758 | 16:00 |
openstackgerrit | Ben Nemec proposed openstack/pbr master: Use sphinxcontrib-apidoc for api docs https://review.opendev.org/663985 | 16:00 |
clarkb | deleting images via api is not very quick | 16:00 |
*** ginopc has quit IRC | 16:02 | |
fungi | debian has finally done away with the heavy/light split for exim packages | 16:04 |
*** Lucas_Gray has quit IRC | 16:04 | |
openstackgerrit | Scott Little proposed openstack/project-config master: Add ansible-playbook repo to starlingx https://review.opendev.org/663954 | 16:04 |
*** stephenfin is now known as finucannot | 16:05 | |
*** jtomasek has quit IRC | 16:05 | |
*** aedc has quit IRC | 16:05 | |
openstackgerrit | Ben Nemec proposed openstack/pbr master: Switch to release.o.o for constraints https://review.opendev.org/663988 | 16:05 |
*** aakarsh has joined #openstack-infra | 16:06 | |
fungi | oh, i thought they had, but apparently not. i guess it's still under discussion | 16:07 |
*** mattw4 has joined #openstack-infra | 16:07 | |
*** xek has quit IRC | 16:11 | |
fungi | speaking of backlogs, looks like zuul's been pegged on available quota since ~08:00z and the waiting node request count is still rising | 16:11 |
clarkb | there are a lot of tripleo and nova changes in the queue | 16:11 |
clarkb | tripleo gate is 45 changes deep | 16:12 |
clarkb | and second change caused a reset | 16:12 |
clarkb | er 45 is gate total not just tripleo | 16:12 |
*** panda is now known as panda|off | 16:13 | |
clarkb | in any case it is busy | 16:13 |
fungi | what has the utilization breakdown been looking like? done an analysis run for may yet? | 16:14 |
clarkb | I haven't | 16:14 |
clarkb | running one now | 16:16 |
clarkb | also I think we merged the statsd tracking change but unsure if scheduler has been restarted since | 16:16 |
clarkb | so we can put up a grafana dashboard soon probably | 16:16 |
*** smarcet has quit IRC | 16:16 | |
openstackgerrit | Ben Nemec proposed openstack/pbr master: Make WSGI tests listen on localhost https://review.opendev.org/663758 | 16:17 |
openstackgerrit | Ben Nemec proposed openstack/pbr master: Switch to release.o.o for constraints https://review.opendev.org/663988 | 16:17 |
clarkb | http://paste.openstack.org/show/752634/ neutron now takes top honors | 16:17 |
clarkb | tempest-slow-py3 is our biggest resource hog now | 16:18 |
clarkb | about twice as much as tempest-full-py3 (I find that interesting) | 16:18 |
logan- | scheduler restart soon might be a good idea also because memory usage is starting to creep up again: http://cacti.openstack.org/cacti/graph_image.php?action=view&local_graph_id=64792&rra_id=2 | 16:18 |
clarkb | logan-: good call | 16:19 |
clarkb | corvus: ^ is that something you have debugging stuff in place for? | 16:19 |
*** emine__ has quit IRC | 16:19 | |
corvus | clarkb: nope, i'm not prepared to debug it at all | 16:20 |
logan- | seems like its on a similar trajectory as the issue 2 weeks ago where ZK resets started happening http://cacti.openstack.org/cacti/graph_image.php?action=view&local_graph_id=64792&rra_id=3 | 16:20 |
clarkb | k, should we make any changes before restarting to make debugging easier next time? I'm thinking of the repl maybe? | 16:21 |
clarkb | http://logs.openstack.org/74/663974/2/check/system-config-run-base/f411e24/job-output.txt.gz#_2019-06-07_16_13_06_270278 exim still failing | 16:21 |
corvus | clarkb: i'll rebase the repl change | 16:21 |
aakarsh | hi fungi clarkb i was able to clone and update https://github.com/cloud-bulldozer/browbeat/tree/test with the ssh key. not sure what i'm doing wrong. I took the private key from the same host I was able to pull and update and encrypted by running with options --infile ~/.ssh/id_rsa --tenant openstack https://zuul.openstack.org x/browbeat | 16:23 |
aakarsh | 16:23 | |
*** yamamoto has quit IRC | 16:23 | |
clarkb | does white space around = matter in exim config? | 16:23 |
clarkb | seems like no based on other content in the file | 16:24 |
openstackgerrit | James E. Blair proposed zuul/zuul master: WIP add repl https://review.opendev.org/579962 | 16:24 |
corvus | clarkb: we can do a cherry-pick and manual install of that | 16:24 |
clarkb | corvus: k | 16:24 |
*** priteau has quit IRC | 16:25 | |
aakarsh | the .zuul.yaml file is here https://opendev.org/x/browbeat/src/branch/master/.zuul.yaml | 16:25 |
clarkb | looking at my local github ssh remotes the username is actually git | 16:27 |
clarkb | is it possible that the username value there isn't needed? | 16:27 |
fungi | aakarsh: and the rht-perf-ci user has push permissions for the corresponding branch of that repository? | 16:27 |
clarkb | I have to pop out in a minute but let me check ara's setup really quick | 16:27 |
fungi | clarkb: aakarsh: oh, yep that is likely the cause | 16:28 |
aakarsh | fungi, yes rht-perf-ci user has push permission. | 16:28 |
aakarsh | ooh so user should be git | 16:28 |
clarkb | https://opendev.org/recordsansible/ara-infra/src/branch/master/.zuul.yaml#L16-L62 I think that is the issue | 16:28 |
clarkb | user should be git | 16:28 |
corvus | aakarsh: what instructions were you following? | 16:28 |
*** mriedem has quit IRC | 16:28 | |
aakarsh | ah makes sense i looked at airshipit/armada they'd user as git https://github.com/airshipit/armada/blob/master/.zuul.yaml#L224 | 16:28 |
aakarsh | http://lists.openstack.org/pipermail/openstack-discuss/2019-April/005007.html | 16:28 |
corvus | so that we can make sure they are updated | 16:28 |
aakarsh | is what I was following ^ corvus | 16:28 |
clarkb | ok I'm popping out for a bit. My image deletes will keep running | 16:29 |
*** mriedem has joined #openstack-infra | 16:29 | |
corvus | aakarsh: thanks | 16:29 |
clarkb | technically the docs are correct because the remote user is git | 16:29 |
aakarsh | i'll try again with git and get back to you. | 16:29 |
corvus | that email is technicall correct, but misleading for github users | 16:29 |
clarkb | its just not obvious that it is git | 16:29 |
clarkb | ya | 16:29 |
fungi | aakarsh: https://opendev.org/recordsansible/ara-web/src/branch/master/.zuul.yaml#L36 | 16:29 |
fungi | so, yes, should be git@ | 16:29 |
clarkb | I guess when I get back we can plan to do a zuul scheduler restart and I'll keep deleting iamges in clouds | 16:30 |
*** smarcet has joined #openstack-infra | 16:30 | |
aakarsh | yep thanks corvus fungi http://paste.openstack.org/show/752637/ i should've directly tried to ssh. sorry for the chaos. | 16:33 |
fungi | no worries, glad you got it sorted | 16:34 |
openstackgerrit | Jeremy Stanley proposed zuul/zuul-jobs master: [DNM] Test unittests and multinode with base-test https://review.opendev.org/663995 | 16:39 |
*** tesseract has quit IRC | 16:40 | |
*** owalsh has quit IRC | 16:40 | |
openstackgerrit | Jeremy Stanley proposed openstack/openstack-zuul-jobs master: [DNM] Test some jobs on top of base-test https://review.opendev.org/663996 | 16:41 |
*** jaosorior has quit IRC | 16:42 | |
corvus | okay, that SPF option is available starting in 4.91; we're at 4.86 on lists.o.o, so we'll need to do it the old debian way | 16:43 |
openstackgerrit | Michael Johnson proposed openstack/diskimage-builder master: Remove the rhel 8 check for xfs https://review.opendev.org/663998 | 16:45 |
*** markvoelker has joined #openstack-infra | 16:50 | |
*** efried is now known as fried_rolls | 16:51 | |
*** weifan has joined #openstack-infra | 16:51 | |
*** owalsh has joined #openstack-infra | 16:52 | |
fungi | oh, ouch, even the exim4 package on bionic is slightly too old to get that | 16:53 |
openstackgerrit | James E. Blair proposed opendev/system-config master: Enable SPF checking on all incoming mail https://review.opendev.org/663974 | 16:53 |
corvus | perl to the rescue | 16:54 |
*** yamamoto has joined #openstack-infra | 16:54 | |
fungi | i'm guessing the volume of dns queries will be mitigated by our use of unbound? | 16:55 |
fungi | since most of the queries will be cache hits | 16:55 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Proposed spec: tenant-scoped admin web API https://review.opendev.org/562321 | 16:56 |
*** jpena is now known as jpena|off | 16:56 | |
fungi | in the interim, i'm clearing out more messages from [0-9]\+@qq.com since we've reached a 30-minute backlog | 16:58 |
fungi | that removed >99% of the ~850 messages in the inbound queue | 16:59 |
*** bobh has joined #openstack-infra | 16:59 | |
*** yamamoto has quit IRC | 16:59 | |
zbr|rover | please let me know if the proposed openstack-tox-mol template looks ok now: https://review.opendev.org/#/c/663599/ --- I added links to two jobs that depends on it and it seems to work correctly. | 16:59 |
*** kopecmartin is now known as kopecmartin|off | 17:03 | |
*** derekh has quit IRC | 17:04 | |
*** weifan has quit IRC | 17:04 | |
*** weifan has joined #openstack-infra | 17:05 | |
*** weifan has quit IRC | 17:05 | |
*** weifan has joined #openstack-infra | 17:05 | |
*** weifan has quit IRC | 17:06 | |
*** liuyulong has quit IRC | 17:06 | |
*** weifan has joined #openstack-infra | 17:06 | |
*** weifan has quit IRC | 17:07 | |
*** weifan has joined #openstack-infra | 17:07 | |
*** weifan has quit IRC | 17:08 | |
*** weifan has joined #openstack-infra | 17:08 | |
*** weifan has quit IRC | 17:08 | |
*** ralonsoh has quit IRC | 17:09 | |
*** bobh has quit IRC | 17:10 | |
openstackgerrit | James E. Blair proposed zuul/zuul master: Add opendev tarball jobs https://review.opendev.org/664006 | 17:12 |
*** diablo_rojo has joined #openstack-infra | 17:13 | |
clarkb | ok mostly back now | 17:13 |
clarkb | fungi: ya we should cache all those lookups locally | 17:14 |
*** factor has joined #openstack-infra | 17:14 | |
clarkb | zbr|rover: I was actually thinking abouit that a bit and wondered if a molecule job would make sense in zuul-jobs | 17:14 |
*** ociuhandu_ has joined #openstack-infra | 17:15 | |
zbr|rover | clarkb: if you want, I can add it there. my only concern was related to success/failure-urls. | 17:15 |
clarkb | zbr|rover: well I was thinking that because zuul uses ansible roles, better testing support for ansible roles in zuul itself would be good | 17:15 |
clarkb | for the reports url I think it is fine since it is relative path and specific to the tool | 17:16 |
fungi | heh, i had asked the same question when i looked at it | 17:16 |
zbr|rover | clarkb: i would love to make a poc that applies to zuul roles too. getting almost instant feedback is very useful when making changes. | 17:16 |
*** ociuhandu has quit IRC | 17:18 | |
*** ociuhandu_ has quit IRC | 17:20 | |
corvus | fungi, clarkb: 663974 passes, however, we don't have a lists.openstack.org host in our testing (because it's still "puppet"), so that change isn't really tested | 17:24 |
*** ricolin has quit IRC | 17:24 | |
corvus | maybe i should add a new job in that change | 17:25 |
*** markvoelker has quit IRC | 17:25 | |
fungi | we could test it turned on globally, but not merge it that way, as a compromise | 17:25 |
corvus | yeah, that would work too | 17:26 |
corvus | lemme see how big a deal a new job is first | 17:26 |
corvus | cause i think we'd want that eventually anyway | 17:26 |
fungi | or throw a child change on which enables it | 17:26 |
fungi | but yeah, an actual job would be awesome | 17:26 |
clarkb | as a double check /etc/resolv.conf points to localhost and localhost:53 is unbound | 17:29 |
clarkb | so we should be doing what we can to minimize those dns lookups | 17:29 |
openstackgerrit | James E. Blair proposed opendev/system-config master: Enable SPF checking on lists https://review.opendev.org/663974 | 17:29 |
fungi | node request backlog looks like it may have peaked for today, so hopefully zuul will begin gaining ground on its backlog now | 17:30 |
corvus | fungi, clarkb: does that job look good? i didn't add any testinfra stuff, but that should exercise the whole deployment on the list server | 17:30 |
clarkb | looking (and at least applying the ansible is a start) | 17:31 |
*** weifan has joined #openstack-infra | 17:31 | |
corvus | oh i think i can add a testinfra thing | 17:31 |
clarkb | corvus: you should be able to check that port 25 is listening | 17:32 |
clarkb | and exim is running? | 17:32 |
fungi | yeah, even just making sure exim doesn't crash/fail to restart would be helpful | 17:32 |
corvus | oh, actually, we'll run the existing testinfra tests | 17:32 |
corvus | test_base.py runs on the 'all' group | 17:32 |
corvus | so that's going to run "exim -bt root", which would crash if the config is invalid | 17:32 |
corvus | i believe that's what caught the earlier error | 17:33 |
*** smarcet has quit IRC | 17:33 | |
fungi | perfect | 17:33 |
corvus | oh, actually, it was ansible that caught it before that | 17:33 |
corvus | anyway, i think we're good :) | 17:33 |
clarkb | k | 17:33 |
clarkb | do we need to add that package to a package install list somewhere? | 17:34 |
corvus | (in the future, maybe we can have testinfra send mail to a mailing list, but that's out of scope today i think) | 17:34 |
clarkb | spf-tools-perl specifically | 17:34 |
corvus | did i....forget to git add? | 17:34 |
corvus | yep | 17:34 |
openstackgerrit | James E. Blair proposed opendev/system-config master: Enable SPF checking on lists https://review.opendev.org/663974 | 17:34 |
clarkb | inap image clean up finally finished. I'm going to do the next cloud region | 17:35 |
*** weifan has quit IRC | 17:35 | |
*** aedc has joined #openstack-infra | 17:36 | |
openstackgerrit | Stephen Finucane proposed openstack/pbr master: Stop using pbr sphinx integration https://review.opendev.org/655565 | 17:38 |
*** smarcet has joined #openstack-infra | 17:39 | |
*** aedc has quit IRC | 17:41 | |
*** roman_g has quit IRC | 17:45 | |
clarkb | corvus: most recent ps lgtm | 17:50 |
clarkb | we'll need to rebase ianws changes assuming that gets in. I can do that so he has up to date ci results | 17:51 |
*** weifan has joined #openstack-infra | 17:51 | |
fungi | what incorporates the roles/exim/tasks/Debian.yaml role? | 17:54 |
fungi | er, tasklist, whatever it's called | 17:54 |
*** jamesmcarthur has quit IRC | 17:54 | |
fungi | something preexisting in the exim role/ | 17:55 |
fungi | ? | 17:55 |
clarkb | fungi: yes there is a default.yaml and a RedHat.yaml that get loaded | 17:55 |
clarkb | and now we'll have a Debian.yaml | 17:55 |
clarkb | (I checked that when doing the review) | 17:55 |
fungi | ahh, so it's some sort of magic fact matching which notices it's there and adds it in? | 17:56 |
fungi | based on operating system family or something like that? | 17:57 |
clarkb | yes its a specific include tasks task | 17:57 |
clarkb | let me get a link | 17:57 |
clarkb | https://opendev.org/opendev/system-config/src/branch/master/roles/exim/tasks/main.yaml#L9-L15 | 17:57 |
fungi | aha, so it will discover it if a match exists. awesome | 17:59 |
zbr|rover | clarkb: fungi: have a look at this https://review.opendev.org/#/c/661994/8/tasks/main.yml -- about layered loading of distro specific configs. | 18:02 |
*** finucannot is now known as stephenfin | 18:02 | |
*** gfidente has quit IRC | 18:02 | |
zbr|rover | i found it suggested on an ansible bug, used it already in a couple of places. | 18:02 |
clarkb | ya we do similar though simpler because fewer cases to worry about | 18:03 |
zbr|rover | sure, just wanted to share the idea, no need to go so deep this the loop | 18:03 |
clarkb | corvus: you updated .zuul.yaml so gitea ran and it failed http://logs.openstack.org/74/663974/5/check/system-config-run-gitea/05bbf86/job-output.txt.gz | 18:05 |
clarkb | maybe if the lists test passes we remove the test, merge it, then add the test back? | 18:05 |
fungi | wfm | 18:06 |
aakarsh | corvus, fungi just wanted to let you know guys know that updating user to git fixed the problem, and the remote repository is now in sync :) thanks again | 18:06 |
fungi | aakarsh: great! glad it worked out | 18:06 |
fungi | clarkb: so 3000/tcp on the gitea server ends up rejecting connections? | 18:10 |
*** smarcet has quit IRC | 18:12 | |
clarkb | fungi: seems like it | 18:13 |
fungi | is that new? do you know? i guess we're maybe testing newer gitea and there are regressions or something? | 18:14 |
clarkb | I think we should only be testing what is in corvus' fork so if that updated then that is possible. Otherwise should be the same version | 18:14 |
clarkb | no idea if that is new | 18:14 |
fungi | ahh, i thought we had switched off the fork a couple weeks back | 18:15 |
clarkb | I thought there was one remaining change? Maybe I'm mistaken in which case ya could be a gitea update breaking us | 18:15 |
*** smarcet has joined #openstack-infra | 18:16 | |
fungi | well, anyway, so when you say "remove the test" and "add the test back" you're referring to the gitea test i guess? | 18:16 |
fungi | otherwise we can't land the exim test until we work out why the gitea test is broken (short of bypassing zuul) | 18:17 |
clarkb | no remove the lists.o.o test so that we don't have changes to .zuul.yaml | 18:17 |
clarkb | the gitea job won't run if we remove the .zuul.yaml change | 18:17 |
*** aedc has joined #openstack-infra | 18:18 | |
*** ccamacho has quit IRC | 18:18 | |
fungi | sure, but then we don't have a working exim test merged | 18:19 |
fungi | either way we seem to not have a working gitea test merged at the moment | 18:19 |
corvus | maybe we should drop the .zuul.yaml file matchers | 18:19 |
fungi | that could be another option, though would allow us to merge broken changes to those jobs, i suppose | 18:20 |
*** bobh has joined #openstack-infra | 18:20 | |
corvus | yeah, we'll have to be careful. neither thing (run no jobs on .zuul.yaml changes / run all jobs on .zuul.yaml changes) is what we want. someday i'll have time to add 'run this job if its config changes' support to zuul, which is what we really want) | 18:21 |
clarkb | if anyone is wondering: "Image transition from deleted to deleted is not allowed" | 18:21 |
fungi | digging into the gitea node log from that failure now to see if i can spot the problem | 18:21 |
fungi | heh, so you can't delete a deleted image, huh? | 18:21 |
clarkb | nope | 18:21 |
*** markvoelker has joined #openstack-infra | 18:22 | |
*** aedc has quit IRC | 18:23 | |
corvus | fungi: based on http://logs.openstack.org/74/663974/5/check/system-config-run-gitea/05bbf86/gitea01.opendev.org/docker/giteadocker_gitea-web_1.txt it looks like it could not connect to the db | 18:23 |
fungi | ahh | 18:23 |
fungi | 2019-06-07 17:57:37 0 [Note] mysqld: ready for connections. | 18:24 |
fungi | so not for lack of a running db server at least | 18:24 |
corvus | yeah, i'm puzzled. | 18:24 |
corvus | fun fact: we're in limestone here | 18:24 |
corvus | and attempting to connect to [::1]:3306 | 18:25 |
clarkb | maybe mariadb isn't listening on v6? | 18:25 |
corvus | in production we have: tcp6 0 0 [::]:mysql [::]:* LISTEN | 18:27 |
corvus | while we're pondering, i'm going to recheck that and see if we get a different answer | 18:31 |
clarkb | k | 18:31 |
fungi | good idea | 18:31 |
fungi | could be this hasn't bitrotted, merely only worked on certain providers | 18:31 |
corvus | we've got the gitea image and mariadb images pinned | 18:32 |
*** jcoufal has quit IRC | 18:32 | |
corvus | though, the mariadb image is not pinned very specifically -- it's 10.4, which was updated 3 days ago | 18:33 |
*** bobh has quit IRC | 18:34 | |
corvus | we can drop back to 10.4.4 or 10.4.3 if we want to backtrack a bit to 23 days or 3 months ago, respectively | 18:34 |
clarkb | we have a debian-jessie image in ovh bhs1. Not for much longer | 18:34 |
clarkb | corvus: possible that the binding behavior changed between those releases and the one 3 days ago I suppose | 18:34 |
clarkb | corvus: might be worth pinning more specifically if recheck reproduces | 18:34 |
corvus | ya | 18:35 |
openstackgerrit | James E. Blair proposed opendev/system-config master: DNM: try mariadb 10.4.4 https://review.opendev.org/664027 | 18:36 |
openstackgerrit | James E. Blair proposed opendev/system-config master: DNM: try mariadb 10.4.3 https://review.opendev.org/664028 | 18:36 |
corvus | or, you know, maybe just throw some computers at the problem and see what they come up with | 18:36 |
corvus | maybe we'll have 3 new data points after lunch :) | 18:37 |
openstackgerrit | Gaëtan Trellu proposed openstack/project-config master: Add api-ref for Qinling https://review.opendev.org/664030 | 18:37 |
*** diablo_rojo has quit IRC | 18:43 | |
*** smarcet has quit IRC | 18:44 | |
*** rascasoft has quit IRC | 18:52 | |
*** markvoelker has quit IRC | 18:55 | |
*** rascasoft has joined #openstack-infra | 18:55 | |
*** e0ne has joined #openstack-infra | 18:59 | |
openstackgerrit | James E. Blair proposed opendev/base-jobs master: Fix js content tarball job name https://review.opendev.org/664032 | 19:00 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Add opendev tarball jobs https://review.opendev.org/664006 | 19:02 |
corvus | should i change opendev-promote-javascript-content to opendev-promote-javascript-content-tarball for consistency? | 19:03 |
clarkb | might help reduce further confusion, but I don't think it is necessary | 19:05 |
fungi | no opinion | 19:07 |
fungi | if we're not going to promote other javascript content besides tarballs, then i don't find it confusing | 19:08 |
clarkb | the gitea job succeeded on recheck | 19:11 |
clarkb | on the exim change | 19:11 |
fungi | harumph | 19:12 |
clarkb | ovh, vexxhost, inap leaked images are cleared out. Limestone is in progress. I'll finish up with linaro-london and rax | 19:12 |
clarkb | but I need to pop out now for lunch activities | 19:12 |
openstackgerrit | Merged opendev/base-jobs master: Fix js content tarball job name https://review.opendev.org/664032 | 19:12 |
clarkb | the deletes take time anyway | 19:12 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Store autohold requests in zookeeper https://review.opendev.org/661114 | 19:15 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Add autohold-info CLI command https://review.opendev.org/662487 | 19:15 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Record held node IDs with autohold request https://review.opendev.org/662498 | 19:15 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: WIP: Auto-delete expired autohold requests https://review.opendev.org/663762 | 19:15 |
*** fried_rolls is now known as efried | 19:15 | |
openstackgerrit | Merged zuul/zuul master: Break long repo names to make them fit https://review.opendev.org/663899 | 19:16 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Add caching of autohold requests https://review.opendev.org/663412 | 19:18 |
mordred | corvus: back from what turned out to be an extra long sandwich - anything you want me to look at or poke at? | 19:19 |
*** e0ne has quit IRC | 19:24 | |
corvus | mordred: wow, you had one of these? https://www.subway.com/~/media/Base_English/Images/SubwayCatering/giant_sub_notext.jpg | 19:28 |
*** emine__ has joined #openstack-infra | 19:28 | |
mordred | corvus: I think mine was longer than that | 19:28 |
corvus | mordred: i think we're just waiting right now, with like 3 irons in the fire | 19:29 |
mordred | corvus: I thnik it would be nice if english differentiated between length and duration ... | 19:29 |
mordred | corvus: awesome | 19:29 |
corvus | fungi, clarkb, mordred: ah, since the gitea job succeeded on recheck, we should probably entertain the idea that there's a weird docker v6 thing happening with that job in limestone. | 19:30 |
corvus | i don't really want to open that can of worms today, but we probably should poke at that soon. | 19:31 |
corvus | i'll just go ahead and abandon my mariadb changes | 19:31 |
mordred | corvus: I can't even | 19:31 |
mordred | corvus: the whole docker + v6 story is just a gift that keeps on giving | 19:32 |
*** weifan has quit IRC | 19:32 | |
mordred | assuming that's actually the issue of course | 19:32 |
corvus | it seems very likely that *something* about docker is the cause | 19:32 |
*** bhavikdbavishi has joined #openstack-infra | 19:33 | |
*** smarcet has joined #openstack-infra | 19:33 | |
*** e0ne has joined #openstack-infra | 19:36 | |
fungi | one possibillity is it's a race... mariadb logged reaching a running state at 17:57:37 but the 17:57:38 db ping is the point at which gitea-web seems to have given up trying to connect to it | 19:41 |
*** udesale has quit IRC | 19:41 | |
fungi | or did it maybe successfully connect at that point and not log anything afterward? | 19:41 |
*** bobh has joined #openstack-infra | 19:43 | |
corvus | fungi: well, that's 5m after it started; maybe that's our timeout | 19:44 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: WIP: Auto-delete expired autohold requests https://review.opendev.org/663762 | 19:46 |
fungi | ahh, so could be we just got unlucky with a slow db container setup/start and lost the race | 19:46 |
fungi | assuming a 5m timeout | 19:47 |
fungi | also, the openstack ml backlog surpassed an hour so i've removed another ~1600 [0-9]\+@qq.com messages from the inbound processing queue | 19:48 |
*** markvoelker has joined #openstack-infra | 19:52 | |
*** guimaluf has joined #openstack-infra | 19:52 | |
clarkb | now down to just rax leaked images. Note I'm not sure how to check for leakages in swift there | 19:54 |
clarkb | so that will have to be a different pass | 19:54 |
*** emine has joined #openstack-infra | 19:56 | |
clarkb | and actually rax will have to wait because there are uploading images and I don't want to delete one unexpectedly | 19:57 |
clarkb | Now to context switch to zuul restarts | 19:57 |
*** smarcet has left #openstack-infra | 19:58 | |
clarkb | the exim fix is almost done in the gate so won't restart until that is in | 19:58 |
*** emine__ has quit IRC | 19:59 | |
clarkb | My plan is to manually install zuul so that we can get the repl change in, then I'll use the playbook to restart the whole thing | 19:59 |
clarkb | corvus: are the python unittests expected to fail on the repl change? and if so is it safe to install it? | 20:01 |
clarkb | maybe it is the thread list checker in the unittests that is failing | 20:02 |
corvus | clarkb: that's what i'm thinking -- that it doesn't shut down cleanly. should be ok. | 20:02 |
clarkb | address alreasy in use errors so much be related to binding the socket | 20:02 |
mordred | clarkb: for swift, once we're not uploading actively, we can delete any and all objects in the images container | 20:04 |
*** emine has quit IRC | 20:04 | |
mordred | once the images are imported the swift objects are no longer needed | 20:04 |
clarkb | mordred: oh right glance copies it internally | 20:04 |
mordred | yah- so the glance process you've been doing should just work on rax as well, but I agree, an additional check for leaked objects is a good idea - when it's not uploading :) | 20:04 |
*** emine has joined #openstack-infra | 20:05 | |
*** smarcet has joined #openstack-infra | 20:05 | |
clarkb | ya I'll hold off until it quiets down | 20:05 |
*** jamesmcarthur has joined #openstack-infra | 20:07 | |
*** mriedem has quit IRC | 20:08 | |
clarkb | zuul==3.8.1.dev154 # git sha ce30029 is now installed on zuul01. That is the wip repl change. The previous commit merged is e0c975a98086882127f2c1b2c30a28876d84ebb7 | 20:09 |
clarkb | and now I wait for exim fix | 20:09 |
*** emine__ has joined #openstack-infra | 20:11 | |
*** mriedem has joined #openstack-infra | 20:11 | |
fungi | #status log filed a removal request from the spamhaus pbl for the ip address of the new ask.openstack.org server | 20:11 |
*** weifan has joined #openstack-infra | 20:12 | |
openstackstatus | fungi: finished logging | 20:12 |
*** jamesmcarthur has quit IRC | 20:12 | |
*** raissa has joined #openstack-infra | 20:13 | |
*** emine has quit IRC | 20:13 | |
*** emine has joined #openstack-infra | 20:15 | |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Auto-delete expired autohold requests https://review.opendev.org/663762 | 20:16 |
*** emine__ has quit IRC | 20:16 | |
*** weifan has quit IRC | 20:16 | |
*** raissa has quit IRC | 20:18 | |
fungi | infra-root: a reminder, when we build/replace servers in rax which send e-mail to a variety of users we have to request exclusions from the spamhaus pbl for them | 20:18 |
*** raissa has joined #openstack-infra | 20:18 | |
*** bhavikdbavishi has quit IRC | 20:19 | |
clarkb | fungi: might be worth adding that to the meeting agenda as a reminder (I think people read the agenda and meeting notes even if they don't attend) | 20:19 |
*** markvoelker has quit IRC | 20:20 | |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Add autohold-info CLI command https://review.opendev.org/662487 | 20:22 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Record held node IDs with autohold request https://review.opendev.org/662498 | 20:22 |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: Auto-delete expired autohold requests https://review.opendev.org/663762 | 20:22 |
fungi | clarkb: god idea, thanks! | 20:22 |
fungi | good idea too | 20:23 |
*** raissa has quit IRC | 20:23 | |
fungi | added | 20:24 |
*** igordc has quit IRC | 20:28 | |
*** raissa has joined #openstack-infra | 20:32 | |
*** raissa has quit IRC | 20:34 | |
*** e0ne has quit IRC | 20:37 | |
*** diablo_rojo has joined #openstack-infra | 20:41 | |
openstackgerrit | David Shrewsbury proposed zuul/zuul master: WIP: Mark nodes as USED when deleting autohold https://review.opendev.org/664060 | 20:41 |
clarkb | last job for the exim fix is finally running | 20:44 |
clarkb | zuul says 10 minutes to merging then we can restart zuul | 20:44 |
*** weifan has joined #openstack-infra | 20:45 | |
clarkb | how does `ansible-playbook -f 10 /opt/system-config/playbooks/zuul_restart.yaml` look? | 20:51 |
clarkb | I guess I should check the mergers and executors for their installed version | 20:52 |
clarkb | zm01 and ze01 are on 00d0abbc709148527a3b57cf8733541f4ba817d8 | 20:53 |
clarkb | which lgtm | 20:53 |
openstackgerrit | Merged opendev/system-config master: Enable SPF checking on lists https://review.opendev.org/663974 | 20:53 |
*** emine has quit IRC | 20:53 | |
clarkb | corvus: fungi mordred ^ ok that merged. Anything else I should do before zuul restart? | 20:54 |
clarkb | I'll save queues and run the playbook if not | 20:54 |
*** aakarsh has quit IRC | 20:55 | |
corvus | puppet was running when that merged, so we're expecting that to take effect at 21:24 i guess | 20:55 |
corvus | clarkb: i think we're gtg if you've done the manual install on the sched | 20:56 |
clarkb | pbr freeze still shows zuul==3.8.1.dev154 # git sha ce30029 on the scheduler | 20:56 |
clarkb | I'll save queues now | 20:56 |
clarkb | queues saved. Running playbook next | 20:56 |
corvus | clarkb: remember you will need to rm the web pid file manually | 20:57 |
clarkb | we had started to swap quite a bit so this was timely | 20:57 |
clarkb | corvus: yup | 20:57 |
clarkb | thats curious zuul scheduler is still running but playbook thinks it isn't | 20:58 |
fungi | clarkb: nothing i'm aware of which is urgent | 20:58 |
clarkb | I won't clear the web pid file until scheduler has fully stopped | 20:58 |
fungi | and i'll keep an eye on lists.o.o in the next half-hour to see if things break when the spf filer is applied | 20:58 |
clarkb | corvus: is it possible that is a rogue pair of processes? | 20:59 |
fungi | if this works and allows us to go back to accepting ${listname}-owner@ messages, i'll be ecstatic | 20:59 |
clarkb | oh they are gone now | 20:59 |
clarkb | proceeding with web pid removal | 20:59 |
corvus | (could be internal pid removal happened before kernel finished swapping in the procs) | 20:59 |
clarkb | scheduler and mergers have been restarted. Playbook is waiting on executors to stop so it can start them again | 21:00 |
*** bobh has quit IRC | 21:02 | |
clarkb | about half the executors have been restarted at this point | 21:03 |
clarkb | still waiting on configs to load before I can requeue | 21:03 |
clarkb | er I guess they've only stopped. It waits for the full set to stop before restarting the executors | 21:05 |
clarkb | scheduler is up. Loading queues | 21:06 |
*** mriedem has quit IRC | 21:06 | |
clarkb | waiting on ze10 and ze07 | 21:08 |
clarkb | playbook is done. Restart is complete other than reloading check queue | 21:10 |
clarkb | pabelanger: you should be able to try ansible 2.8 now | 21:10 |
corvus | repl looks good | 21:10 |
clarkb | memory usage and swap activity back to normal | 21:15 |
clarkb | logan-: thank you for pointing that out | 21:15 |
*** markvoelker has joined #openstack-infra | 21:16 | |
clarkb | check is loaded now | 21:18 |
clarkb | I think that concludes the restart | 21:18 |
*** smarcet has quit IRC | 21:19 | |
clarkb | #status log Performed a full zuul service restart. This reset memory usage (we were swapping), installed the debugging repl, and gives us access to ansible 2.8. Scheduler is running ce30029 on top of e0c975a and mergers + executors are running 00d0abb | 21:19 |
openstackstatus | clarkb: finished logging | 21:19 |
logan- | np clarkb | 21:20 |
*** markvoelker has quit IRC | 21:21 | |
corvus | 2019-06-07 21:29:17 H=(tjtianhe.com.cn) [175.174.81.197] F=<635538059@qq.com> rejected RCPT <OpenStack-operators@lists.openstack.org>: SPF check failed. | 21:29 |
corvus | that's promising | 21:29 |
fungi | yup, just saw it showed yo | 21:29 |
fungi | er, showed up | 21:30 |
corvus | fungi: you see we have a lot of stuff queued for you, yeah? | 21:30 |
fungi | we're at a ~40-minute backlog in the openstack ml queue again | 21:30 |
fungi | yep, shall i clean out the likely spam there one last time and then keep an eye on it? | 21:31 |
corvus | fungi: (it's not important; just logspam) | 21:31 |
*** e0ne has joined #openstack-infra | 21:31 | |
fungi | or i can let it go on its own... it's burning down the backlog fairly quickly now that nothing's being heaped on | 21:32 |
corvus | fungi: to be clear, i'm talking about exim deferred deliveries for fungi@yuggoth | 21:32 |
fungi | ahh, yeah i hadn't spotted those but will take a look now | 21:33 |
fungi | i thought i had whitelisted the listserv so it wouldn't grelist those | 21:33 |
clarkb | we should probably #status log the spf change so that if people have trouble with email they have that breadcrumb? | 21:33 |
fungi | clarkb: good call | 21:34 |
corvus | does anyone have a message they need to send to a list? | 21:34 |
corvus | it'd be nice to see a legit message go through :) | 21:34 |
clarkb | I don't but won't be offended if we throw a test email at -infra | 21:34 |
*** whoami-rajat has quit IRC | 21:34 | |
*** kjackal has quit IRC | 21:34 | |
fungi | anyway we seem to be blowing through the backlog at a rate of ~1 message every 2 seconds, so will be caught up in another 25 minutes if all goes well | 21:35 |
clarkb | how about #status log Exim on lists.openstack.org/lists.opendev.org/lists.starlingx.io/lists.airshipit.org is now enforcing spf failures (not soft failures). This means if you send email from a host that isn't allowed to by the spf record that email will be rejected. | 21:35 |
fungi | i can e-mail the infra ml to mention the spf change | 21:35 |
*** rfarr__ has quit IRC | 21:35 | |
fungi | kill two birds with one stone | 21:35 |
corvus | clarkb, fungi: ++ | 21:35 |
clarkb | #status log Exim on lists.openstack.org/lists.opendev.org/lists.starlingx.io/lists.airshipit.org is now enforcing spf failures (not soft failures). This means if you send email from a host that isn't allowed to by the spf record that email will be rejected. | 21:36 |
openstackstatus | clarkb: finished logging | 21:36 |
*** rlandy|ruck has quit IRC | 21:37 | |
corvus | fungi: it might be worth triggering the openstack release test to make sure those emails are ok | 21:38 |
fungi | agreed. we may need a different address for them | 21:38 |
corvus | we can also whitelist that address to bypass spf checking, if it's a problem | 21:38 |
fungi | though openstack.org just publishes a ?all policy so hopefully not? | 21:38 |
corvus | just, you know, don't tell anyone :) | 21:38 |
*** smarcet has joined #openstack-infra | 21:39 | |
fungi | to confirm, we're blocking failures for -all policies but not ~all or ?all right? | 21:39 |
clarkb | fungi: correct softfails won't block and the others are soft right? | 21:39 |
fungi | yes, should be | 21:40 |
fungi | https://tools.ietf.org/html/rfc7208 | 21:40 |
*** e0ne has quit IRC | 21:40 | |
clarkb | my spf record is ?all if you want me to send email and test | 21:40 |
fungi | the domain i'm sending from has no spf record at all | 21:41 |
corvus | we should really make that test list :) | 21:43 |
corvus | cause maybe emails from both of you would be a good idea | 21:43 |
clarkb | I can respond to fungi's email | 21:44 |
fungi | okay, bombs away | 21:44 |
corvus | fungi's mail was delivered to mm | 21:45 |
fungi | yep, /srv/mailman/openstack/qfiles/in/1559943888.530389+ca0ad2bcfe55cb0d3bd4269962b3ff33ac3d927d.pck | 21:46 |
fungi | ~610 messages ahead of it in the backlog | 21:46 |
clarkb | I guess I can't reply to it until mm processes it. Should I wait or just send an email? | 21:46 |
fungi | if we want i can clean up the input queue (hopefully one last time) | 21:47 |
fungi | and then it should go straight out | 21:47 |
fungi | or we can wait another ~20 minutes | 21:47 |
fungi | based on the current queue size and processing rate | 21:47 |
clarkb | fungi: I'd be ok with that | 21:50 |
*** pcaruana has quit IRC | 21:51 | |
fungi | cleaned up ~570 suspected spams from [0-9]\+@qq.com in the input queue | 21:51 |
fungi | i hope that's the last of them | 21:51 |
fungi | good news is the queue size is no longer growing | 21:51 |
clarkb | response sent | 21:52 |
fungi | i received both | 21:57 |
corvus | \o/ | 21:57 |
fungi | i've intentionally kept myself as the openstack-discuss owner since the migration so that i could filter spam for the -owner address locally. going to see if that dries up now, and if so we can talk about dropping the various blackhole aliases for other lists | 21:58 |
clarkb | I think image uploads to rax may be failing. Possibly as a result of sdk updates and my restart of the builders? | 21:59 |
clarkb | I don't have time to dig into that today, but can start looking monday if nothing uploads between now and then | 21:59 |
fungi | in other news, i've been checking afs02.dfw and we're down to only 3 remaining stale volume replicas | 22:00 |
corvus | fungi: which? | 22:01 |
*** slaweq has quit IRC | 22:01 | |
fungi | mirror.fedora, mirror.ubuntu and mirror.ubuntu-ports all of which i suspect are substantial in size | 22:01 |
fungi | so not entirely surprising | 22:02 |
fungi | the count has been steadily falling all day | 22:02 |
fungi | hopefully those will finish up at least by the end of the weekend | 22:02 |
corvus | i doubt they will | 22:02 |
corvus | Could not lock the VLDB entry for the volume 536871006. | 22:02 |
corvus | they probably exceeded the timeout on mirror-update | 22:03 |
fungi | oh :/ | 22:03 |
fungi | do we need to manually acquire the lock and then start them again in a screen session? | 22:03 |
fungi | (after deleting the old lock?) | 22:03 |
corvus | yes, but that's an afs volume lock, so we have to verify that it's really okay to unlock and then override it | 22:04 |
corvus | yep -- the afs creds timed out during the release: rxk: authentication expired | 22:06 |
corvus | that happened sometime between 2019-06-06T22:45:49,877443816+00:00 and 2019-06-07T18:44:01,757147237+00:00 | 22:06 |
corvus | fungi: i think we should grab the lock in a screen session to prevent further releases from mirror-update01, remove the afs volume lock, then perform a vos release in screet from afsdb01 so it runs with localauth | 22:07 |
*** slaweq has joined #openstack-infra | 22:08 | |
corvus | we should make sure there are no currently running transactions before removing the volume lock | 22:08 |
fungi | took me a moment to figure out why we should vos release in secret ;) | 22:08 |
fungi | and yeah, i'll take a look | 22:08 |
corvus | i think that's "vos status" and it doesn't show anything | 22:10 |
fungi | oh, cool, i was checking the process list on mirror-update | 22:10 |
corvus | even if the 'vos release' command terminates, the underlying transaction (which is from the fileserver on afs01.dfw to afs02.dfw) could still be going | 22:11 |
fungi | ahh, okay | 22:11 |
corvus | basically, 'vos release' locks the vldb, tells 01 to replicate to 02, waits for it to finish, then unlocks the vldb | 22:11 |
fungi | judging from the crontab the script locks for these three are /var/run/fedora-mirror.lock /var/run/reprepro/ubuntu.lock /var/run/reprepro/ubuntu-ports.lock | 22:11 |
corvus | so vos release was effectively killed (due to cred timeout) sometime during that process, but it could have been anytime after the lock and before the unlock. | 22:12 |
corvus | fungi: yeah, i think we're ready to grab those files. want to do it? | 22:12 |
fungi | doing it now | 22:12 |
fungi | okay, i have three screen windows in the root session each with a bash flock'd on one of those lockfiles | 22:13 |
corvus | i've verified there are still no transactions running | 22:13 |
corvus | i'll unlock the vldb now | 22:13 |
fungi | ahh, thanks | 22:14 |
fungi | i've done it before, but would have to look up the commands i used | 22:14 |
corvus | i'm running "vos unlock mirror.fedora" | 22:14 |
corvus | and similar | 22:14 |
fungi | got it | 22:14 |
*** slaweq has quit IRC | 22:15 | |
corvus | okay, we should be able to start releases on afs01db now | 22:15 |
corvus | do we want to do them sequentially or in parallel? | 22:15 |
fungi | i guess they'll fight for bandwidth, right? | 22:16 |
fungi | will the be more efficient sequenced? or just reduce the chances of all of them running afoul of a random network problem | 22:16 |
corvus | i... think so? i've forgotten all the kernel udp trivia i need to know to answer that for certain | 22:16 |
fungi | yeah, i can do them one by one. i'll queue them all up in one command | 22:16 |
fungi | er, one command line | 22:17 |
corvus | fungi: i think "vos release mirror.fedora -localauth" is what wants to run on afsdb01.openstack.org | 22:17 |
corvus | fungi: you want to take care of that too? | 22:18 |
fungi | ahh, yeah, i suppose i don't need to rerun the mirror scripts themselves | 22:18 |
corvus | nope, the rw volume is in good shape; rsyncs finished and all. | 22:19 |
fungi | root screen session started on afsdb01 | 22:19 |
corvus | fungi: maybe && instead of ; ? | 22:19 |
corvus | (just in case something goes wrong but takes 12 hours to do it) | 22:19 |
fungi | good idea | 22:20 |
corvus | fungi: lgtm | 22:20 |
fungi | and there it goes | 22:20 |
corvus | "vos examine mirror.fedora" shows it locked for release | 22:20 |
fungi | i'll check in on it off and on it while i'm awake at least | 22:21 |
fungi | once these complete successfully i'll exit the bash processes holding the mirror update locks for these | 22:21 |
corvus | infra-root: ^ long running vos release processes in screen on afsdb01 if you want to check on progress over the weekend | 22:21 |
clarkb | rgr | 22:22 |
corvus | #status log fedora, ubuntu, ubuntu-ports mirrors are currently resyncing to afs02.dfw and won't update again until that is finished | 22:23 |
openstackstatus | corvus: finished logging | 22:23 |
openstackgerrit | Merged opendev/storyboard-webclient master: Add a subcontroller for Team projects https://review.opendev.org/641963 | 22:25 |
openstackgerrit | Merged opendev/storyboard-webclient master: Add UI for making security teams related to projects https://review.opendev.org/641964 | 22:25 |
*** weifan has quit IRC | 22:28 | |
*** weifan has joined #openstack-infra | 22:29 | |
*** hamzy_ has joined #openstack-infra | 22:30 | |
*** hamzy has quit IRC | 22:32 | |
*** weifan has quit IRC | 22:33 | |
*** weifan has joined #openstack-infra | 22:36 | |
*** EvilienM is now known as EmilienM | 22:42 | |
*** weifan has quit IRC | 22:45 | |
*** weifan has joined #openstack-infra | 22:46 | |
*** weifan has quit IRC | 22:50 | |
*** weifan has joined #openstack-infra | 22:52 | |
openstackgerrit | Merged openstack/pbr master: Make WSGI tests listen on localhost https://review.opendev.org/663758 | 22:53 |
*** smarcet has quit IRC | 23:01 | |
*** diablo_rojo has quit IRC | 23:01 | |
*** hwoarang has quit IRC | 23:03 | |
*** hwoarang has joined #openstack-infra | 23:06 | |
openstackgerrit | Merged openstack/pbr master: Switch to release.o.o for constraints https://review.opendev.org/663988 | 23:06 |
openstackgerrit | Merged opendev/storyboard-webclient master: Add support for Story permission endpoints https://review.opendev.org/642070 | 23:10 |
openstackgerrit | Merged opendev/storyboard-webclient master: Allow marking stories as security-related https://review.opendev.org/642071 | 23:10 |
*** aaronsheffield has quit IRC | 23:10 | |
*** slaweq has joined #openstack-infra | 23:11 | |
*** gyee has quit IRC | 23:12 | |
*** slaweq has quit IRC | 23:15 | |
*** markvoelker has joined #openstack-infra | 23:17 | |
*** michael-beaver has quit IRC | 23:24 | |
*** tjgresha has joined #openstack-infra | 23:29 | |
*** diablo_rojo has joined #openstack-infra | 23:37 | |
*** markvoelker has quit IRC | 23:38 | |
pabelanger | clarkb: ack, thanks | 23:53 |
*** weifan has quit IRC | 23:57 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!