*** armax has quit IRC | 00:08 | |
*** lathiat has quit IRC | 00:09 | |
*** lathiat has joined #openstack-infra | 00:15 | |
*** ricolin has joined #openstack-infra | 00:59 | |
*** Haunted330 has joined #openstack-infra | 01:04 | |
*** wolverineav has quit IRC | 01:11 | |
*** Haunted330 has quit IRC | 01:14 | |
*** Haunted330 has joined #openstack-infra | 01:14 | |
*** Haunted330 has quit IRC | 01:30 | |
*** wolverineav has joined #openstack-infra | 01:35 | |
*** jamesmcarthur has joined #openstack-infra | 01:41 | |
openstackgerrit | Ghanshyam Mann proposed openstack-infra/irc-meetings master: Modify the QA office hour time https://review.openstack.org/642308 | 01:42 |
---|---|---|
*** jamesmcarthur has quit IRC | 01:48 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Implement an OpenShift Pod provider https://review.openstack.org/590335 | 01:52 |
*** hongbin has joined #openstack-infra | 02:02 | |
*** wolverineav has quit IRC | 02:27 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul master: web: add tenant and project scoped, JWT-protected actions https://review.openstack.org/576907 | 02:32 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: [dmn] https://review.openstack.org/642314 | 02:54 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: [dmn] stashing some scripts to make git:// -> https:// changes https://review.openstack.org/642314 | 02:54 |
*** whoami-rajat has joined #openstack-infra | 03:07 | |
*** ykarel has joined #openstack-infra | 03:22 | |
*** wolverineav has joined #openstack-infra | 03:36 | |
*** jamesmcarthur has joined #openstack-infra | 03:39 | |
*** wolverineav has quit IRC | 03:40 | |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: [dmn] stashing some scripts to make git:// -> https:// changes https://review.openstack.org/642314 | 03:43 |
*** wolverineav has joined #openstack-infra | 03:51 | |
*** hongbin has quit IRC | 03:58 | |
*** udesale has joined #openstack-infra | 04:03 | |
*** wolverineav has quit IRC | 04:07 | |
openstackgerrit | Riju Khatri proposed openstack-infra/storyboard-webclient master: Add nofollow attribute to hyperlinks https://review.openstack.org/642327 | 04:19 |
*** wolverineav has joined #openstack-infra | 04:24 | |
*** jamesmcarthur has quit IRC | 04:32 | |
*** yamamoto has joined #openstack-infra | 04:32 | |
*** stakeda has joined #openstack-infra | 04:37 | |
*** ramishra has joined #openstack-infra | 04:45 | |
*** janki has joined #openstack-infra | 04:46 | |
openstackgerrit | Trinh Nguyen proposed openstack-infra/system-config master: Add meetbot to openstack-fenix channel https://review.openstack.org/642340 | 05:18 |
*** jaosorior has joined #openstack-infra | 05:52 | |
openstackgerrit | Ian Wienand proposed openstack-infra/zuul-sphinx master: Add type to role variables https://review.openstack.org/641168 | 05:53 |
*** apetrich has joined #openstack-infra | 06:11 | |
*** ianychoi has quit IRC | 06:32 | |
*** ianychoi has joined #openstack-infra | 06:32 | |
*** ianychoi has quit IRC | 06:35 | |
*** ianychoi has joined #openstack-infra | 06:36 | |
*** jtomasek has joined #openstack-infra | 06:50 | |
*** jbadiapa has joined #openstack-infra | 06:51 | |
*** e0ne has joined #openstack-infra | 06:51 | |
*** pcaruana has joined #openstack-infra | 07:00 | |
*** e0ne has quit IRC | 07:02 | |
*** rcernin has quit IRC | 07:03 | |
openstackgerrit | Riju Khatri proposed openstack-infra/storyboard-webclient master: Show all stories created https://review.openstack.org/642370 | 07:04 |
*** kopecmartin has joined #openstack-infra | 07:06 | |
*** slaweq has joined #openstack-infra | 07:34 | |
openstackgerrit | Ankita Bansal proposed openstack-infra/storyboard-webclient master: allow subscriptions to projects when items in project groups list are expanded Story: 2000545 Task: 2911 https://review.openstack.org/642371 | 07:35 |
*** mpjetta has quit IRC | 07:36 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Add python-path option to node https://review.openstack.org/637338 | 07:37 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Implement an OpenShift Pod provider https://review.openstack.org/590335 | 07:37 |
*** mpjetta has joined #openstack-infra | 07:40 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Implement a Runc driver https://review.openstack.org/535556 | 07:41 |
*** rascasoft has joined #openstack-infra | 07:44 | |
*** pgaxatte has joined #openstack-infra | 08:01 | |
*** ginopc has joined #openstack-infra | 08:02 | |
*** rpittau|afk is now known as rpittau | 08:09 | |
*** xek has joined #openstack-infra | 08:14 | |
*** tkajinam has quit IRC | 08:17 | |
*** gouthamr has quit IRC | 08:18 | |
*** stevebaker has quit IRC | 08:19 | |
*** dmellado has quit IRC | 08:20 | |
*** helenafm has joined #openstack-infra | 08:22 | |
*** e0ne has joined #openstack-infra | 08:27 | |
*** hwoarang has quit IRC | 08:34 | |
*** hwoarang has joined #openstack-infra | 08:36 | |
*** dtantsur|afk is now known as dtantsur | 08:38 | |
*** tosky has joined #openstack-infra | 08:41 | |
*** roman_g has joined #openstack-infra | 08:47 | |
*** iurygregory has joined #openstack-infra | 08:50 | |
*** zbr has joined #openstack-infra | 08:52 | |
openstackgerrit | Merged openstack-infra/irc-meetings master: Modify the QA office hour time https://review.openstack.org/642308 | 08:55 |
*** jpena|off is now known as jpena | 08:56 | |
*** wolverineav has quit IRC | 08:56 | |
*** jpich has joined #openstack-infra | 08:57 | |
*** jpich has quit IRC | 08:57 | |
*** adrianreza has quit IRC | 09:00 | |
*** adrianreza has joined #openstack-infra | 09:00 | |
*** needssleep has quit IRC | 09:01 | |
*** jpich has joined #openstack-infra | 09:02 | |
*** ykarel has quit IRC | 09:10 | |
*** hwoarang has quit IRC | 09:11 | |
*** ykarel has joined #openstack-infra | 09:11 | |
*** hwoarang has joined #openstack-infra | 09:12 | |
openstackgerrit | Merged openstack/diskimage-builder master: [lvm] Add Ubuntu bionic as supported distro https://review.openstack.org/640850 | 09:19 |
*** e0ne has quit IRC | 09:22 | |
*** owalsh_ is now known as owalsh | 09:22 | |
*** noama has joined #openstack-infra | 09:28 | |
*** janki has quit IRC | 09:30 | |
*** jchhatbar has joined #openstack-infra | 09:30 | |
*** ykarel is now known as ykarel|lunch | 09:37 | |
*** e0ne has joined #openstack-infra | 09:42 | |
*** derekh has joined #openstack-infra | 09:43 | |
*** panda is now known as panda|rover | 09:50 | |
*** wolverineav has joined #openstack-infra | 09:57 | |
*** wolverineav has quit IRC | 10:01 | |
*** priteau has joined #openstack-infra | 10:03 | |
*** ykarel|lunch is now known as ykarel | 10:07 | |
dulek | I'm trying to setup CoreDNS for K8s clusters deployed by DevStack in kuryr-kubernetes gates. | 10:13 |
dulek | Is there any chance to learn upstream DNS server address in DevStack? | 10:13 |
openstackgerrit | Adam Coldrick proposed openstack-infra/storyboard-webclient master: Show tags with stories in project view. https://review.openstack.org/642230 | 10:13 |
dulek | Obviously 127.0.0.1 from /etc/resolv.conf will not work for me. :D | 10:13 |
*** yamamoto has quit IRC | 10:17 | |
*** stakeda has quit IRC | 10:26 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: web: add tenant and project scoped, JWT-protected actions https://review.openstack.org/576907 | 10:26 |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: [WIP] Allow operator to generate auth tokens through the CLI https://review.openstack.org/636197 | 10:26 |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: [WIP] Zuul CLI: allow access via REST https://review.openstack.org/636315 | 10:26 |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: [WIP] Add Authorization Rules configuration https://review.openstack.org/639855 | 10:26 |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: [WIP] Web: plug the authorization engine https://review.openstack.org/640884 | 10:27 |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: Zuul Web: add /api/user/actions endpoint https://review.openstack.org/641099 | 10:27 |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: authentication config: add optional token_expiry https://review.openstack.org/642408 | 10:27 |
*** luizbag has joined #openstack-infra | 10:32 | |
*** udesale has quit IRC | 10:34 | |
*** gfidente has joined #openstack-infra | 10:37 | |
*** electrofelix has joined #openstack-infra | 10:40 | |
*** e0ne has quit IRC | 10:43 | |
*** yamamoto has joined #openstack-infra | 10:46 | |
*** e0ne has joined #openstack-infra | 10:47 | |
*** jbadiapa has quit IRC | 10:52 | |
*** yamamoto has quit IRC | 10:52 | |
*** jbadiapa has joined #openstack-infra | 10:52 | |
*** jbadiapa has quit IRC | 10:54 | |
*** jbadiapa has joined #openstack-infra | 10:54 | |
*** yamamoto has joined #openstack-infra | 10:56 | |
*** ricolin has quit IRC | 10:59 | |
*** gouthamr has joined #openstack-infra | 10:59 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: web: add tenant and project scoped, JWT-protected actions https://review.openstack.org/576907 | 11:03 |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: [WIP] Allow operator to generate auth tokens through the CLI https://review.openstack.org/636197 | 11:03 |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: [WIP] Zuul CLI: allow access via REST https://review.openstack.org/636315 | 11:04 |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: [WIP] Add Authorization Rules configuration https://review.openstack.org/639855 | 11:04 |
*** dmellado_ has joined #openstack-infra | 11:04 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: [WIP] Web: plug the authorization engine https://review.openstack.org/640884 | 11:04 |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: Zuul Web: add /api/user/actions endpoint https://review.openstack.org/641099 | 11:04 |
*** dmellado_ is now known as dmellado | 11:05 | |
*** dave-mccowan has joined #openstack-infra | 11:19 | |
*** ykarel_ has joined #openstack-infra | 11:22 | |
*** stevebaker has joined #openstack-infra | 11:22 | |
*** ykarel has quit IRC | 11:24 | |
*** hwoarang has quit IRC | 11:27 | |
*** hwoarang has joined #openstack-infra | 11:35 | |
*** e0ne has quit IRC | 11:37 | |
openstackgerrit | Jakub Bielecki proposed openstack-infra/zuul-preview master: add basic description into README.rst https://review.openstack.org/642428 | 11:41 |
*** dave-mccowan has quit IRC | 11:42 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: authentication config: add optional token_expiry https://review.openstack.org/642408 | 11:43 |
*** edmondsw has joined #openstack-infra | 11:47 | |
*** ykarel_ is now known as ykarel | 11:49 | |
*** rlandy has joined #openstack-infra | 11:57 | |
*** jchhatbar has quit IRC | 11:58 | |
*** rh-jelabarre has joined #openstack-infra | 12:02 | |
*** aojea has joined #openstack-infra | 12:08 | |
*** markvoelker has quit IRC | 12:14 | |
*** panda|rover is now known as panda|rover|lunc | 12:21 | |
*** e0ne has joined #openstack-infra | 12:30 | |
*** ginopc has quit IRC | 12:35 | |
*** janki has joined #openstack-infra | 12:35 | |
*** trown|back11mar is now known as trown | 12:46 | |
*** jamesmcarthur has joined #openstack-infra | 12:48 | |
*** jpena is now known as jpena|lunch | 12:57 | |
*** hwoarang has quit IRC | 13:00 | |
*** kgiusti has joined #openstack-infra | 13:01 | |
*** pgaxatte has quit IRC | 13:01 | |
*** pgaxatte has joined #openstack-infra | 13:03 | |
*** hwoarang has joined #openstack-infra | 13:04 | |
dmsimard | Anyone know if spam on freenode is still fairly problematic ? The nickserv auth is still a problem for many IRC clients that attempt to join channels before authenticating. | 13:08 |
smcginnis | dmsimard: I just saw something recently (beginning of last week?) that there was yet another wave of spam bots hitting freenode server. | 13:11 |
*** jcoufal has joined #openstack-infra | 13:12 | |
smcginnis | I think the auth requirement is here to stay, unfortunately. Just too easy to get hit if it's left open. | 13:12 |
dmsimard | smcginnis: yeah, I was suspecting as much but I was holding onto the hope that it wasn't so :( | 13:14 |
tosky | smcginnis: just a note with my KDE hat on - we reopened various channels because we are promoting the usage of the matrix bridge | 13:14 |
tosky | so far nothing happened (after a week) | 13:14 |
tosky | I've seen spam on few OFTC channels, though | 13:14 |
smcginnis | tosky: Oh? How's the matrix bridge working out? | 13:16 |
tosky | smcginnis: there are some delays from time to time, but it's improving; they gave us a server: https://dot.kde.org/2019/02/20/kde-adding-matrix-its-im-framework | 13:17 |
smcginnis | I did use the matrix mobile client for awhile to try it out and for the most part I liked it. I did notice some delays (this was at least 6 months ago), but it worked. | 13:18 |
smcginnis | Cool to see a larger effort to use that. | 13:18 |
tosky | I think someone here was looking into matrix - it would be interesting to have an opendev matrix server federated with the rest of the system | 13:18 |
tosky | they are also rewriting riot (the client) and everyone seems very excited about this, but I didn't try the snapshots yet | 13:18 |
*** panda|rover|lunc is now known as panda|rover | 13:23 | |
*** yamamoto has quit IRC | 13:24 | |
*** ianychoi has quit IRC | 13:28 | |
*** ianychoi has joined #openstack-infra | 13:29 | |
*** jamesmcarthur has quit IRC | 13:30 | |
*** jamesmcarthur has joined #openstack-infra | 13:31 | |
*** wolverineav has joined #openstack-infra | 13:33 | |
*** jamesmcarthur has quit IRC | 13:36 | |
*** wolverineav has quit IRC | 13:37 | |
*** mriedem has joined #openstack-infra | 13:39 | |
*** yamamoto has joined #openstack-infra | 13:43 | |
*** beekneemech is now known as bnemec | 13:43 | |
*** jamesmcarthur has joined #openstack-infra | 13:45 | |
*** jamesmcarthur_ has joined #openstack-infra | 13:49 | |
*** jrist has quit IRC | 13:49 | |
pabelanger | dmsimard: yah, proxy also had issue over the weekend | 13:50 |
*** otherwiseguy has quit IRC | 13:51 | |
*** sthussey has joined #openstack-infra | 13:52 | |
*** jrist has joined #openstack-infra | 13:53 | |
*** jamesmcarthur has quit IRC | 13:53 | |
*** otherwiseguy has joined #openstack-infra | 13:53 | |
smcginnis | review.o.o down? | 13:55 |
rpittau | it seems so | 13:55 |
rpittau | :/ | 13:55 |
smcginnis | But it's not even a deadline week. :) | 13:56 |
openstackgerrit | Merged openstack-infra/elastic-recheck master: Add query for nova functional test bug 1819374 https://review.openstack.org/642292 | 13:56 |
openstack | bug 1819374 in OpenStack Compute (nova) "test_interface_detach_with_port_with_bandwidth_request intermittently fails" [Medium,In progress] https://launchpad.net/bugs/1819374 - Assigned to Balazs Gibizer (balazs-gibizer) | 13:56 |
slaweq | working again :) | 13:56 |
slaweq | at least for me | 13:56 |
smcginnis | Oh, yep! | 13:56 |
*** efried1 has joined #openstack-infra | 13:56 | |
*** agopi has joined #openstack-infra | 13:56 | |
*** efried has quit IRC | 13:57 | |
*** efried1 is now known as efried | 13:57 | |
*** jpena|lunch is now known as jpena | 13:59 | |
mordred | smcginnis, tosky: I was looking at matrix a bit and was pretty pleased with it, and also with the IRC bridge. I think it's a worthwhile thing to talk about again in denver | 14:03 |
smcginnis | mordred: I think performance/lag was the big issue brought up in the past, but hosting our own matrix servers was an option to maybe address that. | 14:04 |
smcginnis | I think it would definitely be worth talking about more in Denver. | 14:04 |
mordred | yeah. there also seems to be a wechat bridge, altough I haven't yet investigated how good it is | 14:05 |
fungi | dmsimard: we did have one wander into this channel over the weekend even with the nick registration requirement in place | 14:06 |
dmsimard | mordred: fwiw I wrote a rudimentary bridge for ara that (probably) supports a bunch of protocols | 14:07 |
dmsimard | two errbot instances, one on protocol A, another on protocol B -- and they relay messages to each other | 14:07 |
dmsimard | so any backend supported by errbot is supported, in theory | 14:07 |
mordred | nod. | 14:07 |
fungi | tosky: the last time we tried removing registration as a requirement from our channels (a few months ago) we made it a week or two before the spammers found us and started ramping back up across various channels again, so finally had to put it back in place | 14:07 |
dmsimard | fungi: yeah I think I saw that | 14:08 |
*** ginopc has joined #openstack-infra | 14:11 | |
*** e0ne has quit IRC | 14:14 | |
*** dave-mccowan has joined #openstack-infra | 14:19 | |
*** armax has joined #openstack-infra | 14:21 | |
*** e0ne has joined #openstack-infra | 14:23 | |
dmellado | dmsimard: o/ | 14:30 |
dmellado | who's spamming you? xD | 14:30 |
dmsimard | :( | 14:32 |
dmellado | tosky: matrix bridge? | 14:33 |
*** e0ne has quit IRC | 14:36 | |
*** e0ne has joined #openstack-infra | 14:38 | |
*** priteau has quit IRC | 14:44 | |
openstackgerrit | Ankita Bansal proposed openstack-infra/storyboard-webclient master: project_group view: add number of active stories beside repo list https://review.openstack.org/642211 | 14:46 |
*** FlorianFa has joined #openstack-infra | 14:50 | |
*** FlorianFa has quit IRC | 14:52 | |
*** FlorianFa has joined #openstack-infra | 14:52 | |
*** udesale has joined #openstack-infra | 14:56 | |
*** priteau has joined #openstack-infra | 14:59 | |
*** dpawlik has quit IRC | 15:02 | |
*** hwoarang has quit IRC | 15:05 | |
*** hwoarang has joined #openstack-infra | 15:07 | |
*** ginux has joined #openstack-infra | 15:15 | |
*** ginopc has quit IRC | 15:15 | |
*** ginux is now known as ginopc | 15:15 | |
*** yamamoto has quit IRC | 15:21 | |
iurygregory | Morning everyone, quick question does anyone have an idea why zuul is saying that the job is not defined in https://review.openstack.org/#/c/642474 ironicclient-functional is defined in https://review.openstack.org/#/c/642474/4/zuul.d/ironicclient-jobs.yaml | 15:26 |
*** roman_g has quit IRC | 15:26 | |
*** ykarel is now known as ykarel|afk | 15:28 | |
*** yamamoto has joined #openstack-infra | 15:29 | |
AJaeger | iurygregory: commented - the error message might be wrong, best doublecheck that everything is valid yaml | 15:29 |
iurygregory | AJaeger, do you know any tool that i could use to check? =) | 15:30 |
iurygregory | oh ty for the comment =) | 15:30 |
AJaeger | not sure whether that is really the problem - worth a try... | 15:32 |
*** janki has quit IRC | 15:33 | |
openstackgerrit | Riju Khatri proposed openstack-infra/storyboard-webclient master: Show all stories created https://review.openstack.org/642370 | 15:36 |
*** jamesmcarthur_ has quit IRC | 15:38 | |
*** jamesmcarthur has joined #openstack-infra | 15:38 | |
*** agopi is now known as agopi|brb | 15:39 | |
*** yamamoto has quit IRC | 15:44 | |
*** yamamoto has joined #openstack-infra | 15:45 | |
openstackgerrit | Helena proposed openstack-infra/project-config master: Adding zuul jobs for rsd-virt-for-nova repo https://review.openstack.org/642500 | 15:45 |
mordred | AJaeger: feel like a +3 on https://review.openstack.org/#/c/632532/ ? | 15:49 |
*** weshay68802228 is now known as weshay | 15:53 | |
*** jamesmcarthur has quit IRC | 15:57 | |
*** jamesmcarthur has joined #openstack-infra | 15:57 | |
*** yamamoto has quit IRC | 15:59 | |
*** jamesmcarthur has quit IRC | 16:00 | |
AJaeger | mordred: that's system-config and I don't have the power to +3... | 16:00 |
*** yamamoto has joined #openstack-infra | 16:00 | |
mordred | AJaeger: oh - so it is! | 16:00 |
mordred | AJaeger: I forget sometimes that you are not infinitely powerful | 16:01 |
*** jamesmcarthur has joined #openstack-infra | 16:01 | |
*** ramishra has quit IRC | 16:02 | |
dulek | Hi! | 16:02 |
dulek | I'm trying to setup CoreDNS for K8s clusters deployed by DevStack in kuryr-kubernetes gates. | 16:02 |
dulek | Is there any chance to learn upstream DNS server address in DevStack? | 16:02 |
dulek | At the moment I'm looking at https://git.openstack.org/cgit/openstack-infra/system-config/tree/playbooks/group_vars/dns.yaml, but should I use dns_master or one of dns_notify? | 16:03 |
fungi | dulek: i saw your question earlier in scrollback, probably better to start with the problem than the solution you've jumped to | 16:03 |
dulek | fungi: Sure thing! I'm setting up coredns pod to serve as DNS for pods. | 16:03 |
tosky | dmellado: https://matrix.org/blog/2015/06/22/the-matrix-org-irc-bridge-now-bridges-all-of-freenode/ | 16:03 |
dulek | fungi: Thing is, coredns needs to forward "outside-of-the-cluster" DNS queries to a DNS server that can resolve them. | 16:04 |
dulek | fungi: Thing is - in gate's DevStack VM I only have 127.0.0.1 put into /etc/resolv.conf. | 16:04 |
dulek | fungi: So that won't work from the coredns pod. | 16:05 |
AJaeger | mordred: ;) | 16:05 |
dulek | fungi: So my too ideas at the moment is either to figure out the "real" DNS server and set coredns to forward there. | 16:05 |
dulek | fungi: Or run coredns pod on hostNetworking, bind it to 127.0.0.<whatever> and forward to 127.0.0.1. | 16:06 |
fungi | dulek: challenges there are likely going to be related to the variety of network topologies we have in different service providers. in some cases we need to prefer resolving via ipv6 because all ipv4 egress is through a single nat which can get easily overwhelmed with too many simultaneous requests, while in other providers we have no global ipv6 routes at all and must use ipv4 for dns resolution | 16:06 |
dulek | The latter is not really appealing because that coredns is supposed to sit behind K8s Service and running it with hostNetworking would make stuff harder. | 16:07 |
*** ykarel_ has joined #openstack-infra | 16:07 | |
clarkb | dulek: fungi the unbound config should have already sorted ou the ipv4 vs ipv6 for you | 16:07 |
clarkb | its a bit implementation specific but reading that config file might be simplest | 16:07 |
mordred | yeah. if you read the unbound config you could get the info you need | 16:07 |
dulek | fungi: Is that why that unbind instance is deployed on gate VMs? | 16:08 |
fungi | right, my point was it's likely going to be better if coredns can forward its queries to the local unbound service | 16:08 |
dulek | fungi: That would definitely be best. | 16:08 |
clarkb | fungi: yup I agree, unfortunately lxc is the only container runtime that seems to also agree | 16:08 |
fungi | dulek: yes, that and to reduce the volume and latency for repeated queries | 16:08 |
clarkb | lxc runs a dnsmasq with an interface in both network namepaces to bridge between containers and host resolver | 16:09 |
clarkb | docker punts to google if you have localhost set as resolver | 16:09 |
mordred | cause that's a good default behavior | 16:09 |
fungi | i don't suppose docker has improved that at all | 16:09 |
*** ykarel|afk has quit IRC | 16:09 | |
clarkb | I think k8s expects you to configure dns within k8s (aka read the unbound config and set it as appropriate) | 16:10 |
mordred | yeah | 16:10 |
dulek | clarkb: Ha, nice to know. In our case we're running K8s with Neutron ports serving the pods. | 16:10 |
mordred | and that's what dulek is trying to set up and test | 16:10 |
mordred | like - the k8s version of what we're doing with unbound | 16:10 |
clarkb | mordred: yup | 16:10 |
fungi | so the job would need to be able to parse the unbound config and generate a similar kubernetes configuration i guess? | 16:11 |
mordred | so - perhaps one option is "read the unbound config" - but maybe writing a similar role that provides the appropriate input variables for an in-k8s coredns setup wouldn't be a bad idea *waves hands* | 16:11 |
fungi | or do we expose the values we're setting in the unbound configuration as ansible variables? | 16:11 |
clarkb | infra-root I've just put afs01.ord and afsdb01.o.o in the puppet emergency file. The first because it is the server I want to upgrade today and the second so I can disable docs afs publsihing during this process | 16:12 |
dulek | Can I even see the unbound config? I mean - how complicated it is. | 16:12 |
dulek | In the end I was only looking for a single address to forward too. :P | 16:12 |
clarkb | on mirror-update I intend to hold all the cron locks rather than disable crons there | 16:12 |
clarkb | the last piece I need to consider is the wheel publishing jobs | 16:13 |
clarkb | thoughts on whether we want to pull those out of zuul or just let them potentially fail for a bit? | 16:13 |
clarkb | dulek: its pretty simple | 16:13 |
clarkb | dulek: its a plan text file with lines that say server: $ipaddress iirc | 16:13 |
* clarkb finds details | 16:14 | |
mordred | dulek: the configure-unbound role in openstack-zuul-jobs is where the magic i | 16:14 |
mordred | is | 16:14 |
openstackgerrit | Adam Coldrick proposed openstack-infra/storyboard-webclient master: WIP: Automatically add security teams to security stories https://review.openstack.org/642071 | 16:15 |
clarkb | dulek: /etc/unbound/forwarding.conf is the file and ' forward-addr: $NODEPOOL_STATIC_NAMESERVER_V6' is what the lines look like | 16:15 |
mordred | clarkb: we could maybe zuul_return unbound_primary_nameserver and unbound_secondary_nameserver ... | 16:15 |
clarkb | dulek: there may be more than one entry in that | 16:15 |
* dulek thanks anyone who decided to put codesearch.openstack.org up. | 16:15 | |
clarkb | mordred: I don't really want to make that a real interface. The way things should work is you use host dns or if you think you are smarter than us figure it out | 16:16 |
mordred | clarkb: I hear that - but ever user doing containerized things is going to hit this issue | 16:16 |
fungi | dulek: that was mostly done by taron, an outreachy intern who worked with us a few years back | 16:16 |
clarkb | mordred: unless they use lxc like osa | 16:16 |
mordred | clarkb: and as much as I thnik the container ecosystem has done this incorrectly, me thinking that isn't going to make things work | 16:16 |
openstackgerrit | Matt Riedemann proposed openstack-infra/elastic-recheck master: Remove query for bug 1806126 https://review.openstack.org/642508 | 16:16 |
openstack | bug 1806126 in OpenStack Compute (nova) "LibvirtRbdEvacuateTest and LibvirtFlatEvacuateTest tests race fail" [High,Fix released] https://launchpad.net/bugs/1806126 - Assigned to Matt Riedemann (mriedem) | 16:16 |
clarkb | mordred: ya I guess the real issue here is knowing v4 vs v6 addr | 16:17 |
mordred | so I don't think it's unreasonable for us to provide some sort of interface that people doing container-based things can use to get dns going properly, right? | 16:17 |
mordred | clarkb: yeah | 16:17 |
*** yamamoto has quit IRC | 16:17 | |
mordred | clarkb: of course, that's probably complicated by whether or not the container subsystem has been configured to understand ipv6 :) | 16:17 |
clarkb | mordred: yes | 16:18 |
dulek | clarkb: Okay, so I assume first "forward-addr:" for zone "." is the one I'm looking for? | 16:18 |
clarkb | whcih is mind boggling that these so called cloud native tools can't ipv6 | 16:18 |
clarkb | dulek: ya first or second shouldn't really matter | 16:18 |
mordred | clarkb: IKR? | 16:18 |
clarkb | dulek: I think we prefer cloudflare to google so first will always be cloudflare then second will be google | 16:19 |
clarkb | but both are expected to work (and unbound round robins queries between them) | 16:19 |
openstackgerrit | Thierry Carrez proposed openstack/ptgbot master: The PTG is no longer OpenStack-specific https://review.openstack.org/642509 | 16:19 |
openstackgerrit | Thierry Carrez proposed openstack/ptgbot master: Update links to point to gitea https://review.openstack.org/642510 | 16:19 |
dulek | clarkb: Okay, but that'll sometimes be IPv6, right? I doubt Kuryr's DevStack plugin is able to do IPv6, but well. | 16:19 |
dulek | clarkb: Oh, I've actually tried forwarding to 8.8.8.8 and it haven't worked on one RAX instance. | 16:20 |
clarkb | dulek: yes because we haev ipv6 only clouds | 16:20 |
clarkb | and on those clouds we prefer ipv6 dns because ipv4 has to go through a shared nat | 16:20 |
clarkb | which mostly works but will sometimes fail | 16:20 |
dulek | :) | 16:20 |
openstackgerrit | Helena proposed openstack-infra/project-config master: Adding zuul jobs for rsd-virt-for-nova repo https://review.openstack.org/642500 | 16:20 |
dulek | Okay, thanks folks, I guess I have enough info to try this. :) | 16:22 |
*** pcaruana has quit IRC | 16:23 | |
clarkb | mordred: thinking about the ipv6 in containers problem we probably want any zuul_return or similar to list dns_primary dns_secondary as what we think you need to use based on ip version of local cloud region then also set a v4 and v6 pair of variables so that if you have cloud native tooling you can ipv4 only | 16:23 |
*** pcaruana has joined #openstack-infra | 16:23 | |
*** ykarel_ is now known as ykarel | 16:23 | |
mordred | clarkb: ++ | 16:23 |
clarkb | infra-root re disabling wheel build jobs afs01.ord doesn't host any of the wheel volumes so I think I am going to not disable the jobs for this first host upgrade | 16:25 |
*** yamamoto has joined #openstack-infra | 16:25 | |
mordred | dulek: fwiw - this could be a good opportunity to get kuryr's devstack plugin to be able to do ipv6 :) | 16:26 |
fungi | especially if this resulted in a job which exercised it with ipv6 | 16:26 |
dulek | mordred: Ha, sure thing, this will need to happen, but I'll look at that in 2 months. Until then I'm stuck in Spain, with ADSL internet without IPv6 enabled. :D | 16:28 |
dulek | And yes, I like to develop on local VM. ;) | 16:28 |
mordred | dulek: yay local development! | 16:29 |
*** ykarel is now known as ykarel|away | 16:29 | |
*** pgaxatte has quit IRC | 16:30 | |
*** mattw4 has joined #openstack-infra | 16:30 | |
*** trown is now known as trown|lunch | 16:30 | |
clarkb | re ipv4 through single nat address. I think we should do everything we can to avoid using that interface but we know there are things that leak through it and if we've done what we can to reduce throughput as much as possible that should make ipv4 as reliable as possible | 16:31 |
clarkb | that is a really long way of saying "it should be ok to use ipv4 where you don't have another option but where there is the option we should prefer ipv6" | 16:31 |
clarkb | infra-root I also added step of moving /etc/openafs aside before doing the upgrade because the post reboot upgrade reenables openafs-fileserver and I don't want it attempting to join our cell without its vicepa volume mounted and happy | 16:32 |
clarkb | basically I'm trying to decouple the process of upgrading from the process of starting openafs back up again and making sure it works so that we can check the upgrade is happy before checking openafs is happy | 16:33 |
clarkb | I expect that I will get started on that later this morning after I've caught up on email and such | 16:34 |
clarkb | https://etherpad.openstack.org/p/afs-fileserver-trusty-to-xenial is my documented process if anyone wants to loko that over before I jump into the deep end of the pool :) | 16:35 |
*** e0ne has quit IRC | 16:41 | |
ildikov | Hi, I have a quick question looking for guidance. The StarlingX team would like to setup a dashboard for the test team like this one: http://reportportal.io | 16:44 |
dulek | Hey, one more question - if I read https://nlnetlabs.nl/documentation/unbound/unbound.conf/ correctly, by default unbound will only listen to queries from localhost, right? | 16:44 |
clarkb | dulek: yes, our unbound only listens on localhost to avoid having an open resolver on the internet | 16:44 |
ildikov | I wonder if there's any way to do it within our infra? | 16:44 |
dulek | So this will mean that it won't work for me to query it from inside my pods… | 16:45 |
*** helenafm has quit IRC | 16:45 | |
clarkb | dulek: we also have iptables firewall rules but container infrastructure tends to say no to those and then open things up in ways we don't want | 16:45 |
clarkb | dulek: unless you can use lxc as a runtime (I don't know if k8s supports that) | 16:45 |
*** nsmeds has left #openstack-infra | 16:45 | |
clarkb | ildikov: elastic-recheck is our tooling most similar to that | 16:45 |
mordred | ildikov: so - we probably want to dig in a little bit in to what such a thing is trying to accomplish and how it fits in to a zuul world | 16:46 |
mordred | also what clarkb said :) | 16:46 |
clarkb | ildikov: and tristanC has a spec up to do the ml part | 16:46 |
dulek | clarkb: Nah, no way, we support Docker and cri-o as container engines. | 16:46 |
*** gyee has joined #openstack-infra | 16:46 | |
clarkb | ildikov: http://status.openstack.org/elastic-recheck/ we write elasticsearch queries to fingerprint bugs then twice and hour we scan our elasticsearch database for occurences of those fingerprints and publish the reports on that page | 16:47 |
iurygregory | Hey everyone, quick question to install everything udner Python3 i just need to set USE_PYTHON3=True on local.conf for devstack? in my patch trying to enable i think its installing python2.7 http://logs.openstack.org/74/642474/5/check/ironicclient-functional/b3d9e17/job-output.txt.gz#_2019-03-11_15_53_48_812130 even if in the job config is set https://review.openstack.org/#/c/642474/5/zuul.d/ironicclient-jobs.yaml | 16:47 |
clarkb | looks like indexing is behind again. I'm going to guess things have OOM'd | 16:47 |
openstackgerrit | Merged openstack-infra/elastic-recheck master: Remove query for bug 1806126 https://review.openstack.org/642508 | 16:47 |
openstack | bug 1806126 in OpenStack Compute (nova) "LibvirtRbdEvacuateTest and LibvirtFlatEvacuateTest tests race fail" [High,Fix released] https://launchpad.net/bugs/1806126 - Assigned to Matt Riedemann (mriedem) | 16:47 |
*** agopi|brb is now known as agopi | 16:47 | |
clarkb | iurygregory: probably a better question for the qa channel. My understanding is that for libs (like ironicclient) devstack will install the lib under python3 and python2 | 16:48 |
ildikov | clarkb: mordred: thanks! I think one of the challenges is that the team is having builds and sanity, robustness, etc testing which may not be integrated with Zuul at this point, but I'll point them to elastic and see if we can converge somehow | 16:48 |
clarkb | iurygregory: however, fungi has noticed recently that this may not work 100% as expected | 16:48 |
iurygregory | clarkb, ty o/ | 16:48 |
iurygregory | ops =X | 16:49 |
mordred | ildikov: yeah - first step is to get any testing that is not integrated with zuul integrated with zuul :) | 16:49 |
clarkb | iurygregory: one possibility is that the python3 install that happens after python2 install isn't overwriting /usr/local/bin entries but I don't think anyone has run it locally and checked | 16:49 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Prevent local code execution via the raw module https://review.openstack.org/642518 | 16:49 |
rpittau | clarkb, can be this the culprit? https://github.com/openstack-dev/devstack/blob/master/inc/python#L432 | 16:49 |
iurygregory | clarkb, gotcha, i will try to debug some more and see with qa too =) | 16:50 |
mordred | ildikov: but - looking at that tool - we have all of the functionality already integrated between elastic-recheck and http://status.openstack.org/openstack-health/#/ | 16:50 |
clarkb | ildikov: to use that as a release health indicator I like to ensure the classification percentage is high (meaning we've identified why most jobs fail say 90%). Then you can look at the occurence of specific bugs to determine how healthy the project is overall | 16:50 |
mordred | ++ | 16:50 |
clarkb | rpittau: ya you can see it installs it twice. First under python2 then under python3, but in some cases we've noticed that the /usr/local/bin/ entrypoints seem to want to use python2 not 3 | 16:51 |
clarkb | rpittau: it is possible that there is a bug in pbr writing out the entrypoint script content's shebang line or that pip isn't overwriting the netry | 16:52 |
*** udesale has quit IRC | 16:52 | |
*** electrofelix has quit IRC | 16:52 | |
clarkb | I reviewed the pbr code and fungi tested it and it seemed to do the right thing. My current guess is pip doesn't overwrite for some reason. POssibly because we are editable? | 16:52 |
ildikov | mordred: clarkb: got it, thank you :) | 16:53 |
mordred | ildikov: happy to have further chats about it as needed of course! | 16:54 |
*** priteau has quit IRC | 16:54 | |
openstackgerrit | Merged openstack-infra/system-config master: Split python-base into its own Dockerfile https://review.openstack.org/632532 | 16:55 |
*** e0ne has joined #openstack-infra | 16:57 | |
*** noama has quit IRC | 16:58 | |
ildikov | mordred: sounds good, thank you | 17:00 |
fungi | clarkb: the other possibility (in the case of the job i was looking at over the weekend there were tracebacks indicating privsep-helper was sometimes called under 2.7 and sometimes under 3.x) is that we're invoking it differently in different places | 17:01 |
fungi | one thing i want to try is a dnm change to rip out the conditional block at http://git.openstack.org/cgit/openstack-dev/devstack/tree/inc/python#n441 and then do some depends-on to see what happens with those current failure cases | 17:04 |
fungi | though odds are someone else will get to that before i have time for it | 17:04 |
pabelanger | clarkb: mordred: fungi: Not sure you seen on friday, but https://review.openstack.org/642100/ starts the process of creating zuul specific tenant for zuul | 17:05 |
clarkb | re e-r being behind it appears the e-s cluster is red due to a lost shard. This seems to have made logstash unhappy? | 17:06 |
fungi | great! | 17:06 |
clarkb | I think maybe if logstash is trying to write to that broken shard it blocks | 17:06 |
*** ginopc has quit IRC | 17:07 | |
clarkb | the shard belongs to the 8ths index so if I restart things we should roll forward so I'm going to restart some things and see if we can get it past that | 17:07 |
clarkb | actually | 17:07 |
clarkb | we may just want to delete that index entirely to avoid trouble | 17:07 |
clarkb | I'm going to go ahead and do that | 17:08 |
*** rpittau is now known as rpittau|afk | 17:09 | |
*** wolverineav has joined #openstack-infra | 17:09 | |
*** kopecmartin is now known as kopecmartin|off | 17:10 | |
clarkb | that seems to have gotten things moving. We'll have a hole but not sure we could avoid that given the broken index | 17:11 |
*** priteau has joined #openstack-infra | 17:12 | |
*** e0ne has quit IRC | 17:13 | |
*** wolverineav has quit IRC | 17:14 | |
*** tosky__ has joined #openstack-infra | 17:14 | |
*** tosky has quit IRC | 17:14 | |
*** rfolco has joined #openstack-infra | 17:14 | |
*** e0ne has joined #openstack-infra | 17:14 | |
*** rfolco|ruck has quit IRC | 17:16 | |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/nodepool master: DNM: Pin to Kubernetes 9 beta until it releases https://review.openstack.org/642524 | 17:18 |
*** tosky__ is now known as tosky | 17:19 | |
*** dtantsur is now known as dtantsur|afk | 17:19 | |
*** jpich has quit IRC | 17:29 | |
*** ykarel|away has quit IRC | 17:29 | |
*** e0ne has quit IRC | 17:31 | |
*** mriedem is now known as mriedem_afk | 17:31 | |
*** iurygregory has quit IRC | 17:35 | |
*** jamesmcarthur has quit IRC | 17:40 | |
*** jamesmcarthur has joined #openstack-infra | 17:41 | |
openstackgerrit | Merged openstack-infra/zuul master: Prevent local code execution via the raw module https://review.openstack.org/642518 | 17:45 |
*** jamesmcarthur has quit IRC | 17:45 | |
clarkb | alright meeting agenda is sent. I'm going to find food then when I'm back start on afs01.ord upgrade | 17:47 |
*** trown|lunch is now known as trown | 17:49 | |
fungi | btw, topic:xenial-upgrades has a couple patches to move us forward on the wiki-dev upgrade | 17:51 |
clarkb | fungi: great I'll take al ook before diving into afs upgrades then | 17:51 |
openstackgerrit | Merged openstack-infra/system-config master: Use opendev logos https://review.openstack.org/642179 | 17:55 |
*** derekh has quit IRC | 18:00 | |
openstackgerrit | Merged openstack-infra/openstack-zuul-jobs master: Add Python3 project templates for Train release https://review.openstack.org/641878 | 18:06 |
*** wolverineav has joined #openstack-infra | 18:10 | |
*** dpawlik has joined #openstack-infra | 18:11 | |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul-preview master: WIP: Begin refactoring code for unit testing https://review.openstack.org/642245 | 18:13 |
*** jpena is now known as jpena|off | 18:13 | |
*** wolverineav has quit IRC | 18:15 | |
*** wolverineav has joined #openstack-infra | 18:15 | |
*** wolverineav has quit IRC | 18:15 | |
*** wolverineav has joined #openstack-infra | 18:15 | |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul-preview master: WIP: Begin refactoring code for unit testing https://review.openstack.org/642245 | 18:18 |
clarkb | fungi: +2 on both. You mgiht also want a change to bump the testing of those nodes to xenial in manifests/site.pp | 18:18 |
* clarkb attempts to grab locks on mirror-update.o.o | 18:19 | |
*** jamesmcarthur has joined #openstack-infra | 18:20 | |
*** panda|rover is now known as panda|rover|off | 18:20 | |
*** e0ne has joined #openstack-infra | 18:23 | |
*** e0ne has quit IRC | 18:25 | |
*** jcoufal has quit IRC | 18:25 | |
*** jamesmcarthur has quit IRC | 18:26 | |
clarkb | infra-root I've grabbed all locks on mirror-update and disabled the docs release cron on afsdb01 | 18:27 |
clarkb | I am going to proceed with upgrading afs01.ord.o.o in place now | 18:27 |
openstackgerrit | Jeremy Stanley proposed openstack-infra/zuul-jobs master: [DNM] exercise base-test as parent in unittests https://review.openstack.org/642536 | 18:27 |
clarkb | ok prep work is done and I'm ready to run do-release-upgrade now. I thought about doing a snapshot but I've got the snapshot I already made which should be recent enough | 18:33 |
clarkb | here goes | 18:33 |
openstackgerrit | Jeremy Stanley proposed openstack-infra/zuul-jobs master: [DNM] exercise base-test as parent in unittests https://review.openstack.org/642536 | 18:36 |
openstackgerrit | Jeremy Stanley proposed openstack-infra/zuul-jobs master: [DNM] exercise base-test as parent in unittests https://review.openstack.org/642536 | 18:39 |
*** diablo_rojo has joined #openstack-infra | 18:40 | |
*** eernst has joined #openstack-infra | 18:45 | |
*** jamesmcarthur has joined #openstack-infra | 18:47 | |
*** eernst has quit IRC | 18:50 | |
clarkb | waiting on dkms to build kernel modules | 18:53 |
clarkb | otherwise upgrade has gone exactly as I've documented it so far | 18:53 |
*** jcoufal has joined #openstack-infra | 18:54 | |
*** priteau has quit IRC | 18:55 | |
*** e0ne has joined #openstack-infra | 18:57 | |
*** eernst has joined #openstack-infra | 18:58 | |
*** aojea has quit IRC | 19:00 | |
*** kgiusti has left #openstack-infra | 19:02 | |
*** eernst has quit IRC | 19:02 | |
*** mriedem_afk is now known as mriedem | 19:04 | |
*** eernst has joined #openstack-infra | 19:04 | |
*** jcoufal has quit IRC | 19:06 | |
clarkb | I'm doing what should be the last reboot now | 19:08 |
clarkb | ssh isn't coming back quickly making we wonder if it is fscking the vicepa volume | 19:08 |
*** eernst has quit IRC | 19:09 | |
* clarkb waits patiently (fwiw I have rebooted it post upgrade already so I'm fairly certain the new kernel and all that work and I've mounted the vicepa volume manually before rebooting so expect that to work too) | 19:09 | |
clarkb | however it does not ping and I'm not sure if I should expect it to ping during boot if fscking | 19:09 |
clarkb | console shows the ubuntu 16.04 boot splash iirc that overlays any fsck output | 19:12 |
clarkb | and the little dots are changing color so it is doing something | 19:13 |
clarkb | infra-root ^ let me know if you think I should do something other than be patient under the assumption fsck is in control | 19:13 |
mordred | clarkb: I think patience is the best bet | 19:14 |
*** yamamoto has quit IRC | 19:14 | |
fungi | it's almost certainly the dkms build(s) | 19:17 |
fungi | how many kernels were installed? | 19:17 |
fungi | oh, wait, that was before you rebooted though | 19:17 |
clarkb | yes the upgrade fully completed and we did a reboot then | 19:17 |
fungi | yeah, the boot process should eventually time out if there's a problem waiting for a block device to become available or something | 19:18 |
clarkb | that came back, I fixed puppet apt, reinstalled apt, and then set up openafs configs and remounted vicepa. Then rebooted again (this is where I'm stuck) | 19:18 |
clarkb | internet suggests an f2 might escape the splash screen | 19:18 |
clarkb | hrm f2 shows me what looks like output from the previous shutdown which isn't what I really want | 19:19 |
fungi | is it not done halting yet maybe? | 19:20 |
fungi | or... i know they're doing a mass migration in their dfw region... did you maybe translarently trigger a cold instance migration by rebooting? | 19:20 |
fungi | er, transparently | 19:21 |
clarkb | maybe? | 19:21 |
corvus | sometimes 'esc' get you out of the splash screen | 19:21 |
fungi | probably need to review open tickets in our tenant there to find out if they're trying to get us to reboot some so they'll migrate | 19:21 |
clarkb | corvus: ya esc toggles back and forth. I'm beginning to wonder if fungi's theory is the one | 19:21 |
clarkb | it does appear we are actually stuck on the shutdown side of the reboot and that may be because we haven't booted our new state on new hypervisor yet | 19:22 |
clarkb | would also explain why ping doesn't work because the network stack is off | 19:22 |
clarkb | fungi: I don't see any open tickets though | 19:23 |
fungi | hrm... | 19:23 |
*** xek has quit IRC | 19:23 | |
clarkb | this server is in ord though | 19:23 |
clarkb | doesn't look like the ticket system is region specific | 19:23 |
*** xek has joined #openstack-infra | 19:23 | |
fungi | oh, nevermind. i hadn't heard anything about ord migrations anyway, just dfw | 19:23 |
clarkb | part of me wants to reboot it from the openstack side of things | 19:24 |
fungi | there are a couple tickets for that tenant titled "[ACTION REQUIRED] DFW Datacenter Migration" it seems | 19:24 |
clarkb | as far as the openstack api is concerned the server is ACTIVE | 19:24 |
fungi | and also one "Cloud Server Incident Notification" ticket open | 19:25 |
clarkb | ya I think an api reboot is the next thing | 19:25 |
clarkb | infra-root ^ any objections or alternative suggestions? | 19:25 |
corvus | clarkb: afsd wasn't running, correct? | 19:26 |
clarkb | corvus: bosserver/openafs-fileserver was not running | 19:26 |
fungi | the cloud server incident was for the old pre-upgrade graphite server so i closed it | 19:26 |
clarkb | (aiui the openafs-fileserver service runs bosserver which runs the other services as child processes) | 19:27 |
*** eernst has joined #openstack-infra | 19:27 | |
fungi | the datacenter migration tickets were for ask-staging and afs02.dfw | 19:27 |
fungi | ask-staging was migrated for us already so i closed that ticket | 19:27 |
corvus | then that seems pretty safe to me. even if the ext4 partition is still mounted, ext4 should be able to handle that. i'd worry a bit more (like, would we have to do a volume recovery) if afsd were running. | 19:27 |
clarkb | corvus: got it, ya I'm fairly certain I managed to stop those services | 19:28 |
fungi | afs02.dfw will be rebooted for us at or after "March 09 2019 01:29 UTC | 19:28 |
*** eernst has quit IRC | 19:28 | |
clarkb | proceeding with the reboot now | 19:28 |
*** eernst has joined #openstack-infra | 19:28 | |
fungi | uptime on afs02.dfw is showing 6 days, so i don't think they've rebooted it (yet) | 19:28 |
clarkb | ok server is back up. vicepa is mounted, bosserver is running and afsdb01 bos status says running normally | 19:30 |
clarkb | corvus: thoughts on other stuff to check before I turn on puppet and release some lock files | 19:31 |
clarkb | vos listvldb doesn't show anything sad about the ord volumes that I can see either | 19:32 |
clarkb | I am going to reenable puppet which will ensure that half of things is happy too. Then once we are overall happy we can do afs02.dfw | 19:33 |
clarkb | actually you know I've seen similar reboot/shutdown behavior locally with libvirt where it gets out of sync with the virtual acpi | 19:34 |
clarkb | I wonder if new kernel doesn't play nice with xen/rax | 19:34 |
clarkb | and maybe we need to install some package we'd normally get via rax images | 19:34 |
clarkb | we can do a dpkg listing and compare to rax xenial image | 19:34 |
clarkb | I have reenabled puppet on afs01.ord | 19:35 |
*** eernst has joined #openstack-infra | 19:36 | |
fungi | perhaps the rackspace/xen agent we've got installed is incompatible with newer kernels and needs a different version obtained from somewhere? seems unlikely, but possible i suppose | 19:36 |
*** gfidente is now known as gfidente|afk | 19:36 | |
*** bringha has joined #openstack-infra | 19:37 | |
clarkb | I see that docs has a backup version which I think means it is ready for a vos release? so once we reenable the cron for docs release we should see that everything is working overall | 19:37 |
clarkb | I'll do that once puppet is happy | 19:37 |
clarkb | fungi: ya this reboot was the first from new kernel to off. The previous reboot was old kernel to off to new kernel on | 19:38 |
fungi | our instructions for updating the base job say to start by making sure base-test is identical to base... how are people generally confirming that? | 19:40 |
*** eernst has quit IRC | 19:40 | |
*** e0ne has quit IRC | 19:40 | |
fungi | right now i've got a slew of post_failure job results when i test reparenting to base-test | 19:40 |
*** e0ne has joined #openstack-infra | 19:40 | |
fungi | and i realize i skipped that step | 19:40 |
*** eernst has joined #openstack-infra | 19:42 | |
fungi | `diff -ru playbooks/base{,-test}` does indeed have some bits i should check out | 19:42 |
clarkb | fungi: I'm guessing manual diffs :/ fwiw that job would've been moved into opendev/base-jobs and it is possible that the move wasn't 100% correct for base-test | 19:42 |
*** xek has quit IRC | 19:43 | |
*** eernst has quit IRC | 19:43 | |
*** eernst has joined #openstack-infra | 19:43 | |
*** xek has joined #openstack-infra | 19:43 | |
*** eernst has quit IRC | 19:43 | |
fungi | it does look from git history as if the base-test playbooks were different when they were copied to the new repo | 19:43 |
*** eernst has joined #openstack-infra | 19:44 | |
fungi | this is the current diff between the playbooks: http://paste.openstack.org/show/747569/ | 19:45 |
fungi | someone was testing some change to the upload-logs role | 19:46 |
clarkb | fungi: I would probably start by reverting that? | 19:47 |
corvus | yeah, just copy base to base-test | 19:47 |
fungi | yeah, technically a `git revert ...` isn't going to work since this was carted in from a different repo | 19:48 |
corvus | i don't think anyone's testing anything right now. normally i'd check git log, but that's too much trouble with the repo move. :) | 19:48 |
openstackgerrit | Jeremy Stanley proposed opendev/base-jobs master: Reset base-test playbooks to match base https://review.openstack.org/642550 | 19:50 |
fungi | clarkb: corvus: ^ | 19:50 |
fungi | thanks! (to AJaeger too) | 19:54 |
fungi | once it merges i'll hopefully be able to finish setting up the necessary chain to demonstrate unit tests running on the correct distros for different branches of a random openstack project | 19:54 |
*** jcoufal has joined #openstack-infra | 20:03 | |
*** yamamoto has joined #openstack-infra | 20:06 | |
openstackgerrit | Merged opendev/base-jobs master: Reset base-test playbooks to match base https://review.openstack.org/642550 | 20:06 |
*** bringha has quit IRC | 20:10 | |
*** e0ne has quit IRC | 20:11 | |
*** e0ne has joined #openstack-infra | 20:12 | |
*** yamamoto has quit IRC | 20:12 | |
*** e0ne has quit IRC | 20:15 | |
clarkb | puppet ran with no change on afs01.ord | 20:17 |
clarkb | I'm going to enable the docs publishing cron on afsdb01 now | 20:18 |
fungi | that sounds awesome. do we want more reboot tests? | 20:19 |
ianw | infra-root / corvus : i'm at the point of being ready to send some git:// to https:// changes, can you review this commit message : | 20:25 |
ianw | https://git.openstack.org/cgit/openstack-infra/system-config/tree/tools/mass-git-change/replace.sh?h=refs/changes/14/642314/3#n20 | 20:25 |
clarkb | fungi: ya we can disable the bosserver via afsdb01 after docs release situation is happy and then do another reboot | 20:26 |
clarkb | fwiw we've releases docs but I still see RWrite: 536870991 ROnly: 536870992 Backup: 536870993 | 20:26 |
clarkb | I had assumed backup being higher than the other two meant we need to release to catch up but maybe that isn't what it means | 20:27 |
clarkb | corvus: ^ do you know what that means off the top of your head? | 20:27 |
corvus | clarkb: those are just volume ids, they don't change | 20:28 |
clarkb | corvus: got it, any good way of checking vos release being successful? | 20:28 |
clarkb | I guess I can track a docs change and see it on the web browser side of things | 20:28 |
clarkb | ianw: lgtm | 20:28 |
corvus | clarkb: yes, 'vos examine' should tell you | 20:29 |
corvus | clarkb: i'm pretty sure if one of the sites is behind, it says so | 20:29 |
clarkb | corvus: thanks | 20:29 |
corvus | clarkb: you can 'vos examine docs.readonly' to see the "last update" time | 20:30 |
corvus | that'll be the last time it was released | 20:30 |
clarkb | last update was 5 minutes ago | 20:32 |
corvus | (you can also 'vos examine 536870992' since that's the volume id for the read only volume) | 20:32 |
clarkb | http://paste.openstack.org/show/747574/ so that lgtm | 20:32 |
corvus | clarkb: agree | 20:32 |
clarkb | ok I'll shutdown that server via the bos command on afsdb01 then reboot | 20:32 |
clarkb | to see if we get a clean reboot this time. If we don't then I'll dig into dpkg diffs and see if there are any apparently rax image deltas that we might need to address | 20:33 |
mordred | ianw: lgtm | 20:34 |
ianw | clarkb: fyi afsmon takes the creation date -> http://git.openstack.org/cgit/openstack-infra/afsmon/tree/afsmon/__init__.py#n86 , that's what builds the dashboard's "last release" time | 20:35 |
clarkb | heh it might be doing a fsck this time | 20:36 |
clarkb | I see /dev/xvda1: clean: etc | 20:36 |
clarkb | but I can ssh in so maybe it only fscked xvda1 | 20:36 |
clarkb | bos status says it is back and running | 20:37 |
clarkb | I think I am going to call afs01.ord good now | 20:37 |
clarkb | I'm adding afs02.dfw.o.o to the emergency file now | 20:38 |
fungi | be aware that's the one with the open ticket about the pending (though in theory scheduled for a couple days ago) reboot migration | 20:38 |
clarkb | fungi: ya I've actually got reboot in the list of steps before we do the release upgrade | 20:39 |
clarkb | hopefully that catches any pending migrations | 20:39 |
fungi | which i suspect they either didn't perform or ended up not needing to perform given the listed uptime on the server | 20:39 |
openstackgerrit | Clark Boylan proposed openstack-infra/project-config master: Disable wheel mirror updates for afs server upgrades https://review.openstack.org/642562 | 20:42 |
openstackgerrit | Clark Boylan proposed openstack-infra/project-config master: Revert "Disable wheel mirror updates for afs server upgrades" https://review.openstack.org/642563 | 20:42 |
clarkb | infra-root can I get reviews on the first chagne there to avoid attempting to publish wheel mirror updates while I do afs02.dfw? | 20:42 |
clarkb | I will WIP the second change | 20:43 |
clarkb | to recap the upgrade on afs01.ord the only unexpected thing was the broken reboot after booting new kernel | 20:45 |
clarkb | subsequent reboots work | 20:45 |
clarkb | overall relatively straightforward. Also the lack of puppet updates on the upgraded server implies I picked do-release-upgrade question answers properly :) | 20:46 |
clarkb | (if I had overwritten a file managed by puppet it would update the file contents post upgrade) | 20:46 |
clarkb | I've still got mirror-update locks held and have redisabled the docs publishing cron. Once the change above merges I'll be ready to do afs02.dfw | 20:47 |
fungi | anybody working on zuul executor restarts yet? looks like we've got the raw fix installed but the executors are still running on code from friday | 20:50 |
clarkb | I am not. I've got afs things paged in and trying to run through that as much as possible | 20:50 |
corvus | fungi: nope. you want to do it, or shall i? | 20:52 |
*** jamesmcarthur has quit IRC | 20:56 | |
*** e0ne has joined #openstack-infra | 20:56 | |
clarkb | http://git.openstack.org/cgit/openstack/openstack-ansible/tree/scripts/fastest-infra-wheel-mirror.py | 20:58 |
clarkb | discovered ^ doing a code search on the wheel mirror jobs | 20:58 |
fungi | corvus: i can start in a few minutes. do we have a specific playbook/procedure for that? | 20:59 |
*** luizbag has quit IRC | 21:01 | |
clarkb | fungi: we have a playbook to restart all the zuul services, you should be able to trim it down to just the executors | 21:02 |
openstackgerrit | Merged openstack-infra/project-config master: Disable wheel mirror updates for afs server upgrades https://review.openstack.org/642562 | 21:02 |
clarkb | fungi: possibly by --limit ze*.openstack.org ? | 21:02 |
fungi | yeah, last likely candidate i found in my shell history on bridge.o.o was `sudo ansible ze*.openstack.org -m shell -a 'systemctl restart zuul-executor'` | 21:03 |
clarkb | I'm grabbing something to drink then starting with afs02. That merge was the last thing I needed in the disable things that might do stuff I don't want list | 21:04 |
fungi | i should have started drinking already! ;) | 21:04 |
*** mattw4 has quit IRC | 21:04 | |
*** e0ne has quit IRC | 21:04 | |
*** mattw4 has joined #openstack-infra | 21:05 | |
corvus | fungi: yeah, that systemctl command should be fine | 21:05 |
*** trown is now known as trown|outtypewww | 21:05 | |
fungi | running that in that case | 21:06 |
clarkb | heh not that kind of drinking. | 21:06 |
clarkb | though I grabbed a bottle of whiskey aged on oregon white oak beacuse its somewhat novel and actually not bad either | 21:06 |
fungi | now to check that the pidfile gets refreshed on them all | 21:07 |
*** e0ne has joined #openstack-infra | 21:12 | |
fungi | no pidfiles on a lot of the executors now, checking one for sanity | 21:15 |
fungi | Active: active (running) since Mon 2019-03-11 21:06:29 UTC; 9min ago | 21:16 |
fungi | last entry in its debug log though is: 2019-03-11 21:10:43,743 DEBUG zuul.log_streamer: LogStreamer stopped | 21:17 |
corvus | they take a while to stop | 21:17 |
corvus | fungi: maybe a separate stop and start are needed, depending on what systemd does with restart | 21:18 |
*** xek has quit IRC | 21:18 | |
fungi | will give that a shot | 21:18 |
corvus | fungi: they take ~10m to stop usually | 21:18 |
ianw | clarkb: wow, that script is ... something | 21:19 |
*** mattw4 has quit IRC | 21:21 | |
*** mattw4 has joined #openstack-infra | 21:23 | |
fungi | okay, all 12 executors now have a pidfile with a 21:20 timestamp | 21:23 |
corvus | http://grafana.openstack.org/d/T6vSHcSik/zuul-status?orgId=1 looks reasonable | 21:24 |
clarkb | ok afs02.dfw did something a little different than afs01.ord. It made me configure openafs client because I moved teh config aside | 21:24 |
clarkb | I just went with the defaults since I will move the config back when bringing bosserver back online | 21:24 |
clarkb | (it asked for cell name and cache size) | 21:25 |
*** whoami-rajat has quit IRC | 21:25 | |
zbr | ianw: can you please comment on https://review.openstack.org/#/c/639951/ ? | 21:26 |
*** mattw4 has quit IRC | 21:28 | |
*** mattw4 has joined #openstack-infra | 21:29 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Add no_log entries to skopeo copy commands https://review.openstack.org/642574 | 21:29 |
fungi | yeah, seems the executors are running normally once more (or at least coming back to normalcy) | 21:31 |
fungi | #status log restarted zuul executors for security fix 5ae25f0 | 21:32 |
openstackstatus | fungi: finished logging | 21:32 |
clarkb | fungi: now is the part where I am waiting on dkms | 21:32 |
*** mattw4 has quit IRC | 21:32 | |
fungi | for the record, `systemctl restart zuul-executor` seems to only have stopped them, not started them | 21:33 |
*** mattw4 has joined #openstack-infra | 21:33 | |
fungi | yet registered them as running | 21:33 |
corvus | mordred, tristanC, tobiash: when i visit https://opendev.org/ i see a gitea logo in the top left. but when i shift-reload, it turns to the opendev logo. but when i click 'home' it turns back to the gitea logo. i'm pretty sure this is related to service workers. any idea how to fix that (other than asking users to tell their browser to delete service workers?) | 21:33 |
fungi | the separate stop and start corvus suggested did the trick though | 21:33 |
*** jcoufal has quit IRC | 21:36 | |
*** dave-mccowan has quit IRC | 21:36 | |
clarkb | corvus: mozilla bugzilla implies that for firefox you ahve to cleark those cache items through the "storage inspector" | 21:37 |
clarkb | I odn't know what storage inspector is | 21:37 |
corvus | yeah, anything that requires a user to do something is a no go :) | 21:38 |
*** e0ne has quit IRC | 21:38 | |
corvus | is the local storage cleared when the service worker is updated with a new version? does it ever clear the local storage for any (other) reason? | 21:40 |
*** pcaruana has quit IRC | 21:45 | |
ianw | zbr: hrrm, my immediate feeling is that i don't think the package manager is a great thing to switch on. i understand it happens to be a proxy for the release of the platform | 21:46 |
ianw | but across dib, devstack, etc we just do YUM= and select either yum or dnf ... for all intents they're compatible | 21:47 |
zbr | ianw: i think that the ultimate question is which "atom" name to use to identify the python3-first platforms. | 21:47 |
zbr | using "distro-ver" does not scale at all, bindep.txt files would become at total mess soon if we start adding versions. | 21:48 |
zbr | i have no strong feelings for the name, just want to find one that we can all agree, so we can start using it. | 21:49 |
clarkb | fwiw I don't know that it will be that bad | 21:49 |
clarkb | once you cross the threshold you can clean up the legacy stuff | 21:49 |
zbr | clarkb: based on my prev experience some distros are really hard to kill, ;) | 21:50 |
zbr | there is always someone still wanted them listed, aka super-extra-extended-support kind of... | 21:50 |
clarkb | infra-root afs02.dfw.openstack.org is upgraded and up and running with a happy status report from bos status. Now a question of process. Would you like me to reenable all of the publishing of volumes then let afs burn in until tomorrow morning when I upgrade afs01.dfw.o.o. This should give us a high degree of confidence everything works as expected. Or do you think I should just go ahead and upgrade | 21:53 |
clarkb | afs01.dfw.o.o now with the locks and jobs disabled? | 21:53 |
*** dpawlik has quit IRC | 21:53 | |
clarkb | afs01.dfw is our RWrite volume server for all volumes fwiw | 21:53 |
corvus | clarkb: i'm not sure burn in will increase confidence by much, so if you want to plow through, that sounds reasonable. | 21:54 |
corvus | usually if it breaks, it breaks immediately and hard :) | 21:54 |
clarkb | cool I'll proceed then | 21:54 |
clarkb | rather get this done than wait longer | 21:54 |
fungi | wfm | 21:54 |
clarkb | I've put afs01.dfw in the puppet emergency file. Before I start on that server I'll disable afs02.dfw again and reboot it again to make sure that subsequent reboots work as expected (they did on afs01.ord) | 21:55 |
*** jtomasek has quit IRC | 21:56 | |
clarkb | zbr: ya I guess I'm thinking centos 7 is the only one that is python2 | 21:57 |
clarkb | zbr: so you'll ahve to set centos7 packages and not centos7 pacakges which isn't so bad | 21:57 |
clarkb | and it is happy after that last reboot | 21:57 |
zbr | clarkb: unless you also count for rhel-7... and we already have two. others may start to popup... i don't know. | 21:58 |
ianw | zbr: can we explore the problem with some full examples @ https://etherpad.openstack.org/p/bindep-python3 | 22:05 |
*** rfolco is now known as rfolco|ruck|off | 22:05 | |
ianw | zbr: but then the issue is adding in rhel8, right? | 22:07 |
zbr | ianw: yeah, i am trying to find a way to avoid all extra maintenance work. | 22:08 |
ianw | zbr: so when it comes down to it, your major concern is that [platform:fedora platform: centos-8 platform:rhel-8] is not forward compatible? | 22:13 |
zbr | ianw: yeah. alternative would be using centos-7 and rhel-7 as conditions, with negative matches. | 22:15 |
clarkb | building dkms things on afs01.dfw as part of the upgrade now | 22:15 |
fungi | how does platform:redhat differ from platform:rpm in the absence of non-redhat-derived rpm-based distros? | 22:15 |
fungi | like, until a project decides to add, say, suse support you can always just operate off the assumption that platform:rpm means "all the rpm-based platforms i expect to support right now" | 22:16 |
zbr | ianw: btw, what are the returned values for rhel now? (i do not have one to ssh to it right now) | 22:16 |
ianw | zbr: i don't know, i'm just guessing | 22:16 |
clarkb | there are rhel examples in the bindep test suite | 22:16 |
clarkb | for 7 not 8 | 22:16 |
clarkb | workstation and server | 22:16 |
*** eernst has quit IRC | 22:17 | |
ianw | yeah, it is "rhel" | 22:17 |
zbr | but we can safelyuse "rhel". do we also have a rhel-7 atom? | 22:17 |
clarkb | yes you should have both | 22:18 |
ianw | i think i'm feeling the issue, that we need to "explode" platform:rpm !platform:fedora to explicit matches | 22:19 |
ianw | what if we supported a "+"? | 22:20 |
fungi | not 100% sure what the thrust of this exploration exercise is, but keep in mind bindep was designed with the expectation that you use the broadest possible platform profiles until someone tells you they're insufficient for a given platform, because in many cases the package name is identical across all distros, next most common is that rpm-based distros tend to use one naming convention while | 22:21 |
zbr | i updated the etherpad with my practical example for molecule. it should work but not tested yet. | 22:21 |
fungi | dpkg-based distros use a different convention but within their general family they still all use the same name for that package... least most common is having to handle package name differences between distros in the same family or between releases of the same distro (i.e. packages getting renamed/replaced) | 22:21 |
zbr | fungi: yeah, in fact most tools share the same names, like "git", "gcc". | 22:23 |
clarkb | ianw: that becomes tricky with suse going from 42 to 15, but that might help overall | 22:23 |
fungi | zbr: leaving me to wonder why you have, for example, python-pip [platform:redhat platform:dpkg] | 22:23 |
fungi | why is that not just python-pip with no platform profiles? | 22:23 |
ianw | fungi: well i think on that etherpad page, when we add in rhel-8, how do we keep things relatively sane with different python package names? that's a constrained problem | 22:24 |
fungi | oic, packages getting renamed to python3-pip? | 22:24 |
zbr | fungi: there is no such thing as pytho-pip on brew (macos). | 22:24 |
ianw | yeah, then you want python3 on fedora,centos-8,rhel-8 but not <= centos-7 | 22:25 |
zbr | fungi: is better to miss one distro than adding a false dependency. | 22:25 |
fungi | zbr: and that list you have there i expected to work on brew? | 22:25 |
fungi | "better to miss one distro than adding a false dependency" is sort of the opposite of how bindep is designed to be used | 22:25 |
*** tosky has quit IRC | 22:26 | |
zbr | fungi: to answer your question, brew is not listed because it didn't had time. | 22:26 |
fungi | as i said, the expectation is you use the broadest possible platform profiles (or none at all) until someone proposes making them more granular to support an additional platform with the list of packages you have | 22:26 |
ianw | clarkb: sorry, is docs.openstack.org giving forbidden atm? | 22:26 |
fungi | so, yeah, if you're working on making the list there also support brew then it makes sense, that wasn't clear to me from context | 22:27 |
fungi | ianw: confirmed | 22:27 |
fungi | clarkb: hiccough with the afs upgrade? | 22:27 |
clarkb | not that I've seen on my side yet | 22:28 |
clarkb | it is doing dkms things | 22:28 |
johnsom | Ummm, https://docs.openstack.org/octavia/latest/ is giving Forbidden -You don't have permission to access /octavia/latest/ on this server. | 22:28 |
fungi | i'm getting 403 forbidden at https://docs.openstack.org/ currently | 22:28 |
clarkb | I wonder if we are serving from the rw volume instead of the ro | 22:28 |
johnsom | Ah, ianw got here first... | 22:28 |
fungi | DocumentRoot /afs/openstack.org/docs | 22:29 |
clarkb | hrm | 22:29 |
fungi | according to the vhost | 22:29 |
clarkb | does that imply afs01.ord isn't actualyl working? | 22:29 |
clarkb | since it hosts the other RO volume for docs | 22:29 |
fungi | ls: cannot access '/afs/openstack.org/docs': Connection timed out | 22:29 |
clarkb | fwiw zuul-ci.org does load, is served by the same apache but afs02.dfw serves as backup RO volume | 22:30 |
clarkb | I'm most of the way through the upgrade on afs01.dfw | 22:30 |
clarkb | I think I should continue? that will get docs back when afs01.dfw is back | 22:30 |
fungi | yeah, i think we have some documentation on how to swap those | 22:30 |
openstackgerrit | Merged openstack-infra/zuul master: Increase timeout of test_plugins https://review.openstack.org/641803 | 22:30 |
openstackgerrit | Merged openstack-infra/zuul master: Fix test race in test_container_jobs https://review.openstack.org/641791 | 22:30 |
clarkb | fungi: fwiw that path is the RO path aiui | 22:31 |
clarkb | fungi: which implies afs01.ord isn't serving the data I think | 22:31 |
clarkb | bos status says afs01.ord is running normally | 22:31 |
fungi | hrm, yep | 22:33 |
clarkb | iptables shows the expected ports are open | 22:33 |
fungi | ahh, found, the docs we have are for vos move actions | 22:33 |
clarkb | fungi: I'm not sure a vos move will help? | 22:33 |
clarkb | I wonder if the client on files has stale info so its looking for the data only on the shut down server? | 22:33 |
clarkb | ok afs01.dfw is ready for rebooting to come up on the new kernel. | 22:34 |
clarkb | I'm going to tell it to do that now then we can get things set up to serve the content for docs too | 22:34 |
fungi | this is what files02 sees: http://paste.openstack.org/show/747582/ | 22:35 |
fungi | `vos listvol -server afs01.ord.openstack.org` also shows docs.readonly as On-line | 22:37 |
ianw | fungi: i wonder if it's because it's lost contact with ord previously, then it's back, then dfw has gone | 22:39 |
ianw | i.e. deep openafs bug | 22:39 |
fungi | like if we'd waited longer in between it would have been seamless? | 22:39 |
ianw | aka, turn it on and off again, and maybe it works | 22:39 |
clarkb | afs01.dfw should be back soon | 22:40 |
ianw | fungi: like something in the state machine of what to access has gone wrong; cause yeah, vos looks likes it's ok, but ls /afs/openstack.org is not | 22:40 |
clarkb | afs01.dfw is back | 22:41 |
clarkb | and so is docs | 22:41 |
clarkb | I would like to reboot afs01.dfw one more time to ensure it works without a hard reboot | 22:41 |
fungi | confirmed, it started working as soon as afs01.dfw returned | 22:41 |
clarkb | should I go ahead and do that knowing that we may take another docs.o.o outage? | 22:42 |
fungi | sure, unless we want to use that to diagnose what's going on with files02 | 22:42 |
clarkb | fungi: ianw maybe you can restart afsclient services on files02 after I stop the bosserver on afs01.dfw? | 22:42 |
fungi | worth a shot | 22:42 |
clarkb | fungi: I'm guessin that ianw's hunch is not far off | 22:42 |
ianw | we can try ... although it might be kernel-level ish | 22:42 |
clarkb | basically the client side has gotten confused about where it should get the data from | 22:42 |
clarkb | ianw: fungi ok let me know when I should stop afs01.dfw for its last reboot | 22:43 |
fungi | go for it. i've got the commands to stop/start the openafs-client service queued up | 22:43 |
ianw | i think now, and yeah let's try openafs-client stop/start on files02 and see if it works | 22:44 |
ianw | fungi: ++ | 22:44 |
clarkb | ok it is stopped | 22:44 |
fungi | things are still working | 22:44 |
clarkb | why don't you check things work on files before I reboot afs01.dfw since that will race | 22:44 |
fungi | maybe cached? | 22:44 |
clarkb | oh ya could it be that we hit the cache timeout? | 22:44 |
fungi | i'm browsing around the site successfully still | 22:45 |
clarkb | should I hold off on the reboot? the service will auto start on boot so want to awit until you think you are done debugging | 22:45 |
clarkb | also possible that dfw coming back kicked the state machien in the client | 22:45 |
clarkb | so it knows to fall over to ord | 22:45 |
*** mriedem has quit IRC | 22:45 | |
corvus | i just did a local ls on my workstation | 22:45 |
corvus | it initially took a while to realize that the server was down, but then apparently switched to a replica successfully | 22:46 |
fungi | yeah, it seems like it's now no longer breaking even with afs01.dfw back offline again | 22:46 |
clarkb | shall I reboot then? | 22:46 |
fungi | go for it | 22:46 |
corvus | i'm even able to load tc governance documents via docs.o.o, which are almost certainly not in any cache. ;) | 22:47 |
fungi | i'm continuing to browse around to random pages on the docs site | 22:47 |
clarkb | and reboot succeeded as expected | 22:47 |
clarkb | I'm going to put afs01.dfw.o.o and afsdb01.o.o back into puppet | 22:48 |
corvus | sorry i missed the opportunity to help debug earlier | 22:48 |
*** hwoarang has quit IRC | 22:48 | |
clarkb | then release my mirror-update locks and we can merge https://review.openstack.org/#/c/642563/ if anyone else wants to be second +2 on that | 22:48 |
corvus | maybe shout "infra dash root" if you need more eyes next time :) | 22:48 |
fungi | it was definitely a strange situation | 22:49 |
*** hwoarang has joined #openstack-infra | 22:49 | |
fungi | johnsom: stuff should be back to normal as of ~22:40z | 22:50 |
fungi | so roughly 10 minutes ago | 22:50 |
mordred | corvus: I agree with your symptom from earlier of only sometimes getting the opendev logo on opendev.org | 22:50 |
*** threestrands has joined #openstack-infra | 22:50 | |
johnsom | Yeah, works for me now | 22:51 |
mordred | corvus: I have no idea if it's service worker related - or whatnot - it's definitely 'interesting' | 22:51 |
clarkb | I have released all of the locks on mirror-update after vos release for docs et al ran on afsdb01 | 22:52 |
clarkb | last remaining step is to merge https://review.openstack.org/#/c/642563/1 if infra-root can review that | 22:52 |
fungi | i think it got double-approved | 22:53 |
corvus | mordred: i deleted my localhost:3000 service worker and it fixed my local gitea | 22:54 |
corvus | i don't consider that to be a viable production fix, so i have not yet done that for opendev | 22:54 |
corvus | but i do think that points very strongly in the direction of service workers | 22:54 |
mordred | yeah | 22:54 |
fungi | is gitea storing different things in service workers than they're designed to be used for? | 22:55 |
fungi | seems really strange for that to affect logs/branding | 22:56 |
fungi | er, logos/branding | 22:56 |
mordred | could have something to do with how those things are being bundled? | 22:56 |
*** tkajinam has joined #openstack-infra | 22:56 | |
clarkb | looks like we do all our mirror update crons on even numbered hours so 0000UTC is when they will next run | 22:58 |
clarkb | I'll try to keep an eye on that in an hour | 22:58 |
clarkb | but as far as I can tell the afs fileservers are upgraded so I've moved them to the done section \o/ | 22:59 |
clarkb | if anyone wants to sort out the afs db server process I'm happy to help. I tried to dig into the process for that and couldn't really come up with anything good that didn't involved a proper outage | 22:59 |
clarkb | I'll probably start collecting info on an etherpad for that tomorrow so we can get started on something | 23:00 |
*** TheJulia has joined #openstack-infra | 23:00 | |
clarkb | #status log Upgraded afs01.dfw, afs02.dfw, and afs01.ord to Xenial from Trusty | 23:01 |
openstackstatus | clarkb: finished logging | 23:01 |
*** eernst has joined #openstack-infra | 23:05 | |
*** dustinc has joined #openstack-infra | 23:07 | |
*** eernst has quit IRC | 23:10 | |
*** rascasoft has quit IRC | 23:12 | |
openstackgerrit | Merged openstack-infra/project-config master: Revert "Disable wheel mirror updates for afs server upgrades" https://review.openstack.org/642563 | 23:14 |
*** mattw4 has quit IRC | 23:16 | |
clarkb | corvus: one thing I notice on afs01.dfw that isn't the case on afs02.dfw or afs01.ord is that /etc/openafs/ThisCell is a symlink. We seem to try to set it with puppet and get Mar 11 23:16:39 afs01 puppet-user[3501]: (/Stage[main]/Openafs::Client/File[/etc/openafs/ThisCell]) Ensure set to :present but file type is link so no content will be synced | 23:27 |
clarkb | corvus: to make that message go away can I safely copy the target of that symlink over the symlink? | 23:28 |
clarkb | this isn't a regression due to the upgrade so not urgent | 23:28 |
corvus | clarkb: i can't imagine that would be a problem, and it's not ringing a bell, so if we did that manually at some point (as opposed to some package install script somewhere) i can't recall. | 23:29 |
ianw | clarkb: it being a symlink sounds maybe like something debconf would do in the interactive install case? | 23:31 |
clarkb | ianw fungi fyi I abandoned https://review.openstack.org/#/c/641880/ | 23:32 |
clarkb | ianw: maybe? its a symlink to the server/ThisCell file | 23:32 |
clarkb | the content between those files is the same on the other servers just not a symlink | 23:32 |
ianw | zbr: tried to summarise what i understand to be the issue and suggested maybe abstraction of the name would be clearer. very interested what fungi thinks | 23:33 |
fungi | ianw: where was this? on the etherpad? | 23:34 |
ianw | fungi: https://review.openstack.org/#/c/639951/ | 23:35 |
fungi | ahh, thanks | 23:36 |
*** threestrands_ has joined #openstack-infra | 23:42 | |
*** threestrands has quit IRC | 23:45 | |
corvus | fungi: i responded on https://review.openstack.org/642574 and +W. is that okay? | 23:45 |
fungi | yep, fine by m,e | 23:46 |
fungi | er, by me | 23:46 |
corvus | my plan was to re-key the intermidate registry today after that merged | 23:47 |
corvus | my plan is now to re-key the intermediate registry tomorrow | 23:47 |
corvus | then we can consider that to be in production | 23:47 |
fungi | sounds great | 23:47 |
*** rascasoft has joined #openstack-infra | 23:52 | |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add no_log entries to skopeo copy commands https://review.openstack.org/642574 | 23:57 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!