*** mattw4 has quit IRC | 00:03 | |
*** mriedem has quit IRC | 00:09 | |
pabelanger | ianw: yah, maybe https://review.opendev.org/656905/ / https://review.opendev.org/657126 / https://review.opendev.org/656838 for the next release, should be good for me | 00:28 |
---|---|---|
pabelanger | then I can start work on fedora-30 and see what we are doing to grub | 00:29 |
*** ijw has quit IRC | 00:30 | |
*** ijw has joined #openstack-infra | 00:30 | |
*** yamamoto has joined #openstack-infra | 00:34 | |
*** ijw has quit IRC | 00:36 | |
*** larainema has quit IRC | 00:42 | |
*** oyrogerg has joined #openstack-infra | 00:43 | |
*** Adri2000 has quit IRC | 00:46 | |
*** Lucas_Gray has quit IRC | 00:46 | |
*** larainema has joined #openstack-infra | 00:49 | |
*** Adri2000 has joined #openstack-infra | 00:51 | |
*** whoami-rajat has joined #openstack-infra | 01:16 | |
*** diablo_rojo has joined #openstack-infra | 01:17 | |
*** jamesmcarthur has joined #openstack-infra | 01:25 | |
*** yamamoto has quit IRC | 01:33 | |
*** logan- has quit IRC | 01:33 | |
*** logan- has joined #openstack-infra | 01:37 | |
*** jamesmcarthur has quit IRC | 01:43 | |
*** Swami has quit IRC | 01:44 | |
*** jamesmcarthur has joined #openstack-infra | 02:14 | |
*** hongbin has joined #openstack-infra | 02:19 | |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: Add python-path option to node https://review.opendev.org/637338 | 02:22 |
*** rh-jlabarre has quit IRC | 02:23 | |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: Implement a Runc driver https://review.opendev.org/535556 | 02:24 |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: Implement an OpenShift Pod provider https://review.opendev.org/590335 | 02:25 |
*** jamesmcarthur has quit IRC | 02:30 | |
*** udesale has joined #openstack-infra | 02:42 | |
*** jamesmcarthur has joined #openstack-infra | 02:43 | |
*** rcernin has quit IRC | 02:43 | |
*** samueldmq has quit IRC | 02:46 | |
logan- | http://logs.openstack.org/15/657415/2/gate/openstack-ansible-deploy-aio_metal-debian-stable/eb22e7e/job-output.txt.gz#_2019-05-07_02_52_49_354574 odd that this url 404'd | 02:55 |
*** jamesmcarthur has quit IRC | 03:04 | |
*** bhavikdbavishi has joined #openstack-infra | 03:05 | |
*** jamesmcarthur has joined #openstack-infra | 03:06 | |
*** ramishra has joined #openstack-infra | 03:10 | |
*** yamamoto has joined #openstack-infra | 03:20 | |
*** jamesmcarthur has quit IRC | 03:26 | |
*** rcernin has joined #openstack-infra | 03:29 | |
*** eernst has joined #openstack-infra | 03:30 | |
*** ykarel|away has joined #openstack-infra | 03:30 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: Get executor job params https://review.opendev.org/607078 | 03:41 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: Separate out executor server from runner https://review.opendev.org/607079 | 03:41 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: Move repository preparation into common class https://review.opendev.org/648642 | 03:41 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: Separate out executor concerns from AnsibleJob https://review.opendev.org/648643 | 03:41 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: runner: implement prep-workspace https://review.opendev.org/607082 | 03:41 |
*** ekultails has quit IRC | 03:41 | |
*** raukadah is now known as chandankumar | 03:52 | |
*** jamesmcarthur has joined #openstack-infra | 03:56 | |
*** ykarel|away has quit IRC | 03:59 | |
*** e0ne has joined #openstack-infra | 04:06 | |
*** tkajinam has quit IRC | 04:07 | |
*** tkajinam has joined #openstack-infra | 04:08 | |
*** tkajinam has quit IRC | 04:08 | |
*** tkajinam has joined #openstack-infra | 04:08 | |
*** eernst has quit IRC | 04:10 | |
*** e0ne has quit IRC | 04:12 | |
*** diablo_rojo has quit IRC | 04:16 | |
*** ykarel|away has joined #openstack-infra | 04:18 | |
*** jamesmcarthur has quit IRC | 04:22 | |
*** ricolin has joined #openstack-infra | 04:26 | |
*** ykarel|away is now known as ykarel | 04:27 | |
*** hongbin has quit IRC | 04:33 | |
*** janki has joined #openstack-infra | 04:37 | |
openstackgerrit | Merged openstack/project-config master: openafs-client : update kdc servers https://review.opendev.org/657504 | 05:14 |
*** e0ne has joined #openstack-infra | 05:22 | |
*** ricolin has quit IRC | 05:24 | |
openstackgerrit | Merged zuul/zuul master: Tiny cleanup in change panel js https://review.opendev.org/655589 | 05:28 |
*** ijw has joined #openstack-infra | 05:29 | |
*** ijw_ has joined #openstack-infra | 05:31 | |
*** ijw_ has quit IRC | 05:33 | |
*** ijw has quit IRC | 05:34 | |
*** ijw has joined #openstack-infra | 05:35 | |
*** Weifan has joined #openstack-infra | 05:47 | |
*** Weifan has quit IRC | 05:47 | |
openstackgerrit | Merged openstack/diskimage-builder master: Only enable dbus-daemon for fedora-29 and below https://review.opendev.org/657126 | 05:47 |
*** Weifan has joined #openstack-infra | 05:48 | |
*** Weifan has quit IRC | 05:52 | |
*** ijw_ has joined #openstack-infra | 05:53 | |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: [wip] test pre-intsall of systemd https://review.opendev.org/657530 | 05:55 |
*** e0ne has quit IRC | 05:56 | |
*** ijw has quit IRC | 05:56 | |
*** rkukura has joined #openstack-infra | 05:57 | |
*** kopecmartin|off is now known as kopecmartin | 06:01 | |
*** lpetrut has joined #openstack-infra | 06:07 | |
*** pcaruana has joined #openstack-infra | 06:21 | |
*** pgaxatte has joined #openstack-infra | 06:24 | |
*** gthiemonge has joined #openstack-infra | 06:31 | |
*** jbadiapa has joined #openstack-infra | 06:33 | |
*** shardy has joined #openstack-infra | 06:35 | |
*** slaweq has joined #openstack-infra | 06:36 | |
*** e0ne has joined #openstack-infra | 06:46 | |
*** rpittau|afk is now known as rpittau | 06:47 | |
*** jtomasek has joined #openstack-infra | 06:50 | |
*** dciabrin has joined #openstack-infra | 06:52 | |
*** e0ne has quit IRC | 06:53 | |
*** armax has quit IRC | 07:03 | |
*** ginopc has joined #openstack-infra | 07:04 | |
*** yboaron_ has joined #openstack-infra | 07:05 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: Separate out executor concerns from AnsibleJob https://review.opendev.org/648643 | 07:09 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: runner: implement prep-workspace https://review.opendev.org/607082 | 07:09 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: runner: add configuration schema https://review.opendev.org/640672 | 07:09 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: runner: add prepare-workspace command line interface https://review.opendev.org/644770 | 07:09 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: runner: add execute sub-command https://review.opendev.org/630944 | 07:09 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: runner: add job parameters listing https://review.opendev.org/644795 | 07:09 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: add depends-on support to the freeze job API https://review.opendev.org/639022 | 07:09 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: runner: add support for depends-on https://review.opendev.org/632064 | 07:09 |
*** tosky has joined #openstack-infra | 07:16 | |
*** ccamacho has joined #openstack-infra | 07:23 | |
*** tesseract has joined #openstack-infra | 07:26 | |
*** rcernin has quit IRC | 07:27 | |
*** jpich has joined #openstack-infra | 07:29 | |
*** hwoarang has quit IRC | 07:32 | |
*** hwoarang has joined #openstack-infra | 07:34 | |
*** oyrogerg has quit IRC | 07:39 | |
*** ramishra_ has joined #openstack-infra | 07:39 | |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: A reporter for Elasticsearch with the capability to index build and buildset results in an index. https://review.opendev.org/644927 | 07:41 |
*** ramishra has quit IRC | 07:42 | |
*** amoralej|off is now known as amoralej | 07:46 | |
*** kjackal has joined #openstack-infra | 07:50 | |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: A reporter for Elasticsearch with the capability to index build and buildset results in an index. https://review.opendev.org/644927 | 07:56 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: Separate out executor concerns from AnsibleJob https://review.opendev.org/648643 | 07:56 |
*** jpena|off is now known as jpena | 08:00 | |
*** trident has quit IRC | 08:01 | |
*** trident has joined #openstack-infra | 08:02 | |
*** ccamacho has quit IRC | 08:03 | |
*** quiquell has quit IRC | 08:04 | |
*** lucasagomes has joined #openstack-infra | 08:04 | |
*** quiquell has joined #openstack-infra | 08:05 | |
*** rossella_s has joined #openstack-infra | 08:06 | |
*** ctr has joined #openstack-infra | 08:10 | |
*** ykarel is now known as ykarel|lunch | 08:17 | |
*** ctr has left #openstack-infra | 08:23 | |
*** tkajinam has quit IRC | 08:35 | |
*** pkopec has joined #openstack-infra | 08:38 | |
*** priteau has joined #openstack-infra | 08:44 | |
*** cmoura has quit IRC | 08:45 | |
*** ralonsoh has joined #openstack-infra | 08:48 | |
*** ykarel|lunch is now known as ykarel | 08:52 | |
ianw | pabelanger: released with those f29 fixes. i think it makes sense somewhat that f29 introduced enablement there, and possibly systemd isn't setup at the point we're install dbus. https://review.opendev.org/#/c/657530/ shows that systemd brings in dbus anyway, so i don't know how to break that circular dependency | 09:01 |
openstackgerrit | Merged openstack/diskimage-builder master: Document the various global filesystem options https://review.opendev.org/656255 | 09:03 |
*** ykarel_ has joined #openstack-infra | 09:17 | |
*** ykarel has quit IRC | 09:18 | |
pabelanger | ianw: thanks, I'll test out today | 09:21 |
*** electrofelix has joined #openstack-infra | 09:21 | |
openstackgerrit | Wenqing Gu proposed opendev/glean master: Sync when writing the file https://review.opendev.org/652238 | 09:25 |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Split the base playbook into services https://review.opendev.org/656871 | 09:41 |
mordred | pabelanger: I feel as if we're both up a smidge early | 09:41 |
pabelanger | mordred: IKR | 09:46 |
*** jaosorior has joined #openstack-infra | 09:48 | |
*** ykarel__ has joined #openstack-infra | 09:56 | |
*** ykarel__ is now known as ykarel | 09:56 | |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Split the base playbook into services https://review.opendev.org/656871 | 09:56 |
*** ykarel_ has quit IRC | 09:57 | |
*** zbr has joined #openstack-infra | 10:04 | |
*** kjackal has quit IRC | 10:06 | |
*** yumapath has joined #openstack-infra | 10:07 | |
yumapath | 15:31] <yumapath> hi folks am running zuul v3 and it was working system , couple of weeks back it stopped working and from logs we figured out that review.openstack.org has moved to review.opendev.org and we made necessary changes to the system [15:32] <yumapath> after doing the changes we are seeing the following errors [15:32] <yumapath> openstack/networking-cisco - undefined (undefined) Zuul encountered an error while | 10:07 |
yumapath | dev/devstack. The error was: [Errno -2] Name or service not known [15:32] <yumapath> and zuul scheduler is nto starting [15:32] <yumapath> May 07 09:03:26 cinder-zuulv3 zuul-scheduler[10921]: Traceback (most recent call last): May 07 09:03:26 cinder-zuulv3 zuul-scheduler[10921]: File "/usr/local/lib/python3.6/dist-packages/zuul/driver/gerrit/gerrit May 07 09:03:26 cinder-zuulv3 zuul-scheduler[10921]: key_filename | 10:07 |
pabelanger | yumapath: there was a large migration of git repos to opendev a few weeks ago, so you are likely seeing breakage of where the repos now live. | 10:08 |
pabelanger | so, you might want to update to https://opendev.org/x/networking-cisco | 10:08 |
pabelanger | this is also because redirects don't work via git:// | 10:09 |
pabelanger | which zuul uses when I lasted looked | 10:09 |
mordred | pabelanger, yumapath: almost - zuul should be using ssh to connect to gerrit, but gerrit doesn't do redirects | 10:10 |
mordred | also - there is no more git:// | 10:10 |
pabelanger | ah | 10:10 |
yumapath | @pabelanger and @mordred i have updated the url to opendev.org | 10:12 |
yumapath | the x factor for cinder for example is still openstack/cinder | 10:12 |
yumapath | even for that it is failing | 10:12 |
yumapath | zuul scheduler is failing to start | 10:13 |
yumapath | wiht the following error | 10:13 |
pabelanger | yumapath: can you pastebin the error that you get from scheduler-debug.log | 10:13 |
yumapath | ok sure | 10:13 |
openstackgerrit | Merged openstack/diskimage-builder master: Allow specification of filesystem journal size https://review.opendev.org/633368 | 10:14 |
openstackgerrit | Merged openstack/diskimage-builder master: Support defining the free space in the image https://review.opendev.org/655127 | 10:14 |
yumapath | http://paste.openstack.org/show/750855/ | 10:15 |
yumapath | @pabelanger http://paste.openstack.org/show/750855/ | 10:15 |
yumapath | that is the error | 10:15 |
*** ijw_ has quit IRC | 10:17 | |
yumapath | @mordred what does it mean to say there is no more git:// | 10:17 |
yumapath | what is the equivalent i should be using instead of git:// as base protocol | 10:17 |
yumapath | https?? | 10:17 |
mordred | yumapath: there is no git:// protocol support in opendev anymore ... https:// should be used instead | 10:17 |
yumapath | everywhere | 10:17 |
yumapath | ok | 10:17 |
*** ijw has joined #openstack-infra | 10:17 | |
yumapath | i just checked | 10:18 |
yumapath | am already using https | 10:18 |
yumapath | everywhere | 10:18 |
mordred | good! | 10:18 |
*** pcaruana has quit IRC | 10:19 | |
yumapath | and not using git | 10:19 |
mordred | that error from your paste looks like it can't connect to gerrit - is it possible there is a typo when you switched to review.opendev.org in your config files? | 10:19 |
*** kjackal has joined #openstack-infra | 10:19 | |
pabelanger | yah | 10:19 |
yumapath | [gerrit] port = 29418 server = https://review.opendev.org sshkey = /var/lib/zuul/.ssh/id_rsa | 10:20 |
yumapath | this is what i have in my zuul.conf | 10:20 |
yumapath | am able to manually connect to review.opendev.org | 10:20 |
pabelanger | yumapath: I think you need to drop https | 10:20 |
pabelanger | server = review.opendev.org | 10:21 |
yumapath | oh | 10:21 |
pabelanger | then setup password | 10:21 |
mordred | yeah | 10:21 |
pabelanger | that will then use https to connect | 10:21 |
yumapath | what password should i setup | 10:21 |
pabelanger | you get that from gerrit | 10:21 |
mordred | pabelanger: it'll use ssh ... but you're right | 10:21 |
mordred | just drop the https:// ... | 10:21 |
pabelanger | https://review.opendev.org/#/settings/http-password | 10:21 |
yumapath | ok | 10:21 |
mordred | you've got an ssh key and port defined correctly there | 10:22 |
*** bhavikdbavishi has quit IRC | 10:22 | |
yumapath | ok | 10:22 |
yumapath | and where do i put this password | 10:22 |
mordred | you don't need one | 10:22 |
*** ijw has quit IRC | 10:22 | |
mordred | you've got an ssh key | 10:22 |
yumapath | ok sure | 10:22 |
yumapath | i ll just drop the https: | 10:22 |
yumapath | and restart sevices and check | 10:22 |
mordred | yeah - give that a try | 10:22 |
mordred | ++ | 10:22 |
pabelanger | Yah, password in connection just means zuul will use https, so it can post comments on reviews | 10:23 |
yumapath | ok | 10:23 |
yumapath | i get the point | 10:23 |
yumapath | and i will be using ssh | 10:23 |
yumapath | as i have given gerrit keys | 10:24 |
yumapath | 2019-05-07 10:30:35,941 INFO zuul.ConfigLoader: Loading configuration from /etc/zuul/main.yaml 2019-05-07 10:30:47,152 ERROR zuul.GerritConnection: Cannot get references from openstack/networking-cisco 2019-05-07 10:30:48,161 ERROR zuul.GerritConnection: Cannot get references from openstack-dev/devstack 2019-05-07 10:30:49,186 ERROR zuul.GerritConnection: Cannot get references from openstack-infra/devstack-gate 2019-05-07 | 10:26 |
yumapath | that error went away | 10:26 |
yumapath | now seeing this error | 10:26 |
yumapath | means these repos have moved out to someother locaton ? | 10:26 |
mordred | yes - openstack-dev/devstack is now openstack/devstack - openstack-infra/devstack-gate is now openstack/devstack-gate | 10:29 |
mordred | you can check the new locations by going to, for instance, https://opendev.org/openstack-infra/devstack-gate - and seeing where it redirects you to | 10:29 |
yumapath | thanks much mordred and pabelanger | 10:33 |
yumapath | i will take it forward | 10:33 |
pabelanger | ++ | 10:34 |
mordred | cool - good luck! | 10:34 |
*** udesale has quit IRC | 10:42 | |
*** udesale has joined #openstack-infra | 10:42 | |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Split the base playbook into services https://review.opendev.org/656871 | 10:47 |
openstackgerrit | Wenqing Gu proposed opendev/glean master: Sync when writing the file https://review.opendev.org/652238 | 10:50 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: web: honor job dependencies in trigger modal https://review.opendev.org/657567 | 10:51 |
*** pcaruana has joined #openstack-infra | 10:55 | |
*** panda is now known as panda|lunch | 10:56 | |
*** ijw has joined #openstack-infra | 10:57 | |
*** jpena is now known as jpena|lunch | 11:01 | |
*** amoralej is now known as amoralej|lunch | 11:02 | |
*** ijw has quit IRC | 11:05 | |
*** priteau has quit IRC | 11:11 | |
*** ykarel is now known as ykarel|afk | 11:13 | |
*** rpittau has quit IRC | 11:14 | |
*** rpittau has joined #openstack-infra | 11:17 | |
*** yamamoto has quit IRC | 11:18 | |
*** Lucas_Gray has joined #openstack-infra | 11:22 | |
pabelanger | ianw: okay, fedora-29 build properly with 2.22.0 release | 11:29 |
pabelanger | booting now | 11:29 |
pabelanger | ianw: and glean python3 works | 11:35 |
pabelanger | ianw: thanks for release ! | 11:36 |
pabelanger | now on to fedora-30 | 11:36 |
ianw | pabelanger: nice! i'll check in the morning :) | 11:37 |
ianw | should have some time to work on it too | 11:38 |
pabelanger | ++ | 11:39 |
*** yumapath has quit IRC | 11:46 | |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Split the base playbook into services https://review.opendev.org/656871 | 11:47 |
*** ykarel|afk is now known as ykarel | 11:48 | |
*** yamamoto has joined #openstack-infra | 11:49 | |
*** lyarwood has quit IRC | 11:51 | |
*** yamamoto has quit IRC | 11:53 | |
*** yamamoto has joined #openstack-infra | 11:53 | |
*** yamamoto has quit IRC | 11:54 | |
*** yamamoto has joined #openstack-infra | 11:54 | |
*** yamamoto has quit IRC | 11:54 | |
*** bhavikdbavishi has joined #openstack-infra | 11:55 | |
*** yamamoto has joined #openstack-infra | 11:56 | |
*** jpena|lunch is now known as jpena | 12:00 | |
mordred | corvus: ^^ I think that's close - I think we're still going to hit a timeout on gitea though - since it's creating all of the repos in project-config | 12:02 |
*** jamesmcarthur has joined #openstack-infra | 12:02 | |
mordred | (and we know that's less that optimal efficiency-wise) | 12:02 |
*** mriedem has joined #openstack-infra | 12:05 | |
openstackgerrit | Paul Belanger proposed openstack/diskimage-builder master: Use fedora-release-common for fedora 30+ https://review.opendev.org/656905 | 12:08 |
*** larainema has quit IRC | 12:12 | |
openstackgerrit | Merged opendev/irc-meetings master: Removing unused meeting time for the Public Cloud WG. https://review.opendev.org/656993 | 12:13 |
*** amoralej|lunch is now known as amoralej | 12:18 | |
*** nicolasbock has joined #openstack-infra | 12:20 | |
*** rh-jlabarre has joined #openstack-infra | 12:21 | |
*** rlandy has joined #openstack-infra | 12:25 | |
*** jamesmcarthur has quit IRC | 12:28 | |
*** rfolco|ruck is now known as rfolco|dentist | 12:29 | |
*** rh-jlabarre has quit IRC | 12:30 | |
*** rh-jlabarre has joined #openstack-infra | 12:30 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: Separate out executor concerns from AnsibleJob https://review.opendev.org/648643 | 12:31 |
*** jcoufal has joined #openstack-infra | 12:32 | |
mordred | ianw: http://logs.openstack.org/71/656871/15/check/system-config-run-letsencrypt/cee06b7/job-output.txt.gz#_2019-05-07_11_59_54_555538 | 12:39 |
mordred | ianw: running letsencrypt playbooks in system-config test job seems to point to there being something missing from the ansible | 12:39 |
*** aaronsheffield has joined #openstack-infra | 12:45 | |
*** quiquell has quit IRC | 12:54 | |
*** yamamoto has quit IRC | 12:57 | |
*** yamamoto has joined #openstack-infra | 12:58 | |
*** jcoufal has quit IRC | 13:03 | |
*** yamamoto has quit IRC | 13:03 | |
openstackgerrit | Thomas Bechtold proposed opendev/irc-meetings master: rpm-packaging: Move meeting time https://review.opendev.org/657596 | 13:07 |
*** lseki has joined #openstack-infra | 13:09 | |
*** ijw has joined #openstack-infra | 13:18 | |
*** rfarr has joined #openstack-infra | 13:19 | |
*** Goneri has joined #openstack-infra | 13:22 | |
*** bhavikdbavishi has quit IRC | 13:22 | |
*** ekultails has joined #openstack-infra | 13:22 | |
*** ijw has quit IRC | 13:23 | |
*** bobh has joined #openstack-infra | 13:25 | |
*** bobh has quit IRC | 13:25 | |
*** panda|lunch is now known as panda | 13:25 | |
clarkb | mordred: LE in testing talks to LEs test servers then we pretend and use a lpcal self signed cert | 13:30 |
clarkb | mordred: there is a flag to do that and I think we have to supply the self signed cert too | 13:31 |
*** yamamoto has joined #openstack-infra | 13:32 | |
*** rkukura has quit IRC | 13:35 | |
*** e0ne has joined #openstack-infra | 13:36 | |
*** yamamoto has quit IRC | 13:37 | |
*** e0ne has quit IRC | 13:39 | |
*** lpetrut has quit IRC | 13:40 | |
*** jaosorior has quit IRC | 13:49 | |
*** quiquell has joined #openstack-infra | 13:50 | |
*** quiquell is now known as quiquell|bbl | 13:50 | |
*** ijw has joined #openstack-infra | 13:53 | |
*** yboaron_ has quit IRC | 13:58 | |
*** rosmaita has joined #openstack-infra | 13:58 | |
mordred | clarkb: in this case we're missing either rndc.conf or rndc.key | 13:58 |
mordred | clarkb: and I don't see anywhere in our ansible that installs one of those | 13:59 |
corvus | mordred: \o/ i left some questions on the patch -- the first one (about puppet on trusty) i'm really confused about and might be more of an irc question. | 13:59 |
mordred | clarkb: also - unrelated, but related to the run-base job ... I was right about timeouts - so next step is that I'm going to rewrite the gitea repo creation stuff as a single python module | 14:00 |
mordred | corvus: let me go read them | 14:00 |
corvus | mordred: re gitea: cool -- i agree that's the best end state. i was wondering if we could defer that by omitting the puppet playbook from the gitea job temporarily, but maybe it's better to just go ahead and write the module; we need it anyway. | 14:01 |
corvus | mordred: i'm kind of thrilled that the test is failing because it's so realistic :) | 14:01 |
*** ccamacho has joined #openstack-infra | 14:06 | |
mordred | corvus: responded on the change - although, I like the idea of deferring the rewrite and skipping the playbook for now | 14:06 |
mordred | just because this change is already pretty big | 14:07 |
mordred | and yes - it is thrilling that it's so real :) | 14:07 |
mordred | I like the concept of running this with our actual data - since it provides motivation to make the system work more better' | 14:08 |
corvus | mordred: thx. your hunch is enough to satisfy my curiosity about the puppet stuff | 14:08 |
mordred | \o/ | 14:08 |
corvus | i'm stumped by the rndc thing | 14:09 |
mordred | corvus: I had to add a package install already for the dns stack | 14:09 |
mordred | corvus: which makes me think we're just missing something in ansible that got put in place some other way or something | 14:10 |
mordred | but also - yes - stumped | 14:10 |
clarkb | mordred: ah that is going yo be for the acme zone hosting in bind | 14:11 |
corvus | oh i don't see the master-nameserver role running | 14:12 |
corvus | mordred: i think i got it; i'll leave comments | 14:13 |
*** lucasagomes is now known as lucas-brb | 14:14 | |
*** jistr is now known as jistr|call | 14:14 | |
mordred | corvus: yes! I thnik I also have it based on you saying that | 14:14 |
mordred | but look forward to your comments | 14:14 |
*** janki has quit IRC | 14:14 | |
corvus | mordred: done | 14:15 |
mordred | corvus: do we also need to put adns-letsencrypt.opendev.org into the adns group in gate-groups? | 14:15 |
corvus | hrm, i'm assuming the host matching was working from before | 14:16 |
openstackgerrit | Merged openstack/diskimage-builder master: Use fedora-release-common for fedora 30+ https://review.opendev.org/656905 | 14:16 |
corvus | mordred: adns: adns*.open*.org | 14:16 |
corvus | so i think it matches the prod group | 14:16 |
mordred | ah - ok, cool | 14:17 |
corvus | (not entirely sure why that wasn't just called "adns01.opendev.org" but we can try that later i guess) | 14:17 |
mordred | yeah | 14:17 |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Split the base playbook into services https://review.opendev.org/656871 | 14:17 |
mordred | corvus: in a perfect world, that's going to be green | 14:18 |
*** ekultails has quit IRC | 14:18 | |
corvus | mordred: fingers crossed! | 14:18 |
*** pcaruana has quit IRC | 14:19 | |
*** rfolco|dentist is now known as rfolco|ruck | 14:20 | |
*** jistr|call is now known as jistr | 14:23 | |
*** tdasilva has joined #openstack-infra | 14:26 | |
*** apetrich has quit IRC | 14:29 | |
*** ekultails has joined #openstack-infra | 14:37 | |
*** armax has joined #openstack-infra | 14:37 | |
*** pcaruana has joined #openstack-infra | 14:38 | |
*** apetrich has joined #openstack-infra | 14:38 | |
*** efried is now known as efried_pto | 14:39 | |
*** ykarel is now known as ykarel|away | 14:39 | |
mordred | corvus: blast. gitea failed. investigating | 14:39 |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Split the base playbook into services https://review.opendev.org/656871 | 14:41 |
mordred | that one was easy | 14:41 |
corvus | mordred: exciting! | 14:42 |
mordred | corvus: does this: http://logs.openstack.org/71/656871/17/check/system-config-run-gitea/aa90dac/job-output.txt.gz#_2019-05-07_14_51_29_402430 speak to you about what might be wrong there? | 15:01 |
mordred | corvus: (before I dig in further) | 15:02 |
*** bhavikdbavishi has joined #openstack-infra | 15:02 | |
*** tosky has quit IRC | 15:06 | |
*** kjackal has quit IRC | 15:06 | |
corvus | mordred: not immediately, no | 15:08 |
*** rfolco|ruck is now known as rfolco|lunch | 15:08 | |
mordred | corvus: kk. I'll figure it out | 15:10 |
*** chandankumar is now known as raukadah | 15:10 | |
*** imacdonn has quit IRC | 15:11 | |
*** imacdonn has joined #openstack-infra | 15:11 | |
*** pcaruana has quit IRC | 15:16 | |
*** ccamacho has quit IRC | 15:16 | |
rajinir | mmedvede: Tthe ciwatch is not showing any results http://ciwatch.mmedvede.net/project?project=cinder&time=7+days , used to work until last week. Any idea? | 15:18 |
*** ykarel_ has joined #openstack-infra | 15:18 | |
*** hamzy has quit IRC | 15:20 | |
*** ykarel|away has quit IRC | 15:21 | |
*** rfolco|lunch is now known as rfolco|ruck|free | 15:22 | |
*** rfolco|ruck|free is now known as rfolco|ruck | 15:22 | |
corvus | mordred: i think i grok -- we're not running the gitea role | 15:23 |
corvus | i think that's because of the interaction with gerrit via remote_puppet_git | 15:24 |
*** jamesmcarthur has joined #openstack-infra | 15:24 | |
corvus | mordred: so maybe we add the remote_puppet_git playbook to that job? | 15:24 |
*** ykarel_ has quit IRC | 15:25 | |
corvus | mordred: oh, derp, that's the thing you just removed | 15:26 |
corvus | so we need that in order for gitea to be installed, but it also runs project creation which takes too long | 15:27 |
corvus | so we either need to write that python module, or else make a temporary "install gitea" playbook | 15:28 |
corvus | or, really, i guess we could just stick the install gitea step into the test playbook | 15:28 |
*** ijw has quit IRC | 15:28 | |
*** rkukura has joined #openstack-infra | 15:35 | |
*** michael-beaver has joined #openstack-infra | 15:36 | |
*** rkukura has quit IRC | 15:36 | |
aspiers | hogepodge, AJaeger, ttx: I only just discovered https://opendev.org/osf/four-opens/ via google. Presumably the intention to publish it online at some point? | 15:39 |
*** lucas-brb is now known as lucasagomes | 15:39 | |
ttx | yeah, it's been a bit slow recently as we switched to summit prep | 15:40 |
dirk | who here can help me understand why the tarballs.openstack.org service tarballs are no longer getting updated? | 15:40 |
dirk | did we discontinue this service? or is it merely broken? | 15:41 |
*** pgaxatte has quit IRC | 15:42 | |
clarkb | dirk: zuul creates all of those tarballs so likely it is due to failing zuul jobs | 15:43 |
dirk | right, can you point me where those jobs are? | 15:43 |
clarkb | dirk on the zuul status site there is a builds tab which we can use to lookup those builds a d woro from there | 15:43 |
clarkb | I think if you search tarball you may get them | 15:44 |
dirk | no hit on "tarball" | 15:44 |
dirk | ah, on jobs tab | 15:45 |
clarkb | possibly that they arent running if not showing up in the builds tab | 15:45 |
*** bobh has joined #openstack-infra | 15:45 | |
dirk | yep, no runs | 15:45 |
dirk | ok, why did we lose those jobs? | 15:46 |
dirk | http://zuul.openstack.org/builds?job_name=publish-openstack-python-branch-tarball&project=openstack%2Fcinder&branch=stable%2Frocky | 15:46 |
dirk | last run jan 22 | 15:46 |
dirk | but there were merges to rocky afterwards | 15:46 |
*** hamzy has joined #openstack-infra | 15:48 | |
*** Lucas_Gray has quit IRC | 15:49 | |
*** bhavikdbavishi has quit IRC | 15:50 | |
*** rpittau is now known as rpittau|afk | 15:51 | |
clarkb | I wonder if it got caught in the split between xenial for rocky and older and bionic for train and newer | 15:51 |
clarkb | maybe that resulted in no valid job for older branches? do stein tarballs and or master get published? | 15:52 |
dirk | master works | 15:54 |
dirk | stein as well | 15:54 |
dirk | so only older than stein | 15:54 |
clarkb | I suspect the node split then. | 15:55 |
*** gyee has joined #openstack-infra | 15:55 | |
fungi | likely they inherit from a job with a branch matcher, and tags don't match any branch | 15:56 |
clarkb | these are the per commit post jobs though | 15:56 |
clarkb | they should work with branch matchers | 15:56 |
fungi | oh, the branch tip tarballs? | 15:57 |
fungi | january 22 seems too early to have been the xenial->bionic transition | 15:57 |
clarkb | ya | 15:57 |
mordred | corvus: oh derp. and thanks - that makes total sense | 15:58 |
clarkb | there may not have been rocky commits after jan 22 and before the switch? | 15:58 |
fungi | possible | 15:58 |
fungi | hrm, nova's rocky tarball was last updated may 5 | 15:59 |
fungi | so it's not all projects impacted by this | 15:59 |
mordred | corvus: I'm going to do the "put it in the test playbook" for now | 15:59 |
*** e0ne has joined #openstack-infra | 16:02 | |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Split the base playbook into services https://review.opendev.org/656871 | 16:04 |
fungi | dirk: clarkb: http://zuul.opendev.org/t/openstack/builds?project=openstack%2Fcinder&branch=stable%2Frocky&pipeline=post | 16:04 |
mordred | corvus: this is a fun refactoring isn't it? | 16:04 |
fungi | there have been *no* post pipeline jobs for cinder since january 22 | 16:04 |
fungi | i wonder if that coincides with when we rolled in the zuul fix to disallow running jobs with secrets from other non-config projects? | 16:05 |
clarkb | oh! is this the thing where they need to stop running the loci job? | 16:05 |
fungi | is this the loci image build? | 16:05 |
clarkb | yup | 16:05 |
fungi | yeah | 16:05 |
clarkb | I think this is that | 16:05 |
fungi | specifically, maybe they missed backporting that removal to stable/rocky | 16:05 |
openstackgerrit | Adam Spiers proposed opendev/storyboard master: Clarify the rationale for StoryBoard's unique design https://review.opendev.org/657633 | 16:11 |
*** jamesmcarthur has quit IRC | 16:15 | |
*** manjeets__ is now known as manjeets | 16:21 | |
fungi | clarkb: dirk: confirmed, looks like https://review.opendev.org/636979 fixed it in master (before stable/stein was created) but nobody backported it to earlier cinder branches | 16:22 |
smcginnis | On it | 16:22 |
fungi | so publish-loci-cinder is included in stable/ocata through stable/rocky still | 16:23 |
fungi | dirk: did you happen to spot a similar problem with any other projects? | 16:24 |
dirk | fungi: so there was a security fix that silently made the job not run anymore instead of running with a big fat error? | 16:24 |
smcginnis | I had seen several other projects that had this issue with loci jobs, but the ones I knew about were able to fix it. At least in master. | 16:24 |
fungi | dirk: more specifically, there was a security fix which prevents that configuration from being valid for the post pipeline, but there's nowhere convenient to report that in gerrit since this is something which runs after changes merge | 16:24 |
*** whoami-rajat has quit IRC | 16:25 | |
fungi | basically zuul is unable to determine which jobs to run in the post pipeline for those branches because they include a request to run a job which is not available to that project | 16:27 |
*** nicolasbock has quit IRC | 16:29 | |
*** nicolasbock has joined #openstack-infra | 16:29 | |
dirk | hmm | 16:30 |
dirk | I guess a FAILURE with "no actual run happened due to error XXX" entry in the job listing search above would be great | 16:30 |
dirk | I can see more publish-loci* references. so all of them need to be removed? | 16:30 |
*** ginopc has quit IRC | 16:32 | |
dirk | smcginnis: thanks! | 16:34 |
dirk | smcginnis: can you ping and bribe cores for a few more +2s? | 16:34 |
smcginnis | :) | 16:35 |
smcginnis | I'll see if I can round some up. | 16:35 |
*** jtomasek has quit IRC | 16:40 | |
*** jpich has quit IRC | 16:41 | |
fungi | dirk: zuul does log the configuration failure to its service log... i'm not sure how we'd go about logging a failure to decide what jobs to run in the dashboard... maybe it would be relevant to the buildsets table? this is what the exception it's raising looks like http://paste.openstack.org/show/750882/ | 16:41 |
dirk | fungi: yep, that exception is a billion times more useful to know than no information nothing about the builds not running ;) | 16:42 |
dirk | imho this would be just a FAILURE build actually with this error. but I can understand that its difficult to build a link to a log that way.. | 16:42 |
*** whoami-rajat has joined #openstack-infra | 16:43 | |
fungi | i agree, the challenge is in deciding where is a useful place to expose it... the failure condition isn't specific to the builds table | 16:43 |
fungi | there is no "build" because it couldn't decide which builds to create | 16:43 |
*** lucasagomes has quit IRC | 16:44 | |
dirk | ah, ok. I begin to understand | 16:44 |
*** lucasagomes has joined #openstack-infra | 16:45 | |
dirk | fungi: can you find similar exceptions elsewhere? my attempts at grepping only turn up cinder | 16:45 |
*** e0ne has quit IRC | 16:45 | |
fungi | dirk: aha! http://zuul.opendev.org/t/openstack/buildsets?pipeline=post&project=openstack%2Fcinder&branch=stable%2Frocky | 16:46 |
fungi | it does indicate a config error there | 16:46 |
dirk | cool! | 16:47 |
fungi | the exceptions are in the zuul scheduler's service log, not in any job logs, which is why i used paste.o.o to show what it contains | 16:47 |
dirk | http://zuul.opendev.org/t/openstack/buildsets?pipeline=post&result=CONFIG_ERROR | 16:47 |
dirk | looks like bgpvpn is the next one on the list | 16:47 |
fungi | if you drop the pipeline filter you can see some similar config errors in periodic jobs too | 16:48 |
dirk | unfortunately there is no detail on the error | 16:53 |
dirk | fungi: can you lookup why networking-bgpvpn fails? there is no loci job defined there.. | 16:53 |
fungi | dirk: all those seem to be for stable/ocata | 16:55 |
fungi | most recent is from a month ago so may be past the end of our scheduler log retention, but i'll see what i can find | 16:55 |
openstackgerrit | Brian Haley proposed openstack/project-config master: Fix stats for neutron grafana page https://review.opendev.org/657646 | 16:56 |
fungi | yeah, it failed when https://review.opendev.org/646566 merged on 2019-04-10 so we may have something | 16:57 |
*** kjackal has joined #openstack-infra | 16:59 | |
fungi | dirk: Exception: Unable to modify final job <Job publish-openstack-sphinx-docs branches: None source: opendev/base-jobs/zuul.yaml@master#112> attribute required_projects={'git.openstack.org/openstack/horizon': <zuul.model.JobProject object at 0x7f0aa0460a58>, 'git.openstack.org/openstack/networking-bagpipe': <zuul.model.JobProject object at 0x7f0aa0460da0>, | 17:00 |
fungi | 'git.openstack.org/openstack/networking-odl': <zuul.model.JobProject object at 0x7f0aa04606a0>} with variant <Job publish-openstack-sphinx-docs branches: None source: openstack/networking-bgpvpn/.zuul.yaml@stable/ocata#47> | 17:00 |
fungi | so likely openstack/networking-bgpvpn is trying to define a variant of publish-openstack-sphinx-docs on its stable/ocata branch | 17:01 |
*** armax has quit IRC | 17:02 | |
dirk | right, looks liek that | 17:03 |
*** Weifan has joined #openstack-infra | 17:10 | |
*** jpena is now known as jpena|off | 17:10 | |
anteaya | `/msg clarkb I missed the meeting before summit, I'm reading the notes now | 17:11 |
anteaya | opps | 17:11 |
anteaya | that space | 17:11 |
anteaya | I was half and half on posting in channel, guess I'll keep going | 17:11 |
*** quiquell|bbl is now known as quiquell | 17:11 | |
anteaya | so clarkb the question is about survey.o.o cert | 17:11 |
anteaya | you mentioned riding off into the sunset | 17:12 |
anteaya | do you mean that you will renew the cert for two years and then no longer? | 17:12 |
*** udesale has quit IRC | 17:13 | |
fungi | that would be get a survey.openstack.org cert but also plan to make that a redirect to survey.opendev.org and only have a valid cert for the former for a couple years for backward-compatibility | 17:13 |
fungi | survey.opendev.org can get letsencrypt certs with little effort | 17:13 |
anteaya | ah, thank you | 17:14 |
*** bobh has quit IRC | 17:14 | |
*** armax has joined #openstack-infra | 17:18 | |
fungi | basically the openstack.org domain is hard for us to deal with since we don't have the ability currently to host it in the same infrastructure as the opendev.org or zuul-ci.org domains | 17:18 |
*** kjackal has quit IRC | 17:19 | |
anteaya | that makes sense | 17:21 |
anteaya | glad the new domains offer more expidited solutions | 17:21 |
anteaya | whoops, expedited | 17:22 |
openstackgerrit | Matt Riedemann proposed openstack/os-loganalyze master: Add watcher to filter list https://review.opendev.org/657652 | 17:26 |
*** raissa has joined #openstack-infra | 17:31 | |
openstackgerrit | Logan V proposed openstack/diskimage-builder master: Use megabyte granularity for image extra space https://review.opendev.org/657654 | 17:34 |
*** Weifan has quit IRC | 17:35 | |
*** Weifan has joined #openstack-infra | 17:36 | |
openstackgerrit | Merged zuul/zuul master: Use user.html_url for github reporter messages https://review.opendev.org/655188 | 17:38 |
*** Weifan has quit IRC | 17:40 | |
*** bobh has joined #openstack-infra | 17:42 | |
openstackgerrit | Merged zuul/zuul master: Add release-zuul-python to post pipeline https://review.opendev.org/655474 | 17:43 |
*** kopecmartin is now known as kopecmartin|off | 17:46 | |
*** electrofelix has quit IRC | 17:47 | |
*** lucasagomes has quit IRC | 17:50 | |
*** armax has quit IRC | 17:51 | |
openstackgerrit | David Shrewsbury proposed zuul/zuul-preview master: Partial code refactor for unit testing https://review.opendev.org/643666 | 17:53 |
openstackgerrit | David Shrewsbury proposed zuul/zuul-preview master: Add unit testing framework and sample test. https://review.opendev.org/644247 | 17:53 |
openstackgerrit | David Shrewsbury proposed zuul/zuul-preview master: Finish refactor https://review.opendev.org/644609 | 17:53 |
*** ykarel has joined #openstack-infra | 17:54 | |
smcginnis | clarkb: Just a pointer, this was related to that direct message I sent late last week: http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005950.html | 17:55 |
clarkb | smcginnis: ah ya I don't think the gpus we have access to support virtualization, they are directly passed through | 17:56 |
clarkb | and they are served by VMs not baremetal. But mnaser can respond and clarify all of that | 17:56 |
clarkb | mnaser: ^ if you get a chance can you take a look at that email thread and respond with details on the vexxhost gpu flavor you've exposed to nodepool? | 17:56 |
*** kjackal has joined #openstack-infra | 17:56 | |
smcginnis | OK, cool. We would most likely have the same constraints trying to do something in OpenLab, so it probably makes sense to figure out if there is anything we can do here since it's for Nova. | 17:57 |
openstackgerrit | Michael Johnson proposed openstack/project-config master: Create 'Backport-Candidate' for Octavia repos https://review.opendev.org/657657 | 17:57 |
*** ykarel is now known as ykarel|away | 17:57 | |
clarkb | smcginnis: fwiw I wish these requests came with actual needs rather than solutions (eg test nova vgpu support not we need cards with vgpu) | 17:57 |
clarkb | it helps a lot to understand the use cases as there may be more than one solution available | 17:58 |
smcginnis | True | 17:58 |
*** ralonsoh has quit IRC | 18:00 | |
clarkb | We are about an hour from the infra meeting. I'm still trying to catch up on life after being away for far too long, but I'll be there and will do my best to run a meeting :) | 18:02 |
*** Weifan has joined #openstack-infra | 18:04 | |
*** kjackal has quit IRC | 18:05 | |
clarkb | Shrews: memory on the nodepool nodes looks good. I think we need longer term data to declare things fixed, but at least we havne't regressed or obviously not fixed it :) | 18:05 |
Shrews | clarkb: yeah, i don't think we've run nearly long enough to make an evaluation, except for maybe "things haven't yet gone boom" | 18:06 |
Shrews | maybe after a week | 18:06 |
*** mattw4 has joined #openstack-infra | 18:06 | |
*** quiquell is now known as quiquell|off | 18:08 | |
*** ijw has joined #openstack-infra | 18:08 | |
Shrews | clarkb: how do we get rid of the nb04 entry in cacti? I can't find it anywhere in system-config | 18:09 |
Shrews | maybe it requires a restart.... | 18:09 |
clarkb | Shrews: I believe we have to manually edit the database. We have a script that enrolls new hosts but nothing that checks the diff to remove old ones | 18:09 |
Shrews | oh | 18:10 |
clarkb | we need to remove the git0* servers too (as well as delete them entirely) | 18:10 |
*** armax has joined #openstack-infra | 18:12 | |
fungi | yeah, we haven't auto-removed old hosts for a couple of reasons: | 18:13 |
fungi | it's sometimes useful to look at the resource utilization of an old host we no longer have | 18:14 |
*** ijw has quit IRC | 18:14 | |
fungi | it's helpful in the case of server replacements to avoid blowing away history we intend to resume updating with a new server | 18:14 |
*** ijw has joined #openstack-infra | 18:14 | |
fungi | (though we don't do much of the latter any longer, so those cases are even moreso covered by the former point) | 18:15 |
*** ijw has quit IRC | 18:15 | |
*** ijw has joined #openstack-infra | 18:16 | |
openstackgerrit | Michael Johnson proposed openstack/project-config master: Create 'Backport-Candidate' for Octavia repos https://review.opendev.org/657657 | 18:16 |
*** ijw has quit IRC | 18:17 | |
*** ijw has joined #openstack-infra | 18:17 | |
*** armax has quit IRC | 18:17 | |
*** ijw has quit IRC | 18:19 | |
*** ijw has joined #openstack-infra | 18:20 | |
openstackgerrit | Paul Belanger proposed zuul/nodepool master: Add release-zuul-python to post pipeline https://review.opendev.org/657658 | 18:21 |
*** hwoarang has quit IRC | 18:24 | |
*** hwoarang has joined #openstack-infra | 18:26 | |
*** Goneri has quit IRC | 18:33 | |
*** Weifan has quit IRC | 18:44 | |
*** ykarel|away has quit IRC | 18:45 | |
*** zbr is now known as zbr|pto | 18:45 | |
*** hwoarang has quit IRC | 18:47 | |
*** hwoarang has joined #openstack-infra | 18:49 | |
*** bobh has quit IRC | 18:55 | |
*** Weifan has joined #openstack-infra | 19:05 | |
*** Weifan has quit IRC | 19:10 | |
*** Weifan has joined #openstack-infra | 19:10 | |
*** tesseract has quit IRC | 19:10 | |
*** rkukura has joined #openstack-infra | 19:11 | |
*** raissa has quit IRC | 19:12 | |
*** yamamoto has joined #openstack-infra | 19:13 | |
*** Weifan has quit IRC | 19:18 | |
*** Goneri has joined #openstack-infra | 19:23 | |
tdasilva | mordred, corvus: what's the url to be used with encrypt_secret.py? | 19:27 |
*** jamesmcarthur has joined #openstack-infra | 19:27 | |
corvus | tdasilva: https://zuul.openstack.org/ | 19:28 |
corvus | tdasilva: or https://zuul.opendev.org/ --tenant=openstack | 19:28 |
fungi | tdasilva: we've got an openstack-specific example at https://docs.openstack.org/infra/manual/zuulv3.html#secret-variables | 19:30 |
*** Goneri has quit IRC | 19:30 | |
tdasilva | fungi: thanks!! that was very helpful... I had project just as `swift` as opposed to `openstack/swift` | 19:31 |
tdasilva | one thing I noticed about this is that a single project could only then have one secret associated with it, right? I can't associate a secret with a given third-party tool | 19:32 |
fungi | projects can define any many different named secrets as they like... am i misunderstanding the question? | 19:33 |
*** jamesmcarthur has quit IRC | 19:34 | |
mordred | tdasilva: yes, that is correct | 19:34 |
*** jamesmcarthur has joined #openstack-infra | 19:34 | |
*** kjackal has joined #openstack-infra | 19:34 | |
mordred | we've had some discussions about the broader use-case of "a collection of projects that would like to define a secret once and use it across their repos (and only their repos)" - I don't believe we have gotten all the way to solid answer/design yet though | 19:35 |
openstackgerrit | Merged openstack/diskimage-builder master: openssh-server: harden sshd config https://review.opendev.org/653890 | 19:35 |
*** jamesmcarthur has quit IRC | 19:35 | |
fungi | mordred: that sounds like the reverse of the question asked | 19:36 |
*** jamesmcarthur has joined #openstack-infra | 19:36 | |
tdasilva | right, so from the example fungi provided: `tools/encrypt_secret.py --infile file_with_secret \ | 19:36 |
tdasilva | --tenant openstack https://zuul.openstack.org openstack/kolla` | 19:36 |
fungi | a single project can have many secrets associated with it, but a secret can't be used by jobs for other projects than the project in which it's defined (unless it's in a global config project) | 19:36 |
tdasilva | for example: that secret is tied to kolla, not to kolla's docker hub account | 19:36 |
mordred | tdasilva: yes, that is correct | 19:37 |
mordred | fungi: I agree with you | 19:37 |
tdasilva | oh, so there's no id to the secret | 19:37 |
fungi | i'm not sure what that means | 19:37 |
mordred | well - there's a name for the secret, but zuul prevents it from being used outside of the project in which it was defined | 19:38 |
fungi | if what you're asking is how to make the kolla dockerhub credentials usable by jobs in multiple repositories, you need to encrypt it as a secret for each repository individually. but the openstack/kolla project can have jobs which make use of secrets for dockerhub and npm, for example | 19:38 |
mordred | this is so that devs from nova can't define a job uses swift's dockerhub secret to upload stuff | 19:39 |
tdasilva | fungi: I understand now, it's just that zuul itself doesn't keep a mapping of which secret is for which service. It just has N secrets stored for swift, it's up to me to know which secret to which for which third-party tool | 19:40 |
fungi | yeah, usually you name each secret with something memorable | 19:40 |
fungi | like npm-credentials or dockerhub-key or something | 19:41 |
tdasilva | got it! thanks | 19:42 |
*** jamesmcarthur has quit IRC | 19:42 | |
*** Weifan has joined #openstack-infra | 19:42 | |
fungi | tdasilva: an example in the wild: https://opendev.org/recordsansible/ara-web/src/branch/master/.zuul.yaml#L40 | 19:42 |
fungi | that one has a lengthy name, but you get the idea | 19:43 |
*** Goneri has joined #openstack-infra | 19:43 | |
*** Weifan has quit IRC | 19:45 | |
tdasilva | fungi: yep, understood now. Here's our patch: https://review.opendev.org/#/c/657046/5/.zuul.yaml | 19:46 |
*** yamamoto has quit IRC | 19:47 | |
ianw | infra-root: could i get one more eye on https://review.opendev.org/#/c/651053/ which removes the old linaro-cn1 cloud | 19:53 |
clarkb | ianw: done!@ | 19:53 |
clarkb | fungi: not right now because reasons. But I'll try to sync up with you on the groups removals before thursday so that we can make sure we've done all the necessary data backups first | 19:54 |
fungi | sounds good, though you should see them in the list of trove and nova snapshots in rackspace dfw | 19:54 |
clarkb | k | 19:54 |
fungi | i did them before the summit | 19:55 |
fungi | maybe a week before? | 19:55 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Remove graphite.openstack.org https://review.opendev.org/651104 | 19:55 |
fungi | and now to find a very late lunch which by this point is more like early dinner | 19:57 |
mordred | clarkb: https://review.opendev.org/#/c/657275/ | 19:58 |
mordred | fungi: ^^ (you too, mr TC face) | 19:58 |
mordred | clarkb: oh - should I also pull the osf/ repos out in that patch? | 19:58 |
clarkb | osf repos would be openstackid and friends? | 20:00 |
mordred | clarkb: yeah - and osf/groups and osf/groups-static-pages | 20:02 |
mordred | clarkb: I can do that as a follow up if you wanna look at them | 20:04 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Avoid testinfra 3.0.0 https://review.opendev.org/657308 | 20:04 |
mordred | clarkb: oh - and jjb | 20:04 |
ianw | ^ testinfra 3.0.1 released but probably worth keeping a note for posterity | 20:05 |
ianw | pabelanger: want to have a look at https://review.opendev.org/#/c/656912/ to add a environment variable to not run syncs under "timeout" before we start a manual sync for f30 mirror? | 20:07 |
clarkb | mordred: ya maybe in a followup so we doont mix together different sets | 20:07 |
mordred | clarkb: kk | 20:08 |
pabelanger | ianw: looking | 20:08 |
mordred | clarkb: what are we thinking about the repos that are in openstack-infra in governance, but which we did not decide to put in opendev/ and are still in openstack/ ? | 20:09 |
mordred | clarkb: just leave them alone for now? | 20:09 |
pabelanger | ianw: +2 | 20:10 |
*** smarcet has joined #openstack-infra | 20:10 | |
clarkb | mordred: ya I thibk those get uncoupled if/when opendev becomes more independent | 20:11 |
mordred | clarkb: cool. that makes sense | 20:11 |
mordred | clarkb: https://review.opendev.org/657676 Remove osf and jjb projects from Infrastructure | 20:12 |
clarkb | mordred: ianw cool I've gotten through the changes above really quick | 20:20 |
clarkb | and now I have to pop out for a bit again. | 20:20 |
ianw | mordred: not sure if saw my mention of https://review.opendev.org/#/c/656908/ to fix up tox siblings with projects that don't have metadata in setup.cfg ... testinfra is one such project :) | 20:23 |
openstackgerrit | Merged opendev/system-config master: Remove linaro-cn1 region https://review.opendev.org/651053 | 20:23 |
mordred | ianw: yes! I just forgot to leave a vote :) | 20:23 |
ianw | mordred: thanks :) testinfra upstream were receptive to us running CI against them (added with https://review.opendev.org/#/c/657461/) ... | 20:26 |
ianw | i'll keep an eye as to how this plays out with splitting up base playbooks | 20:26 |
ianw | it might be good to just get that going for now anyway, to help avoid breakages to the status quo | 20:27 |
*** hamzy has quit IRC | 20:27 | |
*** diablo_rojo has joined #openstack-infra | 20:32 | |
openstackgerrit | Merged opendev/system-config master: Avoid testinfra 3.0.0 https://review.opendev.org/657308 | 20:33 |
*** slaweq has quit IRC | 20:36 | |
*** slaweq has joined #openstack-infra | 20:42 | |
openstackgerrit | James E. Blair proposed opendev/system-config master: Install os_client_config on bridge https://review.opendev.org/657685 | 20:46 |
corvus | mordred, clarkb: ^ i think that's the issue that's keeping k8s-on-openstack from running on bridge | 20:46 |
corvus | mind if i run that manually? | 20:47 |
mordred | corvus: wfm | 20:47 |
*** slaweq has quit IRC | 20:55 | |
*** kjackal has quit IRC | 21:04 | |
*** slaweq has joined #openstack-infra | 21:13 | |
*** diablo_rojo has quit IRC | 21:15 | |
clarkb | gitea seems unhappy | 21:17 |
clarkb | I cant hit the backends directly either | 21:18 |
* clarkb finds ssh keys | 21:19 | |
*** mattmceuen has joined #openstack-infra | 21:19 | |
clarkb | syslog complains about kubelet | 21:21 |
corvus | i can hit the backends directly | 21:21 |
clarkb | did the streams get crossed? | 21:21 |
corvus | hrm | 21:21 |
corvus | maybe unblocking the k8s cluster had a side effect | 21:21 |
clarkb | corvus: there is nothing listening on port 3000 on gitea01.opendev.org | 21:22 |
corvus | oh sorry, i can ssh in | 21:22 |
fungi | same for gitea02 | 21:22 |
clarkb | it appears to be repeateldy trying to run kubelet | 21:23 |
fungi | just ssh and smtp listening globally | 21:23 |
clarkb | there are no docker containers under docker ps -a | 21:23 |
fungi | maybe garbage collection gone awry? | 21:23 |
clarkb | I think kubernetes deployment gone awry | 21:24 |
fungi | no, that would be the images | 21:24 |
fungi | so yeag | 21:24 |
clarkb | https://review.opendev.org/#/c/656880/ is still unmerged | 21:24 |
corvus | yes but i applied in manually | 21:24 |
corvus | oh sorry that was the docker one | 21:24 |
corvus | the k8s_on_openstack playbook was running | 21:25 |
corvus | i killed it | 21:25 |
fungi | 657685 is what got applied manually, right | 21:25 |
*** slaweq has quit IRC | 21:25 | |
corvus | in the hopes that later hosts may be unaffected | 21:25 |
*** nicolasbock has quit IRC | 21:25 | |
*** nicolasbock has joined #openstack-infra | 21:25 | |
corvus | for some reason gitea06 is still working | 21:26 |
corvus | clarkb: do you want to remove all others from the lb? | 21:26 |
clarkb | gitea06 is the broken node not in the lb | 21:26 |
corvus | oh | 21:26 |
corvus | hrm | 21:26 |
fungi | right, we meant to remember to look into replacing 06 this week | 21:26 |
clarkb | in theory if we docker compose up on gitea01 it should bring all the containers back up again | 21:28 |
clarkb | and the data is bind mounted off the host | 21:28 |
corvus | i will do that | 21:28 |
clarkb | we may also need to stop whatever k8s things are trying to run there as they are what I suspect nuke docker | 21:29 |
corvus | ERROR: for giteadocker_mariadb_1 Cannot start service mariadb: invalid header field value "oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:359: container init caused \\\"rootfs_linux.go:53: mounting \\\\\\\"cgroup\\\\\\\" to rootfs \\\\\\\"/var/lib/docker/aufs/mnt/e61d27b8f390a48c66121e6b3010bc7521e3621597cfde55046899317264a2c4\\\\\\\" at | 21:29 |
corvus | \\\\\\\"/sys/fs/cgroup\\\\\\\" caused \\\\\\\"no subsystem for mount\\\\\\\"\\\"\"\n" | 21:29 |
corvus | i will try removing recently added packages | 21:31 |
corvus | and reinstalling docker | 21:31 |
*** sshnaidm has joined #openstack-infra | 21:31 | |
clarkb | docker --version reports docker 1.12.6 which is old(er) | 21:31 |
clarkb | ya ++ I think we want to reinstall from the upstream repo | 21:31 |
corvus | can someone work on stopping run_all.sh | 21:32 |
corvus | so we can safely remove the k8s playbook from it? | 21:32 |
fungi | on it | 21:32 |
*** sshnaidm is now known as sshnaidm|pto | 21:32 | |
clarkb | looks like we'll need to stop the running plays and then remove it from cron | 21:32 |
corvus | i'm starting to suspect this ran on every host | 21:33 |
fungi | #status log commented out run_all.sh in crontab of bridge.o.o | 21:34 |
clarkb | fungi: did you also stop the running process(es)? | 21:34 |
fungi | i've also manually killed the runningscript | 21:34 |
fungi | yes | 21:34 |
fungi | there are also some ansible-playbook processes owned by root on bridge.o.o | 21:35 |
fungi | should those be killed or is that one of us? | 21:35 |
clarkb | it isn't me | 21:35 |
*** Weifan has joined #openstack-infra | 21:36 | |
mordred | it isn't me | 21:36 |
fungi | i killed the parent, was over 10 minutes old | 21:36 |
fungi | so almost certainly not one of us | 21:36 |
*** mriedem has quit IRC | 21:36 | |
clarkb | is it ansible-playbook -v /opt/k8s-on-openstack/site.yaml that we suspect caused the problem | 21:37 |
mordred | probably | 21:37 |
mordred | yup. there's a hosts: all | 21:38 |
corvus | May 07 21:38:41 gitea01 systemd[1]: docker.socket: Failed with result 'service-start-limit-hit'. | 21:39 |
corvus | anyone know how to deal with that? | 21:39 |
mordred | no - but I'll start googling | 21:39 |
clarkb | that means we failed to start the unit multiple times so systemd gave up | 21:39 |
clarkb | probably need to look at journalctl output | 21:39 |
corvus | feel free to do anything on gitea01 to get docker started | 21:39 |
corvus | i believe i have uninstalled/reinstalled things appropriately there | 21:40 |
clarkb | May 07 21:37:58 gitea01 dockerd[1567]: Error starting daemon: error initializing graphdriver: /var/lib/docker contains several valid graphdrivers: aufs, overlay2; Please cleanup or explicitly choose storage driver (-s <DRIVER>) | 21:40 |
corvus | oh actually | 21:40 |
corvus | do we run from distro or upstream docker? | 21:40 |
clarkb | upstream | 21:40 |
*** rh-jlabarre has quit IRC | 21:40 | |
corvus | then let me redo that | 21:40 |
mordred | corvus: while you're installing/uninstalling | 21:41 |
mordred | uninstall kubelet kubeadm kubectl kuberenetes-cni | 21:42 |
*** Goneri has quit IRC | 21:42 | |
corvus | mordred: i already did | 21:43 |
mordred | kk | 21:43 |
corvus | running docker-compose up | 21:43 |
mordred | gitea06 has overlay2 in /var/lib/docker - so I'm guessing it's overlay2 we want and not aufs | 21:43 |
clarkb | that sounds right iirc aufs is deprecated in favor of $newer things | 21:44 |
corvus | seems happier now | 21:45 |
clarkb | corvus: was it just the reinstall that you did to make it happier? | 21:45 |
fungi | i only know of aufs as the (ancient) apple unix file server | 21:45 |
corvus | yeah | 21:45 |
fungi | ahh, advanced unification filesystem | 21:45 |
*** Weifan has quit IRC | 21:46 | |
*** mattw4 has quit IRC | 21:46 | |
mordred | should we split up the remaining hosts? | 21:46 |
corvus | 1 sec | 21:46 |
mordred | kk | 21:46 |
corvus | does https://gitea01.opendev.org:3000/ look good? | 21:46 |
clarkb | I'm checking it now | 21:46 |
clarkb | I can browse zuul/zuul | 21:47 |
clarkb | it has the 3.8.1 tag from today | 21:47 |
corvus | this is my repair procedure: http://paste.openstack.org/show/750896/ | 21:47 |
corvus | does that look good? | 21:47 |
corvus | there may be some amount of "systemctl something docker" between lines 5 and 7 | 21:48 |
*** Goneri has joined #openstack-infra | 21:48 | |
corvus | that's unclear due to the hiccup | 21:48 |
clarkb | corvus: I think we may want to keep socat (or at least it doesn't hurt to have) | 21:48 |
clarkb | otherwise ya I think that looks about right | 21:48 |
corvus | well, it was installed for this | 21:48 |
clarkb | ah ok. | 21:48 |
corvus | this is the list of installed or upgraded packages: http://paste.openstack.org/show/750897/ | 21:49 |
mordred | yes - that looks good to me | 21:49 |
clarkb | (we'll also need to do the same on gitea-lb01.opendev.org I suspect and then similar to all the other hosts but without the docker compose up) | 21:49 |
corvus | ok, i'll take 2-3, clarkb you do 4-5, mordred you do 7-8 | 21:50 |
*** cfriesen has joined #openstack-infra | 21:50 | |
mordred | corvus: on it | 21:50 |
clarkb | k | 21:50 |
cfriesen | is opendev.org down? | 21:50 |
clarkb | cfriesen: yes | 21:50 |
*** mattw4 has joined #openstack-infra | 21:50 | |
cfriesen | kay. I'm not nuts. :) | 21:50 |
corvus | that did not work for me | 21:51 |
corvus | systemd failed to start it | 21:51 |
mordred | yeah. same here | 21:52 |
clarkb | third | 21:52 |
mordred | wow. -- The start-up result is RESULT. | 21:52 |
mordred | root@gitea07:~# docker --version | 21:53 |
mordred | Docker version 18.09.6, build 481bc77 | 21:53 |
mordred | that's what we're expecting, yes? | 21:53 |
corvus | hrm | 21:53 |
corvus | oh i see it | 21:53 |
corvus | sorry | 21:53 |
mordred | thank god | 21:53 |
clarkb | I get the same error I pasted above | 21:53 |
corvus | http://paste.openstack.org/show/750898/ | 21:54 |
corvus | i botched the rm's at the top | 21:54 |
clarkb | k rerunning with new paste commands | 21:54 |
corvus | apt-get --purge remove docker-ce docker-ce-cli docker-engine containerd.io ebtables ethtool socat | 21:54 |
corvus | and that's the remove command if you already ran it before | 21:55 |
corvus | and i still get the same error | 21:55 |
mordred | me too | 21:56 |
corvus | oh you know what | 21:56 |
corvus | when i installed the wrong version of docker and removed it, it removed /var/lib/docker | 21:56 |
mordred | and that didn't delete the volumes? | 21:56 |
clarkb | I don't think we use actual volumes | 21:57 |
*** ijw has quit IRC | 21:57 | |
clarkb | we use bind mounts | 21:57 |
corvus | i believe that's the case yeah | 21:57 |
corvus | to be conservative, we could try just moving it out of the way | 21:57 |
*** ijw has joined #openstack-infra | 21:57 | |
mordred | ok - in fact | 21:57 |
mordred | just move aufs out of the way | 21:57 |
mordred | I just tried that and it worked | 21:58 |
mordred | commands coming | 21:58 |
clarkb | fwiw there are no aufs layers in my aufs dir | 21:58 |
corvus | i think we should move the whole dir out of the way | 21:58 |
clarkb | (further evidence it is not the thing we want) | 21:58 |
mordred | http://paste.openstack.org/show/750899/ | 21:58 |
mordred | this is what I ran | 21:58 |
mordred | and it worked for me | 21:58 |
mordred | but I can also move the whole dir | 21:58 |
*** Goneri has quit IRC | 21:58 | |
*** mcornea has joined #openstack-infra | 21:58 | |
mordred | (you're gonna need to umount the aufs regardless I believe) | 21:59 |
corvus | let's do that: umount /var/lib/docker/aufs; mv /var/lib/docker /var/lib/docker-old | 21:59 |
mordred | ++ | 21:59 |
clarkb | k trying that now | 22:00 |
corvus | then run: apt-get install docker-ce | 22:00 |
corvus | to make dpkg happy | 22:00 |
clarkb | doing the apt-get remove, then umount + mv, then reinstall | 22:00 |
corvus | or that, yeah | 22:00 |
mordred | https://gitea07.opendev.org:3000/explore/repos | 22:01 |
mordred | I'm going to move to 8 | 22:01 |
corvus | it will take a moment or two for gitea to start, btw. | 22:01 |
mordred | yeah | 22:01 |
corvus | 7 lgtm | 22:01 |
*** cfriesen has left #openstack-infra | 22:02 | |
*** rfarr_ has joined #openstack-infra | 22:03 | |
clarkb | 05 is on its way up (waiting on images to pull) | 22:03 |
mordred | which is the correct docker version? | 22:03 |
clarkb | Docker version 18.09.6, build 481bc77 is what I have | 22:03 |
mordred | ok. cool. | 22:03 |
mordred | 8 is pulling and starting | 22:04 |
corvus | 02 and 03 are up running total: [123...7.] | 22:04 |
clarkb | I'll start on 04 in a moment when 05 lgtm | 22:04 |
mordred | k. 8 should be up | 22:04 |
clarkb | corvus: you want ot do gitea-lb01 ? | 22:04 |
corvus | clarkb: yes | 22:05 |
mordred | I can't hit 8 | 22:05 |
mordred | https://gitea08.opendev.org:3000/ | 22:05 |
mordred | am I just typing wrong? | 22:05 |
clarkb | mordred: it takes a while for it to come up | 22:05 |
mordred | kk | 22:05 |
corvus | mordred: give it time; check the logs | 22:05 |
*** rfarr has quit IRC | 22:05 | |
corvus | 2019/05/07 22:01:36 [I] Git Version: 2.20.1 | 22:05 |
corvus | 2019/05/07 22:03:25 [I] Run Mode: Production | 22:05 |
corvus | it can sit between those 2 lines for a while | 22:06 |
mordred | ok. yes - I'm between them | 22:06 |
mordred | btw - jhesketh wins the prize for best new tibit | 22:06 |
mordred | docker-compose logs --tail=100 -f | 22:06 |
mordred | thanks jhesketh ^^ ! | 22:06 |
clarkb | 04 is pulling images now and should be on its way up soon | 22:07 |
mordred | 8 is up | 22:07 |
*** smarcet has quit IRC | 22:07 | |
mordred | probably need to do insecure docker registry | 22:07 |
mordred | since it dockers | 22:07 |
mordred | and zuul-preview | 22:08 |
corvus | haproxy is up and so is https://opendev.org/ | 22:08 |
mordred | corvus: I'll do zuul-proxy | 22:08 |
corvus | i think 4 is the only one outstanding | 22:08 |
clarkb | mordred: ya then we'll need to cleanup all the things | 22:09 |
corvus | i'll look at insecure-registry | 22:09 |
corvus | ftr, this is the final effective procedure: http://paste.openstack.org/show/750900/ | 22:10 |
mordred | btw - I can't ssh to zp01.opendev.org by hostname | 22:10 |
clarkb | non docker hosts will only need the first 4 lines right? | 22:11 |
corvus | mordred: i think i messed up the dns for that | 22:11 |
* mattmceuen thanks guys! | 22:11 | |
*** slaweq has joined #openstack-infra | 22:11 | |
mordred | zuul-preview should be done | 22:12 |
corvus | mattmceuen: sorry for the outage | 22:12 |
corvus | registry is done | 22:13 |
clarkb | should we maybe use ansible for the cleanup of non docker hosts ? | 22:13 |
corvus | probably -- but first -- are we concerned at all about the other packages that got upgraded? | 22:14 |
corvus | libc... glib... | 22:14 |
corvus | systemd | 22:14 |
clarkb | corvus: I believe those would get updated by unattended upgrades | 22:14 |
mordred | where did they get pulled in from? | 22:14 |
*** eernst has joined #openstack-infra | 22:14 | |
mordred | did we get a systemd from the k8s repo? or just from ubuntu | 22:14 |
mordred | ? | 22:14 |
*** eernst has quit IRC | 22:15 | |
corvus | oh.. hrm. maybe it was just a regular upgrade from ubuntu that happened to get pulled in... | 22:15 |
corvus | let's see if we can find out | 22:15 |
mordred | yeah. dpkg shows an ubuntu version | 22:15 |
clarkb | I'm looking at ubuntu package search to see when those updated | 22:15 |
mordred | ii systemd 237-3ubuntu10.15 amd64 system and service manager | 22:15 |
mordred | ii libc6:amd64 2.27-3ubuntu1 | 22:15 |
corvus | okay, that's probably sane then. so yeah, i think we can just do lines 1-4 | 22:16 |
mordred | ++ | 22:16 |
*** eernst has joined #openstack-infra | 22:16 | |
corvus | mordred: want to construct an ansible command to run that on all hosts except the ones we just did? | 22:16 |
mordred | probably just do hosts: all;!zp01;! ... | 22:16 |
mordred | yeah - lemme get a host string real quick | 22:16 |
clarkb | mordred: what host is taht from? ii systemd 237-3ubuntu10.21 amd64 system and service manager is what I get from gitea04 which is bionic and matches ubuntu's package list for that | 22:16 |
corvus | k, i'll preparet to double check it | 22:17 |
*** rfarr__ has joined #openstack-infra | 22:17 | |
mordred | clarkb: it was zp01 | 22:17 |
clarkb | but I concur what I am looking at on gitea04 for systemd seems fine compared to ubuntu package listings | 22:17 |
mordred | corvus, clarkb: - hosts: 'all:!zp01*:!gitea*:insecure*' | 22:19 |
mordred | how does that look? | 22:19 |
*** sshnaidm|pto has quit IRC | 22:19 | |
clarkb | mordred: we should cross check against the emergency group too maybe? | 22:19 |
*** rfarr_ has quit IRC | 22:19 | |
clarkb | though I guess the earlier all that installed all the k8s won't have respected that either | 22:19 |
corvus | nothings disabled right now | 22:20 |
mordred | yeah to both | 22:20 |
corvus | mordred: lgtm | 22:20 |
mordred | ethtool socat | 22:20 |
clarkb | ya lgtm too | 22:20 |
mordred | are those dangerous to blank-slate remote? | 22:21 |
mordred | remove? | 22:21 |
clarkb | mordred: we need socat on the zuul executors | 22:21 |
mordred | anyway - on bridge - /root/fix-issues.yaml | 22:21 |
clarkb | I'm somewhat inclined to just leave socat in place (it is a useful tool and shouldn't hrut if we have it on things) | 22:21 |
mordred | maybe let's leave socat out of this list - and do a different one if we want | 22:21 |
corvus | yeah, leaving socat wfm | 22:21 |
clarkb | I don't expect ethtool is used by any of our software | 22:21 |
*** rfarr__ has quit IRC | 22:21 | |
mordred | ebtables? | 22:21 |
corvus | also should be ok to remove i think | 22:22 |
clarkb | mordred: we don't ebtables as far as I know | 22:22 |
mordred | kk - check /root/fix-issues.yaml | 22:22 |
clarkb | maybe remove that one on a test node first though to make sure it doesn't interact with iptables? | 22:22 |
corvus | well it didn't earlier | 22:23 |
corvus | on the gitea hosts | 22:23 |
mordred | good point | 22:23 |
mordred | http://paste.openstack.org/show/750901/ <-- in case anyone is following along without root to bridge | 22:23 |
clarkb | confirmed that iptables -L -n shows our expected ruleset on gitea04 | 22:23 |
*** slaweq has quit IRC | 22:24 | |
corvus | those last 4 were via "apt-get autoremove", so at least on the gitea hosts, they were dangling packages | 22:24 |
clarkb | mordred: that playbook lgtm. | 22:24 |
corvus | playbook lgtm | 22:24 |
mordred | corvus: should I maybe put in an apt-get autoremove -y ? | 22:25 |
corvus | mordred: meh, i don't think so | 22:25 |
clarkb | mordred: that will trigger kernel removals which is slow | 22:25 |
mordred | good point | 22:25 |
clarkb | I think for now lets not do that so that we converge on happy state quicker | 22:25 |
mordred | I'm going to start a screen session | 22:25 |
corvus | i made that command by doing a remove on all the packages except the last 4, then an autoremove | 22:25 |
corvus | then squashed the 4 things the autoremove did into the command i gave you | 22:25 |
mordred | oh - or, I guess I reconnected to a screen session | 22:25 |
*** mcornea has left #openstack-infra | 22:26 | |
mordred | what -f value do we think? 20? 40 since it's a simple playbook? | 22:26 |
corvus | we've used 50 for simple things on lots of hosts | 22:26 |
mordred | k. maybe let's do it with --limit=review-dev01.openstack.org just to double-check ourselves? | 22:27 |
corvus | k | 22:27 |
clarkb | wfm | 22:27 |
clarkb | looks like it didn't get as far as review-dev01? | 22:27 |
corvus | hrm, is there a way to tell apt-get remove to not fail on removing uninstalled packages? | 22:28 |
clarkb | fungi: ^ you know all the apt magic | 22:28 |
fungi | not that i'm aware of but checking | 22:28 |
fungi | you could mark them as automatically installed and then call apt autoremove | 22:29 |
mordred | corvus: maybe we just failed_when: false on the task? | 22:30 |
corvus | mordred: but i don't think it removes anything | 22:30 |
mordred | oh - right | 22:30 |
corvus | i guess we could string a bunch of commands together with | | 22:30 |
corvus | er, with || | 22:30 |
clarkb | and end with || true | 22:31 |
corvus | it's a very docker solution to a very docker problem | 22:31 |
mordred | hahaha | 22:31 |
mordred | wait - my brain isn't processing the right incantation here | 22:32 |
mordred | if kubelet isn't installed, then things didnt' get that far, right? | 22:33 |
fungi | could filter the list of installed packages by the list of undesired packages and then call apt remove on that, i suppose | 22:33 |
corvus | mordred: oooh... i see... that's what clarkb meant earlier | 22:34 |
corvus | mordred: i guess so? | 22:34 |
clarkb | mordred: ya my guess is that we'll either have all packages present or none | 22:34 |
mordred | how about just type kubelet && apt-get remove | 22:34 |
corvus | wfm | 22:34 |
clarkb | ya that seems to work on elasticsearch02 | 22:34 |
clarkb | you can use that as a test node as it seems to have been kubed | 22:34 |
mordred | that did not make review-dev happy | 22:35 |
corvus | oh | 22:35 |
corvus | i think the issue is *docker* | 22:35 |
mordred | oh. because different package names | 22:35 |
corvus | i don't think it was installed on hosts that didn't have it | 22:35 |
mordred | yeah | 22:35 |
corvus | docker-engine is the only "docker" package on review-dev | 22:36 |
* clarkb is not following, but expects yall understand | 22:36 | |
corvus | so i think you can drop those first 2 | 22:36 |
clarkb | ah | 22:36 |
*** mattmceuen has left #openstack-infra | 22:36 | |
corvus | clarkb: "docker-ce" was us all along, not the borg kube | 22:36 |
mordred | type kubelet && apt-get --purge remove -y kubernetes-cni cri-tools kubectl kubelet kubeadm | 22:36 |
mordred | so that? | 22:36 |
corvus | no | 22:36 |
corvus | mordred: what did you remove? | 22:37 |
corvus | mordred: just remove docker-ce and docker-ce-cli from the list | 22:37 |
mordred | oh - so you're saying just take docker-ce and docker-ce-cli out | 22:37 |
mordred | gotit | 22:37 |
mordred | type kubelet && apt-get --purge remove -y docker-engine kubernetes-cni cri-tools kubectl kubelet kubeadm containerd.io ebtables ethtool | 22:37 |
mordred | that | 22:37 |
corvus | yes, but 1 sec | 22:37 |
mordred | kk | 22:37 |
corvus | we may want to add aufs-tools and cgroupfs-mount to the list | 22:38 |
corvus | i see those as having been installed on review-dev01 | 22:38 |
mordred | k | 22:39 |
clarkb | I bet those we actually want on the docker hosts so weren't in the autoremove list | 22:39 |
mordred | let's try review-dev again | 22:39 |
corvus | ya | 22:39 |
corvus | i don't see an error there | 22:39 |
mordred | does apt-get remove return 100 on removing packages? | 22:40 |
clarkb | decimal 100 on error says the man page | 22:40 |
mordred | E: Unable to locate package containerd.io", "E: Couldn't find any package by glob 'containerd.io' | 22:41 |
mordred | so I think that goes along with docker-ce | 22:41 |
corvus | containerd.io was not installed | 22:41 |
corvus | agreed | 22:41 |
clarkb | makes sense, containerd is relatively new and the docker that was installed is a bit old | 22:42 |
clarkb | likely predates containerd | 22:42 |
mordred | E: Held packages were changed and -y was used without --allow-change-held-packages."] | 22:42 |
mordred | k8s-on-openstack puts in some holds (not sure why this didn't matter on gitea hosts though) | 22:42 |
corvus | did you use -y on the gitea hosts? | 22:43 |
mordred | oh! wait | 22:43 |
clarkb | I didn't use -y | 22:43 |
mordred | ubuntu-server is getting pulled in | 22:43 |
mordred | The following packages will be REMOVED:\n aufs-tools* cgroupfs-mount* cri-tools* docker-engine* ebtables* ethtool*\n kubeadm* kubectl* kubelet* kubernetes-cni* ubuntu-server*\nThe following held packages will be changed:\n docker-engine kubeadm kubectl kubelet kubernetes-cni | 22:43 |
*** panda has quit IRC | 22:44 | |
corvus | how is that happening? | 22:44 |
mordred | ethtool | 22:44 |
clarkb | ethtool is ya that | 22:45 |
corvus | ok lets keep ethtool i guess :) | 22:45 |
clarkb | wfm | 22:45 |
mordred | y'all ok with adding --allow-change-held-packages ? | 22:45 |
corvus | i reckon | 22:45 |
mordred | review-dev seems happy | 22:46 |
corvus | cool | 22:46 |
mordred | shall I try elasticsearch02 for confirmation? | 22:46 |
corvus | ++ | 22:46 |
*** panda has joined #openstack-infra | 22:46 | |
clarkb | I have a before dpkg -l as well | 22:46 |
mordred | k that worked | 22:46 |
fungi | it's getting less and less clear to me what happened and what's being cleaned up... did we install docker packages from the wrong place and they dragged in a bunch of dependencies we're now in need of uninstalling? | 22:47 |
corvus | i'm ready when clarkb is | 22:47 |
mordred | http://paste.openstack.org/show/750903/ | 22:47 |
clarkb | http://paste.openstack.org/show/750902/ is the diff from es02 dpkg -l | 22:47 |
corvus | fungi: we installed *k8s* everywhere | 22:47 |
clarkb | corvus: mordred ^ maybe double check that delta | 22:47 |
fungi | ahh | 22:47 |
clarkb | corvus: mordred but ya it lgtm I think we are about as ready as we'll be to proceed | 22:47 |
mordred | yeah. that list looks good | 22:47 |
clarkb | fungi: the playbook to install k8s used host: all | 22:48 |
clarkb | corvus: mordred oh wiat | 22:48 |
mordred | yeah? | 22:48 |
clarkb | corvus: mordred the actual k8s clusters are in our inventory too | 22:48 |
clarkb | we'll likely break them if we run with the current !s | 22:48 |
fungi | we only need to uninstall the explicitly installed packages though, right? the dependencies would all have been marked as automatically installed and so autoremove should be able to remove them once the others are removed | 22:48 |
mordred | which actual k8s clusters? | 22:48 |
clarkb | mordred: the gitea one and the nodepool one | 22:48 |
mordred | the nodepool one is a magnum one, so that should be fine | 22:49 |
mordred | right? | 22:49 |
corvus | the nodepool one shouldn't be in the inventory, but the gitea one is | 22:49 |
clarkb | fungi: yes, except that will take forever to run beacuse kernels will be removed too | 22:49 |
corvus | (though, killing it wouldn't be the end of the world) | 22:49 |
clarkb | fungi: but we could do that if you think it is prudent to be cautious | 22:49 |
fungi | got it | 22:49 |
fungi | nah, this is fine | 22:49 |
fungi | i forget about all the cruft kernels hanging around | 22:49 |
mordred | opendev-k8s* would be that cluster | 22:50 |
corvus | mordred: agreed | 22:50 |
mordred | ok. playbook updated to skip that cluster | 22:50 |
corvus | mordred: current rev lgtm | 22:50 |
mordred | one last chance to abort ... | 22:50 |
clarkb | I can't think of anything else at this point | 22:51 |
corvus | i say go | 22:51 |
*** hwoarang has quit IRC | 22:51 | |
clarkb | the ones that are failing are mirror nodes? I wonder if we just have to clean those out of inventory | 22:51 |
mordred | sorry - the exclusion line was wrong - left off the ! | 22:52 |
clarkb | mordred: did we accidently remove docker again in places we should not have? | 22:52 |
mordred | should we put an || echo "no kubelet found" at the end? | 22:53 |
mordred | clarkb: no - I dont' think so - we probably could have kept it running | 22:53 |
corvus | docker is still running on registry | 22:53 |
clarkb | and opdnev.org is still up | 22:54 |
clarkb | (I guess it is docker-ce that would break our docker nodes but we remove docker-something) | 22:54 |
clarkb | so likely still good | 22:54 |
*** pkopec has quit IRC | 22:54 | |
mordred | if type kubelet ; then apt-get --purge remove -y --allow-change-held-pac | 22:54 |
mordred | kages docker-engine kubernetes-cni cri-tools kubectl kubelet kubeadm ebtables aufs-tools cgroupfs-mount ; fi | 22:55 |
mordred | how's that as a way to not get as much red? | 22:55 |
corvus | looks good in principle assuming the bash is right | 22:55 |
clarkb | maybe check it on a node first just to confirm the bash but ya lgtm | 22:55 |
mordred | no errors on elasticsearch02 at least | 22:56 |
*** Weifan has joined #openstack-infra | 22:56 | |
mordred | well - shall we go again? | 22:57 |
*** jamesmcarthur has joined #openstack-infra | 22:57 | |
corvus | ++ | 22:57 |
clarkb | ya gonna have to get through them at some point | 22:57 |
*** threestrands has joined #openstack-infra | 23:00 | |
mordred | corvus: so - while we're waiting on that - there's a thing about k8s-on-openstack that should be noted ... which is that it's designed for site.yml to be run when the cwd is the k8s-on-openstack repo ... doing so causes ansible-playbook to pick up the ansible.cfg from that directory when it runs, and that sets inventory = /dev/null | 23:00 |
*** hwoarang has joined #openstack-infra | 23:00 | |
*** tkajinam has joined #openstack-infra | 23:00 | |
mordred | well, it finished before I finished typing that | 23:00 |
clarkb | mordred: its done fwiw | 23:00 |
*** threestrands has quit IRC | 23:00 | |
clarkb | maybe skim over the list and double check there aren't any that failed that we expect to succeed? | 23:01 |
*** threestrands has joined #openstack-infra | 23:01 | |
clarkb | the unreachables I saw are expected to be unreachable (and at least one has a change up to remove it from inventory, the graphite.o.o host) | 23:01 |
mordred | yeah - the list looks fine to me | 23:01 |
corvus | do we have a change up to either remove the playbook or fix the invocation? | 23:02 |
clarkb | corvus: I don't think we have that yet | 23:02 |
mordred | let's just remove it | 23:03 |
*** bobh has joined #openstack-infra | 23:03 | |
mordred | we should rework it when we coem back to it anyway | 23:03 |
corvus | i'll propose changes | 23:04 |
mordred | kk | 23:04 |
clarkb | and maybe double check the other k8s playbooks are group restricted? | 23:04 |
mordred | there are no other k8s playbooks - it's just that one | 23:04 |
clarkb | timeout -k 2m 120m ansible-playbook -f 50 ${ANSIBLE_PLAYBOOKS}/bootstrap-k8s-nodes.yaml | 23:04 |
clarkb | is the other one I see | 23:04 |
mordred | oh - hrm | 23:05 |
mordred | ah 0 yeah - that one is group restricted - but we can also ditch it | 23:05 |
openstackgerrit | James E. Blair proposed opendev/system-config master: Invoke run_k8s_ansible from its directory https://review.opendev.org/657703 | 23:06 |
*** rcernin has joined #openstack-infra | 23:06 | |
openstackgerrit | James E. Blair proposed opendev/system-config master: Stop running gitea k8s cluster playbooks https://review.opendev.org/657704 | 23:06 |
corvus | clarkb, mordred: proposed as 2 changes so that we don't forget about the fix | 23:06 |
clarkb | maybe flip the order? | 23:06 |
clarkb | (so that we merge the removal first? | 23:06 |
corvus | i did that order so that a revert of "stop running" gets us the right playbooks | 23:07 |
corvus | er, right scripts i guess | 23:07 |
mordred | ++ | 23:07 |
mordred | I think it's fine if we just land both before we turn cron back on | 23:07 |
clarkb | I have approved both | 23:07 |
*** dciabrin has quit IRC | 23:08 | |
*** dciabrin has joined #openstack-infra | 23:08 | |
*** threestrands has quit IRC | 23:08 | |
*** slaweq has joined #openstack-infra | 23:11 | |
*** smarcet has joined #openstack-infra | 23:12 | |
clarkb | I'm going to double check that gitea06 is not in the lb pool | 23:14 |
clarkb | it is not, so the change to make that config update took hold | 23:14 |
clarkb | (this is a good thing) | 23:15 |
mordred | yay | 23:15 |
* mordred has a scheduled thing in 15 mintues - are we back together enough that folks are comfortable with me stepping away? | 23:15 | |
clarkb | yes I think we can leave the cron disabled for a bit and deal with that tomorrow maybe? | 23:16 |
clarkb | other than t hat I think we should be recovered) | 23:16 |
mordred | ++ to leaving the cron disabled until people have enough brainpellets to reenable it | 23:16 |
*** yamamoto has joined #openstack-infra | 23:17 | |
*** sarob has joined #openstack-infra | 23:18 | |
clarkb | how about #status log Ansible cron disabled on bridge until we remove the k8s management from run_all.sh. This is necessary to keep k8s installations from breaking standalone docker usage. | 23:21 |
clarkb | and maybe #status notice If your jobs failed due to connectivity issues to opendev.org they can be rechecked now. Services have been restored at that domain | 23:21 |
*** bobh has quit IRC | 23:21 | |
mordred | clarkb: ++ | 23:21 |
clarkb | #status log Ansible cron disabled on bridge until we remove the k8s management from run_all.sh. This is necessary to keep k8s installations from breaking standalone docker usage. | 23:22 |
clarkb | statusbot is out at lunch? | 23:22 |
clarkb | 2019-05-03 22:44:05,311 DEBUG irc.client: _dispatcher: join is the last thing logged by it. I will restart it | 23:22 |
*** openstackstatus has joined #openstack-infra | 23:24 | |
*** ChanServ sets mode: +v openstackstatus | 23:24 | |
fungi | and it's back | 23:24 |
clarkb | #status log Ansible cron disabled on bridge until we remove the k8s management from run_all.sh. This is necessary to keep k8s installations from breaking standalone docker usage. | 23:24 |
openstackstatus | clarkb: finished logging | 23:24 |
*** slaweq has quit IRC | 23:24 | |
clarkb | #status notice If your jobs failed due to connectivity issues to opendev.org they can be rechecked now. Services have been restored at that domain. | 23:24 |
openstackstatus | clarkb: sending notice | 23:24 |
*** rlandy is now known as rlandy|brb | 23:25 | |
-openstackstatus- NOTICE: If your jobs failed due to connectivity issues to opendev.org they can be rechecked now. Services have been restored at that domain. | 23:26 | |
*** jamesmcarthur has quit IRC | 23:27 | |
*** Weifan has quit IRC | 23:28 | |
openstackstatus | clarkb: finished sending notice | 23:28 |
clarkb | thank you statusbot | 23:28 |
*** eernst has quit IRC | 23:34 | |
*** Goneri has joined #openstack-infra | 23:34 | |
*** lseki has quit IRC | 23:40 | |
*** guimaluf has joined #openstack-infra | 23:43 | |
*** Weifan has joined #openstack-infra | 23:44 | |
openstackgerrit | Merged opendev/system-config master: Add mirroring for Stein packages https://review.opendev.org/655686 | 23:48 |
clarkb | I had to recheck the first of our fix changes | 23:48 |
clarkb | puppet apply for xenial timed out | 23:49 |
clarkb | I've seen that happen if ruby gems are slow to download things | 23:49 |
clarkb | and with that I'm gonna figure out dinner | 23:49 |
*** sarob has quit IRC | 23:51 | |
*** whoami-rajat has quit IRC | 23:55 | |
*** mattw4 has quit IRC | 23:56 | |
ianw | corvus/clarkb: thanks, i responded to the few comments in the LE restart stuff with https://review.opendev.org/#/c/652801/ | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!