*** threestrands has joined #openstack-infra | 00:12 | |
*** lbragstad_ has joined #openstack-infra | 00:38 | |
*** jamesmcarthur has quit IRC | 00:39 | |
*** lbragstad_ has quit IRC | 00:44 | |
*** dayou has quit IRC | 00:46 | |
*** dayou has joined #openstack-infra | 00:48 | |
fungi | gmann: as far as gerrit is concerned, x/foo and openstack/foo are repository names, so a repository can be renamed from x/foo to openstack/foo using that process | 00:56 |
---|---|---|
fungi | that way the original review history is retained in the renamed project | 00:57 |
fungi | and redirects are created in gitea and so on | 00:57 |
*** lbragstad_ has joined #openstack-infra | 01:09 | |
*** matt_kosut has joined #openstack-infra | 01:25 | |
*** matt_kosut has quit IRC | 01:30 | |
*** imacdonn has quit IRC | 01:53 | |
*** lbragstad_ has quit IRC | 01:57 | |
*** dannins has joined #openstack-infra | 02:26 | |
*** dave-mccowan has joined #openstack-infra | 02:26 | |
*** dave-mccowan has quit IRC | 02:32 | |
*** lbragstad_ has joined #openstack-infra | 02:57 | |
*** jamesmcarthur has joined #openstack-infra | 03:11 | |
*** igordc has joined #openstack-infra | 03:19 | |
*** lbragstad_ has quit IRC | 03:20 | |
*** ramishra has joined #openstack-infra | 03:27 | |
*** matt_kosut has joined #openstack-infra | 03:27 | |
*** jamesmcarthur has quit IRC | 03:31 | |
*** matt_kosut has quit IRC | 03:31 | |
*** jamesmcarthur has joined #openstack-infra | 03:35 | |
*** jamesmcarthur has quit IRC | 03:40 | |
*** dave-mccowan has joined #openstack-infra | 03:42 | |
*** ricolin has quit IRC | 03:43 | |
*** armax has quit IRC | 03:49 | |
*** ykarel|away is now known as ykarel | 04:24 | |
*** ricolin has joined #openstack-infra | 04:27 | |
*** dchen has quit IRC | 04:27 | |
*** dave-mccowan has quit IRC | 04:30 | |
*** dchen has joined #openstack-infra | 04:46 | |
*** matt_kosut has joined #openstack-infra | 05:27 | |
*** matt_kosut has quit IRC | 05:32 | |
*** evrardjp has quit IRC | 05:35 | |
*** evrardjp has joined #openstack-infra | 05:35 | |
*** matt_kosut has joined #openstack-infra | 06:26 | |
*** igordc has quit IRC | 06:31 | |
*** threestrands has quit IRC | 06:42 | |
*** admcleod has quit IRC | 06:47 | |
*** ricolin has quit IRC | 06:52 | |
*** rcernin has quit IRC | 07:06 | |
*** lmiccini has joined #openstack-infra | 07:09 | |
*** AJaeger has quit IRC | 07:12 | |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Scheduler test app manager https://review.opendev.org/708812 | 07:12 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Use scheduler manager consistently in tests https://review.opendev.org/709542 | 07:12 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Refactor executor_client in tests https://review.opendev.org/709672 | 07:12 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Refactor merge_client in tests https://review.opendev.org/709676 | 07:12 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Refactor nodepool in tests https://review.opendev.org/709703 | 07:12 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Refactor zookeeper in tests https://review.opendev.org/709709 | 07:12 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Consolidate scheduler pause/exit as hibernation https://review.opendev.org/709723 | 07:12 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Refactor `self.event_queues` in tests https://review.opendev.org/709990 | 07:12 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Scheduler's pause/resume functionality https://review.opendev.org/709735 | 07:12 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: WIP: Store unparsed branch config in Zookeeper https://review.opendev.org/705716 | 07:12 |
*** AJaeger has joined #openstack-infra | 07:16 | |
*** dpawlik has joined #openstack-infra | 07:23 | |
*** matt_kosut has quit IRC | 07:38 | |
*** ykarel is now known as ykarel|lunch | 07:39 | |
*** pgaxatte has joined #openstack-infra | 07:40 | |
*** rpittau|afk is now known as rpttau | 07:42 | |
*** rpttau is now known as rpittau | 07:42 | |
*** hashar has joined #openstack-infra | 07:46 | |
*** tetsuro has joined #openstack-infra | 08:00 | |
*** slaweq has joined #openstack-infra | 08:01 | |
*** matt_kosut has joined #openstack-infra | 08:04 | |
*** tkajinam has quit IRC | 08:07 | |
*** tesseract has joined #openstack-infra | 08:12 | |
*** tosky has joined #openstack-infra | 08:15 | |
*** admcleod has joined #openstack-infra | 08:18 | |
*** iurygregory has joined #openstack-infra | 08:20 | |
*** jcapitao has joined #openstack-infra | 08:26 | |
*** amoralej|off is now known as amoralej | 08:26 | |
*** tetsuro has quit IRC | 08:29 | |
*** tetsuro has joined #openstack-infra | 08:31 | |
*** jpena|off is now known as jpena | 08:31 | |
*** ricolin_ has joined #openstack-infra | 08:33 | |
*** dtantsur|afk is now known as dtantsur | 08:35 | |
openstackgerrit | YumengBao proposed openstack/project-config master: Add rss link for cyborg-specs https://review.opendev.org/711875 | 08:37 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Use scheduler manager consistently in tests https://review.opendev.org/709542 | 08:43 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Refactor executor_client in tests https://review.opendev.org/709672 | 08:43 |
*** ralonsoh has joined #openstack-infra | 08:52 | |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Refactor merge_client in tests https://review.opendev.org/709676 | 08:54 |
openstackgerrit | Liang Fang proposed openstack/project-config master: New repo: devstack-plugin-open-cas https://review.opendev.org/711878 | 08:55 |
*** rpittau is now known as rpittau|bbl | 08:56 | |
*** ykarel|lunch is now known as ykarel | 08:59 | |
*** ricolin_ has quit IRC | 09:00 | |
*** ociuhandu has joined #openstack-infra | 09:01 | |
*** ricolin_ has joined #openstack-infra | 09:02 | |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Refactor nodepool in tests https://review.opendev.org/709703 | 09:08 |
*** yolanda has quit IRC | 09:11 | |
*** yolanda has joined #openstack-infra | 09:11 | |
*** pkopec has joined #openstack-infra | 09:23 | |
*** ricolin_ has quit IRC | 09:23 | |
*** derekh has joined #openstack-infra | 09:37 | |
*** ijw has joined #openstack-infra | 09:42 | |
*** apetrich has joined #openstack-infra | 09:46 | |
*** ijw has quit IRC | 09:46 | |
*** roman_g has joined #openstack-infra | 09:46 | |
*** ociuhandu has quit IRC | 09:48 | |
*** gfidente has joined #openstack-infra | 09:51 | |
*** happyhemant has joined #openstack-infra | 09:53 | |
*** owalsh^ is now known as owalsh | 10:00 | |
*** gshippey has joined #openstack-infra | 10:01 | |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Refactor zookeeper in tests https://review.opendev.org/709709 | 10:03 |
*** xek_ has joined #openstack-infra | 10:05 | |
*** auristor has quit IRC | 10:08 | |
*** zbr|pto is now known as zbr | 10:20 | |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Consolidate scheduler pause/exit as hibernation https://review.opendev.org/709723 | 10:22 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Refactor `self.event_queues` in tests https://review.opendev.org/709990 | 10:26 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Scheduler's pause/resume functionality https://review.opendev.org/709735 | 10:26 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: WIP: Store unparsed branch config in Zookeeper https://review.opendev.org/705716 | 10:26 |
*** sshnaidm|afk is now known as sshnaidm | 10:27 | |
*** auristor has joined #openstack-infra | 10:36 | |
*** yboaron has joined #openstack-infra | 10:51 | |
*** dchen has quit IRC | 10:58 | |
openstackgerrit | Donny Davis proposed openstack/project-config master: Setting OpenEdge Provider to 10 test nodes https://review.opendev.org/711903 | 11:11 |
donnyd | looks to me like we are able to produce successful builds http://logstash.openstack.org/#/dashboard/file/logstash.json?query=node_provider:%5C%22openedge-us-east%5C%22%20AND%20message:%5C%22Upload%20logs%20to%20swift%5C%22%20&from=12h | 11:12 |
donnyd | I would like to turn Openedge up to 10 test nodes and monitor | 11:13 |
*** ricolin_ has joined #openstack-infra | 11:13 | |
openstackgerrit | Liang Fang proposed openstack/project-config master: New repo: devstack-plugin-open-cas https://review.opendev.org/711878 | 11:17 |
*** ociuhandu has joined #openstack-infra | 11:17 | |
*** ociuhandu has quit IRC | 11:19 | |
*** ociuhandu has joined #openstack-infra | 11:20 | |
*** tetsuro has quit IRC | 11:23 | |
*** ykarel is now known as ykarel|afk | 11:26 | |
frickler | donnyd: approved, ping me if something goes wrong | 11:27 |
donnyd | thank you frickler | 11:28 |
donnyd | I will monitor closely to ensure nothing blows up | 11:28 |
*** yamamoto has quit IRC | 11:29 | |
*** AJaeger has quit IRC | 11:33 | |
*** sshnaidm has quit IRC | 11:36 | |
donnyd | also I got the numa settings from sean-k-mooney so everything is back to how it was | 11:36 |
openstackgerrit | Merged openstack/project-config master: Setting OpenEdge Provider to 10 test nodes https://review.opendev.org/711903 | 11:37 |
*** nicolasbock has joined #openstack-infra | 11:37 | |
*** sshnaidm has joined #openstack-infra | 11:40 | |
*** ociuhandu_ has joined #openstack-infra | 11:41 | |
*** rosmaita has joined #openstack-infra | 11:43 | |
*** lbragstad_ has joined #openstack-infra | 11:43 | |
*** ociuhandu has quit IRC | 11:44 | |
*** AJaeger has joined #openstack-infra | 11:47 | |
*** jpena is now known as jpena|lunch | 11:48 | |
*** jcapitao is now known as jcapitao_lunch | 11:51 | |
*** lbragstad_ has quit IRC | 11:51 | |
*** sshnaidm has quit IRC | 11:53 | |
*** sshnaidm has joined #openstack-infra | 11:54 | |
*** weshay is now known as weshay|ruck | 11:54 | |
*** rlandy has joined #openstack-infra | 11:58 | |
*** yamamoto has joined #openstack-infra | 12:01 | |
*** yamamoto has quit IRC | 12:06 | |
*** yamamoto has joined #openstack-infra | 12:07 | |
*** jamesmcarthur has joined #openstack-infra | 12:10 | |
*** jamesmcarthur has quit IRC | 12:14 | |
*** ykarel|afk is now known as ykarel | 12:17 | |
*** tetsuro has joined #openstack-infra | 12:18 | |
*** jamesmcarthur has joined #openstack-infra | 12:20 | |
openstackgerrit | Mohammed Naser proposed opendev/lodgeit master: Upload container images https://review.opendev.org/711854 | 12:23 |
*** yamamoto has quit IRC | 12:29 | |
*** jpena|lunch is now known as jpena | 12:31 | |
openstackgerrit | Cédric Jeanneret (Tengu) proposed openstack/project-config master: Add new Validation Framework projects https://review.opendev.org/711910 | 12:32 |
*** jamesmcarthur has quit IRC | 12:36 | |
*** takamatsu has quit IRC | 12:37 | |
*** Goneri has joined #openstack-infra | 12:40 | |
*** rh-jelabarre has joined #openstack-infra | 12:41 | |
*** rpittau|bbl is now known as rpittau | 12:44 | |
*** jamesmcarthur has joined #openstack-infra | 12:47 | |
*** AJaeger has quit IRC | 12:50 | |
openstackgerrit | Donny Davis proposed openstack/project-config master: Bumping OpenEdge test node commit to 20 https://review.opendev.org/711914 | 12:53 |
donnyd | Testing at 10 nodes looks good to me - no failures from a launch / connection perspective | 12:54 |
*** lbragstad has joined #openstack-infra | 12:54 | |
donnyd | I would like to bump up the test node commit to 20 test nodes and then leave it there for a while | 12:54 |
*** yamamoto has joined #openstack-infra | 12:54 | |
*** jamesmcarthur has quit IRC | 12:56 | |
donnyd | There have been zero launch errors since I fixed the networking issue yesterday | 12:56 |
*** jamesmcarthur has joined #openstack-infra | 12:57 | |
*** rh-jelabarre has quit IRC | 12:58 | |
*** ricolin_ has quit IRC | 13:00 | |
*** jamesmcarthur has quit IRC | 13:02 | |
*** zxiiro has joined #openstack-infra | 13:04 | |
*** lbragstad has quit IRC | 13:06 | |
*** sshnaidm has quit IRC | 13:08 | |
*** sshnaidm has joined #openstack-infra | 13:09 | |
*** tetsuro has quit IRC | 13:09 | |
*** sshnaidm has quit IRC | 13:10 | |
*** ricolin_ has joined #openstack-infra | 13:10 | |
openstackgerrit | Cédric Jeanneret (Tengu) proposed openstack/project-config master: Add new Validation Framework projects https://review.opendev.org/711910 | 13:11 |
*** jamesmcarthur has joined #openstack-infra | 13:13 | |
*** jcapitao_lunch is now known as jcapitao | 13:14 | |
*** sshnaidm has joined #openstack-infra | 13:18 | |
*** sshnaidm has quit IRC | 13:18 | |
*** ociuhandu_ has quit IRC | 13:19 | |
*** sshnaidm has joined #openstack-infra | 13:19 | |
*** ociuhandu has joined #openstack-infra | 13:19 | |
*** bdodd has joined #openstack-infra | 13:21 | |
*** hashar has quit IRC | 13:29 | |
*** cdearborn has joined #openstack-infra | 13:29 | |
*** ociuhandu has quit IRC | 13:31 | |
*** ociuhandu has joined #openstack-infra | 13:31 | |
*** jamesmcarthur has quit IRC | 13:32 | |
*** jamesmcarthur has joined #openstack-infra | 13:32 | |
*** ricolin_ has quit IRC | 13:33 | |
*** ykarel is now known as ykarel|afk | 13:33 | |
*** amoralej is now known as amoralej|lunch | 13:34 | |
*** rh-jelabarre has joined #openstack-infra | 13:37 | |
*** apetrich has quit IRC | 13:37 | |
openstackgerrit | Merged openstack/project-config master: Bumping OpenEdge test node commit to 20 https://review.opendev.org/711914 | 13:37 |
*** yamamoto has quit IRC | 13:37 | |
*** jamesmcarthur has quit IRC | 13:38 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Make revoke-sudo more general. https://review.opendev.org/706262 | 13:43 |
*** yamamoto has joined #openstack-infra | 13:44 | |
openstackgerrit | Benjamin Schanzel proposed zuul/zuul-jobs master: Kubernetes Node Support for Mirroring Git Repos https://review.opendev.org/711920 | 13:44 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Adds variable to toggle whether to revoke sudo https://review.opendev.org/706248 | 13:45 |
*** AJaeger has joined #openstack-infra | 13:46 | |
*** eharney has joined #openstack-infra | 13:47 | |
*** AJaeger has quit IRC | 13:47 | |
*** AJaeger has joined #openstack-infra | 13:49 | |
Tengu | hello there! quick question: I apparently would need a new wiki namespace on wiki.openstack.org for a new "project" (linked to Rhttps://review.opendev.org/#/c/711910/) - I'm currently editing the governance thing as Andreas notified, and am a bit strugling for the "url" part.. | 13:50 |
Tengu | unless.... hm. I might just drop it under "TripleO" for now | 13:51 |
Tengu | weshay|ruck: any thoughts? -^^ or do you want to discuss it during the mtg tomorrow? | 13:52 |
AJaeger | Tengu: you set it up with notifications etc. as part of tripleo. If that is not correct, you can set it up outside of openstack namespace | 13:52 |
AJaeger | Tengu: but if you use openstack/ as prefix, I need a governance change. | 13:53 |
Tengu | AJaeger: errr.... my brain just froze with your first sentence | 13:53 |
*** ociuhandu has quit IRC | 13:53 | |
AJaeger | Tengu: what did I do wrong? | 13:53 |
Tengu | AJaeger: and I agree with the governance change - no problem with that. Just a bit lost as to "what's the best thing to do" | 13:53 |
AJaeger | Tengu: discuss with weshay|ruck and tripleo team first ;) | 13:54 |
Tengu | AJaeger: yes, that's the main idea - I have a point during the meeting tomorrow :) | 13:54 |
*** ociuhandu has joined #openstack-infra | 13:54 | |
Tengu | I just pushed the change request today in order to ensure everything is ready :). | 13:54 |
AJaeger | Then WIP it for now ;) | 13:55 |
AJaeger | this all can wait from my side... | 13:55 |
*** dave-mccowan has joined #openstack-infra | 13:55 | |
Tengu | np - I should have -w it before, sorry | 13:55 |
*** lbragstad has joined #openstack-infra | 13:57 | |
*** ociuhandu has quit IRC | 13:58 | |
*** ScottMC has joined #openstack-infra | 14:01 | |
*** jamesmcarthur has joined #openstack-infra | 14:03 | |
*** adriancz has joined #openstack-infra | 14:03 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Control log archive and user preservation with vars https://review.opendev.org/701381 | 14:04 |
*** ykarel|afk is now known as ykarel | 14:07 | |
openstackgerrit | Albin Vass proposed zuul/zuul master: Fix minor spelling error https://review.opendev.org/711926 | 14:08 |
*** jamesmcarthur has quit IRC | 14:09 | |
*** sshnaidm has quit IRC | 14:15 | |
*** amoralej|lunch is now known as amoralej | 14:15 | |
*** yamamoto has quit IRC | 14:16 | |
*** armax has joined #openstack-infra | 14:20 | |
*** artom has joined #openstack-infra | 14:21 | |
*** Lucas_Gray has joined #openstack-infra | 14:22 | |
*** sshnaidm has joined #openstack-infra | 14:28 | |
mnaser | may i have eyes on https://review.opendev.org/#/c/711854/ please | 14:34 |
*** rh-jelabarre has quit IRC | 14:34 | |
*** rh-jelabarre has joined #openstack-infra | 14:34 | |
*** sshnaidm has quit IRC | 14:35 | |
*** jamesmcarthur has joined #openstack-infra | 14:39 | |
*** jamesmcarthur has quit IRC | 14:44 | |
*** lpetrut has joined #openstack-infra | 14:49 | |
*** sshnaidm has joined #openstack-infra | 14:50 | |
*** sshnaidm_ has joined #openstack-infra | 14:52 | |
*** sshnaidm has quit IRC | 14:54 | |
*** sshnaidm_ is now known as sshnaidm | 14:56 | |
*** beekneemech is now known as bnemec | 15:04 | |
*** ykarel is now known as ykarel|away | 15:05 | |
*** jamesmcarthur has joined #openstack-infra | 15:18 | |
*** jamesmcarthur has quit IRC | 15:19 | |
*** jamesmcarthur_ has joined #openstack-infra | 15:19 | |
*** rh-jelabarre has quit IRC | 15:29 | |
*** rh-jelabarre has joined #openstack-infra | 15:33 | |
*** mattw4 has joined #openstack-infra | 15:34 | |
openstackgerrit | Merged opendev/lodgeit master: Upload container images https://review.opendev.org/711854 | 15:40 |
mordred | mnaser. noonedeadpunk: woot! | 15:44 |
*** apetrich has joined #openstack-infra | 15:45 | |
*** jamesmcarthur_ has quit IRC | 15:47 | |
clarkb | does anyone have python3.5 handy? I think we can remove our workaround for importlib-resources to fix virtualenv and tox now that importlib-resources 1.3.x have released | 15:50 |
clarkb | I 've tested that python2.7 is working (and it does work) | 15:50 |
clarkb | I'm also going to recheck my zuul-jobs DNM chnage that uses base-test as a parent job | 15:50 |
clarkb | that should run without the workaround | 15:50 |
clarkb | https://review.opendev.org/#/c/680178/4 is that change | 15:51 |
clarkb | donnyd: we can safely delete this grafana dashboard right? http://grafana.openstack.org/d/3Bwpi5SZk/nodepool-fortnebula?orgId=1 | 15:53 |
clarkb | (that doesn't delete the data from graphite, just the easy access dashboard) | 15:54 |
mnaser | mordred: yay thanks! | 15:54 |
mordred | clarkb: I can have one convenient pretty quickly | 15:55 |
*** Lucas_Gray has quit IRC | 15:56 | |
clarkb | mordred: ya I've started a xenial docker container and am just making sure I've got it close enough to test node python before testing | 15:56 |
mordred | ah - test node python I definitely won't have | 15:56 |
clarkb | I remembered that its actually python3.5 on xenial that matters not just any python3.5 because python3.5 and python3.6 apparently didn't fork properly | 15:56 |
mordred | yeah- I've got pyenv python3.5 - so that's not xenial python3.5 at all | 15:57 |
*** ociuhandu has joined #openstack-infra | 15:57 | |
*** jamesmcarthur has joined #openstack-infra | 15:58 | |
*** Lucas_Gray has joined #openstack-infra | 15:59 | |
fungi | same, my py35 is via make altinstall from a cpython checkout of the latest 3.5.x tag | 15:59 |
fungi | built against current state of libraries on debian/unstable | 16:00 |
fungi | so bears little resemblance to whatever xenial is shipping packaged | 16:00 |
clarkb | ok on a xenial container I've used python3 -m venv to createa virtualenv, then in that virtualenv I've installed virtualenv to latest using -U. THis gets my importlib-resources 1.3.1. That pip install -U virtualenv reports distlib failed to install but returns 0 anyway and running venv/bin/virtualenv doesthisevenwork succeeds | 16:00 |
mordred | fungi: yup | 16:00 |
clarkb | I think that means we can safely clean up our workaround if we confirm our images have all built within the last couple days | 16:01 |
mordred | clarkb: woot | 16:01 |
fungi | sgtm | 16:01 |
donnyd | Yea we can purge the old FN stuff | 16:01 |
*** diablo_rojo has joined #openstack-infra | 16:02 | |
*** lpetrut has quit IRC | 16:05 | |
*** Lucas_Gray has quit IRC | 16:06 | |
*** Lucas_Gray has joined #openstack-infra | 16:08 | |
clarkb | infra-root looking at image builds really quickly we have lots of old images sticking around. My hunch is that those are related to leaked volumes in vexxhost | 16:09 |
clarkb | I'm in a meeting now, but will look closer afterwards. My hunch is that we'll need to clean all that up and nesure things are all new enough before we remove the workaround | 16:09 |
*** matt_kosut has quit IRC | 16:14 | |
*** lmiccini has quit IRC | 16:25 | |
*** pgaxatte has quit IRC | 16:28 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Control log archive and user preservation with vars https://review.opendev.org/701381 | 16:29 |
*** yboaron has quit IRC | 16:30 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Make revoke-sudo more general. https://review.opendev.org/706262 | 16:31 |
*** jamesmcarthur has quit IRC | 16:35 | |
*** jamesmcarthur has joined #openstack-infra | 16:37 | |
*** matt_kosut has joined #openstack-infra | 16:44 | |
*** matt_kosut has quit IRC | 16:48 | |
clarkb | roman_g: to followup on the airship testing, are those nodes being allocated reliably now? | 16:48 |
*** imacdonn has joined #openstack-infra | 16:49 | |
*** nicolasbock has quit IRC | 16:53 | |
*** apetrich has quit IRC | 16:54 | |
*** ociuhandu has quit IRC | 16:54 | |
*** ociuhandu has joined #openstack-infra | 16:55 | |
*** nicolasbock has joined #openstack-infra | 16:55 | |
clarkb | fungi: donnyd http://paste.openstack.org/show/790485/ I think that represents the bulk of our leaked images beacuse those aren't in a deleting state (if they were in a deleting state we'd remove them from local disk whichi s full on nb02) | 16:56 |
clarkb | I think that will require zk surgery, which I can look into in a bit | 16:56 |
clarkb | (vexxhost does have at least one leaked image but I think we deletedt he local copy due to the deleting state change there) | 16:56 |
clarkb | but I think non of those fn resources exist anymore so we can delte them from the zk db directly | 16:57 |
*** ijw has joined #openstack-infra | 16:57 | |
clarkb | then that will result in necessary state updates to image state (and deletions) | 16:57 |
fungi | ahh, yeah, i think something's no longer deleting images when we set the images list to an empty list | 16:58 |
fungi | because over the weekend when we added the openedge environment with an empty images list, nodepool happily uploaded all our images to it anyway | 16:58 |
clarkb | neat | 16:58 |
fungi | Shrews: ^ any suggestions for troubleshooting that? | 16:59 |
clarkb | ok back in a few to poke at zk unless someone else would like to | 16:59 |
*** ijw has quit IRC | 16:59 | |
*** ijw has joined #openstack-infra | 17:00 | |
*** chandankumar is now known as raukadah | 17:00 | |
*** jamesmcarthur has quit IRC | 17:00 | |
*** ociuhandu has quit IRC | 17:01 | |
*** ociuhandu has joined #openstack-infra | 17:03 | |
*** ijw_ has joined #openstack-infra | 17:05 | |
*** AJaeger has quit IRC | 17:05 | |
*** AJaeger has joined #openstack-infra | 17:05 | |
Shrews | reading sb | 17:05 |
donnyd | I could probably being the FN endpoint back online, but would take a day or so | 17:06 |
Shrews | fungi: can i get more background here? | 17:07 |
*** ijw has quit IRC | 17:08 | |
Shrews | did we not properly decommission FN or something? | 17:08 |
donnyd | Shrews: FN was left up for a week with the image list empty | 17:08 |
Shrews | donnyd: that's a confusing statement to me because http://paste.openstack.org/show/790485/ shows the image list not empty... so.... ? I'm clearly missing context here | 17:09 |
donnyd | We moved fort nebula to open edge | 17:09 |
Shrews | And during this move, was FN decommissioned as outlined in https://zuul-ci.org/docs/nodepool/operation.html#removing-a-provider ? | 17:10 |
Shrews | Or has something else led up to this? | 17:10 |
fungi | Shrews: we set diskimages: [] for fortnebula in https://review.opendev.org/709257 | 17:11 |
*** jamesmcarthur has joined #openstack-infra | 17:11 | |
fungi | apparently images did not get cleaned up after that merged | 17:11 |
Shrews | fungi: was it disabled in the launcher first? we have bright red boxes on that op page warning us to do that :) | 17:12 |
Shrews | if nodes remain that use those images, they won't get cleaned up | 17:12 |
fungi | we set max-servers: 0 | 17:12 |
fungi | but maybe there were leaked nodes? | 17:12 |
clarkb | there are only two nodes remaining per my paste | 17:12 |
clarkb | shouldnt prevent all deletes like that | 17:13 |
fungi | Shrews: but then we merged https://review.opendev.org/711760 to replace fortnebula with openedge | 17:14 |
fungi | and nodepool immediately uploaded images to it even though diskimages: [] was in there for the new environment | 17:14 |
Shrews | clarkb: correct, shouldn't prevent all deletes. i'm just trying to understand the sequence of things rn | 17:14 |
Shrews | did we let all FN nodes get used up after setting max-servers to 0? | 17:15 |
clarkb | the issue us that change only updated the launcher | 17:15 |
clarkb | not nodepool.yaml for the builders | 17:15 |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: Tests bindep role on all-platforms https://review.opendev.org/708704 | 17:15 |
clarkb | not sure if openedge bringup had similar problem | 17:16 |
fungi | oh... | 17:16 |
fungi | yep, thanks Shrews | 17:16 |
clarkb | but I think we just need o rm zk nodes | 17:16 |
fungi | that's what did it | 17:16 |
fungi | we set the launcher to diskimages: [] but the builders still had the old diskimages list configured | 17:17 |
* fungi sighs | 17:17 | |
fungi | and then we renamed the environment for it, explaining why it immediately uploaded | 17:17 |
Shrews | We need to build a way into nodepool CLI to help cleanup here. We shouldn't ever require manual ZK cleanup (but looks like we may need to do that here). | 17:18 |
Shrews | clarkb: i think just deleting the zk nodes won't trigger disk cleanup though. that may also need to be done manually | 17:20 |
Shrews | lemme look at the code for a sec.... | 17:20 |
clarkb | iirc it scans for images that either are deleting or have no record then deletes them | 17:21 |
clarkb | removing the zk entries should cause local disk cleanups | 17:21 |
*** ccamacho has quit IRC | 17:22 | |
fungi | i believe it did last time i deleted znodes for old images | 17:23 |
Shrews | do we still have FN defined on the builders then? | 17:23 |
Shrews | i think if we only delete the *upload* records (not the build records), then it should clean up disk | 17:24 |
clarkb | Shrews: correct | 17:24 |
*** rpittau is now known as rpittau|afk | 17:27 | |
Shrews | clarkb: i have to leave in 10 min to meet with my tax guy. can you handle the deletes? should be paths like /nodepool/images/centos-7/builds/0000121707/providers/fortnebula-regionone/images/0000000001 | 17:28 |
*** dtantsur is now known as dtantsur|afk | 17:30 | |
Shrews | i'll see about some sort of "force-delete-upload-records" option for the CLI | 17:31 |
*** apetrich has joined #openstack-infra | 17:31 | |
clarkb | ya I can do the deletes | 17:33 |
*** jpena is now known as jpena|off | 17:33 | |
clarkb | basically delete the things in my paste | 17:33 |
clarkb | that should then clean up builds | 17:33 |
*** ociuhandu_ has joined #openstack-infra | 17:35 | |
*** evrardjp has quit IRC | 17:35 | |
*** evrardjp has joined #openstack-infra | 17:35 | |
*** ociuhandu has quit IRC | 17:38 | |
*** ociuhandu_ has quit IRC | 17:39 | |
*** apetrich has quit IRC | 17:41 | |
*** jcapitao is now known as jcapitao_off | 17:41 | |
clarkb | nodes have been removed. now doing the image uploads | 17:42 |
fungi | i can also do the znode cleanup (i feel responsible for helping make that mess in the first place), but won't be caught up on other stuff to where i can start in on it for a couple more hours | 17:43 |
clarkb | no worries, its straight forward once we've agreed that is the course of action | 17:46 |
clarkb | after these are done I'll check on nb01 and nb02 to see that they've freed the appropriate disk space and are able to build iamges again | 17:47 |
clarkb | then we wait for image updates and can clean up the base job | 17:47 |
clarkb | fwiw rmr fortnebula-regionone where it shows up under the /nodepool/images tree as well as rmr for /nodepool/nodes/$nodeid seems to be the ticket | 17:47 |
clarkb | can also get $node to see more about it to confirm that you want to remoev it | 17:48 |
fungi | rmr is the recursive remove? | 17:48 |
clarkb | yes | 17:49 |
fungi | i think i missed that and manually recursed last time i did it | 17:49 |
donnyd | anything I can do to be helpful? | 17:52 |
clarkb | donnyd: nope I think this was all on our end | 17:52 |
clarkb | we now have 280GB free on nb02 | 17:52 |
fungi | that was fast! | 17:52 |
clarkb | 312GB on nb01 | 17:52 |
clarkb | I think it may be worthwhile stopping them, rebooting, and cleaning out the dib tmp dir stuff since that tends to bloat | 17:53 |
clarkb | I'll go ahead and do that now | 17:53 |
clarkb | oh ya we've got tons of dib processes on nb02 at least | 17:53 |
clarkb | nb02 is rebooting now. It iwll come up with nodepool-builder disabled, I'll clear out the tmp dib dirs, then reenable nodepool-builder and reboot again | 17:55 |
clarkb | then repeat on nb01 | 17:55 |
clarkb | these reboots are always so slow | 17:57 |
* fungi remembers rebooting physical servers which took several hours to complete their power-on selftests | 17:59 | |
fungi | this doesn't seem slow at all | 17:59 |
*** ccamacho has joined #openstack-infra | 18:00 | |
*** derekh has quit IRC | 18:00 | |
*** jamesmcarthur has quit IRC | 18:00 | |
clarkb | at about 5 minutes now. I wonder if it is fscking (fwiw this seems to always happen on these servers, sometimes I wonder if it is the stop side that is slow since ssh is immediately killed it appaers to be fast but could be trying to gracefully stop stuff after that takes time) | 18:01 |
*** jcapitao_off has quit IRC | 18:07 | |
*** Lucas_Gray has quit IRC | 18:08 | |
*** jamesmcarthur has joined #openstack-infra | 18:08 | |
*** andrewbonney has quit IRC | 18:10 | |
mordred | fungi: I remember the good old days of being terrified to reboot a server because there was a chance it would fail POST or just simply not boot back up properly for some reason | 18:14 |
clarkb | still not responding and there is nothing on the console. I think that is fsck behavior with ubuntu? | 18:15 |
mordred | which might lead to a couple of days of manual reconstruction | 18:15 |
clarkb | oh it just started doing a boot splash | 18:15 |
mordred | clarkb: yeah - sometimes it just goes to the bad place | 18:15 |
mordred | \o/ | 18:15 |
fungi | clarkb: you can `sudo touch /fastboot` before rebooting to skip forced timed fsck of filesystems | 18:15 |
mordred | clarkb: you know - there is a grub/kernel option that disables the quiet boot thing | 18:15 |
clarkb | well the fsck is probably a good idea. disablibg quiet boot seems like a good idea too | 18:16 |
fungi | mordred: keyboard error, press f1 to continue | 18:16 |
clarkb | mostly its the lack of info that is annoying more so than the server doing what it needs to check its disks are sane | 18:16 |
clarkb | anyway I will wait patiently since it seems to be doing something (likely fsck) | 18:16 |
mordred | yeah. these are servers - please print lots of lines of text to console | 18:16 |
*** jamesmcarthur has quit IRC | 18:16 | |
fungi | i'm being told i need to take advantage of the unusual warm snap here to go for a brief walk. bbiab | 18:17 |
*** jamesmcarthur has joined #openstack-infra | 18:17 | |
clarkb | fungi: I'll be taking advantage of the sun here in about an hour :) | 18:17 |
clarkb | highly recommend | 18:17 |
fungi | good call | 18:17 |
clarkb | I went on a bike ride on saturday and regretted it when the skies decided hail was appropriate | 18:18 |
clarkb | was small hail but it got so cold out of nowhere | 18:18 |
clarkb | anyone know how to give ctrl + alt + f1 in the rax console? Is that what the check marks are for ctrl and alt to have it capture those key presses? | 18:21 |
clarkb | hrm I think I figured it out. those check boxes seem to be actual inputs so checking them then f1 gives you the keypresses you want | 18:22 |
clarkb | it appears its stuck on unmounting dib stuff | 18:22 |
clarkb | which isn't unsurprising. I think it actually needs a forceful reboot | 18:22 |
clarkb | any objections to trying that? | 18:23 |
clarkb | basically the lat thing in the log is reached target shutdown with a bunch of unmount failures above it | 18:23 |
*** sean-k-mooney has joined #openstack-infra | 18:24 | |
clarkb | and now I have ssh access | 18:25 |
mordred | woot | 18:25 |
mordred | clarkb: is the dib unmounting thing something we should try to dig in to? | 18:26 |
mordred | I seem to remember that coming up before | 18:26 |
clarkb | mordred: I believe it is a side effect of running out of disk on the server | 18:26 |
clarkb | mordred: basically when that happens dib starts to fail hard because so much of what it does relies on successful writes | 18:26 |
clarkb | I'm not sure its worth digging into beyond making dib run out of disk less (which we've been pushing on by cleaning up old images and having it remove the disk files once all upload states are deleting and not actually deleted) | 18:27 |
clarkb | the problem this time was we removed a cloud without properly cleaning it up so we basically had a second set of images for all images hanging around | 18:27 |
clarkb | however, its possible something else is causing the leaking | 18:27 |
mordred | clarkb: oh right | 18:29 |
*** amoralej is now known as amoralej|off | 18:31 | |
clarkb | thats said cleaning up dib_tmp is freeing a lot of disk space so those may be leaking which then puts pressure on things even if clouds are all happy | 18:31 |
clarkb | (I had thought that the cleanups were failing due to running out of space, but this rm is deleting way more data than I would expect if that were the case) | 18:33 |
mordred | clarkb: :( | 18:34 |
clarkb | there is now 467GB free on nb02's volume and i'm still waiting for rm to finihs | 18:38 |
clarkb | I would've expected an image or two's worth of cleanup if it was running out of disk that caused things to spiral out of control | 18:38 |
clarkb | not ~10 images worth | 18:38 |
clarkb | er I guess its "just" 200GB | 18:38 |
clarkb | which is less than 10 images | 18:38 |
clarkb | but still | 18:38 |
AJaeger | clarkb, mordred: do you want to +1 the OpenDev governance change? (the revert-revert) - https://review.opendev.org/#/c/710020/ | 18:39 |
AJaeger | might be good to give some additional +1 to avoid further questions and delays | 18:40 |
AJaeger | infra-root ^ | 18:40 |
clarkb | ya I'll take a look | 18:41 |
*** ralonsoh has quit IRC | 18:42 | |
*** jamesmcarthur has quit IRC | 18:44 | |
*** ociuhandu has joined #openstack-infra | 18:44 | |
mordred | infra-root: running out to store - back in a few | 18:49 |
*** jamesmcarthur has joined #openstack-infra | 18:56 | |
*** eck` has joined #openstack-infra | 18:58 | |
clarkb | ok nb02 is all done now and running the builder again | 19:02 |
clarkb | going to look at nb01 next | 19:02 |
clarkb | nb01 is in a much happier state. I thas leaked ~4 builds looks like not many like nb02 | 19:05 |
clarkb | it could be that there is an underlying bug which is made worse by the disk filling | 19:05 |
clarkb | ok and nb01 is done now | 19:07 |
clarkb | infra-root related to images and keeping on top of them, should we be deleting fedora-29 since the upstream packages have been retired? | 19:07 |
clarkb | maybe that is a question for ianw | 19:07 |
clarkb | I'm making a list of images leaked in vexxhost now | 19:08 |
*** ociuhandu has quit IRC | 19:10 | |
AJaeger | clarkb: fedora-latest nodeset still points to fedora-29 | 19:10 |
*** ociuhandu has joined #openstack-infra | 19:11 | |
*** eharney has quit IRC | 19:11 | |
openstackgerrit | Andreas Jaeger proposed opendev/base-jobs master: Switch nodeset fedora-latest to fedora 30 https://review.opendev.org/711969 | 19:13 |
AJaeger | clarkb, ianw, ^ | 19:13 |
clarkb | AJaeger: hrm, ok it was my undertanding that jobs were failing because package mirrors are not working as they retired it upstream of us (then mirrors picked up on that), but maybe it is still working in some capacity? | 19:13 |
AJaeger | clarkb: no idea | 19:14 |
AJaeger | clarkb: just noting that we have jobs configured... | 19:15 |
clarkb | ++ | 19:15 |
clarkb | http://paste.openstack.org/show/790494/ is a survey of remaining leaks and images that are not building currently | 19:15 |
openstackgerrit | Andreas Jaeger proposed opendev/glean master: Switch to Fedora 30 jobs https://review.opendev.org/711970 | 19:16 |
*** ociuhandu has quit IRC | 19:16 | |
*** gfidente is now known as gfidente|afk | 19:16 | |
AJaeger | clarkb: once those two are merged and one for x/tobiko, we can merge diskimage-builder and retire fedora-29 | 19:17 |
clarkb | the buster images are broken due to a change I made a while back that got back burnered. I'll look into fixing that first | 19:18 |
fungi | walk concluded. back and catching up while lunching on leftovers from the weekend | 19:19 |
fungi | weather was nice | 19:19 |
fungi | no hail ;) | 19:19 |
openstackgerrit | Andreas Jaeger proposed openstack/diskimage-builder master: Remove Fedora 29 job https://review.opendev.org/711972 | 19:21 |
AJaeger | ianw, clarkb: pushed all changes for fedora 29 (left project-config out), topic is fedora-29 | 19:21 |
openstackgerrit | Clark Boylan proposed openstack/project-config master: Fix debian-buster partition config https://review.opendev.org/711973 | 19:24 |
clarkb | infra-root ^ that should fix the debian buster image builds I thik | 19:24 |
*** gyee has joined #openstack-infra | 19:24 | |
clarkb | I need to sort out my time in the sun as well as lunch now | 19:24 |
clarkb | will be back to keep digging into nodepool image statuses | 19:25 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Tests bindep role on all-platforms https://review.opendev.org/708704 | 19:26 |
fungi | thanks, reviewing | 19:27 |
*** tesseract has quit IRC | 19:33 | |
*** dave-mccowan has quit IRC | 19:38 | |
openstackgerrit | Merged openstack/project-config master: Fix debian-buster partition config https://review.opendev.org/711973 | 19:40 |
*** lbragstad_ has joined #openstack-infra | 19:54 | |
*** lbragstad has quit IRC | 19:57 | |
clarkb | "ERROR: Cannot uninstall 'six'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall." is the opensuse-15 error | 19:59 |
fungi | yay | 20:00 |
fungi | also you should be out enjoying the afternoon | 20:00 |
clarkb | I should but I decided I wasn't quite ready yet :) | 20:01 |
clarkb | I think our plan to cleanup python on test images will fix opensuse-15 | 20:01 |
clarkb | I'm not really seeing a good shorter term answer | 20:01 |
fungi | yeah, any idea what's dragging in six in the first place? | 20:02 |
clarkb | fungi: its the python2-pip install done before switching over to source in pip-and-virtualenv | 20:03 |
clarkb | tumbleweed has the same error | 20:03 |
fungi | ahh, okay | 20:03 |
clarkb | that means that newer packaging (that will trickle down into leap) isn't fixed either | 20:03 |
clarkb | AJaeger: ^ suse may want to consider fixing that too | 20:03 |
clarkb | AJaeger: basically setuptools should be used instead of distutils so that pip is happier | 20:03 |
AJaeger | dirk, cmurphy, evrardjp, can either of you followup, please? ^ | 20:05 |
* AJaeger is too far away from python packaging... | 20:05 | |
clarkb | python2-six is the package looks like | 20:06 |
*** jamesmcarthur has quit IRC | 20:06 | |
clarkb | but any others in the same boat should eventually be updated | 20:06 |
dirk | clarkb: I will take a look | 20:08 |
AJaeger | thanks, dirk ! | 20:10 |
*** dave-mccowan has joined #openstack-infra | 20:12 | |
dirk | this issue was re-added in | 20:13 |
dirk | Sat Aug 18 09:08:38 UTC 2018 - Matěj Cepl <mcepl@suse.com> | 20:13 |
dirk | Sat Aug 18 09:08:38 UTC 2018 - Matěj Cepl <mcepl@suse.com> | 20:13 |
dirk | - Break the cycilical dependency on python-setuptools. | 20:13 |
dirk | so setuptools needs six to build, and six setuptools | 20:13 |
*** jamesmcarthur has joined #openstack-infra | 20:14 | |
mordred | sigh | 20:14 |
fungi | i guess suse can't have manual uploads of non-redistributed binary packages to break dependency cycles like in debian | 20:15 |
mordred | clarkb: I agree - the new python plan should fix this issue | 20:16 |
*** jamesmcarthur has quit IRC | 20:17 | |
fungi | yep, not preinstalling python2-pip should do the trick, as long as it's the only thing dragging in six as a dependency | 20:17 |
mordred | clarkb: the only thing I could think of to work around the issue on opensuse for now is to go the other way in the short term - uninstall python2pip and python2-six install pip with get-pip | 20:17 |
mordred | it's not ideal as it's a little more spaghetti code in the element with a special case for suse - but the end result should be mostly ok until we can get it sorted for real | 20:17 |
*** jamesmcarthur has joined #openstack-infra | 20:18 | |
fungi | and, again, assuming that's the only package depending on it in those images | 20:18 |
mordred | yah | 20:19 |
dirk | fungi: well, we can with tricks. but cycles are still frowned upon as they we automatically rebuild full cycles | 20:19 |
clarkb | ya though if we get the images built dealing with this on the distro is a job/software prolem | 20:19 |
*** cdearborn has quit IRC | 20:19 | |
fungi | dirk: yep, obviously getting rid of cyclic dependencies is the preferred solution. just isn't always possible | 20:19 |
clarkb | fwiw I've confiemd that the image delete failures in vexxhost are due to leaked volume for boot from volume nodes | 20:20 |
clarkb | I'm now cleaning up some volumes | 20:20 |
*** AJaeger has quit IRC | 20:21 | |
clarkb | I have deleted 3 clearly leaked volumes and 3 volumes stuck in a creating state since new years eve. This cleaned up 3 out of 5 images | 20:26 |
clarkb | trying to see where the other two are stuck now | 20:26 |
clarkb | There are no held nodes in sjc1 currently so not that | 20:27 |
*** ccamacho has quit IRC | 20:29 | |
clarkb | `openstack --os-cloud openstackjenkins-vexxhost --os-region sjc1 server remove volume 485a607e-29dc-4b1e-b2db-a65027757202 89b630af-581e-488f-8fb0-72751bf74652` would make one of the leaked volumes deletable if that command would work | 20:30 |
clarkb | mordred: ^ did you ever sort out if we could do that more forceful detachment once the server no longer exists | 20:30 |
clarkb | that command results in error becaues the server is gone | 20:31 |
fungi | so i think the solution was that there is a direct cinder api call we can make to "detach" the volume from its perspective even though nova no longer knows about the server instance to which it's supposedly still attached | 20:33 |
fungi | and i want to say there was work to add that as a fallback in openstacksdk but now i don't remember | 20:34 |
*** xek_ has quit IRC | 20:36 | |
jrosser | please could i get a hold on openstack-ansible-deploy-aio_metal-debian-buster on review 711821 | 20:36 |
fungi | jrosser: what are you troubleshooting in that job? just so i can be more specific in the autohold comment | 20:37 |
jrosser | "symlink to zuul provided repos" | 20:37 |
mordred | we had a script to do it | 20:37 |
mordred | I thought I pushed that up somewhere no? | 20:38 |
clarkb | infra-root mnaser http://paste.openstack.org/show/790497/ is what I've found | 20:38 |
clarkb | mordred: oh maybe | 20:38 |
clarkb | (sorry was still in investigative mode) | 20:39 |
mordred | clarkb: oh - that's the other half of the story | 20:39 |
clarkb | I think the other three volumes need cloud intervention because they are attached to nodes from new years eve that refuse to delete | 20:39 |
fungi | jrosser: added. for future reference we also need to know the repository name, though i inferred it from the change details | 20:39 |
mordred | clarkb: in tools/clean-leaked-volumes.py | 20:39 |
jrosser | fungi: ooh ok, i'll bear that in mind for next time - thanks for adding it | 20:40 |
fungi | yw! | 20:40 |
mordred | in system-config - the c.block_storage.delete call is what you're looking for | 20:40 |
mordred | clarkb: you need to delete the attachement the volume has | 20:40 |
clarkb | mordred: thanks. Though I'm actually about to pop out for a bike ride in the sun now. I can figure out running that if others are busy | 20:40 |
clarkb | mordred: ya then I can delete the volume | 20:40 |
mordred | clarkb: yah | 20:40 |
*** sshnaidm is now known as sshnaidm|afk | 20:40 | |
clarkb | the other three in that paste are gonna bestuck I think because the instances still exist | 20:40 |
mordred | clarkb: do we need to delete the instances too? | 20:41 |
clarkb | mordred: I've tried, they won't delete :) | 20:41 |
mordred | we can still delete the volumes :) | 20:41 |
clarkb | mordred: we can detach even if the instance is running? | 20:41 |
clarkb | that seems dangerous but I guess its fine | 20:41 |
mordred | with taht second command - it's basically "sudo hey cinder detach this" | 20:41 |
fungi | sudo cinder make me a sandwich | 20:41 |
clarkb | ok not safe in the general case but since those server instances are test nodes that nodepool doesn't know about that means we don't care about them and can do our best to cleanup | 20:42 |
clarkb | we should still eventually clean up those instances though (and that requires cloud intervention I think) | 20:42 |
mordred | unless we can find more information about why they won't delete - yeah, I think we need mnaser to delete them | 20:43 |
clarkb | mordred: they are timestamped around the same time I found volumes stuck in a creating state (those did delete) | 20:43 |
clarkb | mordred: my guess is ceph or something in the cloud was unhappy new years eve | 20:43 |
mordred | joy | 20:43 |
clarkb | and these instances were reported back to nodepool as failures, but actually "completed" enough to get uuids and sit around and fial to delete | 20:44 |
ianw | hey sorry yesterday was a holiday here, back today | 20:45 |
ianw | infra-root: speaking of fedora, there's a stack in https://review.opendev.org/#/q/status:open+topic:nodepool-legacy that deploys a builder from containers; reviews welcome. i want to move the fedora builds to that initially | 20:48 |
ianw | (mordred has already looked at a few, thanks) | 20:49 |
clarkb | ianw: I noticed an nb01.opendev.org which I assume is where ^ will be deployed | 20:49 |
clarkb | ? | 20:49 |
ianw | clarkb: yep, i brought that up quite a while ago when i was still thinking of a more traditional deployment | 20:49 |
clarkb | cool, I'll try to revie after my bike ride. I think the more immediate fires are all out at this point | 20:49 |
ianw | it hasn't been ansiblised yet | 20:50 |
clarkb | and if someone else wants to try and clean up the volumes/intsances in http://paste.openstack.org/show/790497/ feel free | 20:50 |
clarkb | and really popping out now before I lose all this nice sunlight | 20:51 |
dirk | mordred: clarkb: so should I push for a fix on suse side or not? | 20:52 |
clarkb | dirk: I think we'll fix/workaround this issue on our side eventually, but the suse packaging should not use distutils anywhere for this reason | 20:53 |
clarkb | it should probably be fixed in the distro too if possible | 20:53 |
*** trident has quit IRC | 20:57 | |
dirk | clarkb: submitted to tumbleweed via bsc#1166139 | 20:57 |
dirk | for leap we'd have to do a SLE update (it is inherited from there) | 20:57 |
*** jamesmcarthur has quit IRC | 20:58 | |
*** trident has joined #openstack-infra | 20:58 | |
*** jamesmcarthur has joined #openstack-infra | 20:58 | |
*** zxiiro has quit IRC | 21:01 | |
*** trident has quit IRC | 21:04 | |
*** trident has joined #openstack-infra | 21:05 | |
*** imacdonn has quit IRC | 21:17 | |
jrosser | fungi: i think my hold is ready now, this is my key http://jrosser.woaf.net/openstack.pub, also.... is this expected? http://jrosser.woaf.net/openstack.pub | 21:20 |
jrosser | oops this http://paste.openstack.org/show/790498/ | 21:20 |
*** rh-jelabarre has quit IRC | 21:21 | |
fungi | jrosser: ansible is problematic in that if there's no room for it to write a temporary script it will pretend the host is unreachable | 21:22 |
fungi | so we have a fallback to do a raw exec over ssh in order to check whether the rootfs is full | 21:22 |
fungi | that looks to me like the zuul executor couldn't reach the node at all at that point, or at least that its sshd wasn't responding | 21:23 |
fungi | time to find out if i can | 21:23 |
fungi | jrosser: ssh root@192.237.253.115 | 21:28 |
fungi | jrosser: also someone held a centos-8 node for you two weeks ago, didn't mention in the comment what you were debugging. can we release that one? | 21:29 |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: Add mypy to linter test https://review.opendev.org/711750 | 21:29 |
jrosser | fungi: yes that one can be released | 21:30 |
fungi | thanks, deleting it now | 21:30 |
*** sean-k-mooney has quit IRC | 21:32 | |
mnaser | clarkb: appreciate your thoughts on .. https://review.opendev.org/#/c/711861/1 | 21:34 |
*** eharney has joined #openstack-infra | 21:35 | |
mnaser | (i figured that there wouldn't be interest in maintaining the helm charts in infra and that way we can iterate/move forwards faster rather than blocking on folks who have to review things they dont need) | 21:35 |
fungi | if we were going to use those to deploy paste.opendev.org then it might make sense as part of the opendev repository namespace, but i don't know to what extent we're relying on helm currently nor what future plans we might have to do so | 21:36 |
*** rcernin has joined #openstack-infra | 21:36 | |
mordred | I don't believe we have any plans to use helm - so I think mnaser just maintaining them whever makes sense is a fine idea | 21:37 |
fungi | yeah, seems fine to me too | 21:37 |
mnaser | yeah my train of thought is.. will this be used by infra = | 21:38 |
mnaser | => opendev/ else vexxhost/ | 21:38 |
fungi | my thoughts exactly | 21:38 |
mordred | ++ | 21:38 |
mnaser | btw -- https://review.opendev.org/#/c/710020/ should be ready to land once we hit the "deadline" of being open for long enough :) | 21:39 |
mnaser | (re opendev) | 21:39 |
fungi | we've been open since the very beginning! ;) | 21:41 |
mnaser | fungi: oh i meant that the change is open for long enough to merge :p | 21:41 |
fungi | my monday evening jokes fall a little flat | 21:42 |
mnaser | it's okay, it's been a long day | 21:42 |
donnyd | Has anyone noticed any issues with Open Edge today? | 21:42 |
donnyd | it seems to be aside from the image thing with nodepool earlier all is well | 21:43 |
fungi | donnyd: i have not heard/seen any, no | 21:43 |
donnyd | I have only seen one error to launch in the last 12 hours | 21:44 |
fungi | sounds ultra-stable to me then | 21:44 |
donnyd | yea that is not too bad | 21:45 |
donnyd | if there have been no complaints then I am pretty happy with its first day back into operation | 21:45 |
*** jamesmcarthur has quit IRC | 21:47 | |
*** nicolasbock has quit IRC | 21:55 | |
*** slaweq has quit IRC | 21:57 | |
*** dpawlik has quit IRC | 22:00 | |
*** yboaron has joined #openstack-infra | 22:01 | |
*** ociuhandu has joined #openstack-infra | 22:03 | |
*** ociuhandu has quit IRC | 22:07 | |
*** slaweq has joined #openstack-infra | 22:09 | |
*** zigo has quit IRC | 22:13 | |
*** slaweq has quit IRC | 22:14 | |
*** bdodd has quit IRC | 22:18 | |
*** zigo has joined #openstack-infra | 22:19 | |
jrosser | fungi: we can release the hold on openstack-ansible / openstack-ansible-deploy-aio_metal-debian-buster / 711821 | 22:19 |
jrosser | i've figured out whats going on - the extra disks on a rax node get mounted over the top of a symlink i set up too early in the job | 22:20 |
*** bdodd has joined #openstack-infra | 22:21 | |
fungi | d'oh, yep that'd do it | 22:21 |
fungi | deleted! | 22:22 |
fungi | in rackspace the rootfs is small so some jobs mount the ephemeral disk they provide at /opt | 22:22 |
jrosser | total fluke that the buster job that failed last time also landed on a rax node for the held job :) | 22:23 |
fungi | or somewhere similar | 22:23 |
fungi | well, if it hadn't failed you could have just kept rechecking. autoholds don't trigger on successful builds | 22:23 |
clarkb | ya I don't think we'd have much input for helm charts | 22:24 |
clarkb | maybe one day but our current setup is somewhere between that and puppet using ansible to drive docker compose | 22:25 |
*** diablo_rojo has quit IRC | 22:26 | |
clarkb | mnaser: fungi I've approved the change | 22:29 |
clarkb | mnaser: did you see my notes about about servers and volumes that have leaked in vexxhost? I'm about to try manually removing attachments and deleting volumes but the servers will remain leaked I think | 22:29 |
ianw | clarkb: i was just having a look at those volumes you mentioned to try and help out -- they seem to think they're attached? | 22:32 |
*** rkukura has joined #openstack-infra | 22:32 | |
clarkb | ianw: yup. One of them is "attached" to a server that does not exist. We can super safely remove that attachment then delete that volume. For the other three I think those servers still exist but refuse to delete for whatever reason. We can remove the attachment and delete the volumes but the servers will probably be even more unhappy after that | 22:33 |
clarkb | hwoever those servers were created on new years eve and are nodepool nodes so I don't think we care too much | 22:33 |
*** rkukura has quit IRC | 22:34 | |
clarkb | I'm hacking up system-config/clean-leaked-bfv.py now to detach those 4 | 22:34 |
mordred | clarkb: let me know if that doesn't work | 22:35 |
mordred | (also I really do need to put that into sdk as some actual api calls) | 22:36 |
clarkb | mordred: mostly just trying to understand what an attachmend id is | 22:36 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: Implement zookeeper-auth https://review.opendev.org/619156 | 22:36 |
clarkb | but otherwise I think I should be able to get it running soon enough | 22:36 |
clarkb | (the script won't work as is because it only removes volumes that don't have servers that exist in server list and we have 3 of those) | 22:36 |
*** slaweq has joined #openstack-infra | 22:37 | |
openstackgerrit | Tristan Cacqueray proposed zuul/nodepool master: Implement zookeeper-auth https://review.opendev.org/619155 | 22:38 |
*** rkukura has joined #openstack-infra | 22:39 | |
*** bradm has joined #openstack-infra | 22:40 | |
clarkb | mordred: ianw http://paste.openstack.org/show/790499/ how does that look? | 22:40 |
openstackgerrit | Merged openstack/project-config master: Add vexxhost/lodgeit-helm https://review.opendev.org/711861 | 22:40 |
clarkb | I'm going to try that now | 22:41 |
ianw | clarkb: lgtm if it works and the zombie servers don't hang on to the reference :) | 22:42 |
*** slaweq has quit IRC | 22:42 | |
clarkb | ianw: I think we are about to find out if they do or not :) | 22:42 |
mordred | clarkb: wait | 22:42 |
mordred | oh - nevermind. yes | 22:42 |
mordred | that looks good | 22:42 |
mordred | (I had a quick panic because my brain skipped over volumes_to_detach | 22:43 |
clarkb | all 4 detachments returned 200 and volume list shows them detached | 22:44 |
clarkb | I'm going to try and delete them now | 22:44 |
clarkb | the three that were attached showed deleting and now show available | 22:45 |
clarkb | so ya I think those three volumes and their servers will need cloud admin intervention | 22:46 |
*** yboaron has quit IRC | 22:46 | |
clarkb | mnaser: noonedeadpunk the volumes and servers on lines 7-9 of http://paste.openstack.org/show/790497/ are sad and can/should be deleted | 22:46 |
clarkb | debian buster appears to have built successfully too | 22:47 |
clarkb | its a (northern hemisphere) spring cleaning day! | 22:48 |
mordred | woot! | 22:48 |
*** slaweq has joined #openstack-infra | 22:54 | |
*** tkajinam has joined #openstack-infra | 22:55 | |
*** slaweq has quit IRC | 22:59 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: Implement zookeeper-auth https://review.opendev.org/619156 | 23:00 |
*** mattw4 has quit IRC | 23:00 | |
*** gshippey has quit IRC | 23:01 | |
*** rlandy is now known as rlandy|bbl | 23:05 | |
*** diablo_rojo has joined #openstack-infra | 23:12 | |
*** pkopec has quit IRC | 23:15 | |
*** lbragstad_ has quit IRC | 23:17 | |
*** jamesmcarthur has joined #openstack-infra | 23:22 | |
*** jamesmcarthur has quit IRC | 23:23 | |
*** gyee has quit IRC | 23:24 | |
*** jamesmcarthur has joined #openstack-infra | 23:24 | |
*** jamesmcarthur has quit IRC | 23:25 | |
*** jamesmcarthur has joined #openstack-infra | 23:25 | |
*** dchen has joined #openstack-infra | 23:25 | |
clarkb | ianw I reviewed the nb01.opendev.org stack. I left comments on much of them about some tweaks we can do that might be more maintainable long term. Let me know what you think (the -1 was left because its an actual error aiui but the others are +2 because we can refine as we go) | 23:26 |
ianw | clarkb: thanks, yeah just thinking about the nodepool.yaml copy. i think with remote_src: yes it would work | 23:27 |
ianw | i.e. not copy itself constantly if it didnt' change by testing checksum | 23:27 |
clarkb | ianw: ++ (on a followon change I note where we can simplify the container mount config if we do that copy) | 23:27 |
ianw | oh, thanks, a missing git add on the last bit | 23:27 |
ianw | i'm still a bit unsure on the overall lifecycle management of this container ... but think that will come as we test it | 23:28 |
ianw | i.e. upgrading on releases but not killing things in ways that leave crap around locally and on remote clouds | 23:28 |
*** tosky has quit IRC | 23:30 | |
clarkb | ianw: having the state in zk means we do a really good job of keeping leaks out of clouds now | 23:31 |
clarkb | ianw: the exceptions there are when clouds themslves fail to allow us to clean up | 23:32 |
clarkb | the local stuff I think will be a learning experience. Maybe we can have a container init script that cleans out /opt/dib_tmp? | 23:32 |
*** larainema has quit IRC | 23:33 | |
*** apetrich has joined #openstack-infra | 23:33 | |
clarkb | I guess we'll update the image whenever changes land to nodepool | 23:39 |
*** jamesmcarthur has quit IRC | 23:40 | |
*** jamesmcarthur has joined #openstack-infra | 23:40 | |
ianw | clarkb: yeah, i'm actually thinking the tmps should be volumes too | 23:41 |
ianw | the nesting will be interesting too, if depending on how we keep proceeding with image generation | 23:42 |
clarkb | ianw: you mean proper docker volumes rather than bind mounts? | 23:43 |
ianw | yeah, something like that; attached just for the container lifespan | 23:43 |
clarkb | huh I wonder how those will work with mounts leaking | 23:44 |
clarkb | we can find out :) | 23:44 |
fungi | well, and image leaks when i forget to clear out the diskimages list on the builder config rather than just the launcher config before removing a provider | 23:45 |
*** jamesmcarthur has quit IRC | 23:46 | |
ianw | going to merge those two base ones to minimise rebase/merge changes | 23:49 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!