openstackgerrit | Ian Wienand proposed openstack/project-config master: Switch nodepool builders to opendev.org mirror https://review.opendev.org/690757 | 00:03 |
---|---|---|
*** mlavalle has quit IRC | 00:04 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Remove mirror02.dfw.rax.openstack.org https://review.opendev.org/727894 | 00:04 |
openstackgerrit | Merged openstack/project-config master: Switch nodepool builders to opendev.org mirror https://review.opendev.org/690757 | 00:20 |
ianw | mirror01.ord.rax.openstack.org has nothing but bot hits, and the opendev.org mirror is switched in, ergo we can remove that. i'll shut it down first now | 00:26 |
*** ysandeep|away is now known as ysandeep | 00:28 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Remove mirror01.ord.rax.openstack.org https://review.opendev.org/727897 | 00:30 |
ianw | that leaves the IAD mirror. the opendev.org mirror there was being used as the KAFS testing host. i think we should rebuild that as a focal host and switch it in, then we can remove the openstack.org one | 00:30 |
ianw | i've deleted that server and will rebuild it now | 00:37 |
ianw | we have plenty of notes on the kafs stuff, and gate testing changes (that need remerge due to the significant recent changes, but it's all there) | 00:37 |
openstackgerrit | Ian Wienand proposed opendev/zone-opendev.org master: Update mirror.iad.rax.opendev.org address https://review.opendev.org/727899 | 00:49 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Replace mirror.iad.rax.opendev.org host https://review.opendev.org/727901 | 01:14 |
openstackgerrit | Merged opendev/zone-opendev.org master: Update mirror.iad.rax.opendev.org address https://review.opendev.org/727899 | 01:17 |
ianw | the builder no longer reference it so i've shutdown mirror02.dfw.rax.openstack.org; we can remove it after system-config merge | 01:41 |
ianw | clarkb: what is the deal with citycloud kna1/lon1/sto2 ? "# Disabled until 2018-07-01 at request of citycloud." | 01:44 |
clarkb | ianw: we turned it off due to cappacity issues. at the time they said they hoped to turn it back on a few months later | 01:48 |
clarkb | at this point we may just want to retire those providers and can add them back again if they are able | 01:48 |
openstackgerrit | Merged opendev/system-config master: Replace mirror.iad.rax.opendev.org host https://review.opendev.org/727901 | 01:52 |
openstackgerrit | Ian Wienand proposed openstack/project-config master: Remove citycloud kna1/lon1/sto2 clouds https://review.opendev.org/727902 | 01:56 |
ianw | 02:10:29 up 854 days ... not a bad effort | 02:10 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Remove citycloud https://review.opendev.org/727905 | 02:12 |
ianw | boo ... infra-prod-base failed | 02:19 |
ianw | Run specified playbook on bridge.o.o and redirect output ... had a non-zero exit, but it's not really clear why | 02:22 |
ianw | there are no failed tasks | 02:22 |
ianw | but there are unreachable hosts | 02:25 |
clarkb | ianw you found the logs on bridge right? | 02:26 |
clarkb | we dont publish in zuul for msot things yet as they need review of output | 02:26 |
ianw | yep, there were no failed tasks in it | 02:26 |
ianw | ansible gives !0 exit with unreachable hosts ... according to https://github.com/ansible/ansible/issues/19720 | 02:27 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: production-playbook : ignore unreachable hosts https://review.opendev.org/727907 | 02:37 |
ianw | i think this was my own fault for downing the mirror server before adding to emergency | 02:41 |
ianw | i'm not sure on the best way to retrigger the run | 02:41 |
clarkb | I think infra-prod-base runs hourly | 02:42 |
clarkb | you can also touch the lock file in zuuls homedir (I forget the path) then run the playbook from zuuls homedir directly | 02:42 |
clarkb | the lockfile got added to documentation | 02:42 |
clarkb | jobs will just queue up behind it and when you remove it the jobs run aiui | 02:43 |
clarkb | waiting for hourly run may not be too badshould happen in about 20 minutes | 02:43 |
ianw | hrm, in this case i'd like it to happen in order because it's for the new iad mirror server, so i want base, letsencrypt, mirror etc | 02:44 |
clarkb | ah | 02:44 |
clarkb | I think you can reenqueue the change then | 02:44 |
clarkb | sincethat will get all the jobs for you | 02:44 |
ianw | yeah, let's see if i can remember the magic spell :) | 02:47 |
ianw | i tried "zuul enqueue-ref --tenant openstack --trigger gerrit --pipeline deploy --project opendev/system-config --ref master --newrev a751cab84d1c767c1a9503925ea5da2cc6cdcc98" | 02:50 |
ianw | which is ... kind of right ... every promote-image job has triggered failed | 02:51 |
ianw | but it has also enqueued the base job, so ... | 02:52 |
clarkb | I think deploy is change not ref based | 02:54 |
clarkb | I always have to look it up in the inventory of the jobs that ran before | 02:55 |
clarkb | you get clues based on the zuul vars that line up to the command flags | 02:55 |
clarkb | ianw ya I wouldve expected a lot fewer jobs which is probqbly related to not having files in the event to filter on | 02:58 |
clarkb | checking logs and docker hub I think the image promotes all failed because there was no previous manifeat pointing out images to promote so thats all fine | 03:02 |
clarkb | if you are happy with what base is doing then I think this is working well enough | 03:04 |
ianw | yeah, i don't think it managed to destroy anything | 03:04 |
ianw | hopefully base passes now then LE will run | 03:04 |
clarkb | mordred: corvus: tomorrow it might be good to clarify if that enqueue command was the expected method to eb used. I think maybe if enqueue and not enqueue-ref was used file filters may have worked? but I'm not positive that is more correct | 03:09 |
clarkb | ianw: base was successful and looking at the log it doesn't seem like anything went weird. I do notice we have an apt-get autoremove -y and an ssh key update that both show up as changed. But those were both changed in the last run as well | 03:16 |
clarkb | So I think this is just going to look a lot like our old run all script and apply all the things | 03:16 |
ianw | yeah, it looks like mirror01.iad.rax is getting its certs now | 03:16 |
clarkb | its not targetted like we've gotten used to so we should keep an eye out for odd failures, but otherwise seems to be happy | 03:17 |
ianw | i think because it's TOT (nothing else has merged) it's unlikely to try reverting anything | 03:17 |
clarkb | ya | 03:18 |
clarkb | more thinking it will be a canary for things that may be unhappy with their ansible | 03:18 |
clarkb | since we're potentially running those things less often now? | 03:19 |
clarkb | that info would be good to collect if so | 03:19 |
openstackgerrit | Oleksandr Kozachenko proposed zuul/zuul-jobs master: Patch CoreDNS corefile https://review.opendev.org/727868 | 03:23 |
clarkb | it continues to be happy. I'll resume my normal evening now | 03:33 |
ianw | clarkb: thanks! | 03:41 |
ianw | yay, https://mirror.iad.rax.opendev.org/ is up, is a focal node, and things look good | 03:56 |
*** ykarel|away is now known as ykarel | 03:58 | |
openstackgerrit | Ian Wienand proposed openstack/project-config master: Switch RAX IAD mirror to opendev.org version https://review.opendev.org/727917 | 03:58 |
ianw | clarkb: well i had a go at uploading the focal raw image from /opt/images to sjc1 vexxhost and didn't get anything bootable and also i can't delete the image now either ... so ... yeah | 05:42 |
ianw | anyway, if yourself, or anyone else feels like pulling at bits of https://etherpad.opendev.org/p/openstack.org-mirror-be-gone we're quite close now | 05:42 |
openstackgerrit | Oleksandr Kozachenko proposed zuul/zuul-jobs master: Add container and pod log in the test for ensure-kubernetes role https://review.opendev.org/727929 | 05:43 |
*** DSpider has joined #opendev | 05:44 | |
*** dpawlik has joined #opendev | 06:06 | |
openstackgerrit | Oleksandr Kozachenko proposed zuul/zuul-jobs master: Patch CoreDNS corefile https://review.opendev.org/727868 | 06:55 |
openstackgerrit | Oleksandr Kozachenko proposed zuul/zuul-jobs master: Add container and pod log in the test for ensure-kubernetes role https://review.opendev.org/727929 | 06:55 |
*** tosky has joined #opendev | 07:31 | |
*** dtantsur|afk is now known as dtantsur | 07:32 | |
*** rpittau|afk is now known as rpittau | 07:33 | |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: Remove citycloud kna1/lon1/sto2 clouds https://review.opendev.org/727902 | 07:35 |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: Remove Citycloud from grafana https://review.opendev.org/727956 | 07:35 |
AJaeger | ianw: fixed your change and added one on top ^ | 07:35 |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: Stop translation stable branches on projects without Dashboard https://review.opendev.org/723217 | 07:47 |
*** moppy has quit IRC | 08:01 | |
*** moppy has joined #opendev | 08:01 | |
openstackgerrit | Jens Harbott (frickler) proposed opendev/system-config master: Fix access to clouds on bridge https://review.opendev.org/615197 | 08:07 |
*** ykarel is now known as ykarel|lunch | 08:11 | |
*** jaicaa has quit IRC | 08:14 | |
*** ysandeep is now known as ysandeep|lunch | 08:14 | |
*** jaicaa has joined #opendev | 08:17 | |
*** dpawlik has quit IRC | 08:40 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed zuul/zuul-jobs master: yamlint: EOF newlines and comments indent https://review.opendev.org/725516 | 08:42 |
*** dpawlik has joined #opendev | 08:45 | |
*** ysandeep|lunch is now known as ysandeep | 09:03 | |
*** sshnaidm|afk is now known as sshnaidm | 09:11 | |
openstackgerrit | Merged zuul/zuul-jobs master: yamlint: EOF newlines and comments indent https://review.opendev.org/725516 | 09:27 |
*** ykarel|lunch is now known as ykarel | 09:31 | |
*** ykarel is now known as ykarel|mtg | 10:02 | |
*** rpittau is now known as rpittau|bbl | 10:18 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: fetch-tox-output: use tox_extra_args https://review.opendev.org/728023 | 10:50 |
*** ykarel|mtg is now known as ykarel | 11:18 | |
*** Eighth_Doctor is now known as Conan_Kudo | 11:19 | |
*** Conan_Kudo is now known as Eighth_Doctor | 11:20 | |
*** lpetrut has joined #opendev | 11:25 | |
AJaeger | mordred: want to approve pbrx retirement change https://review.opendev.org/726462 and abandon all open reviews, please? | 11:52 |
*** tkajinam has quit IRC | 12:01 | |
openstackgerrit | Merged zuul/zuul-jobs master: Combine javascript deployment and deployment-tarball jobs https://review.opendev.org/727370 | 12:02 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: fetch-sphinx-tarball: introduce zuul_use_fetch_output https://review.opendev.org/681870 | 12:03 |
*** rpittau|bbl is now known as rpittau | 12:15 | |
*** lpetrut has quit IRC | 12:36 | |
*** lpetrut has joined #opendev | 12:48 | |
*** ykarel is now known as ykarel|afk | 12:54 | |
AJaeger | config-core, pbrx is ready to be removed from Zuul, please review https://review.opendev.org/#/c/726463/ | 13:39 |
*** lpetrut has quit IRC | 13:46 | |
*** ykarel|afk is now known as ykarel | 14:07 | |
openstackgerrit | Monty Taylor proposed opendev/base-jobs master: Add jobs for publishing javascript content https://review.opendev.org/728097 | 14:14 |
mordred | AJaeger: ^^ that should work I think | 14:14 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: fetch-tox-output: use tox_extra_args https://review.opendev.org/728023 | 14:15 |
mordred | AJaeger: hrm - although we might actually want to untar in place and then do an rsync --delete - otherwise we're going to grow cruft in the target location | 14:19 |
AJaeger | mordred: will review later | 14:20 |
mordred | AJaeger: thanks | 14:20 |
openstackgerrit | Monty Taylor proposed opendev/base-jobs master: Add jobs for publishing javascript content https://review.opendev.org/728097 | 14:25 |
openstackgerrit | Monty Taylor proposed opendev/base-jobs master: Add jobs for publishing javascript content https://review.opendev.org/728097 | 14:26 |
clarkb | mordred: can you check the question about reruning deploy jobs from around 0240utc? | 14:26 |
AJaeger | mordred: lGTM, fix the linter job please | 14:34 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: fetch-tox-output: use tox_extra_args https://review.opendev.org/728023 | 14:37 |
openstackgerrit | Monty Taylor proposed opendev/base-jobs master: Add jobs for publishing javascript content https://review.opendev.org/728097 | 14:37 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: fetch-tox-output: use tox_extra_args https://review.opendev.org/728023 | 14:38 |
*** mlavalle has joined #opendev | 14:40 | |
clarkb | infra-root I'm trying to help a project with possibly force merging a change over in #openstack-infra. I've tried to pull up Project Bootstrappers via web ui to add myself and it doesnt' seem to be in the list. If I ls-groups via the ssh api it is there. Any idea of what is going on? | 14:51 |
clarkb | nevermind I'm a derp | 14:52 |
clarkb | thats what I get for early morning gerrit stuff (I was looking in the project list not the group list in the web ui) | 14:52 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: fetch-tox-output: use tox_extra_args https://review.opendev.org/728023 | 14:54 |
openstackgerrit | Merged zuul/zuul-jobs master: tox siblings installed packages: Add PEP 440 direct reference format https://review.opendev.org/727475 | 15:00 |
*** diablo_rojo has joined #opendev | 15:00 | |
*** ykarel is now known as ykarel|away | 15:12 | |
fungi | clarkb: i generally just use the ssh api and `ssh -p 29418 fungi@review.opendev.org 'gerrit set-members "Project Bootstrappers" --add fungi'` and then similarly with --remove when i'm done | 15:14 |
clarkb | I normally use the webui but will keep ^ in mind as that is nice and repeatable | 15:22 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: fetch-tox-output: use tox_extra_args https://review.opendev.org/728023 | 15:23 |
fungi | yeah, my workflow is to --add myself, ctrl-r my gertty view of a change and then review to add verified and ctrl-u to submit, or git push --force gerrit whatever stuff i've prepared, and then --remove myself again | 15:27 |
fungi | saves me fumbling with the webui for any of it | 15:27 |
fungi | took me a while to get to the point where i consistently remember the gerrit command for that though ;) | 15:28 |
fungi | btw, i restarted the vos release of mirror.ubuntu from yesterday, first attempt eventually failed with "Failed to end transaction on rw volume: Possible communication failure; The volume 536870949 could not be released to the following 1 sites: afs02.dfw.openstack.org /vicepa" | 15:30 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: tox: use tox_extra_args, add tox_config_file https://review.opendev.org/728023 | 15:30 |
fungi | this time around it noted "Deleting extant RO_DONTUSE site on afs02.dfw.openstack.org... done; Creating new volume 536870950 on replication site afs02.dfw.openstack.org: done" | 15:31 |
fungi | so maybe it'll be better | 15:31 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: tox: use tox_extra_args, add tox_config_file https://review.opendev.org/728023 | 15:31 |
fungi | if it fails the same way again, i'll see if i need to vos endtrans something, or delete the ro volume on afs02.dfw and recreate | 15:31 |
*** ysandeep is now known as ysandeep|sleep | 15:31 | |
clarkb | fungi: I almost read that deleting extant... thing to mean its removed the ro volume and is recreating it | 15:32 |
clarkb | here is hoping that makes it happy | 15:32 |
clarkb | emergency.yaml looks a lot better now. Thanks everyone for giving that a quick cleanup | 15:35 |
clarkb | infra-root can I get reviews on https://review.opendev.org/#/c/727865/ to add a new nova key entry in our clouds? Then we'll follow up with https://review.opendev.org/#/c/727867/ to complete work I told shrews and dmsimard to not worry about (since it needs this rotation step) | 15:36 |
AJaeger | infra-root, logstash queue is up to 50k and not catching up | 15:37 |
clarkb | AJaeger: thanks. Chances are we've got another extra large log file clogging the pipes | 15:37 |
clarkb | AJaeger: I'll try to take a look soon | 15:37 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: tox: use tox_extra_args, add tox_config_file https://review.opendev.org/728023 | 15:39 |
corvus | clarkb: drat, i was hoping i'd have the signet hc now and a new ssh key. :/ | 15:42 |
clarkb | corvus: mine arrive tomorrow according to ups | 15:43 |
corvus | clarkb: exciting! | 15:43 |
clarkb | AJaeger: screen-monasca-api.txtMon Apr 27 15:54:23 2020 314.2M <- that will do it | 15:43 |
AJaeger | 314 M? Argh ;( | 15:43 |
clarkb | corvus: ya its been a struggle for them with covid and manufacturing and all that. I'm really hopeful the utility will outweigh all of that and maybe even buy more for people I know | 15:44 |
clarkb | AJaeger: also note the size of the job-output on this one https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_c91/724633/2/check/tox-py35/c911410/ | 15:44 |
clarkb | that was https://review.opendev.org/#/c/724633/ which avass has abandoned | 15:45 |
clarkb | so maybe that is a one off | 15:45 |
AJaeger | 715 M ;( | 15:46 |
clarkb | AJaeger: https://29f6adc0364e579577f3-f81ad92f06c83d9d7da63d2d01eeb258.ssl.cf5.rackcdn.com/727363/2/check/openstack-tox-docs/69825e7/ that is from ironic but wonder if that is widespread too? | 15:47 |
clarkb | AJaeger: perhaps realted to the work happenign around theming and stuff right now? | 15:47 |
clarkb | AJaeger: I think we start there, see if we can get monasca to log less verbosely if not exclude it in our log processing. And check openstack tox docs jobs as well | 15:48 |
clarkb | I'm going to download the tox-docs log from above to see if I can make sense of what is happening | 15:48 |
AJaeger | Let me check tox-docs as well... | 15:48 |
clarkb | AJaeger: looks like it may be 120MB of latex warnings from pdf generation? | 15:51 |
clarkb | I'm guessing that is something we can pretty easily clean up thankfully | 15:51 |
clarkb | basically each line of docs has multiple lines of warnings/errors | 15:51 |
AJaeger | let me silence "automated-steps" at least | 15:53 |
openstackgerrit | Clark Boylan proposed opendev/base-jobs master: Don't index monasca-api logs in elasticsearch https://review.opendev.org/728121 | 15:54 |
clarkb | AJaeger: ^ monasca-persister is already in the exclude list so I think we can just add monasca-api to start | 15:55 |
AJaeger | clarkb: +2 | 15:55 |
clarkb | AJaeger: I think if we can get that fix alnded and a fix for pdf/latex things in I'll restart the workers and we can see if we catch more of these :) | 15:55 |
clarkb | restarting now would likely just result in quick crashes particularly if pdf generation is that verbose | 15:56 |
AJaeger | not sure how to silence the latex thing ;( | 16:01 |
*** rpittau is now known as rpittau|afk | 16:01 | |
openstackgerrit | Merged zuul/zuul-jobs master: Policy rule for ownership between remote and executor https://review.opendev.org/724855 | 16:04 |
clarkb | AJaeger: we run `sphinx-build -b latex` which runs pdflatex | 16:04 |
clarkb | so far havent' found anything useful on the sphinx side. now going to see the pdflatex side | 16:05 |
clarkb | might also be able to change the latex_engine in sphinx but that may produce slightly different results in the output? | 16:06 |
AJaeger | we use xelatex in sphinx - doubt that changing that helps | 16:08 |
clarkb | AJaeger: ironic's sphinx conf.py doesn't set latex_engine so I assumed they were using the default. But the logs show xetex being used? | 16:09 |
AJaeger | openstackdocstheme sets xelatex | 16:10 |
AJaeger | (with exception of 2.1.0 which was just fixed in 2.1.1) | 16:10 |
fungi | could tox be configured to redirect (or filter) the output from that latex command in cases where it's just too verbose? | 16:11 |
clarkb | fungi: ya, thats the big hammer options available to us I think | 16:11 |
fungi | |grep -v some patterns | 16:11 |
clarkb | we could also have projects fix their latex (though I expect that to be fiddly and not easy) | 16:11 |
AJaeger | overfull hboxes ;( | 16:11 |
fungi | ugly, and needs /usr/bin/grep added to the out-of-venv command whitelist in the tox config to silence another warning | 16:12 |
clarkb | AJaeger: underful too | 16:12 |
AJaeger | yep | 16:12 |
clarkb | fungi: AJaeger: possible bad idea redirect all the output to a file then log that file separate from job-output. THen we'll collcet it as a log but not index it | 16:12 |
clarkb | however I don't really feel like this info si useful right now becuase i really doubt anyone will figure out the hboxes | 16:13 |
fungi | not a terrible idea | 16:13 |
fungi | also if it's that repetitive, it probably compresses well | 16:13 |
clarkb | AJaeger: fungi: I think that is the best idea I've got for now. We have the jobs section for pdf generation do redirection into a file, emit a note to job-output that this step tends to be incredibly verbose and logs can be found at $location if necessary | 16:16 |
AJaeger | roles/build-pdf-docs in openstack-zuul-jobs handles the building, we could add something there | 16:19 |
*** sshnaidm is now known as sshnaidm|afk | 16:21 | |
AJaeger | or in roles/tox in zuul-jobs | 16:26 |
AJaeger | no good idea right now ;( | 16:26 |
clarkb | AJaeger: maybe we can do the tox role as is but modify it to install deps only. Then do a followup task that executs tox in shell and redirects things? | 16:26 |
clarkb | that way we get all the dependency stuff managed by the tox role as well as a virtualenv ready to go? | 16:27 |
*** dpawlik has quit IRC | 16:27 | |
clarkb | ya we can set tox_extra_args | 16:27 |
clarkb | let me try write that change up really quickly | 16:27 |
AJaeger | clarkb: I can't wrap my head around that right now. Thanks for write that up, will try reviewing later | 16:28 |
AJaeger | fungi, want to approve the monasca logstash block change https://review.opendev.org/728121, please? | 16:30 |
*** dtantsur is now known as dtantsur|afk | 16:44 | |
*** mlavalle has quit IRC | 16:47 | |
openstackgerrit | Merged opendev/base-jobs master: Don't index monasca-api logs in elasticsearch https://review.opendev.org/728121 | 16:59 |
AJaeger | ianw has a change up to switch RAX to opendev mirror, please review https://review.opendev.org/727917 | 17:06 |
AJaeger | and we're ready to retire pbrx: https://review.opendev.org/726463 | 17:07 |
AJaeger | config-core, please review these two ^ | 17:07 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: Simplify zuul-output usage and remove merge-output-to-logs https://review.opendev.org/728151 | 17:11 |
openstackgerrit | James E. Blair proposed zuul/zuul-jobs master: Move artifactory test job https://review.opendev.org/728152 | 17:13 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add a tox module https://review.opendev.org/728154 | 17:14 |
*** panda is now known as panda|out | 17:15 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add a tox module https://review.opendev.org/728154 | 17:16 |
openstackgerrit | Merged zuul/zuul-jobs master: fetch-sphinx-tarball: introduce zuul_use_fetch_output https://review.opendev.org/681870 | 17:19 |
openstackgerrit | Merged zuul/zuul-jobs master: Add upload-artifactory role https://review.opendev.org/725678 | 17:26 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Add a tox module https://review.opendev.org/728154 | 17:28 |
*** avass has joined #opendev | 17:39 | |
openstackgerrit | Merged zuul/zuul-jobs master: Move artifactory test job https://review.opendev.org/728152 | 18:05 |
*** hashar has joined #opendev | 18:11 | |
*** diablo_rojo has quit IRC | 18:14 | |
openstackgerrit | Andreas Jaeger proposed opendev/system-config master: Stop cloning a bunch of puppet modules we don't use https://review.opendev.org/720892 | 18:23 |
AJaeger | mordred: I rebased your change. infra-root, the change above is the next step to retire a couple of puppet modules ^ | 18:24 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: tox: update lint regex to not require column https://review.opendev.org/725030 | 18:31 |
AJaeger | mordred, infra-root, any idea why 720892 fails? Fixes welcome ;) | 18:50 |
*** dpawlik has joined #opendev | 18:51 | |
clarkb | AJaeger: mordred https://zuul.opendev.org/t/openstack/build/2bcf368be87f46b493fc25804b70ecba/log/applytest/puppetapplytest23.final.out.FAILED Could not find declared class ::openstackci::nodepool_builder at /home/zuul/applytest/puppetapplytest23.final:10:3 on node ubuntu-xenial-inap-mtl01-0016575779.openstack.org | 18:55 |
clarkb | did we remove openstackci? | 18:55 |
clarkb | I'm guessing yes and nb03 is still using it? | 18:55 |
AJaeger | clarkb: mordred planned to remove it | 18:56 |
AJaeger | clarkb: see https://review.opendev.org/#/c/720901/ | 18:57 |
AJaeger | if we still need it, we can add it back in... | 18:57 |
clarkb | AJaeger: ya I think nb03 is still using it | 18:58 |
clarkb | its close to not using it anymore but is still using it aiui | 18:58 |
AJaeger | mordred: could you rework, please? ^ | 18:58 |
*** hashar has quit IRC | 19:22 | |
*** hashar has joined #opendev | 19:23 | |
*** dpawlik has quit IRC | 19:46 | |
clarkb | ok responded to the jitsi meet thread with things I've learned | 20:18 |
fungi | thanks! i wanted to but am pretty sure i just don't have time to get to it | 20:19 |
clarkb | fungi: https://review.opendev.org/#/c/727865/ is another change on my todo list for landing if you can review it | 20:26 |
clarkb | fungi: step 1 in rotating nodepool node ssh keys | 20:26 |
fungi | that brings us to 7 ssh keys | 20:28 |
clarkb | ya :( | 20:28 |
fungi | 5 normally residing in north america, one in australia and one in europe | 20:28 |
clarkb | I'm pulling up ianw's mirror backlog | 20:30 |
clarkb | infra-root https://review.opendev.org/#/c/727917/1 will put our first focal mirror node in production. It has tbe +2's necessary but not sure if there is any other vetting we want to do before doing that? I've clicked around the http served indexes and that seems happy | 20:32 |
clarkb | mordred: (but infra-root in general) https://review.opendev.org/#/c/727907/1 is an interesting one that deals with ansible failures to connect to servers. I've left some thoughts on potential issues with the change but am curious to see if others think that is an issue or not | 20:40 |
*** hashar has quit IRC | 20:49 | |
corvus | clarkb: 917 seems like one to wait until release is all clear; is it? | 20:54 |
clarkb | corvus: yes, and I believe the release is all clear except for the trailing projects which do their stuff in ~2 weeks? | 20:55 |
clarkb | smcginnis: ^ ya'll are basically done for a bit right? | 20:55 |
fungi | there's no longer a deadline for the cycle-trailing projects | 20:58 |
fungi | they get done whenever they get done, and we no longer hold up infrastructure changes for them | 20:58 |
smcginnis | clarkb: Yep | 20:58 |
clarkb | the fixes for logstash sadness have landed. I'm going to restart logstash worker services to ressurrect them | 21:02 |
fungi | mirror.ubuntu volume re-re-release is still going | 21:10 |
clarkb | fungi: I want to say it was about a 12 hour release when I added focal | 21:12 |
fungi | well, it's hopefully not replacing all the files for focal | 21:13 |
clarkb | fungi: from the message you posted earlier it wouldn't surprise me if it is replacing all the files | 21:14 |
clarkb | it said it removed the old volume and is recreating it | 21:14 |
fungi | on afs02.dfw anyway | 21:14 |
fungi | but yeah, quite possible | 21:14 |
clarkb | right the ro copy | 21:16 |
openstackgerrit | James E. Blair proposed opendev/system-config master: Add iptables_extra_allowed_groups https://review.opendev.org/726475 | 21:19 |
openstackgerrit | James E. Blair proposed opendev/system-config master: Run Zuul, Nodepool, and Zookeeper as the "container" user https://review.opendev.org/726958 | 21:28 |
clarkb | fungi: any reason to not approve https://review.opendev.org/#/c/727865/ ? it should only add a new key (we won't use it yet) impact should be low | 21:36 |
*** diablo_rojo has joined #opendev | 21:43 | |
clarkb | ianw for when your day starts I had a couple comments on some of your mirror and ansible changes | 21:48 |
ianw | ok, cool, will look! | 21:58 |
ianw | i got stuck booting something successfully to replace the two vexxhost ones ... perhaps we should start with https://review.opendev.org/#/c/726886/ | 22:00 |
ianw | i got that .raw image booting in vexxhost, but it didn't come up on the network. wasn't sure if it was me or glean | 22:00 |
clarkb | huh I wonder if our test nodes have similar issues on vexxhost with focal | 22:03 |
ianw | i also then couldn't delete that image | 22:05 |
ianw | yeah i still can't "The image cannot be deleted because it is in use through the backend store outside of Glance" | 22:06 |
clarkb | ianw: are t he servers you booted on it refusing to delete? | 22:10 |
clarkb | I think we've always seen glance agree to delete those images once the instances were removed | 22:10 |
ianw | yeah, the dud server that booted were removed without issues | 22:11 |
ianw | maybe i should try booting it again while i'm here | 22:12 |
ianw | here's a full boot log http://paste.openstack.org/show/793619/ | 22:16 |
clarkb | ianw: oh did you double check if the volume leaked? | 22:16 |
clarkb | because that has happened. Server deletes but volume for it does not. Image refuses to delete as a result | 22:16 |
ianw | clarkb: it probably is that | 22:18 |
ianw | http://paste.openstack.org/show/793620/ | 22:18 |
ianw | i saw those blank ones there yesterday and didn't want to delete them | 22:18 |
clarkb | ianw: if you show them there should be details on what hosts they belonged to in a json blob | 22:18 |
clarkb | should give you the nova server name | 22:18 |
clarkb | then from that you can decide if it is safe to delete the volume | 22:19 |
clarkb | logstash queue backlog has fallen by ~10% since I restarted services | 22:19 |
clarkb | down from 43k to 39k | 22:19 |
ianw | good point, they have "volume_image_metadata" | 22:19 |
ianw | in that log we see | 22:21 |
ianw | [[0;32m OK [0m] Reached target [0;1;39mNetwork is Online[0m. | 22:21 |
ianw | then at the very end | 22:22 |
ianw | [[0;32m OK [0m] Finished [0;1;39mGlean for interface ens3[0m. | 22:22 |
ianw | [[0;32m OK [0m] Reached target [0;1;39mNetwork (Pre)[0m. | 22:22 |
clarkb | that looks like glean is starting late | 22:23 |
ianw | why pre comes after network is anyone's guess ... | 22:23 |
clarkb | network is online is supposed to mean basic networkign has happened (so you can bind to addrs iirc) | 22:23 |
clarkb | https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ ya | 22:24 |
clarkb | but I guess that is platform dependent so ubuntu may be setting that early | 22:24 |
ianw | i converted the upstream images to raw, and uploaded that, and that doesn't even seem to boot, or at least nothing comes up on the console | 22:37 |
clarkb | I wonder if mnaser or Open10K8S have looked at focal yet | 22:38 |
mnaser | clarkb: like in terms of | 22:42 |
mnaser | Running it on our cloud? | 22:42 |
clarkb | mnaser: ya | 22:42 |
clarkb | mnaser: per ianw above the upstream ubuntu images don't seem to boot after being converted to raw and uploaded | 22:43 |
clarkb | and our built iamges using glean aren't any better | 22:43 |
mnaser | I don’t believe we have yet. Maybe noonedeadpunk can dig a bit.. | 22:43 |
ianw | ohhhh wait wait i might be trying to boot an *arm64* image | 22:44 |
clarkb | ianw: oh ha | 22:44 |
ianw | yeah | 22:44 |
ianw | it's almost like i need fonts that distinguish between "rm" "md" in big red letters or something! :) | 22:45 |
ianw | heh, well there we go, that works | 22:49 |
ianw | i think that probably works out to about an hour per letter i wasted fiddling with things :/ | 22:51 |
ianw | and we have learnt, i think, that something is up with glean ... maybe? i'm not 100% sure on those images mordred made | 22:52 |
clarkb | ianw: any idea if ipv6 is coming up properly? the kernel handles that with RAs in vexxhost I think | 22:54 |
clarkb | ianw: that might allow us to get in and debugipv4 from there | 22:54 |
clarkb | but ya debugging these things tends to be difficult if we cant get a shell on the ost | 22:55 |
ianw | i'll get these hosts up and maybe try things out in the ci tenant | 22:56 |
ianw | jenkins/zuul tenant i mean | 22:56 |
*** tkajinam has joined #opendev | 22:58 | |
clarkb | ianw: did you see my comment about returning success when a host is unreachable? I think we don't want to univesally apply that rule and may not want to apply it at all due to the relationships between playbooks | 23:03 |
clarkb | that said I've thought about it more and in the run all days we would continue if any one playbook failed too | 23:04 |
clarkb | we were careful to keep related things in individual playbooks to avoid misappling changes on unreachable failures (or any failure really) | 23:04 |
*** tosky has quit IRC | 23:05 | |
ianw | yeah, it's a fair point ... it's just we don't have much of a "this is failing" alert system, other than you watching it directly | 23:05 |
ianw | perhaps getting someone in the loop, rather than ignoring it, is better | 23:06 |
clarkb | ianw: the proposed change would make it harder to notice that too right? because we'd basically paper over the situation? | 23:06 |
ianw | true :) | 23:07 |
openstackgerrit | Ian Wienand proposed opendev/zone-opendev.org master: Add vexxhost opendev.org mirrors https://review.opendev.org/728308 | 23:23 |
clarkb | ianw: in ^ the firsttwo A/AAAA records have an hour ttl and the next set don't. Is that intentional? | 23:26 |
ianw | hrm, it seems it depends if you copy/paste from the launch output or other bits in the file | 23:28 |
ianw | actually i wonder if emacs adds that for me? | 23:30 |
clarkb | I think we've done hour long ttls on mirrors to avoid extra lookups in jobs | 23:30 |
clarkb | since dns in some jobs has been a flaky thing | 23:30 |
openstackgerrit | Ian Wienand proposed opendev/zone-opendev.org master: Add vexxhost opendev.org mirrors https://review.opendev.org/728308 | 23:31 |
ianw | the output from launch.py doens't have the ttl and i don't remember ever manually setting it | 23:31 |
clarkb | ya I think its mostly a mirror thing | 23:32 |
clarkb | since we've seen jobs do dns lookups and fail to pull them down due to dns thorugh nat or whatever | 23:33 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Add vexxhost opendev.org mirrors https://review.opendev.org/728309 | 23:39 |
ianw | ^ both hosts are up, with a separate volume attached for openafs/apache cache | 23:40 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Add vexxhost opendev.org mirrors https://review.opendev.org/728309 | 23:40 |
clarkb | ianw: the depends on is wrong | 23:42 |
clarkb | it is pointing to itself, that is why zuul complains | 23:42 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Add vexxhost opendev.org mirrors https://review.opendev.org/728309 | 23:42 |
clarkb | ianw: two more things noted on the change | 23:45 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Add vexxhost opendev.org mirrors https://review.opendev.org/728309 | 23:49 |
openstackgerrit | Ian Wienand proposed openstack/project-config master: Switch vexxhost mirrors to opendev.org https://review.opendev.org/728310 | 23:50 |
clarkb | ianw: close :) one small thing on 309 still | 23:51 |
ianw | yes :) | 23:52 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Add vexxhost opendev.org mirrors https://review.opendev.org/728309 | 23:52 |
ianw | i feel we should be autogenerating this list ... another time though :) | 23:52 |
*** DSpider has quit IRC | 23:54 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Remove vexxhost openstack.org mirrors https://review.opendev.org/728311 | 23:54 |
ianw | i'm going to merge the removal of the dfw.rax.openstack ... that server is shutdown and in emergency | 23:58 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!