opendevreview | Ian Wienand proposed opendev/system-config master: launch: permit ssh root login on base image https://review.opendev.org/c/opendev/system-config/+/869615 | 00:27 |
---|---|---|
ianw | Unable to enable service systemd-timesyncd: Failed to enable unit: Unit file /etc/systemd/system/systemd-timesyncd.service is masked. | 00:32 |
ianw | maybe this image is too opinionated ... geez this feels like the old days of snapshot images | 00:34 |
Clark[m] | Oof | 00:37 |
*** rlandy is now known as rlandy|out | 00:38 | |
opendevreview | Merged openstack/diskimage-builder master: Clean up tox.ini for tox v4 https://review.opendev.org/c/openstack/diskimage-builder/+/869579 | 00:47 |
*** JasonF is now known as JayF | 02:17 | |
ianw | ok, i went back to making our own image. converting it to raw and copying some properties seeme to work it out | 02:55 |
ianw | the host keys still don't display correct in the launch node output | 02:55 |
opendevreview | Ian Wienand proposed opendev/system-config master: Add nb04.opendev.org https://review.opendev.org/c/opendev/system-config/+/869622 | 04:08 |
*** yadnesh|away is now known as yadnesh | 04:11 | |
opendevreview | Ian Wienand proposed opendev/system-config master: doc/nodepool: update vhd-util docs https://review.opendev.org/c/opendev/system-config/+/869623 | 04:13 |
opendevreview | Ian Wienand proposed opendev/zone-opendev.org master: Add nb04.opendev.org https://review.opendev.org/c/opendev/zone-opendev.org/+/869626 | 04:23 |
*** ysandeep is now known as ysandeep|ruck | 04:39 | |
*** ysandeep|ruck is now known as ysandeep|ruck|brb | 05:41 | |
*** pojadhav- is now known as pojadhav | 05:51 | |
ianw | ok, nb04 should be ready. it's up, and got an 800gb volume attached, which is the same as in linaro | 05:53 |
*** marios is now known as marios|rover | 06:04 | |
*** ysandeep|ruck|brb is now known as ysandeep|ruck | 06:04 | |
*** ysandeep|ruck is now known as ysandeep|ruck|brb | 06:49 | |
*** ysandeep|ruck|brb is now known as ysandeep|ruck | 06:59 | |
*** jpena|off is now known as jpena | 08:20 | |
opendevreview | Merged opendev/base-jobs master: Fix tox.ini for tox v4 https://review.opendev.org/c/opendev/base-jobs/+/869582 | 09:39 |
*** ysandeep|ruck is now known as ysandeep|afk | 10:11 | |
*** rlandy|out is now known as rlandy | 11:13 | |
*** ysandeep|afk is now known as ysandeep|ruck | 11:31 | |
*** bhagyashris is now known as bhagyashris|brb | 13:36 | |
opendevreview | Luke Odom proposed openstack/diskimage-builder master: Map curl to curl-minimal for rocky 9 https://review.opendev.org/c/openstack/diskimage-builder/+/869424 | 13:59 |
opendevreview | Luke Odom proposed openstack/diskimage-builder master: Add swap support https://review.opendev.org/c/openstack/diskimage-builder/+/869270 | 13:59 |
*** dasm|off is now known as dasm | 14:00 | |
dpawlik | Clark[m], fungi: hey, I would like to make a version e.g. 1.0.0 or 0.1.0 of logscraper. I check few docs related to Openstack release, but can not find proper information about versioning such projects. Should I create a PS similar to https://review.opendev.org/c/openstack/releases/+/607010 or just make a tag, push it to openstack/ci-log-processing | 14:02 |
dpawlik | project and then add the project there https://github.com/openstack/releases/blob/master/tools/build_tag_history.sh#L21-L57 or do it earlier? | 14:02 |
fungi | dpawlik: sig repositories and independent deliverables don't use the coordinated openstack release process, their maintainers just push signed git tags directly to gerrit | 14:04 |
dpawlik | fungi: ack. Should I also add entry to build_tag_history.sh or not needed? | 14:07 |
fungi | not needed | 14:08 |
fungi | we do need to check the acl to make sure you have signed tag push access to the repo, but i'm in a meeting at the moment. can check after | 14:09 |
dpawlik | fungi: ok. No rush | 14:12 |
fungi | dpawlik: is it the openstack/ci-log-processing repository? | 14:17 |
dpawlik | yup | 14:17 |
fungi | dpawlik: propose a change to https://opendev.org/openstack/project-config/src/branch/master/gerrit/acls/openstack/ci-log-processing.config which adds a section like the one described at https://docs.opendev.org/opendev/infra-manual/latest/creators.html#creation-of-tags and use ci-log-processing-release as the group name | 14:20 |
fungi | i'll approve it and add you to the group once it deploys | 14:21 |
fungi | dpawlik: also you'll probably want to at least look over https://docs.opendev.org/opendev/infra-manual/latest/creators.html#prepare-an-initial-release and https://docs.opendev.org/opendev/infra-manual/latest/drivers.html#tagging-a-release | 14:22 |
opendevreview | daniel.pawlik proposed openstack/project-config master: Add createSignedTag permissions for openstack/ci-log-processing project https://review.opendev.org/c/openstack/project-config/+/869727 | 14:29 |
dpawlik | ack fungi. Thanks for help | 14:29 |
opendevreview | Merged openstack/project-config master: Add createSignedTag permissions for openstack/ci-log-processing project https://review.opendev.org/c/openstack/project-config/+/869727 | 14:52 |
fungi | dpawlik: i've added your gerrit account as the initial member of ci-log-processing-release | 15:13 |
dpawlik | cool, thank you fungi | 15:19 |
dpawlik | we will merge few changes, apply on the production and if all is fine for a week/2 weeks I will do a release | 15:20 |
fungi | sounds good | 15:20 |
dpawlik | I see that cluster is not so full now so maybe after upgrading rdo/sf infra I will send an email to opendev mailing list and schedule Opensearch upgrade to newer version | 15:22 |
dpawlik | they have automated process on AWS, but the cluster will be not reachable during that time | 15:22 |
*** ysandeep|ruck is now known as ysandeep|out | 15:24 | |
clarkb | fungi: frickler have time for https://review.opendev.org/c/opendev/base-jobs/+/869580 ? | 15:57 |
frickler | clarkb: ack, I had that open earlier today but then got distracted | 16:06 |
clarkb | thankyou! | 16:06 |
opendevreview | Merged openstack/project-config master: Add the "api-ref-jobs" template to CloudKitty https://review.opendev.org/c/openstack/project-config/+/867651 | 16:16 |
opendevreview | Merged opendev/base-jobs master: Add opendev nox docs promotion https://review.opendev.org/c/opendev/base-jobs/+/869580 | 16:18 |
*** marios|rover is now known as marios|out | 16:46 | |
opendevreview | Clark Boylan proposed opendev/bindep master: Use nox https://review.opendev.org/c/opendev/bindep/+/868004 | 17:05 |
*** jpena is now known as jpena|off | 17:24 | |
fungi | clarkb: was https://goharbor.io/docs/2.7.0/administration/configure-proxy-cache/ one of the options we looked at for container image caching? | 17:48 |
Clark[m] | I don't think so. We primary looked at using the docker registry as it documents the use case but then has no method of pruning. | 18:01 |
Clark[m] | Lack of pruning appears to be a fatal flaw in basically every container images registry | 18:01 |
fungi | storage is free and infinitely available, right? | 18:02 |
Clark[m] | That one says "log in to the web UI to start garbage collection"... And garbage collection isn't quite right either for a cache | 18:02 |
fungi | yeah | 18:04 |
fungi | maybe the web ui is backed by a rest api at least | 18:04 |
Clark[m] | It also has quotas but they are project level not registry level. Really as far as I can tell no one has really implemented a lru cache for docker images on finite disk space | 18:05 |
fungi | apparently it's the registry that the sovereign cloud stack distribution ships/uses, just couldn't remember discussing that one previously | 18:05 |
Clark[m] | Anymore it seems like caches like that are not considered worth supporting. Even pypi is going to kill our ability to cache properly | 18:07 |
fungi | we just need them to drop a cdn endpoint in each of our donor cloud regions ;) | 18:08 |
clarkb | fungi the parent change for the bindep nox change needs review too https://review.opendev.org/c/opendev/bindep/+/868003/ that change fixes an issue with updated deps on rolling release distros | 18:47 |
fungi | oh, yep | 18:47 |
fungi | that one lgtm too | 18:48 |
opendevreview | Gustavo Sanchez proposed openstack/project-config master: Add the woodpecker charm to Openstack charms https://review.opendev.org/c/openstack/project-config/+/869751 | 18:58 |
fungi | #status log Restarted services on lists.openstack.org since some mailman processes were terminated earlier today by out-of-memory events | 19:24 |
opendevstatus | fungi: finished logging | 19:24 |
*** mtomaska__ is now known as mtomaska | 19:27 | |
*** rlandy is now known as rlandy|brb | 19:37 | |
clarkb | ianw: did catch a small issue on https://review.opendev.org/c/openstack/project-config/+/868443/2 | 19:47 |
ianw | thanks, yep that's a typo. will fix | 19:48 |
*** rlandy|brb is now known as rlandy | 20:01 | |
opendevreview | Merged opendev/zone-opendev.org master: Add nb04.opendev.org https://review.opendev.org/c/opendev/zone-opendev.org/+/869626 | 20:09 |
opendevreview | Joshua Watt proposed zuul/zuul-jobs master: use-buildset-registry: Respect docker_mirror https://review.opendev.org/c/zuul/zuul-jobs/+/869760 | 20:15 |
opendevreview | Joshua Watt proposed zuul/zuul-jobs master: use-docker-mirror: Also run task if docker_mirror is defined https://review.opendev.org/c/zuul/zuul-jobs/+/869761 | 20:16 |
opendevreview | Merged opendev/system-config master: Drop openEuler 20.03 LTS SP2 repo mirror https://review.opendev.org/c/opendev/system-config/+/848796 | 20:34 |
opendevreview | Joshua Watt proposed zuul/zuul-jobs master: use-buildset-registry: Prepend buildset registry to mirrors https://review.opendev.org/c/zuul/zuul-jobs/+/869760 | 20:36 |
opendevreview | Clark Boylan proposed opendev/system-config master: Remove Gerrit 3.5 images https://review.opendev.org/c/opendev/system-config/+/869763 | 21:02 |
opendevreview | Clark Boylan proposed opendev/system-config master: Convert Gerrit images to python3.10 https://review.opendev.org/c/opendev/system-config/+/869764 | 21:02 |
opendevreview | Clark Boylan proposed opendev/system-config master: Add Gerrit 3.7 images https://review.opendev.org/c/opendev/system-config/+/869765 | 21:02 |
opendevreview | Clark Boylan proposed openstack/project-config master: Update jeepyb gerrit image build deps https://review.opendev.org/c/openstack/project-config/+/869766 | 21:03 |
clarkb | I split things up a bit as I'm unsure how painful getting 3.7 going will be but the other bits can happen before that pretty easily | 21:03 |
ianw | i guess we should add a nb04 nodepool config before adding the host to the inventory, not sure if the deployment will work without a config file | 21:08 |
opendevreview | Ian Wienand proposed openstack/project-config master: Add nb04 config https://review.opendev.org/c/openstack/project-config/+/869769 | 21:26 |
ianw | did something change in openstack-zuul-jobs-linters? | 21:53 |
opendevreview | Ian Wienand proposed openstack/project-config master: Add nb04 config https://review.opendev.org/c/openstack/project-config/+/869769 | 21:57 |
opendevreview | Ian Wienand proposed openstack/project-config master: openstack-afs.yaml : correct indentation https://review.opendev.org/c/openstack/project-config/+/869772 | 21:57 |
clarkb | ianw: nothing has changed with that job that i know of. However tox releases happen regularly now and often break things | 21:58 |
ianw | i think it appears to have found an indentation that doesn't seem standard in project-config. but wondering why it just found it | 21:59 |
ianw | last timeit passed it ran with ansible-lint 6.4.0 | 22:04 |
ianw | yamllint 1.29.0 released about 4 hours ago ... so that would be it | 22:05 |
ianw | opendev-buildset-registry -- post_failure | 22:08 |
ianw | so one problem fixed, now another :) | 22:08 |
ianw | https://zuul.opendev.org/t/openstack/build/c3af3434793649e69027628277f9b5fc doesn't help | 22:09 |
ianw | https://zuul.opendev.org/t/openstack/builds?job_name=opendev-buildset-registry&project=openstack/project-config ... ran ok half hour ago ... | 22:09 |
clarkb | I just rechecked a change that didn't collect logs due to the host becoming unreachable after a failre | 22:10 |
clarkb | I'm going to try and watch the console live just in case | 22:10 |
clarkb | maybe something more systemic? | 22:10 |
clarkb | oh wait the unreachable is for the cleanup phase trying to run df | 22:11 |
clarkb | it says there were no upload failures | 22:11 |
clarkb | er when I grep the uuid on hte executor that ran the job | 22:12 |
clarkb | successful jobs don't have logs either | 22:12 |
clarkb | corvus: ^ | 22:12 |
clarkb | oh its an ovh outage | 22:12 |
fungi | yay | 22:12 |
clarkb | I figured that out by trying to click on a site preview link and got a 503 from ovh | 22:12 |
clarkb | I'm guessing if I pull up the console log for my browser when trying to fetch the manifest file I'll get similar | 22:13 |
clarkb | the browser says cors doesn't allow this | 22:13 |
fungi | makes sense | 22:14 |
clarkb | except we explicitly set cors options on the swift uploads to allow this... | 22:14 |
clarkb | so ya I'm not sure what the next step for debugging is | 22:15 |
clarkb | unless maybe openstacksdk just updated on the executors somehow (we last updated them friday/saturdayish and I think logs were working yesterday just fine) | 22:15 |
timburke | if swift's sending back 503s, i wouldn't expect any CORS info (whether stored on containers or objects) to make it back to the proxy, much less out to the client | 22:16 |
fungi | ugh, the exim queue on lists.openstack.org is currently around 220k messages (i think, i had to check on the filesystem because mailq just hangs) | 22:17 |
fungi | and it's growing, not shrinking. that's not good | 22:17 |
clarkb | timburke: oh i see the 503 leads to no cors headers so the issue is still liekly whatever causes the 503. that makes sense | 22:18 |
clarkb | in that case I think we can observe for a bit since in theorythese objects are getting uploaded successfully it is just the retrieval that fails | 22:18 |
ianw | clarkb: did the change you mention about fail POST_FAILURE? | 22:19 |
clarkb | ianw: no it failed properly. Its possible yours is different | 22:19 |
clarkb | fungi: is that for mail coming in or out? | 22:20 |
clarkb | (I'm assuming out) | 22:20 |
fungi | out | 22:20 |
ianw | what's the best way to turn a job id/event id -> the executor where it ran? | 22:20 |
clarkb | ianw: grep the build uuid in /var/log/zuul/executor-debug.log across all twelve executors | 22:20 |
clarkb | ianw: if you get the logs back it tells you what executor it is, unfortunately in this case we don't have the logs so have to brute force it | 22:20 |
ianw | clarkb: ok, cool, so "what i was doing" is still the best way :) | 22:21 |
clarkb | ERROR: Could not find a version that satisfies the requirement PyYAML>=3.1.0 (from versions: none) <- the legit error in my build | 22:22 |
ianw | action: zuul_swift_upload | 22:24 |
ianw | failed: true | 22:24 |
ianw | but it's no_log so ... all i know is it didn't work | 22:24 |
clarkb | ianw: it should log where it was trying to upload too | 22:24 |
clarkb | possible the errors are not just in retrieving objects but also uploding them? was it ovh (gra is where i saw the issues specifically) | 22:24 |
ianw | "_swift_provider_name": "ovh_gra" | 22:25 |
clarkb | ok we might need to pull that region out of the list then now that we have evidence of it breaking uploads | 22:26 |
ianw | it's hard to pinpoint but https://zuul.opendev.org/t/openstack/builds?result=POST_FAILURE&skip=0 shows an uptick ~20:00UTC? | 22:29 |
ianw | (this is probably accessible via statsd) | 22:29 |
ianw | if https://public-cloud.status-ovhcloud.com/ is the right thing, afaics it's not showing any failures | 22:31 |
clarkb | spot checking I haven't found any evidence of bhs1 errors yet. I'll push a change just to disbale gra | 22:32 |
opendevreview | Clark Boylan proposed opendev/base-jobs master: Disable OVH GRA1 log uploads https://review.opendev.org/c/opendev/base-jobs/+/869775 | 22:32 |
clarkb | infra-root ^ | 22:33 |
ianw | if it fails i guess force merge? | 22:34 |
clarkb | ya | 22:35 |
corvus | hopefully we don't have a linter that disallows "#-" :) | 22:35 |
clarkb | it did fail on a post failure. I've rechecked it and if tha doesn't work I'll see about force merging it | 22:38 |
clarkb | hows this look #status notice One of our CI job log storage providers appears to be having trouble with log uploads and retrievals. We are in the process of removing that provider from the pool. | 22:40 |
fungi | lgtm, thanks! | 22:41 |
clarkb | #status notice One of our CI job log storage providers appears to be having trouble with log uploads and retrievals. We are in the process of removing that provider from the pool. | 22:43 |
opendevstatus | clarkb: sending notice | 22:43 |
-opendevstatus- NOTICE: One of our CI job log storage providers appears to be having trouble with log uploads and retrievals. We are in the process of removing that provider from the pool. | 22:43 | |
clarkb | ok I understand the pyyaml thing. The issue is pyyaml makes python3.9 and python3.10 specific wheels. It doesn't do python3 abi. Bullseye python is 3.9 which is why that all worked before but now trying to chnge the base image to 3.10 isn't working because the java side image that is 3.10 doesn't see the wheel as valid? | 22:43 |
clarkb | ya that is exactly it. We use the openjdk 11 upstream image | 22:45 |
clarkb | ok no python3.10 for now | 22:45 |
opendevstatus | clarkb: finished sending notice | 22:46 |
clarkb | one of the jobs for the base-jobs gating is trying to uploda to bhs1 and taking longer than I expect | 22:46 |
clarkb | so I might remove bhs1 too then force merge | 22:46 |
clarkb | yup it just failed against bhs1. New ps | 22:46 |
JayF | https://review.opendev.org/c/openstack/ironic-python-agent/+/867915 any suggestions for why this might not be cooperating and merging? The patch it depends-on has landed... I did W-1/W+1 to try and knock it loose, unsuccessfully | 22:47 |
opendevreview | Clark Boylan proposed opendev/base-jobs master: Disable OVH BHS1 and GRA1 log uploads https://review.opendev.org/c/opendev/base-jobs/+/869775 | 22:47 |
JayF | would strongly prefer not having to modify the patch but I can do that if it's the only option :/ | 22:47 |
opendevreview | Merged opendev/base-jobs master: Disable OVH BHS1 and GRA1 log uploads https://review.opendev.org/c/opendev/base-jobs/+/869775 | 22:50 |
clarkb | ok force merge successful | 22:50 |
clarkb | JayF: give us a few to dig ourselves out of the swift provider problem and I can probably take a look | 22:51 |
JayF | On an unrelated item; if someone is available I need https://review.opendev.org/admin/groups/835647ed8ebcc92ae0bdcfcb1b25adba02d972b1 to be seeded with me so I can populate the group membership | 22:51 |
JayF | clarkb: no rush at all on either of these | 22:51 |
ianw | clarkb: we probably need to debug it incase its an expired token or something that won't resolve itself? | 22:51 |
clarkb | ianw: ya maybe. Its never been that in the past but it is theoretically possible | 22:52 |
ianw | JayF: i think https://review.opendev.org/c/openstack/ironic-python-agent/+/868065/1 needs to merge? | 22:53 |
fungi | JayF: i think the change you want to merge has a parent which isn't approved yet? https://review.opendev.org/c/openstack/ironic-python-agent/+/868065 | 22:53 |
JayF | this is a triple jinx, because I think I figured that out right as you were telling me | 22:53 |
clarkb | JayF: I went ahead and added you but note that TheJulia or dtantsur could have added you too | 22:53 |
JayF | ack | 22:54 |
JayF | clarkb: oh, I didn't realise they were already members or I wouldn't have bugged you, that's my bad | 22:54 |
fungi | yeah, also iurygregory and rpittau are in it | 22:55 |
clarkb | fungi: I don't see them | 22:55 |
clarkb | oh they are there now wtf | 22:55 |
clarkb | maybe jayf just added them | 22:55 |
fungi | yeah, possible i pulled it up after he started editing | 22:56 |
JayF | I just added them, absolutely | 22:56 |
JayF | we had an agreement at the meeting about who goes into this group :D | 22:56 |
clarkb | phew | 22:56 |
clarkb | ianw: so the thing that makes me think we aren't at fault is the 503s tring to get files back that were supposedly uploaded successfully | 22:57 |
clarkb | ianw: if it was purely upload failures then I would be more worried about something on our end being at fault. But we have at least one job where it uplaoded just fine according to apis then refused to return the results | 22:57 |
opendevreview | Clark Boylan proposed opendev/system-config master: Add Gerrit 3.7 images https://review.opendev.org/c/opendev/system-config/+/869765 | 22:58 |
ianw | true ... there's probably not a quick way to test | 22:59 |
clarkb | we can modify the base-test job to only talk to ovh | 22:59 |
ianw | oh, of course, yeah so we should do that to confirm operation before re-enabling | 23:03 |
clarkb | also amorin may have thoughts when the EU day starts again | 23:07 |
*** rlandy is now known as rlandy|out | 23:25 | |
opendevreview | Ian Wienand proposed opendev/system-config master: [wip] add variable to block UA's for mailman https://review.opendev.org/c/opendev/system-config/+/869779 | 23:38 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!