ianw | https://zuul.opendev.org/t/openstack/builds?job_name=release-wheel-cache does not appear to be running still | 01:04 |
---|---|---|
*** ykarel_ is now known as ykarel | 04:48 | |
*** ysandeep|away is now known as ysandeep | 05:10 | |
*** jpena|off is now known as jpena | 06:58 | |
*** amoralej|off is now known as amoralej | 07:07 | |
*** rpittau|afk is now known as rpittau | 07:30 | |
*** ysandeep is now known as ysandeep|lunch | 07:55 | |
*** ykarel is now known as ykarel|lunch | 08:04 | |
*** ykarel|lunch is now known as ykarel | 09:17 | |
*** ysandeep|lunch is now known as ysandeep | 09:19 | |
*** bhagyashris_ is now known as bhagyashris|ruck | 09:50 | |
opendevreview | Dmitry Tantsur proposed ttygroup/gertty master: Suggest a 'cherry-picked from' line when cherry picking https://review.opendev.org/c/ttygroup/gertty/+/799641 | 11:06 |
opendevreview | Dmitry Tantsur proposed ttygroup/gertty master: examples: match 'commit <hash>' https://review.opendev.org/c/ttygroup/gertty/+/799642 | 11:13 |
*** jpena is now known as jpena|lunch | 11:31 | |
*** ysandeep is now known as ysandeep|brb | 12:12 | |
*** amoralej is now known as amoralej|lunch | 12:20 | |
*** ysandeep|brb is now known as ysandeep | 12:28 | |
*** jpena|lunch is now known as jpena | 12:37 | |
*** amoralej|lunch is now known as amoralej | 13:15 | |
*** osmanlicilegi is now known as Guest4 | 13:34 | |
*** prometheanfire is now known as Guest3 | 13:35 | |
opendevreview | Danni Shi proposed openstack/diskimage-builder master: Add a keylime-agent element and a tpm-emulator element https://review.opendev.org/c/openstack/diskimage-builder/+/789601 | 13:44 |
*** ykarel is now known as ykarel|away | 14:38 | |
*** ysandeep is now known as ysandeep|away | 14:41 | |
clarkb | ianw: I wonder if that hit the afllout of jinja changes in zuul? | 14:56 |
clarkb | I can probabl take a look after my morning meeting and sending out the infra meeting agenda | 14:56 |
clarkb | fungi: I'm also hoping to run my gerrit account retirement script over the list I generated on Friday. Any chance you might be able to take a look at hte double check list today? | 14:56 |
fungi | clarkb: i looked at it some over the weekend, spot-checking the danger list, and everything i queried looked reasonable to me. i'll take another look today between meetings though | 14:57 |
clarkb | excellent, in that case I'll plan to start processing that list after meetings and such | 14:58 |
clarkb | fungi: in https://review.opendev.org/c/openstack/project-config/+/799123 why do we remove support to check track-upstream just because we drop the feature from one project? Do we clean that up so we can remove those tools/ scripts that get cleaned up? | 15:13 |
clarkb | I'm ok with that if that is the case, just want to make sure I understand the additional cleanup of feature support for jeepyb in there | 15:13 |
fungi | clarkb: well, at least as a start, it lets us catch if someone tries to add track-upstream on another project | 15:22 |
fungi | the expectation is that we don't want to continue to support that | 15:23 |
*** gthiemon1e is now known as gthiemonge | 15:23 | |
fungi | the accompanying system-config change rips out the cronjob for it, which has broken on the new gerrit deployment | 15:23 |
clarkb | ah I hadn't realized it wasn't working ++ in that case | 15:25 |
fungi | yeah, it's been spamming us ~hourly since the new server was built | 15:25 |
fungi | and as we don't need it any longer, it seemed like my time was better invested simplifying it away rather than fixing it | 15:25 |
clarkb | sounds good. I'll finish up my review on those changes as soon as I get the infra meeting agenda out | 15:27 |
fungi | i haven't done anything to remove the feature from jeepyb yet, just our use of it | 15:28 |
fungi | but that could be a next step (there's a fair bit we can probably clean up in jeepyb at this point) | 15:28 |
clarkb | we might leave it in jeepyb in case other users are using it (though I don't really know of any other users of jeepyb at this point) | 15:29 |
fungi | right, that was sort of why i hadn't approached that end of it yet | 15:30 |
clarkb | I've got the meeting agenda updated. Any last minute items to add before I send it out? | 15:30 |
fungi | i can't think of any | 15:35 |
clarkb | sent | 15:37 |
clarkb | fungi: I left a comment on 799123. Maybe you can check it to see if you want to address that in a followup and approve 799123 as is? | 15:40 |
* clarkb finds breakfast | 15:44 | |
fungi | sure | 15:45 |
fungi | thanks! | 15:46 |
*** marios is now known as marios|out | 16:04 | |
*** jpena is now known as jpena|off | 16:14 | |
opendevreview | Rich Bowen proposed opendev/yaml2ical master: Report which week a meeting occurs. https://review.opendev.org/c/opendev/yaml2ical/+/799691 | 16:29 |
clarkb | fungi: I've got my input list for retire-user.sh ready to go as well as an updated heredoc git commit message in that script. Should I start processing the list or do you want to double check more accounts first? | 16:38 |
*** rpittau is now known as rpittau|afk | 16:39 | |
fungi | clarkb: i checked a few more, seems like it should be safe enough. we should expect at least a few people to reactivate eventually and run into problems, but can't make an omelette otherwise | 16:39 |
clarkb | ya I think if we wait a couple of weeks to give them a chance to complain we'll be fine. | 16:40 |
clarkb | alright I'll start running that here | 16:40 |
clarkb | just have to figure out tee syntax with for loops again | 16:40 |
opendevreview | Jeremy Stanley proposed openstack/project-config master: Drop use of track-upstream https://review.opendev.org/c/openstack/project-config/+/799123 | 16:45 |
*** amoralej is now known as amoralej|off | 16:58 | |
frickler | clarkb: ianw: I'm not sure I'll make it to today's meeting, but I want to mention that I try to at least passively watch most of the meetings, even if I don't talk much. so I'd not be super happy with moving it even later, but given what ianw does, I'd also agree that giving him easier access would be more important | 17:10 |
clarkb | frickler: that is good to know. I thought I'd raise the question at least. | 17:10 |
frickler | maybe another option would be to move to your evening, which would be early morning for me? 5 UTC would be feasible for me, maybe 4 UTC too, at least during daylight saving here | 17:12 |
frickler | though that might be too late for fungi | 17:13 |
fungi | i'm flexible... don't have kids or other obligations really | 17:13 |
fungi | 5utc would be 1am local right now, so not ideal, but i'd make it work | 17:14 |
frickler | on a slightly different topic I'll be mostly offline (otherwise known as PTO) for 3 weeks starting this friday. maybe you can experiment with the timing during that interval | 17:15 |
fungi | yeah, i'll be gone for the next two meetings after today, on the road all day both tuesdays (i might come home monday instead in which case i'll just miss next week's) | 17:20 |
*** mgoddard- is now known as mgoddard | 17:26 | |
clarkb | frickler: enjoy your time off and that is a neat idea re experimenting with times | 17:31 |
clarkb | ok the account retirments are done. I'll get the log file stashed in the usual location momentarily. Then I'll rerun the audit script to pick up these changes | 17:59 |
fungi | awesome, thanks again! | 17:59 |
clarkb | the log is on review now and I'm running the audit now | 18:04 |
clarkb | I'm going to take a break then prep for our meeting while that is running | 18:04 |
AJaeger | Hi, just noticed that a promote job failed, seems a fallout from the recent security Zuul change. | 18:21 |
AJaeger | Could somebody look at https://zuul.opendev.org/t/openstack/build/1c148f751a594ceab627020b6f11dd36 and propose a fix, please? | 18:21 |
clarkb | AJaeger: thanks for the haeds up I'm sure one of us can. ianw also noticed some periodic jobs are not running so we need to dig into those too | 18:23 |
AJaeger | thanks, clarkb ! | 18:23 |
clarkb | fungi: ^ ajaegers example looks like the one you were working on before with targets being undefined | 18:32 |
fungi | The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'targets' The error appears to be in '/var/lib/zuul/builds/1c148f751a594ceab627020b6f11dd36/trusted/project_0/opendev.org/opendev/base-jobs/playbooks/docs/promote.yaml': line 47, column | 18:32 |
clarkb | but this is for openstack manuals | 18:32 |
fungi | yeah, was just taking a peek there | 18:32 |
fungi | promote-openstack-manuals seems to descend from opendev-promote-docs-base which i think we adjusted | 18:32 |
fungi | so need to figure out what the mismatch is there | 18:33 |
clarkb | fungi: I have approved the track upstream removal change | 18:33 |
clarkb | the gerrit user audit has completed and results looks how I expect them. I'll push that yaml file up to the normal spot | 18:35 |
fungi | thanks x2! | 18:35 |
opendevreview | Merged openstack/project-config master: Drop use of track-upstream https://review.opendev.org/c/openstack/project-config/+/799123 | 18:41 |
fungi | once the system-config change lands, optional follow-ups are retiring the opendev/gerrit repo and ripping track-upstream support out of jeepyb | 18:42 |
clarkb | looks like the linaro cloud ssl cert is still expired. I'll email kevinz | 18:48 |
fungi | yeah, seems like he didn't see the ping in here last week | 18:48 |
clarkb | email sent | 18:53 |
fungi | okay, looks like the problem with the openstack manuals promotion is afsdocs_secret-openstack-manuals needs to be reworked to look similar to afsdocs_secret-tox-docs | 18:59 |
fungi | i'll give it a shot | 18:59 |
opendevreview | Jeremy Stanley proposed openstack/project-config master: afsdocs_secret-openstack-manuals: Zuul 4.6.0 fix https://review.opendev.org/c/openstack/project-config/+/799710 | 19:03 |
fungi | that ^ should hopefully solve the issue AJaeger pointed out earlier | 19:03 |
ianw | just fyi i'm out my thu/fri this week | 19:56 |
clarkb | enjoy! | 19:57 |
ianw | clarkb: re the periodic jobs, i'm sure it is some sort of job config issue but it's not obvious to me where the error is | 19:59 |
ianw | i would have thought there'd still be failed jobs | 20:00 |
clarkb | ianw: ya I looked at the zuul status bell and I didn't see any errors that looked suspect there | 20:00 |
clarkb | maybe we need to grep scheduler logs for the job name and take it from there? | 20:00 |
ianw | clarkb: yeah, that's pretty much what i did :) http://lists.zuul-ci.org/pipermail/zuul-discuss/2021-July/001660.html | 20:01 |
clarkb | aha there is a thread on zuul-discuss | 20:01 |
ianw | http://people.redhat.com/~iwienand/zuul-periodic-27-06-2021/963101cc7c01460abfd34664e5610e18.txt is basically the last time it ran | 20:01 |
ianw | http://people.redhat.com/~iwienand/zuul-periodic-27-06-2021/afdb7039494649c09e8bb2b64158a385.txt i don't know. that log is > 100mb as it seems to be in a bit of a loop. but also, i think during that period requirements had config errors | 20:03 |
ianw | http://people.redhat.com/~iwienand/zuul-periodic-27-06-2021/5fe4dd08d1a44700bf92fd475565f687.txt is after zuul got restarted | 20:03 |
clarkb | Change <Branch 0x7fa023409a60 openstack/requirements refs/heads/master updated None..None> is already in pipeline, ignoring | 20:05 |
clarkb | aha that is it | 20:05 |
clarkb | ianw: look at the status page we have a 90 hour old periodic entry in the queue | 20:06 |
clarkb | ianw: I think we dequeue that then let it schedule directly as usual? I think enqueue of periodic doesn't work the way we want and that is hcausing the problem | 20:06 |
ianw | ah! | 20:07 |
ianw | publish-wheel-cache-debian-bullseye-arm64 queued | 20:07 |
ianw | and then everything else with error | 20:08 |
ianw | there is an 86-hour old one too | 20:08 |
clarkb | ya and the arm64 queue is related to the linaro cert I think? I sent email about that | 20:08 |
clarkb | I need to eat some lunch but I can help clean that up after | 20:09 |
ianw | hrm, we should have osuosl nodes for that | 20:09 |
ianw | hrm, the one that has all errors is openstack/requirements 0000000 | 20:16 |
ianw | i wonder if that got re-enqueued after the zuul restart | 20:16 |
ianw | there is a kolla change in similar state | 20:16 |
ianw | "sudo docker-compose exec scheduler zuul dequeue --tenant openstack --pipeline periodic --project openstack/requirements --ref 0000000000000000000000000000000000000000" failed, but the entry went way from the status page | 20:22 |
ianw | "sudo docker-compose exec scheduler zuul dequeue --tenant openstack --pipeline periodic --project openstack/requirements --ref refs/heads/master" worked but i had to run it twice (i reloaded status page between) | 20:23 |
clarkb | ianw: yes it was in the queue when things got stopped and then reenqueued after and I think the reenqueue doesn't work for periodic jobs | 20:37 |
clarkb | fungi: in https://review.opendev.org/c/openstack/project-config/+/799710/1/zuul.d/secrets.yaml is "branch" as a key there going to do the right thing? I guess I don't understand how the generic "branch" branch name fits into publishing for manuals | 20:40 |
clarkb | ianw: re arm64 and linaro outage does osuosl provide the larger arm64 flavor type or just linaro? I thought that may be the problem we are seeing | 20:41 |
fungi | clarkb: i'm not positive, that change just makes it consistent with the other secrets being passed to the same parent. the playbook is what actually accesses that array | 20:48 |
ianw | the requirements job shouldn't require larger images, although the kolla one maybe | 20:50 |
clarkb | fungi: ya I worry that the mapping from special var in the past doesn't map onto what is in there now. I think you may need to write down all the branches? Worth cross checking anyway | 20:50 |
ianw | ok, right now i just ran | 21:10 |
ianw | sudo docker-compose exec scheduler zuul dequeue --tenant openstack --pipeline periodic --project openstack/kolla --ref refs/heads/master | 21:10 |
ianw | which seemed to remove the stuck kolla periodic buildset @ 0000000000000000000000000000000000000000 | 21:11 |
ianw | but the one for refs/heads/master is still there | 21:11 |
fungi | yeah, the 0x0 items are from a reenqueue | 21:11 |
ianw | i am going to run it again | 21:12 |
ianw | ok, so the *second* run got the one that was in the queue with ref/heads/master | 21:13 |
fungi | worth working out whether we're actually able to pass the right parameters through the rpc interface to correctly reenqueue a timer-triggered item | 21:13 |
ianw | i have great deja vu of never being able to figure that out | 21:13 |
ianw | http://lists.zuul-ci.org/pipermail/zuul-discuss/2019-May/000909.html google tells me | 21:14 |
ianw | i'm not sure why the first dequeue with --ref refs/heads/master removes the buildset against 000....000 | 21:15 |
ianw | anyway, there is a wallaby one for kolla too | 21:15 |
ianw | ok, that is gone now too | 21:15 |
ianw | i will look at the arm64 nodes before 06:00 UTC and see if we can't sort this out | 21:16 |
clarkb | fungi: your fix for the promotion job looks correct after reading the playbook. However, https://opendev.org/opendev/base-jobs/src/branch/master/zuul.d/jobs.yaml#L277-L290 should also get updated | 21:19 |
clarkb | fungi: I've approced the secrets fix and we can followup on ^ | 21:19 |
clarkb | docs_tag_path only shows up in the docstrings of those jobs | 21:20 |
ianw | 799126 should also finally unbreak the dib gate and also our centos-8-stream image building | 21:20 |
clarkb | I think it has been replaced with target.tag. I'll work on an update to base-jobs | 21:20 |
opendevreview | Merged openstack/project-config master: afsdocs_secret-openstack-manuals: Zuul 4.6.0 fix https://review.opendev.org/c/openstack/project-config/+/799710 | 21:27 |
fungi | clarkb: thanks, i'll try to write that one you spotted up now | 21:28 |
fungi | aha, the description for opendev-publish-tox-docs-base | 21:30 |
opendevreview | Clark Boylan proposed opendev/base-jobs master: Fix docstrings to match job updates https://review.opendev.org/c/opendev/base-jobs/+/799720 | 21:34 |
clarkb | fungi: ^ something like that maybe | 21:34 |
fungi | ooh, i was struggling to find words. i'll review that and criticize your choices instead! ;) | 21:34 |
fungi | i've asked a question on it, because my fixes are very close to being a cargo-cult of prior fixes | 21:37 |
fungi | so my understanding is cloudy | 21:37 |
fungi | it seems like the playbooks aren't treating those as jobvars at all, since they're accessed via the secret values | 21:39 |
clarkb | fungi: responded | 21:39 |
clarkb | jobvar is just an rst rendering thing | 21:40 |
opendevreview | Clark Boylan proposed opendev/base-jobs master: Fix docstrings to match job updates https://review.opendev.org/c/opendev/base-jobs/+/799720 | 21:42 |
fungi | ahh | 21:50 |
clarkb | ianw: left some comments on the paste change. I think there are a few testing things to clean up and I had a question about some of that too | 21:52 |
clarkb | corvus: catching up on the matrix spec it isn't fully clear to me if we need to manage the synapse server to run a k8s bridge or our own irc bridge. Do those run in the server or as separate software than well bridges? | 22:10 |
mordred | clarkb: the IRC bridges already exist - so we dont' need to run anything there | 22:14 |
mordred | if we wanted to supply a bridge to, for instance, k8s slack, that we can have EMS run that for us (it's $20/month per bridge) | 22:15 |
clarkb | I see | 22:17 |
mordred | (they run as separate processes, but aiui you also need to configure the synapse software to interact with them) | 22:18 |
corvus | yep; and there's also some cooperation needed on the slack side -- for each slack instance (so it doesn't scale as well as, say, hooking up to an entire irc network) | 22:18 |
corvus | if anyone here is on the gerrit slack, you're welcome to use my matrix bridge, btw. just ping me. | 22:19 |
* mordred uses the gerrit slack via corvus' matrix bridge | 22:19 | |
ianw | clarkb: on the buildset-registry job for the paste service in gate; i think i left that out deliberately because at gate time it should only be pulling the lodgeit image from upstream | 22:22 |
ianw | i think that's right; if it depends-on a lodgeit change, that change would have to be merged (and pushed to dockerhub) before it got to gate? | 22:22 |
clarkb | ianw: I think if you use the buidlest registry then that wouldn't need to be the case, but I'm not completely sure | 22:22 |
ianw | i guess i'm saying we shouldn't be using the intermediate/buildset registry in gate here because the change should be published | 22:24 |
ianw | i think that's different for images that are part of system-config | 22:25 |
corvus | whatever you do, i wouldn't make check differ from gate | 22:25 |
clarkb | ianw: I think if you have a depends on then zuul should be able to find the image in the intermediate registry and pull it to the buildest registry then when the parent merges dockerhub will be updated but that could happen concurrently with gating the paste job | 22:27 |
corvus | depends-on also means that it won't merge until the change ahead is merged. if we're willing to accept the small race condition between gate and promote, then it should be reasonable to use the buildset registry in both check and gate. | 22:27 |
*** Guest3 is now known as prometheanfire | 22:28 | |
opendevreview | Ian Wienand proposed opendev/system-config master: Add paste service https://review.opendev.org/c/opendev/system-config/+/798400 | 22:28 |
opendevreview | Ian Wienand proposed opendev/system-config master: lodgeit: use mariadb connector https://review.opendev.org/c/opendev/system-config/+/799004 | 22:28 |
corvus | however, deployment jobs that use a mutex should be strictly sequenced, so even that race shouldn't be a problem | 22:28 |
clarkb | yup we promote before running infra-prod deployment jobs iirc | 22:30 |
clarkb | hwich results in a strict ordering | 22:30 |
ianw | well in this case it's the lodgeit job that would need to promote | 22:30 |
clarkb | ah right | 22:31 |
clarkb | even then I think the risk is fairly low. But point taken | 22:31 |
corvus | true, that could race a deployment job. i think the chances are small though | 22:31 |
clarkb | On the zuul and nodepool side we run their deployment jobs hourly to reduce the pain of that aiui | 22:32 |
ianw | i also think there is a strong possibility the lodgeit container will never again be updated :) | 22:32 |
corvus | ianw: i think your idea of basically having the system-config gate fail if the lodgit promote failed has merit -- though there is a race condition there too. if you do decide to make gate!=check, please leave a note as to why since usually we just assume that's a bug. | 22:32 |
corvus | but all things being equal, i like the current ps where check==gate | 22:33 |
ianw | i'm fine with that | 22:33 |
opendevreview | Merged openstack/diskimage-builder master: Mount /sys RO https://review.opendev.org/c/openstack/diskimage-builder/+/799126 | 22:44 |
clarkb | side note: the zookeeper metrics look good according to grafana | 22:54 |
ianw | corvus: i know we're a few days out, but do you forsee any issues if i restart zuul to incorporate https://review.opendev.org/c/opendev/system-config/+/798243 on the 11th (UTC ~11pm, i.e at about this time) | 22:54 |
clarkb | watches strongly correlate to ephemeral nodes | 22:54 |
corvus | ianw: lgtm | 22:56 |
*** ysandeep|away is now known as ysandeep | 23:38 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!