Thursday, 2021-06-10

opendevreviewMerged opendev/system-config master: Create ircbot container
opendevreviewMerged opendev/system-config master: limnoria/meetbot setup on
ianwit looks like base is failing to deploy with issues with apt-get autoremove02:34
ianwdpkg: error processing package shim-signed (--configure):02:35
ianwSetting up grub-efi-amd64-signed (1.167~18.04.3+2.04-1ubuntu44.1) ...02:39
ianwInstalling for x86_64-efi platform.02:39
ianwgrub-install: error: cannot find EFI directory.02:39
ianwwhich probably isn't surprising given we don't boot efi02:40
prometheanfiresqlalchemy bump just merged, so if gates fail, may be that (tested via cross tests, but...)02:51
ianwi feel like it must have something to do with that03:00
ianwi have filed
opendevmeetLaunchpad bug 1931518 in grub2-signed (Ubuntu) "grub-efi-amd64-signed 1.167~18.04.1+2.04-1ubuntu44 has broken non-EFI install" [Undecided,New]04:01
ianwour images have an unmounted /boot/efi directory which might be confusing things04:07
fricklerso the gate still looks pretty borked. iiuc we have a new zuul release but nobody got round to deploy it? should I give it a try? I'm also wondering whether in the current situation one would actually want to restore queues or better give it a fresh start04:52
opendevreviewIan Wienand proposed opendev/system-config master: limnoria: fix some minor typos
*** bhagyashris_ is now known as bhagyashris06:08
ianwfrickler: yeah, i'm afraid i just didn't keep up enough on that one.  i do believe that corvus' fix merged06:12
ianwi might be worried about introducing other changes though without someone with more overview06:13
fricklerianw: yeah, the fix was merged and 4.5.0, but I didn't see any indication as to why it wasn't deployed, so I'm wary about that, too06:14
frickler... and 4.5.0 released ...06:14
ianwi'm afraid i'm about out of time.  i haven't managed to get meetbot switched over to but i've identified and fixed up a few small issues to get there tomorrow hopefully06:18
ianwfrickler: i don't know if i'm of any help.  if you want to restart, i can be around in an hour or two06:21
fricklerianw: no, I don't think I want to touch it really, will wait for corvus' feedback, thx06:24
fricklerI wouldn't have time to do extended debugging in case it wouldn't just work after a restart, too. and at least check jobs still mostly seem to be working fine06:26
opendevreviewMerged opendev/system-config master: limnoria: fix some minor typos
opendevreviewAnanya proposed opendev/elastic-recheck master: Configure ER bot
yoctozeptoinfra-root: ethercalc seemingly down11:40
opendevreviewLajos Katona proposed openstack/project-config master: Rename x/tap-as-a-service to openstack/tap-as-a-service
corvusfrickler, ianw: i think we just ran out of time yesterday; i can deploy it today, though i'm also surprised if the situation got worse; it should have been a one-time event unless the cloud killed another executor vm13:13
fungicorvus: i don't know that it's gotten worse, the report in #openstack-infra was about changes stuck for more than a day, so probably leftovers from the same incident13:38
fungi"elapsed time 28 hours" as of 05:25 utc so yeah, that was the same incident13:39
opendevreviewBenjamin Schanzel proposed zuul/zuul-jobs master: Add a meta log upload role with a failover mechanism
corvusok, thought i saw someone mention something about the gate pipeline13:40
opendevreviewBenjamin Schanzel proposed zuul/zuul-jobs master: Add a meta log upload role with a failover mechanism
fungiyeah, not sure what frickler meant by "the gate still looks pretty borked" in that case, there wasn't much explanation of which issue was suspected13:45
fungimy internet service is still down (yay!) so looking at stuff over 2g/3g cellular is a bit of a pain13:47
fungithere's a hung openstack/neutron item in the openstack tenant's gate pipeline for ~36 hours, so again same incident, but it doesn't seem to be blocking anything else from merging13:48
opendevreviewBenjamin Schanzel proposed zuul/zuul-jobs master: Add a meta log upload role with a failover mechanism
opendevreviewBenjamin Schanzel proposed zuul/zuul-jobs master: Add a meta log upload role with a failover mechanism
fricklercorvus: fungi: sorry for not being more specific. there was a large backlog in tripleo gate, though that looks better now. and then still didn't get into gate, although clarkb had a comment about that yesterday, but I can't see any trace of that in gerrit14:26
opendevreviewAlex Schultz proposed opendev/system-config master: Add cdn0{1,2}
*** timburke has joined #opendev14:47
opendevreviewHervĂ© Beraud proposed openstack/project-config master: Adding missing zuul for etcd3gw
clarkbfrickler: went into the gate when I approved it15:26
clarkbfrickler: it was at the end of the queue, maybe when it was properly evaluated it got evicted? I wonder if we have no jobs that apply when .rst files are updated?15:27
clarkbif we deploy 4.5.0 then we'll automatically deal with those changes that are stuck from the original event15:28
clarkbI have reapproved 795383 to see what it does15:28
clarkb795383 shows up in the pipeline now, but without any jobs, lets see what it does15:29
fungiwell, deploying 4.5.0 means restarting the scheduler, which means reenqueuing the changes, which means sure they're dealt with at that point15:30
clarkbfungi: yup exactly15:30
fungithe bug which led to them getting stuck would require another executor crashing/disconnecting/rebooting15:30
fungito have a recurrence15:30
fungii've seen no evidence of new incidents there15:30
clarkblooking at 795383 it shows no jobs. I strongly suspect that we've removed any job which will run against only an rst change15:31
fungialso, cable provider helpfully scheduled someone to come take a look at my internet outage... sunday afternoon15:31
fungiguess i'm going to be at half-capacity for a few more days15:32
clarkbthe alternative for those stuck changes is to abandon and restore them or we can deuqeue enqueue?15:34
fricklerclarkb: ah, we had that set of patches to move d-g jobs from project-config into the repo, maybe it missed something. so that would indeed be a completely unrelated issue then15:34
clarkbinfra-root should I go get my keys loaded and dequeue enqueue those changes?15:34
fungiclarkb: if you want to, feel free, they're not blocking anything though and their owners can also push new patchsets or abandon/restore to achieve the same effect if it's urgent15:43
kukuAny solutions to update bios firmware with ironic ?15:45
clarkbkuku: you're probably best off asking the ironic team. We run the developer tools that help make ironic but don't have much direct experience with the tool15:46
clarkbfungi: it seems like a honey trap at this point causing people confusion. I'll go ahead and sort them out15:46
fungikuku: you can usually find them in #openstack-ironic15:48
clarkbdequeue then enqueue seems to have worked for the oldest change in check. I'll do the others now15:51
corvussorry, just had some surprise stuff come up; i should be able to restart later, but if you want to pick those off now that's fine15:56
clarkbcorvus: no worries, I've taken care of those in check. There are a couple in periodic and one in check-arm64 that are a bit older than the incident so I think they may be stuck for other reasons15:56
fungimy guess there would either be no available nodes, or we've had more nodeset locks fail to get released by the launcher16:03
clarkbI'm looking into the d-g thing more closely now since I'm fairly sure the issue is lack of jobs to run on .rst file updates based on the state of the gate for it right now16:04
clarkb ya that seems to be common. Now to see if I can find one that doesn't exclude .rst that can be added16:08
clarkbI would expect devstack-gate-hooks to run since it doesn't seem to filter on files and its parent doesn't either16:12
clarkboh it is only in check. Cool we can fix this16:13
melwittcan anyone tell me what's the location/url of the gerrit rest api for
clarkbmelwitt: and for authenticated requests16:41
clarkbmelwitt: note they prefix the json responses with invalid json that you have to chomp off16:43
clarkb)]}'\n is what they prefix with so 5 bytes you can just remove16:43
melwittclarkb: oh, ok thanks!16:44
clarkband for /a/ you have to use that url prefix when you are doing actions that require auth. If you supply auth to the non /a/ url you get 403s back and it is very confusing16:46
clarkbits also fine to use /a/ when not needing auth but you have to supply the auth if you use /a/ and can't query those urls anonymously16:47
clarkbfungi: you had asked about changes related to recently, if your Internets are working well enough could you review that onet oo?16:54
fungimelwitt: also be aware you need to use http basic auth with this gerrit version (older versions used digest auth instead)16:56
*** amoralej is now known as amoralej|off16:56
fungiand if you want to look at an example piece of software which interfaces with the gerrit rest api, ttygroup/gertty is excellent... it does just about anything you can imagine via the api16:57
melwittok thanks16:57
*** amoralej|off has quit IRC16:59
melwittoh good, I will need that :)17:01
fungiclarkb: yep, reviewing now. i basically ended up wiring the entire house into a cellular modem as a workaround, which is not great but i can manage17:07
clarkbwow, when I have those isseus I just pick a device to tether with and don't bother with doing the whole home. But it happens less for me so you probably want to have a more robust fallback17:10
fungii switch between computers a lot, so being tied to one machine isn't as efficient for me17:10
fungialso tethering one computer via wifi means it's cut off from the other systems here17:11
mnaserif any other infra-root is around to +2 so that ianw can +A in their AM and help add those mirrors?17:22
*** david-lyle is now known as dklyle17:26
corvussanity check: it looks like we don't release jeepyb? is that correct?17:28
clarkbcorvus: looking17:28
corvusi don't see it on pypi, and i see zero tags17:28
clarkbcorvus: yup agreed it seems we don't release jeepyb, we just deploy it. But jeepyb consumes gerritlib from releases so depending on where things change we do sometimes make releases of gerritlib17:29
corvusk, thx17:29
opendevreviewMerged opendev/system-config master: Remove system-config-legacy-logstash-filters job
opendevreviewMerged zuul/zuul-jobs master: Bump default Helm version to 2.17.0
corvusmanage-projects is a really good candidate for a container image deliverable (like we did for grafyaml)18:05
clarkbcorvus: we bundle it onto the gerrit container now forsome reasonI don't recall18:06
clarkbits pssible that was for simplicitly and it isn't necessary18:07
corvusyeah, i'm assuming we're doing something semi-intrusive (like using the gerrit server keys).  but i have a non-opendev use for just manage-projects, and that's well suited for running completely external18:07
clarkboh I remember!18:07
clarkbit is because jeepyb also contains the gerrit hooks18:07
clarkband gerrit expects those in its path18:07
clarkbFor manage-projects we could split that out I think18:07
clarkbthen also continue to have jeepyb in the gerrit image for hooks18:08
corvusah yep the hooks; that'll do it18:12
fungiyeah, would make sense to install twice (once in gerrit container, once in manage-projects container)18:12
corvusi mean, even we're trying to get rid of the hooks18:13
fungiif someone reimplements those remaining hook scripts as zuul jobs, we can completely drop it from the gerrit container18:13
fungiyeah, precisely18:13
fungiin addition to my current tin cans and string internet backup, the gutter installers finally arrived this afternoon, so i get to listen to hammering, drilling, and sawing sheet metal just on the other side of the wall18:39
fungiat least i don't have any conference calls this afternoon18:40
clarkbwe can probably arrange one if you'd like to share the fun :)18:40
fungioh, you still need to test you new dac, right?18:40
fungi"can you hear the reciprocating saw now? how about now?"18:41
clarkbI tested it with a local call. I'm satisfied it is working18:41
clarkbalso I'm on the couch today because parents are in my office18:41
clarkbso conference call today may not use it18:42
clarkbianw: I have reviewed the statusbot change. Couple of notes there as well as I think there may be missing files (need to git add and repush?)18:59
clarkbianw: also a comment on but I +2'd it anyay as it should work as is19:13
ianwclarkb: thanks!21:43
ianwclarkb: my thought with the volume is that it is one thing that has very little context outside the container; logs and images we want visibility into, but the only reason for this is to workaround overlayfs issues21:47
SpamapSWell I think I'm here now?21:53
ianwSpamapS: it's too early for philosophy here :)21:54
SpamapSassuming time is linear, you're probably right.21:55
SpamapSWeirdly though, I can't seem to rejoin #ceph or #debian-mysql now that I'm on Matrix. :-P21:56
opendevreviewDanni Shi proposed openstack/diskimage-builder master: Add a keylime-agent element and a tpm-emulator
opendevreviewIan Wienand proposed opendev/system-config master: Run statusbot from
corvusspamaps: looks like #debian-mysql needs nick registration; you can give your registered nick password to the bridge, and it will authenticate for you.  start a PM with and say "!help"22:19
ianwi'm running syncs of meeting logs and channel logs now from -> opendev.org22:19
corvusspamaps: "!storepass" is the command to save the password22:20
opendevreviewMerged opendev/system-config master: Add Fedora 34 mirrors

