| opendevreview | daniel.pawlik proposed openstack/ci-log-processing master: Fix Docker image upload in post pipeline https://review.opendev.org/c/openstack/ci-log-processing/+/993879 | 07:40 |
|---|---|---|
| opendevreview | Merged openstack/ci-log-processing master: Fix Docker image upload in post pipeline https://review.opendev.org/c/openstack/ci-log-processing/+/993879 | 08:12 |
| dpawlik | hi fungi clarkb Could you check what is the issue that the ci-log-processing-upload-image job does not work? https://zuul.opendev.org/t/openstack/builds?job_name=ci-log-processing-upload-image&skip=0 | 08:44 |
| dpawlik | I guess it might be something related to secret, but if you can take a look what is on the executor side, would be great | 08:45 |
| dpawlik | I would like to update the logscraper/logsender today, next week I would like to perform opensearch upgrade to 3.5 as AWS console suggested | 08:45 |
| dpawlik | I built that on the host. So now would be good just to update on Dockerhub :) | 08:53 |
| mnasiadka | dpawlik: there was an earlier discussion about users of the OpenSearch platform - do you know any? | 08:56 |
| dpawlik | Probably melwitt is still using it, also other team related to network, maybe slaweq | 09:05 |
| dpawlik | maybe ykarel_ or sean-k-mooney | 09:09 |
| dpawlik | it's just guessing | 09:09 |
| sean-k-mooney | mnasiadka: i use it often to debug ci failure and understand if its a one of or persitent failure | 09:18 |
| mnasiadka | Ok, so that’s purely RedHat - I was just interested :) | 09:18 |
| sean-k-mooney | oftend in this case is perhaps once a month | 09:18 |
| mnasiadka | Do you have any tools for that? | 09:18 |
| sean-k-mooney | so | 09:19 |
| sean-k-mooney | https://opensearch.logs.openstack.org/_dashboards/app/data-explorer/discover/#?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-30d,to:now))&_a=(discover:(columns:!(hosts_region,project),interval:d,sort:!()),metadata:(indexPattern:'94869730-aea8-11ec-9e6a-83741af3fdcd',view:discover))&_q=(filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:' | 09:19 |
| sean-k-mooney | 94869730-aea8-11ec-9e6a-83741af3fdcd',key:project,negate:!f,params:(query:openstack%2Fnova),type:phrase),query:(match_phrase:(project:openstack%2Fnova)))),query:(language:kuery,query:'message:%22Cursor%20needed%20to%20be%20reset%22')) | 09:19 |
| sean-k-mooney | well https://opensearch.logs.openstack.org/_dashboards/app/data-explorer/discover/ | 09:19 |
| sean-k-mooney | is the upstream opensearch | 09:19 |
| sean-k-mooney | that is open to everyone | 09:19 |
| sean-k-mooney | its not redhat sepcific | 09:19 |
| sean-k-mooney | we used ot have elasticsearch and logstash before it | 09:20 |
| mnasiadka | Oh boy, I love that query language | 09:20 |
| sean-k-mooney | the login if you dont know it the super secure openstack openstack | 09:21 |
| sean-k-mooney | mnasiadka: so ya you can use it to look at basiclly the last 10 days of ci logs | 09:22 |
| sean-k-mooney | so when i see contibtors recheckign becasue of X more then once ot tweice in a few days | 09:22 |
| sean-k-mooney | i will often see if i know why and if i want a sense for how bad the bleeding is or when it may have started ill check openserch to determin that | 09:23 |
| mnasiadka | Great to see it’s used, but the entry bar is not that low ;-) | 09:24 |
| sean-k-mooney | ya its not very visabel that it exits | 09:25 |
| sean-k-mooney | you kind fo have to been around long enouch for some one else to tell you and it does not come up often | 09:25 |
| sean-k-mooney | usuesll i will convert a dashboard to a tinyurl https://tinyurl.com/2k63h77v | 09:26 |
| sean-k-mooney | and share that in irc when im debuging something | 09:26 |
| sean-k-mooney | that issue will be fixed by https://review.opendev.org/c/openstack/nova/+/900783 | 09:28 |
| sean-k-mooney | that just an example of me lookign at a passing job and seeing tracebacks and going that shoudlnot happen | 09:29 |
| sean-k-mooney | then using opensrach to see how often | 09:29 |
| sean-k-mooney | in that particalar case it came up while i was lokign at a failure but that traceback was not relevent which made it harder to figureout what the actual issue was | 09:30 |
| mnasiadka | dpawlik: secret decryption failed | 09:32 |
| mnasiadka | dpawlik: https://paste.opendev.org/show/b2lAzJ3k3wlia6QJjbg5/ | 09:33 |
| sean-k-mooney | frickler: ya i almost alwasy share the urls via a shortenere but it woudl be nice if it was built in. | 09:33 |
| sean-k-mooney | frickler: clarkb if it didnt exist it would not fundementally break my workflow but i woudl jsut stop tryign to use data to determin if an issue is happening often or not and jsut go by how often i see it | 09:34 |
| sean-k-mooney | i might be special in that if zuul is red on a cahnge i review even if its a non voting job i almost alwasys click in and at least chcek why | 09:35 |
| sean-k-mooney | as in even if the patch is not +2w's ill quickly check to see if i can bring that context back into the review whne relevent even if its to just tell the contibtor (ignore this job failre it whas because of X know issue) | 09:37 |
| dpawlik | thanks mnasiadka. Thanks for confirmation | 09:46 |
| mnasiadka | np | 09:48 |
| opendevreview | daniel.pawlik proposed openstack/ci-log-processing master: Update registry credentials https://review.opendev.org/c/openstack/ci-log-processing/+/993888 | 09:56 |
| dpawlik | let's see if my comment "secret encrypted with opendev/system-config repo" is actual | 09:57 |
| opendevreview | Merged openstack/ci-log-processing master: Update registry credentials https://review.opendev.org/c/openstack/ci-log-processing/+/993888 | 10:34 |
| dpawlik | mnasiadka still it fails. | 10:48 |
| dpawlik | Could you check if issue is with decryption? If yes, now I will take openstack/ci-log-processing and remove the comment | 10:49 |
| opendevreview | daniel.pawlik proposed openstack/ci-log-processing master: Re-encrypt registry password with openstack/ci-log-processing https://review.opendev.org/c/openstack/ci-log-processing/+/993896 | 10:52 |
| opendevreview | Merged openstack/ci-log-processing master: Re-encrypt registry password with openstack/ci-log-processing https://review.opendev.org/c/openstack/ci-log-processing/+/993896 | 11:20 |
| opendevreview | Guillaume Boutry proposed openstack/project-config master: sunbeam: retire all single charm repositories https://review.opendev.org/c/openstack/project-config/+/903666 | 11:37 |
| mnasiadka | dpawlik: I see some recent runs of that job passed? | 11:51 |
| dpawlik | indeed! Thanks Michal | 12:07 |
| opendevreview | Seongsoo Cho proposed openstack/openstack-zuul-jobs master: Add Weblate client Ansible role https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/991432 | 13:09 |
| fungi | dpawlik: it looks to me like the cause is that you missed https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/thread/WBBLBI6ZS6FA6Q5ZMH4C2MWPL3WG3H24/ | 13:30 |
| fungi | so it's been broken since january | 13:30 |
| fungi | i guess nobody used it in the past 5 months (or they got stale/misleading results to their queries) | 13:30 |
| fungi | according to git history, today was the first time that secret was updated in years | 13:31 |
| frickler | I guess this is a good moment to remind people not only to re-encrypt the old credentials, but disable those and create and upload new ones | 13:35 |
| sean-k-mooney | fungi: wait are you saying that when i have been using opensearch for the last 6 months the data was all old? | 13:41 |
| sean-k-mooney | because i was seeign result form more recent then that i.e. Jun 16, 2026 | 13:42 |
| fungi | sean-k-mooney: maybe it didn't rely on that job working then | 13:42 |
| sean-k-mooney | ya im not sure checkign now woudl not really show either way if its now working | 13:43 |
| sean-k-mooney | but i gueslly query for the last 30 days | 13:43 |
| sean-k-mooney | and typically get back 10 days worth of results | 13:43 |
| sean-k-mooney | i think that woudl have retued nothing if that was entrily broken | 13:43 |
| fungi | yeah, so it was the ci-log-processing-upload-image job's dockerhub credential, i guess what broke was the ability to push new versions of the docker image, for which there had been none needed anyway | 13:44 |
| sean-k-mooney | ah that makes sense | 13:44 |
| fungi | which means the service was probably fine, just couldn't be updated | 13:44 |
| sean-k-mooney | so it was getting the data fine but we could not update it | 13:45 |
| fungi | i initially thought it was something like a write credential for pushing log data into opensearch, but i guess that's handled elsewhere | 13:45 |
| sean-k-mooney | we had a leak or zuul secrete cve backin january right? | 13:45 |
| sean-k-mooney | so ye rotatted the base zuul salt | 13:46 |
| sean-k-mooney | and invlaidate all the secrete if i recall properly | 13:46 |
| sean-k-mooney | so https://review.opendev.org/c/openstack/ci-log-processing/+/993896 was just updating them | 13:47 |
| fungi | yes, security vulnerability that could have exposed the per-project encryption keys for secrets, which we fixed but had no way to be sure they hadn't ended up in someone's hands first | 13:47 |
| sean-k-mooney | hence frickler's reminded to jost use upload the same one with the new public key | 13:47 |
| sean-k-mooney | but to also roate it on the backend | 13:48 |
| fungi | which is why in the ml post i linked above, we encouraged everyone to take the opportunity to also reset any credentials or keys that had been encrypted as secrets | 13:48 |
| fungi | as an added precaution | 13:49 |
| sean-k-mooney | fungi: on an unrelated note i discoverd my zuul ci didnt have file system currption becuase the pod was using the empty dir k8s volume type meaning the git caches were drop on pod recreates | 13:50 |
| sean-k-mooney | fungi: so my issue was with tryign to clone nova or cinder a 34Kbps | 13:51 |
| fungi | oof, that sounds... painful | 13:51 |
| sean-k-mooney | im currently debating fixing that | 13:52 |
| sean-k-mooney | or jsut redinstalling those servers with openstack | 13:52 |
| sean-k-mooney | proxmox is not bad but there is no zuul/nodepools supprot and if i ever want to do anyting that cant run in a contianer like devstask my current setup cant accomidate that | 13:53 |
| fungi | writing a proxmox driver for zuul-launcher probably wouldn't be all that hard | 13:54 |
| fungi | though i don't know enough about the platform to say that for sure | 13:55 |
| opendevreview | Guillaume Boutry proposed openstack/project-config master: sunbeam: retire all single charm repositories https://review.opendev.org/c/openstack/project-config/+/903666 | 14:19 |
| dpawlik | fungi hey, yes, I miss that mail. I have not time to make any changes in ci-log-processing, especially that I recently switched team | 15:19 |
| fungi | yeah, seems like it didn't have any direct impact until now, when you were updating the image | 15:19 |
| melwitt | re: the backscroll, yes I am a OpenSearch user :) | 15:42 |
| mnasiadka | sean-k-mooney: I had the same problem with zuul-operator, ended up having a local fork with a lot of improvements, because the upstream project… well… doesn’t have traction :) | 15:48 |
| mnasiadka | But yes, emptydir doesn’t help | 15:48 |
| sean-k-mooney | mnasiadka: well in this case the main issue for my install is networking | 15:51 |
| sean-k-mooney | i need to debug why its so slow | 15:51 |
| sean-k-mooney | but yes im also runing a fork with 2 minor tweaks | 15:52 |
| sean-k-mooney | i needed to change the ssh key permissions | 15:52 |
| sean-k-mooney | and something else minor | 15:52 |
| mnasiadka | Cloning nova with the default zuul git config nearly always times out, I had to tune the timeouts - because the defaults are sort of aggressive, but I can understand why | 15:53 |
| sean-k-mooney | https://github.com/SeanMooney/zuul-operator/commit/d2dc4be75a8402c8159764dc24423487add72fe4 and https://github.com/SeanMooney/zuul-operator/commit/89ce1ae82de739c25155c3262578b841efd975eb | 15:53 |
| mnasiadka | And restarting zuul components with emptydir and losing cache is a bit meh | 15:53 |
| sean-k-mooney | well yes nova is indeed a big boy | 15:53 |
| clarkb | mnasiadka: opendev tunes those defaults too. I don't think that is a bug but it is something you need to be aware of if dealing with larger repos | 15:53 |
| sean-k-mooney | althogh normlaly when my networkin is not borked this vm has 4-5G down and about a thent of that up | 15:54 |
| mnasiadka | clarkb: let’s say running zuul is a journey, and I’ve been on one for the last 6 months I guess ;) | 15:54 |
| sean-k-mooney | its actlly not so hard to do its just a vim like learnign curve the first time | 15:55 |
| sean-k-mooney | also that kind of unfair the docker compoes is really easy | 15:55 |
| sean-k-mooney | tryign to run it on k8s is less so | 15:56 |
| clarkb | there are definitely upsides and downsides to each approach. I do think that more people should be thinking about using simpler tools | 15:57 |
| clarkb | k8s is great at a certain scale. But I'm not sure that most people using it for every last thing actually benefit much | 15:58 |
| mnasiadka | I started with docker compose, but I got tempted by an existing Kubernetes cluster and zuul-operator, which in the beginning worked great, but then I wanted to customize more and more - and now I have like 10 patches on top of upstream | 16:01 |
| mnasiadka | And seriously with a running service it’s hard now to go to some other solution, because the migration will take time | 16:02 |
| clarkb | yup I'm speaking broadly. I don't actually have anything against k8s or people using k8s. I think that it became a default and often times it seems like overkill | 16:03 |
| mnasiadka | Actually Zuul running in k8s is kind of neat, because I can scale any component and k8s will scale the workers accordingly | 16:04 |
| mnasiadka | But getting there was surely some amount of work I didn’t anticipate :) | 16:05 |
| mnasiadka | (As in how hard can that be kind of attitude) | 16:05 |
| sean-k-mooney | clarkb: i used k8s for my current ci because i wante to learn how flux worked for ci/cd | 16:14 |
| sean-k-mooney | i have been very tempeted over the year to add zuul to kolla-ansible | 16:14 |
| sean-k-mooney | because that is so much simpler and easier to debug | 16:14 |
| clarkb | one big upside to k8s (particularly if someone else manages the cluster) is that you can stop thinking so hard about the system layer (OS upgrades etc) | 16:15 |
| clarkb | I think the scaling argument is less useful if you're running a static fleed of k8s nodes to make that happen (though I know you can auto scale that layer too I don't think that is as common) | 16:15 |
| clarkb | s/fleed/fleet/ | 16:15 |
| sean-k-mooney | just so long as your not on the hook to now maintain that k8s instacne as well :) | 16:15 |
| clarkb | yup exactly and for many that is the case because you can pay amazon to do it pretty cheapyl | 16:15 |
| sean-k-mooney | so its defnintly ceaper on power/hardware costs for me not to but if i think about my personal time i likely shouuld just pay someone else too | 16:16 |
| sean-k-mooney | but on the other hand its good to have a playgorund to learn these things in lower risk envs | 16:17 |
| clarkb | As for downsides there are a lot of weird behaviors that are not always apparent upfront that you learn later that make things awkward from a process perspective. Like how to resize a volume that is part of a volume claim template. Or dealing with the chowning on volume attachment. Or the default common ingress implementation suddenly not being maintained on short notice | 16:19 |
| clarkb | its all solvable. Its just a different set of problems to "I need to upgrade this operating system" | 16:20 |
| *** haleyb is now known as haleyb|out | 22:49 | |
Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!