Friday, 2022-08-26

dviroel\|rover	master promoted, reverting	00:08
dviroel\|rover	current-tripleo/2022-08-26 00:02	00:08
rlandy	nice	01:05
rlandy	train is promoting	01:05
*** rlandy is now known as rlandy\|out		01:33
*** pojadhav\|out is now known as pojadhav		01:35
*** dviroel\|rover is now known as dviroel\|out		01:37
*** ysandeep\|out is now known as ysandeep		02:21
ysandeep	good morning oooci o/	02:25
ysandeep	rlandy\|out, regarding sc01 slowness card.. its not only sc01 - all the rdo jobs are slow	03:28
ysandeep	https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-standalone-wallaby/bbc1dd7/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz	03:28
ysandeep	took ~30 mins: 022-08-25 18:51:02.273791 \| fa163e1b-009f-a236-db4b-000000000889 \| TIMING \| tripleo_firewall : Manage firewall rules \| standalone \| 0:29:08.429153 \| 1684.59s	03:28
ysandeep	Simply running : iptables -t filter -L INPUT - takes 40 seconda	03:28
ysandeep	iptables -t filter -L INPUT -n returns quite quickly	03:29
ysandeep	I am debugging with takashi	03:29
ysandeep	https://www.fir3net.com/UNIX/Linux/iptables-l-output-displays-slowly.html	03:29
ysandeep	https://serverfault.com/questions/791911/centos-extremely-slow-dns-lookup suggests we remove 127.0.0.1 from resolv.conf entry and that resolved the issue	03:55
ysandeep	looks like there was a historical reason they added 127.0.0.1 : https://opendev.org/openstack/tripleo-ci/commit/5d612318c95b9b3ff78e66e91d5c225274cb8b09	04:06
*** soniya29\|ruck is now known as soniya29		04:56
ysandeep	finally, we know what the issue is unbound service not present in RDO images	06:04
ysandeep	that's the difference with Upstream.. causing slowness in dns resolution	06:05
ysandeep	let me check with infra and save ~40 min on each of our job run	06:05
ysandeep	fyi.. Incase some want to know the details https://bugs.launchpad.net/tripleo/+bug/1983718/comments/10	06:06
ysandeep	updated: https://bugs.launchpad.net/tripleo/+bug/1983718 and https://trello.com/c/R6SuOv6E/2661-cixlp1983718tripleociproa-periodic-master-scen1-standalone-fails-timeout-manage-firewall-rules	06:25
*** pojadhav is now known as pojadhav\|ruck		06:47
*** jm1\|ruck is now known as jm1\|rover		06:48
jm1	happy friday #oooq ysandeep pojadhav\|ruck :)	06:48
pojadhav\|ruck	jm1, happy friday you too :)	06:48
ysandeep	jm1, good morning o/	06:49
frenzyfriday	hey, does anyone know this error: SELinux relabeling of /etc/pki is not allowed I am gettingthis while getting telegraf container up in downstream. I have already set selinux on the host to permissive	07:33
frenzyfriday	I have tried adding privileged: true to the docker compose	07:36
jm1	frenzyfriday: so the line this is coming from is https://github.com/rdo-infra/ci-config/blob/63b70523433c31df47eb5cddef19a7a3e9c96a2d/ci-scripts/infra-setup/roles/rrcockpit/files/docker-compose.yml#L87	07:44
jm1	frenzyfriday: question is, why do we have this volume in the first place?	07:44
frenzyfriday	yep, it says we are son supposed to mount /etc and some other dirs	07:45
jm1	frenzyfriday: i guess the only reason is that we want to pass the rh internal ca cert to the container?	07:45
frenzyfriday	We need this to get the certs to connect to downstream	07:45
frenzyfriday	I was wondering if mounting does not work probably we have to run copy certs to the container itself in the dockerfile	07:45
frenzyfriday	But how did it work for the c7 internal cockpit vm?	07:46
frenzyfriday	But in this case ^ whenever the certs change , we will have to rebuild the container	07:46
jm1	frenzyfriday: c7 had docker which behaves differently from podman in c8/9 in some parts	07:46
chandankumar	frenzyfriday: regarding selinux relabelling issue https://github.com/containers/podman/issues/2379#issuecomment-466770671 might help	07:48
jm1	frenzyfriday: why not download the ca cert in infra-setup playbook (only for incockpit) to both /etc/pki and another place and then mount it into the container?	07:48
frenzyfriday	yeah, that might work! /m tries	07:49
jm1	chandankumar, frenzyfriday: from security side this is not a good idea https://bugzilla.redhat.com/show_bug.cgi?id=1594485#c4	07:49
jm1	chandankumar, frenzyfriday: we are running containers as root hence we could face exactly the issue described in bz	07:50
*** jpena\|off is now known as jpena		07:52
jm1	frenzyfriday: we have to add the "ca download step" (to /etc/pki on incockpit) as a task to the ansible playbook/role anyway. so copying it to a safe location as well shouldnt be too hard	07:53
ysandeep	frenzyfriday, jm1 try removing :z from https://github.com/rdo-infra/ci-config/blame/63b70523433c31df47eb5cddef19a7a3e9c96a2d/ci-scripts/infra-setup/roles/rrcockpit/files/docker-compose.yml#L87	07:54
ysandeep	i think I added that because it was needed for C7	07:54
jm1	ysandeep: and it probably still is required because we access https pages which are signed with rh internal ca	07:55
ysandeep	keep the mount just remove :z , I mean revert this patch: https://github.com/rdo-infra/ci-config/commit/ef60d16d9a1ff86ec2087e82f1d26a1b80af77c8	07:56
ysandeep	not 100% sure that will fix the issue but worth a try.	07:57
jm1	ysandeep, frenzyfriday: very certainly the rh ca is still required because ruck rover script needs it to access internal pages	07:57
jm1	ysandeep, frenzyfriday: good point about removing :z, worth a try	07:59
jm1	ysandeep,frenzyfriday: although it would be nice to use selinux in enforcing mode instead of permissive mode	08:00
ysandeep	+1, remove :z and turn selinux in enforcing mode	08:02
*** ysandeep is now known as ysandeep\|away		08:18
frenzyfriday	containers are up \o/	08:29
frenzyfriday	ysandeep\|away, jm1, chandankumar thank you guys!	08:29
* frenzyfriday now tweaking nginx		08:29
jm1	frenzyfriday: awesome!	08:29
chandankumar	yw :-)	08:32
chandankumar	jm1: thank you for the bz link. :-)	08:33
*** pojadhav is now known as pojadhav\|ruck		08:57
*** pojadhav is now known as pojadhav\|ruck		09:03
jm1	pojadhav\|ruck, rlandy\|out: rr notes are up to date now. they are now also more or less in sync with cix cards, ordering of known bugs matches "Prodchain Blocked" lane in cix trello board.	09:16
pojadhav\|ruck	jm1, I have started looking at upstream components	09:17
jm1	pojadhav\|ruck, rlandy\|out: does not look too bad: c8 train compute is 6 days out, c9 wallaby security is 5 days out, the rest is less than 4 days out	09:17
jm1	* less or equal	09:18
soniya29	pojadhav\|ruck, jm1, let me know when can we have sync-up?	09:19
jm1	soniya29, pojadhav\|ruck: do we want to wait for rlandy\|out? i guess she will be here soon	09:20
pojadhav\|ruck	yeah.. we can wait for more 1 hr	09:21
pojadhav\|ruck	then lets sync	09:21
soniya29	pojadhav\|ruck, jm1, okay	09:21
*** pojadhav\|ruck is now known as pojadhav\|lunch		09:22
jm1	pojadhav: i will start rerunning jobs for upstream	09:22
* pojadhav\|lunch will be back in half hr		09:22
pojadhav\|lunch	jm1, sure	09:22
pojadhav\|lunch	jm1, based on cix card update I am rerunning valiation mater job here: https://review.rdoproject.org/r/c/testproject/+/40897	09:23
jm1	pojadhav\|lunch: ack	09:24
frenzyfriday	incockpit: http://10.0.109.28/ finally	09:41
frenzyfriday	The scripts need some change. I will create a patch	09:42
jm1	frenzyfriday: cool! where do you want to put internal tenant_vars?	09:56
frenzyfriday	there is a base repo downstream, lemme check	10:05
*** pojadhav- is now known as pojadhav\|ruck		10:23
jm1	pojadhav\|ruck, rlandy\|out: went through all failing rdo jobs and annotated failing component jobs in rr notes	10:29
* jm1 lunch		10:30
*** rlandy\|out is now known as rlandy		10:30
rlandy	jm1: thanks	10:30
rlandy	let's sync when dviroel\|out is in	10:30
rlandy	pojadhav\|ruck: ^^	10:30
rlandy	is there a new rr hackmd	10:30
rlandy	https://hackmd.io/94uNoMlnQgegrgy1iXV1kQ - ok great	10:30
rlandy	jm1: frenzyfriday: pojadhav\|ruck: cockpit upstream is brokem	10:33
rlandy	out of date	10:33
rlandy	http://dashboard-ci.tripleo.org/d/HkOLImOMk/upstream-and-rdo-promotions?orgId=1	10:33
rlandy	promotions are more recent than that	10:33
rlandy	pojadhav\|ruck: hello - you around?	10:34
rlandy	chandankumar: hi	10:34
chandankumar	rlandy: hello	10:34
rlandy	arrived safe?	10:34
chandankumar	yes	10:34
chandankumar	thank you :-)	10:34
pojadhav\|ruck	rlandy, yes around	10:35
rlandy	pojadhav\|ruck: how are you and jm1 dividing the rr work?	10:36
rlandy	are you working downstream>	10:36
pojadhav\|ruck	yes I started looking at d/stream, but I didnt found any major blocker, only rerunning jobs which are are having sigle failures. now started looking at upstream components.	10:37
rlandy	pojadhav\|ruck: we will need to meet about the TC stuff when dasm gets in	10:37
pojadhav\|ruck	rlandy, yes	10:37
rlandy	pojadhav\|ruck: pls puts notes and last promotes dates on the rr notes like jm1 has done	10:37
rlandy	downstream is in good shape	10:37
pojadhav\|ruck	rlandy, yep I will add	10:37
rlandy	frenzyfriday: hi	10:40
rlandy	pls ping when around	10:40
rlandy	looks like upstream cockpit is not getting latest data	10:40
frenzyfriday	rlandy, hi	10:41
frenzyfriday	upstream also? :'(	10:41
rlandy	frenzyfriday: hi - can you meet for a few?	10:41
* frenzyfriday checks		10:41
frenzyfriday	yep sure	10:41
rlandy	frenzyfriday: https://meet.google.com/tug-xmbn-rag?pli=1&authuser=0	10:41
rlandy	bhagyashris: hi - will need to meet with you next	10:41
rlandy	pls ping when you are around	10:42
rlandy	jm1: pls nicj	10:47
rlandy	nick	10:47
rlandy	jm1: pojadhav\|ruck: promotions are in better shape than that - the cockpit is not getting latest data - spoke with frenzyfriday - pls check dlrn for your data	10:47
pojadhav\|ruck	ack	10:48
*** ysandeep\|away is now known as ysandeep		10:53
rlandy	bhagyashris: thanks for https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-config/+/426246 - merging	10:56
rlandy	we will need a zuul restart to get the line to show	10:56
rlandy	bhagyashris: can you add 17.1 on rhel8 to the promoter?	11:00
chandankumar	ysandeep+++++++++++++++++++++	11:04
ysandeep	chandankumar++ thanks for suggestions on c9 dib image :D	11:07
ysandeep	reviewbot: please add in review list: https://review.opendev.org/c/openstack/tripleo-ci/+/854751	11:07
reviewbot	I could not add the review to Review List	11:07
rlandy	ysandeep: chandankumar: ok - will merge	11:07
bhagyashris	rlandy, hey i am around...	11:14
rlandy	bhagyashris: hi ... review time now	11:15
rlandy	pls join that - will chat with you afterwards	11:15
bhagyashris	sure	11:15
bhagyashris	joining...	11:15
bhagyashris	folks review time...	11:17
bhagyashris	rlandy, https://review.opendev.org/c/openstack/tripleo-heat-templates/+/852446/	11:21
*** ysandeep is now known as ysandeep\|afk		11:23
bhagyashris	rlandy, https://hackmd.io/kqJ-XcUeQ_24bjQ7pjTVJQ	11:23
*** dviroel\|out is now known as dviroel		11:23
rlandy	akahat: hey - https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-9-multinode-mixed-os&pipeline=openstack-periodic-integration-stable1-cs8&skip=0	11:27
rlandy	failing	11:27
rlandy	can you look into that	11:27
rlandy	ysandeep\|afk: dviroel: you should be unblocked now on resources	11:30
rlandy	dviroel: good morning	11:30
rlandy	jm1: pojadhav\|ruck: soniya: dviroel:pls ping when you are all around so we can rr sync	11:30
ysandeep\|afk	rlandy++ wohoo awesome \o/	11:31
ysandeep\|afk	thanks!	11:31
dviroel	rlandy: nice o/	11:32
rlandy	dviroel: fyi - cockpit is not updating	11:32
rlandy	so last nights promos are not showing	11:32
frenzyfriday	there is a problem with the podman networking in the incockpit. the containers are up but they cannot communicate with each other	11:32
rlandy	which is why jm1 thinks things are worse than they are	11:33
rlandy	dviroel: did we promote w c9 security component?	11:33
dviroel	yeah, i see an comment abot that above	11:33
dviroel	yes, it is promoted	11:33
rlandy	dviroel++	11:33
rlandy	so what's out is w c8 integ and w c9	11:33
rlandy	ok - we can deal with those when jm1 and pojadhav\|ruck sync happens	11:33
dviroel	https://trunk.rdoproject.org/centos9-wallaby/component/security/	11:33
dviroel	rlandy: is this dns fix for vexx affecting everything	11:34
dviroel	?	11:34
pojadhav\|ruck	rlandy, i am available for the sync, will wait for others	11:35
ysandeep\|afk	frenzyfriday, podman networking -- hmm are we starting containers as root?	11:36
*** ysandeep\|afk is now known as ysandeep		11:36
frenzyfriday	ysandeep, yep	11:36
ysandeep	I think I know what's the issue and possible workaround, fetching..	11:37
frenzyfriday	ysandeep, even if I do docker exec it keeps throwing me out	11:37
jm1	rlandy: nick what?	11:38
jm1\|rover	rlandy: i am here	11:38
jm1	rlandy: me too	11:38
rlandy	ok- think we are all here	11:39
ysandeep	frenzyfriday, read this: https://bugzilla.redhat.com/show_bug.cgi?id=2091840	11:39
rlandy	dviroel: jm1\|rover: pojadhav\|ruck: soniya: https://meet.google.com/ocx-ttaj-zbe?pli=1&authuser=0	11:39
ysandeep	frenzyfriday, run this as root - ip link \| grep mtu	11:40
ysandeep	and see if podman bridge have higher mtu than eth0?	11:40
ysandeep	frenzyfriday, let me know if you want to discuss over a gmeet?	11:42
frenzyfriday	ysandeep, yep, cni-podman1: 1500, eth0 1450	11:43
frenzyfriday	ysandeep, yep sure: https://meet.google.com/sug-xzxz-xvn	11:44
akahat	rlandy, looking..	11:56
chandankumar	I saw some chatter around downstream promoter , Does it sorted out?	11:58
chandankumar	*it got	11:58
*** ysandeep is now known as ysandeep\|afk		12:05
frenzyfriday	chandankumar, nope. Containers are up - but now they cannot communicate between each other	12:06
rlandy	jm1: we missed our 1-1	12:15
jm1	rlandy: want to do it in 5mins?	12:15
rlandy	scheduling that for next week when you are off rr	12:15
rlandy	can't have other meetings today	12:15
jm1	rlandy: ok sure	12:17
jm1	dviroel, soniya: thanks for your rr notes, that really helped us getting started :)	12:19
jm1	frenzyfriday: need help with upstream cockpit?	12:20
dviroel	jm1: pojadhav\|ruck: testing cloudops fix here https://review.rdoproject.org/zuul/status#44652	12:21
pojadhav\|ruck	dviroel, ack	12:21
jm1	pojadhav\|ruck: will sync our rr notes against cix board again and then update promotion dates in rr notes	12:24
pojadhav\|ruck	jm1, sure	12:24
pojadhav\|ruck	d/stream one already update based on dlrn data	12:24
pojadhav\|ruck	updated*	12:24
pojadhav\|ruck	promotion dates	12:24
jm1	pojadhav\|ruck: ack, thanks!	12:25
frenzyfriday	jm1, hey, do you know if there is an ansible role for podman-compose like https://github.com/rdo-infra/ci-config/blob/master/ci-scripts/infra-setup/roles/rrcockpit/tasks/start_services.yml#L16 ?	12:27
frenzyfriday	I can see containers.podman.podman_container but it does not accept compose files	12:27
jm1	frenzyfriday: i dont think so, this does not show anything https://github.com/containers/ansible-podman-collections/tree/master/plugins/modules	12:28
jm1	frenzyfriday: podman has kube files. but iirc one can use docker compose with podman. are you running in issues with that?	12:29
jm1	frenzyfriday: we can debug together if you want	12:29
frenzyfriday	yes, if we have posman installed and try to set up the containers with dockerfile, docker-compose then the containers cannot communicate with each other. But if I use podman-compose up it works fine. Now outr playbook uses docker-compose module and passes the compose file to it. We need to do it through podman compose	12:30
frenzyfriday	jm1, you are RRing this week. I'll try to set up the incockpit manually for now (the compose part) and we can check after your RR	12:31
jm1	frenzyfriday: whatever you prefer ^^	12:31
frenzyfriday	I am also off next week	12:31
jm1	frenzyfriday: clean solution would be switching to kube files on podman	12:32
jm1	frenzyfriday: what about upstream cockpit?	12:32
jm1	frenzyfriday: do you need help there? is it working again?	12:32
frenzyfriday	jm1, no, did not get to the upstream yet	12:33
jm1	frenzyfriday: ack	12:33
frenzyfriday	jm1, ysandeep\|afk for the upstream cockpit I see an error in the telegraf container logs: https://paste.opendev.org/show/bHgnf4CZDrzgkJart8bD/	12:36
frenzyfriday	maybe thats why we arent getting new content?	12:36
frenzyfriday	Error in plugin: metric parse error: expected tag at 1:127: "zuul-queue-status,url=https://softwarefactory-project.io/zuul/api/tenant/rdoproject.org/status,pipeline=openstack-check,queue=,job=tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-wallaby,review=852845,patch_set=1 result=\"None\",enqueue_time=1661511140489,enqueued_time=103.6614,result_code=-1"	12:37
frenzyfriday	2022-08-26T12:36:05Z D! [outputs.influxdb] Wrote batch of 1 metrics in 4.045052ms	12:37
ysandeep\|afk	dasm\|off, ^^	12:37
ysandeep\|afk	frenzyfriday, you are seeing that error continously or it was only once?	12:40
frenzyfriday	ysandeep\|afk, repeating	12:44
frenzyfriday	ysandeep\|afk, https://paste.opendev.org/show/bfphcqOi78l7eOWqC2fK/	12:44
* pojadhav\|ruck brb		12:44
ysandeep\|afk	frenzyfriday, probably related to some recent changes, I will let dasm\|off take a look otherwise I can take a look on my Monday morning.	12:47
frenzyfriday	ysandeep\|afk, cool, thanks!	12:47
frenzyfriday	rlandy, ^ related to upstream cockpit	12:47
rlandy	frenzyfriday: probably last few merged changes on rr tool	12:48
rlandy	soniya also noticed some data being off lately on rr tool	12:49
rlandy	dasm\|off: ^^ when you are in, pls look at these errors	12:49
rlandy	pojadhav\|ruck: hi - can you start looking at the node provision failure on check?	12:51
pojadhav\|ruck	rlandy, sure	12:53
jm1	frenzyfriday, ysandeep\|afk, dasm\|off: docker containers such as telegraf_py3 are not automatically updated when ruck_rover.py script is changed. for example, on upstream cockpit the telegraf container is 3 month old, but the latest change to ruck_rover.py is 4 days old	12:53
jm1	rlandy: ^	12:53
ysandeep\|afk	soniya, could you please elaborate what issues you are seeing on rr tool?	12:55
jm1	frenzyfriday, ysandeep\|afk, dasm\|off, rlandy: docker-compose up tries to be smart and recreates containers when docker-compose.yml has been altered, but docker-compose will NOT watch for updates of Dockerfile(s) or files added in those Dockerfile(s)	12:56
ysandeep\|afk	jm1, :( we need a way to rebuild container for changes in files.	12:57
rlandy	yeah - we need to rethink this	12:57
rlandy	also - how many changes are going in rr tool	12:57
rlandy	if they are all needed	12:57
jm1	ysandeep\|afk, rlandy: yeah. how about starting to sync our ansible code in infra-setup with our existing infra? ;)	12:57
soniya	ysandeep\|afk, rlandy, sorry i missed your pings	12:58
soniya	ysandeep\|afk, the status shown with the rr tool was not so correct, like for a pipeline the rr tool shows status=RED but the pipeline didnt had so many blockers as such	12:59
ysandeep\|afk	soniya, by chance do you still have the output/if you are still seeing somewhere?	13:00
soniya	ysandeep\|afk, nope, i noticed this while filling the program call doc	13:02
ysandeep\|afk	ack, If even one job is failing whether its in criteria or not than red is correct	13:02
jm1	soniya: what do you mean with pipeline did not have so many blockers? are you check pipeline with cockpit?	13:02
ysandeep\|afk	so that ruck/rover will check the failing job	13:02
* ysandeep\|afk goes in a mtg		13:02
soniya	jm1: we are discussing about rr tool	13:03
rlandy	pojadhav\|ruck: 1-1??	13:05
jm1	soniya: yeah and in that context i am wondering what you took as a comparison for rr tool to find out that rr tool is broken. i am asking that because maybe we have issues somewhere else as well	13:05
soniya	ysandeep\|afk, jm1, so IMHO i think the status shown in rr tool should also consider the latest promotion happened and not just failing criteria jobs, because in our case, we got promotion 2 days back and status shown was 'RED'. just to mention we didn't skip any of jobs as far as i remember	13:09
soniya	<ysandeep\|afk> so that ruck/rover will check the failing job - for this we already have rr tool showing the testproject ready for them, that should be enough, right?	13:10
frenzyfriday	hey folks, the CRE team wanter to know if we have rocky linux running any of the jobs of tripleo ci?	13:11
frenzyfriday	afuscoar, ^	13:11
*** rcastillo\|rover is now known as rcastillo		13:15
rcastillo	o/	13:19
rcastillo	don't know why my bouncer wants me to be rover :(	13:19
soniya	<ysandeep\|afk> soniya, by chance do you still have the output/if you are still seeing somewhere? - may be new ruck/rovers can confirm this	13:25
soniya	if time permits then	13:25
bhagyashris	rlandy, add jobs in criteria https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/426268 and added 17.1 on rhel8 promotion https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/44663	13:29
ysandeep\|afk	rlandy, mtg time	13:31
rlandy	bhagyashris: thanks - will look after meeting	13:31
jm1	soniya: maybe rr tool is right and our cockpit is wrong? we having issues with cockpit atm	13:42
soniya	jm1, ack	13:51
jm1	frenzyfriday: time for a chat?	13:52
frenzyfriday	jm1, yep sure	13:53
pojadhav\|ruck	rlandy, dasm\|off still not in yet :(	14:00
rlandy	pojadhav\|ruck: let's wait for him	14:02
rlandy	pojadhav\|ruck: ping me when you are eod	14:02
pojadhav\|ruck	rlandy, i was about to eod by 1:30 pm UTC, not its 2 pm UTC :D	14:04
*** dasm\|off is now known as dasm		14:04
dasm	o/	14:04
pojadhav\|ruck	omg	14:04
pojadhav\|ruck	dasm here :D	14:04
pojadhav\|ruck	I am waiting for you only dasm	14:05
dasm	pojadhav\|ruck: where? i joined a call, but no one is here	14:05
pojadhav\|ruck	dasm joining	14:05
dasm	ack	14:05
*** njohnston_ is now known as njohnston		14:05
pojadhav\|ruck	rlandy, TC stuff ?	14:06
dasm	12:53 <jm1> frenzyfriday, ysandeep\|afk, dasm\|off: docker containers such as telegraf_py3 are not automatically updated when ruck_rover.py script is changed.	14:08
dasm	jm1: hmm. i was told it's autobuilt. i don't remember who said that.	14:09
dasm	that's a bad thing.	14:09
rlandy	joining	14:10
dasm	12:37:30 frenzyfriday \| 2022-08-26T12:36:05Z D! [outputs.influxdb] Wrote batch of 1 metrics in 4.045052ms	14:19
dasm	i'm gonna check that, but if we're not deploying new code, we might not have newer changes	14:19
dasm	ysandeep\|afk: ^	14:19
jm1	dasm: i am on upstream cockpit right now, trying to rebuild telegraf	14:20
dasm	jm1: ack	14:21
ysandeep\|afk	dasm, true	14:21
*** ysandeep\|afk is now known as ysandeep\|out		14:22
frenzyfriday	folk, lemme know if http://tripleo-cockpit.lab4.eng.bos.redhat.com/?orgId=1 is working for you	14:24
frenzyfriday	I have changes centos9 image to 8 for telegraf. thanks jm1 for the suggestion! The script will need a lot of work which will take time	14:25
dviroel	frenzyfriday: works for me, i see 17.1 rhel 9 there, no data yet but, at least, there is board there now	14:26
afuscoar	frenzyfriday: in my case the downstream dashboard I've created is there http://tripleo-cockpit.lab4.eng.bos.redhat.com/d/tbUsg0Z4k/downstream-data?orgId=1	14:28
afuscoar	Just not information to populate. I guess bc telegraf problem	14:28
frenzyfriday	afuscoar, hm.. lemme check what happened to the data	14:29
*** jpena is now known as jpena\|off		14:29
afuscoar	Idk if the script is located in the right path	14:31
chandankumar	see ya people	14:36
chandankumar	happy weekend :-)	14:37
rlandy	frenzyfriday: http://tripleo-cockpit.lab4.eng.bos.redhat.com/d/bSwsg0WVz/rhel9-rhos17-1-full-component-pipeline - yep	14:37
rlandy	no data yet on the downstream side	14:37
afuscoar	Oh, it's also happening in other dashboards	14:37
afuscoar	happy weekend chandankumar	14:37
rlandy	frenzyfriday++++++++++++++	14:38
rlandy	jm1++++++++++++++++++	14:38
rlandy	thank you both	14:38
rlandy	dasm: when you are done with UA stuff, can you look into the upstream cockpit error with frenzyfriday?	14:39
rlandy	hasn't collected new data	14:39
dasm	rlandy: it takes few minutes to gather data, but i'll check that one	14:39
dviroel	jm1: I added you to cloudops space, you might receive tons of notifications, but you can disabled them	14:40
dviroel	jm1: we are discussing cloudops issue there since yesterday	14:40
jm1	dviroel: ack ok thanks!	14:41
dasm	frenzyfriday: i logged into lab4 instance and manually ran ruck_rover.py --influx. I see results. They should be landing in the cockpit soon	14:43
jm1	dasm: frenzyfriday has fixed downstream (manually)	14:43
frenzyfriday	dasm, ack, yeah the logs say it is still pulling data	14:43
* frenzyfriday lunch		14:43
jm1	rlandy, pojadhav\|ruck: upstream cockpit is currently refreshing but might still be broken. you can use downstream cockpit for the moment, frenzyfriday has fixed it	14:44
rlandy	jm1: frenzyfriday: thank you	14:44
rlandy	jm1: putting in patch to chase wallaby c9	14:45
jm1	rlandy: thank you! put my latest update to rr notes.	14:45
jm1	rlandy, dasm, ysandeep\|out: upstream cockpit has several issues which we fixed manually for now, e.g. mtu was wrong which prevented container rebuilds for months. (this actually should have caused issues since the beginning of that vm on vexxhost). anyway, we will fix infra code next week. i am eod for today	14:48
rlandy	jm1: ok - have wallaby c8 and c9 in rerun	14:48
dasm	jm1: o/ have a good weekend mate	14:48
rlandy	jm1: asked dasm to add a new eoic	14:48
rlandy	epic	14:48
rlandy	to track needed cockpit changes	14:48
dasm	it's gonna be EPIC!	14:48
rlandy	to take up next sprint	14:48
rlandy	or this one - if anyone has time	14:49
dasm	EPIC sprint for EPIC epic with EPIC task :)	14:49
rlandy	jm1: dasm: pls log these tasks there	14:49
dasm	(sorry, i'm in a weekend mode atm)	14:49
rlandy	so we capture what needs to be done	14:49
rlandy	jm1: have a good weekend	14:49
jm1	rlandy, dasm: ack. dasm please add me to 'Watchers' on jira cards (in case you are creating new ones for infra)	14:51
dasm	jm1: ack. cc pojadhav\|ruck	14:51
* jm1 have a nice weekend #oooq		14:53
pojadhav\|ruck	rlandy, leaving for the day	14:54
rlandy	frenzyfriday: huge thank you for dealing with all this	14:54
rlandy	frenzyfriday+++++++++++++++++++++++++++	14:54
pojadhav\|ruck	did most of the JIRA stuff with dasm	14:54
rlandy	pojadhav\|ruck: dasm: thank you both!!!	14:55
rlandy	Team is rocking out!!!!	14:55
* pojadhav\|ruck out		14:55
*** pojadhav\|ruck is now known as pojadhav\|out		14:55
pojadhav\|out	see you all on monday !!	14:55
pojadhav\|out	bbye	14:55
rlandy	have a great weekend	14:56
*** dviroel is now known as dviroel\|lunch		15:00
Tengu	chandankumar: heya! I think there's an issue with the way molecule are launched in tripleo-ansible - it seems to limit to the "default" scenario... how can I make sure it's running all ?	15:02
rlandy	Tengu: you missed chandankumar - is it urgent?	15:13
Tengu	rlandy: not really - just making tripleo-ansible molecule tests less reliable.	15:13
Tengu	I'll check next week.	15:13
Tengu	but basically, this explains why molecule tests are all green for my tripleo_httpd_vhost role, while it's NOT working :).	15:14
Tengu	"woops".	15:14
rlandy	yep - too good t be true type thing	15:15
Tengu	exactly.	15:15
Tengu	I was testing my role "in real situation" and it crashes with a template thingy that should have crashed zuul.	15:16
Tengu	it didn't.	15:16
Tengu	so I hunted down the molecule report and, wow, only one test was launched :). the "default" scenario. «woopsss»	15:16
rlandy	:)	15:16
rlandy	ok - so tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001 works	15:17
rlandy	but c9 doesn't provision nodes .... fhmmm	15:17
rlandy	last success 08/24	15:18
rlandy	charming	15:18
Tengu	rlandy: I may have a working change for the tripleo-ansible molecule thingy.	15:43
Tengu	testing on my env.	15:43
Tengu	it's "meh", but it allows to run all the scenarios.	15:44
*** frenzyfriday is now known as frenzyfriday\|pto		15:48
*** dviroel\|lunch is now known as dviroel		16:10
rlandy	dasm: hi	16:22
rlandy	when was the last working centos image you could use for bmc?	16:22
rlandy	date of that image?	16:22
dasm	rlandy: 20220606.0 next one which fails is 20220621.1	16:23
rlandy	ok it's not that	16:24
dasm	> CentOS-Stream-GenericCloud-9-20220606.0. Following releases, none of them starts.	16:24
rlandy	yeah - I think we need to promote master	16:27
rlandy	current-trupleo hash is from 08/22	16:28
rlandy	we need 08/25 or later	16:28
dviroel	rlandy: on 08/25 we already have cloudops bug?	16:41
dviroel	yes right, it was yestreday	16:41
dviroel	:P	16:41
*** rcastillo\|rover is now known as rcastillo		16:55
dviroel	rcastillo: do you miss rover times?	16:55
dviroel	:)	16:55
rcastillo	every day I miss it	16:57
rlandy	rcastillo; we can always out you back on :)	17:16
rcastillo	it'd be unfair not to let others in on the fun :)	17:16
rlandy	you're so considerate :)	17:16
dasm	rcastillo: i believe others don't mind to allow you to have all fun ;)	17:20
dasm	*all the fun	17:20
rcastillo	you're just saying that to be nice	17:23
dviroel	rlandy: that tempest test failure on wallaby c8 seems to be a real issue?	18:12
dviroel	not first time?	18:12
rlandy	dviroel: got one test failing on latest run	18:12
rlandy	not repeated	18:12
rlandy	running again	18:12
rlandy	to compare	18:12
rlandy	different hash	18:13
dviroel	i think I saw that failing yesterday, not sure which hash	18:13
rlandy	dviroel: wrt node prov failure - I really think we need to promote a more recent version fo master	18:13
rlandy	dviroel: that was a different hash - and a different test	18:13
dviroel	ok	18:13
rlandy	following both hashes	18:13
rlandy	dviroel: for node prov - you promoted hash from 08/22 I think	18:14
dviroel	rlandy: yeah, but we are blocked due to scenario001	18:14
rlandy	we need 08/25	18:14
rlandy	correct	18:14
rlandy	so once that is cleared	18:14
dviroel	cloudops is out already	18:14
dviroel	their fix didn't work	18:15
dviroel	they use Spaces to chat, I can add you if you want	18:15
dviroel	if you want notifications on your gmail	18:15
dviroel	:)	18:16
rlandy	now looking at wallaby c9	18:36
rlandy	ugh - I give up - skipping fs001 to promote wallaby c8	19:58
rlandy	dviroel: ^^ https://logserver.rdoproject.org/54/36254/156/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-wallaby/e4928dd/logs/undercloud/var/log/tempest/stestr_results.html.gz - best result I can get	19:59
rlandy	skipping and promoting this hash	19:59
rlandy	ugh - tempest you win, I give up	19:59
dviroel	rlandy: you have my vote on this	20:03
rlandy	dviroel: will need - patch in progress -one sec	20:04
rlandy	dviroel: ok - https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/44664 - here	20:06
rlandy	dviroel: thank you	20:08
rlandy	openstack-periodic-integration-main - quite good	20:10
rlandy	waiting on sc01	20:11
rlandy	dviroel: pls check vexx flavors are there	20:11
rlandy	ticket says done	20:11
dviroel	they missed one, but it is ok for now	20:17
dviroel	will unblock our work	20:17
dviroel	the missing flavor may or may not be needed	20:17
dviroel	the one with extra memory	20:17
dviroel	rlandy: added a new section on podified doc, about cluster provision	20:18
rlandy	checking which one was missed	20:18
* dviroel brb - walk		20:26
rlandy	dviroel: fond which one is missing	20:28
rlandy	responded on ticket	20:28
rlandy	dviroel: dasm: rcastillo: one of you guys, pls check me on this before I merge: https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/44663	20:31
rlandy	from bhagyashris	20:31
rcastillo	looking	20:31
rlandy	t	20:32
rlandy	ty	20:32
rcastillo	looks good	20:35
rcastillo	do depends-on work on rdo -> downstream gerrit?	20:35
dasm	rlandy: lgtm	20:39
dasm	but, there is one minor thing. i'm double checking	20:40
dasm	ok, it's good.	20:42
rlandy	thanks	20:52
* dviroel back		21:00
* dasm => offline		21:06
dasm	see you Monday!	21:06
*** dasm is now known as dasm\|off		21:07
rlandy	bye dasm	21:11
rlandy	dviroel++ thanks for onboarding help	21:25
rlandy	still trying to promo c8	21:42
rlandy	patch keeps failing on a diff test	21:42
dviroel	docker rate limiting	21:43
rlandy	mumble	21:43
rlandy	I pay $7 a month for us to have a paid account	21:43
dviroel	haha	21:43
dviroel	we just need to add container login on these jobs then	21:44
dviroel	will solve that issue	21:44
dviroel	I still need to move rdo to container-login role	21:44
dviroel	i have a note about that, but it is good to create tasks	21:45
rlandy	otherwise will try on sunday	21:45
rlandy	when less people are active	21:45
dviroel	are we missing wallaby c8 and c9?	21:46
dviroel	how is wallaby c9?	21:46
rlandy	don't ask	21:51
rlandy	kvm job now passed	21:51
rlandy	as did tempest	21:51
rlandy	checking ovb	21:51
rlandy	new hash just started	21:52
rlandy	old one was out 64, 39, 1	21:52
rlandy	35	21:52
rlandy	https://review.rdoproject.org/r/c/testproject/+/43883	21:52
rlandy	was my rerun	21:52
rlandy	2022-08-26 15:31:24.107091 \| primary \| TASK [print content of 'resolv.conf' after modifications] **********************	21:53
rlandy	2022-08-26 15:31:24.107327 \| primary \| Friday 26 August 2022 15:31:24 -0400 (0:00:01.672) 0:15:50.294 *********	21:53
rlandy	2022-08-26 15:31:24.133797 \| primary \| ok: [undercloud] => {	21:53
rlandy	2022-08-26 15:31:24.133846 \| primary \| "msg": "Content of resolv.conf: # Generated by NetworkManager\nsearch openstacklocal novalocal\nsearch ooo.test\nnameserver 10.0.0.250\n# NOTE: the libc resolver may not support more than 3 nameservers.\n# The nameservers listed below may not be recognized."	21:53
rlandy	ha	21:53
rlandy	that may be the change made yesterday	21:54
rlandy	diff overcloud deploy errors in multiple runs	21:57
rlandy	on 064	21:57
rlandy	we're getting nowhere now	22:16
dviroel	hum	22:17
dviroel	yeah, 064 and 039 where like that in master too	22:18
dviroel	did you tried another cloud? ibm, internal?	22:18
* dviroel it is happening, I see 3 control plane and 3 worker vms		22:19
dviroel	7 instances if you add bootstrap	22:19
dviroel	rlandy: i think that these jobs are testing you	22:23
dviroel	rlandy: failing again	22:23
* dviroel brb		22:31
*** dviroel is now known as dviroel\|afk		22:31
*** dviroel\|afk is now known as dviroel		22:58
* dviroel almost there with ocp cluster		23:00
dviroel	have a great weekend team?	23:00
dviroel	s/?/!	23:01
rlandy	bye all	23:01
dviroel	o/	23:01
rlandy	have a great weekend	23:01
*** dviroel is now known as dviroel\|out		23:01

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!