Friday, 2022-10-28

*** rlandy\|bbl is now known as rlandy\|out		00:09
opendevreview	Ian Wienand proposed opendev/system-config master: [wip] testing with bridge99.opendev.org https://review.opendev.org/c/opendev/system-config/+/862845	01:16
opendevreview	Ian Wienand proposed opendev/base-jobs master: infra-prod: Move project-config reset into base-jobs https://review.opendev.org/c/opendev/base-jobs/+/862853	03:56
opendevreview	Ian Wienand proposed opendev/system-config master: Remove old bridge testing https://review.opendev.org/c/opendev/system-config/+/862766	03:59
opendevreview	Ian Wienand proposed opendev/system-config master: Refernce bastion through prod_bastion group https://review.opendev.org/c/opendev/system-config/+/862845	03:59
opendevreview	Ian Wienand proposed opendev/system-config master: Revert "Update to tip of master in periodic jobs" https://review.opendev.org/c/opendev/system-config/+/862854	03:59
opendevreview	Ian Wienand proposed opendev/base-jobs master: infra-prod: Move project-config reset into base-jobs https://review.opendev.org/c/opendev/base-jobs/+/862853	07:00
*** jpena\|off is now known as jpena		07:22
*** benj_0 is now known as benj_		08:07
*** ShadowJonathan_ is now known as ShadowJonathan		08:07
*** andrewbonney_ is now known as andrewbonney		08:07
*** walshh__ is now known as walshh_		08:07
*** open10k8s_ is now known as open10k8s		08:07
*** aprice_ is now known as aprice		08:07
*** erbarr_ is now known as erbarr		08:07
*** TheJulia_ is now known as TheJulia		08:07
*** gouthamr_ is now known as gouthamr		08:07
*** eball_ is now known as eball		08:07
*** snbuback_ is now known as snbuback		08:07
*** PrinzElvis_ is now known as PrinzElvis		08:07
*** chateaulav_ is now known as chateaulav		08:07
*** odyssey4me_ is now known as odyssey4me		08:13
frickler	infra-root: https://review.opendev.org/c/opendev/system-config/+/862759/1/playbooks/roles/nodepool-base/tasks/main.yaml is broken, the attr must be named "host" instead of "addr"	09:58
frickler	noticed this via mail from failing nodepool job	09:58
opendevreview	Dr. Jens Harbott proposed opendev/system-config master: Fix generated zookeeper config for nodepool https://review.opendev.org/c/opendev/system-config/+/862878	10:01
frickler	unrelated: wheels were last successfully built 8 days ago	10:03
*** rlandy_ is now known as rlandy		10:35
*** arxcruz is now known as arxcruz\|ruck		10:36
frickler	looks like afs rpm issue again https://zuul.opendev.org/t/openstack/build/80ad124ac3cf4933a7b9a381ad2f0b9c	10:36
frickler	regarding nodepool, someone might want to check why the failures can be seen in the job log, but the job still passed https://bb6d07e84661d82562d5-daf98166b205b84408724e1df10e75fa.ssl.cf5.rackcdn.com/862759/1/gate/system-config-run-nodepool/4cb5b5b/nl01.opendev.org/docker/nodepool-docker_nodepool-launcher_1.txt	11:23
*** dviroel is now known as dviroel\|rover		11:44
opendevreview	Merged opendev/system-config master: Fix generated zookeeper config for nodepool https://review.opendev.org/c/opendev/system-config/+/862878	12:11
frickler	actually with that patch in there is still something wrong in the nodepool.yaml generated in gate, the host adresses are empty https://c60c9b47a943d67d7acd-72f68c73c06acdc7229714e8d93d40d1.ssl.cf1.rackcdn.com/862878/1/gate/system-config-run-nodepool/73b5a84/nl01.opendev.org/docker/nodepool-docker_nodepool-launcher_1.txt	12:21
frickler	will have to watch what gets generated by the next periodic job on the live systems	12:22
fungi	we can revert https://review.opendev.org/862759 and just go back to relying on the old module for now, worst case	12:29
frickler	nodepool servers seem to be happy again since 1h, so the issue seems to happen only in CI	13:40
frickler	revert would have been difficult unless we also revert to using old bridge	13:41
fungi	not really. new bridge was already working without 862759, that was merely a performance improvement	13:54
fungi	the actual problem we hit on the new bridge was related to some unused zk keys which were embedded as raw binary, and were causing encoding problems for ansible	13:55
fungi	that was fixed by a separate change just prior to 862759	13:55
frickler	ah, good, do you remember where that fix was? can't find anything matching in system-config	14:10
frickler	also this is failing for a couple of days https://zuul.opendev.org/t/openstack/builds?job_name=infra-prod-base&project=opendev/system-config	14:10
frickler	"usermod: user zuul is currently used by process 1370895" on bridge01 itself	14:12
fungi	frickler: possible it was an edit to our private group vars, checking...	14:14
fungi	frickler: yep, that's where it was	14:15
fungi	"Remove unsued keytab entries" (most recent commit in /etc/ansible/hosts on the new bridge)	14:16
frickler	fungi: ah, right, thx	14:28
fungi	anyway, those were what was gumming up the works for newer python, apparently	14:29
*** dasm\|off is now known as dasm		14:58
clarkb	re the usermod error that seems similar to the issue I addressed with removing the ubuntu user as part of launch node	15:10
clarkb	I addressed that by forcing the removal regardless as a subsequent step does a reboot. I doubt we're trying to remove the zuul user there but maybe any modification is tripping over similar?	15:10
opendevreview	Andy Ladjadj proposed zuul/zuul-jobs master: fix(packer): prevent task failure when packer_variables is not defined https://review.opendev.org/c/zuul/zuul-jobs/+/836744	15:21
opendevreview	Andy Ladjadj proposed zuul/zuul-jobs master: [upload-logs-base] add ipa extension to known mime types https://review.opendev.org/c/zuul/zuul-jobs/+/834045	15:21
opendevreview	Andy Ladjadj proposed zuul/zuul-jobs master: [upload-logs-base] add android mime-type https://review.opendev.org/c/zuul/zuul-jobs/+/834046	15:22
clarkb	fungi: frickler: re nodepool config we should double check that the old code produced different results too. Thinking out loud here I wonder if our inventory in the test jobs has enough of the bits we use in production to produce a working config	15:27
opendevreview	Andy Ladjadj proposed zuul/zuul-jobs master: [ensure-python] install python version only if not present https://review.opendev.org/c/zuul/zuul-jobs/+/770656	15:36
*** dviroel\|rover is now known as dviroel\|rover\|lunch		15:37
clarkb	infra-root after breakfast I'll get around to deleting gitea-lb01 and jvb02 from their respective clouds	15:42
clarkb	I'll start by shutting down services on the hosts and letting them sit for about an hour just to make sure there isn't any unexpected fallout then turn them off	15:42
clarkb	s/then turn them off/then delete them/	15:43
*** jpena is now known as jpena\|off		15:48
fungi	sounds great, thanks!	15:50
clarkb	services are now off	15:51
clarkb	expect deletions to occur around 1700 UTC	15:51
*** dviroel\|rover\|lunch is now known as dviroel\|rover		16:31
clarkb	infra-root the rax dns backup is failing on bridge01, but it is/was also failing on bridge.o.o. Not a regression	16:37
clarkb	infra-root I've just realized that both bridges will attempt to run the zuul restart playbook later today	16:43
fungi	d'oh!	16:43
clarkb	Since bridge.o.o shouldn't be configured autoamtically anymore I'm going to manually comment out the crontab entry for the playbook on that server allowing bridge01 to be the lone zuul restart commander	16:43
fungi	probably time that we shut down the old bridge (just not delete it)	16:44
fungi	but yeah that's a good intermediate step	16:44
clarkb	crontab is edited	16:44
clarkb	ya we should probably wait for ianw's monday before doing that?	16:44
fungi	agreed	16:44
clarkb	just in case he feels there are still things that need cross checking	16:44
fungi	the new bridge is working great though	16:44
fungi	and now that we've sussed out how to make latest osc work with rackspace volume management, i don't really have anything i need to preserve from the old bridge	16:45
clarkb	ya I've got a bunch of stuff in my homedir but I'm fairly certain non of it is particularly important	16:47
clarkb	running `openstack` for the first time on bridge01 reminds me this is a docker command	16:51
clarkb	it started doing things I didn't expect at first so I ^C'd	16:51
clarkb	personally I'm not sure how I feel about consuming osc that way	16:52
clarkb	its definitely a surprise to have stuff download in response to a server list	16:52
clarkb	and now I'm wondering why python-openstackclient and openstackclient are both on pypi	16:54
clarkb	python-openstackclient appears to be the up to date one	16:55
clarkb	openstackclient says it is a meta pacakge that installs the same major version of python-openstackclient. It doesn't have new releases so I guess that stopped getting updated	16:56
clarkb	anyway I'm setting up another venv because in the past that has become necessary. I'm just jumping the gun on that.	16:59
clarkb	and I can list nodes and volumes in vexxhost which is what I need to double check my work deleting gitea-lb01	17:00
clarkb	infra-root its been an hour since I shutdown services on jvb02 and gitea-lb01 any objections to deleting them now?	17:00
clarkb	ok gitea-lb01 is deleted, but that didn't auto delete its bfv volume. Going to delete that too	17:04
clarkb	the volume doesn't seem to remember the last thing it was attached to which makes deletion difficult if you don't do a volume list first (I did this out of fear this may happen)	17:06
clarkb	I can't server list against rax	17:08
clarkb	Version 2 is not supported, use supported version 3 instead.	17:09
clarkb	Invalid client version '2.0'. Major part should be '3'	17:09
clarkb	fungi: do you have this working on bridge01? your comments about the volume stuff make me think that this may be the case	17:09
clarkb	#status log Deleted gitea-lb01 (e65dc9f4-b1d4-4e18-bf26-13af30dc3dd6) and its BFV volume (41553c15-6b12-4137-a318-7caf6a9eb44c) as this server has been replaced with gitea-lb02.	17:10
opendevstatus	clarkb: finished logging	17:10
clarkb	ok the issue is the cinder api. Apparently we don't support API v2 in the latest version	17:12
clarkb	ok downgrading to osc<5.0.0 fixes it (5.0 might work too? I'm not sure). It reused the python-cinderclient wheel I already had cached which implies the issue is in osc not cinderclient	17:15
clarkb	#status log Deleted jvb02.opendev.org (a93ef02b-4e8b-4ace-a2b4-cb7742cdb3e3) as we don't need this extra jitsi meet jvb to meet ptg demands	17:18
opendevstatus	clarkb: finished logging	17:18
clarkb	gtema: ^ fyi re client issues	17:18
opendevreview	Clark Boylan proposed opendev/zone-opendev.org master: Remove gitea-lb01 and jvb02 from DNS https://review.opendev.org/c/opendev/zone-opendev.org/+/862941	17:20
clarkb	The good news is that the new bridge can do this stuff with some minor tweaks	17:23
fungi	clarkb: latest cli/sdk worked for me by pinning python-cinderclient<8	17:37
fungi	it was cinderclient 8.0.0 which dropped volume v2 api support	17:38
clarkb	fungi: it works fine with old osc and cinderclient 9.1.0	17:41
fungi	the ~fungi/foo venv on bridge01 is able to volume list, and was built via just `pip install openstackclient 'python-cinderclient<8'`	17:41
clarkb	it could be that both things are breaking it in different ways but if you change one or the other then it works	17:41
clarkb	oh openstackclient should install the same version that python-openstackclient<5.0.0 installs	17:42
clarkb	I don't think I'm going to debug this further, I just want to call it out as something that downgrading osc alone seems tohave fixed so isn't the cinderclient's sole issue	17:42
fungi	that working venv has openstackclient==4.0.0 and python-openstackclient==6.0.0	17:42
fungi	also openstacksdk==0.102.0	17:43
clarkb	mine has python-openstackclient==4.0.2 and python-cinderclient==9.1.0 and openstacksdk==0.102.0	17:43
clarkb	I guess hte promise that openstackclient always installs the same major version of python-openstackclient as its major version is wrong	17:43
fungi	so it's either old osc with new cinderclient, or new osc with old cinderclient?	17:43
clarkb	ya I think so	17:44
fungi	clarkb: to provide a different cloud.yaml file path/name with osc you need to export an envvar, right? i remember there was some way to do it but can't find an command-line flag at least	18:00
fungi	i guess i should be looking for the old oscc docs	18:21
clarkb	yes I think it is an env var. Something like OS_CONFIG_FILE	18:41
clarkb	I forget the actual name though	18:41
clarkb	fungi: OS_CLIENT_CONFIG_FILE openstacksdk defines it not osc	18:48
fungi	aha, thanks!	18:49
fungi	i had taken to grepping through the source because i didn't spot it in the docs	18:49
clarkb	ya it might be worth having osc's docs link to sdk's docs on the subject	18:50
*** dviroel\|rover is now known as dviroel\|rover\|afk		20:22
*** dasm is now known as dasm\|off		20:23
clarkb	part of me wants to start upgrading gitea backends to jammy now, but I think waiting for the openssl vuln to be fixed is a good idea	21:02
*** dviroel\|rover\|afk is now known as dviroel\|rover		21:34
*** dviroel\|rover is now known as dviroel\|out		21:36
fungi	oh, is jammy already openssl 3.x?	21:45
clarkb	I think so	21:47
fungi	looks like it is, yeah	21:52
fungi	3.0.2-0ubuntu1.6	21:52

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!