Friday, 2021-03-12

clarkb	https://docs.python.org/3/library/ssl.html#id9 is the related documentation	00:03
clarkb	I bet that this is realted to 1.3 somehow	00:03
*** sboyron has quit IRC		00:05
fungi	clarkb: my change and the dnm change stacked on it indicate that your revision works with 3.6-3.8 on bionic and 3.9 on focal, so i expect it's safe	00:18
fungi	is there more we want to test?	00:18
*** cloudnull has quit IRC		00:27
*** cloudnull has joined #opendev		00:27
TheJulia	Anything special about limehost?	00:29
TheJulia	I ask because it to be the cloud our multinode job in ironic loves to fail on	00:30
fungi	limestone? it's ipv6-only with ipv4 access to the internet via many-to-one nat	00:33
TheJulia	yeah	00:33
fungi	what kind of failures? talking to things on the internet? ipv4-only things?	00:33
TheJulia	it looks like the vxlan tunnel is just not passing traffic	00:33
TheJulia	between the two nodes	00:33
fungi	oh, neat. i think we do our multinode setup specifically p2p so that vxlan won't try to use multicast... are you using ours or did you roll your own?	00:34
fungi	is it failing to pass traffic over vxlan between nodes all the time there, or only sometimes?	00:35
fungi	possible the lan there has gotten partitioned or something, i suppose	00:36
openstackgerrit	Merged opendev/system-config master: refstack: use CNAME for production server https://review.opendev.org/c/opendev/system-config/+/780125	00:38
fungi	TheJulia: another possibility is that ipv4 connectivity for some nodes is breaking partway into the build? we'd still be able to reach them via ipv6 so zuul wouldn't realize anything had gone wrong network-wise	00:40
fungi	might make sense to look at syslog on one of the failure examples, see if dhcpd logs any lease updates, arp overwrites, et cetera	00:41
TheJulia	https://0b3775447bad164395a7-ce9ebe3ea1326bbb58a211f00836955d.ssl.cf2.rackcdn.com/778145/2/gate/ironic-tempest-ipa-wholedisk-direct-tinyipa-multinode/4c83c2d	00:41
TheJulia	When we power up VMs attached to brbm, basically the packets never get through it appears	00:41
TheJulia	so they never boot	00:41
TheJulia	at least off of compute1	00:41
fungi	basically we're communicating with those nodes exclusively over ipv6, while vxlan is communicating between the nodes over ipv4, so if the latter is dying that could explain it	00:42
fungi	we do at least test initially that each node can reach something on the internet over ipv4, but it could be breaking after that i suppose	00:43
fungi	syslog shows iptables blocking a bunch of multicast traffic	00:44
fungi	is that typical?	00:44
fungi	vxlan will try to tunnel layer-2 broadcast traffic over multicast ip	00:45
fungi	possible that's just benign noise	00:49
*** tosky has quit IRC		00:51
TheJulia	I think it is noise	00:51
TheJulia	cross node traffic seems to work just fine otherwise	00:55
TheJulia	I'm not an ovs expert but it almost looks like ovs kind of works, datapath gets established, and then ovs seems to become unhappy and boom	00:59
* TheJulia wonders about MTUs		00:59
guillaumec	clarkb, indeed, "context.options \|= ssl.OP_NO_TLSv1_3" solves the zuul ssl test issue	01:01
fungi	TheJulia: ooh, good line of inquiry. that could vary by provider too	01:01
fungi	TheJulia: https://zuul.opendev.org/t/openstack/build/4c83c2d9c1774ce09f0d447bbdbed4d1/log/zuul-info/zuul-info.compute1.txt#26	01:02
fungi	1500	01:02
fungi	i think devstack tries to set the virtual interfaces lower to accommodate that	01:02
TheJulia	1500 feels like classic physical interface. v6 has pmtu discovery, I wonder if we're in some weird cross-hypervisor packet dropping	01:03
* TheJulia prepares to mark the job non-voting :(		01:03
fungi	yeah, also any particular snapshot of the pmtu for those peers won't necessarily be consistent	01:04
fungi	logan-: ^ if you're around, maybe you could have some theories since you know what the underlying network looks like	01:05
TheJulia	For the VM's themselves we have, we're dropping the mtu to 1330. Neutron runs at 1430 (there is a reason for the 100 bytes, I just don't remember it without tasty beverages.)	01:07
fungi	yeah, that memory is best not relived without some chemical safety net	01:08
TheJulia	Yeah, I'd think the only way to really figure this out is to be able to catch it in the act with a pcap or something	01:09
TheJulia	but that would be huge	01:09
fungi	is this the most common failure for that job? if so an autohold could at least keep the nodes around after the job fails	01:10
fungi	doesn't mean whatever was breaking them would still be broken by the time we logged in, but worth a shot	01:10
TheJulia	fungi: I looked at ?3? randomly on that job and it was all the same	01:13
TheJulia	all on limestone	01:13
TheJulia	I dunno, I'm kind of okay with just defferring it at the moment, too much work to do.	01:13
TheJulia	that is unless magical ideas appear	01:14
openstackgerrit	Jeremy Stanley proposed opendev/gear master: DNM: see if intermediate Python versions work too https://review.opendev.org/c/opendev/gear/+/780131	01:18
fungi	TheJulia: once you (or anyone really) is ready to dig into it, we can set up an autohold for that job and wait for it to catch a failure	01:23
*** mlavalle has quit IRC		01:24
TheJulia	fungi: much appreciated	01:34
*** mgagne has joined #opendev		02:29
ianw	kopecmartin / clarkb : i gave the containers a cycle after the config change applied and i can see results on https://refstack.openstack.org now. so i think it's working and won't roll back	02:38
johnsom	I'm trying to push that tag for wsme, but ssh with gerrit is rejecting me. Even if I try to checkout a patch using ssh I get permission denied. Any tips/ideas?	02:41
johnsom	The key in gerrit (web) is correct	02:41
*** artom has quit IRC		02:46
openstackgerrit	Ian Wienand proposed opendev/system-config master: refstack: cleanup old puppet https://review.opendev.org/c/opendev/system-config/+/780138	02:49
johnsom	Ok, it's something broken on this fedora workstation. Everything works fine from other VMs.	02:54
*** whoami-rajat_ has joined #opendev		02:55
openstackgerrit	Ian Wienand proposed opendev/system-config master: certcheck: cleanup letsencrypt domains https://review.opendev.org/c/opendev/system-config/+/780140	03:01
ianw	johnsom: fedora 33?	03:01
johnsom	yeah	03:02
ianw	yep, that's a known issue	03:02
johnsom	lol	03:02
fungi	openssl security defaults	03:02
johnsom	Can I get an hour refund?	03:02
ianw	https://issues.apache.org/jira/browse/SSHD-1118 if you'd like to read too much inconclusive detail on it :)	03:02
johnsom	ha, thanks, I will take a look	03:02
ianw	speaking of, RAX got the wrong end of the stick with my report that fedora 33 doesn't work with their console host	03:03
ianw	i think they thought it meant fedora 33 hosts don't show a console, not that you can't connect to their console host via fedora 33 with default configuration	03:04
ianw	that's even more screwed up and i'm owed a bigger refund than johnsom there :)	03:05
johnsom	Yep, that was the exact problem. Thanks ianw	03:08
*** whoami-rajat_ is now known as whoami-rajat		03:16
ianw	i filed https://issues.apache.org/jira/browse/SSHD-1141 as requested in sshd-1118	03:32
ianw	i think i distilled it correctly, fungi ^ could maybe check :)	03:32
openstackgerrit	Ian Wienand proposed opendev/system-config master: kerberos-kdc: role to manage Kerberos KDC servers https://review.opendev.org/c/opendev/system-config/+/778840	04:06
openstackgerrit	Ian Wienand proposed opendev/system-config master: kerberos: switch servers to Ansible control https://review.opendev.org/c/opendev/system-config/+/779890	04:06
openstackgerrit	Ian Wienand proposed opendev/system-config master: kerberos-kdc: add database backups https://review.opendev.org/c/opendev/system-config/+/779891	04:06
openstackgerrit	Ian Wienand proposed opendev/system-config master: refstack: add backup https://review.opendev.org/c/opendev/system-config/+/775061	04:18
openstackgerrit	Ian Wienand proposed opendev/system-config master: borg-backup hosts: use exact names https://review.opendev.org/c/opendev/system-config/+/780144	04:28
*** ysandeep\|holiday is now known as ysandeep		04:33
*** ykarel has joined #opendev		04:50
ykarel	ianw, hi, u around?	05:11
ykarel	we facing mirror issues for centos 8-stream, and on seeing i see the centos mirror from which infra mirrors follow is not synched for 12 hours	05:12
ykarel	http://mirror.dal10.us.leaseweb.net/centos/8-stream/AppStream/x86_64/os/repodata/ one	05:13
ykarel	in https://mirror-status.centos.org/ i see some mirror which are good	05:14
openstackgerrit	Merged opendev/system-config master: refstack: add backup https://review.opendev.org/c/opendev/system-config/+/775061	05:15
ykarel	the mirror u added in https://review.opendev.org/c/opendev/system-config/+/684437 is good currently, which was changed later to current one ^ as that was not up to date that time and was not in mirror-status.centos,org	05:17
ykarel	in https://review.opendev.org/c/opendev/system-config/+/716602	05:17
*** stevebaker has quit IRC		05:18
*** stevebaker has joined #opendev		05:23
ykarel	ok http://mirror.dal10.us.leaseweb.net/centos/8-stream/AppStream/x86_64/os/repodata/ is updated now, so next rsync should fix infra mirrors	05:25
*** whoami-rajat has quit IRC		05:28
ykarel	last run missed that, and now next run is in approx 1.25 hour	05:37
ykarel	if it can be manually triggered before that it will be good, else have to wait	05:37
*** ykarel_ has joined #opendev		06:08
*** ykarel has quit IRC		06:08
*** ralonsoh has joined #opendev		06:18
*** marios has joined #opendev		06:20
*** ykarel_ has quit IRC		06:31
*** ykarel has joined #opendev		06:32
*** whoami-rajat_ has joined #opendev		06:56
ykarel	mirrors got updated now	07:01
*** slaweq has joined #opendev		07:11
*** eolivare has joined #opendev		07:25
ianw	ykarel: sorry, missed this, things in sync now?	07:34
ykarel	ianw, yes now it's synched	07:34
ykarel	ianw, now seeing issue with epel repos not synched	07:47
*** hashar has joined #opendev		07:48
ykarel	http://pubmirror1.math.uh.edu/fedora-buffet/epel/8/Everything/x86_64/repodata/?C=M;O=D vs mirror.ord.rax.opendev.org/epel/8/Everything/x86_64/repodata/?C=M;O=D	07:50
ykarel	and other epel mirror https://dl.fedoraproject.org/pub/epel/8/Everything/x86_64/repodata/?C=M;O=D	07:51
*** sboyron has joined #opendev		08:05
*** andrewbonney has joined #opendev		08:33
*** amoralej has joined #opendev		08:44
kopecmartin	ianw: clarkb return_to address is fixed, thanks for that, but I still can't sign in, I suspect it might be something with realm, in openstackid.org i can see that I'm signing in from "Site" realm instead of 'refstack.openstack.org'	08:50
*** tosky has joined #opendev		08:51
*** jpena\|off is now known as jpena		08:58
ttx	kopecmartin: yes I confirm I see the same. It's weird as the URL has openid.realm=https%3A%2F%2Frefstack.openstack.org	09:08
kopecmartin	ttx: hmm, is there something else which has to be set on the server side in order to have the correct realm?	09:10
ttx	kopecmartin: I have no idea.. I'll ask openstackID folks to have a look. Was anything changed in the parameters, or was just everything copied over from the old one?	09:10
ttx	Or could it be some DNS propagation issue ? Like the IP we have for refstack.o.o is not the same as the one the openstackid server sees?	09:12
kopecmartin	ttx: there were lots of changes in the server config (like puppet -> containers move, py2->py3, OS version ...) but no significant changes in the configs	09:14
kopecmartin	well , a little workaround with redirection https://review.opendev.org/c/opendev/system-config/+/776292/18/playbooks/roles/refstack/templates/refstack.vhost.j2	09:15
kopecmartin	maybe that ^^?	09:15
kopecmartin	might be, unfortunately the dns config is outside my scope	09:17
*** stevebaker has quit IRC		09:17
ttx	hmm, probably not. Here the issue is that clicking "Log In" should give us the login form, not the openstackid.org front page	09:17
ttx	I'll ask the ID provider guys, they should be able to tell us what's missing. i'll let you know here if they reply anything useful, And thanks again for working on this!	09:18
kopecmartin	ttx: sure, thanks .. I'm gonna quickly check the refstack project to see how the signin url is form - i remember there were some changes too	09:18
*** ysandeep is now known as ysandeep\|lunch		09:41
openstackgerrit	Aurelien Lourot proposed openstack/project-config master: Add Magnum charm to OpenStack charms https://review.opendev.org/c/openstack/project-config/+/780211	09:50
*** dtantsur\|afk is now known as dtantsur		09:51
openstackgerrit	Aurelien Lourot proposed openstack/project-config master: Add Magnum charm to OpenStack charms https://review.opendev.org/c/openstack/project-config/+/780211	09:54
*** smcginnis has joined #opendev		10:38
*** bodgix has quit IRC		10:59
*** bodgix_ has joined #opendev		10:59
*** slaweq has quit IRC		11:00
*** slaweq has joined #opendev		11:02
*** brinzhang0 has quit IRC		11:11
openstackgerrit	Rotan proposed openstack/diskimage-builder master: replace the link which is in the 06-hpdsa file https://review.opendev.org/c/openstack/diskimage-builder/+/730286	11:20
*** ysandeep\|lunch is now known as ysandeep		11:36
openstackgerrit	Merged zuul/zuul-jobs master: bindep.txt: skip python-devel for el8 platform https://review.opendev.org/c/zuul/zuul-jobs/+/780050	11:47
*** hashar is now known as hasharLunch		12:10
*** smcginnis has quit IRC		12:28
*** jpena is now known as jpena\|lunch		12:32
*** artom has joined #opendev		12:44
*** tkajinam has quit IRC		12:54
*** hasharLunch is now known as hashar		13:00
*** smcginnis has joined #opendev		13:05
*** ykarel has quit IRC		13:08
*** ykarel has joined #opendev		13:09
*** amoralej is now known as amoralej\|lunch		13:23
*** jpena\|lunch is now known as jpena		13:49
*** smcginnis has quit IRC		13:52
*** mlavalle has joined #opendev		13:59
dtantsur	hi folks! is it only me, or there is some issue with published logs? https://zuul.opendev.org/t/openstack/build/6f9c830b828e4ff382ed05bfdc608a80/log/job-output.txt	14:03
fungi	ianw: that mina-ssh feature request looks good to me, also they've already replied suggesting you could implement it for them ;)	14:04
fungi	dtantsur: "This logfile could not be found" usually means either we failed trying to upload it, or it disappeared off the swift server after upload. i'll take a look in the executor debug logs in a bit to rule out the former (that usually ends in a post_failure result though)	14:06
dtantsur	thanks! note that it's a very recent run, so it shouldn't have timed out.	14:06
fungi	i'll need to look at it after i run some errands this morning, but will dig in as soon as i'm back	14:10
*** amoralej\|lunch is now known as amoralej		14:12
openstackgerrit	Rich Bowen proposed opendev/yaml2ical master: Adds second- and fourth- week recurring meetings https://review.opendev.org/c/opendev/yaml2ical/+/780266	14:14
*** hashar is now known as hasharAway		14:20
*** mfixtex has joined #opendev		14:24
*** smcginnis has joined #opendev		14:28
TheJulia	put of curiosity, is the new gerrit webui making huge calls for lists of everythning as it could relate to the user interaction?	14:32
*** lpetrut has joined #opendev		14:35
*** mfixtex has quit IRC		14:37
*** whoami-rajat_ is now known as whoami-rajat		14:40
fungi	kopecmartin: clarkb: ttx: apparently the problem is the auth url should be https://openstackid.org/accounts/openid2 not just the base site url	14:47
fungi	TheJulia: not entirely sure, it's implemented with polymer... but the reason we suspect it's slow is for backend reasons (the relational database has been replaced with objects in git repositories)	14:48
fungi	and it seems like memory pressure might be making filesystem caches inefficient	14:48
*** hasharAway is now known as hashar		14:49
TheJulia	OH!	14:52
TheJulia	Yeah, that explains a lot	14:52
TheJulia	since it looks like the client asks for things like all my changes, all my blah, at least what I can grok on the screen, and if that doesn't load quite fast enough then the page load breaks it seems	14:54
TheJulia	This is why database indexes are a thing too	14:54
TheJulia	"Hi, give me the index" vs "hi, pls tablescan this for me"	14:54
fungi	yeah, and gerrit maintains very large in-memory and on-disk caches of stuff, but the indexing even in caches becomes quite important	14:57
*** Green_Bird has joined #opendev		14:59
*** ysandeep is now known as ysandeep\|dinner		15:00
*** eolivare has quit IRC		15:01
*** Green_Bird has quit IRC		15:01
*** eolivare has joined #opendev		15:02
*** Green_Bird has joined #opendev		15:02
fungi	dtantsur: it looks like uploads for that build worked fine, but that one file is not available in swift for some reason (other logs uploaded for that same build can be accessed no problem). have you seen more examples of this? maybe we can find a commonality	15:03
*** Green_Bird has quit IRC		15:03
fungi	specifically, https://14f46b65f6b8edf7deec-a7117e65d5d46fb2ebde9a8b3aa13b86.ssl.cf2.rackcdn.com/780251/1/check/releases-tox-list-changes/6f9c830/job-output.txt reports a "Content Encoding Error" from the rackspace swift cdn	15:03
*** artom has quit IRC		15:04
*** Green_Bird has joined #opendev		15:04
dtantsur	I haven't seen other cases, no	15:05
fungi	so this may be something broken in rackspace's cdn layer, or data corruption at rest (though swift i think prevents that, i don't know how "swift" rackspace's deployment is), or it could be we did something weird when uploading the file (but which did not produce any error)	15:05
*** Green_Bird has quit IRC		15:05
*** Green_Bird has joined #opendev		15:06
openstackgerrit	Martin Kopec proposed opendev/system-config master: refstack: Fix openid endpoint https://review.opendev.org/c/opendev/system-config/+/780272	15:09
kopecmartin	fungi: clarkb ianw ^^	15:09
kopecmartin	fungi: thanks .. i didn't notice it was overrided in the config, I checked just the default value in refstack ..ah	15:10
fungi	kopecmartin: awesome, reviewing now	15:11
fungi	kopecmartin: i've approved, once it deploys please double-check whether things are working as desired	15:12
*** artom has joined #opendev		15:16
*** lpetrut has quit IRC		15:16
fungi	i need to pop out to run some errands (i'm a bit behind) but should be back in an hour	15:19
kopecmartin	fungi: thank you, sure	15:20
openstackgerrit	Merged opendev/system-config master: refstack: Fix openid endpoint https://review.opendev.org/c/opendev/system-config/+/780272	15:42
clarkb	fungi: re the gear change safety I think guillaumec is saying that those chagnes will break zuul testing on focal with python 3.8	16:03
clarkb	fungi: and it appears related to the enablement of tls 1.3 via PROTOCOL_TLS	16:03
clarkb	we could maybe update the bottom change to disable 1.3 for now?	16:03
clarkb	( I worry that is the sort of change that becomes permanent)	16:04
*** hashar is now known as hasharAway		16:06
clarkb	guillaumec: maybe we can try to do a minimal reproducer forcing tls 1.3 between client and server and then asking both of them to stop?	16:08
clarkb	guillaumec: since the test is timing out I suspect that it may just be a teardown/cleanup problem?	16:08
*** hasharAway is now known as hashar		16:11
*** dhellmann has quit IRC		16:12
clarkb	fungi: TheJulia: ovs did not support vxlan over ipv6 until relatively recently (and even that may be spec defying?). One option may be to update the multi node bridge stuff to run it over ipv6 if present as that will get us on the prefered IP stack for providers like limestone	16:13
*** dhellmann has joined #opendev		16:13
clarkb	though using codesearch I'm not sure that the multi node bridge stuff is involved? seems like this may all happen in devstack	16:14
TheJulia	yeah, there is some magic there someplace in the entire multinode setup	16:14
TheJulia	I have to hunt it down every single time I need to look at it :\	16:14
TheJulia	That might be an option, interestingly enough cross-node v4 seems to be fine in general but we may just not be seeing everything from the job logs that could be happening that makes it seem like everything is fine	16:16
*** ysandeep\|dinner is now known as ysandeep		16:18
clarkb	ya and vxlan is udp and could be more sensitive to those problems?	16:18
*** klonn has joined #opendev		16:19
clarkb	another issue it could be is conflicting ip addrs	16:20
clarkb	we saw that way back when osic was around because they assigned test node ips out of 10/8 and occassionally the overlays would overlap ip ranges and routing would break	16:20
clarkb	ok it is using the multinode network setup via zuul. It does so with a patch interface between brbm and br-infra called phy-brbm-infra	16:22
clarkb	and phy-infra-brbm	16:22
clarkb	they are opposite ends of the same virtual cable	16:22
clarkb	does not appear to be an ip conflict. br-infra uses 172.24.4.0/24 and the limestone nodes are 10.4.70.0/24	16:25
clarkb	that probably rules out the easy things, holding a couple of nodes and inspecting the result is likely the easiest way to debug	16:29
clarkb	ianw: the mina sshd feature request lgtm. Also chris might be my hero	16:38
kopecmartin	clarkb: fungi will this https://review.opendev.org/c/opendev/system-config/+/780272 be applied on the server automatically or is there a manual action required?	16:39
fungi	kopecmartin: looks like we don't have a separate deploy job for it yet, so it should get applied in our hourly deployment i think? i'll check in a sec	16:40
fungi	or it could be the deploy jobs haven't finished yet	16:40
kopecmartin	great, thanks .. just wanted to be sure	16:41
clarkb	there should be an infra-prod job, we may have to intervene and restart the service to pick up the config change though	16:41
fungi	kopecmartin: oh, it just hasn't run yet, see the deploy pipeline at https://zuul.opendev.org/t/openstack/status	16:41
clarkb	I notice ianw did that earlier for the fqdn switch	16:41
fungi	there is an infra-prod-service-refstack build for it in waiting state, but it's a ways down the list	16:42
fungi	and yeah, maybe the playbook needs a restart handler for config changes	16:42
*** amoralej is now known as amoralej\|off		16:42
*** marios is now known as marios\|out		16:45
ttx	one of the jobs seems to have failed	16:47
ttx	infra-prod-base on the deploy of 780272	16:47
fungi	yeah, the infra-prod-base job probably had trouble deploying to a down server somewhere, i'm about to go hunting in the logs on the bastion, it shouldn't affect refstack deployment unless it was the refstack server which was the problem	16:48
ttx	ack	16:48
*** marios\|out has quit IRC		16:48
fungi	that's the job which does things like add our sysadmin accounts, set up mta configs, et cetera	16:48
fungi	but it's running against every machine in our inventory, so if one is down/hung somewhere, that'll report a build failure	16:49
fungi	d'oh!	16:50
fungi	refstack.openstack.org : ok=0 changed=0 unreachable=1 failed=0 s	16:50
fungi	kipped=0 rescued=0 ignored=0	16:50
fungi	so bridge can't reach refstack.openstack.org	16:50
fungi	aha, expected	16:51
fungi	refstack01.openstack.org : ok=60 changed=2 unreachable=0 failed=0 skipped=7 rescued=0 ignored=0	16:51
fungi	refstack01.openstack.org is working, but the refstack.openstack.org server in our inventory (i'm guessing the old one) is unreachable, probably offline in preparation for being deprovisioned but we haven't deleted it from the inventory yet	16:51
clarkb	yup, ianw says the old server would be shutdown but not removed for now	16:52
fungi	ttx: so that build failure is expected in this case	16:52
fungi	i suppose we could have added that server to our disable list to avoid the deploy build trying to reach it and reporting failure	16:55
fungi	something we could consider for future deprovisioning work	16:55
TheJulia	clarkb: oh yeah, definitely way more sensitive	16:56
clarkb	fungi: ya and maybe we should go ahead and add it now to prevent confusion until it is removed?	16:57
TheJulia	fungi: ^^^ that is why we lowered the mtu a long time ago.... I remembered :(	16:57
*** hashar has quit IRC		16:59
*** eolivare has quit IRC		17:06
*** jpena is now known as jpena\|brb		17:12
fungi	TheJulia: i'm sorry, we should have waited to trigger those memories until beer time	17:25
fungi	clarkb: good call, added it just now	17:25
ttx	fungi: ok let me know when I should be testing again :)	17:28
fungi	will do, looks like there are still three deploy jobs ahead of it	17:36
openstackgerrit	Clark Boylan proposed opendev/system-config master: Enable srvr, stat and dump commands in the zk cluster https://review.opendev.org/c/opendev/system-config/+/780303	17:36
fungi	the semaphore those jobs use tends to slow this down quite a bit	17:36
clarkb	corvus: ^ enabling those commands	17:37
*** jpena\|brb is now known as jpena		17:54
fungi	ttx: kopecmartin: refstack deployment finished at 17:49:57 utc, i'll check whether the service got restarted	17:55
*** ralonsoh has quit IRC		17:55
fungi	looks like the container was last upped at 02:34 utc, according to ps	17:56
fungi	also /var/refstack/refstack.conf was last modified on february 10, not sure if that's old. checking the bindmounts now	17:57
clarkb	fungi: I need breakfast now that the nodepool launcher debugging is done, but I can help with refstack once I've eaten something	17:58
fungi	aha, yeah that's cruft, it looks at /var/lib/refstack/etc/refstack.conf now and that was modified 17:49	17:58
fungi	openstack_openid_endpoint = https://openstackid.org/accounts/openid2	17:59
fungi	ttx: kopecmartin: so the config looks correct. will the service need a restart to see the updated refstack.conf file or does it reload it autonomously? sounds like ianw did an explicit restart to pick up an earlier config change	17:59
kopecmartin	fungi: a restart will be needed	18:01
kopecmartin	so that the config gets copied to the container and is applied	18:02
*** ykarel has quit IRC		18:04
fungi	kopecmartin: okay, doing that now	18:05
fungi	we should consider adding a handler to do that on config updates if that's safe, or abstract the configuration loading into something which can be triggered by a signal (or watch for file updates directly)	18:05
fungi	it's on its way back up now	18:06
*** artom has quit IRC		18:06
fungi	ttx: kopecmartin: i guess go ahead and test it now	18:06
kopecmartin	fungi: \o/ it works!! thank you!!	18:07
*** artom has joined #opendev		18:08
fungi	kopecmartin: no thanks needed, i just pushed a few buttons... but glad it's sorted now	18:08
*** dtantsur is now known as dtantsur\|afk		18:10
*** hamalq has joined #opendev		18:14
fungi	#status log Restarted the containers on refstack01 to pick up configuration change from https://review.opendev.org/780272	18:15
openstackstatus	fungi: finished logging	18:15
*** smcginnis has quit IRC		18:25
clarkb	corvus: still planning to do a zuul restart today? queues aren't tiny but also not huge. Node demand is very low.	18:26
clarkb	also openstack release team said they could extend to monday if necessary (they seemed ok with the friday plan)	18:27
*** smcginnis has joined #opendev		18:30
corvus	clarkb: extend what?	18:31
clarkb	corvus: feature freeze	18:33
clarkb	(I kinda got the impression a few things were going to slip even if we did nothing so they were already considering it)	18:33
corvus	clarkb: yeah, i agree nodes look good now, but maybe after lunch?	18:34
clarkb	corvus: wfm. Though I'll be trying to enjoy this good weather on the bike this afternoon, but will be around before and after that	18:35
fungi	i'll be around	18:35
*** klonn has quit IRC		18:38
clarkb	fungi: now that we've had a day to think about it, any reasons to not move forward with simply retiring those accounts with the no external id for preferred email address problem? We theorize these are the result of fallout from other sql db based account mangling as we dont' expect this is doable as a normal user. Also none of the accounts have been used in a year according to the audit script	18:43
*** jpena is now known as jpena\|off		18:44
fungi	no, i still think it seems like it should be entirely safe to retire those	18:49
clarkb	ok I'll proceed with that now then	18:49
clarkb	I went over the data again ab it more today and yo ucan see for some of the accounts they clearly transitioned from one account to another (just builds more confidence this is the right move)	18:50
*** LowKey has joined #opendev		18:52
fungi	yep, i expect they're all like that, it's just harder to connect the dots for a few since it happened years before	18:52
*** andrewbonney has quit IRC		18:54
clarkb	alright that is done and logs have been uploaded to review	19:00
clarkb	I'm going to do a consistency check next	19:00
clarkb	#status log Corrected all Gerrit preferred email lacks external id account consistency problems.	19:14
openstackstatus	clarkb: finished logging	19:14
clarkb	still have quite a number of external id conflicts but this is progress	19:14
clarkb	the consistency resutls in are in my homedir on review	19:14
* fungi will take a look shortly		19:16
*** hashar has joined #opendev		20:07
*** klonn has joined #opendev		20:08
corvus	clarkb, fungi: i'm going to start that restart now	20:55
corvus	clarkb: did your nodepool change land? should we restart nodepool too?	20:56
clarkb	corvus: it did land and I worked through them yesterday already	20:56
corvus	clarkb: ok, so we'll just leave nodepool alone?	20:56
clarkb	ya should be fine to leave nodepool alone	20:56
fungi	cool, i'm here. need help?	20:57
corvus	fungi: i don't think so; i'm just going to save queues then run the zuul_restart playbook	20:57
fungi	i'm around to dig in if it goes pear shaped	20:58
corvus	stopping now	21:00
corvus	things are starting	21:02
corvus	cat jobs are catting	21:04
fungi	so catty	21:06
fungi	our zuul is practically jellicle	21:07
corvus	re-enqueing	21:07
corvus	2021-03-12 21:08:02,726 DEBUG zuul.RPCListener: Formatting tenant openstack status took 0.005 seconds for 93502 bytes	21:08
corvus	that's a new log line btw	21:08
*** sboyron has quit IRC		21:08
fungi	nice! i like the (albeit miniscule) measurement there	21:08
corvus	see where that is when all the changes are re-enqueued :)	21:08
fungi	i suppose it gets bigger when there's queue data	21:08
fungi	heh, right that	21:09
*** artom has quit IRC		21:09
clarkb	and we cache that for ~1second still right?	21:09
corvus	yep	21:09
fungi	last i looked at the apache config	21:09
corvus	we cache internally too	21:10
corvus	apache protects zuul-web, and zuul-web protects zuul-scheduler	21:10
fungi	oh, right, the cache duration is expressed in the headers	21:11
fungi	not hard-coded in the apache vhost config	21:11
corvus	we're at about .03s for 500k so far	21:11
corvus	(still enqueueing)	21:11
*** whoami-rajat has quit IRC		21:13
corvus	#status log restarted all of zuul at commit 13923aa7372fa3d181bbb1708263fb7d0ae1b449	21:19
openstackstatus	corvus: finished logging	21:19
corvus	re-enqueue is done.	21:19
corvus	2021-03-12 21:19:30,233 DEBUG zuul.RPCListener: Formatting tenant openstack status took 0.059 seconds for 877325 bytes	21:20
corvus	that's looking typical	21:20
corvus	sometimes it's higher, but it's not in the main thread, so can suffer from contention	21:20
corvus	0.1 looks to be the max	21:20
clarkb	still well below the cache time which is why I was curious	21:21
fungi	still fairly small, good sign	21:21
fungi	we had almost no node request backlog prior to the restart, and the reenqueue really only shot it up to 500 briefly	21:23
fungi	it's already burning down quickly	21:23
fungi	we weren't even using max quota at the time of the restart, so seems like it was good timing	21:24
clarkb	ya I expected even with feature freeze that friday would be much calmer	21:24
fungi	everyone's already drinking	21:24
fungi	why am i not drinking yet?	21:24
clarkb	I'm not drinking because it is almost time to get some exercise	21:25
fungi	time to exercise my liver	21:25
corvus	then drinking	21:25
clarkb	corvus: it is almost as warm here as there. I'm really excited	21:25
fungi	it's 22.5c here	21:26
fungi	crazy given this is technically still winter for more than a week	21:26
clarkb	will get to 16 here in about an hour. I'm timing my outside time around that temp peak :)	21:26
fungi	breezy but sunny. we should have this temperature all the time	21:27
clarkb	if I go out in half and hour then my 1-1.5 hours outside should involve max warmth	21:27
fungi	i should walk to the beach, but it's almost dinner	21:27
*** hashar has quit IRC		21:37
*** smcginnis has quit IRC		21:38
*** smcginnis has joined #opendev		21:44
clarkb	looks like grafana says backlog is back to basically nil	21:49
fungi	yes, we're back down under quota again	21:50
fungi	i think that means the weekend is here	21:50
clarkb	fungi: ruamel will serialize human readable yaml right?	21:50
clarkb	I think my next step on the gerrit account work is to have the audit script spit out serialized data so I can write queries against it more easily	21:50
fungi	clarkb: i don't know how to interpret some of those words, but it preserves ordering and comments	21:50
fungi	it also comes at the cost of a spaghetti pile of ruamel libraries as dependencies	21:51
clarkb	fungi: heh maybe "more human readable than pyyaml" is more accurate	21:51
clarkb	I guess I can try pyyaml first	21:51
clarkb	in particular what I want to start looking at is if there are any more accounts that have broken openids regardless of previous activity and realized for taht I should just try to serialize as much info as possible then write separate queries against that	21:51
clarkb	also do you think we can land the tooling as proposed?	21:52
fungi	workaround is to actually make comments in yaml (like have a "description" field, et cetera)	21:52
clarkb	its been used a fair bit now and would make it easier for me when I switch between system-config branches to not have to always checkout that one branch to have the tools present	21:52
fungi	er, yeah i'm not entirely understanding the "human readable" bit then	21:53
fungi	if it's not about comments, then...	21:53
*** smcginnis has quit IRC		21:53
fungi	you can make pyyaml emit more human-friendly yaml formats, you just need to configure it	21:53
clarkb	fungi: maybe the pain has been in configuring it then	21:54
fungi	https://mudpy.org/gitweb?p=mudpy.git;a=blob;f=mudpy/data.py;h=b73959a1b63d857657dbdd4f5afce32c3746e593;hb=HEAD#l161	21:55
fungi	i've overridden the dumper there specifically to force it to indent lists, but you can probably ignore that	21:55
fungi	the end result though is to make pyyaml write files that yamllint can stomach	21:56
fungi	i fond it mildly incoherent that yamllint (written in python) objects to the default output of the most commonly-used python yaml implementation, but i've come to terms with that	21:58
fungi	s/fond/find/	21:58
clarkb	is pyyaml optimized for wire transfers by default? I seem to recall there may be reasons like that	21:58
fungi	yeah, could be	21:59
fungi	anyway, feel free to steal that, it's all isc licensed. maybe you want the indented lists too, the _IBSEmitter class is not that complicated to add	22:00
clarkb	thanks	22:00
*** klonn has quit IRC		22:05
fungi	i keep meaning to push a pr to pyyaml to make that configurable, but... enotime	22:32
*** gothicserpent has quit IRC		23:14
*** gothicserpent has joined #opendev		23:20

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!