Wednesday, 2021-12-15

*** rlandy\|ruck is now known as rlandy\|out		00:57
corvus	we are at > T+2h since the rolling restart, and everything seems nominal	01:10
fungi	yeah, things are looking clean to me	01:18
wxy-xiyuan	hi ianw, the openEuler label is there https://zuul.opendev.org/t/openstack/labels but there is no node for ready. I assume there is any build/launch error in nodepool? Could you please take a look, or how can I debug it? Thanks.	02:00
ianw	wxy-xiyuan: ahh, sorry i meant to check back on that : you can see the build @ https://nb03.opendev.org/openEuler-20.03-LTS-SP2-arm64-0000000023.log	02:01
ianw	it's failing in our project-config elements	02:01
ianw	i have to admit that totally slipped my mind	02:02
wxy-xiyuan	Nice, this log is what I need. Checking. Big thanks	02:02
ianw	wxy-xiyuan: the elements in https://opendev.org/openstack/project-config/src/branch/master/nodepool/elements will need updating	02:05
wxy-xiyuan	xinliang https://nb03.opendev.org/openEuler-20.03-LTS-SP2-arm64-0000000023.log	02:05
wxy-xiyuan	dib-run-parts Running /tmp/in_target.d/install.d/20-iptables	02:05
wxy-xiyuan	echo 'Unsupported operating system openeuler'	02:06
wxy-xiyuan	ianw ++	02:06
wxy-xiyuan	xinliang: https://opendev.org/openstack/project-config/src/branch/master/nodepool/elements/nodepool-base/install.d/20-iptables as ianw said, maybe not only iptalbes, but also other base elements need update	02:07
xinliang	wxy-xiyuan: thanks, looking at it	02:07
xinliang	wxy-xiyuan: these elements haven't been tested before	02:09
fungi	system-config-run-mirror-update seems like it may have started consistently timing out in the run phase	02:31
opendevreview	wangxiyuan proposed openstack/project-config master: Add openEuler disto support for elements https://review.opendev.org/c/openstack/project-config/+/821794	02:33
opendevreview	Ian Wienand proposed openstack/diskimage-builder master: [dnm] add vm element to 9-stream image test to test bootloader https://review.opendev.org/c/openstack/diskimage-builder/+/821772	02:50
ianw	fungi: hrm -- i think that does mount afs as part of it. that's the only part that i think would cause problems	02:53
ianw	2021-12-15 02:02:59.852337 \| bridge.openstack.org \| 2021-12-15 01:50:38,158 INFO ansible: changed: [mirror-update01.opendev.org] => {	03:02
ianw	2021-12-15 02:02:59.867179 \| bridge.openstack.org \| 2021-12-15 01:58:35,723 INFO ansible: changed: [mirror-update01.opendev.org] => {	03:03
ianw	it took about 8 minutes to build openafs	03:03
ianw	that ... seems about normal, i guess	03:03
ianw	https://03eb8bb46d4e2a6a232a-dc3e65ccae23bb6c49297bc4ac109b91.ssl.cf5.rackcdn.com/820899/6/check/system-config-run-mirror-update/3e8178e/job-output.txt	03:03
ianw	2021-12-15 02:03:00.611982 \| bridge.openstack.org \| 2021-12-15 02:02:43,597 INFO ansible: mirror-update01.opendev.org : ok=233 changed=121 unreachable=0 failed=0 skipped=28 rescued=0 ignored=12	03:05
ianw	the whole thing finished 4-ish minutes after that, so nothing blowing out there	03:05
opendevreview	Ian Wienand proposed openstack/diskimage-builder master: [dnm] add vm element to 9-stream image test to test bootloader https://review.opendev.org/c/openstack/diskimage-builder/+/821772	03:16
opendevreview	Ian Wienand proposed openstack/diskimage-builder master: [dnm] add vm element to 9-stream image test to test bootloader https://review.opendev.org/c/openstack/diskimage-builder/+/821772	03:35
opendevreview	chzhang8 proposed openstack/project-config master: register and bring back tricircle under x namespaces https://review.opendev.org/c/openstack/project-config/+/800442	03:42
opendevreview	Ian Wienand proposed openstack/diskimage-builder master: [dnm] add vm element to 9-stream image test to test bootloader https://review.opendev.org/c/openstack/diskimage-builder/+/821772	03:51
fungi	second recheck seems to have made it back into the gate too, so i guess it's not 100% failing	04:28
*** ysandeep\|out is now known as ysandeep		04:50
*** pojadhav- is now known as pojadhav		05:12
ykarel	rax-iad and rax-dfw still affected with pypi issues	05:13
ykarel	ianw, you around?	05:13
ykarel	or anyone else from infra who can help in clearing this	05:14
ykarel	one option i see to try out revert of https://review.opendev.org/c/openstack/project-config/+/760495	05:16
ykarel	as public mirror of these looks good, the internal ones are impacted	05:16
ykarel	or can try -XPURGE to those internal mirrors if that resolves it, but need to do that from where those are reachable	05:17
ianw	ykarel: I am for just a bit	06:02
ykarel	ianw, ack np i am trying to clear with hack https://review.opendev.org/c/openstack/neutron/+/821798	06:03
ykarel	clear rax-iad	06:03
ykarel	now waiting for node on rax-dfw	06:03
ianw	i'm not sure what that internal mirror would have to do with it? that is just using the rax local network to access the mirror, but it's the same node	06:04
ianw	i.e. mirror-int.iad.rax.opendev.org == mirror.iad.rax.opendev.org, just one is the internal interface	06:05
ianw	as mentioned with pypi, we are only a proxy...	06:05
frickler	also, although the nodes are called "mirror", they are actually just caching proxies	06:06
ykarel	ianw, but can see failures with mirror-int.iad.rax.opendev.org but with mirror.iad.rax.opendev.org installation get's fine	06:06
ianw	hrm, that is probably a red-herring issue -- they both go to exactly the same apache process	06:06
frickler	ykarel: do you have links to the failures?	06:06
ianw	(i mean, i will never say never, but I'd highly doubt that is actually the issue)	06:07
ykarel	frickler, https://86804121d4d0f7ba6424-61662cfb64be48a1e2663c2773bf553c.ssl.cf2.rackcdn.com/821414/2/check/neutron-tempest-plugin-api/c4e6f3a/job-output.txt	06:08
ykarel	there were many failures http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22The%20user%20requested%20(constraint)%5C%22	06:09
ykarel	yesterday other providers were also impacted but fungi cleared those by running -XPURGE against those	06:10
ianw	2021-12-15 03:53:43.468529 \| controller \| ERROR: Cannot install neutron==19.1.0.dev278 because these package versions have conflicting dependencies.	06:10
ykarel	but today only seeing failures in rax-iad and rax-dfw	06:10
ykarel	and the common among those was these using mirror-int	06:10
ianw	2021-12-15 03:53:43.468804 \| controller \| neutron 19.1.0.dev278 depends on pecan>=1.3.2	06:10
ykarel	so likely yesterday those were not cleared	06:11
ianw	2021-12-15 03:53:43.468828 \| controller \| The user requested (constraint) pecan===1.4.1	06:11
ykarel	ianw, yes rax-iad affected with ^ and rax-dfw with pyjwt	06:11
frickler	likely yet another CDN hickup for pypi. give me a bit to setup some local testing	06:11
ykarel	frickler, try try with rax-dfw	06:12
ykarel	rax-iad seems to be fixed after i ran -XPURGE	06:12
ianw	so the real problem is that pip gets an error fetching pecan? but outputs that error?	06:13
ykarel	ianw, yes right	06:13
ykarel	1.4.1 not available	06:13
ykarel	older version can be installed	06:13
frickler	the error that pip doesn't log usually is that it doesn't find any version at all in the index	06:17
ykarel	frickler, i found a running job on rax-dfw, is it possible to log into it ?	06:22
ykarel	ip 104.130.132.107	06:22
frickler	ykarel: both iad and dfw seem to be working for me. either my testing is wrong or the XPURGE is indeed global as I assumed	06:22
ykarel	frickler, you used mirror-int?	06:23
frickler	ykarel: no, I agree with ianw that there is no plausible explanation for it to behave differently	06:24
ykarel	frickler, may be can check on 104.130.132.107 node to confirm?	06:24
ykarel	in a venv with latest pip just need to run MIRROR=mirror-int.dfw.rax.opendev.org	06:24
ykarel	pip install --index-url="https://${MIRROR}/pypi/simple" --extra-index-url="https://$MIRROR/wheel/ubuntu-20.04-x86_64" --trusted-host=$MIRROR -c https://raw.githubusercontent.com/openstack/requirements/master/upper-constraints.txt pyjwt	06:25
ykarel	if ^ fails try with public one	06:25
frickler	working just fine for me	06:27
ykarel	frickler, ohkk then likely it got fixed, you ran on 104.130.132.107 only, right?	06:28
frickler	yep	06:28
ykarel	ack Thanks for checking, will keep an eye if it happens again	06:29
frickler	oh, wait, I tested without u-c. it only finds pyjwt==2.2.0, 2.3.0 is missing	06:31
ykarel	ahh then it's affected, can try the same command as above	06:31
ykarel	first with mirror-int and then with public one	06:31
frickler	ykarel: o.k., indeed it works without the -int, this is very weird, need to look into what is happening on the proxy	06:38
ykarel	frickler, okk good	06:39
ykarel	for now may be can just run PURGE against mirror-int.dfw.rax.opendev.org to clear CI	06:39
frickler	ykarel: it doesn't work like that, the purge is against pypi, not the proxy.	06:42
ykarel	frickler, but isn't if you fire the request to proxy it don't got to pypi?	06:42
frickler	but maybe the proxy has different local caching for -int and external URLs	06:42
ykarel	i ran it for rax-iad and seemed it worked	06:42
ykarel	i ran purge there and after some time module installed fine	06:43
frickler	o.k., indeed the proxy caches by URL, so mirror and mirror-int are different	06:53
frickler	running "htcacheclean -A -v -p /var/cache/apache2/proxy/ 'https://mirror-int.dfw.rax.opendev.org:443/pypi/simple/pyjwt/?'" has resolved the issue for that pkg	06:53
ykarel	frickler, Thanks	06:54
frickler	infra-root: for reference, this is what I did in detail, still need to look into decoding the timestamps https://paste.opendev.org/show/811679/	06:57
frickler	ykarel: thanks for being so persistent, I'll see if we can better tune the proxy cache	06:58
ykarel	frickler, Thanks, btw you are in what timezone?	07:02
frickler	ykarel: nominally UTC+1 currently, but I don't always stick to that ;)	07:04
ykarel	yes seems so as it's too early for you now :)	07:05
elodilles	fungi corvus : ack, thanks!	07:09
dulek	Hey folks! I see another set of dependency issues in the OpenStack jobs, this time Keystone installation fails on pyjwt.	07:16
dulek	Does it make sense to recheck these jobs now?	07:16
frickler	dulek: if the errors were happening on rax-dfw or rax-int, yes. otherwise please link to a failure	07:28
frickler	infra-root: it seems that sometimes we have "stuck" cache entries with an expiry of 24h instead of the expected 5m, see the first timestamps in my above paste for the mirror-int entry	07:29
dulek	Here's one on rax-iad: https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_0a9/821442/7/check/openstack-tox-pep8/0a92d7c/job-output.txt	07:35
dulek	And that's it, rest failed on rax-dfw.	07:37
*** ysandeep is now known as ysandeep\|brb		08:12
*** ysandeep\|brb is now known as ysandeep		08:22
*** sshnaidm is now known as sshnaidm\|afk		09:33
*** ysandeep is now known as ysandeep\|afk		09:57
*** redrobot6 is now known as redrobot		10:34
*** ysandeep\|afk is now known as ysandeep		11:08
*** rlandy is now known as rlandy\|ruck		11:13
*** sshnaidm\|afk is now known as sshnaidm		11:26
anbanerj\|ruck	Hi,	11:39
anbanerj\|ruck	We have a gate blocker. Patches 816991,16 and 821778 which fixes bugs needs to go first to unblock the rest. Can someone please get these two patches to the top of the queue?	11:39
anbanerj\|ruck	https://review.opendev.org/c/openstack/tripleo-quickstart-extras/+/821699/	11:39
anbanerj\|ruck	https://review.opendev.org/c/openstack/tripleo-ci/+/821778/	11:39
anbanerj\|ruck	fungi, clarkb ^ when you get some time	11:40
anbanerj\|ruck	thanks	11:40
*** pojadhav is now known as pojadhav\|afk		11:46
anbanerj\|ruck	Also 821538 (https://review.opendev.org/c/openstack/tripleo-quickstart-extras/+/821538) pls. thanks	11:52
*** tkajinam is now known as Guest8517		11:58
anbanerj\|ruck	Hey clarkb, fungi: Sorry pls ignore the previous patches. The correct patches that have to go first to unblock the gate are: 821538, 821778, 821699. in order. Could you pls put these at the top of the tripleo gate queue? Thanks!	12:07
*** pojadhav\|afk is now known as pojadhav		12:41
*** kopecmartin_ is now known as kopecmartin		13:41
*** lbragstad8 is now known as lbragstad		14:09
fungi	frickler: i suspect some fastly cnd endpoints occasionally serving pypi's fallback mirror could explain that if their fallback sets different cache parameters on responses	14:12
frickler	fungi: indeed, I made some further tests and the cache timeout (default seems to be mostly 10m) is being sent from pypi and not specified on our side	14:14
frickler	fungi: while the broken index responses seem to be sent also with broken or long timeout. we should consider setting like a maximum timeout of maybe 1h or less to reduce the impact of those	14:15
frickler	https://httpd.apache.org/docs/2.4/mod/mod_cache.html#cachemaxexpire would be the option to set	14:17
fungi	so basically when a fastly endpoint decides it can't reach the real pypi backend and serves something from pypi's backup mirror instead, it's also including a much longer cache timeout which results in us serving that stale data from our proxies for even longer	14:17
fungi	anbanerj\|ruck: i've put 821538,2 821778,1 821699,2 (in that order) as the first three items in the tripleo shared gate queue now	14:25
anbanerj\|ruck	fungi, thank you!	14:25
fungi	no problem	14:26
*** pojadhav is now known as pojadhav\|afk		14:32
frickler	fungi: that's just my current assumption based on what I saw and the data from the cache I posted. might need further watching.	14:43
frickler	fungi: otoh enforcing a limit of 1h when the default we see is 10m might be agreeable already	14:43
frickler	fungi: it's also not sure whether that timeout is set by fastly or comes from the backends	14:44
fungi	yeah, i'd be up for making it 10min even	14:44
frickler	another thing I noticed: we run a htcacheclean daemon for /var/cache/apache2/mod_cache_disk but that dir is empty, the proxy cache is in /var/cache/apache2/proxy , which we don't seem to actively clean at all	14:49
*** ysandeep is now known as ysandeep\|dinner		14:55
*** simondodsley_ is now known as simondodsley		14:58
*** mnaser_ is now known as mnaser		14:58
*** ildikov_ is now known as ildikov		15:00
*** johnsom_ is now known as johnsom		15:00
*** clayg_ is now known as clayg		15:00
*** bbezak_ is now known as bbezak		15:00
*** erbarr_ is now known as erbarr		15:01
*** parallax_ is now known as parallax`		15:01
*** walshh__ is now known as walshh_		15:01
*** davidlenwell_ is now known as davidlenwell		15:01
*** JpMaxMan_ is now known as JpMaxMan		15:02
*** parallax` is now known as parallax		15:02
*** parallax is now known as Guest8535		15:03
Clark[m]	One of those cache paths and htcacheclean is the default you get with Ubuntu packaging. The other is our path necessary due to cinder volumes in use on some hosts. Both should get cache cleaning via Cron jobs.	15:09
Clark[m]	Note a 10m expiry won't be very effective due to how often htcacheclean runs for keeping disk use down. But I think apache will refresh data it sees as stale, it just won't delete it as quickly as we might expect	15:10
*** clarkb is now known as Guest8536		15:11
frickler	Clark[m]: the 10m would be mostly to reduce the impact of caching broken indices, not to reduce disk usage	15:13
frickler	might be interesting to check whether we could actually make that specific only for indices and have longer timeouts for wheels/tars	15:14
fungi	yeah, we want to turn over indices fairly quickly, but the packages don't ever change so can be cached for as long as we have space	15:22
fungi	however, the packages are technically proxied from a different site name entirely, so we can probably leverage that difference?	15:22
*** ysandeep\|dinner is now known as ysandeep		15:25
*** parallax_ is now known as parallax		15:30
frickler	Context:server config, virtual host, directory, .htaccess	15:31
frickler	doesn't seem to work by location :(	15:31
frickler	the last thing to note, which both ianw and me were wrong about: the cache works per target URL, not per source, so for rax indeed the cache for the -int versions is distinct from the one seen from the public	15:33
*** dviroel\|rover is now known as dviroel\|rover\|lunch		15:47
elodilles	fungi corvus : i'm about to run the branch delete script now. i'll let you know when i reach the part where multiple branches will be deleted in a short time	15:49
fungi	elodilles: sounds great, i'm around and can keep an eye on things as well	15:49
elodilles	ack, let's see	15:50
*** rlandy\|ruck is now known as rlandy\|ruck\|brb		15:50
*** rlandy\|ruck\|brb is now known as rlandy\|ruck		16:12
*** Guest8536 is now known as clarkb		16:18
opendevreview	Merged opendev/system-config master: Copy Exim logs in system-config-run jobs https://review.opendev.org/c/opendev/system-config/+/820899	16:32
*** dviroel\|rover\|lunch is now known as dviroel\|rover		16:32
*** marios is now known as marios\|out		16:33
fungi	that took far more rechecks than i would have expected	16:37
*** ysandeep is now known as ysandeep\|out		16:54
clarkb	it just occured to me that we should update the limnoria bot when meetings aren't happening	17:00
clarkb	for that reason my next bullseye update will be the matrix eavesdrop bot instead of limnoria	17:00
clarkb	corvus: ^ fyi I'm approving that update. I don't expect trouble since that bot doesn't rely on debian user space for much	17:00
clarkb	for limnoria if we land that today we'll want to do it after the swift meeting at 2100 UTC	17:02
opendevreview	Clark Boylan proposed opendev/system-config master: Add firewall behavior assertions to test_bridge https://review.opendev.org/c/opendev/system-config/+/821780	17:07
clarkb	I think ^ might actually end up doing what we want, but I'm doing another forced failure to be sure	17:11
*** sshnaidm is now known as sshnaidm\|afk		17:14
fungi	looks like the other test-related change in topic:mailman-lists will merge shortly, and then i'll start approving the ones which might (though highly unlikely) have production impact	17:19
clarkb	cool. I'll be around if I can help	17:20
corvus	i'm not quite sure why i'm not seeing a login button on zuul... i'll try to look into that today	17:22
opendevreview	Jeremy Stanley proposed opendev/system-config master: Restart mailman services when testing https://review.opendev.org/c/opendev/system-config/+/821144	17:23
opendevreview	Jeremy Stanley proposed opendev/system-config master: Use newlist's automate option https://review.opendev.org/c/opendev/system-config/+/820397	17:23
elodilles	fungi: 10 branch have been deleted in the last ~3 minutes	17:28
elodilles	currently i see 10 management events	17:29
clarkb	now to see how long it takes them to exit the queue	17:29
fungi	yeah, in theory it should only spend time on the first and last ones, right?	17:30
corvus	seeing 10 events in the queue is expected; they should all be processed together (or at least 1 and then 9 more together)	17:30
corvus	2021-12-15 17:28:42,098 INFO zuul.Scheduler: Tenant reconfiguration beginning for openstack due to projects {('opendev.org/openstack/openstack-ansible-os_barbican', 'stable/ocata'), ('opendev.org/openstack/openstack-ansible-os_ceilometer', 'stable/ocata'), ('opendev.org/openstack/openstack-ansible-openstack_openrc', 'stable/ocata'), ('opendev.org/openstack/openstack-ansible-os_gnocchi', 'stable/ocata'),	17:31
corvus	('opendev.org/openstack/openstack-ansible-os_horizon', 'stable/ocata'), ('opendev.org/openstack/openstack-ansible-os_designate', 'stable/ocata'), ('opendev.org/openstack/openstack-ansible-os_aodh', 'stable/ocata'), ('opendev.org/openstack/openstack-ansible-os_cinder', 'stable/ocata'), ('opendev.org/openstack/openstack-ansible-os_heat', 'stable/ocata'), ('opendev.org/openstack/openstack-ansible-os_glance', 'stable/ocata')}	17:31
corvus	that is very promising though :)	17:31
elodilles	:]	17:31
fungi	so either it will drop from 10 to 9 to 0, or from 10 to 1 to 0	17:31
corvus	or maybe 10 to 0 if we're lucky	17:31
fungi	presumably the latter	17:31
fungi	ahh, okay	17:32
elodilles	0 \o/	17:32
fungi	much better!	17:33
fungi	thanks again corvus!!!	17:33
elodilles	yepp, thanks for the fix!	17:33
corvus	no prob, thanks for finding the bug :)	17:33
elodilles	i'll queue now some more branch to delete	17:33
fungi	also i'm enjoying the cute icons for the pipeline manager types now	17:33
corvus	that's mhu's handywork	17:34
fungi	it's lovely	17:34
elodilles	another 16 branch deleted, and now: Queue lengths: 4 events, 14 management events.	17:40
opendevreview	Jeremy Stanley proposed opendev/system-config master: Restart mailman services when testing https://review.opendev.org/c/opendev/system-config/+/821144	17:43
opendevreview	Jeremy Stanley proposed opendev/system-config master: Use newlist's automate option https://review.opendev.org/c/opendev/system-config/+/820397	17:43
opendevreview	Merged opendev/system-config master: Update matrix-eavesdrop image to bullseye https://review.opendev.org/c/opendev/system-config/+/821332	17:46
opendevreview	Merged opendev/system-config master: Collect mailman logs in deployment testing https://review.opendev.org/c/opendev/system-config/+/821112	17:46
elodilles	Queue lengths: 6 events, 7 management events.	17:46
elodilles	(there were 2 managements events for a while, but now it's 0 \o/)	17:59
elodilles	(and that was all, every EOL'd branch have been deleted now, i think)	18:00
fungi	that's awesome, thanks for working through that with us elodilles!	18:03
elodilles	:]	18:05
opendevreview	Clark Boylan proposed opendev/system-config master: Add firewall behavior assertions to testinfra testing https://review.opendev.org/c/opendev/system-config/+/821780	18:06
clarkb	fungi: ^ I think that is mergeable now and addresses the confusion around editing those extra files with extra commas	18:07
clarkb	I've approved the zp01 dns record update change	18:17
fungi	thanks, i agree putting the new file in the list of those triggering the jobs we're interested in is a better way to go about it	18:17
opendevreview	Merged opendev/zone-opendev.org master: Try to make zuul-preview records more clear https://review.opendev.org/c/opendev/zone-opendev.org/+/821743	18:24
opendevreview	Merged opendev/system-config master: Make sure /usr/bin/python is present for mailman https://review.opendev.org/c/opendev/system-config/+/821095	18:27
opendevreview	Merged opendev/system-config master: Add "mailman" meta-list to lists.katacontainers.io https://review.opendev.org/c/opendev/system-config/+/821775	18:31
opendevreview	Clark Boylan proposed opendev/system-config master: Add firewall behavior assertions to testinfra testing https://review.opendev.org/c/opendev/system-config/+/821780	18:31
clarkb	fungi: ^ the point you made about inventory changing is a good one. That attemps to avoid problems	18:31
fungi	clarkb: inspection of the logs indicates 821144 is working now as written. if you're okay with it let's get it and the one after it merged and then we're caught up to working for the current mailman deployment and i can start on migrating foundation mailing lists to the new site	18:32
clarkb	fungi: +2'd if you want to approve	18:33
fungi	thanks! will do	18:34
opendevreview	Merged opendev/system-config master: Restart mailman services when testing https://review.opendev.org/c/opendev/system-config/+/821144	19:05
opendevreview	Merged opendev/system-config master: Use newlist's automate option https://review.opendev.org/c/opendev/system-config/+/820397	19:09
clarkb	The meeting about openstack health is happenign tomorrow at 16:30 utc. I'll try to attend that	19:10
clarkb	Should be able to as that is just late enough to not interfere with my morning tasks	19:10
fungi	qa team meeting? is it irc?	19:12
gmann	http://lists.openstack.org/pipermail/openstack-discuss/2021-December/026250.html	19:13
gmann	fungi: clarkb ^^	19:13
clarkb	it is on google meet	19:13
fungi	ahh, okay	19:13
gmann	we discussed in QA IRC meeting and setup the schedule time with tripleo team on google meet	19:13
fungi	it overlaps with the cd foundation interop sig call i usually join	19:14
fungi	but if clarkb's on then i'm redundant anyway	19:14
opendevreview	Ghanshyam Mann proposed openstack/project-config master: Add openstack-venus irc channel in access an gerrit bot https://review.opendev.org/c/openstack/project-config/+/821875	20:08
ianw	frickler: thanks for looking in on the cache stuff, sorry i had to disappear. the -int thing is interesting -- told you i never say never :)	20:35
ianw	is the end result that we should be talking to pypi about their backend having different timeouts to fastly?	20:37
ianw	maybe it is something they do, so that if the cdn is down, they don't get hit so hard on the backend server?	20:37
ianw	but if the backend is serving bad data (again) ...	20:37
Clark[m]	We brought the general issue up to them when it was last hitting us frequently. The plan at the time was to try and make the backup far more up to date aiui. But the data here seems to indicate that hasnt happened yet	20:39
Clark[m]	Constraints is the main reason it affects openstack. Most other installs will just use an old package version. But that represents security risks. Bringing up that angle might be productive	20:40
ianw	ok, the last time i remember we found the backend was actually out of disk and was seriously out of date. but that was a while ago	20:43
ianw	https://github.com/pypa/warehouse/issues/8568 16-sep-2020 to be exact :)	20:44
Clark[m]	Ya I think they realized they had to address it in general but seems to not have happened due to occurrence rate for us. But really openstack using constraints is what exposes this so they may not be aware it has gotten bad again	20:44
Clark[m]	And maybe arguing this exposes their uses to pulling old insecure packages is a good angle to approach for why this isn't just an openstack is weird issue	20:45
clarkb	I think what I'm getting at is it would be better for them to fail than to fallback	20:47
fungi	well, the last time we brought it up with them the biggest issue was that the fallback lacked python_requires metadata in the indices, and the plan was to get that added (which i think they did?)	20:49
clarkb	fungi: right there were a number of issues. One of which was that, theo ther was it hadn't updated in months	20:49
fungi	or maybe that was the time before last	20:49
clarkb	In this case it seems the backup doesn't update for weeks at least based on some of the errors we have observed	20:49
clarkb	but regardless it seems that returning a 400/500 error to the client when the cdn can't find the data is a better response due to the security issues this potentialy presents	20:50
fungi	the debate raging over pep 665 "lockfiles" might present an opportunity to point out that this will become increasingly painful for users	20:50
clarkb	OpenStack has essentially opted into those errors via contraints, but I'm saying everyone should see them instead	20:50
clarkb	If pypi cannot serve a correct and up to date index that may include important security updates then they should fail and pass that to the user	20:52
clarkb	that is my tldr	20:52
clarkb	Unrelated: the swift meeting is about to start at 21:00 UTC. I need to pick up kids from school ar about 22:15UTC and won't be back until close to 23:00UTC. If I approve the limnoria bullseye update at ~22:00UTC and can't debug until ~23:00 UTC is that a problem for anyone?	20:53
fungi	not a problem for me. but also i plan to be around and am happy to help debug	20:53
clarkb	ok I'll plan to approve it at 22:00 UTC or when I notice the swift meeting is over	20:54
fungi	looks like in order to redirect foo@lists.bar to foo@lists.baz we may need to add a new kind of aliases router which can match on whole addresses with the domain part rather than just the local part	21:00
fungi	luckily i just realized i already have one written for my personal mailserver	21:09
*** dviroel\|rover is now known as dviroel\|out		21:11
clarkb	ianw: https://review.opendev.org/c/opendev/system-config/+/821780 is what I ended up with for testing firewall rules externally. I don't think it is perfect (there is a todo in there) but it seems to functioan and do the checking we want to have	21:18
opendevreview	Ghanshyam Mann proposed openstack/project-config master: Add openstack-venus irc channel in access an gerrit bot https://review.opendev.org/c/openstack/project-config/+/821875	21:35
opendevreview	Ghanshyam Mann proposed opendev/system-config master: Add openstack-venus channel in statusbot https://review.opendev.org/c/opendev/system-config/+/821882	21:35
ianw	clarkb: I think you can just match testinfra_hosts on the zk hosts, and then anything you run is running on bridge	21:42
ianw	similar to how the screenshots work; selenium is running on bridge -- we just use things in "host.X()" context to run on the remote host?	21:42
ianw	does that make sense?	21:42
clarkb	hrm isn't the host passed in the remote testinfra_host entries so would be zk instead?	21:43
clarkb	I guess I could implement my own checker for tcp connectivity is what you are saying and not use the host argument?	21:44
clarkb	then the actual test case is running from bridge so it would always be external connectivity. Just need to implement our own checks	21:45
clarkb	the swift meeting has ended. I'm approving the limnoria bullseye chagne now	21:45
clarkb	ianw: ya so I think what would work is set testinfra_hosts to zk or whatever and move the test case into test_zookeeper.py. Then ignore the host var that is passed to the testcase except for getting its IP address. Then implement our own checker?	21:49
timburke	thanks for waiting for us :-)	21:51
opendevreview	Ghanshyam Mann proposed openstack/project-config master: Mark openstack-placement IRC channel as retired https://review.opendev.org/c/openstack/project-config/+/821889	21:53
ianw	clarkb: yep -- basically if you hav ea test_zookeeper.py as "usual", if you run, say, requests.* in there it's running on bridge. it's only if you use like "host.cmd()" that it's actually running on the remote server	22:03
clarkb	right. I'll take a look at doing that refactor later today	22:04
opendevreview	Ghanshyam Mann proposed opendev/system-config master: Fix command for setting the entry message for IRC channel https://review.opendev.org/c/opendev/system-config/+/821913	22:07
opendevreview	Jeremy Stanley proposed opendev/system-config master: Add a domain aliases mechanism to lists.o.o https://review.opendev.org/c/opendev/system-config/+/821914	22:10
opendevreview	Jeremy Stanley proposed opendev/system-config master: Create an OpenInfra Foundation staff ML https://review.opendev.org/c/opendev/system-config/+/821915	22:10
opendevreview	Jeremy Stanley proposed opendev/system-config master: Forward messages for OpenInfra Foundation staff ML https://review.opendev.org/c/opendev/system-config/+/821916	22:10
opendevreview	Merged opendev/system-config master: Update limboria ircbot to bullseye https://review.opendev.org/c/opendev/system-config/+/821330	22:27
opendevreview	Merged opendev/system-config master: Fix command for setting the entry message for IRC channel https://review.opendev.org/c/opendev/system-config/+/821913	22:27
clarkb	Need to generate some text here to check if limnoria is working	22:54
clarkb	that last message shows up in the text log.	22:55
clarkb	I'll try a test meeting momentarily	22:55
clarkb	do we have a test meeting entry?	22:55
clarkb	https://meetings.opendev.org/meetings/test/ looks like yes	22:56
opendevreview	Jeremy Stanley proposed opendev/system-config master: Add a domain aliases mechanism to lists.o.o https://review.opendev.org/c/opendev/system-config/+/821914	22:56
opendevreview	Jeremy Stanley proposed opendev/system-config master: Create an OpenInfra Foundation staff ML https://review.opendev.org/c/opendev/system-config/+/821915	22:56
opendevreview	Jeremy Stanley proposed opendev/system-config master: Forward messages for OpenInfra Foundation staff ML https://review.opendev.org/c/opendev/system-config/+/821916	22:56
clarkb	https://meetings.opendev.org/meetings/test/2021/test.2021-12-15-22.56.txt I think the new image is happy. Note I didn't land the install from upstream update yet as that one seems more scary. I was goign to check if they haev a stable branch or release tags or something as an alternative	22:57
opendevreview	James E. Blair proposed zuul/zuul-jobs master: Switch docs theme to RTD https://review.opendev.org/c/zuul/zuul-jobs/+/821918	22:59
opendevreview	James E. Blair proposed zuul/zuul-jobs master: Switch docs theme to RTD https://review.opendev.org/c/zuul/zuul-jobs/+/821918	23:08
corvus	zuul changes merged... i could do a rolling restart, but i have to head out in <2 hours, so i'll plan on doing it tomorrow (unless other folks plan on being around today and would rather i do it now)	23:12
opendevreview	Clark Boylan proposed opendev/system-config master: Add firewall behavior assertions to testinfra testing https://review.opendev.org/c/opendev/system-config/+/821780	23:17
clarkb	corvus: I've got family stuff this evening so I'd prefer tomorrow, but no objections if fungi and/or ianw would like to do it	23:17
clarkb	ianw: ^ I think that is what you were suggesting for the connectivity testing	23:17
ianw	shoudl we merge the log format update too before restart https://review.opendev.org/c/opendev/system-config/+/821508 ?	23:18
ianw	is it scheduler or complete restart?	23:18
clarkb	ianw: I've approved the log formatter change. I believe this restart is a rolling restart of scheduelrs (and maybe web?)	23:19
ianw	clarkb: yep, that's almost exactly what i was thinking :)	23:20
fungi	i can be around for a restart if that's preferable	23:21
ianw	i'll be happy to do it, just wait for the log format changes to apply?	23:22
*** rlandy\|ruck is now known as rlandy\|ruck\|bbl		23:23
corvus	it would be a rolling scheduler+web restart	23:26
ianw	++	23:27
opendevreview	Ian Wienand proposed openstack/diskimage-builder master: centos: work around 9-stream BLS issues https://review.opendev.org/c/openstack/diskimage-builder/+/821772	23:27
corvus	okay, i'll go ahead and do the restart under the assumption that fungi/ianw will be around to check on it later.	23:27
corvus	i'm still around for the next hour or so	23:28
corvus	pulling images now	23:28
corvus	killing 02	23:31
fungi	yeah, i'm still around and will keep an eye on things	23:38
corvus	okay 02 is fully up; going to restart 01 now	23:42
opendevreview	James E. Blair proposed zuul/zuul-jobs master: Switch docs theme to RTD https://review.opendev.org/c/zuul/zuul-jobs/+/821918	23:59
corvus	it's up; i don't see any concerning errors in the scheduler logs	23:59
corvus	will restart web now	23:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!