Friday, 2020-10-23

cgoncalvesianw, kevinz: hey. any news on Linaro being down?06:13
ysandep|ruck#opendev hey i noticed some jobs are with mirror issues:-07:44
ysandep|ruck2020-10-23 07:36:26.814749 | primary | Errors during downloading metadata for repository 'AppStream':07:44
ysandep|ruck2020-10-23 07:36:26.814847 | primary |   - Status code: 403 for (IP: 2607:ff68:100:54:f816:3eff:feb5:4635)07:44
ysandep|ruck2020-10-23 07:36:26.814939 | primary | Error: Failed to download metadata for repo 'AppStream': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried07:44
ysandep|ruckfungi, clarkb ^^ fyi ..07:48
openstackgerritlikui proposed openstack/diskimage-builder master: update tox
yoctozeptomorning infra09:55
yoctozeptois there a chance to get valid https on ?09:56
yoctozeptofrickler: yeah, it's 2020 so it's obligatory TLS nowadays11:45
ysandeep|ruck#opendev hello guys o/ ,  Intermittently some jobs are failing with retry_limit after mirror issues.11:45
ysandeep|ruckIs this the right channel to this issue?11:45
ysandeep|ruck2020-10-23 07:36:26.814749 | primary | Errors during downloading metadata for repository 'AppStream':11:45
ysandeep|ruck2020-10-23 07:36:26.814847 | primary |   - Status code: 403 for (IP: 2607:ff68:100:54:f816:3eff:feb5:4635)11:46
ysandeep|ruck2020-10-23 07:36:26.814939 | primary | Error: Failed to download metadata for repo 'AppStream': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried11:46
ysandeep|ruckIs this the right channel to report* this issue?11:46
yoctozeptoysandeep|ruck: the problem is known, yes11:46
ysandeep|ruckyoctozepto, good to know its already known issue, many thanks! , do we have any bug or patch for this issue?11:47
yoctozeptoysandeep|ruck: waiting for infra-root for that11:48
ysandeep|ruckyoctozepto, okay, thanks!11:48
yoctozeptoysandeep|ruck: the mirror will be fixed or disabled11:48
yoctozeptoyoctozepto: you are welcome11:49
AJaegerysandeep|ruckyoctozepto, as discussed in #openstack-infra, the mirror has been rebooted. If you encounter problems with new jobs (starting now or later), please report again.11:55
ysandeep|ruckAJaeger, awesome thanks!11:56
AJaegerysandeep|ruck: fungi did all the work ;)11:59
fungiyep, sorry, was just waking up and checked #openstack-infra before here12:04
fungithe console showed hung cpu tasks, i recorded a copy to ~fungi/limestone-mirror.console on bridge.o.o12:05
fungithere you go12:08
fungilogan-: ^ anything unusual going on in the limestone cloud around 09:00 utc yesterday?12:09
fungiaha, more like 08:0012:11
fungiwe have them in syslog in this case, it seems to have been able to log to disk just couldn't communicate over the network12:12
fungithe first stall was logged at Oct 22 08:01:2812:13
openstackgerritzbr proposed zuul/zuul-jobs master: Allow test-setup to perform a connection reset
*** ricolin has joined #opendev14:14
openstackgerritzbr proposed zuul/zuul-jobs master: Add test_setup_reset_connection setting
openstackgerritzbr proposed zuul/zuul-jobs master: Add test_setup_reset_connection setting
openstackgerritzbr proposed zuul/zuul-jobs master: Add test_setup_reset_connection setting
openstackgerritJeremy Stanley proposed openstack/project-config master: Use StoryBoard for sandbox repos
kevinzclarkb: np15:04
openstackgerritzbr proposed zuul/zuul-jobs master: Add test_setup_reset_connection setting
clarkbdmsimard: we've had persistent issues with very large log files clogging up the works. Which causes gearman to backup and eventually fall over15:54
clarkbmelwitt is looking at having the tools automatically discard large log files so that it will be more reliable15:54
dmsimardworks for me, wanted to point it out in case it wasn't a known issue15:55
clarkbthanks for checking15:56
melwittclarkb: I'm working on streaming the data instead of loading entire large file into memory as a first step. and then see how it does. if you think that won't be enough, I can stack another change on top to not send too large files for indexing (I don't yet know which part of this code initiates indexing tho)16:05
clarkbmelwitt: oh no I think thats great too16:06
clarkbbasically do our best to index the large logs and if it still fails figure it out from there16:07
clarkbfungi: I'm trying to remember what the rough plan for jvbs was. I think we said we'd add one more for a total of two jvb hosts?16:11
clarkbwhich would be 60% of previous ptg capacity (and it seemed we had plenty last time?)16:11
fungiright, and we already run a jvb on the primary too yeah?16:12
fungiso we'll end up with three jvb processes in total16:12
clarkbyup so last time we had primary + 4 jvbs which is a total of 5 and this time adding one more would be total of 316:12
fungithat matches my recollection of the discussion from the meeting a couple weeks ago16:12
clarkbshoudl I start booting a jvb02 then? or does someone else want to give it a go? (I'm happy to do it but offering in case others are interested in seeing how meetpad is put together)16:13
openstackgerritzbr proposed zuul/zuul-jobs master: Add test_setup_reset_connection setting
kevinzclark, fungi: Linaro US nodepool recovered16:33
kevinzplease help to check if it works16:33
fungithanks kevinz!!!16:33 is up and running an image build16:33
kevinzOK, np16:34
kevinzSorry for inconvenience16:34
clarkbthe other thing to check for is if jobs can run on that cloud again16:34
fungiseems i can browse
clarkboh ya the mirror that too :)16:35
fungiyep, sorry, as if one security problem wasn't enough, the rest of my afternoon is being spent trying to catch up on openstack vmt stuff i put off all week16:58
clarkbno worries, I had intended on doing it was just checking no one else wanted to give it a go first16:59
clarkbthe scale up should be basically automagic at this point though since xmpp is used to coordinate16:59
openstackgerritmelanie witt proposed opendev/puppet-log_processor master: Stream log files instead of loading full files into memory
openstackgerritClark Boylan proposed opendev/system-config master: Add jvb02 prior to the PTG
clarkbI'm sorting out sshfp records then will push up the dns update too17:35
melwittclarkb: ^ first intelligible pass at streaming log files. note that I did not fully test it in that I hacked main to create a LogRetriever and call _open_log_file_url and loop to _retrieve_log_line manually. I'm not sure how completely zuul will test it. if you think there's a better test I should do, lmk17:39
clarkbmelwitt: I'm not sure zuul will test it, but what we can do is take one of the worker nodes out of config management and test it that way. Its not like things are functional in that system right now anyway17:41
clarkb(we do have some logstash worker tests but not sure we test the file retrieval there)17:41
melwittcool, I wondered if we could do that. and I hope I don't have some dumb bug in the _handle_event method -_-17:42
melwittif I do, I apologize in advance17:42
fungiwell, there are 20 workers, so if one is buggy it's not the end of the world17:42
fungi20 worker servers i mean... with... 4? worker processes each17:42
clarkbreviews welcome17:59
melwittmy change failed the legacy linter but I'm not sure whether it's related. it failed during some package installs and I don't see the words "puppet-log_processor" in it
clarkbmelwitt: its a puppet issue. I'll put a fix under your change18:10
melwittoh ok, thanks18:11
openstackgerritClark Boylan proposed opendev/puppet-log_processor master: Stream log files instead of loading full files into memory
openstackgerritClark Boylan proposed opendev/puppet-log_processor master: Fix puppet linter complaints about :: prefixes
clarkbmelwitt: if you're curious old puppet required the :: prefix to root the namespace scope. Newer puppet made it optional then at some point they decided that even though the old version is still correct it shouldn't be used :/18:14
* clarkb finds lunch18:15
melwittclarkb: a-ha, now the log file makes sense to me. thanks!18:17
fungipip 20.2.4 was just released, 20.3 will turn on the dep solver probably next week18:37
fungithey're saying wednesday or thursday is likely for that18:37
funginow the beaker-rspec job doesn't like 75949218:42
fungiGem::RemoteFetcher::UnknownHostError: timed out18:50
fungilooks like a connectivity problem, or maybe is having trouble18:51
fungii'll recheck it18:51
clarkbfungi: thanks19:00
openstackgerritMerged opendev/puppet-log_processor master: Fix puppet linter complaints about :: prefixes
clarkbfungi: I'm thinking we may not get a second reviewer for the jvb stuff today. Should we go ahead and approve it now or ask ianw to review during australia monday morning?19:24
fungiclarkb: i figured i'd approve those in a little while if another infra-root doesn't happen along19:24
fungiworking on some stir fry now but will circle back around to those once i'm done19:28
clarkbnow I want stir fry19:33
fungiit's deep-fry/stir-fry... i'm practicing my skills at doing mongolian beef19:35
clarkbI had some leftover curry for lunch which was good19:37
mordrednow I want stir-fry too20:07
corvusfungi: hope you can keep up with all the orders!20:11
fungii actually can't20:21
fungidredging all the wafer-thin slices of steak is something i need to get faster at20:21
openstackgerritMerged opendev/ master: Add jvb02
clarkbfungi looks like failed on tox linters due toa timeout21:32
clarkbshould we reenqueue to the gate or just recheck?21:33
clarkb(lookingat the timed out job it seems to have just been slow)21:33
* clarkb enqueues because running out of day21:33
fungiyeah, sorry, please do21:35
fungigot sidetracked prepping pizza dough for tomorrow21:35
prometheanfireare we still being scraped?21:54
* prometheanfire can't connect to gerrit via gertty21:54
clarkbdid you set a new api token?21:55
prometheanfireoh, that needed? ok21:56
prometheanfireI need a new http-password set?21:57
clarkbyes, we unset all of them21:57
prometheanfireah, kk done21:57
fungiyeah, gerrit 2.13 still keeps them in plaintext in the db22:04
fungiwhat we're upgrading to will put kdf (bcrypt) hashes in there instead22:04
fungidownside is that gerrit will no longer be able to show you the key once set, other than ephemerally when it's generated22:05
fungiso if you forget it you can't look it up, you'll just have to regenerate it22:05
fungiultimately much safer though22:06
prometheanfireya, I prefer that anyway :P22:06
fungiso do we22:06
clarkbI think that tox linters job may fail again22:08
clarkbit takes 10 minutes just to run tox without any tests? that installs deps?22:08
clarkbwhy is that so slow22:08
fungithe times i watched, it looked like it was taking forever on ansible-lint22:10
clarkbya its slow there too22:10
clarkbfungi: do you think we should just increase the timeout? I'm not sure how much slowness dbeugging I want to do right now22:17
clarkb unfortunately doesn't have timing info for the pip install steps22:20
fungimmm, maybe22:25
openstackgerritClark Boylan proposed opendev/system-config master: Add jvb02 prior to the PTG
clarkbthat bumps the tox-linters timeout22:28
fungimaybe sometime soon we can figure out how to speed that up22:29
clarkbsupposedly not using find makes it go faster because then you don't have python startup overhead over and over22:30
clarkbbut in the past it hasn't found all the files when we did that so may need double checking22:30
fungii have a feeling this is part of the problem:
fungiyeah, what you just said22:31
fungiwe basically incur the startup overhead 300x22:31
fungiand that multiplier is only going to continue to increase as we port more systems from puppet to ansible22:32
clarkbya we tried swithicng and it dind't work at all iirc22:34
clarkbso we reverted22:34
openstackgerritMerged opendev/system-config master: Add jvb02 prior to the PTG
fungiclarkb: ^23:37
clarkboh it hasn't run yet because it updated inventory os runs all the jobs23:48
clarkbI'll check on it a bit later then23:48
*** owalsh_ has quit IRC23:48
