Wednesday, 2019-09-11

donnydclarkb: I think I can work something like that up00:01
donnydI was also thinking maybe io load00:01
*** jamesmcarthur has joined #openstack-infra00:02
*** jbadiapa has quit IRC00:04
*** jamesmcarthur has quit IRC00:06
*** jamesmcarthur has joined #openstack-infra00:10
*** goldyfruit___ has joined #openstack-infra00:11
donnydIs there a way we could publish the nodepool logs to the log server00:15
donnydI would like to run down the source of some jobs not being able to ssh inbound00:15
*** mtreinish has joined #openstack-infra00:18
*** jamesmcarthur has quit IRC00:27
*** armax has quit IRC00:32
*** jamesmcarthur has joined #openstack-infra00:39
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: Add remove-zuul-sshkey  https://review.opendev.org/68071200:41
*** eharney has quit IRC00:42
*** gyee has quit IRC00:42
*** jamesmcarthur has quit IRC00:44
ianwspeaking of mtu's, i feel like it's likely afs01 -> afs02 could use jumbo packets?00:44
ianwwe could enable that with "-jumbo" to the volserver process i think00:45
ianwanyway ... let's sort out one problem at a time :)  some stats on zero delta release for fedora -> https://lists.openafs.org/pipermail/openafs-info/2019-September/042864.html00:46
*** armax has joined #openstack-infra00:56
*** dychen has joined #openstack-infra01:01
*** dchen has quit IRC01:02
*** jamesmcarthur has joined #openstack-infra01:02
*** dingyichen has joined #openstack-infra01:03
*** dychen has quit IRC01:06
*** mriedem has joined #openstack-infra01:06
*** dingyichen has quit IRC01:11
*** dchen has joined #openstack-infra01:13
*** dychen has joined #openstack-infra01:20
*** dchen has quit IRC01:22
*** tkajinam has quit IRC01:25
*** tkajinam has joined #openstack-infra01:26
*** dingyichen has joined #openstack-infra01:39
*** HenryG has joined #openstack-infra01:40
*** dychen has quit IRC01:41
*** nicolasbock has quit IRC01:44
*** nicolasbock has joined #openstack-infra01:45
*** rlandy has quit IRC01:50
ianwauristor / others: for our own info, fedora mirror seemed to go wrong @01:53
ianwCould not end transaction on a ro volume: rxk: authentication expired01:53
ianw Could not update VLDB entry for volume 53687100601:53
ianwin http://files.openstack.org/mirror/logs/rsync-mirrors/fedora.log.7.gz01:53
ianwunfortunately, that's the last log, it was ~ 2019-08-31 15:56:5901:53
openstackgerritAkihiro Motoki proposed openstack/project-config master: Update horizon grafana dashboard  https://review.opendev.org/68136101:54
ianwafaict, every update since then hit a locked volume01:54
ianwuntil our disk issues and restarts etc for the last few days01:55
ianwthis probably explains how r/w delta has got up to 50gb ... it's many days of mirroring01:55
auristorianw: more bad news - openafs can't do jumbograms with rxkad/crypt connections :(01:58
*** nicolasbock has quit IRC01:58
*** jamesmcarthur has quit IRC01:59
*** dingyichen has quit IRC02:04
*** jamesmcarthur has joined #openstack-infra02:04
*** dchen has joined #openstack-infra02:04
*** slaweq has joined #openstack-infra02:11
*** FlorianFa has quit IRC02:15
*** slaweq has quit IRC02:15
*** auristor has quit IRC02:20
*** njohnston|lunch has quit IRC02:22
*** auristor has joined #openstack-infra02:27
*** roman_g has quit IRC02:34
*** FlorianFa has joined #openstack-infra02:35
*** mriedem has quit IRC02:48
*** yamamoto has joined #openstack-infra02:56
*** jamesmcarthur has quit IRC03:00
*** jamesmcarthur has joined #openstack-infra03:06
*** ykarel|away has joined #openstack-infra03:08
*** ccamacho has quit IRC03:11
*** jamesmcarthur has quit IRC03:13
openstackgerritIan Wienand proposed opendev/system-config master: fedora mirror update : add sleep  https://review.opendev.org/68136703:16
*** jamesmcarthur has joined #openstack-infra03:25
*** rh-jelabarre has quit IRC03:31
*** rfolco has quit IRC03:32
*** PrinzElvis has quit IRC03:39
*** larainema has joined #openstack-infra03:39
*** abelur has quit IRC03:40
*** srwilkers has quit IRC03:41
*** rpioso has quit IRC03:42
*** knikolla has quit IRC03:42
*** davecore has quit IRC03:42
*** csatari has quit IRC03:42
*** setuid has quit IRC03:42
*** rpioso has joined #openstack-infra03:44
*** srwilkers has joined #openstack-infra03:44
*** knikolla has joined #openstack-infra03:45
*** setuid has joined #openstack-infra03:45
*** csatari has joined #openstack-infra03:45
*** davecore has joined #openstack-infra03:45
*** PrinzElvis has joined #openstack-infra03:45
*** abelur has joined #openstack-infra03:45
*** jamesmcarthur has quit IRC03:57
*** udesale has joined #openstack-infra04:00
*** ykarel|away has quit IRC04:08
*** ianychoi_ has joined #openstack-infra04:25
*** ykarel|away has joined #openstack-infra04:25
*** ykarel|away is now known as ykarel04:25
*** ianychoi has quit IRC04:27
*** soniya29 has joined #openstack-infra04:30
*** exsdev has quit IRC04:42
*** pcaruana has joined #openstack-infra04:42
*** exsdev has joined #openstack-infra04:46
*** kjackal has joined #openstack-infra04:57
*** slaweq has joined #openstack-infra05:11
*** pcaruana has quit IRC05:12
*** jtomasek has joined #openstack-infra05:13
*** slaweq has quit IRC05:16
*** rcernin has quit IRC05:22
*** redrobot has quit IRC05:25
*** rcernin has joined #openstack-infra05:38
*** kopecmartin|off is now known as kopecmartin05:46
*** rcernin has quit IRC05:51
cmurphyAJaeger: can you propose the correction to https://review.opendev.org/681161 and i'll approve?05:57
*** rpittau|afk is now known as rpittau06:01
AJaegercmurphy: good morning - will do ;)06:02
AJaegercmurphy: https://review.opendev.org/681380 - found one more problem06:05
*** yamamoto has quit IRC06:06
cmurphythanks AJaeger06:07
* cmurphy -> bed06:07
*** rcernin has joined #openstack-infra06:09
*** slaweq has joined #openstack-infra06:11
*** slaweq has quit IRC06:15
*** pcaruana has joined #openstack-infra06:21
*** slaweq has joined #openstack-infra06:21
AJaegergood night, cmurphy !06:25
*** pgaxatte has joined #openstack-infra06:26
*** AJaeger has quit IRC06:28
*** AJaeger has joined #openstack-infra06:35
*** ricolin has joined #openstack-infra06:38
*** ricolin has quit IRC06:39
*** ociuhandu has joined #openstack-infra06:42
*** ociuhandu has quit IRC06:46
*** lpetrut has joined #openstack-infra06:52
*** ykarel is now known as ykarel|lunch06:57
*** ociuhandu has joined #openstack-infra07:04
*** ociuhandu has quit IRC07:04
*** ociuhandu has joined #openstack-infra07:07
*** ociuhandu has quit IRC07:08
*** trident has quit IRC07:08
*** tesseract has joined #openstack-infra07:11
*** pkopec has joined #openstack-infra07:15
*** roman_g has joined #openstack-infra07:16
*** gfidente has joined #openstack-infra07:17
*** trident has joined #openstack-infra07:17
ianw#status log all volumes released and mirror-update.opendev.org returned to operation.  for info on the debugging done with fedora volume; see https://lists.openafs.org/pipermail/openafs-info/2019-September/042865.html07:18
openstackstatusianw: finished logging07:18
*** ralonsoh has joined #openstack-infra07:19
*** trident has quit IRC07:22
*** rcosnita has joined #openstack-infra07:24
*** ociuhandu has joined #openstack-infra07:24
ianwi've also disabled fileaudit logging on servers; it doesn't appear related to rsync updates07:26
rcosnitahello. Has anyone got some time to review a very small pull request: https://review.opendev.org/#/c/678786? -> It is a minor improvement in the openstack ironic helm chart.07:27
*** trident has joined #openstack-infra07:31
*** jpena|off is now known as jpena07:37
*** xenos76 has joined #openstack-infra07:38
*** ccamacho has joined #openstack-infra07:38
*** rcernin has quit IRC07:38
AJaegerrcosnita: this is the channel for running the OpenStack infrastructure. For helm charts, check the proper channel. Full list of channels is at https://wiki.openstack.org/wiki/IRC07:41
rcosnitasorry, I've just realised that. I added my question on openstack-helm channel. Sorry for the noise.07:41
*** ykarel|lunch is now known as ykarel07:45
*** e0ne has joined #openstack-infra07:56
*** jbadiapa has joined #openstack-infra08:01
*** sshnaidm|afk is now known as sshnaidm|ruck08:09
*** ykarel is now known as ykarel|meeting08:12
*** e0ne has quit IRC08:12
*** dchen has quit IRC08:12
*** ociuhandu has quit IRC08:12
*** ykarel_ has joined #openstack-infra08:14
*** ociuhandu has joined #openstack-infra08:15
*** ykarel|meeting has quit IRC08:16
*** ykarel_ is now known as ykarel|meeting08:18
*** derekh has joined #openstack-infra08:20
*** tkajinam has quit IRC08:27
*** e0ne has joined #openstack-infra08:40
zbrianw: morning!08:48
*** priteau has joined #openstack-infra08:53
openstackgerritSorin Sbarnea proposed openstack/openstack-zuul-jobs master: openstack-tox-molecule: replace success-url and failure-url  https://review.opendev.org/68125109:02
*** lpetrut has quit IRC09:04
*** ianychoi_ has quit IRC09:09
*** rcernin has joined #openstack-infra09:10
*** gfidente has quit IRC09:10
*** rcosnita has quit IRC09:11
*** soniya29 has quit IRC09:18
*** dtantsur|afk is now known as dtantsur09:30
*** ykarel_ has joined #openstack-infra09:32
*** ykarel|meeting has quit IRC09:34
*** ricolin_phone has joined #openstack-infra09:35
*** ricolin_phone has quit IRC09:38
*** ykarel_ is now known as ykarel09:39
*** gfidente has joined #openstack-infra09:40
*** rcernin has quit IRC09:41
*** ociuhandu has quit IRC09:43
*** prometheanfire has quit IRC09:43
*** prometheanfire has joined #openstack-infra09:44
*** kjackal has quit IRC09:49
*** ociuhandu has joined #openstack-infra09:49
*** kjackal has joined #openstack-infra09:51
*** pgaxatte has quit IRC09:52
*** ociuhandu has quit IRC09:54
*** pkopec has quit IRC10:24
*** pkopec has joined #openstack-infra10:26
*** ianychoi has joined #openstack-infra10:30
*** ianychoi has quit IRC10:45
*** ianychoi has joined #openstack-infra10:45
sshnaidm|ruckhi, how can I build new containers with zuul? The current containers on docker.io/zuul has zuul version 3.5.0 which is quite old11:00
sshnaidm|ruckmordred, clarkb ^^11:00
AJaegerwhere are you looking? https://hub.docker.com/r/zuul/zuul/tags has the latest build, doesn't it?11:01
*** mahajan-abhishek has joined #openstack-infra11:02
AJaegersshnaidm|ruck: ^11:03
*** shachar has quit IRC11:08
*** snapiri has joined #openstack-infra11:08
*** rh-jelabarre has joined #openstack-infra11:16
sshnaidm|ruckAJaeger, weird, this build has 3.5.011:20
*** sshnaidm|ruck is now known as sshnaidm|bbl11:20
*** ykarel is now known as ykarel|afk11:20
AJaegersshnaidm|bbl: but is it recent? So, is the version number wrong or the content also outdated? Best discuss on #zuul in either case...11:21
*** udesale has quit IRC11:31
zbrouch those zuul/zuul does not even have tags on it, maybe there is another one that has versioned tags? clearly it would be very useful to have versioned tags and also a mobile one like "latest" that points to latest release.11:32
*** nicolasbock has joined #openstack-infra11:39
*** mattymo has joined #openstack-infra11:40
*** jpena is now known as jpena|lunch11:40
mattymoAnyone around familiar with zuul/nodepool? I know I can add secret env vars to zuul jobs, but for nodepool, you can't embed secrets in nodepool.yaml. I have to host the nodepool conf on a public git repo and I don't want to post secrets required for building RHEL disk images11:41
mattymoAny suggestions?11:41
zbrianw: can you please abandon https://review.opendev.org/#/c/335520/ ? thanks11:57
pabelangermattymo: one option is to use ansible to encrypt your secrets and tempate the config11:57
pabelangermattymo: another is use secure.conf: https://zuul-ci.org/docs/nodepool/installation.html?#configuration11:58
*** spsurya has joined #openstack-infra11:58
pabelangermattymo: you may also want to join #zuul for nodepool / zuul specific questions too11:58
*** rfolco has joined #openstack-infra12:06
mattymopabelanger: I looked at the source code for secure.conf. It only supports zookeeper settings. I thought about hacking it, but I don't want to fork if I don't have to. I thought of another dirty hack of adding the secrets to the shell env inside the nodepool container12:07
pabelangermattymo: what is it you are looking to do?12:10
*** pgaxatte has joined #openstack-infra12:10
pabelangeryou likely best to encrypt them using vault from ansible12:10
pabelangeror some cfgmgmt tool in this case12:11
mattymopabelanger: You're right. I'm going to try that route first12:11
*** auristor has quit IRC12:11
pabelangermattymo: also look to join #zuul too, there are more humans doing container things, which might know a better way to use secrets from those tools12:12
*** lpetrut has joined #openstack-infra12:13
*** ociuhandu has joined #openstack-infra12:13
*** ociuhandu has quit IRC12:14
*** goldyfruit___ has quit IRC12:15
*** derekh has quit IRC12:16
*** ociuhandu has joined #openstack-infra12:16
*** ociuhandu has quit IRC12:16
*** auristor has joined #openstack-infra12:17
*** rlandy has joined #openstack-infra12:23
*** jamesmcarthur has joined #openstack-infra12:25
*** auristor has quit IRC12:28
*** happyhemant has joined #openstack-infra12:30
*** jamesmcarthur has quit IRC12:30
*** auristor has joined #openstack-infra12:31
*** jpena|lunch is now known as jpena12:32
*** auristor has quit IRC12:37
*** ociuhandu has joined #openstack-infra12:38
*** ykarel|afk is now known as ykarel12:38
*** noama has joined #openstack-infra12:41
*** auristor has joined #openstack-infra12:44
*** e0ne has quit IRC12:46
*** mriedem has joined #openstack-infra12:49
*** eharney has joined #openstack-infra12:51
*** e0ne has joined #openstack-infra12:52
*** derekh has joined #openstack-infra13:01
*** Goneri has joined #openstack-infra13:08
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: WIP: Allow ensure-tox to upgrade tox version  https://review.opendev.org/67646413:11
*** ociuhandu has quit IRC13:12
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Allow ensure-tox to upgrade tox version  https://review.opendev.org/67646413:13
*** ociuhandu has joined #openstack-infra13:13
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: add-build-sshkey: add centos/rhel-8 support  https://review.opendev.org/67409213:13
*** mahajan-abhishek has left #openstack-infra13:14
*** kjackal has quit IRC13:15
*** kjackal has joined #openstack-infra13:18
*** ociuhandu has quit IRC13:18
*** jaosorior has quit IRC13:26
*** ociuhandu has joined #openstack-infra13:26
*** goldyfruit___ has joined #openstack-infra13:29
*** ociuhandu has quit IRC13:31
*** eharney has quit IRC13:32
*** eharney has joined #openstack-infra13:32
*** sthussey has joined #openstack-infra13:40
*** panda is now known as panda|ruck13:40
*** ociuhandu has joined #openstack-infra13:41
*** aaronsheffield has joined #openstack-infra13:42
*** sshnaidm|bbl is now known as sshnaidm|ruck13:44
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Allow ensure-tox to upgrade tox version  https://review.opendev.org/67646413:44
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: add-build-sshkey: add centos/rhel-8 support  https://review.opendev.org/67409213:44
*** panda|ruck is now known as panda|rover13:45
zbris the success-url and failure-url are going away or that is only a temporary issue? asking so I know if i adopt the tox-docs approach in order places or not.13:48
*** redrobot has joined #openstack-infra13:50
AJaegerzbr: my expectation: success-url will work while a buildset is running and work for non-swift logs. But discuss with corvus13:50
*** rpittau is now known as rpittau|afk13:54
zbrcorvus: pabelanger: please help me merge https://review.opendev.org/#/c/681251/ to fix broken urls on tox-molecule jobs.13:55
*** nicolasbock has quit IRC14:10
*** nicolasbock has joined #openstack-infra14:11
*** nicolasbock has quit IRC14:11
*** nicolasbock has joined #openstack-infra14:13
*** diablo_rojo has joined #openstack-infra14:15
*** nicolasbock has quit IRC14:15
*** ricolin has joined #openstack-infra14:18
openstackgerritAndreas Jaeger proposed openstack/project-config master: Rename x/ansible-role-cloud-launcher -> opendev/  https://review.opendev.org/66253014:22
*** electrofelix has joined #openstack-infra14:26
*** mattymo has quit IRC14:32
*** priteau has quit IRC14:34
*** ykarel is now known as ykarel|away14:37
*** nhicher has quit IRC14:39
openstackgerritGraham Hayes proposed openstack/project-config master: Allow all TC members +W access  https://review.opendev.org/67821414:39
*** sshnaidm|ruck is now known as sshnaidm|rover14:40
mriedemclarkb: fyi if you see ssh failures in tempest runs, i've been noticing this in the guest console log output:14:41
mriedemhttp://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22sh%3A%20write%20error%3A%20No%20space%20left%20on%20device%5C%22%20AND%20tags%3A%5C%22console%5C%22&from=7d14:41
mriedemi can't say for sure that's the reason for the ssh failure though because the log goes on to show cloud-init got the network info14:41
*** ykarel|away has quit IRC14:46
*** panda|rover is now known as panda|ruck14:52
*** jbadiapa has quit IRC14:54
*** pgaxatte has quit IRC14:54
*** nhicher has joined #openstack-infra14:56
*** prometheanfire has quit IRC14:57
*** ykarel has joined #openstack-infra15:03
*** jamesmcarthur has joined #openstack-infra15:08
*** chandankumar has quit IRC15:11
*** chandankumar has joined #openstack-infra15:12
*** gyee has joined #openstack-infra15:23
*** bdodd has joined #openstack-infra15:23
*** igordc has joined #openstack-infra15:31
*** mattw4 has joined #openstack-infra15:34
clarkbpabelanger: mattymo ya an important distinction here is that zuul is managing secrets for the jobs it runs not for zuul's own configs15:37
*** eharney has quit IRC15:37
clarkbin the nodepool case if you need to write secrets to the config I think configuration management tooling is what you want to rely on15:37
pabelanger++15:38
clarkbmriedem: I think that can be a few potential issues: 1) growroot or equivalent isn't working to expand the image size on boot 2) the image flavor disk size is too small for the workload being tested on the image 3) we are running out of disk space in /var/lib/libvirt/images (or whatever that path was) like ironic jobs are doing15:39
clarkbmriedem: considering that we are able to have ansible ssh in and collect logs I think 3) is unlikely as that is usually more catastrophic but could still be possible15:40
clarkbmriedem: checking 2) should be easy. checking 1) will depend on the image and whether or not we collect console logs15:40
clarkbcorvus: fungi canI get a review on https://review.opendev.org/#/c/681354/ so that I can test only collecting last effort debug logs on job failures?15:41
*** e0ne has quit IRC15:42
*** e0ne has joined #openstack-infra15:42
*** jaosorior has joined #openstack-infra15:44
clarkbmriedem: 'GROWROOT: NOCHANGE: partition 1 is size 2078687. it cannot be grown'15:44
clarkbmriedem: I think that means the image is already as big as it can be based on flavor and image size?15:44
clarkbis that 2MB?15:45
clarkbso ya maybe we bump the flavor size up to 50MB or something15:45
clarkboh its volume backed so that must be 2GB not 2MB?15:49
*** eharney has joined #openstack-infra15:50
*** ykarel is now known as ykarel|away15:51
*** Garyx has quit IRC15:53
*** Garyx has joined #openstack-infra15:54
clarkbok we create a volume with the cirros image with a size of '1'. I think that means the volume itself is 1GB large15:54
mriedemthe default volume size in tempest runs is 1gb15:56
clarkbmriedem: ya so now I wonder where the size in the growroot error comes from15:57
clarkbit should be in the range of 107374182415:58
clarkbor 1073741 if the units are kb15:58
*** diablo_rojo has quit IRC15:59
*** diablo_rojo has joined #openstack-infra15:59
clarkbmriedem: I'm going to guess that cirros issue is locally reproduceable unless it is a hypervisor running out of disk16:00
clarkbhttps://9fb8b67e1f094c1ca6a3-8e4483e68b70671d63c5bd54e8baf97f.ssl.cf5.rackcdn.com/628076/4/check/cinder-tempest-dsvm-lvm-lio-barbican/125dba2/logs/df.txt.gz at least at the end of the job we have plenty of disk16:00
openstackgerritMerged opendev/base-jobs master: Update cleanup tasks to only happen on failure  https://review.opendev.org/68135416:01
clarkbfungi: thanks re ^16:01
mugsieanyone able to give an ACL change a +W? - https://review.opendev.org/#/c/678214/16:02
mriedemhttp://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22GROWROOT%3A%20NOCHANGE%3A%20partition%201%20is%20size%5C%22%20AND%20message%3A%5C%22it%20cannot%20be%20grown%5C%22%20AND%20tags%3A%5C%22console%5C%22&from=7d16:03
*** mattw4 has quit IRC16:04
*** mattw4 has joined #openstack-infra16:04
openstackgerritMerged zuul/zuul-jobs master: Fix evaluating nodepool_ip and switch_ip facts  https://review.opendev.org/68118216:05
clarkbmriedem: it is also entirely possible those errors happen on successful jobs too we just don't collect console logs in that case16:06
clarkbthat said I think removing errors even if benign is worthwhile as it removes distractions when debugging16:06
mriedemclarkb: true16:06
mriedemi wonder if it would be useful to have tempest ssh and df and dump that out when we get an ssh failure?16:07
mriedemsince the console output shows the disks but not their usage16:07
AJaegermugsie: why did you rebase that change? That is completely unneeded unless they are merge conflicts - and only wastes test resources that are rare at the momemnt with all the feature freeze rush...16:08
clarkbmriedem: if we have an ssh failure we may not be able to ssh in :)16:08
AJaeger(a single one does not hurt - just wanted to point it out before it happens large scale ;)16:08
clarkbmriedem: but ya I think figuring that out would be good16:08
mriedemclarkb: heh right16:08
mriedemregardless i guess i should open a new bug for this and track it in e-r16:09
mugsieAJaeger: habit - I went to double check it would still rebase cleanly, and just g-r'd it16:09
clarkbmugsie: note that git review will bail out if a rebase is necessary16:09
clarkbthough if you aren't making changes then it will also error because no changes16:10
mugsieclarkb: I know, it just have a habit of git fetch, git rebase, git review for older patches16:10
*** dtantsur is now known as dtantsur|afk16:10
clarkbwe try really hard to tell you when a rebase is necessary so that you only do it then :)16:10
*** mattw4 has quit IRC16:10
clarkbgerrit, zuul and git review will all tell you16:10
mugsieyeah, unfortunately most of my work these days is outside of zuul, gerrit, and other good CI/CD toolchains16:11
fungialso git-restack is good for avoiding unnecessary rebasing onto different parents when updating a change series16:11
fungi(related)16:11
mriedemclarkb: could this be closer to the reason for the ssh fail?16:13
mriedemip-route6:unreachable default dev lo  metric -1  error -10116:13
clarkbmriedem: is https://openstack.fortnebula.com:13808/v1/AUTH_e8fd161dc34c421a979a9e6421f823e9/zuul_opendev_logs_c4c/671072/18/gate/nova-tox-functional/c4ca604/job-output.txt the functional test problem you were saying has a fix? it just reset the gate16:13
mriedemthough i guess in this test it's trying to ssh into an ipv4 address16:13
clarkbmriedem: ya I think that is only a problem if using ipv6 which that test may not be16:13
mriedemclarkb: i noticed TestInstanceNotificationSampleWithMultipleCompute.test_multiple_compute_actions on another change this morning, it's not related to the one i pointed out yesterday (which is merged now),16:14
mriedemso i suspect something new has merged which is tickling a new failure,16:14
mriedemi'll e-r that one after i get this growroot one16:14
openstackgerritMatt Riedemann proposed opendev/elastic-recheck master: Add query for growroot fail bug 1843610  https://review.opendev.org/68152716:16
openstackbug 1843610 in OpenStack-Gate "Tempest ssh to guest intermittently fails, "GROWROOT: NOCHANGE: partition 1 is size 2078687. it cannot be grown" seen in guest console log" [Undecided,New] https://launchpad.net/bugs/184361016:16
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: WIP: Allow ensure-tox to upgrade tox version  https://review.opendev.org/67646416:16
openstackgerritMerged openstack/project-config master: Allow all TC members +W access  https://review.opendev.org/67821416:17
*** ricolin has quit IRC16:22
openstackgerritMatt Riedemann proposed opendev/elastic-recheck master: Add query for test_multiple_compute_actions race fail bug 1843615  https://review.opendev.org/68153016:27
mriedemclarkb: ^16:27
openstackbug 1843615 in OpenStack Compute (nova) "TestInstanceNotificationSampleWithMultipleCompute.test_multiple_compute_actions intermittently failing since Sept 10, 2019" [High,Confirmed] https://launchpad.net/bugs/184361516:27
*** ociuhandu has quit IRC16:30
*** chandankumar is now known as raukadah16:31
openstackgerritSorin Sbarnea proposed zuul/zuul master: For pre blocks to wrap text  https://review.opendev.org/68153216:32
*** ralonsoh has quit IRC16:37
*** lpetrut has quit IRC16:40
*** hwoarang has quit IRC16:42
*** jpena is now known as jpena|off16:44
*** hwoarang has joined #openstack-infra16:45
*** jaosorior has quit IRC16:46
*** ociuhandu has joined #openstack-infra16:46
openstackgerritMerged openstack/openstack-zuul-jobs master: openstack-tox-molecule: replace success-url and failure-url  https://review.opendev.org/68125116:47
clarkbcorvus: we don't seem to set zuul_success on the cleanup playbook http://paste.openstack.org/show/775150/16:49
*** ykarel|away has quit IRC16:52
*** e0ne has quit IRC16:54
corvusclarkb: good thing we tested :/16:54
clarkbIsee why now, working on a zuul change16:54
*** tesseract has quit IRC16:57
*** mriedem is now known as gibi_zeppelin16:58
*** gibi_zeppelin is now known as gibi_submarine16:58
*** gibi_submarine is now known as mriedem16:58
*** derekh has quit IRC17:00
*** eernst has joined #openstack-infra17:01
openstackgerritMerged opendev/elastic-recheck master: Add query for growroot fail bug 1843610  https://review.opendev.org/68152717:02
openstackbug 1843610 in OpenStack-Gate "Tempest ssh to guest intermittently fails, "GROWROOT: NOCHANGE: partition 1 is size 2078687. it cannot be grown" seen in guest console log" [Undecided,New] https://launchpad.net/bugs/184361017:02
clarkbrunning test now to confirm a fix17:05
openstackgerritJames E. Blair proposed zuul/zuul master: WIP: Add support for the Gerrit checks plugin  https://review.opendev.org/68077817:06
openstackgerritPaul Belanger proposed zuul/nodepool master: Disable interface_ip check, when host-key-checking is disable  https://review.opendev.org/68154417:06
*** ociuhandu has quit IRC17:08
*** ykarel|away has joined #openstack-infra17:08
*** gfidente has quit IRC17:13
openstackgerritMerged opendev/elastic-recheck master: Add query for test_multiple_compute_actions race fail bug 1843615  https://review.opendev.org/68153017:15
openstackbug 1843615 in OpenStack Compute (nova) "TestInstanceNotificationSampleWithMultipleCompute.test_multiple_compute_actions intermittently failing since Sept 10, 2019" [High,In progress] https://launchpad.net/bugs/1843615 - Assigned to Matt Riedemann (mriedem)17:15
openstackgerritSorin Sbarnea proposed zuul/zuul master: For pre blocks to wrap text  https://review.opendev.org/68153217:19
openstackgerritClark Boylan proposed zuul/zuul master: Pass zuul_success to cleanup playbooks  https://review.opendev.org/68155217:21
openstackgerritSorin Sbarnea proposed zuul/zuul master: For pre blocks to wrap text  https://review.opendev.org/68153217:21
openstackgerritTristan Cacqueray proposed zuul/zuul master: synchronize: add support for kubectl connection  https://review.opendev.org/68155317:22
openstackgerritMerged openstack/project-config master: New project request: airship/images  https://review.opendev.org/67794417:26
*** ociuhandu has joined #openstack-infra17:27
*** sshnaidm|rover is now known as sshnaidm|off17:28
*** ociuhandu has quit IRC17:31
openstackgerritPaul Belanger proposed zuul/nodepool master: Disable interface_ip check, when host-key-checking is disable  https://review.opendev.org/68154417:32
*** spsurya has quit IRC17:32
zbrclarkb: if you can wf https://review.opendev.org/#/c/680962/ it would be great.17:37
*** kopecmartin is now known as kopecmartin|off17:38
openstackgerritMerged zuul/zuul master: Fix timestamp race occurring on fast systems  https://review.opendev.org/68093717:44
clarkbzbr: why do we not need to mock platform.system to match 'Darwin' in the darwin case?17:46
clarkbthat is what we check against in depends.py17:46
zbrclarkb: see https://opendev.org/opendev/bindep/src/branch/master/bindep/depends.py#L307-L31417:48
zbrmainly we have a very different approach.17:48
zbrclarkb: but probably we could mock more and avoid that conditionals17:48
clarkbzbr: ya that code only runs if platform.system == Darwin but we never set that I don't think17:48
zbri cannot say I liked it in particular.17:49
clarkbwhat sets platform.system to Darwin is my question I gess ( the change sets it to Linux if testing linux but I don't see that for darwin)17:49
zbrclarkb: if you want, make a comment and I will try to rework the code tomorrow.17:49
clarkbI don't know that we need to rework the code. I'm just trying to undersatnd it17:50
clarkbthe test passes which means something must set platform.system to darwin right?17:50
zbrclarkb: because platform.system was not always mocked, when you would run unittest tests on Darwin, they would fail17:51
clarkbright I get that part. but for the test to work something must set platform.system to Darwin. What does that17:51
zbrsorry i need to go, family duty17:52
clarkboh I see it now17:52
clarkbwe haev two mocks happening and they are nested17:52
clarkbmy brain read it as a single one17:52
*** e0ne has joined #openstack-infra17:55
*** electrofelix has quit IRC17:56
*** e0ne has quit IRC17:59
*** ramishra has quit IRC18:00
*** ykarel|away has quit IRC18:00
*** ykarel|away has joined #openstack-infra18:00
*** jamesmcarthur has quit IRC18:06
*** jamesmcarthur has joined #openstack-infra18:07
*** jamesmcarthur has quit IRC18:11
*** Garyx has quit IRC18:12
openstackgerritDonny Davis proposed openstack/project-config master: FN needs to give a little more juice  https://review.opendev.org/68156918:16
*** Garyx has joined #openstack-infra18:16
*** e0ne has joined #openstack-infra18:17
openstackgerritJames E. Blair proposed zuul/zuul master: WIP: Add support for the Gerrit checks plugin  https://review.opendev.org/68077818:18
*** panda|ruck is now known as panda|ruck|off18:21
openstackgerritMerged opendev/bindep master: Fix test execution failure on Darwin  https://review.opendev.org/68096218:21
donnydI am gonna watch the utilization, if all is going well I am gonna push FN to it's limits so hopefully people have to wait less on jobs18:24
*** jbadiapa has joined #openstack-infra18:27
*** ykarel|away has quit IRC18:28
openstackgerritMerged openstack/project-config master: FN needs to give a little more juice  https://review.opendev.org/68156918:34
openstackgerritJeremy Stanley proposed opendev/system-config master: Add several missing ssldomains to certcheck config  https://review.opendev.org/68157018:36
fungiclarkb: ^18:36
fungii spotted this because namecheap sent me a warning about the docs.starlingx.io cert coming up for renewal and on a whim i double-checked it18:37
fungiexpires in a few weeks, which got me wondering why i hadn't seen any warnings from certcheck about it... and that was why18:39
clarkbfungi: thanks18:48
*** pkopec has quit IRC18:51
*** eharney has quit IRC18:55
*** kjackal has quit IRC18:57
AJaegerdonnyd: thanks!19:01
donnyd:)19:01
donnydhopefully it has a bit more to give..19:02
AJaegerclarkb, fungi, want to +2A 681569 to give us some more nodes?19:02
AJaegeroh, fungi did already.. Thanks19:02
AJaegerand it's merged - mea culpa19:02
donnydIt hasn't actually started scheduling on it yet though19:02
AJaegerdonnyd: it first needs deployment run to change our config, that can take an hour19:03
donnydah i see19:04
AJaegerbut looking at http://grafana.openstack.org/d/T6vSHcSik/zuul-status?orgId=1 we have 20 nodes more already19:05
AJaegeralso visible here: http://grafana.openstack.org/d/3Bwpi5SZk/nodepool-fortnebula?orgId=119:05
AJaegerquota problem?19:05
AJaegerdonnyd: we don't reach more than 73 nodes19:06
donnydnah19:07
donnydquota is set at 10019:07
paladoxcorvus another thing to look out for is https://bugs.chromium.org/p/gerrit/issues/detail?id=7645 (i'm not sure if it affects 2.13) but does 2.14+19:07
fungiAJaeger: donnyd: i haven't looked at nodepool, but if there are quotas around processors, ram or disk those could also cause it not to boot additional nodes19:08
donnydi think that grafana may lag a little behind reality19:08
fungialso that, for sure19:08
paladoxwe are being heavily affected by that issue at wikimedia :( (had gerrit go down three times because of it)19:08
*** eharney has joined #openstack-infra19:08
donnydOh i disabled those in the beginning19:08
paladox*today19:08
donnydoh well i also guess i should put all my hypervisors back in service19:11
donnydoh well i also guess i should put all my hypervisors back in service19:12
openstackgerritAndreas Jaeger proposed openstack/project-config master: Fix secret for promote-tox-docs-special-base  https://review.opendev.org/68157919:12
AJaegerclarkb, fungi, sorry, another wrong name, this breaks infra-manual ^19:12
donnydshould be able to get about 16 more jobs out of it19:12
donnydnope... grafana is accurate19:14
donnydhttp://grafana.openstack.org/d/3Bwpi5SZk/nodepool-fortnebula?orgId=1&from=1568227429047&to=156822948912819:18
donnydbeen almost 30 minutes.. so i guess we must be waiting on something else maybe19:18
clarkbdonnyd: its set to max-servers: 90 in the config now19:22
clarkboh I see the max servers value went up but the actual used values did not. ya quota is likely to blame19:24
openstackgerritMerged opendev/system-config master: Add several missing ssldomains to certcheck config  https://review.opendev.org/68157019:25
donnydugg... i suck... must have turned it down for something or other19:26
donnydok its set back to 120 where it should be19:26
*** bdodd has quit IRC19:27
*** pgaxatte has joined #openstack-infra19:33
*** pgaxatte has quit IRC19:36
openstackgerritAndreas Jaeger proposed openstack/project-config master: Add base promote job for moving static.o.o  https://review.opendev.org/68158219:44
openstackgerritAndreas Jaeger proposed openstack/project-config master: Switch governance sites to AFS publishing  https://review.opendev.org/68158319:44
openstackgerritAndreas Jaeger proposed openstack/project-config master: Move security site to AFS publishing  https://review.opendev.org/68158419:44
*** rh-jelabarre has quit IRC19:44
AJaegerianw: that should get us started with static publishing. The first job can merge directly, the others for each site. Is that what you have in mind?19:45
clarkbdonnyd: http://grafana.openstack.org/d/3Bwpi5SZk/nodepool-fortnebula?orgId=1 shows the expected jump in usage now19:45
donnydyea now its rolling along19:45
*** eharney has quit IRC19:45
clarkbcorvus: can you review https://review.opendev.org/#/c/681552/ if that looks sane I may hold off on implementing the cleanup playbook until after that gets in and we restart exectuors19:46
donnydI am going to add the last hypervisor back in, the other testing can wait19:46
clarkbif not then maybe we'll just run cleanup on every change again?19:46
donnydthen then juice it up a little more19:46
*** rh-jelabarre has joined #openstack-infra19:48
*** eernst has quit IRC19:49
clarkbalso worth noting that while adding resources definitely helps with the backlog, much of that aid is negated if we keep having a flaky gate19:50
clarkbif anyone is able to help with that I can give direction but I can't really devote time to debugging why cinder tests fail or nova etc19:51
*** eernst has joined #openstack-infra19:52
*** panda|ruck|off has quit IRC19:55
*** panda has joined #openstack-infra19:57
AJaegerconfig-core, please review https://review.opendev.org/681579 to fix publishing for infra-manual19:58
donnydclarkb: yea that makes sense19:59
donnydIf you look at my grafana I don't push FN too hard because i don't want to see failing jobs from timeouts and what not20:00
*** eernst has quit IRC20:05
*** jamesmcarthur has joined #openstack-infra20:06
*** e0ne has quit IRC20:06
clarkbdonnyd: ya not saying you should spin it up more. Mostly want to make the projects consuming the resources to understand the best thing they can do to make the situation better is improve reliability of the jobs20:13
openstackgerritJames E. Blair proposed zuul/zuul master: Add enqueue reporter action  https://review.opendev.org/68113220:15
openstackgerritJames E. Blair proposed zuul/zuul master: Add no-jobs reporter action  https://review.opendev.org/68127820:15
openstackgerritJames E. Blair proposed zuul/zuul master: Add report time to item model  https://review.opendev.org/68132320:15
openstackgerritJames E. Blair proposed zuul/zuul master: Add Item.formatStatusUrl  https://review.opendev.org/68132420:15
openstackgerritJames E. Blair proposed zuul/zuul master: Add support for the Gerrit checks plugin  https://review.opendev.org/68077820:15
*** e0ne has joined #openstack-infra20:15
*** e0ne has quit IRC20:16
*** e0ne has joined #openstack-infra20:16
*** eernst has joined #openstack-infra20:17
corvusclarkb: looks legit20:17
clarkbcorvus: http://paste.openstack.org/show/775158/ is that a known swift problem?20:18
*** eernst has quit IRC20:18
clarkbalso thank you for the review20:18
clarkbcorvus: thats the create a container step right?20:19
corvusclarkb: i'm not sure i can tell what request failed from that20:20
corvusis there a traceback or anything?20:20
clarkbcorvus: no traceback, was just going off of the url20:20
clarkbfwiw that container exists and has 617 objects in it according to osc20:21
corvusso probably not the create step20:21
corvusclarkb: could be https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/upload-logs-swift/library/zuul_swift_upload.py#L511 ?20:23
corvusthat lacks a retry20:24
openstackgerritMerged openstack/project-config master: Fix secret for promote-tox-docs-special-base  https://review.opendev.org/68157920:26
*** markvoelker has quit IRC20:27
clarkbcorvus: maybe? but it looks like if that fails it is expecting to create the container?20:28
corvusclarkb: well, it got a 401 not a 404, so that may be an unhandled exception20:28
corvus(i'm not sure, but seems plausible)20:29
*** e0ne has quit IRC20:29
*** e0ne_ has joined #openstack-infra20:29
clarkbah20:29
clarkband 401 is what swift uses to say this isn't consistent yet right20:29
clarkbtimburke: ^ is my memory off there?20:29
timburke401 means i don't know who you are. typically, invalid token20:30
pabelangerclarkb: make sure there isn't another swift container owned by somebody else?20:30
timburkenot sure how you're uploading, but could also be an expired or otherwise invalid tempurl20:31
corvuspabelanger: this is rax swift (so more swiftlike than ceph)20:31
clarkbpabelanger: thats not a problem with swift aiui20:31
pabelangerkk20:31
corvusthis is not using tempurl20:32
clarkbI'm going to create a test object in that container20:33
donnydclarkb: is it just a coincidence that 16 jobs all finished at the same time???20:35
donnydi should have looked a little closer  3220:35
corvusdonnyd: they may have been aborted20:35
clarkbcorvus: I am able to create an object in that container from bridge20:35
donnydok... just want to make sure FN working correctly20:35
*** markvoelker has joined #openstack-infra20:36
*** kjackal has joined #openstack-infra20:37
openstackgerritPaul Belanger proposed zuul/zuul-jobs master: Also include nodepool inventory variables  https://review.opendev.org/68160120:39
*** factor has quit IRC20:40
*** markvoelker has quit IRC20:41
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: fetch-javascript-tarball: introduce zuul_do_synchronize  https://review.opendev.org/68160320:41
*** factor has joined #openstack-infra20:42
*** rh-jelabarre has quit IRC20:43
*** xenos76 has quit IRC20:47
*** ociuhandu has joined #openstack-infra20:50
*** exsdev has quit IRC20:51
*** rh-jelabarre has joined #openstack-infra20:52
*** exsdev has joined #openstack-infra20:53
*** ociuhandu has quit IRC20:57
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: fetch-javascript-tarball: introduce zuul_do_synchronize  https://review.opendev.org/68160321:02
*** slaweq has quit IRC21:10
*** slaweq has joined #openstack-infra21:11
*** rcernin has joined #openstack-infra21:15
*** slaweq has quit IRC21:18
*** jamesmcarthur has quit IRC21:20
mriedemclarkb: fyi in case you need to promote anything again https://review.opendev.org/#/c/681540 fixes one of the known gate race reset bugs21:24
mriedemthe one you pointed out to me this morning21:24
mriedemi just saw something else fail on it21:24
*** prometheanfire has joined #openstack-infra21:26
*** noama has quit IRC21:26
*** slaweq has joined #openstack-infra21:27
openstackgerritDonny Davis proposed openstack/project-config master: FN can bump up another 20  https://review.opendev.org/68161521:28
donnydthings don't look too bad and i just brought the other hypervisor online. It seems to be running the instances ok, so up 20 more21:29
*** slaweq has quit IRC21:32
*** goldyfruit_ has joined #openstack-infra21:32
guilhermespinfra-root: Is it possible to get a hold on openstack-ansible-deploy-aio_metal-debian-stable  over project openstack/openstack-ansible (review #670051 ).... btw, keys https://github.com/guilhermesteinmuller.keys21:33
guilhermespthanks in advance!21:33
clarkbmriedem: I have enqueued it21:33
clarkbguilhermesp: yes I can request the hold now21:34
guilhermespthanks clarkb ! let me know when I can recheck it21:34
*** goldyfruit___ has quit IRC21:34
clarkbguilhermesp: what can I put in the notes as far as the purpose (helps us to figure out when we can delte the node)21:34
guilhermespoh ok21:36
clarkbcorvus: re the swift thing I'm guessing we don't print full traceback to avoid leaking data?21:36
clarkbcorvus: I'm thinking that might be the next step in debugging is getting that info21:36
clarkbguilhermesp: just a short thing like "debugging osa bug foo" or whatever21:36
guilhermespinvestigate galera_new_cluster command that is trying to connect to database with different user then localhost/127.0.0.121:37
clarkbthanks21:37
guilhermespbtw, I can tell you too when Im done with the node as usual21:37
clarkbit is in place and yes you should still tell us :) we add the notes so that if we miss you or whatever we have enough info to make a judgement call21:38
openstackgerritClark Boylan proposed opendev/base-jobs master: Add cleanup playbook to all base jobs  https://review.opendev.org/68132221:39
*** e0ne_ has quit IRC21:41
*** kjackal has quit IRC21:45
openstackgerritMerged openstack/project-config master: FN can bump up another 20  https://review.opendev.org/68161521:51
*** mriedem is now known as mriedem_afk21:53
ianwAJaeger: thanks will look today21:55
*** markvoelker has joined #openstack-infra21:59
*** panda has quit IRC22:00
*** panda has joined #openstack-infra22:03
guilhermespcool clarkb22:06
guilhermespso could I recheck it now?22:06
clarkbya should be ready for you to trigger the job22:07
guilhermespthanks clarkb ! rechecked22:07
clarkbIm out running errands for the next little bit but can check in after to see if we have caight one yet22:07
*** slaweq has joined #openstack-infra22:11
*** whoami-rajat has quit IRC22:11
*** slaweq has quit IRC22:16
rm_workfungi: do you know if Matthias is on IRC? I don't know his nick22:20
rm_workI didn't mean to start a whole debate about this thing, I just want my patch merged, lol22:21
fungirm_work: yeah, i didn't mean to imply there was no value in patching it, but it's up to the horizon maintainers to decide, not the vmt22:23
*** rlandy is now known as rlandy|bbl22:24
fungirm_work: and yeah, i don't know for sure whether he hangs out in irc much. i want to say i've seen him on occasionally as "mrunge" but i don't see him in #openstack-horizon right now nor in recent channel history there22:25
corvusclarkb: i think we should be able to print the full traceback; we probably just need to work out the best way to do that in an ansible module.22:27
rm_workfungi: yeah I just shouldn't have checked the security box when i submitted the bug22:27
rm_worki didn't even submit it until after he told me i needed a LP# on my patch22:27
rm_work>_<22:27
rm_workand I already had a +2 on the patch once...22:27
donnydfungi: I have 10 more instances I can give over to the CI, but I want to watch it for a little bit22:32
fungidonnyd: sounds great! also, we should figure out how to get you listed at https://www.openstack.org/foundation/companies/#infra-donors now that you're providing in excess of 100 test nodes22:34
*** threestrands has joined #openstack-infra22:37
*** jcoufal has joined #openstack-infra22:38
cmurphyproposed another timeout increase for keystone https://review.opendev.org/681621 we have a better fix ready to go but it conflicts with everything trying to make it through the gate right now so would rather wait on it22:40
donnydfungi: That would be pretty cool22:46
fungidonnyd: "figure out" in this sense entails figuring out how you'd want to be displayed (name/logo), the actual logistics are mainly that we ask the osf website admins to stick it on there and they do. not terribly complicated22:48
*** mtreinish has quit IRC22:54
*** dchen has joined #openstack-infra22:55
*** goldyfruit_ has quit IRC22:57
*** tkajinam has joined #openstack-infra23:04
*** slaweq has joined #openstack-infra23:11
*** slaweq has quit IRC23:15
*** igordc has quit IRC23:18
*** jamesmcarthur has joined #openstack-infra23:32
*** igordc has joined #openstack-infra23:33
openstackgerritJames E. Blair proposed zuul/zuul master: Add support for the Gerrit checks plugin  https://review.opendev.org/68077823:34
*** exsdev has quit IRC23:40
*** igordc has quit IRC23:40
*** exsdev has joined #openstack-infra23:42
*** igordc has joined #openstack-infra23:42
*** sthussey has quit IRC23:42
*** jamesmcarthur has quit IRC23:46
*** jamesmcarthur has joined #openstack-infra23:46
*** mriedem_afk has quit IRC23:49
*** jamesmcarthur has quit IRC23:51

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!