Wednesday, 2018-03-21

*** wolverineav has quit IRC00:03
*** wolverineav has joined #openstack-infra00:04
*** pahuang has quit IRC00:05
*** xinliang has quit IRC00:07
*** odyssey4me has quit IRC00:10
*** odyssey4me has joined #openstack-infra00:10
*** germs has joined #openstack-infra00:11
*** germs has quit IRC00:11
*** germs has joined #openstack-infra00:11
*** germs has quit IRC00:15
*** pahuang has joined #openstack-infra00:17
*** wolverineav has quit IRC00:18
*** xinliang has joined #openstack-infra00:19
*** wolverineav has joined #openstack-infra00:21
corvusclarkb: it'll take a lot of test infrastructure to handle that case.  is it worth it?00:26
clarkbcorvus: mayhe not? it just seema like the main reason for having that ordering system? A substitute for unittest setup may be to just add a couple plugins to a devstack only job that have a dep order and that eill test that it works at an integration level00:28
*** wolverineav has quit IRC00:29
*** wolverineav has joined #openstack-infra00:30
*** r-daneel has quit IRC00:31
corvusclarkb: i can give it a shot.  i'll just note we're establishing a very high bar of additional testing for a self-testing change.00:37
clarkbcorvus: ya thats what I mean about just aving a devstack up job that includes some plugins that sue the featur00:38
clarkbnone would use it now but we could add that once they do00:39
clarkbrather than write an explicit test for it00:39
corvusthe change has been sitting for 4 months and i've forgotten who needed the feature.  :(00:39
clarkbbut then it would be self testing for consumers of the feature00:39
corvusi know that someone couldn't use the new devstack job without this.  i can't remember who.00:39
*** wolverineav has quit IRC00:40
corvusanyway, i'll try adding a test tomorrow00:41
clarkbI want to say one group was magnum/zun and depending on the generic docker plugin for devstack00:41
*** yamamoto has joined #openstack-infra00:43
*** yamamoto has quit IRC00:48
*** claudiub has quit IRC00:50
*** hongbin has joined #openstack-infra00:51
*** felipemonteiro_ has joined #openstack-infra00:52
*** wolverineav has joined #openstack-infra00:53
*** felipemonteiro__ has joined #openstack-infra00:54
*** wolverin_ has joined #openstack-infra00:55
*** felipemonteiro_ has quit IRC00:58
*** wolverineav has quit IRC00:59
*** diablo_rojo has quit IRC01:04
*** harlowja has quit IRC01:07
*** gcb has joined #openstack-infra01:08
*** andreww has quit IRC01:09
*** wolverin_ has quit IRC01:13
*** wolverineav has joined #openstack-infra01:13
*** wolverineav has quit IRC01:17
*** felipemonteiro_ has joined #openstack-infra01:32
*** felipemonteiro__ has quit IRC01:32
*** pahuang has quit IRC01:33
*** pahuang has joined #openstack-infra01:34
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [DNM] testing 554684  https://review.openstack.org/55468501:42
*** yamamoto has joined #openstack-infra01:45
*** pahuang_ has joined #openstack-infra01:45
*** pahuang has quit IRC01:45
*** dingyichen has joined #openstack-infra01:47
*** yamamoto has quit IRC01:51
*** cshastri has joined #openstack-infra02:00
*** agopi has joined #openstack-infra02:03
*** mriedem has quit IRC02:06
*** pahuang_ has quit IRC02:13
*** jamesmcarthur has joined #openstack-infra02:18
*** myoung|afk is now known as myoung02:23
*** myoung is now known as myoung|afk02:27
*** pahuang_ has joined #openstack-infra02:30
ianwOSError: [Errno 28] No space left on device: '/home/zuul/.ansible/tmp/ansible-local-12482nyahLb' ... i feel like i'm seeing this a lot02:33
*** yamamoto has joined #openstack-infra02:37
*** psachin has joined #openstack-infra02:38
ianwit's not on the executor is it? ...02:38
fungicruft on executors?02:39
clarkbI think it might be02:39
fungiwe can run out of space if we overrun them with load or leak cruft02:39
*** zhurong has joined #openstack-infra02:40
ianwhmm, two failures are02:41
ianwhttp://logs.openstack.org/05/554705/2/check/tripleo-ci-centos-7-containers-multinode/57bb822/02:41
ianwhttp://logs.openstack.org/05/554705/2/check/tripleo-ci-centos-7-undercloud-containers/dfe438c02:41
ianwone was z02, the other ze06 i think ... neither seems that full02:42
fungithen probably the nodes02:42
fungiran in the same provider?02:43
openstackgerritYumengBao proposed openstack-infra/project-config master: Set up cyborg-specs repository  https://review.openstack.org/55397602:43
clarkbcould be the remotes then? it is possible the jobs do run out if 80gb isnt enough02:43
*** andreww has joined #openstack-infra02:44
ianwohhh, i wonder if https://review.openstack.org/#/c/553784/ is related.  i just noticed that looking at my particular failure02:46
*** salv-orl_ has joined #openstack-infra02:48
ianwno, i can't see that the test has even run long enough that it could fill up the disk02:48
*** andreas_s has joined #openstack-infra02:49
*** salv-orlando has quit IRC02:51
ianwhere's more out of disk errors ... http://logs.openstack.org/85/554685/3/check/dib-dsvm-functests-python2-centos-7-image/22ffbf6/job-output.txt.gz#_2018-03-21_01_55_38_50364902:53
*** rosmaita has quit IRC02:53
*** andreas_s has quit IRC02:53
*** yamamoto has quit IRC02:55
*** felipemonteiro_ has quit IRC02:55
*** wolverineav has joined #openstack-infra02:56
ianwhttp://logs.openstack.org/85/554685/3/check/dib-dsvm-functests-python2-centos-7-image/22ffbf6/job-output.json.gz ... no configure-swap role?02:57
*** pahuang_ has quit IRC02:57
clarkbhappening in different clouds too02:58
clarkbsize_available: 3730972672 according to ansible02:59
clarkbhttp://logs.openstack.org/05/554705/2/check/tripleo-ci-centos-7-containers-multinode/57bb822/zuul-info/host-info.primary.yaml02:59
clarkbthats less than 4GB03:00
clarkbianw: is it all centos 7? maybe something on the image isn't growfsing properly?03:00
ianwyeah ... i logged into a few and they seemed ok ,but maybe a bad image is rolling out03:00
clarkband only 12GB or so total avaialble it thinks03:01
clarkbso ya I think it must be a growfs problem on boot03:01
ianwMar 21 02:49:00 centos-7-rax-dfw-0003093035 growroot[732]: + growpart /dev/xvda 103:01
ianwMar 21 02:49:00 centos-7-rax-dfw-0003093035 growroot[732]: WARN: sector size not found in sfdisk output, assuming 51203:01
ianwMar 21 02:49:00 centos-7-rax-dfw-0003093035 growroot[732]: FAILED: failed to get start and end for /dev/xvda1 in /dev/xvda03:02
ianwsigh ...03:02
clarkbwe can rollbcak centos7 and pause image builds. At that point concern is it affecting the other images too03:03
clarkbianw: oh did we switch to gpt ? I wonder if growpart doesn't understand gpt03:03
ianwwe shouldn't have ... but new dib release is suspect i guess03:05
ianwPartition Table: msdos03:06
clarkbhttps://bugs.launchpad.net/ubuntu/+source/cloud-initramfs-tools/+bug/1087526 even if it was gpt looks like support was added ~5 years ago03:06
openstackLaunchpad bug 1087526 in cloud-utils (Ubuntu) "need support for gpt partition tables" [Medium,Fix released]03:06
clarkbso would be surprised if that broke it03:06
pabelangerHmm, seeing errors on fedora-2703:06
pabelangerhttp://logs.openstack.org/95/554695/31/check/ansible-role-statsd-fedora-27/e3ae030/ara/result/5a8e4e24-6cfa-42c6-960b-6ed013f00910/03:06
pabelangerany change with repos recently?03:07
ianw-bash-4.2# sfdisk --unit=S --dump /dev/vda03:07
ianwsfdisk: detected Disk Manager - unable to handle that03:07
clarkbpabelanger: that host has a proper 80GB at least03:07
ianwhttps://github.com/mmalecki/util-linux/blob/master/fdisks/sfdisk.c#L1524 wtf03:10
ianw  /dev/vda1   *        2048    26664575    13331264   53  OnTrack DM6 Aux303:10
*** pahuang_ has joined #openstack-infra03:10
openstackgerritmegan guiney proposed openstack-infra/project-config master: initial config for getting-started-with-openstack project  https://review.openstack.org/55476803:11
ianwthat's 0x5103:11
clarkbline 1521 is excellent03:11
clarkblets just pointer math at a magical offset03:11
clarkbianw: so it is a DM6 like it is mad about?03:11
ianwhttp://paste.openstack.org/show/707073/ ... parted seems to be ok, but fdisk is reporting this weird type03:12
ianwbut we didn't change the mbr portions of the code in dib03:13
clarkbits almost like they see two different partition tables03:15
*** rlandy|bbl is now known as rlandy03:15
*** sree has joined #openstack-infra03:16
*** wolverineav has quit IRC03:16
ianwhttps://review.openstack.org/#/c/533490/21/diskimage_builder/block_device/level1/partition.py03:16
*** sree has quit IRC03:16
*** dave-mccowan has quit IRC03:16
ianwline 59, what's 83 in hex03:16
*** wolverineav has joined #openstack-infra03:17
*** sree has joined #openstack-infra03:17
ianw0x53 ... which is this "OnTrack DM6 Aux3" type03:17
clarkb5303:17
clarkbso its an encoding error in the int?03:18
openstackgerritHongbin Lu proposed openstack-infra/irc-meetings master: Add Shengqin Feng as a chair of Zun meetings  https://review.openstack.org/55476903:18
*** sree_ has joined #openstack-infra03:18
clarkboh it shows up in the diff too so ya that seems like the likely problem03:18
*** sree_ is now known as Guest2197703:19
ianwyep ... and we would have gotten away with it too except for this meddling check for ancient weird disk managers in sfdisk03:19
*** hongbin has quit IRC03:19
*** Guest21977 has quit IRC03:20
clarkbI'll happily review a fix for that before bed03:21
*** wolverineav has quit IRC03:21
*** sree has quit IRC03:21
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Fix default partition type  https://review.openstack.org/55477103:22
*** sree has joined #openstack-infra03:22
ianwclarkb: ^ i think that this stage, a dib point release with this is probably the way forward03:23
clarkbI guess integration testing didn't catch it because the image was big enough on disk to function for ssh03:23
*** dhajare has joined #openstack-infra03:23
ianwyeah, i'll add testing growroot to the todo list.  have to grep the logs on the booted system or something03:24
*** andreww has quit IRC03:24
clarkbI've +2'd the change and I agree a bugfix release seems in order03:24
ianwthx!  it's always something :/03:25
ianwohhhhh, but the dib gate is broken at the moment for another triple-o issue03:27
ianwand is also likely to hit this problem as it merges03:28
clarkbwe can pause image building across the board then delete the new images03:28
openstackgerritshangxdy proposed openstack-infra/project-config master: Fix ZUUL_USER_SSH_PUBLIC_KEY to support ssh key content  https://review.openstack.org/46791903:28
clarkbthen unpause image building once a dib release is out03:28
ianwyeah, doing that now03:28
clarkbassuming our old image of the pair is still working (I think it is)03:28
*** ramishra has joined #openstack-infra03:30
*** dhajare has quit IRC03:31
clarkbalso current nodepool no longer runs ready scripts so we may want to modify our tests to ssh explicitly if they don't aalready03:31
openstackgerritIan Wienand proposed openstack-infra/project-config master: Pause builds for dib 2.12.0  https://review.openstack.org/55478503:33
*** iyamahat_ has quit IRC03:33
*** dchen has joined #openstack-infra03:35
clarkband I guess nb03 is fine because it uses gpt03:35
*** pahuang_ has quit IRC03:35
clarkbI've approved ^03:35
ianwthanks, once in i'll delete all the new images.  that should get us to stability, tripleo can fix the dib job, and we can do a point release03:36
*** ykarel has joined #openstack-infra03:36
clarkbprobably worth an email to the dev list to let people know why their jobs had a sad and also to avoid 2.12.0 and wait for 2.12.103:41
clarkbI think its under control now though so I'm going to find a late dinner and bed03:41
ianwclarkb: thanks as always, ttyl03:41
*** jamesmcarthur has quit IRC03:44
*** zhurong has quit IRC03:44
*** yamamoto has joined #openstack-infra03:45
openstackgerritMerged openstack-infra/project-config master: Pause builds for dib 2.12.0  https://review.openstack.org/55478503:46
*** pahuang_ has joined #openstack-infra03:52
*** links has joined #openstack-infra04:00
*** links has quit IRC04:00
*** yamamoto has quit IRC04:07
*** yamamoto has joined #openstack-infra04:08
*** udesale has joined #openstack-infra04:09
ianwalright, puppet's rolled that config out hopefully04:14
ianwdeleting less used like fedora first to make sure we're ok04:19
*** eernst has joined #openstack-infra04:22
*** andreww has joined #openstack-infra04:27
*** yamamoto has quit IRC04:33
*** pgadiya has joined #openstack-infra04:35
*** harlowja has joined #openstack-infra04:39
*** rlandy has quit IRC04:39
*** eernst has quit IRC04:42
ianw#status log all today's builds deleted, and all image builds on hold until dib 2.12.1 release.  dib fix is https://review.openstack.org/554771 ; however requires a tripleo fix in https://review.openstack.org/554705 to first unblock dib gate04:44
openstackstatusianw: finished logging04:44
*** dhajare has joined #openstack-infra04:50
*** sree has quit IRC04:54
*** sree has joined #openstack-infra04:55
*** harlowja has quit IRC04:56
*** sree has quit IRC04:59
*** pahuang_ has quit IRC05:02
*** sree has joined #openstack-infra05:05
*** dchen has quit IRC05:05
*** dchen has joined #openstack-infra05:06
*** lpetrut has joined #openstack-infra05:06
*** dchen has quit IRC05:08
*** sree has quit IRC05:09
*** imacdonn has quit IRC05:14
*** imacdonn has joined #openstack-infra05:14
*** pahuang_ has joined #openstack-infra05:16
*** sree has joined #openstack-infra05:27
*** sree has quit IRC05:31
*** claudiub has joined #openstack-infra05:43
*** dsariel has joined #openstack-infra05:45
*** masuberu has quit IRC06:04
*** zhurong has joined #openstack-infra06:05
*** sree_ has joined #openstack-infra06:07
*** sree_ is now known as Guest8329406:08
*** Guest83294 has quit IRC06:12
*** jcoufal has joined #openstack-infra06:12
*** sree_ has joined #openstack-infra06:12
*** sree_ is now known as Guest5607606:13
*** germs has joined #openstack-infra06:13
*** germs has quit IRC06:13
*** germs has joined #openstack-infra06:13
*** ihrachys has quit IRC06:15
*** germs has quit IRC06:18
*** lpetrut has quit IRC06:19
*** jcoufal_ has joined #openstack-infra06:19
*** e0ne has joined #openstack-infra06:20
*** udesale has quit IRC06:21
*** jcoufal has quit IRC06:21
*** udesale has joined #openstack-infra06:21
*** jcoufal has joined #openstack-infra06:27
*** Guest56076 has quit IRC06:27
tobiashAJaeger, mordred: I'm +2 on https://review.openstack.org/554297 but didn't hit +3 in case someone else wants/should look on this06:28
*** yamamoto has joined #openstack-infra06:28
*** armaan has joined #openstack-infra06:28
*** jcoufal_ has quit IRC06:30
*** masber has joined #openstack-infra06:31
*** dbecker has quit IRC06:31
*** vaidy has quit IRC06:34
*** isviridov_away has quit IRC06:34
*** gus has quit IRC06:34
*** lpetrut has joined #openstack-infra06:35
*** StevenK has quit IRC06:35
*** sdake has quit IRC06:35
*** gus has joined #openstack-infra06:36
*** jbadiapa has joined #openstack-infra06:36
*** StevenK has joined #openstack-infra06:36
*** sdake has joined #openstack-infra06:37
*** sdake has quit IRC06:37
*** sdake has joined #openstack-infra06:37
*** isviridov_away has joined #openstack-infra06:40
*** e0ne has quit IRC06:40
*** jamesmcarthur has joined #openstack-infra06:44
*** pcichy has joined #openstack-infra06:44
*** dbecker has joined #openstack-infra06:46
*** vaidy has joined #openstack-infra06:46
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Pass NODEPOOL_ZK_HOST variable for py35 test  https://review.openstack.org/55481006:46
*** jamesmcarthur has quit IRC06:48
*** agopi has quit IRC06:49
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Allow external zookeeper in tox py35 runs  https://review.openstack.org/55481006:52
*** gongysh has joined #openstack-infra06:55
*** lpetrut has quit IRC06:55
*** logan- has quit IRC06:58
*** logan- has joined #openstack-infra06:58
AJaegertobiash: thanks. I'm fine with it, just doing one sanity check...07:04
*** kiennt26 has joined #openstack-infra07:04
*** jaosorior has quit IRC07:05
*** masber has quit IRC07:05
*** masber has joined #openstack-infra07:06
*** salv-orl_ has quit IRC07:12
*** sree_ has joined #openstack-infra07:13
*** sree_ is now known as Guest7002707:14
openstackgerritMerged openstack-infra/zuul master: Switch to stestr  https://review.openstack.org/53688207:14
*** alexchadin has joined #openstack-infra07:14
*** salv-orlando has joined #openstack-infra07:16
*** Guest70027 has quit IRC07:18
*** rcernin has quit IRC07:21
*** andreas_s has joined #openstack-infra07:26
*** yamamoto has quit IRC07:32
*** hashar has joined #openstack-infra07:33
*** diablo_rojo has joined #openstack-infra07:34
*** kjackal has joined #openstack-infra07:40
*** jaosorior has joined #openstack-infra07:44
*** ralonsoh has joined #openstack-infra07:46
*** priteau has joined #openstack-infra07:52
*** yamahata has joined #openstack-infra07:55
*** yamamoto has joined #openstack-infra08:00
*** yamamoto has quit IRC08:03
*** HeOS has joined #openstack-infra08:07
*** priteau has quit IRC08:07
*** danpawlik has joined #openstack-infra08:10
*** florianf has joined #openstack-infra08:13
*** yamamoto has joined #openstack-infra08:13
AJaegergaryk, boden, is https://review.openstack.org/#/c/554292/ and https://review.openstack.org/#/c/554245 working fine? IT looks to me fine - and if you agree, I'll merge https://review.openstack.org/554297 and you can merge 55429208:14
*** yamamoto has quit IRC08:16
*** yamamoto has joined #openstack-infra08:16
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Fix zuul-web port in zuul-from-scratch doc  https://review.openstack.org/55482908:20
*** dingyichen has quit IRC08:24
*** krenczewski has quit IRC08:28
*** tesseract has joined #openstack-infra08:31
*** krenczewski has joined #openstack-infra08:35
*** amoralej|off is now known as amoralej08:36
*** masber has quit IRC08:40
*** jpena|off is now known as jpena08:43
*** tosky has joined #openstack-infra08:43
*** lpetrut has joined #openstack-infra08:45
*** lpetrut has quit IRC08:48
*** lpetrut_ has joined #openstack-infra08:48
*** claudiub has quit IRC08:50
*** tesseract has quit IRC08:51
*** tesseract has joined #openstack-infra08:52
*** tesseract has quit IRC08:54
*** tesseract has joined #openstack-infra08:57
*** lucas-afk is now known as lucasagomes08:59
*** jpich has joined #openstack-infra09:02
*** arxcruz|off is now known as arxcruz09:04
*** zhurong has quit IRC09:04
*** zhurong has joined #openstack-infra09:14
*** electrofelix has joined #openstack-infra09:14
*** duobei has joined #openstack-infra09:17
*** duobei has left #openstack-infra09:17
*** masber has joined #openstack-infra09:20
openstackgerriteldad marciano proposed openstack-infra/grafyaml master: Add datasource to template schema.  https://review.openstack.org/54836509:20
*** jesusaur has quit IRC09:21
*** jesusaur has joined #openstack-infra09:24
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add trigger driver  https://review.openstack.org/55483909:25
*** gongysh has quit IRC09:30
*** efoley has joined #openstack-infra09:33
*** derekh has joined #openstack-infra09:34
*** yamamoto has quit IRC09:42
*** yamamoto has joined #openstack-infra09:43
*** pgaxatte has joined #openstack-infra09:45
pgaxattehello09:45
*** markmcd has left #openstack-infra09:46
pgaxattecoreycb: I noticed a problem with mistral's pike release on ubuntu cloud archive09:47
*** yamamoto has quit IRC09:48
*** yamamoto has joined #openstack-infra09:48
*** yamamoto has quit IRC09:48
openstackgerriteldad marciano proposed openstack-infra/grafyaml master: Add datasource to template schema.  https://review.openstack.org/54836509:49
*** panda|off is now known as panda09:52
*** pbourke has joined #openstack-infra09:53
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add trigger driver  https://review.openstack.org/55483909:54
*** gfidente has joined #openstack-infra09:54
*** gfidente has joined #openstack-infra09:54
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: web: add reenqueue button  https://review.openstack.org/55485609:54
*** claudiub has joined #openstack-infra09:55
openstackgerritMerged openstack-infra/irc-meetings master: Add Shengqin Feng as a chair of Zun meetings  https://review.openstack.org/55476909:56
openstackgerritMerged openstack-infra/irc-meetings master: Remove WOS Mentoring Meeting  https://review.openstack.org/55472309:56
*** dizquierdo has joined #openstack-infra09:57
*** armaan has quit IRC10:09
*** armaan has joined #openstack-infra10:10
dmelladoHi, I've started seeing a few POST_FAILURES again10:10
dmelladois there something off in the infra?10:10
dmelladoAJaeger: rcarrillocruz ?10:10
*** priteau has joined #openstack-infra10:11
*** dhajare has quit IRC10:16
stephenfinAJaeger: Could you point me to the job definition that decides if we run the legacy docs build (setup.py build_sphinx) or not? I can't find it10:17
*** dhajare has joined #openstack-infra10:17
stephenfinIt seems a few of the merged 'topic:updated-pti' patches have inadvertently broken local docs builds and they weren't picked up in the gate because of that job's magic10:17
stephenfinsmcginnis: ^10:18
openstackgerriteldad marciano proposed openstack-infra/grafyaml master: Add datasource to template schema.  https://review.openstack.org/54836510:23
AJaegerstephenfin: it's in zuul-jobs, let me give you a link...10:25
AJaegerstephenfin: http://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/sphinx/tasks/main.yaml10:26
AJaegerdmellado: for CentOS images? There was an email to the dev mailing list10:26
dmelladoAJaeger: this I've seen with Ubuntu image10:27
*** alexchadin has quit IRC10:27
dmelladoi.e. https://review.openstack.org/#/c/548309/10:27
*** alexchadin has joined #openstack-infra10:28
*** alexchadin has quit IRC10:30
AJaegerdmellado: there was a timeout at http://logs.openstack.org/09/548309/8/check/kuryr-kubernetes-tempest-lbaasv2/1b908b5/job-output.txt.gz#_2018-03-21_09_37_04_786322 - not sure why.10:33
dmelladoAJaeger: I'll recheck and see10:34
*** alexchadin has joined #openstack-infra10:35
*** boden has joined #openstack-infra10:39
*** dizquierdo has quit IRC10:39
*** alexchadin has quit IRC10:46
*** yamamoto has joined #openstack-infra10:48
stephenfinAJaeger: Ta. Email sent10:49
*** zoli is now known as zoli|lunch10:49
*** dtantsur|afk is now known as dtantsur10:49
*** yamahata has quit IRC10:52
bodenAJaeger hi, FYI left you a response here https://review.openstack.org/#/c/554245/2  seems we still may have some issues10:53
kashyapHey folks, just a quick thank you note:  paste.openstack.org is super fast and reliable.10:53
kashyapNice work there!10:54
*** yamamoto has quit IRC10:54
*** dizquierdo has joined #openstack-infra10:55
*** zhurong has quit IRC10:55
AJaegerboden: indeed - one step forward, one more problem found ;)10:56
*** yamamoto has joined #openstack-infra10:56
AJaegerboden: best discuss with mordred how to handle the vmware-api tests. He should be around soon10:56
bodenAJaeger: ack, I just need a pointer in the right direction… then I can go off and break more stuff :)10:56
AJaegerhope you and mordred find a solution.10:58
*** e0ne has joined #openstack-infra10:59
*** yamamoto has quit IRC11:01
*** yamamoto has joined #openstack-infra11:02
*** udesale_ has joined #openstack-infra11:04
*** numans is now known as numans_afk11:05
*** yamamoto has quit IRC11:06
*** udesale has quit IRC11:07
*** sshnaidm|sick is now known as sshnaidm11:08
*** dhajare has quit IRC11:08
*** gyankum has joined #openstack-infra11:10
*** udesale_ has quit IRC11:13
openstackgerriteldad marciano proposed openstack-infra/grafyaml master: Add datasource to template schema.  https://review.openstack.org/54836511:14
*** yamamoto has joined #openstack-infra11:16
*** yamamoto has quit IRC11:16
*** alexchadin has joined #openstack-infra11:16
*** katkapilatova has joined #openstack-infra11:21
*** pcichy has quit IRC11:27
*** adarazs is now known as adarazs_lunch11:27
*** dhajare has joined #openstack-infra11:27
*** cshastri has quit IRC11:27
*** snapiri has quit IRC11:28
*** snapiri has joined #openstack-infra11:28
*** numans_afk is now known as numans11:29
coreycbpgaxatte: hi, what's happening? we should move to #ubuntu-server for package issues.11:32
*** claudiub has quit IRC11:35
*** claudiub has joined #openstack-infra11:36
*** ldnunes has joined #openstack-infra11:36
*** jpena is now known as jpena|off11:39
*** jpena|off is now known as jpena11:40
ssbarneaany gerritbot expert around? I observed that it fails to spot stalled connections and do a reconnect.11:40
*** e0ne has quit IRC11:42
ssbarneahttps://storyboard.openstack.org/#!/story/200171411:46
*** yamamoto has joined #openstack-infra11:48
*** yamamoto has quit IRC11:52
*** dhajare has quit IRC11:54
*** dhajare has joined #openstack-infra11:54
*** dsariel has quit IRC11:58
*** jlabarre has quit IRC11:58
*** tpsilva has joined #openstack-infra11:59
*** e0ne has joined #openstack-infra12:00
*** adarazs_lunch is now known as adarazs12:01
*** e0ne has quit IRC12:02
*** yamamoto has joined #openstack-infra12:03
*** odyssey4me has quit IRC12:03
*** odyssey4me has joined #openstack-infra12:03
*** rfolco has joined #openstack-infra12:05
*** dprince has joined #openstack-infra12:06
*** rosmaita has joined #openstack-infra12:08
*** yamamoto has quit IRC12:08
*** zoli|lunch is now known as zoli12:11
*** dsariel has joined #openstack-infra12:13
*** e0ne has joined #openstack-infra12:15
*** lucasagomes is now known as lucas-hungry12:16
*** yamamoto has joined #openstack-infra12:18
*** jpena is now known as jpena|lunch12:20
*** dsariel has quit IRC12:21
*** yamamoto has quit IRC12:22
*** efried has quit IRC12:23
*** sambetts|afk is now known as sambetts12:24
*** efried has joined #openstack-infra12:24
openstackgerritJoshua Hesketh proposed openstack-infra/zuul master: WIP Retry merge jobs  https://review.openstack.org/55489012:27
*** dprince has quit IRC12:29
*** panda is now known as panda|lunch12:30
*** yamamoto has joined #openstack-infra12:33
*** trown|outtypewww is now known as trown|ruck12:34
*** rlandy has joined #openstack-infra12:35
*** yamamoto has quit IRC12:38
*** edmondsw has joined #openstack-infra12:40
*** VW has joined #openstack-infra12:41
openstackgerritDavid Moreau Simard proposed openstack-infra/system-config master: WIP: Rewrite launch-node.py in Ansible playbooks/roles  https://review.openstack.org/55489412:42
*** gcb has quit IRC12:42
dmsimardinfra-root: ^ I couldn't sleep last night so I did something12:42
*** dhajare has quit IRC12:42
*** pgadiya has quit IRC12:44
*** panda|lunch is now known as panda12:45
*** yamamoto has joined #openstack-infra12:48
*** jamesmcarthur has joined #openstack-infra12:50
*** florianf_ has joined #openstack-infra12:51
*** yamamoto has quit IRC12:53
*** rosmaita has quit IRC12:53
*** jamesmcarthur has quit IRC12:53
*** florianf has quit IRC12:53
*** dizquierdo has quit IRC12:59
*** felipemonteiro_ has joined #openstack-infra13:01
*** eharney has joined #openstack-infra13:01
*** felipemonteiro__ has joined #openstack-infra13:02
*** adarazs is now known as adarazs_afk13:02
*** kgiusti has joined #openstack-infra13:03
*** yamamoto has joined #openstack-infra13:03
*** germs has joined #openstack-infra13:04
*** germs has quit IRC13:04
*** germs has joined #openstack-infra13:04
*** dprince has joined #openstack-infra13:05
*** felipemonteiro_ has quit IRC13:06
fricklerinfra-root: google found out for me that we also publish the sources for zuul docs, doesn't look like it should be that way: https://docs.openstack.org/infra/zuul/_sources/admin/client.rst.txt13:06
dmsimardfrickler: errr that's inside _sources13:07
dmsimardhow did it even find that link ?13:07
dmsimarddo we need to add a robots.txt in there ?13:07
*** yamamoto has quit IRC13:08
fricklerdmsimard: not sure, put I also don't think the sources should even exist at that place on docs.o.o ?13:08
fricklerdmsimard: other question: I tried to autohold a node for debugging, how can I find out if there is indeed a held node for that?13:09
dmsimardfrickler: what I've been doing is doing a nodepool list --detail |grep hold (or held?) on one of the nodepool launchers13:09
dmsimardthere might be a better way but I feel there's a gap between zuul and nodepool for that13:10
AJaegerfrickler: sphinx publishes them...13:11
AJaegerfrickler: that's a sphinx variable to set13:11
*** udesale has joined #openstack-infra13:12
fricklerAJaeger: so is that published intentionally? it is the 4th hit when searching "openstack zuul autohold" btw13:14
AJaegerfrickler: html_copy_source and html_show_sourcelink handle this...13:14
fricklerdmsimard: that command seems to work, thx. waiting for the proper node to show up now13:15
fricklermordred: fyi, there is a held node attributed to you 19 days old. please check if you still need that one13:16
*** myoung|afk is now known as myoung13:18
*** yamamoto has joined #openstack-infra13:18
*** snapiri has quit IRC13:19
*** mriedem has joined #openstack-infra13:20
*** yamamoto has quit IRC13:22
*** alexchad_ has joined #openstack-infra13:24
*** jpena|lunch is now known as jpena13:25
*** alexchadin has quit IRC13:25
*** amoralej is now known as amoralej|lunch13:27
kashyapAny way to reduce the time for `git-review`?  This is just abysmal :-(13:28
kashyap$> time git review13:28
kashyapremote: Processing changes: updated: 1, refs: 1, done13:28
kashyap[...]13:28
kashyapremote:   https://review.openstack.org/534384 libvirt: Allow to specify granular CPU feature flags13:28
kashyap[...]13:28
kashyapreal    7m1.107s13:28
kashyapuser    0m1.102s13:28
kashyapsys     0m0.816s13:28
*** lucas-hungry is now known as lucasagomes13:29
*** eharney has quit IRC13:30
fricklerkashyap: please use paste.openstack.org for multiline pastes. also gerrit seems to be a bit slow for me at times, too, that might be related13:33
*** yamamoto has joined #openstack-infra13:33
kashyapfrickler: Yeah, I normally do use paste.o.o extensively.  Posted here as it was under 8 lines, saving people to open yet another URL13:33
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add trigger driver  https://review.openstack.org/55483913:33
kashyapThat's I even trimmed the output before I pasted here.13:34
kashyaps/That's/That's why/13:34
*** jlabarre has joined #openstack-infra13:34
*** adarazs_afk is now known as adarazs13:37
*** yamamoto has quit IRC13:38
fricklerkashyap: o.k., I agree that it is a borderline case ;) regarding the timing, do you see that every time or was it a one off? how long does a "git review -d" take for you in comparison?13:38
kashyapfrickler: (No worries; I myself correct people on other channels to paste.)  Yes, I saw that last night too :-(  About 6 minutes13:39
kashyapMade me put a fork in my eye & turn it until everything came out13:39
* kashyap tries `git review -d`13:41
fricklerkashyap: I agree that this is unreasonably long. could you also check the timings for "for i in 4 6;do time curl -I -$i https://review.openstack.org; done" please?13:42
kashyapWill try; first letting the `git review -d` run finish13:42
kashyapfrickler: BTW, my network speed is ... let's say "less than stellar"13:43
kashyapDownload: 3.72 Mbits/s13:43
kashyapUpload: 3.85 Mbits/s13:43
*** dklyle has joined #openstack-infra13:44
*** david-lyle has quit IRC13:44
fricklerkashyap: that still sounds pretty reasonable IMO. from your timing I feared you would come up with some kbit/s only13:45
kashyapfrickler: `git review -d 534384` is still running13:45
* kashyap goes to spend some time on other mailing lists based projects where I use a `git-send-email` + 'mutt' to apply 100s of patches instantenously13:46
kashyap(To regain some sanity)13:46
fricklerkashyap: good or rather not good. but at least confirms that the other direction is affected, too13:46
kashyapfrickler: http://paste.openstack.org/show/707550/13:47
kashyap(The `curl` thing you asked for.)13:47
AJaegerfrickler: for the doc issue from earlier, see also http://sphinx.readthedocs.io/en/master/config.html#confval-html_copy_source - we need the sources for searching in the docs and proper display.13:47
AJaegerfrickler: we could add to the global robots.txt a line to disable _sources13:48
*** yamamoto has joined #openstack-infra13:48
AJaegerfrickler: want to send a patch for http://git.openstack.org/cgit/openstack/openstack-manuals/tree/www/static/robots.txt ?13:48
*** psachin has quit IRC13:48
fricklerAJaeger: ah, didn't know that the sources are used for that. I'll take a look at that, thx13:49
fricklerkashyap: o.k., that looks pretty normal to me, so I'm out of clue for now, sorry. maybe some other infra-root has more ideas13:50
*** ihrachys has joined #openstack-infra13:51
kashyapfrickler: No problem.  Meanwhile, your `git-review -d` is just finished:13:51
kashyap$> time git review -d 53438413:51
kashyap[...]13:51
kashyapreal    6m30.754s13:51
kashyapuser    0m0.729s13:51
kashyapsys     0m0.528s13:51
dmsimardkashyap: what repository ?13:53
*** yamamoto has quit IRC13:53
kashyapNova13:53
fricklerkashyap: just to confirm, this is a new situation for you in the last couple of days? or has it always been that bad for you?13:53
kashyapfrickler: It's new in the past few days.  In the past it was under 10 seconds13:53
dmsimardkashyap: is that a fresh repository clone ? or something you've been carrying for a while ?13:54
kashyapdmsimard: The latter13:54
kashyapFor a while.  It has multiple remotes13:54
*** alexchad_ is now known as alexchadin13:54
kashyap$> du -sh .git13:54
kashyap325M    .git13:54
dmsimardkashyap: can you try and reproduce from a fresh git.o.o clone ? It would probably be a good hint to point us in the right direction.13:54
kashyapdmsimard: Will this test be okay:13:54
kashyap(1) Do a fresh clone13:55
dmsimardNova is definitely one of those bigger repositories too13:55
kashyap(2) Time `git review -d`13:55
kashyap?13:55
dmsimardkashyap: sure -- and if you could compare with the pure git commands it would be helpful too. Like: git fetch https://git.openstack.org/openstack/nova refs/changes/84/534384/7 && git checkout FETCH_HEAD13:56
dmsimard(please use paste.o.o :D)13:56
AJaegerkashyap: "real 0m38.751" for me13:57
kashyapdmsimard: As I said earlier, if it's under 6 lines, I normally just paste here; it's less work for everyone.  And ~6 lines is completely fine on IRC, IMHO13:57
kashyapBut yeah, anything above, I do use pastebin (extensively)13:57
*** zhipeng has joined #openstack-infra13:57
dmsimardkashyap: yeah but these are going to be a couple times ~6 lines, let's capture everything in the same paste :p13:58
kashyapdmsimard: Of course, I'll use paste for such entries.  Really, I myself correct pepole on other channels I help out on.  So no worries.13:59
*** bobh has joined #openstack-infra14:03
*** yamamoto has joined #openstack-infra14:03
kashyapdmsimard: I'm in a bit of a rush; I'll get it sometime tonight14:03
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Upgrade from angularjs (v1) to angular (v5)  https://review.openstack.org/55198914:04
dmsimardkashyap: okay -- my general train of thought was to try and isolate if the issue was coming from gerrit, from git-review, or from something else (git pack/garbage collection/etc?)14:04
dmsimardkashyap: git review is more or less a wrapper so just the test between "git-review -d" and pure git checkout is a good test14:05
*** dizquierdo has joined #openstack-infra14:06
kashyapYeah14:06
kashyapNoted; that's a good tip to remember14:06
*** yamamoto has quit IRC14:07
*** hongbin has joined #openstack-infra14:08
fricklerAJaeger: do you know whether /_sources/ in robots.txt would also match subdirectories? or does it need a full path like /infra/zuul/_sources/ ? if we need to the latter for all the projects we publish, I guess we would need to write a tool to autogenerate it14:08
mordredfrickler: I do not still need the held node - lemme go delete it14:09
*** dsariel has joined #openstack-infra14:11
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Allow external zookeeper in tox py35 runs  https://review.openstack.org/55481014:13
*** yamamoto has joined #openstack-infra14:18
*** myoung is now known as myoung|rover14:19
*** esberglu has joined #openstack-infra14:19
*** myoung|rover is now known as myoung|rover|mtg14:20
*** yamamoto has quit IRC14:23
*** cshastri has joined #openstack-infra14:24
*** derekh has quit IRC14:25
*** r-daneel has joined #openstack-infra14:25
*** ykarel is now known as ykarel|away14:25
*** derekh has joined #openstack-infra14:27
AJaegerfrickler: it starts from root, so /_sources/ will not help, you need /infra/zuul/_sources/14:28
AJaegermordred: did you see boden's comment earlier on how to setup vmware-api so that it now checks out from git?14:29
*** ykarel|away has quit IRC14:30
*** gouthamr has joined #openstack-infra14:32
*** yamamoto has joined #openstack-infra14:33
mordredAJaeger: I did not - looking14:37
pabelangermorning14:37
pabelangerI'm going to try launching review-dev01.o.o again14:37
*** yamamoto has quit IRC14:38
*** hashar is now known as hasharAway14:39
*** amoralej|lunch is now known as amoralej14:40
smcginnispabelanger: 2.14 testing?14:40
AJaegermordred: so, https://review.openstack.org/#/c/554292/ is fine on the OpenStack CI side - but now vmware-api installs from pypi instead of from git. They need some guideance/tools - and developers as well - on how to test locally with those packages from git.14:41
pabelangersmcginnis: first upgrade to xenial, but yah eventually gerrit testing14:41
AJaegermordred: want to +A https://review.openstack.org/554297 ? I think it's good to go and has 2 +2s14:41
smcginnispabelanger: Nice!14:42
mordredAJaeger: gotcha. so - local testing of siblings things is definitely high on the todo list - there isn't a GREAT story for it this instant14:42
dansmithI'm not the only one seeing post_failures again recently, right?14:43
mordredAJaeger, boden: I've got a patch up to add a helper to pbr - although it might be better to add such a helper as a separate repo14:43
dansmithlooks like unreachable workers: http://logs.openstack.org/90/547990/6/check/legacy-tempest-dsvm-multinode-live-migration/28c2873/job-output.txt.gz#_2018-03-21_14_38_29_83770214:44
dansmithI saw at least one yesterday too14:44
mordredAJaeger: ok. pulling the trigger- hold on to your hats :)14:44
AJaeger;)14:44
*** yamamoto has joined #openstack-infra14:46
*** yamamoto has quit IRC14:46
bodenAJaeger mordred I’m still a little confused on how things work… how can we still install our dependenat projects like neutron, sfc, etc. from git when running tox locally (outside of the gate)?14:46
*** felipemonteiro__ has quit IRC14:47
fricklerdansmith: saw a few of those, but not enough yet to establish a pattern. if you have multiple occurrences, you may want to check zuul-info/inventory.yaml whether they all fail on a particular provider14:47
*** dave-mccowan has joined #openstack-infra14:47
*** felipemonteiro__ has joined #openstack-infra14:47
dansmithfrickler: okay the one from just now is rax, FYI14:47
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Revert "Switch to stestr"  https://review.openstack.org/55494314:50
*** alexchadin has quit IRC14:50
*** germs has quit IRC14:51
*** gyankum has quit IRC14:51
*** felipemonteiro_ has joined #openstack-infra14:52
*** germs has joined #openstack-infra14:52
AJaegerboden, mordred , did you see http://lists.openstack.org/pipermail/openstack-dev/2018-March/128328.html ?14:54
*** ykarel|away has joined #openstack-infra14:55
*** ykarel|away is now known as ykarel14:55
*** felipemonteiro__ has quit IRC14:55
mordredAJaeger: I didn't  - but yeah, that's basically a good summary of the current state - we need it in places that aren't neutron/horizon related too14:56
bodenAJaeger I missed that detail.. I’ll have to spend some time munking with it to see if I can get it to work14:57
mordredAJaeger, boden: we also have a similar thing in python-openstackclient and made tox envs like this: http://git.openstack.org/cgit/openstack/python-openstackclient/tree/tox.ini#n5814:57
mordredthat's not ideal either though - which is why I started in on pbr siblings14:58
pabelangerhmm, fedora-27 nodes still having dnf issues14:58
bodenmordred AJaeger ok thanks.. I’ll parse that info… is this doc’d anywhere?14:58
pabelangergoing to hold one and see why that is14:58
pabelangerdoh14:59
pabelangermordred: dmsimard: could I get a +3 on: https://review.openstack.org/55462414:59
pabelangerneeded so we can 2 step review-dev01.o.o online14:59
*** agopi has joined #openstack-infra15:00
*** yamahata has joined #openstack-infra15:00
mordredpabelanger: wfm15:00
*** iyamahat has joined #openstack-infra15:00
openstackgerritMerged openstack-infra/zuul-jobs master: Uninstall and reinstall siblings one at a time  https://review.openstack.org/55429715:03
*** felipemonteiro__ has joined #openstack-infra15:07
clarkbkashyap: considering it was happening to github as well the other day I am guessing it is a problem local to you. You may want to tcpdump a fetch to see what is going on (and possibly fetch via http so that you can read a bit more of what is going on than if it were https)15:07
*** cshastri has quit IRC15:08
kashyapclarkb: Yeah, will investigate.  Under some duress to finish something else, before I run to my Dutch class in a few minutes15:08
kashyapFirst thing in the morning.15:08
*** felipemonteiro_ has quit IRC15:11
*** kien-ha has joined #openstack-infra15:12
*** electrofelix has quit IRC15:12
AJaegerboden: saw your comment - can you do a change on top of mine first and iterate on that? Once that works, we can disucss merging them in one - or approve both.15:16
AJaegerI'd like to keep a baseline ;)15:16
*** eernst has joined #openstack-infra15:22
*** jamesdenton has quit IRC15:24
*** zhipeng has quit IRC15:26
openstackgerritMerged openstack-infra/zuul master: Revert "Switch to stestr"  https://review.openstack.org/55494315:27
*** VW_ has joined #openstack-infra15:28
bodenAJaeger ok15:29
openstackgerritMerged openstack-infra/system-config master: Add gerrit_configure flag to review-dev01.o.o  https://review.openstack.org/55462415:30
*** VW has quit IRC15:30
*** agopi is now known as agopi|lunch15:31
fricklerinfra-root: I'll be afk now, but in case someone has time to continue investigating post failures, it does look like we have some significant increase in the last 48h or so: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22POST-RUN%20END%20RESULT_TIMED_OUT%5C%2215:31
clarkbfrickler: thanks, I'll probably dig into that once properly awake15:31
clarkbI was looking at a post failure for the tripleo change that dib needs to get the 2.12.1 release out and it appears to be different than the post failure linked by dansmith above15:32
*** VW_ has quit IRC15:32
*** VW has joined #openstack-infra15:32
*** yamamoto has joined #openstack-infra15:33
fricklerclarkb: yes, I saw two patterns, one the ssh hostkey changed during post and the other a timeout without further logs during fetch-devstack-log-dir15:33
clarkbI wonder if we had a zookeeper disconnection and nodepool recycled nodes (and their IPs)15:34
clarkbpabelanger: ^ what is the easiest way to check for that?15:34
corvusclarkb: nodepool doesn't recycle nodes15:35
clarkbcorvus: right it would be the cloud recycle IPs, nodepool would delete and make new ones15:35
clarkbzuul scheduler memory use is still looking good so I don't think we had a swapping situation result in zk disconnects15:35
fricklerclarkb: oh, now that you mention that, I was seeing this error when looking at held nodes earlier on nl01: "WARNING kazoo.client: Connection dropped: socket connection error: Permission denied"15:36
fricklerclarkb: I ignored it because the command seemed to succeed anyway, but it might be related15:36
corvusclarkb: ah yes.  it's possible the scheduler won't cancel a job if it loses the node.15:36
corvuswhich would cause weird behavior like this15:36
pabelangerclarkb: I usually grep debug log on scheduler looking for kazoo.client logging15:36
clarkbI think that would explain why some of the jobs basically timeout and others hit ssh key errors15:36
clarkbthe timeouts happen when cloud doesn't recycle the IP and the key errors when it does15:37
* clarkb checks the zk server first15:37
corvuskazoo doesn't appear in the last 3 log files15:38
clarkbplenty of disk and free memory on nodepool.o.o. The zookeeper log itself doesn't complain about anything since last october15:38
clarkband process has been running since january 2815:38
clarkbI think zk itself is fine15:38
*** dizquierdo has quit IRC15:39
*** florianf_ has quit IRC15:39
pabelangerclarkb: I've used http://status.openstack.org/elastic-recheck/#1721093 too to help track them. Last time we lost zookeeper connection, there was a huge spike since all nodes were being deleted15:40
corvusyeah, if this is a trend vs an event, it's less likely to be a zk issue15:40
corvuslogstash says they're all ovh-bhs115:41
clarkbcorvus: dansmith's was rax ord15:41
corvuslogstash says they're nearly all ovh-bhs115:41
clarkbhttp://logs.openstack.org/90/547990/6/check/legacy-tempest-dsvm-multinode-live-migration/28c2873/job-output.txt.gz#_2018-03-21_14_38_29_837702 is the log for that one15:41
clarkb(possible that dansmiths is separate issue)15:42
*** florianf has joined #openstack-infra15:42
*** kien-ha has quit IRC15:42
corvuslike maybe 5 out of 100 are not bhs115:42
corvusin fact, exactly 5 out of the 100 i'm looking at are not bhs115:43
clarkbreading nl01 logs for the IPs involved in dansmith's failure I don't think nodepool deleted the node early or booted the reused IP nodes quickly enough to cause a conflict15:46
*** kiennt26_ has joined #openstack-infra15:46
clarkbthe timestamps all line up in a run dansmiths job to completion, a minute or two later ask for cloud to delete nodes. Than half an hour later boot new instances with those same IPs15:47
clarkbso I think that does rule out the zk theory15:47
*** eharney has joined #openstack-infra15:48
*** VW has quit IRC15:50
*** VW has joined #openstack-infra15:50
*** eernst has quit IRC15:52
*** eernst_ has joined #openstack-infra15:53
clarkbin the dansmitch case I wonder if something in the devstack process is restarting networking and sshd is finding a new host key?15:54
clarkbfail at typing15:54
clarkbI'm going to look at a bhs1 case now15:54
*** eernst_ has quit IRC15:54
clarkbara renders these funny, I guess because zuul is preemting it15:56
*** eernst has joined #openstack-infra15:56
clarkbah yup the helpful ara float over tooltip thing says that it was interrupted15:57
*** eernst has quit IRC15:57
*** eernst has joined #openstack-infra15:57
*** rosmaita has joined #openstack-infra15:58
*** iyamahat has quit IRC15:58
*** yamahata has quit IRC16:00
*** efried is now known as efried_rollin16:01
clarkbcorvus: 2018-03-21 15:15:52,503 WARNING nodepool.CleanupWorker: Deleting leaked instance ubuntu-xenial-ovh-bhs1-0003103290 (e30d69e9-ec6b-4f51-8a18-3fee2e13b2c2) in ovh-bhs1 (unknown node id 0003103290)16:01
clarkbcorvus: nodepool seems ot think the node leaked. It is still leaking after the job failed though16:02
openstackgerritSaju M proposed openstack/python-jenkins master: pypy is not checked at gate  https://review.openstack.org/55497116:03
clarkboh wait it appears to do the normal deletion first then the cleanup thread runs on its every minute run or whatever and catches it because it is still around16:03
corvusso maybe just a slow delete16:03
clarkbya I don't think that is odd anymore. Just a race for how quickly cloud can delete an instance16:04
*** eernst has quit IRC16:06
*** jlabarre has quit IRC16:07
*** kien-ha has joined #openstack-infra16:07
*** katkapilatova has left #openstack-infra16:07
*** kien-ha has quit IRC16:10
*** danpawlik has quit IRC16:11
jlvillalgerritbot review request: https://review.openstack.org/#/c/545469/  Some cleanup/refactoring and adding unit tests. Has one +2 Thanks!16:13
*** eernst has joined #openstack-infra16:16
*** andreas_s has quit IRC16:16
clarkbshort of networking trouble between the executor(s) and bhs1 I'm stumped. Running a ping with high packet count between ze07 (where one job timed out) to the bhs1 mirror lost no packets16:16
*** yolanda_ has joined #openstack-infra16:19
*** eernst has quit IRC16:19
*** yolanda has quit IRC16:19
*** derekh has quit IRC16:21
*** derekh has joined #openstack-infra16:22
*** wolverineav has joined #openstack-infra16:22
*** dsariel has quit IRC16:23
*** andreas_s has joined #openstack-infra16:26
*** masber has quit IRC16:27
*** trown|ruck has quit IRC16:28
dirkWasn't there some way of doing a mixin? E.g I want to inherit Openstack-tox but change the nodeset.. e.g. build on bionic instead of the default  xenial - for background of the question see https://review.openstack.org/#/c/554824/16:29
clarkbdirk: the zuul docs call it a variant.16:30
*** andreas_s has quit IRC16:30
corvusdirk: wow, that syntax would be surprisingly easy to implement at this point.  but no, it's not supported.16:31
corvusdirk: maybe just parent to cross-test and then change the nodeset?16:31
corvus(i hesitate to say this, but, to directly answer the question, if you just duplicated that job with a different parent line on each (as clarkb says -- variants), you'd get the mixin behavior.  i say i hesitate because i don't know if that behavior is going to be too confusing)16:36
dirkcorvus: yeah, there a ways to avoid multiple parents.. is that the best way?16:36
*** dizquierdo has joined #openstack-infra16:37
*** kiennt26_ has quit IRC16:37
*** danpawlik has joined #openstack-infra16:37
*** armaan has quit IRC16:37
*** ramishra has quit IRC16:37
dirkHmm, yeah. Does bionic support multiple python 3.x versions? E.g. is it likely that we end up with a single distro that can do all tox jobs at some point?16:38
dirkI'll try duplicate of the job for now16:38
*** gyee has joined #openstack-infra16:38
corvusdirk: i honestly don't know which way is best.  i've spend enough time with the algorithm that it seems reasonable to me; i'd like to know if others think mixins are reasonable, or too confusing/unmaintainable.16:38
corvusdirk: perhaps if folks do like the idea of mixins, we should legitimize it by adopting your 'list' syntax.16:39
dirkcorvus: the other possibility is a yaml variable reference16:39
dirkE.g. &openstack_py36_nodeset16:40
dirkThat is centrally defined to expqnd to whatever base distro is the choice for that tox flavor16:40
corvusdirk: yaml references will only work within the same file, so you could do that in that file, but not in project-config and use it in requirements16:40
corvusdirk: but we could define nodesets by purpose... ie, an actual nodeset called 'openstack-py36-nodeset'.16:41
persiaIsn't https://review.openstack.org/#/c/550235/ the current way to do that sort of thing?16:41
*** agopi|lunch has quit IRC16:41
persiaSo one would just define a new base nodeset (with properties) if one wanted to have specific environments?16:42
*** agopi|lunch has joined #openstack-infra16:42
*** danpawlik has quit IRC16:42
fungidirk: depends on what you mean by "support" but like xenial and trusty before it, the main archive for bionic only comes with a single python 3 interpreter (3.6.x). there are plenty of ways to install other versions of python on any of the platforms we offer depending on how complicated you want to make your jobs16:43
corvuspersia: well, we have a bunch of nodesets defined here: http://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/zuul.d/nodesets.yaml16:43
corvuspersia: so right now, we're generally saying things like "openstack-tox-py36 requires bionic".  i think dirk wants to say something like "requirements-py36 requires whatever openstack is using for py36"16:44
corvusso, another layer of indirection16:44
corvusone way of doing that is a 'mixin' of the openstack py36 job.  another is indirection in the nodeset reference.16:44
persiaRight.  Based on what I've been learning to try to understand our nodepool config, I would expect the to use the "label" construct for that indirection.16:45
corvuspersia: i think we want to keep the labels descriptive of what the nodes actually are (eg 'bionic'), and only add the indirection for their use to either jobs or nodesets in zuul16:45
persiaThen I think I'll let this conversation conclude without adding more, and will want to have a different conversation about what to use when "bionic" isn't enough information to explain what a node *is*.16:46
persia(but I'm not prepared for the latter conversation yet)16:46
persiaBroad gloss being that "bionic" describes a set of versions of software, installed in a way, but may not specify behaviour, specific packages installed, or even the ABI of the platform, if we support multiple architectures.16:47
*** VW has quit IRC16:47
*** myoung|rover|mtg is now known as myoung16:48
fungiin this case, as a policy we've (openstack infra) decided to only support one image for any given distro+release16:48
*** VW has joined #openstack-infra16:48
corvuspersia: indeed; i believe the thought is that we can expand that as needed (eg, bionic-x86_64, bionic-arm [which is an awesome label name], etc)16:48
fungibut yes, we're i suppose extending that to distro+release+arch16:48
dirkcorvus: nodeset name by purpose sounds good to me as well16:49
* fungi wants a bionic arm now16:49
persiacorvus: Yes.  I'm not currently prepared, but there was talk about "bionic" being some-random-arch-of-bionic in the future.16:49
persiafungi: You can run jobs against ubuntu-xenial-arm now, if you like.  They work.  Bionic is mostly waiting for bionic to finish (there are some wrinkles) and/or for us to have more capacity.16:49
fungipersia: i know, i was joking about wishing i had an appendage made of a mix of meat and artificial technology16:50
*** eernst has joined #openstack-infra16:50
corvuspersia: perhaps we should anticipate this and go ahead and use ubuntu-bionic-x86 instead of ubuntu-bionic.16:51
corvusso, going forward, start encoding arch into labels16:51
* persia wishes there were better ways to acheive the physical equivalent of "straight man" humor on iRC16:51
fungipersia: i think i just did? ;)16:51
*** e0ne has quit IRC16:51
corvusbasically, it's just that until recently, distro+release was sufficiently descriptive; now we have a second axis16:51
persiacorvus: There are migration issues.  I hope to have enough time to think through enough to propose a early draft spec in the next couple weeks.  I would strongly suggest keeping just ubuntu-bionic as the node label for now.16:52
fungiyeah, i am wholeheartedly in favor of going ahead and extending our labels to include architecture for any new labels we add, and planning to go back and correct the old ones as well16:52
persiaBut, anyway, I am really interested in the best solution for purpose-based nodes (dirk's thing).  We should discuss that now, as I think we have the right input information for it :)16:52
fungifor bionic there shouldn't be any migration concerns there16:53
fungias we have said it's not officially supported and don't use it for voting jobs16:53
*** myoung is now known as myoung|food16:53
persiaWe already do, but only in a couple places, so it isn't actually that important for now.  There is migration work to do, but it is small for bionic.  Ideally, we'll sort the arch thing before we suggest projects test on bionic.16:53
fungias long as we decide to have ubuntu-bionic-x86_64 and ubuntu-bionic-aarch64 or whatever before we officially support its use then i don't see a problem16:54
fungi"migration" may need to be done, but should be entirely non-impacting as far as ongoing software development is concerned16:55
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Rename javascript package to @zuul-ci/dashboard  https://review.openstack.org/55199916:55
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Stop falling back to job name for missing url  https://review.openstack.org/55405616:55
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Use requests instead of urllib.request in tests  https://review.openstack.org/55405716:55
openstackgerritMonty Taylor proposed openstack-infra/zuul master: web: add /{tenant}/jobs/{job_name} route  https://review.openstack.org/55097816:55
openstackgerritMonty Taylor proposed openstack-infra/zuul master: web: add /{tenant}/projects routes  https://review.openstack.org/55097916:55
openstackgerritMonty Taylor proposed openstack-infra/zuul master: web: add /{tenant}/pipelines route  https://review.openstack.org/54152116:55
fungisince those jobs should remain non-voting for another month-ish16:55
openstackgerritMonty Taylor proposed openstack-infra/zuul master: dashboard: add /{tenant}/job.html page to display job details  https://review.openstack.org/53554516:55
openstackgerritMonty Taylor proposed openstack-infra/zuul master: dashboard: add /{tenant}/projects.html web page  https://review.openstack.org/53787016:55
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Fix indentation and renable the eslint rule  https://review.openstack.org/54567116:55
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Shift html templates into components  https://review.openstack.org/55132716:55
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Upgrade to webpack 4  https://review.openstack.org/55198716:55
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Upgrade from angularjs (v1) to angular (v5)  https://review.openstack.org/55198916:55
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Remove dashboard workaround for missing log_url  https://review.openstack.org/55406616:55
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Use glyphicons for status balls  https://review.openstack.org/55199216:55
corvuspersia: so if we wanted to do that, then i think we should define purpose nodesets.  a "python" nodeset we use for pep8 jobs, and a "python36" nodeset we use for openstack-tox-py36 and requirements-tox-py3616:55
persiacorvus: That was the thought I had, as it makes job definitions look like the project-config change I linked above.16:56
persiaAnd that means users can target specific nodesets, which can be composed of whatever infra thinks is a good idea at the time.16:56
openstackgerritsebastian marcet proposed openstack-infra/openstackid-resources master: Added endpoint to delete RSVP Question Value  https://review.openstack.org/55498916:57
corvuspersia: i'm having trouble connecting https://review.openstack.org/550235 to that suggestion (because it seems to be an example of a job specifying a complete anonymous nodeset, rather than using one which is purpose named)16:57
*** camunoz has joined #openstack-infra16:57
*** hasharAway is now known as hashar16:57
corvuspersia: but yes, otherwise i agree we could would accomplish the thing you just said16:58
persiacorvus: Ah.  Apologies.  That nodeset is one that is capable of building wheels that match the cpython running for a bionic mirror.  That it happens to be the same nodeset that would be used by people testing with ubuntu-bionic is somewhat of a coincidence.16:58
persiaThe point being to just override the nodeset for a job with purpose-defined nodesets, rather than introducing a new feature to use different nodesets.16:59
openstackgerritMerged openstack-infra/openstackid-resources master: Added endpoint to delete RSVP Question Value  https://review.openstack.org/55498916:59
fungipurpose-named nodesets may also serve us well if what we want to be able to do is unconditionally swap out the backing node platform for one without needing to survive through another piecemeal migration like trusty->xenial ended up being16:59
*** eernst has quit IRC16:59
*** eernst has joined #openstack-infra17:00
persiaAnd such a migration may be more complicated now, as a greater portion of the job definitions now live entirely in project repos.17:00
fungiwe declare a flag day (like we did with precise->trusty) and if your jobs don't work on the new platform then you get to shift your development focus to fixing that before you can land other changes17:00
*** eernst has quit IRC17:01
*** eernst has joined #openstack-infra17:01
*** bradjones has quit IRC17:03
*** eernst has quit IRC17:03
*** udesale has quit IRC17:03
*** eernst has joined #openstack-infra17:03
*** danpawlik has joined #openstack-infra17:07
corvusyeah, i anticipated that just changing the nodeset for 'openstack-tox-py3X' would generally be sufficient, but that's only true for that job and its descendents.  so if you're doing something like the requirements cross-tests, you'd also need to update those if we didn't do one of the things we're talking about here.17:09
*** trown has joined #openstack-infra17:09
*** panda is now known as panda|off17:10
*** trown is now known as trown|lunch17:11
*** felipemonteiro__ has quit IRC17:11
*** danpawlik has quit IRC17:11
*** felipemonteiro__ has joined #openstack-infra17:11
*** armaan has joined #openstack-infra17:11
persia'openstack-tox-py3X' makes me wonder if it is possible to define a job in such a way that it always runs on *both* 'openstack-tox-py27' and 'openstack-tox-py36', or whether that ends up always needing to be two jobs.17:15
fungiwe mostly accomplish that by making two jobs and then grouping them in a project-template17:17
fungiso the project can just add that template instead of needing to add the jobs individually17:17
*** zoli is now known as zoli|gone17:18
persiaMakes sense.17:18
*** zoli|gone is now known as zoli17:18
fungia multinode job which just runs distinct tasks on each node with no communication between the nodes is almost (always?) better implemented as separate jobs since zuul can schedule them independently17:18
fungier, (almost?) always17:19
*** danpawlik has joined #openstack-infra17:20
clarkbok back to debugging bhs1 failures17:20
clarkbcorvus: pabelanger dmsimard do you know if the ssh connection manager thing logs its state anywhere on the executors? I'm wondering if that might give me a clue17:21
dmsimardclarkb: missing context.. what ssh connection manager thing ? Ansible ? Paramiko ?17:22
clarkbdmsimard: the ssh -o controlmaster process ansible uses to ssh to remote hosts on our executors17:22
dmsimardclarkb: you get the ansible literal ssh commands with "ansible -vvv" I believe17:22
clarkbreading the ssh man page it should go to stderr17:23
clarkbbut I'm not sure ansible is capturing that stderr anywhere17:23
dmsimardclarkb: do you see what you need in http://paste.openstack.org/show/707773/ ?17:24
dmsimardoh wait, that's not SSH, that's the special local connection thing, hang on17:24
*** danpawlik has quit IRC17:25
clarkbdmsimard: well in this case I'm hoping ansible/zuul are alrady logging it somewhere I'm not seeing so that I can review logs for jobs that have failed17:26
*** ykarel is now known as ykarel|afk17:26
*** pbourke has quit IRC17:27
dmsimardclarkb: with SSH: http://paste.openstack.org/raw/707777/17:28
dmsimardclarkb: zuul executors don't run ansible with -vvv unless they have verbosity activated17:28
dmsimardthey run with one -v by default iirc17:28
clarkbgotcha17:28
openstackgerritPaul Belanger proposed openstack-infra/zuul master: Enable autohold for RETRY_LIMIT  https://review.openstack.org/55499517:28
*** agopi|lunch is now known as agopi|17:28
clarkbcorvus: thoughts on turning that on to help debug the bhs1 network problems?17:28
*** agopi| is now known as agopi17:28
dmsimardpabelanger: added a comment on that patch17:29
*** florianf has quit IRC17:30
*** lucasagomes is now known as lucas-afk17:31
*** lpetrut_ has quit IRC17:32
*** jpich has quit IRC17:33
*** NobodyCam has quit IRC17:37
*** r-daneel has quit IRC17:37
*** dhajare has joined #openstack-infra17:37
*** r-daneel has joined #openstack-infra17:37
*** icey has quit IRC17:37
*** gus has quit IRC17:38
*** kuromagi has quit IRC17:38
*** kuromagi has joined #openstack-infra17:38
*** v1k0d3n has quit IRC17:38
*** gmann_ has quit IRC17:38
*** NobodyCam has joined #openstack-infra17:39
*** andreaf has quit IRC17:39
*** gus has joined #openstack-infra17:39
*** andreaf_ has joined #openstack-infra17:39
*** gmann_ has joined #openstack-infra17:40
*** icey has joined #openstack-infra17:40
*** felipemonteiro_ has joined #openstack-infra17:40
*** v1k0d3n has joined #openstack-infra17:40
*** efoley has quit IRC17:40
*** danpawlik has joined #openstack-infra17:40
clarkbas another datapoint my irc host is actually in bhs1 too17:40
clarkband I've not noticed any networking trouble17:40
openstackgerritMerged openstack-infra/zuul master: Add zuul-stream remote tests  https://review.openstack.org/55471417:41
*** andreaf_ is now known as andreaf17:41
*** felipemonteiro__ has quit IRC17:41
*** myoung|food is now known as myoung17:45
*** danpawlik has quit IRC17:45
dmsimardclarkb: I'm not up to date on BHS1.. can you summarize what's going on ?17:49
clarkbdmsimard: http://logstash.openstack.org/#/dashboard/file/logstash.json?query=message:%5C%22POST-RUN%20END%20RESULT_TIMED_OUT%5C%22 shows a rather large number of post run timeouts in bhs1. They all seem to be due to ssh timing out at the end of the job which times out the job then the rest of the taks end up working after that17:50
clarkbdmsimard: http://logs.openstack.org/03/529703/1/gate/nova-tox-functional/bd1f381/job-output.txt.gz#_2018-03-21_14_44_13_399218 is a specific example17:50
clarkbhttp://logs.openstack.org/97/554697/1/gate/openstack-tox-py35/7a5562b/job-output.txt#_2018-03-21_15_51_30_579759 is another17:51
clarkbinterestingly they seem to maybe all be failing getting result data. Perhaps there is some correlation there17:52
dmsimardWhere does "Copy files from /home/zuul/workspace/ on node" come from ? codesearch is turning up empty17:52
clarkbwhere do you see that?17:52
dmsimardah, it's not failing at the same place for every job17:52
clarkbcorrect, these are distinct jobs with different post playbooks17:53
dmsimardI found it here http://logs.openstack.org/32/554832/2/check/networking-odl-rally-carbon/668995e/job-output.txt#_2018-03-21_17_07_30_45083517:53
clarkbbut they do seem to be doing similar tasks17:53
clarkbbasically copying the data from test node to the executor17:53
*** pickle has quit IRC17:53
*** pickle has joined #openstack-infra17:53
clarkbya doing a synchronize pill to executor work root17:55
*** haleyb has quit IRC17:56
clarkbthinking about it does that rsync go through the controlmaster ssh connection?17:57
*** e0ne has joined #openstack-infra17:58
dmsimardbtw unrelated but I'm seeing repeated occurrences of puppet complaining: http://paste.openstack.org/show/707808/17:58
clarkbif not it could explain why the rest of the job is happy if it continues on over the controlmaster ssh connection17:58
dmsimardmordred: ^ in case you know what this is17:58
clarkbwhile the synchronizes (rsync) in particular are unhappy17:58
dmsimardclarkb: rsync over ssh basically opens an ephemeral rsyncd server on the other side before pushing the data.. right ?17:59
corvusclarkb: i know of no reason it wouldn't use the same controlmaster17:59
*** derekh has quit IRC18:00
*** sambetts is now known as sambetts|afk18:00
clarkbreading the docs and the code it appears to not use the same control master by default18:01
clarkbhttp://docs.ansible.com/ansible/latest/synchronize_module.html its an explicit flag you have to set: use_ssh_args18:01
dmsimardcorvus: the synchronize module is ... very confusing to say the least.18:01
clarkbI think that explains at least the mode of failure and why the rest of the job is generally happy18:01
clarkbit doesn't explain why rsync/synchronize are failing18:01
dmsimardclarkb: I've also come across a suggestion that we set "ansible_ssh_common_args" in the inventory instead of under the [ssh] block18:02
*** felipemonteiro__ has joined #openstack-infra18:02
dmsimardI'm not exactly sure why18:03
clarkbon ze07 the zuul user has >17k files open which could potentially cause problems though unlike that would be so cloud region specific18:04
clarkbmy normal login on that host has a 4k file limit18:05
*** pblaho has quit IRC18:05
*** VW_ has joined #openstack-infra18:05
corvusi switched ze01 to verbose18:05
*** VW_ has quit IRC18:05
*** VW_ has joined #openstack-infra18:06
corvusi want to see some of those command lines18:06
*** felipemonteiro_ has quit IRC18:06
dmsimardThere's several upstream issues around ssh args and the synchronize module... tl;dr it seems complicated to make it use what we want it to use (in this case -o ControlMaster ?) i.e, https://github.com/ansible/ansible/issues/1676718:08
*** VW has quit IRC18:08
corvusssh -o ControlMaster=auto -o ControlPersist=60s -o UserKnownHostsFile=/var/lib/zuul/builds/77b410b77f104d388a42304b7b0d9470/work/.ssh/known_hosts -o Port=22 -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=zuul -o ConnectTimeout=30 -o18:09
corvusControlPath=/var/lib/zuul/builds/77b410b77f104d388a42304b7b0d9470/.ansible/cp/....18:09
corvusthat's a typical ansible ssh invocation for reference18:09
corvus(not an rsync one)18:10
clarkbConnectTimeout is probably one that would speed up this failure mode if it were set18:10
clarkb(on the rsync)18:10
corvushrm, verbose doesn't appear to be outputting the rsync/ssh command.  i just see the module args.18:12
corvusi don't suppose it makes it into ara or zuul_json?18:12
*** danpawlik has joined #openstack-infra18:12
*** VW_ has quit IRC18:13
*** VW has joined #openstack-infra18:13
*** jpena is now known as jpena|off18:14
clarkb2018-03-21 18:14:06,585 WARNING kazoo.client: Connection dropped: socket connection error: Permission denied getting that trying to do a `sudo -H -u nodepool nodepool list` on nl01. I guess that is where frickler saw it and not in the logs (I do get the listing output though)18:14
clarkbok theorying time18:15
*** dtantsur is now known as dtantsur|afk18:15
clarkbour bhs1 nodes have ipv6 addrs according to neutron18:15
clarkbBut they don't work because nothing on the node knows about them or how to configure them18:16
clarkband occasionally rsync is going to use the ipv6 address instead of ipv418:16
corvusoccasionally?18:16
clarkbcorvus: well I don't think all the jobs in bhs1 are failing18:17
clarkbcorvus: so I'm guessing there is some non determinism there? maybe order of ips returned by shade/nodepool? I dunno18:17
*** danpawlik has quit IRC18:17
corvusclarkb: i believe we only give ansible one ip address, so if v6 isn't showing up in the inventory file, it shouldn't be involved18:18
clarkbhttp://logs.openstack.org/03/529703/1/gate/nova-tox-functional/bd1f381/zuul-info/inventory.yaml ok and it is ipv4 there18:18
clarkbnodepool does list the public ipv6 addr under its listing though18:18
corvusyeah, but ansible_host is the important bit here18:18
mordredcorvus, clarkb: reading scrollback18:19
*** trown|lunch is now known as trown18:19
*** lpetrut has joined #openstack-infra18:20
clarkbis it possible that the ssh key manipulation that runs during the job would be confused by the nodepool data ?18:20
mordredok. it doesn't look like there's an immediate shade bug at least ...18:21
corvus/usr/bin/rsync --delay-updates -F --compress --archive --rsh=/usr/bin/ssh -S none -o Port=22 -o StrictHostKeyChecking=no --rsync-path=sudo rsync --safe-links --out-format=<<CHANGED>>%i %n%L zuul@158.69.64.111:/opt/stack/data/ca-bundle.pem /var/lib/zuul/builds/b425a5fddcf24242a85f5291aeb2b7a3/work/ca-bundle.pem18:21
*** EmilienM is now known as mimi18:21
*** mimi is now known as EmilienM18:21
corvusthat looks like a typical rsync command according to zuul_json18:21
corvusswitching ze01 to unverbose18:21
*** gfidente is now known as gfidente|afk18:22
mordredTIL synchronize uses ssh completely differently18:22
corvusyeah, i very much stand corrected on that18:22
dmsimardAccording to https://github.com/ansible/ansible/issues/16767#issuecomment-233898082 -- it seems a workaround is to tell the synchronize module to *really* use the SSH configuration we're running Ansible with18:23
dmsimardWhich is sort of unfortunate18:23
corvusdmsimard: well even that bug suggests that use_ssh_args would work for us18:24
dmsimardIt doesn't really explain why things are suddenly failing and (mostly) in bhs1 though18:24
*** harlowja has joined #openstack-infra18:24
corvusindeed -- we've found a difference, but not an explanation18:25
clarkbpoking around the only places that seem to use ipv6 are multinode roles that setup host keys, /etc/hosts, and firewall rules18:25
*** dhajare has quit IRC18:25
clarkbit is possible the /etc/hosts stuff would braek on that but that would break in the job itself not post run18:26
clarkbalso many of these jobs are single node18:26
dmsimardDoes anyone know if we use custom ssh args in a synchronize module somewhere ? Looking at the upload-logs we don't do anything special.18:26
dmsimardI vaguely remember just falling back to a rsync command task due to this kind of nonsense before..18:26
corvusdmsimard: i think zuul v2.5 used rsync directly mostly due to trying to achieve compat with jenkins.  i think things got simpler with v3 and we can just use sync.18:28
dmsimardAh, found it. It was for another issue related to delegation of the synchronize task https://github.com/rdo-infra/ci-config/blob/master/jenkins/jobs/scripts/destroy-vm.sh#L77-L8818:28
corvus(btw, there's a suggestion in that bug about defaulting use_ssh_args to true.  and another about making it a config file option.  either would be nice)18:28
fricklerclarkb: ack on the kazoo.client warning, that's exactly what I saw, too. sorry for not having been more explicit18:29
dmsimardcorvus: It sounds like setting use_ssh_args to true would be a good call but it needs to be done on a task basis unless we ship a custom synchronize module (like we do for other modules)18:30
clarkb158.69.77.125 is a node that may hve this happenign to it18:30
clarkbthere is no active zuul user session but we have a console log daemon thing floating around18:30
clarkbI can ssh into it just fine as root18:30
corvuswhat about from the executor?18:30
*** haleyb has joined #openstack-infra18:30
corvus(which executor is it)18:31
clarkbI don't know haven't gotten that far :)18:31
clarkbI went from nodepool up not zuul down18:31
clarkbcourse I only have about 4 more minutes before it gets auto deleted :/18:31
corvusze01 build 83f63965727e4159951d4f9b9d15de2418:32
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Upgrade from angularjs (v1) to angular (v5)  https://review.openstack.org/55198918:32
dansmithclarkb: here's another POST_FAILURE that actually ran to completion but failed after for some different reason: http://logs.openstack.org/02/545002/14/check/nova-multiattach/280795c/job-output.txt.gz18:32
corvusze01 seems to be able to connect to that host over ssh18:33
corvusso if it's a network issue, it's a very transient one18:33
*** dizquierdo has quit IRC18:33
clarkbhrm that also isn't ending with a synchronize18:33
clarkbbut on the host I don't see zuul /me looks harder18:34
clarkbnetstat doesn't see an ssh either18:35
clarkbthe ip for that build uuid doesn't seem to match tht may explain it18:35
clarkboh maybe multinode18:35
clarkbso the task that is running is running on the other host? could mean this isn't exhibiting this problem in that case18:36
dmsimardin /tmp/tmp_hosts you have 158.69.77.125 and 158.69.77.13618:36
* dmsimard looks on 158.69.77.13618:36
clarkbI'm probably wrong about that host then18:36
clarkbif its multinode then it is likely just busy on the other node18:36
dmsimardit's odd that less is not installed on those machines18:38
clarkbdmsimard: they are based on the minimal elements from dib which is very minimal18:39
dmsimardyeah, it breaks journalctl and man pages (amongst probably other things)18:39
clarkblooking at the total number of jobs running on bhs1 this does seem to be fairly intermittent18:39
dmsimardeven vi/vim isn't installed *gasp*18:39
clarkbI expect that if controlmaster fails during pre we just get a new node and try again and never notice. If it manages to connect things work because controlmaster18:40
clarkbthen if you are lucky in post a new connection for rsync willfail18:40
clarkbcorvus: is the controlmaster process shared across ansible processes?18:40
dmsimardI'm not sure what "auto" does18:40
*** armaan has quit IRC18:42
corvusclarkb: i think http://git.openstack.org/cgit/openstack-infra/zuul/commit/?id=a86aaf1158b2153e5aed5ae1fd550962330d01dc  explains18:43
dmsimardhmm, we're not setting a controlpath ? Should we be doing that ?18:43
*** rosmaita has quit IRC18:43
corvusdmsimard: we do set a controlpath18:43
*** yamamoto has quit IRC18:43
dmsimardcorvus: oh, I missed it in your paste, you're right18:43
openstackgerritPaul Belanger proposed openstack-infra/zuul master: Enable autohold for RETRY_LIMIT  https://review.openstack.org/55499518:43
clarkbya so that fits into my current thinking18:44
clarkbit is intermittent enough that most jobs start fine and get through pre with a working connection then won't fail until the end when the rsync fails18:44
*** andreas_s has joined #openstack-infra18:45
clarkbcorvus: we weren't getting any additioanl rsync logging from rsync itself were we when you turned on verbosity?18:45
corvusclarkb: that seemed to be the case18:46
corvusi only observed the module invocation dictionary as additional invocation18:46
clarkbI wonder if we could look for failed pres18:46
clarkband catch ssh logging for that18:46
dmsimardcorvus: I vaguely remember an issue where the controlpath path was too long... the one you pasted above seemed long enough, can you paste what the full control path actually looks like ?18:46
clarkb(this assumes that is happening at all which I don't have real evidence of yet)18:47
openstackgerritPavlo Shchelokovskyy proposed openstack/os-testr master: Use subunit and stestr API more  https://review.openstack.org/50975218:47
corvusdmsimard: it was only a little bit longer: /var/lib/zuul/builds/77b410b77f104d388a42304b7b0d9470/.ansible/cp/658a09534618:47
dmsimardcorvus: okay, so it's not that then. cool.18:47
*** Swami has joined #openstack-infra18:48
dmsimardclarkb: ssh logging where ? you mean on nodepool nodes ? or on the executor ?18:49
*** ralonsoh has quit IRC18:49
clarkbdmsimard: the executor18:49
clarkbdmsimard: to see what the failure condition is18:49
*** andreas_s has quit IRC18:50
dmsimardok, fwiw the output of journalctl -u ssh on 158.69.77.136 (paste on fedoraproject due to paste.o.o truncating) https://paste.fedoraproject.org/paste/ppkb1pxXlhTiqs0bi6RobQ/raw18:51
*** lpetrut has quit IRC18:51
*** danpawlik has joined #openstack-infra18:52
clarkbdmsimard: ya I'm not longer convinced that pair of nodes was having problems18:52
clarkbI missed the fact that multinode could means similar18:52
clarkb*similar no zuul connection behavior18:52
dmsimardI don't remember seeing this kind of odd message before "Mar 21 17:37:36 ze03 sshd[3995]: Received disconnect from x.x.x.x port 42100:11: Normal Shutdown, Thank you for playing [preauth]"18:53
bkerohttp://lists.mindrot.org/pipermail/openssh-unix-dev/2014-January/031953.html18:54
dmsimardSome software have funny messages :)18:56
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Fix zuul-web port in zuul-from-scratch doc  https://review.openstack.org/55482918:56
*** danpawlik has quit IRC18:56
*** jlabarre has joined #openstack-infra18:57
dmsimardclarkb: have we isolated whether or not this is only occurring on synchronize tasks ? It's worth trying use_ssh_args if so -- it still won't explain the sudden ovh issues but if it works it's a worthwhile data point18:57
dmsimard(It seems to be only synchronize tasks)18:57
clarkbdmsimard: I've not examined each task no. But the ones I have looked at are synchronizes18:57
dmsimardI'll look at a couple18:58
corvusi also wonder, based on clarkb's multinode nodeset from earlier whether multinode jobs might be more likely to hit the controlpersist timeout and end up opening new connections on tasks mid-run18:59
corvusif we see multinode jobs hitting errors in bhs1 on non-synchronize tasks, that may be happening.  if we aren't, then i wonder why it isn't happening.19:00
*** lpetrut has joined #openstack-infra19:00
dmsimard9 out of 9 are synchronize tasks19:01
dmsimardoh, hey.. I know, we store these in graphite now, let's see when they started happening19:01
* dmsimard looks19:01
dmsimardmy graphite-fu is rusty but the data points are in: stats_counts.zuul.executor.ze*_openstack_org.phase.*.RESULT_TIMED_OUT (or stats.zuul.executor.ze*_openstack_org.phase.*.RESULT_TIMED_OUT .. I'm not sure what's the difference between the two)19:05
corvuslogstash says the rate has increased starting around 36 hours ago19:07
corvusmaybe only 30 hours ago.  hard to say.19:07
clarkbat a rate of 3-4 an hour?19:09
clarkbat least for the last 6 hours19:09
*** jcoufal has quit IRC19:12
openstackgerritPaul Belanger proposed openstack-infra/zuul master: Enable autohold for RETRY_LIMIT / POST_FAILURE  https://review.openstack.org/55499519:13
dmsimardThat seems to strangely correlate with an undergoing network maintenance in the BHS datacenter: http://status.ovh.com/?do=details&id=1532819:13
dmsimardWhich started yesterday19:13
*** felipemonteiro__ has quit IRC19:13
*** felipemonteiro__ has joined #openstack-infra19:13
dmsimard(correlation != causation but just mentioning)19:14
openstackgerritPaul Belanger proposed openstack-infra/zuul master: Enable autohold for RETRY_LIMIT / POST_FAILURE  https://review.openstack.org/55499519:16
clarkbdmsimard: ya the more I dig into this the more I think it is likely a provider side issue. I can't find anywhere we'd use ipv6 that would affect this and break. Our images work most of the time and the jobs aren't consistent enough to point to a specific job19:17
clarkbthe one thing consistent on our endappears to be synchronize but I think that is more innocent bystander not using the controlmaster than cause19:17
corvusyeah, i think our two next steps are: 1) give the ovh folks a heads up that we're seeing more connection timeouts than before.  2) start adopting use_ssh_args=true in our log copying tasks.19:19
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Return CORS headers on all requests  https://review.openstack.org/55502719:20
*** myoung is now known as myoung|biab19:22
*** ykarel|afk is now known as ykarel|away19:22
*** dprince has quit IRC19:23
*** jaosorior has quit IRC19:23
*** savihou has joined #openstack-infra19:23
*** dprince has joined #openstack-infra19:23
*** savihou has quit IRC19:24
*** savihou has joined #openstack-infra19:25
*** eharney has quit IRC19:26
*** sree has joined #openstack-infra19:26
*** savihou has quit IRC19:28
*** savihou has joined #openstack-infra19:28
*** savihou has quit IRC19:28
*** eharney has joined #openstack-infra19:28
*** ykarel|away has quit IRC19:28
*** danpawlik has joined #openstack-infra19:28
prometheanfirecan I get an review (one more) glean, https://review.openstack.org/54860419:30
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Allow external zookeeper in tox py35 runs  https://review.openstack.org/55481019:30
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Change test prints to log.info  https://review.openstack.org/55405819:30
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Fix logging in tests to be quiet when expected  https://review.openstack.org/55405419:30
openstackgerritMonty Taylor proposed openstack-infra/zuul master: Add license and downgrade exception to alembic template  https://review.openstack.org/55405519:30
*** sree has quit IRC19:31
*** danpawlik has quit IRC19:33
*** salv-orlando has quit IRC19:34
*** salv-orlando has joined #openstack-infra19:35
*** efried_rollin is now known as efried19:37
*** tesseract has quit IRC19:38
*** salv-orlando has quit IRC19:38
*** pickle is now known as dhill_19:39
fungiclarkb: not sure if you saw, but 552667 has a non-foundation-staff +2 now too19:40
fungiprobably best if the ptl still approves that one, i suppose19:40
clarkbI'm running a while ssh clarkbsirchost 'echo foo' ; do sleep 5 ; done to see if I can catch ovh failing to my irc box19:40
openstackgerritMerged openstack-infra/zuul master: Fix zuul-web port in zuul-from-scratch doc  https://review.openstack.org/55482919:40
clarkbfungi: ok will look19:40
fungiclarkb: i'd give it even chances that the failures only impact certain instances at random (possibly those scheduled to certain hosts or something) rather than everything in their network19:42
clarkbfungi: ya likely19:43
clarkbalso is _ valid in unix username?19:43
corvusclarkb, fungi: not sure if we want to just ping infra-root or something to lot folks know about https://review.openstack.org/55266719:43
*** dprince has quit IRC19:43
corvusoh i just did19:43
fungiclarkb: that's a very good question, but hopefully one diablo_rojo has an answer to19:44
*** jaosorior has joined #openstack-infra19:44
clarkbcorvus: ya I'm reviewing it now. as soon as I'm happy with the _ I will +2.19:44
clarkb(and approve once infra-root is done with it?19:44
*** yamamoto has joined #openstack-infra19:44
pabelanger+219:44
*** salv-orlando has joined #openstack-infra19:44
clarkbits valid as a filepath so should be fine for the homedir19:45
corvusit apparently matches the regex in debian's adduser19:46
clarkbya internet seems to think anything that is a valid C identifier is fine19:47
clarkbso I think this should be fine19:48
*** yamamoto has quit IRC19:49
clarkbI've +2'd it will give it until after lunch for other roots to chime in and approve if there is no opposition19:50
*** rfolco is now known as rfolco|ruck19:51
openstackgerriteldad marciano proposed openstack-infra/grafyaml master: Add datasource to template schema.  https://review.openstack.org/54836519:54
*** VW has quit IRC19:54
openstackgerritDoug Hellmann proposed openstack-infra/openstack-zuul-jobs master: add openstack-tox-lower-constraints  https://review.openstack.org/55503419:54
*** VW has joined #openstack-infra19:54
*** eharney has quit IRC19:57
*** ekhugen has quit IRC19:59
*** danpawlik has joined #openstack-infra20:01
openstackgerritmegan guiney proposed openstack-infra/project-config master: initial config for getting-started-with-openstack project  https://review.openstack.org/55476820:02
*** ekhugen has joined #openstack-infra20:03
*** danpawlik has quit IRC20:06
pabelangernice, review-dev01.o.o is online20:12
pabelangerI'll now work on moving the volume from review-dev.o.o to review-dev01.o.o20:13
*** eharney has joined #openstack-infra20:13
ianwclarkb: https://review.openstack.org/#/c/554705/ ... hmmm tripleo is not looking happy20:14
ianwat this point i could force-merge a change to remove the test from dib, and then we could progress with fixing all that20:14
*** Krenair has quit IRC20:14
clarkbianw: ya I was going to ask if we thought that was prudent. Considering there are other users of dib probably20:15
clarkbianw: you may not need a force merge since it should be self testing config update?20:15
pabelangerianw: clarkb: +120:15
*** priteau has quit IRC20:15
ianwoh, yeah, doh, it will drop the test20:16
pabelangerokay, powering off review-dev.o.o20:17
*** Krenair has joined #openstack-infra20:18
clarkbmy ssh test to my bhs1 node never failed. I have stopped it. Likely only affecting certain subnets or l2 addresses over specific lacp links etc20:18
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Remove tripleo jobs  https://review.openstack.org/55503720:19
*** Krenair has quit IRC20:22
*** gouthamr has quit IRC20:26
*** camunoz has quit IRC20:29
prometheanfirepabelanger: mnaser https://review.openstack.org/548604 please?20:29
*** Krenair has joined #openstack-infra20:30
*** armaan has joined #openstack-infra20:30
*** salv-orlando has quit IRC20:32
*** jaosorior_ has joined #openstack-infra20:33
*** danpawlik has joined #openstack-infra20:33
clarkbanyone else want to ack https://review.openstack.org/#/c/555037/ ?20:34
*** kgiusti has left #openstack-infra20:36
dhellmanndo any of you have tools you use for making automated edits to yaml files? I have something that preserves the order, but not whitespace or comments.20:36
logan-ruamel.yaml allows you preserve and manipulate comments20:37
*** jaosorior has quit IRC20:37
dhellmannthanks, logan-, I'll take a look at that20:38
fungiyeah, as much as i'm not a fan of ruamel.yaml due to its dependency tie-ins to the whole suite of ruamel libs, it's the only library i'm aware of which preserves yaml whitespace, comments and ordering20:38
*** danpawlik has quit IRC20:38
dhellmannthis is for a one-off thing to add the lower-constraints job to a bunch of in-repo configs so I think I can accept the dependencies20:38
fungii wouldn't personally choose to use ruamel.yaml in general-purpose software i intend to distribute, it's handy for hacky utility uses20:39
fungiso yeah, seems suited to your use case here20:39
dhellmannyeah20:39
pabelangerclarkb: fungi: how does https://etherpad.openstack.org/p/jgLaT4MRuC look so far with review-dev01.o.o20:41
openstackgerritClark Boylan proposed openstack-infra/system-config master: Properly deprecate stackforge  https://review.openstack.org/55431220:41
clarkbfungi: ^ thank you for the review, but I've realized that I forgot to update index things so got that done20:41
corvusdhellmann: we haven't finished making zuul safe for lots of simultaneous zuul.yaml changes yet, so when you do that, be careful.  usually i put a 20 minute delay between each patchset upload.20:41
*** Krenair has quit IRC20:41
pabelangerclarkb: fungi: right now volumes have been moved to new server, and think I'm ready to enable puppet again to finish gerrit installation20:41
corvusdhellmann: (each such change uses too much memory, so we run out if there are lots.  a fix is in progress, but probably won't be complete for a few weeks yet)20:42
dhellmanncorvus : yeah, fungi and I talked about doing them in small batches20:42
corvusthat works too20:42
*** Krenair has joined #openstack-infra20:42
clarkbpabelanger: make sure you chown the contents of the volume if necessary20:42
dhellmannwe said ~10 at a time20:42
clarkbpabelanger: the uids don't necessarily line up20:42
*** amoralej is now known as amoralej|off20:43
dhellmannI can go smaller if I need to20:43
pabelangerclarkb: yah, it looks correct now. But good idea to call it out20:43
*** camunoz has joined #openstack-infra20:43
*** ethfci has joined #openstack-infra20:45
dhellmannwell, ruamel.yaml supports comments but doesn't maintain whitespace20:45
*** yamamoto has joined #openstack-infra20:45
fungiclarkb: oh, hah, i missed that you had renamed the file in that change20:45
dhellmannawk it is, I guess20:45
pabelangerokay, rebooting review-dev01.o.o to confirm /etc/fstab20:46
fungidhellmann: or round-trip through diff/patch using options to ignore whitespace changes20:46
openstackgerritMerged openstack-infra/zuul master: Allow external zookeeper in tox py35 runs  https://review.openstack.org/55481020:46
dhellmannfungi : I'm not sure how that would work, can you elaborate?20:46
tonybdoes someone have time to EOL OpenStackAnsible as described in http://lists.openstack.org/pipermail/openstack-dev/2018-March/128330.html (or add me to bootstrappers so I can do it)20:47
*** myoung|biab is now known as myoung20:47
corvusdhellmann: gimme a sec, i'll get you some code20:47
dhellmanncorvus : thanks20:47
fungidhellmann: make the edit, generate a diff using the option for ignoring whitespace changes, reset, apply the diff. also kinda hacky but may allow you to not alter whitespace that way20:47
slaweqhi guys, do You know how we can remove old "feature/xxx" branches from neutron repo?20:48
dhellmannfungi : fun, I've never done that. I'll give it a try.20:48
openstackgerritPaul Belanger proposed openstack-infra/system-config master: Finish gerrit install for review-dev01.o.o  https://review.openstack.org/55504820:48
fungidhellmann: diff -w to "ignore all white space" (may need -B for "ignore changes where lines are all blank" too, i can't remember if that counts as part of -w)20:49
pabelangerclarkb: fungi: if you are good with etherpad, I think we can land ^ and kick review-dev01.o.o20:51
*** yamamoto has quit IRC20:51
clarkbpabelanger: +220:52
*** Krenair has quit IRC20:54
fungipabelanger: yeah, that looks entirely sane20:55
*** camunoz has quit IRC20:55
clarkbdmsimard: any luck with that limestone mirror today?20:55
openstackgerritJames E. Blair proposed openstack-infra/project-config master: Add batch project update script  https://review.openstack.org/55505320:56
*** rosmaita has joined #openstack-infra20:56
corvusdhellmann, fungi: sorry that's not more polished, but it should hopefully get you going ^20:57
fungicorvus: oh, that's actually really slick20:58
fungiyour definition of polished is a lot stricter than mine20:58
pabelangergreat20:58
corvusit totally has the wrong number of newlines between methods.  :)20:59
fungithe hobgoblins will be displeaseed20:59
fungidispleased too20:59
clarkbnibalizer pointed out this new project called black, its basically gofmt for python and the color they are painting the shed is black21:00
clarkbapparently the focus is on maintaining minimal diffs and readability for code review which seems like a good goal21:00
*** esberglu has quit IRC21:01
fungiso sorta like autopep8?21:02
clarkbkinda, they break a few pep8 rules by default21:02
*** trown is now known as trown|outtypewww21:02
fungii'm all for breaking pep8 rules21:02
fungithey should file it as pep88821:02
clarkbthe biggest drawback I think is that it requires python3.6 which isn't quite in all the places yet21:03
clarkband you have to want the code style it produces21:03
fungisure, but that could be said for pep8/autopep8 as well21:03
clarkbpep8 is a lot more flexible. I'm not sure how aggressive autopep8 is21:03
pabelangerprometheanfire: did we ever start on nodepool dsvm testing for gentoo?21:05
fungiit's configurable to not apply certain rules at least21:05
clarkbianw: did you see tripleo has asked for an email about removing those tests from tripleo21:05
clarkber removing those tripleo tests from dib21:06
prometheanfirepabelanger: no, we were going to switch to a systemd image21:06
prometheanfirepabelanger: which is waiting on glean to support gentoo systemd21:06
clarkbalso any other infra root want to review that really quickly so that we can get a dib erlase out and unpause our image builds?21:06
prometheanfirethat review has been up there for a while...21:06
dhellmanncorvus : I also ran into issues with ruamel.yaml changing large multi-line strings into quoted strings; does that formatter handle that case?21:06
*** danpawlik has joined #openstack-infra21:07
*** Krenair has joined #openstack-infra21:07
ianwclarkb: ... ok21:08
clarkbianw: I figure its worth a note to them. I don't think we need their approval to remove the cogating21:08
clarkb(dib becoming an infra project and moving out of tripleo gives us that freedom)21:09
pabelangerprometheanfire: k, it would be great to start work on bring that online. I can maybe see what would be needed, but we should just be able to add image and job into nodepool. We then would depends-on to glean for any needed changes21:09
pabelangerfungi: are you okay to proceed on https://review.openstack.org/555048/ ?21:09
prometheanfirepabelanger: you don't have workflow on glean?21:09
* prometheanfire wonders who does so he can go bother them21:10
corvusdhellmann: i'm not certain, but i think the deltas from what we typically have in zuul.yaml files is minimal, so i wouldn't expect it to change that.21:10
pabelangerprometheanfire: I do, but have no way to know if that is actually the fix21:10
prometheanfireI'm building a systemd image right now with it (redefined the git source for glean)21:10
*** gfidente|afk has quit IRC21:10
fungipabelanger: yep! approved21:10
pabelangerfungi: danke21:10
prometheanfireI'll test boot it to be sure once it's built and let you know (if that works)21:10
*** eharney has quit IRC21:10
dhellmanncorvus : ok, thanks, I'll give it a try21:10
clarkbprometheanfire: pabelanger keep in mind the mbr partition table is currently broken in dib (we are working to fix it) it will likely boot but growroot will have a sad21:11
*** bnemec is now known as sin-master21:12
*** sin-master is now known as bnemec21:12
*** danpawlik has quit IRC21:12
pabelangerack21:13
clarkboh mwhahaha acked the dib test change anyways so I think we are doubly good21:13
prometheanfireclarkb: oh, guess my images won't work then :| (is building off master+patches)21:13
clarkbprometheanfire: well it may work for a boot test21:13
clarkbprometheanfire: but just not have much disk to use after that :)21:13
clarkbmostly something to be aware of during your testing21:13
prometheanfireI am including the growroot element21:13
*** VW has quit IRC21:13
pabelangerprometheanfire: left +2 with comments, a few more eyes might be safer :)21:13
prometheanfirethat's fine, I'll reboot and see21:13
mwhahahayea go ahead and fix it, i'm working on the tripleo ci stuff21:13
*** esberglu has joined #openstack-infra21:14
pabelangerclarkb: https://review.openstack.org/548604 glean change we are talking about21:14
prometheanfiremwhahaha: D&D tonight :P21:14
clarkbpabelanger: also do you know if we are actually ssh'ing into nodes during the nodepool tests? nodepool doesn't do that anymore itself right? so we'd have to explicitly do it21:14
openstackgerritMerged openstack-infra/zuul master: Enable autohold for RETRY_LIMIT / POST_FAILURE  https://review.openstack.org/55499521:15
pabelangerclarkb: yah, we SSH21:15
clarkbor wait its just the ready script ? basic connectivity is still checked iirc21:15
pabelangerclarkb: http://git.openstack.org/cgit/openstack-infra/nodepool/tree/tools/check_devstack_plugin.sh#n2721:15
pabelangercould be improved for more coverage, like growroot if we wanted21:16
clarkbpabelanger: ya21:17
pabelangermight be good to validate HDD size we expect21:17
clarkbianw: I went ahead and approved the dib change to remove the jobs21:17
clarkbdon't want to wait anylonger on that one21:17
ianwclarkb: thanks21:18
openstackgerritJames E. Blair proposed openstack-infra/jeepyb master: Support cgit alias sites and short names  https://review.openstack.org/55506321:24
pabelangermnaser: We seem to be in good shape with vexxhost, how does it look on your side?21:24
*** boden has quit IRC21:24
pabelangermnaser: do we want to bump max-servers?21:24
dhellmanncorvus : that seems to work great, thanks!21:26
*** priteau has joined #openstack-infra21:27
corvusdhellmann: \o/21:30
corvusclarkb, fungi, mordred: can you see https://review.openstack.org/555063 and my comment when you have a moment.21:30
pabelangerclarkb: fungi: I'21:31
pabelangererr21:32
pabelangerclarkb: fungi: I'm going to kick review-dev01.o.o now21:32
*** felipemonteiro_ has joined #openstack-infra21:32
*** salv-orlando has joined #openstack-infra21:32
*** felipemonteiro__ has quit IRC21:35
*** salv-orlando has quit IRC21:38
*** Krenair has quit IRC21:43
*** danpawlik has joined #openstack-infra21:45
*** agopi is now known as agopi|dinner21:46
*** yamamoto has joined #openstack-infra21:47
openstackgerritsebastian marcet proposed openstack-infra/openstackid-resources master: Added get ticket types endpoints  https://review.openstack.org/55507121:48
*** danpawlik has quit IRC21:50
openstackgerritMerged openstack-infra/openstackid-resources master: Added get ticket types endpoints  https://review.openstack.org/55507121:51
*** danpawlik has joined #openstack-infra21:53
*** yamamoto has quit IRC21:53
*** Krenair_ has joined #openstack-infra21:53
*** agopi|dinner has quit IRC21:53
*** priteau has quit IRC21:54
*** pcaruana has quit IRC21:55
*** rfolco|ruck is now known as rfolco|off21:57
*** danpawlik has quit IRC21:57
*** Krenair_ has quit IRC21:57
*** salv-orlando has joined #openstack-infra21:59
*** Krenair_ has joined #openstack-infra22:00
clarkbcorvus: thinking about that I think I'm ok with only hosting zuul (and any potential other repos) via http(s) if we think that will make a less confusing user experience22:00
clarkbcorvus: anymore git:// isn't really necessary with smart http being pretty ubiquitous22:01
clarkbI think it would be good to continue supporting git:// for openstack/ as those repos have had it set up that way for a long time22:01
* clarkb will go transcribe that on the change22:03
*** eernst has quit IRC22:05
pabelangerthat seems reasonable, I've been using https a lot more over git://22:07
clarkbI think centos6 was really the last place where it would make a real differencein our world22:08
clarkbbecause the git there was too old to smart http22:09
clarkbdib functests are not fast22:09
*** eernst has joined #openstack-infra22:10
*** eernst has quit IRC22:10
openstackgerritMerged openstack-infra/system-config master: Finish gerrit install for review-dev01.o.o  https://review.openstack.org/55504822:11
clarkbarg dib change was actually hit by the bhs1 thing22:12
ianwclarkb: what's the bhs1 thing?22:15
clarkbianw: flaky ansible synchronize in post-run playbooks. Apparently syncrhonize doesn't set up rsycn to use the controlmaster persistent connection thing we use for all the other ansible ssh connectivity so it apepars to be more susceptible to this22:16
clarkbianw: my hunch is that in pre if connectivity fails from the get go we just delete the node and retry again until ssh works and then it works through pre and run because of the controlmaster process but then rsync is more susceptable to it22:17
clarkbcorvus' proposed plan was to update our use of synchronize to use controlmaster and send ovh an email about it22:17
clarkbianw: dmsimard also pointed out that ovh is in the process of upgrading the operating system on some of their networking gear in bhs1 which may be related22:17
clarkbianw: http://logstash.openstack.org/#/dashboard/file/logstash.json?query=message:%5C%22POST-RUN%20END%20RESULT_TIMED_OUT%5C%22 logstash query for it22:18
clarkbrate appears to be 3-4 times per hour22:19
pabelangerokay, now kicking review-dev01.o.o since patch landed22:19
pabelangerand puppet ran okay22:20
pabelangerlet me see if gerrit starts22:20
*** lpetrut has quit IRC22:22
*** danpawlik has joined #openstack-infra22:23
*** yamahata has joined #openstack-infra22:24
*** e0ne has quit IRC22:24
*** rcernin has joined #openstack-infra22:25
pabelangerdoh, security email is me from review-dev0122:26
pabelangerokay, gerrit looks to be running but an issue with apache config22:28
*** danpawlik has quit IRC22:28
*** threestrands has joined #openstack-infra22:29
*** felipemonteiro_ has quit IRC22:29
*** felipemonteiro_ has joined #openstack-infra22:29
*** threestrands has quit IRC22:30
pabelangerwoot22:30
pabelangerhttps://review-dev01.openstack.org22:30
pabelangerclarkb: fungi: ^22:30
*** threestrands has joined #openstack-infra22:30
*** bobh has quit IRC22:30
pabelangerI had to modify apache2 manually, but will propose a fix in system-config22:30
*** Krenair_ has quit IRC22:31
clarkbpabelanger: login doesn't work because it wants to redirect to review-dev.o.o22:31
clarkbonce dns is updated it should work22:32
pabelangeryah, there is some issues around numeric hostnames22:32
pabelangerlet me add dns and revert apache change and see if it works22:32
ianwclarkb: thanks ... that sounds ... too much for me to deal with right now :)22:33
*** Krenair has joined #openstack-infra22:34
*** d0ugal has quit IRC22:34
ianwi'm just doing some manual boots to verify the dib fix too22:35
*** d0ugal has joined #openstack-infra22:37
*** bmace has joined #openstack-infra22:40
ianwclarkb/pabelanger: speaking of gerrit, any particular thoughts on https://review.openstack.org/#/c/552288/ which fixes some of our custom sql so it works with h2, which is used during hte git-review unit testing?22:41
pabelangerclarkb: once DNS updates, https://review-dev.openstack.org/107974 ready for review :D22:42
ianwi tested that via an online sql fiddle thing, so it's 100% to be absolutely fine :)22:43
clarkbianw: it would be good to have mordred debug/review that one too22:43
clarkbianw: mordred wrote the mysql ism updates to make our upgrade work in the first place22:43
clarkbianw: and we can toss the resulting war onto review-dev once pabelanger gets it working22:43
pabelangerclarkb: so, I think tomorrow we can apply patches we used for review-dev01.o.o, merge, then launch the replacement review01.o.o server to obtain IP address. Then send out the email to ML and prepare for migrate next week?22:44
clarkbpabelanger: ya if review-dev ends up happy with the dns update I think that would be the next step. As for preparing to migrate next week may be hard for some because it is apparently easter22:44
*** hashar has quit IRC22:45
clarkbalso tc discussions are related to connectivity issues we may want to consider more notice for the ip addr update22:45
*** armaan has quit IRC22:45
pabelangersure, getting the replacement server online is first step, deciding when to move volumes can them be made. I'd say, 60min window (longer for buffer if we want) is all we'd need.  review-dev01 went very well22:46
clarkbya should go qucik since we aren't transforming any data22:46
clarkbjust moving it22:46
*** andreas_s has joined #openstack-infra22:47
pabelangeryah, as long as we detach clean, should be fine22:47
pabelangerokay, going to get some food then poke around on new server22:48
pabelanger#status log review01-dev.o.o now online (ubuntu-xenial) and review-dev.o.o DNS redirected22:49
openstackstatuspabelanger: finished logging22:49
*** iyamahat has joined #openstack-infra22:49
*** yamamoto has joined #openstack-infra22:49
*** hongbin has quit IRC22:50
clarkbpabelanger: remember to check the ip against email blacklists22:51
*** andreas_s has quit IRC22:51
*** esberglu has quit IRC22:51
*** yamamoto has quit IRC22:54
*** masber has joined #openstack-infra23:04
*** danpawlik has joined #openstack-infra23:04
pabelangerclarkb: right, where I look for that?23:04
njohnston_Quick question - how would I go about getting added to the -core group for a project that I created, but somehow the -core group was left without any members in it?23:04
*** ldnunes has quit IRC23:05
clarkbpabelanger: https://www.spamhaus.org/lookup/ you can check the IP (do v4 and v6) and request removals if necessary23:05
clarkbnjohnston_: the initial group member is an explicit manual add23:05
clarkbnjohnston_: can you point me to the change that created the group? and I can add the appropriate initial member based on that info23:05
njohnston_clarkb: Thanks! https://review.openstack.org/#/c/546260/323:06
*** Goneri has quit IRC23:06
njohnston_sorry, not sure why I have an out of date changeset bookmarked, should just be https://review.openstack.org/#/c/546260/23:06
clarkbya I found the current one :)23:07
*** edmondsw has quit IRC23:07
*** jtomasek has quit IRC23:08
clarkbnjohnston_: I have added you to the group. The group is self owned so now that it has an initial member, that individual (you) can add whoever you like (and they too can add whoever they like)23:08
*** danpawlik has quit IRC23:08
njohnston_Thanks very much clarkb!23:08
*** tpsilva has quit IRC23:09
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Fix default partition type  https://review.openstack.org/55477123:11
*** salv-orlando has quit IRC23:26
*** salv-orlando has joined #openstack-infra23:26
*** Krenair has quit IRC23:28
*** tosky has quit IRC23:28
*** r-daneel has quit IRC23:30
*** salv-orlando has quit IRC23:30
*** Krenair has joined #openstack-infra23:38
*** danpawlik has joined #openstack-infra23:39
*** Adri2000 has quit IRC23:40
*** Adri2000 has joined #openstack-infra23:41
*** felipemonteiro_ has quit IRC23:41
*** Krenair has quit IRC23:43
*** danpawlik has quit IRC23:44
*** gyee has quit IRC23:45
*** claudiub has quit IRC23:51
*** yamamoto has joined #openstack-infra23:51
*** Krenair has joined #openstack-infra23:52
pabelangerclarkb: fungi: so far I don't see anything wrong on review-dev01.o.o. I haven't looked at storyboard-dev integration but can in the morning. anything else I should be looking at? Anything zuul related we should test?23:53
clarkbpabelanger: considering the biggest change is java 8 probably just normal functionality. Pushing code, reviewing changes, etc23:54
pabelangerYah, I'll do more of that testing tomorrow morning for sure23:55
*** yamamoto has quit IRC23:55
*** Krenair has quit IRC23:58
bmacehey folks.  i read through all the instructions and read through all the current values in project-config/gerrit/projects.yaml.  it isn't clear if an upstream / imported code repository retains its branches / tags.  can anyone tell me if it does or if it essentially just pulls master and the rest is lost?23:58
clarkbbmace: it should pull in all branches and tags as is23:59
bmaceclarkb: thanks very much :)23:59
clarkb(it explicitly tries to do this at least and I don't recall anyone ever saying it failed at it)23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!