*** ade_lee_ has joined #tripleo | 00:05 | |
*** lbragstad has quit IRC | 00:19 | |
*** lbragstad has joined #tripleo | 00:19 | |
*** ade_lee_ has quit IRC | 00:20 | |
*** ade_lee_ has joined #tripleo | 00:20 | |
openstackgerrit | Alan Bishop proposed openstack/tripleo-heat-templates master: Fix selinux context for glance-api https://review.opendev.org/682768 | 00:32 |
---|---|---|
*** ade_lee_ has quit IRC | 00:38 | |
*** ade_lee_ has joined #tripleo | 00:38 | |
EmilienM | abishop: ^ commented | 00:50 |
abishop | EmilienM: ah, yes, thx for point it out | 00:52 |
EmilienM | abishop: pep8 did it for us :D | 00:52 |
abishop | good dog, pep8, good dog | 00:52 |
EmilienM | :) | 00:54 |
EmilienM | woof ! | 00:54 |
openstackgerrit | wes hayutin proposed openstack/tripleo-common master: Revert "Add reauthentication to sessions" https://review.opendev.org/682769 | 01:00 |
openstackgerrit | Alan Bishop proposed openstack/tripleo-heat-templates master: Fix selinux context for glance-api https://review.opendev.org/682768 | 01:05 |
EmilienM | weshay|ruck: remember the last time we had ara for overcloud? it caused timeouts because ansible was so slow | 01:10 |
EmilienM | I wonder how we can do profiling without burning ansible while it runs | 01:10 |
*** slaweq has joined #tripleo | 01:11 | |
*** slaweq has quit IRC | 01:16 | |
*** mschuppert has quit IRC | 01:22 | |
weshay|ruck | EmilienM, hey.. I thought I was going to have replace a garbage disposal tonight.. fixed it w/ a broom :) | 01:25 |
EmilienM | lol | 01:26 |
weshay|ruck | EmilienM, pretty sure we had it in the overcloud prior to the log server change | 01:26 |
EmilienM | yeah? | 01:26 |
EmilienM | like when? | 01:26 |
weshay|ruck | EmilienM, so.. fwiw, I'd like to have it working but turned off if needed | 01:27 |
weshay|ruck | so we can turn it back on in a check job to help profile when we need it | 01:27 |
EmilienM | right | 01:28 |
dmsimard | EmilienM, weshay|ruck: the performance issues might have been due to the html generation and swift upload ? I would expect both to take a long time. | 01:28 |
EmilienM | or run it in periodic | 01:28 |
EmilienM | no IIRC it was during the ansible run | 01:28 |
EmilienM | but it might be long time ago | 01:28 |
weshay|ruck | ya.. periodic is a fine time | 01:28 |
EmilienM | long time I didn't catch up on ara/overcloud | 01:28 |
openstackgerrit | wes hayutin proposed openstack/tripleo-common master: Revert "Add reauthentication to sessions" https://review.opendev.org/682769 | 01:30 |
dmsimard | I'll try to see if I can run some benchmarks | 01:30 |
dmsimard | there is definitely an overhead but it should hopefully be low enough to be worth it :) | 01:31 |
EmilienM | dmsimard: panda has a patch https://review.opendev.org/#/c/682679/ | 01:31 |
EmilienM | but I cleared the check/gate due to our issues | 01:31 |
dmsimard | yup, I helped panda today with it | 01:32 |
dmsimard | I'm not a fan of that solution though | 01:33 |
EmilienM | weshay|ruck, cloudnull : so https://review.opendev.org/#/c/682731/ passed but tripleo-ci-centos-7-containers-multinode SUCCESS in 3h 00m 46s | 01:33 |
EmilienM | this isn't good | 01:34 |
EmilienM | dmsimard: if you could work with him on a good solution, it would help us to do profiling | 01:34 |
weshay|ruck | ya.. I saw.. I also reverted is add auth patch | 01:40 |
weshay|ruck | I'll run these a couple times | 01:40 |
EmilienM | weshay|ruck, cloudnull : the revert of session.close didn't help alone https://review.opendev.org/#/c/682717/ | 01:40 |
EmilienM | tripleo-ci-centos-7-containers-multinode SUCCESS in 3h 19m 29s | 01:40 |
cloudnull | maybe the session revert + the workers ? | 01:41 |
weshay|ruck | bbiab | 01:42 |
EmilienM | maybe | 01:43 |
EmilienM | I can try to squash them all | 01:43 |
cloudnull | in the next review remote the pool options too | 01:45 |
cloudnull | i mentioned that in a comment | 01:45 |
cloudnull | lets just roll with retrues enabled but with the pool setting set to the defaults | 01:45 |
EmilienM | k | 01:45 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-common master: image_uploader: relax HTTPAdapter pool_connections and pool_maxsize https://review.opendev.org/682731 | 01:47 |
EmilienM | https://review.opendev.org/#/c/674097/ isn't cleanly revertable | 01:47 |
EmilienM | I guess weshay|ruck had to rebase it manually | 01:47 |
cloudnull | I can add that revert to your other one | 01:50 |
EmilienM | I'm trying now | 01:51 |
EmilienM | meh | 01:53 |
EmilienM | ok so https://review.opendev.org/#/c/682769/ is a clean revert | 01:56 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-common master: Revert "Close the http sessions of registry on image prepare" https://review.opendev.org/682717 | 01:57 |
EmilienM | cloudnull: I rebased ^ on top of your 2 patches that were reverted by weshay|ruck | 01:57 |
EmilienM | my "relax" patch isn't relevant anymore because the reverts | 01:58 |
cloudnull | i try and keep an eye on the two reviews just to see if they go . | 02:02 |
cloudnull | otherwise im going to go "relax" myself and try and attack this fresh tomorrow | 02:02 |
EmilienM | same, good night | 02:07 |
EmilienM | weshay|ruck: please leave the reverts in the order they are now, and see if it helped to stack them | 02:08 |
EmilienM | i really wish we could avoid any revert here as they all solve problems | 02:08 |
EmilienM | we'll see tomorrow | 02:08 |
*** apetrich has quit IRC | 02:10 | |
*** jaganathan has quit IRC | 02:19 | |
EmilienM | undercloud podman[565692]: Sorry, user neutron is not allowed to execute '/usr/sbin/ss -ntuap' as neutron on undercloud.localdomain. | 02:43 |
EmilienM | another one I saw in logs recently | 02:43 |
EmilienM | (master) | 02:43 |
*** slaweq has joined #tripleo | 03:11 | |
*** slaweq has quit IRC | 03:16 | |
*** weshay|ruck has quit IRC | 03:24 | |
*** dsneddon has joined #tripleo | 03:32 | |
*** dsneddon has quit IRC | 03:37 | |
*** rh-jelabarre has quit IRC | 03:47 | |
*** rh-jelabarre has joined #tripleo | 03:47 | |
*** rh-jelabarre has quit IRC | 03:47 | |
*** ricolin has joined #tripleo | 03:54 | |
*** ramishra has joined #tripleo | 04:01 | |
*** soniya29 has joined #tripleo | 04:12 | |
*** ykarel has joined #tripleo | 04:24 | |
*** janki has joined #tripleo | 04:34 | |
*** udesale has joined #tripleo | 04:45 | |
*** pcaruana has joined #tripleo | 04:46 | |
*** marios has joined #tripleo | 04:54 | |
*** jaosorior has quit IRC | 04:57 | |
*** jaosorior has joined #tripleo | 04:57 | |
*** slaweq has joined #tripleo | 05:11 | |
*** jtomasek has joined #tripleo | 05:14 | |
*** slaweq has quit IRC | 05:16 | |
*** lmiccini has joined #tripleo | 05:23 | |
*** dsneddon has joined #tripleo | 05:32 | |
*** dsneddon has quit IRC | 05:37 | |
*** ramishra has quit IRC | 05:39 | |
*** kopecmartin|off is now known as kopecmartin | 05:54 | |
*** yprokule has joined #tripleo | 05:56 | |
*** jfrancoa has joined #tripleo | 06:01 | |
*** ratailor has joined #tripleo | 06:02 | |
*** hjensas|afk has quit IRC | 06:02 | |
*** jfrancoa has quit IRC | 06:05 | |
bandini | EmilienM: what mwhahaha says is right, we need centos8 for that | 06:08 |
*** slaweq has joined #tripleo | 06:11 | |
*** slaweq has quit IRC | 06:16 | |
*** jfrancoa has joined #tripleo | 06:19 | |
*** holser has joined #tripleo | 06:20 | |
*** xek_ has joined #tripleo | 06:22 | |
*** ratailor has quit IRC | 06:25 | |
*** ratailor has joined #tripleo | 06:27 | |
*** mschuppert has joined #tripleo | 06:28 | |
*** xek_ has quit IRC | 06:30 | |
*** bogdando has joined #tripleo | 06:38 | |
openstackgerrit | Juan Badia Payno proposed openstack/tripleo-ansible master: Let ReaR uses different backup strategy https://review.opendev.org/682583 | 06:40 |
openstackgerrit | Juan Badia Payno proposed openstack/tripleo-ansible master: Let ReaR uses different backup strategy https://review.opendev.org/682583 | 06:43 |
*** slaweq has joined #tripleo | 06:55 | |
*** rcernin has quit IRC | 06:57 | |
*** apetrich has joined #tripleo | 06:59 | |
*** hberaud|gone is now known as hberaud | 07:04 | |
*** hjensas|afk has joined #tripleo | 07:17 | |
*** tosky has joined #tripleo | 07:18 | |
*** amoralej|off is now known as amoralej | 07:19 | |
*** pbandark has joined #tripleo | 07:22 | |
*** pbandark has quit IRC | 07:22 | |
*** sshnaidm|pto is now known as sshnaidm|rover | 07:24 | |
*** beagles has quit IRC | 07:31 | |
*** mmedvede has quit IRC | 07:31 | |
*** b3nt_pin has joined #tripleo | 07:32 | |
*** rpittau|afk is now known as rpittau | 07:32 | |
*** arxcruz has quit IRC | 07:32 | |
*** cylopez has joined #tripleo | 07:32 | |
*** mmedvede has joined #tripleo | 07:34 | |
*** arxcruz has joined #tripleo | 07:36 | |
openstackgerrit | Arx Cruz proposed openstack/tripleo-quickstart master: Adapt os_tempest in FS001 https://review.opendev.org/673400 | 07:36 |
*** jpich has joined #tripleo | 07:40 | |
*** jpena|off is now known as jpena | 07:41 | |
*** ykarel is now known as ykarel|lunch | 07:44 | |
*** gfidente has joined #tripleo | 07:50 | |
*** gfidente has quit IRC | 07:51 | |
*** radeks has joined #tripleo | 07:55 | |
*** cylopez has quit IRC | 08:01 | |
*** tkajinam has quit IRC | 08:04 | |
*** gfidente has joined #tripleo | 08:15 | |
*** cylopez has joined #tripleo | 08:16 | |
*** d0ugal has quit IRC | 08:23 | |
*** d0ugal has joined #tripleo | 08:23 | |
*** avivgta has joined #tripleo | 08:30 | |
*** alexmcleod has joined #tripleo | 08:36 | |
*** dciabrin_ has quit IRC | 08:38 | |
*** derekh has joined #tripleo | 08:40 | |
*** iurygregory has joined #tripleo | 08:48 | |
*** ratailor has quit IRC | 08:51 | |
*** ratailor has joined #tripleo | 08:52 | |
*** florianf has joined #tripleo | 08:55 | |
*** pcaruana has quit IRC | 08:57 | |
*** ramishra has joined #tripleo | 08:59 | |
*** pcaruana has joined #tripleo | 09:01 | |
*** dsneddon has joined #tripleo | 09:15 | |
*** dsneddon has quit IRC | 09:21 | |
*** ramishra_ has joined #tripleo | 09:25 | |
*** ramishra has quit IRC | 09:26 | |
*** ramishra has joined #tripleo | 09:26 | |
*** ramishra_ has quit IRC | 09:30 | |
*** panda|ruck|off is now known as panda|ruck | 09:37 | |
*** avivgta has quit IRC | 09:40 | |
*** tesseract has joined #tripleo | 09:44 | |
openstackgerrit | Dougal Matthews proposed openstack/tripleo-ansible master: Ignore the Sphinx documentation build https://review.opendev.org/682848 | 09:48 |
*** dtantsur|afk is now known as dtantsur | 09:49 | |
openstackgerrit | Karthik S proposed openstack/tripleo-heat-templates master: WIP: Remove the allocate_vfs during update/upgrades for SR-IOV deployment https://review.opendev.org/682851 | 09:53 |
*** d0ugal has quit IRC | 09:54 | |
openstackgerrit | Karthik S proposed openstack/tripleo-heat-templates master: WIP: Remove the allocate_vfs during update/upgrades for SR-IOV deployment https://review.opendev.org/682851 | 09:54 |
openstackgerrit | Tom Barron proposed openstack/tripleo-heat-templates master: Run Manila in scenario004 without pacemaker https://review.opendev.org/682853 | 10:05 |
*** openstackgerrit has quit IRC | 10:06 | |
*** d0ugal has joined #tripleo | 10:10 | |
*** zbr has quit IRC | 10:11 | |
*** dciabrin has joined #tripleo | 10:12 | |
*** zbr has joined #tripleo | 10:12 | |
*** dciabrin has quit IRC | 10:17 | |
*** ramishra has quit IRC | 10:20 | |
*** florianf has quit IRC | 10:25 | |
*** ykarel|lunch is now known as ykarel | 10:30 | |
*** florianf has joined #tripleo | 10:33 | |
*** dciabrin has joined #tripleo | 10:37 | |
*** hamdyk has joined #tripleo | 10:39 | |
gfidente | ykarel have a link to the nova submission you mentioned in https://review.opendev.org/#/c/682768/2/deployment/glance/glance-api-container-puppet.yaml ? | 10:42 |
ykarel | gfidente, yes https://review.opendev.org/#/c/669317/ | 10:44 |
gfidente | ykarel so I think we're struggling on one important question | 10:47 |
gfidente | why is :z needed at all? | 10:47 |
gfidente | we didn't have it in queens or rocky | 10:47 |
*** openstackgerrit has joined #tripleo | 10:47 | |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-upgrade master: Fix image validation when AP services get several images. https://review.opendev.org/682854 | 10:47 |
gfidente | so I get the change for nova is preserving :z where possible, but why was :z added remains an open question | 10:47 |
zbr | sshnaidm|rover: cloudnull: please have a look at https://github.com/docker/compose/pull/6900/files and let m know if it looks ok or needs changes. | 10:48 |
ykarel | gfidente, afaik it's used to change context of directories bind mounted within containers | 10:48 |
gfidente | ykarel yeah it is relabeling the directory on the host | 10:48 |
ykarel | gfidente, yes | 10:48 |
gfidente | with a selinux context which allows container to do whatever with the files in it | 10:48 |
gfidente | question is why that was needed | 10:49 |
gfidente | and if it is what we want | 10:49 |
ykarel | yes this is only i know, and since not all filesystem supports relabelling it's failing | 10:49 |
ykarel | like NFS doesn't support | 10:49 |
ykarel | gfidente, Tengu might have more idea on why it needed | 10:49 |
*** udesale has quit IRC | 11:01 | |
*** udesale has joined #tripleo | 11:02 | |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/tripleo-common master: Implement Ansible fact cache for Mistral executor https://review.opendev.org/682855 | 11:07 |
sshnaidm|rover | zbr, is it docker-compose? | 11:11 |
*** mcornea has joined #tripleo | 11:12 | |
*** lucasagomes has joined #tripleo | 11:13 | |
zbr | yeah | 11:13 |
*** yprokule has quit IRC | 11:13 | |
zbr | brb | 11:13 |
*** yprokule has joined #tripleo | 11:13 | |
*** dsneddon has joined #tripleo | 11:16 | |
*** pcaruana has quit IRC | 11:19 | |
*** dsneddon has quit IRC | 11:20 | |
openstackgerrit | Arx Cruz proposed openstack/tripleo-quickstart master: Adapt os_tempest in FS001 https://review.opendev.org/673400 | 11:21 |
*** psachin has joined #tripleo | 11:26 | |
sshnaidm|rover | zbr, where do you use it? | 11:27 |
*** pcaruana has joined #tripleo | 11:28 | |
zbr | sshnaidm|rover: i was trying to do something for molecule and faced the bug i raised. | 11:28 |
*** iurygregory_ has joined #tripleo | 11:29 | |
zbr | this may explain why it was avoided in the past | 11:29 |
sshnaidm|rover | zbr, something to molecule with docker-compose? | 11:29 |
*** iurygregory has quit IRC | 11:29 | |
*** ratailor has quit IRC | 11:30 | |
zbr | sshnaidm|rover: i wanted to change how we start the containers | 11:30 |
zbr | sshnaidm|rover: think about what marios was saying regarding running a test registry, this is the classic use-case for it. | 11:31 |
*** morazi has joined #tripleo | 11:33 | |
*** dsneddon has joined #tripleo | 11:33 | |
*** raildo has joined #tripleo | 11:34 | |
*** jpena is now known as jpena|lunch | 11:35 | |
*** dsneddon has quit IRC | 11:37 | |
*** udesale has quit IRC | 11:44 | |
*** brault has joined #tripleo | 11:49 | |
*** brault has quit IRC | 11:53 | |
EmilienM | any progress on the timeouts? sshnaidm|rover have you seen our discussions last night? | 11:59 |
sshnaidm|rover | EmilienM, yeah | 11:59 |
EmilienM | I have meeting now but | 12:00 |
EmilienM | https://review.opendev.org/#/c/682717/ doesn't show good results | 12:00 |
EmilienM | tripleo-ci-centos-7-containers-multinode SUCCESS in 2h 52m 20s | 12:00 |
EmilienM | still too long with the reverts | 12:00 |
*** hberaud is now known as hberaud|lunch | 12:03 | |
*** amoralej is now known as amoralej|lunch | 12:05 | |
*** pbandark has joined #tripleo | 12:05 | |
*** brault has joined #tripleo | 12:06 | |
*** janki has quit IRC | 12:07 | |
*** rh-jelabarre has joined #tripleo | 12:10 | |
*** weshay has joined #tripleo | 12:11 | |
*** leanderthal has joined #tripleo | 12:13 | |
*** paramite|clone has quit IRC | 12:14 | |
*** paramite has joined #tripleo | 12:14 | |
openstackgerrit | Michele Baldessari proposed openstack/puppet-tripleo master: WIP OVN separate VIP https://review.opendev.org/672673 | 12:15 |
openstackgerrit | Michele Baldessari proposed openstack/tripleo-heat-templates master: WIP OVN DBS separate vip https://review.opendev.org/669847 | 12:15 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-common master: image_uploader: relax HTTPAdapter pool_connections and pool_maxsize https://review.opendev.org/682731 | 12:16 |
EmilienM | marios: thanks for the reviews | 12:16 |
cloudnull | mornings | 12:16 |
EmilienM | cloudnull, weshay: ^ so the patch on top of all reverts, helped to make tripleo image prepare much faster | 12:17 |
EmilienM | but I'm still unsure why we reach 3 hours in runtime | 12:17 |
EmilienM | in https://review.opendev.org/#/c/682717/ | 12:17 |
* weshay looks | 12:17 | |
EmilienM | 2h52, but still | 12:17 |
EmilienM | 2019-09-18 03:33:09 | TASK [tripleo-container-image-prepare : Run tripleo-container-image-prepare logged to: /var/log/tripleo-container-image-prepare.log] *** | 12:17 |
EmilienM | 2019-09-18 03:43:17 | changed: [undercloud] | 12:17 |
EmilienM | https://f9d9188beb36665070d5-34d5a3d9e2a9ca67eb62c8365f7602e7.ssl.cf1.rackcdn.com/682717/3/check/tripleo-ci-centos-7-containers-multinode/2319472/logs/undercloud/home/zuul/undercloud_install.log.txt.gz | 12:18 |
EmilienM | that's logs with all the reverts ^ | 12:18 |
EmilienM | 10min is really good | 12:18 |
EmilienM | but now there is something else that is being slow /me digs | 12:18 |
weshay | 10 min to get all the containers is really good | 12:18 |
weshay | agree | 12:18 |
weshay | we could run it once more.. it could have been the provider etc | 12:18 |
EmilienM | yeah | 12:18 |
EmilienM | although I don't like to revert, again the patches solve actual problems | 12:19 |
EmilienM | my "relax" patch, didn't help much: | 12:19 |
EmilienM | https://storage.bhs1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_5d7/682731/2/check/tripleo-ci-centos-7-containers-multinode/5d773cc/logs/undercloud/home/zuul/undercloud_install.log.txt.gz | 12:19 |
EmilienM | 2019-09-17 23:07:59 | TASK [tripleo-container-image-prepare : Run tripleo-container-image-prepare logged to: /var/log/tripleo-container-image-prepare.log] *** | 12:19 |
EmilienM | 2019-09-17 23:27:24 | changed: [undercloud] | 12:19 |
EmilienM | 20 min | 12:19 |
EmilienM | but I think we still need it | 12:20 |
cloudnull | it looks like across the revert patches, and the relax review we're still averaging 3 hours | 12:21 |
EmilienM | right | 12:21 |
cloudnull | ~3 hours | 12:21 |
EmilienM | there is something else | 12:22 |
EmilienM | we should bring tripleo-container-image-prepare to 10 min | 12:22 |
EmilienM | I checked and on August 19th, in multiple jobs it took 9 or 10 min max | 12:22 |
EmilienM | so 20 isn't acceptable | 12:22 |
cloudnull | is that in OVH? | 12:23 |
EmilienM | in which case? | 12:23 |
cloudnull | or are we seeing 20+ minutes across regions ? | 12:23 |
EmilienM | I haven't watched regions :/ | 12:23 |
cloudnull | OVH is consistently slower in terms of network performance | 12:23 |
EmilienM | i'll create a spreadsheet to capture what I see | 12:23 |
*** derekh has quit IRC | 12:24 | |
*** rlandy has joined #tripleo | 12:25 | |
cloudnull | OVH is where we were seeing docker auth issues and network instability. | 12:25 |
EmilienM | ok good to know | 12:25 |
EmilienM | cloudnull: https://docs.google.com/spreadsheets/d/1KSMWAgfPLOY52_m_lVFEv5wq2qUNXHLm8TL05gRR7IE/edit#gid=0 | 12:27 |
*** ade_lee_ has quit IRC | 12:30 | |
weshay | cloudnull, EmilienM per cloud http://dashboard-ci.tripleo.org/d/si1tipHZk/jobs-exploration?orgId=1&fullscreen&panelId=2 | 12:31 |
*** jpena|lunch is now known as jpena | 12:31 | |
EmilienM | weshay: thanks, I'm looking at the tripleo image prepare thing | 12:31 |
EmilienM | rackspace is clearly the slowest | 12:32 |
*** brault has quit IRC | 12:34 | |
EmilienM | weshay: we don't have ara for undercloud right? | 12:36 |
weshay | EmilienM, they killed all the ara reports | 12:37 |
weshay | EmilienM, panda|ruck is looking into it this morning | 12:38 |
EmilienM | damn | 12:38 |
EmilienM | weshay: I'm doing manual profiling on undercloud deployment | 12:42 |
weshay | EmilienM, not sure if this would help.. https://storage.gra1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_03b/682769/2/check/tripleo-ci-centos-7-containers-multinode/03be755/logs/undercloud/var/log/extra/dstat.html.gz | 12:44 |
EmilienM | weshay: yeah thanks, it can help too | 12:44 |
sshnaidm|rover | we don't have ara for undercloud | 12:44 |
sshnaidm|rover | we do still have ara reports in ovb jobs - it can be used for investigations | 12:45 |
sshnaidm|rover | but still not for undercloud.. | 12:45 |
weshay | EmilienM, same job in rdo https://review.rdoproject.org/zuul/builds?job_name=tripleo-ci-centos-7-multinode-1ctlr-featureset010 | 12:47 |
EmilienM | weshay: yeah but not reaching 3h, or? | 12:47 |
weshay | sshnaidm|rover, hrm.. rdo is also not rendering | 12:48 |
weshay | EmilienM, ya.. not even close | 12:48 |
openstackgerrit | Alan Bishop proposed openstack/tripleo-heat-templates master: Fix selinux context for glance-api https://review.opendev.org/682768 | 12:48 |
weshay | fyi.. updated http://dashboard-ci.tripleo.org/d/si1tipHZk/jobs-exploration?orgId=1&fullscreen&panelId=7 | 12:49 |
openstackgerrit | Jose Luis Franco proposed openstack/tripleo-upgrade master: Add new workarounds mechanism to apply workarounds via Ansible. https://review.opendev.org/673767 | 12:49 |
openstackgerrit | Jose Luis Franco proposed openstack/tripleo-upgrade master: Enable system upgrade and upgrade run per host. https://review.opendev.org/674050 | 12:49 |
sshnaidm|rover | weshay, it does, but very slow | 12:49 |
*** brault has joined #tripleo | 12:52 | |
EmilienM | weshay: I'm working here: https://docs.google.com/spreadsheets/d/1KSMWAgfPLOY52_m_lVFEv5wq2qUNXHLm8TL05gRR7IE/edit#gid=486496162 | 12:52 |
*** janki has joined #tripleo | 12:52 | |
*** iurygregory has joined #tripleo | 12:53 | |
*** janki has quit IRC | 12:53 | |
*** iurygregory_ has quit IRC | 12:55 | |
*** soniya29 has quit IRC | 12:56 | |
EmilienM | weshay: I now suspect something on the overcloud, see my numbers | 12:58 |
openstackgerrit | Tom Barron proposed openstack/tripleo-ci master: Use podman for scenario004 https://review.opendev.org/682890 | 12:58 |
*** brault has quit IRC | 12:58 | |
EmilienM | weshay: undercloud is 7min longer and overcloud 8min, | 12:58 |
EmilienM | it explains the 2h 29m 14s in Aug 21 and 2h 52m 20s now with the revert | 12:59 |
*** jcoufal has joined #tripleo | 13:01 | |
openstackgerrit | Jose Luis Franco proposed openstack/tripleo-upgrade master: Ensure pacemaker bootstrap controller is upgraded first. https://review.opendev.org/680595 | 13:01 |
*** ramishra has joined #tripleo | 13:01 | |
*** derekh has joined #tripleo | 13:04 | |
*** amoralej|lunch is now known as amoralej | 13:08 | |
*** ricolin_ has joined #tripleo | 13:08 | |
*** Goneri has joined #tripleo | 13:09 | |
jfrancoa | marios: hey! I think we can drop the bomb https://review.opendev.org/#/c/681669/ ;-) | 13:11 |
*** ricolin has quit IRC | 13:11 | |
*** ricolin_ is now known as ricolin | 13:12 | |
*** holser has quit IRC | 13:14 | |
weshay | EmilienM, the manual ara | 13:18 |
*** mcornea has quit IRC | 13:18 | |
weshay | panda|ruck, how is ara.. any patches yet? | 13:18 |
*** mcornea has joined #tripleo | 13:18 | |
weshay | panda|ruck, sshnaidm|rover fyi https://docs.google.com/spreadsheets/d/1KSMWAgfPLOY52_m_lVFEv5wq2qUNXHLm8TL05gRR7IE/edit?pli=1#gid=486496162 | 13:18 |
*** ade_lee_ has joined #tripleo | 13:19 | |
dmsimard | weshay, EmilienM: back to back meetings this morning but I'll have time this PM to see how I can help | 13:20 |
dmsimard | had an idea last night but it was getting late .. :) | 13:20 |
*** ade_lee__ has joined #tripleo | 13:23 | |
*** Vorrtex has joined #tripleo | 13:24 | |
*** cylopez has quit IRC | 13:25 | |
*** _dpaterson has joined #tripleo | 13:26 | |
*** ade_lee_ has quit IRC | 13:26 | |
marios | jfrancoa: sorry in call.. bombs away! | 13:29 |
gfidente | weshay I was chatting with tbarron about https://review.opendev.org/#/c/682853/ and https://review.opendev.org/#/c/678224/ | 13:30 |
weshay | k | 13:30 |
gfidente | I think the first makes sense until we get pcmk running on centos8, but then we'll try to migrate back all on pcmk when it gets back working on centos8 ? | 13:30 |
jfrancoa | marios: thank you comrade! | 13:31 |
weshay | gfidente, that sounds fine to me... but def.. a topic for #tripleo mtg on tues.. I can raise it there if you like | 13:32 |
gfidente | weshay++ yes please | 13:32 |
gfidente | I was assuming we go back to pcmk when it's ready | 13:32 |
gfidente | need bandini to help with that too I think | 13:32 |
gfidente | in fact we can even set bandini as owner from the start | 13:33 |
*** psachin has quit IRC | 13:33 | |
gfidente | cause he is not responding to ping | 13:33 |
gfidente | weshay done https://etherpad.openstack.org/p/tripleo-meeting-items | 13:35 |
*** aakarsh has joined #tripleo | 13:35 | |
EmilienM | weshay: i feel like the step 5 takes longer than before as well | 13:36 |
*** hjensas|afk has quit IRC | 13:36 | |
*** sri_ has joined #tripleo | 13:36 | |
EmilienM | weshay: checkout my spreadsheet | 13:36 |
weshay | ya.. I get a little .. I am | 13:37 |
bandini | gfidente: ? | 13:37 |
gfidente | bandini you have a topic for next week tripleo meeting https://etherpad.openstack.org/p/tripleo-meeting-items | 13:37 |
weshay | I get a little nervous about just comparing a couple jobs.. disucssing putting up an ara server and now and getting it populated w/ the sqllite dbs in the jobs | 13:37 |
EmilienM | step5 is usually nova compute related | 13:40 |
EmilienM | owalsh: from the top of your head, anything recently added at step5 which could make the deployment slower? | 13:40 |
mwhahaha | what about that wait task | 13:40 |
EmilienM | maybe the inflight validations? | 13:40 |
owalsh | EmilienM: well, they were too trigger happen so they will wait a little bit now. How much slower are we talking about? | 13:41 |
owalsh | s/happen/happy | 13:42 |
*** hberaud|lunch is now known as hberaud | 13:42 | |
EmilienM | owalsh: 7 min slower at step 5 | 13:42 |
EmilienM | on both undercloud & overcloud | 13:42 |
EmilienM | that's a pattern | 13:42 |
weshay | sshnaidm|rover, panda|ruck can we spin up an ara server today.. and play around w/ it? | 13:42 |
sshnaidm|rover | weshay, yeah, will look into it | 13:42 |
dmsimard | I can help | 13:42 |
sshnaidm|rover | dmsimard, great | 13:42 |
sshnaidm|rover | dmsimard, I plan to send data from job to server and get back url to playbooks page, is it possible? | 13:43 |
*** aakarsh has quit IRC | 13:43 | |
sshnaidm|rover | dmsimard, so I can publish this url later in zuul artifacts | 13:44 |
openstackgerrit | Slawek Kaplonski proposed openstack/tripleo-quickstart-extras master: Add playbook and role to run tobiko https://review.opendev.org/655423 | 13:44 |
dmsimard | sshnaidm|rover: yes, it wouldn't be much different than what we had on logs.openstack.org | 13:44 |
EmilienM | mwhahaha: yeah that's it | 13:44 |
EmilienM | https://f9d9188beb36665070d5-34d5a3d9e2a9ca67eb62c8365f7602e7.ssl.cf1.rackcdn.com/682717/3/check/tripleo-ci-centos-7-containers-multinode/2319472/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz | 13:45 |
EmilienM | TASK [Get nova-conductor healthcheck status] | 13:45 |
EmilienM | it takes 1min to check nova-conductor (alone) | 13:45 |
EmilienM | and there are other nova services to check | 13:45 |
owalsh | EmilienM: it's waiting for the healthcheck to run, scheduled to run after 120s | 13:46 |
mwhahaha | let's revert the validations | 13:46 |
mwhahaha | i think we need to rethink those | 13:46 |
owalsh | mwhahaha: just turn them ogg? | 13:46 |
owalsh | off? | 13:46 |
mwhahaha | no because it turns all of them off | 13:46 |
mwhahaha | that's not what we want | 13:46 |
EmilienM | we can't turnit off | 13:46 |
mwhahaha | we can't be waiting 2 mins | 13:46 |
weshay | thought validations were going to contained to 1 job | 13:46 |
EmilienM | unless we set ContainerHealthcheckDisabled to True | 13:46 |
EmilienM | but then we disable a bunch of other thigns | 13:46 |
weshay | that's what we spoke about at ptg / den | 13:47 |
mwhahaha | inflight validations shouldn't extend the deployment | 13:47 |
mwhahaha | they should be simple checks | 13:47 |
mwhahaha | i don't think using the health check systemd stuff is correct | 13:47 |
EmilienM | we could create a EnableInflightValidations parameter | 13:48 |
EmilienM | and turn off by default | 13:48 |
*** brault has joined #tripleo | 13:48 | |
EmilienM | mwhahaha: you ok with it? | 13:48 |
mwhahaha | fine | 13:48 |
EmilienM | i'll send a patch if yes | 13:48 |
EmilienM | ok | 13:48 |
mwhahaha | features need to be off by default anyway | 13:48 |
*** brault has quit IRC | 13:48 | |
mwhahaha | until we are sure they don't mess things up | 13:49 |
weshay | +1111 | 13:49 |
EmilienM | ok so that will be 7 min x 2 saved | 13:49 |
EmilienM | which will probably help | 13:49 |
weshay | totally | 13:49 |
* EmilienM prepares a patch | 13:49 | |
weshay | isn't that Tengu and gchamoul ? | 13:49 |
gchamoul | weshay: inflight is more Tengu | 13:50 |
weshay | Tengu, yo yo | 13:50 |
gchamoul | weshay: in PTO till Monday! | 13:50 |
owalsh | EmilienM: could also try reducing the retry interval | 13:51 |
weshay | bah.. probably hiking mountains and making friends w/ mountain goats and sheep | 13:51 |
EmilienM | it'll stress podman exec too much IMHO | 13:51 |
EmilienM | even if we reduce to 60s, or 30s, I really don't want our deployment to wait 30s and do nothing | 13:51 |
EmilienM | no? | 13:51 |
gchamoul | EmilienM: we already have a parameter for that in tripleoclient | 13:51 |
owalsh | yea, problem is that we didn't catch the failing healthchecks in CI for a long time, now we do | 13:51 |
EmilienM | gchamoul: what's the param | 13:52 |
EmilienM | ah disable-validations? | 13:52 |
mwhahaha | failing health checks can be a single check at the end of the deploy | 13:52 |
gchamoul | EmilienM: https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/v1/overcloud_deploy.py#L734-L742 | 13:52 |
mwhahaha | i'm not sure that's teh correct usage of inflight validations | 13:52 |
owalsh | EmilienM: ^^^ didn't you say we want to fail early? | 13:53 |
mwhahaha | inflight validations are more to make sure the services are ok not that healthchecks are bad | 13:53 |
mwhahaha | so we can have a single check at the end of a step that makes sure all checks are ok | 13:53 |
mwhahaha | we actually had that proposed | 13:53 |
gchamoul | EmilienM: --no-inflight-validations, to false by default | 13:53 |
mwhahaha | i think we landed it and had issues | 13:53 |
mwhahaha | backing like queens | 13:53 |
EmilienM | gchamoul: right but we still need a way to disable it in the THT | 13:53 |
mwhahaha | https://review.opendev.org/#/c/569153/ | 13:54 |
EmilienM | the easiest way without a param, would be to exclude ansible tags for these tasks | 13:54 |
EmilienM | it would avoid adding a param everywhere | 13:54 |
EmilienM | https://opendev.org/openstack/python-tripleoclient/src/branch/master/tripleoclient/workflows/deployment.py#L342-L345 | 13:54 |
EmilienM | is supposed to do that no? | 13:54 |
*** artom has quit IRC | 13:54 | |
mwhahaha | inflight validatiosn should be off by default | 13:55 |
EmilienM | gchamoul: it's enabled by default here: https://opendev.org/openstack/python-tripleoclient/src/branch/master/tripleoclient/workflows/deployment.py#L338 | 13:55 |
mwhahaha | not on by default | 13:55 |
EmilienM | let's fix it in tripleoclient, not THT | 13:56 |
gchamoul | EmilienM: so this patches don't work? https://review.opendev.org/#/q/Ic3af7eb49ee6db5bc0ab10302c3f2a2c616db7b6 | 13:56 |
EmilienM | gchamoul: the param isn't set by default | 13:57 |
gchamoul | ok | 13:57 |
EmilienM | let me see again | 13:57 |
owalsh | EmilienM: re the retry interval I meant this one - https://review.opendev.org/#/c/681442/8/deployment/nova/nova-api-container-puppet.yaml@446. It's not hitting podman exec, just querying systemd | 13:58 |
EmilienM | clearly --disable-validations didn't disable the in flight validations | 13:58 |
EmilienM | owalsh: yes we should reduce that for sure | 13:58 |
EmilienM | owalsh: I'll let gchamoul fix it and also in his patch in tripleo-validation for the new tasks he's doing | 13:58 |
owalsh | EmilienM: ack | 13:59 |
EmilienM | gchamoul: the code you just pasted, changed since then | 13:59 |
cloudnull | so the Tripleo-Transformation meeting is supposed to take place in a min, but I think we should push it to not disrupt this conversation ? | 14:00 |
mwhahaha | also that's only overcloud | 14:00 |
mwhahaha | we need them off for the undercloud too | 14:00 |
EmilienM | right | 14:00 |
EmilienM | I think the logic in the client is wrong i'm digging | 14:00 |
EmilienM | cloudnull: you can go ahead, don't want to interrupt your meeting | 14:00 |
cloudnull | #startmeeting tripleo | 14:02 |
openstack | Meeting started Wed Sep 18 14:02:32 2019 UTC and is due to finish in 60 minutes. The chair is cloudnull. Information about MeetBot at http://wiki.debian.org/MeetBot. | 14:02 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 14:02 |
*** openstack changes topic to " (Meeting topic: tripleo)" | 14:02 | |
openstack | The meeting name has been set to 'tripleo' | 14:02 |
EmilienM | gchamoul: https://github.com/openstack/python-tripleoclient/commit/bf48dbc84405208dd86ae3dd4879fc7735b99838 | 14:02 |
EmilienM | that broke the logic | 14:02 |
cloudnull | #topic rollcall | 14:02 |
*** openstack changes topic to "rollcall (Meeting topic: tripleo)" | 14:02 | |
cloudnull | o/ | 14:02 |
fultonj | o/ | 14:02 |
gchamoul | EmilienM: yes I was not in right branch sorry | 14:03 |
gchamoul | o/ | 14:03 |
cloudnull | welcome to the Tripleo-Transformation squad meeting | 14:03 |
owalsh | o/ | 14:03 |
*** ykarel is now known as ykarel|afk | 14:04 | |
cloudnull | lets give a few for folks to trickle in. | 14:04 |
*** cylopez has joined #tripleo | 14:05 | |
*** xek_ has joined #tripleo | 14:06 | |
cloudnull | #topic recap | 14:07 |
*** openstack changes topic to "recap (Meeting topic: tripleo)" | 14:07 | |
cloudnull | Etherpad | 14:07 |
cloudnull | #link https://etherpad.openstack.org/p/tripleo-ansible-agenda | 14:07 |
cloudnull | Work-board | 14:07 |
cloudnull | #link https://storyboard.openstack.org/#!/board/174 | 14:07 |
cloudnull | Project | 14:07 |
cloudnull | #link https://storyboard.openstack.org/#!/project/openstack/tripleo-ansible | 14:07 |
cloudnull | Open Reviews in need | 14:07 |
cloudnull | #link https://review.opendev.org/#/q/project:%255Eopenstack/tripleo-ansible+status:open+label:verified%253D%252B1%252Cuser%253Dzuul | 14:07 |
cloudnull | THT2TA work in need of reviews | 14:07 |
cloudnull | https://review.opendev.org/#/q/topic:THT2TA+(status:open) | 14:07 |
weshay | 0/ | 14:07 |
cloudnull | just to highlight some recent things going on: | 14:07 |
cloudnull | the ansible-sig had its first meeting last week | 14:07 |
cloudnull | http://eavesdrop.openstack.org/meetings/ansible_sig/2019/ansible_sig.2019-09-13-14.03.log.html | 14:07 |
cloudnull | there were some great conversations and introductions that took place. | 14:08 |
cloudnull | Its still SUPER early days on that effort, but hopefully it can help the general deployment communities converge on some things | 14:08 |
cloudnull | http://eavesdrop.openstack.org/meetings/ansible_sig/2019/ansible_sig.2019-09-13-14.03.html | 14:09 |
cloudnull | ^ if folks are interested | 14:09 |
cloudnull | a lot of the THT2TA work is currently on pause as we stabilize Train. | 14:10 |
*** dtantsur is now known as dtantsur|afk | 14:10 | |
cloudnull | #topic open-discussion | 14:10 |
*** openstack changes topic to "open-discussion (Meeting topic: tripleo)" | 14:10 | |
openstackgerrit | Emilien Macchi proposed openstack/python-tripleoclient master: Disable inflight validations by default https://review.opendev.org/682905 | 14:10 |
cloudnull | anyone have something they want to raise ? | 14:10 |
EmilienM | gchamoul, weshay, mwhahaha ^ | 14:10 |
openstackgerrit | Emilien Macchi proposed openstack/python-tripleoclient master: Disable inflight validations by default https://review.opendev.org/682905 | 14:10 |
cloudnull | owalsh have you had a moment to take a look at that connection plugin again ? | 14:11 |
owalsh | cloudnull: not yet, I'll take a look after this | 14:11 |
cloudnull | mnaser are we unblocked on : | 14:13 |
cloudnull | #link https://review.opendev.org/#/c/676421 | 14:13 |
*** aakarsh has joined #tripleo | 14:15 | |
cloudnull | If there's nothing else that folks want to talk about, then I think we can call it. | 14:18 |
cloudnull | Thanks everyone! | 14:18 |
cloudnull | #endmeeting | 14:18 |
*** openstack changes topic to "CI Status: RED ( no rechecks, wait for https://review.opendev.org/#/c/682729/ ) | community irc meeting Tues@1400 UTC - tripleo-ci-community meeting Tues@1330 UTC | https://docs.openstack.org/tripleo-docs/latest/" | 14:18 | |
openstack | Meeting ended Wed Sep 18 14:18:26 2019 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 14:18 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/tripleo/2019/tripleo.2019-09-18-14.02.html | 14:18 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/tripleo/2019/tripleo.2019-09-18-14.02.txt | 14:18 |
openstack | Log: http://eavesdrop.openstack.org/meetings/tripleo/2019/tripleo.2019-09-18-14.02.log.html | 14:18 |
openstackgerrit | Emilien Macchi proposed openstack/python-tripleoclient master: Disable inflight validations by default https://review.opendev.org/682905 | 14:19 |
*** Vorrtex has quit IRC | 14:20 | |
*** openstackgerrit has quit IRC | 14:21 | |
cloudnull | EmilienM weshay looking at the task breakdown for a recent multinode job (3+hours) and a multinode job from before aug 20, it looks like we lost ~30 min in the overcloud deploy | 14:26 |
EmilienM | right that's what we said | 14:26 |
*** openstackgerrit has joined #tripleo | 14:26 | |
openstackgerrit | Jesse Pretorius (odyssey4me) proposed openstack/tripleo-common master: Implement Ansible fact cache for Mistral executor https://review.opendev.org/682855 | 14:26 |
EmilienM | 10 min saved with the reverts | 14:26 |
EmilienM | and 15 min saved with the inflight validations | 14:27 |
*** sri_ has quit IRC | 14:27 | |
EmilienM | ~25min saved | 14:27 |
cloudnull | i may be wrong but it seems like validations were running before and taking the same amount of time ? | 14:27 |
owalsh | cloudnull: they weren't waiting for the healthcheck to have run | 14:28 |
EmilienM | no they weren't | 14:28 |
weshay | panda|ruck, sshnaidm|rover fyi ^ | 14:28 |
owalsh | cloudnull: so they always succeeded | 14:28 |
weshay | panda|ruck, sshnaidm|rover we'll need to schedule a handoff for the new ruck/rovers | 14:28 |
cloudnull | I have PLAY [Validate the deployment] taking 23 min on a recent run and 29 on an aug 19 run | 14:29 |
cloudnull | on the sheets recent-success-play-breakdown and b4aug20-success-play-breakdown I broken down all the timings | 14:29 |
owalsh | cloudnull: they are run in the deploy_steps playbook | 14:30 |
odyssey4me | EmilienM cloudnull Although my motivation was not related to the current job timeouts, https://review.opendev.org/682855 may help reduce CI time a fair bit. | 14:32 |
EmilienM | odyssey4me: do we need to bind mount var/tmp/ansible_fact_cache in the mistral executor container? | 14:33 |
odyssey4me | EmilienM: it doesn't have to be persistant, in fact if the underlying OS changes it's better if it's flushed and rebuilt | 14:34 |
odyssey4me | It's mainly useful when doing multiple playbook runs after each other. | 14:34 |
odyssey4me | I dunno if that gives enough context to answer your question. I don't really know if it's best to bind mount or not. I'm still learning how everything fits together. | 14:35 |
odyssey4me | Maybe cloudnull could comment, given that he's got more exposure and understands the implications better. | 14:36 |
* cloudnull knows nothing | 14:36 | |
odyssey4me | lol | 14:36 |
owalsh | don't think that applies to deploy_setups_playbook.yaml - facts are gathered once at the start | 14:36 |
*** brault has joined #tripleo | 14:36 | |
cloudnull | EmilienM owalsh do we run through the same container prepare tools on both the under/overcloud? | 14:36 |
*** hamdyk has quit IRC | 14:36 | |
EmilienM | same tools yes | 14:37 |
odyssey4me | It will most certainly help with updates/upgrades/ffwd-upgrades. There are a lot of playbooks with fact gathering in there. | 14:37 |
odyssey4me | And that's the situation which gave rise to the patch in the first place. | 14:37 |
cloudnull | odyssey4me that makes sense. as for the cache path, if it doesn't exist ansible will try and create it. | 14:38 |
odyssey4me | cloudnull: ++ | 14:38 |
cloudnull | it looks like "gather facts needed by role" is run 5x during overcloud_deploy. | 14:40 |
weshay | mwhahaha, EmilienM looks like 7.7 is upstream now https://c540761633a53fe4ef5d-0ee3eeb68aa256c74f1e35f60c262d61.ssl.cf1.rackcdn.com/673400/33/check/tripleo-ci-centos-7-containers-multinode/5196268/logs/undercloud/etc/centos-release.txt.gz | 14:40 |
mwhahaha | oh noes | 14:40 |
cloudnull | 11x in the multinode job output | 14:40 |
fultonj | weshay: is it ok to +w reviews now? | 14:41 |
weshay | not yet | 14:41 |
weshay | sorry | 14:41 |
EmilienM | lol | 14:41 |
* EmilienM opens the window | 14:41 | |
* fultonj removes +w from a review | 14:42 | |
owalsh | cloudnull: this is the fact gathering I was referring to - once at the start of the playbook IIUC https://github.com/openstack/tripleo-heat-templates/blob/master/common/deploy-steps.j2#L845 | 14:43 |
owalsh | cloudnull: not sure if the "gather facts needed by role" does anything, should be skipped when ansible_python is defined | 14:46 |
cloudnull | owalsh yea, looking at https://075427cdf1b184b4819e-6c26b67c11fc06a040fd3a92715b9b9e.ssl.cf1.rackcdn.com/682410/2/check/tripleo-ci-centos-7-containers-multinode/b03cf28/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz it looks like "gather facts used by role" was skipped. however, "Gathering Facts" looks like its run 6x between the general job | 14:50 |
cloudnull | out put and the overcloud deploy | 14:50 |
EmilienM | yeah I noticed the ansible fact thing | 14:50 |
cloudnull | each one is about 1 minute, so not a tun of time | 14:51 |
cloudnull | but could be time saved if we can use cached facts? | 14:51 |
owalsh | cloudnull: also run it twice on the undercloud deploy | 14:52 |
*** gkadam has joined #tripleo | 14:54 | |
*** ykarel|afk is now known as ykarel | 14:54 | |
*** gkadam_ has joined #tripleo | 14:55 | |
*** gkadam has quit IRC | 14:56 | |
odyssey4me | cloudnull owalsh Well, in CI perhaps it's a short period - in ia production env we were working with the other day it took ~4 mins per host, per fork batch... and the env had a lot of hosts, so it took quite a while | 14:56 |
cloudnull | ++ | 14:57 |
*** jaosorior has quit IRC | 14:57 | |
cloudnull | if it works and saves us 5 min in CI that would be a big help | 14:57 |
* cloudnull IMO | 14:57 | |
owalsh | cloudnull: sure it's minutes? I'm seeing 10s | 14:58 |
odyssey4me | the choice is either to not gather facts and use a setup task which scopes the facts gathered - which is hard to ensure we consider in reviews and is hard to maintain... or to just cache facts, which is easier to maintain | 14:58 |
cloudnull | I think across all the times we run gather facts its ~5 minutes | 14:59 |
* cloudnull just browsing logs | 14:59 | |
odyssey4me | it's plausible that we can exclude the facter/ohai subset of facts, which will save even more time... but I've not managed to validate that's OK to do just yet | 14:59 |
*** fmount has joined #tripleo | 15:00 | |
owalsh | and on real deployment it's even faster... <1s for undercloud, ~4s for the 4 overcloud nodes | 15:01 |
*** artom has joined #tripleo | 15:03 | |
paramite | shouldn't map_merge return single map and not single map wrapped in array? | 15:04 |
*** gkadam_ has quit IRC | 15:04 | |
owalsh | cloudnull, odyssey4me: if we were to cache facts between runs it could break things, e.g here https://review.opendev.org/#/c/681042/12/deployment/nova/nova-compute-common-container-puppet.yaml@86 | 15:05 |
*** iurygregory has quit IRC | 15:05 | |
*** arxcruz is now known as arxcruz|ruck | 15:08 | |
bogdando | EmilienM: another boost for CI may be bind-mounting yum cache for containers updater | 15:08 |
bogdando | or may be not if those get updated from the locally built gate repo anyway | 15:09 |
*** ade_lee_ has joined #tripleo | 15:10 | |
bogdando | EmilienM: I mean https://opendev.org/openstack/tripleo-common/src/branch/master/scripts/container-update.py#L113 | 15:11 |
owalsh | cloudnull, odyssey4me: although cacheable: false should about this I assume | 15:11 |
bogdando | something like https://review.opendev.org/#/c/575742/8/scripts/container-update.py@145 ... | 15:12 |
*** xek__ has joined #tripleo | 15:12 | |
bogdando | but we prolly not using container-update.py anymore but its corresponding ansible role? | 15:13 |
*** ade_lee__ has quit IRC | 15:13 | |
*** xek_ has quit IRC | 15:14 | |
odyssey4me | owalsh: Yeah, the good news is that set_fact defaults to having cacheable=no. | 15:16 |
owalsh | odyssey4me: ah, cool | 15:17 |
lbragstad | EmilienM o/ do you happen to have an update on the status of CI? https://review.opendev.org/#/c/680345 | 15:32 |
EmilienM | lbragstad: read topic | 15:32 |
EmilienM | CI Status: RED | 15:32 |
EmilienM | :D | 15:32 |
lbragstad | bah - thanks :) | 15:32 |
EmilienM | np | 15:32 |
weshay | heh | 15:32 |
weshay | bogdando, that is an interesting idea | 15:33 |
*** iurygregory has joined #tripleo | 15:34 | |
*** yprokule has quit IRC | 15:35 | |
*** zbr has quit IRC | 15:36 | |
*** zbr has joined #tripleo | 15:36 | |
*** brault has quit IRC | 15:38 | |
*** sshnaidm|rover is now known as sshnaidm | 15:39 | |
*** dtantsur|afk is now known as dtantsur | 15:39 | |
*** panda|ruck is now known as panda | 15:39 | |
*** cfontain has joined #tripleo | 15:40 | |
*** morazi has quit IRC | 15:44 | |
*** cfontain has quit IRC | 15:44 | |
*** cfontain has joined #tripleo | 15:45 | |
weshay | EmilienM, fyi.. if you provide the url.. you can still get ara in ovb jobs https://logs.rdoproject.org/58/682458/2/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/23be3f1/logs/ara_oooq_overcloud/ara-report/ | 15:45 |
EmilienM | weshay: in bj with alex now | 15:46 |
weshay | np | 15:46 |
weshay | just fyi | 15:46 |
EmilienM | weshay: my patch will fix overcloud | 15:48 |
EmilienM | weshay: but we are working on the undercloud | 15:48 |
weshay | EmilienM, woot | 15:48 |
EmilienM | of course not same code ... :/ | 15:48 |
EmilienM | weshay: well, just for the validation thing | 15:48 |
*** ykarel is now known as ykarel|away | 15:50 | |
*** ramishra has quit IRC | 15:55 | |
weshay | any hoo.. the good news of the day... centos 7.7 did not impact tripleo / ci | 15:58 |
bogdando | how come | 16:00 |
bogdando | is it like for the 1st time in history? :) | 16:00 |
*** marios has quit IRC | 16:01 | |
weshay | bogdando, second :) | 16:01 |
weshay | I think we cleared 7.6 as well w/o issues | 16:02 |
weshay | bogdando, cr repo in ci :) | 16:02 |
bogdando | that's a goot trend | 16:02 |
bogdando | good* | 16:02 |
weshay | zbr, can you please put up a test change for rhel 8 scen 4 on https://review.opendev.org/#/c/682853/ | 16:02 |
weshay | zbr, point me at the change when you have it | 16:03 |
*** rpittau is now known as rpittau|afk | 16:03 | |
weshay | zbr, ? | 16:04 |
*** ykarel|away has quit IRC | 16:05 | |
zbr | sure | 16:05 |
weshay | zbr, thanks.. also please renick | 16:06 |
*** zbr is now known as zbr|ruck | 16:07 | |
*** hjensas|afk has joined #tripleo | 16:08 | |
*** cylopez has quit IRC | 16:08 | |
openstackgerrit | Sorin Sbarnea proposed openstack/tripleo-ci master: DNM: Test tripleo-ci-rhel-8-scenario004-standalone-rdo https://review.opendev.org/682942 | 16:10 |
sshnaidm | weshay, seems like containers-multinode started to take more time from 25-26 Aug | 16:10 |
sshnaidm | weshay, as it seems from grafana.. | 16:11 |
weshay | sshnaidm, what are you looking at to determine that? | 16:11 |
*** xek__ has quit IRC | 16:11 | |
sshnaidm | weshay, try this: http://dashboard-ci.tripleo.org/d/si1tipHZk/jobs-exploration?orgId=1&from=now-90d&to=now&var-influxdb_filter=job_name%7C%3D%7Ctripleo-ci-centos-7-containers-multinode&var-influxdb_filter=branch%7C%3D%7Cmaster&var-influxdb_filter=result%7C%3D%7CSUCCESS | 16:12 |
*** gfidente has quit IRC | 16:12 | |
weshay | ah nice | 16:13 |
sshnaidm | weshay, I just went over pages in table of jobs | 16:13 |
sshnaidm | weshay, from page 8 it takes more time | 16:13 |
weshay | aye | 16:14 |
weshay | sshnaidm, thank you.. also gives me some ideas | 16:15 |
weshay | sshnaidm++ | 16:15 |
*** cfontain has quit IRC | 16:16 | |
*** radeks has quit IRC | 16:16 | |
sshnaidm | also it seems like long time for tempest - 1133 secs, but maybe this is ok.. chandankumar ? | 16:16 |
openstackgerrit | Sorin Sbarnea proposed openstack/tripleo-heat-templates master: Run Manila in scenario004 without pacemaker https://review.opendev.org/682853 | 16:20 |
openstackgerrit | Emilien Macchi proposed openstack/python-tripleoclient master: Introduce --inflight-validations for standalone / undercloud https://review.opendev.org/682943 | 16:21 |
EmilienM | weshay: ^ patch for undercloud/standalone | 16:21 |
* weshay looks | 16:22 | |
chandankumar | sshnaidm: which fs? | 16:22 |
sshnaidm | chandankumar, 010 | 16:22 |
sshnaidm | chandankumar, containers multinode | 16:22 |
chandankumar | sshnaidm: that is running os_tempest | 16:23 |
sshnaidm | chandankumar, yep | 16:23 |
chandankumar | sshnaidm: one scenario runs then it is' ok i think there | 16:23 |
openstackgerrit | Emilien Macchi proposed openstack/python-tripleoclient master: Introduce --inflight-validations for standalone / undercloud https://review.opendev.org/682943 | 16:24 |
EmilienM | i rebased on top of the other to see results all together | 16:24 |
*** ayoung has joined #tripleo | 16:24 | |
*** dsneddon has joined #tripleo | 16:25 | |
*** dsneddon has quit IRC | 16:29 | |
*** kopecmartin is now known as kopecmartin|off | 16:29 | |
*** jpich has quit IRC | 16:31 | |
sshnaidm | weshay, have 2 mins? | 16:33 |
weshay | sshnaidm, yes | 16:33 |
sshnaidm | weshay, your bluej | 16:33 |
EmilienM | cloudnull, weshay, mwhahaha : so the plan is to land https://review.opendev.org/#/c/682905/ and https://review.opendev.org/#/c/682943/ first | 16:34 |
EmilienM | then https://review.opendev.org/#/c/682729/ and https://review.opendev.org/#/c/682769/ | 16:34 |
EmilienM | merging tripleoclient first to save 7 + 7 minutes | 16:34 |
EmilienM | and give a chance to the tripleo common reverts to pass the gate | 16:34 |
EmilienM | how does it sound? | 16:34 |
cloudnull | +1 | 16:35 |
EmilienM | cloudnull: so iiuc we'll likely have 401 again right | 16:35 |
*** bogdando has quit IRC | 16:36 | |
*** amoralej is now known as amoralej|off | 16:36 | |
*** dtantsur is now known as dtantsur|afk | 16:38 | |
openstackgerrit | Kevin Carter (cloudnull) proposed openstack/tripleo-common master: Revert the re-authentication logic https://review.opendev.org/682945 | 16:38 |
cloudnull | EmilienM https://review.opendev.org/#/c/682945/ | 16:39 |
cloudnull | that reverts the both re-auth changes in one go | 16:39 |
cloudnull | yes, we're likely to run into the same 401 issue in OVH however, I think its worth trying to see if these couple of reviews stabalize the gate and circle back on the re-auth logic, if needed | 16:40 |
*** derekh has quit IRC | 16:41 | |
EmilienM | cloudnull++ | 16:42 |
EmilienM | brb lunch | 16:42 |
*** ksambor_ has joined #tripleo | 16:43 | |
*** ksambor has quit IRC | 16:45 | |
*** ksambor_ is now known as ksambor | 16:45 | |
*** tosky has quit IRC | 16:49 | |
*** dsneddon has joined #tripleo | 16:53 | |
*** tesseract has quit IRC | 16:53 | |
*** brault has joined #tripleo | 17:00 | |
*** hberaud is now known as hberaud|gone | 17:06 | |
mwhahaha | EmilienM: i'd rther not land the reverts unless we have to | 17:12 |
mwhahaha | can we just see if the adjusting of the pools address it? | 17:12 |
cloudnull | EmilienM did that here https://review.opendev.org/#/c/682731/ | 17:12 |
cloudnull | its running now, so hopefully more soon | 17:13 |
cloudnull | mwhahaha I was thinking about your 401 journey and was thinking that if we do land the 401 revert, maybe we can circle back and re-implement some of the things we learned along the way, like encoding='utf-8', adding connection retries instead of function retries, etc. | 17:15 |
cloudnull | that said, the revert may not actually be nessisary | 17:15 |
*** pbandark has quit IRC | 17:18 | |
*** jpena is now known as jpena|off | 17:19 | |
*** ricolin has quit IRC | 17:27 | |
EmilienM | mwhahaha, cloudnull : 2019-09-17 23:07:59 | TASK [tripleo-container-image-prepare : Run tripleo-container-image-prepare logged to: /var/log/tripleo-container-image-prepare.log] *** | 17:31 |
EmilienM | 2019-09-17 23:27:24 | changed: [undercloud] | 17:31 |
EmilienM | https://storage.bhs1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_5d7/682731/2/check/tripleo-ci-centos-7-containers-multinode/5d773cc/logs/undercloud/home/zuul/undercloud_install.log.txt.gz | 17:31 |
EmilienM | the relax patch didn't help | 17:31 |
EmilienM | 20 min :/ | 17:31 |
openstackgerrit | Kevin Carter (cloudnull) proposed openstack/tripleo-common master: image_uploader: relax HTTPAdapter pool_connections and pool_maxsize https://review.opendev.org/682731 | 17:32 |
cloudnull | EmilienM updated your relax patch, maybe it will help, maybe not? | 17:33 |
EmilienM | looking | 17:33 |
cloudnull | fewer retries, with no backoff | 17:34 |
EmilienM | interesting | 17:34 |
EmilienM | we'll see | 17:34 |
*** iurygregory has quit IRC | 17:35 | |
dmsimard | sshnaidm: back from meetings, lunch, etc etc | 17:35 |
sshnaidm | dmsimard, ack, so let's start from install, how to install ara server? | 17:36 |
*** zbr|ruck is now known as zbr | 17:36 | |
* cloudnull going to grab a bite to eat, brb | 17:37 | |
dmsimard | sshnaidm: I guess we want something like logs.o.o where we set up apache with the ara sqlite middleware | 17:38 |
dmsimard | and then we need a mechanism to upload databases | 17:38 |
sshnaidm | dmsimard, ok | 17:40 |
dmsimard | from zuul's perspective, we already have the ara-report role but it's turned off right now -- it would save the database in an ara-report directory at the top level of the logs | 17:41 |
dmsimard | we could look at how to customize the role to allow uploading the ara-report directory to an arbitrary location | 17:41 |
dmsimard | so we'd need a keypair with the private key as a zuul secret to upload them | 17:42 |
sshnaidm | dmsimard, we can do it in our collect-logs role, I'll handle this | 17:43 |
*** jtomasek has quit IRC | 17:43 | |
sshnaidm | dmsimard, and I think we can start w/o secrets, will add them later | 17:43 |
dmsimard | on logs.o.o, ara is set up through puppet and isn't very consumable outside of that specific use case | 17:43 |
dmsimard | I thought about logs.rdo which already has the middleware setup but the poor server is dying | 17:44 |
sshnaidm | dmsimard, on logs.o.o it's off and infra didn't provide solution yet, on rdo it works when it's not overloaded much | 17:44 |
sshnaidm | dmsimard, when we run nested ansible in jobs we have 2-3 ara.sqlite files that we need to send to custom ara-server | 17:45 |
dmsimard | oh wow logs.rdo is at 73% used, it's been a long time since I've seen it under 90% | 17:45 |
sshnaidm | dmsimard, should we use something like that? https://ara.readthedocs.io/en/latest/ansible-role-ara-web.html | 17:47 |
dmsimard | sshnaidm: those are for >1.0 which is py3 (please centos8) | 17:48 |
dmsimard | for <1.0 there is https://github.com/recordsansible/ansible-role-ara which is largely unmaintained right now | 17:49 |
sshnaidm | dmsimard, let's use py3 | 17:50 |
dmsimard | sshnaidm: are you using ansible with py3 though ? | 17:50 |
sshnaidm | dmsimard, sure, why not | 17:50 |
dmsimard | if you're already using ansible with py3, this makes everything much easier :) | 17:51 |
sshnaidm | dmsimard, for >1.0 - what does that mean? We need to collect data using ara>1.0? | 17:51 |
dmsimard | sshnaidm: yes -- two very different code bases and database models | 17:52 |
dmsimard | the collection and the server needs to match | 17:52 |
EmilienM | cloudnull: it looks much cleaner indeed (I read requests doc) | 17:52 |
dmsimard | sshnaidm: can you point me to the bits where you install ansible and set it up to use ara ? | 17:53 |
sshnaidm | dmsimard, just in venv, https://github.com/openstack/tripleo-quickstart/blob/master/requirements.txt | 17:55 |
*** matbu has quit IRC | 17:56 | |
dmsimard | sshnaidm: ok, let me try something | 17:56 |
sshnaidm | dmsimard, this how we use it: https://github.com/openstack/ansible-role-collect-logs/blob/master/tasks/publish.yml#L18-L59 | 17:56 |
sshnaidm | dmsimard, bbl | 17:56 |
*** sshnaidm is now known as sshnaidm|bbl | 17:57 | |
*** mmethot_ has joined #tripleo | 17:57 | |
*** mmethot_ has quit IRC | 17:58 | |
*** mmethot_ has joined #tripleo | 17:59 | |
*** mmethot has quit IRC | 18:00 | |
*** brault has quit IRC | 18:03 | |
*** mcornea has quit IRC | 18:06 | |
*** mmethot_ has quit IRC | 18:06 | |
*** mcornea has joined #tripleo | 18:06 | |
*** mmethot has joined #tripleo | 18:08 | |
openstackgerrit | David Moreau Simard proposed openstack/tripleo-quickstart master: DNM: Test ara master branch and send results to a remote server https://review.opendev.org/682956 | 18:09 |
*** mmethot has quit IRC | 18:11 | |
openstackgerrit | David Moreau Simard proposed openstack/tripleo-quickstart master: DNM: Test ara master branch and send results to a remote server https://review.opendev.org/682956 | 18:14 |
openstackgerrit | David Moreau Simard proposed openstack/ansible-role-collect-logs master: DNM: Adjust paths and commands for ara collection for >1.0 https://review.opendev.org/682958 | 18:15 |
*** mmethot has joined #tripleo | 18:16 | |
dmsimard | sshnaidm|bbl: ^ the first patch should send data to https://api.trunk.demo.recordsansible.org/ which is just an instance for test purposes | 18:17 |
dmsimard | if it works well we can deploy a static server with the new roles and figure out how to upload the databases | 18:18 |
*** lmiccini has quit IRC | 18:23 | |
cloudnull | back | 18:32 |
openstackgerrit | Kevin Carter (cloudnull) proposed openstack/tripleo-common master: image_uploader: relax HTTPAdapter pool_connections and pool_maxsize https://review.opendev.org/682731 | 18:34 |
cloudnull | ^ fixed pep8 issue | 18:34 |
*** openstackgerrit has quit IRC | 18:37 | |
*** ade_lee__ has joined #tripleo | 18:43 | |
*** ade_lee_ has quit IRC | 18:46 | |
weshay | jfrancoa, fyi.. the gate is closed, please stop workflowing for now :) | 18:55 |
*** jcoufal has quit IRC | 18:56 | |
*** lucasagomes has quit IRC | 18:58 | |
jfrancoa | jfrancoa: ack, thanks for reminder | 19:01 |
dpeacock | Re-checks still off limits? (Just checking topic is valid). | 19:03 |
slagle | we need a topic for the topic showing the status of the topic | 19:07 |
*** cylopez has joined #tripleo | 19:07 | |
dpeacock | heh | 19:07 |
dpeacock | sorry | 19:08 |
*** mcornea has quit IRC | 19:10 | |
*** openstackgerrit has joined #tripleo | 19:16 | |
openstackgerrit | Michele Baldessari proposed openstack/tripleo-ansible master: WIP OVN VIP https://review.opendev.org/682968 | 19:16 |
openstackgerrit | Michele Baldessari proposed openstack/puppet-tripleo master: WIP OVN separate VIP https://review.opendev.org/672673 | 19:16 |
*** Vorrtex has joined #tripleo | 19:17 | |
*** mcornea has joined #tripleo | 19:17 | |
*** mcornea has quit IRC | 19:22 | |
*** holser has joined #tripleo | 19:26 | |
*** AJaeger has joined #tripleo | 19:28 | |
openstackgerrit | Andreas Jaeger proposed openstack/tripleo-ansible master: Remove duplicate publish docs jobs https://review.opendev.org/682971 | 19:28 |
AJaeger | tripleo team, my last change was incomplete, sorry ^ | 19:29 |
*** holser has quit IRC | 19:29 | |
*** holser has joined #tripleo | 19:30 | |
mwhahaha | k | 19:31 |
weshay | mwhahaha, ContainerCli: docker in master ci/environments should all be podman ya? | 19:31 |
mwhahaha | no | 19:31 |
mwhahaha | it's docker only if not HA | 19:31 |
mwhahaha | er | 19:31 |
mwhahaha | it's docker if pacemaker is enabled | 19:32 |
mwhahaha | it's podman only if pacemaker is not exnabled | 19:32 |
mwhahaha | until we get centos8 | 19:32 |
*** holser has quit IRC | 19:32 | |
*** mcornea has joined #tripleo | 19:32 | |
weshay | k | 19:32 |
weshay | hrm.. scen04 is pacemaker | 19:33 |
weshay | and docker as you said | 19:33 |
weshay | ah.. but for rhel8 job I need to override to podman | 19:34 |
AJaeger | thanks, mwhahaha | 19:34 |
*** cylopez has quit IRC | 19:35 | |
*** AJaeger has left #tripleo | 19:35 | |
*** holser has joined #tripleo | 19:36 | |
*** tosky has joined #tripleo | 19:43 | |
*** jbadiapa has quit IRC | 19:44 | |
openstackgerrit | Andreas Jaeger proposed openstack/tripleo-validations stable/stein: Switch to promote docs job https://review.opendev.org/682982 | 19:45 |
*** AJaeger has joined #tripleo | 19:46 | |
AJaeger | mwhahaha: https://review.opendev.org/682982 is a backport that would be nice as well sometime | 19:47 |
*** artom has quit IRC | 19:47 | |
mwhahaha | yup i'll merge once that one passes ci since it's a bit more complex then the last :) | 19:48 |
mwhahaha | actually since it's a clean cherry pick i'm sur eit's fine | 19:49 |
*** holser__ has joined #tripleo | 19:50 | |
AJaeger | if it's not fine, you have another problem ;) But let's wait and see :) | 19:51 |
AJaeger | thanks | 19:51 |
*** sshnaidm|bbl is now known as sshnaidm | 19:52 | |
*** Goneri has quit IRC | 19:52 | |
mwhahaha | ¯\_(ツ)_/¯ stranger things have happened | 19:53 |
*** holser has quit IRC | 19:53 | |
sshnaidm | mwhahaha, EmilienM do you have any idea why sc012 job fails with podman, but pass with docker? docker: https://review.opendev.org/#/c/682410/ podman: https://review.opendev.org/#/c/682341/ | 19:55 |
sshnaidm | seems like it runs different sets of containers.. | 19:55 |
mwhahaha | what's 012 | 19:55 |
EmilienM | ironic? | 19:56 |
EmilienM | don't rem | 19:56 |
mwhahaha | 2019-09-18 09:03:50 | "stderr: Error: error checking path \"/run/libvirt\": stat /run/libvirt: no such file or directory", | 19:57 |
mwhahaha | also | 19:57 |
mwhahaha | 2019-09-18 08:51:10 | "2019-09-18 08:51:08,413 DEBUG: 84179 -- Error: refusing to remove \"container-puppet-mysql_init_tasks\" as it exists in libpod as container 264c2b261ea882c642f1c36b0049c72b9de3bb8ff6c54055fd286e44608a0b17: container already exists", | 19:57 |
*** jbadiapa has joined #tripleo | 19:57 | |
mwhahaha | those logs make me sad in general | 19:58 |
*** AJaeger has left #tripleo | 19:58 | |
sshnaidm | mwhahaha, it's same in docker job | 19:58 |
sshnaidm | but this task pass there | 19:58 |
mwhahaha | docker is ok with mounting non existant files | 19:59 |
mwhahaha | podman is not | 19:59 |
EmilienM | it reminds me something :P | 19:59 |
sshnaidm | omg | 19:59 |
mwhahaha | didn't i add logic to handle that | 19:59 |
mwhahaha | or did that not land | 19:59 |
* mwhahaha checks | 19:59 | |
mwhahaha | https://review.opendev.org/#/c/682141/ | 19:59 |
mwhahaha | oh look | 19:59 |
mwhahaha | it's still pending | 19:59 |
*** cylopez has joined #tripleo | 20:00 | |
*** holser__ has quit IRC | 20:04 | |
sshnaidm | I'll rebase on it | 20:06 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-ci master: Use podman for scenario012 https://review.opendev.org/682341 | 20:07 |
*** jfrancoa has quit IRC | 20:08 | |
*** pcaruana has quit IRC | 20:10 | |
*** alexmcleod has quit IRC | 20:13 | |
*** cfontain has joined #tripleo | 20:16 | |
openstackgerrit | Merged openstack/tripleo-ansible master: Remove duplicate publish docs jobs https://review.opendev.org/682971 | 20:17 |
*** cfontain has quit IRC | 20:17 | |
*** cfontain has joined #tripleo | 20:17 | |
EmilienM | https://logs.rdoproject.org/05/682905/3/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/8cd15e1/logs/undercloud/var/lib/mistral/overcloud/ansible-playbook-command.sh.txt.gz | 20:17 |
EmilienM | ansible-playbook-2 -v /var/lib/mistral/overcloud/deploy_steps_playbook.yaml --become --inventory-file /var/lib/mistral/overcloud/tripleo-ansible-inventory.yaml --skip-tags opendev-validation "$@" | 20:17 |
EmilienM | https://review.opendev.org/#/c/682905/ seems to do the job | 20:18 |
EmilienM | mwhahaha: what happens with ansible facts? | 20:19 |
EmilienM | it's crazy | 20:19 |
EmilienM | in the logs | 20:19 |
mwhahaha | what do you mean | 20:19 |
EmilienM | logs sound fucked up to me | 20:19 |
mwhahaha | which ones | 20:19 |
EmilienM | https://logs.rdoproject.org/05/682905/3/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/8cd15e1/logs/undercloud/var/lib/mistral/overcloud/ansible.log.txt.gz | 20:19 |
EmilienM | LINE 10 | 20:19 |
EmilienM | why do we spit out ansible_facts | 20:19 |
mwhahaha | do we run in debug? | 20:20 |
mwhahaha | not sure | 20:20 |
EmilienM | it's annoying | 20:20 |
mwhahaha | yes | 20:20 |
openstackgerrit | Kevin Carter (cloudnull) proposed openstack/tripleo-heat-templates master: [WIP] Convert the container-puppet to ansible https://review.opendev.org/661883 | 20:21 |
*** ayoung has quit IRC | 20:21 | |
mwhahaha | it's in deploy-steps.j2 | 20:21 |
mwhahaha | maybe we need to silence it? | 20:21 |
*** ayoung has joined #tripleo | 20:22 | |
mwhahaha | let me propose adding no log to those | 20:23 |
mwhahaha | EmilienM: i think it's the load_vars | 20:25 |
mwhahaha | it dumps them | 20:25 |
mwhahaha | or include_vars rather | 20:25 |
openstackgerrit | Alex Schultz proposed openstack/tripleo-heat-templates master: Add no_log to include_vars https://review.opendev.org/682988 | 20:27 |
EmilienM | weshay: can you review https://review.opendev.org/#/c/682943/ and +2 if good plz | 20:28 |
openstackgerrit | Tom Barron proposed openstack/tripleo-heat-templates master: Run Manila in scenario004 without pacemaker https://review.opendev.org/682853 | 20:28 |
openstackgerrit | Tom Barron proposed openstack/tripleo-heat-templates master: Remove setting ContainerCli to docker for scenario004 https://review.opendev.org/682989 | 20:28 |
EmilienM | mwhahaha: thanks for the patch yeah I think that'll help | 20:29 |
*** cylopez has quit IRC | 20:33 | |
openstackgerrit | wes hayutin proposed openstack/tripleo-heat-templates master: move multinode scen006 to experimental https://review.opendev.org/682991 | 20:36 |
*** cfontain has quit IRC | 20:37 | |
*** cfontain has joined #tripleo | 20:39 | |
*** cfontain has quit IRC | 20:41 | |
openstackgerrit | wes hayutin proposed openstack/python-tripleoclient stable/stein: scen008 never passes in stein https://review.opendev.org/682993 | 20:43 |
EmilienM | btw, enable_validations is true in undercloud job | 20:45 |
*** Goneri has joined #tripleo | 20:45 | |
EmilienM | but i guess it's fine (not timeouting) | 20:45 |
*** cfontain has joined #tripleo | 20:47 | |
openstackgerrit | Tom Barron proposed openstack/tripleo-ci master: Use podman for scenario004 https://review.opendev.org/682890 | 20:49 |
dmsimard | mwhahaha, EmilienM: facts are shown in console because of "-vv" of the ansible-playbook command | 20:52 |
mwhahaha | they didn't used to be | 20:52 |
mwhahaha | and i don't think we've turned up the logging | 20:53 |
* mwhahaha shrugs | 20:53 | |
mwhahaha | we don't want them at all tbh | 20:53 |
weshay | tbarron, hey.. have a sec to sync? | 20:56 |
*** aakarsh has quit IRC | 21:02 | |
tbarron | weshay: sure, i may not understand the objective here well, coming from outside :) | 21:06 |
weshay | tbarron, you are doing the right things.. I just need the rhel8 version of scen004 to use podman | 21:06 |
weshay | you don't need to change tht because everything is centos7 otherwise | 21:06 |
weshay | we'll change those settings once centos8 is in the mix | 21:07 |
tbarron | weshay: so I should abandon the tht changes that switch scenario004 to podman then, right? | 21:07 |
weshay | yup | 21:07 |
weshay | will be nice once upstream and downstream are the same again | 21:08 |
tbarron | weshay: what about the tripleo-ci change: https://review.opendev.org/682890 | 21:08 |
weshay | yup.. abandon, did it for u | 21:09 |
tbarron | weshay: cool, that leaves 682853 | 21:10 |
*** cfontain_ has joined #tripleo | 21:10 | |
weshay | tbarron, yup.. thanks again for jumping on it | 21:10 |
*** cfontain has quit IRC | 21:10 | |
tbarron | that is passcing scenario004 but failed scenario004 in RDO -- ansible tried to update /etc/docker/daemon.json | 21:12 |
openstackgerrit | Kevin Carter (cloudnull) proposed openstack/tripleo-heat-templates master: [WIP] Convert the container-puppet to ansible https://review.opendev.org/661883 | 21:12 |
tbarron | weshay: which wouldn't be there for podman, right ^^^ ? | 21:12 |
tbarron | weshay: so I'm trying to figure whether I need to do anything else on https://review.opendev.org/#/c/682853/ or whether it's good to go once gate onpens up again | 21:13 |
weshay | mwhahaha, EmilienM fyi https://review.opendev.org/#/c/682945 | 21:13 |
weshay | cloudnull's patch is ready | 21:13 |
EmilienM | i want to see https://review.opendev.org/#/c/682731/ before | 21:13 |
*** mcornea has quit IRC | 21:14 | |
weshay | ah agree k | 21:14 |
*** cfontain_ has quit IRC | 21:14 | |
*** raildo has quit IRC | 21:14 | |
*** paramite has quit IRC | 21:16 | |
*** gbarros has joined #tripleo | 21:17 | |
*** brault has joined #tripleo | 21:19 | |
EmilienM | weshay, mwhahaha : damn | 21:19 |
EmilienM | we run validations as well on tripleo-ci-centos-7-containers-multinode | 21:19 |
EmilienM | (for undercloud) | 21:19 |
EmilienM | we need to disable them | 21:20 |
mwhahaha | Of course we do | 21:20 |
weshay | clearly Tengu is going to do some pushups | 21:20 |
EmilienM | on fs010 | 21:20 |
weshay | not only your future PTL, but now your fitness coach | 21:20 |
EmilienM | going the extra mile | 21:20 |
EmilienM | that's why we elected you ! | 21:21 |
weshay | elected.. lolz | 21:21 |
weshay | I still have bruises from the cuffs | 21:21 |
weshay | :) | 21:21 |
EmilienM | I'm disabling validations on fs010 | 21:22 |
weshay | +1 | 21:22 |
*** Goneri has quit IRC | 21:22 | |
EmilienM | we might want to pusht hat path direct in tree by infra | 21:22 |
EmilienM | I know they don't like it | 21:22 |
EmilienM | but the patch won't land otherwise (timeouting again) | 21:22 |
weshay | which? 682905,3 | 21:23 |
EmilienM | no | 21:23 |
EmilienM | my future patch to fs010 | 21:23 |
EmilienM | I mean | 21:23 |
EmilienM | yeah I wish the tripleoclient would land too | 21:23 |
EmilienM | weshay: can you ask infra please? | 21:23 |
*** brault has quit IRC | 21:23 | |
EmilienM | https://review.opendev.org/#/c/682943/ | 21:23 |
EmilienM | https://review.opendev.org/#/c/682905 | 21:23 |
EmilienM | and my incoming oooq patch | 21:23 |
stevebaker | morning | 21:24 |
cloudnull | time to +w that one | 21:24 |
EmilienM | i feel like it's morning too | 21:24 |
EmilienM | stevebaker: hey | 21:24 |
weshay | meh.. if you would have asked 2hrs ago.. but the jobs above it are about to land I think | 21:24 |
stevebaker | weshay: hey do you know how to trigger the experimental pipeline in RDO CI? | 21:24 |
* stevebaker wants to run featureset039 | 21:25 | |
weshay | hrm.. I think it's rdo-experimental /me looks | 21:25 |
weshay | stevebaker, oh fun.. I have the patch for you | 21:25 |
weshay | stevebaker, master fs039 is all kinds of fubar'd | 21:25 |
weshay | stevebaker, https://review.rdoproject.org/r/#/c/21672/ | 21:25 |
weshay | stevebaker, you working on master? | 21:26 |
weshay | stein is working alright | 21:26 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-quickstart master: fs010: disable validations on the undercloud https://review.opendev.org/683001 | 21:26 |
EmilienM | weshay, stevebaker : https://github.com/rdo-infra/review.rdoproject.org-config/blob/master/zuul.d/pipelines.yaml#L191 | 21:26 |
*** Goneri has joined #tripleo | 21:26 | |
EmilienM | check experimental | 21:26 |
EmilienM | weshay: is https://review.opendev.org/683001 ok? | 21:27 |
stevebaker | weshay: yes, master. I managed to get enough of a fs039 reproducer to build a dev environment | 21:27 |
weshay | that's amazing.. because it has like 5 patches atm | 21:27 |
weshay | and we're still not at the bottom of it | 21:27 |
stevebaker | EmilienM: hmm I tried that but no extra jobs ran https://review.opendev.org/#/c/680571/ | 21:28 |
weshay | it's not an experimental job | 21:28 |
EmilienM | stevebaker: have you checked the project's layout? maybe there is no exp job | 21:28 |
weshay | stevebaker, just use test project | 21:28 |
weshay | like my patch | 21:28 |
stevebaker | ok | 21:28 |
weshay | stevebaker, https://review.rdoproject.org/r/#/c/21672/5/.zuul.yaml | 21:28 |
weshay | stevebaker, how did you get stuck w/ fs039? | 21:29 |
weshay | you poor bastard | 21:29 |
weshay | and ZUUL GOES DOWN | 21:29 |
weshay | https://images.app.goo.gl/UZwhhMnuNq8QpVfi9 | 21:30 |
stevebaker | weshay: It looks like in my reproducer env the tempest run failed with messages like 2019-09-16 03:35:39 | ++ openstack network list -c ID -f value | 21:31 |
stevebaker | 2019-09-16 03:35:41 | HttpException: 503: Server Error for url: https://overcloud.ooo.test:13696/v2.0/networks, No server is available to handle this request.: 503 Service Unavailable | 21:31 |
weshay | suprised you got that far tbh | 21:31 |
*** Vorrtex has quit IRC | 21:31 | |
weshay | w/ fs039 | 21:32 |
weshay | master | 21:32 |
stevebaker | weshay: for a nova-less undercloud I need to call novajoin manually to build the config_drive that is passed to metalsmith provision | 21:32 |
stevebaker | weshay: also novajoin needs to listen to ironic notifications for scale-down unregister nodes | 21:33 |
weshay | stevebaker's in the advanced class.. /me pretending to know this and asking how I can help | 21:33 |
stevebaker | thoughts and prayers | 21:34 |
weshay | you'll always have that | 21:34 |
stevebaker | weshay: lets forget about fs039 for now, I also want 035 and 053 to run on this patch https://review.opendev.org/#/c/680571/ | 21:34 |
*** gbarros has quit IRC | 21:34 | |
stevebaker | 001 runs fine with it | 21:35 |
stevebaker | testproject I guess | 21:38 |
*** panda has quit IRC | 21:41 | |
*** panda has joined #tripleo | 21:42 | |
*** _dpaterson has quit IRC | 21:44 | |
*** nhicher has quit IRC | 21:46 | |
*** nhicher has joined #tripleo | 21:47 | |
openstackgerrit | Merged openstack/tripleo-quickstart master: fs010: disable validations on the undercloud https://review.opendev.org/683001 | 21:54 |
EmilienM | weshay: ^ | 21:54 |
openstackgerrit | Merged openstack/python-tripleoclient master: Disable inflight validations by default https://review.opendev.org/682905 | 21:54 |
openstackgerrit | Merged openstack/python-tripleoclient master: Introduce --inflight-validations for standalone / undercloud https://review.opendev.org/682943 | 21:55 |
weshay | ah look at that | 21:55 |
EmilienM | ok now we can talk | 21:55 |
EmilienM | weshay: blocking https://review.opendev.org/#/c/682945/ for now | 21:56 |
EmilienM | and wait for https://review.opendev.org/#/c/682731/ | 21:56 |
EmilienM | before bed I'll check status | 21:56 |
EmilienM | but we probably bought a 15min solid with that | 21:57 |
EmilienM | for image_uploader there is a 10 min to find | 21:57 |
EmilienM | bbl | 21:57 |
weshay | EmilienM, thanks! | 21:57 |
*** rlandy is now known as rlandy|bbl | 22:04 | |
*** dsneddon has quit IRC | 22:13 | |
*** rfolco has quit IRC | 22:43 | |
*** florianf has quit IRC | 22:46 | |
*** dsneddon has joined #tripleo | 22:49 | |
*** tkajinam has joined #tripleo | 23:02 | |
*** dsneddon has quit IRC | 23:15 | |
*** rcernin has joined #tripleo | 23:16 | |
openstackgerrit | Takashi Kajinami proposed openstack/puppet-tripleo master: Disable keystone token_flush by default https://review.opendev.org/682512 | 23:32 |
*** tosky has quit IRC | 23:40 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!