*** saneax is now known as saneax-_-|AFK | 00:00 | |
*** agopi has joined #tripleo | 00:03 | |
*** ooolpbot has joined #tripleo | 00:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 00:10 |
---|---|---|
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771435 | 00:10 |
*** ooolpbot has quit IRC | 00:10 | |
openstack | Launchpad bug 1771435 in tripleo "scenario001/002 failing on autoscaling with urllib3.exceptions.SSLError: [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:579)" [Critical,In progress] - Assigned to Alex Schultz (alex-schultz) | 00:10 |
*** myoung|ruck|afk is now known as myoung|ruck | 00:14 | |
openstackgerrit | Ronelle Landy proposed openstack-infra/tripleo-ci master: DNM - Testing integration of releases script in toci https://review.openstack.org/568717 | 00:15 |
myoung|ruck | weshay, sshnaidm|rover, mwhahaha: fyi queens promoted (http://dashboards.rdoproject.org/queens, https://trunk.rdoproject.org/centos7-queens/a7/fb/a7fbaca2a97adda856c3ba5d5166fb1665f02bc0_85b157a9) | 00:16 |
*** rlandy is now known as rlandy|bbl | 00:16 | |
*** mburned is now known as mburned_out | 00:17 | |
*** agopi has quit IRC | 00:19 | |
*** linhnm has joined #tripleo | 00:33 | |
*** dtrainor has quit IRC | 00:52 | |
*** psahoo has joined #tripleo | 00:57 | |
*** toure is now known as toure|gone | 01:01 | |
*** ooolpbot has joined #tripleo | 01:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 01:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771435 | 01:10 |
openstack | Launchpad bug 1771435 in tripleo "scenario001/002 failing on autoscaling with urllib3.exceptions.SSLError: [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:579)" [Critical,In progress] - Assigned to Alex Schultz (alex-schultz) | 01:10 |
*** ooolpbot has quit IRC | 01:10 | |
*** tiswanso has joined #tripleo | 01:16 | |
*** tiswanso_ has quit IRC | 01:19 | |
openstackgerrit | Merged openstack/tripleo-validations stable/ocata: Validate that there should not be XFS volumes with ftype=0 https://review.openstack.org/564738 | 01:30 |
*** ssbarnea_ has quit IRC | 01:34 | |
*** gyankum has joined #tripleo | 01:55 | |
*** atoth has quit IRC | 01:59 | |
*** lblanchard has joined #tripleo | 02:00 | |
*** bkopilov has quit IRC | 02:00 | |
openstackgerrit | Merged openstack/tripleo-common master: Install Octavia amphora image if Red Hat https://review.openstack.org/566913 | 02:01 |
openstackgerrit | Merged openstack/tripleo-common master: add tripleo update job as voting https://review.openstack.org/563526 | 02:01 |
*** linhnm has quit IRC | 02:03 | |
openstackgerrit | Merged openstack/tripleo-heat-templates master: Deploy Designate in scenario003 https://review.openstack.org/555007 | 02:06 |
openstackgerrit | Merged openstack/tripleo-common master: Adds Create Container Workflow https://review.openstack.org/563619 | 02:06 |
openstackgerrit | Merged openstack/tripleo-upgrade master: Use tripleo_role_name instead of role_name https://review.openstack.org/568443 | 02:06 |
openstackgerrit | Merged openstack/tripleo-common master: Include 'tripleo_role_name' in the inventory https://review.openstack.org/568340 | 02:06 |
*** ooolpbot has joined #tripleo | 02:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 02:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771435 | 02:10 |
*** ooolpbot has quit IRC | 02:10 | |
openstack | Launchpad bug 1771435 in tripleo "scenario001/002 failing on autoscaling with urllib3.exceptions.SSLError: [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:579)" [Critical,In progress] - Assigned to Alex Schultz (alex-schultz) | 02:10 |
*** rlandy|bbl has quit IRC | 02:16 | |
openstackgerrit | Merged openstack/tripleo-heat-templates master: deploy-steps: switch to tripleo_role_name https://review.openstack.org/568343 | 02:20 |
openstackgerrit | Merged openstack/python-tripleoclient master: Improve heat launcher user retrieval https://review.openstack.org/560904 | 02:20 |
openstackgerrit | Merged openstack-infra/tripleo-ci master: add tripleo update job as voting https://review.openstack.org/565523 | 02:20 |
*** lblanchard has quit IRC | 02:23 | |
openstackgerrit | Merged openstack/python-tripleoclient master: Error if deployment fails https://review.openstack.org/567384 | 02:32 |
openstackgerrit | Marius Cornea proposed openstack/tripleo-upgrade master: Load roles list from yaml instead of awk parsing https://review.openstack.org/568696 | 02:33 |
*** dmacpher has joined #tripleo | 02:35 | |
*** liverpooler has joined #tripleo | 02:36 | |
*** gkadam has joined #tripleo | 02:36 | |
*** dbecker has quit IRC | 02:46 | |
*** liverpooler has quit IRC | 02:49 | |
*** agopi has joined #tripleo | 02:50 | |
*** psachin has joined #tripleo | 02:50 | |
openstackgerrit | Merged openstack/tripleo-heat-templates master: Revert "Change default endpoint map entries to use TLS" https://review.openstack.org/568699 | 02:54 |
*** agopi has quit IRC | 02:54 | |
*** ramishra has joined #tripleo | 02:57 | |
*** agopi has joined #tripleo | 02:58 | |
openstackgerrit | Merged openstack/tripleo-heat-templates master: update tht jobs to include network/endpoints https://review.openstack.org/568700 | 03:00 |
*** dbecker has joined #tripleo | 03:01 | |
openstackgerrit | wes hayutin proposed openstack/tripleo-upgrade master: DNM, test https://review.openstack.org/568732 | 03:04 |
*** gkadam has quit IRC | 03:07 | |
*** agopi has quit IRC | 03:10 | |
*** ooolpbot has joined #tripleo | 03:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 03:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 03:10 |
*** ooolpbot has quit IRC | 03:10 | |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 03:10 |
*** agopi has joined #tripleo | 03:10 | |
openstackgerrit | wes hayutin proposed openstack/tripleo-upgrade master: add container minimal check and gate https://review.openstack.org/568733 | 03:20 |
*** gyankum has quit IRC | 03:20 | |
openstackgerrit | wes hayutin proposed openstack/tripleo-upgrade master: DNM, test https://review.openstack.org/568732 | 03:21 |
*** bkopilov has joined #tripleo | 03:23 | |
*** alee_afk is now known as alee | 03:30 | |
*** rajinir has quit IRC | 03:39 | |
*** links has joined #tripleo | 03:41 | |
EmilienM | stevebaker: https://review.openstack.org/#/c/568716/ if you can look, thx | 03:42 |
stevebaker | EmilienM: looking | 03:44 |
*** fragatina has quit IRC | 03:46 | |
*** janki has joined #tripleo | 03:49 | |
openstackgerrit | Emilien Macchi proposed openstack/ansible-role-container-registry master: Handle Docker rpm updates https://review.openstack.org/568714 | 03:53 |
openstackgerrit | Steve Baker proposed openstack/tripleo-quickstart-extras master: Populate /etc/yum/vars/contentdir https://review.openstack.org/568701 | 03:53 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: docker: cleanup update tasks https://review.openstack.org/568715 | 03:54 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: Deploy Docker via Ansible and not Puppet https://review.openstack.org/561377 | 03:58 |
EmilienM | jaosorior: sorry we had to revert one of your patches | 03:58 |
EmilienM | jaosorior: see https://review.openstack.org/568699 | 03:58 |
Tengu | hello there | 03:59 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-common master: (cleanup) remove usage of role_name https://review.openstack.org/568347 | 04:02 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-common master: (cleanup) remove usage of role_name https://review.openstack.org/568347 | 04:02 |
jaosorior | EmilienM: fuck | 04:02 |
EmilienM | jaosorior: the week isn't bright CI side | 04:03 |
jaosorior | EmilienM: where is the scenario001 and 002's definition? | 04:03 |
jaosorior | they probably are missing the CA installation... | 04:03 |
*** sanjay__u has joined #tripleo | 04:03 | |
*** tzumainn has quit IRC | 04:03 | |
EmilienM | jaosorior: in THT/ci/environments? | 04:03 |
EmilienM | jaosorior: can you please review this quick one? https://review.openstack.org/#/c/568716/ | 04:03 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/tripleo-heat-templates master: Revert "Revert "Change default endpoint map entries to use TLS"" https://review.openstack.org/568736 | 04:04 |
jaosorior | EmilienM: gonna try to revert the revert. Is that will run the relevant scenarios, right? | 04:05 |
EmilienM | jaosorior: thanks to https://review.openstack.org/#/c/568700/, yes | 04:06 |
jaosorior | good | 04:06 |
jaosorior | uhm... wait | 04:07 |
jaosorior | unknown protocol | 04:07 |
jaosorior | that's not an SSL verification error | 04:07 |
jaosorior | that that's ceilometer using the wrong port | 04:08 |
*** ooolpbot has joined #tripleo | 04:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 04:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 04:10 |
*** ooolpbot has quit IRC | 04:10 | |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 04:10 |
*** gyankum has joined #tripleo | 04:10 | |
*** fragatina has joined #tripleo | 04:11 | |
EmilienM | jaosorior: oops | 04:14 |
weshay | EmilienM, behave | 04:15 |
openstackgerrit | Ade Lee proposed openstack/tripleo-upgrade master: Add config_change role https://review.openstack.org/567300 | 04:16 |
Tengu | hello weshay :) | 04:17 |
weshay | Tengu, hey brotha.. sorry I missed you today in the community mtg | 04:18 |
Tengu | weshay: no problem, we'll catch up eventually ;). Now that I'm part of the Beast. | 04:18 |
jaosorior | EmilienM: if you have any chance to help debug this http://logs.openstack.org/45/560445/29/check/tripleo-ci-centos-7-scenario002-multinode-oooq-container/6c9f8c0/logs/tempest.html.gz I would really appreciate it. I had seen it before (even without TLS) but can't figure out what the issue is. | 04:21 |
Tengu | jaosorior: oh. I had that one yesterday | 04:21 |
Tengu | on my fluentd config patch | 04:21 |
jaosorior | seems to call heat a bunch of times for some reason | 04:22 |
Tengu | jaosorior: (that is, the first: Details: {u'message': u'Unable to complete operation on subnet 64fc7190-ca23-4b5a-b838-731fd609fdec: One or more ports have an IP allocation from this subnet.', u'type': u'SubnetInUse', u'detail': u''} | 04:22 |
jaosorior | then it fails and tries to call the cleanup | 04:22 |
jaosorior | which ends up using the wrong port | 04:22 |
jaosorior | Tengu: right, that's the error in the cleanup | 04:23 |
jaosorior | but the issue starts in telemetry_tempest_plugin.scenario.test_telemetry_integration.TestTelemetryIntegration | 04:23 |
jaosorior | EmilienM: anybody from the ceilo team I can ask about this? | 04:24 |
weshay | jaosorior, pm | 04:25 |
EmilienM | jaosorior: weird | 04:26 |
EmilienM | I haven't see it today | 04:26 |
openstackgerrit | Steve Baker proposed openstack/tripleo-quickstart-extras master: WIP Use "openstack tripleo container image prepare" https://review.openstack.org/568403 | 04:26 |
jaosorior | EmilienM: right I'm talking more than a month ago, exact same thing | 04:27 |
jaosorior | anyway, Unkown protocol error means that something is trying to using https in an http port | 04:28 |
jaosorior | just not sure what, cause the tempest logs are not very explicit about it | 04:28 |
jaosorior | but the test calling heat a bunch of times, then failing, then calling the teardown and failing at that, is what I mean | 04:29 |
Tengu | i.e. race condition at some point? | 04:29 |
weshay | EmilienM, fyi https://review.openstack.org/#/c/568733/ | 04:29 |
jaosorior | Tengu: not sure. | 04:35 |
jaosorior | Tengu: apparently this trace is all I get from tempest... | 04:36 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-common master: (cleanup) remove usage of role_name https://review.openstack.org/568347 | 04:37 |
jaosorior | Tengu: if you have time to look into this as well, I sure would appreciate the help | 04:38 |
*** moguimar has quit IRC | 04:38 | |
*** pblaho has quit IRC | 04:38 | |
*** dtantsur|afk has quit IRC | 04:38 | |
Tengu | jaosorior: I might give it a try :) | 04:38 |
*** moguimar has joined #tripleo | 04:39 | |
*** anilvenkata has joined #tripleo | 04:40 | |
*** jaosorior has quit IRC | 04:40 | |
Tengu | jaosorior: hmmm there's this log file, but I don't see error in it for now: http://logs.openstack.org/45/560445/29/check/tripleo-ci-centos-7-scenario002-multinode-oooq-container/6c9f8c0/logs/undercloud/home/zuul/tempest/tempest.log.txt.gz | 04:40 |
Tengu | duh.. | 04:40 |
*** jaosorior has joined #tripleo | 04:47 | |
jaosorior | Tengu: do you have a node where you can deploy? | 04:48 |
Tengu | jaosorior: not yet, I should get a tiny monster tomorrow (ordered yesterday in an online store) | 04:48 |
Tengu | jaosorior: and as I lack the RAM in my laptop, can't really afford to deploy anything on it. | 04:49 |
Tengu | jaosorior: you can follow the subnet life in this log file: http://logs.openstack.org/45/560445/29/check/tripleo-ci-centos-7-scenario002-multinode-oooq-container/6c9f8c0/logs/undercloud/home/zuul/tempest/tempest.log.txt.gz | 04:49 |
*** ykarel|away has joined #tripleo | 04:50 | |
Tengu | although I don't really see anything weird :/. | 04:50 |
Tengu | it ends up with the failure | 04:50 |
jaosorior | right | 04:50 |
jaosorior | I just don't know what it tried to do before the failure | 04:50 |
Tengu | ah, the log would show it. | 04:50 |
Tengu | jaosorior: the error starts here: http://logs.openstack.org/45/560445/29/check/tripleo-ci-centos-7-scenario002-multinode-oooq-container/6c9f8c0/logs/undercloud/home/zuul/tempest/tempest.log.txt.gz#_2018-05-15_18_19_03_471 | 04:51 |
Tengu | you might want to check a bit above that line? | 04:51 |
*** saneax-_-|AFK is now known as saneax | 04:51 | |
*** pblaho has joined #tripleo | 04:52 | |
Tengu | apparently.... hmm. there are a lot of redirection just before, if we take the XML file present in the parent dir. | 04:52 |
jaosorior | right | 04:53 |
jaosorior | the redirection from heat | 04:53 |
Tengu | ah, port 13004 is heat? | 04:53 |
jaosorior | yeah | 04:53 |
* Tengu takes note | 04:53 | |
jaosorior | I don't really understand the redirections though | 04:53 |
jaosorior | "Redirecting https://overcloud.localdomain:13004/v1/99fdd28b260b460c9040a043b5d763fe/stacks/integration_test -> https://overcloud.localdomain:13004/v1/99fdd28b260b460c9040a043b5d763fe/stacks/integration_test/156f4da9-ee97-42ab-bb76-8c66ccc6f774" | 04:54 |
jaosorior | uhu | 04:54 |
jaosorior | brb | 04:54 |
*** jaosorior has quit IRC | 04:54 | |
*** paramite_ has joined #tripleo | 04:58 | |
*** cshastri has joined #tripleo | 04:59 | |
*** jaosorior has joined #tripleo | 05:02 | |
Tengu | jaosorior: digging a bit further in the script launching tempest, it has a concurrency of 1, so it should NOT create some race condition. | 05:04 |
jaosorior | well that's good to know | 05:04 |
jaosorior | but... where are the redirections coming from? | 05:04 |
Tengu | http://logs.openstack.org/45/560445/29/check/tripleo-ci-centos-7-scenario002-multinode-oooq-container/6c9f8c0/logs/undercloud/home/zuul/tempest_output.log.txt.gz#_2018-05-15_18_18_41 | 05:04 |
Tengu | I think the best shot would be to check the telemetry_tempest_plugin.scenario.test_telemetry_integration.TestTelemetryIntegration.test_autoscaling content. | 05:05 |
*** aufi has joined #tripleo | 05:05 | |
Tengu | so that we might understand WHAT it does. this should be deployed on any undercloud I guess. | 05:05 |
Tengu | as part of the tempest bundle | 05:05 |
Tengu | meaning we should even get a hand on it via github | 05:05 |
jaosorior | Tengu: https://github.com/openstack/telemetry-tempest-plugin | 05:06 |
*** yprokule has joined #tripleo | 05:06 | |
Tengu | thank you :) | 05:06 |
jaosorior | Tengu: https://github.com/openstack/telemetry-tempest-plugin/blob/master/telemetry_tempest_plugin/integration/gabbi/gabbits-live/autoscaling.yaml | 05:06 |
jaosorior | Tengu: oho https://github.com/openstack/telemetry-tempest-plugin/blob/master/telemetry_tempest_plugin/integration/gabbi/gabbits-live/autoscaling.yaml#L29 | 05:07 |
jaosorior | that's an explicit loop | 05:07 |
jaosorior | that's why we see all those redirects | 05:07 |
jaosorior | funky | 05:07 |
Tengu | hmm | 05:08 |
Tengu | so it loops 300 times on the same URL | 05:08 |
Tengu | with 1 second delay | 05:08 |
jaosorior | well | 05:08 |
Tengu | ..... that's a DoS | 05:08 |
jaosorior | it I guess it polls until it gets a status 200 | 05:09 |
Tengu | unless it "breaks" the loop when it gets a 200 | 05:09 |
Tengu | :D | 05:09 |
Tengu | anyway. maybe heat engine is just falling appart with that loop | 05:09 |
Tengu | jaosorior: you said it was random right? | 05:10 |
*** ooolpbot has joined #tripleo | 05:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 05:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 05:10 |
*** ooolpbot has quit IRC | 05:10 | |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 05:10 |
jaosorior | Tengu: I think what happens is that it fails creating the stack | 05:10 |
Tengu | hmm. do we actually get the 300 iteration? I don't think so. urllib3 fails before the end | 05:10 |
*** udesale has joined #tripleo | 05:11 | |
Tengu | due to some connection issue. funky part, it's not a "connection refused", meaning heat might still be running properly | 05:11 |
*** dxiri has quit IRC | 05:11 | |
jaosorior | I don't see any issues in the heat logs | 05:12 |
jaosorior | actually, someone does a stack update | 05:12 |
jaosorior | and it even gets to update complete | 05:12 |
jaosorior | Tengu: http://logs.openstack.org/45/560445/29/check/tripleo-ci-centos-7-scenario002-multinode-oooq-container/6c9f8c0/logs/subnode-2/var/log/containers/heat/heat-engine.log.txt.gz#_2018-05-15_18_18_39_991 | 05:12 |
jaosorior | right | 05:12 |
Tengu | darn. of course, "containers" -.- | 05:12 |
jaosorior | Tengu: the stack update is here https://github.com/openstack/telemetry-tempest-plugin/blob/master/telemetry_tempest_plugin/integration/gabbi/gabbits-live/autoscaling.yaml#L102 | 05:12 |
Tengu | forget that directory whily searching. | 05:12 |
Tengu | L108 in fact | 05:13 |
Tengu | but yep. | 05:13 |
Tengu | jaosorior: http://logs.openstack.org/45/560445/29/check/tripleo-ci-centos-7-scenario002-multinode-oooq-container/6c9f8c0/logs/subnode-2/var/log/containers/heat/heat-engine.log.txt.gz#_2018-05-15_18_17_29_184 we do have the CREATE COMPLETE | 05:15 |
Tengu | and, according to the log timestamp, it's BEFORE the crash | 05:15 |
jaosorior | So, it's not a heat issue | 05:15 |
Tengu | and there a second stack created successfully, still before the crash: http://logs.openstack.org/45/560445/29/check/tripleo-ci-centos-7-scenario002-multinode-oooq-container/6c9f8c0/logs/subnode-2/var/log/containers/heat/heat-engine.log.txt.gz#_2018-05-15_18_17_43_133 | 05:16 |
*** elgxl has joined #tripleo | 05:17 | |
jaosorior | Tengu: I think the whole thing went well and it failed on the cleanup | 05:20 |
Tengu | hmm. indeed. | 05:21 |
Tengu | meaning the cleanup maybe isn't full | 05:21 |
Tengu | i.e. it tries to drop the subnet before the instances/ports are really down | 05:21 |
Tengu | so this is another .yaml file right? | 05:21 |
*** elgxl has quit IRC | 05:21 | |
jaosorior | trying to find it | 05:22 |
Tengu | maybe https://github.com/openstack/telemetry-tempest-plugin/blob/b30a19214d0036141de75047b444d48ae0d0b656/telemetry_tempest_plugin/integration/gabbi/gabbits-live/aodh-gnocchi-threshold-alarm.yaml ? | 05:22 |
Tengu | the only one matching "teardown" in that repo | 05:22 |
*** ratailor has joined #tripleo | 05:23 | |
Tengu | funky. | 05:23 |
Tengu | this is in the test just BEFORE the one that fails | 05:23 |
*** saneax is now known as saneax-_-|AFK | 05:23 | |
Tengu | jaosorior: https://github.com/openstack/telemetry-tempest-plugin/blob/master/telemetry_tempest_plugin/integration/gabbi/gabbits-live/autoscaling.yaml#L147 well.. | 05:24 |
*** remus has joined #tripleo | 05:24 | |
jaosorior | Tengu: sure | 05:25 |
jaosorior | Tengu: it deletes the stack; but we see that it deleted some image | 05:25 |
jaosorior | the images are not handled as part of the stack | 05:25 |
Tengu | there' also : https://github.com/openstack/telemetry-tempest-plugin/blob/master/telemetry_tempest_plugin/integration/gabbi/gabbits-live/autoscaling.yaml#L129 just before the stack deletion | 05:25 |
*** agurenko has joined #tripleo | 05:26 | |
jaosorior | Tengu: shit | 05:28 |
jaosorior | Tengu: so... | 05:28 |
jaosorior | Tengu: this is where the integration tests get called https://github.com/openstack/telemetry-tempest-plugin/blob/master/telemetry_tempest_plugin/scenario/test_telemetry_integration.py | 05:29 |
Tengu | yup | 05:29 |
jaosorior | here, for instance is how they assign the endpoints https://github.com/openstack/telemetry-tempest-plugin/blob/master/telemetry_tempest_plugin/scenario/test_telemetry_integration.py#L84 | 05:29 |
jaosorior | most of it is just getting the endpoints | 05:29 |
Tengu | yup. | 05:29 |
jaosorior | except for the glance image | 05:29 |
jaosorior | which calles self.glance_image_create() | 05:30 |
jaosorior | that function is part of the parent class | 05:30 |
jaosorior | manager.ScenarioTest | 05:30 |
jaosorior | which is part of the base tempest definitions | 05:30 |
Tengu | https://github.com/openstack/tempest/blob/526468df52e4dcb8193259ebd55f100dddb97fd2/tempest/scenario/manager.py#L429 | 05:31 |
jaosorior | right | 05:31 |
jaosorior | that clales _image_create | 05:31 |
Tengu | L400 | 05:31 |
jaosorior | which does this https://github.com/openstack/tempest/blob/526468df52e4dcb8193259ebd55f100dddb97fd2/tempest/scenario/manager.py#L420 | 05:31 |
jaosorior | so, what we're seeing is the cleanup running all the "addCleanup" callbacks | 05:32 |
jaosorior | whcih, IIRC, are called at random | 05:32 |
jaosorior | or maybe they aren't, but still, we would need to check each addCleanup that is used | 05:32 |
jaosorior | Tengu: oho | 05:36 |
jaosorior | wait up | 05:36 |
jaosorior | I skipped something | 05:36 |
Tengu | hmm? | 05:36 |
* Tengu digging in the code | 05:36 | |
jaosorior | Tengu: So, check the SSL exception log | 05:37 |
*** jtomasek has joined #tripleo | 05:37 | |
jaosorior | Tengu: I had missed this: | 05:37 |
jaosorior | AssertionError: From test "check event" : | 05:37 |
Tengu | so we know what test actually fail with that | 05:38 |
jaosorior | Tengu: it's this https://github.com/openstack/telemetry-tempest-plugin/blob/master/telemetry_tempest_plugin/integration/gabbi/gabbits-live/autoscaling.yaml#L66 | 05:38 |
Tengu | panko? | 05:39 |
Tengu | what service is that? | 05:39 |
jaosorior | Tengu: https://docs.openstack.org/panko/latest/ | 05:39 |
Tengu | ah, yep | 05:39 |
Tengu | storage for ceilometer | 05:39 |
Tengu | is panko listening with TLS? | 05:39 |
jaosorior | Tengu: it's public endpoint is | 05:40 |
jaosorior | but hey! at least we know what service failed | 05:40 |
Tengu | http://logs.openstack.org/45/560445/29/check/tripleo-ci-centos-7-scenario002-multinode-oooq-container/6c9f8c0/logs/subnode-2/var/log/containers/httpd/panko-api/panko_wsgi_error.log.txt.gz | 05:41 |
jaosorior | uuuh... | 05:41 |
jaosorior | funky | 05:41 |
Tengu | although.. | 05:41 |
Tengu | the access log actually does show working connections later. | 05:41 |
*** dparkes has quit IRC | 05:41 | |
jaosorior | Tengu: nothing in the panko logs either http://logs.openstack.org/45/560445/29/check/tripleo-ci-centos-7-scenario002-multinode-oooq-container/6c9f8c0/logs/subnode-2/var/log/containers/panko/app.log.txt.gz | 05:42 |
Tengu | jaosorior: was about to say that :] | 05:42 |
Tengu | just the warning, but not an ssl-related. | 05:42 |
Tengu | BUT | 05:42 |
Tengu | ssl is managed in haproxy right? | 05:43 |
jaosorior | yes | 05:43 |
Tengu | what does IT say. | 05:43 |
jaosorior | was about to check the config | 05:43 |
*** marios has joined #tripleo | 05:43 | |
Tengu | hmmm where is it now with the containers.... | 05:43 |
*** quiquell|off is now known as quiquell | 05:43 | |
jaosorior | Tengu: it still outputs its logs to journal | 05:44 |
Tengu | hmmm yep, but its config file itself? | 05:44 |
Tengu | oh. we're missing /var/lib/kolla directory ? | 05:45 |
jaosorior | Tengu: http://logs.openstack.org/45/560445/29/check/tripleo-ci-centos-7-scenario002-multinode-oooq-container/6c9f8c0/logs/subnode-2/var/log/config-data/haproxy/etc/haproxy/ | 05:45 |
Tengu | ah | 05:46 |
Tengu | jaosorior: wondering.... there's no support for the redirect in the yaml part regarding panko | 05:47 |
jaosorior | Tengu: ?? | 05:47 |
Tengu | in the autoscaling.yaml that is | 05:48 |
jaosorior | Tengu: what do you mean? | 05:48 |
Tengu | missing "redirects: true" - but not sure it causes any issue for that one | 05:48 |
openstackgerrit | Rajesh Tailor proposed openstack/tripleo-heat-templates master: Allow configuration of NFS backend for Nova https://review.openstack.org/564179 | 05:50 |
*** masco has joined #tripleo | 05:50 | |
*** dciabrin has quit IRC | 05:50 | |
jaosorior | Tengu: so there's two options I can see right now | 05:51 |
jaosorior | one is that the redirect is messing things up | 05:51 |
jaosorior | the second is that from the beginning we are getting the URL wrong | 05:51 |
jaosorior | But I think the issue is in the call to $ENVIRON['PANKO_SERVICE_URL']/v2/events | 05:52 |
jaosorior | I'm trying to bring up an environment to reproduce it | 05:52 |
Tengu | jaosorior: May 15 18:18:39 centos-7-rax-dfw-0004034638 haproxy[53297]: 192.168.24.2:54498 [15/May/2018:18:18:39.577] panko panko/<NOSRV> -1/-1/-1/-1/9 400 187 - - PR-- 126/0/0/0/3 0/0 "<BADREQ>" | 05:53 |
Tengu | gotcha | 05:53 |
Tengu | thank you, lovely grep :) | 05:53 |
*** fragatina has quit IRC | 05:55 | |
*** fragatina has joined #tripleo | 05:55 | |
*** dparkes has joined #tripleo | 05:56 | |
jaosorior | Tengu: don't really get taht log | 05:56 |
Tengu | jaosorior: in journal | 05:57 |
jaosorior | right | 05:57 |
jaosorior | but, don't really understand what happened there | 05:57 |
jaosorior | something did a bad request to the panko endpoint | 05:57 |
jaosorior | but... why doesn't it show the endpoint that was used? | 05:57 |
Tengu | jaosorior: http://paste.openstack.org/show/721060/ | 05:58 |
jaosorior | oho | 05:59 |
* Tengu loves grep | 05:59 | |
Tengu | I could isolate haproxy process logs as well if you want them :) | 05:59 |
jaosorior | that would be nice :D | 05:59 |
jaosorior | but yeah, still not entirely sure what the issue is :/ | 06:00 |
Tengu | wc -l process.log | 06:00 |
Tengu | 4526 process.log | 06:00 |
Tengu | or maybe not that nice XD | 06:00 |
*** dparkes has quit IRC | 06:00 | |
Tengu | jaosorior: `grep 'haproxy\[' journal.log.gz > process.log` | 06:01 |
Tengu | wget apparently get an uncompressed file. | 06:01 |
Tengu | probably an inflate from the web server. | 06:01 |
jaosorior | chandankumar: are you around>? | 06:03 |
*** psachin has quit IRC | 06:03 | |
*** jfrancoa has joined #tripleo | 06:03 | |
*** pliu has quit IRC | 06:04 | |
*** lucas-afk has quit IRC | 06:05 | |
*** remus has quit IRC | 06:06 | |
*** pliu has joined #tripleo | 06:06 | |
jaosorior | Tengu: I'm trying to figure out how tempest got the panko url | 06:06 |
jaosorior | Tengu: this is how the urls are assigned https://github.com/openstack/telemetry-tempest-plugin/blob/master/telemetry_tempest_plugin/scenario/test_telemetry_integration.py#L84 | 06:07 |
*** lucasagomes has joined #tripleo | 06:07 | |
jaosorior | it eventually calls this function https://github.com/openstack/telemetry-tempest-plugin/blob/master/telemetry_tempest_plugin/scenario/test_telemetry_integration.py#L45 | 06:07 |
*** yprokule_ has joined #tripleo | 06:07 | |
jaosorior | this is the tempest config that was used http://logs.openstack.org/45/560445/29/check/tripleo-ci-centos-7-scenario002-multinode-oooq-container/6c9f8c0/logs/undercloud/home/zuul/tempest/etc/tempest.conf.txt.gz | 06:07 |
*** ooolpbot has joined #tripleo | 06:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 06:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 06:10 |
*** ooolpbot has quit IRC | 06:10 | |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 06:10 |
Tengu | jaosorior: apparently the URLs are feeded directly by the API | 06:10 |
*** yprokule has quit IRC | 06:10 | |
*** yprokule_ is now known as yprokule | 06:10 | |
*** pblaho has quit IRC | 06:11 | |
*** moguimar has quit IRC | 06:11 | |
*** psachin has joined #tripleo | 06:12 | |
jaosorior | Tengu: yeah, they're gotten from the keystone catalog aparently | 06:13 |
*** moguimar has joined #tripleo | 06:13 | |
jaosorior | Tengu: still trying to bring up an environment | 06:14 |
jaosorior | I'll let you know when I do so we can test this out | 06:14 |
Tengu | jaosorior: tomorrow I should get all the orderd things, and I'll take a moment in order to build the computer, install it with some OS, and deploy a VM-based lab | 06:14 |
jaosorior | Tengu: I have CentOS in my machine | 06:14 |
jaosorior | that's what I've been using to deploy | 06:15 |
jaosorior | it's been working quite alright | 06:15 |
Tengu | guess it doesn't really matter as the lab itself will be VM-based :). | 06:15 |
*** hjensas has quit IRC | 06:15 | |
Tengu | but I think I'll build something very similar to what I did in my previous job | 06:15 |
jaosorior | Tengu: I think quickstart makes some assumptions as what the host OS is | 06:15 |
Tengu | erf.. is RDO third party CI down? | 06:16 |
jaosorior | so, I do suggest either RHEL or CentOS | 06:16 |
jaosorior | no idea | 06:16 |
Tengu | yup, will probably go centos. | 06:16 |
Tengu | anyway, that's for tomorrow. | 06:16 |
openstackgerrit | Merged openstack/python-tripleoclient master: (cleanup) remove usage of vars when calling ansible https://review.openstack.org/568346 | 06:17 |
Tengu | https://logs.rdoproject.org/28/566228/9/openstack-check/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/Z14cea5ca93a54c4b823f0d4d36faf804/undercloud/home/jenkins/overcloud_prep_images.log.txt.gz grmbl. | 06:18 |
*** cylopez has joined #tripleo | 06:25 | |
*** pblaho has joined #tripleo | 06:25 | |
*** skramaja has joined #tripleo | 06:26 | |
*** hewbrocca_afk has quit IRC | 06:26 | |
*** markmc has quit IRC | 06:26 | |
*** udesale has quit IRC | 06:27 | |
*** dparkes has joined #tripleo | 06:29 | |
*** udesale has joined #tripleo | 06:29 | |
*** holser__ has joined #tripleo | 06:33 | |
*** elgxl has joined #tripleo | 06:34 | |
*** chandankumar has quit IRC | 06:38 | |
*** apetrich has quit IRC | 06:38 | |
*** chandankumar has joined #tripleo | 06:39 | |
*** lvdombrkr has joined #tripleo | 06:39 | |
chandankumar | jaosorior: yes sensi, How can I help ? | 06:40 |
jaosorior | chandankumar: I had some tempest questions but ended sorting them out | 06:43 |
jaosorior | thanks! | 06:43 |
chandankumar | jaosorior: cool! | 06:43 |
Tengu | jaosorior: you have caught the issue? :) | 06:44 |
chandankumar | jaosorior: I need some help on https://review.openstack.org/#/q/topic:tempest_log+(status:open+OR+status:merged) | 06:44 |
openstackgerrit | Merged openstack/python-tripleoclient master: undercloud upgrade: include UndercloudUpgrade service https://review.openstack.org/568716 | 06:47 |
*** saneax-_-|AFK is now known as saneax | 06:47 | |
chandankumar | jaosorior: what was the issue by the way? | 06:47 |
*** quiquell is now known as quiquell|afk | 06:49 | |
sshnaidm|rover | chandankumar, arxcruz, do you know about telemetry tests failing? http://logs.openstack.org/21/528621/16/check/tripleo-ci-centos-7-scenario002-multinode-oooq-container/de36d70/logs/undercloud/home/zuul/tempest_output.log.txt.gz | 06:50 |
*** apetrich has joined #tripleo | 06:50 | |
Tengu | chandankumar: ah, well, that is the issue jaosorior was tracking down :) -^ | 06:53 |
chandankumar | Tengu: is it the same telemetry issue? | 06:54 |
Tengu | 2s | 06:54 |
sshnaidm|rover | chandankumar, arxcruz fyi: https://bugs.launchpad.net/tripleo/+bug/1771508 | 06:54 |
openstack | Launchpad bug 1771508 in tripleo "Telemetry tests fail in scenario-001 and 002 jobs" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 06:54 |
Tengu | chandankumar: yes, exactly if I check this: http://logs.openstack.org/21/528621/16/check/tripleo-ci-centos-7-scenario002-multinode-oooq-container/de36d70/logs/undercloud/home/zuul/tempest/tempest.html.gz | 06:54 |
*** marrusl has quit IRC | 06:54 | |
chandankumar | sshnaidm|rover: ^^ jaosorior is hunting down | 06:54 |
sshnaidm|rover | chandankumar, it's blocking gates currently, so I'd propose to exclude it from tempest until it's solved | 06:55 |
*** hjensas has joined #tripleo | 06:55 | |
chandankumar | sshnaidm|rover: ack! | 06:55 |
*** slaweq has quit IRC | 06:56 | |
sshnaidm|rover | chandankumar, arxcruz can you please add it to exclude list for now? | 06:56 |
jaosorior | Tengu: haven't caught the issue... still deploying the environment | 06:57 |
jaosorior | just waiting for ansible to run | 06:58 |
jaosorior | like watching paint dry | 06:58 |
*** masco has quit IRC | 06:58 | |
*** udesale has quit IRC | 06:58 | |
*** slaweq has joined #tripleo | 06:58 | |
Tengu | XD | 06:58 |
Tengu | jaosorior: btw, you use quickstart for such deploy? | 06:59 |
*** olap has joined #tripleo | 07:07 | |
*** udesale has joined #tripleo | 07:07 | |
*** marrusl has joined #tripleo | 07:08 | |
*** ccamacho has joined #tripleo | 07:08 | |
openstackgerrit | Chandan Kumar proposed openstack/tripleo-quickstart-extras master: Add test_autoscaling tests to skip list https://review.openstack.org/568766 | 07:09 |
chandankumar | sshnaidm|rover: ^^ | 07:09 |
openstackgerrit | Martin André proposed openstack/tripleo-quickstart-extras master: Remove duplicated undercloud_enable_tempest key https://review.openstack.org/568767 | 07:10 |
*** ooolpbot has joined #tripleo | 07:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 07:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 07:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771508 | 07:10 |
*** ooolpbot has quit IRC | 07:10 | |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 07:10 |
openstack | Launchpad bug 1771508 in tripleo "Telemetry tests fail in scenario-001 and 002 jobs" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 07:10 |
Tengu | oh. the isue I got (introspection fails) :). | 07:10 |
sshnaidm|rover | chandankumar, thanks! | 07:10 |
*** rcernin has quit IRC | 07:10 | |
Tengu | so won't obsess on the rechecks for now. | 07:10 |
*** masco has joined #tripleo | 07:11 | |
chandankumar | sshnaidm|rover: do we gate gnocchi github pr with tripleo? | 07:12 |
sshnaidm|rover | chandankumar, I don't think so | 07:12 |
chandankumar | sshnaidm|rover: I think we need one | 07:12 |
jaosorior | Tengu: yep, quickstart | 07:12 |
sshnaidm|rover | chandankumar, why github pr? gnocchi is in gerrit | 07:12 |
chandankumar | sshnaidm|rover: it is on github, moving outside openstack long time ago | 07:13 |
jaosorior | sshnaidm|rover: not anymore AFAIK | 07:13 |
chandankumar | sshnaidm|rover: https://github.com/gnocchixyz/gnocchi | 07:13 |
sshnaidm|rover | I see, didn't know about that.. | 07:14 |
sshnaidm|rover | and why to do it? to prevent gating and errors detection? :) | 07:14 |
chandankumar | sshnaidm|rover: yup, because telemetry-tempest-plugin gets out of sync | 07:15 |
chandankumar | sshnaidm|rover: below is the list of jobs running https://github.com/gnocchixyz/gnocchi/pull/878 | 07:16 |
sshnaidm|rover | it's only linters jobs, don't check functionality | 07:17 |
*** agopi has quit IRC | 07:17 | |
*** tesseract has joined #tripleo | 07:17 | |
sshnaidm|rover | well, I don't remember we ever gated gnocchi with tripleo | 07:17 |
chandankumar | sshnaidm|rover: adding a card in trello | 07:17 |
bandini | jaosorior: apologies, was away yesterday. I see https://review.openstack.org/#/c/554926/ has merged \o/ | 07:19 |
*** udesale has quit IRC | 07:19 | |
*** ykarel|away is now known as ykarel | 07:19 | |
openstackgerrit | Michele Baldessari proposed openstack/puppet-tripleo master: WIP move unfencing to meta_params https://review.openstack.org/568769 | 07:23 |
*** udesale has joined #tripleo | 07:24 | |
*** zoli is now known as zoli|sickday | 07:26 | |
*** zoli|sickday is now known as zoli | 07:27 | |
*** quiquell|afk is now known as quiquell | 07:33 | |
arxcruz | chandankumar: sshnaidm|rover hi, sorry, was on the train, anything i can do ? | 07:33 |
*** florianf has joined #tripleo | 07:34 | |
*** ffiore has joined #tripleo | 07:38 | |
*** tosky has joined #tripleo | 07:38 | |
*** elgxl has quit IRC | 07:39 | |
sshnaidm|rover | arxcruz, thanks, I think we are good atm | 07:43 |
*** agopi has joined #tripleo | 07:43 | |
sshnaidm|rover | arxcruz, but need to investigate why tempest test fail later | 07:43 |
arxcruz | :) | 07:44 |
*** moshele has joined #tripleo | 07:45 | |
arxcruz | sshnaidm|rover: k | 07:46 |
chandankumar | arxcruz: https://review.openstack.org/#/c/568766/ | 07:47 |
arxcruz | chandankumar: maybe we should contact pradik | 07:49 |
arxcruz | sshnaidm|rover: ^ | 07:49 |
chandankumar | arxcruz: yup, better assign to them | 07:49 |
*** jpena|off is now known as jpena | 07:50 | |
*** dmacpher has quit IRC | 07:50 | |
*** dmacpher has joined #tripleo | 07:51 | |
*** kopecmartin has joined #tripleo | 07:53 | |
*** amoralej|off is now known as amoralej | 07:53 | |
*** psahoo has quit IRC | 07:55 | |
*** ykarel is now known as ykarel|lunch | 07:56 | |
openstackgerrit | Martin Kopec proposed openstack/tripleo-quickstart-extras master: Remove hardcoded cinder v1 option for tempestconf https://review.openstack.org/568776 | 07:58 |
*** shardy has joined #tripleo | 08:00 | |
*** mdnadeem has joined #tripleo | 08:04 | |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Add python script to dynamically compose releases https://review.openstack.org/567521 | 08:05 |
openstackgerrit | Michele Baldessari proposed openstack/puppet-tripleo master: Move unfencing to meta_params https://review.openstack.org/568769 | 08:07 |
sshnaidm|rover | arxcruz, yeah, definitely | 08:09 |
*** ooolpbot has joined #tripleo | 08:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 08:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 08:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771508 | 08:10 |
*** ooolpbot has quit IRC | 08:10 | |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 08:10 |
openstack | Launchpad bug 1771508 in tripleo "Telemetry tests fail in scenario-001 and 002 jobs" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 08:10 |
*** psahoo has joined #tripleo | 08:11 | |
*** akrivoka has joined #tripleo | 08:14 | |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: [WIP] Implement --output-file to write the bash script https://review.openstack.org/568285 | 08:15 |
*** markmc has joined #tripleo | 08:21 | |
*** hewbrocca_afk has joined #tripleo | 08:22 | |
openstackgerrit | Martin André proposed openstack/tripleo-common master: Revert "Revert "Pass connection info via ansible config file"" https://review.openstack.org/568781 | 08:23 |
openstackgerrit | Martin André proposed openstack/tripleo-common master: Revert "Revert "Pass connection info via ansible config file"" https://review.openstack.org/568781 | 08:25 |
openstackgerrit | Doug Szumski proposed openstack/diskimage-builder master: Remove duplicate GRUB command line entry https://review.openstack.org/568600 | 08:31 |
openstackgerrit | Arx Cruz proposed openstack/tripleo-quickstart-extras master: Fix generation of docs https://review.openstack.org/568783 | 08:33 |
*** ssbarnea_ has joined #tripleo | 08:35 | |
*** dbecker has quit IRC | 08:37 | |
*** ykarel|lunch is now known as ykarel | 08:37 | |
*** derekh has joined #tripleo | 08:41 | |
*** agopi has quit IRC | 08:44 | |
openstackgerrit | Marios Andreou proposed openstack/tripleo-heat-templates master: FFU Add cinder-backup missing fast_forward_upgrade_tasks https://review.openstack.org/568520 | 08:45 |
*** wolverineav has joined #tripleo | 08:46 | |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Run undeploy_plan workflow to delete deployment https://review.openstack.org/566366 | 08:51 |
*** gkadam has joined #tripleo | 08:53 | |
*** salmankhan has joined #tripleo | 08:54 | |
openstackgerrit | Sorin Sbarnea proposed openstack/tripleo-quickstart master: Resolve deprecation syntax warning https://review.openstack.org/568789 | 08:59 |
*** radeks_ has joined #tripleo | 09:00 | |
*** gfidente has joined #tripleo | 09:06 | |
*** gfidente has quit IRC | 09:06 | |
*** gfidente has joined #tripleo | 09:06 | |
*** links has quit IRC | 09:07 | |
*** jistr has quit IRC | 09:09 | |
*** ooolpbot has joined #tripleo | 09:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 09:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 09:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771508 | 09:10 |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 09:10 |
*** ooolpbot has quit IRC | 09:10 | |
openstack | Launchpad bug 1771508 in tripleo "Telemetry tests fail in scenario-001 and 002 jobs" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 09:10 |
*** jistr has joined #tripleo | 09:12 | |
openstackgerrit | Jose Luis Franco proposed openstack/tripleo-upgrade master: Separate scripts creation tasks for under/overcloud. https://review.openstack.org/566291 | 09:16 |
openstackgerrit | Jose Luis Franco proposed openstack/tripleo-upgrade master: Move SSL undercloud validation out of UC script create tasks. https://review.openstack.org/567190 | 09:16 |
*** panda|bbl is now known as panda | 09:18 | |
*** links has joined #tripleo | 09:23 | |
*** bogdando has joined #tripleo | 09:23 | |
*** pmannidi has quit IRC | 09:27 | |
openstackgerrit | Bogdan Dobrelya proposed openstack/python-tripleoclient master: Persist generated undercloud parameters t-h-t https://review.openstack.org/565764 | 09:36 |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-quickstart-extras master: Fix path and wire-in UC deploy role data file https://review.openstack.org/563988 | 09:36 |
openstackgerrit | Carlos Goncalves proposed openstack/tripleo-common stable/queens: Install Octavia amphora image if Red Hat https://review.openstack.org/568801 | 09:40 |
*** dtantsur has joined #tripleo | 09:43 | |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Add python script to dynamically compose releases https://review.openstack.org/567521 | 09:47 |
jaosorior | bogdando: we have logrotate configured in the overcloud nodes, right? | 09:48 |
derekh | sshnaidm|rover: I'm looking into https://bugs.launchpad.net/tripleo/+bug/1770972 again | 09:50 |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 09:50 |
sshnaidm|rover | derekh, thanks, seems like the problem is back | 09:50 |
derekh | sshnaidm|rover: is the image download location logged for this job somewhere? https://logs.rdoproject.org/65/566565/7/openstack-check/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/Z14cd35a437b546d0a25ba0721c29b5d0/undercloud/home/jenkins/overcloud_prep_images.log.txt.gz#_2018-05-15_18_57_21 | 09:50 |
sshnaidm|rover | derekh, looking.. | 09:51 |
sshnaidm|rover | derekh, only those tasks: https://logs.rdoproject.org/65/566565/7/openstack-check/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/Z14cd35a437b546d0a25ba0721c29b5d0/console.txt.gz#_2018-05-15_17_21_34_645 | 09:52 |
sshnaidm|rover | derekh, but I see that md5 checking is skipping, very odd.. checking now | 09:53 |
sshnaidm|rover | https://logs.rdoproject.org/65/566565/7/openstack-check/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/Z14cd35a437b546d0a25ba0721c29b5d0/console.txt.gz#_2018-05-15_17_21_37_476 | 09:53 |
sshnaidm|rover | derekh, damn.. I suspect I know what happens.. | 09:55 |
derekh | Also I noticed that the 2 jobs lined to as examples in the bug appear to be using a ramdisk with a different sizes to each other | 09:55 |
derekh | 2018-05-15 17:55:58 | | e5b21dfb-c043-4c26-8a0f-e1396c4d9ac6 | bm-deploy-ramdisk | ari | 391551106 | active | | 09:55 |
derekh | 2018-05-15 17:11:29 | | 8fe45a7e-e3b9-447f-8dbb-08762b79b9a7 | bm-deploy-ramdisk | ari | 393831283 | active | | 09:55 |
derekh | sshnaidm|rover: ya? | 09:56 |
derekh | *linked | 09:56 |
sshnaidm|rover | derekh, we check md5 here: https://github.com/openstack/tripleo-quickstart/blob/dc188905a1f5498a1562a4dda9963f79d25fe1ac/roles/fetch-images/tasks/fetch.yml#L136 | 09:56 |
sshnaidm|rover | derekh, but we use ansible cache for variables.. I'm not sure - with set_fact do we override cached variables or not? | 09:57 |
sshnaidm|rover | derekh, because if not - we check md5 for previous donwloaded image - overcloud.qcow2 | 09:57 |
derekh | sshnaidm|rover: I got no idea | 09:57 |
sshnaidm|rover | derekh, we run these role twice - for overcloud image and ipa, overcloud is first | 09:57 |
sshnaidm|rover | derekh, ok, I'll submit a patch | 09:58 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-quickstart master: Don't cache calculated md5 variables https://review.openstack.org/568805 | 10:03 |
sshnaidm|rover | derekh, let's see ^^ | 10:03 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-quickstart master: Don't cache calculated md5 variables https://review.openstack.org/568805 | 10:04 |
sshnaidm|rover | derekh, btw, we changed image yesterday again in promotion | 10:05 |
sshnaidm|rover | derekh, if this image is corrupted, maybe we have problems with uploading the image.. | 10:05 |
*** bkopilov has quit IRC | 10:05 | |
derekh | sshnaidm|rover: I'm testing the wrong image so, this didn't change yesterday https://images.rdoproject.org/master/rdo_trunk/current-tripleo/ | 10:05 |
derekh | sshnaidm|rover: where should I be looking? | 10:06 |
sshnaidm|rover | derekh, image that is used in jobs now is https://images.rdoproject.org/master/rdo_trunk/current-tripleo/ | 10:06 |
derekh | sshnaidm|rover: that what I was looking at, but the timestamp is from monday | 10:07 |
sshnaidm|rover | derekh, ok, so is that image ok? | 10:08 |
derekh | sshnaidm|rover: it worked for me, | 10:08 |
sshnaidm|rover | ok.. so at least upload worked well | 10:08 |
derekh | 100% of the 1 time I tried it | 10:08 |
openstackgerrit | melissaml proposed openstack/tripleo-docs master: Trivial: Update pypi url to new url https://review.openstack.org/563241 | 10:10 |
*** ooolpbot has joined #tripleo | 10:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 10:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 10:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771508 | 10:10 |
*** ooolpbot has quit IRC | 10:10 | |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 10:10 |
openstack | Launchpad bug 1771508 in tripleo "Telemetry tests fail in scenario-001 and 002 jobs" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 10:10 |
*** salmankhan has quit IRC | 10:14 | |
sshnaidm|rover | panda, can you please +2 on https://review.openstack.org/#/c/568766/ ? | 10:16 |
*** wolverineav has quit IRC | 10:17 | |
*** salmankhan has joined #tripleo | 10:18 | |
*** wolverineav has joined #tripleo | 10:18 | |
*** agurenko_ has joined #tripleo | 10:18 | |
*** agurenko has quit IRC | 10:19 | |
sshnaidm|rover | chandankumar, what is your username in LP? | 10:19 |
*** wolverineav has quit IRC | 10:23 | |
*** jaganathan_ has quit IRC | 10:26 | |
*** agurenko_ is now known as agurenko | 10:28 | |
*** olap has quit IRC | 10:33 | |
*** olap has joined #tripleo | 10:34 | |
*** olap has quit IRC | 10:37 | |
*** olap has joined #tripleo | 10:41 | |
*** agurenko is now known as agurenko_ | 10:42 | |
*** olap has quit IRC | 10:44 | |
*** olap has joined #tripleo | 10:44 | |
*** links has quit IRC | 10:44 | |
openstackgerrit | Nir Magnezi proposed openstack/tripleo-heat-templates master: Make lb-mgmt-subnet a class B subnet https://review.openstack.org/568138 | 10:47 |
*** dciabrin has joined #tripleo | 10:47 | |
*** masco has quit IRC | 10:47 | |
*** olap has quit IRC | 10:49 | |
*** olap has joined #tripleo | 10:50 | |
openstackgerrit | Nir Magnezi proposed openstack/tripleo-common master: Make lb-mgmt-subnet a class B subnet https://review.openstack.org/568089 | 10:51 |
*** dciabrin has quit IRC | 10:52 | |
*** dciabrin has joined #tripleo | 10:53 | |
*** dciabrin has quit IRC | 10:54 | |
*** olap has quit IRC | 10:54 | |
*** mburned_out is now known as mburned | 10:55 | |
*** leanderthal has joined #tripleo | 10:56 | |
*** agurenko has joined #tripleo | 10:56 | |
*** dciabrin has joined #tripleo | 10:57 | |
*** agurenko_ has quit IRC | 10:57 | |
*** dciabrin has quit IRC | 10:57 | |
*** links has joined #tripleo | 10:57 | |
chandankumar | sshnaidm|rover: chkumar246 | 10:58 |
*** dciabrin has joined #tripleo | 10:59 | |
*** dciabrin has quit IRC | 10:59 | |
*** dciabrin has joined #tripleo | 10:59 | |
*** masco has joined #tripleo | 11:00 | |
sshnaidm|rover | chandankumar, ok, I assigned https://bugs.launchpad.net/tripleo/+bug/1771508 to pradk for now, but he isn't online, please feel free to ping him about that | 11:02 |
openstack | Launchpad bug 1771508 in tripleo "Telemetry tests fail in scenario-001 and 002 jobs" [Critical,Triaged] - Assigned to Pradeep Kilambi (pkilambi) | 11:02 |
chandankumar | sshnaidm|rover: ack | 11:02 |
*** janki has quit IRC | 11:03 | |
*** lucasagomes is now known as lucas-hungry | 11:04 | |
*** dciabrin_ has joined #tripleo | 11:04 | |
jaosorior | chandankumar, sshnaidm|rover: I'm also taking a look at it | 11:04 |
sshnaidm|rover | jaosorior, great, thanks! | 11:04 |
jaosorior | sshnaidm|rover: I'll poke pradk when he's online about this | 11:05 |
jaosorior | so we can both try to sovle it | 11:05 |
jaosorior | *solve | 11:05 |
chandankumar | jaosorior: sshnaidm|rover I have pinged pradk on #rhos-mm internally | 11:05 |
*** dciabrin has quit IRC | 11:07 | |
*** ooolpbot has joined #tripleo | 11:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 11:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 11:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771508 | 11:10 |
*** ooolpbot has quit IRC | 11:10 | |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 11:10 |
openstack | Launchpad bug 1771508 in tripleo "Telemetry tests fail in scenario-001 and 002 jobs" [Critical,Triaged] - Assigned to Pradeep Kilambi (pkilambi) | 11:10 |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Add method for getting dlrn_hash from release and hash_name https://review.openstack.org/567320 | 11:13 |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Add releases script pytest tests to tox.ini https://review.openstack.org/567649 | 11:13 |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: [WIP] Add CLI argument parser and YAML file parser https://review.openstack.org/567936 | 11:13 |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: [WIP] Implement --output-file to write the bash script https://review.openstack.org/568285 | 11:13 |
openstackgerrit | Dmitry Tantsur proposed openstack/tripleo-heat-templates master: Remove support for classic drivers https://review.openstack.org/567827 | 11:13 |
openstackgerrit | Dmitry Tantsur proposed openstack/instack-undercloud master: Remove support for classic drivers https://review.openstack.org/567886 | 11:13 |
openstackgerrit | Dmitry Tantsur proposed openstack/tripleo-common master: Fix handling hardware types and drivers when generating fencing parameters https://review.openstack.org/567896 | 11:14 |
*** ssbarnea_ has quit IRC | 11:14 | |
openstackgerrit | Dmitry Tantsur proposed openstack/tripleo-common master: Convert classic drivers to hardware types on enrollment https://review.openstack.org/567900 | 11:14 |
*** dciabrin_ has quit IRC | 11:18 | |
openstackgerrit | Arx Cruz proposed openstack/tripleo-quickstart-extras master: Fix generation of docs https://review.openstack.org/568783 | 11:23 |
*** jaosorior has quit IRC | 11:23 | |
*** dciabrin_ has joined #tripleo | 11:27 | |
florianf | jtomasek: so I ran into an interesting problem with the deployment status tracking | 11:28 |
*** olap has joined #tripleo | 11:29 | |
*** abishop has joined #tripleo | 11:31 | |
*** janki has joined #tripleo | 11:31 | |
florianf | jtomasek: I can't pinpoint exactly what happened, because I was switching ui patches between deployment attempts. But in the end what happened is: I started a deployment through the UI. then deleted it with `openstack stack delete overcloud` while it was deploying. this resulted in an error message shown in the UI ("updating a stack when it's deleting isn't supported") | 11:32 |
*** janki has quit IRC | 11:32 | |
*** quiquell is now known as quique|lunch | 11:32 | |
florianf | jtomasek: this error message is the last message stored in swift | 11:32 |
*** janki has joined #tripleo | 11:32 | |
florianf | jtomasek: But since then the stack has been deleted, but the error message still shows up. | 11:32 |
florianf | jtomasek: which makes sense, because I guess if you delete a stack through `openstack stack delete` it will not leave a message in swift and thus the UI will not recognize the status change. | 11:34 |
*** abishop has quit IRC | 11:34 | |
*** rh-jelabarre has joined #tripleo | 11:34 | |
*** panda is now known as panda|lunch | 11:36 | |
openstackgerrit | Yurii Prokulevych proposed openstack/tripleo-upgrade master: Run Ceph upgrade before converge. https://review.openstack.org/566282 | 11:36 |
*** dciabrin__ has joined #tripleo | 11:38 | |
*** dciabrin_ has quit IRC | 11:41 | |
*** jaosorior has joined #tripleo | 11:42 | |
*** gkadam has quit IRC | 11:43 | |
*** gkadam has joined #tripleo | 11:43 | |
*** dciabrin__ has quit IRC | 11:43 | |
*** raildo has joined #tripleo | 11:44 | |
*** rmascena has joined #tripleo | 11:44 | |
akrivoka | when deploying oooq on rdo cloud using devmode.sh, what is the proper place to make customizations for the deployment? I've tried changing the values in ~/.quickstart/config/environments/rdocloud.yml but they are getting overwritten with defaults when I run the devmode.sh | 11:45 |
openstackgerrit | Marios Andreou proposed openstack/python-tripleoclient master: Add .deployment.v1.deploy_on_servers to ffwd-upgrade prepare https://review.openstack.org/566336 | 11:45 |
akrivoka | e.g. I want to change undercloud_flavor and latest_guest_image | 11:46 |
*** jpena is now known as jpena|lunch | 11:47 | |
*** ssbarnea_ has joined #tripleo | 11:47 | |
*** jcoufal has joined #tripleo | 11:50 | |
*** atoth has joined #tripleo | 11:51 | |
jtomasek | florianf: yeah, new undeploy workflow should be used to delete the deployment, So the status in swift gets updated too. In case when user just deletes the stack, it gets to the incorrect state. This situation sort of equals to situation when user deletes swift object for example. It is like deleting something from database. But, your issue whould be resolved with these patches: https://review.openstack.org/#/c/564315/3 and https://review.openstack.org/ | 11:51 |
jtomasek | #/c/566699/4 | 11:51 |
jtomasek | florianf: https://review.openstack.org/#/c/566699/4 | 11:51 |
jtomasek | florianf: one retrieves the deployment status from swift and other recovers deployment status based on stack status | 11:52 |
*** abishop has joined #tripleo | 11:52 | |
jtomasek | I raised a point of combining those together | 11:53 |
openstackgerrit | Bogdan Dobrelya proposed openstack/python-tripleoclient master: Fix hiera data override file writing https://review.openstack.org/568818 | 11:53 |
bogdando | jaosorior: yes | 11:54 |
*** abishop has quit IRC | 11:55 | |
florianf | jtomasek: ok cool, so this is not an unkown issue. :-) | 11:55 |
chandankumar | sshnaidm|rover: jaosorior as per sileht it appera s that haproxy not running ssl while the panko endpoint is https | 11:58 |
sileht | or the reverse :) | 11:58 |
*** trown|outtypewww is now known as trown | 11:59 | |
jaosorior | chandankumar: isn't it? | 11:59 |
sileht | I just check the haproxy config and it looks good | 11:59 |
sileht | I'm looking for the keystone catalog content to see what the endpoint is | 11:59 |
sileht | jaosorior, ^ | 11:59 |
*** wolverineav has joined #tripleo | 12:00 | |
bogdando | mwhahaha: hi, do you remember that bug for broken exit code of undercloud install? | 12:00 |
jaosorior | #startmeeting TripleO Security Squad | 12:00 |
openstack | Meeting started Wed May 16 12:00:47 2018 UTC and is due to finish in 60 minutes. The chair is jaosorior. Information about MeetBot at http://wiki.debian.org/MeetBot. | 12:00 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 12:00 |
*** openstack changes topic to " (Meeting topic: TripleO Security Squad)" | 12:00 | |
openstack | The meeting name has been set to 'tripleo_security_squad' | 12:00 |
jaosorior | I'll wait a little bit fo rmore folks to log in | 12:00 |
lhinds | hey oz | 12:00 |
jaosorior | hey lhinds! how's it going? | 12:01 |
lhinds | good thanks | 12:01 |
jaosorior | #topic Public TLS work udpate | 12:06 |
*** openstack changes topic to "Public TLS work udpate (Meeting topic: TripleO Security Squad)" | 12:06 | |
jaosorior | right! so | 12:07 |
*** amoralej is now known as amoralej|lunch | 12:07 | |
jaosorior | public TLS by default merged | 12:07 |
jaosorior | ....and it was reverted :D | 12:07 |
jaosorior | It was reverted here https://review.openstack.org/#/c/568699/ | 12:08 |
jaosorior | because of this bug https://bugs.launchpad.net/tripleo/+bug/1771435 | 12:08 |
openstack | Launchpad bug 1771435 in tripleo "scenario001/002 failing on autoscaling with urllib3.exceptions.SSLError: [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:579)" [Critical,Fix released] - Assigned to Alex Schultz (alex-schultz) | 12:08 |
jaosorior | it seems that tempest (the telemetry plugin) is poking panko | 12:09 |
*** lucas-hungry is now known as lucasagomes | 12:09 | |
jaosorior | and it gets a TLS endpoint with a non-TLS port (for some strange reason) | 12:09 |
jaosorior | I'm still not sure why that happens | 12:09 |
jaosorior | but I'm looking into it | 12:09 |
jaosorior | seems sileht is also looking into it | 12:10 |
*** ooolpbot has joined #tripleo | 12:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 12:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 12:10 |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 12:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771508 | 12:10 |
*** ooolpbot has quit IRC | 12:10 | |
openstack | Launchpad bug 1771508 in tripleo "Telemetry tests fail in scenario-001 and 002 jobs" [Critical,Triaged] - Assigned to Pradeep Kilambi (pkilambi) | 12:10 |
jaosorior | if someone wants to help with that | 12:10 |
jaosorior | I can provide details about how to reproduce it | 12:10 |
jaosorior | so let me know | 12:10 |
jaosorior | help is very much appreciated | 12:11 |
*** shardy_ has joined #tripleo | 12:11 | |
jaosorior | once that merges, then just docs are missing and we'll have public TLS by default :D | 12:11 |
Tengu | can't help more than what I did for now :/ | 12:11 |
Tengu | learning curve is nice :3 | 12:11 |
sshnaidm|rover | derekh, weshay I suspect there is different problem with images | 12:12 |
jaosorior | Tengu: you're getting your system tomorrow, right? | 12:12 |
Tengu | the builder? yep. | 12:12 |
sshnaidm|rover | derekh, we update our images in the job: https://logs.rdoproject.org/15/568715/2/openstack-check/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/Z5df1951657694a9ebaad63e71362a76a/console.txt.gz#_2018-05-16_04_15_05_346 | 12:12 |
sshnaidm|rover | derekh, it's done so: https://github.com/openstack/tripleo-quickstart-extras/blob/69ad943adda9000f79277f0230a5751869de9cb3/roles/modify-image/tasks/manual.yml#L33-L70 | 12:13 |
*** shardy has quit IRC | 12:13 | |
jaosorior | Tengu: let me know and I can help you reproduce the issue | 12:13 |
sshnaidm|rover | derekh, weshay but what we have when running update: https://logs.rdoproject.org/15/568715/2/openstack-check/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/Z5df1951657694a9ebaad63e71362a76a/undercloud/home/jenkins/repo_setup.sh.1526444104.log.txt.gz | 12:13 |
sshnaidm|rover | it may be a reason for failures.. | 12:14 |
jaosorior | any other questions/feedback on the public TLS stuff? | 12:14 |
Tengu | jaosorior: ok :). | 12:14 |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci master: DNM: build image in every OVB job https://review.openstack.org/568258 | 12:14 |
weshay | oof | 12:15 |
*** ansmith has quit IRC | 12:16 | |
Tengu | hello weshay :) | 12:16 |
*** tiswanso has quit IRC | 12:16 | |
jaosorior | #topic Secret management | 12:16 |
*** openstack changes topic to "Secret management (Meeting topic: TripleO Security Squad)" | 12:16 | |
*** dprince has joined #tripleo | 12:16 | |
jaosorior | So, I sent out a mail about enabling swift volume encryption by default http://lists.openstack.org/pipermail/openstack-dev/2018-May/130529.html | 12:17 |
jaosorior | mwhahaha: are you around? I saw you reviewed the patch and had some concerns | 12:17 |
mwhahaha | Sorta | 12:18 |
jaosorior | mwhahaha: swift isn't really poked that much anymore | 12:19 |
weshay | matbu, chem https://review.openstack.org/#/c/568680/ | 12:19 |
mwhahaha | So the perf thing probably ok | 12:19 |
jaosorior | mwhahaha: just to store the plan and get the plan out | 12:19 |
jaosorior | update the plan from the UI | 12:19 |
jaosorior | that's about it AFAIK | 12:19 |
jaosorior | ooh and get artifacts from the overcloud | 12:19 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-upgrade master: add container minimal check and gate https://review.openstack.org/568733 | 12:19 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-quickstart-extras master: WIP: Reproduce CI multinode job with libvirt https://review.openstack.org/543429 | 12:20 |
mwhahaha | But more services is kinda a problem, also how secure is a generic barbican | 12:20 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-upgrade master: add container minimal check and gate https://review.openstack.org/568733 | 12:20 |
*** salmankhan has quit IRC | 12:20 | |
mwhahaha | Like would luks be better | 12:21 |
*** tcw has quit IRC | 12:21 | |
jaosorior | mwhahaha: it isn't great, but from there we can more forward to using the pkcs11 plugin for the more security concerned | 12:21 |
Tengu | mwhahaha: you'd still get the key somewhere, or have to manually enter encryption password manually after each reboot | 12:21 |
mwhahaha | Luks solves the data at rest problem better imho | 12:22 |
mwhahaha | And the undercloud is less of a problem for automatic reboots | 12:22 |
*** salmankhan has joined #tripleo | 12:23 | |
mwhahaha | Since we don't assume 100% uptime | 12:23 |
*** fultonj has joined #tripleo | 12:23 | |
*** fultonj has joined #tripleo | 12:23 | |
mwhahaha | Having dealt with hsm's before I'd rather we recommend luks for the undercloud | 12:24 |
mwhahaha | That's my take on it | 12:24 |
jaosorior | mwhahaha: some people require hardware security | 12:24 |
jaosorior | some folks even want to tie luks to an hsm | 12:24 |
mwhahaha | Then those people enable it | 12:24 |
mwhahaha | But not be default | 12:24 |
EmilienM | bogdando: thx for https://review.openstack.org/#/c/568818/ | 12:25 |
mwhahaha | I don't see upside to it being on by default | 12:25 |
jaosorior | alright, those are valid points; I'll leave the commit up there for a bit and see what other folks think; more feedback is always good :) | 12:25 |
mwhahaha | False sense of security is bad :D | 12:26 |
jaosorior | agreed | 12:27 |
Tengu | small question: is there a way to trigger an rdo third party CI without triggering zuul? | 12:27 |
*** pkovar has joined #tripleo | 12:27 | |
beagles | are the current containerized undercloud install docs in https://docs.openstack.org/tripleo-docs/latest/install/installation/installing.html correct? | 12:27 |
mwhahaha | Tengu: check-rdo | 12:27 |
Tengu | mwhahaha: thank you! | 12:28 |
jaosorior | #topic Kerberos auth for keystone | 12:28 |
*** openstack changes topic to "Kerberos auth for keystone (Meeting topic: TripleO Security Squad)" | 12:28 | |
jaosorior | Alright, something else I wanted to bring up was a (relatively) low hanging fruit | 12:28 |
*** abishop has joined #tripleo | 12:28 | |
jaosorior | keystone supports kerberos for authentication, and I don't think it would be too hard to do (you can do a TLS everywhere deployment if you need keberos around) | 12:29 |
beagles | I'm getting what appears to be issues inc onfiguring nova_placement, heat_api, ironic_api, mysql, ironic, mistral, zaqar, nova, keystone...well basically everything I think | 12:29 |
jaosorior | some folks have expressed interest about it, so I thought it would be a good thing to have' | 12:29 |
*** rlandy has joined #tripleo | 12:29 | |
sshnaidm|rover | weshay, well, seems like we can't update images at all, jobs pass only when we build them.. | 12:29 |
jaosorior | so, if someone wants to pick up that work, I can provide details on how to do it | 12:29 |
jaosorior | so, let me know :D | 12:29 |
Tengu | jaosorior: is there some open issue for that? | 12:30 |
jaosorior | Tengu: there isn't; didn't think about tracking it with launchpad given it's not a bug but a feature request :D | 12:30 |
Tengu | there are FRE on launchpad :). | 12:31 |
jaosorior | OK, I can write one then | 12:31 |
*** abishop has quit IRC | 12:31 | |
jaosorior | #action jaosorior to write an RFE bug about Kerberos authentication | 12:31 |
Tengu | that would be best in order to follow | 12:32 |
*** abishop has joined #tripleo | 12:32 | |
jaosorior | I'll provide all the details needed to get that working on that bug | 12:32 |
jaosorior | #topic Any other business | 12:33 |
*** openstack changes topic to "Any other business (Meeting topic: TripleO Security Squad)" | 12:33 | |
weshay | sshnaidm|rover, ok.. I like the patch | 12:33 |
jaosorior | Anything else folks want to bring up to the meeting? | 12:33 |
weshay | thanks sshnaidm|rover | 12:33 |
lhinds | jaosorior: yup | 12:33 |
lhinds | #topic limiting heat-admin | 12:33 |
lhinds | so I have my new machine now and have been thinking of taking the following approach to get a list of every sudo call. | 12:34 |
jaosorior | #topic limiting heat-admin | 12:34 |
*** openstack changes topic to "limiting heat-admin (Meeting topic: TripleO Security Squad)" | 12:34 | |
*** panda|lunch is now known as panda | 12:34 | |
lhinds | in audit you can track all sudo calls: | 12:34 |
lhinds | https://github.com/openstack/tripleo-heat-templates/blob/master/environments/auditd.yaml#L109 | 12:34 |
lhinds | /var/log/audit/* | 12:35 |
*** liverpooler has joined #tripleo | 12:35 | |
lhinds | The puppet service can be used to set this up in the overcloud with an environment file, but seeking advice on how I could do this for the undercloud | 12:35 |
jaosorior | lhinds: well, we're moving towards having a containerized undercloud, which would be deployed with t-h-t as well | 12:36 |
lhinds | I guess I could use guestfs into the image and set it up there. I could also add a grub2.conf option to enable it early in the boot phase. | 12:36 |
jaosorior | lhinds: so you could enable the same functionality for the undercloud that way | 12:36 |
lhinds | jaosorior: ack, see what you mean. So would I be able to pull in an -enviroment file to configure audit within the undercloud | 12:37 |
lhinds | container or vm | 12:37 |
jaosorior | right | 12:37 |
lhinds | it won't be a feature, just a debug method to help me see sudo calls | 12:37 |
jaosorior | understood | 12:38 |
jaosorior | that's a good start for that | 12:38 |
lhinds | I guess I can ping you with this outside the meeting if you can help me jaosorior | 12:38 |
jaosorior | lhinds: sure! | 12:39 |
lhinds | just need to grok the best way to do it, and then I will be on my way to getting it scoped out and a patch submitted | 12:39 |
lhinds | lets do that (will send a DM to you) | 12:39 |
jaosorior | awesome | 12:40 |
lhinds | I can then see a complete list of every user who calls sudo (so validations, nova, keystone etc) | 12:40 |
jaosorior | sounds like a plan to get this started | 12:40 |
lhinds | cool. that's it for me. | 12:40 |
jaosorior | the main concern I guess is heat-admin and validations | 12:40 |
jaosorior | openstack services have their own sudoer rules, which look alright, as far as I've seen | 12:40 |
lhinds | yup, validations is the big one..so i also need to think about making sure validations makes lots of noise and gets used a lot | 12:41 |
lhinds | jaosorior: there is also rootwrap which nicely limits things | 12:41 |
jaosorior | #topic Any other business | 12:43 |
*** openstack changes topic to "Any other business (Meeting topic: TripleO Security Squad)" | 12:43 | |
*** jpena|lunch is now known as jpena | 12:43 | |
jaosorior | Anything else folks want to bring up? | 12:43 |
*** pchavva has joined #tripleo | 12:44 | |
*** masco has quit IRC | 12:44 | |
jaosorior | Alright, thanks for joining folks! | 12:45 |
jaosorior | #endmeeting | 12:45 |
*** openstack changes topic to "Welcome to Rocky. CI status: YELLOW (OVB failing) | http://tripleo.org/ | https://docs.openstack.org/tripleo-docs/latest" | 12:45 | |
openstack | Meeting ended Wed May 16 12:45:08 2018 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 12:45 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/tripleo_security_squad/2018/tripleo_security_squad.2018-05-16-12.00.html | 12:45 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/tripleo_security_squad/2018/tripleo_security_squad.2018-05-16-12.00.txt | 12:45 |
openstack | Log: http://eavesdrop.openstack.org/meetings/tripleo_security_squad/2018/tripleo_security_squad.2018-05-16-12.00.log.html | 12:45 |
*** tiswanso has joined #tripleo | 12:45 | |
*** rbowen has joined #tripleo | 12:45 | |
*** tcw has joined #tripleo | 12:45 | |
lhinds | thanks jaosorior | 12:45 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-quickstart-extras master: DNM: disable update of IPA image in jobs https://review.openstack.org/568833 | 12:45 |
*** tiswanso has quit IRC | 12:45 | |
*** tiswanso has joined #tripleo | 12:46 | |
*** Goneri has joined #tripleo | 12:51 | |
openstackgerrit | Bogdan Dobrelya proposed openstack/python-tripleoclient master: Log errors with raised exceptions https://review.openstack.org/568837 | 12:54 |
derekh | sshnaidm|rover: nice one, that DNM patch should confirm it | 12:56 |
sshnaidm|rover | derekh, I think we hit this: https://serverfault.com/questions/911781/yum-rpm-failed-to-initialize-nss-library-in-chroot | 12:57 |
*** moguimar has quit IRC | 12:57 | |
*** masco has joined #tripleo | 12:57 | |
openstackgerrit | Merged openstack/tripleo-quickstart-extras master: Add test_autoscaling tests to skip list https://review.openstack.org/568766 | 12:58 |
openstackgerrit | Merged openstack/tripleo-ui master: Exclude nodes deployed with another deployment plan https://review.openstack.org/563646 | 12:58 |
openstackgerrit | Merged openstack/tripleo-ui master: Add sanitizeMessage function https://review.openstack.org/564502 | 12:58 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-quickstart-extras master: Mount /dev for chrooted environment https://review.openstack.org/568838 | 12:59 |
sshnaidm|rover | derekh, but I'm not sure if i do it right ^^ | 12:59 |
jaosorior | sileht: have you found out anything else about the bug? | 12:59 |
sileht | jaosorior, not really, do we drop the keystone catalog somewhere ? | 12:59 |
jaosorior | sileht: I tried reproducing it, but my deployment is failing cause it can't scale the stack for some reason | 13:00 |
jaosorior | sileht: not AFAIK | 13:00 |
sshnaidm|rover | derekh, it's the same is here: https://github.com/openstack-infra/tripleo-ci/blob/3a8922988fb915ba8830fc00abf28beb10c45ecb/scripts/common_functions.sh#L119 | 13:00 |
slagle | jtomasek: hi. i was thinking that before someone would use https://review.openstack.org/#/c/564315/, they would've upgraded their overcloud, so it doesn't need to support both old and new | 13:00 |
sileht | jaosorior, I only found a partial output that don't really help | 13:00 |
slagle | jtomasek: as the workflow you're adding would do the conversion | 13:00 |
*** rmascena has quit IRC | 13:01 | |
*** toure|gone is now known as toure | 13:01 | |
sshnaidm|rover | derekh, please advice if you know how to apply this solution for our case.. | 13:01 |
*** ratailor has quit IRC | 13:01 | |
derekh | sshnaidm|rover: trying to find out | 13:02 |
sileht | jaosorior, can you list the keystone catalog on your installation ? | 13:05 |
dtantsur | jaosorior: hey! are you The One to ask about SSL in the undercloud? :) | 13:05 |
jaosorior | dtantsur: yes | 13:06 |
derekh | sshnaidm|rover: looks ok to me http://paste.openstack.org/show/721090/ | 13:07 |
jaosorior | sileht: no; I just nuked it to deploy a bigger deployment to try to deal with the scaling issues | 13:07 |
derekh | sshnaidm|rover: but you probably also need to unmount it | 13:07 |
jaosorior | sorry :( | 13:07 |
dtantsur | jaosorior: I have an issue here, which is probably no one's fault, just imperfection of the world. on an undercloud with SSL nothing using requests (e.g. openstack clients) works in a venv | 13:08 |
dtantsur | do you think we could provide some CA environment variables in stackrc to help requests find the SSL bundle? | 13:08 |
*** ukalifon has joined #tripleo | 13:09 | |
jaosorior | dtantsur: yes we could | 13:10 |
sshnaidm|rover | derekh, seems like I need to do it after image archived and before removing mountdir? https://github.com/openstack/tripleo-quickstart-extras/blob/4f205f4cea20a3c690eda9ff7856b197b42e4f9a/roles/modify-image/tasks/manual.yml#L70 | 13:11 |
dtantsur | jaosorior: I can try giving it a shot, but I really don't know where to start | 13:11 |
sshnaidm|rover | derekh, I mean umount | 13:11 |
*** ooolpbot has joined #tripleo | 13:11 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 13:11 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 13:11 |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 13:11 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771508 | 13:11 |
*** ooolpbot has quit IRC | 13:11 | |
openstack | Launchpad bug 1771508 in tripleo "Telemetry tests fail in scenario-001 and 002 jobs" [Critical,Triaged] - Assigned to Pradeep Kilambi (pkilambi) | 13:11 |
jaosorior | dtantsur: How was it deployed? with a user-provided cert? or with the autogenerated one? | 13:11 |
dtantsur | jaosorior: autogenerated one, I guess. at least I did not provide anything :) | 13:12 |
*** derekh has quit IRC | 13:12 | |
*** ansmith has joined #tripleo | 13:12 | |
*** derekh has joined #tripleo | 13:12 | |
jaosorior | dtantsur: it depends on the CA used, the default local CA (which is what it's using I think), uses '/etc/pki/ca-trust/source/anchors/cm-local-ca.pem' | 13:13 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-quickstart-extras master: Mount /dev for chrooted environment https://review.openstack.org/568838 | 13:13 |
trozet | mwhahaha: i've gotten past the pacemaker issue and made it to step 4 (yay), but now it gets to step 4 in ansible and just hangs forever | 13:13 |
*** dhill_ has joined #tripleo | 13:13 | |
sshnaidm|rover | derekh, ^^ | 13:13 |
trozet | mwhahaha: do you know of any bugs filed around that? | 13:13 |
jaosorior | dtantsur: if it would be using FreeIPA, it would be /etc/ipa/ca.crt | 13:13 |
*** mcornea has joined #tripleo | 13:13 | |
derekh | sshnaidm|rover: before the "find . -print" id say | 13:13 |
jaosorior | and if it would be user-provided... then who knows :D | 13:13 |
dtantsur | well, I have /etc/pki/ca-trust/source/anchors/cm-local-ca.pem | 13:13 |
dtantsur | oh, this starts sounding even more complex.. | 13:13 |
openstackgerrit | Gabriele Cerami proposed openstack-infra/tripleo-ci master: Add ability to use a different release per playbook https://review.openstack.org/566565 | 13:14 |
jaosorior | dtantsur: PKI is pain | 13:14 |
jaosorior | dtantsur: but yeah, the most common case would be /etc/pki/ca-trust/source/anchors/cm-local-ca.pem | 13:14 |
Tengu | :] | 13:14 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-quickstart-extras master: Mount /dev for chrooted environment https://review.openstack.org/568838 | 13:15 |
sshnaidm|rover | derekh, like that? ^^ | 13:15 |
derekh | sshnaidm|rover: for "Close initramfs " id move it up 2 lines, we were not zipping up the dev directory before so no need to start | 13:15 |
derekh | sshnaidm|rover: yup | 13:15 |
*** lblanchard has joined #tripleo | 13:15 | |
sshnaidm|rover | great, let's see | 13:15 |
* sshnaidm|rover is crossing finger on both hands and legs | 13:15 | |
*** tzumainn has joined #tripleo | 13:17 | |
dtantsur | jaosorior: okay, setting --os-cacert to this patch fixes the problem for me | 13:18 |
dtantsur | now I need to understand where to wire it in... | 13:18 |
jaosorior | dtantsur: well, there is the OS_CACERT environment variable. That could go in the stackrc | 13:19 |
*** lblanchard has quit IRC | 13:20 | |
dtantsur | now I need to figure out where stackrc is generated for containerized undercloud. anyone remembers from the top of the head? | 13:20 |
*** lblanchard has joined #tripleo | 13:20 | |
jaosorior | I don't :( | 13:21 |
mwhahaha | trozet: no i'm not aware of that | 13:22 |
dtantsur | jaosorior: filed https://bugs.launchpad.net/tripleo/+bug/1771565 | 13:23 |
openstack | Launchpad bug 1771565 in tripleo "SSL custom certificates do not work with anything using request in a venv" [Medium,Triaged] - Assigned to Dmitry Tantsur (divius) | 13:23 |
*** quique|lunch is now known as quiquell | 13:23 | |
jaosorior | dtantsur: where is requests coming from in that case? pip? | 13:25 |
dtantsur | bogdando: hey! in the new shiny containerized world where is stackrc generated? | 13:26 |
*** amoralej|lunch is now known as amoralej | 13:26 | |
*** hjensas has quit IRC | 13:26 | |
dtantsur | jaosorior: anything using python-requests in a venv | 13:26 |
dtantsur | okay, I guess it's here: https://github.com/openstack/tripleo-heat-templates/blob/fea5bfbcc81f322812720f29459d5ae648a4647b/extraconfig/post_deploy/undercloud_post.sh#L14 | 13:27 |
bogdando | dtantsur: /root/stackrc | 13:27 |
bogdando | or /home/stack/stackrc | 13:27 |
dtantsur | yeah, I was wondering about the code generating it. I think I've found it | 13:28 |
dtantsur | jaosorior: okay, I have to update https://github.com/openstack/tripleo-heat-templates/blob/fea5bfbcc81f322812720f29459d5ae648a4647b/extraconfig/post_deploy/undercloud_post.sh#L14 apparently. Is there a variable in THT that points at the CA file? | 13:28 |
sshnaidm|rover | mwhahaha, I opened 2 bugs today about fs010 failing in pike and queens, which block them, should I add "alert" there? https://bugs.launchpad.net/tripleo/+bug/1771551 and https://bugs.launchpad.net/tripleo/+bug/1771549 | 13:29 |
openstack | Launchpad bug 1771551 in tripleo "Containers multinode jobs fails on stable pike because of pacemaker" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 13:29 |
openstack | Launchpad bug 1771549 in tripleo "Containers multinode jobs fails on stable queens" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 13:29 |
jaosorior | dtantsur: there isn't (but there should) it's in puppet. | 13:31 |
* dtantsur takes a big sip of vodka | 13:31 | |
openstackgerrit | Ronelle Landy proposed openstack-infra/tripleo-ci master: Add dry run option to toci_quickstart https://review.openstack.org/567060 | 13:31 |
slagle | are we still YELLOW? ovb appears to be "passing" when the mood strikes | 13:32 |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-quickstart master: Deploy TLS overcloud in fs051 (sc000 upgrade job) https://review.openstack.org/568843 | 13:33 |
alee | jaosorior, mcornea so it looks like the right tags are being set - but the password job still is skipping the password related tasks .. http://logs.openstack.org/97/567897/4/experimental/tripleo-ci-centos-7-scenario000-multinode-oooq-container-password-changes/9118617/ | 13:33 |
jaosorior | dtantsur: there is a variable (I had forgotten) | 13:33 |
jaosorior | dtantsur: it's InternalTLSCAFile | 13:33 |
alee | looks like we're still missing something to get them to be invoked | 13:33 |
dtantsur | \o/ | 13:34 |
jaosorior | dtantsur: check environments/public-tls-undercloud.yaml | 13:34 |
dtantsur | thanks jaosorior | 13:34 |
* dtantsur puts vodka away | 13:34 | |
jistr | jaosorior: fyi our current blocking issue on upgrade job is related to TLS https://bugs.launchpad.net/tripleo/+bug/1771567 | 13:34 |
openstack | Launchpad bug 1771567 in tripleo "sc000 upgrade failing on haproxy TLS config during initial deployment" [High,Triaged] | 13:34 |
alee | mcornea, made the changes you suggested in tripleo-upgrade | 13:35 |
openstackgerrit | James Slagle proposed openstack/python-tripleoclient master: overcloud plan deployment failures https://review.openstack.org/568673 | 13:35 |
mcornea | alee: hey, I noticed but I didn't get to review yet | 13:35 |
jaosorior | jistr: I see; so as part of that upgrade we should pass the no-tls environment | 13:35 |
jistr | jaosorior: it fails on deploy, before trying any upgrade, i suspect it's because we use Rocky UC to deploy Queens OC, and the client+common from Rocky doesn't work well with Queens templates with something TLS related. I'm trying to take the easy way out by enabling TLS in the upgrade job on deploy too https://review.openstack.org/#/c/568843/ | 13:35 |
alee | mcornea, cool - its just that despite the changes, it seems like the tasks are still not being executed | 13:37 |
jaosorior | jistr: yeah, that would be the easiest option | 13:37 |
jaosorior | jistr: but yeah, for non-tls deployments, we should use the no-tls environment when upgrading | 13:37 |
alee | jaosorior, trown -- any ideas? | 13:37 |
jaosorior | (still needs to write the docs) | 13:37 |
*** cshastri has quit IRC | 13:38 | |
shardy_ | jtomasek: Hi, just been thinking about how we might adapt capabilities-map.yaml to also work with non-openstack plans, have you already looked into that? | 13:38 |
*** shardy_ is now known as shardy | 13:38 | |
jaosorior | trown, sshnaidm|rover: Do all the tasks in quickstart need to have tags? Would that be what alee is missing? | 13:38 |
openstackgerrit | James Slagle proposed openstack/python-tripleoclient master: overcloud plan deployment failures https://review.openstack.org/568673 | 13:38 |
sshnaidm|rover | jaosorior, if you don't use "-t all", then tasks should have tags to be executed | 13:39 |
jistr | jaosorior: yea makes sense re passing the env file. We don't get far enough to attempt an upgrade there though :/ We might need to tweak the Rocky client/common to be able to deploy Queens at all. | 13:39 |
shardy | jtomasek: There seem to be two missing parts, a way to restrict a plan to only specific environment_groups, and a way to do globbing in environment_groups so we can e.g have an OpenShift group that points to environments/openshift or whatever | 13:39 |
sshnaidm|rover | jaosorior, usually tags are in high levels for files/roles, not for each task | 13:39 |
alee | sshnaidm|rover, so - it looks like all the tasks do have tags .. | 13:39 |
alee | so they are listed but are still being skipped | 13:40 |
jistr | jaosorior: i'm not sure if we support fresh deployment of Q OC with R UC, but we should support at least basic management of Q OC from R UC, i'm not sure if that broke too or if it's just fresh deploy which gets in trouble | 13:40 |
sshnaidm|rover | alee, I need to know what's the problem to answer :) | 13:40 |
alee | sshnaidm|rover, thanks -- let me give some context :) | 13:40 |
jistr | jaosorior: anyway, trying to postpone this by going all-TLS and we'll see how far that gets us | 13:41 |
alee | sshnaidm|rover, I'm trying to add a job to check password changes | 13:41 |
jaosorior | jistr: want to have a talk tomorrow to see how we can address this? | 13:41 |
alee | sshnaidm|rover, https://review.openstack.org/#/c/567897/ | 13:41 |
mcornea | alee: so checking the output I see that you now passed overcloud-config-change tag to the multinode-overcloud-upgrade.yml playbook but that playbook doesn't contain the tag as it does for upgrade https://github.com/openstack/tripleo-quickstart-extras/blob/master/playbooks/multinode-overcloud-upgrade.yml#L38 | 13:41 |
jistr | jaosorior: yea sure | 13:41 |
jaosorior | jistr: lets do that, I'm sure we can come up with some solution | 13:41 |
openstackgerrit | Emilien Macchi proposed openstack/puppet-tripleo master: Neutron wrappers: lookup for THT parameter https://review.openstack.org/566737 | 13:41 |
openstackgerrit | Emilien Macchi proposed openstack/puppet-tripleo master: Neutron wrappers: lookup for THT parameter https://review.openstack.org/566737 | 13:42 |
alee | mcornea, ah - looking .. | 13:42 |
openstackgerrit | Emilien Macchi proposed openstack/puppet-tripleo master: Neutron wrappers: lookup for THT parameter https://review.openstack.org/566737 | 13:42 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: Deploy Docker via Ansible and not Puppet https://review.openstack.org/561377 | 13:42 |
alee | mcornea, ok - yeah - I was looking at tripleo-update as a guide -- looks like there is a separate file for that .. | 13:44 |
alee | playbooks/multinode-overcloud-update.yml | 13:44 |
alee | guess I need a similar one for config_change | 13:45 |
openstackgerrit | Jiri Stransky proposed openstack-infra/tripleo-ci master: Add keystone only upgrade jobs https://review.openstack.org/558425 | 13:45 |
mwhahaha | sshnaidm|rover: yea | 13:46 |
openstackgerrit | Ronelle Landy proposed openstack-infra/tripleo-ci master: Add dry run option to toci_quickstart https://review.openstack.org/567060 | 13:46 |
EmilienM | bnemec: not sure you saw my message from yesterday but now we have designate in the CI job, we need to test it via Tempest, you'll need to patch quickstart to run at least one test | 13:46 |
mcornea | alee: yeah, I think so, I'm not familiar with the upstream ci but from what I can tell if you get the role with the overcloud-config-change in such playbook then it should trigger the tripleo-upgrade(assuming the proper vars in tripleo-upgrade are set) | 13:46 |
*** bkopilov has joined #tripleo | 13:47 | |
alee | mcornea, yeah I see where the playbooks are defined in tocigate_test-oooq.sh | 13:47 |
alee | cool - let me give it a try | 13:48 |
*** jfrancoa has quit IRC | 13:48 | |
openstackgerrit | Merged openstack/tripleo-quickstart-extras master: Updated undercloud tempest skip list for Pike https://review.openstack.org/566686 | 13:49 |
openstackgerrit | Merged openstack/tripleo-upgrade master: Run Ceph upgrade before converge. https://review.openstack.org/566282 | 13:49 |
beagles | mwhahaha: regarding https://review.openstack.org/#/c/567641/ ... | 13:50 |
*** jfrancoa has joined #tripleo | 13:50 | |
*** anilvenkata has quit IRC | 13:51 | |
mwhahaha | beagles: yea i went looking, it doesn't seem that we created a quota puppet provider. THat being said that seems to be something we might want to do as different type of task | 13:51 |
mwhahaha | beagles: unfortuantely it's a n awkward configuration because it spans all the services so it's not something we can just assign with like keystone though i think it belongs there more than in an octavia deployment task | 13:52 |
beagles | mwhahaha: yeah | 13:53 |
openstackgerrit | Yurii Prokulevych proposed openstack/tripleo-upgrade stable/queens: Run Ceph upgrade before converge. https://review.openstack.org/568852 | 13:53 |
openstackgerrit | Dmitry Tantsur proposed openstack/tripleo-heat-templates master: undercloud: set OS_CACERT when TLS is used https://review.openstack.org/568853 | 13:54 |
dtantsur | jaosorior: something like ^^^ ? | 13:54 |
*** udesale_ has joined #tripleo | 13:55 | |
jaosorior | dtantsur: nice! | 13:55 |
mwhahaha | beagles: i assume the quota thing is kinda blocking for octavia. That being said, shouldn't it be in it's own tenant outside of the ctrlplane stuff? | 13:55 |
dtantsur | jaosorior: do we run any CI jobs with TLS enabled? | 13:55 |
*** moguimar has joined #tripleo | 13:55 | |
jaosorior | dtantsur: we do | 13:55 |
beagles | mwhahaha: yes and yes | 13:55 |
beagles | mwhahaha: we started adding support for a separate tenant for octavia but the timing/priorities didn't work out | 13:56 |
mwhahaha | :/ | 13:56 |
beagles | too many fires - not enough firepeople | 13:57 |
* beagles thinks about that for a second | 13:57 | |
mwhahaha | https://media.giphy.com/media/yr7n0u3qzO9nG/giphy.gif | 13:57 |
*** udesale has quit IRC | 13:58 | |
beagles | :) | 14:01 |
*** kambiz has quit IRC | 14:04 | |
openstackgerrit | Daniel Alvarez proposed openstack/tripleo-heat-templates master: Fix missing parameters in OVN DVR environment files https://review.openstack.org/568856 | 14:04 |
Tengu | see you tomorrow! | 14:05 |
*** kambiz has joined #tripleo | 14:05 | |
*** moshele has quit IRC | 14:07 | |
*** rajinir has joined #tripleo | 14:07 | |
*** gyankum has quit IRC | 14:09 | |
*** kambiz has quit IRC | 14:09 | |
jaosorior | Tengu: have a good one | 14:09 |
*** ooolpbot has joined #tripleo | 14:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 14:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 14:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771549 | 14:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771551 | 14:10 |
*** ooolpbot has quit IRC | 14:10 | |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 14:10 |
openstack | Launchpad bug 1771549 in tripleo "Containers multinode jobs fails on stable queens with overcloud deploy timeout" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 14:10 |
openstack | Launchpad bug 1771551 in tripleo "Containers multinode jobs fails on stable pike because of pacemaker" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 14:10 |
openstackgerrit | Daniel Alvarez proposed openstack/tripleo-heat-templates stable/queens: Fix missing parameters in OVN DVR environment files https://review.openstack.org/568858 | 14:11 |
*** ykarel is now known as ykarel|away | 14:12 | |
*** tiswanso_ has joined #tripleo | 14:13 | |
*** kambiz has joined #tripleo | 14:13 | |
openstackgerrit | Brent Eagles proposed openstack/tripleo-heat-templates master: Add acl to paths that are shared among related neutron processes https://review.openstack.org/567655 | 14:13 |
*** lblanchard1 has joined #tripleo | 14:14 | |
EmilienM | myoung|ruck, weshay , mwhahaha : you probably know but introspection looks broken https://logs.rdoproject.org/20/567320/7/openstack-check/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/Zea21e3ed07234ca09366f6b619c041c1/undercloud/home/jenkins/overcloud_prep_images.log.txt.gz#_2018-05-16_13_30_27 | 14:15 |
*** dtrainor has joined #tripleo | 14:15 | |
EmilienM | https://bugs.launchpad.net/tripleo/+bug/1770972 I guess | 14:15 |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 14:15 |
EmilienM | and https://review.openstack.org/#/c/568838/ should help, ok | 14:16 |
derekh | EmilienM: sshnaidm|rover has a potential patch being tested at the moment | 14:16 |
*** ykarel|away has quit IRC | 14:16 | |
derekh | yup | 14:16 |
*** tiswanso has quit IRC | 14:16 | |
*** lblanchard has quit IRC | 14:16 | |
weshay | EmilienM, ya.. derekh and sshnaidm|rover have been on that | 14:16 |
EmilienM | cool | 14:17 |
EmilienM | mwhahaha: the gate is low, I wonder if we can land https://review.openstack.org/#/c/568347/ and https://review.openstack.org/#/c/568680/ *now* so this thing is done | 14:18 |
EmilienM | mwhahaha: FWIW, it passed FS035 | 14:18 |
EmilienM | but I'm ok to wait. I just think it's better to have it asap | 14:18 |
mwhahaha | yea that's fine | 14:18 |
EmilienM | mcornea: will need your review on https://review.openstack.org/#/c/568680/ please | 14:18 |
mcornea | EmilienM: looks good, can I set workflow? I see the other one has it | 14:21 |
mwhahaha | mcornea: sure | 14:21 |
*** tiswanso_ has quit IRC | 14:22 | |
*** tiswanso has joined #tripleo | 14:23 | |
EmilienM | mcornea: thx | 14:25 |
openstackgerrit | Michele Baldessari proposed openstack/puppet-tripleo master: Move unfencing to meta_params https://review.openstack.org/568769 | 14:26 |
openstackgerrit | Ronelle Landy proposed openstack-infra/tripleo-ci master: Add dry run option to toci_quickstart https://review.openstack.org/567060 | 14:26 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: Revert "Add test_autoscaling tests to skip list" https://review.openstack.org/568860 | 14:26 |
*** lvdombrkr89 has joined #tripleo | 14:29 | |
honza | jtomasek: how is the network configuration work going? do you have any wip code we could look at? | 14:30 |
jtomasek | shardy, slagle: let me read back, I'll respond shortly | 14:31 |
jtomasek | honza: I am going to send out patches I currently have, so we align them with your networks listing, then we can identify and split work, sounds ok? | 14:31 |
beagles | mwhahaha: is non-containerized undercloud install supposed to work or is it in "don't" territory now? | 14:31 |
mwhahaha | beagles: still supposed to work | 14:31 |
beagles | mwhahaha: okay | 14:32 |
*** lvdombrkr has quit IRC | 14:32 | |
mwhahaha | beagles: we still test it in ci as well. we haven't completely cut over | 14:32 |
beagles | mwhahaha: gotcha | 14:32 |
honza | jtomasek: sounds good --- i just want to get started on it sooner rather than later because things always take longer than we think :) | 14:32 |
myoung|ruck | EmilienM: aye...will talk about it now in CIX...but introspection is borked atm | 14:33 |
jtomasek | honza: yeah | 14:33 |
shardy | jtomasek: ack thanks, not urgent but I wanted to sync up re the capabilities map for openshift | 14:33 |
jtomasek | shardy: so currently we have openshift group in t-h-t capabilities map. I think that capabilities-map should define capabilities of the deployment plan, so if the plan is designed for openshift deployment, it's capabiltiies-map should include only environments related to openshift deployment | 14:36 |
jtomasek | shardy: problem to solve would be where to draw the line and split t-h-t | 14:36 |
*** dxiri has joined #tripleo | 14:36 | |
shardy | jtomasek: yeah, I wasn't sure if capabilities map is enough, or if we need some additional filtering? | 14:36 |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: [WIP] Add CLI argument parser and YAML file parser https://review.openstack.org/567936 | 14:37 |
shardy | jtomasek: For now I'm assuming all the environments will stay in t-h-t, but that we need to ensure only configuration related to openshift is displayed in an openshift plan | 14:38 |
jtomasek | shardy: I think capabiltiies-map is enough. We should probably consider removing 'Other' group which is added in tripleo-common action and it lists all environments not included in capabilities-map. I think we should make things more srict -> what is not in capabilties-map is not usable (by UI) | 14:38 |
shardy | then when we have that we can add the various configurations folks may want to use | 14:38 |
shardy | jtomasek: Ok, is the "Other" group still useful for openstack deployments? | 14:39 |
shardy | maybe we should make it configurable via a flag somewhere? | 14:39 |
shardy | e.g in the plan_environment perhaps? | 14:39 |
jtomasek | shardy: it is basically a fallback for environments which have not been added to capa-map but should | 14:39 |
jtomasek | shardy: we could add key to plan-environment.yaml...yeah | 14:39 |
jtomasek | to specify capabilities-map file name | 14:39 |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: [WIP] Implement --output-file to write the bash script https://review.openstack.org/568285 | 14:40 |
shardy | jtomasek: Ah yeah that would be nice, so we can maintain separate files | 14:40 |
jtomasek | si if we decide to keep everything in single t-h-t repository, we would include multiple capabilities-maps | 14:40 |
shardy | and eventually even separate repos etc | 14:40 |
EmilienM | bogdando: your patch to fix upgrade seems to work for me :) | 14:40 |
shardy | jtomasek: yeah I was thinking keep the single repo but use a file structure so that they can easily be split in future if needed | 14:41 |
jtomasek | shardy: sounds good to me | 14:41 |
openstackgerrit | Ade Lee proposed openstack/tripleo-quickstart-extras master: Add playbook for overcloud-config-change https://review.openstack.org/568865 | 14:42 |
*** ykarel|away has joined #tripleo | 14:43 | |
*** holser__ has quit IRC | 14:46 | |
*** gbarros has joined #tripleo | 14:47 | |
*** holser__ has joined #tripleo | 14:47 | |
bogdando | EmilienM: \o/ | 14:48 |
openstackgerrit | Lukas Bezdicka proposed openstack/tripleo-common stable/queens: Persist package update ansible logs https://review.openstack.org/565853 | 14:48 |
openstackgerrit | Ronelle Landy proposed openstack-infra/tripleo-ci master: Add dry run option to toci_quickstart https://review.openstack.org/567060 | 14:49 |
*** ykarel|away is now known as ykarel | 14:49 | |
jtomasek | slagle: ok, I don't have a strong opinion, that's why I didn't -1, only benefit which comes to my mind is that in case user accidentally removes the deployment_status object, it would get automagically recovered, next time it tries to fetch the status. On the other hand it would make the workflow more complicated and user should not remove that object, as removing it basically equals to something like manually removing something from application | 14:51 |
jtomasek | database:) | 14:51 |
openstackgerrit | Ade Lee proposed openstack-infra/tripleo-ci master: Add password change job https://review.openstack.org/567897 | 14:52 |
*** gvrangan has joined #tripleo | 14:54 | |
openstackgerrit | Bogdan Dobrelya proposed openstack/python-tripleoclient master: Persist generated undercloud parameters t-h-t https://review.openstack.org/565764 | 14:54 |
d0ugal | rbrady,apetrich,thrash,toure,jtomasek: Workflow Squad meeting in 4 mins. https://etherpad.openstack.org/p/tripleo-workflows-squad-status | 14:56 |
openstackgerrit | Alex Schultz proposed openstack/tripleo-common stable/pike: Add yum update to base https://review.openstack.org/568292 | 14:57 |
openstackgerrit | Alex Schultz proposed openstack/tripleo-common stable/pike: Run yum clean to reduce size of docker image layer https://review.openstack.org/568510 | 14:57 |
*** cylopez has quit IRC | 14:58 | |
*** links has quit IRC | 14:58 | |
EmilienM | jaosorior: I'm upgrading undercloud from queens to rocky (containerized) and it fails on starting certmonger | 15:01 |
derekh | sshnaidm|rover: I'm starting to think maybe the ramdisk size has reached some limit, fixing the yum clean might help it but maybe only because it reduces the size of the ramdisk a bit, I'll let you know if I find anything out | 15:02 |
sshnaidm|rover | derekh, ack | 15:02 |
EmilienM | jaosorior: http://ix.io/1axU | 15:05 |
*** jfrancoa has quit IRC | 15:05 | |
EmilienM | jaosorior: wrong link nevermind | 15:05 |
*** tiswanso_ has joined #tripleo | 15:06 | |
EmilienM | jaosorior: http://paste.openstack.org/show/721106/ | 15:07 |
mwhahaha | EmilienM: reboot | 15:07 |
mwhahaha | EmilienM: that's a 7.4 upgraded to 7.5 w/o a reboot | 15:07 |
EmilienM | watttt | 15:07 |
mwhahaha | also, read your email | 15:08 |
EmilienM | pff emails | 15:08 |
* mwhahaha has explained this about 4 times already | 15:08 | |
EmilienM | I needed a 5th :P | 15:08 |
*** jfrancoa has joined #tripleo | 15:08 | |
mwhahaha | EmilienM: Ok. REBOOT | 15:08 |
mwhahaha | done | 15:08 |
openstackgerrit | Ronelle Landy proposed openstack-infra/tripleo-ci master: Add dry run option to toci_quickstart https://review.openstack.org/567060 | 15:08 |
*** tiswanso has quit IRC | 15:09 | |
*** ooolpbot has joined #tripleo | 15:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 15:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 15:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771549 | 15:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771551 | 15:10 |
*** ooolpbot has quit IRC | 15:10 | |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 15:10 |
openstack | Launchpad bug 1771549 in tripleo "Containers multinode jobs fails on stable queens with overcloud deploy timeout" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 15:10 |
openstack | Launchpad bug 1771551 in tripleo "Containers multinode jobs fails on stable pike because of pacemaker" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 15:10 |
sileht | jaosorior, I got it: https://review.openstack.org/#/c/568699/1/network/endpoints/endpoint_map.yaml@a102 | 15:11 |
sileht | jaosorior, the port change is missing | 15:11 |
sileht | it should be 13977 | 15:11 |
mwhahaha | EmilienM: though it does show a deficiency somewhere (probably quickstart?) where we don't yum update & reboot on the undercloud host before installing | 15:12 |
EmilienM | mwhahaha: I didn't use quickstart | 15:12 |
slagle | jtomasek: ok. and as for plan_management.py vs deployment.py, i wasn't sure either, so i just picked plan_management | 15:12 |
mwhahaha | EmilienM: ok well you should yum update & reboot before installing :D | 15:12 |
EmilienM | mwhahaha: no | 15:13 |
mwhahaha | EmilienM: but i think others are hitting this in quickstart | 15:13 |
EmilienM | mwhahaha: yum update is run by a THT service | 15:13 |
mwhahaha | EmilienM: is this in the upgrade? | 15:13 |
EmilienM | after services are properly stopped | 15:13 |
EmilienM | mwhahaha: yes | 15:13 |
mwhahaha | so that's the difference, that's the upgrade | 15:13 |
mwhahaha | so for 7.4 to 7.5 a reboot has to happen | 15:13 |
mwhahaha | thanks rhel | 15:13 |
EmilienM | wouat | 15:14 |
mwhahaha | there is a bz open for this | 15:14 |
mwhahaha | the problem is certmonger and dbush | 15:14 |
*** gvrangan has quit IRC | 15:14 | |
mwhahaha | https://bugzilla.redhat.com/show_bug.cgi?id=1569122 | 15:14 |
openstack | bugzilla.redhat.com bug 1569122 in instack-undercloud "Undercloud installation fails with "Execution of '/bin/getcert list' returned 1: Error org.freedesktop.DBus.Error.TimedOut"" [High,New] - Assigned to jslagle | 15:14 |
slagle | wat | 15:16 |
*** radeks_ has quit IRC | 15:16 | |
jtomasek | slagle: I'll leave that decision on d0ugal or anyone else from workflows squad:) | 15:16 |
EmilienM | so we could exclude dbus upgrade | 15:16 |
EmilienM | because really I don't see how we can reboot in the middle of the containerized undercloud upgrade unless we re-do (again) the whole workflow | 15:17 |
d0ugal | florianf: Hey, can you take a look at these reviews? https://review.openstack.org/#/c/562296/ and https://review.openstack.org/#/c/562358/ (you already got the third one in the series) | 15:17 |
d0ugal | jtomasek: What decision was that? | 15:18 |
florianf | d0ugal: yup | 15:18 |
jtomasek | d0ugal: https://review.openstack.org/#/c/564315/3 | 15:19 |
jtomasek | d0ugal: whether get_plan_deployment_status workflow should live in plan_management.yaml or deployment.yaml workbooks | 15:20 |
EmilienM | mwhahaha: otherwise we should have to change the doc, like: "yum update python-tripleoclient dbus; reboot" and run "openstack undercloud upgrade" when rebooted... | 15:20 |
mwhahaha | EmilienM: it's likely this is only a 7.4 -> 7.5 issue | 15:20 |
mwhahaha | but i'm not sure, it's a rhel upgrade problem | 15:21 |
d0ugal | jtomasek: commenting. | 15:21 |
EmilienM | mwhahaha: so in my case, dbus wasn't upgraded (I already deployed queens on 7.5) | 15:22 |
EmilienM | mwhahaha: and certmonger failed to start | 15:22 |
*** masco has quit IRC | 15:22 | |
EmilienM | I'm trying again now | 15:22 |
*** radeks_ has joined #tripleo | 15:25 | |
*** dparkes has quit IRC | 15:26 | |
openstackgerrit | Merged openstack/tripleo-heat-templates master: Add ability to control Glance's enabled_import_methods https://review.openstack.org/567667 | 15:26 |
openstackgerrit | Merged openstack/tripleo-ui master: Add deployment status tracking infrastructure https://review.openstack.org/559021 | 15:26 |
openstackgerrit | Merged openstack/tripleo-ui master: Enable config-download deployment tracking https://review.openstack.org/559022 | 15:26 |
openstackgerrit | Merged openstack/tripleo-upgrade stable/queens: Run Ceph upgrade before converge. https://review.openstack.org/568852 | 15:26 |
openstackgerrit | Merged openstack/tripleo-upgrade master: Cleanup on oc_roles var https://review.openstack.org/568680 | 15:26 |
*** aufi has quit IRC | 15:27 | |
openstackgerrit | Marios Andreou proposed openstack/python-tripleoclient stable/queens: Add .deployment.v1.deploy_on_servers to ffwd-upgrade prepare https://review.openstack.org/568604 | 15:28 |
EmilienM | mwhahaha and others: https://review.openstack.org/568347 hasn't landed yet (tripleo-common) while https://review.openstack.org/568680 just landed | 15:28 |
*** skramaja has quit IRC | 15:28 | |
EmilienM | it means we'll have some errors in check for patches that are send now until https://review.openstack.org/568347 lands and is built | 15:28 |
mwhahaha | EmilienM: ... depends-on would have been nice i guess | 15:29 |
EmilienM | mwhahaha: it had depends on | 15:29 |
mwhahaha | EmilienM: also why is tripleo-upgrade not in the tripleo queue | 15:29 |
mwhahaha | oh so it's backwards | 15:29 |
EmilienM | mwhahaha: wes has a patch | 15:29 |
mwhahaha | so we'll just have errors and there's not much we can do about it | 15:29 |
EmilienM | it'll merge today | 15:29 |
EmilienM | right and it's only this time | 15:29 |
EmilienM | tripleo-upgrade is having real jobs and will be in gate with others | 15:29 |
EmilienM | see https://review.openstack.org/#/c/568733/ | 15:30 |
*** cshastri has joined #tripleo | 15:30 | |
mwhahaha | EmilienM: mcornea had a good point on that, can we update it to only run on specific patches? | 15:30 |
EmilienM | weshay: yeah | 15:30 |
EmilienM | ^ can we update it please? i just had the same thought | 15:31 |
EmilienM | also we don't need tripleo-ci-centos-7-undercloud-containers and tripleo-ci-centos-7-containers-multinode | 15:31 |
*** tiswanso_ has quit IRC | 15:32 | |
*** quiquell is now known as quiquell|off | 15:32 | |
weshay | EmilienM, ya.. I agree we may not need multinode-containers kicking from this repo | 15:32 |
*** tiswanso has joined #tripleo | 15:32 | |
EmilienM | weshay: neither tripleo-ci-centos-7-undercloud-containers | 15:32 |
weshay | right right | 15:33 |
EmilienM | save trees! | 15:33 |
weshay | however I also thought we have this min criteria for all projects | 15:33 |
weshay | so I went conservative there | 15:33 |
weshay | but if you guys agree.. I'll update to be just update/upgrade related | 15:33 |
weshay | I'll make the update | 15:33 |
*** masco has joined #tripleo | 15:34 | |
*** olap has quit IRC | 15:34 | |
*** ramishra has quit IRC | 15:34 | |
openstackgerrit | Ronelle Landy proposed openstack-infra/tripleo-ci master: Add releases script pytest tests to tox.ini https://review.openstack.org/567649 | 15:35 |
rasca | mwhahaha, EmilienM, I've opened the original bug because I was hitting it on the ospphase0 OSP11 CI job, but *only* there, what do you mean with all versions? I'm testing the same exact approach on all the OSP releases and getting this just in 11 | 15:36 |
jtomasek | d0ugal: https://bugs.launchpad.net/tripleo/+bug/1771610 | 15:36 |
openstack | Launchpad bug 1771610 in tripleo "deployment_status.yaml swift object does not exist when plan is created" [High,Triaged] | 15:36 |
*** eck` is now known as eck`gone | 15:36 | |
mwhahaha | rasca: it's not specific to osp11 | 15:37 |
mwhahaha | rasca: there's something outside of the OSP bits that causing it and it only started showing up with 7.5 | 15:37 |
rasca | mwhahaha, ok, but since I'm using the same exact procedure on the same exact env to test OSP10-11-12, why I'm hitting this just and systematically on 11? | 15:38 |
mwhahaha | rasca: that's a question for folks who know about certmonger :D | 15:38 |
rasca | mwhahaha, I'll add these info on the bug | 15:39 |
*** gyankum has joined #tripleo | 15:39 | |
mwhahaha | the underlying certmonger not starting is not an osp specific thing. i don't know how your env deploys 10/11/12, so maybe 10 is getting 7.4 and 11 gets 7.5? | 15:39 |
rasca | mwhahaha, nope, these envs are the *same*, I mean, with no difference | 15:41 |
rasca | mwhahaha, same way of provisioning the undercloud | 15:41 |
openstackgerrit | wes hayutin proposed openstack/tripleo-upgrade master: add container minimal check and gate https://review.openstack.org/568733 | 15:41 |
* mwhahaha shrugs | 15:42 | |
mwhahaha | i'd have to look at it further. there's a lot of things that happen under the covers with various tooling that hide stuff | 15:42 |
mwhahaha | so i tend not to believe anything :D | 15:42 |
openstackgerrit | wes hayutin proposed openstack/tripleo-upgrade master: DNM, test https://review.openstack.org/568732 | 15:43 |
weshay | EmilienM, k.. updated 568732,3 | 15:45 |
weshay | and only the update jobs is kicking.. | 15:45 |
*** agurenko has quit IRC | 15:45 | |
EmilienM | cool | 15:46 |
*** etingof is now known as etingof|brb | 15:46 | |
*** paramite_ has quit IRC | 15:46 | |
trozet | hey guys when this is running: 2018-05-16 10:24:22,500 p=17826 u=mistral | TASK [Start containers for step 4] ********************************************* | 15:49 |
trozet | 2018-05-16 10:25:08,655 p=17826 u=mistral | ok: [overcloud-controller-0] => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false} | 15:49 |
trozet | how do i go see what is actually being done? | 15:49 |
bandini | rasca: the bottom line is "if you update dbus, you need to reboot the box" | 15:50 |
mwhahaha | trozet: pretty sure it still shows up in the journal | 15:50 |
trozet | mwhahaha: what is the no_log true thing, how do i enable that? | 15:50 |
*** leanderthal has quit IRC | 15:51 | |
mwhahaha | trozet: it's on the ansible play itself, and i don't think you want to because it may break other things. (excessive logging breaks mistral) | 15:51 |
trozet | mwhahaha: yeah with the long blob into sql problem :) | 15:51 |
*** yprokule has quit IRC | 15:51 | |
d0ugal | Do we have a Mistral bug for this btw? | 15:51 |
mwhahaha | trozet: slagle was working on capturing the output somewhere | 15:51 |
trozet | mwhahaha: so where do you mean it shows up in journal? im tailing /var/log/messages on the overcloud-controller-0 node | 15:52 |
mwhahaha | d0ugal: i remember rbrady__ talking about it yesterday so maybe? | 15:52 |
rasca | bandini, mwhahaha, so the only conclusion is that just for osp-11 after configuring repo with rhos-release we get a dbus upgrade | 15:52 |
sileht | mwhahaha, about https://bugs.launchpad.net/tripleo/+bug/1771435, the root cause is here: https://review.openstack.org/#/c/568699/1/network/endpoints/endpoint_map.yaml@a0102 in the offensing patch the port is 8977 instead of 13977 | 15:52 |
openstack | Launchpad bug 1771435 in tripleo "scenario001/002 failing on autoscaling with urllib3.exceptions.SSLError: [SSL: UNKNOWN_PROTOCOL] unknown protocol (_ssl.c:579)" [Critical,Fix released] - Assigned to Alex Schultz (alex-schultz) | 15:52 |
*** masco has quit IRC | 15:52 | |
mwhahaha | trozet: i thought it was there, but if we'r e not capturing it yet | 15:53 |
openstackgerrit | Mehdi Abaakouk (sileht) proposed openstack/tripleo-heat-templates master: Revert "Revert "Change default endpoint map entries to use TLS"" https://review.openstack.org/568887 | 15:53 |
trozet | d0ugal: are you referring to the issue I was just talking about with sql/mistral? | 15:53 |
rasca | bandini, mwhahaha, this can explain why I'm hitting this particular behavior, and I can also verify it, let me check | 15:53 |
*** janki has quit IRC | 15:53 | |
mwhahaha | sileht: ok so bad port, we'll i know jaosorior reverted the revert so you may want to check that patch | 15:53 |
sileht | oh I was about to do the same | 15:53 |
*** bogdando has quit IRC | 15:53 | |
mwhahaha | sileht: https://review.openstack.org/#/c/568736/ | 15:53 |
bandini | rasca: yes (see my comment#3 the update is right there) | 15:54 |
mwhahaha | sileht: doesn't look like he updated it though, feel free to comment/patch it. I'm sure he'll be ok with it | 15:54 |
d0ugal | trozet: yup! | 15:54 |
*** ykarel is now known as ykarel|away | 15:54 | |
trozet | mwhahaha: the only thing i see in journald is May 16 10:51:37 overcloud-controller-0.opnfvlf.org Keepalived_vrrp[18618]: /usr/bin/systemctl status haproxy.service exited with status 1 | 15:54 |
trozet | May 16 10:51:37 overcloud-controller-0.opnfvlf.org dockerd-current[3438]: /usr/bin/systemctl status haproxy.service exited with status 1 | 15:54 |
trozet | d0ugal: well one instance of that was fixed by not storing the ansible output into sql...let me find the patch | 15:54 |
mwhahaha | trozet: so we build the ansible bits in THT, so you could try switching the no_log to false in THT and rerunning the deploy | 15:54 |
openstackgerrit | Mehdi Abaakouk (sileht) proposed openstack/tripleo-heat-templates master: Revert "Revert "Change default endpoint map entries to use TLS"" https://review.openstack.org/568736 | 15:54 |
mwhahaha | it'll be in deploy-steps.j2 or something | 15:55 |
d0ugal | trozet: Right, I think I spotted that. Probably worth fixing (or at least tracking) in Mistral too. | 15:55 |
mwhahaha | just grep for the task name in tht to find it | 15:55 |
trozet | d0ugal: https://review.openstack.org/#/c/565900/ | 15:55 |
*** tcw has quit IRC | 15:55 | |
d0ugal | trozet: Thanks. I'll open a Mistral specific bug. | 15:55 |
sileht | mwhahaha, jaosorior that's done | 15:55 |
shardy | trozet: you can also try running the deploy steps manually via ansible playbook, and if needed change the tasks or add debug etc | 15:56 |
trozet | d0ugal: I also just bumped the blob size for sql, so that it can accept larger pieces of data | 15:56 |
shardy | trozet: https://hardysteven.blogspot.co.uk/2018/02/debugging-tripleo-revisited-heat.html shows how to do that via config download | 15:56 |
*** fragatina has quit IRC | 15:57 | |
trozet | shardy: thanks i havent seen this link before | 15:58 |
trozet | shardy, mwhahaha: my problem is this thing just seems to hang at step 4, it doesnt fail | 15:58 |
trozet | mwhahaha: so i guess i can take your suggestion and try to enable the log to see what is happening | 15:58 |
mwhahaha | trozet: or kill it and run it by hand | 15:58 |
trozet | mwhahaha: but just to be clear..this isnt a good solution for debugging this for a user right | 15:59 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-quickstart-extras master: DNM: test ansible 2.5.1 https://review.openstack.org/563471 | 15:59 |
thrash | d0ugal: suggest perhaps trimming the output so that it contains the last X chars | 15:59 |
trozet | mwhahaha: the expectation is this should not hang, and if it fails, produce an obvious error | 15:59 |
mwhahaha | trozet: we're not the regular users, but yea | 15:59 |
openstackgerrit | RedHat RDO CI proposed openstack/tripleo-quickstart-extras master: GATE CHECK for quickstart-extras https://review.openstack.org/560445 | 16:00 |
*** rbowen_ has joined #tripleo | 16:00 | |
*** udesale_ has quit IRC | 16:02 | |
slagle | mwhahaha: trozet : switching no_log:true on that task might not help you fwiw | 16:03 |
slagle | ansible does not stream output back to the console | 16:03 |
slagle | it collects it all, then shows it | 16:03 |
mwhahaha | yea that's what i figured | 16:03 |
mwhahaha | rerunning it by hand will be better for debugging | 16:04 |
trozet | mwhahaha, slagle: yeah so if it hangs on a task i get nothing :/ | 16:04 |
slagle | in the case where the task fails, we do in fact show the error, as there are specifc follow up tasks to show the output if they previously failed | 16:04 |
mwhahaha | hanging tasks is not a new thing | 16:04 |
slagle | trozet: that's a function of what the task is doing | 16:04 |
*** marios has quit IRC | 16:04 | |
mwhahaha | anything that can hang needs an external timeout | 16:04 |
mwhahaha | in puppet had that, not sure if you have that with ansible bits now | 16:04 |
slagle | eventually it ought to time out, and maybe you'll get some output | 16:05 |
mwhahaha | so if you find where it's hanging, we need to ensure it has a timeout mechanism somewhere | 16:05 |
trozet | well in my output I have | 16:05 |
trozet | 2018-05-16 10:24:22,500 p=17826 u=mistral | TASK [Start containers for step 4] ********************************************* | 16:05 |
trozet | 2018-05-16 10:25:08,655 p=17826 u=mistral | ok: [overcloud-controller-0] => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false} | 16:05 |
*** itlinux has joined #tripleo | 16:05 | |
trozet | so im guessing its hanging on the compute node...ihavent looked at that let me go check | 16:05 |
*** rmascena has joined #tripleo | 16:05 | |
*** rmascena is now known as raildo_ | 16:05 | |
*** masco has joined #tripleo | 16:06 | |
*** saneax is now known as saneax-_-|AFK | 16:06 | |
*** rbowen has quit IRC | 16:06 | |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: Mount /dev for chrooted environment https://review.openstack.org/568838 | 16:07 |
trozet | mwhahaha, slagle: ah ha! looks like it is ceph-osd-run.sh running over and over | 16:07 |
mwhahaha | short list for hanging process offenders: pacemaker, ceph | 16:07 |
itlinux | hello all. good morning first of all. | 16:07 |
trozet | mwhahaha: https://paste.fedoraproject.org/paste/ey0dtVluXxsHE8Ien47EQA/raw | 16:07 |
itlinux | I have AD enabled and at times I get this error An unexpected error prevented the server from fulfilling your request. (HTTP 500) | 16:08 |
itlinux | this happens only on one domain since I have two and the second is fine | 16:08 |
*** raildo has quit IRC | 16:08 | |
mwhahaha | trozet: yea that's a question for gfidente or fultonj | 16:08 |
* mwhahaha doesn't touch ceph unless he has to | 16:08 | |
trozet | mwhahaha: haha damnt | 16:09 |
Tengu | wise thought :] | 16:09 |
*** ooolpbot has joined #tripleo | 16:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 16:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 16:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771549 | 16:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771551 | 16:10 |
*** ooolpbot has quit IRC | 16:10 | |
trozet | mwhahaha: do you know if tripleo CI covers ceph? | 16:10 |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 16:10 |
openstack | Launchpad bug 1771549 in tripleo "Containers multinode jobs fails on stable queens with overcloud deploy timeout" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 16:10 |
openstack | Launchpad bug 1771551 in tripleo "Containers multinode jobs fails on stable pike because of pacemaker" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 16:10 |
sshnaidm|rover | weshay, I was watching https://review.openstack.org/#/c/568838/ running :( | 16:10 |
mwhahaha | trozet: yes | 16:10 |
mwhahaha | trozet: scenario001/004 | 16:10 |
trozet | mwhahaha: ok maybe I can compare that | 16:10 |
trozet | gfidente: are you around to help me debug this issue? | 16:10 |
*** eck`gone is now known as eck` | 16:10 | |
sshnaidm|rover | weshay, please don't rebase it, let's see if introspection passes | 16:10 |
*** ramishra has joined #tripleo | 16:11 | |
*** fragatina has joined #tripleo | 16:12 | |
weshay | sshnaidm|rover, ah sorry | 16:20 |
weshay | I saw it hit that undercloud issue | 16:20 |
weshay | http://logs.openstack.org/38/568838/3/check/tripleo-ci-centos-7-undercloud-containers/7cd8799/logs/undercloud/home/zuul/undercloud_install.log.txt.gz#_2018-05-16_13_36_24 | 16:20 |
*** lvdombrkr89 has quit IRC | 16:22 | |
EmilienM | beagles: https://review.openstack.org/#/c/561377/ is passing now | 16:22 |
EmilienM | with https://review.openstack.org/#/c/566737/ | 16:22 |
beagles | \o/ | 16:22 |
EmilienM | mwhahaha, beagles : please review these 2 patches ^ thanks | 16:23 |
sshnaidm|rover | weshay, i care only about rdo ci ovb jobs in that patch | 16:23 |
mwhahaha | EmilienM: https://i.imgur.com/DKUR9Tk.png | 16:23 |
*** paramite_ has joined #tripleo | 16:23 | |
weshay | sshnaidm|rover, roger that | 16:24 |
*** dprince has quit IRC | 16:24 | |
beagles | lol | 16:24 |
sshnaidm|rover | mwhahaha, weshay where do we install pacemaker from? for pike for example.. | 16:25 |
sshnaidm|rover | last pike promotion was tonight, but it didn't help | 16:25 |
openstackgerrit | John Trowbridge proposed openstack-infra/tripleo-ci master: Add CLI argument parser and YAML file parser https://review.openstack.org/567936 | 16:26 |
*** psahoo has quit IRC | 16:26 | |
*** jfrancoa has quit IRC | 16:26 | |
*** florianf has quit IRC | 16:26 | |
weshay | sshnaidm|rover, https://buildlogs.centos.org/centos/7/cloud/x86_64/openstack-pike/ | 16:27 |
sshnaidm|rover | weshay, and which version do we need? | 16:28 |
mwhahaha | sshnaidm|rover: it's in the image build | 16:28 |
*** jfrancoa has joined #tripleo | 16:28 | |
weshay | sshnaidm|rover, so iirc the build had 1.1.18 | 16:28 |
weshay | this has 1.1.16 | 16:28 |
sshnaidm|rover | weshay, so buildlogs repo needs to be updated? | 16:29 |
mwhahaha | sshnaidm|rover: we haven't landed the yum update in the base container yet | 16:29 |
mwhahaha | it's still in gate | 16:29 |
sshnaidm|rover | mwhahaha, it doesn't matter for pacemaker afaik | 16:29 |
mwhahaha | yes it should | 16:29 |
mwhahaha | oh wait yea | 16:29 |
mwhahaha | that should be installed in that container | 16:29 |
mwhahaha | so that's the build from that container | 16:29 |
openstackgerrit | John Trowbridge proposed openstack-infra/tripleo-ci master: Add CLI argument parser and YAML file parser https://review.openstack.org/567936 | 16:30 |
*** mdnadeem_ has joined #tripleo | 16:30 | |
mwhahaha | unless we are installing pacemaker in the base container | 16:30 |
*** moshele has joined #tripleo | 16:30 | |
*** mdnadeem has quit IRC | 16:31 | |
*** holser__ has quit IRC | 16:31 | |
weshay | sshnaidm|rover, so in the container build.. if the centos base repo is used.. we should get 1.1.18 | 16:31 |
weshay | so apparently it's not? | 16:32 |
sshnaidm|rover | weshay, yep, it's not.. | 16:32 |
sshnaidm|rover | weshay, but it worked for queens | 16:32 |
weshay | hrm.. can we look at that? | 16:32 |
*** fragatina has quit IRC | 16:32 | |
weshay | https://logs.rdoproject.org/openstack-periodic-24hr/periodic-tripleo-centos-7-pike-containers-build/946e18f | 16:33 |
chandankumar | sshnaidm|rover: is the telemetry issue got fixed/ | 16:34 |
chandankumar | ? | 16:34 |
sshnaidm|rover | chandankumar, I think so, weshay has a reverting patch for tempest | 16:34 |
weshay | well.. mwhahaha does, and it's my understanding we'd rather see it fail in the gate and fix any issue immediately if there is one | 16:36 |
weshay | then to take it offline | 16:36 |
*** kopecmartin has quit IRC | 16:36 | |
*** itlinux has quit IRC | 16:36 | |
mwhahaha | it wasn't a telemetry fix anyway | 16:36 |
mwhahaha | we don't disable that test | 16:36 |
weshay | for future reference sshnaidm|rover chandankumar ^ | 16:37 |
*** ccamacho has quit IRC | 16:37 | |
chandankumar | sshnaidm|rover: did you get a chance to look at tempest container log patch | 16:37 |
mwhahaha | that's like disabling ping test when ping test is the only thing running | 16:37 |
*** panda is now known as panda|off | 16:37 | |
sshnaidm|rover | ok, I will remember this specifc test never to put in ignore list | 16:38 |
chandankumar | what about adding a tag so that people cannot put it in skip list | 16:38 |
chandankumar | ? | 16:38 |
weshay | sshnaidm|rover, chandankumar I'll add some doc text to the skip files.. to indicate what is and what is not appropriate at this moment | 16:38 |
sshnaidm|rover | never | 16:38 |
openstackgerrit | John Trowbridge proposed openstack-infra/tripleo-ci master: Add CLI argument parser and YAML file parser https://review.openstack.org/567936 | 16:39 |
openstackgerrit | Dmitry Tantsur proposed openstack/tripleo-heat-templates master: undercloud: set OS_CACERT when TLS is used https://review.openstack.org/568853 | 16:40 |
* chandankumar headed home | 16:40 | |
weshay | sshnaidm|rover, I wonder if kolla overwrites the avail yum repos | 16:40 |
weshay | 2018-05-16 04:29:05.193 | TASK [rdo-kolla-build : Fetch repo file] *************************************** | 16:40 |
weshay | https://logs.rdoproject.org/openstack-periodic-24hr/periodic-tripleo-centos-7-pike-containers-build/946e18f/console.txt.gz | 16:41 |
*** ffiore has quit IRC | 16:41 | |
*** tcw has joined #tripleo | 16:42 | |
*** gvrangan has joined #tripleo | 16:42 | |
*** dprince has joined #tripleo | 16:45 | |
*** gkadam has quit IRC | 16:46 | |
*** eck` is now known as eck`gone | 16:46 | |
*** paramite_ has quit IRC | 16:46 | |
sshnaidm|rover | weshay, if it does, so only for pike | 16:47 |
weshay | sshnaidm|rover, ya.. so.. fak.. I think I have to run this locally | 16:48 |
*** rbowen_ is now known as rbowen | 16:49 | |
*** rbowen has quit IRC | 16:49 | |
*** rbowen has joined #tripleo | 16:49 | |
*** shardy has quit IRC | 16:50 | |
*** eck`gone is now known as eck` | 16:51 | |
*** rbowen has quit IRC | 16:51 | |
*** rbowen has joined #tripleo | 16:51 | |
sshnaidm|rover | weshay, well, we have 1.18 http://logs.openstack.org/85/564285/9/check/tripleo-ci-centos-7-containers-multinode/1bbbd26/logs/subnode-2/var/log/extra/rpm-list.txt.gz | 16:52 |
weshay | sshnaidm|rover, ya.. so I suspect kolla for some reason... I'll setup a reproducer | 16:53 |
*** links has joined #tripleo | 16:53 | |
sshnaidm|rover | weshay, so on image it's 1.1.18, how can I check what was in container? | 16:53 |
weshay | sshnaidm|rover, ya.. /me looks | 16:53 |
weshay | I think we *may* have it.. not 100% sure | 16:54 |
*** ramishra has quit IRC | 16:54 | |
weshay | re: a log of the version | 16:54 |
*** salmankhan has quit IRC | 16:54 | |
*** radeks_ has quit IRC | 16:54 | |
openstackgerrit | James Slagle proposed openstack/tripleo-heat-templates master: Revert "Switch public endpoints to use FQDNs by default" https://review.openstack.org/568899 | 16:54 |
weshay | sshnaidm|rover, this looks right to me https://logs.rdoproject.org/openstack-periodic-24hr/periodic-tripleo-centos-7-pike-containers-build/946e18f/kolla/logs/rabbitmq.log | 16:55 |
weshay | sshnaidm|rover, where did you see 1.1.16? | 16:55 |
sshnaidm|rover | weshay, I didn't say I saw 16 | 16:56 |
weshay | sshnaidm|rover, k k | 16:56 |
weshay | sshnaidm|rover, so what lead you down this path? | 16:56 |
weshay | what failure | 16:56 |
sshnaidm|rover | weshay, weshay but in repo link you pasted above it's only 16 | 16:56 |
weshay | right | 16:57 |
weshay | deps should be updated.. or not duplicated | 16:57 |
weshay | wth | 16:57 |
weshay | sshnaidm|rover, is pike now working in upstream? | 16:57 |
* weshay checks | 16:57 | |
sshnaidm|rover | weshay, the failure seems exactly the same we had in queens and master and which was solved by pacemaker | 16:57 |
sshnaidm|rover | weshay, no | 16:57 |
openstackgerrit | James Slagle proposed openstack/tripleo-heat-templates master: Revert "Switch public endpoints to use FQDNs by default" https://review.openstack.org/568899 | 16:57 |
*** fragatina has joined #tripleo | 16:58 | |
weshay | sshnaidm|rover, so you've confirmed after the promotion, an upstream job with latest promoted bits fails the same way | 16:58 |
*** fragatina has quit IRC | 16:58 | |
*** raildo_ is now known as raildo | 16:58 | |
*** dtantsur is now known as dtantsur|afk | 16:58 | |
weshay | sshnaidm|rover, seems that way :) https://review.openstack.org/#/c/564285/ | 16:59 |
sshnaidm|rover | weshay, see https://bugs.launchpad.net/tripleo/+bug/1771551 | 16:59 |
openstack | Launchpad bug 1771551 in tripleo "Containers multinode jobs fails on stable pike because of pacemaker" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 16:59 |
*** mdnadeem_ has quit IRC | 16:59 | |
sshnaidm|rover | weshay, Michele checked in container and it's 16 there | 17:00 |
*** ykarel|away has quit IRC | 17:00 | |
*** fragatina has joined #tripleo | 17:01 | |
*** derekh has quit IRC | 17:01 | |
*** fragatina has quit IRC | 17:02 | |
*** fragatina has joined #tripleo | 17:02 | |
weshay | sshnaidm|rover, afaict.. at least in the latest hubbot job | 17:03 |
weshay | 2018-05-16 09:45:33 | - imagename: docker.io/tripleopike/centos-binary-rabbitmq:d52ad67500aacdb4c2a1321363bfe87de4e6b518_88c9954e | 17:03 |
weshay | which is n-1 on promotions | 17:03 |
weshay | https://hub.docker.com/r/tripleopike/centos-binary-rabbitmq/tags/ | 17:03 |
weshay | should be pulling 1ba7734082acaef6e95d489e4c32cea52aa92c4c_de76e108 | 17:03 |
* weshay looks at your bug | 17:03 | |
*** shreshtha has quit IRC | 17:03 | |
weshay | sshnaidm|rover, your bug is also referencing a job that uses the old n-1 promoted container | 17:04 |
weshay | http://logs.openstack.org/98/564698/2/check/tripleo-ci-centos-7-containers-multinode/22c050e/logs/undercloud/home/zuul/overcloud_prep_containers.log.txt.gz | 17:04 |
weshay | sshnaidm|rover, c what I mean | 17:06 |
weshay | sshnaidm|rover, can you put up a dnm patch on tht pike | 17:06 |
weshay | and see if it persists? | 17:06 |
weshay | or myoung|ruck | 17:06 |
myoung|ruck | sure, just caught up...had made a sammich | 17:07 |
sshnaidm|rover | weshay, hmm.. then how does gate check run on old hash?? | 17:08 |
myoung|ruck | https://buildlogs.centos.org/centos/7/cloud/x86_64/openstack-pike/ has pacemaker 1.1.16 | 17:08 |
weshay | myoung|ruck, yes we know, however centos base has 1.1.18 | 17:09 |
*** gvrangan has quit IRC | 17:09 | |
weshay | sshnaidm|rover, are you sure it kicked after the promtion? | 17:09 |
myoung|ruck | yes...so shouldn't we have the deps repo updated to match upstream? | 17:09 |
myoung|ruck | base | 17:09 |
weshay | imho the duplicate should be removed | 17:09 |
myoung|ruck | kolla is prob finding it from https://trunk.rdoproject.org/centos7-pike/delorean-deps.repo | 17:10 |
sshnaidm|rover | weshay, last run 14:16 today | 17:10 |
*** ooolpbot has joined #tripleo | 17:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 17:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 17:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771549 | 17:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771551 | 17:10 |
*** ooolpbot has quit IRC | 17:10 | |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 17:10 |
openstack | Launchpad bug 1771549 in tripleo "Containers multinode jobs fails on stable queens with overcloud deploy timeout" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 17:10 |
openstack | Launchpad bug 1771551 in tripleo "Containers multinode jobs fails on stable pike because of pacemaker" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 17:10 |
weshay | myoung|ruck, read through the comments again | 17:10 |
weshay | in irc | 17:10 |
*** jcoufal_ has joined #tripleo | 17:10 | |
sshnaidm|rover | weshay, ok, I have pike patches to merge, let's check on them: https://review.openstack.org/#/c/568151/ https://review.openstack.org/#/c/568292/ | 17:11 |
*** gvrangan has joined #tripleo | 17:11 | |
*** jpena is now known as jpena|off | 17:11 | |
*** beagles is now known as beagles|afk | 17:11 | |
EmilienM | sshnaidm|rover, dmsimard|off: what's the progress in ara integration? | 17:11 |
*** trown is now known as trown|lunch | 17:11 | |
EmilienM | I thought it would be easy and we just need to configure a callback plugin | 17:11 |
sshnaidm|rover | EmilienM, sorry, no progress yet :( | 17:12 |
*** itlinux has joined #tripleo | 17:12 | |
sshnaidm|rover | EmilienM, when installing ara seems like it breaks ansible run, no idea why | 17:12 |
weshay | sshnaidm|rover, seems to be working in pike now http://zuul.openstack.org/stream.html?uuid=b4799642a64e49fa9e339d335a9f4f72&logfile=console.log | 17:12 |
openstackgerrit | Alex Schultz proposed openstack/python-tripleoclient master: Make standalone role name configurable https://review.openstack.org/568378 | 17:13 |
sshnaidm|rover | EmilienM, but I'll get back to this asap conditions allow.. | 17:13 |
weshay | sshnaidm|rover, we can probably close the bug | 17:13 |
*** jcoufal has quit IRC | 17:13 | |
sshnaidm|rover | weshay, let's wait for results | 17:13 |
sshnaidm|rover | weshay, but if it's so, 1 problem less *phew | 17:14 |
openstackgerrit | Alex Schultz proposed openstack/python-tripleoclient master: Update HostnameMap generation https://review.openstack.org/567951 | 17:14 |
*** psachin has quit IRC | 17:14 | |
weshay | sshnaidm|rover, :) | 17:14 |
weshay | I got 99 problems, but pike aint one? | 17:14 |
sshnaidm|rover | weshay, so only queens overcloud timeout remains | 17:14 |
EmilienM | sshnaidm|rover: we can probably ask dmsimard|off to give a hand | 17:14 |
weshay | sshnaidm|rover, across the board? | 17:14 |
sshnaidm|rover | EmilienM, I need to improve my debug skills, need to see why ansible fails to start - where can I see that? in mistral logs..? | 17:15 |
trozet | mwhahaha: I figured out the problem: https://github.com/ceph/ceph-ansible/issues/2598 | 17:15 |
sshnaidm|rover | weshay, from urgent ones | 17:15 |
EmilienM | sshnaidm|rover: right, in /var/lib/mistral you should see ansible logs | 17:15 |
sshnaidm|rover | weshay, because it blocks queens | 17:15 |
mwhahaha | trozet: HEH | 17:16 |
sshnaidm|rover | EmilienM, I don't have this folder when it fails | 17:16 |
*** links has quit IRC | 17:16 | |
EmilienM | sshnaidm|rover: on the overcloud | 17:16 |
EmilienM | err | 17:16 |
trozet | mwhahaha: so in Apex we create a loop device, but i was creating an ext4 partition on it...so thats why no osd | 17:16 |
EmilienM | on the undercloud sorry | 17:16 |
EmilienM | sshnaidm|rover: it depends which ansible run you're talking about | 17:16 |
trozet | mwhahaha: so testing it now without creating a partition, and will also try to submit a fix for this in ceph-ansible to have better checking | 17:16 |
mwhahaha | trozet: you and your fake devices | 17:17 |
sshnaidm|rover | EmilienM, well, last time I tried - overcloud failed and this folder didn't exist, so I made a conclusion that ansible just didn't start at all.. | 17:17 |
trozet | mwhahaha: haha i used to just use a directory in the old puppet-ceph, but in ceph-ansible thats not allowed, so now I create the persistent loop device | 17:17 |
*** gbarros has quit IRC | 17:17 | |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci master: WIP: use ara for ansible deploy https://review.openstack.org/565079 | 17:19 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-common master: WIP: ara with ansible deploy https://review.openstack.org/565077 | 17:19 |
sshnaidm|rover | EmilienM, I'll check it again.. | 17:19 |
weshay | sshnaidm|rover, just to make sure I understand what you are saying | 17:19 |
weshay | tripleo-ci-centos-7-containers-multinodeFAILURE in 2h 58m 24s | 17:19 |
weshay | in https://review.openstack.org/#/c/567224/ | 17:19 |
weshay | that job is the issue? | 17:20 |
weshay | for queens | 17:20 |
sshnaidm|rover | weshay, yes, https://bugs.launchpad.net/tripleo/+bug/1771549 | 17:20 |
openstack | Launchpad bug 1771549 in tripleo "Containers multinode jobs fails on stable queens with overcloud deploy timeout" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 17:20 |
weshay | sshnaidm|rover, ok.. by urgent.. you meant alert | 17:21 |
weshay | k | 17:21 |
weshay | thanks | 17:21 |
*** etingof|brb is now known as etingof | 17:21 | |
weshay | sshnaidm|rover, ah fak | 17:22 |
openstackgerrit | Merged openstack/tripleo-common master: (cleanup) remove usage of role_name https://review.openstack.org/568347 | 17:22 |
weshay | that is timing out since | 2018-05-15 02:39 | | 17:22 |
weshay | myoung|ruck, ^ | 17:22 |
weshay | maybe before that | 17:23 |
weshay | so queens is blocked | 17:23 |
sshnaidm|rover | myoung|ruck, weshay, btw, master containers build fails | 17:23 |
weshay | fak | 17:23 |
openstackgerrit | Michael Chapman proposed openstack/tripleo-heat-templates master: Add OPNFV scenario environment https://review.openstack.org/486905 | 17:23 |
sshnaidm|rover | from yesterday | 17:23 |
mwhahaha | so i noticed the kolla build job has been failing, have we looked into that yet? | 17:24 |
sshnaidm|rover | myoung|ruck, do we have a bug about master containers build fails? | 17:24 |
alee | mcornea, a little progress it seems -- any idea why this is hapeneing though? http://logs.openstack.org/97/567897/5/experimental/tripleo-ci-centos-7-scenario000-multinode-oooq-container-password-changes/4463bc5/logs/quickstart_install.txt.gz | 17:24 |
michchap | hey guys, I'm trying to add a container job for ODL, but it doesn't seem to run the container image prepare script so the DockerOpendaylightApiImage var never gets set in heat, is there something I'm likely missing? | 17:24 |
alee | mcornea, Fatal: [undercloud]: FAILED! => {"changed": false, "failed": true, "msg": "Source /home/zuul/overcloud_deploy.sh not found"} | 17:24 |
mcornea | alee: yes! something familiar :) give me a sec | 17:25 |
myoung|ruck | will check...and open if not | 17:25 |
*** amoralej is now known as amoralej|off | 17:25 | |
*** tesseract has quit IRC | 17:27 | |
*** beagles|afk is now known as beagesl | 17:27 | |
*** beagesl is now known as beagles | 17:27 | |
mcornea | alee: add this at the top of your play: https://github.com/openstack/tripleo-upgrade/blob/master/tasks/fast-forward-upgrade/create-prepare-scripts.yaml#L1-L4 | 17:27 |
weshay | sshnaidm|rover, hrm... https://review.rdoproject.org/grafana/dashboard/db/tripleo-ci?orgId=1&var-pipeline=All&var-branch=queens&var-cloud=All&var-type=All&var-jobtype=All | 17:27 |
myoung|ruck | weshay, sshnaidm|rover, #908 is running now, checking logs for preview | 17:28 |
alee | mcornea, cool | 17:28 |
myoung|ruck | previous | 17:28 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: docker: cleanup update tasks https://review.openstack.org/568715 | 17:28 |
sshnaidm|rover | weshay, yeah? | 17:29 |
sshnaidm|rover | myoung|ruck, weshay, seems like something with registry, 500, 504 errors | 17:29 |
* myoung|ruck is looking here: https://review.rdoproject.org/jenkins/job/periodic-tripleo-centos-7-master-containers-build/907/consoleFull atm | 17:29 | |
sshnaidm|rover | myoung|ruck, weshay gotta run, I'll look tomorrow.. | 17:29 |
myoung|ruck | ^^ last fail | 17:29 |
*** sshnaidm|rover is now known as sshnaidm|off | 17:29 | |
jaosorior | sileht: hey! thanks for looking into it! | 17:31 |
myoung|ruck | weshay, sshnaidm|off, details will land there --> https://bugs.launchpad.net/tripleo/+bug/1771634 | 17:31 |
openstack | Launchpad bug 1771634 in tripleo "periodic: master container build is failing" [Critical,Triaged] | 17:31 |
*** pkovar has quit IRC | 17:32 | |
openstackgerrit | Michele Baldessari proposed openstack/puppet-tripleo master: Move unfencing to meta_params https://review.openstack.org/568769 | 17:32 |
openstackgerrit | Matthew Thode proposed openstack/diskimage-builder master: uncap networkx https://review.openstack.org/568910 | 17:33 |
*** gbarros has joined #tripleo | 17:34 | |
*** hjensas has joined #tripleo | 17:34 | |
openstackgerrit | Michael Chapman proposed openstack/tripleo-quickstart master: Updates OpenDaylight feature set 31 https://review.openstack.org/500872 | 17:38 |
*** cshastri has quit IRC | 17:38 | |
*** jfrancoa has quit IRC | 17:41 | |
*** rh-jelabarre has quit IRC | 17:41 | |
*** jfrancoa has joined #tripleo | 17:42 | |
openstackgerrit | Michele Baldessari proposed openstack/puppet-tripleo master: WIP Use the non-fqdn name when creating stonith levels https://review.openstack.org/568913 | 17:43 |
openstackgerrit | Merged openstack/tripleo-heat-templates master: FFU Set NetworkDeploymentActions CREATE,UPDATE for ffwd-upgrade prepare https://review.openstack.org/567270 | 17:44 |
weshay | what is octavia-housekeeping? | 17:44 |
mwhahaha | weshay: it's an octavia service | 17:45 |
mwhahaha | weshay: why | 17:45 |
* mwhahaha throws things at beagles | 17:45 | |
weshay | it's failing to build | 17:45 |
mwhahaha | link to logs? | 17:45 |
beagles | what? | 17:45 |
weshay | https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-centos-7-master-containers-build/6e68284/console.txt.gz | 17:45 |
weshay | not easy to parse | 17:45 |
mwhahaha | RFE less crappy logging :D | 17:46 |
weshay | ya.. maybe it's more than just that one.. but I was not familiar w/ it | 17:46 |
mwhahaha | usually when it fails to build it's a lack of package | 17:46 |
beagles | wow that is hard to sort out | 17:46 |
openstackgerrit | Ade Lee proposed openstack/tripleo-upgrade master: Add config_change role https://review.openstack.org/567300 | 17:47 |
mwhahaha | weshay: do we not capture the kolla logs seperately? those are easier to parse | 17:47 |
weshay | looks like infra https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-centos-7-master-containers-build/6e68284/kolla/logs/octavia-housekeeping.log | 17:47 |
mwhahaha | RROR:kolla.common.utils:neutron-l3-agent Failed with status: error | 17:47 |
weshay | ERROR:kolla.common.utils.octavia-housekeeping:received unexpected HTTP status: 500 Internal Server Error | 17:47 |
*** gvrangan has quit IRC | 17:47 | |
mwhahaha | INFO:kolla.common.utils.neutron-l3-agent:Trying to push the image | 17:48 |
mwhahaha | ERROR:kolla.common.utils.neutron-l3-agent:received unexpected HTTP status: 504 Gateway Time-out | 17:48 |
weshay | ya | 17:48 |
mwhahaha | looks like issues pushing local containers | 17:48 |
mwhahaha | not code/packaging related | 17:48 |
weshay | ya | 17:48 |
weshay | sorry | 17:48 |
weshay | mwhahaha, you shouldn't throw things at people man | 17:48 |
weshay | you can poke someone's eye out | 17:48 |
* mwhahaha gently tosses things at weshay | 17:49 | |
weshay | that's nice | 17:49 |
* beagles wears protective gear | 17:49 | |
mwhahaha | and by things i mean a cactus | 17:49 |
beagles | lol | 17:49 |
weshay | mwhahaha, you promised | 17:49 |
mwhahaha | lunch next week? | 17:49 |
mwhahaha | i'll bring a cactus | 17:49 |
weshay | while all these suckers are at summit | 17:49 |
mwhahaha | i'm sure the mrs would love a trip to ikea | 17:49 |
weshay | sure | 17:49 |
weshay | hell ya | 17:50 |
weshay | approved.. sweedish meatballs and cactus.. hrm | 17:50 |
mwhahaha | :D | 17:50 |
weshay | rasca, ^ | 17:50 |
*** gfidente is now known as gfidente|afk | 17:53 | |
openstackgerrit | Alex Schultz proposed openstack/tripleo-heat-templates master: Add basics for standalone node https://review.openstack.org/566419 | 17:53 |
openstackgerrit | Merged openstack/tripleo-quickstart-extras master: Run containers update only for required packages https://review.openstack.org/567550 | 17:55 |
openstackgerrit | Merged openstack-infra/tripleo-ci master: Revert "temp workaround to bring ci gates back online" https://review.openstack.org/568341 | 17:55 |
sileht | jaosorior, your welcome | 17:56 |
myoung|ruck | mwhahaha, weshay, https://bugs.launchpad.net/tripleo/+bug/1771634 is up to date with current status, I see you already found the 504/500 so I'll not respam here. #508 is building now...that's the next data point | 17:58 |
openstack | Launchpad bug 1771634 in tripleo "periodic: master container build is failing" [Critical,Triaged] - Assigned to Matt Young (halcyondude) | 17:58 |
weshay | myoung|ruck, it's been running red for how may jobs? | 17:58 |
myoung|ruck | mwhahaha: weshay: indeed the logs are kind of hard to read...I've been pulling them local, swapping \n for newlines. the individual kollar container build logs are easier to digest. links in LP | 17:59 |
myoung|ruck | weshay: I'm going thru that now, https://bugs.launchpad.net/tripleo/+bug/1771634/comments/1 | 17:59 |
openstack | Launchpad bug 1771634 in tripleo "periodic: master container build is failing" [Critical,Triaged] - Assigned to Matt Young (halcyondude) | 17:59 |
weshay | k | 17:59 |
myoung|ruck | weshay: 901 was last success | 18:00 |
myoung|ruck | yesterday | 18:00 |
slagle | I think we killed CI again | 18:04 |
mwhahaha | noooooo | 18:04 |
slagle | EmilienM: since we merged https://review.openstack.org/#/c/568343/, everything is broken until we get an updated Mistral container containing the new tripleo-common | 18:05 |
slagle | since the inventory script will run from within the mistral container | 18:05 |
mwhahaha | i thought we reverted the update thing | 18:05 |
mwhahaha | so we should be getting updates | 18:05 |
mwhahaha | now | 18:05 |
slagle | how often does that happen? | 18:05 |
mwhahaha | at run time | 18:05 |
mwhahaha | it was off but just landed | 18:05 |
* mwhahaha goes and finds the review | 18:06 | |
mwhahaha | https://review.openstack.org/#/c/568341/ | 18:06 |
slagle | ok | 18:06 |
mwhahaha | that just landed | 18:06 |
mwhahaha | with everythign else | 18:06 |
slagle | thanks | 18:06 |
mwhahaha | we knew there would be some out of sync stuff but any jobs going forward as of 11mins ago should be ok | 18:06 |
mwhahaha | :D | 18:06 |
mwhahaha | juggling flaming chainsaws | 18:06 |
slagle | i should have just rechecked instead of trying to investigate :-P | 18:07 |
bandini | lol | 18:07 |
weshay | mwhahaha, that is a great description | 18:07 |
mwhahaha | https://www.youtube.com/watch?v=G8OTcY0iegI | 18:07 |
* weshay hopes for carnage | 18:08 | |
*** moshele has quit IRC | 18:08 | |
mwhahaha | too much talking, not enough juggling | 18:08 |
mwhahaha | 3:30 | 18:08 |
*** olap has joined #tripleo | 18:09 | |
weshay | just one? | 18:09 |
weshay | ah | 18:09 |
weshay | I think EmilienM went to school w/ that guy | 18:09 |
*** ooolpbot has joined #tripleo | 18:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 18:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 18:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771549 | 18:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771634 | 18:10 |
*** ooolpbot has quit IRC | 18:10 | |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 18:10 |
openstack | Launchpad bug 1771549 in tripleo "Containers multinode jobs fails on stable queens with overcloud deploy timeout" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 18:10 |
openstack | Launchpad bug 1771634 in tripleo "periodic: master container build is failing" [Critical,Triaged] - Assigned to Matt Young (halcyondude) | 18:10 |
weshay | this guy has it going on https://www.youtube.com/watch?v=OoJW-_OeFtw | 18:10 |
mwhahaha | tiny chainsaws | 18:10 |
mwhahaha | acceptable use of flask | 18:11 |
weshay | myoung|ruck, running a recreate of queens multinode-containers, will send a tmate when it's ready | 18:11 |
*** jfrancoa has quit IRC | 18:12 | |
*** akrivoka has quit IRC | 18:12 | |
*** rwsu has quit IRC | 18:13 | |
myoung|ruck | weshay: ack cool. just went thru 905, 906 (previous 2 fails) - so far it's random failures, all when attempting to push. 905 failed a new/different way than 906/907. https://bugs.launchpad.net/tripleo/+bug/1771634/comments/3 | 18:14 |
openstack | Launchpad bug 1771634 in tripleo "periodic: master container build is failing" [Critical,Triaged] - Assigned to Matt Young (halcyondude) | 18:14 |
myoung|ruck | weshay, mwhahaha, dmsimard|off: do we know what kind of HW we're running RDO container registry on? | 18:16 |
mwhahaha | no idea | 18:17 |
dmsimard|off | myoung|ruck: it's a virtual machine on RDO Cloud backed by a Ceph volume | 18:17 |
myoung|ruck | this smells like we're swamping the recv'r | 18:17 |
dmsimard|off | myoung|ruck: not sure what you want to know | 18:17 |
mwhahaha | myoung|ruck: is it the rdo container registry or the local docker instance int he build | 18:17 |
dmsimard|off | The ceph storage on RDO cloud is notoriously slow | 18:17 |
myoung|ruck | dmsimard|off: it's looking like we're getting 500/504 and/or timeouts when pushing containers from the promotion jobs | 18:17 |
dmsimard|off | has anything changed recently ? | 18:18 |
myoung|ruck | mwhahaha: still unravelling the kolla build layers...i had assumed we're pushing the new image to rdo registry, now i'm self-doubting. rather than debug in quiet land trying to run fast/transparent ;) | 18:20 |
mwhahaha | myoung|ruck: yea that's fine, i think it gets pushed the local docker instance first, i would also need to look at that | 18:21 |
* myoung|ruck gets into console.registry nd watches 908 | 18:21 | |
dmsimard|off | myoung|ruck: Let me try something to see if it helps. | 18:22 |
myoung|ruck | dmsimard|off: mwhahaha might have a very good point...i was assuming that "INFO:kolla.common.utils.octavia-housekeeping:Trying to push the image" meant --> RDO | 18:23 |
mwhahaha | May 15 23:12:31 upstream-centos-7-rdo-cloud-tripleo-174252 dockerd-current[1584]: time="2018-05-15T23:12:31.837209370Z" level=warning msg="failed to upload schema2 manifest: received unexpected HTTP status: 504 Gateway Time-out - falling back to schema1" | 18:23 |
mwhahaha | whatever that means | 18:23 |
dmsimard|off | do we know if that's on the local or remote registry ? | 18:24 |
dmsimard|off | like is it the container build job ? | 18:24 |
*** gyankum has quit IRC | 18:24 | |
mwhahaha | yea it is the build job but i don't know when containers are pushed | 18:24 |
mwhahaha | like are they all built locally then pushed at teh end | 18:24 |
mwhahaha | or are they pushed as they are built | 18:24 |
mwhahaha | https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-centos-7-master-containers-build/551e606/undercloud/var/log/journal.txt.gz#_May_15_23_25_25 | 18:24 |
mwhahaha | though a 504 would be odd to hit locally | 18:25 |
mwhahaha | sounds more like a remote error | 18:25 |
myoung|ruck | it appears that kolla spins up 16 threads (per conf file in our case) | 18:26 |
myoung|ruck | it builds, and pushes as they are built | 18:27 |
myoung|ruck | e.g. https://console.registry.rdoproject.org/registry#/images/tripleomaster/centos-binary-aodh-base:tripleo-ci-testing landed 17 mins ago from currenly running buile 908 | 18:27 |
myoung|ruck | build | 18:27 |
myoung|ruck | ^^ https://review.rdoproject.org/jenkins/job/periodic-tripleo-centos-7-master-containers-build/908/console | 18:28 |
myoung|ruck | have gone thru the a few previous failures, so far it's random which container get's whacked | 18:29 |
myoung|ruck | (aside) I am persistantly annoyed at how ansible hides output until everything is done. combined with not being on the actual node it makes diagnoses == log diving... 17:42:51 TASK [rdo-kolla-build : Build and push images] --> {spinny-wheel-thingy} :) | 18:30 |
* myoung|ruck switches the "whine selector" to the "SHADDUP" position | 18:31 | |
myoung|ruck | dmsimard|off: is it possible to get a dump from server side of registry, do we have the server logs anywhere findable/parsable? | 18:32 |
*** rbowen_ has joined #tripleo | 18:33 | |
*** rbowen has quit IRC | 18:33 | |
*** rbowen_ has quit IRC | 18:33 | |
*** rbowen has joined #tripleo | 18:33 | |
*** radeks has joined #tripleo | 18:34 | |
eric-young | I've got a fairly simple patch if someone wants to review it... https://review.openstack.org/#/c/563914/ | 18:35 |
weshay | mwhahaha, EmilienM chem, matbu updated the tripleo-upgrade gate https://review.openstack.org/#/c/568733/ | 18:35 |
openstackgerrit | Ronelle Landy proposed openstack-infra/tripleo-ci master: DNM Testing releases with zuul change https://review.openstack.org/568928 | 18:35 |
*** salmankhan has joined #tripleo | 18:36 | |
openstackgerrit | Ronelle Landy proposed openstack-infra/tripleo-ci master: DNM Testing releases with zuul change https://review.openstack.org/568928 | 18:38 |
*** salmankhan has quit IRC | 18:41 | |
weshay | dmsimard|off, myoung|ruck heh.. of course it works this time | 18:42 |
weshay | :) | 18:42 |
myoung|ruck | mwhahaha, dmsimard|off: so bug has details for all the fails since 901, mixture of 504, 500, and read timeouts | 18:42 |
*** atoth has quit IRC | 18:42 | |
myoung|ruck | weshay: fakme. i mean...that's great! 908 ======== BUILD CONTAINERS IMAGES COMPLETED | 18:43 |
myoung|ruck | dmsimard|off: did you tweak anything? (e.g. [14:22:20] <dmsimard|off> myoung|ruck: Let me try something to see if it helps. ) | 18:43 |
* myoung|ruck attempts to actually eat that (now 2 hour old) sandwich lolz | 18:43 | |
dmsimard|off | myoung|ruck: I have not yet | 18:43 |
*** trown|lunch is now known as trown | 18:44 | |
myoung|ruck | ok so we got lucky then...would expect we'll keep hitting this...just by the #'s | 18:44 |
dmsimard|off | myoung|ruck: what I'm doing right now is properly deleting the old namespaces (like master, because we're using tripleomaster now) | 18:44 |
dmsimard|off | like I was explaining to weshay recently | 18:44 |
dmsimard|off | some of the kolla images have upwards of 30 layers and when the docker client does a push, it needs to query the registry to know if it needs to push each layer or not -- the registry does a lookup in the filesystem and returns a 404 or a 200 depending if the layer is there or not so the client knows whether to push or not | 18:45 |
dmsimard|off | when you're pushing 125+ images with so many layers, it's very expensive on a storage volume that is already slow to begin with | 18:45 |
dmsimard|off | so the general idea is to keep as little images/tags on the registry as we can, hence the regular pruning | 18:46 |
myoung|ruck | dmsimard|off: makes total sense | 18:49 |
myoung|ruck | dmsimard|off: "docker image prune" ? | 18:51 |
*** etingof is now known as etingof|afk | 18:51 | |
openstackgerrit | Ronelle Landy proposed openstack/python-tripleoclient master: DNM Testing releases script with zuul change https://review.openstack.org/568931 | 18:53 |
dmsimard|off | myoung|ruck: no | 18:53 |
openstackgerrit | Michele Baldessari proposed openstack/puppet-pacemaker master: WIP do not create stonith constraint location when 1-node cluster https://review.openstack.org/568932 | 18:54 |
myoung|ruck | mwhahaha, weshay, dmsimard|off confirmed we're seeing this in queens as well: https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-centos-7-queens-containers-build/0b2dc67/kolla/logs/neutron-l3-agent.log | 18:55 |
myoung|ruck | same mo | 18:55 |
weshay | ya.. I suspect it has nothing to do w/ the release | 18:56 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: Add missing UndercloudUpgrade to environment https://review.openstack.org/568933 | 18:58 |
myoung|ruck | weshay: aye | 18:59 |
* myoung|ruck will brb | 18:59 | |
*** eck` is now known as eck`gone | 18:59 | |
*** eck`gone is now known as eck` | 19:00 | |
*** cameron_chuang has joined #tripleo | 19:00 | |
trozet | mwhahaha: so the ceph issue i fixed...now still hit the infinite loop in step 4...the nova secret container seems to open a virsh interactive shell, thats why it hangs | 19:00 |
*** moshele has joined #tripleo | 19:00 | |
trozet | mwhahaha, EmilienM: https://paste.fedoraproject.org/paste/cGmL09MYyBA4T0fRDAMhWQ/raw | 19:01 |
*** olap has quit IRC | 19:02 | |
*** fragatin_ has joined #tripleo | 19:05 | |
*** abishop_ has joined #tripleo | 19:05 | |
*** abishop has quit IRC | 19:06 | |
*** fragatin_ has quit IRC | 19:07 | |
*** fragatin_ has joined #tripleo | 19:07 | |
*** fragatina has quit IRC | 19:07 | |
*** fragatin_ has quit IRC | 19:09 | |
openstackgerrit | James Slagle proposed openstack/tripleo-common master: Set deployment_status from config_download_deploy https://review.openstack.org/566953 | 19:10 |
openstackgerrit | James Slagle proposed openstack/tripleo-common master: Add workflow for plan deployment status https://review.openstack.org/564315 | 19:10 |
openstackgerrit | James Slagle proposed openstack/tripleo-common master: Ansible json error callback plugin https://review.openstack.org/566938 | 19:10 |
openstackgerrit | James Slagle proposed openstack/tripleo-common master: Workflow and action for deployment failures https://review.openstack.org/567318 | 19:10 |
*** ooolpbot has joined #tripleo | 19:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 19:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 19:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771549 | 19:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771634 | 19:10 |
*** ooolpbot has quit IRC | 19:10 | |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 19:10 |
openstack | Launchpad bug 1771549 in tripleo "Containers multinode jobs fails on stable queens with overcloud deploy timeout" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 19:10 |
openstack | Launchpad bug 1771634 in tripleo "periodic: container build jobs are failing when pushing to rdo registry (500, 504, read timeout)" [Critical,Triaged] - Assigned to Matt Young (halcyondude) | 19:10 |
*** fragatina has joined #tripleo | 19:13 | |
*** mcornea has quit IRC | 19:13 | |
*** gbarros has quit IRC | 19:14 | |
*** mcornea has joined #tripleo | 19:14 | |
openstackgerrit | Alex Schultz proposed openstack/tripleo-heat-templates master: Add basics for standalone node https://review.openstack.org/566419 | 19:16 |
*** tosky has quit IRC | 19:20 | |
dmsimard|off | myoung|ruck: the tags for master/queens/pike have been deleted, pruning the images now. | 19:21 |
* myoung|ruck is taking bets on reclaimed space size and peers at mwhahaha and weshay | 19:21 | |
mwhahaha | 5 megs? | 19:22 |
dmsimard|off | myoung|ruck: we rotate about 500GB worth of images every two weeks or so | 19:22 |
mwhahaha | 640k is all we should ever need | 19:22 |
myoung|ruck | 640! my god man. 64k baby. 64k | 19:22 |
cameron_chuang | Hi all, When I deploy pike TripleO it failed with msg ""Error: Evaluation Error: Error while evaluating a Function Call, Could not find data item oslo_messaging_rpc_password in any Hiera data file and no default supplied at /etc/puppet/modules/tripleo/manifests/profile/base/ceilometer.pp:78:30 on node overcloud-novacompute-0.localdomain", where I need to fix config ? | 19:23 |
myoung|ruck | mwhahaha: https://st.depositphotos.com/1021561/3891/i/950/depositphotos_38919889-stock-photo-three-keys-keyboard-binary-layout.jpg | 19:23 |
dmsimard|off | myoung|ruck: ~4TB of traffic in about a month http://paste.openstack.org/show/721125/ | 19:24 |
*** tosky has joined #tripleo | 19:24 | |
dmsimard|off | myoung|ruck: but really what's hurting us is the slow I/O for which I have no solution | 19:24 |
dmsimard|off | https://review.rdoproject.org/grafana/?orgId=1&var-datasource=default&var-server=registry.rdoproject.org.rdocloud&var-inter=$__auto_interval&from=now%2FM&to=now | 19:24 |
* myoung|ruck puts serious hat on for a sec | 19:24 | |
mwhahaha | where does one acquire a serious hat | 19:24 |
mwhahaha | did serious cat sell it to you | 19:25 |
myoung|ruck | dmsimard|off: are there tweakable params to increase timeouts for lengthy i/o operations on the docker registry side? so even with slow i/o we don't time out and 500/504/faceplant but just take longer to complete the push? | 19:25 |
mwhahaha | i'm assuming the registry software is not that tunable | 19:26 |
dmsimard|off | myoung|ruck: we're talking about timeouts but we're not even sure what's the real issue | 19:26 |
dmsimard|off | myoung|ruck: let's do housekeeping first -- delete the cruft and etc, then see if things improve | 19:26 |
myoung|ruck | this seems to happen (at least from the builds analyzed so far) to be 1-2 containers per job. Or I'm curious if it's worse...but since a bunch of container layers were pushed by previous job (with same hash/tag being used...as was the case on the string of master jobs) it's a 'try try again" going on...where we don't push layers already pushed (previously) | 19:26 |
dmsimard|off | we also need to update openshift at some point | 19:26 |
myoung|ruck | dmsimard|off: ack, I don't have cycles today to dive in, but is the docker registry log anywhere we can get at it? | 19:31 |
weshay | dang | 19:31 |
weshay | DO NOT PISS OFF THE EVILIEN | 19:31 |
dmsimard|off | myoung|ruck: folks from the infrastructure core team can access it on a need basis | 19:32 |
myoung|ruck | ... ok. curious what --max-concurrent-uploads is set to and if it would help | 19:32 |
trozet | myoung|ruck: hey, i think we have another problem with host/container package mismatch | 19:34 |
myoung|ruck | and what's in the registry logs behind the 500's | 19:34 |
trozet | myoung|ruck: i suspect there is also an issue with libvirt packages, for the containerized libvirt | 19:34 |
gouthamr | hi mwhahaha, i reported a bug on tripleo that i plan on fixing, post filing, i see a message saying that i need to be added to the LP group to manipulate the LP fields.. | 19:35 |
mwhahaha | gouthamr: i can also triage it. You should be able to assign it to yourself at least | 19:35 |
myoung|ruck | trozet: could you please update https://bugs.launchpad.net/tripleo/+bug/1771602 with the details of what you're seeing, or if it's already in LP link it to that RFE tracker? | 19:35 |
openstack | Launchpad bug 1771602 in tripleo "RFE: detect and warn when package versions in bare metal vs. container don't match" [Medium,Triaged] | 19:35 |
mwhahaha | gouthamr: if you aren't planning on doing a bunch of tripleo bug work, i don't think it's necessary to get yourself added to LP groups. Also we're working on moving off of LP | 19:36 |
gouthamr | mwhahaha: thanks, here it is: https://bugs.launchpad.net/tripleo/+bug/1771656 added my conclusions on the report and assigned it to myself | 19:36 |
openstack | Launchpad bug 1771656 in tripleo "[manila] Dell/EMC backends require value for share_backend_name " [Undecided,New] - Assigned to Goutham Pacha Ravi (gouthamr) | 19:36 |
trozet | myoung|ruck: yeah. I need to confirm my suspicion first, but will do | 19:36 |
mwhahaha | gouthamr: done | 19:36 |
myoung|ruck | trozet: thanks! | 19:37 |
trozet | myoung|ruck: if i kill the nova libvirt container, start libvirtd on the host, and do the command its all good | 19:39 |
gouthamr | mwhahaha: nice, thank you.. i am a newbie here, and will be working with the storage projects (mainly manila) with abishop_ | 19:39 |
trozet | myoung|ruck: i think this is a blocker...im not sure how easy it is going to be to get these versions to match.. | 19:40 |
trozet | myoung|ruck: pacemaker you can workaround..but libvirt is tied to qemu/kernel | 19:40 |
openstackgerrit | Goutham Pacha Ravi proposed openstack/puppet-tripleo master: Remove share_backend_name from Dell-EMC manila backends https://review.openstack.org/568945 | 19:44 |
myoung|ruck | trozet: is it an actual inside/outside container libvirt package version mismatc? | 19:46 |
myoung|ruck | mismatch* | 19:46 |
trozet | myoung|ruck: yeah | 19:47 |
* myoung|ruck nods | 19:47 | |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: execute the build / install of zuul changes in undercloud-upgrade https://review.openstack.org/568946 | 19:47 |
trozet | myoung|ruck: well the weird part is in the container if i do rpm -qa | grep libvirt theres nothing listed.. | 19:47 |
trozet | myoung|ruck: but if id o virsh --version it shows me the version is 3.2 | 19:47 |
trozet | myoung|ruck: and version on host is 3.9 | 19:47 |
*** wolverineav has quit IRC | 19:52 | |
*** slaweq has quit IRC | 19:52 | |
openstackgerrit | Ronelle Landy proposed openstack/python-tripleoclient master: DNM Testing releases script with zuul change https://review.openstack.org/568931 | 19:52 |
*** slaweq has joined #tripleo | 19:52 | |
*** holser__ has joined #tripleo | 19:53 | |
*** dparkes has joined #tripleo | 19:54 | |
*** jcoufal has joined #tripleo | 19:56 | |
*** slaweq_ has joined #tripleo | 19:57 | |
*** slaweq has quit IRC | 19:57 | |
openstackgerrit | James Slagle proposed openstack/python-tripleoclient master: Actually print the error during deployment fail https://review.openstack.org/568707 | 19:59 |
openstackgerrit | James Slagle proposed openstack/python-tripleoclient master: overcloud plan deployment status https://review.openstack.org/564341 | 19:59 |
openstackgerrit | James Slagle proposed openstack/python-tripleoclient master: overcloud plan deployment failures https://review.openstack.org/568673 | 19:59 |
*** jcoufal_ has quit IRC | 19:59 | |
openstackgerrit | Dan Sneddon proposed openstack/tripleo-heat-templates master: Add ability to pre-assign IPs by role on ctlplane https://review.openstack.org/568505 | 20:00 |
*** zshi has quit IRC | 20:09 | |
*** ooolpbot has joined #tripleo | 20:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 20:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 20:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771549 | 20:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771634 | 20:10 |
*** ooolpbot has quit IRC | 20:10 | |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 20:10 |
openstack | Launchpad bug 1771549 in tripleo "Containers multinode jobs fails on stable queens with overcloud deploy timeout" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 20:10 |
openstack | Launchpad bug 1771634 in tripleo "periodic: container build jobs are failing when pushing to rdo registry (500, 504, read timeout)" [Critical,Triaged] - Assigned to Matt Young (halcyondude) | 20:10 |
*** pchavva has quit IRC | 20:10 | |
myoung|ruck | trozet: thanks for the update/details of libvirt pain in https://bugs.launchpad.net/tripleo/+bug/1771602 | 20:11 |
openstack | Launchpad bug 1771602 in tripleo "RFE: detect and warn when package versions in bare metal vs. container don't match" [Medium,Triaged] | 20:11 |
*** liverpooler has quit IRC | 20:11 | |
*** dougbtv_ has joined #tripleo | 20:11 | |
openstackgerrit | John Trowbridge proposed openstack-infra/tripleo-ci master: Add CLI argument parser and YAML file parser https://review.openstack.org/567936 | 20:15 |
*** moshele has quit IRC | 20:17 | |
*** dougbtv_ has quit IRC | 20:17 | |
openstackgerrit | Nir Magnezi proposed openstack/tripleo-common master: Make lb-mgmt-subnet a class B subnet https://review.openstack.org/568089 | 20:19 |
openstackgerrit | Alex Schultz proposed openstack/tripleo-heat-templates master: Add basics for standalone node https://review.openstack.org/566419 | 20:19 |
openstackgerrit | Nir Magnezi proposed openstack/tripleo-heat-templates master: Make lb-mgmt-subnet a class B subnet https://review.openstack.org/568138 | 20:21 |
*** asbishop has joined #tripleo | 20:24 | |
openstackgerrit | James Slagle proposed openstack/tripleo-common master: Ignore errors when checking result of previous deployments https://review.openstack.org/568955 | 20:24 |
mwhahaha | weshay, myoung|ruck: is scenario000 broken | 20:25 |
*** gbarros has joined #tripleo | 20:26 | |
* myoung|ruck looks with his third eye...and the other 2 as well | 20:26 | |
*** abishop_ has quit IRC | 20:27 | |
mwhahaha | EmilienM: http://logs.openstack.org/51/567951/3/check/tripleo-ci-centos-7-undercloud-containers/c88f50b/logs/undercloud/home/zuul/undercloud_install.log.txt.gz#_2018-05-16_18_00_35 that's a special error | 20:27 |
EmilienM | looking | 20:28 |
*** dprince has quit IRC | 20:28 | |
mwhahaha | I assume it's https://review.openstack.org/#/c/567951/3/tripleoclient/v1/tripleo_deploy.py@264 | 20:30 |
*** tiswanso has quit IRC | 20:31 | |
*** raildo has quit IRC | 20:31 | |
EmilienM | mwhahaha: something with _set_roles_file, one sec | 20:32 |
mwhahaha | maybe a bad rebase from the tht_render thing | 20:33 |
weshay | mwhahaha, last two failed, however today it's at 95% | 20:33 |
myoung|ruck | mwhahaha: time flew, i have a hard stop -2 min ago, but back in 80 min. a quick look at http://cistatus.tripleo.org/#tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates shows a few check jobs have hit an issue...before i dash (in -3min now) are you looking at a specific patch? | 20:33 |
mwhahaha | myoung|ruck: just noticed two check failures on patches that didn't touch overcloud | 20:33 |
mwhahaha | it can wait until later | 20:33 |
weshay | "msg": "Overcloud minor update execution step failed..."} | 20:33 |
myoung|ruck | mwhahaha: k will look when back | 20:34 |
mwhahaha | yea when i poked at it the ansible said no hosts | 20:34 |
weshay | I swear I'll never had a update/upgrade job voting | 20:34 |
EmilienM | mwhahaha: I wonder if we need to return roles_data = None if there is no roles_file | 20:35 |
EmilienM | well you return "return self.roles_data" anyway so it's not it | 20:35 |
weshay | mwhahaha, it properly fails on | 568736,2 | 20:36 |
weshay | the other failure is on rlandy's patch but it shouldn't fail there | 20:37 |
weshay | you are looking at https://review.openstack.org/568946 | 20:37 |
mwhahaha | https://review.openstack.org/#/c/568378/ | 20:37 |
mwhahaha | https://review.openstack.org/#/c/567951/ | 20:38 |
mwhahaha | those two | 20:38 |
mwhahaha | neither touches overcloud bits | 20:38 |
weshay | ya.. I supsect https://review.openstack.org/#/c/568680/ | 20:40 |
* weshay runs | 20:40 | |
EmilienM | so | 20:41 |
*** radeks has quit IRC | 20:41 | |
EmilienM | https://review.openstack.org/#/c/568347/ needs to be in the mistral container | 20:41 |
EmilienM | which will happen when we have a promotion | 20:41 |
mwhahaha | no should be fine now because we're updating containers | 20:42 |
mwhahaha | we reverted that bits | 20:42 |
mwhahaha | or are we not checking that in the 000 job | 20:42 |
EmilienM | we don't update undercloud containers when undercloud is containerized | 20:43 |
EmilienM | which is the case of fs001 | 20:43 |
mwhahaha | is fs001 used by scenario000? | 20:43 |
EmilienM | no no | 20:43 |
* mwhahaha left his magic decoder ring at home | 20:43 | |
EmilienM | :D | 20:43 |
*** moshele has joined #tripleo | 20:44 | |
*** ansmith has quit IRC | 20:44 | |
*** gbarros has quit IRC | 20:45 | |
openstackgerrit | Marius Cornea proposed openstack/tripleo-upgrade master: DNM: Stop openstack services before undercloud upgrade https://review.openstack.org/568667 | 20:45 |
*** moshele has quit IRC | 20:46 | |
*** rbowen has quit IRC | 20:47 | |
stevebaker | morning | 20:48 |
EmilienM | yo | 20:48 |
*** holser__ has quit IRC | 20:51 | |
*** wolverineav has joined #tripleo | 20:54 | |
*** salmankhan has joined #tripleo | 20:56 | |
*** gfidente|afk has quit IRC | 20:57 | |
*** slaweq_ has quit IRC | 20:59 | |
*** slaweq has joined #tripleo | 20:59 | |
*** lblanchard1 has quit IRC | 21:00 | |
*** salmankhan has quit IRC | 21:01 | |
openstackgerrit | Alex Schultz proposed openstack/python-tripleoclient master: Update HostnameMap generation https://review.openstack.org/567951 | 21:01 |
weshay | mwhahaha, so.. what to do w/ scen000 | 21:02 |
mwhahaha | figure out how it broke | 21:02 |
weshay | k | 21:02 |
weshay | the only repo that was not gated was tripleo-upgrade | 21:02 |
weshay | but obviously we still could have missed something | 21:02 |
mwhahaha | weshay: oh and switch it to non-voting | 21:04 |
*** trown is now known as trown|outtypewww | 21:05 | |
*** mcornea has quit IRC | 21:05 | |
slagle | hmm, guess my revert https://review.openstack.org/#/c/559926 won't pass without some other change also reverted | 21:05 |
* mwhahaha digs up the non-voting patch for scenario000 | 21:06 | |
*** jcoufal_ has joined #tripleo | 21:07 | |
*** dxiri_ has joined #tripleo | 21:07 | |
openstackgerrit | Alex Schultz proposed openstack-infra/tripleo-ci master: Revert "Revert "Disable voting on tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates"" https://review.openstack.org/568962 | 21:08 |
*** dxiri has quit IRC | 21:09 | |
*** ooolpbot has joined #tripleo | 21:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 21:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 21:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771549 | 21:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771634 | 21:10 |
*** ooolpbot has quit IRC | 21:10 | |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 21:10 |
*** jcoufal has quit IRC | 21:10 | |
openstack | Launchpad bug 1771549 in tripleo "Containers multinode jobs fails on stable queens with overcloud deploy timeout" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 21:10 |
openstack | Launchpad bug 1771634 in tripleo "periodic: container build jobs are failing when pushing to rdo registry (500, 504, read timeout)" [Critical,Triaged] - Assigned to Matt Young (halcyondude) | 21:10 |
weshay | mwhahaha, lovely | 21:12 |
*** tiswanso has joined #tripleo | 21:12 | |
weshay | mwhahaha, while ur in the revert kinda mood | 21:13 |
weshay | anything come to mind re: multinode-containers for queens.. timing out... looks like the deployment never starts | 21:13 |
*** itlinux has quit IRC | 21:13 | |
weshay | I have a local recreate.. networking looks ok | 21:13 |
*** Goneri has quit IRC | 21:14 | |
*** olap has joined #tripleo | 21:16 | |
*** tiswanso has quit IRC | 21:16 | |
*** olap has quit IRC | 21:20 | |
*** asbishop has quit IRC | 21:21 | |
mwhahaha | meh we moved it | 21:28 |
* mwhahaha goes and digs it back up | 21:28 | |
openstackgerrit | Alex Schultz proposed openstack-infra/tripleo-ci master: Revert "Revert "Disable voting on tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates"" https://review.openstack.org/568962 | 21:29 |
mwhahaha | there we go | 21:30 |
* mwhahaha wanders off for a bit | 21:30 | |
*** slaweq has quit IRC | 21:33 | |
*** slaweq has joined #tripleo | 21:34 | |
openstackgerrit | Alex Schultz proposed openstack-infra/tripleo-ci master: Revert "Revert "Disable voting on tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates"" https://review.openstack.org/568962 | 21:34 |
mwhahaha | I created a bug https://bugs.launchpad.net/tripleo/+bug/1771686 | 21:34 |
openstack | Launchpad bug 1771686 in tripleo "tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates failing on update because of no hosts" [Critical,Triaged] | 21:34 |
openstackgerrit | Ade Lee proposed openstack/tripleo-upgrade master: Add config_change role https://review.openstack.org/567300 | 21:36 |
*** agopi has joined #tripleo | 21:38 | |
*** ansmith has joined #tripleo | 21:38 | |
*** d0ugal_ has joined #tripleo | 21:40 | |
*** d0ugal has quit IRC | 21:41 | |
openstackgerrit | James Slagle proposed openstack/tripleo-common master: Revert "TLS by default for the overcloud" https://review.openstack.org/568964 | 21:45 |
openstackgerrit | James Slagle proposed openstack/tripleo-heat-templates master: Revert "Switch public endpoints to use FQDNs by default" https://review.openstack.org/568899 | 21:45 |
*** itlinux has joined #tripleo | 21:51 | |
*** leitan has joined #tripleo | 21:55 | |
*** leitan has quit IRC | 22:01 | |
*** leitan has joined #tripleo | 22:01 | |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: Support networkx 2.0 https://review.openstack.org/506524 | 22:03 |
*** wolverineav has quit IRC | 22:05 | |
*** ooolpbot has joined #tripleo | 22:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 22:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 22:10 |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 22:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771549 | 22:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771634 | 22:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771686 | 22:10 |
*** ooolpbot has quit IRC | 22:10 | |
openstack | Launchpad bug 1771549 in tripleo "Containers multinode jobs fails on stable queens with overcloud deploy timeout" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 22:10 |
openstack | Launchpad bug 1771634 in tripleo "periodic: container build jobs are failing when pushing to rdo registry (500, 504, read timeout)" [Critical,Triaged] - Assigned to Matt Young (halcyondude) | 22:10 |
openstack | Launchpad bug 1771686 in tripleo "tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates failing on update because of no hosts" [Critical,Triaged] | 22:10 |
*** ssbarnea_ has quit IRC | 22:14 | |
*** pliu has quit IRC | 22:14 | |
*** pabelanger has quit IRC | 22:15 | |
*** aputtur has quit IRC | 22:15 | |
*** haleyb has quit IRC | 22:15 | |
*** hewbrocca_afk has quit IRC | 22:15 | |
*** mburned has quit IRC | 22:16 | |
*** rasca has quit IRC | 22:16 | |
*** rnoriega has quit IRC | 22:16 | |
*** markmc has quit IRC | 22:16 | |
*** jjoyce has quit IRC | 22:16 | |
*** akrzos has quit IRC | 22:16 | |
*** lhinds has quit IRC | 22:16 | |
*** faceman has quit IRC | 22:16 | |
*** jschlueter has quit IRC | 22:17 | |
*** myoung|ruck has quit IRC | 22:17 | |
*** weshay has quit IRC | 22:17 | |
*** slaweq has quit IRC | 22:21 | |
*** aputtur has joined #tripleo | 22:22 | |
*** hewbrocca_afk has joined #tripleo | 22:22 | |
*** mburned has joined #tripleo | 22:22 | |
*** weshay has joined #tripleo | 22:22 | |
*** dxiri_ has quit IRC | 22:25 | |
*** rcernin has joined #tripleo | 22:25 | |
*** dxiri has joined #tripleo | 22:25 | |
*** weshay has quit IRC | 22:26 | |
*** hewbrocca_afk has quit IRC | 22:26 | |
*** mburned has quit IRC | 22:27 | |
*** aputtur has quit IRC | 22:27 | |
*** rlandy is now known as rlandy|bbl | 22:28 | |
*** rajinir has quit IRC | 22:28 | |
*** andreaf has quit IRC | 22:29 | |
*** andreaf has joined #tripleo | 22:29 | |
*** itlinux has quit IRC | 22:33 | |
*** jschlueter has joined #tripleo | 22:34 | |
*** weshay has joined #tripleo | 22:34 | |
*** pabelanger has joined #tripleo | 22:34 | |
*** mburned has joined #tripleo | 22:35 | |
*** faceman has joined #tripleo | 22:35 | |
*** myoung has joined #tripleo | 22:35 | |
*** akrzos has joined #tripleo | 22:35 | |
*** pliu has joined #tripleo | 22:35 | |
*** lhinds has joined #tripleo | 22:35 | |
*** rasca has joined #tripleo | 22:35 | |
*** aputtur has joined #tripleo | 22:36 | |
*** jjoyce has joined #tripleo | 22:36 | |
*** agopi has quit IRC | 22:37 | |
*** rnoriega has joined #tripleo | 22:37 | |
*** haleyb has joined #tripleo | 22:37 | |
*** agopi has joined #tripleo | 22:38 | |
*** hewbrocca_afk has joined #tripleo | 22:39 | |
*** d0ugal__ has joined #tripleo | 22:40 | |
*** markmc has joined #tripleo | 22:41 | |
*** d0ugal_ has quit IRC | 22:41 | |
*** dougbtv_ has joined #tripleo | 22:47 | |
openstackgerrit | Merged openstack/tripleo-validations master: Add validation for checking roles against flavors https://review.openstack.org/562296 | 22:52 |
openstackgerrit | Merged openstack/python-tripleoclient master: Fix hiera data override file writing https://review.openstack.org/568818 | 22:59 |
openstackgerrit | Merged openstack/tripleo-validations stable/pike: Remove unused tox_install.sh https://review.openstack.org/567272 | 22:59 |
openstackgerrit | Merged openstack/tripleo-validations stable/pike: Validate that there should not be XFS volumes with ftype=0 https://review.openstack.org/564698 | 22:59 |
openstackgerrit | Merged openstack/tripleo-quickstart-extras master: Populate /etc/yum/vars/contentdir https://review.openstack.org/568701 | 22:59 |
*** dougbtv_ has quit IRC | 23:00 | |
openstackgerrit | Merged openstack-infra/tripleo-ci master: Add python script to dynamically compose releases https://review.openstack.org/567521 | 23:05 |
*** ooolpbot has joined #tripleo | 23:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1770972 | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771549 | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771634 | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771686 | 23:10 |
openstack | Launchpad bug 1770972 in tripleo "CI: Images introspection fails in OVB jobs" [Critical,Triaged] - Assigned to Derek Higgins (derekh) | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1771692 | 23:10 |
*** ooolpbot has quit IRC | 23:10 | |
openstack | Launchpad bug 1771549 in tripleo "Containers multinode jobs fails on stable queens with overcloud deploy timeout" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 23:10 |
openstack | Launchpad bug 1771634 in tripleo "periodic: container build jobs are failing when pushing to rdo registry (500, 504, read timeout)" [Critical,Triaged] - Assigned to Matt Young (halcyondude) | 23:10 |
openstack | Launchpad bug 1771686 in tripleo "tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates failing on update because of no hosts" [Critical,Triaged] | 23:10 |
openstack | Launchpad bug 1771692 in tripleo "hubbot check jobs are timing out on OC deploy" [Critical,Triaged] | 23:10 |
*** slaweq has joined #tripleo | 23:10 | |
*** pmannidi has joined #tripleo | 23:14 | |
*** slaweq has quit IRC | 23:15 | |
*** lhinds has quit IRC | 23:16 | |
*** pliu has quit IRC | 23:16 | |
*** rnoriega has quit IRC | 23:16 | |
*** mburned has quit IRC | 23:16 | |
*** pabelanger has quit IRC | 23:17 | |
*** markmc has quit IRC | 23:17 | |
*** weshay has quit IRC | 23:17 | |
*** hewbrocca_afk has quit IRC | 23:17 | |
*** aputtur has quit IRC | 23:17 | |
*** rasca has quit IRC | 23:17 | |
*** akrzos has quit IRC | 23:17 | |
*** myoung has quit IRC | 23:17 | |
*** jschlueter has quit IRC | 23:18 | |
*** haleyb has quit IRC | 23:18 | |
*** faceman has quit IRC | 23:18 | |
*** jjoyce has quit IRC | 23:18 | |
*** leitan has quit IRC | 23:19 | |
*** nyechiel_ has quit IRC | 23:20 | |
*** aputtur has joined #tripleo | 23:20 | |
*** pabelanger has joined #tripleo | 23:20 | |
*** rnoriega has joined #tripleo | 23:21 | |
*** weshay has joined #tripleo | 23:21 | |
*** rasca has joined #tripleo | 23:21 | |
*** haleyb has joined #tripleo | 23:21 | |
*** akrzos has joined #tripleo | 23:21 | |
*** pliu has joined #tripleo | 23:21 | |
*** myoung has joined #tripleo | 23:23 | |
*** mburned has joined #tripleo | 23:23 | |
*** faceman has joined #tripleo | 23:23 | |
*** jschlueter has joined #tripleo | 23:23 | |
*** lhinds has joined #tripleo | 23:23 | |
*** markmc has joined #tripleo | 23:24 | |
*** jjoyce has joined #tripleo | 23:24 | |
*** hewbrocca_afk has joined #tripleo | 23:24 | |
*** gvrangan has joined #tripleo | 23:34 | |
*** tosky has quit IRC | 23:34 | |
*** jcoufal has joined #tripleo | 23:42 | |
*** moshele has joined #tripleo | 23:44 | |
*** jcoufal_ has quit IRC | 23:46 | |
*** jcoufal has quit IRC | 23:47 | |
mwhahaha | weshay: looks like tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates is ok, there was just a gap of broken stuff | 23:48 |
* mwhahaha abandondes the revert | 23:48 | |
mwhahaha | from ~16:02 to 19:30 it was failing | 23:48 |
mwhahaha | it's been green since about 19:16 or so | 23:48 |
*** olap has joined #tripleo | 23:51 | |
*** dxiri has quit IRC | 23:52 | |
*** dxiri has joined #tripleo | 23:53 | |
*** olap has quit IRC | 23:56 | |
*** myoung is now known as myoung|ruck | 23:56 | |
*** dxiri has quit IRC | 23:57 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!