*** agopi has joined #oooq | 00:06 | |
hubbot | FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal, tripleo-ci-centos-7-scenario007-multinode-oooq-container | check logs @ https://review.openstack.org/472607 and fix them ASAP. | 00:19 |
---|---|---|
hubbot | FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal, tripleo-ci-centos-7-scenario007-multinode-oooq-container | check logs @ https://review.openstack.org/472607 and fix them ASAP. | 02:19 |
*** ykarel_ has joined #oooq | 03:24 | |
ykarel_ | quiquell|off, panda when you are back please check https://review.rdoproject.org/r/#/c/13429/, promoter is not promoting queens from long | 03:55 |
ykarel_ | it promoted only once but queens passed multiple runs | 03:55 |
*** ykarel_ is now known as ykarel | 03:55 | |
*** skramaja has joined #oooq | 04:16 | |
hubbot | FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal, tripleo-ci-centos-7-scenario007-multinode-oooq-container | check logs @ https://review.openstack.org/472607 and fix them ASAP. | 04:19 |
*** pgadiya has joined #oooq | 04:23 | |
*** pgadiya has quit IRC | 04:23 | |
*** pgadiya has joined #oooq | 04:26 | |
*** pgadiya has quit IRC | 04:26 | |
*** jtomasek has joined #oooq | 05:12 | |
*** ratailor has joined #oooq | 05:35 | |
*** jfrancoa has joined #oooq | 05:46 | |
*** marios has joined #oooq | 05:52 | |
*** quiquell|off is now known as quiquell|ruck | 06:08 | |
quiquell|ruck | Good morning ykarel | 06:08 |
ykarel | quiquell|ruck, Good Morning | 06:08 |
ykarel | quiquell|ruck, please check the promotion issue i mentioned above | 06:10 |
*** kopecmartin has joined #oooq | 06:10 | |
quiquell|ruck | ykarel: Checking | 06:10 |
ykarel | Ok | 06:11 |
ykarel | currently queens is conflicting with ocata so promotion for queens is not starting | 06:11 |
quiquell|ruck | Promoter global lock ? | 06:12 |
ykarel | yes | 06:12 |
ykarel | if it's possible to stop ocata promotion somehow, it would be good to stop it temporary | 06:12 |
quiquell|ruck | Let me do some forensics first , to check why ocata is not finishing | 06:13 |
*** links has joined #oooq | 06:13 | |
ykarel | quiquell|ruck, it's finishing but it start and stop is between start of queens | 06:13 |
ykarel | http://38.145.34.55/queens.log and http://38.145.34.55/ocata.log | 06:14 |
quiquell|ruck | And queen is not reganing the locking after ? | 06:14 |
ykarel | no, again and again same situation is commint | 06:14 |
quiquell|ruck | That's weird | 06:14 |
quiquell|ruck | Feels like it's not a semaphore | 06:14 |
ykarel | Last 5 queens attempt:- 05:24:01, 05:34:01, 05:44:01, 05:54:01, 06:04:01 | 06:16 |
hubbot | FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal, tripleo-ci-centos-7-scenario007-multinode-oooq-container | check logs @ https://review.openstack.org/472607 and fix them ASAP. | 06:19 |
ykarel | Last 5 queens attempt:- 05:24:01, 05:34:01, 05:44:01, 05:54:01, 06:04:01 | 06:20 |
ykarel | Last 5 ocata start, stop:- 05:24:01 - 05:24:14 ,05:33:01 - 05:34:13 ,05:43:01 - 05:44:13 ,05:53:01 - 05:54:14 ,06:03:01 - 06:04:13 ,06:13:02 - 06:14:14 | 06:20 |
ykarel | quiquell|ruck, ^^ | 06:20 |
ykarel | so ocata is winning everytime, | 06:21 |
ykarel | to acquire lock | 06:21 |
quiquell|ruck | I will check the code, I think queens is timing out on the lock | 06:23 |
quiquell|ruck | And ocata enters and re-gain again | 06:23 |
quiquell|ruck | I am going to activate queens by hand | 06:23 |
quiquell|ruck | To have a promotion at least | 06:23 |
quiquell|ruck | Maste is screw with issues | 06:24 |
quiquell|ruck | master | 06:24 |
ykarel | quiquell|ruck, Ok | 06:24 |
*** jaganathan has joined #oooq | 06:24 | |
quiquell|ruck | I have to leave now going back in a few | 06:24 |
ykarel | yes i pushed a patch for master, but it need +1W | 06:24 |
*** jaganathan has quit IRC | 06:25 | |
quiquell|ruck | quiquell|ruck: Let's find someone to +1w that | 06:25 |
*** quiquell|ruck is now known as quiquell|ruck|af | 06:25 | |
*** quiquell|ruck|af is now known as quique|ruck|afk | 06:25 | |
ykarel | Okk | 06:25 |
*** jaganathan has joined #oooq | 06:25 | |
*** apetrich has joined #oooq | 06:46 | |
*** apetrich has quit IRC | 06:46 | |
*** jbadiapa has joined #oooq | 06:51 | |
*** ccamacho has joined #oooq | 06:56 | |
*** quique|ruck|afk is now known as quiquell|ruck | 07:01 | |
*** apetrich has joined #oooq | 07:02 | |
*** tesseract has joined #oooq | 07:04 | |
quiquell|ruck | ykarel: I have a WIP for the sequencial of promoter | 07:10 |
quiquell|ruck | https://review.rdoproject.org/r/#/c/13437/ | 07:10 |
quiquell|ruck | It's just a hack not big refactoring | 07:10 |
quiquell|ruck | But doing so we need to remove it from crontab | 07:10 |
ykarel | quiquell|ruck, hmm adjustment needs to be done if we change the way it used to work | 07:11 |
ykarel | quiquell|ruck, have you started queens promotion, or stopped ocata one? | 07:12 |
quiquell|ruck | Not yet | 07:12 |
ykarel | Okk | 07:13 |
quiquell|ruck | ykarel: Going to do a script to run them in parallel | 07:22 |
ykarel | quiquell|ruck, running in parallel won't cause a issue that we saw earlier(layer image issue)? | 07:23 |
quiquell|ruck | s/parallel/sequencial/ | 07:23 |
quiquell|ruck | Coffe is still reaching my blood | 07:23 |
ykarel | Ok:) | 07:24 |
*** bogdando has joined #oooq | 07:24 | |
*** tosky has joined #oooq | 07:28 | |
quiquell|ruck | ykarel: Running in sequence now | 07:33 |
quiquell|ruck | I will give the script a pin | 07:34 |
ykarel | Ok. let's see how it goes | 07:34 |
quiquell|ruck | Running now for queens | 07:37 |
ykarel | Nice, promotion started | 07:38 |
quiquell|ruck | Humm maybe the order is not quite correct | 07:38 |
ykarel | 2018-04-23 07:37:35,485 17070 INFO promoter Promoting the container images for dlrn hash 5466f249bd36900a1dac573cdc83e7a11493aea2 on queens to current-tripleo | 07:38 |
ykarel | i noticed master --> pike --> ocata --> queens | 07:38 |
quiquell|ruck | yep | 07:39 |
quiquell|ruck | That's wrong | 07:39 |
quiquell|ruck | master -> queens --> pike --> ocata would be the correct | 07:39 |
quiquell|ruck | Let's queens finish | 07:39 |
quiquell|ruck | And I change the script | 07:39 |
ykarel | Okk queens need multiple promotions | 07:39 |
quiquell|ruck | Master is fucked up so | 07:39 |
quiquell|ruck | Also the other ones | 07:40 |
quiquell|ruck | :-) | 07:40 |
quiquell|ruck | So it will do only queens | 07:40 |
ykarel | yes, we have +1W now on master fix | 07:40 |
quiquell|ruck | So let's not fix stuff now to have a lot of queens promotions | 07:40 |
quiquell|ruck | :-) | 07:40 |
ykarel | hoping to get that merged before next promotion run | 07:40 |
*** amoralej|off is now known as amoralej | 07:42 | |
quiquell|ruck | ykarel: Do we want to close https://bugs.launchpad.net/tripleo/+bug/1765008 | 07:43 |
openstack | Launchpad bug 1765008 in tripleo "Tempest API tests failing for stable/queens branch" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami) | 07:43 |
quiquell|ruck | Or keep opening to progress with segments ? | 07:43 |
ykarel | quiquell|ruck, i think promotion-blocker tags can be removed, and close once https://review.openstack.org/#/c/562932/ is merged | 07:45 |
quiquell|ruck | This also depends on removing from the skip list the test | 07:46 |
quiquell|ruck | It's no longer a blocker, the test is in the skip list now | 07:46 |
quiquell|ruck | We can work it without hurries | 07:46 |
quiquell|ruck | Just missing the +2v at https://review.openstack.org/#/c/563443/ | 07:48 |
quiquell|ruck | OK | 07:48 |
quiquell|ruck | Cool :-) | 07:48 |
ykarel | hmm | 07:49 |
*** ykarel is now known as ykarel|lunch | 07:56 | |
*** agopi has quit IRC | 08:05 | |
amoralej | hi | 08:05 |
amoralej | we are finally moving dependencies for master to new repo based on rocky tag | 08:06 |
amoralej | i don't expect any impact | 08:06 |
amoralej | but let me know if you notice anything abnormal | 08:06 |
amoralej | ykarel|lunch, quiquell|ruck ^ | 08:06 |
quiquell|ruck | amoralej: Noted | 08:06 |
quiquell|ruck | amoralej: newbie question, what are the impacts of it ? | 08:08 |
amoralej | quiquell|ruck, currently we are using the deps from queens | 08:09 |
amoralej | we need to move to rocky, as threre are packages that are only required in rocky | 08:09 |
amoralej | so we are moving that | 08:09 |
amoralej | in this particular case we are splitting the dependencies in two repos | 08:10 |
amoralej | it should be transparent for you | 08:10 |
amoralej | but, if you notice something starts failing because of lack of packages or something | 08:10 |
amoralej | let me know, in case i mess it up | 08:10 |
*** jaosorior has joined #oooq | 08:15 | |
quiquell|ruck | Ok will keep you in my mind | 08:15 |
*** lucas-afk is now known as lucasagomes | 08:15 | |
hubbot | FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal, tripleo-ci-centos-7-scenario007-multinode-oooq-container | check logs @ https://review.openstack.org/472607 and fix them ASAP. | 08:19 |
quiquell|ruck | ykarel|lunch: ERROR: Error installing fluent-plugin-kubernetes_metadata_filter: | 08:27 |
quiquell|ruck | INFO:kolla.image.build.fluentd: serverengine requires Ruby version >= 2.1.0. | 08:27 |
quiquell|ruck | INFO:kolla.image.build.fluentd:[0m | 08:27 |
quiquell|ruck | INFO:kolla.image.build.fluentd:[91minvalid options: -SHN | 08:27 |
quiquell|ruck | INFO:kolla.image.build.fluentd:(invalid options are ignored) | 08:27 |
quiquell|ruck | INFO:kolla.image.build.fluentd:[0m | 08:27 |
quiquell|ruck | https://logs.rdoproject.org/openstack-periodic-24hr/periodic-tripleo-centos-7-pike-containers-build/b7b3eff/kolla/logs/fluentd.log | 08:27 |
*** ykarel|lunch is now known as ykarel | 08:31 | |
*** jaosorior has quit IRC | 08:32 | |
ykarel | quiquell|ruck, since when the job is failing | 08:33 |
ykarel | amoralej, ack | 08:34 |
quiquell|ruck | 20 of april | 08:34 |
quiquell|ruck | Looking about -SHN option | 08:34 |
quiquell|ruck | rdoc has the -SHN opt ion | 08:35 |
ykarel | also both master and queens containers build timedout: Build timed out (after 240 minutes). Marking the build as failed. | 08:38 |
ykarel | looks abnormal | 08:38 |
ykarel | INFO:kolla.common.utils.zaqar:Trying to push the image | 08:39 |
ykarel | ERROR:kolla.common.utils.zaqar:received unexpected HTTP status: 500 Internal Server Error | 08:39 |
quiquell|ruck | ykarel: Damn | 08:39 |
quiquell|ruck | Humm about the pikes issue | 08:40 |
quiquell|ruck | Looks like active support is not being installed | 08:40 |
quiquell|ruck | active_support | 08:40 |
quiquell|ruck | unable to convert "\x84" from ASCII-8BIT to UTF-8 for lib/active_support/values/unicode_tables.dat | 08:40 |
quiquell|ruck | And maybe that's why SHN option is not in place | 08:40 |
ykarel | quiquell|ruck, /me no idea about this | 08:40 |
quiquell|ruck | OIk | 08:40 |
*** sshnaidm|off is now known as sshnaidm | 08:41 | |
quiquell|ruck | Bug openned | 08:43 |
quiquell|ruck | https://bugs.launchpad.net/tripleo/+bug/1766195 | 08:43 |
openstack | Launchpad bug 1766195 in tripleo "Error installing fluent-plugin-kubernetes_metadata_filter at pike" [Critical,New] - Assigned to Gabriele Cerami (gcerami) | 08:43 |
panda | quiquell|ruck: morning | 09:00 |
quiquell|ruck | panda: Morning | 09:00 |
quiquell|ruck | panda: bj to do a sync ? | 09:01 |
quiquell|ruck | ykarel: the kolla timeout feels transtitory, older jobs doesn't have it | 09:01 |
quiquell|ruck | The one consistent is the fluentd | 09:01 |
quiquell|ruck | Damn My bj is close wait | 09:02 |
quiquell|ruck | Ok now is on | 09:03 |
ykarel | quiquell|ruck, Ok good to focus on fluentd, kolla one can be checked in next run | 09:07 |
quiquell|ruck | Looking at the timeouts now | 09:07 |
arxcruz | myoung: chandankumar kopecmartin weshay hey guys, i have to leave in 2 hours for an appoitment, I should be back in time for the scrum, but if i don't I already update my cards :) | 09:07 |
*** jaosorior has joined #oooq | 09:07 | |
kopecmartin | arxcruz, ack | 09:08 |
*** moguimar has quit IRC | 09:08 | |
quiquell|ruck | ykarel: I see a lot of failures at postin rpm packaging | 09:09 |
quiquell|ruck | at kolla | 09:09 |
*** brault has joined #oooq | 09:10 | |
*** moguimar has joined #oooq | 09:10 | |
chandankumar | arxcruz: ack | 09:11 |
ykarel | quiquell|ruck, postin? | 09:12 |
quiquell|ruck | post installation step of the spec file | 09:13 |
ykarel | okk | 09:13 |
quiquell|ruck | 200~D-Bus connection: Operation not permitted | 09:13 |
quiquell|ruck | Shit | 09:13 |
ykarel | ahh, that can be a real issue | 09:13 |
ykarel | not transient | 09:13 |
quiquell|ruck | That is master promotion timeout | 09:14 |
panda | quiquell|ruck: ok | 09:14 |
quiquell|ruck | panda, ykarel: The master/queens timeout is a 500 Internal Server Error trying to push nova-compute | 09:33 |
quiquell|ruck | It's the same that you show at pikes ykarel | 09:33 |
*** zoli is now known as zoli|lunch | 09:33 | |
quiquell|ruck | panda: master/queens/pikes promotion blocker https://bugs.launchpad.net/tripleo/+bug/1766202 | 09:41 |
openstack | Launchpad bug 1766202 in tripleo "Pushing nova-compute kolla image gets HTTP status: 500 Internal Server Error" [Critical,New] - Assigned to Gabriele Cerami (gcerami) | 09:41 |
quiquell|ruck | panda: You missing the |rover in the nick | 09:42 |
ykarel | quiquell|ruck, it not just nova-compute, it's seen in many container images | 09:45 |
quiquell|ruck | ykarel: Ok, renaming it | 09:45 |
ykarel | quiquell|ruck, where you saw: 200~D-Bus connection: Operation not permitted? | 09:46 |
quiquell|ruck | ykarel: At masters, but maybe it has been there always | 09:47 |
quiquell|ruck | In the containers build | 09:47 |
quiquell|ruck | Have put the info here | 09:47 |
quiquell|ruck | https://review.rdoproject.org/etherpad/p/ruckrover-sprint12 | 09:47 |
ykarel | Okk, then Internal Server can be a transient one | 09:48 |
quiquell|ruck | Could be... | 09:48 |
quiquell|ruck | Is there something similar at check or gate ? | 09:48 |
ykarel | fs007 --> scenario007 | 09:48 |
quiquell|ruck | They will be run more often | 09:49 |
quiquell|ruck | Ups ok | 09:49 |
quiquell|ruck | :-/ | 09:49 |
quiquell|ruck | Thanks man | 09:49 |
ykarel | and Internal server Error is master and queens, not pike | 09:49 |
quiquell|ruck | It's also pike | 09:50 |
ykarel | Okk, i haven't seen that | 09:50 |
quiquell|ruck | Pike has two issues now | 09:50 |
quiquell|ruck | Humm this is pushing to docker.io not rdo ? | 09:52 |
quiquell|ruck | Damn now I don't find the 500 internal error log at master/queens | 09:59 |
quiquell|ruck | !gatestatus | 10:00 |
openstack | quiquell|ruck: Error: "gatestatus" is not a valid command. | 10:00 |
hubbot | FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal, tripleo-ci-centos-7-scenario007-multinode-oooq-container | check logs @ https://review.openstack.org/472607 and fix them ASAP. | 10:00 |
*** adarazs is now known as adarazs_lunch | 10:09 | |
*** sshnaidm has quit IRC | 10:10 | |
*** dtantsur|pto is now known as dtantsur | 10:16 | |
hubbot | FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal, tripleo-ci-centos-7-scenario007-multinode-oooq-container | check logs @ https://review.openstack.org/472607 and fix them ASAP. | 10:19 |
*** panda is now known as panda|rover | 10:27 | |
*** zoli|lunch is now known as zoli | 10:54 | |
*** sshnaidm has joined #oooq | 11:03 | |
quiquell|ruck | ykarel: | 11:04 |
quiquell|ruck | yRevert "Fix neutron-plugin-ml2.yaml puppet base ref | 11:04 |
quiquell|ruck | merged | 11:04 |
ykarel | yup let's wait for it to be packaged | 11:04 |
ykarel | quiquell|ruck, you can see it here once it's packaged:- https://trunk.rdoproject.org/api-centos-master-uc/api/report.html?package=openstack-tripleo-heat-templates | 11:06 |
*** lucasagomes is now known as lucas-hungry | 11:06 | |
ykarel | commit: 297aac5c255e0c9c55740c7a611f5279dc3a1735 | 11:06 |
quiquell|ruck | Then I can move it to released ? | 11:06 |
quiquell|ruck | at lp ? | 11:06 |
ykarel | i think released is autodone with the new releases | 11:07 |
quiquell|ruck | btw queens just promoted | 11:07 |
quiquell|ruck | OOk | 11:07 |
ykarel | nice current-tripleo-rdo | 11:07 |
*** moguimar has quit IRC | 11:13 | |
*** dpeacock has quit IRC | 11:14 | |
ykarel | quiquell|ruck, package is built and in consistent repo, so next periodic run will have the fix | 11:20 |
quiquell|ruck | Ok, let's wait for it | 11:20 |
chandankumar | arxcruz: if you are around, do we need more work on this one https://trello.com/c/wwpJfjRA/695-tempest-run-should-take-less-than-5-min-in-tripleo-undercloud-jobs ? | 11:29 |
ykarel | quiquell|ruck, added alert tag on https://bugs.launchpad.net/tripleo/+bug/1766195 | 11:32 |
openstack | Launchpad bug 1766195 in tripleo "Error installing fluent-plugin-kubernetes_metadata_filter at pike" [Critical,New] - Assigned to Gabriele Cerami (gcerami) | 11:32 |
quiquell|ruck | ykarel: I have to read about the openstack bot | 11:33 |
quiquell|ruck | Thanks btw | 11:33 |
ykarel | np | 11:33 |
*** amoralej is now known as amoralej|lunch | 11:36 | |
*** panda|rover is now known as panda|rover|lunc | 11:41 | |
*** adarazs_lunch is now known as adarazs | 11:47 | |
*** quiquell|ruck is now known as quique|ruck|food | 11:47 | |
*** rfolco|off is now known as rfolco | 11:55 | |
*** atoth has joined #oooq | 11:56 | |
*** lucas-hungry is now known as lucasagomes | 12:06 | |
quique|ruck|food | sshnaidm: featureset030 and featureset035 are missing from sova | 12:09 |
quique|ruck|food | Could it be possibe for sova to read them from promoter master.ini ? | 12:09 |
*** trown|outtypewww is now known as trown | 12:12 | |
ykarel | quique|ruck|food, updated the bug:- https://bugs.launchpad.net/tripleo/+bug/1766195 | 12:14 |
openstack | Launchpad bug 1766195 in tripleo "Error installing fluent-plugin-kubernetes_metadata_filter at pike" [Critical,Confirmed] - Assigned to Gabriele Cerami (gcerami) | 12:14 |
sshnaidm | quique|ruck|food, both are in config of sova | 12:15 |
sshnaidm | quique|ruck|food, and both are displayed in promotion page | 12:16 |
sshnaidm | quique|ruck|food, what the problem do you hit exactly? | 12:16 |
quique|ruck|food | sshnaidm: The problem is that I'm totally blind... they are already there :-) sorry | 12:19 |
sshnaidm | quique|ruck|food, np :) | 12:19 |
*** rlandy has joined #oooq | 12:19 | |
hubbot | FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal, tripleo-ci-centos-7-scenario007-multinode-oooq-container | check logs @ https://review.openstack.org/472607 and fix them ASAP. | 12:19 |
quique|ruck|food | ykarel: master container build running now, let's see were we are with the kolla image pushing | 12:22 |
ykarel | quique|ruck|food, ack | 12:22 |
ykarel | hoping it to not reproduce and have a promotion :) | 12:22 |
*** amoralej|lunch is now known as amoralej | 12:25 | |
*** tcw has joined #oooq | 12:39 | |
*** tcw has quit IRC | 12:39 | |
*** tcw has joined #oooq | 12:40 | |
*** faceman- is now known as faceman | 12:44 | |
sshnaidm | quique|ruck|food, ykarel do you know about such problem with gperftools-libs? https://logs.rdoproject.org/26/563526/1/openstack-check/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/Zd49e57a183c44c3c871ca322d7f5e63c/undercloud/home/jenkins/install_packages.sh.log.txt.gz#_2018-04-23_11_43_29 | 12:54 |
quique|ruck|food | sshnaidm: They are chaning the master's dependencies from queens to rocky | 12:56 |
*** adarazs is now known as adarazs_brb | 12:56 | |
quique|ruck|food | Try a recheck | 12:56 |
quique|ruck|food | amoralej: It's related ? | 12:56 |
amoralej | quique|ruck|food, yes, it's related | 12:57 |
amoralej | to my change | 12:57 |
*** quique|ruck|food is now known as quiquell|ruck | 12:57 | |
amoralej | it will be fixed whtn mirror is in sync | 12:57 |
amoralej | quiquell|ruck, you are seeing many jobs failing? | 12:57 |
amoralej | i could apply a fix temporarily until synchronization is done | 12:58 |
quiquell|ruck | Not much | 12:58 |
quiquell|ruck | I think it's starting to fail in the gates | 12:58 |
amoralej | if it's critical | 12:58 |
quiquell|ruck | periodic ones take so long | 12:58 |
amoralej | i think it should be ok now, in fact | 12:58 |
amoralej | for jobs starting now | 12:58 |
quiquell|ruck | Apply the fix will take a lot of time | 12:58 |
quiquell|ruck | Let's wait a little, and send and advise to one of the mailing list | 12:59 |
sshnaidm | quiquell|ruck, amoralej ack, thanks, will recheck\ | 13:00 |
amoralej | sshnaidm, quiquell|ruck i'll push a fix in the repos and keep it for one or two days | 13:01 |
amoralej | mmmm | 13:02 |
amoralej | although i may break other things... | 13:02 |
amoralej | quiquell|ruck, sshnaidm please let me know if you still see jobs failing with that error | 13:02 |
quiquell|ruck | amoralej: ack | 13:03 |
*** Goneri has joined #oooq | 13:04 | |
*** ratailor has quit IRC | 13:21 | |
quiquell|ruck | amoralej: Now it's a promotion blocker | 13:21 |
quiquell|ruck | https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-centos-7-master-containers-build/eadb914/undercloud/home/jenkins/install_packages.sh.log.txt.gz | 13:21 |
ykarel | :( | 13:21 |
amoralej | yeah, i saw it | 13:22 |
amoralej | i'm testing again if that's fixed now | 13:22 |
quiquell|ruck | amoralej: It's a transitory problem ? | 13:22 |
amoralej | quiquell|ruck, it's something that should be already fixed | 13:23 |
amoralej | but i'm testing it | 13:23 |
amoralej | testing it in https://review.rdoproject.org/jenkins/job/rdoinfo-tripleo-master-testing-centos-7-multinode-1ctlr-featureset016-nv/96/console | 13:24 |
ykarel | hmm the failures from master job is 1 hour back, so should be good now | 13:25 |
quiquell|ruck | bad luck | 13:25 |
quiquell|ruck | Let's just wait for another periodic run | 13:25 |
quiquell|ruck | amoralej: Let also check the amoralej running job | 13:26 |
*** moguimar has joined #oooq | 13:26 | |
weshay | thanks for checking that out quiquell|ruck :) | 13:26 |
quiquell|ruck | Hello there | 13:27 |
quiquell|ruck | recursive hchecking | 13:27 |
weshay | amoralej, are the changes to master/rocky impacting the gate while the sync is happening? | 13:29 |
weshay | e.g. http://logs.openstack.org/63/507963/54/gate/tripleo-ci-centos-7-undercloud-oooq/54add67/logs/undercloud/home/zuul/repo_setup.log.txt.gz | 13:29 |
amoralej | weshay, yes it has impacted | 13:30 |
amoralej | but impact should be low | 13:30 |
weshay | amoralej, do we have any sort of tech debt card / bug to capture that | 13:30 |
amoralej | let me see that job | 13:30 |
*** adarazs_brb is now known as adarazs | 13:30 | |
weshay | amoralej, FYI.. anytime the gate goes down due to ci/infra we should make sure we're working to elliminate the root cause if / when possible | 13:31 |
amoralej | weshay, yes, that job failure is related to new rocky | 13:31 |
amoralej | weshay, ok, should i fill a card somewhere? | 13:32 |
amoralej | i think i have an idea to fix that | 13:32 |
amoralej | in future | 13:32 |
amoralej | don't cache repomd.xml files | 13:32 |
amoralej | i need to use expiration headers for it | 13:32 |
weshay | amoralej, that sounds good | 13:32 |
weshay | ya | 13:32 |
amoralej | weshay, do you have the full list of mirrors upstream? | 13:33 |
amoralej | i could force refresh | 13:33 |
weshay | amoralej, I don't but I bet david or paul has it | 13:33 |
amoralej | mmmm, it's still failing | 13:34 |
amoralej | in https://review.rdoproject.org/jenkins/job/rdoinfo-tripleo-master-testing-centos-7-multinode-1ctlr-featureset016-nv/96/console | 13:34 |
amoralej | mmmm | 13:34 |
amoralej | i don't know why | 13:34 |
ykarel | amoralej, this time also gperftools? | 13:35 |
amoralej | i'm waiting for the logs | 13:36 |
amoralej | but i guess so | 13:36 |
amoralej | let's see if i can hold the node | 13:36 |
arxcruz | chandankumar: checking | 13:36 |
ykarel | amoralej, logs available now: https://logs.rdoproject.org/42/12942/6/check/rdoinfo-tripleo-master-testing-centos-7-multinode-1ctlr-featureset016-nv/Za85b635804154c8582c917a5bd949b60/undercloud/home/jenkins/install_packages.sh.log.txt.gz | 13:36 |
amoralej | yeah | 13:37 |
amoralej | let me try to reproduce it | 13:37 |
weshay | sshnaidm, any thoughts on https://review.openstack.org/#/c/562347/ | 13:39 |
sshnaidm | weshay, seems fine to me | 13:41 |
weshay | https://review.openstack.org/#/c/563244/ | 13:46 |
arxcruz | omg, look how beautifull is all these SUCCESS jobs... https://review.openstack.org/#/c/561920/ | 13:48 |
chandankumar | arxcruz: yay! | 13:50 |
*** panda|rover|lunc is now known as panda|ruck | 13:51 | |
quiquell|ruck | arxcruz: Better if you start like "You will not believe what this review does... " | 13:51 |
*** quiquell|ruck is now known as quiquell|rover | 13:51 | |
arxcruz | quiquell|rover: hahaha | 13:52 |
amoralej | quiquell|rover, ykarel it should be fine now | 13:52 |
amoralej | rechecking my job | 13:52 |
ykarel | amoralej, hmm locally it's working now | 13:52 |
ykarel | before 1 minute not working | 13:52 |
amoralej | i'm updated the repos manually | 13:53 |
amoralej | with the needed packages | 13:53 |
amoralej | i couldn't force mirror sync | 13:53 |
amoralej | and in my browser repomd.xml is fine | 13:53 |
amoralej | but not in yum | 13:53 |
amoralej | anyway, it's good to add those packages | 13:53 |
amoralej | to the repo | 13:53 |
amoralej | i'll send a rdoinfo review for it | 13:53 |
ykarel | amoralej, Ok, so only gperftools is missing, or there are others as well | 13:54 |
amoralej | other two which are needed by gperftools | 13:54 |
ykarel | okk | 13:54 |
amoralej | gv and Xaw3d | 13:54 |
ykarel | ack | 13:54 |
amoralej | in fact, we may have others | 13:54 |
amoralej | that's why i want to rerun a job | 13:55 |
ykarel | ahh, Ok go on | 13:55 |
*** trown is now known as trown|brb | 13:56 | |
ykarel | any idea why queens phase1 is not promoting for:- 5466f249bd36900a1dac573cdc83e7a11493aea2_0c8f7f95 | 13:57 |
ykarel | weshay, quiquell|rover panda|ruck ^^ | 13:58 |
* quiquell|rover checking | 13:58 | |
myoung | good morning, TripleO CI standup/scrum start shortly! https://bluejeans.com/7050859455, https://etherpad.openstack.org/p/tripleo-ci-squad-meeting | 13:59 |
ykarel | weshay, any idea why CIX card not created for https://bugs.launchpad.net/tripleo/+bug/1766195? | 14:00 |
openstack | Launchpad bug 1766195 in tripleo "Error installing fluent-plugin-kubernetes_metadata_filter at pike" [Critical,Triaged] - Assigned to Gabriele Cerami (gcerami) | 14:00 |
weshay | ykarel, it will, we wait for 5hrs to give us a chance to fix it before we escalate it | 14:01 |
*** trown|brb is now known as trown | 14:01 | |
ykarel | weshay, ack | 14:01 |
ykarel | i saw sometime it's created soon, so just asked | 14:01 |
quiquell|rover | ykarel: Promoted to tripleo-current | 14:03 |
quiquell|rover | https://dashboards.rdoproject.org/queens | 14:03 |
quiquell|rover | To current-tripleo-rdo will take time... | 14:03 |
ykarel | quiquell|rover, i can see all required jobs for phase 1 passed:- https://trunk.rdoproject.org/api-centos-queens/api/civotes_detail.html?commit_hash=5466f249bd36900a1dac573cdc83e7a11493aea2&distro_hash=0c8f7f95a9dace8504905c03862036fe270f1545 | 14:06 |
ykarel | for what promotion is waiting? | 14:07 |
ykarel | http://38.145.34.55/queens.log | 14:07 |
weshay | trown, fyi https://review.openstack.org/#/q/topic:gate_update+(status:open+OR+status:merged) | 14:07 |
weshay | sshnaidm, trown I made the tht zuul change non-voting but I can think of a good reason not to have it voting.. | 14:07 |
weshay | just treading lightly I guess | 14:08 |
sshnaidm | weshay, acc. to policy it should run some period as non-voting to prove a stability | 14:09 |
weshay | sshnaidm, ya.. I think we have done that already though | 14:11 |
weshay | sshnaidm, so is that true for the same job on the diff repo? | 14:11 |
weshay | that's where I'm not clear.. | 14:12 |
weshay | the job is def.. stable | 14:12 |
*** matbu has quit IRC | 14:12 | |
sshnaidm | weshay, I think it's overall triple policy: experimental -> non-voting -> voting | 14:12 |
weshay | sshnaidm, yes.. I get that .. but if you have job_A voting on repo_1 do you then have to start over for repo_2? | 14:13 |
weshay | and set job_A back to experimental | 14:13 |
sshnaidm | weshay, I don't think so.. | 14:13 |
weshay | huh.. starting to see some success on queens w/ the update job | 14:14 |
weshay | previously only saw it working on master | 14:14 |
*** moguimar has quit IRC | 14:14 | |
*** matbu has joined #oooq | 14:14 | |
ykarel | quiquell|rover, containers-build for queens is running for around 2 hours, so looks like https://bugs.launchpad.net/tripleo/+bug/1766202 is not a transient issue and needs to be checked | 14:15 |
openstack | Launchpad bug 1766202 in tripleo "Pushing kolla image gets HTTP status: 500 Internal Server Error" [Critical,New] - Assigned to Gabriele Cerami (gcerami) | 14:15 |
ykarel | It would be good to hold the node: upstream-centos-7-rdo-cloud-tripleo-143846 to check what's going on | 14:16 |
ykarel | amoralej, ^^ | 14:16 |
*** skramaja has quit IRC | 14:16 | |
amoralej | that's pushing to RDO's registry? | 14:17 |
quiquell|rover | ykarel: ack | 14:17 |
ykarel | amoralej, yes | 14:18 |
amoralej | let me see if i see some error in logs | 14:18 |
quiquell|rover | It's running in RDO infra so yes | 14:18 |
ykarel | amoralej, ack | 14:19 |
quiquell|rover | ykarel: Going to ask the RDO guys | 14:19 |
hubbot | FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal, tripleo-ci-centos-7-scenario007-multinode-oooq-container | check logs @ https://review.openstack.org/472607 and fix them ASAP. | 14:19 |
*** links has quit IRC | 14:30 | |
amoralej | quiquell|rover, ykarel my test job has just passed the step where it was failing before, at least | 14:30 |
ykarel | amoralej, nice, let's see full run to confirm we don't have anything else missing | 14:31 |
amoralej | yeah | 14:31 |
amoralej | i think i know why i missed these | 14:31 |
amoralej | it's kind of corner case | 14:31 |
amoralej | because it's a requirement for ceph, and i didn't take it into account in my script as ceph shouldn't depend on things in rdo repo | 14:32 |
ykarel | Okk | 14:32 |
quiquell|rover | We have to open a bug to track this | 14:33 |
quiquell|rover | Failing one for 3 hours ago | 14:33 |
quiquell|rover | https://review.rdoproject.org/jenkins/job/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/11739/ | 14:33 |
amoralej | quiquell|rover, let me know if you open a bug for that and i'll update it | 14:38 |
quiquell|rover | amoralej: ack | 14:38 |
quiquell|rover | amoralej: Do you have around the review with the changes to get deps from rocky ? | 14:39 |
amoralej | quiquell|rover, not yet, but i first did it manually to workaround | 14:40 |
quiquell|rover | Ok | 14:40 |
quiquell|rover | amoralej, ykarel: https://bugs.launchpad.net/tripleo/+bug/1766271 | 14:42 |
openstack | Launchpad bug 1766271 in tripleo "Installing packages: gperftools-libs-2.4.91-1.el7.x86_64.rpm not found" [Critical,Confirmed] - Assigned to Alfredo Moralejo (amoralej) | 14:42 |
panda|ruck | rlandy: so this is maybe I don't get: with the RH nameserver it doesn't work. If you have both the RH nameserver and another it does ? | 14:47 |
ykarel | 5466f249bd36900a1dac573cdc83e7a11493aea2_0c8f7f95 queens phase1 promotion still not started(all criteria jobs passed), can this be a bug in promotion script? | 14:47 |
ykarel | will look tomorrow if it's not promoted by then | 14:48 |
ykarel | quiquell|rover, ^^ | 14:48 |
quiquell|rover | ykarel: Will check | 14:48 |
*** agopi has joined #oooq | 14:48 | |
rlandy | panda|ruck: if you look at what you get on the vm nodes | 14:48 |
rlandy | using dhcp | 14:49 |
rlandy | when you set up your minidell attached to rh network | 14:49 |
rlandy | if you add an additional dns server | 14:49 |
rlandy | yes, it works | 14:49 |
rlandy | hence I sued append | 14:49 |
rlandy | used | 14:49 |
rlandy | but supercede will also work | 14:50 |
*** ykarel is now known as ykarel|away | 14:50 | |
ykarel|away | quiquell|rover, Thanks | 14:50 |
quiquell|rover | ykarel|away: Thanks to you man | 14:50 |
panda|ruck | rlandy: probably because the first nameserver does not respond in this case. If it starts responding and saying it doesn't exist, the second will not be queried | 14:51 |
panda|ruck | If I understood what trown was saying | 14:51 |
rlandy | panda|ruck: that was my understanding | 14:53 |
rlandy | also | 14:53 |
panda|ruck | rlandy: so no appending seems a safer solution | 14:54 |
rlandy | I wasn't sure if all user would hit the dns resolution problem - as such forcing a supersede seemed too much | 14:54 |
panda|ruck | mmhh | 14:55 |
trown | ya | 14:55 |
rlandy | but since we are narrowing the scope to just our use case | 14:55 |
rlandy | I am ok with the supersede | 14:55 |
*** quiquell|rover is now known as quiquell|off | 14:56 | |
* rlandy updates review | 14:56 | |
panda|ruck | rlandy: thanks. | 14:58 |
*** skramaja has joined #oooq | 15:04 | |
*** skramaja has quit IRC | 15:07 | |
adarazs | myoung, panda|ruck, weshay: which stable check jobs do you want to monitor with hubbot? stable/{ocata,pike,queens}? | 15:10 |
panda|ruck | adarazs: can I get stable/austin ? | 15:27 |
trown | nope | 15:29 |
adarazs | panda|ruck: we need to reach 88mph at least and start THT early :) | 15:29 |
adarazs | so that I can submit a change on it :) | 15:29 |
trown | cant even get stable/grizzly | 15:29 |
trown | or folsom | 15:30 |
trown | essex... I think that is as far back as I can go without looking at google | 15:30 |
panda|ruck | to be honest I'm not sure austin was even released to the public | 15:30 |
*** chandankumar has quit IRC | 15:35 | |
*** sshnaidm is now known as sshnaidm|afk | 15:41 | |
*** chandankumar has joined #oooq | 15:42 | |
*** sshnaidm|afk has quit IRC | 15:46 | |
*** sshnaidm has joined #oooq | 15:48 | |
*** sshnaidm has quit IRC | 15:49 | |
*** ykarel|away has quit IRC | 15:49 | |
*** moguimar has joined #oooq | 15:50 | |
*** panda|ruck is now known as panda|bbl | 15:51 | |
*** bogdando has quit IRC | 16:01 | |
trown | rlandy: I dont think we need your timeout patch if we remove cloud-init | 16:04 |
trown | rlandy: I think that is actually what is causing vms to be so slow | 16:04 |
trown | to boot | 16:04 |
*** ccamacho has quit IRC | 16:04 | |
rlandy | trown; that's fine, I can abandon the patch if the time to boot improves | 16:05 |
trown | ya was more just giving a heads up, becuase for iterative testing it is annoying that vms are taking more than 3 minutes to boot :P | 16:06 |
trown | so if you have that issue in your testing too, try removing cloud-init | 16:07 |
*** marios has quit IRC | 16:11 | |
*** ccamacho has joined #oooq | 16:11 | |
*** zoli is now known as zoli|gone | 16:14 | |
*** zoli|gone is now known as zoli | 16:14 | |
hubbot | FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal, tripleo-ci-centos-7-scenario007-multinode-oooq-container | check logs @ https://review.openstack.org/472607 and fix them ASAP. | 16:20 |
myoung | kopecmartin, arxcruz, chandankumar, weshay, tempest squad scrum in 3 min, https://etherpad.openstack.org/p/tripleo-tempest-squad-meeting, https://bluejeans.com/7050859455 | 16:27 |
*** lucasagomes is now known as lucas-afk | 16:29 | |
arxcruz | rlandy: weshay trown can one of you guys +w https://review.openstack.org/#/c/562155/ ? | 16:32 |
*** trown is now known as trown|lunch | 17:00 | |
*** sshnaidm has joined #oooq | 17:00 | |
*** moguimar has quit IRC | 17:02 | |
*** sshnaidm is now known as sshnaidm|off | 17:02 | |
*** jaosorior has quit IRC | 17:05 | |
*** kopecmartin has quit IRC | 17:12 | |
*** tesseract has quit IRC | 17:23 | |
*** dtantsur is now known as dtantsur|afk | 17:33 | |
*** agopi is now known as agopi|lunch | 17:42 | |
*** jaosorior has joined #oooq | 17:49 | |
*** trown|lunch is now known as trown | 18:05 | |
hubbot | FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal, tripleo-ci-centos-7-scenario007-multinode-oooq-container | check logs @ https://review.openstack.org/472607 and fix them ASAP. | 18:20 |
*** jaosorior has quit IRC | 18:55 | |
*** Goneri has quit IRC | 18:56 | |
*** amoralej is now known as amoralej|off | 19:06 | |
*** agopi|lunch is now known as agopi | 19:07 | |
*** atoth has quit IRC | 19:20 | |
*** jfrancoa has quit IRC | 19:29 | |
*** jfrancoa has joined #oooq | 19:42 | |
hubbot | FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master, tripleo-quickstart-extras-gate-newton-delorean-full-minimal, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-master, tripleo-ci-centos-7-scenario007-multinode-oooq-container | check logs @ https://review.openstack.org/472607 a | 20:20 |
rlandy | trown: wrt removing cloud-init, do you have something for that change yet? if not will put up a patch to with another virt-customize line in https://github.com/openstack/tripleo-quickstart/blob/master/roles/libvirt/setup/overcloud/tasks/fake_nodepool.yml | 20:27 |
rlandy | the libvirt reproducer is working so far for me with the extra wait time | 20:28 |
*** dougbtv_ has joined #oooq | 20:37 | |
trown | rlandy: ya i dont have a patch for it... I just did it in one of my snapshots | 20:39 |
*** dougbtv has quit IRC | 20:40 | |
trown | rlandy: but that is the right spot i think | 20:40 |
rlandy | trown: cool - will put something up for review | 20:40 |
rlandy | reproducer is going fine on fs010 | 20:40 |
rlandy | will move on to other sets afterwards | 20:40 |
rlandy | the reproducer patch is updated | 20:41 |
rlandy | trown: I can test the snapshot stuff tomorrow | 20:41 |
trown | rlandy: sure.. I have still been ironing out some kinks in the way I checked for stopped vms today | 20:42 |
trown | rlandy: but it should be ready for testing tomorrow | 20:42 |
*** jfrancoa has quit IRC | 20:42 | |
rlandy | I didn't sign up for it yet as I thought someone else might volunteer but if nobody signs by then, I'll be QE | 20:42 |
*** jtomasek has quit IRC | 20:43 | |
trown | I was able to figure out the fs037 issue is something to do with iptables... havent checked exactly what yet though | 20:44 |
rlandy | oh - that's good | 20:44 |
trown | ya I think we will likely have that featureset working in reproducer by the end of the sprint | 20:45 |
trown | which is the one that gave me so much trouble last sprint | 20:45 |
rlandy | well then we accomplished something | 20:55 |
*** Goneri has joined #oooq | 21:00 | |
*** dougbtv__ has joined #oooq | 21:01 | |
*** trown is now known as trown|outtypewww | 21:02 | |
*** dougbtv_ has quit IRC | 21:03 | |
rlandy | trown|outtypewww: ha - disabling cloud-init does fix the long reboot | 21:21 |
*** Goneri has quit IRC | 22:07 | |
*** jrist has quit IRC | 22:08 | |
*** jrist has joined #oooq | 22:12 | |
*** jrist has quit IRC | 22:12 | |
*** jrist has joined #oooq | 22:12 | |
hubbot | FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master, tripleo-quickstart-extras-gate-newton-delorean-full-minimal, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-master, tripleo-ci-centos-7-scenario007-multinode-oooq-container | check logs @ https://review.openstack.org/472607 a | 22:20 |
*** tosky has quit IRC | 23:54 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!