*** d0ugal has joined #oooq | 00:01 | |
hubbot | FAILING CHECK JOBS on stable/ocata: gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-ocata @ https://review.openstack.org/564291 | 00:29 |
---|---|---|
*** weshay has quit IRC | 01:30 | |
*** weshay has joined #oooq | 01:36 | |
*** hamzy has quit IRC | 01:56 | |
*** hamzy has joined #oooq | 01:56 | |
*** atoth has quit IRC | 01:57 | |
hubbot | All check jobs are working fine on stable/queens, stable/ocata, stable/ocata, master. | 02:29 |
*** EmilienM has quit IRC | 03:22 | |
*** EmilienM has joined #oooq | 03:22 | |
*** EmilienM has joined #oooq | 03:22 | |
*** jaganathan has joined #oooq | 03:33 | |
*** jaganathan has quit IRC | 03:34 | |
*** udesale has joined #oooq | 03:54 | |
*** jaganathan has joined #oooq | 04:04 | |
*** tcw has quit IRC | 04:15 | |
*** tcw has joined #oooq | 04:17 | |
hubbot | All check jobs are working fine on stable/ocata, stable/ocata, master, stable/queens. | 04:29 |
*** holser__ has joined #oooq | 04:42 | |
*** holser__ has quit IRC | 04:52 | |
*** pgadiya has joined #oooq | 05:07 | |
*** pgadiya has quit IRC | 05:07 | |
*** hamzy has quit IRC | 05:12 | |
*** quiquell|off is now known as quiquell | 05:33 | |
*** udesale_ has joined #oooq | 05:35 | |
*** marios has joined #oooq | 05:36 | |
*** marios has quit IRC | 05:36 | |
*** udesale has quit IRC | 05:37 | |
*** marios has joined #oooq | 05:38 | |
*** links has joined #oooq | 05:38 | |
*** marios has quit IRC | 05:49 | |
*** marios has joined #oooq | 05:49 | |
*** udesale__ has joined #oooq | 06:16 | |
*** kopecmartin has joined #oooq | 06:16 | |
*** udesale_ has quit IRC | 06:18 | |
*** pgadiya has joined #oooq | 06:25 | |
*** pgadiya has quit IRC | 06:25 | |
hubbot | All check jobs are working fine on stable/ocata, stable/ocata, master, stable/queens. | 06:29 |
*** holser__ has joined #oooq | 06:37 | |
*** ccamacho has joined #oooq | 06:40 | |
*** ssbarnea has quit IRC | 06:43 | |
*** ssbarnea has joined #oooq | 06:44 | |
*** pgadiya has joined #oooq | 06:58 | |
*** pgadiya has quit IRC | 06:58 | |
*** bogdando has joined #oooq | 06:59 | |
*** zoli|gone is now known as zoli | 07:01 | |
*** zoli is now known as zoli|wfh | 07:01 | |
*** zoli|wfh is now known as zoli | 07:01 | |
*** saneax has joined #oooq | 07:05 | |
*** jbadiapa has joined #oooq | 07:06 | |
*** sshnaidm has joined #oooq | 07:10 | |
*** ratailor has joined #oooq | 07:14 | |
*** jtomasek has joined #oooq | 07:15 | |
*** tesseract has joined #oooq | 07:16 | |
quiquell | sshnaidm: Welcome back ! | 07:17 |
*** quiquell is now known as quiquell|afk | 07:19 | |
*** yolanda_ has joined #oooq | 07:20 | |
*** yolanda has quit IRC | 07:23 | |
*** holser__ has quit IRC | 07:24 | |
*** saneax has quit IRC | 07:25 | |
*** saneax has joined #oooq | 07:26 | |
*** tosky has joined #oooq | 07:29 | |
*** amoralej|off is now known as amoralej | 07:31 | |
*** holser__ has joined #oooq | 07:36 | |
sshnaidm | quiquell|afk, I'm mostly off this week :) in Brno now, but will poke here | 07:38 |
*** sshnaidm is now known as sshnaidm|brq | 07:38 | |
quiquell|afk | sshnaidm|brq: Ahh ok | 07:43 |
*** quiquell|afk is now known as quiquell | 07:43 | |
quiquell | sshnaidm|brq: back to lonely mornings :-( | 07:44 |
sshnaidm|brq | quiquell, why? where is everybody? :) | 07:44 |
*** gkadam has joined #oooq | 07:44 | |
quiquell | sshnaidm|brq: Good one :-) | 07:45 |
*** tesseract-RH has joined #oooq | 07:45 | |
sshnaidm|brq | quiquell, thanks for working on grafana, looks amazing | 07:46 |
quiquell | sshnaidm|brq: I have focused on alarms | 07:46 |
quiquell | sshnaidm|brq: RDO guys have discover our toy, for alarms they use sensu | 07:47 |
*** tesseract has quit IRC | 07:48 | |
quiquell | sshnaidm|brq: you can ask for the alarms to the ruck-rover-alert, in the IRC channel | 07:48 |
*** tesseract-RH has quit IRC | 07:48 | |
*** tesseract has joined #oooq | 07:49 | |
sshnaidm|brq | quiquell, yeah, I thought to use their sensu.. they have it in #rdo-dev | 07:50 |
sshnaidm|brq | need to check it with them | 07:50 |
quiquell | sshnaidm|brq: Yep, don't like grafana alerts too much, you have to hardcode too much | 07:50 |
quiquell | Let's just use it with thresholds and put the alerts in sensu | 07:51 |
sshnaidm|brq | agree | 07:54 |
quiquell | sshnaidm|brq: #rdo-dev is the one at freenode, a don't see too much people there | 07:55 |
sshnaidm|brq | quiquell, it's only for alerts | 07:57 |
sshnaidm|brq | quiquell, like tripleo-ci | 07:57 |
quiquell | sshnaidm|brq: Do it make sense to install a sensu to play around in our ruck-rover sandbox ? | 07:57 |
sshnaidm|brq | quiquell, seems like that | 07:58 |
quiquell | would like to do the grafan witout the constraints of adding alerts | 07:58 |
sshnaidm|brq | quiquell, worth to check sensu configs on rdo | 07:58 |
sshnaidm|brq | maybe it's simple enough just to use it.. | 07:58 |
quiquell | sshnaidm|brq: will take look | 07:59 |
*** holser__ has quit IRC | 08:04 | |
*** holser__ has joined #oooq | 08:04 | |
*** jaosorior has joined #oooq | 08:05 | |
*** gkadam has quit IRC | 08:05 | |
*** gkadam has joined #oooq | 08:10 | |
*** jfrancoa has joined #oooq | 08:28 | |
hubbot | All check jobs are working fine on stable/ocata, stable/ocata, master, stable/queens. | 08:29 |
quiquell | sshnaidm|brq: Interesting https://rdo.fsn.ponee.io/thread.html/71eba78db3681759d19cc0a7b561726b6cc86632ade5315d484faf6b@%3Cdev.lists.rdoproject.org%3E | 08:37 |
quiquell | arxcruz|ruck: Do you have the access info of myoung promoter ? | 09:07 |
arxcruz|ruck | quiquell: yes i do, i think i sent to you by mail no ? | 09:23 |
quiquell | arxcruz|ruck: Can't access my pub key is not in the server, will wait for myoung | 09:25 |
arxcruz|ruck | quiquell: gimme your key | 09:25 |
quiquell | ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9dblk/9GGZQmklr0TPcJtgG8c5ikgG3nXj/iAahtIVHjT0jailjvtdspidJnySb5jbJOK6O0654hLaIIqxTxiBu4PwdwrSXbLzk00yZCQNk2+F4aGz3IybMX2DZsPf0ByQ7LC3EcV9q1lLLNVXnzyZMez2+pNuGFNvbvaOpX+5Tgl8lDgcdu05VK8ooWhiFjwkJ3D1+zlszDmJBmwgElHh81SqMtF2SpRB5L4sMvliIjOP59Ie/i21QmBrLzCW1p4I8xPQc5cgDU6Rdn0D8DbbhzoCpRBSw7NQh/9YKxffwmwIlJ5oF7OqSRk2ja9Ktwexnlhq9F//84iFfBGpB7b ellorent@redhat.com | 09:26 |
*** jfrancoa has quit IRC | 09:50 | |
quiquell | arxcruz|ruck: Can you join to #tripleo-ci ? Want to show you womething | 09:57 |
*** sai- has joined #oooq | 10:20 | |
*** sai_ has quit IRC | 10:20 | |
*** udesale__ has quit IRC | 10:23 | |
saneax | folks on a rdo cloud ovb deploy, facing the certmonger service start error - https://bugs.launchpad.net/tripleo/+bug/1770944 | 10:25 |
openstack | Launchpad bug 1770944 in tripleo "CI: centos.ci: certmonger service fails while installing undercloud" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 10:25 |
*** hamzy has joined #oooq | 10:27 | |
hubbot | All check jobs are working fine on stable/queens, stable/ocata, stable/ocata, master. | 10:29 |
*** zoli is now known as zoli|lunch | 10:36 | |
*** tesseract has quit IRC | 10:41 | |
*** tesseract has joined #oooq | 10:44 | |
*** holser__ has quit IRC | 10:59 | |
sshnaidm|brq | saneax, I think it should be resolved with newer certmonger package maybe? | 11:02 |
saneax | sshnaidm|brq, I am using certmonger-0.78.4-3.el7_5.1.x86_64 | 11:04 |
sshnaidm|brq | saneax, try asking in #tripleo , it doesn't seems like oooq problem | 11:07 |
sshnaidm|brq | saneax, maybe jaosorior knows more | 11:07 |
saneax | sure, thanks sshnaidm|brq | 11:07 |
*** quiquell is now known as quiquell|lunch | 11:25 | |
*** amoralej is now known as amoralej|lunch | 11:31 | |
*** jbadiapa has quit IRC | 11:39 | |
*** atoth has joined #oooq | 11:44 | |
*** zoli|lunch is now known as zoli|wfh | 11:49 | |
jaosorior | sshnaidm|brq, saneax it's a certmonger issue. arxcruz|ruck tried to submit a patch for it in puppet-certmonger. But it hasn't merged yet | 11:51 |
jaosorior | here it is https://github.com/saltedsignal/puppet-certmonger/pull/20 | 11:51 |
jaosorior | I tried pinging the maintainer but there's no answer yet | 11:51 |
*** rfolco has joined #oooq | 11:55 | |
*** udesale has joined #oooq | 11:57 | |
saneax | jaosorior, is it possible to install undercloud on rhel 7.5 with this issue ? | 11:59 |
saneax | is there a hack? | 12:00 |
arxcruz|ruck | saneax: the problem is: | 12:00 |
arxcruz|ruck | before undercloud install, there's an undercloud update/upgrade package | 12:00 |
arxcruz|ruck | that brings the latest dbus | 12:01 |
arxcruz|ruck | that according developers, need a reboot to work properly | 12:01 |
arxcruz|ruck | i notice a restart into dbus service fix the problem, but they said that's how it's gonna be | 12:01 |
arxcruz|ruck | not a bug, a feature lol | 12:01 |
arxcruz|ruck | so, after the rpm upgrade, the dbus service is in a bad state | 12:01 |
arxcruz|ruck | so certmonger that depends on dbus, fails to start | 12:01 |
arxcruz|ruck | not only that, any service depending on dbus fails to start | 12:02 |
saneax | arxcruz|ruck, thanks for the info | 12:03 |
*** jbadiapa has joined #oooq | 12:07 | |
saneax | arxcruz|ruck, can you point me the specific dbus issue please? | 12:08 |
*** trown|outtypewww is now known as trown | 12:09 | |
weshay | arxcruz|ruck, how are you sir? How is the ruck/rovering going.. need anything? | 12:12 |
weshay | panda, top of the morning to ya irishman | 12:14 |
*** quiquell|lunch is now known as quiquell | 12:14 | |
weshay | hey quiquell :) | 12:17 |
quiquell | weshay: welcome back ! | 12:17 |
weshay | thank you | 12:17 |
weshay | rfolco, you have a few minutes today to sync w/ me? | 12:17 |
rfolco | weshay, sure. welcome back :) | 12:18 |
rfolco | weshay, just tell me what time works best | 12:18 |
*** amoralej|lunch is now known as amoralej | 12:19 | |
arxcruz|ruck | weshay: hey boss, i'm good and you? hope you enjoy vacation | 12:21 |
weshay | rfolco, k.. thanks man. I sent an invite | 12:21 |
arxcruz|ruck | weshay: everything is green :) | 12:21 |
weshay | arxcruz|ruck, ya.. all is good here | 12:21 |
weshay | arxcruz|ruck, I saw :) very nice | 12:21 |
arxcruz|ruck | all phases, all branches | 12:21 |
weshay | arxcruz|ruck, obviously something must be broken then :P | 12:21 |
chkumar246 | arxcruz|ruck: kopecmartin any one wants to be QE for this one https://trello.com/c/pLrKDqWt/789-make-python-tempestconf-backward-compatible? | 12:22 |
kopecmartin | chkumar246, sure, you can add me | 12:22 |
arxcruz|ruck | saneax: the dbus bug is https://bugzilla.redhat.com/show_bug.cgi?id=1569122 | 12:23 |
openstack | bugzilla.redhat.com bug 1569122 in instack-undercloud "Undercloud installation fails with "Execution of '/bin/getcert list' returned 1: Error org.freedesktop.DBus.Error.TimedOut"" [High,New] - Assigned to jslagle | 12:23 |
chkumar246 | arxcruz|ruck: https://review.rdoproject.org/r/#/c/14023/ -> once merged let me know if it breaks any job | 12:23 |
chkumar246 | in tripleo ci | 12:23 |
arxcruz|ruck | weshay: please, don't say that | 12:23 |
saneax | thanks arxcruz|ruck | 12:23 |
tosky | chkumar246: are those the only requirements for 2.0.0? No more refactoring? | 12:23 |
tosky | just to set expectation (iirc we discussed it last week) | 12:23 |
*** apetrich has quit IRC | 12:23 | |
arxcruz|ruck | chkumar246: can you test it before ? | 12:24 |
*** apetrich has joined #oooq | 12:24 | |
arxcruz|ruck | weshay: welcome back boss, when you have time, let me know, i would like to talk with you :) | 12:24 |
chkumar246 | tosky: no we have to complete refactoring as discussed. | 12:24 |
chkumar246 | tosky: rdo trunk should align with tempestconf master, if it's break, we can fix it | 12:25 |
tosky | chkumar246: oki, I asked because the card says "Once the card is done, let's create a new release of python-tempestconf" | 12:25 |
weshay | arxcruz|ruck, ok.. how about after the ci-escalation mtg | 12:26 |
arxcruz|ruck | sure | 12:27 |
*** holser__ has joined #oooq | 12:27 | |
chkumar246 | arxcruz|ruck: few jobs will pass for sure, but not sure which will break | 12:28 |
hubbot | All check jobs are working fine on stable/queens, stable/ocata, stable/ocata, master. | 12:29 |
arxcruz|ruck | saneax: are you working on the upgrades job right? | 12:31 |
arxcruz|ruck | anywhere between the install and upgrade, if you restart the dbus, you problems are solved :) | 12:31 |
saneax | arxcruz|ruck, not quite | 12:32 |
saneax | this was completely new deploy with master tag | 12:33 |
saneax | but i will try with restart of dbus | 12:33 |
*** rlandy has joined #oooq | 12:34 | |
chkumar246 | arxcruz|ruck: keep an eye on rdo tempestconf patches, let me know if you donot get the changes | 12:35 |
*** rlandy is now known as rlandy|rover | 12:35 | |
saneax | yes restart of dbus fixed the certmonger issue arxcruz|ruck | 12:36 |
saneax | undercloud deploy is going ahead | 12:36 |
saneax | thanks for your help | 12:37 |
rlandy|rover | myoung: I'd like to turn the old promoter on and turn your server off | 12:40 |
chkumar246 | arxcruz|ruck: kopecmartin once we unpin master, feel free to get your oooq-extras patches ready for merging | 12:41 |
chkumar246 | master of tempestconf | 12:42 |
arxcruz|ruck | chkumar246: i'm not confortable with that unpin before a DNM patch testing it... | 12:47 |
chkumar246 | arxcruz|ruck: do we have tripleo experimental job running against rdoinfo? As we cannot test rdoinfo changes against upstream ci. | 12:48 |
arxcruz|ruck | chkumar246: we can get a dummy patch on python-tempestconf and run the job setting from git, or depends on ? | 12:49 |
rlandy|rover | !gatestatus | 12:49 |
openstack | rlandy|rover: Error: "gatestatus" is not a valid command. | 12:49 |
chkumar246 | arxcruz|ruck: it does not work for Depends on: <rdo patch> in upstream patch. | 12:49 |
arxcruz|ruck | chkumar246: yeah, but we get latest tempestconf, that works with depends-on | 12:50 |
chkumar246 | arxcruz|ruck: got that | 12:54 |
quiquell | trown, panda: Do we need this https://github.com/openstack-infra/tripleo-ci/blob/master/toci_quickstart.sh#L98 ? | 12:58 |
weshay | rlandy|rover, howdy | 12:59 |
rlandy|rover | weshay: welcome back!! | 12:59 |
* rlandy|rover though weshay was out until wednesday?? | 12:59 | |
rlandy|rover | arxcruz|ruck; so you want to switch over roles or stay as we are? | 13:00 |
rlandy|rover | arxcruz|ruck: just logged this ... https://bugs.launchpad.net/tripleo/+bug/1774990 - looking into it | 13:01 |
openstack | Launchpad bug 1774990 in tripleo "[queens promotion] RDO phase 2 baremetal env E jobs are failing to deploy the overcloud" [High,Triaged] - Assigned to Ronelle Landy (rlandy) | 13:01 |
trown | quiquell: we shouldnt if the undercloud upgrade job is using the script | 13:01 |
weshay | rlandy|rover, can you join my blue for a quick sync ruck/rover | 13:02 |
rlandy|rover | yep | 13:03 |
weshay | thanks | 13:03 |
quiquell | trown: Going to clean this up, then. | 13:04 |
quiquell | to have less sh... to debug | 13:04 |
quiquell | Have a zuul question, does the zuul.d jobs of a Depends-On get executed ? | 13:20 |
quiquell | myoung: Are you there ? | 13:38 |
myoung | quiquell: yup | 13:39 |
myoung | rlandy|rover: ack | 13:39 |
myoung | rlandy|rover: ready to flip the switch? | 13:39 |
quiquell | myoung: Have add a telegraf to sol, to monitor the promoter | 13:39 |
myoung | %gatestatus | 13:39 |
hubbot | All check jobs are working fine on stable/queens, stable/ocata, stable/ocata, master. | 13:39 |
myoung | quiquell: ahh, cool. we are about to turn it off and turn back on the other one it appears | 13:39 |
quiquell | myoung: Will add telegraf to the other when you have it. | 13:40 |
myoung | weshay: welcome back | 13:43 |
arxcruz|ruck | myoung: Mr. Young | 13:47 |
sshnaidm|brq | myoung, rlandy|rover don't we have downstream gates running? | 13:47 |
arxcruz|ruck | you can call me Mr. Old | 13:47 |
rlandy|rover | sshnaidm|brq: rhos012 gates should be running again | 13:48 |
myoung | sshnaidm|brq: as of last week we had the rhos-12 gates running | 13:48 |
rlandy|rover | tq was runnin | 13:48 |
myoung | rhos-13 gates are defined but have a few issues | 13:48 |
rlandy|rover | we just enabled tqe today | 13:48 |
rlandy|rover | myoung: yep - we are ready to flip the switch - arxcruz|ruck will work with you on it | 13:49 |
sshnaidm|brq | is there a patch I can see rhos gates runs on it? | 13:51 |
sshnaidm|brq | ... and pass | 13:52 |
weshay | myoung, thanks man | 13:52 |
myoung | arxcruz|ruck: could you please update https://bugs.launchpad.net/tripleo/+bug/1770860 with details heading into returning to the tripleo-infra instance? have we root caused what's going on there? | 13:53 |
openstack | Launchpad bug 1770860 in tripleo "tracker-bug: network lag in tripleo-infra tenant prevents container promotions" [Critical,Triaged] | 13:53 |
quiquell | myoung, arxcruz|ruck: ping when promoter-server is up and running | 13:53 |
arxcruz|ruck | myoung: no root cause :( | 13:54 |
myoung | arxcruz|ruck: is it still experiencing 30 seconds to log in, 3 years to wget a small text file? | 13:54 |
myoung | :) | 13:54 |
myoung | arxcruz|ruck: let's chat after scrums... | 13:55 |
arxcruz|ruck | weshay: ^ | 13:55 |
arxcruz|ruck | ok | 13:55 |
myoung | o/ all - tripleo-ci standup in 5 | 13:55 |
quiquell | Have to restart the laptop | 13:58 |
*** links has quit IRC | 14:10 | |
hubbot | All check jobs are working fine on stable/queens, stable/ocata, stable/ocata, master. | 14:29 |
*** hamzy has quit IRC | 14:36 | |
*** saneax has quit IRC | 14:37 | |
EmilienM | rlandy|rover: hi it's me again :D | 14:40 |
rlandy|rover | EmilienM: hey there - what's up? | 14:40 |
EmilienM | CI folks: please look https://review.openstack.org/#/c/571529/ (sent on ML yesterday) - I thought someone from CI would review it but it was missed probably, just let me know if any problem | 14:40 |
EmilienM | rlandy|rover: I need a reproducer again | 14:41 |
EmilienM | on the same pathc | 14:41 |
myoung | quiquell: this is what was talking about https://review.rdoproject.org/r/#/c/14027/ | 14:41 |
EmilienM | rlandy|rover: we made progress but now have another failure | 14:41 |
EmilienM | let me link the script | 14:41 |
EmilienM | https://logs.rdoproject.org/16/566916/11/openstack-check/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-master/Zb3297f9fb4a44a10b34f7afa1b9e860d/reproducer-quickstart.sh | 14:42 |
EmilienM | if you can :-) | 14:42 |
rlandy|rover | EmilienM: ok - I'll set it and ping you with the undercloud ip - we should also look through the errors you get on running a reproducer at some point :) | 14:43 |
bogdando | o/ do you know something of 'Failure prepping block device., Code: 500' %% | 14:43 |
bogdando | ^^ | 14:43 |
bogdando | I have this when reproing that patch | 14:43 |
EmilienM | rlandy|rover: can you show me how you reproduce? can you share your openrc file (without password) | 14:43 |
EmilienM | so I can look what's different | 14:43 |
rlandy|rover | yep | 14:45 |
EmilienM | rlandy|rover: do we have the same kind of tenant? | 14:46 |
quiquell | myoung: Nice the promoter stuff sure we can put something of this in the ruck/rover cockpit | 14:46 |
EmilienM | rlandy|rover: maybe I have less quotas :P | 14:46 |
*** quiquell is now known as quiquell|off | 14:46 | |
rlandy|rover | EmilienM: http://pastebin.test.redhat.com/599506 | 14:46 |
rlandy|rover | last time you reported an issue with the key being missing | 14:47 |
rlandy|rover | EmilienM: we all have the same tenants - except weshay | 14:47 |
rlandy|rover | he can access the CI quotas | 14:47 |
rlandy|rover | but I can't | 14:47 |
EmilienM | ok let me try to reproduce | 14:48 |
EmilienM | weshay has more quotas? ppfffttt | 14:48 |
rlandy|rover | EmilienM; setting it up on my tenant as well | 14:49 |
EmilienM | he's always on PTO, that's unfair | 14:49 |
rlandy|rover | he's back - careful | 14:49 |
EmilienM | oh hey wes how are you | 14:50 |
EmilienM | bogdando, rlandy|rover: my stack is deployed | 14:55 |
EmilienM | and quickstart is running | 14:55 |
EmilienM | :-O :-O :-O | 14:56 |
EmilienM | I just used rlandy|rover's openrc | 14:56 |
rlandy|rover | step 1 - you'll need the zuul change included | 14:56 |
EmilienM | how, I ran with -a | 14:56 |
EmilienM | I shouldn't do that probably | 14:56 |
bogdando | woot woot | 14:56 |
rlandy|rover | you can edit tripleo-quickstart in /opt/stack | 14:57 |
EmilienM | ok | 14:57 |
rlandy|rover | I have one running as well - w/o -a | 14:57 |
rlandy|rover | I included the change to start | 14:57 |
rlandy|rover | will see if that works | 14:57 |
EmilienM | ok I just updated /opt/stack/tripleo-quickstart/config/general_config/featureset035.yml | 14:58 |
EmilienM | rlandy|rover: thanks again! I guess I can continue alone from here | 14:58 |
EmilienM | and stop using your time/resources :-) | 14:58 |
rlandy|rover | EmilienM: cool - in the mean time, I added your key to zuul@38.145.33.10 - I'll watch to see if the change is included there | 14:59 |
EmilienM | ok | 14:59 |
weshay | rfolco, ping | 15:01 |
weshay | rfolco, let's chat | 15:01 |
*** tesseract-RH has joined #oooq | 15:01 | |
*** tesseract has quit IRC | 15:02 | |
myoung | chkumar246, kopecmartin, arxcruz|ruck, weshay: tempest squad standup/scrum/sync in 26 min, please update cards if not already done | 15:04 |
rfolco | weshay, give me 5 min, lunch | 15:06 |
*** sshnaidm|brq has quit IRC | 15:10 | |
*** jbadiapa has quit IRC | 15:15 | |
*** hamzy has joined #oooq | 15:21 | |
myoung | arxcruz|ruck, rlandy|rover: are either of you free at 1pm EDT (5pm UDT) for promoter massaging? Do you want/need to sync on this? You both have access to cipromo@sol.redacted.com, HTH if you need/want it. | 15:22 |
arxcruz|ruck | myoung: i do have access, but i don't know exactly what i need to do | 15:23 |
myoung | arxcruz|ruck: ack, we have tempest scrum in 6 mins, then bug triage. after *that* I can assist. between now and then, if you want to log into the promoter in tripleo-infra, and just do a few tests of pulling containers from docker.io and the RDO registry, curling a few files, determine if it still takes 30-60 sec to connect via ssh, etc...that would be a good starting point | 15:24 |
myoung | see if we have basic networking or if we're still in a state of "you can't get there from here" | 15:25 |
rlandy|rover | myoung: on meeting | 15:26 |
myoung | arxcruz|ruck, rlandy|rover: if I had to shoot from the hip and propose something, I think it would be be a good idea to spin up a new VM on the tenant, give it more than 2 cores (like we have now), use overlay2 FS driver (https://docs.docker.com/storage/storagedriver/overlayfs-driver), and enable verbose logging in the dockerd configuration so we can understand what's going on there. | 15:27 |
myoung | arxcruz|ruck, rlandy|rover, alternate proposal is to turn it on and see what happens lol. but I don't think we've addressed any of the root issue(s) so same inputs, likely same outputs? | 15:28 |
rlandy|rover | myoung: can chat later - talking with rasca now | 15:28 |
myoung | rlandy|rover: ack, i don't have cycles till after 1pm EDT anyway | 15:28 |
myoung | :) | 15:28 |
rlandy|rover | 1 pm is fine | 15:28 |
arxcruz|ruck | myoung: okay, let's spin a new vm | 15:28 |
arxcruz|ruck | rlandy|rover: myoung here's is almost 6pm | 15:29 |
myoung | chkumar246, kopecmartin, arxcruz|ruck, weshay: tempest squad scrum starts shortly, https://etherpad.openstack.org/p/tripleo-tempest-squad-meeting, https://bluejeans.com/7050859455 | 15:29 |
*** dtrainor has joined #oooq | 15:31 | |
*** ratailor has quit IRC | 15:32 | |
myoung | weshay / arxcruz|ruck, coming or shoudl we start now? | 15:32 |
arxcruz|ruck | comming | 15:33 |
*** rfolco_ has joined #oooq | 15:36 | |
*** marios has quit IRC | 15:37 | |
*** marios_ has joined #oooq | 15:37 | |
*** rfolco has quit IRC | 15:37 | |
*** marios_ is now known as marios | 15:37 | |
*** bogdando has quit IRC | 15:38 | |
*** bogdando has joined #oooq | 15:40 | |
*** bogdando has quit IRC | 15:46 | |
EmilienM | rlandy|rover: for the record, my OS_IDENTITY_VERSION was set on '3' instead of '2', I suspect it was the reason why my stack was failing. | 16:05 |
rlandy|rover | rasca++ | 16:06 |
hubbot | rlandy|rover: rasca's karma is now 1 | 16:06 |
rlandy|rover | nice work on that backport | 16:07 |
rlandy|rover | EmilienM: happy you found the issue | 16:07 |
rasca | rlandy|rover, thanks for your help there! | 16:07 |
rlandy|rover | need to submit the review to include a tqe/tq change | 16:07 |
*** gkadam has quit IRC | 16:08 | |
*** gkadam has joined #oooq | 16:08 | |
chkumar246 | myoung: kopecmartin I have closed and updated few bugs related to tempest from that query | 16:14 |
chkumar246 | i think we need to a sprint to get it cleared | 16:15 |
*** sshnaidm|brq has joined #oooq | 16:16 | |
*** sanjay__u has quit IRC | 16:19 | |
myoung | chkumar246: ack | 16:23 |
*** chkumar246 is now known as chandankumar | 16:24 | |
*** trown is now known as trown|lunch | 16:27 | |
hubbot | All check jobs are working fine on stable/ocata, stable/ocata, master, stable/queens. | 16:29 |
*** udesale has quit IRC | 16:30 | |
*** tesseract-RH has quit IRC | 16:34 | |
*** holser__ has quit IRC | 16:38 | |
*** tosky has quit IRC | 16:47 | |
*** kopecmartin has quit IRC | 17:00 | |
* EmilienM loves the green on https://dashboards.rdoproject.org/rdo-dev | 17:19 | |
* EmilienM sends kudos to people here | 17:19 | |
*** zoli|wfh is now known as zoli|gone | 17:32 | |
*** zoli|gone is now known as zoli | 17:32 | |
*** jaganathan has quit IRC | 17:39 | |
*** amoralej is now known as amoralej|off | 17:45 | |
*** trown|lunch is now known as trown | 17:54 | |
EmilienM | rlandy|rover: FYI I don't need your env, I've reproduce the env myself and good news fixed the bug I had (FYI, it was https://review.openstack.org/#/c/572151/) | 18:06 |
rlandy|rover | EmilienM: good to know - thanks | 18:19 |
rfolco_ | myoung, ping | 18:25 |
hubbot | All check jobs are working fine on stable/ocata, stable/ocata, master, stable/queens. | 18:29 |
myoung | rfolco_: what's up | 18:34 |
rfolco_ | myoung, DoD not clear | 18:34 |
rfolco_ | https://github.com/openstack-infra/tripleo-ci/blob/master/zuul.d/multinode-jobs.yaml#L83 | 18:34 |
rfolco_ | upgrade job has fs051, ok, goal achieved | 18:34 |
rfolco_ | but it run on experimental pipeline only | 18:34 |
rfolco_ | myoung, should we move this job to check/gate, or even 3rd party pipelines ? | 18:35 |
myoung | rfolco_: sorry was multitasking. which card? | 18:39 |
* myoung reloads state/context | 18:39 | |
rfolco_ | https://trello.com/c/Ji8RaoHy/776-ci-job-create-keystone-only-full-upgrade-undercloud-overcloud-new-job?menu=filter&filter=label:Sprint%2014%20CI | 18:39 |
rfolco_ | myoung, ^ | 18:39 |
myoung | rfolco_: it was my understanding that for this the DoD was that it was running as nonvoting, triggering on changes to tq/tqe/tu. if it's running now in experimental only, then we're not done... | 18:41 |
myoung | rfolco_: i've also been in BJ for the past nearly 5 hours straight so I could have wires crossed...need to drop to get some lunch | 18:43 |
rfolco_ | myoung, :) | 18:43 |
rfolco_ | thanks | 18:43 |
*** myoung is now known as myoung|lunch | 18:43 | |
*** dtrainor has quit IRC | 18:47 | |
*** dtrainor has joined #oooq | 18:48 | |
*** quiquell|off has quit IRC | 19:03 | |
rlandy|rover | arxcruz|ruck: looking at https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/tqe-gate-rhos-12-ci-rhos-ovb-minimal-pacemaker-public-bond/ - looks like the role failure was just one job - the current running jobs are passed thga t | 19:12 |
rlandy|rover | that | 19:12 |
arxcruz|ruck | rlandy|rover: okay, cool :) one problem less :) | 19:12 |
rlandy|rover | arxcruz|ruck: yeah - we aren't short of problems so one less is better | 19:13 |
rlandy|rover | rasca: hey - looking at the failures here ... https://review.openstack.org/#/c/572155/ | 19:16 |
rlandy|rover | you still on line? | 19:16 |
rlandy|rover | otherwise I will update | 19:16 |
*** jaosorior has quit IRC | 19:19 | |
rlandy|rover | how do I get access to https://registry.rdoproject.org? | 19:34 |
rlandy|rover | arxcruz|ruck: ^^ do you have access? | 19:34 |
arxcruz|ruck | rlandy|rover: i don't | 19:34 |
rlandy|rover | need to see progress on registry | 19:35 |
arxcruz|ruck | chandankumar: do you know ? | 19:35 |
rlandy|rover | I don;t see any updates on docker yet | 19:35 |
rlandy|rover | I think apevec probably can | 19:35 |
rlandy|rover | current-tripleo | 19:36 |
rlandy|rover | 374 MB | 19:36 |
rlandy|rover | 14 hours ago on docker | 19:36 |
rlandy|rover | process is still running | 19:36 |
arxcruz|ruck | rlandy|rover: using the password from the script doesn't work ? | 19:37 |
rlandy|rover | no - I think you haveto auth against your login | 19:37 |
rlandy|rover | there is no opportunity to enter a password | 19:37 |
*** holser__ has joined #oooq | 19:41 | |
*** myoung|lunch is now known as myoung | 19:57 | |
myoung | rlandy|rover: queens promotion is done, and I've killed the promoter on sol | 19:58 |
rlandy|rover | myoung: thank you | 19:58 |
rlandy|rover | waiting on the master one | 19:58 |
myoung | access to the rdo registery is herre: https://console.registry.rdoproject.org/ | 19:59 |
rlandy|rover | myoung: I get auth failed | 19:59 |
myoung | rlandy|rover, arxcruz|ruck, and afaik is controlled / acl'd by membership to https://github.com/orgs/rdo-infra/people | 19:59 |
myoung | I think we need to have tripleo-ci memebers added to that group, or we need the auth on the RDO side to be able to include us via some other mechanism | 20:00 |
rlandy|rover | myoung: thanks - will ask apevec when he is on line | 20:00 |
rlandy|rover | 2018-06-04 17:58:08,765 16603 INFO promoter Promoting the container images for dlrn hash 3a65b17da7b98c83dbfda432af88bb56d3501de9 on master to current-tripleo | 20:00 |
myoung | rlandy|rover: what can be done in the meantime is CLI access via either docker commands directly, or the openshift command line, using the creds on the promoter in the secrets file | 20:00 |
rlandy|rover | is all I follow atm | 20:01 |
myoung | to track status, on the promoter instance can "sudo docker images | grep 3a65b17da7b98c83dbfda432af88bb56d3501de9 to track it's progress downloading / uploading images to/from rdo and docker.io | 20:01 |
myoung | i think it would also be helpful to enable verbose(er) dockerd logging as well | 20:02 |
* myoung will capture additional ideas in LP rfe's | 20:03 | |
rlandy|rover | nothing there | 20:19 |
rlandy|rover | "sudo docker images | grep 3a65b17da7b98c83dbfda432af88bb56d3501de9" | 20:19 |
rlandy|rover | latest is from 7 days ago | 20:20 |
rlandy|rover | process is still running | 20:20 |
*** atoth has quit IRC | 20:20 | |
rlandy|rover | afaict the process is still running but I don't see any pushes going on | 20:23 |
rlandy|rover | no updates since 018-06-04 17:58:08,765 16603 INFO promoter Promoting the container images for dlrn hash 3a65b17da7b98c83dbfda432af88bb56d3501de9 on master to current-tripleo | 20:23 |
rlandy|rover | weshay: ^^ | 20:29 |
rlandy|rover | do you see something going on that I don't? | 20:29 |
weshay | rlandy|rover, it may be downloading from the rdo registry | 20:29 |
* weshay looks | 20:29 | |
hubbot | All check jobs are working fine on stable/queens, stable/ocata, stable/ocata, master. | 20:30 |
weshay | hubbot++ | 20:30 |
hubbot | weshay: hubbot's karma is now 1 | 20:30 |
weshay | rlandy|rover, I suspect it's here https://github.com/rdo-infra/ci-config/blob/master/ci-scripts/container-push/container-push.yml#L78 | 20:32 |
weshay | a lot of that can be removed w/ the oc client | 20:32 |
rlandy|rover | that would be nice | 20:32 |
weshay | rlandy|rover, also | 20:33 |
weshay | dam | 20:33 |
rlandy|rover | I'll request access to https://registry.rdoproject.org:8443 tomorrow | 20:33 |
weshay | \/var/lib/docker is at 100% | 20:33 |
rlandy|rover | I am auth denied | 20:33 |
rlandy|rover | clean that | 20:33 |
weshay | ? | 20:34 |
weshay | sudo su -? | 20:34 |
rlandy|rover | ? | 20:35 |
rlandy|rover | can we get rid of manual_promotion? | 20:37 |
weshay | we can just stop it | 20:38 |
weshay | rlandy|rover, are you on it? | 20:38 |
weshay | trown, myoung have you guys ever had to clean up the local containers on the promoter? | 20:44 |
myoung | weshay: I have not, are we getting close to full on the disk? | 20:45 |
weshay | myoung, 80gb is full | 20:46 |
myoung | weshay: afaik the promoter should be cleaning up after itself, unless it's failed or killed | 20:46 |
* myoung logs in and looks | 20:46 | |
myoung | we might need to run a prune | 20:46 |
weshay | myoung, I did.. nothing came up | 20:46 |
myoung | omg it's so nice to log in in ~2 sec :) | 20:47 |
trown | I also thought it cleaned up after itself | 20:47 |
myoung | oof...we might need to flip back to other promoter and clean this up...it appears we're super low on space and even docker calls are just hanging..." | 20:52 |
* myoung looks deeper | 20:52 | |
myoung | /dev/vdb1 80G 80G 173M 100% /var/lib/docker | 20:52 |
myoung | ahh that's why...on tmux already had commands running :) | 20:53 |
* myoung is experimenting locally with things here https://lebkowski.name/docker-volumes/ | 20:57 | |
myoung | weshay, trown, testing out cleanup options on the other promoter (now not runnign anything) | 21:00 |
myoung | ^^ big hammer running now... "docker rmi $(docker images -a -q)" | 21:01 |
myoung | (on sol) | 21:01 |
myoung | weshay: interesting...looks like since we ahve containers from multiple repos (rdo, docker) that have shared layers, we might be running into this (silently) when attempting to remove... | 21:04 |
myoung | Error response from daemon: conflict: unable to delete d5381dcd3b00 (must be forced) - image is referenced in multiple repositories | 21:04 |
myoung | "docker rmi --force $(docker images -a -q)" just ripped thru very quickly and obliterated 16G of old images | 21:05 |
*** trown is now known as trown|outtypewww | 21:06 | |
myoung | weshay: ^^ | 21:06 |
*** jfrancoa has joined #oooq | 21:07 | |
*** holser___ has joined #oooq | 21:07 | |
* myoung watches space free up | 21:08 | |
*** jfrancoa has quit IRC | 21:09 | |
*** holser__ has quit IRC | 21:10 | |
myoung | looking better, up to 7g freed up | 21:12 |
myoung | weshay, rlandy|rover, trown|outtypewww, so I guess https://github.com/rdo-infra/ci-config/blob/master/ci-scripts/container-push/container-push.yml#L157 is not doing what we thought w.r.t. actually freeing up the space for all these layers | 21:15 |
myoung | got it...this is why | 21:16 |
myoung | When absent an image will be removed. Use the force option to un-tag and remove all images matching the provided name. | 21:16 |
myoung | http://docs.ansible.com/ansible/latest/modules/docker_image_module.html :: state flag | 21:16 |
* myoung makes a patch | 21:16 | |
*** holser___ has quit IRC | 21:17 | |
myoung | weshay, rlandy|rover, trown|outtypewww: https://review.rdoproject.org/r/14048 promoter: Use force parameter with removing images | 21:21 |
*** holser__ has joined #oooq | 21:30 | |
*** myoung is now known as myoung|off | 21:31 | |
*** holser___ has joined #oooq | 21:38 | |
*** holser__ has quit IRC | 21:41 | |
*** holser___ has quit IRC | 21:53 | |
rlandy|rover | weshay: going to kick promoter.sh | 21:57 |
rlandy|rover | woohoo - some action on runk.registry.rdoproject.org/tripleomaster/centos-binary-cinder-volume 3a65b17da7b98c83dbfda432af88bb56d3501de9_dba04735 | 22:04 |
rlandy|rover | 18% /var/lib/docker | 22:12 |
rlandy|rover | fills up quickly | 22:12 |
*** tcw has quit IRC | 22:14 | |
*** tcw has joined #oooq | 22:15 | |
hubbot | All check jobs are working fine on stable/queens, stable/ocata, stable/ocata, master. | 22:30 |
rlandy|rover | 3a65b17da7b98c83dbfda432af88bb56d3501de9_dba04735 | 22:35 |
rlandy|rover | 278 MB | 22:35 |
rlandy|rover | 7 minutes ago | 22:35 |
rlandy|rover | updated | 22:35 |
*** sshnaidm|brq has quit IRC | 22:55 | |
rlandy|rover | weshay: still around? | 23:00 |
rlandy|rover | weshay: https://review.rdoproject.org/r/#/c/14048/ - looks like a good shot to me - thoughts? | 23:10 |
weshay | rlandy|rover, /me looks | 23:14 |
rlandy|rover | script is moving along nicely now | 23:14 |
weshay | rlandy|rover, ah good | 23:14 |
weshay | rlandy|rover, that is worth a shot :) | 23:15 |
rlandy|rover | weshay: merge it? | 23:15 |
weshay | aye | 23:15 |
rlandy|rover | weshay: ok - I think I will need to clean up after this run | 23:15 |
rlandy|rover | but will watch it | 23:16 |
rlandy|rover | already tagging on docker.io | 23:16 |
rlandy|rover | weshay: looks pretty clean - should we re-enable the cron? | 23:54 |
rlandy|rover | queens running now | 23:55 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!