*** ysandeep has joined #oooq | 01:08 | |
*** ykarel has joined #oooq | 03:49 | |
*** epoojad1 has joined #oooq | 04:39 | |
*** udesale has joined #oooq | 04:46 | |
*** ykarel has quit IRC | 05:13 | |
*** epoojad1 has quit IRC | 05:15 | |
*** ykarel has joined #oooq | 05:28 | |
*** ratailor has joined #oooq | 05:39 | |
*** epoojad1 has joined #oooq | 05:42 | |
*** bhagyashris has joined #oooq | 05:46 | |
*** bhagyashris_ has joined #oooq | 05:49 | |
*** bhagyashris has quit IRC | 05:52 | |
*** surpatil has joined #oooq | 06:18 | |
*** marios has joined #oooq | 06:23 | |
*** bhagyashris_ is now known as bhagyashris | 06:59 | |
*** saneax has joined #oooq | 07:02 | |
*** SurajPatil has joined #oooq | 07:12 | |
*** marios has quit IRC | 07:13 | |
*** surpatil has quit IRC | 07:15 | |
*** marios has joined #oooq | 07:16 | |
*** surpatil has joined #oooq | 07:33 | |
*** SurajPatil has quit IRC | 07:36 | |
*** SurajPatil has joined #oooq | 07:42 | |
*** surpatil has quit IRC | 07:44 | |
*** ykarel is now known as ykarel|lunch | 07:48 | |
*** surpatil has joined #oooq | 07:50 | |
*** SurajPatil has quit IRC | 07:52 | |
*** saneax has quit IRC | 08:22 | |
*** ratailor_ has joined #oooq | 08:45 | |
*** ratailor has quit IRC | 08:48 | |
*** ykarel|lunch is now known as ykarel | 08:55 | |
*** SurajPatil has joined #oooq | 09:09 | |
*** surpatil has quit IRC | 09:11 | |
*** surpatil has joined #oooq | 09:28 | |
*** SurajPatil has quit IRC | 09:31 | |
*** ysandeep has quit IRC | 09:32 | |
zbr|ruck | morning | 09:44 |
---|---|---|
marios | o/ | 09:44 |
marios | zbr|ruck: hi i filed that fyi (didn't know you were ruck) https://bugs.launchpad.net/tripleo/+bug/1857884 & bump timeout at https://review.opendev.org/#/c/700764/ | 09:44 |
openstack | Launchpad bug 1857884 in tripleo "centos-7-master-containers-build-push time out pushing containers" [Critical,Triaged] | 09:44 |
marios | zbr|ruck: blocks the master promotion looks like it has timed out for over a week | 09:44 |
zbr|ruck | marios: my guess is that removing timeout will decrease its value! | 09:46 |
zbr|ruck | it was there to increase it, not to decrease it. | 09:46 |
*** saneax has joined #oooq | 09:46 | |
marios | zbr|ruck: no it defaults to 3 hours from base | 09:46 |
marios | zbr|ruck: but please sanity check if you think that is wrong | 09:48 |
zbr|ruck | marios: why would we want to put 2h in that case? | 09:48 |
zbr|ruck | to fail faster? the reality is that if we don't build containers in 2h, we will not be able to do other stuff. and the total job duration on zuul is enforced. | 09:48 |
marios | zbr|ruck: i can't remember why but i guess at the time it was taken from the same value used downstream https://github.com/openstack/tripleo-ci/commit/4ffc30df03aca056f798d47f4bdfde43f331301a | 09:48 |
marios | zbr|ruck: but it isn't failing to build | 09:48 |
marios | zbr|ruck: it is timing out during the push (see bug) | 09:48 |
zbr|ruck | well, no push means failure to deliver | 09:49 |
zbr|ruck | marios: now the funny part, I can only add +1 to it. | 09:49 |
zbr|ruck | i have no extra permissions. | 09:50 |
marios | zbr|ruck: ok so since master is REALLY red now i suggest we increase that timeout even if we end up reverting it because we find some better way | 09:50 |
marios | zbr|ruck: ack thanks for checking the review | 09:50 |
zbr|ruck | marios: sure, I am ok to merge it, but I cannot do it. | 09:50 |
marios | zbr|ruck: ack np we can wait a bit more | 09:50 |
zbr|ruck | marios: at some point I lost hope of getting core on tripleo. | 09:51 |
*** sanjayu_ has joined #oooq | 09:52 | |
marios | zbr|ruck: i added it to https://etherpad.openstack.org/p/ruckroversprint19 (are you using a different etherpad?) | 09:53 |
marios | zbr|ruck: do you have rover or are you lone ranger? | 09:53 |
ykarel | marios, /me commented on the review, as it looks an actual issue to me | 09:54 |
ykarel | and increasing timeout that much doesn't looks good | 09:55 |
ykarel | as other working/pushing too better | 09:55 |
*** saneax has quit IRC | 09:55 | |
marios | ykarel: ack thanks... indeed train/stein are fine at 2 hours... and it looks like it started ~ 23rd... but still even if temporary we should bump it so we can promote master | 09:58 |
ykarel | marios, /me added what i suspect and testing it to confirm | 10:01 |
ykarel | and just for promotion we can use testproject | 10:01 |
ykarel | also there was a good hash from 26th which passed all jobs accept fs020 | 10:01 |
ykarel | s/accept/except | 10:01 |
marios | ykarel: ack but not that simple we need the whole pipeline (i have one there if needed https://review.rdoproject.org/r/#/c/24299/ ) | 10:02 |
ykarel | so fs020 is also a promotin blocker | 10:02 |
ykarel | yes agree with that ^^, my point was mainly that there is a way to promote | 10:03 |
ykarel | so if you can debug fs020 issue it will be good, meanwhile i checking container build | 10:03 |
marios | zbr|ruck: ^^ fyi fs20 master | 10:03 |
marios | ykarel: i didn't see fs20 since if container build fails nothing else runs | 10:04 |
marios | ykarel: must have failed on earlier run? | 10:04 |
ykarel | marios, u can check build history | 10:04 |
ykarel | yes it's timing out from last couple of days | 10:04 |
marios | ykarel: ack thanks | 10:04 |
zbr|ruck | marios: i am alone | 10:04 |
marios | zbr|ruck: k will try help you a bit... looking at stein now (train looks good promoted today already) | 10:05 |
zbr|ruck | i would be cool to have a soft-timeout option, failing the job when is reached but not killing it. this would allow us to see if a small bump is needed, or we are beyond help. | 10:07 |
*** ykarel is now known as ykarel|afk | 10:19 | |
*** ykarel|afk has quit IRC | 10:25 | |
*** bogdando has joined #oooq | 10:42 | |
*** bogdando has quit IRC | 10:42 | |
*** ykarel|afk has joined #oooq | 11:09 | |
*** ykarel|afk is now known as ykarel | 11:10 | |
ykarel | marios, my test resulted good https://review.rdoproject.org/r/#/c/24321/ | 11:10 |
ykarel | so next master run in 1 hour should not hit timeout in container build | 11:11 |
marios | ykarel: ack thanks but what did you change? | 11:11 |
marios | ykarel: i don't see depends-on at that test | 11:12 |
ykarel | marios, as said earlier issue is in infra side | 11:12 |
ykarel | that happened due to ppc jobs tags and component promotion pipeline in master | 11:12 |
marios | ykarel: but its pretty consistent. it has been timing out for almost a week now | 11:12 |
marios | https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-centos-7-master-containers-build-push | 11:13 |
ykarel | marios, so there were too many tags, i cleaned up tags older than 4 days for master | 11:13 |
ykarel | marios, if u notice timings ^^, u will find timings are increasing | 11:13 |
ykarel | from day by day | 11:13 |
marios | ykarel: ah ok, was there a change we can point to (for tags cleanup) or this is manual thing | 11:13 |
ykarel | marios, it's happens automatically daily, but due to too much tags push due to ppc and component promotion, this cleanup went insufficient | 11:14 |
ykarel | marios, /me will post the findings on bug itself in some time | 11:15 |
ykarel | after the next periodic run | 11:15 |
marios | ykarel: ack ok | 11:15 |
ykarel | there are multiple things to taken care though, like dlrn api for component pipeline is different | 11:15 |
ykarel | will post everything that i found on bug, and then we will see how to take care | 11:16 |
ykarel | both temporary and permanent measure | 11:16 |
*** epoojad1 is now known as epoojad1|afk | 11:35 | |
*** epoojad1|afk has quit IRC | 11:40 | |
*** bhagyashris has quit IRC | 12:35 | |
*** ratailor_ has quit IRC | 12:38 | |
*** udesale has quit IRC | 12:44 | |
*** sanjayu_ has quit IRC | 13:17 | |
ykarel | kopecmartin, can u check https://bugs.launchpad.net/tripleo/+bug/1857365 | 13:19 |
openstack | Launchpad bug 1857365 in tripleo "Queens, Rocky fs020 tempest test are failing ServersOnMultiNodesTest and TestSecurityGroupsBasicOps" [Critical,Triaged] | 13:19 |
ykarel | possibly related to tempestconf bump in queens/rocky | 13:19 |
ykarel | specificly relate to commit that added min_compute_nodes param | 13:19 |
*** rfolco has joined #oooq | 14:02 | |
*** saneax has joined #oooq | 14:12 | |
*** surpatil has quit IRC | 14:18 | |
*** epoojad1 has joined #oooq | 14:33 | |
ykarel | marios, added comment re. container build issue https://bugs.launchpad.net/tripleo/+bug/1857884/comments/2 | 14:59 |
openstack | Launchpad bug 1857884 in tripleo "centos-7-master-containers-build-push time out pushing containers" [Critical,Triaged] | 14:59 |
ykarel | let me know in case something is not clear | 14:59 |
marios | ykarel: thanks checking zbr|ruck fyi that master build tags issue see ^^ | 15:02 |
zbr|ruck | dogfooding docker registry should come with some warnings about chocking with some bones | 15:04 |
*** ykarel is now known as ykarel|away | 15:21 | |
*** epoojad1 has quit IRC | 15:35 | |
weshay | marios++ ykarel++ | 15:49 |
marios | o/ weshay | 15:54 |
weshay | hey brotha | 15:55 |
marios | weshay: np noticed this morning that master was dying for a week now and that push job was timing out for days | 15:55 |
weshay | aye.. looks like a registry cleanup fixed it? | 15:55 |
marios | weshay: initially prposed a timeout bump but ykarel|away helped with the reg cleanup | 15:55 |
marios | weshay: well he proved it in test project but we didn't see a periodic yet | 15:55 |
weshay | also have other rdo issues.. jobs hitting retry_limit | 15:56 |
marios | weshay: not sure if there is something else wrong there though... | 15:56 |
marios | right | 15:56 |
marios | weshay: i mean i thought it would have run already | 15:56 |
marios | but | 15:56 |
marios | https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-master | 15:56 |
marios | only once today :/ | 15:56 |
marios | so something else is up | 15:56 |
weshay | not much we can do if infra isn't working quite right atm | 15:56 |
weshay | I'll write something up later today.. I have a quick doctors app this morning.. will be back in about an hour | 15:57 |
weshay | marios, thanks for the tip re: the visa | 15:57 |
marios | weshay: ack i am going to go in a bit ... back on friday... maybe on thursday not sure yet :D depends how crazy the kids make me | 15:58 |
weshay | aye :) | 15:58 |
weshay | thanks man | 15:58 |
marios | weshay: np doing my job :) but I'll take the thanks | 15:58 |
weshay | train is promoting often.. which is awesome.. to be expected that master blows up | 15:58 |
marios | weshay: yes, i started typing that and stopped to not jinx it | 15:59 |
marios | thanks weshay ! | 15:59 |
marios | ;) | 15:59 |
weshay | heh | 16:02 |
rfolco | zbr|ruck, quick question... assert: that: is not working, do you know what is wrong ? I tried without {{ }} and with it. http://logs.rdoproject.org/99/24199/5/check/periodic-tripleo-centos-7-master-component-compute-promote-to-current-tripleo/d2ef778/job-output.txt | 16:13 |
zbr|ruck | rfolco: i did not use assert myself, i used fail with when. | 16:13 |
rfolco | zbr|ruck, good idea | 16:14 |
zbr|ruck | i can only test in an isolated playbook to see how it works. | 16:14 |
rfolco | zbr|ruck, my local test works though, with and without {{ }} | 16:14 |
zbr|ruck | version of ansible different? | 16:14 |
rfolco | maybe | 16:14 |
zbr|ruck | rfolco: documentation is clear | 16:15 |
zbr|ruck | try to use only a list | 16:15 |
zbr|ruck | maybe is a bug which evaluates from a non list | 16:15 |
zbr|ruck | i personally do not fancy seeing conditions inside strings, so i would rather prefer writing a list with one element | 16:16 |
rfolco | zbr|ruck, problem is that I only know if works if I merge the change, coz local works | 16:16 |
rfolco | zbr|ruck, any issues with that: - reported_jobs.stdout | from_json | json_query(query) != [] | 16:17 |
zbr|ruck | without testing I would not be able to say, add a debug before | 16:17 |
rfolco | (it wasn't in a list btw) | 16:17 |
*** ykarel|away has quit IRC | 16:18 | |
zbr|ruck | rfolco: that condition is prone to false-positives | 16:19 |
rfolco | zbr|ruck, how? | 16:20 |
zbr|ruck | if jinja evaluates as "null" condition would be true. i suspect assert would evaluate to true too often? | 16:20 |
*** saneax has quit IRC | 16:20 | |
*** saneax has joined #oooq | 16:20 | |
zbr|ruck | but the best way to find out is to add a debug line before that and log the full evaluation, so we know what happens. | 16:20 |
rfolco | null is covered in a previous fail when check | 16:20 |
zbr|ruck | rfolco: add a debug line and lets merge it | 16:21 |
rfolco | ok | 16:21 |
*** ykarel|away has joined #oooq | 16:28 | |
marios | happy new year folks see you later this week | 16:39 |
marios | o/ rfolco missed you today :D | 16:39 |
rfolco | o/ | 16:40 |
rfolco | ? | 16:40 |
marios | rfolco: just mean i thought you were out and it was just me and zbr|ruck | 16:40 |
rfolco | marios, atypical day | 16:41 |
marios | rfolco: yeah ... grafana has 97% pass rate for upstream jobs... so yeah ;) | 16:41 |
marios | very quiet | 16:41 |
marios | anyway have a good rest of week if you're working | 16:42 |
marios | happy new year either way ! | 16:42 |
marios | \o/ | 16:42 |
* marios out | 16:42 | |
rfolco | happy new year marios | 16:42 |
rfolco | zbr|ruck, https://review.rdoproject.org/r/24325 | 16:50 |
rfolco | please ? | 16:50 |
*** marios has quit IRC | 16:51 | |
zbr|ruck | commented | 16:54 |
zbr|ruck | printing only stdout may prove not-enough | 16:54 |
zbr|ruck | in fact stdout is likely already saved in zuul json file. | 16:54 |
rfolco | how about now | 17:12 |
rfolco | https://review.rdoproject.org/r/#/c/24325/ | 17:12 |
rfolco | zbr|ruck, ^ | 17:12 |
zbr|ruck | rfolco: ouch, that looks really ugly. i would probably want to dump criteria value to a file and test outside. | 17:14 |
rfolco | zbr|ruck, dump to a file? | 17:16 |
rfolco | this? json_query(query) | 17:16 |
zbr|ruck | rfolco: can we test this before merging? i cannot do a proper review just by looking at that code. | 17:17 |
rfolco | zbr|ruck, I can paste the local test if you like | 17:17 |
zbr|ruck | AFAIK you do not need credentials to read from DLRN, which means that this playbook could be encapsulated in a test, right? | 17:19 |
rfolco | zbr|ruck, my local test emulates the dlrnapi call w/ a sample json output... 3 test cases there: empty, [], and valid json reponse. | 17:26 |
rfolco | zbr|ruck, see http://paste.openstack.org/show/787971/ | 17:26 |
*** ykarel|away has quit IRC | 17:54 | |
*** sanjayu_ has joined #oooq | 18:10 | |
*** saneax has quit IRC | 18:13 | |
*** rfolco has quit IRC | 19:54 | |
*** d0ugal has joined #oooq | 20:32 | |
*** sanjayu_ has quit IRC | 20:32 | |
*** tosky has joined #oooq | 20:33 | |
*** d0ugal has quit IRC | 21:49 | |
*** tosky has quit IRC | 23:17 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!