Friday, 2022-09-02

*** rlandy is now known as rlandy|out02:47
*** rcastillo|rover_ is now known as rcastillo03:56
*** chandankumar is now known as chkumar|ruck05:02
dpawlikdasm|off: hey, I'm starting to like influx, when I see such error :)05:06
*** ysandeep|out is now known as ysandeep05:22
ysandeephappy friday tripleo-ci o/05:34
ykarelError: invalid policy in \"/etc/containers/policy.json\": Unknown key \"keyPaths\"05:35
ykarelok it's known05:35
ykarelhttps://bugs.launchpad.net/tripleo/+bug/198850005:36
sshnaidmykarel, huh, just was talking about it yesterday :)06:59
ykarelsshnaidm, so you noticed it faster then upstream CI :)06:59
sshnaidmykarel, yeah, because it broke before on RDO: https://review.rdoproject.org/zuul/builds?pipeline=github-check&skip=007:01
sshnaidmnot sure how that happened07:01
sshnaidmprobably image building delay07:01
ykarelmm but images i don't think have containers rpms installed07:02
ykarelokk it's because these jobs using centos mirrors07:03
ykarelhttps://logserver.rdoproject.org/68/468/36e6c58626941415ee7e715cdd836b99ac16fbfe/github-check/tripleo-ci-centos-9-standalone/2bf30d2/logs/undercloud/etc/yum.repos.d/quickstart-centos-appstreams.repo.txt.gz07:03
ykareland opendev mirrors refreshed last night, so they hit the issue late07:03
sshnaidmykarel, that explains07:07
sshnaidmbut why do these jobs not use rdo mirros..? (afs?)07:07
dpawlikyou can check the date when it was rebuild: https://nb01.opendev.org/images/07:08
dpawlikfor c9 seems that it was done today07:08
dpawlikthe rdo mirror is "binding" to opendev mirror. If issue is on opendev we also have 07:09
ykarelsshnaidm, using mirrors relies on /etc/ci directory created by mirror-info-fork role in rdo07:11
ykareland mirror-info role in upstream07:11
sshnaidmah, probably it has only mirror-info07:12
sshnaidmbecause it's same job exactly as in upstream07:12
*** arxcruz is now known as arxcruz|rover07:12
ykareli don't see that role is executed in those jobs07:12
arxcruz|roverchkumar|ruck hello 07:12
arxcruz|rovergood morning 07:12
chkumar|ruckarxcruz|rover: hello good morning07:12
arxcruz|roverchkumar|ruck want to sync? or wait for ronelle?07:13
*** jpena|off is now known as jpena07:36
*** ysandeep is now known as ysandeep|afk09:49
*** rlandy|out is now known as rlandy10:37
rlandyarxcruz|rover: chkumar|ruck: hi - anything we need to sync about?10:38
chkumar|ruckrlandy: o/ good morning10:39
chkumar|ruckrlandy: waiting on this https://review.opendev.org/c/openstack/tripleo-quickstart/+/855587 to clear our gate and master line10:39
arxcruz|roverrlandy not from my side, rerunning wallaby c8 and train 10:39
rlandyarxcruz|rover: yesterday - only fs001 was out on train10:39
arxcruz|roverrlandy yup10:39
arxcruz|roverrerunning it 10:39
rlandyif we had two diff set of tempest failures we can skip promote there10:40
rlandywe should do that before your EoD10:40
rlandyarxcruz|rover: got one other CIX for you ...10:40
chkumar|ruckrlandy: https://bugs.launchpad.net/tripleo/+bug/198851410:41
chkumar|ruckgate blocker10:41
chkumar|ruckhttps://bugs.launchpad.net/tripleo/+bug/198850010:41
rlandyarxcruz|rover: https://trello.com/c/3p8i2YdZ/2639-cixlp1982874tripleociproa-testcreateobjectwithtransferencoding-is-failing-on-tripleo-jobs10:41
rlandychkumar|ruck: do we have a workaround?10:42
chkumar|ruckrlandy: https://review.opendev.org/c/openstack/tripleo-quickstart/+/855587 will clear everything10:42
chkumar|ruckit is as usual centos stream mess10:42
chkumar|ruckthey updated the config but forgot to put the file10:43
chkumar|ruckand shipped the code10:43
chkumar|ruckit broke our stuff10:43
rlandychkumar|ruck: ok - thanks - voted there - pls merge when ready10:43
chkumar|ruckwaiting for zuul +110:43
rlandyarxcruz|rover: hi - pls see christian's comments on that card 10:43
rlandyarxcruz|rover: pls review and let's decide how to go on this one10:45
rlandyysandeep|afk: chkumar|ruck: forwarding you eoghan's response10:46
rlandywe are a no on vienna10:46
rlandybhagyashris: hi - you around?10:48
rlandylet's touch base on 17.1 on 810:48
bhagyashrisrlandy, yes 10:48
bhagyashrisstandalone is passing now10:48
bhagyashrisrunning other jobs10:48
rlandybhagyashris: k ... https://meet.google.com/zgp-qyas-dxv?pli=1&authuser=010:48
arxcruz|roverrlandy sorry, i was lunching, i reply there, i'll check with gman what is the best action, maybe we should not update the urllib3 that might break other things 10:50
arxcruz|roverso better patch tempest, i'll create the patch 10:50
chkumar|ruckrlandy: thank you :-)10:52
rlandyarxcruz|rover++ thanks - let's close that out10:56
rlandychkumar|ruck: arxcruz|rover: downstream promos needs some love - as well as the tripleo component on rhos-1710:57
rlandysince ovb is functional again10:57
rlandychkumar|ruck: arxcruz|rover: the rest of the downstream components are cleaned up now10:57
rlandywhich is good10:57
rlandychkumar|ruck: arxcruz|rover: I'm kicking a rerun on 16.2 failed jobs now10:58
chkumar|ruckok10:59
arxcruz|roverok10:59
rlandywe should stagger the 17 and 17.1 reruns for ovb10:59
rlandysince bhagyashris is also running ovb jobs now for 17.1 on 811:00
arxcruz|roverperiodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-train passes 11:03
rlandyarxcruz|rover: woohoo11:03
rlandyso you'll be clean on upstream promos11:03
*** ysandeep|afk is now known as ysandeep11:08
dviroelo/11:22
*** carloss is now known as carloss|afk11:28
rlandyysandeep; chkumar|ruck: I have a few minutes now if you want to discuss earlier?11:40
ysandeeprlandy, sure11:40
ysandeepchkumar|ruck, you around?11:41
rlandyforwarded one last email11:43
ysandeepthanks, received :) lets meet when chkumar|ruck is back11:46
rlandyyep11:46
chkumar|ruckysandeep: back11:49
ysandeepchkumar|ruck, rlandy meet.google.com/pbx-jpyt-uht11:50
arxcruz|roverbrb in 45 min11:54
chkumar|ruckrlandy: https://zuul.opendev.org/t/openstack/status#855587,12:21
chkumar|ruckrlandy: tripleo-ci-centos-9-undercloud-containers seems to be stuck https://zuul.opendev.org/t/openstack/stream/52ffc5d44ebd456e94bbbab7b740a1c0?logfile=console.log12:21
chkumar|ruck2022-09-02 10:19:19.112661 | TASK [Update all installed packages after new repos are setup]12:21
chkumar|ruckrlandy: we need to kill that job12:22
chkumar|ruckand or recheck it12:22
chkumar|ruckrlandy: will I update the review?12:22
chkumar|ruckor ask infra to do a force merge12:22
chkumar|ruckon update I is going to take 3+ hr to complete12:23
rlandyok12:25
*** dasm|off is now known as dasm13:00
dasmdpawlik | dasm|off: hey, I'm starting to like influx, when I see such error :) 13:00
dasmdpawlik: that was unexpected :) but entertaining13:01
dasmjm1[m]: monday doesn't work for me. Labor Day|Day off. I'm online now if you have some time to chat.13:01
*** carloss|afk is now known as carloss13:03
dasmjm1 ^13:03
dasmjm1 i might look through current tasks to group them, and what can and cannot be done in 2 weeks. After that, we can discuss what's gonna be the top priority.13:06
chkumar|ruckysandeep: can you add https://review.opendev.org/c/openstack/tripleo-quickstart/+/855587 +w and +2 so that infra can move it at the op of the queue13:11
ysandeepchkumar|ruck, looking13:11
ysandeepchkumar|ruck, done13:13
jm1dasm: in meetings, will ping you later13:29
chkumar|ruckrlandy: arxcruz|rover please keep an eye on this patch https://review.opendev.org/c/openstack/tripleo-quickstart/+/855587 to clear gate14:05
chkumar|ruckonce merges please reply to the gate blocker email14:05
arxcruz|roveryes sir 14:05
chkumar|rucksee ya!14:05
chkumar|ruckhave a nice weekend !14:06
rlandychkumar|ruck: thanks14:07
rlandyhave a great weekend14:07
*** jpena is now known as jpena|off14:07
rlandyarxcruz|rover: pls ping when you are EoD14:10
rlandywill take over from there 14:10
arxcruz|roverok14:10
rlandyif patch has not merged14:10
rlandyyet14:10
rlandysorry - rotating meetings 14:10
rlandyuntil 5:30 utc14:10
*** ysandeep is now known as ysandeep|out14:34
jm1dasm: brief sync?14:56
dasmjm1: yup15:04
dasmjm1: https://meet.google.com/epy-ofdn-uin15:04
*** dviroel is now known as dviroel|lunch15:13
jm1dasm: really appreciate that you will focus on our infra :)15:36
* jm1 out for today, have a nice (long) weekend 🥂15:38
rlandyarxcruz|rover: hey - need help with anything?16:04
arxcruz|roverrlandy no, i'm just waiting for chandan's patch get merged 16:08
rlandyk16:08
arxcruz|roverrlandy patch merged, i'm sending an email in response 16:16
arxcruz|roverdone 16:17
rlandyarxcruz|rover: perfect - thanks16:27
rlandydownstream reruns are in progress16:27
rlandyarxcruz|rover: so we need to rerun master/wallaby jobs?16:28
arxcruz|roverrlandy yes, i'll put it to run16:29
rlandyarxcruz|rover: can you look at what's going wrong with https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-integration-stable1-cs8&skip=016:29
rlandycontainer builds16:29
rlandytimed out last two days16:29
arxcruz|roverok16:29
*** dviroel|lunch is now known as dviroel16:41
arxcruz|roverrlandy i'll continue to check on monday regarding the push jobs17:01
arxcruz|roveri think it might be a problem with disk space17:01
arxcruz|roversince i'm seeing failing because it can't open the log file17:01
arxcruz|roverhttps://logserver.rdoproject.org/openstack-periodic-integration-stable1-cs8/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-build-containers-centos-8-quay-push-wallaby/21798d5/logs/build.log17:01
rlandyarxcruz|rover: thanks - I;m out then17:01
rlandybut we need to get this sorted17:01
rlandythank you17:01
arxcruz|rover2022-09-02 14:12:03 | 2022-09-02 14:12:03.485 39806 ERROR tripleo_common.image.builder.buildah.BuildahBuilder FileNotFoundError: [Errno 2] No such file or directory: '/home/zuul/container-builds/76cdb7b4-6477-4343-86c2-e420ccc6a236/base/os/nova-base/nova-compute/nova-compute-build.log'17:01
rlandycan you add a note to rr hackmd so chkumar|ruck knows you're looking at it?17:02
arxcruz|roverbut rlandy it seems chandan saw the same on c9 and have a patch for it https://review.opendev.org/c/openstack/tripleo-quickstart/+/855552 17:06
arxcruz|roverneed to add on his patch c8 as well 17:06
arxcruz|roverat least the failure seems similar 17:06
rlandyk17:08
rlandydasm; hey17:08
rlandyputting in a sync meeting for infra17:08
rlandy6pm utc ok?17:09
rlandylunch brb17:13
dasmrlandy: 6pm works for me17:47
rlandyk17:50
rlandydasm: https://meet.google.com/utv-xfkz-hot?pli=1&authuser=018:01
rlandy16.2 promoted18:43
dviroelyey19:00
dviroelrlandy: so, wrt fips image on rdo19:01
dviroelrlandy: patch merged, nodepool built the image and uploaded the image19:01
rlandysounds good19:01
dviroelI created a patch with nodesets19:01
rlandydviroel: great - need it merged?19:02
dviroelhttps://review.rdoproject.org/r/c/rdo-jobs/+/4473419:02
dviroelno19:02
dviroelit should work with depends-on right?19:02
rlandycorrect19:02
rlandyyou can test with depends19:02
dviroelhttps://review.rdoproject.org/r/c/testproject/+/44652/5/.zuul.yaml19:02
dviroelNODE_FAILURE19:02
dviroelam I missing a step?19:02
dviroelin this process19:03
rlandyone sec19:03
dviroelwill double check if image was uploaded19:04
rlandysorry - back19:11
* rlandy looks19:11
rlandyhmmm ... no logs19:13
rlandydviroel: can I rekick19:14
dviroelyes19:17
rlandyjob is still queued19:18
dviroelack19:21
rlandydviroel: is that what happened last time?19:23
rlandyor is zuul thinking about it more now?19:23
dviroelrlandy: yes, same thing19:23
dviroelthen I stop watching, and it failed :)19:24
rlandyyou'd have to ping rhos-ops19:24
rlandymaybe they can see the error19:24
rlandyI don't see it on our warnings19:24
dviroelack, will do19:24
rlandylooks like it's trying to provision the node and failing19:24
rlandybut I can't see why19:24
rlandywe don't have that access19:25
rlandydviroel: looks like node hang to me19:27
rlandybut I can't see why19:28
dviroelhow is still avaible now?19:31
dviroelwho*19:31
* dviroel needs coffee19:31
dasmrlandy: dviroel seems like yesterday's rr patch to mitigate issues with querying zuul is not working. dpawlik sent me some message last night. 19:31
dasmi just checked my internal irc19:31
dasmi should've done that earlier19:32
dviroeldasm: the workaroung worked? but doesn't solve the issue?19:33
dasmdviroel: yes19:33
dviroelackl19:33
dviroelneeds further investigation, maybe we need to debug on a dev environment19:33
dasmthe issue: we're hitting zuul very hard, every 30mins, in parallel19:34
dasmdo we have a zuul dev env?19:34
dviroelcockpit dev env19:34
dviroelbut you can create a zuul dev env too19:34
dasmwe're killing zuul, so only rhos-ops can tell us if that's affecting them19:34
dasmwe might need to do so19:34
dviroelthe problem will be to populate it19:34
dviroelor not, you can create script to trigger a bunch of noop jobs19:35
dviroeldasm: you can used zuul quickstart tutorial19:35
dasmactually i don't need a zuul19:35
dasmi can even have something else, just being queried.19:36
rlandydviroel: hmm ... now we are at 21 mins19:36
dviroel+119:36
dasmlike simple http service should be enough19:36
dviroelrlandy: who is around in rhos-ops?19:36
rlandydasm: can we stop our collection19:36
rlandydviroel: nhicher19:36
dviroelack19:37
rlandytristan19:37
rlandyboth in canada19:37
dasmrlandy: yes, we can remove it from being collected. it will render cockpit unusable19:37
dasmrlandy: although, there is one more thing we might do. we might introduce random delay in the ruck_rover script itself, to avoind querying zuul19:38
rlandydasm:" we're hitting zuul very hard, every 30mins, in parallel"19:39
rlandycan we split that ^^?19:39
dasmrlandy: technically yes. by separating array of commands into different tasks.19:40
rlandyarxcruz|rover: ok - all downstream lines promoted19:40
rlandydasm: how hard is that? workable?19:42
dasmrlandy: lemme try doing that19:43
dviroeldasm: dev cockpit seems to be working19:44
dviroelare you using it to test?19:44
dasmdviroel: i got a message: "seems that patch is not working as expected ;/" nothihg less, nothihg more.19:45
dasmit's a separate thing from cockpit19:45
dasmtelegraf agent has something like19:46
dasm>collection_offset: Collection offset is used to shift the collection by the given interval. This can be be used to avoid many plugins querying constraint devices at the same time by manually scheduling them in time.19:46
dasmbut i'm not sure yet if that applies to input.exec array19:46
rlandyhmm - not very expressive19:48
dasmnot really19:48
dviroelbtw, how many instances are running at the same time?19:48
dasmdviroel: what do you mean?19:48
dviroeldownstream, upstream19:49
dasmi believe just these two19:49
dviroeland the dev one that I just saw19:49
dasmif there is one, yes. then it's 319:49
dviroelyeah, make zuul life worst 19:50
dasmthat actually might be it20:00
dasm> collection_jitter: Overrides the collection_jitter setting of the agent for the plugin. Collection jitter is used to jitter the collection by a random interval.20:00
dviroelok, beer time. Have a great weekend team o/20:55
dasmdviroel: o/ have a good one20:55
*** dviroel is now known as dviroel|out20:56
dasmrlandy: https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/44736 cc dviroel|out 21:03
dasmthat's the ultimate solution for telegraf. cc dpawlik 21:03
dasmin the future, I'm gonna rework it, to make it single rr command, to avoid bunch of duplication.21:03
dasmBut for now, that's something which is gonna introduce required jitter in the command execution.21:04
* rlandy looks21:16
rlandydasm: going to vote there21:17
rlandybut will wait in dpawlik before merge21:17
rlandydasm; maybe ping him and chkumar|ruck to merge on monday since us is out21:17
dasmindeed21:17
rlandylooks like a good start21:18
dasmdpawlik: chkumar|ruck can you check that one above? to me it's logical. the same way like 2 prior changes ;)21:18
rlandymaybe we rethink in the longer term21:18
dasmdefinitely21:18
dasm"one small step for TripleO CI team, one giant leap for rhos ops (zuul) team"21:19
dasm;)21:19
* rlandy out21:22
* dasm signing off for today21:32
*** dasm is now known as dasm|off21:32

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!