Monday, 2021-06-28

*** marios is now known as marios|ruck05:16
jpodivinhuh, wrong chan05:25
marios|ruckneeds a worfkow at Revert "Set T->U undercloud upgrade to non-voting"06:23
marios|ruckplease thank you 06:23
*** amoralej|off is now known as amoralej06:33
ysandeepmarios|ruck, ack checking06:54
*** jpena|off is now known as jpena06:57
marios|ruckthank you ysandeep 06:58
ysandeepall thanks to you for working with Alex on fixing that :D 07:00
anbanerj|rovermarios|ruck, Hey, good morning07:36
marios|ruckhello anbanerj|rover 07:37
*** ykarel|away is now known as ykarel07:46
arxcruzchandankumar: hey man, remind me, what was the naming we decided on the skiplist? rdo and osp right ?09:29
chandankumararxcruz: yes, rdo for upstream and osp for downstream09:30
anbanerj|roverhey marios|ruck, going afk for 20-30 mins09:30
* anbanerj|rover afk09:30
marios|ruckthanks anbanerj|rover 09:33
*** ykarel is now known as ykarel|lunch09:38
*** bhagyashris_ is now known as bhagyashris09:40
* anbanerj|rover back10:01
anbanerj|rovermarios|ruck, I am checking victoria and ussuri promotions 10:12
marios|ruckanbanerj|rover: thanks10:17
*** ykarel|lunch is now known as ykarel10:43
*** jpena is now known as jpena|lunch11:32
anbanerj|rovermarios|ruck, for ussuri looks good - it failed the sc010 jobs (actavia timeout) and only periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-ussuri11:43
*** amoralej is now known as amoralej|lunch11:47
rlandyysandeep: hi ...
marios|ruckanbanerj|rover: k thanks - i would say post a testproject for it but it won't help us until we fix (e.g. )11:51
rlandy^^ still seeing that in fs001 but only fs00111:51
marios|ruckanbanerj|rover: otherwise we can't override the dlrn_hash used/reported11:51
rlandyand fs035 pass11:51
anbanerj|rovermarios|ruck, ok, I'll keep the testP ready11:51
marios|ruckanbanerj|rover: actually you can check if tripleo-ci-testing has updated yet11:52
marios|ruckanbanerj|rover: i am looking at train going to try testproject there cos it didn't update yet i.e. 8d27e439b3c20b65f1ad51f1d9ab01c8 still the same as the one used in the failed job11:53
marios|ruckanbanerj|rover: you can check for ussuri ^^ 11:53
anbanerj|rovermarios|ruck, no it has already updated. There are more failing jobs in the latest hash. Lemme testP those instead11:54
marios|ruckanbanerj|rover: k 11:56
ysandeeprlandy, checking11:57
marios|ruckanbanerj|rover: sshnaidm: ysandeep: can you please add this to your reviews
marios|ruckanbanerj|rover: thanks just saw you voted 11:58
anbanerj|rovermarios|ruck, yep done11:58
rlandyysandeep: no worries - I am commenting on the card11:59
rlandywill bring up findings at CIX11:59
ysandeeprlandy, ack11:59
rlandyI wanted to know if it was just a fs001 thing but the big was logged against BM - so no11:59
ysandeeprlandy: I am kind of confused with card because that bug was for - Couldn't resolve host name for rhos-release-latest , the log you shared is failing on provisioning.. 12:08
ysandeepmarios|ruck, ack 12:09
rlandyysandeep: yeah - idk - but we still see that error12:11
ysandeeprlandy: sry, which error? 12:11
rlandyysandeep: the critical error is actually below: dnf.exceptions.RepoError: Unknown repo: 'delorean-*-deps'12:12
rlandy2021-06-27T17:55:47-0400 CRITICAL Error: Unknown repo: 'delorean-*-deps'12:12
rlandyRuntimeError: Curl error (6): Couldn't resolve host name for [Could not resolve host:]12:12
ysandeeprlandy, now i remember.. weshay|ruck got side tracked initially- actually that's unrelated error to issue... issue was with rhosp-release not with rhos-release12:14
ysandeep2021-04-27 10:36:44 | 2021-04-27 10:36:44.549530 | 000c523b-c9aa-2e5d-582e-000000000227 |      FATAL | Deploy release version package | overcloud-controller-2 | error={"changed": false, "failures": ["No package rhosp-release available."], "msg": "Failed to install some of the specified packages", "rc": 1, "results": []12:14
rlandyysandeep: I think we should close that card and log something new12:14
ysandeepfix is for rhosp-release
rlandyyep - that is what the bug says12:15
rlandyjoining CIX to discuss12:15
ysandeeprlandy: true, if fs035 passes, then original issue is already fixed12:15
*** jpena|lunch is now known as jpena12:29
*** marios|ruck is now known as marios|ruck|call12:30
weshay|ruckmarios|ruck|call, probably need to add more branches
weshay|ruckmarios|ruck|call, let's see where else this is hitting12:55
pojadhavfolks, pls review :
bhagyashrischandankumar, akahat pojadhav marios|ruck|call rlandy zbr soniya ysandeep scrum time ...13:00
bhagyashrisweshay|ruck, ^13:00
marios|ruck|callweshay|ruck: yes it will need branches because see links in 13:04
marios|ruck|callweshay|ruck: ussuri victoria wallaby @ ^^ 13:05
*** amoralej|lunch is now known as amoralej13:08
marios|ruck|callweshay|ruck: fixing now13:09
weshay|ruckarxcruz, link please13:19
weshay|ruckarxcruz, don't see 13:20
akahatchandankumar, have you pulled patch on the promoter server?13:20
chandankumarakahat: nope13:21
chandankumarakahat: you have forgot to add the release hack13:21
chandankumarakahat: can you add that one13:21
weshay|ruckarxcruz, ?13:21
akahatchandankumar, ack13:22
ysandeepweshay|ruck, rlandy when you have time - I would like to merge the revert till dual voting is fixed: 13:23
weshay|ruckarxcruz, 13:34
ysandeepweshay|ruck, I see an email from you on Friday, Do you want to chat now?13:37
weshay|ruckysandeep, sure13:38
* anbanerj|rover lunch14:10
weshay|ruckmarios|ruck|call, you still on a call?14:11
weshay|ruckmarios|ruck|call, so I think we need two bugs14:11
weshay|rucklooks at
weshay|ruckmarios|ruck|call, you are linking to periodic-tripleo-ci-centos-8-scenario002-standalone-victoria, which means the jobs value must be changed14:12
marios|ruck|callweshay|ruck: no :)14:15
*** marios|ruck|call is now known as marios|ruck14:15
marios|ruckweshay|ruck:  my point is the luks one is the consistent one14:15
*** ysandeep is now known as ysandeep|afk14:15
marios|ruckweshay|ruck: the others are only seen in some jobs, like that crypsteup one 17:11 < weshay|ruck> looks at 14:16
weshay|ruckmarios|ruck, ya.. so let's skip luks.. across branches and jobs14:17
marios|ruckweshay|ruck: k let me update to do jobs []14:17
weshay|ruckthe other temepst failures... in some of those jobs.. would need a new bug14:18
weshay|ruckmarios|ruck, [] is correct14:18
marios|ruckweshay|ruck: yeah but only if we start seeing those consistently 14:18
marios|ruckweshay|ruck: i mean for 'new bug' 14:18
weshay|ruckright.. oh14:18
weshay|ruckand let me get health link14:18
marios|ruckweshay|ruck: but will make the jobs [] for now sec 14:18
weshay|ruckfor upstream at least14:19
weshay|ruck\s/current_test/test_ur_interested in14:19
weshay|ruckfor example14:20
weshay|ruckno fails.. only 5 runs.. so 14:20
marios|ruckweshay|ruck: heh i was just trying to copy paste in the url :)14:20
weshay|ruckwish we had this in rdo.. 14:20
weshay|ruckso helps a little14:20
marios|ruckweshay|ruck:  The last data in the subunit2sql database, from "2021-06-23T16:14:32.000Z", is >1 day old. There might be an issue with result collection. 14:20
marios|ruckweshay|ruck: are we running that? 14:20
marios|ruckweshay|ruck: yes i think we are this is the openstack health that sorin et al were workign on 14:21
weshay|ruckthat's upstream infra14:21
weshay|ruckwe're not yet running a local version of this part14:22
marios|ruckweshay|ruck: but this is what they are planning to decommision?14:22
marios|ruckweshay|ruck: or was that logstash 14:22
weshay|ruckya.. so we may not even run those other two tests upstream14:23
weshay|ruckand could just be periodic integration/component14:23
*** ysandeep|afk is now known as ysandeep14:44
marios|ruckweshay|ruck: o/14:54
marios|ruckwill reach out to them if they don't respond by tomorrow to my ping at
weshay|ruckk.. marios|ruck can we prepare a skip today and wf-114:55
weshay|rucktomorrow we will be 8 days out14:55
marios|ruckweshay|ruck: we have one 14:55
weshay|ruckoh.. perfect14:55
marios|ruckweshay|ruck: see
weshay|ruckk.. /me looks14:55
marios|ruckweshay|ruck: well we don't want to merge it/there are more tests potentially14:55
marios|ruckweshay|ruck: see comment #5 from today for example 14:56
weshay|ruckzbr, anbanerj|rover you folks have a sec?14:57
weshay|ruckactually give me 30 min14:58
anbanerj|roverweshay|ruck, yep sure14:58
*** dviroel is now known as dviroel|lunch15:07
rlandyysandeep: merging
ysandeeprlandy, thanks!15:11
weshay|ruckrlandy, did sunil make onto irc yet?15:25
weshay|ruckanbanerj|rover, ok.. want to sync up?15:25
rlandyweshay|ruck: doesn't look like it15:26
anbanerj|roverweshay|ruck, yep15:26
rlandyweshay|ruck: will check in with him in a bit15:27
weshay|ruckzbr, can we merge ?15:33
weshay|ruckanbanerj|rover, 15:39
weshay|rucknd versioned identity endpoints when attempting to authenticate. Please check that your auth_url is correct. Unable to establish connection to https://[2001:db8:fd00:1000::5]:13000: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13000):15:39
rlandychandankumar: akahat: looking at
rlandyimage promotion patch15:52
rlandychandankumar: akahat: ready to review/try out?15:52
*** ykarel is now known as ykarel|away15:52
rlandysee test failures15:53
*** amoralej is now known as amoralej|off15:53
akahatrlandy, yes. we can try it out15:54
rlandyakahat: k - I don;t have a promotable hash atm15:55
rlandybut chandankumar has a test hash 15:55
rlandyotherwise can ping when we have a workable hash15:55
akahatrlandy, okay. will wait for the candidate hash.15:56
*** ysandeep is now known as ysandeep|away15:56
rlandychandankumar: ^^ let us know if you agree or if you want to try promote to a fake named hash15:57
rlandyanbanerj|rover: hey - just FYI - looking at open ER patches 15:59
marios|ruckanbanerj|rover: weshay|ruck: o/ me off in a couple mins15:59
*** jpena is now known as jpena|off16:00
*** dviroel|lunch is now known as dviroel16:00
weshay|ruckmarios|ruck, k.. need me to follow anything?16:02
marios|ruckweshay|ruck: fyi that ran green so train should promote if you wana check that 16:05
marios|ruckweshay|ruck: info in the gchat 16:05
marios|ruckweshay|ruck:  dlrn_hash_tag: 8d27e439b3c20b65f1ad51f1d9ab01c816:05
weshay|ruckah nice16:07
weshay|ruckmarios|ruck, it's promoting now :)
marios|ruckweshay|ruck: nice ... that's my queue 16:08
marios|ruckhave a good one o/16:08
*** marios|ruck is now known as marios|out16:09
chandankumarrlandy: in the new image promotion patch, we have added more logging not much improvement there16:28
rlandychandankumar:k - so Ill follow what you and akahat want to do here16:31
rlandywe can carry on with manual promotions - we are not blocked 16:31
rlandyits just as you want me to test along with you16:31
chandankumarcurrently cherry-picked that patch16:32
rlandybhagyashris: ysandeep|away: weshay|ruck: added new work item for sprint 47 on hackmd - may be more DF focused - can discuss tomorrow16:37
rlandychandankumar: k- let's try another promotion tomorrow16:38
weshay|ruckrlandy, metalsmith and cephadm work should be done, but would be smart to review it16:39
rlandybe done (by DF?)16:39
weshay|ruckold items for us... new for qe unfortunately16:39
weshay|ruckrlandy, no as in the work is done.. it's in ci16:40
rlandyoh  should be already done16:40
rlandygot it16:40
weshay|ruckrlandy, let's add a work item to review and socialize both items.. because it's not very socialized16:42
rlandyand make sure we have all the 17 settings right16:43
weshay|ruckrlandy, you've seen both things :)16:45
weshay|ruckbut probably didn't connect the dots16:45
weshay|ruckand no working overcloud deployments in .... 4 months? maybe w/ 1716:48
weshay|ruckmakes a person kind of forget16:49
rlandyovercloud deployments are working now - so we can just kill the item16:49
rlandyno need to socialize it16:49
weshay|ruckrlandy, they are working?16:51
weshay|ruckin 17?16:51
chandankumarrlandy: Ok, I tested the image server code
chandankumarcommented on the patch what needs to be changed,16:52
rlandyweshay|ruck: ^^16:52
chandankumarnow it is working fine16:52
chandankumarrlandy: you can try  now with a working hash16:53
rlandychandankumar: k - don;t have a working hash as of now16:53
rlandymaybe tomorrow16:53
chandankumarI have tested it against wes-current-tripleo16:53
rlandyworking is an interesting question ... not as of now ...
weshay|ruckrlandy, I wouldn't say it's "fine"
weshay|rucklol.. ya the config is there16:54
chandankumarrlandy: please comment on this patch to improve logging of qcow promotion16:54
chandankumarwhen free,thanks!16:55
rlandyweshay|ruck: yeah - well that is what I commented in the meeting this morning16:55
rlandyfs020 ok16:55
rlandyso fs001 may need some work16:56
rlandywhatever - can discuss tomorrow16:56
anbanerj|roverweshay|ruck, testP for victoria fs039 had hit "Failed to attach network adapter device" bug17:12
anbanerj|roverfs035 simply timed out without reason, reruning it17:12
anbanerj|roverFor ussuri I could not run it against the old hash even with depends on
eagleswhat's the correct way to enable a tempest plugin (designate) to run only on scenario 3 for tripleo master?17:20
weshay|ruckrlandy, re: eagles question..  we have not yet consolidated the tempest config yet right?17:39
rlandyweshay|ruck: card is still in progress17:41
weshay|ruckeagles, atm.. scenario003's tempest config is here:
weshay|ruckeagles, is it running in another job you want to turn off?17:41
eaglesno. I was trying to enable it for scenario 3
eaglesi created this patch to trigger s003 and it didn't seem to run any designate tests. It occurred to me that the  plugin might've been blacklisted - or I might simply have the syntax wrong for enabling it17:43
weshay|ruckeagles, that's centos-7 is that what you want?17:44
weshay|ruckdoubt it17:44
weshay|ruckeagles, we're killing stein17:44
eaglesprobably why lol thanks17:47
weshay|ruckeagles, see pm.. maybe ur taking care of the concern17:48
weshay|ruckrlandy, so.. metalsmith *is* working in 17 :)
weshay|rucknice job getting the overcloud deployed17:53
weshay|ruck35 :)17:53
rlandyweshay|ruck: right - fs001 is another issue17:53
rlandyfs035  and fs02017:53
rlandyare doing fine17:53
weshay|ruckaye.. nice nice17:53
eaglesthanks weshay|ruck, rlandy 17:53
weshay|ruckok.. so metal smith we're good17:53
rlandyweshay|ruck: the discussion started around ceph17:53
rlandyand where we are deploying that17:54
weshay|ruckrlandy, scenario001/004 should have cephadm17:54
rlandysince we are ceph517:54
rlandyand qe is ceph417:54
rlandythat's kind of how we got into this17:54
rlandyalso 16.2 vs 17 testing17:54
rlandyand what's covered in what17:54
weshay|ruckrlandy, meh17:54
weshay|ruckrlandy, we could consider adding ceph to an ovb job.. but let's not worry about it now17:55
rlandyweshay|ruck: so really we are good from our side17:55
rlandyweshay|ruck: ^^ ack17:55
weshay|ruckrlandy, ya.. this all went down like 6 months ago :)17:55
rlandyweshay|ruck: except that 17 never worked17:55
rlandyuntil nowish17:55
rlandyso upstream is fine17:55
rlandybut we juts got 17 into the mix17:56
rlandyit's not really our discussion17:56
rlandyit's QE's17:56
rlandyexcept where we have to show why our tests are passing and QEs are failing on the same dlrn hash17:56
rlandywe're testing different things17:57
rlandythat's all17:57
rlandyweshay|ruck: either way, I removed the item from the sprint discussion17:58
rlandythe setting should be right now for 17 - it's juts fs001 that needs to be fixed/debugged17:58
rlandychandankumar: ok -this upcoming 16.2 should be a promotable hash17:59
weshay|ruckk.. /me adding to that now17:59
eaglesanother question- is there a way to easily query the periodic jobs out that are timing out for tempest failures are sort of a different beast I think18:01
rlandyyep - getting link18:03
rlandyso if the jobs are not zuul timing out - only tempest timing out ... is probably best18:06
rlandyor better18:07
eaglesrlandy ah cool thanks18:08
rlandygranted there could be other failures in there18:08
rlandyeagles: ^^ approximation :)18:08
eaglesrlandy: yeah, I'm also looking for just timed out deployments but I can see how that works18:08
eaglesoh.. hmm doesn't actually say timed out if it was this kind of failure lol. hunting we will go18:16
weshay|ruckdepends if it's a zuul timeout or tripleo timeout.. both have timers18:17
rlandyasync task did not complete within the requested time - 5700s18:31
rlandy^^ message to search for18:31
eaglesrlandy: ack thaks!18:32
rlandyweshay|ruck: - still the correct health location to check?18:47
anbanerj|roverrlandy,  yes ^ is correct. But I found a lot of strings are not gettting a hit in the logstash18:53
rlandyanbanerj|rover: have time to meet up?19:03
anbanerj|roverrlandy, sure, but even I am trying to find out why there are no hits19:04
weshay|ruckanbanerj|rover, take a look there19:12
weshay|ruckrlandy, yes19:12
weshay|ruckanbanerj|rover, rlandy message:"async task did not"19:22
weshay|ruckeagles, fyi!(),refreshInterval:(pause:!t,value:0),time:(from:now-3w,to:now))&_a=(columns:!(build_name),filters:!(),index:logstash,interval:auto,query:(language:kuery,query:'message:%22async%20task%20did%20not%22'),sort:!())19:29
weshay|ruckso.. we could open a new bug.. probably and reference the old one19:29
weshay|ruckand we can start digging19:29
anbanerj|roverrlandy, weshay|ruck
weshay|ruckrlandy, anbanerj|rover*scenario010.*%2F&from=now-30d&to=now19:34
weshay|ruckI don't see that async error upstream19:37
weshay|ruckanbanerj|rover, rlandy message:"async task did not complete within the requested time" tags: console19:43
rlandyanbanerj|rover: I can't edit
rlandysent request to be able to edit20:22
anbanerj|roverrlandy, I think bhagyashris or zbr has edit rights20:28
rlandyanbanerj|rover: ok - will make a list elsewhere until I have edit rights20:55
eaglesrlandy: weshay|ruck: there is still the problmem that designate_tempest_plugin is still in the tempest_excludelist.tx which seems to supersed the includelist. What's the proper way to remove the designate_tempest_plugin from the exclude list21:57
* weshay|ruck looks.. but got log?21:59
rlandyare we talking about ^^22:05
weshay|ruckeagles, where do you see it in excludes22:05
rlandyjob log would help22:06
rlandybut I see it in skip22:06
weshay|ruckoh.. we can nuke it from the skip22:06
weshay|ruckrlandy,         lp: ''22:06
weshay|ruckya.. rlandy++22:07
weshay|ruckthat's probably what he meant22:07
rlandyI know what he means22:07
rlandyexclude list show in the job te,pest log22:07
rlandygetting :)22:07
weshay|ruckeagles, which branch?22:08
rlandynow of course I can't find a good example22:10
rlandybut basically yeah, I assume if it's skipped, it would be in excludes22:10
rlandyweshay|ruck: I'll put in a review to nuke that? not sure why it was there to begin with22:12
rlandylet arxcruz review22:12
rlandymark it w-1 for the moment22:12
rlandylet eagles test with it22:12
rlandysec ...22:12
weshay|ruckit's very old22:12
eaglessorry stepped afk for a sec - 1s22:14
eaglesthe patch is
rlandyeagles: sec - submitting patch you can try depends-on22:16
weshay|ruckkid duty 0/22:17
rlandyeagles: pls try a depends-on:
rlandymarked it w-1 for the moment - as I want the tempest gurus to approve this22:21
rlandybut it should spring you free22:21
eaglesrlandy: cool! thanks 10^622:21

Generated by 2.17.2 by Marius Gedminas - find it at!