hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-fedora-28-standalone, tripleo-ci-centos-7-scenario010-multinode-oooq-container @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp- (2 more messages) | 00:12 |
---|---|---|
*** vinaykns has quit IRC | 01:39 | |
*** rascasoft has joined #oooq | 01:46 | |
*** gouthamr has quit IRC | 01:48 | |
*** hubbot1 has quit IRC | 01:49 | |
*** dmellado has quit IRC | 01:50 | |
*** rascasoft has quit IRC | 01:50 | |
*** gouthamr has joined #oooq | 02:50 | |
*** dmellado has joined #oooq | 02:55 | |
*** hubbot1 has joined #oooq | 02:58 | |
*** apetrich has quit IRC | 03:15 | |
*** dsneddon has quit IRC | 03:19 | |
*** dsneddon has joined #oooq | 03:35 | |
*** skramaja has joined #oooq | 03:36 | |
*** dsneddon has quit IRC | 03:40 | |
*** saneax has joined #oooq | 04:05 | |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-fedora-28-standalone, (4 more messages) | 04:13 |
*** dsneddon has joined #oooq | 04:15 | |
*** skramaja_ has joined #oooq | 04:17 | |
*** udesale has joined #oooq | 04:18 | |
*** skramaja has quit IRC | 04:18 | |
*** dsneddon has quit IRC | 04:20 | |
*** ykarel|away has joined #oooq | 04:31 | |
*** ykarel|away is now known as ykarel | 04:31 | |
*** raukadah is now known as chandankumar | 04:34 | |
*** dsneddon has joined #oooq | 04:51 | |
*** dsneddon has quit IRC | 04:57 | |
*** udesale has quit IRC | 05:22 | |
*** ratailor has joined #oooq | 05:23 | |
*** dsneddon has joined #oooq | 05:30 | |
*** skramaja_ is now known as skramaja | 05:36 | |
*** dsneddon has quit IRC | 05:40 | |
*** dsneddon has joined #oooq | 06:10 | |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-fedora-28-standalone, (4 more messages) | 06:13 |
*** jtomasek has joined #oooq | 06:16 | |
*** ccamacho has quit IRC | 06:21 | |
quiquell|off | https://review.openstack.org/#/c/640749 | 06:54 |
quiquell|off | chandankumar: commented | 06:55 |
*** jfrancoa has joined #oooq | 06:58 | |
quiquell|off | marios: all the reviews with prio are -1? | 07:04 |
*** quiquell|off is now known as quiquell | 07:08 | |
marios | quiquell: they are? | 07:11 |
marios | quiquell: which ones you mean? | 07:11 |
marios | quiquell: ah maybe you mean /#/c/19065/ /r/#/c/19026/ cos they were squished into https://review.rdoproject.org/r/#/c/18975/ | 07:13 |
marios | quiquell: i'll update the relevant taiga cards | 07:13 |
marios | quiquell: or you mean something else | 07:13 |
quiquell | marios: na was just that | 07:18 |
quiquell | going to review taiga first | 07:18 |
quiquell | marios: so next one would be the switch https://review.rdoproject.org/r/#/c/19066/ | 07:20 |
quiquell | marios: is that it ? | 07:20 |
marios | quiquell: well yeah but woah there we need to check everything ok. containers pushed, with the ci-squad-testing tag and then that the built containers are OK | 07:21 |
quiquell | marios: humm there is no run of new job https://softwarefactory-project.io/zuul/t/rdoproject.org/builds?job_name=periodic-tripleo-fedora-28-master-containers-build-push | 07:22 |
quiquell | It have being merged like 16 hours ago | 07:22 |
marios | quiquell: thanks will check in bit... hope it isn't a problem/nit there we'll see | 07:22 |
marios | quiquell: but yeah we have couple days to test things and hopefully switch this week would be great but lets see | 07:22 |
quiquell | marios: cool cool, not bad | 07:23 |
quiquell | :-) | 07:23 |
quiquell | marios: we will need a red ribbon and a pair of scissors | 07:23 |
marios | :D | 07:23 |
quiquell | marios: let me debug why is not running | 07:23 |
marios | quiquell: k thanks | 07:24 |
marios | going to some reviews first | 07:24 |
quiquell | yep | 07:24 |
quiquell | marios: there are node failures at firs job periodic-tripleo-centos-7-master-promote-consistent-to-tripleo-ci-testing | 07:27 |
quiquell | marios: going to ask around | 07:27 |
marios | quiquell: there was rdo outage last night | 07:28 |
marios | quiquell: i saw on mailing list maybe related | 07:28 |
quiquell | marios: yep have to be it | 07:29 |
marios | chandankumar: sorry i didn't get to https://review.openstack.org/#/c/604311/ on time it has votes already (i promised yesterday i'd check) same quiquell for https://review.rdoproject.org/r/19064 was gonna look again | 07:30 |
chandankumar | marios: quiquell \o/ | 07:31 |
chandankumar | updating the patches :-) | 07:31 |
*** kopecmartin|off is now known as kopecmartin | 07:34 | |
quiquell | marios: we have see a new run in one hour or so I think | 07:35 |
*** quiquell is now known as quiquell|brb | 07:35 | |
marios | thanks quiquell|brb lets see | 07:42 |
*** apetrich has joined #oooq | 07:43 | |
sshnaidm | marios, quiquell|brb take a look at something trivial please: https://review.openstack.org/#/c/640722 | 07:47 |
marios | sshnaidm: no | 07:49 |
*** udesale has joined #oooq | 07:51 | |
quiquell|brb | sshnaidm: +w | 07:58 |
*** udesale has quit IRC | 08:01 | |
*** quiquell|brb is now known as quiquell | 08:07 | |
*** rascasoft has joined #oooq | 08:09 | |
*** saneax has quit IRC | 08:11 | |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-fedora-28-standalone, (4 more messages) | 08:13 |
*** jtomasek has quit IRC | 08:20 | |
*** yolanda has joined #oooq | 08:25 | |
*** ccamacho has joined #oooq | 08:29 | |
*** ykarel is now known as ykarel|lunch | 08:41 | |
*** jpena|off is now known as jpena | 08:52 | |
*** amoralej|off is now known as amoralej | 08:53 | |
marios | quiquell: https://review.rdoproject.org/zuul/build/5c121da642e34b78b5ea86a91a86c715 | 09:12 |
marios | quiquell: ah was yesterday before push | 09:12 |
marios | sorry | 09:12 |
*** panda|ruck|off is now known as panda|ruck|flu | 09:13 | |
panda|ruck|flu | yay | 09:13 |
*** bogdando has joined #oooq | 09:15 | |
marios | panda|ruck|flu: you're ruck - not allowed to get sick clearly stated in ci bylaw article 15 subsection 2b | 09:15 |
*** saneax has joined #oooq | 09:16 | |
panda|ruck|flu | marios: yes, but comma 3 article 7 of the good on duty engineering operating manual in mojo says "You can have a flu as long as your work is not going to be affected by it" | 09:17 |
marios | panda|ruck|flu: oh right. ok then carry on | 09:17 |
panda|ruck|flu | "or if you die, you have at least created all the bugs and pushed all the patches" | 09:18 |
arxcruz | quiquell: panda|ruck|flu marios https://review.openstack.org/#/c/634380/ | 09:18 |
arxcruz | chandankumar: ^ | 09:18 |
panda|ruck|flu | who's chandankumar ? | 09:18 |
panda|ruck|flu | I know only raukadah ! | 09:19 |
quiquell | arxcruz: commented will workflow after answer | 09:21 |
*** dtantsur|afk is now known as dtantsur | 09:22 | |
marios | arxcruz: ack left a inline question earlier | 09:22 |
arxcruz | marios: quiquell answering | 09:22 |
quiquell | marios, sshnaidm, panda|ruck|flu: For the BM trigger https://review.rdoproject.org/r/19067 | 09:23 |
quiquell | is a PoC | 09:23 |
quiquell | It will just populte a little ci-script but only internal software factory is using it | 09:24 |
arxcruz | quiquell: marios done go there and +2+w | 09:24 |
arxcruz | :D | 09:24 |
quiquell | arxcruz: it does not load the variables ? really ? | 09:25 |
quiquell | that's weird | 09:25 |
arxcruz | quiquell: you can check the previous patches, i did a lot of attempts | 09:25 |
quiquell | well include_role is weird too | 09:25 |
quiquell | arxcruz: yep include_role is hard to make it work well | 09:25 |
quiquell | arxcruz: ack na let's merge this | 09:25 |
quiquell | arxcruz: +w | 09:26 |
arxcruz | quiquell: the vars_files was failing as well, it seems it doesn't allow use variables on vars_files path | 09:26 |
arxcruz | playbook_dir was getting an error as not defined | 09:26 |
chandankumar | kopecmartin: Hello | 09:26 |
kopecmartin | chandankumar, hi | 09:27 |
marios | arxcruz: no | 09:27 |
marios | never | 09:27 |
marios | gonna | 09:27 |
marios | do | 09:27 |
marios | it | 09:27 |
marios | E V E R | 09:27 |
arxcruz | malaka | 09:27 |
marios | lol | 09:27 |
quiquell | haha | 09:27 |
arxcruz | marios: https://www.youtube.com/watch?v=e1gfVtYQfaw | 09:28 |
marios | arxcruz: sshnaidm guys why do you hate taiga | 09:28 |
marios | like why you don't put it on the commit message | 09:28 |
marios | and why you ignore me | 09:28 |
marios | when i ask for it | 09:28 |
* marios cries | 09:28 | |
arxcruz | marios: I could write a book why i hate taiga | 09:28 |
sshnaidm | sshnaidm, how did you guess?? | 09:28 |
chandankumar | kopecmartin: https://review.openstack.org/#/c/640859/ regarding os_tempest doc patch on tripleo docs, is it about creating a new job based on current tripleo os_tempest job or how one can run os_tempest on tripleo standalone finishes? | 09:29 |
sshnaidm | marios, ^ | 09:29 |
quiquell | arxcruz: Put a user story first about documentation | 09:29 |
arxcruz | LOL | 09:29 |
marios | sshnaidm: | 09:29 |
kopecmartin | chandankumar, the first part of the question, i have no idea how the other thing works .. feel free to add it | 09:30 |
marios | arxcruz: yeah we should totally have a storyless task about that | 09:30 |
marios | arxcruz: but what, do you think trello is better? | 09:30 |
marios | or we dont need no stinkin scrum tools | 09:30 |
quiquell | let's move to storyboard XD | 09:30 |
sshnaidm | trello sucks, taiga sucks no less | 09:30 |
chandankumar | kopecmartin: ok, then I think commit message needs to be fixed | 09:30 |
panda|ruck|flu | this is mutiny | 09:30 |
marios | arxcruz: why don't you say so in retro if you have better way | 09:30 |
marios | someone lock up panda|ruck|flu | 09:30 |
chandankumar | sshnaidm: better design a tool | 09:31 |
panda|ruck|flu | heads will roll | 09:31 |
chandankumar | panda|ruck|flu: :-) | 09:31 |
panda|ruck|flu | upgrades will roll too | 09:31 |
panda|ruck|flu | don't worry guys, we're all going to move to jira | 09:32 |
panda|ruck|flu | very soon | 09:32 |
sshnaidm | panda|ruck|flu, how is that? | 09:32 |
sshnaidm | panda|ruck|flu, does ibm uses jira only? | 09:32 |
panda|ruck|flu | sshnaidm: before I answer, what's your take on jira ? | 09:33 |
sshnaidm | jira is cool | 09:33 |
quiquell | maybe atlassina has buy taiga | 09:33 |
arxcruz | sshnaidm: what's the collect log role we are currently using on standalone ? I need to add some os_tempest related stuff there | 09:33 |
sshnaidm | jira is flexible enough to make it not sucks | 09:33 |
arxcruz | sshnaidm: i got this but it's not working https://review.openstack.org/#/c/640358/ | 09:33 |
sshnaidm | arxcruz, the usual one I think | 09:33 |
quiquell | arxcruz: role is the same, but maybe the paremeters are different | 09:34 |
chandankumar | arxcruz: can we add the same on os_tempest side | 09:34 |
chandankumar | arxcruz: like post log collection in /var/log/tempest dir? | 09:34 |
sshnaidm | arxcruz, why not just "/home/*/tempest/"? Is it something big there? | 09:34 |
quiquell | arxcruz: http://git.openstack.org/cgit/openstack-infra/tripleo-ci/tree/toci-quickstart/config/collect-logs.yml#n9 | 09:35 |
arxcruz | sshnaidm: not really | 09:35 |
arxcruz | quiquell: aha! it's the one on tripleo-ci | 09:35 |
quiquell | arxcruz: yep I think no one use default from role | 09:35 |
panda|ruck|flu | sshnaidm: there's been talk on switching to jira, but apperently everybody is tired of changing tools, so it has been mostly voted down. | 09:35 |
arxcruz | quiquell: thanks! | 09:35 |
sshnaidm | arxcruz, yeah, quiquell is right | 09:35 |
sshnaidm | arxcruz, but collect all tempest/, no need to be so specific | 09:36 |
arxcruz | sshnaidm: ok | 09:36 |
sshnaidm | panda|ruck|flu, where was this talk? | 09:36 |
*** arxcruz has left #oooq | 09:36 | |
*** derekh has joined #oooq | 09:36 | |
quiquell | panda|ruck|flu: Have being working with Jira at my previous job and is not that bad | 09:36 |
quiquell | panda|ruck|flu: Pretty customizable and it has API | 09:36 |
sshnaidm | panda|ruck|flu, so it's not jira that was downvoted, but just an option to move to another tool | 09:37 |
quiquell | panda|ruck|flu: Wouldn't be difficult to integrate with zuul for example | 09:37 |
*** arxcruz has joined #oooq | 09:37 | |
sshnaidm | yeah, totally, jira is much better than trello/taiga | 09:37 |
arxcruz | weird, i left the channel | 09:37 |
*** ChanServ sets mode: +o arxcruz | 09:38 | |
arxcruz | sshnaidm: at least jira is faster than taiga | 09:38 |
sshnaidm | arxcruz, yeah, it's annoying to wait your us to load.. | 09:38 |
*** ykarel|lunch is now known as ykarel | 09:39 | |
quiquell | panda|ruck|flu: Can we merge this https://review.rdoproject.org/r/#/c/19067/ ? | 09:39 |
quiquell | panda|ruck|flu: want to put in place a PoC on internal sf and it depends on that | 09:39 |
arxcruz | sshnaidm: wondering if the 'premium' taiga would be faster | 09:39 |
arxcruz | or if we install taiga in our infra | 09:39 |
sshnaidm | arxcruz, you can self-host it on your server | 09:40 |
quiquell | panda|ruck|flu: is just new files | 09:40 |
arxcruz | sshnaidm: yup, wondering if that would be faster | 09:40 |
sshnaidm | arxcruz, and then to maintain another service.. | 09:40 |
sshnaidm | arxcruz, well, I think it will be faster, we don't make a lot of load there | 09:40 |
sshnaidm | I'm trying not to :D | 09:41 |
arxcruz | lol | 09:41 |
quiquell | sshnaidm: what a excuse :-P | 09:41 |
sshnaidm | quiquell, yeah, worried about load there | 09:41 |
panda|ruck|flu | quiquell: sshnaidm maybe you should put what kind of customization you would like to do and can't do in taiga in a retrospective card. | 09:41 |
quiquell | very infra baby cry | 09:41 |
marios | so for sure we need to talk about this in retro | 09:42 |
quiquell | panda|ruck|flu: nah I am all good, just making some noise here :-) | 09:42 |
marios | ok lets unlock panda|ruck|flu | 09:42 |
marios | sorry about that. why can't we all be friends | 09:42 |
sshnaidm | panda|ruck|flu, like "speed"? | 09:42 |
marios | sshnaidm: for me trello is MUUUUCH slower than taiga | 09:43 |
sshnaidm | marios, well, I didn't wait for a minute to open a task there | 09:43 |
panda|ruck|flu | I agree on speed, but it's not a customization. what else ? fill the retro! | 09:44 |
marios | sshnaidm: well that's bad if its laggy for you | 09:44 |
sshnaidm | marios, as I see in meetings - not only for me | 09:44 |
panda|ruck|flu | filling the retro was brought to you by your local friendly UA | 09:44 |
panda|ruck|flu | *jinjles* | 09:44 |
marios | sshnaidm: i mean yeah we should see what we can do about that | 09:44 |
quiquell | panda|ruck|flu: well one good from from Jira is that for example for QE you can specify a real list of people that verify names and the like | 09:44 |
marios | sshnaidm: well it sometimes lags here too granted, but not for a minute... and certainly was not better but much worse for me with trello | 09:44 |
sshnaidm | jira has REAL task dependencies, why other tools forget about it.. | 09:45 |
panda|ruck|flu | quiquell: ah yeah, that was one of my first objection to the use of taiga. | 09:45 |
chandankumar | panda|ruck|flu: I used jira in my previous team and I hate that | 09:45 |
panda|ruck|flu | lol | 09:45 |
panda|ruck|flu | we're all gonna die. | 09:45 |
panda|ruck|flu | using different tools. | 09:45 |
marios | sshnaidm: anyway lets agree to talk about it in retro and if folks really feel so strongly lets try something else | 09:45 |
marios | personally i like taiga but as discussed with arxcruz i'm interested in understanding why others don't | 09:46 |
marios | sorry if i missed it but i don't hear a lot of people saying why they hate taiga in previous retros | 09:46 |
marios | panda|ruck|flu: you're not allowed to please until at least after you finish ruck | 09:46 |
panda|ruck|flu | I'm putting a card in retro | 09:47 |
panda|ruck|flu | "Why do I hate taiga" | 09:47 |
marios | panda|ruck|flu: thanks | 09:47 |
sshnaidm | but afaik we move to taiga not only because trello sucks | 09:47 |
quiquell | panda|ruck|flu: Do you know if this NODE_FAILURE is going to be gone next run ? https://softwarefactory-project.io/zuul/t/rdoproject.org/builds?job_name=periodic-tripleo-centos-7-master-promote-consistent-to-tripleo-ci-testing | 09:47 |
sshnaidm | quiquell, 6 hours ago rdo cloud is back to life afaik | 09:48 |
sshnaidm | quiquell, it was outage today again.. | 09:48 |
quiquell | sshnaidm: but it's realy up now ? | 09:49 |
quiquell | Going to ask if they can enqueue the jobs there | 09:49 |
sshnaidm | quiquell, good question | 09:49 |
panda|ruck|flu | quiquell: I don't think so, expect bumpiness for a while | 09:49 |
quiquell | panda|ruck|flu: ack, damn | 09:50 |
quiquell | panda|ruck|flu, sshnaidm: I am going to add this script rdo-infra/ci-config https://review.rdoproject.org/r/#/c/19067/ | 09:51 |
quiquell | for the bm-trigger PoC | 09:51 |
panda|ruck|flu | quiquell: I don't see a test that tries to trigger the job twice on the same condition | 09:53 |
panda|ruck|flu | s/condition/hash | 09:53 |
quiquell | panda|ruck|flu: it's binary trigger or don't trigger | 09:54 |
quiquell | panda|ruck|flu: you mean checking "inprogress" ? | 09:54 |
quiquell | Don't know if we are using inprogress at DLRN reporting | 09:54 |
panda|ruck|flu | quiquell: hash was promoted to tripleo-ci-testing -> your script checks the hash and triggers the job -> the job fails -> the script reruns, but the hash is the same as before -> ?? | 09:58 |
*** holser_ has joined #oooq | 10:01 | |
quiquell | panda|ruck|flu: this is the same we have now | 10:04 |
quiquell | panda|ruck|flu: we rerun on same hashes | 10:05 |
quiquell | panda|ruck|flu: I have see duplicated job status at DLRN on same hashes | 10:05 |
quiquell | panda|ruck|flu: but we can prevent that here if we want | 10:06 |
quiquell | panda|ruck|flu: this is what you mean? | 10:06 |
panda|ruck|flu | quiquell: yes, but you're right, I remember we made that choice, so the job could run again on the same hashes if it failed for some error on the job itself, or on infra | 10:07 |
panda|ruck|flu | quiquell: maybe we could think about avoiding to rerun if the hash already promoted to current-tripleo | 10:08 |
panda|ruck|flu | quiquell: but that's for another time\] | 10:08 |
quiquell | panda|ruck|flu: more important than that, if we have two runs one good one bad what do we do? | 10:09 |
panda|ruck|flu | quiquell: we quit. unreal scenario, everything implodes. | 10:09 |
quiquell | Well for containers build maybe that's weird | 10:09 |
quiquell | Ack | 10:09 |
quiquell | panda|ruck|flu: let's just merge this to put a PoC in place we can elaborate more later on | 10:10 |
panda|ruck|flu | quiquell: anyway, if the hash reaches promotion level, there's nothing we can do, the hash promoted. There's no easy way to unpromote, and it would be weird. | 10:10 |
panda|ruck|flu | and now that I said that, that's the next thing that happens | 10:10 |
panda|ruck|flu | a double fail that makes a bad hash pass, and we need to unpromote | 10:11 |
quiquell | panda|ruck|flu: yep but preventing run at baremetal can be important | 10:11 |
panda|ruck|flu | I called it. | 10:11 |
quiquell | panda|ruck|flu: since hardware is more limited | 10:11 |
panda|ruck|flu | quiquell: as first step, let's just mimick what the rest is doing | 10:11 |
quiquell | panda|ruck|flu: ack agree, so any unit test you want to add ? | 10:12 |
panda|ruck|flu | quiquell: fine point, but for now this is only for master, so the baremetal will need to run a job every 6 hours or so | 10:12 |
quiquell | panda|ruck|flu: we don't want check jobs no? | 10:12 |
panda|ruck|flu | when we'll branch, we'll probably have to rethingk the strategy to save resources | 10:12 |
panda|ruck|flu | quiquell: I don't think we have the capacity | 10:13 |
quiquell | Ack | 10:13 |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-fedora-28-standalone, (5 more messages) | 10:13 |
quiquell | panda|ruck|flu: let's merge the script will put a periodic at internal and pus manual reporting to DLRN | 10:13 |
quiquell | And see how it goes | 10:13 |
panda|ruck|flu | quiquell: it's a bit difficult to undestand what are the sceanarios you're testing | 10:13 |
sshnaidm | panda|ruck|flu, did you see this error in jobs? http://logs.openstack.org/22/640722/2/gate/tripleo-ci-centos-7-containers-multinode/028b578/logs/undercloud/home/zuul/undercloud_install.log.txt.gz#_2019-03-05_08_37_24 | 10:14 |
quiquell | panda|ruck|flu: names issue ? | 10:14 |
panda|ruck|flu | quiquell: name and don't know what the Promotion object represent | 10:15 |
quiquell | panda|ruck|flu: Promotion object is dlrnapi_client stuff | 10:16 |
panda|ruck|flu | sshnaidm: not in periodic no | 10:16 |
quiquell | panda|ruck|flu: will explain at code | 10:16 |
panda|ruck|flu | sshnaidm: is this causing troubles ? seems like it's not fatal to not have kubernetes interface to remove. | 10:17 |
quiquell | panda|ruck|flu: I have just remember | 10:19 |
quiquell | panda|ruck|flu: we do the containers-build as a post | 10:19 |
*** fmount has joined #oooq | 10:19 | |
quiquell | panda|ruck|flu: so don't know if we are able to report if it fails | 10:19 |
panda|ruck|flu | oh dear | 10:19 |
quiquell | panda|ruck|flu: normal dlrn reporting at post playbook will not work :-/ | 10:20 |
quiquell | panda|ruck|flu: I mean this https://review.rdoproject.org/r/#/c/19030/ | 10:20 |
panda|ruck|flu | quiquell: why not ? | 10:20 |
sshnaidm | panda|ruck|flu, well, it failed undercloud install afaiu | 10:20 |
quiquell | why do we do at post ? | 10:20 |
quiquell | panda|ruck|flu: we use zuul.success to report | 10:20 |
quiquell | panda|ruck|flu: I think this is only run not post | 10:20 |
sshnaidm | panda|ruck|flu, oops, no, red herring | 10:21 |
panda|ruck|flu | sshnaidm: yep failed in tripleo-image-prepare | 10:23 |
panda|ruck|flu | sshnaidm: http://logs.openstack.org/22/640722/2/gate/tripleo-ci-centos-7-containers-multinode/028b578/logs/undercloud/var/log/tripleo-container-image-prepare.log.txt.gz | 10:23 |
panda|ruck|flu | and as susual no error in that file, even if the task fails | 10:24 |
quiquell | panda|ruck|flu: only run phase https://zuul-ci.org/docs/zuul/user/jobs.html?highlight=zuul_success#var-zuul_success | 10:24 |
quiquell | panda|ruck|flu: also why do we don the containres-build at post ? | 10:24 |
quiquell | panda|ruck|flu: feels like the main stuff of the job | 10:24 |
quiquell | ahh we don't care | 10:26 |
quiquell | job will not be there I think | 10:26 |
quiquell | humm | 10:26 |
quiquell | no it will | 10:26 |
panda|ruck|flu | quiquell: I remember the decision, but don't remember why we run on post, I think it's because secrets can be used only in post ? | 10:27 |
quiquell | If want post fail the next coninue :-/ | 10:27 |
quiquell | panda|ruck|flu: secrets can be used everywhere, but the playbook have to be in the same project | 10:27 |
quiquell | panda|ruck|flu: I think we are inheritance run <- that could be it | 10:27 |
quiquell | panda|ruck|flu: we don't want to override that | 10:28 |
panda|ruck|flu | well with the new build containers job we don't need to inherit from a quickstart job | 10:29 |
quiquell | panda|ruck|flu: good point I am going to use those jobs | 10:30 |
quiquell | panda|ruck|flu: yep confirmed that the new job only do collect logs stuff at post :-) | 10:37 |
panda|ruck|flu | quiquell: one less problem | 10:41 |
panda|ruck|flu | quiquell: 99 to go | 10:41 |
quiquell | hehe | 10:42 |
quiquell | panda|ruck|flu: this will be it ? https://review.rdoproject.org/r/#/c/19030/ | 10:46 |
panda|ruck|flu | quiquell: probably not | 10:50 |
quiquell | humm zuul error though | 10:50 |
panda|ruck|flu | quiquell: trivial | 10:50 |
quiquell | fixed | 10:51 |
quiquell | panda|ruck|flu: maybe better at base job ? | 10:51 |
quiquell | At parent I mean | 10:51 |
quiquell | panda|ruck|flu: btw this is the job that uses the script https://code.engineering.redhat.com/gerrit/#/c/164227/ | 10:52 |
quiquell | panda|ruck|flu: I am testing dependencies at check pipeline | 10:52 |
panda|ruck|flu | marios: Y U NOT IN #RDO ?? | 11:17 |
panda|ruck|flu | marios: heads up, there's a something in the containers build job configuration taht is creating conflict in RDO/upstream and preventing the periodic pipeline to start | 11:18 |
panda|ruck|flu | marios: something that was merged around 11 GMT yesteday, periodic pipeline stopped to trigger after | 11:18 |
marios | panda|ruck|flu: sheise. ok lets revert that | 11:18 |
quiquell | panda|ruck|flu: was not that NODE_FAILURE issue at RDO in general ? | 11:19 |
marios | panda|ruck|flu: might be the override of the f28 job, vs adding a new push like we did for centos but weshay said it doesn't matter nuke it | 11:19 |
marios | but anyway revert first | 11:19 |
panda|ruck|flu | quiquell: no the job doesn't even trigger, zuull refuses | 11:19 |
marios | panda|ruck|flu: sec | 11:19 |
marios | https://review.rdoproject.org/r/#/c/19101/ quiquell panda|ruck|flu merge it! | 11:20 |
marios | :D | 11:20 |
chandankumar | sshnaidm: where this file is used in tqe https://github.com/openstack/tripleo-quickstart/blob/master/ansible-role-requirements.yml ? | 11:20 |
marios | and will repropose it with duplicate f28 stuff i.e. with -push | 11:20 |
marios | quiquell: panda|ruck|flu ? k ^ | 11:20 |
sshnaidm | chandankumar, I think it's obsolete | 11:21 |
quiquell | marios, panda|ruck|flu: merging | 11:21 |
chandankumar | sshnaidm: ok! | 11:21 |
sshnaidm | chandankumar, I saw distgit patch was merged | 11:21 |
sshnaidm | chandankumar, should we wait for promotion to get it upstream or not? | 11:21 |
marios | quiquell: thanks | 11:22 |
panda|ruck|flu | quiquell: ack commented | 11:23 |
quiquell | +w | 11:23 |
*** holser_ is now known as holser|lunch | 11:28 | |
quiquell | panda|ruck|flu, marios: So what was the issue ? | 11:36 |
marios | quiquell: i am not sure but the first thing i can think of is the new fedora job was overwriting the current one, rather than adding fedora28-container-push as we did for centos | 11:36 |
marios | quiquell: so i will do that when i repost in a bit (doing sthing else right now anyway lets see once it merges if that unblocks promotions for confirmation | 11:37 |
marios | panda|ruck|flu: ^ | 11:37 |
marios | cool quiquell panda|ruck|flu https://review.rdoproject.org/r/#/c/19101/ merged thanks | 11:38 |
quiquell | marios: what do you mean overwrite ? | 11:39 |
marios | quiquell: removed the existing one https://review.rdoproject.org/r/#/c/18975/2/zuul.d/jobs.yaml and switched in the dependencies https://review.rdoproject.org/r/#/c/18975/2/zuul.d/tripleo.yaml | 11:40 |
marios | quiquell: rather than just merging a f28 job and seeing it run | 11:41 |
marios | in parrallel to the existing one | 11:41 |
marios | quiquell: make sense ? | 11:41 |
panda|ruck|flu | marios: opened https://bugs.launchpad.net/tripleo/+bug/1818646 to track it | 11:42 |
openstack | Launchpad bug 1818646 in tripleo "misconfiguration of the new periodic build containers job preventing periodic pipeline to trigger" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami) | 11:42 |
marios | panda|ruck|flu: ack | 11:42 |
panda|ruck|flu | marios: I'm sorry to to have got this first, it's been a busy monday, and usually when you apply something to config, you should check that everything is working well | 11:42 |
marios | panda|ruck|flu: sure but how | 11:43 |
marios | panda|ruck|flu: we can't see things there until they merge | 11:43 |
*** rascasoft has quit IRC | 11:43 | |
panda|ruck|flu | marios: after merge, ensure that the job is passing your point | 11:43 |
marios | panda|ruck|flu: k this is what we doing today? we were waiting for it to run so we could see | 11:43 |
panda|ruck|flu | marios: don't merge and forget :) | 11:43 |
chandankumar | sshnaidm: I think we can merge it, it is not going to break stuff, | 11:43 |
marios | panda|ruck|flu: i did not | 11:43 |
marios | panda|ruck|flu: merge and forget | 11:43 |
marios | panda|ruck|flu: i object to that | 11:43 |
chandankumar | instead of waiting for promotion | 11:43 |
sshnaidm | chandankumar, to merge what? | 11:44 |
panda|ruck|flu | marios: objection overruled! | 11:44 |
marios | panda|ruck|flu: it merged last night. its now slightly passed noon here. so... | 11:44 |
marios | panda|ruck|flu: and its periodic, and there was rdo outage. | 11:44 |
panda|ruck|flu | it merged last night ? | 11:44 |
chandankumar | sshnaidm: about the nova join patches | 11:44 |
panda|ruck|flu | mmhh,, maybe ti's not it then ... | 11:44 |
sshnaidm | chandankumar, ok, but let's review them again | 11:44 |
panda|ruck|flu | Zuul CI <zuul@review.rdoproject.org>: Change has been successfully merged by | 11:44 |
panda|ruck|flu | Zuul CI (2019-03-04 14:18:39+0000) < Reply > | 11:45 |
marios | panda|ruck|flu: https://review.rdoproject.org/r/#/c/18975/ | 11:45 |
quiquell | panda|ruck|flu, marios: we don't have a log of the missoncifugration from #rdo ? | 11:45 |
panda|ruck|flu | quiquell: I have a crypting log pasted from jpena | 11:45 |
panda|ruck|flu | quiquell: marios https://softwarefactory-project.io/paste/show/1460/ | 11:45 |
chandankumar | sshnaidm: https://review.openstack.org/#/c/640089/ & https://review.openstack.org/#/c/639651/ | 11:45 |
chandankumar | sshnaidm: this one needs +w https://review.openstack.org/#/c/640749/ os_tempest related | 11:46 |
panda|ruck|flu | marios: anyway yes, rdocloud outage threw me off too, and no blame in general. let's fix this. | 11:46 |
quiquell | marios, panda|ruck|flu: We have to test the job at a test review | 11:47 |
marios | panda|ruck|flu: err ok except the bit where you accused me of merge and forget. like 13:43 <+panda|ruck|flu> marios: don't merge and forget :) | 11:48 |
marios | panda|ruck|flu: sure appology accepted :) | 11:48 |
panda|ruck|flu | oh well I didn't see the conversation this morning before getting online | 11:54 |
panda|ruck|flu | sorry marios ^ | 11:55 |
marios | panda|ruck|flu: ack | 11:55 |
marios | panda|ruck|flu: as i said i don't yet object to that job being suspected as the root, and we posted the revert no problem. i take offence to the accusation that i was being irresponsible and not a good team player, creating more work for ruck/rover, delaying everyone etc | 11:58 |
panda|ruck|flu | marios: with all the things that have been happening you can be hardly accused of delaying the promotions | 11:59 |
marios | panda|ruck|flu: 13:43 <+panda|ruck|flu> marios: after merge, ensure that the job is passing your point 13:43 <+panda|ruck|flu> marios: don't merge and forget :) | 12:00 |
marios | 13:42 <+panda|ruck|flu> marios: I'm sorry to to have got this first, it's been a busy monday, and usually when you apply something to config, you should check that everything is working well | 12:00 |
*** amoralej is now known as amoralej|lunch | 12:00 | |
*** udesale has joined #oooq | 12:00 | |
panda|ruck|flu | marios: yes, because I didn't see your conversation from before getting online, and I wasn't poked on the missing pipeline triggered. So I assumed erroneusly you were not checking the results and I'm sorry for that. | 12:03 |
marios | panda|ruck|flu: ack | 12:03 |
*** jtomasek has joined #oooq | 12:05 | |
panda|ruck|flu | marios: Don't know if it make you feel any better at this point, but I was never implying or thinking you woudl have done it on purpose. | 12:06 |
marios | panda|ruck|flu: ack ok man | 12:06 |
marios | panda|ruck|flu: for me i don't feel bad/better/anything now its done - np. i was responding to a public accusation of being irresponsible. you apologised cos it was misplaced as you missed some stuff earlier. done | 12:07 |
quiquell | kids to your rooms ! | 12:08 |
marios | but muuuuuuuum | 12:08 |
quiquell | no dinner to vgames | 12:08 |
panda|ruck|flu | he started | 12:09 |
*** ratailor has quit IRC | 12:13 | |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-centos-7-scenario010 (4 more messages) | 12:13 |
panda|ruck|flu | marios: still can't find anything wrong with your patch ... | 12:15 |
marios | panda|ruck|flu: ack np if its OK we can readd it | 12:16 |
panda|ruck|flu | marios: and we don't have any more information from zuul | 12:16 |
marios | panda|ruck|flu: otherwise i'll repropose no problem | 12:16 |
marios | that really was not my problem at all, if it helps to eliminate that cool | 12:16 |
marios | we just repropose | 12:16 |
*** ccamacho has quit IRC | 12:17 | |
panda|ruck|flu | marios: let me dig a bit more, and if this is the patch that causes trouble and the periodic don't trigger you may need to split it into smaller patches and proceed slowly | 12:18 |
marios | panda|ruck|flu: k | 12:18 |
quiquell | marios: can we exercise the f28 job after merging just the definition ? | 12:20 |
quiquell | marios: we can try that first | 12:20 |
marios | quiquell: we'll add it in the layout too just without dependencies | 12:20 |
marios | quiquell: so it will run | 12:20 |
panda|ruck|flu | marios: it has to be that, by exclusion, only three patches merged yesterday, and the other two just added stuff on the task itself , not the pipeline/job configuration | 12:21 |
quiquell | marios: is config project | 12:21 |
marios | panda|ruck|flu: ack | 12:21 |
marios | quiquell: yeah i mean its what we did for centos job same | 12:21 |
marios | quiquell: like right now it is in the layout but not wired up into the dependencies etc | 12:21 |
marios | quiquell: but is running | 12:21 |
marios | quiquell: https://github.com/rdo-infra/review.rdoproject.org-config/blob/1b37844d0d0cdbeffa832109c1b1654ea6894094/zuul.d/tripleo.yaml#L154 | 12:22 |
marios | quiquell: so should be ok? ^ | 12:22 |
*** apetrich has quit IRC | 12:30 | |
*** jpena is now known as jpena|lunch | 12:31 | |
*** apetrich has joined #oooq | 12:31 | |
panda|ruck|flu | marios: quiquell the only thing I can see in the patch is an indentation one char off the others, I doubt that can cause all this but with zuul I learned to never say never | 12:33 |
panda|ruck|flu | but it was off even before. | 12:36 |
quiquell | panda|ruck|flu: we have also some lintings at check | 12:36 |
*** apetrich has quit IRC | 12:38 | |
*** skramaja has quit IRC | 12:42 | |
*** skramaja_ has joined #oooq | 12:42 | |
*** tosky has joined #oooq | 12:48 | |
*** holser|lunch is now known as holser_ | 12:54 | |
*** apetrich has joined #oooq | 12:56 | |
quiquell | marios: were it's running the job ? | 13:01 |
*** panda|ruck|flu is now known as panda|ruck|lunch | 13:03 | |
panda|ruck|lunch | found nothing still, and I have a headache. | 13:03 |
marios | quiquell: https://github.com/rdo-infra/review.rdoproject.org-config/blob/1b37844d0d0cdbeffa832109c1b1654ea6894094/zuul.d/tripleo.yaml#L154 | 13:04 |
marios | quiquell: this one? | 13:04 |
marios | http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-centos-7-master-containers-build-push/5c121da/logs/build.log.txt.gz | 13:04 |
quiquell | marios: I mean the f28 | 13:04 |
marios | quiquell: oh well we were adding the new one with the patch that weas reverted | 13:04 |
marios | so nowhere... | 13:04 |
marios | quiquell: it didn't run cos we merged it and reverted | 13:04 |
quiquell | marios: We can still add just the definition | 13:04 |
quiquell | marios: and run it with test project | 13:04 |
marios | quiquell: ok but it reverted already | 13:05 |
quiquell | marios: you have already splitted | 13:07 |
quiquell | marios: we can try to merge this https://review.rdoproject.org/r/#/c/19026/ | 13:07 |
quiquell | marios: and do a run at testproject | 13:07 |
quiquell | marios: is not going to affect periodics | 13:07 |
quiquell | we... it will is changing the parent job :-/ | 13:07 |
weshay | morning | 13:08 |
marios | quiquell: well thats one option. like to merge the three patches independently. but still. i have to rework it to not overwrite or touch the existing f28 ata ll | 13:09 |
weshay | oh lordy.. panda|ruck|lunch hey | 13:09 |
*** weshay is now known as weshay|rover | 13:09 | |
quiquell | marios: you mean the existing centos one | 13:09 |
quiquell | marios: maybe refactoring without my stupid idea of share the parent | 13:10 |
quiquell | marios: so we go step by step | 13:10 |
marios | quiquell: no there is existing f28 one | 13:10 |
marios | quiquell: lets talk on community call. i'm gonna post what am working on then rework those patches and repropose it again for the f28 periodic | 13:11 |
quiquell | marios: ack, yep better there | 13:11 |
quiquell | weshay|rover: good morning, question, for bm make any sense to have a fedora-28 nodeset ? | 13:12 |
weshay|rover | quiquell no | 13:12 |
weshay|rover | lolz | 13:12 |
weshay|rover | :) | 13:12 |
quiquell | ack | 13:12 |
weshay|rover | quiquell if we're internal we have access to RHEL | 13:12 |
quiquell | weshay|rover: then maybe a fake one so we don't have the tenant error | 13:13 |
weshay|rover | quiquell it makes sense to have a RHEL node set on the internal parent | 13:13 |
quiquell | weshay|rover: https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/config-errors | 13:13 |
weshay|rover | quiquell fix that one, then stop working on config errors this sprint.. or at least deprioritize it | 13:14 |
weshay|rover | quiquell I need folks to help push the f28 pipeline along if you have cycles | 13:14 |
quiquell | well I was having issues with the bm, didn't know if they were the issue | 13:14 |
quiquell | ack | 13:14 |
*** quiquell is now known as quiquell|lunch | 13:16 | |
weshay|rover | quiquell|lunch the best of my knowledge comes from rlandy and she mentioned we can execute w/ config errors | 13:16 |
marios | weshay|rover: working on that | 13:17 |
marios | weshay|rover: we discussed last night | 13:17 |
marios | weshay|rover: https://tree.taiga.io/project/tripleo-ci-board/task/827?kanban-status=1447274 | 13:18 |
marios | weshay|rover: me panda|ruck|lunch rlandy had a call and we agreed i'd dig & post somethign today and we'd then discuss and see if more /someone else will do etc | 13:18 |
marios | weshay|rover: we merged this https://review.rdoproject.org/r/#/c/18975 but today reverted as suspected of breaking the periodics https://review.rdoproject.org/r/#/c/19101 & https://bugs.launchpad.net/tripleo/+bug/1818646 | 13:19 |
openstack | Launchpad bug 1818646 in tripleo "misconfiguration of the new periodic build containers job preventing periodic pipeline to trigger" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami) | 13:19 |
marios | weshay|rover: talk more on community | 13:19 |
weshay|rover | k | 13:22 |
weshay|rover | thanks for the update | 13:22 |
weshay|rover | quiquell|lunch is it possible to test and trigger changes to config from the reproducer? ^ | 13:23 |
quiquell|lunch | weshay|rover: you have to reconstruct them | 13:26 |
quiquell|lunch | weshay|rover: or do similar to OVB we just inject them there and reconstruct secrets | 13:26 |
quiquell|lunch | weshay|rover: we cannot "import" directly | 13:26 |
weshay|rover | ok.. maybe we chat about that in the community call | 13:26 |
quiquell|lunch | ack is doable | 13:26 |
*** amoralej|lunch is now known as amoralej | 13:28 | |
weshay|rover | sshnaidm hey.. 2 min | 13:30 |
sshnaidm | weshay|rover, ack | 13:30 |
*** dtantsur is now known as dtantsur|brb | 13:31 | |
*** rlandy has joined #oooq | 13:36 | |
rlandy | quiquell|lunch: hi - I merged the last of my review last night - so the repo is all your to merge your changes | 13:38 |
quiquell|lunch | rlandy: puff I am trying too but I get NODE_FAILURE runing the job at gates | 13:38 |
quiquell|lunch | rlandy: at check it works | 13:38 |
rlandy | quiquell|lunch: yeah - that happens with upstream node - I think it comes from rdocloud | 13:39 |
rlandy | quiquell|lunch: btw - doing demo at community meeting | 13:39 |
quiquell|lunch | rlandy: have not being able to merge this https://code.engineering.redhat.com/gerrit/#/c/164288/ | 13:39 |
rlandy | quiquell|lunch: wrote some doc so I will follow that | 13:39 |
*** quiquell|lunch is now known as quiquell | 13:39 | |
*** jpena|lunch is now known as jpena | 13:40 | |
rlandy | quiquell: you can remove the job run from check if we are desperate | 13:41 |
quiquell | rlandy: maybe it worth it to discover why we have a NODE_FAILURE there | 13:41 |
quiquell | rlandy: In case is going to happend at periodics too is good to know | 13:41 |
rlandy | quiquell: same reason rdocloud can't give you a node I think | 13:43 |
rlandy | will confirm that | 13:43 |
quiquell | ack is this is the case we remove from check/gate to be able to merge this | 13:44 |
quiquell | I will also merge the DLRN part | 13:44 |
rlandy | quiquell: asking on prodchain-infra-dfg | 13:44 |
quiquell | rlandy: ack thanks | 13:45 |
* chandankumar is away for 1 & half hour, need to meet my little nephew! | 13:45 | |
rlandy | weshay|rover: https://code.engineering.redhat.com/gerrit/#/c/164398/ - to clean up old jobs using hardware from jenkins | 13:47 |
rlandy | quiquell: do you have 5 mins to chat to go through the doc? | 13:47 |
quiquell | rlandy: sure | 13:48 |
rlandy | joining my bj | 13:48 |
quiquell | going to your blue | 13:48 |
*** rascasoft has joined #oooq | 13:50 | |
*** panda|ruck|lunch is now known as panda|ruck|flu | 13:50 | |
*** ratailor has joined #oooq | 13:56 | |
rlandy | quiquell: question - re: https://code.engineering.redhat.com/gerrit/#/c/164288/12/zuul.d/jobs.yaml - why the base job in jobs rather than config? | 14:07 |
quiquell | rlandy: no need for config since it does not have secrets and it's easier to test | 14:09 |
quiquell | rlandy: test directly at check | 14:09 |
panda|ruck|flu | marios: quiquell periodic triggered | 14:13 |
quiquell | rlandy: /o\ | 14:13 |
rlandy | quiquell: one other question, did you run the test-connection-to-hardware-envD test after changing the nodeset? | 14:13 |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-centos-7-scenario010 (4 more messages) | 14:13 |
quiquell | panda|ruck|flu: ^ | 14:13 |
quiquell | rlandy: yep, changed nodeset, runc was giving me NODE_FAILURE too | 14:14 |
quiquell | rlandy: so I get all paranoic | 14:14 |
rlandy | quiquell: no worries, I'l change it back later | 14:14 |
marios | panda|ruck|flu: cool thanks | 14:14 |
rlandy | not serious | 14:14 |
*** ykarel is now known as ykarel|away | 14:14 | |
quiquell | rlandy: for DLRN have reconstructed the review, is easier | 14:17 |
quiquell | https://code.engineering.redhat.com/gerrit/#/c/164496/ | 14:17 |
* rlandy updates | 14:18 | |
*** ykarel|away has quit IRC | 14:18 | |
*** ykarel|away has joined #oooq | 14:19 | |
quiquell | rlandy: humm want to run the DLRN stuff again before demo | 14:20 |
*** vinaykns has joined #oooq | 14:20 | |
rlandy | quiquell: no rush | 14:20 |
rlandy | just going through the explanation | 14:21 |
quiquell | I will merge then | 14:21 |
rlandy | quiquell: where is the companion config review? | 14:21 |
rlandy | oh I see it | 14:22 |
rlandy | all set | 14:22 |
quiquell | rlandy: https://code.engineering.redhat.com/gerrit/#/c/164358/ | 14:22 |
rlandy | quiquell: yep - saw it | 14:22 |
quiquell | panda|ruck|flu, marios: periodics are working allright so it was something with the config ? | 14:23 |
marios | quiquell: that's what i understood from panda|ruck|flu ping yes | 14:24 |
weshay|rover | can folks start joining the community bluejeans | 14:29 |
weshay|rover | panda|ruck|flu if you are sick man.. go rest | 14:30 |
weshay|rover | panda|ruck|flu anything other than rdo-cloud you need to hand off? | 14:30 |
weshay|rover | panda|ruck|flu any reason why admins can't reboot all the compute nodes atm | 14:31 |
weshay|rover | marios rlandy sshnaidm zbr mtg time | 14:33 |
marios | ack was waiting for jaosorior to call it :D | 14:34 |
*** skramaja_ is now known as skramaja | 14:35 | |
*** skramaja has quit IRC | 14:38 | |
*** dtantsur|brb is now known as dtantsur | 14:40 | |
panda|ruck|flu | quiquell: marios yes, so that patch is doing something wrong. I looked at the patch top to bottom and reverse at least 10 times, still can't find anything wrong with it. | 14:41 |
marios | panda|ruck|flu: quiquell ack will propose (have in progress but meetings) the re-merge and we can discuss there. will make it a bit stafer by not changing the current job. wondering if th enodeset removed in the base is the problem lets discuss after this topic | 14:43 |
sshnaidm | ykarel|away, mistral should run on ovb afaik (if I understood right the question) | 14:49 |
ykarel|away | sshnaidm, yes mistral is running, but not ceph in ovb | 14:49 |
sshnaidm | ykarel|away, oh, right, no ceph is there | 14:50 |
sshnaidm | chandankumar, commented on https://review.openstack.org/#/c/640089 | 14:52 |
*** ratailor has quit IRC | 15:00 | |
arxcruz | chandankumar: https://tree.taiga.io/project/tripleo-ci-board/task/780?kanban-status=1447276 please confirm | 15:01 |
*** ccamacho has joined #oooq | 15:03 | |
chandankumar | sshnaidm: ack checking! | 15:05 |
chandankumar | sshnaidm: https://review.openstack.org/#/c/640749/ os_tempest related | 15:07 |
quiquell | panda|ruck|flu, marios, weshay|rover: I have reproduce a freezing issue | 15:31 |
quiquell | But maybe I have something wrong there | 15:31 |
panda|ruck|flu | quiquell: how do we look at it ? | 15:31 |
quiquell | panda|ruck|flu: come to madrid my door is open | 15:32 |
panda|ruck|flu | quiquell: gimme a sec | 15:32 |
quiquell | ok, perparing some coffe | 15:32 |
quiquell | But I think error is different | 15:33 |
panda|ruck|flu | quiquell: yay paste it so I can create the 8th promotion blocker :) | 15:33 |
quiquell | Exception: Job periodic-tripleo-fedora-28-master-containers-build-push depends on periodic-tripleo-centos-7-master-promote-consistent-to-tripleo-ci-testing which was not run. | 15:33 |
panda|ruck|flu | quiquell: do you have any explanation why that was not run ? | 15:34 |
panda|ruck|flu | quiquell: do you see "configuration is not valid" anywhere in zuul logs ? | 15:34 |
marios | weshay|rover: exactly one hour. rfolco|pto would be so proud :D | 15:34 |
weshay|rover | lolz | 15:35 |
quiquell | panda|ruck|flu: nope, going to just run the tagging job at periodic and see if it runs | 15:35 |
rlandy | marios: I think I am done with my tasks - what can I help with? | 15:35 |
marios | rlandy: can you add any thoughts on the review for now https://review.rdoproject.org/r/#/c/19108/ (I'll be updating that in a bit with what we just discussed, like removing the conditional ) otherwise will ping you in a bit for reviews | 15:37 |
rlandy | looking | 15:37 |
panda|ruck|flu | marios: one thing rlandy can do since she's on a different TZ, is continue to push your patches forward when you're not here. | 15:37 |
panda|ruck|flu | so she can be the time extension to have these tasks worked 16 hours a day | 15:38 |
panda|ruck|flu | (give or take some hours) | 15:38 |
marios | panda|ruck|flu: sure if theres stuff to udpate | 15:40 |
weshay|rover | ykarel|away just a quick skeleton epic, let's fill in gaps in our time https://tree.taiga.io/project/tripleo-ci-board/epic/828 | 15:40 |
ykarel|away | weshay|rover, okk will look tomorrrow | 15:41 |
panda|ruck|flu | zuul Y U NO GIB MORE INFORMATION ON FAILURE ? | 15:45 |
*** chandankumar is now known as raukadah | 15:47 | |
quiquell | panda|ruck|flu, marios: I have the jobs enqueued | 15:49 |
marios | quiquell: which jobs | 15:50 |
*** udesale has quit IRC | 15:50 | |
rlandy | marios: left comments on https://review.rdoproject.org/r/#/c/19108/ | 15:51 |
quiquell | marios: just the fedora container-push | 15:51 |
quiquell | marios: the dependency was just at the new fedora28 job, isn' tit ? | 15:51 |
quiquell | going to add the centos without dependency | 15:51 |
marios | quiquell: you mean https://review.rdoproject.org/r/#/c/18975/ the one we reverted? | 15:52 |
quiquell | yep | 15:52 |
marios | quiquell: nice how are you running it though did you skip the secrets? | 15:52 |
marios | created new ones? | 15:52 |
quiquell | yep centos7 was just running and fedora28 has the dependency | 15:52 |
quiquell | marios: recreawte secrets | 15:52 |
quiquell | but I think it does not matter because is kicking it | 15:52 |
marios | quiquell: which dependency though i don't follow your comment earlier | 15:52 |
panda|ruck|flu | also this fails way before using any secrets, so even with fake secret it should be good | 15:52 |
quiquell | so no config problem but let me add centos7 | 15:53 |
marios | quiquell: i'm also going to post a patch for the reproposal of that https://review.rdoproject.org/r/#/c/18975/ working on now | 15:53 |
quiquell | marios: I mean that periodic-tripleo-centos-7-master-containers-build-push didn't have any dependency | 15:54 |
quiquell | marios: is that right ? | 15:54 |
marios | quiquell: right | 15:54 |
quiquell | ack | 15:54 |
marios | quiquell: but that's what i'm doing | 15:54 |
marios | quiquell: i mean repropose,don't change the existing one, and make the f28 just be additive new job in layout without depends | 15:54 |
quiquell | marios: humm stuff reverted is kicking it alright in the reproducer, maybe I have do something wrong :-( | 15:55 |
quiquell | marios: it has a node_failure at centos7 though but I think it's RDO issue | 15:56 |
quiquell | Yap they get enqueued, there are problems at RDO but not related to configuration | 15:59 |
quiquell | are they sure it was not the RDO issue ? maybe the job freeze was a red herring | 15:59 |
quiquell | I am parenting the same way | 16:00 |
panda|ruck|flu | quiquell: this tried to triggers at least 4 times in the last 24h. Rdocloud was working yesterday, with some problem on the nested VM nodes. But the log show clearly taht zuul is saying that the configuration is invalid | 16:02 |
quiquell | panda|ruck|flu: then I am missing something at my config... :-/ | 16:02 |
*** raukadah is now known as chandankumar | 16:04 | |
quiquell | humm did have the nodeset at tripleo-ci-base-singlenode-rdo-containers-build-push | 16:06 |
quiquell | Let's check removing it | 16:06 |
quiquell | Nah working too | 16:08 |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-centos-7-scenario010 (4 more messages) | 16:13 |
marios | quiquell: panda|ruck|flu rlandy posted https://review.rdoproject.org/r/19114 i kept changing my mind quiquell about duplicating or keeping a base with https://review.rdoproject.org/r/#/c/19114/1/zuul.d/jobs.yaml kept the base. main difference is the f28 is just additive now https://review.rdoproject.org/r/#/c/19114/1/zuul.d/tripleo.yaml so didn't touch existing f28 job, and i keep nodeset on the base i.e. did not do that https://review.rdoproject | 16:16 |
quiquell | marios: have test the stuff before revert and is queued so no config problem :-/ | 16:20 |
zbr | weshay|rover: when you have two minutes maybe you can give me a hint on https://tree.taiga.io/project/tripleo-ci-board/us/805 | 16:20 |
marios | quiquell: k well not sure how we proceed with that though. like we try again and see if it breaks and re-revert. or we try the softer version https://review.rdoproject.org/r/#/c/19114/ still gives us what we want like base etc and the new job , just no dependencies | 16:20 |
panda|ruck|flu | rlandy: weshay|rover do you know how to reproduce phase1 jobs ? | 16:20 |
zbr | i am not sure how to validate that tripleo-repos configured the right repos | 16:21 |
rlandy | panda|ruck|flu: by running the equivalent quickstart command | 16:21 |
panda|ruck|flu | rlandy: and it will pick what hash ? | 16:22 |
quiquell | marios: there is whan thing you can do | 16:22 |
quiquell | marios: after you merge stuff from config if there is alright | 16:22 |
quiquell | marios: you can navigate jobs | 16:22 |
panda|ruck|flu | rlandy: mmhh, will probably work on current-tripleo master | 16:22 |
quiquell | marios: if you can navigate jobs they will have no config problems at periodic | 16:23 |
quiquell | marios: so we can merge your last review and check that everything is ok here https://softwarefactory-project.io/zuul/t/rdoproject.org/jobs | 16:23 |
rlandy | panda: you can set hash | 16:23 |
quiquell | marios: following parents and all | 16:23 |
rlandy | --dlrn_hash_tag | 16:23 |
rlandy | panda|ruck|flu: ^^ | 16:23 |
rlandy | also this stuff has been merged for a while so a regular quickstart run should pick it up | 16:24 |
weshay|rover | sshnaidm I'm still not seeing http://cistatus.tripleo.org/ Overcloud image create failed in the sova "Reasons" | 16:24 |
* weshay|rover needs that | 16:24 | |
marios | quiquell: ok that sounds good maybe you can show me tomorrow. so now() we try to merge https://review.rdoproject.org/r/#/c/19114/ or at least folks review it please panda|ruck|flu quiquell rlandy weshay|rover when you have time, maybe tomorrow :) but this is the reproposal of the revert. softer approach without ripping out existing f28 | 16:24 |
quiquell | rlandy, panda|ruck|flu, marios: I don't see any issue at https://softwarefactory-project.io/zuul/t/rdoproject.org/jobs | 16:26 |
quiquell | dman | 16:26 |
quiquell | here https://review.rdoproject.org/r/#/c/19114/ | 16:26 |
quiquell | marios: +2 | 16:26 |
rlandy | let's try this again | 16:26 |
rlandy | marios: w'ing | 16:27 |
rlandy | panda|ruck|flu: ^^ fyi | 16:27 |
marios | rlandy: thanks maybe give few mins see if folks have more comments? but yeah lets do it | 16:27 |
quiquell | yep let's check if jobs appears coorectly and their parents | 16:27 |
marios | it should be ok since this time we aren't changing the existing dependencies | 16:27 |
rlandy | marios: I am waiting for panda|ruck|flu to object | 16:27 |
marios | just adding https://review.rdoproject.org/r/#/c/19114/1/zuul.d/tripleo.yaml leaf new job | 16:27 |
marios | lets see | 16:27 |
sshnaidm | weshay|rover, do you have example of job with this error? | 16:27 |
rlandy | it is his problem if we mess this up again | 16:28 |
weshay|rover | sshnaidm search for 639026 in sova | 16:28 |
weshay|rover | sshnaidm https://bugs.launchpad.net/tripleo/+bug/1818305 | 16:29 |
openstack | Launchpad bug 1818305 in tripleo "overcloud-full image fails to build calling mkfs -t xfs, exec sudo failed" [Critical,Triaged] | 16:29 |
sshnaidm | weshay|rover, so everything is there | 16:30 |
sshnaidm | weshay|rover, search for 639026 in sova ;) | 16:30 |
sshnaidm | weshay|rover, sova is full with errors "Overcloud image create failed." | 16:31 |
weshay|rover | really? | 16:32 |
weshay|rover | so for https://logs.rdoproject.org/26/639026/1/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/ab2f2ca/logs/undercloud/home/zuul/overcloud_image_build.log.txt.gz | 16:32 |
weshay|rover | | 2019-03-01 19:39 || 66 min || Reason was NOT FOUND. Please investigate. || 639026,1 || Logs || master || - || check | | 16:32 |
weshay|rover | for example | 16:32 |
weshay|rover | what am I not seeing? | 16:32 |
weshay|rover | I'm on http://cistatus.tripleo.org/ | 16:32 |
weshay|rover | I don't see that at all | 16:33 |
* weshay|rover blind? | 16:33 | |
sshnaidm | weshay|rover, refresh? | 16:34 |
weshay|rover | heh.. will try.. but just loaded this | 16:35 |
quiquell | sshnaidm: git-review at centos is from EPEL https://review.rdoproject.org/r/#/c/19047/ | 16:35 |
quiquell | sshnaidm: changed to install with pip | 16:35 |
weshay|rover | jebus | 16:36 |
weshay|rover | sorry sshnaidm | 16:36 |
sshnaidm | weshay|rover, https://i.imgur.com/g8Z7L2a.png | 16:36 |
weshay|rover | aye | 16:37 |
weshay|rover | I see it now, sorry to bother you | 16:37 |
*** quiquell is now known as quiquell|off | 16:37 | |
rlandy | panda|ruck|flu: pls speak now or forever ... objections re merging marios new patch https://review.rdoproject.org/r/#/c/19114/ | 16:39 |
panda|ruck|flu | marios: what's your next step if https://review.rdoproject.org/r/19114 fails ? | 16:39 |
panda|ruck|flu | I'll speak forever | 16:39 |
rlandy | revert? | 16:40 |
panda|ruck|flu | rlandy: after the revert | 16:40 |
rlandy | this is pretty minimal | 16:41 |
*** kopecmartin is now known as kopecmartin|off | 16:41 | |
panda|ruck|flu | the only things I can object here is to try to inject the smallest increment possibile on the original patch, but having no idea what could satisfy zuul there's no way to say if this is addind too much or not | 16:41 |
marios | panda|ruck|flu: yeah i think if it breaks periodics again we revert. do you have better suggestion | 16:42 |
marios | panda|ruck|flu: this is smaller than the original | 16:43 |
marios | panda|ruck|flu: not sure what else to do open to suggestions | 16:43 |
rlandy | when are periodics set to trigger again? | 16:43 |
panda|ruck|flu | rlandy: marios as far as we know what caused the problem was periodic-tripleo-centos-7-master-containers-build-push in that patch, not the fedora-28, so the reparenting may be the problem here. | 16:45 |
*** quiquell|off is now known as quiquell | 16:45 | |
panda|ruck|flu | marios: rlandy but I really don't know what to suggest here, le'ts merge this and see what happens | 16:45 |
quiquell | panda|ruck|flu: if parenting is the problem job will not appear at jobs page I think | 16:45 |
marios | panda|ruck|flu: can you point re what caused the problem was periodic-tripleo-centos-7-master-containers-build-push in that patch, not the fedora-28, | 16:45 |
*** jpena is now known as jpena|brb | 16:46 | |
marios | panda|ruck|flu: where do you mean | 16:46 |
panda|ruck|flu | rlandy: next trigger is at 22GMT | 16:46 |
* rlandy checks current gmt | 16:46 | |
panda|ruck|flu | marios: https://bugs.launchpad.net/tripleo/+bug/1818646 first line of zuul log extract, zuul reads the job and bails out | 16:47 |
openstack | Launchpad bug 1818646 in tripleo "misconfiguration of the new periodic build containers job preventing periodic pipeline to trigger" [Critical,Fix released] - Assigned to Gabriele Cerami (gcerami) | 16:47 |
panda|ruck|flu | marios: I have no experience in zuul log, but that's whay I'm able to gete from it | 16:47 |
quiquell | panda|ruck|flu: if the job has something wrong it will not appear here https://softwarefactory-project.io/zuul/t/rdoproject.org/jobs | 16:47 |
quiquell | panda|ruck|flu: after the merge | 16:47 |
panda|ruck|flu | what* | 16:47 |
quiquell | the job or any of the parents | 16:48 |
marios | panda|ruck|flu: k maybe the fact i removed the nodeset from base in first patch but no longer doing it might help on that but we just don't know for sure :/ | 16:48 |
panda|ruck|flu | quiquell: ah, cool | 16:48 |
quiquell | panda|ruck|flu: let's just merge and see if it appears there correctly, it takes a little though | 16:49 |
marios | panda|ruck|flu: rlandy so we merge and see sounds like no one has better idea :D | 16:49 |
panda|ruck|flu | rlandy: quiquell marios: at this point it's trial and error | 16:49 |
quiquell | marios: maybe we can as RDO guys to look at scheduler logs will we merge | 16:49 |
panda|ruck|flu | weshay|rover: we may have promoted a hash that cannot pass phase 1 | 16:49 |
panda|ruck|flu | weshay|rover: and not sure how to proceed. | 16:50 |
rlandy | marios: panda|ruck|flu: if we have no better idea or way to check, merge it is | 16:50 |
weshay|rover | panda|ruck|flu ok.. that happens from time to time | 16:50 |
weshay|rover | panda|ruck|flu it's directly related to introspection right? | 16:50 |
marios | quiquell: ack talk tomorrow /me getting ready to bail. i thought you left already missed us? ;D | 16:50 |
panda|ruck|flu | weshay|rover: the hash did not contain some fix that were needed to do introspection on VMs | 16:50 |
rlandy | marios: panda|ruck|flu: here we go | 16:50 |
marios | thanks rlandy panda|ruck|flu quiquell | 16:50 |
quiquell | marios: nah family is not here | 16:50 |
weshay|rover | panda|ruck|flu this is why we have phase 1 :) | 16:50 |
quiquell | marios: I want to set up a little before my PTO | 16:51 |
panda|ruck|flu | marios: if this fails, next step is to recheck the reparenting | 16:51 |
quiquell | rlandy, panda|ruck|flu, marios: merging | 16:51 |
quiquell | ack ? | 16:51 |
quiquell | let's monitor jobs | 16:51 |
rlandy | quiquell: done | 16:51 |
panda|ruck|flu | weshay|rover: but at this point we'll wait for the next hash, or try to introduce some fix in this hash ? | 16:51 |
weshay|rover | panda|ruck|flu it waits until we promote tripleo again | 16:51 |
quiquell | rlandy: let's monitor https://softwarefactory-project.io/zuul/t/rdoproject.org/jobs | 16:52 |
rlandy | to see if the job shows | 16:52 |
quiquell | periodic-tripleo-fedora-28-master-containers-build-push has to appear there | 16:52 |
panda|ruck|flu | quiquell: I don't see periodic-tripleo-master-containers-build-push-base in https://softwarefactory-project.io/zuul/t/rdoproject.org/jobs | 16:52 |
quiquell | and its parent too | 16:52 |
rlandy | only when it runs | 16:52 |
quiquell | panda|ruck|flu: is not merged yet | 16:52 |
quiquell | It takes a little | 16:53 |
quiquell | rlandy, panda|ruck|flu: 200~https://softwarefactory-project.io/zuul/t/rdoproject.org/job/periodic-tripleo-fedora-28-master-containers-build-push | 16:53 |
panda|ruck|flu | rlandy: IIUC the job will appear no there immediately, it's just the list of job taht are configured for this instance of zuul | 16:53 |
quiquell | and herachy is all good too | 16:53 |
quiquell | let's check centos | 16:53 |
panda|ruck|flu | weshay|rover: ok then | 16:54 |
weshay|rover | panda|ruck|flu ur killing it man.. nice job | 16:54 |
quiquell | panda|ruck|flu: yep could be the pipeline :-/ | 16:54 |
*** ccamacho has quit IRC | 16:55 | |
weshay|rover | panda|ruck|flu are there any issues w/ overcloud image builds last time the periodic launched? | 16:57 |
weshay|rover | days ago | 16:57 |
* weshay|rover looks | 16:57 | |
rlandy | ok - jobs are there | 16:58 |
quiquell | rlandy: yep the parents can be followed until base-minimal | 16:59 |
quiquell | We can just cross fingers | 16:59 |
quiquell | I will drop now | 16:59 |
rlandy | k - will keep watch | 16:59 |
rlandy | marios: ^^ | 16:59 |
*** quiquell is now known as quiquell|off | 17:00 | |
zbr | sshnaidm: can you help me with something related to nodesets? | 17:00 |
marios | thanks rlandy quiquell|off | 17:00 |
* marios dropping too ttyl folks | 17:00 | |
panda|ruck|flu | weshay|rover: the job in pipeline kept failing one step before in the recent runs | 17:02 |
panda|ruck|flu | weshay|rover: so we have blockers that we can't verify and fixes that we don't know if they worked. | 17:02 |
panda|ruck|flu | weshay|rover: because we didin't see the job arrive at those points | 17:03 |
weshay|rover | re: introspection on vm's | 17:03 |
weshay|rover | ? | 17:03 |
panda|ruck|flu | weshay|rover: no re: weshay|rover | panda|ruck|flu are there any issues w/ overcloud image builds last time the periodic launched? | 17:03 |
weshay|rover | aye aye | 17:03 |
weshay|rover | k | 17:03 |
weshay|rover | it looks like on 3/1 the image builds worked | 17:04 |
weshay|rover | our jobs have been down since friday | 17:04 |
-openstackstatus- NOTICE: Gerrit is being restarted for a configuration change, it will be briefly offline. | 17:09 | |
weshay|rover | panda|ruck|flu can I help w/ something? | 17:10 |
zbr | weshay|rover: if you have cpu cycles, i do have two questions. | 17:11 |
sshnaidm | zbr, sure | 17:11 |
zbr | sshnaidm: see The nodeset "ubuntu-bionic" was not found. error on https://review.openstack.org/#/c/636923/ | 17:11 |
zbr | sshnaidm: mainly I add the job ustream, which works as expected but this breaks RDO zuul fails to find this nodesets. | 17:12 |
*** jpena|brb is now known as jpena | 17:12 | |
zbr | i don't really understand why this happens as this job was supposed to run only upstream. | 17:12 |
*** chandankumar is now known as raukadah | 17:12 | |
sshnaidm | zbr, will look, gerrit is down | 17:12 |
zbr | sshnaidm: yeah, perfect timing. | 17:13 |
zbr | i wonder if we have any nodesets that are defined on both zuul servers. | 17:13 |
zbr | my grep atempt tells me that we do not have, the rdo ones are usually prefixed with "upstream-" prefix for few of them but I found none that shares the same name. I wonder if this is by design. | 17:14 |
*** ykarel|away has quit IRC | 17:21 | |
*** panda|ruck|flu is now known as panda|ruck|off | 17:27 | |
*** bogdando has quit IRC | 17:27 | |
sshnaidm | zbr, hmm.. not sure why it happens. jpena, do yo know maybe why we have error about node in rdo ci? https://review.openstack.org/#/c/636923/ | 17:29 |
panda|ruck|off | I'm gonna take a knife and spread myself on the bed. | 17:30 |
jpena | sshnaidm: RDO's Zuul complain is 'The nodeset "ubuntu-bionic" was not found.'. Most likely that is the node type defined for tripleo-ci-molecule-jobs, and it is not defined in review.rdo's config | 17:32 |
sshnaidm | jpena, and should it be there? | 17:32 |
sshnaidm | jpena, I just don't understand why we need to have the same nodes | 17:32 |
weshay|rover | rlandy do we have any ai's for the ci-rhos -> upshift migration? | 17:33 |
jpena | sshnaidm: we are using the same definitions in both openstack and RDO's zuul. Either we have the same nodesets, or we override them in the review.rdo configuration | 17:33 |
rlandy | weshay|rover: none | 17:34 |
zbr | jpena: that change is adding a molecule job upstream which uses ubuntu-bionic node to run, and upstream zuul has no problem with it. Still, RDO one chokes with this error, even if it is not supposed to run that job. Today is the first time I seen this error and that job is not really new. I wonder what is causing it and how to avoid it. | 17:34 |
rlandy | weshay|rover: we have nothing running there atm | 17:35 |
weshay|rover | k.. thanks | 17:35 |
rlandy | unless you want to put rhos back there | 17:36 |
jpena | zbr: if RDO's Zuul is having trouble, it's because we're telling it to load the tripleo-ci zuul configuration and use it | 17:36 |
rlandy | sshnaidm: is it possible yet to run ovb on upshift? | 17:37 |
jpena | zbr: https://github.com/rdo-infra/review.rdoproject.org-config/blob/master/zuul/rdo.yaml#L927-L932. So RDO's zuul is loading the jobs defined there. And the job is requesting a non-existing nodeset (in RDO's Zuul configuration) | 17:39 |
zbr | jpena: we are trying to reuse job definitions. but it seems that this reusal chokes when it encounters a un undefined nodeset, even if that job is not configured to run on rdo. | 17:39 |
weshay|rover | rlandy the only thing I would be concerned about is loosing resources ( as backup ) | 17:40 |
zbr | i am bit surprised we didn't discovered this when we did the build-containers change. | 17:40 |
rlandy | weshay|rover: really - have not looked at this tenant in a while - will see what is there and what can access | 17:42 |
weshay|rover | k k | 17:43 |
rlandy | weshay|rover: ok to merge https://code.engineering.redhat.com/gerrit/#/c/164398/? | 17:43 |
jpena | zbr: I remember seeing dummy node definitions in the past, to fix some of these issues | 17:43 |
zbr | jpena: good to know. 2nd question would why we do not share the same node names between upstream/rdo, i seen usage of "upstream-" prefix befor their names, but this does not make them re-usable. | 17:45 |
zbr | probably better to ask on #rdo | 17:45 |
jpena | zbr: we only have a subset of nodes (no ubuntu, suse or similar). Even for centos ones, we need to have specific configurations for rdo | 17:46 |
jpena | that makes it hard to reuse | 17:46 |
zbr | jpena: practical use-case: i only need a node where I can install and run docker. I do not care about os much. is there a cross-zuul way of doing that? | 17:47 |
jpena | zbr: not that I'm aware of | 17:48 |
zbr | the only reason why this job has ubuntu-bionic there is because it does install docker to run some tasks. i could easily switch to something else. | 17:48 |
*** agopi has joined #oooq | 17:50 | |
zbr | jpena: where should I define the fake/placeholder nodes? (sorry for so many questions) | 17:51 |
jpena | zbr: https://github.com/rdo-infra/rdo-jobs/blob/master/zuul.d/nodesets.yaml | 17:52 |
jpena | I think some nodesets used in openstack.org's Zuul are redefined there | 17:52 |
jpena | openstack-single-node or openstack-two-node | 17:53 |
zbr | jpena: thanks, I will add there with comment. | 17:54 |
sshnaidm | rlandy, should be ok now, but didn't try yet | 17:54 |
zbr | jpena: afaik I do no expect a nodeset from different zuul servers to be identical, just "likely" compatible. | 17:54 |
rlandy | sshnaidm: k - I'll try it out | 17:55 |
sshnaidm | zbr, you can use existent fedora or centos nodes | 17:56 |
sshnaidm | zbr, unless you need ubuntu in particular | 17:56 |
zbr | sshnaidm: probably a centos-7 node only, as I need to run docker. | 17:56 |
zbr | sshnaidm: so the answer is not very simple :p | 17:59 |
weshay|rover | zbr can you join my blue when you have a few | 18:00 |
zbr | weshay|rover: i can now. | 18:01 |
weshay|rover | k on | 18:01 |
*** agopi has quit IRC | 18:04 | |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-fedora-28-standalone, (5 more messages) | 18:13 |
*** chem has quit IRC | 18:16 | |
*** chem has joined #oooq | 18:17 | |
*** jfrancoa has quit IRC | 18:24 | |
*** amoralej is now known as amoralej|off | 18:28 | |
*** saneax has quit IRC | 18:31 | |
*** jpena is now known as jpena|off | 18:33 | |
weshay|rover | rlandy so .. /me looking at https://softwarefactory-project.io/zuul/t/rdoproject.org/build/f83753df32dd48af988b5b338d56e528 | 18:41 |
weshay|rover | that looks good to me | 18:41 |
weshay|rover | now we're just waiting to see if periodic's trigger? | 18:42 |
rlandy | weshay|rover: yep one passing build | 18:43 |
* rlandy checks centos | 18:43 | |
weshay|rover | it also passed | 18:43 |
rlandy | https://softwarefactory-project.io/zuul/t/rdoproject.org/build/91e17bb82d574380aa813c24a58c0272 | 18:44 |
rlandy | I don't see any periodics though | 18:44 |
weshay|rover | rlandy https://softwarefactory-project.io/zuul/t/rdoproject.org/builds?job_name=periodic-tripleo-centos-7-master-containers-build-push | 18:45 |
weshay|rover | ? | 18:45 |
rlandy | weshay|rover: also - I am following up with mohan re: upshift. we didn;t get assigned to batch one or batch two | 18:45 |
weshay|rover | k thank you! | 18:46 |
weshay|rover | rlandy are we using this https://review.rdoproject.org/r/#/c/19000/ | 18:47 |
weshay|rover | to test the push of fedora-28 to the rdo registry? | 18:47 |
rlandy | not that I can tell | 18:48 |
rlandy | not merged | 18:48 |
weshay|rover | it's just a test review | 18:48 |
weshay|rover | ya.. just trying to figure out what was left for https://tree.taiga.io/project/tripleo-ci-board/task/773?kanban-status=1447276 | 18:49 |
weshay|rover | but I *think* it's ready for qe | 18:50 |
rlandy | the two push jobs are out of sync though | 18:50 |
weshay|rover | meaning the centos-7 job config latest run != that fedora-28 one? | 18:50 |
weshay|rover | I think we're here now https://tree.taiga.io/project/tripleo-ci-board/task/827?kanban-status=1447275 | 18:54 |
weshay|rover | but we could also start poking at https://tree.taiga.io/project/tripleo-ci-board/task/823?kanban-status=1447274 | 18:55 |
weshay|rover | now that we have https://review.rdoproject.org/zuul/build/f83753df32dd48af988b5b338d56e528 | 18:56 |
weshay|rover | rlandy so.. may I chat w/ you for a minute? | 18:58 |
rlandy | weshay|rover: sure | 18:58 |
rlandy | weshay|rover: https://review.rdoproject.org/r/#/c/19066/1/zuul.d/jobs.yaml | 19:14 |
*** chem has quit IRC | 19:16 | |
*** chem has joined #oooq | 19:17 | |
rlandy | weshay|rover: https://review.rdoproject.org/r/#/c/19108/ | 19:22 |
*** dtantsur is now known as dtantsur|afk | 19:31 | |
*** holser_ has quit IRC | 19:42 | |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-fedora-28-standalone, (5 more messages) | 20:13 |
*** holser_ has joined #oooq | 20:46 | |
weshay|rover | rlandy 0027c049002ece9f84f187c7fa6bcb4ceed642e3_fcc621ac | 21:12 |
*** dsneddon has quit IRC | 21:18 | |
*** holser_ has quit IRC | 21:23 | |
panda|ruck|off | latest periodic still failing with https://bugs.launchpad.net/tripleo/+bug/1818538 and neutron-lib update didn't fix it. | 21:32 |
openstack | Launchpad bug 1818538 in tripleo "All OVB jobs in master periodic pipeline fail to install the undercloud while trying to create the ctlplane network with 503 service unavailable on neutron" [Critical,Fix committed] - Assigned to Gabriele Cerami (gcerami) | 21:32 |
weshay|rover | panda|ruck|off ok | 21:32 |
weshay|rover | panda|ruck|off rlandy and I are pushing the fedora28 work | 21:33 |
weshay|rover | I can update what we've done in an email | 21:33 |
weshay|rover | or live | 21:33 |
weshay|rover | I'm on with her now | 21:33 |
panda|ruck|off | I'm seeing the list of reviews flooding my inbox. | 21:34 |
weshay|rover | lolz | 21:34 |
weshay|rover | panda|ruck|off we found some indentation errors in the periodic job that could have caused https://bugs.launchpad.net/tripleo/+bug/1818646 | 21:40 |
openstack | Launchpad bug 1818646 in tripleo "misconfiguration of the new periodic build containers job preventing periodic pipeline to trigger" [Critical,Fix released] - Assigned to Gabriele Cerami (gcerami) | 21:40 |
weshay|rover | panda|ruck|off it seems that fedora dlrn is way behind centos | 21:41 |
weshay|rover | we may want to switch the order of the check.. find the latest in fedora and then see if it's in centos | 21:41 |
weshay|rover | but htat is just today's data | 21:41 |
panda|ruck|off | weshay|rover: the indentation was there also before | 21:47 |
panda|ruck|off | weshay|rover: but I would not be surprised if that was the problem | 21:47 |
weshay|rover | hopefully.. | 21:48 |
weshay|rover | panda|ruck|off rlandy and I are watching https://review.rdoproject.org/r/#/c/19000/ in rdo zuul | 21:48 |
weshay|rover | panda|ruck|off and hopefully openstack-periodic will trigger and run next go | 21:49 |
rlandy | containers-build-push just started | 21:49 |
weshay|rover | if that's the case.. we're in pretty good shape.. | 21:49 |
weshay|rover | if openstack-periodic does NOT trigger properly I'll have to try and debug w/ quiquell|off | 21:49 |
panda|ruck|off | weshay|rover: building hash for fedora takes always a bit more time than centos. So if you invert the order, you'll probably promote the consistent immediately before the last that was created in centos. | 21:50 |
panda|ruck|off | it's probably just like missing the last 10 minutes updates though | 21:51 |
weshay|rover | panda|ruck|off let's talk about that tomorrow man.. it's not an immediate problem | 21:51 |
panda|ruck|off | ok | 21:51 |
weshay|rover | panda|ruck|off it's more than that.. today at least | 21:51 |
panda|ruck|off | 'night | 21:51 |
weshay|rover | panda|ruck|off good night man.. feel better | 21:51 |
rlandy | weshay|rover: should we put in the review to reverse the order? | 22:00 |
weshay|rover | ? | 22:02 |
weshay|rover | oh | 22:02 |
weshay|rover | let's wait | 22:02 |
rlandy | k | 22:02 |
weshay|rover | so we can chat w/ panda|ruck|off tomorrow | 22:02 |
rlandy | weshay|rover and rlandy probably did enough damage for one day | 22:03 |
weshay|rover | :) | 22:06 |
weshay|rover | rlandy email summary sent | 22:06 |
weshay|rover | please correct me if you see something | 22:06 |
* rlandy looks | 22:07 | |
rlandy | weshay|rover: looks right | 22:10 |
weshay|rover | rlandy board looks much better :) | 22:12 |
hubbot1 | FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-fedora-28-standalone, (5 more messages) | 22:13 |
rlandy | as long as that is accurate :) | 22:15 |
weshay|rover | panda|ruck|off I'll take the program call tomorrow | 22:17 |
*** chem has quit IRC | 22:20 | |
*** chem has joined #oooq | 22:21 | |
rlandy | weshay|rover: periodic started | 22:25 |
rlandy | weshay|rover: should we try stop the current job? | 22:26 |
rlandy | not that I know how | 22:26 |
rlandy | actually can resubmit with no tests | 22:26 |
rlandy | weshay|rover: ^^? | 22:31 |
rlandy | too late | 22:33 |
rlandy | already ran | 22:34 |
rlandy | weshay|rover: periodic triggered and stopped | 22:38 |
rlandy | error again? | 22:38 |
rlandy | we still have a problem | 22:43 |
weshay|rover | rlandy ? | 22:46 |
* weshay|rover looks | 22:46 | |
rlandy | weshay|rover: periodic started | 22:47 |
rlandy | and just stopped | 22:47 |
weshay|rover | hrm.. let'er go | 22:47 |
rlandy | looks like it never trigger but it did - it somehow was killed | 22:47 |
weshay|rover | which was killed | 22:48 |
weshay|rover | rlandy? | 22:49 |
rlandy | periodic pipeline | 22:50 |
* rlandy gets | 22:50 | |
weshay|rover | rlandy afaict. it appears to be working | 22:50 |
rlandy | periodic-tripleo-centos-7-master-promote-consistent-to-tripleo-ci-testingopenstack-infra/tripleo-cimasteropenstack-periodicmaster792019-03-05T14:10:16SUCCESS | 22:50 |
rlandy | last run is 14:10:16 - I saw it start | 22:51 |
weshay|rover | it's running now | 22:51 |
*** dsneddon has joined #oooq | 22:51 | |
weshay|rover | the container build jobs in openstack-periodic are running | 22:51 |
weshay|rover | so far this looks fine | 22:51 |
weshay|rover | just don't recheck 19000 | 22:52 |
rlandy | weshay|rover: ok - breathing again | 22:52 |
rlandy | maybe view pane just shifted - I thought it disappeared | 22:53 |
rlandy | weshay|rover: sorry for alarm | 22:53 |
rlandy | anyways 19000 works | 22:53 |
* rlandy goes - will be back later if revert needed | 22:53 | |
*** rlandy is now known as rlandy|bbl | 22:54 | |
weshay|rover | rlandy|bbl ur fine | 22:54 |
weshay|rover | https://logs.rdoproject.org/00/19000/5/check/periodic-tripleo-ci-fedora-28-standalone-master/f64fa51/job-output.txt.gz#_2019-03-05_22_47_27_563021 | 22:54 |
weshay|rover | is interesting | 22:54 |
weshay|rover | curl --silent http://mirror.regionone.rdo-cloud-tripleo.rdoproject.org:8080/rdo/fedora28/71/d9/71d9a66abe93049a91d433f805c045abe135303a_e6dde0f8/delorean.repo -S | 22:55 |
weshay|rover | distro must resolve to fedora28 and not fedora | 22:55 |
*** dsneddon has quit IRC | 22:56 | |
*** dsneddon has joined #oooq | 23:10 | |
*** dsneddon has quit IRC | 23:15 | |
*** vinaykns has quit IRC | 23:22 | |
*** dsneddon has joined #oooq | 23:25 | |
*** derekh has quit IRC | 23:36 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!