14:00:32 <EmilienM> #startmeeting tripleo
14:00:33 <openstack> Meeting started Tue Jul 11 14:00:32 2017 UTC and is due to finish in 60 minutes.  The chair is EmilienM. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:35 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:37 <openstack> The meeting name has been set to 'tripleo'
14:00:42 <matbu> o/
14:01:02 <sshnaidm> \o
14:01:08 <jfrancoa> o/
14:01:10 <rasca> o/
14:01:11 <EmilienM> #topic agenda
14:01:12 <adarazs> o/
14:01:12 <EmilienM> * review past action items
14:01:14 <EmilienM> * one off agenda items
14:01:14 <ccamacho> o\
14:01:16 <EmilienM> * bugs
14:01:18 <EmilienM> * Projects releases or stable backports
14:01:20 <EmilienM> * CI
14:01:22 <EmilienM> * Specs
14:01:24 <EmilienM> * open discussion
14:01:26 <EmilienM> Anyone can use the #link, #action and #info commands, not just the moderatorǃ
14:01:28 <EmilienM> Hi everyone! who is around today?
14:01:30 * EmilienM slow today
14:01:36 <jtomasek> o/
14:01:40 <marios> \o
14:01:44 <shardy> o/
14:01:45 <jpich> o/
14:01:47 <cdearborn> o/
14:01:56 <mwhahaha> hi2u
14:02:20 <EmilienM> #topic review past action items
14:02:29 <EmilienM> team to review https://review.openstack.org/#/c/478516/
14:02:39 <EmilienM> panda isn't here but I see the patch needs some love
14:02:45 <trown> o/
14:02:49 <EmilienM> I'll check with him
14:02:54 <beagles> o/
14:02:58 <EmilienM> #topic one off agenda items
14:03:02 <EmilienM> #link https://etherpad.openstack.org/p/tripleo-meeting-items
14:03:13 <EmilienM> gfidente: go ahead
14:03:17 <weshay> o/
14:03:19 <gfidente> o/
14:03:30 <gfidente> so I think we got up for review the remaining submissions to test ceph-ansible in ci
14:03:46 <gfidente> my question is if it should replace the old environment file so that jobs previously using ceph now uses ceph-ansible
14:03:57 <gfidente> or if we should have separate jobs
14:04:04 <EmilienM> how do you manage upgrades?
14:04:04 <gfidente> /me would like the first option more
14:04:23 <gfidente> EmilienM that is a ceph-ansible playbook which we trigger from a heat parameter
14:04:27 <florianf> o/
14:04:28 <EmilienM> so your option would force everyone to deploy the new thing?
14:04:56 <gfidente> my option would keep the existing puppet-ceph services but default the existing environment files to use ceph-ansible yes
14:05:46 <trown> will we have tripleo jobs running on ceph-ansible changes?
14:06:03 <gfidente> trown good questions, we don't have yet, this was discussed but it's totally WIP
14:06:13 <gfidente> so the answer is no
14:06:29 <gfidente> right now we own ceph-ansible builds in cbs, but we don't have ci for ceph-ansible against tripleo
14:06:35 <trown> that seems prone to ci outages
14:06:35 <EmilienM> unless we have CI coverage for ceph-ansible, I don't see any point to switch to it
14:06:58 <gfidente> though that is our only option to deploy ceph in containers today
14:07:21 <trown> ya, maybe we need to first add the patches that enable turning it on, but keeping default to puppet-ceph
14:07:24 <EmilienM> if we had ceph-ansible in our promotion pipeline (packaging in RDO, etc), then fine
14:07:38 <trown> then get tripleo CI running on ceph-ansible
14:08:00 <EmilienM> yes, I would have seen a CI job like it was the plan but never came up
14:08:00 <trown> if we get enabling patches in, we could always run container job that way
14:08:02 <gfidente> EmilienM so no right now it's not subject to promotion, we manually tag the package for testing in the cbs repos and later release
14:08:20 <EmilienM> maybe you can switch multinode-scenario001-container to use ceph-ansible
14:08:58 <EmilienM> gfidente: jobs are voting. Everything manual will be error prone and break our CI. Multinode container job breaks almost every day nowadays
14:08:58 <gfidente> EmilienM okay I'll look into that
14:09:26 <gfidente> EmilienM yeah we have to put up automation in ceph-ansible CI
14:09:34 <EmilienM> everything that isn't tested and isn't in promotion pipeline shouldn't be in gate.
14:09:56 <weshay> +1
14:10:05 <EmilienM> gfidente: so make it work in multinode-scenario001-container and add ceph-ansible to rdo packaging where we can control its version from rdoinfo.
14:10:26 <EmilienM> once we have that, it's a bit more solid and we can maybe consider it into the gate again
14:10:39 <gfidente> ack, thanks
14:10:59 <EmilienM> anything else for this week before we move to the regular agenda?
14:11:13 <gfidente> (not sure if we can make it into rdo packaging actually, but that's a different conversation I think)
14:11:28 <EmilienM> gfidente: something that our delivery team can control the version
14:11:43 <EmilienM> gfidente: folks who manage ceph-ansible are in europe - if something breaks who will be around?
14:12:08 <EmilienM> we want to be able to control tags / bumps from a single place, like we do for almost everything else
14:12:41 <EmilienM> to me it makes sense to have ceph-ansible part of rdo (with a distgit and a version control)
14:12:44 <gfidente> EmilienM yeah I think I get the point
14:12:52 <EmilienM> but yeah we can discuss it offline
14:13:07 <EmilienM> gfidente: maybe on #rdo after this meeting
14:13:13 <EmilienM> with apevec & the team
14:13:15 <gfidente> EmilienM sure if you can, thanks
14:13:26 <EmilienM> cool, let's do that
14:13:44 <EmilienM> #action gfidente & EmilienM to figure out where if whether or not we package ceph-ansible in RDO
14:14:04 <EmilienM> #action gfidente to add ceph-ansible to multinode-scenario001-container job
14:14:11 <EmilienM> #topic bugs
14:14:16 <EmilienM> #link https://launchpad.net/tripleo/+milestone/pike-3
14:14:26 <EmilienM> is there any critical bug we need to discuss this week?
14:14:37 <EmilienM> anything outstanding?
14:15:14 <EmilienM> I see https://bugs.launchpad.net/tripleo/+bug/1703599
14:15:15 <openstack> Launchpad bug 1703599 in tripleo "Containers multinode job deploying an empty overcloud" [Critical,In progress] - Assigned to Jiří Stránský (jistr)
14:15:27 <EmilienM> jistr|call: is it the only patch that needs to be merged? https://review.openstack.org/#/c/482545
14:15:59 <jistr|call> EmilienM: for deploy job, yes. We have another one for upgrade job.
14:16:02 <EmilienM> jistr|call: what change caused this bug? I'm curious - we gate all tripleo projects on containers
14:16:33 <jistr|call> EmilienM: it went to quickstart-extras where we don't run the multinode job. The OOOQ job passed on it, somewhat strangely.
14:16:52 <EmilienM> we need to fix that
14:17:09 <EmilienM> sshnaidm: can you check if we run enough jobs in oooq-extras?
14:17:15 <weshay> the multinode puppet jobs run on extras
14:17:21 <sshnaidm> EmilienM, I'll add multinode containers to extras
14:17:32 <EmilienM> sshnaidm: thank you sir
14:17:36 <EmilienM> weshay: not containers
14:17:37 <jistr|call> weshay: sorry i meant we don't run multinode-containers there
14:17:41 <weshay> ah.. I see but not containers
14:17:42 <weshay> ya
14:18:05 <mwhahaha> we should have all oooq jobs on oooq-extras
14:18:06 <EmilienM> sshnaidm: maybe one is enough for now, not all scenarios but just gate-tripleo-ci-centos-7-containers-multinode
14:18:17 <sshnaidm> EmilienM, yeah, right
14:18:24 <EmilienM> or all scenarios - really I wouldn't mind
14:18:32 <EmilienM> mwhahaha: yes I agree
14:18:43 <EmilienM> the maximum we can add, please add
14:18:47 <mwhahaha> at least all oooq jobs that are voting
14:18:50 <EmilienM> so we reduce the breakages
14:18:59 <sshnaidm> yeah, all gates
14:19:11 <EmilienM> sshnaidm: can you take this one?
14:19:24 <sshnaidm> EmilienM, sure
14:19:31 <EmilienM> #action sshnaidm to add missing oooq jobs in oooq-extras which are already gating somewhere in tripleo
14:19:38 <EmilienM> sshnaidm: thx
14:19:44 <EmilienM> any other bug to discuss this week?
14:20:10 <EmilienM> #topic projects releases or stable backports
14:20:15 <EmilienM> #link https://releases.openstack.org/pike/schedule.html
14:20:41 <EmilienM> we're 2 weeks from pike-3 milestone
14:20:53 <EmilienM> next week is Final release for non-client libraries
14:21:10 <EmilienM> I think tripleo-common and maybe some others
14:22:09 <EmilienM> we'll discuss more about pike-3 next week
14:22:18 <EmilienM> we still have a bunch of blueprints in progress
14:22:39 <EmilienM> if you could help by updating your blueprints / bug in Launchpad, it would be awesome
14:22:58 <EmilienM> because when I do it myself it's in ninja mode and not so good ;-)
14:23:16 <EmilienM> any question about release management before we go ahead?
14:23:16 <jpich> We usually treat "tripleo-common" as "other" (internal component), not a library as it tends to be need updates right to the end
14:23:32 <EmilienM> jpich: right, I just checked in openstack/releases - it's other
14:23:48 <EmilienM> so we shouldn't have anything to release by next week
14:23:57 <shardy> Yeah agreed it's probably too early to freeze t-c
14:24:05 <EmilienM> jpich: thanks for the correction
14:24:17 <jpich> Ok, cool!
14:24:29 <EmilienM> #topic CI
14:24:50 <EmilienM> adarazs, panda, sshnaidm, weshay, trown : any outstanding updates on CI this week?
14:24:52 <shardy> weshay: Hey any update on the conversion of the 3nodes job to quickstart?
14:25:12 <shardy> I'd like to get coverage of multinode (more than one node) w/containers+HA
14:25:23 <sshnaidm> EmilienM, former ovb-updates jobs transited to oooq and successfully runs
14:25:33 <shardy> but I kinda stopped looking into it as I was under the impression we were on the verge of quickstarting the existing 3nodes job
14:25:52 <shardy> if that's not imminent we should probably go ahead and add the coverage with the old scripts as a stopgap
14:26:01 <shardy> but I'm a little wary of doing the work twice..
14:26:19 <EmilienM> sshnaidm: cool, we could update tripleo.org/cistatus.html maybe
14:26:33 <sshnaidm> EmilienM, I have a patch, waiting for fixing gates
14:26:53 <adarazs> EmilienM: I'm working on adding a static "readme" footer for our logs on logs.openstack.org, hopefully this can go through: https://review.openstack.org/482210
14:26:56 <EmilienM> I think trown was about to look at the 3-nodes thing but I might be wrong
14:26:57 <weshay> shardy, our backlog and status is here https://trello.com/b/U1ITy0cu/tripleo-ci-squad
14:27:07 <EmilienM> adarazs: really cool
14:27:08 <weshay> shardy, no progress afaik
14:27:14 <trown> EmilienM: not on my radar no
14:27:21 <EmilienM> ah ok
14:27:33 <EmilienM> adarazs: ping #openstack-infra if you need review - none of us is core on it
14:27:43 <shardy> weshay: Ok, should we have a discussion re priorities and the 3nodes stuff perhaps?
14:27:46 <EmilienM> adarazs: but I'll review it anyway
14:27:56 <weshay> shardy, yes.. we can do that
14:28:06 <shardy> the issue is we need coverage of HA w/containers, then rolling updates w HA/containers
14:28:08 <adarazs> EmilienM: ack, I added pabelanger and ajaeger, but I might ping them directly too.
14:28:21 <weshay> shardy, adarazs can you add that to the schedule for thrs tripleo-ci-squad mtg and invite shardy
14:28:28 <shardy> the latter is net-new coverage, and I'm not super motivated to do it twice, or ask someone to do the work only for it to be immediately redone
14:28:36 <shardy> as was the case e.g with upgrades in the past
14:28:39 <weshay> shardy, understood
14:28:53 <shardy> weshay: thanks!
14:29:26 <pabelanger> feel free to ping me, I am back from PTO today
14:30:10 <adarazs> shardy, weshay: I'll add an agenda item on https://etherpad.openstack.org/p/tripleo-ci-squad-meeting for it.
14:30:22 <shardy> adarazs: ack sounds good, thanks!
14:30:25 <EmilienM> cool, thanks
14:30:41 <shardy> for now perhaps I'll post a WIP patch hijacking the exisitng 3nodes job to show what we'd like to test
14:31:02 <EmilienM> shardy: good idea
14:31:13 <EmilienM> shardy: you might have some changes to do in tripleo-ci repo maybe
14:31:26 <shardy> EmilienM: yeah, I'll take a look
14:31:30 <EmilienM> for the 3-nodes environment - but not sure if really needed, just fyi
14:31:55 <EmilienM> anything else CI related for this week?
14:31:58 <shardy> yeah it uses a custom role which isn't strictly required but should work for HA+Containers I think
14:32:26 <EmilienM> good news
14:32:27 <matbu> adarazs: weshay tripleo-ci mtg is on thursday right ?
14:32:38 <matbu> i'll join as well with shardy
14:32:40 <adarazs> matbu: yeah, 14:30 UTC
14:32:52 <adarazs> I'll add you to the pinglist :)
14:32:53 <matbu> adarazs: ack
14:32:58 <matbu> adarazs: thx
14:33:04 <EmilienM> adarazs: I'll be able to join this week as well (I wasn't last week)
14:33:23 <EmilienM> ok moving on
14:33:25 <EmilienM> #topic specs
14:33:32 <EmilienM> #link https://review.openstack.org/#/q/project:openstack/tripleo-specs+status:open
14:33:43 <adarazs> EmilienM: cool. I was on PTO last week too, so I don't know what exactly happened apart from what I see on the pad.
14:34:12 <EmilienM> adarazs: I might have missed it but I haven't seen an email summary
14:34:39 <adarazs> well, I couldn't really make a summary because I wasn't there :)
14:34:43 <EmilienM> who led the meeting?
14:34:47 <adarazs> apart from pasting the agenda.
14:35:14 <EmilienM> weshay, trown, sshnaidm, panda ? ^
14:35:43 <EmilienM> ok moving on
14:36:00 <EmilienM> anything to discuss about specs this week?
14:36:24 <EmilienM> #topic open discussion
14:36:46 <EmilienM> * Reminder about PTG: please contribute to the schedule: https://etherpad.openstack.org/p/tripleo-ptg-queens
14:37:08 <EmilienM> we'll have 2 days 1/2 to work together, again it's a good place to make progress together
14:37:22 <EmilienM> feel free to propose new topics
14:37:24 <mwhahaha> scenario upgrade jobs in ocata are timing out, is there anything we can do to improve these as it's blocking backports
14:38:07 <EmilienM> https://bugs.launchpad.net/tripleo/+bug/1702955
14:38:08 <openstack> Launchpad bug 1702955 in tripleo "tripleo upgrade jobs timeout when running in RAX cloud" [Critical,Triaged]
14:38:13 <EmilienM> that's what we found out last week
14:38:23 <mwhahaha> scenario001 seems to be consistently failing
14:38:33 <mwhahaha> rax/ovh doesn't seem to be provider specific
14:38:50 <EmilienM> so from what I've understand, it sounds like it's timing out when it's not running on OSIC cloud (which is by far the faster cloud that we have)
14:39:05 <EmilienM> so yeah there is something to do here
14:39:31 <EmilienM> mwhahaha: we should spend some time in comparing old jobs (when it was far from timeouting) and now and compare the overcloud deploy steps
14:39:35 <EmilienM> and see what takes more time
14:39:49 <EmilienM> figuring out if it's because some services or if because one specific service, etc
14:40:13 <EmilienM> I can have a look this week and start some investigation, if someone is willing to help a little also
14:41:11 <EmilienM> anything else for open discussion?
14:41:13 <mwhahaha> i'll see if i can take a look as well
14:41:29 <EmilienM> mwhahaha: I would rather poke someone from upgrade team if you don't mind
14:41:41 <EmilienM> mwhahaha: I think you have better to do :-)
14:42:30 <EmilienM> marios: ^ anyone in your team willing to help?
14:42:56 <EmilienM> anyway, closing the meeting, we can continue offline
14:42:58 <EmilienM> thanks everyone
14:43:00 <EmilienM> #endmeeting