13:59:58 #startmeeting tripleo 13:59:58 Meeting started Tue Sep 20 13:59:58 2016 UTC and is due to finish in 60 minutes. The chair is EmilienM. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:00 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:03 The meeting name has been set to 'tripleo' 14:00:17 #topic rollcall 14:00:18 o/ 14:00:18 \o/ 14:00:20 o/ 14:00:21 \o 14:00:21 o/ 14:00:22 hi2u 14:00:23 o/ 14:00:23 hello 14:00:24 hey 14:00:25 o/ 14:00:27 hello tripleoers 14:00:30 o/ 14:00:30 hey hey! 14:00:32 hi 14:00:40 o/ 14:00:42 o/ 14:00:45 o/ 14:00:48 o/ 14:00:50 o/ 14:01:06 o/ 14:01:14 hi 14:01:22 o/ 14:01:32 o/ 14:01:37 hola! (getting used to Barcelona) 14:01:47 #topic agenda 14:01:49 * one off agenda items 14:01:51 hi 14:01:51 * bugs 14:01:52 o/ 14:01:55 * Projects releases or stable backports 14:01:55 * CI 14:01:57 * Specs 14:01:59 * open discussion 14:02:05 #topic one off agenda items 14:02:08 #link https://etherpad.openstack.org/p/tripleo-meeting-items 14:02:17 so we have more items in the list? ^ 14:02:37 o/ 14:02:39 I would like to talk a bit about Summit 14:02:42 #link https://etherpad.openstack.org/p/ocata-tripleo 14:02:45 o/ 14:03:32 bnemec: you have some topics in "Other" maybe you want to create a session for them? 14:04:10 otherwise we have 7 sessions (with one that is really small and probably shared with PuppetOpenStack group, about puppet validation) 14:04:22 EmilienM: o/ i updated a bit the upgrades section - added some notes (didn't like the word 'stalled' ) - am happy to lead but i know others were involved in all the early work. i think shardy may have been interested esp for the 'invoke the playbooks via heat' 14:04:22 so 6 sessions using tripleo slots 14:04:44 marios: good, I'll let you lead this session please 14:04:55 EmilienM: sure 14:05:00 I think some of the OPNFV concerns would be solved by having externally consumable CI produced by TripleO, so maybe they could be part of the CI session 14:05:09 trown: ++ 14:05:23 hi 14:05:27 some of them are more about plugability of overcloud deploy itself, which would go elsewhere 14:05:35 for all sessions, I would like people to prepare the topics: blueprints, specs when needed, an etherpad, a list of questions to solve, etc 14:05:40 o/ 14:05:51 we're run out of time like usual, so let's be efficient 14:06:04 One thing I'd like to discuss is the future of instack-virt-setup vs tripleo-quickstart vs ... 14:06:20 thanks to the mail from mwhahaha we have a fresh perspective on that 14:06:22 shardy: in a summit session? 14:06:28 perhaps we can combine it with the growing the team one 14:06:29 shardy: I see it fits well in 4. CI - current status and roadmap, isn't ? 14:06:36 as it's partly an onboarding new user issue 14:06:38 EmilienM: when is the deadline for sessions proposal? 14:06:42 shardy: ya I think that would go good in new user experience 14:06:48 shardy: indeed, it also fits here 14:07:03 trown: Yes, since it seems like we've failed to fully switch over to it, summit seems like a good time to get that discussion restarted? 14:07:07 jcoufal: there is no official deadline, but asap is best so we can organize things 14:07:17 shardy: sure 14:07:30 sure, I just wanted to know if we set some date and I missed it :) 14:07:32 thanks 14:07:32 I will be there :) 14:08:04 shardy, jcoufal: I created a new session "8. Features that we want in Ocata" 14:08:12 EmilienM: ++ 14:08:15 let me know if it's incorrect 14:08:48 EmilienM: maybe we could also have a working session on general ocata planning 14:09:02 EmilienM: ack, sounds good, but if we're going to discuss ocata roadmap planning, it should probably be one of the final sessions 14:09:04 slagle: yes, I wrote it in 8. 14:09:05 EmilienM: where we file and prioritize all blueprints (even if they are just placeholders) 14:09:17 ok, i'll check it 14:09:22 Ocata planning, better name indeed :) 14:09:32 yes, at the very end 14:09:36 cool 14:09:37 we could then sync all the priority items into lauchpad (even if there are specs pending, we can mark them as discussion) 14:10:10 jcoufal: don't miss this one ;-) 14:10:14 yea sounds good, and take into consideration all the other things we want to do (ci, onbarding improvements, docs) 14:10:21 ++ 14:10:36 so that we're confident we'll be able to commit to do what we say we will do 14:10:39 one thing I think we (I) did wrong this cycle is we had a lot of very large blueprints 14:10:53 yes 14:10:54 it'd be good to capture the large themes in launchpad at or very soon after summit 14:10:59 wznoinsk, there is a post #link http://docs.openstack.org/developer/devstack/guides/devstack-with-nested-kvm.html but we did not try it 14:11:08 then folks can go away an decompse them into smaller tasks we can track more easily 14:11:13 slagle, shardy: non features can also be tracked in launchpad 14:11:22 eg: "create a CI job to achieve XX" 14:11:49 well if CI is meant to be used by others, "create a CI job" is a feature :) 14:11:55 right 14:12:09 they could, but on that note...tripleo-ci is actually an infra project 14:12:12 in projects.yaml 14:12:18 so officialy...it uses storyboard 14:12:28 ok, we'll use storyboard to track the work 14:12:30 I think the mistral API tracking ended up being a good example of how we should do it (one umbrella BP with lots of dependent ones) 14:12:48 EmilienM: there was some discussion last week about moving it over to the tripleo project 14:12:48 vs custom roles and composable services which ended up giant mega-blueprints ;) 14:12:55 so either way, i guess 14:12:58 we need to decide 14:13:34 I don't really care what tool we use, but fwiw I'm -1 on using more than 1 14:13:34 slagle: moving what? sorry 14:13:42 storyboard? 14:13:45 EmilienM: moving tripleo-ci from infra to tripleo in projects.yaml 14:13:55 I'm ok with either tool, just pick one 14:14:04 it does not make much since under infra really 14:14:26 slagle: ok, let's catch-up later in CI topic maybe 14:14:31 do we have anything else about Summit? 14:14:50 I think we could slide on letting tripleo-ci live in infra but use a separate tool to manage tracking 14:14:59 so there is no deadline about submiting sessions officially, but do it asap, so we can work on the planning next week 14:15:47 also we need volunteers to drive sessions ;-) 14:16:15 ok, let's move to next topic 14:16:36 #topic Projects releases or stable backports 14:16:43 so we released tripleo RC1 yesterday 14:16:46 EmilienM: I think you missed a one off item 14:16:50 d0ugal: damn 14:17:12 it was not there a few min ago sorry 14:17:20 sorry, I added that one slightly on the late side 14:17:22 no worries, I didn't add it :) 14:17:22 #undo 14:17:22 Removing item from minutes: 14:17:31 bandini: go ahead 14:17:47 I can probably answer it however, at least to an extent. 14:18:12 d0ugal: shoot 14:18:20 yeah I am a bit stuck on upgrades due to https://bugs.launchpad.net/tripleo/+bug/1622683 and https://bugs.launchpad.net/tripleo/+bug/1620696 14:18:22 Launchpad bug 1622683 in tripleo "Updating plans breaks deployment" [Critical,In progress] - Assigned to Dougal Matthews (d0ugal) 14:18:22 This patch just landed in the last hour and should hopefully resolve the problems: https://review.openstack.org/371027 14:18:23 Launchpad bug 1620696 in tripleo "M/N upgrades UPDATE_FAILED .enabled_services.list_join: Incorrect arguments to "list_join" should be: "list_join" : [ " ", [ "str1", "str2"]]" [Critical,In progress] - Assigned to mbu (mat-bultel) 14:18:32 but we don't realy have coverage of this in CI which is a problem. 14:19:06 I haven't seen much progress on https://review.openstack.org/#/c/370069/ 14:19:08 matbu: ^ 14:19:10 d0ugal: ok is that the only patch I need to apply? 14:19:16 So, for people that don't know, this is related to an issue with updating the plan in mistral. You deploy, change your templates somehow and then deploy again with the CLI. There have been some issues. We think we have gotten the last one. 14:19:47 bandini: Probably not, at this point I am not totally sure. it depends what you are applying to I think :/ 14:20:07 bandini: you also need https://review.openstack.org/#/c/373220/ if you use relative paths 14:20:08 very nice d0ugal 14:20:11 matbu: any chance to see why CI is failing on https://review.openstack.org/#/c/370069/ ? 14:20:20 that throws a very similar error, but the fix is different 14:20:29 oh yeah, forgot about that one 14:20:33 d0ugal: I see, mind if I ping you later with the exact upgrade steps I am taking and then you can tip me in the right direction? 14:20:46 bandini: sure, ping away :) 14:20:52 EmilienM: i'll check, the review fix the regression 14:21:00 matbu: the review doesn't pass CI 14:21:12 shardy: yeah I saw that one, am using only absolute paths, so it should not matter right? 14:21:15 d0ugal: thanks ;) 14:21:19 matbu: the bug looks quite urgent, please have a look when you can 14:21:28 bandini: Yeah, you only need it for relative paths 14:21:31 ack i have tested the change , i will check why the CI failed 14:21:40 jrist: but you still don't want to do this in the UI yet :) 14:22:01 d0ugal, bandini: we good with off items? 14:22:10 EmilienM: ack, thanks 14:22:13 EmilienM: yup 14:22:18 #topic Projects releases or stable backports 14:22:21 so, like I said :-) 14:22:25 we release RC1 14:22:37 and I'll like to thank people involved on CI debug on Friday 14:22:51 thx for your time, and all efforts to fix bugs so quickly 14:22:58 +100 14:23:02 that said, we now have RC2 to finish 14:23:18 for those who don't know what to do, we have plenty of bugs https://launchpad.net/tripleo/+milestone/newton-rc2 14:23:22 most of them are tracked and WIP 14:23:36 EmilienM: are we going to branch now? so for rc2 things need to go to master && stable/newton? or not yet? 14:23:50 we need to cut RC2 by end of next week 14:23:53 EmilienM: i mean things that didn't already cut stable/newton like tht 14:23:55 We decided to wait and branch for RC2 14:24:06 marios: ^^ 14:24:11 and we'll branch stable/newton when RC2 is out 14:24:12 to reduce the backport pain 14:24:16 shardy: EmilienM ack thanks 14:24:20 EXCEPT: tripleoclient 14:24:24 d0ugal: haha 14:24:26 d0ugal: noted 14:24:28 I'll repeat AGAIN sorry :) 14:24:35 please backport tripleoclient to stable/newton 14:24:57 Thanks to jpich for doing lots of backports today 14:25:02 for bug fixes 14:25:08 jpich++ excellent, thanks 14:25:12 :) 14:25:19 do we have any question about release management? 14:25:33 we have 10 days to close as much RC2 bugs as we can 14:25:42 marios: where are we with manila? 14:26:53 EmilienM: so we landed the main netapp stuff 14:27:03 EmilienM: but there is still cephfs backend to land 14:27:15 marios: can you tag them tripleo/rc2 ? 14:27:22 in gerrit 14:27:33 EmilienM: erno is leading that stuff (i am assisting for the cephfs ... ) tbarron is testing it 14:27:42 EmilienM: yeah is tagged 14:27:47 https://review.openstack.org/#/q/topic:tripleo/rc2 14:28:13 I forgot to say, if people want to track RC2 work, please use this gerrit topic: https://review.openstack.org/#/q/topic:tripleo/rc2 14:28:25 (i'll add my patches in the topic to simplify reviews) 14:28:34 marios: thanks 14:28:42 any question / feedback about release management? 14:29:12 #topic bugs 14:29:30 do we have any outstanding bug this week that is not tracked in RC2 or WIP? 14:29:57 EmilienM: I had some feedback re the usability of custom roles from mcornea 14:29:58 I see one, that is not assigned: https://bugs.launchpad.net/bugs/1624727 14:29:59 Launchpad bug 1624727 in tripleo "Could not fetch contents for file:///home/stack/tripleo-heat-templates/puppet/post.yaml" [Critical,Triaged] 14:30:10 shardy: how is it? 14:30:13 I'm planning to see how hard it is to fix, and raise a bug with details 14:30:21 so we can decide if we land it for RC2 or defer 14:30:38 I didn't submit a bug yet, just saw it - heat fails in HA jobs because lack of memory, FYI 14:30:50 EmilienM: basically you have to copy some templates and make manual adjustments to resource_registry mappings to build custom roles 14:30:58 EmilienM: I think we'll need to figure out how to deal with heat-validate failing in current tripleo-heat-templates master so GUI is not able to retrieve parameters. It is probably due to the fact that some of the parameters don't have default value specified after recent changes 14:31:01 EmilienM: 1624727 might be a duplicate of 1622683 14:31:09 it may be quite easy to improve that, but fully fixing it will probably have to wait until ocata 14:31:17 bandini: please report it in launchpad, as marking it as dup 14:31:19 shardy: ^^^ 14:31:39 EmilienM: I am not 100% sure yet, hopefully I will be able to confirm it today 14:31:45 shardy: is it something we want to backport? 14:31:45 bandini: i'm hitting this issue too 14:31:47 bandini: ack 14:32:07 matbu: yeah there are a few people hitting it apparently 14:32:09 EmilienM: probably not, which is why I'm mentioning it now 14:32:11 bandini: good to see that you've created a bz :) 14:32:23 matbu: I am a diligent boy ;) 14:32:32 heh 14:32:52 shardy: ok, it might be something we want to document 14:33:20 jtomasek: ack, I thought we agreed the workaround was to pass some dummy default values for validation? 14:33:21 shardy: do you have a launchpad handy? 14:33:25 bandini: which issue... i just hit an issue for 9..10 upgrade i didn't see before ( Engine went down during resource CREATE ) 14:33:28 EmilienM: Not yet, I'll raise one today 14:33:47 EmilienM: basically I started documenting it and realized there were some usability issues 14:34:15 shardy: could be, we'll have to figure out what parameters those are and how to provide dummy defaults. I'll catch you later about it 14:34:26 marios: I can't even get to step1 of the upgrade (major-upgrade-pacemaker-init step) atm 14:34:34 bandini: nm... i think sshnaidm issue *might* be related (memory issues? is it heat engine goes down ?) 14:34:36 jtomasek: ack - fyi ramishra was planning to look into the heat fix, but obviously that's likely to be ocata now 14:34:50 shardy: cool! 14:35:02 marios: I am sort of collecting the issues I stumble upon here https://etherpad.openstack.org/p/tripleo-mitaka-newton-upgrades 14:35:12 jtomasek: tbh for newton, I think we'll have to just hard-code a dummy environment 14:35:14 shardy: ok thanks 14:35:24 :( 14:35:27 do we have other outstanding bugs this week? 14:35:28 it's not great, but it should get the initial UI deployment flow working 14:35:36 shardy: ack. I've added it on the list for summit GUI session 14:36:18 bandini: ack thanks looking... matbu is working upstream upgrades so he may be best person to start with those 14:36:19 please target new bugs to https://launchpad.net/tripleo/+milestone/newton-rc2 if you think they can and need to be fixed before end of next week 14:36:26 otherwise, please defer it to ocata-1 14:36:28 jrist: I don't see what else we can do - the heat fix isn't ready, perhaps we can remove the workaround if that proves to be something we can backport 14:36:46 other ideas welcome ;) 14:37:18 believe me, you're more educated on this than I 14:37:25 so if you think that's most of what we can do 14:37:29 then that's probably it 14:39:06 anything else about bugs? 14:40:10 ok next topic 14:40:12 #topic CI 14:40:47 slagle: so I missed that but can you remind us the storyboard thing about tripleo-ci? 14:41:44 EmilienM, it would be difficult to separate tripleo issues from tripleo-ci, so that one of them shouldbe tracked in storyboard, some of them in LP 14:41:48 regarding the CI, i'm still fighting ith the upgrade (full) jobs 14:42:00 EmilienM: since tripleo-ci is part of the infra project, the tracker is storyboard 14:42:22 sshnaidm: help me to debug, we decide to try to switch this job to ovb .. if no objection, which seems to be more stable than multinode 14:42:23 EmilienM: last week, some of the infra folks were suggesting to move tripleo-ci under the tripleo project 14:42:31 which sounds reasonable to me 14:42:35 EmilienM: in governance tripleo-ci is listed as part of the infrastructure group http://git.openstack.org/cgit/openstack/governance/tree/reference/projects.yaml#n2226 14:42:40 if we did that, the tracker would be launchpad 14:43:03 matbu, sure 14:43:11 matbu: no objection from me. Maybe ask to derekh and bnemec 14:43:31 slagle, derekh: it sounds good to me too 14:43:37 +1 14:43:38 we're very familiar with launchpad now 14:43:40 +1 14:43:44 and everyone seems to use it 14:43:50 so let's keep things simple 14:43:55 It's weird for random infra people to have +2 on one of our repos. 14:43:56 EmilienM quick silly question, https://review.openstack.org/#/c/373131/ its for adding correctly delorean-deps depending on the version to be deployed, then why CI does not fail, but local deployments have issues? (OC images are cached from oooq?) 14:44:00 Yeah, launchpad is far from perfect, but it works well enough, and everyone is familiar with it 14:44:10 good 14:44:20 EmilienM: ok, i'll propose a patch to governance 14:44:22 #action move tripleo-ci under tripleo and use launchpad 14:44:25 slagle: ++ 14:45:25 RE. moving the upgrades job to OVB, we might want to sort out our job limits first 14:45:43 we currently limited to 40 simultanious jobs (don't know why) 14:45:54 adding another ovb job will make queues longer 14:46:01 derekh: right 14:46:11 maybe keep the upgrade job in experimental pipeline 14:46:18 so matbu and sshnaidm can run it at demand until it works 14:46:19 EmilienM: ack 14:46:35 and then when stabilized, run it in periodic pipeline to save resources 14:46:41 (it's a proposal) 14:46:51 derekh: the limit is there because instances spawned by nodepool werent getting floating ip's when we had the limit higher 14:47:17 i havent been able to figure out why that is the case 14:47:22 Weird, I thought we dropped it to 50. 14:47:45 EmilienM: sounds like a good start IMO, would be good to also keep the option to run it on a patch on-demand via the experimental pipeline 14:47:45 I think infra deployed some fixes in nodepool that might be helping with the fip problem. 14:47:59 EmilienM: i mean after we have it periodic 14:48:07 jistr: yep, excellent idea 14:48:09 And since rh1 isn't taking 12 minutes to deploy stacks anymore we could probably try upping the limit again. 14:48:18 sshnaidm, matbu: you ok? ^ 14:48:25 bnemec: slagle ahh, I hadn't realised we dropped it at all in the config, I just thought it wasn't getting up too 75 and we didn't know why 14:49:59 anyways doesn't matter, keeping it experimental should be fine 14:50:04 ok 14:50:10 ack 14:50:24 #action move upgrade jobs to ovb and use experimental pipeline until it's stabilized then periodic + experimental 14:50:31 matbu: you takes actions on it? ^ 14:50:53 EmilienM: yep 14:50:56 good 14:50:58 let's move in 14:51:02 move on even 14:51:04 any thoughts on using persistent test envs? to try and ease the load on the rh1 controller? 14:51:21 just throwing it out there, don't need to discuss it now 14:51:26 derekh: yes, why not? 14:51:27 derekh, just to have a few ready on demand? 14:51:29 haa regarding CI, i have a question for you guys 14:51:36 derekh: I like the idea. How hard would it be to do? 14:51:54 sshnaidm: we would reuse them several times 14:51:58 is it possible to chain jobs for example : job overcloud ha deploy, then chain an upgrade jobs with the previsous overcloud env ? 14:52:04 bnemec: it shouldn't be crazy hard 14:52:16 derekh: this would more closely match what we did in the old CI setup (non OVB) 14:52:23 derekh: I like it in that regard 14:52:30 bnemec: we change the workers to not destory at then end 14:52:33 tried and tested 14:52:40 it would also make our jobs about 10 minutes faster 14:52:49 +1 14:53:37 #undo 14:53:38 Removing item from minutes: 14:53:49 #action matbu to move upgrade jobs to ovb and use experimental pipeline until it's stabilized then periodic + experimental 14:53:49 That will mean we have to run in a patched Nova environment. 14:54:00 bnemec: we aleady do 14:54:19 we have 6 min left folks 14:54:21 Yeah, I'm thinking for when we move to rdo cloud. 14:54:33 matbu, not sure, but maybe post pipeline of zuul 14:54:35 It also makes the test envs less dynamic. 14:54:48 matbu, let's ask pabelanger when he's here 14:54:52 bnemec: yup, I've asked rdo people about that (patching rdo cloud), lets see what they say 14:54:54 sshnaidm: no, tripleo cloud won't support it in zuul post pipeline 14:55:32 k 14:55:36 bnemec: loosing the dynamicability is a bummer alright 14:55:38 EmilienM, matbu then we can invent something in ovb 14:55:54 sshnaidm: probably 14:55:57 * bnemec likes dynamicability :-) 14:56:02 derekh: can we followup on mailing list maybe? 14:56:07 we have 5 min left 14:56:12 and I would like to open the discussion 14:56:13 EmilienM: ack 14:56:16 Why would we do that? What do we need to chain jobs like that for? 14:56:29 Isn't "deploy HA, then upgrade" just the upgrades job? 14:56:51 bnemec: cause a full upgrade could takes few hours 14:56:52 derekh: very interesting topic but time is running out 14:57:03 bnemec: then it could save us from timeout 14:57:06 bnemec: no that was just a stack update after the deploy there i think 14:57:08 derekh: please keep us posted on ML 14:57:14 bnemec: so a full upgrade is a number of steps 14:57:21 Yeah the original upgrade job was really a stack-update test 14:57:22 EmilienM: will do 14:57:22 bnemec, and more complicated than it 14:57:23 marios: matbu is working on a full upgrade 14:57:37 we renamed the upgrade job 14:57:39 that was still useful coverage tho 14:57:39 EmilienM: +1 i was answering bnemec abaout theupgrade job 14:57:40 If it's too slow to run in the timeout period, then we're not enabling it on every patch. 14:57:41 to be "update" and run stack update 14:57:44 and now we have 2 jobs: 14:57:48 update: stack update 14:57:51 It's bad enough we have to wait 2 hours for CI results as it is. 14:57:58 and upgrade: actual upgrade (experimental, matbu is doing it) 14:58:02 Adding another 2+ hours on top of that is no good. 14:58:18 ok, sorry for that but no open discussion this week :( 14:58:24 no more time for it 14:58:27 bnemec: yep , but that the cost of CI upgrade 14:58:29 #topic open discussion 14:58:29 Open discussion in #tripleo 14:58:34 :-) 14:58:41 jaajaj 14:58:51 if you have anything to ask or feedback to give, please do it now, or on #tripleo after the meeting, door is always open :) 14:58:59 Thanks EmilienM ! 14:59:32 thanks everyone, and keep rocking 14:59:33 thanks EmilienM 14:59:39 #endmeeting