14:00:40 #startmeeting tripleo 14:00:41 Meeting started Tue Jul 5 14:00:40 2016 UTC and is due to finish in 60 minutes. The chair is shardy. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:42 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:45 The meeting name has been set to 'tripleo' 14:00:47 #topic rollcall 14:00:51 o/ 14:00:53 o/ 14:00:53 Hi all, who's around? 14:00:56 hi 14:00:57 hello 14:00:58 hello 14:00:59 o/ 14:01:01 Hello guys!!! 14:01:18 o/ 14:01:44 #link https://wiki.openstack.org/wiki/Meetings/TripleO 14:01:51 o/ 14:02:20 o/ 14:02:24 #topic agenda 14:02:25 * one off agenda items 14:02:25 * bugs 14:02:25 * Projects releases or stable backports 14:02:27 * CI 14:02:30 * Specs 14:02:32 * open discussion 14:02:42 Does anyone have anything to add re one-off items? 14:02:50 I see one re the UI network diagram, anything else? 14:04:02 o/ (late rollcall) 14:04:10 #topic one-off agenda items 14:04:17 #link https://blueprints.launchpad.net/tripleo-ui/+spec/network-diagram 14:04:20 honza: Hi! 14:04:38 So apologies, I saw the openstack-dev thread on this but have not yet replied 14:05:03 the feature looks good, I'm just wondering what additional interfaces it may need 14:05:06 o/ /me late 14:05:12 o/ 14:05:20 e.g can you use something like a heat stack-preview to get the graph of all network resources? 14:05:42 IMHO you should definitely avoid parsing the templates directly as that will probably be fragile, and duplicate stuff inside heat 14:05:59 Any other comments or questions on the BP? 14:07:15 A spec would probably be a good idea, but that probably shouldn't prevent you starting the implementation 14:07:29 the main risk I see in the BP description is "parse them from YAML" 14:07:37 as mentioned, I don't think we should do that 14:08:51 will it be encompassing the provider network as well, the ones set by os-net-config? 14:09:45 saneax: good question - seems like we'd want it to support visualizing and validation both the physical and overlay network configuration 14:10:22 Since honza doesn't appear to be around, lets take this the the existing openstack-dev thread 14:10:43 #topic bugs 14:10:44 shardy: sounds good 14:11:00 #link https://bugs.launchpad.net/tripleo/ 14:11:09 shardy: sorry, late 14:11:45 honza: np, let's catch up in #tripleo after and/or on the ML 14:11:56 shardy: ok, thanks 14:11:58 overall the BP looks good, but there's some details we should work out re parsing the templates 14:12:29 Does anyone have any specific bugs to discuss this week? 14:12:29 shardy: dprince: yaml parsing is to be done via heat to avoid duplication of efforts (sorry that wasn't clear) 14:12:58 things have been going better with CI this week, but there's still some issues with upgrades job both on master and stable branches 14:13:11 some of that seems to be memory and/or walltime related 14:13:29 shardy, filed a but for upgrades: https://bugs.launchpad.net/tripleo/+bug/1599048 14:13:29 Launchpad bug 1599048 in tripleo "CI upgrades job fail because of memory shortage" [High,New] 14:13:49 shardy, but will look at this after ci move 14:14:46 (next week) 14:14:56 shardy sshnaidm so the upgrades job now doesn't exist, until we get rh1 back doing something 14:15:12 derekh: hehe, that's one way to fix the problem ;) 14:15:35 shardy: yup, unfortunatly we could only run a subset of jobs to keep us ticking over 14:15:56 derekh: cool, what's the timeline for bringing the rh1 capacity back online? 14:16:01 derekh: could we fit in a periodic upgrade job to at least monitor it? 14:16:13 derekh: just an idea... 14:16:21 I've added the upgrades jobs as an experimental job on the tripleo-ci repo, but that wont work unless we add multinics to the OVB envs 14:16:42 so that means we've got no coverage of net-iso now too? 14:17:02 sounds like it is time to land a bunch of code :) ! 14:17:07 lol 14:17:10 #topic CI 14:17:12 dprince: yup, I think I did add it as a periodic job 14:17:22 since we're talking CI, let's just do the CI topic now :) 14:17:28 dprince: but we'll need multi nics to for it to pass 14:17:40 derekh: okay, so perhaps just a report, or we monitor that manually each day... 14:17:46 * derekh had something prepared 14:17:48 All our jobs are now running on rh1, this is a subset of jobs but hopefully we'll only be in this situation for a short period 14:17:48 Most of the details on the move are here http://lists.openstack.org/pipermail/openstack-dev/2016-July/098616.html 14:17:48 If the jobs on rh2 go well, I think the main quiestion we need to answer is, should we redploy rh1 as it was or instead switch to OVB 14:17:48 Some of the pros/cons I've put in that email, so maybe we should discuss it there 14:17:50 On rh2 we have two problems currently 14:17:52 1. libery and mitaka jobs were failing because the sudoers file on the image used contained quotes and instack undercloud didn't allow for that 14:17:55 this patch fixes the problem https://review.openstack.org/#/c/336470/ 14:17:57 but on liberty there seem to be a seperate problem, python is failing with a seg fault 14:17:59 http://logs.openstack.org/70/336470/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/816d049/console.html#_2016-07-04_16_14_34_270127 14:18:02 not sure why yet 14:18:04 2. The failure rate of the HA job is a little highter then it was on rh2, this is what I'm trying to reproduce at the moment 14:18:12 the first line is wronge, all jobs are now on rh2... 14:19:30 derekh: thanks for the update - so can we make the call re OVB when we've run the new rh2 CI for a little longer? 14:19:45 the last line is wrong also, s/rh2/rh1/ 14:19:45 derekh: I will catch up on the list too, but unless RH1 comes up without much work I'm leaning towards a rebuild 14:19:53 e.g when will rh1 be either rebuilt or brought back online? 14:20:08 we are long overdue... I hate having to context switch back to Nova BM and tripleo-image-lements to maintain that cloud 14:20:16 shardy: yup, I think that would be best, but I'm leaning in the direction of OVB aswell 14:20:27 Yeah, provided the OVB env is working smoothly it seems like it'd be good to standardize on that 14:20:31 shardy: we should have the HW available to us again on thursday 14:20:44 derekh: Ok, cool, not a huge outage then 14:20:53 shardy: after that it depends on what problems we hit, there could be none or many 14:21:03 shardy: available doesn't mean anything is working :) 14:21:10 fingers crossed though 14:21:16 dprince: Yeah, but at that point we can make the call re OVB or not ;) 14:21:17 yup, what dprince said 14:21:31 until the hardware is moved, we can keep deliberating and looking at how things are with rh2 :) 14:21:32 do we need refrain ourselves to merge some kind of patches (eg: net iso)? 14:22:04 EmilienM: I think we'll have to rely on local testing for things that look risky from a CI coverage perspective 14:22:08 EmilienM: without local testing I'd say lets be careful 14:22:25 +1 14:22:27 good to know 14:22:28 I don't think we need to make a hard rule about not merging things though 14:23:21 Ok, anything else we need to discuss re CI? 14:23:49 onboarding for new workforce ? 14:24:26 panda: Ah yeah, I never got around to the ML thread re mentoring, I'll do it today 14:24:34 sorry, busy week 14:24:48 I will need some directions on where to start and what are the initial expectation 14:24:49 thanks for the reminder 14:24:52 shardy: ok, thanks. 14:25:15 I'll also create an accound on rh2 with some quota for people looking to reproduce ci and put togeth some instructions 14:25:29 derekh: +1 that sounds good 14:25:29 https://goo.gl/VpIZPJ <- CI bugs 14:25:36 derekh, +1 14:25:42 #topic Projects releases or stable backports 14:25:54 So, it's only a week until newton-2 14:26:02 #link https://launchpad.net/tripleo/+milestone/newton-2 14:26:12 #link http://releases.openstack.org/newton/schedule.html 14:26:28 shardy: sorry, still on CI 14:26:31 I'm starting to get worried, because although we're making good progress, we've not landed any blueprints yet 14:26:32 shardy: we need to land https://review.openstack.org/#/c/336130/ 14:26:34 same as n-1 14:26:50 shardy: mgould has some patches waiting on liberty and he's waiting for https://review.openstack.org/#/c/336130/ 14:27:07 I think we're already pushing hard on composable-services, but can folks please help by prioritizing reviews to things on that n-2 launchpad list (bugs and BPs) 14:27:33 shardy: for https://blueprints.launchpad.net/tripleo/+spec/overcloud-upgrades-workflow-mitaka-to-newton - i've been pulled into some mitaka related upgrades work atm... so that may slip. biggest issue there is https://bugs.launchpad.net/tripleo/+bug/1596950 14:27:33 Launchpad bug 1596950 in tripleo "undercloud upgrade from stable/mitaka to latest hangs - possibly for swift services yet tbd" [Medium,Triaged] - Assigned to Marios Andreou (marios-b) 14:27:40 I'm going to start deferring things this week, but it'd be super-great if I don't have to defer multiple features.. 14:28:05 marios: ack, please retarget to n-3 then, and link the related bugs on the whiteboard 14:28:30 if it's blocked by that bug, you can set the status to blocked 14:28:43 shardy: ack 14:29:08 EmilienM: thanks, I saw that yesterday, good to see the lint issues are now fixed 14:29:35 shardy: still need +1 on liberty, mitaka is fixed 14:30:22 I think we should merge it, the upgrades job isn't failing because of the gemfile change 14:30:48 right 14:31:02 done ;) 14:31:06 shardy: the upgrade job is broken on liberty since we switched to ipv6 14:31:23 Yeah, it'd be good to work out which patches we need to fix that 14:31:34 thanks for investigating and figuring out the root-cause 14:31:40 right, I spent time on it but I could not find time to work on it more 14:31:56 np, lets revisit after n-2 has passed 14:32:18 Does anyone else have updates or questions around releases or backports? 14:33:04 #topic Specs 14:33:35 #link https://review.openstack.org/#/q/project:openstack/tripleo-specs+status:open 14:33:38 shardy: for the release, I can take care of it next Monday if you want 14:34:20 We need further review of https://review.openstack.org/#/c/313872/ and https://review.openstack.org/#/c/313871/ 14:34:31 beagles: ^^ any chance you can revisit those? 14:34:42 it'd be good to land them soon, as we're getting late in the cycle 14:34:53 dsneddon has an interesting one re lldp too 14:35:04 and we need to figure out the plan re composable upgrades 14:35:15 shadower, we are working on SRIOV and DPDK 14:35:22 EmilienM: Sure, if you're happy to do that that would be great, thanks! :) 14:35:31 shardy, we are working on SRIOV and DPDK 14:35:31 will do 14:36:09 karthiks: ack - I've approved the blueprints, so it'd be good to get those specs landed too 14:36:19 shardy, we are almost completed with SRIOV changes.. final puppet changes in progres.. 14:36:46 for the dpdk, we are working on the os-net-config and the puppet changes for dpdk.. 14:37:02 shardy: ack on the composable upgrades, though as we discussed previously, an upgrade newton to ocata thing. by work out you mean still tbd/discuss etc right or did i miss a thread again 14:37:21 skramaja: good to know, I see some patches referenced from https://blueprints.launchpad.net/tripleo/+spec/tripleo-sriov 14:37:24 thanks for the update 14:37:34 shardy, is there anything we have to do on the specs to push the specs 14:37:42 shardy: also hapy to work/help on a spec once we're there... though jistr has one up already that we should update once we decide 14:37:46 marios: Yeah, I was just mentioning it as it's an outstanding spec 14:38:03 marios: we've got a more urgent "ensure minor updates work with composable services" problem for Newton 14:38:32 but the major upgrade of composable services can potentially slip to ocata, but I'd like to get a clear idea of how we'll do it asap 14:38:45 shardy: ack thanks 14:38:53 been planning some prototyping to help with that, but composable roles/services have taken priority 14:39:10 shardy: any links to your prototypes? 14:39:16 * dprince is interested in prototyping this too 14:39:32 dprince: No, I've not done them yet, just looked at a few possible approaches 14:39:41 hoping to get to that after we're past n2 next week 14:39:47 shardy: cool. thanks 14:40:05 I have one remark about composable services 14:40:08 dprince: cool, we can sync up and perhaps work on it together 14:40:41 I would like to slow down a bit the effort in new composable services (mistral, zaqar, etc) and help me (and others) to finish core services (neutron, ceilometer, etc) 14:41:14 we still have some bits to move 14:41:25 and if we want to make it for newton, we need more hands 14:41:38 mistral/zaqar/etc can wait IMHO 14:41:42 EmilienM: for new services though there shouldn't be any conflicts right? 14:41:49 I'm not talking about conflict 14:41:52 EmilienM: besides the CI resource drain, what are your concerns? 14:42:20 ok let me put this way: I need help to finish ceilometer / nova / neutron bits 14:42:24 EmilienM: :( i'd really rather not delay manila further though i can appreciate your concern for the time left to get all thigns done (core) 14:42:51 EmilienM: from my POV one of the biggest benefits of composability is that we can add new services without concern for impacting the core 14:42:52 I'm dealing with multiple blockers in same time, I would need some help 14:43:07 EmilienM: ack, I agree with the priorities FWIW 14:43:11 dprince: *again* it's not a problem of impacting the core 14:43:18 it's a problem of "enough people moving core bits) 14:43:19 we should prioritize those services we know folks want to scale independently of the controller 14:43:27 EmilienM: but I would like to not discourage others who are chipping away at new services 14:43:40 e.g of those which are still part of the monolithic controller definition 14:43:43 I'm not discouraging people, I'm asking for help here 14:43:50 EmilienM: gotcha 14:43:53 cool 14:44:43 some people are waiting for this thing to deploy AIO etc 14:44:56 Yeah, there's a balance here, stuff like manila have been around for ages so we shouldn't block that, but absolutely we need more help with reviews/patches for the core services 14:45:05 it's currently not possible because we need to finish nova / neutron (very close but still not finished) 14:45:27 fwiw, i have deployed aio, w/o pacemaker 14:45:35 EmilienM: perhaps we should break down the remaining services and raise bugs for those not yet decomposed 14:45:43 right, but with pacemaker we still have issues (I'm on it) 14:45:45 EmilienM: i'm testing your other 2 patches now 14:46:04 then we could land the composable services BP, which is really about the new architecture, and have a clearer view wrt progress on remaining/new services 14:46:26 with the benefit of hindsight, a single BP and an etherpad hasn't been the best way to track this (huge) chunk of work 14:46:45 did it seem huge in foresight? :) 14:46:57 right, it was a lot of work 14:47:23 jrist: I think we knew it was a lot of work, but tracking it could have been done in a more granular way 14:47:46 guys quick question, is the etherpad fully updated? https://etherpad.openstack.org/p/tripleo-composable-services 14:47:55 ccamacho: I'm trying to update it every day 14:48:14 shardy: we could have just forked t-h-t and then we'd be done by now already :) 14:48:22 ccamacho: I check all patches every day and update status 14:48:37 EmilienM me to, but when I hit things not updated, ack 14:48:49 dprince: maybe, but this approach has allowed us to prove each step via CI, which we would have lost with the forking approach I guess 14:49:16 progress lately has been very good, since CI has been more stable :) 14:49:48 #topic open discussion 14:50:03 Anyone have any topics to discuss in our final 10mins? 14:50:17 slagle: the AIO work sounds interesting, care to share any details on that? 14:52:09 Evidently not ;) 14:52:21 Ok, well if there's nothing else lets finish early, thanks everyone! 14:52:33 shardy: thanks 14:52:34 #endmeeting