14:00:07 #startmeeting tripleo 14:00:08 Meeting started Tue Aug 30 14:00:07 2016 UTC and is due to finish in 60 minutes. The chair is shardy. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:09 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:11 The meeting name has been set to 'tripleo' 14:00:16 #topic rollcall 14:00:23 o/ 14:00:24 Hi all, who's around? 14:00:24 o/ 14:00:26 o/ 14:00:29 hi 14:00:29 o/ 14:00:35 o/ 14:00:37 o/ 14:00:51 o/ 14:00:55 o/ 14:00:56 hey 14:01:02 \o 14:01:10 \o/ 14:01:39 hi 14:01:43 o/ 14:01:52 o/ 14:02:05 Ok, well hello all, let's get started :) 14:02:08 o/ 14:02:12 #topic agenda 14:02:16 #link https://wiki.openstack.org/wiki/Meetings/TripleO 14:02:25 #link https://etherpad.openstack.org/p/tripleo-meeting-items 14:02:26 o/ 14:02:36 please add any one-off items to the etherpad, we currently have two 14:02:53 o/ 14:02:54 Also, I wanted to say thanks to EmilienM for running these meetings the last two weeks while I was on PTO 14:03:04 o/ 14:03:05 +1 thanks EmilienM 14:03:08 o/ 14:03:16 o/ 14:03:36 cool 14:04:28 #topic one-off items 14:04:37 o/ 14:04:41 #info What features to add to Tripleo-CI tests for newton (panda) 14:04:52 panda: care to provide some more details on this one? 14:04:53 I'm actually already working on that 14:05:05 but sure, details first :) 14:05:24 Ok, so it's about defining the test matrix, sounds good 14:05:33 https://review.openstack.org/#/c/362504/ 14:05:54 it's experimental now but really close to be ready 14:06:06 I c/p matrix from puppet Ci for more consistency 14:06:08 well, I don't have much more detail to add, I know we are not testing everything, and I'd like to know what to add for newton, composable roles for example 14:06:22 o/ 14:06:33 panda: Ok, well it sounds like we'll have to continue to expand the test matrix via scenarios 14:06:44 and that will eventually include tests related to composable roles 14:06:48 panda: can you tell if my patch is related? 14:07:13 right now we're testing some aspects of composable services just by enabling/disabling various services, full composability isn't quite landed yet 14:07:25 shardy: example of run: https://review.openstack.org/#/c/362506/ 14:07:38 what is the status of testing M->N upgrades in CI? 14:07:43 if you try to change cinder composable service, it run scenario002 job 14:07:48 shardy: stalled 14:07:49 matbu, ^ 14:07:50 that and IPv6 are my main concerns prior to the newton release 14:07:57 shardy: undercloud was working but now fail 14:07:58 EmilienM: yes, partially, jobs that use those scenarios need to be added of course 14:08:00 upgrades and ipv6? 14:08:01 o/ 14:08:04 shardy: overcloud is not testing it yet 14:08:13 panda: there are already added dude 14:08:29 +1 on upgrades/ipv6 to start with 14:08:35 EmilienM: ack, Ok perhaps we can discuss that more in the CI topic then 14:08:38 k.. cool 14:08:45 panda: I added the jobs here https://review.openstack.org/#/c/361433/ 14:09:05 * beagles barges in late 14:09:12 anything else on this topic, or shall we continue? 14:09:33 I guess it's good we can continue later in Ci topic 14:09:42 #info newton-3 release status and FFEs (shardy) 14:09:52 #link https://bugs.launchpad.net/tripleo/+milestone/newton-3 14:10:09 So I added this as a one-off topic, so we can all sync up on the status 14:10:09 weshay: I'll give detailed status on upgrade/update jobs later 14:10:37 we're planning to tag newton-3 tomorrow, so we need to land as many of the remaining "Needs Code Review" blueprints as possible today 14:10:57 please ensure your blueprint status is correctly set, and help with reviews to push the last few things in before EOD 14:11:04 shardy: are we going to tag even if CI is not to latest trunk? 14:11:16 shardy: hey, i see the manila-generic stuff isn't in the newton-3 list i should file a bug for it i guess, but thanks to ben who posted to list last week ( I was away). just making sure you're aware/agree. 14:11:24 shardy: sorry, manila-netapp 14:11:35 EmilienM: unless we're confident we'll get a promote so we can release on Thursday, I don't think we have much choice? 14:11:46 right 14:11:46 EmilienM, ok.. from what I've heard from matbu is that we have the CI in place, we're just waiting on some patches to land to unblock the overcloud upgrade 14:12:22 marios: Yeah please raise a bug or bp as we lost track of it because there's nothing in LP 14:12:44 shardy: ack, will do. ftr: tht https://review.openstack.org/#/c/354019/ and puppet-tripleo @ https://review.openstack.org/#/c/354014 14:12:56 So, I created https://launchpad.net/tripleo/+milestone/newton-rc1 14:13:22 my plan is to defer all FFE BPs, and any remaining bugs targetted to newton-3 when we tag the newton-3 release 14:13:52 shardy: should we try to land the non-FFE stuff today if possible then? 14:13:52 we then have a couple more weeks where we can land the remaining things, do testing/bugfixing and get ready to declare newton final 14:14:35 dprince: Yeah, basically it'd be good to minimize the number of FFE's we're tracking by landing stuff today, but if that's not possible, we'll just have to add things to the FFE list and move the target to rc1 14:15:03 hopefully then we can get more folks focussed on the few remaining features, then switch into bugfix mode for the final push ahead of the release 14:15:05 shardy: to reiterate and emphasize for myself and others - if you have patches in-flight that you want to be considered, you need to have a bug or blueprint associated with the patches and that they need to be appropriately target and if appropriate a FFE for, correct? 14:15:06 sound reasonable? 14:15:28 shardy: it sounds safe 14:15:35 beagles: Yes, otherwise there's a high risk the patches will get missed 14:15:56 shardy, ack thanks 14:16:36 beagles: closer to the release (particularly after we branch a release candidate) nothing will get onto stable/newton without a bug reference 14:17:08 beagles: for now, it's about knowing when we can sanely branch, and launchpad is what I'm tracking to evaluate that :) 14:17:15 shardy, makes sense, thanks :) 14:17:23 Ok, any more questions re the release before we move on? 14:17:37 shardy, in the newton-rc1, sriov/dpdk are missing>? 14:17:54 saneax: they are still targetted at newton-3 because I'm hoping they will land todauy 14:17:57 today 14:18:04 OK shardy thanks 14:18:07 if that doesn't happen, I'll move them to rc1 tomorrow 14:18:13 shardy: my topic is sort of related, so it can segway 14:18:20 jrist: Ok, go for it! :) 14:18:25 ok so two things 14:18:40 tripleo-ui has not been upstream integrated with tripleo 14:18:50 so I've been sort of struggling along trying to track both 14:18:59 I would like to propose getting tripleo-ui into tripleo proper 14:19:04 launchpad, bugs, everything 14:19:19 So we currently have a separate UI LP project here: https://launchpad.net/tripleo-ui 14:19:24 was wondering how to do that since it's a pretty big project 14:19:45 I'm Ok with keeping that if you'll find it useful, or tracking everything in lp/tripleo 14:20:00 but I'd like us to start tagging tripleo-ui releases coordinated with all the other tripleo things 14:20:04 mostly I want to make sure that we're getting stuff in for tripleo and workflows that we need for the UI 14:20:11 e.g, starting from tomorrow for newton-3 if you're ready for that? 14:20:12 shardy: that was the segway 14:20:14 exactly 14:20:24 we're extremely close. that's what we're targeting too 14:20:32 jtomasek, florianf ^ 14:20:34 right? 14:20:34 :) 14:20:44 jrist: I'd say lets coordinate manually until the end of newton, then for ocata, we'll get the launchpad tracking better aligned 14:21:00 shardy: ok thanks. we can talk more about the details of that in #tripleo 14:21:10 as for release, I'll coordinate with EmilienM to do a patch for releases 14:21:15 I don't really care if it's in tripleo or tripleo-ui on LP, but we'll need to have per-milestone deliverables and track everything aligned with the rest of tripleo 14:21:27 that's the idea. so we can make sure we're tracking within milestones 14:21:36 and coordinate with stuff in tripleo for the UI 14:21:41 jrist: yup, either EmilienM or I will ping you tomorrow to agree the SHA to release from 14:21:44 thanks shardy that's it from me 14:21:46 then we'll tag it with everything else 14:21:52 yep 14:21:56 woohoo, first integrated tripleo-ui! :) 14:22:02 shardy, jrist +1 14:22:46 Ok, thanks, lets move on 14:22:50 #info TripleO CI for proprietary vendors (qasims) 14:22:53 qasims: Hi! 14:22:57 Hey 14:23:00 qasims: hey 14:23:23 I wanted to discuss possible ideas for this 14:23:28 so this is a follow-up to the ML thread EmilienM posted, and it'd be great to agree the best integration point for third-party CI, such that the various backends can be better tested 14:23:45 third party or not 14:23:56 some vendors could run their plugins in our multinode jobs 14:24:09 as they don't require specific hardware or proprio soft 14:24:21 EmilienM: Yeah, agreed, that may work in some cases 14:24:29 my scenario work fits in this 14:24:36 we could have more scenarios for each vendor 14:24:44 and call the jobs when required thx to wuul 14:24:47 zuul err 14:24:58 qasims: So I guess the question is how we handle non-free components, as I don't think we can use infra resources to deploy/test proprietary pieces 14:25:09 the pb with plumgrid is that its proprio 14:25:15 in those cases, I guess that third-party jobs would be a better fit? 14:25:22 exactly 14:25:24 exactly 14:25:25 similar to what is done for some other projects, e.g Ironic? 14:25:41 how will a third-party use tripleo-ci though? 14:25:54 weshay: you've been looking into setting up third-party CI jobs against tripleo, right? 14:26:18 shardy, aye 14:26:38 we have a third party CI trigger with "rdo-ci-check" 14:26:42 weshay: would it be possible to document how that's all wired up, perhaps in a patch to tripleo-docs? 14:26:54 it's in place from ci.centos, it's not ready from internal ci systems. We need a file server for logs that is public. That is in progress 14:26:57 that may prove a useful starting point for qasims and other vendors? 14:27:13 shardy: I'm working on a doc about it. 14:27:14 yes 14:27:19 shardy, the docs for 3rd party are also in progress under review atm.. adarazs ^ 14:27:30 trown: I guess that's the question, so getting the high-level integration points defined is probably the first step 14:27:35 shardy, qasims: https://review.openstack.org/360007 WIP change 14:27:50 that includes doc 14:27:59 so we want to deploy plumgrid in rdo ci? 14:28:16 maybe internal 14:28:32 EmilienM: No, I think vendors would probably have to set up their own jobs? 14:28:45 Or perhaps capacity could be made available on an ovb cloud somewhere? 14:28:46 in their infra? 14:29:23 qasims: what are your thoughts on running the tests, what would be the best fit from your perspective? 14:29:43 honestly i would have been happy if someone from midokura or contrail could also reply to this thread, they are open source and it would be easier for a first iteration 14:29:53 the case of non-free soft is more complex 14:30:12 for now it seems that we will have to set this up on our own infra 14:30:29 EmilienM: sure, but we need to address both requirements, so it's good to start the conversation :) 14:31:34 shardy, maybe we can help by adding the doc required and infra required for running third party and how to customize the deployment 14:31:35 adarazs: please do not use rdo-ci-check, 3rd party CI should only respond to recheck 14:31:48 adarazs: http://docs.openstack.org/infra/system-config/third_party.html 14:32:05 pabelanger: we're not triggering without this word for now for capacity reasons. 14:32:22 qasims: Ok, well it sounds like some further discussion on how that will work is needed, but I think the main message is lets keep discussing it, and try to document things as we go to simplify subsequent integration efforts 14:32:38 sounds good. 14:33:02 pabelanger: so this is just a magic word that you can use to trigger the third party ci for a patch you want to check, so we don't do 'recheck' either 14:33:02 Ok, thanks qasims! 14:33:15 #topic bugs 14:33:18 qasims: thx for replying to the proposal 14:33:26 I will come shoot proposals to ML if any 14:33:34 adarazs: patch in review.openstack.org? 14:33:41 #link https://bugs.launchpad.net/tripleo 14:34:00 pabelanger: yes. -- it's kind of in stealth mode for now. :) 14:34:02 So, my main request is please ensure all release impacting bugs are targetted at rc1 from now on 14:34:18 we'll need to track release blockers carefully between now and the final release 14:34:29 anyone have any other bug related things to discuss? 14:34:36 adarazs: yes, please do not do that. We have policies in place for 3rd-party CI. Please read: http://docs.openstack.org/infra/system-config/third_party.html and follow that. 14:35:29 adarazs: we can discuss it more in #tripleo after the meeting 14:35:46 shardy: nothing on my side 14:35:50 pabelanger: okay. 14:36:28 https://bugs.launchpad.net/tripleo/+bug/1618418 14:36:28 Launchpad bug 1618418 in tripleo "CI: periodic jobs fail on undercloud install because "AttributeError: 'module' object has no attribute 'RegionInvalidationStrategy'"" [High,Triaged] 14:36:37 its fixed in RDO 14:36:38 have anyone seen such problem in his jobs? 14:36:44 the package is in buildlogs now 14:36:44 We do have a lot of New untriaged bugs, so any help triaging those over the next few days would be appreciated 14:37:01 sshnaidm: yes, fixed last night it was a python dogpile problem 14:37:11 EmilienM, great, thanks 14:37:13 shardy: will do later today 14:37:23 sshnaidm: fyi it also blocked puppet ci promotion 14:37:34 #topic Projects releases or stable backports 14:37:36 sshnaidm: testing now again with new package in puppet ci and will let you know 14:37:45 So, not much to add on this, we already discussed newton-3 14:37:54 EmilienM proposed a mitaka release, so thanks for that 14:38:01 do we need a liberty release too? 14:38:05 shardy: cool, i'll try to get it merged asap 14:38:21 shardy: i haven't seen much backports 14:38:26 i dont thing we need 14:38:31 Ok, lets just go with mitaka for now then 14:38:42 Anything else re releases/backports? 14:38:50 yeah we're already super busy with newton and mitaka 14:39:33 #topic CI 14:39:45 So, earlier we mentioned upgrade coverage, and IPv6 14:39:47 so I have somme updates 14:40:00 regarding coverage of services, we are working on this with scenarios 14:40:23 i have some blockers with gnocchi but pradk will help later today 14:40:44 otherwise the jobs are in place. once they are working i'll move them from experimental queue to check 14:40:48 in non voting of course 14:41:00 https://review.openstack.org/#/c/362504/ 14:41:08 example with https://review.openstack.org/#/c/362506/ 14:41:12 Please don't move them to check right now. 14:41:18 One thing related to this, is ramishra implemented a new heat feature which will help when manipulating the *Services lists 14:41:20 bnemec: why? 14:41:21 https://review.openstack.org/#/q/status:merged+project:openstack/heat+branch:master+topic:bp/environment-merging 14:41:30 Oh wait, these don't run in our cloud though, do they? 14:41:35 bnemec: in non voting, and they run only if you try to modify their env 14:41:38 bnemec: no 14:41:42 bnemec: they run in osinfra 14:41:46 bnemec: multinode ftw 14:41:51 we may want to move towards that to make it easier to compose the different scenarios without hard-coding e.g ControllerServices 14:41:54 nm, carry on then. :-) 14:41:58 bnemec: k 14:42:09 another thing: upgrades 14:42:11 Yeah, to be fair multinode is working out really well 14:42:26 i've heard RDO ci had successful results in RDO CI to test upgrades 14:42:30 Does anyone have any insight into why our ovb jobs have gotten so slow? 14:42:36 it would be great to have help to make it working in upstream CI first 14:42:45 I think it's a combination of things, but we're seeing a lot of jobs hit the infra timeout 14:42:47 as a reminder for everyone here: RDO CI is not Upstream CI 14:42:51 shardy: Yes. We think it's a hardware problem. 14:42:57 which is kinda sad given all the hardware upgrades 14:43:15 The new memory was installed incorrectly and seems to be hurting performance on those boxes. 14:43:20 bnemec: Ok, great - any details or timeline for a fix? 14:43:31 anyway, regarding upgrades / updates, last time i checked last week, jobs were not passing... multiple problems 14:43:58 shardy: We have a ticket open with dcops to get one of the nodes fixed to verify that is the problem. Not sure what the turnaround time on that is. 14:44:09 shardy: EmilienM i'll take care of this jobs (upgrade M to N) 14:44:10 if any volunteer to help on upgrade/update jobs UPSTREAM please let me know 14:44:17 matbu: thank you 14:44:19 EmilienM: ^ 14:44:21 bnemec: ack, ok, well good to hear we're en-route to a fix 14:44:36 I got good result, i think we are close 14:44:37 shardy: Assuming that works, we'll need to discuss when we can afford to pull a bunch of compute nodes out and get them fixed. 14:44:44 matbu: upstream? 14:45:12 EmilienM: the blockers review has been merged (latest yesterday) 14:45:16 matbu: to clarify, is this an RDO upgrade job, where the plan was to make it test upstream code via third-party CI? 14:45:19 please stop saying you guys have results when it's not in upstream CI. Both systems are different 14:45:45 EmilienM: they are different, but third-party feedback would still be valuable provided it's testing upstream code 14:45:52 shardy: i'm mean upstream (tripleo-ci) 14:45:56 i didnt say its not valuable 14:46:06 its just not voting and not run everytime. 14:46:20 EmilienM, matbu is working in both ci systems atm 14:46:28 shardy: EmilienM in the lifecycle team, I has been working with rdo for major upgrade 14:46:42 matbu: Ok, thanks, lets sync up after the meeting, could you perhaps start an etherpad with a list of patches so we can see what needs help w/reviews etc? 14:46:46 cool, I would like the jobs out of experimental queue asap 14:47:10 even the update job was broken, a problem in Heat. I havent checked if it pass now 14:47:31 shardy: EmilienM ack 14:47:33 So, gfidente mentioned we removed ipv6 from CI when we switched to ovb 14:47:50 does anyone have any details on what work is needed to add that back in? 14:47:53 shardy: yep 14:48:09 We need to re-enable the updates job. That's where we were testing ipv6. 14:48:20 shardy: should be similar to IPv4, just different heat environments 14:48:21 bnemec: it's broken now afict 14:48:22 panda, is familiar w/ ipv6, is that something he can drive? 14:48:22 Also, we'll need an ipv6 version of the OVB net-iso templates. 14:48:42 bnemec: the update job is failing because Heat could not update the stack 14:48:51 afik zaneb fixed it 14:48:59 also I think the update job is using multinode where we don't have os-net-config 14:49:07 but we havent promoted tripleo repo for 6 days 14:49:11 Yeah that update bug is fixed, but we need to promote current-tripleo to get it 14:49:13 so we can't as easily bring network isolation to the updates job 14:49:14 gfidente: Two different updates jobs. 14:49:35 We still have the old updates job defined, we just aren't running it in the check queue. 14:49:45 right 14:49:52 then +1 on thd old updates job back to life 14:50:09 do we have the res to run it in check again? 14:50:10 I think we need ipv6 coverage before we declare newton final as we've touched a bunch of stuff which might have broken it 14:50:37 Ok then, sounds like we have a plan :) 14:50:56 Anything else related to CI before we move on? 14:50:56 so who is going after ipv6 in ci? 14:50:58 hopefully it works in RDO CI 14:51:06 As you know we have a problem with periodic job duration, here we have a discussion what to include there, please comment: https://review.openstack.org/#/c/361429/ Also I proposed a patch to split the features: https://review.openstack.org/362904 14:51:08 helping :) 14:51:16 EmilienM: Back under the bridge with you. :-P 14:51:44 Maybe we could just switch the ha job to ipv6? 14:51:51 For now at least. 14:51:56 bnemec: it sounds an excellent proposal 14:52:08 It's already using net-iso, so it would just be a question of switching to ipv6. 14:52:34 weshay: I think we need a volunteer, the plan is to re-enable the old update job w/ipv6, and if panda is available to help, that would be good 14:52:42 bnemec: then we won't cover Ipv4? 14:53:12 shardy: yep, I'll do it 14:53:43 i think its a one line patch 14:54:35 dprince: we have the two ovb jobs, so I guess the idea is run one ipv6 and one ipv4? 14:54:48 I would prefer not to lose coverage of HA w/ipv4 though 14:54:55 Yeah, but nonha doesn't use net-iso, so it's a different kind of ipv4. 14:55:06 exactly, that is my only point 14:55:12 as there's a bunch of stuff which might break with multiple nodes (all the list manipulation in allNodesConfig is a good example) 14:55:14 I'm just looking at alternatives in case we can't get the updates job working quickly. 14:55:37 maybe can we have a periodic job for ha-ipv6 ? 14:55:41 Also, adding another job to the check queue is going to suck, given our existing performance issues. 14:55:51 Yeah, a periodic job would certainly be better than nothing 14:56:01 i can hack on project-config to add it 14:56:05 at least then we can work to make it pass before we release 14:56:10 if everyone agrees 14:56:15 Alright, 4 mins left 14:56:18 Or experimental, so we can trigger it on patches we think might affect ipv6? 14:56:22 bnemec: right 14:56:30 I'll skip Specs, but just a reminder folks can start raising ocata specs now 14:56:37 would be good to have some for discussion at summit 14:56:39 #action EmilienM to create exp job for ipv6-ha 14:56:44 #topic open discussion 14:56:51 yeah, we don't have resources for another "gating" check job yet... pending performance improvements 14:56:57 dprince: right 14:57:09 experimental job will be ok 14:58:52 Ok, does anyone have anything else to raise, or shall we wrap things up? 14:59:17 I'll take that as a no, thanks all! :) 14:59:23 Thanks folks 14:59:25 o/ cheers 14:59:25 #endmeeting