14:00:13 #startmeeting tripleo 14:00:14 Meeting started Tue Nov 1 14:00:13 2016 UTC and is due to finish in 60 minutes. The chair is EmilienM. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:15 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:18 The meeting name has been set to 'tripleo' 14:00:26 #topic agenda 14:00:28 * one off agenda items 14:00:30 * bugs 14:00:32 * Projects releases or stable backports 14:00:34 * CI 14:00:36 * Specs 14:00:38 * open discussion 14:00:40 hello folks! 14:00:44 \m/ 14:00:45 hola 14:00:46 heyia 14:00:48 hello! o/ 14:00:50 o/ 14:00:51 o/ 14:00:52 o/ 14:00:53 o/ 14:00:59 o/ 14:01:00 o/ 14:01:08 * marios always confused with the timechange... wasn't sure if now or 1 hour 14:01:08 o/ 14:01:12 o/ (kinda) 14:01:15 o/ 14:01:26 I hope you had safe travel back to home! 14:01:40 here 14:02:25 #topic off agenda items 14:02:28 #link https://etherpad.openstack.org/p/tripleo-meeting-items 14:02:32 sshnaidm: hi! 14:02:37 EmilienM, hi! 14:02:47 hi 14:03:05 EmilienM, I have 2 questions in off items 14:03:10 go ahead 14:03:53 I started POC patch when running quickstart in TripleO CI ovb environment and it seemed working 14:04:16 I'd like to do experimental job with quickstart running there to prov its correctness and stability 14:04:31 (and detect possible problems) 14:04:57 I'd like to ask if there is no objections to have such job in experimental pipeline 14:05:34 * adarazs looks forward to it. 14:05:34 interesting, I'm wondering what path are we taking 1) using tripleo-ci to run quickstart or 2) implement new quickstart jobs in third party CI that deploy ovb 14:05:47 o/ 14:05:56 2) wouldn't use tripleo-ci 14:06:49 I'd prefer 1) 14:07:01 1) sounds complex to me 14:07:16 as we're going to increase tripleo-ci logic complexity 14:07:40 see https://review.openstack.org/#/c/381094/ 14:07:55 if we want to have tripleo-ci be driven by quickstart in some way soon (which I hope we agreed that we do) then it would be a good first step. 14:08:20 EmilienM, 1) would help also to have CI externally consumable, btw 14:08:58 sshnaidm: can you link to the patch you are referring to? 14:09:05 EmilienM, anyway, it's experimental job, everybody will look at it and if we hate it - it's always possible to kill it 14:09:09 slagle: https://review.openstack.org/#/c/381094/ 14:09:10 o/ 14:09:20 oh right :) 14:09:27 2) looks closer to what we're going to do in the end, I wouldn't add an intermediate step. 14:09:58 1) would be a transition to have 2) at the end, iiuc 14:10:12 but anyway, it's an implemenation specific question, that we can talk outside this meeting. 14:10:36 sshnaidm: +1 for experimental job 14:10:51 ok, thanks! 14:10:51 sshnaidm: and wait 1 or 2 week(s) to make sure it's "stable" 14:11:14 sure 14:11:48 ok, then the second question is about moving tripleo-ci from "openstack-infra" to "openstack" 14:11:51 sshnaidm: in the meantime, could we make progress on using oooq to deploy multinode jobs too? 14:12:19 EmilienM, yeah, I planned it after ovb is running 14:12:23 excellent 14:12:31 bkero, ^ 14:12:41 o/ 14:12:58 slagle: did you have thoughts about moving ooo-ci to openstack? I remember you were involved in this 14:13:58 sshnaidm: +1 on my side 14:14:43 sshnaidm: anything else? 14:14:45 ok, then maybe let's discuss it later in #tripleo 14:14:51 EmilienM, not, that's it, thanks 14:14:57 dtantsur: your turn Sir 14:15:18 yep, quick question: are there plans now to deprecate instack-virt-setup? 14:15:32 it's not longer used in CI, and there seem to be a lot of efforts around quickstart 14:15:47 finally, we're moving away from pxe_ssh drivers, so instack-virt-setup requires reworking 14:15:53 dtantsur: tripleo-quickstart was always supposed to deprecate it, we've just not quite got to that yet in docs/CI 14:16:06 EmilienM: i think it has more to do with moving to 3rd party 14:16:12 dtantsur: have you looked at what rework will be required for quickstart? 14:16:24 EmilienM: once that is complete, it can probably move 14:16:33 slagle: right, ++ 14:16:37 shardy, not yet; but at least quickstart is covered by some CI, unlike tripleo-incubator 14:17:00 dtantsur: ack, yeah and we ideally don't want to do this work twice 14:17:01 EmilienM: but our multinode jobs won't be 3rd party, so maybe we'd split the repo, i dont entirely know if there is precedence for that 14:17:35 dtantsur: what's the timeline for ironic moving away from pxe_ssh? 14:17:47 slagle: mhh interesting, indeed; maybe wait a little more until we figure implementations 14:18:02 shardy, no strict timeline, but it's already deprecated. we have full rights to remove it in Ocata after O-1 14:18:12 slagle: fwiw, puppet openstack CI runs some scripts in openstack namespace... 14:18:13 shardy, also we stop covering it with out CI pretty soon 14:18:56 dtantsur: Ok, so it'd be very good if we could switch docs/CI to oooq before it's removed then ;) 14:19:06 true :) 14:19:12 o-1 seems like an ambitious target for that 14:19:25 o-2 maybe? 14:20:01 o-1 is in 2 weeks 14:20:30 yeah, I think it'd be better if we could aim for o-2 and ask if Ironic would be willing to delay removing it until then 14:20:37 so our docs etc don't break 14:20:44 ++ 14:20:58 late o/ 14:21:25 dtantsur: would that be a reasonable compromise from the Ironic perspective, or are there pressures to remove it really soon? 14:22:19 shardy, we can delay the removal, there is no too much pressue 14:22:33 shardy, but the driver is deprecated, will have no CI soon and will only receive critical bug fixes 14:22:50 dtantsur: Ok sounds good 14:23:11 can anyone take an action to follow up on the oooq side to determine the changes needed? 14:23:47 I can provide any help, but I can't do it right this second 14:23:57 I know that packaging virtualbmc in RDO is step 0 14:24:26 dtantsur: perhaps you could raise a bug against tripleo-quickstart with details of the work you're aware of, then we can figure out who has bandwidth to do it? 14:24:41 shardy, I think I have a tripleo blueprint.. 14:24:48 I can raise a bug as well for sure 14:25:04 #link https://blueprints.launchpad.net/tripleo/+spec/switch-to-virtualbmc 14:25:06 dtantsur: your bp is enough imho 14:25:42 dtantsur: maybe you can sync with weshay to see who from oooq experts could help there 14:25:47 ack, yeah I didn't realize there was already a BP, thanks 14:25:51 weshay: should we add this to the transition etherpad ? 14:25:52 * weshay looks 14:25:56 aye 14:26:17 dtantsur: thanks for bringing this topic! 14:26:20 oh this is cool 14:26:29 dtantsur++ 14:26:41 no problem :) 14:26:41 do we have anything else before we start regular agenda? 14:27:03 panda|pto|PH, I think we can go after that outside of the transition 14:27:31 #topic bugs 14:27:34 weshay: ok 14:28:05 we are still fixing some bugs that we want backported in stable/newton, I consider them highest prio at this time 14:28:07 https://bugs.launchpad.net/tripleo/+bugs?field.tag=newton-backport-potential 14:28:24 https://bugs.launchpad.net/tripleo/+bug/1638003 still an issue for upgrades and updates atm (overcloudrc change)... added a comment at https://review.openstack.org/#/c/391201/ 14:28:25 Launchpad bug 1638003 in tripleo "Stack update causes overcloudrc OS_PASSWORD update and fails overcloud keystone authentication" [High,In progress] - Assigned to Ryan Brady (rbrady) 14:28:54 I suspect there are more, because I noticed not all team members were doing self triage :( 14:29:37 marios: isn't it addressed by https://review.openstack.org/#/c/391201/ ? 14:29:47 EmilienM: not in my testing (sasha too) 14:30:02 EmilienM: so bringing up for visibility.. i needinfod rbrady on the BZ too 14:30:08 oh, it's in bug description 14:30:17 EmilienM: (the BZ is linked from the launchpad bug) 14:30:40 EmilienM: it is in that list you gave anyway i mean the backport bugs 14:30:43 his patch seems like failing on CI consistently 14:31:03 EmilienM: another one for uprades is here https://bugs.launchpad.net/tripleo/+bug/1634897 - still some discussino with pradk on the fix but concerned given the timeframe 14:31:05 Launchpad bug 1634897 in tripleo " Upgrade Mitaka to Newton failed with gnocchi using swift backend: ClientException: Authorization Failure. Authorization Failed: Service Unavailable (HTTP 503)" [Critical,In progress] - Assigned to Pradeep Kilambi (pkilambi) 14:31:15 EmilienM: (also in the list as newton backport) 14:31:32 don't forget to add milestones 14:32:08 ok so https://review.openstack.org/#/c/388649/ will be backported when merged 14:32:27 marios: thanks! 14:33:21 do we have any other outstanding bug? 14:34:09 #topic Projects releases or stable backports 14:34:18 so 2 infos: 14:34:26 ocata-1 will be released in 2 weeks! 14:34:53 also, this week I plan to release a new version of newton, that will contain all our recent backports 14:35:09 I might wait for critical bugs (eg: upgrades, etc) to be fixed first 14:35:22 any thoughts on release management? 14:36:09 should we perhaps aim to release stable branches around the same time as the ocata milestones? 14:36:21 shardy: fyi, I'll be on PTO during ocata-1 release. I'll be online from time to time but not sure I can't send release patch request 14:36:28 IIRC it was mentioned last week and would be a good way to ensure regular releases are made from the stable branches 14:36:40 shardy: sounds good to me 14:37:07 EmilienM: I will be around & can help with the ocata release 14:37:11 thx 14:37:17 ok so 14:37:21 EmilienM: let me know before you go on PTO so we can sync up on the latest status etc 14:37:48 #action EmilienM / shardy to release newton-2 and ocata-1 during Nov 14th week 14:37:55 shardy: sure thing. 14:38:32 marios: please let us know about upgrade things, and make sure they are all backported 14:38:52 #topic CI 14:39:01 EmilienM: ack we are fighting every day :) looking forward to the rest of november 14:39:03 this morning, I found out that ovb/ha job is really unstable 14:39:27 I haven't filed a bug yet, because i'm still waiting for a CI job now 14:39:50 http://logs.openstack.org/64/391064/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/2d15c11/console.html#_2016-10-31_10_19_54_304885 14:40:02 that's the error, and it started on October 27th, during a promotion job 14:40:08 I suspect a regression in OpenStack 14:40:22 but this one has been hard to debug, I haven't seen useful logs in nova/cinder yet 14:40:39 if you still see this error today and later, please let me know 14:41:17 http://tripleo.org/cistatus.html - ovb/ha is red in most of recent runs :( 14:41:23 I see today a lot of errors like "ClientException: resources.volume1: Gateway Time-out (HTTP 504)" in pingtest stack creating 14:41:40 it could be related 14:41:48 I'll look 14:41:50 the error I saw was boot from volume not working 14:42:02 sshnaidm: let's sync after meeting 14:42:16 I have another info 14:42:16 ok 14:42:36 slagle: pabelanger just told me we could submit a patch in nodepool to have a new multinode job with 3 nodes 14:42:37 sshnaidm, EmilienM I have the feeling that it's a selinux issue, the lack of relevant logs is really confusing 14:42:56 jaosorior: I haven't looked audit logs 14:43:01 so maybe 14:43:14 slagle: do you want to take action and submit it this week? 14:44:04 EmilienM: that's something we've discussed, i know we can do it 14:44:16 cool :) 14:44:22 EmilienM: i dont honestly anticipate having time to work on it this week 14:44:46 slagle: ok, no problem 14:44:54 I'll see if I can submit it 14:45:18 #action EmilienM or slagle to submit patch to nodepool to add 3-nodes jobs 14:45:31 anything else about CI this week? 14:45:48 i'd like to get this landed: https://review.openstack.org/381286 14:45:56 so we can test undercloud upgrades 14:46:12 and move forward with the composable undercloud work 14:46:31 reviews appreciated :) 14:46:43 slagle: the job is gate-tripleo-ci-centos-7-undercloud-upgrades-nv, right? 14:46:51 yes 14:47:02 cool! 14:47:15 now that the ovb mitaka/liberty jobs are fixed, ovb should pass on that too 14:47:27 #action team to review undercloud/upgrade patch https://review.openstack.org/381286 14:47:36 slagle: nice will have a look tomorrow 14:47:53 thanks :) 14:48:39 ok last topic 14:48:41 #topic specs 14:48:48 #topic https://review.openstack.org/#/q/project:openstack/tripleo-specs+status:open 14:49:09 i put out a first pass from composable service upgrades discussion, thanks shardy for having a look already https://review.openstack.org/#/c/392116/ 14:49:13 if you have time, please review our specs that we discussed at Summit 14:49:39 the tripleo/squads policy is ready for review and comments were all addressed https://review.openstack.org/#/c/385201/ 14:49:49 marios: I was reviewing yours before the meeting. I'll finish today 14:49:59 EmilienM: ack thanks appreciate 14:50:41 does it sound reasonable to merge Ocata specs before Ocata-1? 14:50:48 it gives us 2 weeks 14:51:00 I would hate merging specs after doing the actual feature ;-) 14:51:50 EmilienM: we can try, but historically TripleO spec reviews have been pretty slow 14:51:56 EmilienM: well that would be the *right* way to do it but ultimately we may not make it 14:51:58 perhaps this is an opportunity to fix that 14:52:02 right, and we might want to improve ourselves on this thing 14:52:15 EmilienM: so +1 nice thing to aim for 14:52:18 marios: well, maybe we should be a bit more serious in our specs 14:52:19 spec reviews are always slow, not only in tripleo :) 14:52:28 but it's not a reason not to improve, I agree 14:52:31 #action team to review Ocata specs for ocata-1 https://review.openstack.org/#/q/project:openstack/tripleo-specs+status:open 14:52:40 I'll send a note on ML about it 14:52:55 any question / remark about specs? 14:53:31 #topic open discussion 14:53:45 any question or feedback is welcome now 14:54:15 I wanted to add that ovb/ha job has at least one big problem: since redis is unable to start, gnocchi-metricd is eating a lot of CPU, so things may be unstable also for that reason. 14:54:16 just FYI I'll be on vacations from Friday night, and then during 2 weeks. I'll have emails & IRC sometimes. 14:54:43 panda|pto|PH: we saw that before 14:54:49 panda|pto|PH: there is a launchpad about it iirc 14:54:58 panda|pto|PH: talk to pradk and carlos 14:55:16 panda|pto|PH: why redis is unable to start? 14:55:21 EmilienM:yes, but it's a different problem, we ened a selinux command to let redis open its unix socket 14:55:54 EmilienM: I'm curre;lty discussing this with rhallisay in #rdo 14:56:15 cool 14:56:26 panda|pto|PH: please file a launchpad bug if needed, so we can track the work being done here 14:56:41 EmilienM: already there 14:56:42 do we have anything else for this week? 14:57:14 EmilienM: https://bugs.launchpad.net/tripleo/+bug/1637961 14:57:15 Launchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [High,Confirmed] - Assigned to Gabriele Cerami (gcerami) 14:57:45 panda|pto|PH: thx 14:57:54 well, sounds like we're done 14:58:09 have a great week everyone 14:58:11 #endmeeting