18:00:57 <krtaylor> #startmeeting third-party 18:00:58 <openstack> Meeting started Mon Sep 15 18:00:57 2014 UTC and is due to finish in 60 minutes. The chair is krtaylor. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:59 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 18:01:01 <openstack> The meeting name has been set to 'third_party' 18:01:11 <krtaylor> who's here this week? 18:01:13 <anteaya> o/ 18:01:25 <ignacio-scopetta> o/ 18:01:33 <mestery> o/ 18:01:46 <bmwiedemann> o/ 18:01:49 <omrim> o/ 18:01:52 <krtaylor> hello 18:02:02 <ociuhandu> hello 18:02:17 <krtaylor> Welcome everyone, let's get started! 18:02:26 <krtaylor> #topic Welcome & Reminder of OpenStack Mission 18:02:35 <krtaylor> #info The OpenStack Open Source Cloud Mission: to produce the ubiquitous Open Source Cloud Computing platform that will meet the needs of public and private clouds regardless of size, by being simple to implement and massively scalable. 18:03:05 <krtaylor> #topic Review of previous week's open action items 18:03:32 <krtaylor> ok, so we didn't have any active actions from last weeks meeting 18:04:04 <krtaylor> but I will bring up the maillists again 18:04:11 <anteaya> great 18:04:36 <krtaylor> that was an announce, but please subscribe to announce at a minimum, if you have not already 18:04:40 <krtaylor> #link http://lists.openstack.org/cgi-bin/mailman/listinfo 18:05:01 <krtaylor> heh, overload of announce 18:05:13 <anteaya> :D 18:05:14 <krtaylor> that was an announcement, please join announce 18:05:36 <krtaylor> see the lists at the bottom of that link, as well as many other good ones 18:06:03 <krtaylor> alright, on to Announcements 18:06:15 <krtaylor> #topic Announcements 18:06:31 <krtaylor> There were none listed in the agenda 18:06:35 <krtaylor> any to mention? 18:06:41 <anteaya> actually if you refresh, there is 18:07:00 <krtaylor> yes, refresh bit me again 18:07:17 <krtaylor> anteaya, you have the floor 18:07:20 <anteaya> thanks 18:07:36 <anteaya> #info third party items etherpad for cross project discussion at summit is now up (anteaya) 18:07:46 <anteaya> #link https://etherpad.openstack.org/p/kilo-third-party-items 18:07:53 <anteaya> so here is our etherpad 18:07:59 <krtaylor> great! 18:08:19 <anteaya> for us to identify and then prioritize items for discussion at the summit 18:08:33 <anteaya> if you were following one of the threads 18:08:50 <anteaya> ttx said he recommends we prepare for a cross project session 18:08:57 <krtaylor> excellent 18:08:59 <anteaya> so we have this etherpad to do that 18:09:08 <anteaya> so first we need items identified 18:09:19 <krtaylor> I have a few, I'll add 18:09:21 <anteaya> at the top of the etherpad I have a format sample 18:09:29 <anteaya> include your name and your irc nick 18:09:50 <anteaya> topics without a name and irc nick will not get as much attention as those with that information 18:09:59 <anteaya> be sure to include yours and tell others 18:10:17 <anteaya> as a group over the next few weeks we can identify items and priortize them 18:10:26 <anteaya> not all items will have time to be discussed 18:10:34 <krtaylor> I also did this for Atlanta 18:10:40 <daya_k> anteaya, is it ok to add items even if i am not planning on attending the summit? 18:10:40 <krtaylor> do we have a slot? 18:10:52 <anteaya> so we have to work together so those items that affect the most folks are idenfied and priortized 18:10:59 <anteaya> daya_k: that will be tough 18:11:06 <daya_k> ok 18:11:17 <anteaya> daya_k: someone will have to be attending summit to share why this is important 18:11:31 <anteaya> daya_k: at this point write them down, others may agree they are important 18:11:33 <daya_k> ok, will find out if someone can follow up and add 18:11:39 <daya_k> ok 18:11:43 <anteaya> and they may be able to share that perspective at summit 18:11:45 <krtaylor> does this get prioritized in with Infra design slots? t he time will be limited for sure 18:11:49 <anteaya> krtaylor: we have no slot yet 18:12:00 <anteaya> krtaylor: but we have been guided to prepare for a slot 18:12:08 <krtaylor> ok, understood 18:12:13 <krtaylor> thats great 18:12:20 <anteaya> krtaylor: which is as close to saying we are going to get a slot as anybody will get at this point 18:12:38 <anteaya> krtaylor: so we have the most we can possibly have and have to prepare accordingly 18:12:42 <anteaya> any other questions? 18:13:03 <krtaylor> Thanks for that announcement anteaya 18:13:17 <anteaya> okay this will become a regular agenda item until summit 18:13:22 <anteaya> so participate regularly 18:13:24 <anteaya> thanks 18:13:26 <anteaya> I'm done 18:13:54 <krtaylor> agreed, I'll leave in on next weeks agenda too 18:14:11 <krtaylor> onward then 18:14:18 <krtaylor> #topic OpenStack Program items 18:14:52 <krtaylor> another reminder, we have 2 open third-party patches: 18:15:14 <krtaylor> #link https://review.openstack.org/#/q/status:open+project:openstack-infra/config+branch:master+topic:third-party,n,z 18:15:40 <krtaylor> If everyone would join in on the review of these 18:16:05 <anteaya> and thanks to those who have already as well 18:16:14 <krtaylor> yes 18:16:32 <krtaylor> and that is a good transition to the next item on the agenda, which is actually one of those patchsets 18:17:07 <krtaylor> I'd like to bring up recheck in this meeting, so we will have discussed and agreed on a plan, then bring that to infra 18:17:18 <krtaylor> put on the agenda for this weeks meeting 18:18:04 <krtaylor> There are basically 2 sides 1) "recheck" comment restarts all tests, all systems, 1st and 3rd 18:18:25 <krtaylor> 2) and what I proposed 18:18:53 <krtaylor> which is based on the feedback from sdague's namespace idea 18:19:32 <krtaylor> that is, recheck is the default behavior, but third-party systems could implement a "recheck <system-name>" 18:19:46 <ociuhandu> #link https://github.com/openstack-infra/config/blob/master/modules/openstack_project/files/zuul/layout.yaml#L19 18:20:04 <daya_k> so, does this overrule the system-name:recheck proposal ? 18:20:09 <ociuhandu> krtaylor: according to the link above, recheck followed by anything will trigger any CI 18:20:36 <krtaylor> this would be a proposal for third-party systems, that is what is done upstream 18:20:52 <krtaylor> and would be optional for third-party, see the patch 18:21:14 <ociuhandu> so we can not use “recheck hyper-v” or “recheck-hyper-v” for any third-party unless that line is being updated 18:21:41 <ociuhandu> i used hyper-v as an example, since it’s our case 18:21:45 <krtaylor> I see it as an optional mechanism for systems that want to be able to support individual rechecks 18:22:05 <krtaylor> ociuhandu, I don't follow 18:22:21 <krtaylor> ociuhandu, that is the upstream yaml 18:22:37 <krtaylor> ociuhandu, your system can change that 18:23:02 <krtaylor> for upstream, everything after recheck is ignored 18:23:03 <ociuhandu> krtaylor: yes, that line means that if I put in our CI “recheck XX” or “recheck-XX” as option for triggering our CI’s recheck, it will also trigger Jenkins 18:23:17 <ociuhandu> which is not what we want, I think 18:23:20 <krtaylor> ociuhandu, is that a problem? 18:23:44 <krtaylor> the general consensus in infra team is that it is not 18:23:58 <ociuhandu> krtaylor: as long as we say that is an “individual recheck” it’s not what we want, no? 18:24:11 <ociuhandu> as individual means not retriggering Jenkins as well 18:24:26 <ociuhandu> and also from time to time jenkins fails as well 18:24:34 <krtaylor> ok, I am open to rewording 18:24:47 <krtaylor> but it would trigger an individual system to recheck 18:25:08 <ociuhandu> so we’ll end up with developers annoyed that at first jenkins passed and later on, at some third-party recheck they got that CI to pass and jenkins to fail 18:25:30 <ociuhandu> we’ve seen a few situations like this, especially when all systems are under load 18:25:43 <krtaylor> the previous proposal brought this up, and transient failures have to be taken into account 18:25:57 <krtaylor> thats what elastic recheck is for 18:26:03 <krtaylor> well, helps with 18:26:23 <ociuhandu> so if we also take that into account, I’d go, for now, with something similar to wha daya_k mentioned 18:27:20 <krtaylor> the concern was that rechecks would be done individually until all of them posted success 18:27:35 <krtaylor> daya_k mentioned? 18:27:50 <krtaylor> oh, I think you are referring to the namespace idea 18:27:50 <ociuhandu> yes, but having the developer wait another round of jenkins check just because of a transient error that he did not hit initially would not make them happy, i think 18:27:51 <ociuhandu> daya_k: so, does this overrule the system-name:recheck proposal ? 18:27:52 <daya_k> systemname:recheck 18:28:06 <krtaylor> that was from sdague 's proposal 18:28:27 <ociuhandu> daya_k: just copy-pasted your mention, was not suposed to be a question :) 18:29:07 <daya_k> sure, nw 18:29:19 <krtaylor> ociuhandu, but, it would help make the tests better overall if it was identifying a transient failure, it is necessary evil for making everything better 18:29:36 <krtaylor> developer pain vs overall good 18:30:07 <anteaya> keep in mind developers have very little tolerance for pain from third party 18:30:52 <krtaylor> anteaya, exactly but I think the issue mentioned here is the patchset contributor waiting for jenkins 18:31:04 <ociuhandu> anteaya: thank you :) 18:31:06 <krtaylor> ociuhandu, is that the concern? 18:31:33 <ociuhandu> krtaylor: the issue: submit patchset, jenkins happy, one third party fails 18:31:49 <dougwig__> To me, this isn't a question of if there will be a unique trigger. simply whether or not it's syntax is standardized. if you know enough to setup one of these systems, it'll have a unique trigger. 18:31:53 <ociuhandu> krtaylor: issue recheck CI -> means CI and Jenkins run again 18:31:56 <krtaylor> waiting on jenkins cannot and shouldn't be avoided, is what I am asserting 18:32:11 <ociuhandu> if “lucky” this time Ci says OK and Jenkins fails 18:32:26 <ociuhandu> for sure developer will get mad here 18:32:37 <ociuhandu> as he already had the OK from jenkins on first run 18:32:42 <krtaylor> but if jenkins fails, then there is another problem that existed before this patchset 18:32:51 <anteaya> the other option is that operators should be evaluating their test results 18:32:53 <ociuhandu> and lost that while trying to do a third-party recheck 18:33:17 <anteaya> if there was a failure due to a failing system, the operator can comment on the patch as a reviewer 18:33:45 <krtaylor> dougwig__, I think the syntax has solved itself, most CI systems already support some form of "recheck <system-name>" 18:33:46 <ociuhandu> krtaylor: we’ve seen so many fails in jenkins due to transient errors that had nothign to do with the patches that I would not say it’s easy to ignore this situation 18:33:57 <anteaya> so the developers can take the reviewers comments into account in their susequent decisions 18:34:22 <krtaylor> anteaya, agreed, a good approach 18:34:56 <krtaylor> ociuhandu, and the developer should be willing to see that the transient problem gets identified 18:35:43 <krtaylor> ociuhandu, take the view of the project as a whole, not an individual developer 18:36:18 <krtaylor> ociuhandu, identifying a transient problem is good! 18:36:25 <ociuhandu> krtaylor: if the developer would not have to wait for a few hours on one patchset run for results, yes, i totally agree 18:37:05 <ociuhandu> krtaylor: sorry for the underline, don’t know how that managed to get in 18:37:12 <krtaylor> ociuhandu, this process is amazingly fast compared to the many other opensource projects I have been involved with 18:37:49 <daya_k> what happens if 2 CI systems fail, then the dev has to issue 2 recheck:system name comments, and trigger jenkins twice unnecessarily 18:38:00 <ociuhandu> krtaylor: just to clear this up, I’m not one of the developers, so I am not talking from my developer’s point of view, just based on feedback i got from multiple developers 18:38:08 <krtaylor> ociuhandu, maybe, if this became too painful, then "recheck <bug>" could be put back in place 18:38:28 <krtaylor> it was removed a while back 18:38:45 <ociuhandu> krtaylor: yes, we are aware of that :) 18:39:01 <krtaylor> actually, not sure how transient failures are identified then, do you now? 18:39:19 <krtaylor> I guess it relies on elastic recheck 18:39:24 <daya_k> also, if the 3rd party ci system was down, dev may not have to make any changes, but just trigger the 3rd party ci itself to get its vote, so, overall, i think this mechanism should only trigger 3rd party ci 18:39:47 <ociuhandu> krtaylor: yes, they use elastic recheck to automatically parse the results for identifying that 18:40:21 <krtaylor> ociuhandu, thats what I figured, good to know 18:41:03 <ociuhandu> krtaylor: one more point here: when the jenkins queue is hours long, would it benefit anyone to add more unnecessary workload for these rechecks? 18:41:05 <krtaylor> daya_k, all I have control over in this proposal is third-party, infra already spoke on jenkins rechecks 18:41:17 <krtaylor> that is really an issure to take to infra team 18:41:38 <anteaya> krtaylor: elastic recheck yes 18:41:57 <krtaylor> ociuhandu, I completely agree, but it was not a concern in all the comments 18:42:00 <anteaya> the bug <bug-number> was removed since the bug number wasn't tracked that way anymore 18:42:29 <ociuhandu> krtaylor: let’s try to find then an alternate, so we can still offer individual third-party checks without triggering jenkins 18:42:29 <krtaylor> and "recheck <jenkins>" didn't pass either 18:42:47 <ociuhandu> our current approach was to use “check” instead of “recheck” 18:42:55 <krtaylor> I personally like "recheck <jenkins>" 18:43:00 <ociuhandu> i.e. “check <ci-name>” 18:43:11 <krtaylor> oops, I mean "recheck jenkins" 18:43:15 <anteaya> is there anything stopping an operator from manually triggering their system to run and then them manually reporting success on a patch as a developer? 18:43:48 <krtaylor> anteaya, no, but it would be outside of gerrit 18:44:13 <daya_k> how would they use zuul to do the merge / 18:44:17 <ociuhandu> anteaya: and that would not update the CI recorded result 18:44:19 * krtaylor ponders that for a moment 18:44:34 <anteaya> the test results wouldn't report back but the operator could report in gerrit on the patch as a reviewer 18:44:48 <anteaya> re-ran the system the tests pass, sorry for the trouble 18:45:11 <anteaya> or re-ran the system, same failure, your patch might be triggering something let me take a deeeper look 18:45:29 <anteaya> daya_k: I don't understand your question 18:45:43 <anteaya> merging is independent of third party ci results 18:46:12 <daya_k> that would need a cherry pick of the patch and manual merge, so if zuul is using your system to run other patches, you would need to figure out how to share the system with zuul 18:46:37 <anteaya> that was my question 18:46:52 <anteaya> is it possible for an operator to manually trigger a set of tests on a patch 18:47:27 <krtaylor> daya_k, I think anteaya is saying everything, including the comment, would be manual 18:47:34 <anteaya> yes 18:47:36 <asselin> hi, part of the issue is now 3rd party ci stats are shown in a awesome table. manual rechecks would not affect that table. 18:48:13 <anteaya> correct, they would show up as an awesome comment from an attentive developer 18:48:24 <krtaylor> hehheh, here here! 18:48:44 <daya_k> correct, thats what i was referring to, that you would need to fight for the same resources as zuul, and if its running continuosly, i dont know how you would be able to use the same system to merge the patch and report. you would need to stop zuul right? 18:48:55 <ociuhandu> anteaya: following on your suggestion, I think it would be difficult to keep track on who’s commenting as reviewer instead of which CI 18:49:21 <anteaya> what happens if you include that in the comment 18:49:57 <ociuhandu> I mean how will the core reviewers check that this is the person is the one that should be? and moreover, if there’s a team, the whole list of members that could? 18:49:58 <anteaya> vote and then say Hi operator of MY CI system here, my latest tests are showing up as failing and I reran the tests manually and these are my results. 18:50:56 <anteaya> if the person doesn't identify which ci system they are operating then yes, this would be a problem 18:51:09 <ociuhandu> anteaya: isn’t it still easier to ensure the trigger for a single third-party CI than all these updates? 18:51:17 <anteaya> but I do hope that the operator that is attentive enough to care enough to do this would add that info in their comment 18:51:30 <anteaya> ociuhandu: it doesn't seem to be 18:51:41 <anteaya> we have spent this entire release discussing this issue 18:51:54 <anteaya> plus the majority of time of this meeting 18:52:00 <krtaylor> ok, I hate to shut this down, but in the interest of time, lets get through the rest of the agenda and we can resume this discussion in open topics section 18:52:03 <anteaya> I dont feel consensous 18:52:26 <krtaylor> no, not close, it needs to be broken down 18:52:36 <krtaylor> and points agreed to or not 18:52:48 <krtaylor> we need to table this for a moment 18:52:58 <krtaylor> quickly 18:53:01 <krtaylor> #topic Deadlines & Deprecations 18:53:18 <krtaylor> there were none on the agenda and I refreshed :) 18:53:22 <krtaylor> any? 18:53:23 <anteaya> :D 18:53:37 <krtaylor> ok, next 18:53:40 <krtaylor> #topic Highlighting a Program or Gerrit Account 18:53:53 <krtaylor> also none on the agenda, any? 18:54:15 <krtaylor> ok then 18:54:17 <daya_k> anteaya: should i bring our ibm sdn-ve system on the infra list? its disabled, i had sent a note indicating i have changed the logging syntax 18:54:19 <krtaylor> #topic Open Discussion 18:54:31 <daya_k> sorry, hit send too soon 18:54:39 <krtaylor> hehheh 18:54:46 <anteaya> daya_k: bring it to the -announce list, if that was where the disabled announcement happened 18:55:01 <anteaya> daya_k: have you already and I haven't replied? 18:55:02 <daya_k> anteaya: i did, didnt get a response 18:55:12 <anteaya> daya_k: thanks for the reminder I will look again 18:55:16 <daya_k> ok, thanks. 18:55:40 <krtaylor> so, I'd encourage everyone to comment on the recheck proposal for third-party, I think we are in agreement on that 18:55:55 <krtaylor> the problem seems to be on waiting on jenkins 18:56:06 <krtaylor> for the recheck that happens everytime 18:57:07 <krtaylor> and, I'll try to summarize this and bring it to the Infra IRC meeting for this week 18:57:29 <krtaylor> anteaya, your recap would be appreciated there too 18:58:03 <krtaylor> any other topics quickly, 2+ minutes 18:58:12 <anteaya> sure 18:58:27 <anteaya> I plan on attending tomorrow's infra meeting 18:58:48 <krtaylor> ociuhandu, asselin, daya_k - thanks for the great discussion on recheck! (and anyone else I missed) 18:58:57 <krtaylor> anteaya, I'll add it to the agenda 18:58:59 <daya_k> thanks krtaylor 18:59:01 <anteaya> kk 18:59:22 <krtaylor> ok, well if there is nothing else, I'll close for this week 18:59:38 <krtaylor> thanks everyone, another extremely useful meeting 18:59:43 <anteaya> thanks for a great meeting krtaylor 19:00:04 <krtaylor> #endmeeting