19:00:32 <clarkb> #startmeeting infra 19:00:32 <openstack> Meeting started Tue Nov 28 19:00:32 2017 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:33 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:35 <openstack> The meeting name has been set to 'infra' 19:00:44 <clarkb> hello, who is here for the infra meeting? 19:00:50 <frickler> o/ 19:00:54 <tobiash> o/ 19:01:01 <ianw> o/ 19:01:25 <clarkb> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:02:21 <clarkb> There are a couple items on the agenda. Jeblair will be joining us late so I think we will do zuulv3 after the general topics list 19:02:31 <clarkb> #topic Announcements 19:03:03 <clarkb> its been a quiet week with the US thanksgiving holiday, I'm not aware of anything that needs announcing 19:03:07 <clarkb> is there something I've missed? 19:03:28 <fungi> i guess the venue for the ptg hasn't been officially announced yet 19:03:51 <fungi> so maybe flag that to announce in next week's meeting if it has been by then 19:04:05 * clarkb scribbles a note 19:04:13 <diablo_rojo_phon> It should be announced by this afternoon. 19:04:35 <clarkb> in that case look to read the openstack-dev mailing list later today for an announcement on the ptg location :) 19:04:43 <AJaeger> o/ 19:04:55 <pabelanger> o/ 19:04:58 <clarkb> #topic Actions from last meeting 19:05:08 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2017/infra.2017-11-21-19.01.txt Minutes from last meeting 19:05:22 <clarkb> Fungi's action to write the secrets backups doc is complete \o/ 19:05:43 <fungi> oh, that was complete last week 19:05:51 <fungi> i think i linked it during the meeting then 19:06:06 <clarkb> ya it ended up in the actions list but now is done so we can keep it out 19:06:09 <fungi> maybe it hadn't merged yet at that point 19:06:53 <clarkb> I'm not aware of any specs that need review or otherwise need to be brought up and will skip zuulv3 for now so that jeblair can join us which means straight ot general topics 19:07:00 <clarkb> #topic General topics 19:07:11 <clarkb> #topic Zanata 4 upgrade 19:07:22 <clarkb> #link https://review.openstack.org/#/c/506795/ initial change needed for zanata upgrade work 19:07:52 <clarkb> we are seeing some of the initial changes come in that will allow us to upgrade from zanata 3 to zanata 4. I think the i18n team would like to see this get done before the string freeze which is late january. 19:08:18 <clarkb> it would be great if we can help make that possible (code reviews, actually merging code/running upgrades) 19:08:35 <jeblair> o/ 19:08:58 <clarkb> I did the last one so am fairly familiar with the service, is anyone else interested in learning about the java/wildfly/zanata things? If so let me know and we can work to help get this going with the i18n team 19:09:12 <ianw> i have a passing familiarity from the last upgrade, and i think we're in similar tz's, so put me down to help 19:09:27 <clarkb> ianw: awesome thanks 19:09:45 <clarkb> I expect it will be mostly straightfroward after talking to aeng, no need for a java update or distro upgrade 19:10:18 <fungi> unlike, say, the next gerrit upgrade 19:10:26 <ianw> i'll start by reviewing ^^ :) 19:10:43 <pabelanger> fungi: +1 19:11:35 <clarkb> #topic Priority Efforts 19:11:44 <clarkb> jeblair: is here, on to zuul 19:11:50 <clarkb> #topic Zuul v3 19:12:17 <clarkb> I wanted to do a quick recap of the zuul cloner venv removal because there were some hiccups with it and want to amke sure we don't forget to do the last cleanups 19:12:32 <clarkb> pabelanger: ^ I think the fixes for the pyyaml install have gone in job side, have we remvoed it from the image again? 19:12:45 <pabelanger> clarkb: we have! 19:13:11 <pabelanger> I think we are ready to actually move to https://review.openstack.org/513506/ now, which removes zuul-cloner from base jobs 19:13:55 <clarkb> #link https://review.openstack.org/513506/ remove z-c shim from base job is ready for review now 19:14:30 <pabelanger> for 513506, we'll need to be ready to fix jobs that are broken by it, and re-parent them to legacy-base 19:14:42 <clarkb> fungi: I think we basically agreed to remove jenkins from the ci group in gerrit last week as well. Did that happen yet? 19:14:51 <pabelanger> as we expect native zuulv3 jobs 1) not to use zuul-cloner, 2) parent to base 19:14:52 <fungi> it has not, but i can do it now 19:15:00 <pabelanger> fungi: +1 19:15:08 <fungi> 10 seconds ;) 19:15:12 <clarkb> works for me and is an easy revert if we need to 19:16:00 <fungi> okay, that was more than 10 seconds, but done now 19:16:11 <clarkb> frickler: tristanC have a third party CI of zuul-jobs agenda item 19:16:16 <fungi> i had to remember the group name because i forgot i'd linked it from the agenda 19:16:25 <clarkb> I don't think tristanC is here, but frickler is. Want to fill us in? 19:16:41 <frickler> I'm not directly related to the CI 19:16:47 <jeblair> somehow i missed that email 19:16:50 <fungi> #info The retired "Jenkins" account has been removed from the Continuous Integration Tools group in Gerrit now 19:16:56 <jeblair> but i've read it now. and i support the concept in general 19:17:04 <frickler> but I wanted to make sure that tristan gets more feedback on his mail 19:17:13 <jeblair> though i was about to send back a reply suggesting that we just use 'recheck' for the recheck command 19:17:25 <fungi> #action fungi send an announcement about the removal of Jenkins from Continuous Integration Tools 19:17:32 <fungi> i'll do that after the meeting 19:17:35 <clarkb> fungi: thanks 19:17:50 <clarkb> #link http://lists.openstack.org/pipermail/openstack-infra/2017-November/005688.html for details on third party CI for zuul-jobs 19:18:01 <jeblair> i've long advocated that third-party cis should not have their own recheck command language 19:18:48 <jeblair> i don't think it should be a problem for this repo 19:19:26 <clarkb> agreed. I think the other considering to make is whether or not we want it to +1, +/-1, -1 or just +0 with logs 19:19:31 <clarkb> *other consideration 19:20:01 <jeblair> i'm happy to try out voting if other folks are 19:20:51 <clarkb> I think voting helps get more eyeballs on the problem and not allowing voting makes it easier to ignore the failures. So I am happy to try voting as well 19:21:13 <tobiash> ++voting 19:21:20 <dmsimard> hi 19:21:29 <dmsimard> sorry, forgot to fix calendar event timezone.. 19:21:38 <fungi> i have no objection to voting third-party ci systems as long as their feedback is reliable 19:21:40 <pabelanger> yah, I think we can give voting a try 19:22:24 <dmsimard> I don't have backlog to get context, but yes, we (RDO/Software Factory) would like to be third party CI against zuul-jobs. I don't know what shape this will take yet. 19:22:59 <tobiash> the same applies for me 19:23:05 <jeblair> dmsimard: current context is software-factory third-party ci: http://lists.openstack.org/pipermail/openstack-infra/2017-November/005688.html 19:23:10 <clarkb> dmsimard: basically jeblair has asked that we not have any special recheck syntax, just support 'recheck' like upstream zuul. And we seem to be comfortable to try it out in a voting manner (so we'll need to update gerrit ACLs) 19:23:13 <jeblair> which i somehow missed over thankgiving 19:23:43 <dmsimard> Ultimately, one of the objectives is to leverage TripleO(-ci) jobs, roles, and playbooks from within review.rdoproject.org which is anologous to review.openstack.org, so it will be very important for us to be able to re-use zuul-jobs (and potentially other things, but that's another topic) outside of OpenStack 19:24:39 <dmsimard> I'd start with non-voting first to get some confidence that we're doing the right thing 19:24:58 <jeblair> wfm 19:25:09 <jeblair> that's the usual approach i believe 19:25:11 <fungi> just to pile on, i agree wrt standardizing on "recheck" across ci systems. rechecking individual ci systems is moderately dangerous for the same reasons we've resisted requests to recheck individual jobs 19:25:11 <clarkb> dmsimard: thats fine, just wnted to bring up possible voting early as a lot of projects don't allow it at all and wasn't sure were we stood on that 19:25:29 <clarkb> so as to avoid surprises later if there were major disagreements :) 19:25:53 <dmsimard> clarkb: well, it's not clear to me yet what these third party jobs will look like yet 19:26:12 <dmsimard> clarkb: i.e, will it be running base-integration/multinode-integration but just from another zuul for example ? 19:26:41 <fungi> also, to be clear on the third-party ci situation, we've also said in the past that we'll disable accounts for any ci systems which start reporting on infra team repo changes without prior discussion 19:26:42 <dmsimard> I feel like there's stuff we'll realize once we get started 19:26:49 <tobiash> dmsimard: it could be running your most important jobs 19:27:24 <mmedvede> I would object to not allowing third-party CIs have their own recheck syntax, sometimes we want to only nudge our third-party CI when there is an obvious problem with it without wasting upstream CI's resources 19:27:33 <jeblair> my thought is not to be too prescriptive about what they're doing. we should have ongoing conversations with third-party ci operators to make sure we're making the most of things, but in general, third-party operators are probably best placed to decide what's important to them and what unique things they can bring to the table. 19:27:39 * rcarrillocruz waves 19:27:46 <fungi> mmedvede: "zuul enqueue" via the rpc in that case 19:28:05 <dmsimard> tobiash: it depends, what's the purpose or the use case ? I don't think running a tripleo-based job against zuul-jobs is necessarily worthwhile -- we're interested in testing the roles individually 19:28:13 <mmedvede> fungi: I thought we are not encouraged to comment on the same patch twice without an explicit recheck 19:28:35 <clarkb> dmsimard: but ya sounds like you can go ahead and start trying things out in a non voting capacity, may even want to start in a non reporting manner first. See how that goes then tweak from there 19:28:40 <fungi> mmedvede: that might be a policy some team has put into place, but it's not our policy afaik 19:28:49 <jeblair> mmedvede: there is nothing about that in https://docs.openstack.org/infra/system-config/third_party.html#requirements 19:28:55 <tobiash> dmsimard: yes, that probably depends on what's important to you as zuul-jobs user 19:29:01 <dmsimard> mmedvede, fungi, jeblair: unless mistaken, the recheck keyword is controlled by the third party CI so there's nothing preventing operators from responding to "recheck" and "check myzuulname" 19:29:04 <jeblair> mmedvede: there is a note in there saying that "recheck" should retrigger all systems. 19:29:18 <fungi> mmedvede: we're specifically talking about third-party ci systems which want to vote on changes to the openstack-infra/zuul-jobs repo (and potentially other deliverables of the infra team in the future) 19:29:29 <jeblair> dmsimard: nothing except their willingness to abide by the guidelines we've established 19:29:46 <pabelanger> wait, I thought we said recheck foo was good a while back. I feel like we go back and forth on this every few months 19:29:47 <jeblair> fungi: indeed, let's not get too far derailed on this :) 19:29:56 <dmsimard> jeblair: the important part is that they answer to "recheck", right ? if they answer to "check foo" is that an issue ? 19:29:57 <jeblair> pabelanger: i have never said that. 19:30:13 <pabelanger> jeblair: other infro-root have, IIRC 19:30:36 <jeblair> dmsimard: we don't have an established policy on that. i would like to, in the context of zuul-jobs only, establish a policy that we don't do that and all systems honor recheck. 19:30:38 <jeblair> only. 19:31:12 <dmsimard> jeblair: that's fine, on our end that means setting up a separate pipeline (because we already have a pipeline meant for third party) but that's not expensive 19:31:50 <jeblair> dmsimard: (or, if it happens to honor something else, just don't mention it) 19:31:56 <dmsimard> sure 19:32:01 <fungi> dmsimard: "our" in this context being rdo ci? 19:32:22 <dmsimard> fungi: yeah, RDO's zuul answers to things like "check rdo experimental" (so we don't trigger "check experimental") 19:32:32 <dmsimard> and possibly other things 19:33:06 <mmedvede> jeblair dmsimard : agree recheck should absolutely retrigger all the CI systems. But this does not exclude ability for third-party CIs to also be triggered separately. I would like there to be an official blessed syntax for this. Right now each CI comes up with their own 19:33:22 <fungi> i'm curious why someone would want upstream experimental pipeline results but not rdo's experimental pipeline results. still, that's not crucial to this topic 19:33:34 <clarkb> ya, I think we may be starting to get into another topic entirely 19:33:41 <clarkb> we can come back to that if there are no other zuulv3 items or finish them 19:34:05 <clarkb> I think we may want to talk about the merging of branches though that wasn't on the agenda. Any other zuulv3 items? 19:34:25 <dmsimard> I have something 19:34:29 <fungi> mmedvede: part of the resistance, historically, is that we feel leaving comments in the code review system is a bad api anyway, and would like to eventually have some other interface fro such tasks 19:35:00 <fungi> and not tie ourselves to a standard involving arbitrary code review comment strings 19:35:51 <clarkb> dmsimard: what was your zuulv3 item? 19:36:06 <dmsimard> I'd love at least a first round of reviews on the 'sqlite over http' ara middleware series to 1) always have ara reports regardless of failure/success 2) reduce even further the impact of storage/inode on the log server 19:36:17 <dmsimard> The reviews are tagged here: https://review.openstack.org/#/q/topic:ara-sqlite-middleware 19:36:48 <dmsimard> And you can see a practical implementation here -- 19:37:00 <clarkb> dmsimard: does that also depend on getting the ara install updated independent of the zuul install on the zuul executors? 19:37:04 <dmsimard> Without the middleware: https://logs.rdoproject.org/33/10433/1/check/rdo-registry-integration/Ze74352b77e17444cace463fc9c994213/ara-database/ 19:37:06 <dmsimard> With the middleware: http://logs-dev.rdoproject.org/33/10433/1/check/rdo-registry-integration/Ze74352b77e17444cace463fc9c994213/ara-database/ 19:37:33 <pabelanger> SSL cert is bad ^ 19:37:42 <dmsimard> pabelanger: yeah, logs-dev :( 19:37:45 <jeblair> (i replied to the ml thread with a summary of our discussion on the third-party ci issue) 19:37:55 <dmsimard> pabelanger: I spun it up without getting proper certs (yet) 19:38:26 <dmsimard> clarkb: it doesn't depend on the version of ara on the executors, no 19:38:48 <clarkb> dmsimard: cool, so we can work this independently. I'll make a note to review it 19:39:04 <dmsimard> clarkb: it doesn't even depend on the version of ara on the logserver (where it would sit like htmlify/os-log_analyzer), it's just a wsgi script that happens to be bundled in ara at the latest version but otherwise can be carried in tree 19:39:25 <pabelanger> Didn't we have a set of patchs to install it onto our own dev server? 19:39:32 <pabelanger> logs-dev.o.o for example 19:39:38 <dmsimard> that's the topic I linked earlier, yes: https://review.openstack.org/#/q/topic:ara-sqlite-middleware 19:39:46 <pabelanger> okay cool 19:39:49 <clarkb> #link https://review.openstack.org/#/q/topic:ara-sqlite-middleware changes to run ara out of sqlite db using middleware. Will cut down on inode use on the logs server hopefully allowing us to add ara to successful jobs again 19:40:00 <pabelanger> will look over here today 19:40:07 <dmsimard> I have a todo to resolve a conflict between htmlify and ara rewrite rules but it's otherwise at least ready for reviewing 19:40:39 <dmsimard> I -W one of the patches but it's still worth reviewing :)( 19:41:23 <dmsimard> I'll probably go ahead and rebase the stack since it's been a while 19:41:26 <dmsimard> that was it for my topic :) 19:41:45 <clarkb> #link http://lists.openstack.org/pipermail/openstack-infra/2017-November/005695.html ml thread on merging feature branches back into master on nodepool and zuul repos and shifting dev work to those branches 19:41:56 <clarkb> If you haven't seen it yet and are interested in zuul ^ is probably worth a read 19:42:01 <clarkb> jeblair: anything you want to add to ^ here? 19:42:34 <jeblair> ya 19:43:01 <jeblair> i'd love for someone from the third-party-ci community to jump in on the puppet-openstackci work 19:43:36 <jeblair> that is something that should be straightforward to accomplish and doesn't require any zuulv3 knowledge -- the opposite in fact -- it's work to keep zuulv2 working with puppet-openstackci 19:43:36 <dmsimard> there's an irc channel where they hang out, worth a try to get their attention 19:43:58 <jeblair> true, though there's a problem if they aren't paying attention here. 19:43:59 <clarkb> mmedvede may also know individuals that might be interested? 19:44:00 <dmsimard> they might not be subscribed to the MLs 19:44:32 <AJaeger> we also have project-config-example repo - what should we do with that one? It uses Zuul v2/Jenkins right now 19:45:17 <jeblair> again, if they aren't subscribed to openstack-infra it's a problem. i will send an announcement to third-party-announce to draw attention to my post, however. 19:45:24 <mmedvede> clarkb: it has been relatively quiet, lennyb fyi ^^ 19:45:41 <clarkb> jeblair: thanks 19:47:45 <fungi> AJaeger: good question... i wonder whether it needs branching or can have v2 and v3 content side-by-side 19:47:49 <clarkb> ok any other zuulv3 items before we move on to open discussion? 19:48:01 <fungi> that's in my opinion part of eth puppet-openstackci work to determine 19:48:17 <AJaeger> fungi: and somebody would need to update it. Question is whether anybody is using it at all... 19:48:36 <jeblair> AJaeger, fungi: can likely support side-by-side as we did during our transition. 19:48:48 <fungi> convenient 19:48:52 <jeblair> though should probably just switch to v3 soon. 19:48:52 <mmedvede> clarkb: I'll take a look at puppet-openstackci for zuulv3 branch merge workarounds 19:49:02 <clarkb> mmedvede: thanks 19:49:10 <clarkb> jeblair: ^ sounds like you may have a volunteer? 19:49:14 <jeblair> \o/ 19:49:37 <fungi> mmedvede: feel free to delegate/distribute the load to any other interested 3rd-party ci ops who show interest too 19:49:58 <fungi> though hopefully the work involved is relatively minimal 19:50:06 <mmedvede> this is my hope :) 19:50:12 <jeblair> ++ 19:50:20 <fungi> but getting some of them to help test it out may make sense 19:50:34 <clarkb> ya I think having third party ci involved just for ^ is worthwhile 19:50:42 <clarkb> even if they aren't able to actively review the changes or write them 19:50:55 <pabelanger> most of the work is going to be moving our zuulv3 stuff from system-config back into puppet-openstackci 19:51:38 <mordred> pabelanger: ++ 19:53:16 <clarkb> #topic open discussion 19:54:03 <clarkb> As a general heads up with the firefighting largely behind us I'd like to start organizing the infra TODO list. Basicalyl something that shows new and old infra folk what work is happening and where they can help out if they have spare cycles 19:54:20 <clarkb> You'll probably see me ask for eyeballs on a storyboard board in the near future 19:54:52 * mordred is back to not being able to login to storyboard, fwiw 19:54:57 <clarkb> fun 19:54:59 <pabelanger> I still need to send out ML post about xenial upgrades, I'll try to get that out later today 19:55:00 <AJaeger> the Zuul v3 migration issue etherpad has still some items, could we all review it over the next days, please? 19:55:08 <clarkb> AJaeger: ++ 19:56:37 <mordred> ++ 19:58:08 <dmsimard> on an openfloor note, unbound reviews are up to try and see if this helps with our ongoing DNS resolution failures: https://review.openstack.org/#/q/topic:unbound-ttl 19:58:30 <dmsimard> should be good to go in, they're set to not change anything and effectively be no-op so that we can try it selectively in some jobs. 19:58:48 <clarkb> dmsimard: is that something we might want to try in a limited fashion on the jobs affected by the problem? 19:59:25 <dmsimard> clarkb: that's exactly the purpose, yes, we're actually not changing the defaults from unbound, but jobs can specify vars for cache-min-ttl and it'll be configured accordingly 19:59:35 <clarkb> gotcah sounds good 20:00:09 <clarkb> alright that is all the time we have, find us in #openstack-infra or on the infra mailing list. Thanks everyone 20:00:12 <clarkb> #endmeeting