19:01:29 <clarkb> #startmeeting infra
19:01:30 <openstack> Meeting started Tue Aug 29 19:01:29 2017 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:31 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:34 <openstack> The meeting name has been set to 'infra'
19:01:43 <clarkb> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting
19:01:57 <clarkb> #topic Announcements
19:02:06 <clarkb> #info Pike release tomorrow, 2017-08-30
19:02:21 <bkero> o/
19:02:34 <clarkb> Early in the morning PDT the release will be happening.
19:02:35 <bkero> fungi: welcome to the other side :)
19:02:39 <fungi> release weeks are always fun
19:02:51 <jeblair> good to know!
19:02:55 <clarkb> 1100UTC is when the fun begins
19:02:58 <fungi> i've volunteered to be around starting at 1100z in case anything comes up
19:03:18 <clarkb> #info PTG planning
19:03:27 <clarkb> #link https://etherpad.openstack.org/p/infra-ptg-queens
19:03:40 <dmsimard> \o
19:03:47 <clarkb> I've gone ahead and edited that etherpad to be a bit more agenda like (though still not quite an agenda)
19:03:57 <clarkb> Please add or edit items if necessary
19:04:28 <mordred> o/
19:04:31 <clarkb> #link https://ttx.re/queens-ptg.html Thierry has also written down what you can expect and gives a good overview of the planning and organization of the PTG at a high level
19:04:46 <clarkb> #info Queens Cycle signing key ready for attestation
19:04:54 <clarkb> #link https://sks-keyservers.net/pks/lookup?op=vindex&search=0x4c8b8b5a694f612544b3b4bac52f01a3fbdb9949&fingerprint=on Queens Cycle signing key
19:05:01 <clarkb> #link http://docs.openstack.org/infra/system-config/signing.html#attestation attestation process
19:05:11 <clarkb> fungi would like us to go and sign the queens cycle release key
19:05:30 <clarkb> infra-root please do that when you have a spare moment
19:05:31 <fungi> i expect to swap that one into production at or shortly following the ptg
19:05:45 <fungi> once the cycle-trailing projects are tagged
19:06:10 <fungi> i meant to have it ready earlier, so apologies for the short notice
19:06:32 <jeblair> this might be a good time to update the docs to incorporate the new key into zuulv3
19:06:32 <clarkb> that is all I had for announcements, does anyone have any to add?
19:06:36 <fungi> i also added a suggestion to the ptg planning etherpad that we could do some key signing there if anyone wants to so it face-to-face
19:06:40 <mordred> jeblair: ++
19:06:47 <jeblair> fungi: ++keysigning
19:06:49 <fungi> s/so/do/
19:07:32 <fungi> i'll give the release team a heads up at their meeting on friday that the new keys is available too
19:07:40 <fungi> er, new key
19:08:12 <clarkb> #topic Actions from last meeting
19:08:20 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2017/infra.2017-08-22-19.03.txt Minutes from last meeting
19:08:44 <clarkb> fungi: have you managed to get switchport counts for infracloud together? I didn't dig into that csv you posted when debugging chocolate
19:09:19 <fungi> nope, that's the old table i fished out of an earlier audit which doesn't have switching/connectivity details
19:09:26 <pabelanger> o/
19:09:33 <fungi> though the dcm there doesn't seem to be getting back to me on it either
19:09:53 <fungi> at the moment, he's also probably keeping an eye on rising floodwaters :/
19:10:15 <clarkb> ya. We can keep it on the list of todos so we don't forget but I imagine flooding has made this a very low priority for them
19:10:21 <clarkb> #action fungi get switchport counts for infra-cloud
19:10:46 <clarkb> I did send out a reminder to the dev list about the gerrit upgrade. I will try to send out another during the PTG (or maybe even remind people at lunch)
19:11:21 <clarkb> ianw: I don't think the mirror update server has been updated yet right? probably worth waiting for after the release tomorrow since we are so close to that
19:11:39 <ianw> no, i chatted to tonyb about it ... but yeah, low priority given release
19:12:03 <clarkb> #action ianw upgrade mirror-update server and bandersnatch
19:12:12 <clarkb> #topic Specs approval
19:12:37 <clarkb> as of today I now have the right gerrit permissions to approve these.
19:13:00 <fungi> as of today i no longer have the responsibility to approve these ;)
19:13:07 <clarkb> I don't see any new specs (not surprising given run up to release and ptg and zuulv3)
19:13:10 <clarkb> #link https://review.openstack.org/#/c/462207/
19:13:17 <mordred> congratulations and condolensces to you both :)
19:13:54 <Zara> +1
19:14:05 <fungi> clarkb: i guess that one's ready for rollcall?
19:14:26 <fungi> it's just a catch-up to what was implemented, right?
19:14:27 <clarkb> fungi: ya, though it already has a few of those votes and isn't very controversial. Do you all want until thursday to look it over or should I go ahead and merge it?
19:14:29 <clarkb> fungi: yup
19:15:40 <fungi> no opinion, though it already has more rollcall votes than a lot of entire specs i ended up approving
19:16:13 * fungi just added one more
19:16:25 <clarkb> I don't mind waiting until Thursday, it is what we've done in the past and gives otehrs a chance to review it if relying on that
19:16:38 <clarkb> so why don't we open this up and I'll check in on it on Thursday
19:17:06 <clarkb> are there any other specs that should be called out?
19:18:53 <clarkb> #topic Priority Efforts
19:19:18 <clarkb> #info Gerrit ContactStore Removal
19:19:21 <clarkb> #link https://review.openstack.org/#/c/492287/
19:19:46 <clarkb> this change is the cleanup in infra-specs for the contactstore removal. It depends on a couple changes that need review. Would be nice to get those in so we can take this off of the list
19:20:17 <clarkb> gerrit topic:contact-store-removal
19:20:31 <clarkb> #topic Zuulv3
19:20:50 <clarkb> jeblair: anything you want to add here? I think we will be using this time slot next week for quick zuul meeting due to monday holiday?
19:21:25 <jeblair> yes, next week is our last meeting before the ptg, and monday is a holiday so no zuul meeting, so i'd like to check in on zuulv3 jobs then
19:21:31 <fungi> lots and lots of job changes flying for zuulv3 right now
19:21:50 <fungi> is there a recommended priority order to review them in? or just reviewreviewreview?
19:21:52 <jeblair> i think it makes sense for that effort to start merging with the wider infra team around then anyway :)
19:22:08 <jeblair> fungi: basically the second :)
19:22:13 <fungi> k, will do
19:22:30 <jeblair> clarkb: i do have an item for today though:
19:22:36 <jeblair> i'd like to chat about the git repo cache
19:22:47 <mordred> yah
19:23:13 <jeblair> in working on the devstack-legacy job (basically devstack-gate run as close as possible to the way we run it now) i found that the minimum projects list is still relatively long
19:23:24 <jeblair> clarkb has done much work in reducing it (thanks!)
19:23:40 <jeblair> but it's still not minimal.  the git repos add up to about 2.3G
19:24:17 <jeblair> now in zuulv3, one of the things we wanted to do was avoid the need for a git cache on the images, so, to date, we push all the git repos from the executor to the nodes
19:24:28 <jeblair> in infra cloud, that 2.3GB transfer takes about 1 hour
19:24:41 <jeblair> (and of course, if more nodes are doing it, it will take even longer)
19:24:57 <jeblair> like almost everything else in zuulv3, of course we don't *have* to do it that way
19:25:06 <jeblair> we *can* use the git cache which is still there on the images
19:25:12 <jeblair> and i'm working on patches to do that
19:25:25 <jeblair> but the discussion i think we should have as a group is:
19:25:37 * mordred is in favor of adding support to the git repo copying for using a git cache if one exists
19:26:03 <fungi> do we have any real-world tests for pushing zuul refs in v3 for larger repos like nova?
19:26:21 <fungi> i imagine it's at least as straining as a git clone nova on a node would be
19:26:47 <jeblair> * what are our feelings around the on-image git repo caches short and long term?  we had wanted to minimize image size; do we want to keep the caches around just for the transition?  until something happens to ameliorate the infra-cloud problem?  permanently?
19:27:30 <jeblair> fungi: i don't think we've pushed nova until this test (but nova is included in this test)
19:27:30 <ianw> yeah, at 150+mb that's bigger than some image downloads, and they've been terribly unreliable, and we don't have local git mirrors?
19:27:31 <fungi> we've mostly been coping with the larger image sizes, though i get it's not ideal
19:27:35 <clarkb> right now the resource in image size seems to be less important that in aggregate cloud bw in part because we can copy a good chunk of the data that needs to move once a day rather than for every job
19:28:54 <fungi> my preference is to get some comparisons between with and without leveraging the local cache for a devstack run, but likely use the on-disk git caches at least through a transitional period until we can get some better real-world data on the impact of not having them
19:28:59 <pabelanger> Our image builds have been pretty stable for the last while, and with zuulv3 it makes it much easier to ensure DIB uploads.  Right now, it takes about ~40mins to build an image, then another 1hr to upload them all with rackspace. I'd be okay with continuing to use /opt/git cache if available
19:29:21 <jeblair> (incidentally pabelanger brought up a suggestion earlier: locate executors in clouds.  it's a good idea, but i don't see us being able to implement it until something like the zuul v4 timeframe.)
19:29:40 <mordred> jeblair: yah - I think that's likely a great step, but I agree, it's not for now
19:29:46 <clarkb> jeblair: beacuse we'd need to restrict them to only talking to local instances right?
19:29:52 <pabelanger> Ya, I like the idea of regional zuul-executors too, but v4 wfm
19:30:09 <jeblair> clarkb: also it's awkward with gearman
19:30:15 <fungi> yep, that gets us a nice middle-ground between on-disk git caches and updating from in-provider caches
19:30:18 <mordred> so I think using the git caches until we can show a migration plan off of them that doesn't add an hour of build time seems sane to me
19:30:32 <clarkb> mordred: ++
19:30:36 <fungi> that might be enough of an improvement to make not having the on-disk caches bearable
19:30:52 <pabelanger> mordred: ++
19:31:08 <mordred> also - since the git repo copying is done in a role in teh base job, once we're past PTG and ready to try things, we can always push up test jobs taht don't use the cache for comparison
19:31:23 <mordred> as we think we might have the situation licked
19:31:23 <jeblair> mordred, fungi: that's fair.  a followup question -- shall we use this for all jobs?  (i don't see why not)
19:31:29 <fungi> i do worry what it looks like to be doing a nova repo push across the atlantic
19:31:30 <mordred> jeblair: yah, I think so
19:31:42 <clarkb> I think that cinder multiattached volumes may actually have gotten in this release too, so in the future if all our clouds get to pike release and we haven't solved it otherwse we could try the local cinder volume cache idea as well
19:31:52 <fungi> i believe using it for all jobs makes sense from a consistency standpoint
19:32:02 <mordred> jeblair: basically making the current git repo rsync step smarter would get us a bunch of traction
19:32:04 <clarkb> jeblair: yes I think so, it will probably only cause confusion if we let users pick or choose
19:32:17 <fungi> making sure zuul v3 works without on-disk caches is still important of course, but we should likely continue to use them consistently for now
19:32:32 <jeblair> fungi: when you asaid 'in-provider cache' did you mean image, or do you mean setting something up on a new host, like the mirror, and cloning from there?
19:32:44 <jeblair> or put another way -- is anyone suggesting that idea ^?
19:33:14 <mordred> jeblair: it's not a terrible idea - but I think it's also a post-ptg kind of idea (and has a similar logic to optional-on-disk cache)
19:33:30 <fungi> i meant having the executors co-located in the same provider as the nodes they're driving would be sort of like having nodes do git clones from in-provider git mirror endpoints
19:33:31 <mordred> "please rsync these prepared repos to this node, but first please get base versions of those repos from XXX"
19:33:43 <clarkb> mordred: ya in large part because everyone wants to cache all the things that way and it turns out that scaling for caching the internet isn't straightforward
19:33:44 <jeblair> mordred: ya
19:33:48 <fungi> from a performance perspective i meant
19:33:49 <mordred> fungi: ++
19:34:08 <jeblair> so that may be something to consider if, after the ptg, we still want to further reduce image size but otherwise still have this problem
19:34:09 <fungi> i probably didn't actually say what i meant though ;)
19:34:12 <mordred> fungi: I also like the per-provider executors concept if/when we can make a good plan for it
19:34:17 <mordred> jeblair: ++
19:34:33 <jeblair> (also, i think we're about to be able to drop the deb- repos from the cache, so that should, like, halve its size)
19:34:41 <mordred> yah
19:34:47 <pabelanger> yup, retiring today actually
19:34:50 <clarkb> we already don't cache them
19:34:53 <jeblair> drat
19:34:55 <fungi> right, wendar has been working on that today
19:34:57 <mordred> fungi: speaking of - we should land that jeepyb change to stop showing retired repos ...
19:34:59 <jeblair> easy win was too easy
19:35:23 <mordred> jeblair: :(
19:35:35 <jeblair> okay so i'd propose an #agreed use the git repo cache for all v3 jobs until at least the transition is complete, then revisit
19:35:40 <fungi> mordred: seems like a good idea to me (and also skip them in codesearch). were there changes up for review?
19:35:45 <clarkb> we probably can optimize that list a bit more though, liek double check we don't cache stackforge/ etc
19:35:47 <mordred> https://review.openstack.org/#/c/478992/ https://review.openstack.org/#/c/478939 and https://review.openstack.org/#/c/467235 if anybody feels like reviewing jeepyb changes
19:36:06 <mordred> fungi: I didn't think about codesearch - that's another good idea to add there
19:36:28 <clarkb> #agreed use the git repo cache for all v3 jobs until at least the transition is complete, then revisit
19:36:34 <jeblair> clarkb: eot from me
19:36:48 <mordred> clarkb: we should grab the projects.yaml list and filter anything marked retired using the same logic as those patches ^^
19:36:49 <clarkb> (happy to #undo that if someone disagrees but I heard general agreement )
19:36:53 <mordred> in the image building
19:36:55 <clarkb> mordred: ++
19:37:11 <fungi> i agree there's agreement
19:37:31 <clarkb> alright moving on
19:37:33 <clarkb> #topic General topics
19:37:48 <clarkb> #info Infracloud SSL certs were swapped out for snakeoil certs
19:38:29 <fungi> and there was much rejoicing?
19:38:35 <clarkb> last week infracloud vanilla's ssl cert expired. So we did a quick switch over to its snakeoil cert because it already has the correct CN in it. Then yesterday did the same in chocolate as its cert expires in a month
19:39:04 <clarkb> in the process updated how puppet master trust's those certs with clouds.yaml. We now directly trust the public cert rather than using update-ca-certificates (whcih wasn't puppeted on puppetmaster)
19:39:06 <fungi> what's the expiration period on the snakeoils?
19:39:13 <clarkb> fungi: 10 years, so ~9 more
19:39:28 <fungi> long enough, thanks ;)
19:39:47 <clarkb> this should be safe because if we swap out certs we don't need to revoke anything as we will just explicitly trust the new cert and stop trusting the old
19:39:54 <clarkb> I guess that is like a local revocation
19:40:58 <clarkb> I should also write up how certs are managed. This isn't currently documented and created some initial confusion while we figured it out
19:41:18 <clarkb> #action clarkb update infracloud docs to include ssl setup info
19:41:33 <clarkb> #info PTG Team Dinner
19:41:42 <clarkb> #link https://etherpad.openstack.org/p/infra-ptg-team-dinner
19:42:16 <clarkb> If you are interested in joining us for a dinner in denver please add yourself to that etherpad with availabilty info
19:43:00 <clarkb> it is looking like Tuesday will have to be the night based on current availability
19:43:22 <clarkb> I think there is a happy hour thing at the PTG that evenign we can use to meet up and head out to somewhere like lowry beer garden
19:43:50 <pabelanger> +1
19:43:50 <clarkb> I'll likely solidy our plans for that early next week (so please do fill out the etherpad if you haven't yet)
19:43:58 <clarkb> *solidify
19:44:33 <clarkb> #topic Open discussion
19:44:55 <clarkb> That was basically all I had. Excited to see everyone in a couple weeks.
19:45:43 <jeblair> ++
19:46:01 <Zara> :D
19:46:07 <fungi> looking forward to it. it will be nice to not to be the cat herder this time
19:46:50 <fungi> hoping i can get deeper into stuff than i did last ptg
19:47:37 <clarkb> Next monday is a holiday in various parts of the world including the US and Canada
19:48:18 <fungi> i will likely still be mostly around
19:48:24 <fungi> in case anything comes up
19:48:46 <fungi> i don't have any travel plans for the weekend/holiday
19:49:34 <clarkb> Alright, doesn't sound like we've got much for open discussion. You can find us in #openstack-infra if something comes up or on the mailing list. Thank you everyone
19:49:44 <clarkb> #endmeeting