19:01:29 <clarkb> #startmeeting infra 19:01:30 <openstack> Meeting started Tue Aug 29 19:01:29 2017 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:31 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:34 <openstack> The meeting name has been set to 'infra' 19:01:43 <clarkb> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:01:57 <clarkb> #topic Announcements 19:02:06 <clarkb> #info Pike release tomorrow, 2017-08-30 19:02:21 <bkero> o/ 19:02:34 <clarkb> Early in the morning PDT the release will be happening. 19:02:35 <bkero> fungi: welcome to the other side :) 19:02:39 <fungi> release weeks are always fun 19:02:51 <jeblair> good to know! 19:02:55 <clarkb> 1100UTC is when the fun begins 19:02:58 <fungi> i've volunteered to be around starting at 1100z in case anything comes up 19:03:18 <clarkb> #info PTG planning 19:03:27 <clarkb> #link https://etherpad.openstack.org/p/infra-ptg-queens 19:03:40 <dmsimard> \o 19:03:47 <clarkb> I've gone ahead and edited that etherpad to be a bit more agenda like (though still not quite an agenda) 19:03:57 <clarkb> Please add or edit items if necessary 19:04:28 <mordred> o/ 19:04:31 <clarkb> #link https://ttx.re/queens-ptg.html Thierry has also written down what you can expect and gives a good overview of the planning and organization of the PTG at a high level 19:04:46 <clarkb> #info Queens Cycle signing key ready for attestation 19:04:54 <clarkb> #link https://sks-keyservers.net/pks/lookup?op=vindex&search=0x4c8b8b5a694f612544b3b4bac52f01a3fbdb9949&fingerprint=on Queens Cycle signing key 19:05:01 <clarkb> #link http://docs.openstack.org/infra/system-config/signing.html#attestation attestation process 19:05:11 <clarkb> fungi would like us to go and sign the queens cycle release key 19:05:30 <clarkb> infra-root please do that when you have a spare moment 19:05:31 <fungi> i expect to swap that one into production at or shortly following the ptg 19:05:45 <fungi> once the cycle-trailing projects are tagged 19:06:10 <fungi> i meant to have it ready earlier, so apologies for the short notice 19:06:32 <jeblair> this might be a good time to update the docs to incorporate the new key into zuulv3 19:06:32 <clarkb> that is all I had for announcements, does anyone have any to add? 19:06:36 <fungi> i also added a suggestion to the ptg planning etherpad that we could do some key signing there if anyone wants to so it face-to-face 19:06:40 <mordred> jeblair: ++ 19:06:47 <jeblair> fungi: ++keysigning 19:06:49 <fungi> s/so/do/ 19:07:32 <fungi> i'll give the release team a heads up at their meeting on friday that the new keys is available too 19:07:40 <fungi> er, new key 19:08:12 <clarkb> #topic Actions from last meeting 19:08:20 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2017/infra.2017-08-22-19.03.txt Minutes from last meeting 19:08:44 <clarkb> fungi: have you managed to get switchport counts for infracloud together? I didn't dig into that csv you posted when debugging chocolate 19:09:19 <fungi> nope, that's the old table i fished out of an earlier audit which doesn't have switching/connectivity details 19:09:26 <pabelanger> o/ 19:09:33 <fungi> though the dcm there doesn't seem to be getting back to me on it either 19:09:53 <fungi> at the moment, he's also probably keeping an eye on rising floodwaters :/ 19:10:15 <clarkb> ya. We can keep it on the list of todos so we don't forget but I imagine flooding has made this a very low priority for them 19:10:21 <clarkb> #action fungi get switchport counts for infra-cloud 19:10:46 <clarkb> I did send out a reminder to the dev list about the gerrit upgrade. I will try to send out another during the PTG (or maybe even remind people at lunch) 19:11:21 <clarkb> ianw: I don't think the mirror update server has been updated yet right? probably worth waiting for after the release tomorrow since we are so close to that 19:11:39 <ianw> no, i chatted to tonyb about it ... but yeah, low priority given release 19:12:03 <clarkb> #action ianw upgrade mirror-update server and bandersnatch 19:12:12 <clarkb> #topic Specs approval 19:12:37 <clarkb> as of today I now have the right gerrit permissions to approve these. 19:13:00 <fungi> as of today i no longer have the responsibility to approve these ;) 19:13:07 <clarkb> I don't see any new specs (not surprising given run up to release and ptg and zuulv3) 19:13:10 <clarkb> #link https://review.openstack.org/#/c/462207/ 19:13:17 <mordred> congratulations and condolensces to you both :) 19:13:54 <Zara> +1 19:14:05 <fungi> clarkb: i guess that one's ready for rollcall? 19:14:26 <fungi> it's just a catch-up to what was implemented, right? 19:14:27 <clarkb> fungi: ya, though it already has a few of those votes and isn't very controversial. Do you all want until thursday to look it over or should I go ahead and merge it? 19:14:29 <clarkb> fungi: yup 19:15:40 <fungi> no opinion, though it already has more rollcall votes than a lot of entire specs i ended up approving 19:16:13 * fungi just added one more 19:16:25 <clarkb> I don't mind waiting until Thursday, it is what we've done in the past and gives otehrs a chance to review it if relying on that 19:16:38 <clarkb> so why don't we open this up and I'll check in on it on Thursday 19:17:06 <clarkb> are there any other specs that should be called out? 19:18:53 <clarkb> #topic Priority Efforts 19:19:18 <clarkb> #info Gerrit ContactStore Removal 19:19:21 <clarkb> #link https://review.openstack.org/#/c/492287/ 19:19:46 <clarkb> this change is the cleanup in infra-specs for the contactstore removal. It depends on a couple changes that need review. Would be nice to get those in so we can take this off of the list 19:20:17 <clarkb> gerrit topic:contact-store-removal 19:20:31 <clarkb> #topic Zuulv3 19:20:50 <clarkb> jeblair: anything you want to add here? I think we will be using this time slot next week for quick zuul meeting due to monday holiday? 19:21:25 <jeblair> yes, next week is our last meeting before the ptg, and monday is a holiday so no zuul meeting, so i'd like to check in on zuulv3 jobs then 19:21:31 <fungi> lots and lots of job changes flying for zuulv3 right now 19:21:50 <fungi> is there a recommended priority order to review them in? or just reviewreviewreview? 19:21:52 <jeblair> i think it makes sense for that effort to start merging with the wider infra team around then anyway :) 19:22:08 <jeblair> fungi: basically the second :) 19:22:13 <fungi> k, will do 19:22:30 <jeblair> clarkb: i do have an item for today though: 19:22:36 <jeblair> i'd like to chat about the git repo cache 19:22:47 <mordred> yah 19:23:13 <jeblair> in working on the devstack-legacy job (basically devstack-gate run as close as possible to the way we run it now) i found that the minimum projects list is still relatively long 19:23:24 <jeblair> clarkb has done much work in reducing it (thanks!) 19:23:40 <jeblair> but it's still not minimal. the git repos add up to about 2.3G 19:24:17 <jeblair> now in zuulv3, one of the things we wanted to do was avoid the need for a git cache on the images, so, to date, we push all the git repos from the executor to the nodes 19:24:28 <jeblair> in infra cloud, that 2.3GB transfer takes about 1 hour 19:24:41 <jeblair> (and of course, if more nodes are doing it, it will take even longer) 19:24:57 <jeblair> like almost everything else in zuulv3, of course we don't *have* to do it that way 19:25:06 <jeblair> we *can* use the git cache which is still there on the images 19:25:12 <jeblair> and i'm working on patches to do that 19:25:25 <jeblair> but the discussion i think we should have as a group is: 19:25:37 * mordred is in favor of adding support to the git repo copying for using a git cache if one exists 19:26:03 <fungi> do we have any real-world tests for pushing zuul refs in v3 for larger repos like nova? 19:26:21 <fungi> i imagine it's at least as straining as a git clone nova on a node would be 19:26:47 <jeblair> * what are our feelings around the on-image git repo caches short and long term? we had wanted to minimize image size; do we want to keep the caches around just for the transition? until something happens to ameliorate the infra-cloud problem? permanently? 19:27:30 <jeblair> fungi: i don't think we've pushed nova until this test (but nova is included in this test) 19:27:30 <ianw> yeah, at 150+mb that's bigger than some image downloads, and they've been terribly unreliable, and we don't have local git mirrors? 19:27:31 <fungi> we've mostly been coping with the larger image sizes, though i get it's not ideal 19:27:35 <clarkb> right now the resource in image size seems to be less important that in aggregate cloud bw in part because we can copy a good chunk of the data that needs to move once a day rather than for every job 19:28:54 <fungi> my preference is to get some comparisons between with and without leveraging the local cache for a devstack run, but likely use the on-disk git caches at least through a transitional period until we can get some better real-world data on the impact of not having them 19:28:59 <pabelanger> Our image builds have been pretty stable for the last while, and with zuulv3 it makes it much easier to ensure DIB uploads. Right now, it takes about ~40mins to build an image, then another 1hr to upload them all with rackspace. I'd be okay with continuing to use /opt/git cache if available 19:29:21 <jeblair> (incidentally pabelanger brought up a suggestion earlier: locate executors in clouds. it's a good idea, but i don't see us being able to implement it until something like the zuul v4 timeframe.) 19:29:40 <mordred> jeblair: yah - I think that's likely a great step, but I agree, it's not for now 19:29:46 <clarkb> jeblair: beacuse we'd need to restrict them to only talking to local instances right? 19:29:52 <pabelanger> Ya, I like the idea of regional zuul-executors too, but v4 wfm 19:30:09 <jeblair> clarkb: also it's awkward with gearman 19:30:15 <fungi> yep, that gets us a nice middle-ground between on-disk git caches and updating from in-provider caches 19:30:18 <mordred> so I think using the git caches until we can show a migration plan off of them that doesn't add an hour of build time seems sane to me 19:30:32 <clarkb> mordred: ++ 19:30:36 <fungi> that might be enough of an improvement to make not having the on-disk caches bearable 19:30:52 <pabelanger> mordred: ++ 19:31:08 <mordred> also - since the git repo copying is done in a role in teh base job, once we're past PTG and ready to try things, we can always push up test jobs taht don't use the cache for comparison 19:31:23 <mordred> as we think we might have the situation licked 19:31:23 <jeblair> mordred, fungi: that's fair. a followup question -- shall we use this for all jobs? (i don't see why not) 19:31:29 <fungi> i do worry what it looks like to be doing a nova repo push across the atlantic 19:31:30 <mordred> jeblair: yah, I think so 19:31:42 <clarkb> I think that cinder multiattached volumes may actually have gotten in this release too, so in the future if all our clouds get to pike release and we haven't solved it otherwse we could try the local cinder volume cache idea as well 19:31:52 <fungi> i believe using it for all jobs makes sense from a consistency standpoint 19:32:02 <mordred> jeblair: basically making the current git repo rsync step smarter would get us a bunch of traction 19:32:04 <clarkb> jeblair: yes I think so, it will probably only cause confusion if we let users pick or choose 19:32:17 <fungi> making sure zuul v3 works without on-disk caches is still important of course, but we should likely continue to use them consistently for now 19:32:32 <jeblair> fungi: when you asaid 'in-provider cache' did you mean image, or do you mean setting something up on a new host, like the mirror, and cloning from there? 19:32:44 <jeblair> or put another way -- is anyone suggesting that idea ^? 19:33:14 <mordred> jeblair: it's not a terrible idea - but I think it's also a post-ptg kind of idea (and has a similar logic to optional-on-disk cache) 19:33:30 <fungi> i meant having the executors co-located in the same provider as the nodes they're driving would be sort of like having nodes do git clones from in-provider git mirror endpoints 19:33:31 <mordred> "please rsync these prepared repos to this node, but first please get base versions of those repos from XXX" 19:33:43 <clarkb> mordred: ya in large part because everyone wants to cache all the things that way and it turns out that scaling for caching the internet isn't straightforward 19:33:44 <jeblair> mordred: ya 19:33:48 <fungi> from a performance perspective i meant 19:33:49 <mordred> fungi: ++ 19:34:08 <jeblair> so that may be something to consider if, after the ptg, we still want to further reduce image size but otherwise still have this problem 19:34:09 <fungi> i probably didn't actually say what i meant though ;) 19:34:12 <mordred> fungi: I also like the per-provider executors concept if/when we can make a good plan for it 19:34:17 <mordred> jeblair: ++ 19:34:33 <jeblair> (also, i think we're about to be able to drop the deb- repos from the cache, so that should, like, halve its size) 19:34:41 <mordred> yah 19:34:47 <pabelanger> yup, retiring today actually 19:34:50 <clarkb> we already don't cache them 19:34:53 <jeblair> drat 19:34:55 <fungi> right, wendar has been working on that today 19:34:57 <mordred> fungi: speaking of - we should land that jeepyb change to stop showing retired repos ... 19:34:59 <jeblair> easy win was too easy 19:35:23 <mordred> jeblair: :( 19:35:35 <jeblair> okay so i'd propose an #agreed use the git repo cache for all v3 jobs until at least the transition is complete, then revisit 19:35:40 <fungi> mordred: seems like a good idea to me (and also skip them in codesearch). were there changes up for review? 19:35:45 <clarkb> we probably can optimize that list a bit more though, liek double check we don't cache stackforge/ etc 19:35:47 <mordred> https://review.openstack.org/#/c/478992/ https://review.openstack.org/#/c/478939 and https://review.openstack.org/#/c/467235 if anybody feels like reviewing jeepyb changes 19:36:06 <mordred> fungi: I didn't think about codesearch - that's another good idea to add there 19:36:28 <clarkb> #agreed use the git repo cache for all v3 jobs until at least the transition is complete, then revisit 19:36:34 <jeblair> clarkb: eot from me 19:36:48 <mordred> clarkb: we should grab the projects.yaml list and filter anything marked retired using the same logic as those patches ^^ 19:36:49 <clarkb> (happy to #undo that if someone disagrees but I heard general agreement ) 19:36:53 <mordred> in the image building 19:36:55 <clarkb> mordred: ++ 19:37:11 <fungi> i agree there's agreement 19:37:31 <clarkb> alright moving on 19:37:33 <clarkb> #topic General topics 19:37:48 <clarkb> #info Infracloud SSL certs were swapped out for snakeoil certs 19:38:29 <fungi> and there was much rejoicing? 19:38:35 <clarkb> last week infracloud vanilla's ssl cert expired. So we did a quick switch over to its snakeoil cert because it already has the correct CN in it. Then yesterday did the same in chocolate as its cert expires in a month 19:39:04 <clarkb> in the process updated how puppet master trust's those certs with clouds.yaml. We now directly trust the public cert rather than using update-ca-certificates (whcih wasn't puppeted on puppetmaster) 19:39:06 <fungi> what's the expiration period on the snakeoils? 19:39:13 <clarkb> fungi: 10 years, so ~9 more 19:39:28 <fungi> long enough, thanks ;) 19:39:47 <clarkb> this should be safe because if we swap out certs we don't need to revoke anything as we will just explicitly trust the new cert and stop trusting the old 19:39:54 <clarkb> I guess that is like a local revocation 19:40:58 <clarkb> I should also write up how certs are managed. This isn't currently documented and created some initial confusion while we figured it out 19:41:18 <clarkb> #action clarkb update infracloud docs to include ssl setup info 19:41:33 <clarkb> #info PTG Team Dinner 19:41:42 <clarkb> #link https://etherpad.openstack.org/p/infra-ptg-team-dinner 19:42:16 <clarkb> If you are interested in joining us for a dinner in denver please add yourself to that etherpad with availabilty info 19:43:00 <clarkb> it is looking like Tuesday will have to be the night based on current availability 19:43:22 <clarkb> I think there is a happy hour thing at the PTG that evenign we can use to meet up and head out to somewhere like lowry beer garden 19:43:50 <pabelanger> +1 19:43:50 <clarkb> I'll likely solidy our plans for that early next week (so please do fill out the etherpad if you haven't yet) 19:43:58 <clarkb> *solidify 19:44:33 <clarkb> #topic Open discussion 19:44:55 <clarkb> That was basically all I had. Excited to see everyone in a couple weeks. 19:45:43 <jeblair> ++ 19:46:01 <Zara> :D 19:46:07 <fungi> looking forward to it. it will be nice to not to be the cat herder this time 19:46:50 <fungi> hoping i can get deeper into stuff than i did last ptg 19:47:37 <clarkb> Next monday is a holiday in various parts of the world including the US and Canada 19:48:18 <fungi> i will likely still be mostly around 19:48:24 <fungi> in case anything comes up 19:48:46 <fungi> i don't have any travel plans for the weekend/holiday 19:49:34 <clarkb> Alright, doesn't sound like we've got much for open discussion. You can find us in #openstack-infra if something comes up or on the mailing list. Thank you everyone 19:49:44 <clarkb> #endmeeting