#openstack-meeting log

19:01:09 <clarkb> #startmeeting infra
19:01:10 <openstack> Meeting started Tue Jun 19 19:01:09 2018 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:11 <ianw> o/
19:01:12 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:14 <openstack> The meeting name has been set to 'infra'
19:01:23 <clarkb> I know a few of us are traveling or otherwise unable to attend
19:01:33 <clarkb> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting
19:01:48 <clarkb> #topic Announcements
19:02:29 <clarkb> I'm going to be out middayish tomorrow. My brother bought a new drift boat and he wants to take it out with the kids.
19:02:58 <anteaya> nice
19:03:30 <clarkb> Other than that just a friendly reminder to take the openstack user survey if you are an operator or user of openstack
19:03:43 <clarkb> any other announcments?
19:04:30 <clarkb> #topic Actions from last meeting
19:04:46 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2018/infra.2018-06-12-19.01.txt Minutes from last meeting
19:04:58 <clarkb> Again no formal actions however the informal one from mordred has been started \o/
19:05:03 <clarkb> which leads us into the next topic
19:05:13 <clarkb> #topic Specs approval
19:05:26 <clarkb> #link https://review.openstack.org/#/c/565550/ config mgmt and containers
19:05:27 <patchbot> patch 565550 - openstack-infra/infra-specs - WIP Update config management for the Infra Control...
19:05:39 <clarkb> This spec is still WIP so not quite up for approval
19:05:54 <mordred> yah - I need to put in some thoughts about hiera and inventory
19:06:00 <mordred> and it probably needs more cleaning
19:06:10 <clarkb> mordred: are you ready for us to start providing feedback?
19:06:30 <mordred> nah - lemme take one more pass at it
19:06:35 <mordred> I'll take the WIP off
19:07:07 <clarkb> ok I'll keep an eye on it for the WIP to be removed then ping people to tkae a look
19:07:47 <clarkb> #topic Priority Efforts
19:08:34 <clarkb> On the config management side of things cmurphy isn't able to attend the meeting today but the topic:puppet-4 changes could use some review
19:08:53 <clarkb> I did a pass last week and will try to do another pass this week. Help is appreciated :)
19:09:02 <clarkb> ianw: ty for the help there btw
19:09:04 <mordred> I +A'd one a few minutes ago
19:09:11 <mordred> but yeah- there are patches
19:10:11 <clarkb> Other than that, the combined spce is in progress, puppet 4 is making slow but steady progress. Anything else we want to talk about related to config mangaement future and updates?
19:11:07 <fungi> related stuff later in the mm3 topic
19:11:16 <clarkb> I mentioned it last week but I think this will make a great PTG topic. I'll work to get a PTG planning etherpad up this week. I hear rumors we should have a schedule out soon too
19:11:27 <clarkb> Look for that etherpad in an infra list thread
19:11:59 <clarkb> Ok storyboard
19:12:24 <clarkb> fungi: ^ anything new we should be aware of with storyboard? How is the outreachy internship bootstrapping going?
19:14:08 <clarkb> fungi: still with us?
19:14:26 <fungi> i saw some more discussion in #storyboard with fatema (our outreachy intern) on getting a working dev deployment going and coming to an understanding with how we interact with gerrit
19:14:35 <fungi> yep, sorry, too many discussions at once
19:14:53 <fungi> the api sig also moved from lp to sb last week
19:15:08 <fungi> (and has a related request on the agenda for a repo rename too)
19:15:49 <fungi> there's been some passing discussion in #openstack-tc about whether sb should be on the help wanted list, and/or whether having it not overshadowed by openstack would increase visibility
19:16:35 <fungi> i think that's it for this week
19:16:48 <clarkb> cool, thanks for the update
19:17:23 <clarkb> #topic General Topics
19:17:43 <clarkb> ianw and fungi have done work to add the new platform9 managed cloud running on packethost
19:17:52 <fungi> mostly ianw
19:17:54 <fungi> almost entirely
19:18:03 <clarkb> it is running jobs with its max-servers value set to 95 now
19:18:21 <mordred> \o/
19:18:39 <clarkb> seems to be doing ok, though I've noticed there are some boot errors (could just be image conversion timeouts) and we seem to have peaked at ~45 instances which may indicate our quota is too high (or there just wasn't enough demand to go higher)
19:18:44 <anteaya> thank you ianw and fungi
19:19:24 <fungi> they're also talking about providing arm64 instances soon too
19:19:27 <ianw> hmm, there should be logs about quota
19:19:36 <clarkb> Keep an eye out for new/weird/unexpected behaviors from jobs running on these instances
19:19:42 <fungi> it'll be nice to have some cross-provider redundancy for arm64-based jobs
19:19:55 <clarkb> ianw: nodepool logs probably have insights too, I just haven't had a chance to look at them yet juggling meetings and meeting prep
19:20:17 <ianw> i believe we're soon getting access to a london based cloud for arm64 too, there is a ticket open.  will add when we get credentials
19:21:00 <clarkb> in any case it appears to be going well so thank you to our new infra donors and to those of us that helped set it up
19:22:16 <clarkb> #link https://etherpad.openstack.org/p/infra-services-naming winterstack naming
19:22:33 <clarkb> I wanted to remind everyone of ^ the feedback has been helpful I think
19:24:11 <clarkb> I'll resync with jbryce on further refining what we have there
19:24:16 <mordred> ++
19:24:30 <fungi> yes, has been a remarkably insightful process, more than i anticipated
19:24:54 <clarkb> but you still have time to get more ideas onto paper :)
19:25:51 <clarkb> Next item is we need to do an SSL certificate refresh. June 30 is when most of our certs expire. I expect I will be doing the actual purchasing and flipping of certs, but had a couple questions I thought I would ask the group.
19:26:16 <clarkb> The first is do we want 1 year or longer certs? and second should I refresh certs that won't be expiring soon to get them on the same schedule?
19:26:47 <fungi> i did 1 year last time because i thought we might be switching to startcom/startssl... now glad we didn't ;)
19:27:03 <mordred> I could see refreshingso that they're all on the same schedule
19:27:12 <fungi> i also lumped in any which were expiring in the next few to six months usually to try and get them eventually in sync
19:27:42 <anteaya> I vote 1 year for certs
19:27:58 <anteaya> things change so quickly I go with the shortest version that makes sense
19:28:11 <clarkb> ya possible that winterstack naming will play into that too
19:28:22 <anteaya> as fer cert cadence I leave that up to the person refreshing
19:28:24 <clarkb> if we don't whitelabel some serivces having paid less for shorter cert may be good
19:28:25 <anteaya> which isn't me
19:28:25 <fungi> i've also been trying out letsencrypt but so far am not sure what our configuration management solution would look like for that given there's a chicken-and-egg sort of process with adding the certs (either need a webserver up on that site already, or need to be able to temporarily add/remove records from dns)
19:29:03 <clarkb> fungi: one approach I've seen is to front everything with a proxy that redirects the letsencrypt renewals to a dedicated renewal host
19:29:12 <clarkb> I don't think that will work too well for us though
19:29:24 <fungi> ick. that's turning mediocre security into terrible security
19:29:57 <ianw> ... hence fungi asking about rax dns api keys?
19:30:06 <fungi> nope
19:30:23 <fungi> i just wanted to set dns records for the lists-dev01.o.o server i'm using for the mm3 poc
19:30:39 <fungi> nothing nearly so exciting
19:30:42 <ianw> oh :)  i thought you might be POC a renewal bot :)
19:31:35 <clarkb> ok, I'll plan to buy 1 year renewals and look to renew anything that expires iwthin the next 6 months or so
19:31:56 <clarkb> and now we can talk about fungi's mailman 3 work
19:32:22 <clarkb> mostly I was interested in hearing baout the trouble you've had bringing up and instance running it to see if the crowd had any suggestions or working around them
19:32:27 <fungi> oh, heh
19:32:34 <fungi> yeah, it's turned into a major rabbit hole
19:33:05 <fungi> to preface, the tc and first contact sig and others have been expressing interest in finding ways to make interacting with the community easier while still using mailing lists
19:33:55 <fungi> mailman 3 (a complete rewrite of mailman 2 which we use today) has some particular features which make it possible to interact with mailing lists in nonconventional ways, such as posting through a webui
19:34:27 <fungi> unfortunately, mm3 is a big, complex beast (much more than mm2 is/was) and it's only recently started getting packaged in linux distros
19:34:59 <fungi> so putting together a proof of concept deployment for people to just play around with has been far more challenging than i anticipated
19:35:19 <fungi> #link https://etherpad.openstack.org/p/mm3poc Mailman 3 Proof of Concept Deployment Notes
19:35:30 <fungi> so far i have some deployment challenge notes accumulated there
19:35:58 <fungi> it exposed, in particular, that we're looking at some not so fun scenarios when it comes to trying to deploy, well, anything on ubuntu bionic (18.04 lts)
19:36:47 <fungi> i've just now (mostly during the meeting) determined that the "xenial pc1" suite of packages puppetlabs provides seems to work on bionic, and gets us puppet 4
19:37:00 <fungi> so might represent a possible stepping stone
19:37:24 <clarkb> we've seen lack of bionic packages in othe rplaces too like docker
19:37:26 <fungi> it at least runs, though as cmurphy's outstanding changes attest, we've still got some configuration updates to get through
19:37:41 <clarkb> I dunno if ubuntu is just less important to these groups or if there is something that makes packaging for it challenging
19:38:03 <fungi> i think deploying in not-containers is increasingly less important to lots of development communities
19:38:34 <clarkb> could be, though containers tend to still need packages too
19:39:08 <fungi> especially for web-facing services where they don't feel like packaging all of the numerous (particularly javascript) dependencies
19:39:22 <clarkb> if puppetlabs PC1 puppet4 works on bionic do you expect you'll push forward with that?
19:39:47 <fungi> so instead of packaging you have some hackish scripted combination of pip install and npm install to get everything into your container image, and then rework the world whenever your scripted assumptions break
19:40:34 <fungi> well, i mostly just needed some means of exposing a manually-installed mailman3 deployment to people to test out and see what/how we might use it
19:41:06 <fungi> but yes, if we're interested in deploying/upgrading servers to bionic so far the xenial pc1 suite at puppetlabs looks like our best option for having something workable soonish
19:41:19 <clarkb> good to know, thanks
19:41:50 <fungi> even though it says xenial on the tin, it seems to have its dependencies satisfied okay on bionic
19:42:13 <fungi> (using the xenial-packaged puppet 3 on the other hand broke horribly with bionic's ruby packages)
19:42:16 <ianw> ++ great to have some initial feedback on bionic hosts
19:43:27 <fungi> anyway, other than those initial insights and the fact that there'll be a mailman3 server (in the next day or so at this rate) people can play around with if they're interested in helping, nothing else from me on this topic
19:43:59 <clarkb> Thanks again. Next thing I wanted to bring up was mnaser and I are meeting with some kata folks in 45 minutes to talk about kata on zuul
19:44:11 <clarkb> There is a zoom conference call if others are interested in participating
19:45:18 <anteaya> where will the meeting id be posted?
19:45:30 <clarkb> It was posted to the kata irc channel I can share it too
19:45:34 * clarkb finds it
19:45:40 <anteaya> thanks
19:45:49 <clarkb> https://zoom.us/j/398799714
19:46:04 <anteaya> thank you
19:46:11 <anteaya> what is the name of the kata irc channel
19:46:18 <clarkb> #kata-dev on freenode
19:46:18 <anteaya> I've learned it isn't #kata
19:46:21 <anteaya> thanks
19:46:29 <clarkb> that is all I had
19:46:33 <clarkb> #topic Open Discussion
19:47:09 <clarkb> Anything else?
19:47:19 <clarkb> Happy to end early so I can grab lunch before the kata meeting
19:47:22 <clarkb> :)
19:47:50 <ianw> not sure if people saw my note about the opensuse mirror
19:48:01 <ianw> #link http://lists.openstack.org/pipermail/openstack-infra/2018-June/005972.html
19:48:25 <fungi> ahh, right. we could probably find ways to manually shard that across a couple of volumes if we absolutely must
19:48:28 <clarkb> ianw: what did you think of cmurphy's suggestion of removing something like half the packages?
19:48:48 <ianw> fungi: it's not volumes, but directories
19:49:11 <ianw> clarkb: i'm happy to review any changes :)
19:49:28 <ianw> it will require pruning the list before the rsync somehow
19:49:42 <clarkb> oh right its rsync
19:49:50 <clarkb> that does make it more complicated
19:49:56 <ianw> maybe something like get a list of files, sed/awk/munge it, then feed that back
19:50:16 <fungi> ianw: oh, right, moving some to another volume doesn't solve it i guess
19:50:55 <fungi> and yes, some regular expression to be able to exclude files by name pattern would probably be needed to make the filtering with rsync possible
19:51:18 <ianw> if the duplicates can be pattern matched out, that would be great
19:51:21 <fungi> or if suse has a reprepro-like solution...
19:51:45 <clarkb> another option may be to proxy cache tumbleweed instead
19:51:54 <clarkb> we do have a really good cache hit rate on port 8080 of our mirror nodes
19:51:59 <clarkb> it is around 95%
19:52:04 <ianw> yes, proxy caching may be the way forward
19:52:23 <ianw> it's another (open)AFS issue to be aware of when setting things up
19:53:00 <ianw> unfortunately, it seems most people find out about these limits like us, when everything stops working :)
19:53:11 <clarkb> I'd be comfortable with that change personally
19:54:08 <fungi> proxy caching's down side is that it doesn't help us if the site backing it goes down
19:54:29 <fungi> so i wouldn't want to rely on it for some of our more heavily used data sets like pypi or ubuntu/centos
19:54:32 <clarkb> correct, but it is also tumbleweed I think we expect a reasonable amount of volatility there in return for testing all the new stuff
19:54:45 <clarkb> fungi: ++
19:55:08 <fungi> that said, "your mirrors can't be copied to afs" is likely something suse would consider worth solving for?
19:55:50 <clarkb> maybe process forward here is to communicate the problem to suse more directly, ask if they are interested in addressing it. Then if that doesn't happen switch to caching instead
19:55:50 <fungi> i wouldn't be surprised if a lot of (particularly university) networks keep distro package mirrors on afs
19:56:09 <ianw> i'm guessing debian "solved" this 25+ years ago with directory names in the pool because probably filesystems of the day didn't like huge numbers of files in a single directory too
19:56:10 <clarkb> (like a bug in their bugzilla)
19:56:23 <fungi> ianw: right
19:57:53 <ianw> clarkb: i'm happy to send a bug.  changing things to proxy mirrors, or rsync magic, etc, i'm happy to review and work with anyone who wants to drive it.  i'll send a follow-up mail with some of our dicussion from here
19:58:09 <clarkb> ianw: sounds good, thanks
19:59:00 <clarkb> and we are bsaically at time. Find us in #openstack-infra if anything else comes up
19:59:03 <clarkb> thank you everyone
19:59:05 <clarkb> #endmeeting