19:01:37 #startmeeting infra 19:01:38 Meeting started Tue Dec 11 19:01:37 2018 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:39 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:41 The meeting name has been set to 'infra' 19:01:51 Hi! o/ 19:02:17 #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting 19:02:33 my phone reminded me to send the agenda out 24 hours in advance 19:02:42 thanks again for curating the agenda and announcing it in advance! 19:02:42 hopefully that will become useful as I do it regularly 19:02:46 yay phones 19:03:17 #topic Announcements 19:04:07 Looking at a calendar Christmas and New Years fall on Tuesdays. I expect that means for many of us that the 18th will be our last meeting of the year and January 8th will be the first of 2019 19:04:31 I've gone ahead and written down that I don't expect we'll have meetings on December 25 or January 1. 19:05:01 those seem pretty likely 19:05:20 i could be around for at least one of those, but best if i'm not 19:05:29 I'll be traveling on the 20th to 28th myself so don't expect me around much for those two tuesdays :) 19:06:17 tl;dr is everyone should enjoy their time off if they get it 19:06:23 #topic Actions from last meeting 19:06:30 #link http://eavesdrop.openstack.org/meetings/infra/2018/infra.2018-12-04-19.01.txt minutes from last meeting 19:07:06 #link https://review.openstack.org/622624 and parent/child changes provide initial opendev.org website content 19:07:22 I think I've got that to a point where we can hopefully publish something 19:07:43 html and css are not something you would seek me out for so feedback on that or followups to make it better much appreciated 19:08:19 #link https://review.openstack.org/#/q/topic:inner-ara-results ianw and dmsimard still looking for review on nested ara for system-config ci jobs 19:08:39 #link https://review.openstack.org/#/q/status:open+topic:fedora29 ianw looking for fedora29 (and associated networkmanager support in glean) change review 19:08:51 these two items were sort of an implied please review these changes action for the group 19:08:52 please ... some interesting things are getting stuck behind it now 19:09:35 I've gone through the stacks and they seem ready. Mostly just need sanity check from another infra-root (but non infra-root reviews/input also appreciated0 19:09:59 #topic Specs 19:10:13 #link https://review.openstack.org/623033 OpenDev Repo Hosting Rework Spec 19:10:20 #link https://review.openstack.org/607377 Storyboard Attachements 19:10:28 #link https://review.openstack.org/581214 Anomaly Detection in CI Logs 19:10:59 these are three infra specs that I expect will end up driving a bunch of work in the new year. As things wind down this year please take a look at them and provide input too 19:11:10 I don't think any are ready for approval just yet, but eyeballs appreciated 19:11:37 corvus: any idea if you expect the opendev spec to be ready for approval soon? 19:12:09 the storyboard attachments spec is close. 19:12:36 clarkb: yes... 19:12:43 * diablo_rojo_phon sneaks in at mention of storyboard from KubeCon 19:13:04 diablo_rojo_phon should mention storyboard at KubeCon 19:13:16 clarkb: mordred said to me "i bet we can run gitea with cephfs in kubernetes" and i've gone down a rabbit hole with that 19:13:25 diablo_rojo_phon: mostly just mentioning that the attachments spec appears near ready. Next week will be the last infra meeting so maybe we put it up for approval at that meeting then approve first week of 2019? 19:13:59 clarkb: i'm trying to see if i can drop in a fully-formed suggestion on how we could run gitea in the next revision. if i fail, i'll just leave it hand-wavey on how we'll implement it. 19:14:07 corvus: ok 19:14:09 it's certainly runnable in the way we currently run things, so it's not a big deal. 19:14:12 clarkb: I'll try to get jhesketh 's comments addressed before next meeting so we can get it merged hopefully before then 19:14:42 #topic Priority Efforts 19:14:47 corvus: is gitea/other critical for moving forward on gerrit? what stops us from sticking with cgit until we get that sorted? 19:14:54 #undo 19:14:55 Removing item from minutes: #topic Priority Efforts 19:14:59 sorry! 19:15:05 fungi: I think its mostly people being allergic to the cgit web ui 19:15:10 we can also probably just cover it in the review 19:15:14 fungi: i'd like to know the whole canonical url story before we do that move 19:15:24 and ya gitea (or other tools) fit into ^ 19:15:38 good point, though we _can_ make cgit coexist with clone urls from smart backend too 19:15:40 fungi: like, "okay, everyone change to git.opendev.org/openstack/nova". "okay, now everyone change again to opendev.org/openstack/nova" 19:15:43 that would be bad :) 19:16:23 #topic Priority Efforts 19:16:29 #topic Storyboard 19:16:30 it's possible that actual implementation is still separable, but i'd like to know the end state to be certain about that. 19:17:02 (sorry for moving things along but I think we have a fairly full agenda and those details will likely come out of running it in test and spec review/updates, I do like the idea of knowing more about what it looks like though) 19:17:25 diablo_rojo_phon: is there a database migration etherpad for storyboard yet? I think I was supposed to dig one of the examples for gerrit? 19:17:28 eumel8 is getting started on webclient translations 19:17:34 #link https://review.openstack.org/623508 19:17:56 diablo_rojo_phon: if you want to create even a stub etherpad url then fungi or myself can start adding info there that will probably help 19:17:58 i posted a puppet change to get started on the db stuff last week, yeah. needs another reviewer 19:18:20 i doubt we need a more detailed migration plan, after i dug into it 19:18:40 #link https://review.openstack.org/623290 Run a local MySQL service on StoryBoard servers 19:19:16 fungi: meaning you think we can forego the etherpad? are you planning on driving that migration then ? (I mostly saw it as a way of communicating to $root what needed to be done) 19:19:40 honestly we just need to try it on the -dev server first, but outage while the db is dumped and loaded (it'll be under a minute) and then changing the database hostname in hiera 19:19:51 gotcha 19:19:52 and then there's a followup change to fix backups afterward 19:20:20 it dawned on me that this is already how we do local and ci testing of storyboard 19:20:34 the puppet module defaults to this particular deployment model already 19:20:50 sounds like we just need an infra root volunteer then? 19:20:52 we've been overriding it in system-config so that we could do trove 19:21:04 (test on -dev, then schedule short outage on production and do the switch tehre) 19:21:23 i volunteer to do the rest of it unless someone else is interested, and then happy to help guide as needed 19:21:36 great and thank you 19:21:57 one thing that may end up being related is storyboard servers are running trusty. Perhaops this is an opportunity to upgrade to xenail (if we have to take an outage anyway) 19:22:05 a great idea 19:22:28 i think we already test against xenial too (for what testing sb does have anyway), but will double-check that 19:22:47 #action fungi migrate to on host db server on storyboard instances unless someone else is interested 19:23:12 confirmed, we puppet test its deployment on xenial already 19:23:34 #topic Update Config Management 19:24:02 We seem to have slowed down a bit around this with the other fires going on (slow gates, zuul updates to try and be more fair around that etc) 19:24:18 mordred: Any progress on a script to generate a static config dynamically? 19:24:45 i was talking to smarcet in #-infra earlier today about php versions and we concluded that this is likely a good time to upgrade the openstackid servers to xenial 19:25:13 corvus did end up deploying a new zuul executor with the static inventory in place. One thing that we learned from that is we don't get the post puppet reboot automatically anymore 19:25:27 which for servers that use HWE kernels and/or AFS this may be required 19:25:29 clarkb: i saw a review for that 19:25:37 (though he needs the php version from bionic and is likely going to end up using a semi-official backport ppa of that on xenial for now) 19:25:50 corvus: other than that smallish hiccup it seems that working with static inventory isn't a big pain? 19:25:59 clarkb: no, pretty simple 19:26:08 i don't think the lack of a tool is a big problem right now; 19:26:18 clarkb: https://review.openstack.org/#/c/622964/ -- i had on my todo to run it before voting actually 19:26:24 i think a tiny change to launch-node to output the inventory snippet would be nice. 19:26:51 or that 19:26:51 #link https://review.openstack.org/#/c/622964/ dynamic generation of static inventory 19:27:05 yeah, we were chatting about that. soonish we're probably going to want to output a dns zone update snippet too so having that fits 19:27:08 i think both would be nice. 19:27:17 corvus: ++ 19:27:25 output the snippet for a quick delta; script for full reconciliation. 19:27:57 our ansible and puppetry do seem to be a fair bit more stable now too 19:28:30 anybody know what our round-trip time is on a full run these days? still in the 30-45min range? 19:29:06 http://grafana.openstack.org/d/qzQ_v2oiz/bridge-runtime?orgId=1 says just under 25 minutes? that seems low 19:29:18 oh except now that we don't have to do all the jinja2 expansion that may be accurate 19:29:37 oh! i missed we were statsd'ing that 19:29:43 fungi: ianw did the work on that 19:29:56 that's right, now i remember. thanks again ianw!~ 19:30:01 most awesome 19:30:13 any progress on docker things? 19:30:15 ianw: ^ 19:30:25 in a word ... no 19:30:35 mordred updated the patch 19:30:39 no progress is good progress? ;) 19:30:47 corvus: ah ok, I'll have to rereview it then 19:31:17 i think the update addressed the review comments about testing iptables 19:31:38 it's now failing tests: https://review.openstack.org/605585 19:31:38 seems to fail testing but we can dig into that 19:32:03 #link https://review.openstack.org/#/c/605585 docker usage prep change. Fails tests currently but we should fix those tests and get this in when ready 19:32:31 Anything else related to config management updates before we move into the general topics? 19:32:45 fungi: you did end up helping cmurphy with the kata lists puppet update? 19:32:55 if that happened we can likely continue rolling up the puppet4 parser list 19:33:14 yes, it merged, nothing changed, we're good there 19:33:36 cool I'll have to take a look and see if I can help get any more of the future parser changes in 19:33:42 more evidence of cmurphy's awesomeness 19:34:00 #topic General Topics 19:34:23 Starting with OpenDev just a reminder to check out corvus' spec around git hosting for opendev and my website content change that were noted earlier in the meeting 19:34:45 Also considering this has become a standing topic and is a big effort I'd like to make this a priority effort (I think this was mentioned before) 19:35:16 I'll push up changes for that later today, but if you have objections to doing that feel free to let me/us know here or PM/email me 19:35:33 corvus: I figure we can use your spec for that data recording on the specs repo side? 19:35:56 clarkb: i'm having trouble parsing that 19:35:57 (since it gives a concrete set of work to prioritize) 19:36:10 clarkb: use my spec for indicating opendev is priority effort? yes 19:36:23 corvus: we list priority efforts as being attached to specs. Rather than write a nebulous opendev spec I like the idea of attaching priority to concrete work items (as in your spec) 19:36:31 great 19:37:12 i rather think opendev is what we're doing. specs related to that are priority efforts on their own 19:37:27 and corvus's gerrit/git spec already proposes itself as such 19:37:35 yup 19:37:54 o/ 19:38:24 Next on the list of general topics was I wanted to remind people that upgrading trusty servers is still valuable. I know we've got a few irons in the fire for this between upgrading to xenial on puppet, running ansible only servers as with dns servers, and switching some service to docker 19:38:30 this could use a recheck when an infra-root has time to babysit https://review.openstack.org/615656 19:38:36 cmurphy: noted thanks 19:39:00 #link https://etherpad.openstack.org/p/201808-infra-server-upgrades-and-cleanup has a list of servers that need attention if you can grab one or two please update that list 19:39:10 clarkb: thanks. i think as mentioned earlier in the meeting the sb and openstackid servers are good next candidates 19:39:11 I'm hoping to get to pbx this week 19:39:15 fungi: yup 19:39:29 so maybe those drop off the list rsn 19:39:58 We have until roughly april to get off trusty. Xenial will eol in 2021 so even Xenial gives us a bit of breathing room 19:40:06 (then bionic will have 10 years of support) 19:41:10 The last item on the general topics list has to do with how we use github. Specially admin account setup 19:41:28 our documents have long said we should use a secondary admin only github account, but I don't believe we ever enforced that 19:41:56 Additionally gentoo found (the hard way) that two factor auth with github reduces its ability to be a shiny target for not so nice people 19:42:05 #link https://review.openstack.org/#/c/620702/ 19:42:13 #link https://review.openstack.org/#/c/620703/ 19:42:53 I proposed two changes to kick off some discussion on this. Basically would we be opposed to requiring github 2fa on those accounts and should we enforce that we set up a second admin account or relax that rule given practice? 19:43:03 two-factor auth saved gentoo from github compromise? or they merely speculate that _if_ they'd required it they wouldn't have had their accounts hacked? 19:43:15 fungi: the speculation is that 2fa would have prevented the compromise 19:43:22 and they now require it for all of their github accounts 19:43:35 i assumed so, your earlier point merely seemed to imply the other 19:44:27 I've run with a hardware token second factor auth on github for about a year now and haven't found ti to be particularly troublesome. It helps that much of the day to day interaction with github is via ssh (whcih doesn't 2fa) 19:44:51 they support totp whcih you can run generators for on android or via linux command line as well 19:45:13 related to this, while the hope under new opendevery is that we can stop caring about needing github credentials for openstack and similar orgs, we do still apparently need them to manage the zuul app/widget/thingy 19:45:28 personally i barely use it and would be happy with a shared account 19:45:41 ya we might manage fewer repo mirrors, but would still have things like the zuul app install in github 19:45:41 but, if the key for the 2fa is with the password, it's not much of a 2fa i guess 19:46:28 ianw: thats a good point. I think ssh is limited in the scope of what it can do, but still dangerous :) 19:47:22 ssh is arguably 2 factor. it's at least 1.5. depends on how you're characterizing threat vectors (turs out "factors" aren't really one-size-fits-all) 19:47:35 i should say, ssh with a password-protected key 19:47:51 (which should go without saying, but...) 19:48:11 anyway, that's beside the point :) 19:48:29 if we consider ssh 2fa as commutative to github 2fa then i'd probably vote for the shared account with a setup on bridge to get the totp token 19:48:47 ianw: oh thats another possibility too. Which is that we use a shared account 19:49:27 if someone compromises bridge we've got bigger problems on our hands than github creds anyway 19:49:28 possibilities: Keep using personal account, use new second personal admin only accounts, use shared admin account. * apply 2fa 19:49:51 if we want 2fa on the github accounts, i'll definitely need a second github account. 19:49:58 it's just github does a pretty bad job at separating things, i think. which means if you have openstack in your account it pollutes all your notifications etc if you use it for "real" work 19:50:14 ianw: yup, so a second account may be desireable anyway 19:50:45 i can take an action item to make a shared account, and see if there's a practical way to have the token working on bridge. if it all looks good, we can then give it permission 19:51:17 ianw: do you have a preference to shared account over second personal account? I guess organizationally its less for each individual to manage 19:51:35 i'm happy to drop all privs for my personal gh account i use to submit the occasional pr or file an issue on some random project, and use a shared account on bridge.o.o (perhaps with a 2fa tool installed there) 19:51:52 #action ianw investigate practicality of shared github account (with possible 2fa) 19:52:08 i'm thinking if i'm setting up a second personal account, it gets confusing as to which one's me, and then i also have to manage two github secrets etc 19:52:17 ianw: ya 19:52:22 also, worth noting, we've been generally sloppy about remembering to add infra-root folk to the orgs we have in gh, or remove the ones who retire 19:52:34 considering how much i use it (never) shared account seems better imo 19:52:39 fungi: thats a good point too. I like the idea of shared creds the more we talk about it 19:52:54 this seems like a great next step, thank you ianw for volunteering to check it out 19:53:34 and with that I'll open the floor 19:53:42 #topic Open Discussion 19:55:34 corvus: I saw these at powells the other day https://www.powells.com/book/-659549220242 19:56:11 if we're left with book shopping things then I should probably end the meeting 19:56:26 ha nice! 19:56:36 Thank you everyone. Reminder we'll meet next week then take a two week break for holidays. 19:56:56 Find us in the infra channel or on the infra mailing list if you have other questions, concerns, thoughts issues, etc 19:56:59 #endmeeting