19:01:13 #startmeeting infra 19:01:14 Meeting started Tue Oct 1 19:01:13 2019 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:15 ehlo 19:01:16 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:18 The meeting name has been set to 'infra' 19:01:26 #link http://lists.openstack.org/pipermail/openstack-infra/2019-September/006488.html Our Agenda 19:01:38 #topic Announcements 19:02:25 Nothing major to announce. I'll be out thursday but otherwise I'm back to business as usuall after traveling last week 19:02:49 mnaser launched a zuul service on vexxhost at ansiblefest last week. That might be worth announcing :) 19:03:27 #topic Actions from last meeting 19:03:38 #link http://eavesdrop.openstack.org/meetings/infra/2019/infra.2019-09-17-19.01.txt minutes from last meeting 19:04:01 We don't have any actions there and its been a while since the last meeting and half of us have been traveling and focused on other things for a bit too 19:04:25 I do see the ianw seems to have written the spec for replacing the static server 19:04:38 whcih wasn't called out as a specific action but worth noting and we'll get to that a bit later in the agenda 19:05:08 #topic Priority Efforts 19:05:19 #topic OpenDev 19:06:07 I poked around gitea a bit more after some guidance from upstream and found that all git process forks in gitea should be timed out already 19:06:35 that coupled with the lack of gitea issues that we've seen recently make me wonder if perhaps the disk problems we were having intermittently may have been an underlying cause of slowness? 19:07:10 I need to review logs again but it is possible that things are happy as is and we need to monitor for those disk issues instead 19:07:34 I also found a bug for the "its slow to browse large repos" issue 19:07:59 upstream seems well aware of it and there are a couple of avenues being investigated for improving this 19:08:09 I think this topic will likely make good conversation in shanghai 19:08:19 devstack has a periodic job that walks repos looking for plugins which occasionally gets a 500 ... it was then reporting spurious changes to documentation (https://review.opendev.org/#/c/684434/). might be one to monitor 19:09:08 #link https://github.com/go-gitea/gitea/issues/491 19:09:12 is the upstream bug for slowness 19:09:39 `time wget -qO/dev/null https://opendev.org/openstack/nova` reports real 0m16.175s for me, as a benchmark 19:09:57 apropos of nothing in particular 19:10:14 ianw: good to know. I think 500s are what we want (at least rather than spending hours trying to respond to a request causing requests to backup and then we OOM) 19:10:43 that bug in particular has some good insight into why it is slow 19:10:59 it basically scans the entire history for everything 19:11:04 wow, opened almost 3 years ago too 19:11:26 fungi: yup and still active as of a couple weeks ago 19:12:39 the last opendev related item I wanted to bring up is I haven't yet drafted that email re governance changes 19:12:52 that and ptg stuff are high on my list this week now that things should be going back to normal for me 19:12:59 any other opendev related topics? 19:13:31 "The previous gitea web ui would take > 3 hours to load" ... that's some dedicated testing :) 19:13:55 was there any pending followup on https://github.com/go-gitea/gitea/issues/8146 then? 19:14:50 fungi: I asked them for help further debugging the slowness but considering we've been stable (from what I've seen) its possible that the issue has largely gone away. But I need to double check logs for OOMs as well as really slow requests 19:15:16 ahh, yeah, okay 19:16:14 #topic Update Config Managment 19:16:45 mordred: I know you've been traveling like some of us, but are we in a position now to try deploying review-dev with our docker image? 19:17:07 I expect that we are really close to that given the changes that have been reviewed and merged to the docker image building for gerrit 19:18:29 I'm guessing mordred is either eating dinner or jetlagged or both :) 19:18:52 Anyone have topics to point out here? ianw perhaps your logrotate changes? 19:19:23 well the fedora mirror still takes hours and hours to release every time, haven't debugged that further, at this point 19:19:58 however, mirror issues turned out to be a red herring for the centos-8 changes i was debugging, it was actually a problem with yum/dnf/??? 19:20:17 (https://review.opendev.org/684461) 19:21:32 (tangentially related -- my current focus on zuul-registry is largely driven by a desire to fix reliability issues we've seen with container image builds) 19:22:08 (ie, the more successful we are with things like building/deploying gerrit with docker, the more we'll need this) 19:22:29 ++ 19:23:40 #topic Storyboard 19:24:00 so i've got some new stuff here 19:24:07 go for it 19:24:16 there's a semi-recent feature which the zuul community has been taking advantage of so far 19:25:03 you can flag a team in sb as a "security team" and then any projects you associate with that team will automatically add access for it to any stories which get marked as security-related 19:25:25 the openstack vmt is starting to make use of it as well, but this raises a self-service challenge 19:25:38 right now, teams can only be created and members added/removed by an sb admin 19:26:14 and shoving an entire rbac into sb would be a ton of new work, so... 19:26:32 #link https://review.opendev.org/685778 Record vulnerability management teams used in SB 19:27:09 after some deliberation the idea of automating group membership and keeping it in yaml is very attractive for a number of reasons 19:27:40 so that change is my proposed schema, before i dive headlong into trying to implement the validation and automation for it 19:28:14 that change lgtm 19:28:30 agreed 19:28:49 i left it open for the potential to use it to define non-security teams as well 19:28:56 fungi: change could use some README ;) 19:29:10 AJaeger: a great idea, thanks 19:29:26 like stuff one in the top of the storyboard tree? 19:29:46 that would be fine 19:29:58 we could probably stand to stick some readmes in other places in our config repos too 19:30:30 i considered putting this in opendev/project-config but for now it's probably better off in the same repo as the gerrit/projects.yaml to avoid people having to make correlated changes in too many repos 19:30:35 fungi: and/or update top-level README 19:30:43 ahh, yeah i can totally do that 19:31:36 i figure this would move to and/or get split out at the same time gerrit configuration does 19:33:23 Sounds like that is it for storyboard? 19:34:07 #topic General Topics 19:34:18 #link https://etherpad.openstack.org/p/201808-infra-server-upgrades-and-cleanup 19:34:24 fungi: any progress with wiki openid stuff? 19:34:51 next phase on the wiki replacement is i think to refresh the current production database and images directory, then work on matching the configuration. i haven't done those things though, no 19:35:23 The other item in this is the static.o.o replacement/update/transition 19:35:26 ianw wrote a spec 19:35:36 #link https://review.opendev.org/683852 Review spec 19:35:58 AJaeger: thank you for taking a first pass review 19:36:03 after I made first changes, ianw disagreed and so write something up so that we're all on the same page. 19:36:14 corvus: ^ that may interest you as I think you suggested it in a meeting a couple weeks ago 19:36:16 spec looks good to me, but has two open quesitons 19:36:38 so, my ask: Review spec and let's discuss the open questions 19:36:50 yeah, we had discussed parts of this across a few meetings, so this pulls it into one place 19:36:50 ++ I think it would also be good if we can get agreement on this pre summit/ptg 19:36:56 then we can bring it up with teams there as necessary 19:38:14 looks good -- my interest is mostly in getting this in front of the tc so they know about the needs 19:39:07 we should be able to merge the spec reasonably quickly (next week?) then we can point the TC to the published version? or do we think we need their input in review too? 19:39:49 in terms of it being mechanical (moving things to volumes, etc) i don't think so 19:39:50 earlier the better i'd say 19:39:52 do we want to figure out the two open questions ourselves? Or need input on those as well? 19:40:01 i expect as long as the end result is sites published in the form they are now, all the backend and automation is mostly implementation detail they won't care about 19:40:15 ... but if there's policy things around how much of this is affected by opendev.org in the future, etc, that may be different 19:40:44 but yeah, it's still good to get them to say that one way or the other 19:41:14 I can bring it up in office hours later today (local time) 19:41:35 to be clear, i want to make sure that the openstack project is aware of the operational cost of this and that it's an opportunity to contribute to the commons. 19:42:10 corvus: yup, that was why I figured the published version might be easiest, but early feedback is always nice 19:42:15 (most of these sites were set up so that we could retire the wiki; instead we're now maintaining the wiki and these sites) 19:44:04 That takes us to the last agenda item. PTG Planning 19:44:09 the ssl question is a very interesting one 19:44:17 are there particular tasks in the spec which would be good to call out as in need of additional assistance, or is it more a general point that we'd welcome assistance in maintaining all of it? 19:44:38 i still have no interest in dealing with a manual dns system 19:45:05 corvus: I believe that ianw had automated the dns changes required for rax so it isn't manual 19:45:09 but does require using their one off api 19:45:10 clarkb: not the point 19:45:16 that's the point 19:45:50 unless we can come up with some way to manage openstack.org dns collaboratively with the foundation through code review, i want to stay away from it 19:46:12 yeah, we've got a dns management solution built on revision control, ci and all free software. the project wanting to use the same domain as the foundation is the challenge presently 19:46:36 corvus: in this case at least most of the changes are to a different domain (or can be) 19:46:47 you do need the glue type records in the parent domain that point to the acme hosting domain 19:46:52 (not a perfect solution for sure) 19:47:17 i just don't think that opendev should be managing openstack.org stuff if we can't do it using the system we've constructed 19:48:16 so maybe what we can offer is some more neutrally branded hosting? 19:48:27 along the lines of tarballs.opendev.org and docs.opendev.org 19:48:30 fungi: re tasks which people can pick up I believe the updates to publishing jobs in particular should be doable by anyone that has edited zuul jobs before so could be a good place to start 19:48:42 clarkb: yeah, that makes sense 19:49:27 and maybe this is a bigger question than i originally thought :/ 19:49:40 i think it's a great question to pose to them 19:50:41 clarkb: I pushed the first three changes for jobs, that will help others to do the rest... 19:50:48 but yes, suggesting all openstack documentation and similar online content be redirected to and served from a different domain is a significant change, so i would expect a lot of concern, at least initially 19:50:51 AJaeger: thanks! 19:51:01 clarkb: https://review.opendev.org/#/q/status:open+project:openstack/project-config+branch:master+topic:static-services 19:51:47 corvus: maybe we can write up those concerns as comments on the change and that can be a piece that the TC offers feedback on? 19:52:20 clarkb: (on publishing jobs) yep, although if we do the per-project volumes we do need to create them manually as there's no facility to do that 19:52:52 I guess that is really what i was calling out as "Perhaps there are plans afoot for OpenDev-ing some of these bits? 19:52:53 Is there some sort of succession planning?" 19:53:17 i personally think redirecting governance.openstack.org to docs.opendev.org/openstack/governance would be fine, and even like the idea of maybe finally being able to serve all those disparate openstack static content sites from a single logical tree 19:54:19 its possible we may also want to do this in stages too 19:54:39 convert to afs hosted thing for what we currently have using the existing certs 19:54:48 ahh, files02 ... i'm not sure i considered this host 19:54:53 and decouple opendevifying this from the switch to afs 19:55:16 is that deployed via puppet? 19:55:22 yes 19:55:27 there is a files.pp iirc 19:56:19 oh wait, files02.openstack.org == files.opendev.org == files.openstack.org 19:56:47 it's unclear to me what in particular about the static files webserver would benefit from containerization, but configuring it with ansible instead of puppet still might 19:56:57 i mean docs.opendev.org ... there's only one "files" server though, right? 19:57:07 ianw: ya there is only one files server 19:57:13 02 was a replacement for 01 not running side by side 19:57:20 and no openstack vs opendev yes as far as I know 19:57:23 we can load-balance it if we want 19:57:38 quick timecheck we only have a few minutes left 19:57:51 ok, right, got it :) so yeah, the idea of starting an ansible-only host to migrate these things to separately is still in play 19:57:59 I wanted to mention that there is a PTG schedule proposed starting at this thread #link http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009733.html Proposed schedule thread 19:58:18 #link https://etherpad.openstack.org/p/OpenDev-Shanghai-PTG-2019 19:58:25 I'm going to start adding content to that this week I hope 19:58:40 also as a heads up I don't actually have a visa yet and am told that I won't know more about that until mid october :/ 19:59:22 ;( 19:59:42 And we are just about at time now. 19:59:57 Thanks everyone. We can continue the afsification discussion on that review or in #openstack-infra 20:00:08 #endmeeting