Tuesday, 2020-04-28

*** diablo_rojo_phon has joined #opendev-meeting07:59
-openstackstatus- NOTICE: Zuul is currently failing testing, please refrain from recheck and submitting of new changes until this is solved.09:00
*** ChanServ changes topic to "Zuul is currently failing testing, please refrain from recheck and submitting of new changes until this is solved."09:00
-openstackstatus- NOTICE: Zuul is currently failing all testing, please refrain from approving, rechecking or submitting of new changes until this is solved.09:11
*** ChanServ changes topic to "Zuul is currently failing all testing, please refrain from approving, rechecking or submitting of new changes until this is solved."09:11
*** ChanServ changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev"12:23
-openstackstatus- NOTICE: Zuul has been restarted, all events are lost, recheck or re-approve any changes submitted since 9:50 UTC.12:23
*** diablo_rojo has joined #opendev-meeting16:41
clarkbanyone else here for the meeting? we will get started shortly18:59
ianwo/19:00
clarkb#startmeeting infra19:01
openstackMeeting started Tue Apr 28 19:01:11 2020 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:01
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:01
*** openstack changes topic to " (Meeting topic: infra)"19:01
openstackThe meeting name has been set to 'infra'19:01
clarkb#link http://lists.opendev.org/pipermail/service-discuss/2020-April/000011.html Our Agenda19:01
clarkb#topic Announcements19:01
*** openstack changes topic to "Announcements (Meeting topic: infra)"19:01
clarkbFor the OpenDev Service Coordinator I said we would wait for volunteers until the end of the month. That gives you a few more days if interested :)19:02
mordredo/19:02
fungihow many people have volunteered so far?19:02
clarkbfungi: I think only me with my informal "I'm willing to do it" portion of the message19:02
clarkbI figured I would send a separate email thursday if no one else did first :)19:03
fungithanks!19:03
diablo_rojo_phono/19:03
clarkb#topic Actions from last meeting19:04
*** openstack changes topic to "Actions from last meeting (Meeting topic: infra)"19:04
clarkb#link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-21-19.01.txt minutes from last meeting19:04
clarkbThere were no actions recorded last meeting. Why don't we dive right into all the fun ansible docker puppet things19:04
clarkb#topic Priority Efforts19:04
*** openstack changes topic to "Priority Efforts (Meeting topic: infra)"19:04
fungiwhy don't we?19:04
clarkb#topic Update Config Management19:04
*** openstack changes topic to "Update Config Management (Meeting topic: infra)"19:04
fungioh, i see, that was a rhetorical question! ;)19:04
clarkbfungi: :)19:04
clarkbmordred: first up on this agenda we've got dockerization of gerrit. Are we happy with how that has gone and can we remove that agenda item now?19:05
clarkbmordred: I think all of the outstanding issues I knew about were addressed19:05
fungiwe do seem to be caught back up since friday's unanticipated excitements19:05
fungiand now we're much further along with things too19:06
corvusso i guess that's done and we could start thinking about the upgrade to 2.16?19:06
mordredWELLL19:06
mordredthere's still a cleanup task19:06
mordredwhich is gerritbot19:06
mordredI don't want to forget about it19:06
fungioh, right, tied in with the eavesdrop dockering19:07
mordredbut we've got eavesdrop split into a playbook and are containering accessbot on eavesdrop now19:07
mordredas well as a gerritbot container patch up19:07
mordredoh - we landed that19:07
mordredcool - I wanna get that bit finished19:07
mordredbut then, yes, I agree with corvus - next step is working on the 2.16 upgrade19:08
clarkbsounds like we are very close19:08
mordredyeah. I think we can remove the agenda item - gerritbot is more just normal work19:08
fungithuogh the gerrit upgrade becomes an agenda item19:09
clarkbNext up is the Zuul driven playbooks. Last Friday we dove head first into running all of zuul from ansible and most of zuul with containers19:09
mordredyeah - zuul-executor is still installed via pip - because we haven't figured out docker+bubblewrap+afs yet19:10
clarkb#link https://review.opendev.org/#/c/724115/ Fix for zuul-scheduler ansible role19:10
mordredeverything else is in containers - which means everything else is now running python 3.719:10
clarkbI found a recent issue with that that could use some eyeballs (looks like testing failed)19:10
fungiwe sort of didn't anticipate merging the change to go through and change uids/gids on servers, which turned it into a bit of a fire drill, but the outcome is marvellous19:10
clarkbrelated to this is work to do the same with nodepool services19:11
mordredclarkb: oh - that's the reason I didn't have the value set in the host_vars19:11
clarkbmordred: does ansible load from the production value and override testing?19:11
clarkbfor nodepool services I think we are sorting out multiarch container builds as we have x86 and arm64 nodepool builders19:13
mordredclarkb: I think so19:13
mordredwe're VERY close to having multiarch working19:13
mordredhttps://review.opendev.org/#/c/722339/19:14
clarkbthen once that is doen and deployed I'd like to land a rebase of https://review.opendev.org/#/c/722394/ to make it easier to work with all of our new jobs in system-config19:14
corvusyes please :)19:14
mordred++19:14
mordredhttps://review.opendev.org/#/c/724079/ <-- we need this to work with the multi-arch job on an ipv6-only cloud19:15
ianw^ that has an unknown configuration error?19:16
mordredianw: AROO19:18
corvusmaybe that's the long-standing bug we've been trying to track down19:19
corvusi can take a look after mtg19:19
clarkbThe other thing I wanted to call out is that we are learning quite a bit about using Zuul for CD19:20
ianwapropos the container services, i don't see any reason not to replace nb01 & nb02 with container based versions now?19:21
clarkbfor example I think we've decided that periodic jobs should pull latest git repo state rather than rely on what zuul provided19:21
fungii like to think we're improving zuul's viability for cd19:21
clarkbianw: ++19:21
clarkbif you notice irregularities in playbook application to production please call it out. Beacuse as fungi points out I think we are improving things by learning here :)19:21
ianwi can do that and move builds there19:21
ianw(new builders, i mean)19:21
fungiat the very least we're becoming an early case study on complex cd with zuul19:22
fungi(spoiler, it's working remarkably well!)19:22
clarkbanything else on the subject of config management and continuous deployment?19:23
mordredoh - I'm testing focal for zuul-executors19:23
mordredhttps://review.opendev.org/#/c/723528/19:23
fungiit's due to be officially released next week, right?19:23
mordredonce that's working - I wanna start replacing ze*.openstack.org with ze*.opendev.org on focal19:24
mordredit's already released19:24
fungioh!19:24
fungii'm clearly behind on the news19:24
fungitime has become more of an illusion than usual19:24
ianwhrm, so for new builders, such as nb0X ... focal too?19:25
fungimight as well19:25
ianws/builders/servers  ?19:25
clarkbmy only concern at this point with it is major services like mysql crash on it19:25
ianwthat will add some testing updates into the loop, but that's ok19:26
mordredclarkb: "awesome"19:26
clarkbwe should carefully test things are workign as we put focal into production19:26
mordred++19:26
fungiin theory anything we can deploy on bionic we ought to be able to deploy on focal, bugs like what clarkb mentions aside19:26
fungistuff that's still relying on puppet is obviously frozen in the xenial past19:26
ianwi doubt system-config testing will work on it ATM for testing infrastructure reasons ... i'm working on it19:27
clarkbless and less stuff is relying on puppet though19:27
mordredianw: we could also go ahead and just do bionic for builders and get that done19:27
ianwparticularly the ensure-tox role and it installing as a user19:27
mordredianw: since those are in containers - I mostly wanted to roll executors out on focal so that we could run all of zuul on the same version of python19:27
ianw... yes pip-and-virtualenv is involved somehow19:28
mordredianw: the focal test nodes are actually going ok for the run-service-zuul test job fwiw19:28
ianwmordred: hrm, perhaps it's mostly if we were to have a focal bridge.o.o in the mix in the test suite, where testinfra is run19:29
mordredyeah - for things where we're running our ansible- the lack of pre-installed stuff is good, since we install everything from scratch anyway19:30
clarkbSounds like that may be it for this topic19:31
clarkb#topic OpenDev19:32
*** openstack changes topic to "OpenDev (Meeting topic: infra)"19:32
clarkbAnother friendly reminder to volunteer for service coordinator if interested19:32
clarkbOn a services front we upgraded gitea thursdayish then had reports of failed git clones over the weekend19:32
clarkb"thankfully" it seems that is a network problem and unrelated to our upgrade19:32
mordred\o.19:33
clarkbcitycloud kna1 (and kna3?) was losing packets sent to vexxhost sjc119:33
mordredI mean19:33
mordred\o/19:33
clarkbfungi was able to track that down using our mirror in kna1 to reproduce the user reports19:33
clarkband from there we did some traceroutes and passed that along to the cloud providers19:33
clarkbsomething to be aware of if we have more reports of this. Double checking the origin is worthwhile19:33
clarkbAlso lists.* has been OOMing daily at between 1000-1200UTC19:34
fungiyeah, that one's not been so easy to correlate19:34
clarkbfungi has been running dstat data collection to help debug that and I think the data shows it isn't bup or mailman. During the period of sadness we get many listinfo processes19:34
fungii think you're on to something with the semrush bot in the logs19:34
clarkbthose listinfo processes are started by apache to render webpage stuff for mailman and correlating to logs we have a semrush bot hitting us during every OOM I've checked19:35
clarkbI've manually dropped in a robots.txt file to tell semrushbot to go away19:35
fungithough ultimately, this probably means we should eventually upgrade the lists server to something with a bit more oomph19:35
clarkbI've also noticed a "The Knowledge AI" bot but it doesn't seem to show up when things are sad19:35
fungior tune apache to not oom the server19:35
clarkbfungi: and maybe even both things :)19:35
clarkbbut ya if the robots.txt "fixes" things I think we can encode that in puppet and then look at tuning apache to reduce numebr of connections?19:36
clarkb#topic General Topics19:37
fungii think so, yes19:37
*** openstack changes topic to "General Topics (Meeting topic: infra)"19:37
clarkbA virtual PTG is planned for the beginning of june19:38
clarkbI've requested these time blocks for us: Monday 1300-1500 UTC, Monday 2300-0100 UTC, Wednesday 0400-0600 UTC19:38
clarkbfungi: have you been tracking what registration and other general "getting involved" requires?19:38
funginot really19:39
fungii mean, i understand registration is free19:39
clarkbk I'll try to get more details on that so that any interesting in participating can do so19:40
clarkb(I expect it will be relatively easy compared to typical PTGs)19:40
fungibut folks are encouraged to register so that 1. organizers can have a better idea of what capacity to plan for, and 2. to help the osf meet legal requirements for things like code of conduct agreement19:40
fungithere's also discussion underway to do requirements gathering for tools19:41
fungii can find an ml archive link19:41
clarkbthanks!19:42
corvusshould we do anything with meetpad?19:43
fungi#link http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014367.html PTG Signup Reminder & PTG Registration19:43
fungi#link http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014481.html Virtual PTG Tooling Requirements & Suggestions19:44
clarkbI think we should keep pushing on meetpad. Last I checked the deployment issues were addressed. Should we plan another round of test calls?19:44
corvusis it generally working now?  does it need more testing?  are there bugs we should look at?  does it meet any of the requirements the foundation has set?19:44
fungithere's an etherpad in that second ml archive link where a list of requirements is being gathered19:45
clarkbI don't know if its working -> more testing probably a good idea. One outstanding issue is the http to https redirect is still missing iirc19:45
corvusi noticed there was a bunch of 'required' stuff on https://etherpad.opendev.org/p/virt-ptg-requirements19:45
corvusclarkb: ah, yeah forgot about that.  i can add that real quick.19:45
fungiprobably one of the harder things to meet there is "Legal/Compliance Approval (e.g. OSF Privacy Policy, OSF Code of Conduct, GDPR)"19:46
corvusi don't even know what that means19:46
fungibut i don't really know, maybe that's easy19:46
clarkbcorvus: it might be a good idea to get meetpad to a place we are generally happy with it, then we can put it in front of the osf and ask them for more details on those less explicit rquirements?19:47
fungii think it's supposed to mean that the osf privacy policy is linked from the service in an easy to find way, that the service complies with the gdpr (that's vague too), and that something enforces that all people connecting agree to follow the code of conduct19:47
fungibut yes, getting clarification on those points would be good19:48
fungialso we've (several of us) done our best to remind osf folks that people will use whatever tools they want at the end of the day, so it may be a matter of legal risks for the osf endorsing specific tools vs just accepting their use19:49
clarkbI do think on paper jitsi meets the more concrete requirments with maybe exception of the room size (depends on whether or not we can bump that up?)19:49
corvusis there a limit on room size?19:49
clarkbcorvus: did you say it is a limit of 35? I thought someone said that19:49
corvusi thought i read about hundreds of people in a jitsi room19:49
clarkbbut there was some workaround for that19:49
fungii think that one was more of "the service should still be usable for conversation when 50 people are in the same room"19:50
clarkbhttps://community.jitsi.org/t/maximum-number-of-participants-on-a-meeting-on-meet-jit-si-server/2227319:50
clarkbmaybe the 35 number came from something like that19:50
fungi(noting that 10 people talking at once is tough to manage, much less 50, and that has little to do with the tools)19:50
clarkbalso those numbers may be specific to the meet.jit.si deployment (and we can tune ours separate?)19:50
fungialso osf has concerns that whatever platforms are endorsed have strong controls allowing rooms to be moderated, people to be muted by moderators, and abusive attendees to be removed reliably19:51
corvusyeah, sounds like there may be issues with larger numbers of folks19:51
clarkbin any case I think step 0 is getting it to work in our simpler case with the etherpad integration19:51
clarkbI think we haven't quite picked that up against since all the etherpad and jitsi updates so worth retesting and seeing where it is at now19:52
corvusk, i'll do the http redirect and ping some folks maybe tomorrow for testing19:52
clarkbthanks!19:52
clarkbfungi any wiki updates?19:52
funginone, the most i find time for is patrolling page edits from new users19:53
fungi(and banning and mass deleting the spam)19:53
clarkb#topic Open Discussion19:53
*** openstack changes topic to "Open Discussion (Meeting topic: infra)"19:53
clarkbThat takes us to the end of our agenda19:53
clarkbAs a quick note my ISP has been sold off and acquired by a new company. That transition takes effect May 1st (Friday). I don't expect outages but seems like chances for them are higher under those circumstances19:54
corvusclarkb: i hear the internet is gonna be big19:54
clarkbcorvus: do you think we could sell books over the internet?19:55
ianwif i could get an eye on19:55
ianw#link https://review.opendev.org/#/c/723309/19:56
ianwthat is part of the pip-and-virtualenv work to add an ensure-virtualenv role for things that actually require virutalenv19:56
ianwdib is one such thing, this gets the arm64 builds back testing; we have dropped pip-and-virtualenv from them19:57
clarkbianw: I'll take a look after lunch if it is still up there then19:57
ianw#link http://eavesdrop.openstack.org/irclogs/%23opendev/%23opendev.2020-04-28.log.html#t2020-04-28T04:47:0019:58
ianwis my near term plan for that work19:58
fungii have a very rough start on the central auth spec drafted locally, though it's still a patchwork of prose i've scraped from various people's e-mail messages over several years so i'm trying to find time to wrangle that into something sensible and current19:58
fungiand i've been fiddling with a new tool to gather engagement metrics from gerrit (and soon mailman pipermail archives, meetbot channel logs, et cetera)19:59
fungitrying to decide whether i should push that up to system-config or make a new repo for it19:59
clarkbfungi: new repo might be worthwhile. Thinking out loud here: the zuul work in system-config is really orientating it towards deployment of tools but not in defining the tools itself as much?20:00
fungiyeah20:00
fungii concur20:00
fungii should make it an installable python project20:00
clarkband we are at time20:01
clarkbthank you everyone@20:01
clarkber !20:01
clarkb#endmeeting20:01
fungithanks clarkb!20:01
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev"20:01
openstackMeeting ended Tue Apr 28 20:01:17 2020 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)20:01
openstackMinutes:        http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-28-19.01.html20:01
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-28-19.01.txt20:01
openstackLog:            http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-28-19.01.log.html20:01
*** diablo_rojo has quit IRC20:39
*** diablo_rojo has joined #opendev-meeting20:42

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!