Tuesday, 2020-05-19

*** diablo_rojo has joined #opendev-meeting18:52
clarkbWe will start the opendev team meeting shortly18:59
fungiyay!19:00
ianwo/19:00
corvusbowl o' lunch acquired19:00
zbro/19:01
mordredo/19:01
clarkbI have hastily eaten my sandwich19:01
fungibowling for lunches19:01
clarkb#startmeeting infra19:01
openstackMeeting started Tue May 19 19:01:10 2020 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:01
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:01
*** openstack changes topic to " (Meeting topic: infra)"19:01
openstackThe meeting name has been set to 'infra'19:01
clarkb#link http://lists.opendev.org/pipermail/service-discuss/2020-May/000025.html Our Agenda19:01
clarkb#topic Announcements19:01
*** openstack changes topic to "Announcements (Meeting topic: infra)"19:01
clarkbThe OpenStack reelase went really smoothly19:01
clarkbthank you to everyone for ensuring that services were running and happy for that19:01
clarkbThis next weekend is also a holiday weekend. OpenStack Foundation staff have decided that we're gonna tack on an extra day on the other side of it making Friday a day off too19:02
fungiand that was with ovh entirely offline for us too ;)19:02
clarkbas a heads up that means I'll be largely away from my computer Friday and Monday19:03
fungii'll likely be around if something comes up, because i'm terrible at vacationing19:03
fungibut i may defer non-emergency items19:04
clarkb#topic Actions from last meeting19:04
*** openstack changes topic to "Actions from last meeting (Meeting topic: infra)"19:04
clarkb#link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-05-12-19.01.txt minutes from last meeting19:04
clarkbThere are no actions to call out19:04
clarkb#topic Priority Efforts19:04
*** openstack changes topic to "Priority Efforts (Meeting topic: infra)"19:04
clarkb#topic Update Config Management19:04
*** openstack changes topic to "Update Config Management (Meeting topic: infra)"19:04
clarkbany movement on the gerritbot installation ? I haven't seen any19:05
fungithere's been a ton of progress on our ci mirrors19:05
clarkbyes, ianw has redeployed them all to be ansible managed udner the opendev.org domain19:05
fungiat least the ones which weren't yet19:05
fungii did similarly for the ovh mirrors since they needed rebuilding anyway19:06
clarkbcorvus: mordred: I think the other big item on this topic is the nb03 arm64 docker image (which still needs debugging on the image build side) and the work to improve testing of Zuul deployments pre merge with zuul19:07
clarkbis there anything else to call out on that? maybe changes that are ready for review?19:07
fungiso far everything seems to be working after the mass mirror replacements, so that went extremely well19:07
corvusi think the work on getting zuul+nodepool working in tests is nearly there -- at least, i think the 2 problems we previously identified have solutions which are complete and ready for review except:19:07
* diablo_rojo sneaks in late again19:08
fungidiablo_rojo: there might still be a few cookies left, but the coffee's gone already19:08
corvusthey keep hitting random internet hangups, so they're both still 'red'.  but i think their constituent jobs have individually demonstrated functionality19:08
corvus2 main changes:19:08
mordredI'm planning on digging back in to figuring out what's up with the multi-arch stuff :(19:08
corvus#link better iptables support in testing https://review.opendev.org/72647519:08
diablo_rojofungi, thats alright, I prefer tea as my hot caffeine source anyway.19:09
corvus#link run zuul as zuuld user https://review.opendev.org/72695819:09
clarkbcorvus: gerrit reports they conflict with each other. Do they need to be stacked?19:09
corvusthat last one is worth a little extra discussion -- we went back and forth on irc on how to handle that, and we decided to use a 'container' user, but once i was done with that, i really didn't like it, so i'm proposing we zig and use a 'zuuld' user instead.19:10
mordred++19:10
corvusthe bulk of the change is normalizing all the variables around that, so, really, it's not a big deal to change (now or later)19:10
corvusthe main thing we were worried about is whether it would screw up the executors, but i walked through the process, and i think the executor is going to happily sshing to the 'zuul' user on the worker nodes even though it's running as zuuld19:11
clarkbcool and that will avoid zuul on the test nodes conflicting with zuul the service user19:11
corvusclarkb: unsure about conflicts; i'll check.  i might rebase both of them on the 'vendor ppa keys' change i'm about to right19:11
corvusclarkb: yep19:12
corvusi'm unsure if the ansible will rename the zuul->zuuld user on the real executors or not19:12
corvusbut if it errors, we can manually do that pretty easily19:12
clarkbthe uids stay the same right? so its jsut a minor /etc/passwd edit?19:12
corvusthe uid will be the same19:12
corvusyep19:12
clarkbalright anything else on the topic of config management updates?19:13
clarkb#topic OpenDev19:14
*** openstack changes topic to "OpenDev (Meeting topic: infra)"19:14
clarkb#link https://etherpad.opendev.org/p/XRyf4UliAKI9nRGstsP4 Email draft for building advisory board.19:14
clarkbI wrote this draft email yesterday and am hoping I can get some reviews of it before sending it out19:15
clarkbbasic idea is to send email to service-discuss while BCC'ing people who I think may be interested. Then have those that are interested respond to the list and volunteer19:15
clarkbif that plan sounds good and people are ok with the email content I'll go ahead and send that out and start trying to drum up interest there19:16
corvusclarkb: generally looks good.  i noted 2 things that could use fixing19:17
clarkbcorvus: thanks I'll try to incorporate feedback after the meeting19:17
clarkbProbably don't need to discuss this further here, just wanted to make sure people were aware of it19:17
clarkbAny other OpenDev topics to bring up before we move on?19:17
clarkb#topic General Topics19:19
*** openstack changes topic to "General Topics (Meeting topic: infra)"19:19
clarkb#topic pip-and-virtualenv next steps19:19
*** openstack changes topic to "pip-and-virtualenv next steps (Meeting topic: infra)"19:19
clarkbianw you added this item want to walk us through it?19:19
ianwyes so the status of this is currently that centos-8, all suse images, all arm64 images have dropped pip-and-virtualenv19:20
ianwthat of course leaves the "hard" ones :)19:20
ianwwe have "-plain" versions of our other platform nodes, and these are passing all tests for zuul-jobs19:21
ianwi'm not sure what to do but notify people we'll be changing this, and do it?19:21
ianwi'm open to suggestiosn19:21
mordredyeah - I think that's likely the best bet19:21
corvusfolks can test by adding a "nodeset" line to a project-pipeline job variant, yeah?19:22
ianwoh and the plain nodes pass devstack19:22
clarkbianw: ya I think now is likely a good time for openstack. Zuul is trying to get a release done, but is also capable of debugging quickly. Airship is probably my biggest concern as they are trying to do their big 2.0 release sometime soonish19:22
mordredit's beginning of an openstack cycle - and there should be a clear and easy answer to "my job isn't working any more" right19:22
mordred?19:22
clarkbcorvus: yes I think so19:22
zbraltenative suffix -vanilla :P19:22
clarkbzbr: we'll be dropping the suffixes entirely I think19:22
clarkbzbr: its jsut there temporarily for testing19:22
ianwyes i'd like to drop the suffix19:22
fungiat a minimum we should make sure the ubuntu-focal images are "plain"19:22
clarkbmaybe the thing to do is notify people of the change with direction on testing using a vairant and the -plain images19:23
corvusmaybe we should send out an email saying "we will change by date $x; you can test now by following procedure $y" ?19:23
ianwoh, yeah the focal images dropped it to19:23
corvusclarkb: yeah that :)19:23
clarkbcorvus: ++19:23
fungiopenstack projects are supposed to be switching to focal and replacing legacy imported zul v2 jobs this cycle anyway19:23
fungithe bigger impact for openstack will likely be stable branches19:23
clarkbdoing it next week may work, week after is PTG though which we'd probably want to avoid unnecesary disruption during19:23
mordredfungi: yeah - where I imagine there will be a long tail of weird stuff19:23
fungithey may have to backport some job fixes to cope with missing pip/virtualenv19:23
clarkbfungi: ya, though I think the idea is that we've handled ti for them?19:24
ianwyeah, at worse you might have to put in an ensure-virtualenv role19:24
corvusset the date at next friday?  (or thursday?)19:24
clarkbwe can also do a soft update where we switch default nodesets to -plain while keeping the other images around. Then in a few more weeks remove the -plain images19:25
ianwok, i will come up with a draft email, noting the things around plain images etc and circulate it19:26
clarkbsounds good19:26
mordredclarkb: I like that idea19:26
fungibut if ianw's hunch is right, then the job changes needed to add the role are equally trivial as the job changes needed to temoprarily use the other image name19:26
clarkbya I think we've done a fair bit of prep work to reduce impact and make fixing it easy. Rolling forward is likely as easy as being extra cautious19:27
fungiso keeping the nonplain images around after the default switch may just be creating more work for us and an attractive nuisance for projects who don't know the fix is as simple as the workaround19:27
clarkbthe key thing is to give people the info they need to know what to do if it breaks them19:27
ianw++ i will definitely explain what was happening and what should happen now19:28
mordred++19:28
ianwif anyone reads it, is of course another matter :)19:28
fungithey'll read it after they pop into irc asking why their jobs broke and we link them to the ml archive19:28
fungibetter late than never19:29
clarkbanything else on this topic?19:29
ianwno thanks19:29
clarkb#topic DNS cleanup19:29
*** openstack changes topic to "DNS cleanup (Meeting topic: infra)"19:29
clarkb#link https://review.opendev.org/728739 : Add tool to export Rackspace DNS domains to bind format19:29
ianwahh, also me19:30
clarkbgo for it19:30
ianwso when i was clearing out old mirrors, i very nearly deleted the openstack.org DOMAIN ... the buttons start to look very similar19:30
ianwafter i regained composure, i wondered if we had any sort of recovery for that19:30
ianwbut also, there's a lot of stuff in there19:30
mordredI like the idea of exporting to a bind file19:31
ianwfirstly, is there any issue with me pasting it to an etherpad for shared eyes to do an audit of what we can remove?19:31
clarkbianw: I don't have any concern doing that for the records we manage in that zone. I'm not sure if that is true of the records the foundation manages19:31
mordredianw: you poked at the api - how hard do you think it woudl be to make a tool that would diff against a bind file and what's in the api? (other than just another export and a diff)19:31
clarkbfungi: ^ you may have a sense for that?19:31
corvusistr there were some maybe semi-private foundation stuff19:31
fungilast time i looked, there was no export option in the webui and their nameservers reject axfr queries19:32
ianw... ok, so maybe i won't post it19:32
fungiclarkb: we can ask the osf admins if there's anything sensitive about dns entries19:32
clarkbianw: we could use a file on bridge instead of etherpad19:32
mordredcould we verify with foundation if there actually are semi-private things? it woudl be nice if we could export it and put a copy into git for reference19:33
clarkband ya fungi and I can ask jimmy and friends if there are reasons to not do that19:33
ianwfungi: there isn't an export option in the UI, but the change linked above uses the API for a bind export19:33
ianwso yeah, i'm thinking at a minimum we should dump it, and other domains, periodically to bridge or somewhere incase someones does someday click the delete domain button19:33
fungiianw: oh, does that support pagination? last time i looked the zone was too big and just got truncated19:33
mordredianw: ++19:33
ianwfungi: it looks complete to me ... let me put it on bridge if you want to look quickly19:34
fungisure, happy to skim19:34
clarkbya I like the idea of a periodic dump. I actually got introduced to Anne via a phone call while walking around seattle one afternoon due to a dns mistake that happened :/19:34
ianw~ianw/openstack.org-bind.txt19:34
ianwok, i could write it as an ansible role to dump the domain, although it would probably have to be skipped for testing19:35
ianwor, just a cron job19:35
fungiideally we won't be adding any new names to that zone though, right? at most replacing a/aaaa with cname and adding le cnames19:35
mordredfungi: yeah - but we remove names from it19:36
mordredas we retire things19:36
clarkbya deleting will be common19:36
fungii suppose the periodic dump is to remind us what's still there19:36
ianwfungi: yep ... and there's also a ton of other domains under management ... i think it would be good to back them up too, just as a service19:36
fungioh, got it, disaster recovery19:37
ianwand, if we want to migrate things, a bind file is probably a good place to start :)19:37
fungiin the past we've shied away from adding custom automation which speaks proprietary rackspace protocols... how is this different?19:37
mordredianw: feature request ... better sorting19:37
mordredfungi: we use the rax dns api for reverse dns19:38
clarkbalso this is less automation and more record keeping (eg it can run in the background and if it breaks it won't affectanything)19:38
corvusi like it since it's a bridge to help us move off of the platform19:38
fungimordred: we do, but we do that manually19:38
clarkboh and ya if yo uwant to move off having bind zone files is an excellent place to start19:39
ianwfungi: i guess it's not ... but we are just making a copy of what's there.  i feel like we're cutting off our nose to spite our face if i had of deleted the domain and we didn't back it up for ideological reasons :)19:39
mordredfungi: yeah - but what clarkb and corvus said19:39
fungii find the "we have periodic exports of your zones in a format you could feed into a better nameserver" argument compelling, yes19:39
mordred++19:39
clarkbfwiw I've asked the question about publishing the zone file more publicly19:40
clarkbwill let everyone know what i hear back19:40
mordredcool19:40
ianwclarkb: thanks, we can do a shared cleanup if that's ok then, and i'll work on some sort of backup solution as well19:40
mordredbut also - I think we should spend a few minutes to sort/format that file - it'll make diffing easier19:40
mordredlike "dump file before change, dump file after change, diff to verify"19:40
clarkbmordred: ya that sounds like a good idea19:41
ianwmordred: alphabetical sort of hostnames?19:41
mordredI could see just straight alph sorting being the most sensible - I don't think it needs to be grouped by type19:41
mordredianw: ++19:41
ianwi can make the export tool do that no probs19:41
mordredthat should be the easiest for our human brains to audit too19:41
mordred(in fact, the rax ui grouping things by type drives me crazy)19:42
ianwok, i'll also list out the other domains under management19:42
fungiianw: your zone export looks comprehensive to me, just skimming. i want to say the limit was something like 500 records so maybe we've sufficiently shrunk our usage that their api is no longer broken for us19:42
fungi(it's 432 now, looks like)19:43
corvusthat sounds like a legit number :)19:43
ianwoh good, maybe i got lucky :)19:43
clarkbas a timecheck we have ~17 minutes and a few more items to get through. Sounds like we've got next steps for this item sorted out. I'm going to move things along19:43
ianwyep, thanks for feedback!19:43
clarkb#topic HTTPS on in region mirrors19:43
*** openstack changes topic to "HTTPS on in region mirrors (Meeting topic: infra)"19:43
clarkb#link https://review.opendev.org/#/c/728986/ Enable more ssl vhosts on mirrors19:43
clarkbI wrote a change to enable more ssl on our mirrors. Then ianw mentioned testing and I ended up down that rabbit hole today19:44
clarkbits a good thing too because the change as is wouldn't have worked19:44
clarkbI think the latest patchset is good to go now though19:44
clarkbThe idea there is we can start using ssl anywhere that the clients can parse it19:44
clarkbIn particular things like pypi and docker and apt repos should all benefit19:45
ianwthat is basically everything ! apt on xenial?19:45
clarkbianw: ya and maybe older debian if we still have that around19:45
clarkbif we get that in I'll confirm all the mirrors are happy and then look at updating our mirror configuration in jobs19:46
clarkbThis has the potential to be disruptive but I think it should be transparent to users19:46
ianwohh s/region/openafs/ ... doh19:46
clarkbmostly just a request for review on that and a heads up that there will be changes in the space19:47
clarkb#topic Scaling Jitsi Meet for Meetpad19:47
*** openstack changes topic to "Scaling Jitsi Meet for Meetpad (Meeting topic: infra)"19:47
clarkbLast Friday diablo_rojo hosted a conference call on meetpad19:47
clarkbwe had something like ~22 concurrent users at peak and it all seemed to work well19:48
clarkbwhat we did notice is that the jitsi video bridge (jvb) was using significant cpu resources though.19:48
diablo_rojoYeah it was good. I had no lag or freezing.19:48
corvusi thought there would be an open bar19:48
diablo_rojocorvus, there was at my house19:48
clarkbI dug into jitsi docs over the weekend and found that jvb is the main component necessary to scale as it doe all the video processing19:48
clarkbthankfully we can run multiple jvbs and they load balance reasonably well (each conference is run on a single jvb though)19:49
clarkb#link https://review.opendev.org/#/c/729008/ Starting to poke at running more jvbs19:49
corvusclarkb: did you read frickler's link about that?19:49
fungiand it can be network distributed from the other services too i guess?19:49
clarkbcorvus: I couldn't find it19:49
clarkbfungi: yes19:49
clarkbbasically where that took me was the 729008 change above19:49
fungiso run multiple virtual machines with a jvb each, and then another virtual machine with the remaining services19:50
fungigot it19:50
clarkbwhich I think is really close at this point and is going to be blocked on dns in our fake setup as well as firewall hole punching19:50
corvusclarkb: maybe rebase on my firewall patch?19:50
clarkbcorvus: ya I think the firewall patch solves the firewall problem. Did you also end up doing the /etc/hosts or similar stuff?19:51
clarkbmy thinking on this is if we can get 729008 in or something like it then we can just spin up a few extra jvb's next week, have them run during the ptg, then shut them down after wards19:51
corvusclarkb: no, firewall config is ip addresses from inventory19:52
mordredclarkb: I have a patch for hostnames19:52
corvusclarkb: at least, that's the backend implementation.  the frontend is 'just add the group'19:52
corvusmordred: you what?19:52
mordredhttps://review.opendev.org/#/c/726910/19:52
mordredif we want to do that19:53
corvusi thought we decided not to?19:53
clarkbI'll continue and we can sort out those details after the meeting. Have a few more items to bring up before our hour is up19:54
mordredI don't remember us deciding that - but if we did, cool - I can abandon that patch19:54
clarkb#topic Project Renames19:54
*** openstack changes topic to "Project Renames (Meeting topic: infra)"19:54
clarkbWe have a single project queued up for renaming in gerrit and gitea19:54
clarkbwe've been asked about this several times and the response previously was we didn't want to take an outage just prior to the openstack release19:55
fungiit's been waiting since just after our last rename maintenance19:55
clarkbthat release is now done so we can schedule a new rename.19:55
clarkbfungi: right, one of the problems here was they didn't get on the queue last time and showed up like the day after19:55
clarkbthis time around I'll try to "advertise" the scheduled rename date to get things on the list as early as possible19:55
clarkbMy current thinking is that with all the prep to the ptg and the ptg itself as well as holidays the best time for this may be the week after the ptg19:56
clarkbJune 8-12 ish time frame19:56
clarkbalso post PTG tends to be a quiet time so may be good for users too19:56
clarkbany preferences or other ideas?19:57
clarkbdoesn't sound like it. Let's pencil in June 12 and start getting potential renames to think with that deadline in mind?19:58
clarkb(sorry if I'm moving too fast I keep looking at hte clock and realize we are just about at time for the day)19:58
fungiwfm, thanks19:58
clarkb#topic Virtual PTG Attendance19:58
*** openstack changes topic to "Virtual PTG Attendance (Meeting topic: infra)"19:58
clarkb#link https://virtualptgjune2020.eventbrite.com Register if you plan to attend. This helps with planning details.19:58
clarkb#link https://etherpad.opendev.org/p/opendev-virtual-ptg-june-2020 PTG Ideas19:58
clarkbA friendly reminder to register for the PTG if you plan to attend19:59
clarkbas well as a link to our planning document with connection and time details19:59
clarkbThis will be all new and different. Will be interesting to see how it goes19:59
clarkbAnd that basically takes us to the end of the hour19:59
fungithanks clarkb!19:59
clarkbThank you everyone for your time. Feel free to continue discussions in #opendev19:59
clarkb#endmeeting20:00
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev"20:00
openstackMeeting ended Tue May 19 20:00:02 2020 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)20:00
openstackMinutes:        http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-05-19-19.01.html20:00
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-05-19-19.01.txt20:00
openstackLog:            http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-05-19-19.01.log.html20:00

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!