Tuesday, 2020-11-03

*** hamalq has quit IRC01:41
*** hashar has joined #opendev-meeting08:03
*** sboyron has joined #opendev-meeting08:11
*** sboyron has quit IRC09:12
*** sboyron has joined #opendev-meeting09:12
*** mordred has quit IRC15:00
*** hashar is now known as hasharOut15:22
*** mordred has joined #opendev-meeting15:41
*** mordred has quit IRC15:47
*** mordred has joined #opendev-meeting15:56
*** hamalq has joined #opendev-meeting17:01
*** hamalq has quit IRC18:27
*** hamalq has joined #opendev-meeting18:27
*** ianw_pto is now known as ianw18:59
clarkbanyone else here for the meeting? we'll get started in a couple minutes18:59
ianwo/19:00
clarkb#startmeeting infra19:01
openstackMeeting started Tue Nov  3 19:01:06 2020 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:01
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:01
*** openstack changes topic to " (Meeting topic: infra)"19:01
fricklero/19:01
openstackThe meeting name has been set to 'infra'19:01
clarkblink http://lists.opendev.org/pipermail/service-discuss/2020-November/000123.html Our Agenda19:01
clarkb#link http://lists.opendev.org/pipermail/service-discuss/2020-November/000123.html Our Agenda19:01
clarkb#topic Announcements19:01
*** openstack changes topic to "Announcements (Meeting topic: infra)"19:01
clarkbWallaby cycle signing key has been activated https://review.opendev.org/76036419:01
clarkbPlease sign if you haven't yet https://docs.opendev.org/opendev/system-config/latest/signing.html19:01
clarkbthis is fungis semi annual reminder that we should verify and sign the contents of that key19:02
clarkbfungi: ^ anything else to add on that topic?19:02
funginot really, it's in place now19:02
clarkbThe other announcement I had was that much of the world has or is going to soon end/start summer time19:03
fungieventually i'd like to look into some opendev-specific signing keys, but haven't had time to plan how we'll handle the e-mail address yet19:03
clarkbdouble check your meetings against your local timezone as things may be offset by an hour from where they were the last ~6months19:03
clarkb#topic Actions from last meeting19:04
*** openstack changes topic to "Actions from last meeting (Meeting topic: infra)"19:04
clarkb#link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-10-13-19.01.txt minutes from last meeting19:05
clarkbI don't see any recorded actions, but it has been a while. Was there anything from previous meetings we should call out quickly?19:05
funginothing comes to mind19:06
clarkb#topic Priority Efforts19:06
*** openstack changes topic to "Priority Efforts (Meeting topic: infra)"19:06
clarkb#topic Update Config Management19:06
*** openstack changes topic to "Update Config Management (Meeting topic: infra)"19:06
clarkbOne thing to call out here is that docker's new rate limiting as gone into effect (or should've)19:07
clarkbI've yet to see catastrophic results from that for our jobs (and zuuls)19:07
clarkbbut we should keep an eye on it.19:07
clarkbIf things do get really sad I've pushed up changes that will stop us funneling traffic through our caching proxies which will diversify the source addresses and should reduce the impact of the rate limiting19:07
clarkbfrickler also reached out to them about their open source project support and they will give us rate limit free images but we have to agree to a bunch of terms which we may not be super thrilled about19:08
clarkbin particular one that worries me is that we can't use third party container tools? something like that19:08
clarkbfungi: do you think we should reach out to jbryce about those terms and see what he thinks about them and go from there?19:09
clarkb(I mean other opinions are good too but jbryce tends to have a good grasp on those times of use agreements)19:09
fungiwell, it was more like we can't imply that unofficial tools are supported for retrieving and running those images, it seemed like19:09
ianwi.e. podman, etc is what that means?19:10
clarkbright, but we actively use skopeo for our image jobs ...19:10
fricklermaybe we should reply to docker and ask what they really mean with all that19:10
fungialso it's specifically about the images we publish, not about how many images we can retrieve which are published by others19:10
clarkbianw: that was how I read it and ya clarification on that point may be worthwhile too19:10
fungifor those who weren't forwarded a copy, here's the specific requirement: "...the Publisher agrees to...Document that Docker Engine or Docker Desktop are required to run their whitelisted images"19:12
clarkbthe good news so far is that our volume seems to be low enough that we haven't hit imediate problems. And fungi and I can see if jbryce has any specific concerns about their agreement (we can have our concerns too) ?19:13
fricklerya I wasn't sure whether the mail should be considered confidential, but I think I could paste it into an etherpad to let us agree on a reply?19:13
fungithe other requirements were mainly about participating in press releases and marketing materials for docker inc19:13
fungiwhich while maybe distasteful are probably not as hard to agree to do if we decide this is important19:13
clarkbya and may not even be the worst thing if we end up talking about how we build images and do the speculative builds and all that19:14
fricklerit might also be interesting to find out whether base images like python+ubuntu might already be under the free program19:14
clarkbfrickler: thats a good point too because if our base images aren't then we are only solving half the problem19:15
clarkbI wonder if there is a way to check19:15
fricklerwhich might imply that we don't have a lot of issues anyway, yes19:15
fungitry to retrieve it 101 times in an afternoon? ;)19:15
fricklerwe could ask about that in our reply, too. do we have a (nearly) complete list of namespaces we use images from?19:16
clarkbfrickler: you can probably do a serach on codesearch for dockerfile and FROM lines to get a representative sample?19:16
frickleralso, do we have a list of "opendev" namespaces? I know about zuul only19:16
clarkbwe have opendevorg and zuul19:16
fungiopendevorg19:16
clarkbwell zuul has zuul and opendev has opendevorg19:17
corvusi think "we" "have" openstack too19:17
fricklerdo we talk for both or would we let zuul do a different contact19:17
clarkbfrickler: for now it is probably best to stick to opendevorg and figure out what the rules are then we can look at expanding from there?19:18
corvusclarkb: ++19:18
clarkbzuul may not be comfortable with all the same rules we may be comfortable with (or vice versa). Starting small seems like a good thing19:18
fungikolla also publishes images to their own namespace i think, loci may as well?19:18
fungibut yeah, i would start with one19:19
clarkbalright anything else on this topic or shold we move on?19:19
fungipossible we could publish our images to more than one registry and then consume from one which isn't dockerhub, though that may encounter similar rate limits19:21
fricklerhttps://etherpad.opendev.org/p/EAfLWowNY8N96APS1XXM19:21
clarkbyes I seem to recall tripleo ruled out quay as a quic fix because they have rate limits too19:21
clarkbI think figuring out how we can run a caching proxy of some sort would still be great  (possibly a version of zuul-registry)19:21
fricklersadly that doesn't include the embedded links, I'll add those later, likely tomorrow19:21
clarkbfrickler: thanks19:21
clarkb#topic OpenDev19:22
*** openstack changes topic to "OpenDev (Meeting topic: infra)"19:22
clarkbThe work to upgrade Gerrit continues. I announced to service-announce@lists.opendev.org that this is going to happen november 20-2219:23
clarkbfungi and I will be driving that but others are more than welcome to help out too :)19:23
clarkbon the prep and testing side of things we need to spin review-test back up on 2.13 with an up todate prod state and reupgrade it19:23
fungiyep, the more the merrier19:23
clarkbwe're also investigating mnaser's idea for using a surrogate gerrit on a performant vexxhost flavor19:24
clarkbbut I think we'll test that from a 2.13 review-test clone19:24
clarkbfungi: do you think that is something we can start in the next day or two19:24
clarkb?19:24
fungiyeah, i was hoping to have time for it today, but i'm still catching my breath and catching up after the past few weeks19:24
clarkbcool I'm hoping for time tomorrow at this point myself19:25
clarkbianw: any new news on the jeepyb side of things where the db access will go away?19:25
clarkbhttps://review.opendev.org/758595 is an unrelated bug in jeepyb that I caught during upgrade testing if people have time for that one19:27
ianwclarkb: sorry no didn't get to that yet, although we poked at some api bits19:27
clarkbno worries, I think we're all getting back into the swing of things after an eventful couple of weeks19:28
ianwit seems what we need is in later gerrits (ability to lookup id's and emails)19:28
ianwbut not in current gerrit, which makes it a bit annoying that i guess we can't pre-deploy things19:28
clarkboh right the api exposes that. I think the thing we ned to check next on that is what perms are required to do that and we can look at that once review-test is upgraded again19:28
clarkbwe can definitely use review-test to dig into that more hopefully soon19:29
clarkbanything else on the gerrit upgrade? or other opendev related topics?19:30
ianwwe can probably discuss out of the meeting, but i did just see that we got an email from the person presumably crawling gerrit and causing a few slowdowns recently19:31
fungiyeah, i figured we'll leave review-test up again after the upgrade test for developing things like that against more easily19:31
fungiianw: yeah, they replied to the ml too, and i've responded to them on-list19:32
fungii dropped you from the explicit cc since you read the ml19:32
ianwoh, ok haven't got that far yet :)19:32
clarkb#topic General Topics19:33
*** openstack changes topic to "General Topics (Meeting topic: infra)"19:33
clarkbQuick note that I intend on putting together a PTG followup email in the near future too. Just many things to catch up on and that has been lagging19:33
clarkb#topic Meetpad Access issues from China19:35
*** openstack changes topic to "Meetpad Access issues from China (Meeting topic: infra)"19:35
clarkbfrickler: you added this one so feel to jump in19:35
clarkbIt is my understanding that it apperas either corporate networks or the great firewall are blocking access to meetpad19:35
clarkbthis caused neutron (and possibly others) to fallback to zoom19:35
frickleryeah, so I just saw that one person had difficulty joining the neutron meetpad19:36
fricklerand I was thinking that it would be good if we could solve that issue19:36
fungiany idea which it was? businesses/isps blocking web-rtc at their network borders, or the national firewall?19:36
fricklerbut it would likely need cooperation with someone on the "inside"19:36
corvuscan we characterize that issue?  (yes what fungi said)19:36
fricklerhe said that he could only listen to audio19:37
corvus(was it even webrtc being blocked or...)19:37
corvusis there more than one report?19:37
fungiyes, i would say first we should see if we can find someone who is on a "normal" (not corporate) network in mainland china who is able to access meetpad successfully (if possible), and then try to figure out what's different for people who can't19:37
fungii should say not corporate and not vpn19:38
fungithere are also people outside china who can't seem to get meetpad to work for various reasons, so i would hate to imply that it's a "china problem"19:38
clarkbmaybe we can see if horace has time to do a test call with us?19:39
clarkbthen work from there?19:39
fricklerftr there were also people having issues with zoom19:40
clarkbI'll try to reach out to horace later to day local time (horace's morning) and see if that is something we can test out19:41
fricklerand even some for whom meetpad seemed to work better than zoom, so not a general issue in one direction19:41
fricklersee the feedback etherpad https://etherpad.opendev.org/p/October2020-PTG-Feedback19:41
fungiyes, meetpad works marginally better for me than zoom's webclient (i'm not brave enough nor foolhardy enough to try zoom's binary clients)19:42
clarkbanything else on this subject? sounds like we need to gather more data19:43
frickleranother, likely unrelated issue, was that meetpad was dropping the etherpad window at times when someone with video enabled was talking19:43
fungii also had a number of meetpad sessions where the embedded etherpad stayed up the whole time, so i still am not quite sure what sometimes causes it to keep getting replaced by camera streams19:43
fungithough yeah, maybe it's that those were sessions where nobody turned their cameras on19:44
fungii didn't consider that possibility19:44
clarkbmay be worth filing an upstream bug on that one19:44
corvuswe're also behind on the js client; upstream hasn't merged my pr19:45
clarkbI did briefly look at the js when it was happening for me and I couldn't figure it out19:45
clarkbcorvus: ah, maybe we should rebsae and deploy a new image and see if it persists?19:45
corvusmaybe it's worth a rebase/update before the next event19:45
clarkb++19:45
fungisometime in the next few weeks might be good for that matter19:46
corvussomething happening in a few weeks?19:46
clarkbfungi is gonna use meetpad for socially distant thanksgiving?19:46
funginah, just figure that gives us lots of time to work out any new issues19:46
corvusah yep.  well before the next event would be best i agree19:47
clarkbok we've got ~13 minutes left and a couple topics I wanted to bring up. We can swing back around to this if we hvae time19:47
clarkb#topic Bup and Borg Backups19:47
*** openstack changes topic to "Bup and Borg Backups (Meeting topic: infra)"19:47
clarkbianw: I think you've made progress on this but wanted to check in on it to be sure19:47
ianwthere's https://review.opendev.org/#/c/760497/ to bring in the second borg backup server19:48
ianwthat should be ready to go, the server is up wit hstorage attached19:48
ianwso basically i'd like to get ethercalc backup up to both borg servers, then stage in more servers, until the point all are borg-ing, then we can stop bup19:49
clarkbany changes yet to add the fuse support deps?19:49
ianwtodo is the fuse bits19:49
clarkbk19:49
ianwthat's all :)19:49
clarkbthank you for pushing on that19:49
clarkb#topic Long term plans for running openstackid.org19:49
*** openstack changes topic to "Long term plans for running openstackid.org (Meeting topic: infra)"19:49
clarkbAmong the recent fires was the openstackid melted down during the virtual summit19:49
clarkbit turned out there were caching problems which caused bsaically all the requests to retry auth and that caused openstack id to break19:50
clarkbwe were asked to scale up openstackid's deployment which fungi and I did. What we discovered doing that is if we had to rebuild or redeploy the service we wouldn't be able to do so successfully without intervention from the foundation sysadmins due to firewalls19:50
clarkbI'd like to work with them to sort out what the best options are for hosting the service and it is feeling like we may not be it. But I want to see if others have strong feelings19:51
clarkbthey did mention they have docker image stuff now so we could convert them to our ansible + docker compose stuff if we wanted to keep running it19:51
fungifor background, we stood up the openstackid.org deployment initially because there was a desire from the oif (then osf) for us to switch to using it, and we said that for such a change to even be on the table we'd need it to be run within our infrastructure and processes. in the years since, it's become clear that if we do integrate it in some way it will be as an identity option for our users so not19:54
fungisomething we need to retain control over19:54
fungicurrently i think translate.openstack.org, refstack.openstack.org and survey.openstack.org are the only services we operate which rely on it for authentication19:55
fungiof those, two can probably go away (translate is running abandonware, and survey is barely used), the other could perhaps also be handed off to the oif19:56
clarkbya no decisions made yet, just wanted to call that out as a thing that is going on19:56
clarkbwe are just about at time now so I'll open it up to any other items really quick19:56
clarkb#topic Open Discussion19:56
*** openstack changes topic to "Open Discussion (Meeting topic: infra)"19:56
fungiwe had a spate of ethercalc crashes over the weekend. i narrowed it down to a corrupt/broken spreadsheet19:57
fungii'll not link it here, but in short any client pulling up that spreadsheet will cause the service to crash19:58
corvuscan/should we delete it?19:58
corvus(the ethercalc which must not be named)19:58
fungiand the webclient helpfully keeps retrying to access it for as long as you have the tab/window open, so it re-crashes the service again as soon as you start it back up19:58
clarkbif we are able to delete it that seems like a reasonable thing to do19:59
fungiyeah, i looked into how to do deletes, there's a rest api and the document for it mentions a method to delete a "room"19:59
clarkbits a redis data store so not sure what that looks like if there isn't an api for it19:59
fungii'm still not quite sure how you auth to the api, suspect it might work like etherpad's19:59
clarkband now we are at time20:00
clarkbfungi: yup they are pretty similra that way iirc20:00
clarkbthank you everyone!20:00
clarkb#endmeeting20:00
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev"20:00
openstackMeeting ended Tue Nov  3 20:00:23 2020 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)20:00
openstackMinutes:        http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-11-03-19.01.html20:00
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-11-03-19.01.txt20:00
openstackLog:            http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-11-03-19.01.log.html20:00
fungithanks clarkb!20:00
*** hasharOut is now known as hashar21:12
*** ChanServ has quit IRC21:26
*** ChanServ has joined #opendev-meeting21:32
*** tepper.freenode.net sets mode: +o ChanServ21:32
*** hashar has quit IRC22:01
*** sboyron has quit IRC22:34

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!