| -openstackstatus- NOTICE: review.opendev.org is being restarted for scheduled maintenance; see http://lists.opendev.org/pipermail/service-announce/2020-April/000003.html | 16:04 | |
| fungi | okay, we can start prepping for the etherpad maintenance in here i suppose | 16:53 |
|---|---|---|
| corvus | status notice etherpad.openstack will be offline for about 30 minutes while it is migrated to a new server with a new hostname; see http://lists.opendev.org/pipermail/service-announce/2020-April/000003.html | 16:54 |
| corvus | how's that look? | 16:54 |
| corvus | also, do we want to startmeeting? | 16:55 |
| corvus | maybe startmeeting opendev-maintenance ? | 16:55 |
| corvus | infra-root: i summon you :) | 16:56 |
| fungi | *poof* | 16:56 |
| * fungi appears in a puff of smoke | 16:56 | |
| clarkb | corvus: ++ on the meeting we can try that out for records | 16:56 |
| clarkb | and the status message lgtm | 16:56 |
| fungi | that lgtm | 16:57 |
| fungi | using meetbot for this one would work, but not for anything where #status alert as they will fight for control of the channel topic | 16:57 |
| corvus | or should we call it 'opendev-maint' because typing is hard? | 16:58 |
| fungi | i'm fine with the abbrev, sure | 16:59 |
| * mordred waves | 16:59 | |
| mordred | corvus: yes | 16:59 |
| corvus | "opendev-maint" going once | 16:59 |
| corvus | ...going twice... | 16:59 |
| corvus | ...sold | 17:00 |
| corvus | #startmeeting opendev-maint | 17:00 |
| openstack | Meeting started Fri Apr 10 17:00:05 2020 UTC and is due to finish in 60 minutes. The chair is corvus. Information about MeetBot at http://wiki.debian.org/MeetBot. | 17:00 |
| openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 17:00 |
| *** openstack changes topic to " (Meeting topic: opendev-maint)" | 17:00 | |
| openstack | The meeting name has been set to 'opendev_maint' | 17:00 |
| corvus | ha, apparently it's opendev_maint :) | 17:00 |
| corvus | #status notice etherpad.openstack.org will be offline for about 30 minutes while it is migrated to a new server with a new hostname; see http://lists.opendev.org/pipermail/service-announce/2020-April/000003.html | 17:00 |
| openstackstatus | corvus: sending notice | 17:00 |
| * mordred is in a screen on etherpad01.opendev.org | 17:01 | |
| -openstackstatus- NOTICE: etherpad.openstack.org will be offline for about 30 minutes while it is migrated to a new server with a new hostname; see http://lists.opendev.org/pipermail/service-announce/2020-April/000003.html | 17:01 | |
| corvus | joined | 17:01 |
| fungi | joined as well | 17:01 |
| mordred | k. I'm ready to rock and roll there - somebody else want to stop existing etherpad? | 17:01 |
| mordred | ( | 17:02 |
| * clarkb is joining | 17:02 | |
| corvus | i'll stop existing etherpad | 17:02 |
| mordred | I'm going to warn everybody - it's like watching paint dry in the screen once this is running | 17:02 |
| fungi | oh, for the db dump/source pipeline? | 17:02 |
| clarkb | I've joined | 17:02 |
| mordred | yup | 17:02 |
| corvus | old etherpad is stopped | 17:03 |
| mordred | ok. I'm going to run the command | 17:03 |
| mordred | it is running | 17:03 |
| corvus | neat, old etherpad is running a puppetlabs mcollectived server | 17:04 |
| corvus | whatever that is | 17:04 |
| openstackstatus | corvus: finished sending notice | 17:04 |
| mordred | WOW | 17:04 |
| corvus | mordred: is etherpad running on the new server? | 17:04 |
| mordred | corvus: it shold not be | 17:04 |
| mordred | I only started the mariadb service | 17:04 |
| corvus | cool, i confirm that's the case :) | 17:04 |
| clarkb | mcollective was puppets message bus for doing orchestration like tasks | 17:05 |
| corvus | should we start the dns change now? | 17:05 |
| corvus | i believe we should change etherpad.openstack.org cname to point to etherpad.opendev.org ? | 17:05 |
| mordred | yeah - I think that's a good idea | 17:05 |
| corvus | i'll get started on that while clarkb and fungi confirm :) | 17:06 |
| clarkb | ++ | 17:06 |
| fungi | yes definitely | 17:07 |
| fungi | to give the change time to propagate | 17:07 |
| fungi | presumably the plan is to delete the existing a/aaaa rrs for etherpad.openstack.org and replace it with a cname to etherpad.opendev.org | 17:08 |
| corvus | etherpad.openstack.org is currently a cname for etherpad01 | 17:09 |
| corvus | etherpad.openstack.org is currently a cname for etherpad01.openstack.org | 17:09 |
| corvus | i was going to change it to be a cname for etherpad.opendev.org | 17:09 |
| corvus | so the result will be etherpad.openstack.org -> etherpad.opendev.org -> etherpad01.opendev.org | 17:09 |
| clarkb | corvus: ++ | 17:09 |
| mordred | corvus: I think that's correct | 17:09 |
| fungi | ahh, right, so just update the cname, even easier | 17:10 |
| corvus | there's just one problem; i don't see etherpad.openstack.org in the list of records in the rax web ui | 17:10 |
| corvus | it was there when i changed the ttl a few days ago | 17:10 |
| fungi | scroll all the way to the end and then keyword search? | 17:10 |
| corvus | is there some kind of limit? | 17:10 |
| mordred | the rax records are paged and sorted by type | 17:10 |
| corvus | fungi: that is my usual procedure which i have done | 17:10 |
| fungi | it only pages in some at a time and you have to scroll | 17:10 |
| mordred | weird | 17:10 |
| fungi | ahh, i can try | 17:10 |
| clarkb | the lenght of the db backup is making me think about this. Whats the disk situation like on the new server? it has a 50GB volume and is currently using ~3GB of that for the prod db? | 17:11 |
| mordred | also - https://review.opendev.org/#/c/718764 can be landed now | 17:11 |
| corvus | wait i found it | 17:11 |
| fungi | standing down! | 17:11 |
| clarkb | also ^F doesn't work properly | 17:11 |
| corvus | ctrl-f was not bringing it up | 17:11 |
| mordred | corvus: once it's loaded it's about 30G of data | 17:11 |
| mordred | gah | 17:11 |
| mordred | clarkb: ^^ | 17:11 |
| clarkb | mordred: is 50GB big enough? | 17:11 |
| corvus | but scrolling to it, it shows up (and it's highlighted) | 17:11 |
| mordred | that's what the volume was on the old one | 17:11 |
| clarkb | mordred: ah ok | 17:11 |
| clarkb | and we can always attach another volume and grow the lv | 17:12 |
| clarkb | now that I've said ^ and checked lvs I'm far less worried :) | 17:12 |
| mordred | ++ | 17:12 |
| * fungi checks paint, still sticky | 17:12 | |
| mordred | that said - I was totaly a shemp when I attached that volume so the lv has a stupid name | 17:12 |
| corvus | #info updated etherpad.openstack.org. CNAME from etherpad01.openstack.org. to etherpad01.opendev.org. | 17:13 |
| corvus | i left the ttl at 300 | 17:13 |
| mordred | cool | 17:13 |
| corvus | do we have an ssl cert for etherpad.openstack.org on etherpad01.opendev.org? | 17:14 |
| fungi | yeah, i already tested that bit | 17:14 |
| corvus | cool, i thought so, just running through things again :) | 17:14 |
| mordred | if you want to watch the db size grow: | 17:14 |
| mordred | ls -ltrah /var/etherpad/db/etherpad@002dlite/store.ibd | 17:14 |
| mordred | on etherpad01.opendev.org | 17:14 |
| clarkb | ya and the LE verification failed the first time around because dns wasn't set up properly to verify that the frist time | 17:15 |
| fungi | X509v3 Subject Alternative Name: DNS:etherpad.opendev.org, DNS:etherpad.openstack.org, DNS:etherpad01.opendev.org | 17:15 |
| fungi | according to openssl | 17:15 |
| mordred | woot | 17:15 |
| corvus | etherpad.openstack.org.300INCNAMEetherpad.opendev.org. | 17:16 |
| corvus | etherpad.opendev.org.299INCNAMEetherpad01.opendev.org. | 17:16 |
| corvus | etherpad01.opendev.org.218INA104.130.124.120 | 17:16 |
| corvus | that's what i get from dig now | 17:16 |
| clarkb | corvus: looks perfect | 17:17 |
| corvus | and cool, the http redirect is working | 17:17 |
| corvus | (because apache is up; it's just the eplite service that's down) | 17:17 |
| mordred | while we're waiting - it occurred to me recently - is having apache on the host rather than in a docker container and in the compose file the right choice? would it make more sense to run it as an apache container as well? | 17:19 |
| clarkb | mordred: ya I was thinking about that back when I thought refstack might grow some momentum again. I think if we want to go away from using host networking having a host run webproxy is nice though it could be the one host network container too | 17:20 |
| fungi | right, i tested the redirect yesterday as well, albeit with the etherpad service down and apache serving an error for it | 17:20 |
| fungi | so looks like what i got from my local /etc/hosts edit | 17:21 |
| mordred | clarkb: yeah - I was thinking about it from a "what would be different about these container services if we decided to roll out k8s" | 17:21 |
| clarkb | mordred: if we rolled out k8s we'd probably use the nginx ingress controller for a good chunk of that ? | 17:21 |
| corvus | i'm ambivalent about whether we run apache in a container or not; if we did, we could stull use host networking | 17:22 |
| clarkb | though services like etherpad need rewriting which I don't know that can do | 17:22 |
| corvus | clarkb: we would use *some kind* in ingress controller, not necessarily the nginx one, depending on what our load balancer situation was like | 17:22 |
| clarkb | fair | 17:22 |
| corvus | and many of them can rewrite | 17:22 |
| mordred | clarkb: yeah - I think we can still run apache behind the ingress controller in those cases - so that we don't have to rewrite all of our rewrites | 17:22 |
| mordred | but also - cloud load balancers are a thign too | 17:22 |
| mordred | when we did the gitea setup, we used a cloud load balancer that attached to exposed service of each pod running | 17:23 |
| clarkb | and that cloud load balancer was running haproxy not nginx :) | 17:23 |
| mordred | that said - in our current clouds we can do the same thing only with nginx ingress if we use VRRP to manage which thing owns the VIP | 17:23 |
| mordred | if we don't want to rely on a cloud load balancer | 17:24 |
| mordred | I know that it's possible to create VRRP-enabled ports in neutron in vexxhost | 17:24 |
| clarkb | mordred: ya the basic requirement is being able to control a shared l2 network between the instances with the 3 IPs on that network | 17:25 |
| clarkb | though maybe you don't even need the third ip on that network if you can vrrp separately? its been a while since I had to do vrrp | 17:26 |
| corvus | here's an ingress controller config for gke with a path mapping (to /, but the syntax is there to imagine other roots); so it's doing layer 7 load balancing -- https://gerrit.googlesource.com/zuul/ops/+/refs/heads/master/k8s/zuul.yaml#315 | 17:26 |
| fungi | clarkb: yeah, technically you can have vrrp/hsrp/carp use only two addresses (though a third makes it somewhat easier) | 17:27 |
| mordred | corvus: so that ingress setup seems like it's mapping a single external ip to the resources? | 17:29 |
| clarkb | mordred: I think its a name not an ip | 17:30 |
| clarkb | (so they could do magic with dns potentially) | 17:30 |
| mordred | kubernetes.io/ingress.global-static-ip-name: "zuul-static-ip" | 17:31 |
| mordred | is what I was keying off of | 17:31 |
| corvus | mordred: yes, it's a single pre-allocated static ip | 17:31 |
| corvus | (i previously ran "gcloud get me a static ip named zuul-static-ip") | 17:32 |
| clarkb | ah | 17:32 |
| clarkb | so its referencing cloud resources outside of k8s | 17:32 |
| mordred | nod. so pattern-wise (ignoring mechanics for a sec) - that would potentally map to the sorts of things we'd want to do | 17:32 |
| corvus | yep | 17:32 |
| mordred | so figuring out the equiv pattern for us inside of a k8s in openstack would be a key piece if we wanted to explore using k8s for services instead of compose | 17:33 |
| clarkb | we are at 13GB used | 17:37 |
| clarkb | and now 15GB this paint is sticky | 17:41 |
| mordred | yeah | 17:44 |
| fungi | "wet data, do not touch" | 17:44 |
| mordred | seems to be running slower today | 17:44 |
| fungi | it is a holiday | 17:48 |
| corvus | we're expiting it to be how big? | 17:48 |
| fungi | ~30gb clarkb said? | 17:48 |
| corvus | 30g right? | 17:48 |
| clarkb | ya thats what mordred said above | 17:49 |
| fungi | oh, got it | 17:49 |
| corvus | so we're 36 minutes away from completion | 17:49 |
| corvus | status notice The etherpad migration is still in progress; revised estimated time of completion 18:30 UTC | 17:50 |
| corvus | should we send that? | 17:50 |
| fungi | yeah, warranted | 17:51 |
| clarkb | ++ | 17:51 |
| corvus | #status notice The etherpad migration is still in progress; revised estimated time of completion 18:30 UTC | 17:51 |
| openstackstatus | corvus: sending notice | 17:51 |
| corvus | i'm going to afk for about 30m | 17:51 |
| -openstackstatus- NOTICE: The etherpad migration is still in progress; revised estimated time of completion 18:30 UTC | 17:51 | |
| fungi | once maintenance is concluded, it may be time to prepare for my annual viewing of "the life of brian" | 17:52 |
| clarkb | I'll be making a tunafish sandwich for lunch when this is done | 17:53 |
| mordred | fungi, clarkb : while you're waiting: https://review.opendev.org/#/c/718764/ | 17:53 |
| mordred | and actually - I think we can not land that yet | 17:54 |
| openstackstatus | corvus: finished sending notice | 17:54 |
| mordred | and land it once we take etherpad out of the emergency file to ... no, that's too laggy. nevermind me | 17:55 |
| clarkb | https://review.opendev.org/#/c/719051/ another good one to review though it had a post failure | 17:55 |
| mordred | I think we can land it whenever | 17:55 |
| mordred | clarkb: and this one remote: https://review.opendev.org/719053 Set env vars pointing to correct file locations | 17:57 |
| mordred | and remote: https://review.opendev.org/719052 Fix issues from rolling out containers | 18:01 |
| mordred | infra-root db migration done | 18:02 |
| mordred | I might have been wrong about db size | 18:02 |
| fungi | or there were a lot of zeroes at the end | 18:02 |
| clarkb | or newer mysql is more compact | 18:02 |
| mordred | I think actually 32G of free space on device is what I was looking at :) | 18:02 |
| fungi | so ready to start up the container? | 18:02 |
| mordred | yeah - I thnk so | 18:03 |
| mordred | any last concerns? | 18:03 |
| fungi | none for me | 18:04 |
| clarkb | none from me | 18:04 |
| mordred | k. here we go | 18:04 |
| mordred | k. I reloaded an openstack etherpad, it redirected to opendev and all is good | 18:04 |
| fungi | i reconnected to a pad i already had open and got sent to the right (new) place | 18:05 |
| mordred | we might want to keep our eyes on this as it gets usage - might need to tune the my.cnf settings | 18:05 |
| fungi | didn't even reload, just clicked the reconnect button from when it got disconnected during the shutdown | 18:05 |
| fungi | we did at least incorporate the apache tuning we had on the old deployment, right? | 18:05 |
| mordred | yeah | 18:06 |
| mordred | innodb_buffer_pool_size= 256M is the one I think might be applicable | 18:06 |
| fungi | tested out a few more pads, not seeing any problem yet | 18:06 |
| clarkb | mordred: thinking it may need to be bigger? | 18:06 |
| mordred | but honestly, 256M of hot data isn't bad | 18:06 |
| clarkb | and ya I think individual etherpads tend to be pretty small. Its the history data that grows (I wonder if we can tune it to prefer the newer pad data) | 18:07 |
| mordred | it'll do that naturally - the buffer pool will only contain the most recently touched pages | 18:07 |
| mordred | so I think it should be fine | 18:08 |
| mordred | in other news, my new dowel-style rolling pin has arrived | 18:09 |
| fungi | have fun! i still just use a boring old marble cylinder roller | 18:11 |
| fungi | but i like the extra weight | 18:11 |
| mordred | are you saying I'm fat? | 18:12 |
| fungi | heh | 18:13 |
| clarkb | that post failrue was due to an rsync failure fwiw | 18:13 |
| clarkb | mordreds approval seems to have rechecked it | 18:13 |
| clarkb | do we need to send an all clear now? and maybe end the meeting? | 18:14 |
| clarkb | not sure what other work there is to do other than following up on gerrit jeepyb things | 18:14 |
| mordred | I think we should end the meeting - don't know if we need an all clear | 18:18 |
| mordred | I thnk this oe is good | 18:18 |
| mordred | we might need to restart etherpad to pick up the settings.json update - but that should be a thing that can just be done - in the margin of error of an internet facing service connectivity | 18:19 |
| mordred | oh - we need to take etherpad01.opendev.org out of emergency - shall I do that? | 18:19 |
| clarkb | ++ | 18:19 |
| clarkb | and then sometime next week clean up the old server and db? probably after we have backups running for the new server? | 18:19 |
| mordred | no - we need ot land ... | 18:20 |
| mordred | https://review.opendev.org/#/c/719036/ | 18:20 |
| mordred | and then ... one sec | 18:20 |
| fungi | mordred: are we missing an equivalent of https://opendev.org/opendev/system-config/src/branch/master/modules/openstack_project/templates/gerrit_patchset-created.erb ? | 18:20 |
| clarkb | mordred: comment on https://review.opendev.org/#/c/719036/1 | 18:21 |
| fungi | nevermind, found it at https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/gerrit/templates/patchset-created.j2 | 18:22 |
| mordred | clarkb: updated - and pushed up 2 additional | 18:23 |
| mordred | fungi: oh - I issed that one in the patch didn't I? | 18:24 |
| fungi | mordred: yeah, i commented | 18:24 |
| fungi | since it's a template it's not in the same directory | 18:24 |
| mordred | ++ | 18:24 |
| fungi | though maybe make it not a template? | 18:24 |
| corvus | o/ | 18:24 |
| fungi | it's only templated so we can toggle the welcome message feature on the existence or absence of a welcome_message_gerrit_ssh_private_key value | 18:25 |
| clarkb | mordred: and do we expect that to noop for review01.openstack.org? I guess since its already configured? | 18:25 |
| mordred | fungi: yeah - which does't exist onreview-dev I think | 18:25 |
| fungi | which i expect was more transitionalor for the benefit of people who might reuse our hook scripts | 18:25 |
| mordred | clarkb: I thnik the backup group is intended to be a normal group for servers we backup? | 18:25 |
| fungi | anyway, yeah, drop the conditional, move to files, add envvar exports | 18:26 |
| clarkb | mordred: aha got it | 18:26 |
| mordred | the backup-server is the only one we only run some times | 18:26 |
| clarkb | also I accidentally adding a +W on that group change. I've removed that | 18:26 |
| mordred | (see the two followup patches) | 18:26 |
| mordred | fungi: no - I think review-dev doesn't have that key | 18:26 |
| mordred | fungi: we'd need to add one for it - and a welcome message user | 18:27 |
| mordred | that said ... | 18:27 |
| mordred | fungi: I updated it - I think you'll like it now | 18:30 |
| mordred | corvus: does the stack at https://review.opendev.org/#/c/719077/ look right to you? | 18:32 |
| corvus | mordred: yeah -- though what was the conclusion about puppet managing backups on review? | 18:33 |
| corvus | (have we confirmed that's gone?) | 18:33 |
| mordred | those would be cron jobs right? | 18:33 |
| clarkb | mordred: yes cron jobs | 18:33 |
| clarkb | and since puppet isn't running its not managing it | 18:33 |
| clarkb | would mostly just be ensuring ansible applies the same or similar cron jobs and bup config | 18:34 |
| mordred | yeah. let me remove the bup cronjob | 18:34 |
| mordred | there's also 2 other cronjobs we have for root we need to add to ansible | 18:34 |
| mordred | but I'll leave them for now | 18:34 |
| mordred | until we have the patch to replace them | 18:34 |
| mordred | k. bup cronjob on review01.opendev.org has been removed - we should expect ansible to add one now | 18:35 |
| mordred | lemme make a patch to add the others | 18:35 |
| clarkb | service-backup should apply it | 18:35 |
| clarkb | when you add the server to the backup group | 18:36 |
| clarkb | (I don't know what rtiggers that playbook though) | 18:36 |
| mordred | clarkb: well - we have a patch to trigger all playbooks on inventory changes | 18:40 |
| mordred | that hasn't landed | 18:40 |
| mordred | https://review.opendev.org/719088 <-- gerrit cron jobs | 18:40 |
| mordred | clarkb: I take it back - inventory changes trigger everything now: https://review.opendev.org/71908 | 18:41 |
| mordred | clarkb: so adding and removing the things to groups should cause the backup playbook to run | 18:41 |
| clarkb | k | 18:41 |
| clarkb | mordred: that link is missing a digit | 18:41 |
| mordred | clarkb:https://review.opendev.org/#/c/717114/ is what I meant | 18:42 |
| clarkb | specifically line 1716 of that change covers this case | 18:43 |
| mordred | yeah | 18:43 |
| mordred | hah | 18:44 |
| corvus | looks like it's time to end the meeting | 18:47 |
| corvus | #endmeeting | 18:47 |
| *** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev" | 18:47 | |
| openstack | Meeting ended Fri Apr 10 18:47:50 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 18:47 |
| openstack | Minutes: http://eavesdrop.openstack.org/meetings/opendev_maint/2020/opendev_maint.2020-04-10-17.00.html | 18:47 |
| openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/opendev_maint/2020/opendev_maint.2020-04-10-17.00.txt | 18:47 |
| openstack | Log: http://eavesdrop.openstack.org/meetings/opendev_maint/2020/opendev_maint.2020-04-10-17.00.log.html | 18:47 |
| *** diablo_rojo has joined #opendev-meeting | 18:55 | |
| -openstackstatus- NOTICE: Due to a database migration error, etherpad.opendev.org is offline until further notice. | 20:07 | |
| *** diablo_rojo_phon has joined #opendev-meeting | 20:53 | |
| *** diablo_rojo has quit IRC | 21:54 | |
| -openstackstatus- NOTICE: Maintenance on etherpad.opendev.org is complete and the service is available again | 22:23 | |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!