*** sboyron has quit IRC | 00:08 | |
*** tosky has quit IRC | 00:41 | |
*** DSpider has quit IRC | 01:59 | |
ianw | #status log collecting netconsole logs from regionone.linaro-us mirror on temp host @ 104.239.145.100 | 03:12 |
---|---|---|
openstackstatus | ianw: finished logging | 03:12 |
ianw | infra-root: ^ should be able to login there. hopefully see if we can capture something if this shuts itself off | 03:13 |
*** mtreinish has quit IRC | 03:15 | |
*** slaweq has quit IRC | 03:40 | |
*** slaweq has joined #opendev | 03:42 | |
*** amotoki has quit IRC | 03:56 | |
*** amotoki has joined #opendev | 03:57 | |
*** ykarel has joined #opendev | 04:46 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Add all backup hosts to borg backups https://review.opendev.org/761855 | 05:09 |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: Remove entry-point for element-info https://review.opendev.org/761857 | 05:44 |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: Remove dib-block-device console entrypoint https://review.opendev.org/761858 | 05:44 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Add all backup hosts to borg backups https://review.opendev.org/761855 | 06:23 |
*** ykarel_ has joined #opendev | 06:27 | |
*** ykarel has quit IRC | 06:30 | |
*** marios has joined #opendev | 06:30 | |
*** ysandeep|away is now known as ysandeep | 06:39 | |
*** ysandeep is now known as ysandeep|ruck | 06:39 | |
*** sboyron has joined #opendev | 06:41 | |
*** whoami-rajat__ has joined #opendev | 06:50 | |
*** lpetrut has joined #opendev | 07:11 | |
*** hashar has joined #opendev | 07:50 | |
*** iurygregory has quit IRC | 07:58 | |
*** fressi has joined #opendev | 08:01 | |
*** andrewbonney has joined #opendev | 08:10 | |
openstackgerrit | Merged openstack/project-config master: Run publish-openstack-artifacts on ubuntu-focal https://review.opendev.org/761776 | 08:10 |
*** ysandeep|ruck is now known as ysandeep|lunch | 08:21 | |
*** rpittau|afk is now known as rpittau | 08:30 | |
*** iurygregory has joined #opendev | 08:30 | |
*** ykarel_ is now known as ykarel | 08:37 | |
*** ysandeep|lunch is now known as ysandeep|ruck | 09:15 | |
*** sshnaidm_ is now known as sshnaidm|rover | 09:24 | |
*** ralonsoh has joined #opendev | 09:33 | |
*** iurygregory has quit IRC | 09:35 | |
*** iurygregory has joined #opendev | 09:36 | |
*** fressi has quit IRC | 09:40 | |
*** fressi has joined #opendev | 09:47 | |
openstackgerrit | Merged zuul/zuul-jobs master: Decrease MTU to account for IPv6 header https://review.opendev.org/761800 | 09:48 |
*** fressi has quit IRC | 09:53 | |
*** fressi has joined #opendev | 09:57 | |
openstackgerrit | zbr proposed opendev/elastic-recheck master: Make file writing atomic https://review.opendev.org/761764 | 10:02 |
*** hashar has quit IRC | 10:31 | |
*** iurygregory_ has joined #opendev | 10:32 | |
*** iurygregory has quit IRC | 10:35 | |
openstackgerrit | Thierry Carrez proposed opendev/irc-meetings master: Update Large Scale SIG meeting time https://review.opendev.org/761884 | 10:48 |
openstackgerrit | Merged opendev/elastic-recheck master: Make file writing atomic https://review.opendev.org/761764 | 10:55 |
*** tosky has joined #opendev | 11:09 | |
*** mtreinish has joined #opendev | 11:14 | |
*** DSpider has joined #opendev | 11:35 | |
*** dtantsur|afk is now known as dtantsur | 12:32 | |
*** iurygregory_ is now known as iurygregory | 13:01 | |
*** hashar has joined #opendev | 13:07 | |
*** priteau has joined #opendev | 13:17 | |
*** d34dh0r53 has quit IRC | 13:46 | |
*** d34dh0r53 has joined #opendev | 13:52 | |
*** ykarel has quit IRC | 14:08 | |
*** TheJulia has joined #opendev | 14:09 | |
*** cloudnull has quit IRC | 14:11 | |
*** ykarel has joined #opendev | 14:18 | |
*** cloudnull has joined #opendev | 14:18 | |
openstackgerrit | zbr proposed zuul/zuul-jobs master: More E208 (22) https://review.opendev.org/761294 | 14:40 |
openstackgerrit | zbr proposed zuul/zuul-jobs master: More E208 (final) https://review.opendev.org/761297 | 14:40 |
openstackgerrit | zbr proposed openstack/project-config master: Add pytest-dev/pytest-infra project https://review.opendev.org/761939 | 14:45 |
openstackgerrit | zbr proposed opendev/system-config master: Account for testinfra project rename https://review.opendev.org/761946 | 14:59 |
*** lpetrut has quit IRC | 15:02 | |
*** ysandeep|ruck is now known as ysandeep|away | 15:25 | |
*** hashar has quit IRC | 15:28 | |
mordred | TIL: in bash v4 - shopt -s globstar | 15:38 |
mordred | ls **/*.conf | 15:38 |
mordred | totally a thing | 15:38 |
mordred | and it DTRT with paths containing spaces | 15:38 |
*** iurygregory has quit IRC | 15:43 | |
openstackgerrit | Merged zuul/zuul-jobs master: More E208 (22) https://review.opendev.org/761294 | 15:51 |
fungi | marvellous! | 15:52 |
fungi | recursive globs here i come | 15:52 |
*** rpittau is now known as rpittau|bbl | 15:52 | |
mordred | right? it's not every day I hear about a thing in bash and immediately get excited about cleaner shell scripts | 15:53 |
*** hashar has joined #opendev | 15:54 | |
*** roman_g has joined #opendev | 15:54 | |
clarkb | fwiw gerrit 3.3 won't release until the monday prior to our planned upgrade. For this reason I think we should stick with 3.2 as the upgrade target | 15:55 |
*** cloudnull has quit IRC | 15:55 | |
clarkb | getting to 3.3 shouldn't take long once we are at 3.2 if the 3.0 -> 3.1 -> 3.2 upgrades are any indication so I'm not too worried about it | 15:55 |
mordred | doing a followup 3.3 ... yeah | 15:55 |
mordred | should be easy | 15:55 |
openstackgerrit | zbr proposed openstack/project-config master: Add pytest-dev/pytest-testinfra https://review.opendev.org/761960 | 15:57 |
*** cloudnull has joined #opendev | 16:02 | |
*** iurygregory has joined #opendev | 16:03 | |
*** mlavalle has joined #opendev | 16:06 | |
*** jgwentworth is now known as melwitt | 16:12 | |
openstackgerrit | zbr proposed openstack/project-config master: Rename testinfra project https://review.opendev.org/761939 | 16:26 |
roman_g | Good morning, team. We are observing increased number of NODE_FAILURE error on attempts to launch jobs utilizing 16GB and 32GB VMs. Could you, please, have a look at Zuul logs and confirm that it's just capacity issue and VMs couldn't be scheduled due to lack of resources? Thank you. | 16:28 |
roman_g | https://zuul.opendev.org/t/openstack/builds?job_name=airship-airshipctl-gate-test&project=airship/airshipctl | 16:28 |
roman_g | https://zuul.opendev.org/t/openstack/builds?job_name=airship-airshipctl-32GB-gate-test&project=airship/airshipctl | 16:28 |
roman_g | We are primarily concerned of 16GB jobs, as 32GB are only temporary ones (not merged and probably would not be merged). | 16:29 |
clarkb | roman_g: we have to turn off the openedge cloud due to a dying router and I expect most of your successful launches were against that cloud | 16:29 |
clarkb | roman_g: now you're only launching against teh citycloud provider and that one has always had a higher fail rate | 16:29 |
roman_g | clarkb, But openedge was down for quite a while, no? | 16:30 |
clarkb | it was down for a while then turned back on for a while and now off again | 16:30 |
donnyd | Yea my edge router is dying | 16:31 |
roman_g | clarkb all right. Thank you. | 16:31 |
clarkb | I'm 99% certain this is going to be citynetwork issues we've seen before but I've not checked the logs | 16:31 |
roman_g | donnyd thank you for your services! | 16:31 |
*** diablo_rojo_phon has joined #opendev | 16:32 | |
donnyd | If i can find a new edge machine, OE will get turned back on, but short of one falling from the sky... | 16:32 |
roman_g | donnyd I wish I could help. | 16:36 |
*** slaweq has quit IRC | 16:37 | |
*** slaweq has joined #opendev | 16:40 | |
clarkb | melwitt: https://34e4edce884aa432edb3-3ebc6fb2608bb766b1a0361978d33f6a.ssl.cf5.rackcdn.com/752006/8/check/ironic-tempest-partition-uefi-redfish-vmedia/dc896cf/controller/logs/screen-etcd.txt should be the first file indexed by your udpated log processor script | 16:43 |
clarkb | mordred: https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_cd4/761037/1/check/ironic-tempest-ipa-partition-uefi-pxe-grub2/cd44441/controller/logs/screen-neutron-api.txt is second | 16:44 |
clarkb | er melwitt ^ | 16:44 |
clarkb | melwitt: I think the next step for us is to check that those contents ended up in elasticsaerch as expected and if so we can fix the python requests package install on your change then land the change and get it out everywhere | 16:44 |
clarkb | I can help with the package install side since that is puppetry and I'm not sure how interested in figuring that out you are | 16:44 |
melwitt | clarkb: thanks! makes sense. +1 to your review comment to hold off on updates until we see how the test run does. I'm up for trying if you point me to an example or something, if that would be helpful for me to try it | 16:46 |
clarkb | melwitt: https://f582e4e618ed6a7ba944-b1e28608f26c1f15a9189c50caa4f021.ssl.cf2.rackcdn.com/760803/1/gate/ipa-tempest-bios-ipmi-iscsi-tinyipa-src/a6fd710/controller/logs/screen-keystone.txt is another | 16:46 |
clarkb | melwitt: oh I've just yolo'd it on production since we have 80 of these processes running | 16:46 |
*** marios is now known as marios|out | 16:47 | |
clarkb | melwitt: what you can do though is check that those three log files show up in logstash for you | 16:47 |
clarkb | for those specific job runs | 16:47 |
melwitt | :) ok, can do. thanks | 16:47 |
clarkb | melwitt: https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_010/749318/3/gate/nova-next/01061cb/controller/logs/screen-c-vol.txt and another. That is probably a big enough sample now where if those are happy we can roll ahead | 16:49 |
melwitt | acl | 16:50 |
melwitt | *ack | 16:50 |
*** marios|out has quit IRC | 16:50 | |
*** hashar has quit IRC | 17:00 | |
*** ykarel has quit IRC | 17:04 | |
clarkb | melwitt: build_short_uuid:"a6fd710" AND filename:"controller/logs/screen-keystone.txt" I think that shows the keystone example | 17:08 |
clarkb | don't forget to expand the default time range, but can you confirm that you think the data is getting indexed? | 17:08 |
melwitt | yeah, I've been doing similar. I find: build_change:752006 AND filename:"controller/logs/screen-etcd.txt" AND build_short_uuid:dc896cf has index 40 lines and the file has 40 lines, so that's a good sign | 17:09 |
melwitt | *indexed | 17:09 |
clarkb | excellent | 17:10 |
*** hamalq has joined #opendev | 17:11 | |
*** hashar has joined #opendev | 17:12 | |
*** hashar has quit IRC | 17:12 | |
*** hamalq has quit IRC | 17:14 | |
*** hamalq has joined #opendev | 17:15 | |
melwitt | for the second sample, I'm not seeing all of the beginning lines in logstash when I sort by ascending timestamp filename:"controller/logs/screen-neutron-api.txt" AND build_short_uuid:cd44441 | 17:15 |
clarkb | melwitt: it should drop all the DEBUG lines, could that be it? | 17:16 |
clarkb | or maybe if they don't have a timestamp prefix they are getting associated to a prior log line/file | 17:16 |
clarkb | (fwiw I don't think issues like that will be due to your change, if we get any lines I expect we're good as far as not regressing via your change) | 17:16 |
melwitt | I see. yeah not yet sure what I'm seeing | 17:17 |
clarkb | I'll work on updating the puppet and removing the StringIO line in a bit if you'd prefer to not learn puppet :) | 17:18 |
melwitt | sure, if you don't mind :) I'm not opposed to learning puppet, I didn't want to create more work for you, so if it would help for me to do it, I'm willing to try | 17:21 |
melwitt | do you know if lines like this "Nov 09 16:25:14.440258 ubuntu-bionic-limestone-regionone-0021671450 systemd[1]: Started Devstack devstack@neutron-api.service." get categorized as DEBUG? | 17:22 |
*** rpittau|bbl is now known as rpittau | 17:22 | |
melwitt | I see that file missing in logstash but like you said, maybe unlikely to be a regression | 17:23 |
melwitt | s/file/line/ | 17:23 |
clarkb | melwitt: I'm not sure what it does if there is no level. but I wouldn't be surprised if that is the underlying issue | 17:23 |
openstackgerrit | Clark Boylan proposed opendev/puppet-log_processor master: Stream log files instead of loading full files into memory https://review.opendev.org/759492 | 17:24 |
clarkb | melwitt: ^ I think that will do it | 17:24 |
melwitt | that same line shows up in the first example file, but not in the second one | 17:24 |
melwitt | a-ha cool thanks! | 17:24 |
clarkb | they use different logstash rules I think | 17:25 |
clarkb | since etcd isn't an openstack service | 17:25 |
clarkb | I think we expect log level in openstack service logs | 17:25 |
clarkb | but we're probably more relaxed with more generic log files | 17:25 |
clarkb | fungi: if you have time to rereview https://review.opendev.org/759492 again that would be great. Its being tested on logstash-worker01 and other than needing to install python-requests it seems happy | 17:26 |
clarkb | melwitt: and thank you for having patience with me while I found time to finish this up | 17:26 |
melwitt | np at all, thank you for testing it out. I definitely wanted to do everything we could to help ensure it wouldn't break anything, so I appreciate it | 17:29 |
*** ralonsoh has quit IRC | 17:34 | |
melwitt | I found that line btw in the neutron file in logstash with filename:"controller/logs/screen-neutron-api.txt" AND build_short_uuid:cd44441 AND message:"Started Devstack" | 17:34 |
melwitt | so there must be something funny with the timestamp sorting on the kibana dashboard I guess | 17:35 |
fungi | clarkb: melwitt: 759492 lgtm | 17:36 |
melwitt | yeah I can see the "@timestamp" field isn't in the same order as the lines appear in the file, example: 2020-11-09T08:43:24.534-08:00 Nov 09 16:25:14.440258 ubuntu-bionic-limes | 17:41 |
melwitt | is the "Started Devstack" line | 17:42 |
*** hjensas__ has joined #opendev | 17:43 | |
clarkb | if we can't parse the timestamp in the input we use the current parse time timestamp | 17:43 |
clarkb | which may explain it | 17:43 |
melwitt | whereas this line "2020-11-09T08:25:16.610-08:00[-] Logging enabled!" appears later in the file but got "@timestamp" earlier in time | 17:43 |
*** hjensas_ has quit IRC | 17:47 | |
openstackgerrit | Clark Boylan proposed opendev/puppet-log_processor master: Stream log files instead of loading full files into memory https://review.opendev.org/759492 | 17:49 |
clarkb | fungi: ^ I had to move the requests install from worker.pp to init.pp because worker.pp is a define and called multiple times for each worker daemon install and we end up with duplicate package conflicts | 17:49 |
clarkb | the init.pp is a class and run once so should be fine there | 17:49 |
fungi | yep, good idea | 17:50 |
fungi | cleaner anyway | 17:50 |
fungi | conflict avoidance with ifdef would have been slightly messier i guess | 17:51 |
clarkb | ya | 17:51 |
*** andrewbonney has quit IRC | 18:09 | |
*** roman_g has quit IRC | 18:20 | |
*** dtantsur is now known as dtantsur|afk | 18:24 | |
*** roman_g has joined #opendev | 18:28 | |
openstackgerrit | Merged openstack/project-config master: Add pytest-dev/pytest-testinfra https://review.opendev.org/761960 | 18:33 |
fungi | ttx: i've requested some clarification on 761751 just to make sure expectations are correctly set | 18:35 |
clarkb | I've removed logstash-worker01 from the emergency hosts file since melwitt's change looks like it will merge soon | 18:35 |
fungi | awesome | 18:35 |
clarkb | once it has merged the daemons won't auto restart so I'll likely go through and reboot them all as its a good opportunity for that | 18:36 |
fungi | that's ought to be a huge stability improvement | 18:36 |
*** mlavalle has quit IRC | 18:43 | |
*** mlavalle has joined #opendev | 18:44 | |
*** rpittau is now known as rpittau|afk | 18:45 | |
openstackgerrit | Merged opendev/puppet-log_processor master: Stream log files instead of loading full files into memory https://review.opendev.org/759492 | 18:50 |
clarkb | I need to figure out lunch but then I'll check ^ has applied then restart as necessary | 18:57 |
*** whoami-rajat__ has quit IRC | 19:00 | |
*** roman_g has quit IRC | 19:09 | |
*** larainema has quit IRC | 19:12 | |
*** hashar has joined #opendev | 19:41 | |
*** tosky has quit IRC | 20:17 | |
yoctozepto | dear infra, may I get chanserv auth to be able to op myself on #openstack-masakari? | 20:20 |
fungi | yoctozepto: sure, just a sec, my wifi is acting a little flaky out here on the deck | 20:26 |
yoctozepto | fungi: sure, take your time, it's not a priority :-) | 20:27 |
*** tosky has joined #opendev | 20:28 | |
clarkb | I'm restarting logstash workers to pick up melwitts change (puppet has applied but they dont' restart services) | 20:28 |
fungi | #status log added founder access to #openstack-masakari for ptl yoctozepto | 20:29 |
openstackstatus | fungi: finished logging | 20:29 |
fungi | yoctozepto: looks like samP was the only person outside opendev irc volunteers with access to control that channel (seems to have been the one to initially register it) | 20:30 |
yoctozepto | fungi: I see, thanks | 20:31 |
ianw | fungi: i think the new mirror-update reprepro is working ok, if you have time to double check the removal @ https://review.opendev.org/#/c/759976/ i can cleanup the old stuff | 20:35 |
ianw | clarkb: and https://review.opendev.org/#/c/761855/ adds all host into borg-backup; i think we're ready to go with that, ethercalc has been working good | 20:36 |
clarkb | ianw: ah cool I'll take a look | 20:37 |
clarkb | melwitt: all 80 logstash workers are now running with your update. Thanks! let us know if you notice anything off with that | 20:38 |
*** whoami-rajat__ has joined #opendev | 20:38 | |
melwitt | clarkb: sweet, will do, I'll keep an eye on it | 20:38 |
clarkb | s/workers/worker daemons/ | 20:39 |
openstackgerrit | Merged opendev/system-config master: Remove mirror-update server and related puppet https://review.opendev.org/759976 | 21:07 |
*** mgoddard has quit IRC | 21:25 | |
clarkb | infra-root I've updated the meeting agenda for tomorrow on the wiki. Is there anything else that we should add before sending it? | 21:28 |
ianw | lgtm | 21:29 |
fungi | i can't think of anything | 21:31 |
fungi | (at all, not really agenda-related, i'm basically just braindead at this point) | 21:32 |
openstackgerrit | Merged openstack/project-config master: Rename testinfra project https://review.opendev.org/761939 | 21:38 |
clarkb | ianw: lookingat the borg change, it occurs to me that we probably want to exclude afs too | 21:43 |
clarkb | ianw: or wait do we do that already because / isn't in the list? | 21:44 |
clarkb | if that is the case do we need to worry about other things like /opt being missed? | 21:44 |
clarkb | maybe we should backup / then be more aggressive on the excludes? | 21:44 |
clarkb | I think that is how bup did it | 21:44 |
*** hashar has quit IRC | 21:46 | |
*** sboyron has quit IRC | 21:46 | |
ianw | clarkb: yeah; i did more to more a include process. do we keep data on opt? | 21:52 |
clarkb | off the top of my head we don't, but we do use opt for random things so we may need to audit that on the backed up hosts? | 21:53 |
fungi | my recollection is we put important data in /home or /srv or /var/lib in some cases, but stuff in /opt is generally ephemeral? | 21:54 |
ianw | ask lists storyboard translate it seems no | 21:59 |
ianw | gitea review-dev review also all code | 22:00 |
fungi | we could stand to settle on a more formal policy for where we keep important stateful data, but it seems like we've been doing a pretty good job at being consistent so far | 22:01 |
ianw | etherpad has an /opt/db but i think that might be a manual thing? | 22:01 |
ianw | yeah, it looks like it was a dump made by mordred in april | 22:02 |
ianw | /dev/xvda1 39G 39G 0 100% / | 22:02 |
ianw | however, that is not good | 22:02 |
fungi | oof | 22:02 |
fungi | looks like it ran out of space while rotating db backups? | 22:04 |
clarkb | there were old stale backups on there that I was supposed to clear out then the world melted down | 22:04 |
clarkb | /var/backups/etherpad-mariadb/etherpad-mariadb.sql.gz.2.gz can be deleted since it is the old dobule gzipped file | 22:05 |
clarkb | I'm rm'ing it now | 22:05 |
clarkb | that gets us to 2.7GB free | 22:06 |
clarkb | which is about how large a db backup is, maybe we should reduce retention to less than 7 days locally | 22:06 |
clarkb | and rely on offsite backups more | 22:06 |
ianw | 8.6G.bup | 22:06 |
ianw | that seems big | 22:06 |
clarkb | review and zuul also had large .bups | 22:07 |
clarkb | I think we decided that you can safely rm it and bup rebuilds? | 22:07 |
ianw | i guess it's still going because the db volume has enough space | 22:08 |
clarkb | yup | 22:09 |
*** stevebaker has joined #opendev | 22:12 | |
ianw | seems like it's still not going to have enough space to dump | 22:12 |
clarkb | ianw: I think the two things we can do are to clean out the .bup dir, them manually rerun bup to rebuild it and reduce logrotate retention from 7 days to say 2 for the local db backups | 22:16 |
clarkb | we've also got a etherpad/etherpad image we can delete | 22:16 |
clarkb | that image isn't very large though so thats a small win | 22:17 |
ianw | ok, i'll move .bup to /opt and re-run to make sure it work | 22:17 |
clarkb | thanks | 22:18 |
ianw | ok, it's re-buping | 22:20 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: etherpad: reduce backup rotations https://review.opendev.org/762008 | 22:24 |
*** slaweq has quit IRC | 22:39 | |
ianw | hrm, bup failed -> http://paste.openstack.org/show/tZVvieAOdZPTXW0KYVMg/ | 22:42 |
ianw | with an out of disk error | 22:42 |
ianw | i'm going to move a bunch of the dump backups to /opt | 22:46 |
openstackgerrit | Merged openstack/diskimage-builder master: yum-minimal: Add centos-stream-repos package for centos-8-stream https://review.opendev.org/760951 | 22:46 |
clarkb | :/ | 22:49 |
clarkb | I've run `sudo docker image rm etherpad/etherpad:1.8.0` for completeness on etherpad | 22:58 |
clarkb | ianw: ^ fyi | 22:59 |
*** fressi has quit IRC | 22:59 | |
ianw | ok, i'm trying this bup backup again with more free space on / | 23:00 |
ianw | 39G 29G 11G | 23:01 |
ianw | i'll see if it goes down | 23:01 |
clarkb | it was 39G 25G 15G a couple minutes before your paste | 23:01 |
ianw | it's now at 9gb | 23:02 |
ianw | so ... yeah :/ | 23:02 |
clarkb | iirc the .bup index is tracking all the files | 23:02 |
clarkb | that is why we excluded the job runtime files on zuul beacuse t here are many of them | 23:02 |
clarkb | I wonder if we've got another place with many files we might want to exclude on etherpad | 23:02 |
clarkb | perhaps the mariadb content? | 23:02 |
ianw | down to 6.5gb now | 23:02 |
ianw | i wonder if we should just get this on borg and not spend too much time debugging this | 23:03 |
clarkb | ya that may be the best path forward at this point | 23:03 |
ianw | the only really important thing here is the .sql dumps | 23:03 |
ianw | yeah, it's down to 2g free space, it's going to run out | 23:04 |
*** iurygregory has quit IRC | 23:04 | |
ianw | i'll just roll-out 761855 and watch it | 23:04 |
clarkb | now it says 25GB free ? | 23:04 |
fungi | i'm in favor of pushing ahead with etherpad's assimilation | 23:05 |
clarkb | ianw: did you stop the bup run? or did it resolve that on its own? | 23:05 |
fungi | it'll happen eventually anyway (resistance is futile) | 23:05 |
clarkb | maybe it was using /tmp for spooling? | 23:05 |
ianw | yeah, it just died | 23:05 |
clarkb | fun | 23:05 |
ianw | ok, will let the borg rollout apply then run that manually so we know we've got some good offsite backups of it | 23:07 |
clarkb | ianw: I wonder if we put /var/etherpad/db/* in the bup excludes if it would be happier | 23:08 |
clarkb | possible we want to add ^ to be borg excludes? | 23:08 |
clarkb | but ya Ithink we can focus on borg now | 23:08 |
ianw | yes, we probably do want that actually, as we just want to store the daily dumps i guess | 23:08 |
ianw | no point storing an in-flight db | 23:08 |
clarkb | yup and the inflicht db probably isn't something we can recover from | 23:09 |
clarkb | oh also for review we mounted the old snapshot | 23:09 |
clarkb | I wonder if we back that up :/ | 23:09 |
*** lourot has quit IRC | 23:09 | |
clarkb | fungi: ^ should we go ahead and unmount and detach that volume now? | 23:10 |
* fungi checks | 23:10 | |
fungi | i've umounted /mnt/2020-20-01_snapshot just now | 23:10 |
clarkb | /mnt/2020-20-01_snapshot is where we mounted that and /mnt/* is in our excludes | 23:10 |
fungi | yeah, i was about to surmise we might already exclude /mnt for precisely this reason | 23:11 |
clarkb | fungi: thanks, we should probably proceed with detaching the volume too? what is the process for that again? we have to unload it form lvm somehow? | 23:11 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: etherpad: ignore live db for borg backups https://review.opendev.org/762012 | 23:11 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: etherpad: ignore live db for borg backups https://review.opendev.org/762012 | 23:11 |
clarkb | there are two "main"s in vgs | 23:12 |
clarkb | which seems like something we don't want to get wrong :/ | 23:12 |
fungi | i just ran `sudo lvchange -an main/test-gerrit` | 23:12 |
fungi | lvs no longer shows the active flag for it | 23:12 |
fungi | unfortunately it's in the same vg as the production volume | 23:13 |
ianw | oh, there was also that extra volume i added | 23:13 |
fungi | hopefully cinder/nova will let us detach it | 23:13 |
ianw | that was no lvm'd, just for scratch space. once you've cleaned up that i'll remove that one oto | 23:13 |
clarkb | fungi: I don't think it is in the same vg? they are listed separately | 23:14 |
*** iurygregory has joined #opendev | 23:16 | |
clarkb | fungi: `sudo lvs -o lv_name,lv_uuid,vguuid` should show the differentiation I think | 23:16 |
clarkb | Lu4enc-MUnC-ToPA-z7zc-ZrMJ-GP6g-QAIhpz is the vg to disable based on that | 23:16 |
fungi | oh neat | 23:16 |
fungi | yep | 23:16 |
fungi | i just looked at the names and assumed it merges them | 23:17 |
fungi | Volume group "Lu4enc-MUnC-ToPA-z7zc-ZrMJ-GP6g-QAIhpz" not found | 23:17 |
fungi | mmm... | 23:17 |
fungi | doesn't look from the manpage like vgchange has a way to take a vg uuid | 23:18 |
clarkb | fungi: internet says we can rename the vg with uuid | 23:18 |
clarkb | then use vgchange on the new name | 23:18 |
fungi | oh, yeah that should work | 23:18 |
clarkb | that is nice because we can confirm we got the right one with vgs prior to stopping too | 23:19 |
fungi | sudo vgrename Lu4enc-MUnC-ToPA-z7zc-ZrMJ-GP6g-QAIhpz notmain | 23:19 |
fungi | Volume group "main" successfully renamed to "notmain" | 23:19 |
fungi | that works | 23:19 |
fungi | 0 logical volume(s) in volume group "notmain" now active | 23:19 |
fungi | okay, should really be ready for detachment now i guess | 23:20 |
clarkb | fungi: does vgs not show that it is inactive? | 23:21 |
fungi | there's no active flag for volume groups, just volumes | 23:21 |
clarkb | fungi: are you working on detaching it next or should I do that? | 23:22 |
clarkb | (and are we ready to try that?) | 23:23 |
fungi | `vgchange -a n` deactivates all volumes in a vg, rather than `lvchange -a n foo/bar` which deactivates volume foo in vg bar | 23:23 |
fungi | so the `vgchange -a n notmain` was just a no-op anyway | 23:23 |
fungi | and yeah, we can work on detaching now, i'll give it a go | 23:24 |
clarkb | fungi: I see, but we ran the vgchange -a n notmain? | 23:24 |
fungi | yeah, sorry, `vgchange -a n foo` deactivates all volumes in vg "foo" | 23:25 |
fungi | i had already deactivated the only lv in that vg, so deactivating the whole vg was a no-op | 23:25 |
fungi | only logical volumes are active or inactive, not volume groups | 23:26 |
clarkb | got it | 23:26 |
fungi | also i checked and the pv for the notmain vg is /dev/xvdd1 | 23:26 |
clarkb | ya and that seems to line up with what cinder is saying for what I believe is the volume | 23:27 |
clarkb | xcdc is the prod one which lines up with vgs | 23:28 |
clarkb | *xvdc | 23:28 |
fungi | argh, can't use osc for looking at the volume list in rax | 23:28 |
clarkb | ohya use my osc | 23:28 |
fungi | i found my outdated venv and it's working | 23:29 |
clarkb | 4cool | 23:29 |
clarkb | heh don't know where the 4 came from | 23:29 |
fungi | 2cool | 23:29 |
*** tosky has quit IRC | 23:33 | |
fungi | okay, it detached and now shows as available in volume list | 23:34 |
fungi | there's another similar available volume i suspect is the one ianw mentioned creating | 23:34 |
clarkb | no that other one is from review-test's /home/gerrit2 that we replaced with the old snapshot | 23:35 |
ianw | fungi: umm, the volume i created for the backup restore should't be available, that's mounted at /backup | 23:35 |
clarkb | ya its the one we moved aside for review-test in order to start from the old state and build forward again | 23:35 |
fungi | ahh, okay, i missed we had anything mounted there | 23:35 |
clarkb | we swapped it out with the snapshot | 23:35 |
clarkb | I figure we'll clean up all the review-test stuff post upgrade. maybe after testing a 3.3 upgrade | 23:36 |
fungi | oh, i see, /backup is a raw blockdev not a pv | 23:36 |
clarkb | ya | 23:36 |
fungi | shall i unmount it too? | 23:36 |
clarkb | no I think ianw is still curating content on it | 23:36 |
fungi | oh okay | 23:36 |
clarkb | vgs looks good to me now | 23:37 |
clarkb | thanks for helping with that. My lvm foo is lacking | 23:37 |
ianw | yeah, let's get what we need off it onto the main disk then we can remove | 23:37 |
openstackgerrit | Merged opendev/system-config master: Add all backup hosts to borg backups https://review.opendev.org/761855 | 23:39 |
fungi | okay, i've deleted volume 765225a9-ab7c-40a3-a0ef-01f33bf498a8 now, the one we created from the snapshot | 23:40 |
*** lourot has joined #opendev | 23:43 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!