Tuesday, 2021-01-19

*** DSpider has quit IRC00:02
*** tosky has quit IRC00:02
fungis almost caught up on node requests00:04
fungidown to about 500 pending00:05
ianwok afs02.dfw upgraded and built, has the key material, going to reboot it00:13
ianwok, it's up, seems to be running ok00:16
ianwfungi: want to try the mirror.deb-nautilus?00:16
fungiyeah, will start it momentarily00:17
fungiand it's off to the races00:17
fungiStarting ForwardMulti from 536871072 to 536871072 on afs02.dfw.openstack.org (full release).00:17
fungihopefully will only take a few minutes to complete00:18
fungiand done00:19
funginot even a few minytes00:19
fungiReleased volume mirror.deb-nautilus successfully00:19
fungican do octopus too if you want, it's a few times larger00:19
ianwnice, yeah i think so.  i think i might work on the db servers and leave afs01 for last00:19
ianwafter all that, i think try an in-place distro upgrade on afs01.ord to see how that goes.  then we can roll that through, theoretically with zero downtime00:21
fungiReleased volume mirror.deb-octopus successfully00:24
openstackgerritMerged opendev/system-config master: Fix review01's fqdn in infratesting  https://review.opendev.org/c/opendev/system-config/+/77124200:30
fungiianw: shall i go ahead with some of the larger mirror volumes or is that going to conflict with upgrade work?00:31
fungiopensuse and centos are the next smallest at around 200GiB00:32
ianwfungi: the only question i guess is if we want to try stopping the queue to update afs01.  i note that thanks to clarkb's rebase i've also approved in those gerrit changes too00:33
ianwthe one i'd be most worried about having in sync mirrors is tarballs, did that get done?00:34
fungiyeah, tarballs is up to date00:41
fungimy main concern is when services stop on afs01.dfw, it's the only place serving current mirror volumes, so those will all go offline globally and break almost every running job00:41
fungiwhich given we're running at full tilt, will probably be a lot of collateral damage00:42
ianwagree, my main concern on the other side is that we don't really understand what running 1.6 means at this point.  i did think we were thinking it would result in randomish failures00:44
ianwrunning a 1.6 that has been restarted post timestamp rollover -- which this now has00:44
auristor1.6 won't result in random failures until a restart00:44
auristorand then it either fails and you restart it again or it doesn00:45
ianwit has been restarted, due to a failure of the cloud provider00:45
auristorthen either it failed or it didn't00:45
ianwok, well i guess it didn't! :)00:45
ianwi still wonder if a gate reset might be worth it to get us completely out of worrying about it all might be a trade-off that is worth it00:46
fungithanks for the clarification auristor! i guess we rolled lucky dice this time00:46
auristorby the 31st the risk of failure after restart will be reduced to where it was on Sunday00:47
fungiand increasingly less as the timestamp has more and more non-zero bits i guess00:47
auristorthe best practice is that before maintenance that the fileserver be vacated, wait for two hours for the client volume location cache to expire, and then perform the maintenance.00:48
fungiyeah, at the moment we have a number of very large volumes which, due to unrelated failures in the cloud service provider we're using, are only current on one server and so we're weighing the cost/benefit of getting those replicas back to normal first before upgrading, so that we don't cause an outage for the volumes in question00:50
fungibut the replication time for them is going to be on the order of a week to complete00:50
auristorI would get those volumes replicated to protect against data loss if the OS upgrade goes south00:50
fungiwell, for now we're upgrading the openafs packages, os upgrades will come later once that's finished00:51
fungii think we're less worried about waiting for the os upgrades00:51
ianwyeah, os upgrades we can round-robbin once this is all over00:51
fungiwe need to get operating systems upgraded before ~april00:52
auristorsince this is all cloud, I would stand up a third fileserver using the new OS version, then when that is operational, move the RWs to that server and add RO sites to that server, then remsite the volumes from afs01.dfw, wait two hours, turn off the afs01.dfw fileserver, reinstall the OS and deploy a new fileserver, then addsite to that server, release to that server, then decommission afs02.dfw00:53
fungiat the moment the focus is on upgrading from openafs 1.6 to 1.8 w/ the timestamp bug fixes00:53
auristorat the moment there exists a single point of failure that that IMO should be the priority00:54
fungiyeah, that's not a terrible idea for the os upgrades, we're mostly concerned there about the pain of changing ip addresses when replacing servers (it's like you say, basically adding and removing servers from the cluster rather than straight up replacing a server with a new build)00:54
auristorIP addresses can change willy nilly for fileservers.   its the location servers that cannot change00:55
fungiand yes, i think the current spof is the greater risk, especially what with a service provider whose iron (both for storage and virtual machines) likes to act up frequently00:55
auristorA client is told where volumes reside with a list of fileserver UUIDs not IP addresses.   The clients query the location service to resolve UUID to IP address.   The fileserver registers it current IP addresses for its UUID each time it restarts.00:56
fungiso i'm inclined to not restart services on afs01.dfw if we can avoid it while trying to get the remaining volumes replicated again00:56
ianwfungi: my issue there is that if it shuts down again, we again roll dice restarting it.  if we took a short outage to update it, we can do the replications with more confidence IMO00:57
auristorand it it fails to restart (by which the failure is an inability to register with the location service), you restart it again until it works00:58
fungiianw: can you estimate the service downtime you saw for the other fileservers? maybe if it's just a minute or so then the number of job failures won't be too great00:58
ianwfungi: it's really just the time to reboot the server; on that order of a minute or so00:58
fungiyeah, the risk with restarts, as we've seen, is that the provider might have a problem and none of us is awake/paying attention for some hours00:59
ianwalso, just from a code-motion POV, that allows us to cull the puppet bits, simplify the groups as i work on the db servers00:59
fungiso it could come up with the service failing to start and then sit like that for hours until someone discovers everything's broken00:59
ianwwell yeah, i wouldn't like fricker to have to parse all this almost cold turkey :)01:00
fungiwe're down to around a 250 node request backlog in zuul, maybe once all builds have their node assignments we give it a few minutes and then upgrade? hopefully that minimizes the impact on serving mirror volumes01:01
fungirunning jobs tend to only hit package mirrors in the first few minutes anyway01:01
ianwfungi: if that's the general process we're agreeing on, i should install the PPA, build the modules and get the system-config change ready01:02
ianwthe only other thing we might like to queue up is if we stop afs01 and then quickly recreate the LVM snapshot.  that should ensure at the worst case we can go back to that.  but we've now done two other servers that are working fine01:03
fungiyeah, not a bad idea01:04
fungibasically i just want to avoid undue disruption to the running builds, while also getting sync back on track for the read-only replicas on afs02.dfw, but not at the expense of creating or blocking more work for any of us if we can help it01:05
openstackgerritIan Wienand proposed opendev/system-config master: Move afs01.dfw into afs-1.8 group  https://review.opendev.org/c/opendev/system-config/+/77129301:05
fungisince the only thing we're more strapped for than build resources right now is people01:06
openstackgerritMerged opendev/system-config master: run-selenium: run selenium on a node  https://review.opendev.org/c/opendev/system-config/+/76707801:07
ianwfungi: yeah, i think if we get things ready, we've got only a small window of reboot time.  i then feel better about letting the replications run as they need to with the fileservers at least consistent01:09
ianwi need to just feed people here, bib01:09
openstackgerritMerged opendev/system-config master: gerrit: Initalize in testing  https://review.opendev.org/c/opendev/system-config/+/76522401:11
openstackgerritMerged opendev/system-config master: gerrit: move plugins to common code  https://review.opendev.org/c/opendev/system-config/+/76726901:12
openstackgerritMerged opendev/system-config master: openafs-server: ensure vos_release keys installed on new servers  https://review.opendev.org/c/opendev/system-config/+/77128401:12
fungioh, i've also been holding a lock for updates to the mirror.deb-docker volume, which is tiny so we could also use that as initial testing after the afs01.dfw restart, before trying to resume updates for larger mirror volumes01:21
fungii guess we're not tracking that one in grafana either01:21
openstackgerritMerged opendev/system-config master: bazelisk-build: specify targets as list  https://review.opendev.org/c/opendev/system-config/+/76727201:23
openstackgerritMerged opendev/system-config master: Move afs02.dfw.openstack.org to afs-1.8 group  https://review.opendev.org/c/opendev/system-config/+/77128501:23
ianwinfra-prod-service-afs (3. attempt) wasn't happy, let me debug01:32
ianw warning: failed to remove playbooks/filter_plugins/__pycache__/getaddrinfo.cpython-36.pyc: Permission denied01:32
ianwsigh, i guess that ansible running out of system-config puts in .pyc files01:33
ianwhang on, if we git ignore these, then "git clean" will ignore them ...?01:34
ianwhrm, already ignore *.pyc01:35
ianwno that's silly, git clean is explicitly removing things we ignore as they're untracked01:37
openstackgerritxinliang proposed openstack/diskimage-builder master: Add rhel support for iscsi-boot  https://review.opendev.org/c/openstack/diskimage-builder/+/77070201:39
fungiyeah, untracked files should be okay as they wouldn't prevent pulling01:45
fungior being pushed into by zuul01:45
ianwthis is not really what i wanted to be debugging :/01:47
fungiwe could consider reverting that in zuul-jobs as it's had unanticipated side effects01:48
ianwi do agree it's right; but maybe it needs to become: yes01:49
fungiat first blush, that seems like a suitable compromise01:51
ianwalthough this got reverted https://opendev.org/zuul/zuul-jobs/commit/1e92a67db6f5fa3f3284d5b1928f104c428187f3 ; it does show that there's some history of become: yes in there01:52
fungii suppose we only want to become for a git clean?01:58
ianwyeah i'm trying splitting it01:58
*** ysandeep|away is now known as ysandeep02:07
openstackgerritIan Wienand proposed zuul/zuul-jobs master: mirror-workspace-git-repos: run clean as root  https://review.opendev.org/c/zuul/zuul-jobs/+/77129702:17
ianwfungi: it looks like node requests have really dropped.  i feel like we should put afs01.dfw in emergency and do the upgrade soonish, and get to releasing volumes with everything consistent02:21
fungiyeah, i'm good with that. i'll add it to the emergency disable list now02:22
ianwi'm going to install the ppa and build the modules.  we can look at the snapshot too02:22
fungiand added02:22
fungii presume we can blow away the previous lvm snapshot02:24
ianwright, so lvremove it, shutdown afs server and recreate it02:26
fungido you have the lvcreate command you used last time handy? if not i can whip something up02:28
ianwoh, unmount it first02:28
fungii'll do that now02:29
ianwDo you really want to remove and DISCARD active logical volume vicepa_snap? [y/n]:02:29
ianwagree yes?02:30
fungioh, no need to umount, it's not mounted. probably because of the reboot02:31
fungiand yeah, you can agree with that02:32
fungior you could lvchange -an that volume first02:32
ianwlvcreate -l100%FREE -s -n vicepa_snap /dev/main/vicepa02:32
ianwis the command to make it02:32
fungiwhich avoids the prompt02:32
ianwmodules are built02:32
fungifor future reference02:32
ianwok, i'm going to stop afs, create snapshot and reboot as fast as possible02:33
fungii guess we want to stop afs services and umount /vicepa before creating the snapshot?02:33
ianwok, yeah unmount it too02:33
fungithat helps ensure it's quiescent02:33
ianwdone, rebooting02:36
fungiin retrospect we probably should have also touched /fastboot02:40
*** hemanth_n has joined #opendev02:40
ianwit should all be back, poking02:42
ianwfungi: i think we're good.  do you want to try a smaller release?02:43
fungiyup, on it02:43
fungiReleased volume mirror.deb-docker successfully02:46
fungiseveral times because that one's a loop02:46
fungii can also do mirror.apt-puppetlabs quickly, another we're not tracking in the dashboard02:47
ianwyeah, i need a script to keep things in order02:47
fungiReleased volume mirror.apt-puppetlabs successfully02:49
fungilooks like things are still working02:50
ianwnice.  well at this point our deployment jobs are borked02:50
ianwso even if we wanted to merge the sequential release change i have, it wouldn't get onto the servers ATM02:51
ianwso i think fungi, if you want to just manually sequentially release volumes we can hopefully get back into sync02:51
fungii can still access volumes which are only current on afs01.dfw so i think we can move forward02:52
fungiand yeah, i'll queue some up to run sequentially in screen now02:53
ianwat this point afs01|02.dfw are in emergency.  i think that's the best way to leave it.  when deployment happens we can double check again afs02.ord is happy and then remove them from emergency02:56
ianwi'll cleanup system-config and write the afsdb ansible bits now02:56
fungireprepro-mirror-update /etc/reprepro/ubuntu mirror.ubuntu 2>&1 | tee /var/log/reprepro/ubuntu.log ; reprepro-mirror-update /etc/reprepro/ubuntu-ports mirror.ubuntu-ports 2>&1 | tee /var/log/reprepro/ubuntu-ports.log ; reprepro-mirror-update /etc/reprepro/ubuntu-cloud-archive mirror.ubuntu-cloud 2>&1 | tee /var/log/reprepro/ubuntu-cloud-archive.log ; centos-mirror-update mirror.centos 2>&1 | tee03:04
fungi/var/log/rsync-mirrors/centos.log ; opensuse-mirror-update mirror.opensuse 2>&1 | tee /var/log/rsync-mirrors/opensuse.log ; fedora-mirror-update mirror.fedora 2>&1 | tee /var/log/rsync-mirrors/fedora.log ; reprepro-mirror-update /etc/reprepro/debian mirror.debian 2>&1 | tee /var/log/reprepro/debian.log ; reprepro-mirror-update /etc/reprepro/debian-security mirror.debian-security 2>&1 | tee03:04
fungithat's what i'm running03:05
fungiapologies for the length03:05
fungiokay, starting that now03:07
*** jhesketh_ has joined #opendev03:40
*** jhesketh has quit IRC03:41
*** jhesketh_ is now known as jhesketh03:43
fungilooks like i forgot the epel mirror in that, but otherwise should be all of them03:45
openstackgerritIan Wienand proposed opendev/system-config master: Remove afs-1.8 group  https://review.opendev.org/c/opendev/system-config/+/77129303:52
*** ysandeep is now known as ysandeep|pto03:55
openstackgerritIan Wienand proposed zuul/zuul-jobs master: mirror-workspace-git-repos: run clean as root  https://review.opendev.org/c/zuul/zuul-jobs/+/77129704:01
*** lbragstad has quit IRC04:16
*** ykarel has joined #opendev04:18
*** hemanth_n has quit IRC04:40
*** hemanth_n has joined #opendev04:44
openstackgerritIan Wienand proposed opendev/system-config master: Manage afsdb servers with Ansible  https://review.opendev.org/c/opendev/system-config/+/77134004:53
openstackgerritIan Wienand proposed opendev/system-config master: Manage afsdb servers with Ansible  https://review.opendev.org/c/opendev/system-config/+/77134005:14
openstackgerritIan Wienand proposed opendev/system-config master: Remove AFS puppet  https://review.opendev.org/c/opendev/system-config/+/77134205:14
openstackgerritIan Wienand proposed zuul/zuul-jobs master: mirror-workspace-git-repos: run clean as root  https://review.opendev.org/c/zuul/zuul-jobs/+/77129705:21
openstackgerritMerged opendev/system-config master: gerrit: get files from bazel build dir  https://review.opendev.org/c/opendev/system-config/+/76743305:29
openstackgerritMerged opendev/system-config master: gerrit: Install zuul-summary-results plugin  https://review.opendev.org/c/opendev/system-config/+/76707905:30
*** guillaumec has quit IRC05:40
ianwsince it's quiet, i might deploy ^ to get the plugin05:43
*** guillaumec has joined #opendev05:44
ianw#status log restarted gerrit to get zuul-summary-results; see also http://lists.openstack.org/pipermail/openstack-discuss/2021-January/019885.html\06:00
openstackstatusianw: finished logging06:00
ykarelianw, you mean summary visible around Files tab? i don't see it06:04
ykareli see two tabs: Files and Findings06:05
ykarelzuul summary will be around those only, not sure /me looking correctly06:05
ianwykarel: you might need a shift-reload06:05
ykarelianw, i did that but still don't see06:06
ykarellet me try on different browser06:06
ykarelokk i see new tab in chrome06:07
ykarelmay be some issue on my firefox06:07
ianwi use firefox and am seeing it there, and also just checked on chrome06:07
ianwit's very basic so wouldn't imagine browser would matter much06:08
ykarelok, i see SyntaxError: "invalid regexp group" likely that causing it06:14
*** ykarel_ has joined #opendev06:16
ykarel_ok, i see SyntaxError: "invalid regexp group" likely that causing it06:17
*** ykarel has quit IRC06:19
*** hemanth_n has quit IRC06:23
*** marios has joined #opendev06:23
*** hemanth_n has joined #opendev06:23
*** ykarel_ is now known as ykarel06:29
ianwhrm, what version of firefox?06:33
*** sboyron has joined #opendev06:33
*** brinzhang has quit IRC06:50
*** brinzhang has joined #opendev06:51
*** fressi has joined #opendev07:24
*** DSpider has joined #opendev07:24
*** ralonsoh has joined #opendev07:27
*** fressi has quit IRC07:31
*** DSpider has quit IRC07:31
*** brinzhang has quit IRC07:32
*** hemanth_n has quit IRC07:32
*** DSpider has joined #opendev07:32
*** hemanth_n has joined #opendev07:33
*** fressi has joined #opendev07:38
*** lpetrut has joined #opendev07:39
*** eolivare has joined #opendev07:47
*** slaweq has joined #opendev07:48
*** fressi has quit IRC07:55
*** fressi has joined #opendev07:55
*** fressi has quit IRC07:56
*** fressi has joined #opendev08:05
*** fressi has quit IRC08:06
*** rpittau|afk is now known as rpittau08:07
*** brinzhang has joined #opendev08:07
*** andrewbonney has joined #opendev08:13
*** hashar has joined #opendev08:22
*** fressi has joined #opendev08:26
*** tosky has joined #opendev08:39
*** jpena|off is now known as jpena08:58
*** sboyron has quit IRC09:06
*** sboyron_ has joined #opendev09:06
openstackgerritLee Yarwood proposed opendev/elastic-recheck master: Add query for bug 1912310  https://review.opendev.org/c/opendev/elastic-recheck/+/77138809:13
openstackbug 1912310 in OpenStack Compute (nova) "libvirt.libvirtError: unable to connect to server at " [Undecided,New] https://launchpad.net/bugs/191231009:13
ianw03:27:38 -> 08:59:54 for a full ubuntu release09:23
*** brinzhang has quit IRC09:29
*** sboyron has joined #opendev09:43
*** sboyron_ has quit IRC09:43
openstackgerritHervĂ© Beraud proposed openstack/project-config master: Adding irc notification for missing oslo projects  https://review.opendev.org/c/openstack/project-config/+/77139209:49
guillaumecianw, ykarel https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Releases/78#javascript "Named capture groups"09:50
*** hashar is now known as hasharOut09:59
*** sboyron_ has joined #opendev10:05
*** sboyron has quit IRC10:08
openstackgerritMartin Kopec proposed opendev/system-config master: WIP Deploy refstack with ansible docker  https://review.opendev.org/c/opendev/system-config/+/70525810:17
*** brinzhang has joined #opendev10:28
ykarelguillaumec, Thanks, so will need to update firefox10:59
ykarelafter updating firefox no longer see that syntax error and Zuul Summary is visible11:04
*** ysandeep|pto is now known as ysandeep11:14
openstackgerritMartin Kopec proposed opendev/system-config master: WIP Deploy refstack with ansible docker  https://review.opendev.org/c/opendev/system-config/+/70525811:18
*** marios has quit IRC11:57
*** dpawlik has quit IRC11:57
*** logan- has quit IRC11:57
*** avass has quit IRC11:57
*** paladox has quit IRC11:57
*** paladox has joined #opendev11:57
*** avass has joined #opendev11:58
*** lpetrut_ has joined #opendev11:58
*** marios has joined #opendev11:58
*** logan- has joined #opendev12:00
*** lpetrut has quit IRC12:01
openstackgerritGorka Eguileor proposed zuul/zuul-jobs master: Fix CentOS wheel mirror URL  https://review.opendev.org/c/zuul/zuul-jobs/+/77142812:02
*** eolivare_ has joined #opendev12:14
*** dpawlik has joined #opendev12:15
*** eolivare has quit IRC12:16
*** artom has quit IRC12:28
*** artom has joined #opendev12:28
*** jpena is now known as jpena|lunch12:34
*** hasharOut is now known as hashar12:45
*** ttx has quit IRC12:51
*** hemanth_n has quit IRC12:52
*** eolivare_ has quit IRC12:53
*** fressi has quit IRC12:56
*** ykarel has quit IRC13:01
*** ttx has joined #opendev13:04
openstackgerritLuigi Toscano proposed openstack/project-config master: cursive: prepare to move the jobs in-tree  https://review.opendev.org/c/openstack/project-config/+/77144313:12
openstackgerritGorka Eguileor proposed zuul/zuul-jobs master: Fix CentOS wheel mirror URL  https://review.opendev.org/c/zuul/zuul-jobs/+/77142813:13
*** ykarel has joined #opendev13:24
openstackgerritGuillaume Chauvel proposed opendev/system-config master: Increase comment log text width to avoid line wrap  https://review.opendev.org/c/opendev/system-config/+/77144513:25
*** eolivare_ has joined #opendev13:26
*** jpena|lunch is now known as jpena13:34
*** lbragstad has joined #opendev13:37
*** _mlavalle_1 has quit IRC13:58
*** mlavalle has joined #opendev13:59
*** fressi has joined #opendev14:04
*** artom has quit IRC14:10
openstackgerritGuillaume Chauvel proposed opendev/system-config master: Increase comment log text width to avoid line wrap  https://review.opendev.org/c/opendev/system-config/+/77144514:25
*** slittle1 has quit IRC14:32
*** artom has joined #opendev14:32
*** artom has quit IRC14:32
*** slittle1 has joined #opendev14:33
*** artom has joined #opendev14:33
slittle1Morning all14:35
slittle1could we get eyes on https://review.opendev.org/c/openstack/project-config/+/771235 ?14:36
slittle1If acceptable, please set up Cole Walker and myself as cores.  Thanks, Scott Little14:36
*** rosmaita has left #opendev14:48
*** fressi has left #opendev14:51
*** chandankumar is now known as raukadah15:08
fungislittle1: oh, yep, i saw that go by yesterday but was firefighting, i should have a moment to look now. thanks for the reminder!15:09
openstackgerritMerged openstack/project-config master: Add PTP Notification app to StarlingX  https://review.opendev.org/c/openstack/project-config/+/77123515:22
*** hashar is now known as hasharKids15:28
*** sshnaidm|ruck is now known as sshnaidm|afk15:37
*** ysandeep is now known as ysandeep|dinner15:37
openstackgerritMerged opendev/gerritbot master: Skip notifications about WIP changes  https://review.opendev.org/c/opendev/gerritbot/+/76513015:46
*** ysandeep|dinner is now known as ysandeep15:47
*** brinzhang has quit IRC15:48
*** brinzhang has joined #opendev15:51
*** ykarel is now known as ykarel|away16:00
*** diablo_rojo has joined #opendev16:01
clarkbdocker closed my ipv6 bug says it is too complicated and risky16:03
clarkbI guess that makes it official16:03
fungithat is, docker thinks ipv6 raw addresses aren't worth supporting?16:03
*** sshnaidm|afk is now known as sshnaidm|ruck16:03
slittle1I think zuul is unhappy with the merge of   https://review.opendev.org/c/openstack/project-config/+/771235,  check out https://zuul.opendev.org/t/openstack/build/ea9b8fbb985b456fa0721f71de6f4f4416:04
clarkbfungi: I think you were looking at something like ^ yesterday?16:05
clarkb(I haven't quite caught up on that yet)16:05
fungiclarkb: ahh, yes, there was a change which merged to zuul-jobs yesterday to force a git clean, but when we run python scripts (either directly or ansible modules) from the zuul-deployed git repo as root it breaks zuul's ability to clean the repo even though the files are untracked16:07
clarkband I guess it is zuul trying to remove a rile owned by root?16:07
openstackgerritGuillaume Chauvel proposed opendev/system-config master: Increase comment log text width to avoid line wrap  https://review.opendev.org/c/opendev/system-config/+/77144516:07
fungiianw has proposed a compromise to run the git clean as root using become, but obviously that needs help on nodes where zuul lacks root privs16:07
fungiyeah, zuul's git clean is trying to remove bytecode files created by the python interpreter which were created as root when run16:08
clarkbfungi: have a link to the original change? I'm curious to see what precipitated that16:08
clarkb(I can dig it up if not)16:08
fungiclarkb: https://review.opendev.org/77122016:09
fungimaybe the git clean could be best effort and the task would just ignore nonzero exit from that16:10
clarkbah we were always cleaning but now a failed clean is an error. Got it (and I guess that makes esnse overall)16:10
clarkbya maybe we need to bring that up with tobiash?16:10
fungii think discussion is happening in ianw's change, finding...16:10
fungiclarkb: https://review.opendev.org/77129716:11
clarkbhrm that will fail in a lot of cases I bet (like when using containers without root?)16:12
fungii've made that suggestion in a review comment now16:14
*** ykarel|away has quit IRC16:17
*** CWalker has joined #opendev16:29
*** lbragstad_ has joined #opendev16:46
openstackgerritJeremy Stanley proposed zuul/zuul-jobs master: mirror-workspace-git-repos: ignore errors from clean  https://review.opendev.org/c/zuul/zuul-jobs/+/77129716:47
*** lpetrut_ has quit IRC16:48
*** slaweq has quit IRC16:48
*** DSpider has quit IRC16:48
*** DSpider has joined #opendev16:48
*** slaweq has joined #opendev16:49
*** lbragstad has quit IRC16:49
*** lbragstad_ is now known as lbragstad16:58
fungiwith python 3.8 and later there's an envvar you can use to control the bytecode path16:59
fungimaybe we could set PYTHONPYCACHEPREFIX in /etc/environment on bridge.o.o? but not until we're using 3.8 of course17:01
clarkbit is 3.6 currently right?17:02
fungiyeah, 3.6.9 on bionic17:03
fungithere is a python3.8 package in bionic-updates we could try17:04
fungibut it won't be the default python317:04
*** brinzhang_ has joined #opendev17:09
*** mlavalle has quit IRC17:11
*** sshnaidm|ruck is now known as sshnaidm|afk17:11
*** brinzhang has quit IRC17:12
*** mlavalle has joined #opendev17:12
*** zbr3 has joined #opendev17:18
*** zbr has quit IRC17:20
*** zbr3 is now known as zbr17:20
openstackgerritJames E. Blair proposed opendev/base-jobs master: Force ownership of zuul repo dirs in opendev CD jobs  https://review.opendev.org/c/opendev/base-jobs/+/77149617:24
*** marios is now known as marios|out17:30
openstackgerritJames E. Blair proposed opendev/base-jobs master: Force ownership of zuul repo dirs in opendev CD jobs  https://review.opendev.org/c/opendev/base-jobs/+/77149617:33
openstackgerritJames E. Blair proposed opendev/base-jobs master: Force ownership of zuul repo dirs in opendev CD jobs  https://review.opendev.org/c/opendev/base-jobs/+/77149617:36
*** hamalq has joined #opendev17:40
clarkbcorvus: fungi I've +2'd it but not approved it because I don't think I'm in a good spot for monitoring it17:43
clarkbI've got to prep for the meeting momentarily and have a couple of other things to finish up17:43
corvusclarkb: any reason not to +w and just check back in a couple hours?17:44
corvusit should either break like now or work like before?17:44
clarkbwell its modifying bridge as well?17:45
clarkbmaybe tahts just me being overly apranoid17:45
mordredclarkb, corvus : base-jobs patch lgtm17:45
*** d34dh0r53 has quit IRC17:47
fungii'm happy to approve it and keep an eye on things17:47
fungithe sooner we see if we're unblocked the sooner we can get other stuff deployed again17:47
clarkbfungi: thanks17:47
*** d34dh0r53 has joined #opendev17:50
openstackgerritMerged opendev/base-jobs master: Force ownership of zuul repo dirs in opendev CD jobs  https://review.opendev.org/c/opendev/base-jobs/+/77149617:55
fungiokay, now that's merged, fingers crossed. i'll try to reenqueue 771235 into deploy... that should be safe right?17:56
clarkbfungi: as long as no other changes have merged after it (you'll want to use the most recent one I think)17:56
clarkbon the system-config side I believe we always checkout master but for project-config I think it is tied to the commit17:56
fungiyeah, that's the most recent project-config change to merge17:57
fungisudo zuul enqueue --tenant openstack --pipeline deploy --project openstack/project-config --change 771235,217:59
fungitrying that17:59
fungiit's waiting for the semaphore18:00
*** eolivare_ has quit IRC18:02
*** jpena is now known as jpena|off18:18
*** dtantsur is now known as dtantsur|afk18:23
*** artom has quit IRC18:30
openstackgerritJeremy Stanley proposed opendev/base-jobs master: Use root to correct ownership of repos  https://review.opendev.org/c/opendev/base-jobs/+/77150318:33
fungicorvus: clarkb: mordred: ^ the failure looks like https://zuul.opendev.org/t/openstack/build/5e954c5436c241979305560cc8826f9d/console#1/2/2/bridge.openstack.org18:33
corvusfungi: so sorry!  i had that in there but must have accidentally deleted it in a revision :(18:34
fungioh no worries18:34
fungii should have spotted it in review18:34
fungiwill try again once that's in18:34
corvusin my defense, i didn't get much sleep last night due to unseasonal howling winds18:34
*** hasharKids has quit IRC18:34
corvusit's basically fire season again here (hot winds) right in the middle of what's supposed to be rainy season18:35
fungii can completely sympathize wrt howling winds. here we can feel the house swaying with every gust18:35
mordredcorvus: are you having fire-rain yet?18:35
corvusmordred: i think that's yosemite in early februrary.  and the fire-tornadoes are still farther north.18:35
fungibetween there being no appreciable wind break when you're right on the water, and the house resting on 4m tall pilings18:35
mordredcorvus: I mean - that's in a normal year. who knows - maybe you'll get lucky and get fire-tornadoes this year?18:36
corvusfungi: i can imagine.  i felt the house move just a little with some of the winds last night.  i'm used to it only moving when there's a quake on the fault up the street.18:36
corvusmordred: if 2021 wants to one-up 2020 (which it's starting to look like) i wouldn't put it past it.18:37
fungii just imagine the house is trying to rock me to sleep18:37
mordredcorvus: if the west coast is starting to suffer from damaging high winds, I worry that we're going to start having earthquakes here18:37
fungi2021, the year of the fire hurricanes18:37
corvusfungi: oh that's so much better than rocking awake18:37
mnaserclarkb: for the upcoming infra meeting, i have another meeting from 2 to 2:30, would it be possible to have my discussion point earlier in the start of the meeting (i don't think it will take much time)18:38
clarkbthe only time I realized I was in an earthquake it was because my bed felt like it had transported to the ocean for about 20 seconds at 2am18:38
clarkbmnaser: would it be easier at the end from 2:30- 3?18:39
fungimaybe he meant he had a meeting from 2:30-318:39
mnaserclarkb: sigh i goofed, i meant my meeting is 2:30 to 3, so if we could discuss it 2:00 to 2:30 (et)18:39
mnaseryes, sorry, slow day :)18:39
fungiwelcome to the club ;)18:39
clarkbmnaser: yes we can make that work18:39
*** rpittau is now known as rpittau|afk18:41
openstackgerritMerged opendev/base-jobs master: Use root to correct ownership of repos  https://review.opendev.org/c/opendev/base-jobs/+/77150318:45
fungiokay, trying again18:45
fungiit's been running a while, so not failing at the start like before at least. good sign18:52
*** artom has joined #opendev18:53
*** marios|out has quit IRC18:53
avassfungi: \o/18:56
CWalkerThank you for fixing that! :)18:57
fungiCWalker: i've added you (author of 771235) as the initial member for the starlingx-ptp-notification-armada-app-core group in gerrit now18:59
fungislittle1: ^18:59
fungiCWalker: thanks go to corvus for getting the deployment tooling back on track19:00
CWalkerExcellent, much appreciated19:00
openstackgerritJeremy Stanley proposed opendev/gerritbot master: Revert "Skip notifications about WIP changes"  https://review.opendev.org/c/opendev/gerritbot/+/77150619:27
openstackgerritJeremy Stanley proposed opendev/gerritbot master: Revert "Revert "Skip notifications about WIP changes""  https://review.opendev.org/c/opendev/gerritbot/+/77150719:27
fricklerclarkb: which docker ipv6 bug was that? /me is always looking for more reasons not to like docker ;)19:35
clarkbfrickler: you cannot pull or push images using an ipv6 address literal as the source/target of the operation19:37
clarkbfrickler: you have to use a dns name or proxy ipv4 to ipv6 (zuul does the proxy step using socat)19:37
corvusfungi, zbr: i expressed my thoughts in full-sentence form on https://review.opendev.org/77150719:38
zbrimho what we lack more is people doing reviews, not notifications about what everyone is working on.19:41
*** andrewbonney has quit IRC19:42
zbrthat was why i wanted to keep the notifications lower, less noise. but i will update it.19:42
*** slaweq has quit IRC19:43
openstackgerritMerged opendev/gerritbot master: Revert "Skip notifications about WIP changes"  https://review.opendev.org/c/opendev/gerritbot/+/77150619:50
fricklerclarkb: ah, found it, thx. that issue was indeed as old as I seemed to remember it to be19:51
*** Jeffrey4l has quit IRC20:04
clarkbianw: https://opendev.org/opendev/system-config/src/branch/master/roles/openafs-client/tasks/openafs-client/Debian.yaml#L11-L12 is the condition in openafs-client I was talking about20:06
clarkbianw: my read of that is xenial hosts using that role will install the xenial 1.6 packages?20:08
fungiso on the ppa addition on our executors, even ze12 has several copies which all add the same ppa in /etc/apt/sources.list.d: openstack-ci-core-ubuntu-openafs-amd64-hwe-xenial.list and ppa_openstack_ci_core_openafs_xenial.list both from 2020-05-07, and openafs.list from 2020-05-2620:08
fungi2020-05-07 looks like it might be the date the server was built20:08
fungino, my bad, the server seems to have been built 2019-05-08 (a year earlier)20:09
clarkbI'm due for lunch and a bike ride, back in a bit to dig into that xenial stuff more if necessary20:09
fungiand yeah, the two files with the same 2020-05-07 date actually have the same precise timestamp (down to a nanosecond)20:10
ianwclarkb: umm, yeah i think that whole when: statement should probably just go, i think it's all old20:10
fungiso i guess that's what ianw meant by deploying two copies at one point20:10
clarkbianw: that was my read on it too, but good to have someone else confirm. I can push that up a bit later today20:10
fungibut anyway, i expect we ansibilified the executors after last may, so they're getting packages from that ppa because puppet configured it20:11
ianwfungi: hrm, the hwe xenial thing ... didn't we have memory issues on the executor and switched to the HWE kernels at some point in the past?20:11
clarkbianw: ya20:11
clarkbit had to do with swappiness iirc20:11
clarkbor how the kernel calculated when to swap20:11
*** openstackgerrit has quit IRC20:12
ianwthat's right, something like that20:12
mordredclarkb: my IRC client tells me there was a nick highlight in #opendev-meeting during the opendev meeting - but does not have the skill to show me _where_20:12
clarkband it was doing it super aggressively because it didn't proeprly calulcate its free memory20:12
clarkbmordred: 19:48:04*           fungi | in the case of storyboard, we already have docker images publishing on each new change merged (thanks mordred!)20:12
mordredyay I was helpful20:12
*** Jeffrey4l has joined #opendev20:13
fungiindeed, that's going to go a long way to us being able to upgrade the sb server20:13
*** openstackgerrit has joined #opendev20:13
openstackgerritMerged opendev/bindep master: ArchLinux: ignore unrelated warnings from pacman  https://review.opendev.org/c/opendev/bindep/+/77110820:13
*** zbr5 has joined #opendev20:14
*** zbr has quit IRC20:16
*** zbr5 is now known as zbr20:16
ianwso are CD jobs working again?  i saw something go by ...20:21
clarkbyes I think so20:22
clarkbianw: we did a chown in our base job for infra-prod jobs20:22
clarkband then fungi reenqueued a thing that ran successfully20:22
ianwahh, right ok https://review.opendev.org/c/opendev/base-jobs/+/771503/1/playbooks/infra-prod/pre.yaml20:22
ianwcool, ok; i'll check on and verify all the afs changes20:23
fungiany jobs inheriting from our deployment base job should in theory be mitigated now20:24
fungiroot does a recursive chown of the entire project tree on the bastion to zuul:zuul and then the zuul user is able to git clean within those repos after that20:25
*** artom has quit IRC20:31
fungiinfra-root: i meant to mention in the meeting, but i've proposed a brush-up for the opendev.org main page here: https://review.opendev.org/76982620:31
*** stevebaker has quit IRC20:35
*** artom has joined #opendev20:42
*** Jeffrey4l has quit IRC20:50
*** Jeffrey4l has joined #opendev20:51
*** stevebaker has joined #opendev21:03
*** artom has quit IRC21:17
*** artom has joined #opendev21:17
*** sboyron_ has quit IRC21:26
*** priteau has quit IRC21:35
openstackgerritIan Wienand proposed opendev/system-config master: openafs-client: cleanup PPA install  https://review.opendev.org/c/opendev/system-config/+/77152121:58
ianwinfra-root (but corvus maybe in particular) : is there any reason keeping the executors on Xenial right now, other than just time to replace the hosts?21:59
ianw"time" I mean "finding developer time"22:00
corvusianw: the only relevant consideration i can think of at this point is afs22:04
corvusother than that, i believe they should be completely agnostic about their underlying os22:04
ianwcool, that was roughly my feeling as well.  it should hopefully be fairly mechanical to swap in new hosts to get rid of xenial22:08
ianwinfra-root: i should have checked before the meeting, but the vexxhost backup storage situation is now critical22:10
ianw /dev/mapper/main--202010-backups--202010 1007G 1007G     0 100% /opt/backups-20201022:10
ianwhrm, the problem with re-purposing the bup backup storage quota is that wiki isn't backing up to borg.22:13
ianwi think that has to now become my top priority issue22:14
ianwwe are still running to the rax server, which has 3tb.  but it's not the redundancy we desire22:14
clarkbianw: and thats doing append only backups with borg that is filling up?22:22
clarkbdo we need to set a more aggressive pruning scheduling with borg?22:23
clarkbI think I've got mine set up to do like a full week then monthly then annually so it should be tunable22:23
ianwclarkb: hrm, whatever is the default i guess -- do we need to run something asynchronously?  and we could probably do with more eyes on what we really need backed up22:24
clarkbianw: I think borg doesn't prune by default22:24
clarkbI'm digging up my script to see what my local set up does22:24
ianwthen i'm sure that's part of the problem :)22:25
clarkbianw: I run borg prune --verbose --list --prefix '{hostname}-' --show-rc --keep-daily    7 --keep-weekly   4 --keep-monthly  622:26
clarkband that is at the end of my backup script so it does the backups then prunes in one go22:26
clarkband if I do a borg list I see that it matches that set of keep directives22:27
*** iurygregory has quit IRC22:28
ianwclarkb: hrm, perhaps as easy as adding that to https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/borg-backup/templates/borg-backup.j2 ?22:28
clarkbya I think so. May we run it by hand against ethercalc or something like that and double check it does what we expect?22:30
clarkbalso maybe keep-monthly 12 ?22:30
clarkbinfra-root ^ may have other thoughts on retention though22:30
ianwok, i've just got to get wiki onto borg so we can have the discussion about retiring all this bup storage (can't retire it while it's still in use :)22:31
ianwi'm thinking i'll manual run a custom playbook because it needs to make and distribute keys22:32
clarkbianw: thanks for https://review.opendev.org/c/opendev/system-config/+/771521 fungi that is a qucik and easy review if you have a moment22:32
ianwbah, sorry meant22:34
ianwhttps://zuul.opendev.org/t/openstack/build/cbad8b367ae544baa2bbb165ae4a93a0/console#1/2/15/base shows installing the ppa on xenial22:35
*** iurygregory has joined #opendev22:37
clarkbhttps://borgbackup.readthedocs.io/en/stable/usage/prune.html shows an interesting --save-space option too22:41
clarkbbut more importantly there is a dry run option22:41
clarkbianw: ^ I think we should be good to try running it with the dry run option and see what it tells us then if it looks good run it for real on say ethercalc then apply it to the script?22:42
clarkbthen maybe look into --save-space?22:42
fungiwe're nearly caught up on mirror full releases, the debian volume has been running for a couple of hours and once it finishes we need to do debian-security and then epel which i missed queuing up earlier, and then a second pass to bring them all up to date which will hopefully not take long at all22:45
fungimight be finished before i fall asleep tonight22:45
ianwfungi: so is the wiki host not in the inventory?22:46
openstackgerritMerged openstack/project-config master: cursive: prepare to move the jobs in-tree  https://review.opendev.org/c/openstack/project-config/+/77144322:50
openstackgerritMerged openstack/project-config master: Adding irc notification for missing oslo projects  https://review.opendev.org/c/openstack/project-config/+/77139222:50
openstackgerritMerged openstack/project-config master: Combine acl file for all interop source code repo  https://review.opendev.org/c/openstack/project-config/+/77106622:50
openstackgerritMerged openstack/project-config master: Move snaps ACL to x  https://review.opendev.org/c/openstack/project-config/+/77053822:50
fungiianw: it's not handled with configuration management, and is still running ubuntu 14.0422:52
ianwahh, i didn't realise it wasn't managed at all22:52
fungithere's a mostly-complete attempt to get it into puppet where i wasn't able to work out all the extension updates22:52
fungii have a mostly-working wiki-dev server up but was running into some missing theming and openid wasn't working22:53
fungithat one's deployed with puppet fully, just still needs work22:53
fungithe "production" wiki server is called wiki-upgrade-test.openstack.org (long story involving having to tear down the actual production server after the upgrade to trusty disabled iptables and the elasticsearch on the server got immediately hacked)22:54
ianwok, it just means my ansible idea won't work to get it onto borg.  i'll have to do it fully manually.  that's ok ... i'll just need a cup of tea first :)22:56
*** openstackgerrit has quit IRC22:59
*** brinzhang0 has joined #opendev23:17
*** brinzhang_ has quit IRC23:20
ianwhrm, borg says that 1.1.15 drops python 3.4 support, but 1.1.14 doesn't seem to be working23:25
ianw  File "/opt/borg/lib/python3.4/site-packages/pkg_resources/__init__.py", line 82, in <module>23:26
ianw    raise RuntimeError("Python 3.5 or later is required")23:26
ianwinteresting ...23:26
clarkbhuh maybe they did it a bit early?23:27
clarkbfwiw it sounds likw 1.1.11 does python223:27
clarkbPossible that python2 + python3.5 was a thing rather than python 3.4?23:27
ianwit seems like borg is not so much at fault here as ... cue your surprise ... setuptools23:27
* fungi tries to stuff his surprise back in the box it accidentally escaped from23:28
clarkboh I see what that is saying its setuptools/pkg_resources complaining23:28
fungimight be able to preinstall older setuptools?23:29
ianwi need to find the last setuptools that supproted 3.423:31
clarkbianw: the pypi metadta should help if you view source on the setuptools index paage23:32
fungiwell, maybe23:33
fungithe trove classifiers might, but the python_requires metadata didn't really come into vogue until somewhat recently23:33
fungiso some versions before the addition of python_requires=3.5 might have also required 3.523:34
fungiso pip install setuptools<44 i guess23:39
ianwok, there is a borg in /opt/borg/ ... we shall never speak of it again23:41
*** brinzhang_ has joined #opendev23:52
fungiheh, thanks!23:52
fungilooks like we're on to the mirror.debian-security volume release now23:53
*** brinzhang0 has quit IRC23:55

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!