Wednesday, 2020-09-09

clarkbgdisk is manually installed into the image00:00
fungiclarkb: the no-wheel behavior is consistent with upstream venv/stdlib. i get it even though i compile my own from cpython git source00:00
clarkbubuntu-focal-arm64-0000007568.log is building now and before the sgdisk failure00:00
clarkbfungi: ya its consistent on suse too00:00
clarkbfungi: I'm going to reread log sand see if it is actually failing or just complaining00:01
clarkbah yup it later logs `running for X`00:01
clarkbthat seems like undesireable behavior but explains why this isn't fatal00:01
clarkbsorry for the noise00:02
clarkbianw: good call 2020-09-09 00:05:57.481 | DEBUG diskimage_builder.block_device.utils [-] exec_sudo: mkfs: failed to execute mkfs.vfat: No such file or directory exec_sudo /usr/local/lib/python3.7/site-packages/diskimage_builder/block_device/
ianwoh yeah, probably need vfat tools because efi partition00:06
clarkb <- seems to be the package00:07
ianwyeah that would be it00:07
clarkbI'll manually install that one too00:07
*** DSpider has quit IRC00:07
clarkbianw: wnt to update the change or should I do that?00:07
ianwclarkb: umm, i can, see if that uefi patch ontop works too00:08
clarkbya wasn't sure what was going on there s ofigured I'd let you push the new ps00:09
clarkb(that way I don't get it wrong)00:09
clarkbdebian-buster-arm64-0000089197.log is building now should give us an idea if there are more missing bits00:09
clarkbbut its going to be dinner soon so I need to step away.00:09
ianwclarkb: thanks; tailing that log now00:10
ianwat least until my internet cuts out :)00:10
clarkbits weird we're all being told to stay home not because of the pandemic today but because the area's emergency services have been on a put fires out marathon00:12
clarkbthankfully no additional power blips since the first one yesterday (which lasted just long enough to remind me one of my UPS batteries is bad)00:12
ianwahh fires, yes i remember them :)00:14
ianwi guess our turn is coming in a few months00:14
ianwlooks like it's moving on to image generation stage, so that's good00:25
ianw(partitions made and formatted)00:25
clarkbianw: looks like adding dosfstools was enough to get a successful job on the new builder00:57
clarkb| debian-buster-arm64-0000089197  | debian-buster-arm64  |   | qcow2         | ready    | 00:00:18:01  |00:58
ianwyep, that's good! :)00:59
clarkbI think that means the le thing is the only outstanding issue and I'm happy to wait for that to happen periodically01:00
* clarkb goes back to enjoying the evening before the smoke returns01:00
ianwclarkb: have fun, i'll keep an eye on LE rollout01:13
*** diablo_rojo has quit IRC02:50
openstackgerritMerged opendev/system-config master: run-base-post: fix ARA artifact link
*** johnavp1989 has left #opendev03:57
openstackgerritOleksandr Kozachenko proposed openstack/project-config master: Add monasca projects in vexxhost tenant
*** ysandeep|away is now known as ysandeep05:42
*** zer0c00l has joined #opendev06:07
zer0c00lIs there a way i can subscribe to events?06:08
zer0c00l mosquitto_sub -h --topic 'gerrit'06:08
zer0c00lConnection error: Connection Refused: not authorised.06:08
zer0c00lIs there a way to setup username and password so i can 'subscribe' to mqtt events?06:08
*** qchris has quit IRC06:20
*** qchris has joined #opendev06:34
openstackgerritMerged openstack/project-config master: Add monasca projects in vexxhost tenant
*** andrewbonney has joined #opendev07:06
*** fressi has joined #opendev07:17
*** hashar has joined #opendev07:17
*** ysandeep is now known as ysandeep|lunch07:40
*** tosky has joined #opendev07:57
openstackgerritFabien Boucher proposed opendev/gear master: use python3 as context for build-python-release
*** moppy has quit IRC08:01
*** moppy has joined #opendev08:01
*** DSpider has joined #opendev08:02
*** ysandeep|lunch is now known as ysandeep08:55
*** dtantsur|afk is now known as dtantsur09:35
*** hashar has quit IRC09:57
openstackgerritSorin Sbarnea (zbr) proposed opendev/elastic-recheck master: Use pytest for queries
openstackgerritSorin Sbarnea (zbr) proposed opendev/elastic-recheck master: Use pytest for queriesSwitches queries testing to use of pytest which provides the following:- test generator for each query (parametrize)- ability to test a single query test- generate html report with tsssest results, making easier to  investigate failu
*** fressi has quit IRC10:13
openstackgerritSorin Sbarnea (zbr) proposed opendev/elastic-recheck master: Use pytest for queries
*** priteau has joined #opendev10:42
openstackgerritSorin Sbarnea (zbr) proposed opendev/elastic-recheck master: Use pytest for queries
openstackgerritOleksandr Kozachenko proposed openstack/project-config master: Add monasca-tempest-plugin in vexxhost tenant
Open10K8SHi team. Please check this PS . forgot to add tempest plugin.11:39
Open10K8Sthank you11:39
openstackgerritTristan Cacqueray proposed openstack/project-config master: pynotedb: end project gating
openstackgerritTristan Cacqueray proposed openstack/project-config master: pynotedb: remove project from infrastructure systems
*** jhesketh has quit IRC12:17
*** jhesketh has joined #opendev12:17
*** ykarel has joined #opendev12:25
ykarelHi when does infra mirrors get's updated?12:25
ykarelexample looking at
ykarel^ have missing nfv directory like
ykarelnfv repo was added couple of hours back12:27
ykarelit was added approx 4 hours ago12:28
openstackgerritLee Yarwood proposed openstack/project-config master: WIP Add Fedora 32 builds
openstackgerritSean McGinnis proposed openstack/project-config master: Set neutron-lib stable ACLs
fungizer0c00l: i agree, something seems broken with connecting to the mqtt broker there. i'm trying to look into it but also have some meetings this morning, so may not get it fixed immediately12:46
fungiykarel: i think we update every 2 hours, but we're pulling from a secondary mirror (the official primary mirror only allows public mirror operators to rsync from it) so when we see new files depends on when the mirror we're copying from gets them12:48
openstackgerritAndreas Jaeger proposed openstack/project-config master: Fix networking-l2gw missed change
AJaegerconfig-core, please review to fix zuul config ^12:48
AJaegertristanC: this should fix your change ^12:49
*** fressi has joined #opendev12:50
ykarelfungi, how to check from mirror you copy is updated or not?12:50
ykareli picked the above url from opendev/system-config12:50
ykarelokk now i see repo is updated12:51
ykarelmirror.mt101 atleast12:51
fungiykarel: if one of ours has it, all of them should. they're just all frontends to a shared network filesystem12:52
ykarelfungi, okk Thanks mirrors are updated now12:54
ykarelfungi, btw where to check when mirrors were last updated?12:54
*** ykarel_ has joined #opendev12:56
AJaegergmann, infra-root, we currently have 104 errors for openstack tenant - many errors about openstack/networking-l2gw, see
*** fressi has joined #opendev12:58
*** ykarel has quit IRC12:58
gmannAJaeger: yeah, its from networking-midonet which fix is up but their gate is already broken and no maintainer -
gmannAJaeger: just posted on current ML thread-
mnaserinfra-root: does someone need to create `/afs/` before works?13:06
mnaserthat seems like the case to be honest, as it says src file not found..13:06
*** ykarel_ has quit IRC13:09
*** ykarel has joined #opendev13:12
fungiykarel: we add a timestamp at the root of each tree we mirror, so in that case tells you when the last time was we ran rsync13:14
fungithat file should also be the same across all out mirror servers, because it too is just in the shared network filesystem13:15
ykarelfungi, Thanks for the info13:15
*** jhesketh has quit IRC13:31
*** fressi has quit IRC13:43
openstackgerritMatthew Treinish proposed opendev/puppet-mosquitto master: Update set_anonymous flag to be explicitly true
openstackgerritMatthew Treinish proposed opendev/puppet-mosquitto master: Update set allow_anonymous flag to be explicitly true
AJaegergmann: are their specific changes we should force-merge for midonet? I would ask clarkb or fungi if they are willing to do that if you have a list.13:57
AJaegergmann: thanks for checking13:57
fungiyeah, i've been following the ml thread, happy to help there if it will get them back on track (like if there are two fixes in different repos so you can't squash them)13:58
gmannAJaeger: they are fixing taas error, let's wait for those if it make it green13:59
gmannchecking here if it work then we can squash them
AJaegergmann: sure, let's wait - and if you need help, feel free to ask here.14:14
openstackgerritMerged opendev/system-config master: Add zuul-jobs-failures list
corvuszbr: ^14:35
gmannAJaeger: sure, thanks14:56
fungiinfra-root: i've now seen a page allocation failure when attaching a new volume to (the current production system), and would like to reboot it so that the volume attachment is seen by the guest os, any objections?14:58
corvusfungi: no objections14:58
fungiin xenwatch again, so like i saw with whichever other server that was (/me checks status log...)14:58
fungiaha, mirror01.dfw.rax was the other one where i saw this occur14:59
fungi#status log cinder volumes for all six elasticsearch servers have been replaced and cleaned up14:59
openstackstatusfungi: finished logging14:59
fungi#status log cinder volume for graphite02 (not yet in production) has been replaced and cleaned up15:00
openstackstatusfungi: finished logging15:00
zbrcorvus: thanks, this means next step is
openstackgerritMerged opendev/puppet-mosquitto master: Update set allow_anonymous flag to be explicitly true
*** bhagyashris is now known as bhagyashris|ruck15:11
*** bhagyashris|ruck is now known as bhagyashris|rove15:11
*** lpetrut has joined #opendev15:12
*** mlavalle has joined #opendev15:17
*** jhesketh has joined #opendev15:23
clarkbslow start here today. I'll be chwcking on nb03's web server then looking at updating vexxhost mirror netplan configs15:24
corvusthe weather here is apocalyptic15:28
corvusit's super dark outside and very orange15:28
clarkbwe've avoided that here but about an hour south its really bad like that15:29
clarkbzer0c00l: fungi re firehose for gerrit events it may be better to use the ssh event stream? I think firehoseis one of thr services we've talked about turning off due to lack of use15:34
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: synchronize-repos: Remove unecessary git path modifications
*** ysandeep is now known as ysandeep|away15:51
clarkbnb03's webserver looks good now. I did have to manually restart the netfilter-persistent unit. I Think that was missed beacuse the original handler handling for apache2 restart failed previously? Thats my best guess15:53
*** ykarel is now known as ykarel|away15:57
clarkbthe rules files were in place on the host but the iptables rules in the kernel had not updated15:57
clarkbanyway was an easy fix and now the webserver is accessible and https is working15:57
*** lpetrut has quit IRC15:59
gmannAJaeger: fungi : networking-midonet still fail on taas fix. i think we can force merge this which can solve config error and other fixes can continue in their own pace-
clarkbgmann: that doesn't even pass pep8?16:00
clarkbshould we force merge that?16:00
AJaegergmann: please double check - just left a comment16:02
gmannclarkb: all those failure are existing failure and retiring networking-l2gw introduces config error which 738046 fix16:02
gmannAJaeger: fixed16:03
clarkbmnaser: before I got rebooting the vexxhost mirror with a new netplan config, is interactive console access avilable in the vexxhost cloud? Wondering if I should bother setting a root password or not16:16
fungiyou should be able to log into horizon there with our tenant creds16:24
fungisince there were no objections, i'm rebooting the prod graphite instance now16:25
clarkbfungi: vexxhost login says enter an email address. Any idea if there is a direct horizon link that will accept username?16:28
fungi#status log rebooted graphite01 to resolve a page allocation failure during volume attach16:29
openstackstatusfungi: finished logging16:29
clarkbI guess that may be in the catalog /me checks16:29
fungii want to say i just went to the base url of the api hostname last time i did it16:29
*** ykarel|away has quit IRC16:31
fungiclarkb: oh, try
clarkbaha thanks (it wasn't in the catalog fwiw)16:32
fungii went poking around in my personal account there and that's where the "cloud console" tab directs you16:32
clarkbok I can get to what is claimed to be the console but there is no interaction it seems16:36
clarkbjust a blinking cursor location16:36
clarkbbut typing doesn't do anything and there is no text16:36
clarkbI wonder if the image itself isn't configured properly to have libvirt hook up to a tty16:36
clarkbI'm thinking setting a local passwd won't help much given ^16:37
clarkbother options we've got include using a cron job to flip the config back16:40
clarkb(or an at job)16:40
mnaserclarkb: the novnc console should work16:41
mnasermaybe hit enter once16:41
clarkbmnaser: ya I've tpyed a bit of stuff to try and waken it16:41
mnaserand you will probably need to reset root password to make life easy16:41
clarkbmnaser: I wonder if it is a mismatch between image and libvirt expectations16:41
mnaserthat's very likely, the default images work and i think they point to ttyS0..16:41
clarkbthis is a default image but it is also focal16:41
clarkbI'm actually thinking now I should boot a test instance and test it there (with updated ip and gateways)16:43
clarkbthat will be the least disruptive thing so doing that really quick16:43
mnaserclarkb: i THINK netplan has a dry run option16:45
clarkbah ya netplan try16:47
clarkbstill I'll do a test instance as that is proper end to end testing16:47
clarkbbut good to know that is a sanity check16:47
fungiclarkb: oh, yep, i should have remembered, there's also a nova api command (accessible via osc too) which will return the url to the instance's novnc session16:51
fungiit's `openstack console url show`16:52
fungi(followed by the instance id16:52
*** dtantsur is now known as dtantsur|afk16:55
clarkbok on my test instance it seems to work and the new network setup looks similar to the old except we lose the dynamic labels and the expiration times on routes16:57
clarkbI'll approve now and when it goes quiet I can apply the configs to the mirror and reboot it16:58
clarkbthe only things that changed between the two host configs was the mac addr and the ipv6 addr (they share gateways too)16:59
clarkbI'm able to ping6 and from the test host after rebooting too17:00
fungisounds like it worked then17:01
clarkbI think that all of our arm64 vm images may be built by now too17:05
clarkbI'm guessing the old server is sad for some reason as ar esult17:05
fungiit will cease to be sad when it's dead17:06
fungier, i mean, deleted17:06
clarkbya as a sanity check nb03 seems to be running the updated image from the gdisk and dosfstools commit so we aren't relying on a manually patched image there. Also plenty of free disk space but we've got lvm now too so can change that later if we needto17:07
clarkbmaybe ianw will be able to quickly confirm that there is no reason to keep later today and we can clean it up17:08
clarkbonce that is done we're ready for zuul zk tls in our zuul and nodepool installation17:08
clarkb*required zk tls17:08
clarkbthinking about the wheel "issue" in image builds I wonder if we should install wheel into those venvs to avoid the "ERROR" messages as they catch the eye17:11
clarkbalso maybe we should see if pip can make that a warning not an error17:11
*** andrewbonney has quit IRC17:14
funginot just to avoid the errors, but also to improve efficiency. pip will be able to cache them17:21
clarkbchange to disable vexxhost temporarily should land shortly. I'm going to pop out for a bit while the existing jobs there finish up. Back in a bit17:21
fungionce the volume replacement i've got in progress now finishes, we'll be down to just 5 remaining out of the 29 rackspace notified us about in dfw17:22
fungitopical, three are for nodepool builders (nb01, 02 and 04)17:23
fungithe other two are wiki-dev, and wiki (the latter i know will be more of a challenge)17:23
openstackgerritMerged openstack/project-config master: Disable vexxhost for mirror work
clarkbgrafana shows nodes are draining in vexxhost now17:46
openstackgerritClark Boylan proposed openstack/project-config master: Revert "Disable vexxhost for mirror work"
*** hashar has joined #opendev18:13
*** hashar has quit IRC18:31
clarkbwaiting patiently, now down to 5 test nodes in use18:40
AJaegerconfig-core, please review - this blocks some other changes to merge like
corvusAJaeger: 750645+318:51
AJaegerthanks, corvus18:51
zer0c00lclarkb, fungi re. Thanks. I thought i was doing something wrong. Does the gerritbot (ircbot) uses firehose mqtt or ssh?18:51
corvuszer0c00l: ssh18:51
clarkbzer0c00l: ssh18:51
fungias does zuul18:54
*** priteau has quit IRC18:57
*** hashar has joined #opendev18:58
zer0c00lHere i thought mqtt would be more reliable. I already use gerritlib(ssh), i have unexplained 'hangs'. It just stops receiving events.18:58
fungizer0c00l: have you tried enabling ssh keepalive?19:00
fungiit may be extended silent periods are causing a firewall or nat somewhere to discard what it sees as a dead tcp state19:00
openstackgerritMerged openstack/project-config master: Fix networking-l2gw missed change
AJaegerclarkb: could you review to start retiring pynotedb, please?19:02
openstackgerritAndreas Jaeger proposed openstack/project-config master: pynotedb: remove project from infrastructure systems
openstackgerritAndreas Jaeger proposed openstack/project-config master: pynotedb: remove project from infrastructure systems
zer0c00lfungi: I am basically running a forked version of
zer0c00lWonder if i can enable keepalive there.19:06
fungiit's using gerritlib which is using paramiko, so if paramiko supports ssh keepalive, i imagine we'd be fine merging patches to support it in gerritlib and gerritbot19:07
* fungi checks19:07
openstackgerritMerged openstack/project-config master: Add monasca-tempest-plugin in vexxhost tenant
zer0c00lfungi: thanks, let me see if i can modify gerritlib to use keepalives19:12
fungiyeah, we're instantiating a paramiko.SSHClient() so i'm looking to see if there's a way to set it through there19:13
fungisimilar to how we set_missing_host_key_policy()19:13
fungizer0c00l: looks like we ought to be able to access it through
fungiso probably something like calling client.get_transport().set_keepalive(60)19:16
fungiyou might want to locally patch your gerritlib first and see if it solves your issue... if it does, we likely want to have gerritlib.gerrit.GerritConnection() grow a new attribute tied to it defaulting to 0, and then make setting that configurable in gerritbot19:18
clarkbwhat is the issue?19:19
fungigerritbot ssh sessions to gerrit hanging19:19
fungifrom initial description, sounds like it could be something as simple as a firewall killing idle tcp connections19:20
fungi(not the gerritbot instance we're running, one zer0c00l has)19:21
fungiwhich is why i suggest first just one-liner patching gerritlib on his end to turn on keepalive in paramiko and see if it helps at all19:22
clarkbgot it19:22
clarkbvexxhost shows in use 019:22
clarkbgoing to update mirror configs and reboot it now19:22
clarkbssh is repsonding but tellin me non root users have to wait longer19:26
clarkbI suppose thats a good sign19:26
clarkband I'm in19:27
clarkbroutes and addresses look good and ping6 works to and review.opendev.org19:27
clarkb is also accessible19:28
clarkbany objections to approving now to reenable the vexxhost usage?19:28
clarkb#status log Configured to configure its ipv6 networking statically with netplan rather than listen to router advertisements.19:29
openstackstatusclarkb: finished logging19:29
*** Goneri has joined #opendev19:31
clarkbI've deleted my test node since the mirror itself is happy19:33
fungino objection19:34
openstackgerritMerged openstack/project-config master: Retire the devstack-plugin-zmq project
*** tosky has quit IRC19:37
openstackgerritMerged openstack/project-config master: pynotedb: end project gating
AJaegerinfra-root, is the next change to retire pynotedb, please review19:49
AJaegeronce that is merged, finishes the retirement.19:49
openstackgerritMerged openstack/project-config master: Revert "Disable vexxhost for mirror work"
*** openstackgerrit has quit IRC20:17
clarkbvexxhost is in use again20:33
*** hashar has quit IRC21:06
clarkbcorvus: for restarting the scheduelr and web to pick up the hold change is that something we want to get done today?21:13
clarkbinfra-root I've removed my WIP from as I think we can probably start cleaning up now21:14
clarkber is the one to cleanup21:14
corvusclarkb: yeah, i can do that in a few mins21:15
clarkbk I'm around and can help too21:16
corvushuh, i just noticed there's a bunch of whitespace at the bottom of the status page; i wonder if that's a pf4 thing21:16
clarkbhrm I don't see that21:17
clarkb is what i'm looking at21:17
corvusme too.  the scroll bar is about 30% down from the top when i've scrolled to the end of the actual content21:18
corvusit's the error drawer21:18
corvuswith 111 errors21:19
corvusi guess that's fine then21:19
corvusnow looks like a good time to restart21:19
clarkbthe errors are the x/networking-l2gw stuff iirc and AJaeger and gmann have been working through those21:20
gmannclarkb: we need to force merge this to get rid of config error -
gmannall failure in this patch are existing failure in that repo21:21
clarkbcorvus: should I go ahead and force merge ^ now or wait for after the restart?21:22
gmannI am not sure when those failure will be fixed as networking-midonet is in no maintainer situation21:22
corvusclarkb: maybe wait till restart21:23
clarkbcorvus: k will wait21:23
corvuswhich is imminent21:23
clarkbgmann: ya so I guess we're fixing zuul but then the repo itself will need someone to step in and address the issues21:23
clarkbcorvus: looks like the most recent run of service-zuul failed21:24
corvusclarkb: the image looked up to date so i proceeded21:25
clarkbze09 is unreachable is the issue there21:25
clarkbso ya should be fine to proceed21:25
corvusthis restart will probably take 20+ minutes due to schema update21:25
clarkbI'm able to ssh into ze09 so not sure what is going on there21:25
corvusweird, ze09 seems to be working.  agreed21:26
corvusrunning jobs and everything21:26
corvusmaybe a fluke?21:26
clarkbya I guess check it in an hour and see if it persists (we run service-zuul hourly)21:26
corvusin other news, the light outside is brighter and now looks like dawn21:30
corvusturns out we really needad a nuclear winter to cool off after the heat wave21:31
clarkbwe seem to be just north of the worst of it here. The photos coming out of places to the south of us are crazy though21:33
corvuswe learned this morning that google removed the manual white balance feature from the android camera app.  fortunately we had a better camera handy.21:35
clarkb was Salem yesterday21:35
clarkbthats about 45-60 minutes south of here depending on traffic21:36
smcginnisI better check in on my friends in Eugene.21:37
clarkba bunch of towns up hill from salem burned down yesterday too21:37
clarkbits weird because we're all being told to stay home and off roads again but not due to the pandemic21:38
fungithat is, like, alien abduction supernatural horror movie red sky21:40
johnsomThe EPA AQI tops out at 500 which is "respirator recommended". We are currently somewhere between 710 and 55121:40
JayFWe're checking a /lot/ of apocalypse boxes for 2020.21:41
johnsomWe have had the total recall experience since Monday night.21:41
clarkbjohnsom: I think if you get north of about wilsonville it clears up a bunch21:44
clarkbstill not great but much better than that21:44
corvuswow that's red.  we're more orange.  this is pretty representative of what we saw this morning:
clarkband the wind changes in a couple days and it will all be different again too21:44
johnsomYeah, I have checked the road cameras and air quality sites.21:44
fungii'll just stay in my bunker. thanks21:45
johnsomHmm, just got a "Proxy Error" from "Reason: Error reading from remote server" Apache/2.4.18 (Ubuntu) Server at Port 44321:46
corvusi'm restarting it21:46
clarkband it needs to do a db migration so a bit slower than normal21:46
corvus#status log restarted zuul scheduler and web at commit 6a5b97572f73a70f72a0795d5e517cff96740887 to pick up held db attribute21:47
openstackstatuscorvus: finished logging21:47
corvusre-enqueue is in progress21:47
clarkbcorvus: now I see the extra scroll space21:48
clarkbcorvus: I think its rendering to the bottom pre restart then remembering that length but when there are fewer chagnes (because renequeue is serial) it renders the shorter list in the bigger space21:49
clarkb(I didn't open the error drawer but maybe it does the same thing)21:49
clarkbcorvus: good to force merge that change now?21:51
corvusclarkb: i think it's only the error drawer.  the drawer is always there but normally invisible.  the length of my page is exactly the length required to display it.21:52
corvusclarkb: and yes, gtg21:52
corvusreenqueue is finished21:52
fungithat explains why i saw something similar on a build result page21:52
clarkb is merged21:53
corvusyeah, i'd expect it on any sufficiently short page in the openstack tenant21:53
clarkbgmann: ^ fyi21:53
clarkbcorvus: ah got it21:53
fungiso the amount of blank space presumably could be observed to differ between tenants based on how many errors a given tenant has21:53
corvusat least for the next few minutes :)21:53
corvusfungi: yep21:53
gmannclarkb: thanks21:54
clarkbgmann: are there stable branch fixes for networking-midonet too?22:00
gmannclarkb: not yet. i can backport it but not sure if their stable branch are active or not22:02
clarkbgmann: well the errors are there as well.22:02
gmannclarkb: ok. so i  need to backport this and then you can force merge?22:04
gmannit seems we need to do until ocata22:04
clarkbyes I can help with that22:05
gmannok. doing22:05
gmanngive me 5 min, in between of fixing openstacksdk for Focal migration22:05
*** openstackgerrit has joined #opendev22:09
openstackgerritClark Boylan proposed opendev/system-config master: Remove nodepool builder puppetry and
clarkbnoticed a relatively small groups thing I missed but figured I'd update the chagne to be more complete22:10
clarkbianw: ^ if you have time to look at the new server and check if it is happy then review ^ that would be great (from what I can see the new server is good)22:10
clarkbbut then if ^ land I think we can just delete the server and its volume?22:10
ianwheh, i though i just +2'd that :)22:10
clarkboh maybe you did, sorry22:10
ianwyeah, lgtm, when i left yesterday everything was built on the new server22:11
ianwso if it hasn't exploded overnight, seems good :)22:11
clarkbits not super clear to me on the best way to clean up the images that has built22:11
clarkbprobably just manually rm those when we are done?22:11
ianwhrm, could we put everything on pause in the config there and then delete them using the cli?22:13
clarkbya that may work22:14
clarkbwe can't remove the images from the image list as that will delete what the new server is building22:14
clarkbbut pausing then asking nodepool to delete them may be easiest22:14
* clarkb writes the pause change22:14
openstackgerritClark Boylan proposed openstack/project-config master: Pause image builds on
clarkbianw: ^ something like that22:16
ianwclarkb: i *think* it's still unpuppeted?  or did we fix that?22:17 # ianw 2020-05-20 hand edits applied to dib to build focal on xenial22:18
ianwno, we didn't ... so i guess just apply by hand and we can get rid of it asap?22:18
clarkbwfm I'll make that change manually22:18
clarkbwait it looks like they are all paused already22:19
clarkbshould I go ahead and ask nodepool to delete the dib images on
ianwohh, istr fungi doing something with it22:20
ianwi think so, let's get rid of it rather than archaeological dig what's going on :)22:20
clarkbalright I'll do dib-image-delete on all the arm64 images that don't show as their builder22:21
clarkbthat should cause nodepool to go and try and clean things up automatically22:21
clarkbwhen that is done we can land then delete the server and volume22:22
clarkbdeletes have been requested. Some have deleted others haven't I wonder if there will be issues :/22:26
clarkbianw: the images which are having a hard time deleting are quite old. I wonder if they are just super stale?22:29
ianwit's not trying on linaro-london is it?22:29
clarkboh ha I think that is it22:30
ianwall those 118 day old ones are there22:30
clarkbthe london cloud is gone right? so we should edit the zk db to clear those upload records instead?22:30
ianwyeah it is, i think it went away ungracefully from what i remember.  as in wasn't working22:31
ianwdo have instructions for hand-edit zk?  i've not done but would be ahppy to learn22:32
ianwhappy even22:32
clarkbianw: I think there are simple instructions somewhere let me see22:32
clarkbianw: that talks about the client.22:33
clarkbianw: but then its a simple fs like navigation system. `help` for commands `ls` to list things `cd` to change your context `get` to get a nodes contents iirc22:34
clarkbI think what we want to do is delete the upload reocrds for those images then the rest will be done automatically22:34
clarkbI can double check things if you like22:34
ianwok, let me see if i can get something talking22:35
openstackgerritClark Boylan proposed opendev/system-config master: Block port 2181 on zookeeper hosts
ianwi wonder if it's harder with containers and ssl etc22:35
clarkbianw: ssl does make it harder ^ is sort of related22:35
clarkbianw: we keep listening on port 2181 too so if you run this on the zk server itself you can talk insecurely locally22:35
clarkbI think that is going to be our keep things simple mode of operation going forward, listen on 2181 and 2281 but only expose 2281 outside the host22:36
ianwyeah "connect localhost" seems to have worked22:38
clarkbI think you want to rm paths like /nodepool/images/centos-8-arm64/builds/0000000043/providers/linaro-london/images/000000000122:38
clarkband if you do those leaf nodes nodepool should cleanup the rest of it for us22:39
ianwis the list22:41
ianwi'll try /nodepool/images/centos-8-arm64/builds/0000000045/providers/linaro-london/images/000000000122:42
clarkbthose paths look like what we want to delete to me22:42
ianwok, the upper paths seem all gone, so i'd say it's cleaned up properly22:44
ianwi'll do the rest22:44
*** auristor has quit IRC22:45
ianwclarkb: ok, no more linaro london22:45
clarkbgmann: thanks I'll work on that shortly22:46
ianwwhat's the deal with the 168 day old gentoo images?22:46
clarkbianw: no idea22:47
ianw2020-05-07 15:24:52.286 | tar (child): xz: Cannot exec: No such file or directory22:48
*** auristor has joined #opendev22:49
clarkbgmann: done22:55
gmannclarkb: thank.22:55
gmannclarkb: i will backport networking-odl  too and ask lajos to merge once he is online tomorrow -
gmannafter that networking-l2gw error should disappear completely22:57
fungisorry, post dinner catching up... what was i doing something with? paused image builds on old nb03?22:58
clarkbfungi: ya, we decided we're just going to delete it anyway so figuring that out doesn't matter much :)22:58
fungigood, because i don't know that my memory can take the strain at this point22:59
fungigood riddance23:00
fungi#status log cinder volume for graphite01 (current production server) has been replaced and cleaned up23:01
openstackstatusfungi: finished logging23:01
fungii'm working on nb01/02/04 now but should be transparent23:02
clarkbmy next builder question is should we rm nb04?23:02
clarkbI'm not sure if we need 3 x86 builders (it doesn't hurt but is it necessary?)23:02
fungihow's our disk space? that's the primary indicator, right?23:04
clarkbthey are each using about half of their disk23:07
clarkbso cutting out 04 would push 01 and 02 to 3/4 of their disk or so23:07
*** Goneri has quit IRC23:21
*** cloudnull4 has joined #opendev23:26
*** cloudnull has quit IRC23:26
*** cloudnull4 is now known as cloudnull23:26
fungiseems okay23:39

Generated by 2.17.2 by Marius Gedminas - find it at!