clarkb | gdisk is manually installed into the image | 00:00 |
---|---|---|
fungi | clarkb: the no-wheel behavior is consistent with upstream venv/stdlib. i get it even though i compile my own from cpython git source | 00:00 |
clarkb | ubuntu-focal-arm64-0000007568.log is building now and before the sgdisk failure | 00:00 |
clarkb | fungi: ya its consistent on suse too | 00:00 |
clarkb | fungi: I'm going to reread log sand see if it is actually failing or just complaining | 00:01 |
clarkb | ah yup it later logs `running setup.py for X` | 00:01 |
clarkb | that seems like undesireable behavior but explains why this isn't fatal | 00:01 |
clarkb | sorry for the noise | 00:02 |
clarkb | ianw: good call 2020-09-09 00:05:57.481 | DEBUG diskimage_builder.block_device.utils [-] exec_sudo: mkfs: failed to execute mkfs.vfat: No such file or directory exec_sudo /usr/local/lib/python3.7/site-packages/diskimage_builder/block_device/utils.py:135 | 00:06 |
ianw | oh yeah, probably need vfat tools because efi partition | 00:06 |
clarkb | https://packages.debian.org/buster/dosfstools <- seems to be the package | 00:07 |
ianw | yeah that would be it | 00:07 |
clarkb | I'll manually install that one too | 00:07 |
*** DSpider has quit IRC | 00:07 | |
clarkb | ianw: wnt to update the change or should I do that? | 00:07 |
ianw | clarkb: umm, i can, see if that uefi patch ontop works too | 00:08 |
clarkb | ya wasn't sure what was going on there s ofigured I'd let you push the new ps | 00:09 |
clarkb | (that way I don't get it wrong) | 00:09 |
clarkb | debian-buster-arm64-0000089197.log is building now should give us an idea if there are more missing bits | 00:09 |
clarkb | but its going to be dinner soon so I need to step away. | 00:09 |
ianw | clarkb: thanks; tailing that log now | 00:10 |
ianw | at least until my internet cuts out :) | 00:10 |
clarkb | its weird we're all being told to stay home not because of the pandemic today but because the area's emergency services have been on a put fires out marathon | 00:12 |
clarkb | thankfully no additional power blips since the first one yesterday (which lasted just long enough to remind me one of my UPS batteries is bad) | 00:12 |
ianw | ahh fires, yes i remember them :) | 00:14 |
ianw | i guess our turn is coming in a few months | 00:14 |
ianw | looks like it's moving on to image generation stage, so that's good | 00:25 |
ianw | (partitions made and formatted) | 00:25 |
clarkb | ianw: looks like adding dosfstools was enough to get a successful job on the new builder | 00:57 |
clarkb | s/job/build/ | 00:57 |
clarkb | | debian-buster-arm64-0000089197 | debian-buster-arm64 | nb03.opendev.org | qcow2 | ready | 00:00:18:01 | | 00:58 |
ianw | yep, that's good! :) | 00:59 |
clarkb | I think that means the le thing is the only outstanding issue and I'm happy to wait for that to happen periodically | 01:00 |
* clarkb goes back to enjoying the evening before the smoke returns | 01:00 | |
ianw | clarkb: have fun, i'll keep an eye on LE rollout | 01:13 |
*** diablo_rojo has quit IRC | 02:50 | |
openstackgerrit | Merged opendev/system-config master: run-base-post: fix ARA artifact link https://review.opendev.org/747101 | 03:09 |
*** johnavp1989 has left #opendev | 03:57 | |
openstackgerrit | Oleksandr Kozachenko proposed openstack/project-config master: Add monasca projects in vexxhost tenant https://review.opendev.org/750561 | 05:09 |
*** ysandeep|away is now known as ysandeep | 05:42 | |
*** zer0c00l has joined #opendev | 06:07 | |
zer0c00l | Is there a way i can subscribe to firehose.opendev.org events? | 06:08 |
zer0c00l | Â mosquitto_sub -h firehose.openstack.org --topic 'gerrit' | 06:08 |
zer0c00l | Connection error: Connection Refused: not authorised. | 06:08 |
zer0c00l | Is there a way to setup username and password so i can 'subscribe' to mqtt events? | 06:08 |
*** qchris has quit IRC | 06:20 | |
*** qchris has joined #opendev | 06:34 | |
openstackgerrit | Merged openstack/project-config master: Add monasca projects in vexxhost tenant https://review.opendev.org/750561 | 06:56 |
*** andrewbonney has joined #opendev | 07:06 | |
*** fressi has joined #opendev | 07:17 | |
*** hashar has joined #opendev | 07:17 | |
*** ysandeep is now known as ysandeep|lunch | 07:40 | |
*** tosky has joined #opendev | 07:57 | |
openstackgerrit | Fabien Boucher proposed opendev/gear master: use python3 as context for build-python-release https://review.opendev.org/742165 | 07:57 |
*** moppy has quit IRC | 08:01 | |
*** moppy has joined #opendev | 08:01 | |
*** DSpider has joined #opendev | 08:02 | |
*** ysandeep|lunch is now known as ysandeep | 08:55 | |
*** dtantsur|afk is now known as dtantsur | 09:35 | |
*** hashar has quit IRC | 09:57 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed opendev/elastic-recheck master: Use pytest for queries https://review.opendev.org/750445 | 10:00 |
openstackgerrit | Sorin Sbarnea (zbr) proposed opendev/elastic-recheck master: Use pytest for queriesSwitches queries testing to use of pytest which provides the following:- test generator for each query (parametrize)- ability to test a single query test- generate html report with tsssest results, making easier to investigate failu https://review.opendev.org/750616 | 10:11 |
*** fressi has quit IRC | 10:13 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed opendev/elastic-recheck master: Use pytest for queries https://review.opendev.org/750445 | 10:13 |
*** priteau has joined #opendev | 10:42 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed opendev/elastic-recheck master: Use pytest for queries https://review.opendev.org/750445 | 10:50 |
openstackgerrit | Oleksandr Kozachenko proposed openstack/project-config master: Add monasca-tempest-plugin in vexxhost tenant https://review.opendev.org/750627 | 11:37 |
Open10K8S | Hi team. Please check this PS https://review.opendev.org/750627 . forgot to add tempest plugin. | 11:39 |
Open10K8S | thank you | 11:39 |
openstackgerrit | Tristan Cacqueray proposed openstack/project-config master: pynotedb: end project gating https://review.opendev.org/750634 | 12:17 |
openstackgerrit | Tristan Cacqueray proposed openstack/project-config master: pynotedb: remove project from infrastructure systems https://review.opendev.org/750635 | 12:17 |
*** jhesketh has quit IRC | 12:17 | |
*** jhesketh has joined #opendev | 12:17 | |
*** ykarel has joined #opendev | 12:25 | |
ykarel | Hi when does infra mirrors get's updated? | 12:25 |
ykarel | example looking at http://mirror.mtl01.inap.opendev.org/centos/8/ | 12:26 |
ykarel | ^ have missing nfv directory like http://mirror.dal10.us.leaseweb.net/centos/8/nfv/ | 12:27 |
ykarel | nfv repo was added couple of hours back | 12:27 |
ykarel | it was added approx 4 hours ago | 12:28 |
openstackgerrit | Lee Yarwood proposed openstack/project-config master: WIP Add Fedora 32 builds https://review.opendev.org/750642 | 12:36 |
openstackgerrit | Sean McGinnis proposed openstack/project-config master: Set neutron-lib stable ACLs https://review.opendev.org/750643 | 12:39 |
fungi | zer0c00l: i agree, something seems broken with connecting to the mqtt broker there. i'm trying to look into it but also have some meetings this morning, so may not get it fixed immediately | 12:46 |
fungi | ykarel: i think we update every 2 hours, but we're pulling from a secondary mirror (the official primary mirror only allows public mirror operators to rsync from it) so when we see new files depends on when the mirror we're copying from gets them | 12:48 |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: Fix networking-l2gw missed change https://review.opendev.org/750645 | 12:48 |
AJaeger | config-core, please review to fix zuul config ^ | 12:48 |
AJaeger | tristanC: this should fix your change ^ | 12:49 |
*** fressi has joined #opendev | 12:50 | |
ykarel | fungi, how to check from mirror you copy is updated or not? | 12:50 |
ykarel | i picked the above url from opendev/system-config | 12:50 |
ykarel | okk now i see repo is updated | 12:51 |
ykarel | mirror.mt101 atleast | 12:51 |
ykarel | http://mirror.mtl01.inap.opendev.org/centos/8/nfv/ | 12:51 |
fungi | ykarel: if one of ours has it, all of them should. they're just all frontends to a shared network filesystem | 12:52 |
ykarel | fungi, okk Thanks mirrors are updated now | 12:54 |
ykarel | fungi, btw where to check when mirrors were last updated? | 12:54 |
*** ykarel_ has joined #opendev | 12:56 | |
AJaeger | gmann, infra-root, we currently have 104 errors for openstack tenant - many errors about openstack/networking-l2gw, see https://zuul.opendev.org/t/openstack/config-errors | 12:56 |
*** fressi has joined #opendev | 12:58 | |
*** ykarel has quit IRC | 12:58 | |
gmann | AJaeger: yeah, its from networking-midonet which fix is up but their gate is already broken and no maintainer - https://review.opendev.org/#/c/738046/ | 13:02 |
gmann | AJaeger: just posted on current ML thread- http://lists.openstack.org/pipermail/openstack-discuss/2020-September/017116.html | 13:04 |
mnaser | infra-root: does someone need to create `/afs/openstack.org/mirror/ceph-deb-octopus` before https://review.opendev.org/#/c/750519/ works? | 13:06 |
mnaser | that seems like the case to be honest, as it says src file not found.. | 13:06 |
*** ykarel_ has quit IRC | 13:09 | |
*** ykarel has joined #opendev | 13:12 | |
fungi | ykarel: we add a timestamp at the root of each tree we mirror, so in that case http://mirror.mtl01.inap.opendev.org/centos/timestamp.txt tells you when the last time was we ran rsync | 13:14 |
fungi | that file should also be the same across all out mirror servers, because it too is just in the shared network filesystem | 13:15 |
ykarel | fungi, Thanks for the info | 13:15 |
fungi | yw | 13:19 |
*** jhesketh has quit IRC | 13:31 | |
*** fressi has quit IRC | 13:43 | |
openstackgerrit | Matthew Treinish proposed opendev/puppet-mosquitto master: Update set_anonymous flag to be explicitly true https://review.opendev.org/750659 | 13:49 |
openstackgerrit | Matthew Treinish proposed opendev/puppet-mosquitto master: Update set allow_anonymous flag to be explicitly true https://review.opendev.org/750659 | 13:53 |
AJaeger | gmann: are their specific changes we should force-merge for midonet? I would ask clarkb or fungi if they are willing to do that if you have a list. | 13:57 |
AJaeger | gmann: thanks for checking | 13:57 |
fungi | yeah, i've been following the ml thread, happy to help there if it will get them back on track (like if there are two fixes in different repos so you can't squash them) | 13:58 |
gmann | AJaeger: they are fixing taas error, let's wait for those if it make it green | 13:59 |
gmann | checking here if it work then we can squash them https://review.opendev.org/#/c/750633/2 | 14:03 |
AJaeger | gmann: sure, let's wait - and if you need help, feel free to ask here. | 14:14 |
openstackgerrit | Merged opendev/system-config master: Add zuul-jobs-failures list https://review.opendev.org/748688 | 14:34 |
corvus | zbr: ^ | 14:35 |
gmann | AJaeger: sure, thanks | 14:56 |
fungi | infra-root: i've now seen a page allocation failure when attaching a new volume to graphite01.opendev.org (the current production system), and would like to reboot it so that the volume attachment is seen by the guest os, any objections? | 14:58 |
corvus | fungi: no objections | 14:58 |
fungi | in xenwatch again, so like i saw with whichever other server that was (/me checks status log...) | 14:58 |
fungi | aha, mirror01.dfw.rax was the other one where i saw this occur | 14:59 |
fungi | #status log cinder volumes for all six elasticsearch servers have been replaced and cleaned up | 14:59 |
openstackstatus | fungi: finished logging | 14:59 |
fungi | #status log cinder volume for graphite02 (not yet in production) has been replaced and cleaned up | 15:00 |
openstackstatus | fungi: finished logging | 15:00 |
zbr | corvus: thanks, this means next step is https://review.opendev.org/#/c/748706/ | 15:06 |
openstackgerrit | Merged opendev/puppet-mosquitto master: Update set allow_anonymous flag to be explicitly true https://review.opendev.org/750659 | 15:10 |
*** bhagyashris is now known as bhagyashris|ruck | 15:11 | |
*** bhagyashris|ruck is now known as bhagyashris|rove | 15:11 | |
*** lpetrut has joined #opendev | 15:12 | |
*** mlavalle has joined #opendev | 15:17 | |
*** jhesketh has joined #opendev | 15:23 | |
clarkb | slow start here today. I'll be chwcking on nb03's web server then looking at updating vexxhost mirror netplan configs | 15:24 |
corvus | the weather here is apocalyptic | 15:28 |
corvus | it's super dark outside and very orange | 15:28 |
clarkb | we've avoided that here but about an hour south its really bad like that | 15:29 |
clarkb | zer0c00l: fungi re firehose for gerrit events it may be better to use the ssh event stream? I think firehoseis one of thr services we've talked about turning off due to lack of use | 15:34 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: synchronize-repos: Remove unecessary git path modifications https://review.opendev.org/747640 | 15:44 |
*** ysandeep is now known as ysandeep|away | 15:51 | |
clarkb | nb03's webserver looks good now. I did have to manually restart the netfilter-persistent unit. I Think that was missed beacuse the original handler handling for apache2 restart failed previously? Thats my best guess | 15:53 |
*** ykarel is now known as ykarel|away | 15:57 | |
clarkb | the rules files were in place on the host but the iptables rules in the kernel had not updated | 15:57 |
clarkb | anyway was an easy fix and now the webserver is accessible and https is working | 15:57 |
*** lpetrut has quit IRC | 15:59 | |
gmann | AJaeger: fungi : networking-midonet still fail on taas fix. i think we can force merge this which can solve config error and other fixes can continue in their own pace- https://review.opendev.org/#/c/738046/5 | 16:00 |
clarkb | gmann: that doesn't even pass pep8? | 16:00 |
clarkb | should we force merge that? | 16:00 |
AJaeger | gmann: please double check https://review.opendev.org/#/c/738046/5 - just left a comment | 16:02 |
gmann | clarkb: all those failure are existing failure and retiring networking-l2gw introduces config error which 738046 fix | 16:02 |
gmann | ok | 16:02 |
gmann | AJaeger: fixed | 16:03 |
clarkb | mnaser: before I got rebooting the vexxhost mirror with a new netplan config, is interactive console access avilable in the vexxhost cloud? Wondering if I should bother setting a root password or not | 16:16 |
fungi | you should be able to log into horizon there with our tenant creds | 16:24 |
fungi | since there were no objections, i'm rebooting the prod graphite instance now | 16:25 |
clarkb | fungi: vexxhost login says enter an email address. Any idea if there is a direct horizon link that will accept username? | 16:28 |
fungi | #status log rebooted graphite01 to resolve a page allocation failure during volume attach | 16:29 |
openstackstatus | fungi: finished logging | 16:29 |
clarkb | I guess that may be in the catalog /me checks | 16:29 |
fungi | i want to say i just went to the base url of the api hostname last time i did it | 16:29 |
*** ykarel|away has quit IRC | 16:31 | |
fungi | clarkb: oh, try https://dashboard.vexxhost.net/ | 16:31 |
clarkb | aha thanks (it wasn't in the catalog fwiw) | 16:32 |
fungi | i went poking around in my personal account there and that's where the "cloud console" tab directs you | 16:32 |
clarkb | ok I can get to what is claimed to be the console but there is no interaction it seems | 16:36 |
clarkb | just a blinking cursor location | 16:36 |
clarkb | but typing doesn't do anything and there is no text | 16:36 |
clarkb | I wonder if the image itself isn't configured properly to have libvirt hook up to a tty | 16:36 |
clarkb | I'm thinking setting a local passwd won't help much given ^ | 16:37 |
clarkb | other options we've got include using a cron job to flip the config back | 16:40 |
clarkb | (or an at job) | 16:40 |
mnaser | clarkb: the novnc console should work | 16:41 |
mnaser | maybe hit enter once | 16:41 |
clarkb | mnaser: ya I've tpyed a bit of stuff to try and waken it | 16:41 |
mnaser | and you will probably need to reset root password to make life easy | 16:41 |
clarkb | mnaser: I wonder if it is a mismatch between image and libvirt expectations | 16:41 |
mnaser | that's very likely, the default images work and i think they point to ttyS0.. | 16:41 |
clarkb | this is a default image but it is also focal | 16:41 |
clarkb | I'm actually thinking now I should boot a test instance and test it there (with updated ip and gateways) | 16:43 |
clarkb | that will be the least disruptive thing so doing that really quick | 16:43 |
mnaser | clarkb: i THINK netplan has a dry run option | 16:45 |
clarkb | ah ya netplan try | 16:47 |
clarkb | still I'll do a test instance as that is proper end to end testing | 16:47 |
clarkb | but good to know that is a sanity check | 16:47 |
fungi | clarkb: oh, yep, i should have remembered, there's also a nova api command (accessible via osc too) which will return the url to the instance's novnc session | 16:51 |
fungi | it's `openstack console url show` | 16:52 |
fungi | (followed by the instance id | 16:52 |
*** dtantsur is now known as dtantsur|afk | 16:55 | |
clarkb | ok on my test instance it seems to work and the new network setup looks similar to the old except we lose the dynamic labels and the expiration times on routes | 16:57 |
clarkb | I'll approve https://review.opendev.org/#/c/750484/ now and when it goes quiet I can apply the configs to the mirror and reboot it | 16:58 |
clarkb | the only things that changed between the two host configs was the mac addr and the ipv6 addr (they share gateways too) | 16:59 |
clarkb | I'm able to ping6 google.com and review.opendev.org from the test host after rebooting too | 17:00 |
fungi | sounds like it worked then | 17:01 |
clarkb | I think that all of our arm64 vm images may be built by nb03.opendev.org now too | 17:05 |
clarkb | I'm guessing the old server is sad for some reason as ar esult | 17:05 |
fungi | it will cease to be sad when it's dead | 17:06 |
fungi | er, i mean, deleted | 17:06 |
clarkb | ya as a sanity check nb03 seems to be running the updated image from the gdisk and dosfstools commit so we aren't relying on a manually patched image there. Also plenty of free disk space but we've got lvm now too so can change that later if we needto | 17:07 |
clarkb | maybe ianw will be able to quickly confirm that there is no reason to keep nb03.openstack.org later today and we can clean it up | 17:08 |
clarkb | once that is done we're ready for zuul zk tls in our zuul and nodepool installation | 17:08 |
clarkb | *required zk tls | 17:08 |
clarkb | thinking about the wheel "issue" in image builds I wonder if we should install wheel into those venvs to avoid the "ERROR" messages as they catch the eye | 17:11 |
clarkb | also maybe we should see if pip can make that a warning not an error | 17:11 |
*** andrewbonney has quit IRC | 17:14 | |
fungi | not just to avoid the errors, but also to improve efficiency. pip will be able to cache them | 17:21 |
clarkb | change to disable vexxhost temporarily should land shortly. I'm going to pop out for a bit while the existing jobs there finish up. Back in a bit | 17:21 |
fungi | once the volume replacement i've got in progress now finishes, we'll be down to just 5 remaining out of the 29 rackspace notified us about in dfw | 17:22 |
fungi | topical, three are for nodepool builders (nb01, 02 and 04) | 17:23 |
fungi | the other two are wiki-dev, and wiki (the latter i know will be more of a challenge) | 17:23 |
openstackgerrit | Merged openstack/project-config master: Disable vexxhost for mirror work https://review.opendev.org/750484 | 17:25 |
clarkb | grafana shows nodes are draining in vexxhost now | 17:46 |
openstackgerrit | Clark Boylan proposed openstack/project-config master: Revert "Disable vexxhost for mirror work" https://review.opendev.org/750765 | 17:52 |
*** hashar has joined #opendev | 18:13 | |
*** hashar has quit IRC | 18:31 | |
clarkb | waiting patiently, now down to 5 test nodes in use | 18:40 |
AJaeger | config-core, please review https://review.opendev.org/750645 - this blocks some other changes to merge like https://review.opendev.org/750634 | 18:49 |
corvus | AJaeger: 750645+3 | 18:51 |
AJaeger | thanks, corvus | 18:51 |
zer0c00l | clarkb, fungi re. Thanks. I thought i was doing something wrong. Does the gerritbot (ircbot) uses firehose mqtt or ssh? | 18:51 |
corvus | zer0c00l: ssh | 18:51 |
clarkb | zer0c00l: ssh | 18:51 |
fungi | as does zuul | 18:54 |
*** priteau has quit IRC | 18:57 | |
zer0c00l | Thanks. | 18:57 |
*** hashar has joined #opendev | 18:58 | |
zer0c00l | Here i thought mqtt would be more reliable. I already use gerritlib(ssh), i have unexplained 'hangs'. It just stops receiving events. | 18:58 |
fungi | zer0c00l: have you tried enabling ssh keepalive? | 19:00 |
fungi | it may be extended silent periods are causing a firewall or nat somewhere to discard what it sees as a dead tcp state | 19:00 |
openstackgerrit | Merged openstack/project-config master: Fix networking-l2gw missed change https://review.opendev.org/750645 | 19:00 |
AJaeger | clarkb: could you review https://review.opendev.org/#/c/750634 to start retiring pynotedb, please? | 19:02 |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: pynotedb: remove project from infrastructure systems https://review.opendev.org/750635 | 19:03 |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: pynotedb: remove project from infrastructure systems https://review.opendev.org/750635 | 19:04 |
zer0c00l | fungi: I am basically running a forked version of https://opendev.org/opendev/gerritbot | 19:06 |
zer0c00l | Wonder if i can enable keepalive there. | 19:06 |
fungi | it's using gerritlib which is using paramiko, so if paramiko supports ssh keepalive, i imagine we'd be fine merging patches to support it in gerritlib and gerritbot | 19:07 |
* fungi checks | 19:07 | |
fungi | http://docs.paramiko.org/en/stable/api/transport.html#paramiko.transport.Transport.set_keepalive | 19:09 |
openstackgerrit | Merged openstack/project-config master: Add monasca-tempest-plugin in vexxhost tenant https://review.opendev.org/750627 | 19:10 |
zer0c00l | fungi: thanks, let me see if i can modify gerritlib to use keepalives | 19:12 |
fungi | yeah, we're instantiating a paramiko.SSHClient() so i'm looking to see if there's a way to set it through there | 19:13 |
fungi | similar to how we set_missing_host_key_policy() | 19:13 |
fungi | zer0c00l: looks like we ought to be able to access it through http://docs.paramiko.org/en/stable/api/client.html#paramiko.client.SSHClient.get_transport | 19:15 |
fungi | so probably something like calling client.get_transport().set_keepalive(60) | 19:16 |
fungi | you might want to locally patch your gerritlib first and see if it solves your issue... if it does, we likely want to have gerritlib.gerrit.GerritConnection() grow a new attribute tied to it defaulting to 0, and then make setting that configurable in gerritbot | 19:18 |
clarkb | what is the issue? | 19:19 |
fungi | gerritbot ssh sessions to gerrit hanging | 19:19 |
fungi | from initial description, sounds like it could be something as simple as a firewall killing idle tcp connections | 19:20 |
fungi | (not the gerritbot instance we're running, one zer0c00l has) | 19:21 |
fungi | which is why i suggest first just one-liner patching gerritlib on his end to turn on keepalive in paramiko and see if it helps at all | 19:22 |
clarkb | got it | 19:22 |
clarkb | vexxhost shows in use 0 | 19:22 |
clarkb | going to update mirror configs and reboot it now | 19:22 |
clarkb | ssh is repsonding but tellin me non root users have to wait longer | 19:26 |
clarkb | I suppose thats a good sign | 19:26 |
fungi | indeed | 19:26 |
clarkb | and I'm in | 19:27 |
clarkb | routes and addresses look good and ping6 works to google.com and review.opendev.org | 19:27 |
clarkb | https://mirror.ca-ymq-1.vexxhost.opendev.org/ is also accessible | 19:28 |
clarkb | any objections to approving https://review.opendev.org/#/c/750765/ now to reenable the vexxhost usage? | 19:28 |
clarkb | #status log Configured mirror01.ca-ymq-1.vexxhost.opendev.org to configure its ipv6 networking statically with netplan rather than listen to router advertisements. | 19:29 |
openstackstatus | clarkb: finished logging | 19:29 |
*** Goneri has joined #opendev | 19:31 | |
clarkb | I've deleted my test node since the mirror itself is happy | 19:33 |
fungi | no objection | 19:34 |
openstackgerrit | Merged openstack/project-config master: Retire the devstack-plugin-zmq project https://review.opendev.org/748724 | 19:36 |
*** tosky has quit IRC | 19:37 | |
openstackgerrit | Merged openstack/project-config master: pynotedb: end project gating https://review.opendev.org/750634 | 19:43 |
AJaeger | infra-root, https://review.opendev.org/#/c/597402 is the next change to retire pynotedb, please review | 19:49 |
AJaeger | once that is merged, https://review.opendev.org/#/c/750635/ finishes the retirement. | 19:49 |
openstackgerrit | Merged openstack/project-config master: Revert "Disable vexxhost for mirror work" https://review.opendev.org/750765 | 19:51 |
*** openstackgerrit has quit IRC | 20:17 | |
clarkb | vexxhost is in use again | 20:33 |
*** hashar has quit IRC | 21:06 | |
clarkb | corvus: for restarting the scheduelr and web to pick up the hold change is that something we want to get done today? | 21:13 |
clarkb | infra-root I've removed my WIP from https://review.opendev.org/#/c/749853/2 as I think we can probably start cleaning up nb03.opendev.org now | 21:14 |
clarkb | er nb03.openstack.org is the one to cleanup | 21:14 |
corvus | clarkb: yeah, i can do that in a few mins | 21:15 |
clarkb | k I'm around and can help too | 21:16 |
corvus | huh, i just noticed there's a bunch of whitespace at the bottom of the status page; i wonder if that's a pf4 thing | 21:16 |
clarkb | hrm I don't see that | 21:17 |
clarkb | https://zuul.opendev.org/t/openstack/status is what i'm looking at | 21:17 |
corvus | me too. the scroll bar is about 30% down from the top when i've scrolled to the end of the actual content | 21:18 |
corvus | it's the error drawer | 21:18 |
corvus | with 111 errors | 21:19 |
corvus | i guess that's fine then | 21:19 |
corvus | now looks like a good time to restart | 21:19 |
clarkb | the errors are the x/networking-l2gw stuff iirc and AJaeger and gmann have been working through those | 21:20 |
gmann | clarkb: we need to force merge this to get rid of config error - https://review.opendev.org/#/c/738046/ | 21:21 |
gmann | all failure in this patch are existing failure in that repo | 21:21 |
clarkb | corvus: should I go ahead and force merge ^ now or wait for after the restart? | 21:22 |
gmann | I am not sure when those failure will be fixed as networking-midonet is in no maintainer situation | 21:22 |
gmann | http://lists.openstack.org/pipermail/openstack-discuss/2020-September/017116.html | 21:23 |
corvus | clarkb: maybe wait till restart | 21:23 |
clarkb | corvus: k will wait | 21:23 |
corvus | which is imminent | 21:23 |
clarkb | gmann: ya so I guess we're fixing zuul but then the repo itself will need someone to step in and address the issues | 21:23 |
gmann | yeah. | 21:24 |
clarkb | corvus: looks like the most recent run of service-zuul failed | 21:24 |
corvus | clarkb: the image looked up to date so i proceeded | 21:25 |
clarkb | ze09 is unreachable is the issue there | 21:25 |
clarkb | so ya should be fine to proceed | 21:25 |
corvus | this restart will probably take 20+ minutes due to schema update | 21:25 |
clarkb | I'm able to ssh into ze09 so not sure what is going on there | 21:25 |
corvus | weird, ze09 seems to be working. agreed | 21:26 |
corvus | running jobs and everything | 21:26 |
corvus | maybe a fluke? | 21:26 |
clarkb | ya I guess check it in an hour and see if it persists (we run service-zuul hourly) | 21:26 |
corvus | in other news, the light outside is brighter and now looks like dawn | 21:30 |
corvus | turns out we really needad a nuclear winter to cool off after the heat wave | 21:31 |
clarkb | we seem to be just north of the worst of it here. The photos coming out of places to the south of us are crazy though | 21:33 |
corvus | we learned this morning that google removed the manual white balance feature from the android camera app. fortunately we had a better camera handy. | 21:35 |
clarkb | https://i.redd.it/c4d2tvxsryl51.jpg was Salem yesterday | 21:35 |
smcginnis | Dang! | 21:36 |
clarkb | thats about 45-60 minutes south of here depending on traffic | 21:36 |
smcginnis | I better check in on my friends in Eugene. | 21:37 |
clarkb | a bunch of towns up hill from salem burned down yesterday too | 21:37 |
clarkb | its weird because we're all being told to stay home and off roads again but not due to the pandemic | 21:38 |
fungi | that is, like, alien abduction supernatural horror movie red sky | 21:40 |
johnsom | The EPA AQI tops out at 500 which is "respirator recommended". We are currently somewhere between 710 and 551 | 21:40 |
JayF | We're checking a /lot/ of apocalypse boxes for 2020. | 21:41 |
johnsom | We have had the total recall experience since Monday night. | 21:41 |
clarkb | johnsom: I think if you get north of about wilsonville it clears up a bunch | 21:44 |
clarkb | still not great but much better than that | 21:44 |
corvus | wow that's red. we're more orange. this is pretty representative of what we saw this morning: https://s.hdnux.com/photos/01/14/01/75/19930957/5/gallery_xlarge.jpg | 21:44 |
clarkb | and the wind changes in a couple days and it will all be different again too | 21:44 |
johnsom | Yeah, I have checked the road cameras and air quality sites. | 21:44 |
fungi | i'll just stay in my bunker. thanks | 21:45 |
johnsom | Hmm, just got a "Proxy Error" from https://zuul.openstack.org/status "Reason: Error reading from remote server" Apache/2.4.18 (Ubuntu) Server at zuul.openstack.org Port 443 | 21:46 |
corvus | i'm restarting it | 21:46 |
clarkb | and it needs to do a db migration so a bit slower than normal | 21:46 |
corvus | #status log restarted zuul scheduler and web at commit 6a5b97572f73a70f72a0795d5e517cff96740887 to pick up held db attribute | 21:47 |
openstackstatus | corvus: finished logging | 21:47 |
corvus | re-enqueue is in progress | 21:47 |
clarkb | corvus: now I see the extra scroll space | 21:48 |
clarkb | corvus: I think its rendering to the bottom pre restart then remembering that length but when there are fewer chagnes (because renequeue is serial) it renders the shorter list in the bigger space | 21:49 |
clarkb | (I didn't open the error drawer but maybe it does the same thing) | 21:49 |
clarkb | corvus: good to force merge that change now? | 21:51 |
corvus | clarkb: i think it's only the error drawer. the drawer is always there but normally invisible. the length of my page is exactly the length required to display it. | 21:52 |
corvus | clarkb: and yes, gtg | 21:52 |
corvus | reenqueue is finished | 21:52 |
fungi | that explains why i saw something similar on a build result page | 21:52 |
clarkb | https://review.opendev.org/#/c/738046/ is merged | 21:53 |
corvus | yeah, i'd expect it on any sufficiently short page in the openstack tenant | 21:53 |
clarkb | gmann: ^ fyi | 21:53 |
clarkb | corvus: ah got it | 21:53 |
fungi | so the amount of blank space presumably could be observed to differ between tenants based on how many errors a given tenant has | 21:53 |
corvus | at least for the next few minutes :) | 21:53 |
corvus | fungi: yep | 21:53 |
gmann | clarkb: thanks | 21:54 |
clarkb | gmann: are there stable branch fixes for networking-midonet too? | 22:00 |
gmann | clarkb: not yet. i can backport it but not sure if their stable branch are active or not | 22:02 |
clarkb | gmann: well the errors are there as well. | 22:02 |
gmann | clarkb: ok. so i need to backport this and then you can force merge? | 22:04 |
gmann | it seems we need to do until ocata | 22:04 |
clarkb | yes I can help with that | 22:05 |
gmann | ok. doing | 22:05 |
gmann | give me 5 min, in between of fixing openstacksdk for Focal migration | 22:05 |
*** openstackgerrit has joined #opendev | 22:09 | |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Remove nodepool builder puppetry and nb03.openstack.org https://review.opendev.org/749853 | 22:09 |
clarkb | noticed a relatively small groups thing I missed but figured I'd update the chagne to be more complete | 22:10 |
clarkb | ianw: ^ if you have time to look at the new server and check if it is happy then review ^ that would be great (from what I can see the new server is good) | 22:10 |
clarkb | but then if ^ land I think we can just delete the server and its volume? | 22:10 |
ianw | heh, i though i just +2'd that :) | 22:10 |
clarkb | oh maybe you did, sorry | 22:10 |
ianw | yeah, lgtm, when i left yesterday everything was built on the new server | 22:11 |
ianw | so if it hasn't exploded overnight, seems good :) | 22:11 |
clarkb | its not super clear to me on the best way to clean up the images that nb03.openstack.org has built | 22:11 |
clarkb | probably just manually rm those when we are done? | 22:11 |
ianw | hrm, could we put everything on pause in the config there and then delete them using the cli? | 22:13 |
clarkb | ya that may work | 22:14 |
clarkb | we can't remove the images from the image list as that will delete what the new server is building | 22:14 |
clarkb | but pausing then asking nodepool to delete them may be easiest | 22:14 |
* clarkb writes the pause change | 22:14 | |
openstackgerrit | Clark Boylan proposed openstack/project-config master: Pause image builds on nb03.openstack.org https://review.opendev.org/750822 | 22:16 |
clarkb | ianw: ^ something like that | 22:16 |
ianw | clarkb: i *think* it's still unpuppeted? or did we fix that? | 22:17 |
ianw | nb03.openstack.org # ianw 2020-05-20 hand edits applied to dib to build focal on xenial | 22:18 |
ianw | no, we didn't ... so i guess just apply by hand and we can get rid of it asap? | 22:18 |
clarkb | wfm I'll make that change manually | 22:18 |
clarkb | wait it looks like they are all paused already | 22:19 |
clarkb | should I go ahead and ask nodepool to delete the dib images on nb03.openstack.org? | 22:19 |
ianw | ohh, istr fungi doing something with it | 22:20 |
ianw | i think so, let's get rid of it rather than archaeological dig what's going on :) | 22:20 |
clarkb | alright I'll do dib-image-delete on all the arm64 images that don't show nb03.opendev.org as their builder | 22:21 |
clarkb | that should cause nodepool to go and try and clean things up automatically | 22:21 |
clarkb | when that is done we can land https://review.opendev.org/749853 then delete the server and volume | 22:22 |
clarkb | deletes have been requested. Some have deleted others haven't I wonder if there will be issues :/ | 22:26 |
clarkb | ianw: the images which are having a hard time deleting are quite old. I wonder if they are just super stale? | 22:29 |
ianw | it's not trying on linaro-london is it? | 22:29 |
clarkb | oh ha I think that is it | 22:30 |
ianw | all those 118 day old ones are there | 22:30 |
clarkb | the london cloud is gone right? so we should edit the zk db to clear those upload records instead? | 22:30 |
ianw | yeah it is, i think it went away ungracefully from what i remember. as in wasn't working | 22:31 |
ianw | do have instructions for hand-edit zk? i've not done but would be ahppy to learn | 22:32 |
ianw | happy even | 22:32 |
clarkb | ianw: I think there are simple instructions somewhere let me see | 22:32 |
clarkb | ianw: https://docs.opendev.org/opendev/system-config/latest/nodepool.html#zookeeper that talks about the client. | 22:33 |
clarkb | ianw: but then its a simple fs like navigation system. `help` for commands `ls` to list things `cd` to change your context `get` to get a nodes contents iirc | 22:34 |
clarkb | I think what we want to do is delete the upload reocrds for those images then the rest will be done automatically | 22:34 |
clarkb | I can double check things if you like | 22:34 |
ianw | ok, let me see if i can get something talking | 22:35 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Block port 2181 on zookeeper hosts https://review.opendev.org/750833 | 22:35 |
ianw | i wonder if it's harder with containers and ssl etc | 22:35 |
clarkb | ianw: ssl does make it harder ^ is sort of related | 22:35 |
clarkb | ianw: we keep listening on port 2181 too so if you run this on the zk server itself you can talk insecurely locally | 22:35 |
clarkb | I think that is going to be our keep things simple mode of operation going forward, listen on 2181 and 2281 but only expose 2281 outside the host | 22:36 |
ianw | yeah "connect localhost" seems to have worked | 22:38 |
clarkb | I think you want to rm paths like /nodepool/images/centos-8-arm64/builds/0000000043/providers/linaro-london/images/0000000001 | 22:38 |
clarkb | and if you do those leaf nodes nodepool should cleanup the rest of it for us | 22:39 |
ianw | http://paste.openstack.org/show/797675/ | 22:41 |
ianw | is the list | 22:41 |
ianw | i'll try /nodepool/images/centos-8-arm64/builds/0000000045/providers/linaro-london/images/0000000001 | 22:42 |
clarkb | those paths look like what we want to delete to me | 22:42 |
ianw | ok, the upper paths seem all gone, so i'd say it's cleaned up properly | 22:44 |
ianw | i'll do the rest | 22:44 |
*** auristor has quit IRC | 22:45 | |
gmann | clarkb: https://review.opendev.org/#/q/I9231b2b362a1f2316307908b7e9ad57a709700f6 | 22:45 |
ianw | clarkb: ok, no more linaro london | 22:45 |
clarkb | gmann: thanks I'll work on that shortly | 22:46 |
ianw | what's the deal with the 168 day old gentoo images? | 22:46 |
clarkb | ianw: no idea | 22:47 |
ianw | 2020-05-07 15:24:52.286 | tar (child): xz: Cannot exec: No such file or directory | 22:48 |
*** auristor has joined #opendev | 22:49 | |
clarkb | gmann: done | 22:55 |
gmann | clarkb: thank. | 22:55 |
gmann | clarkb: i will backport networking-odl too and ask lajos to merge once he is online tomorrow - https://review.opendev.org/#/c/738074/ | 22:56 |
gmann | after that networking-l2gw error should disappear completely | 22:57 |
fungi | sorry, post dinner catching up... what was i doing something with? paused image builds on old nb03? | 22:58 |
clarkb | fungi: ya, we decided we're just going to delete it anyway so figuring that out doesn't matter much :) | 22:58 |
fungi | good, because i don't know that my memory can take the strain at this point | 22:59 |
fungi | good riddance | 23:00 |
fungi | #status log cinder volume for graphite01 (current production server) has been replaced and cleaned up | 23:01 |
openstackstatus | fungi: finished logging | 23:01 |
fungi | i'm working on nb01/02/04 now but should be transparent | 23:02 |
clarkb | my next builder question is should we rm nb04? | 23:02 |
clarkb | I'm not sure if we need 3 x86 builders (it doesn't hurt but is it necessary?) | 23:02 |
fungi | how's our disk space? that's the primary indicator, right? | 23:04 |
clarkb | they are each using about half of their disk | 23:07 |
clarkb | so cutting out 04 would push 01 and 02 to 3/4 of their disk or so | 23:07 |
*** Goneri has quit IRC | 23:21 | |
*** cloudnull4 has joined #opendev | 23:26 | |
*** cloudnull has quit IRC | 23:26 | |
*** cloudnull4 is now known as cloudnull | 23:26 | |
fungi | seems okay | 23:39 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!