opendevreview | Merged opendev/system-config master: zuul-*: use multiline formatter https://review.opendev.org/c/opendev/system-config/+/821508 | 00:06 |
---|---|---|
corvus | web is back | 00:16 |
corvus | oh sort of a shame that hadn't merged... could always roll again | 00:16 |
fungi | yeah, ianw was asking if we should wait for that | 00:17 |
fungi | i think he was offering to do the restart once it landed | 00:17 |
ianw | i can if we like, or just wait till next time | 00:20 |
ianw | hopefully i won't have immediate need to debugging any multi-line errors :) | 00:20 |
corvus | oh sorry i thought that was a conversation with clarkb on a different subject | 00:22 |
corvus | i understand how to demultiplex the multiple conversations now, but i did not at the time. | 00:22 |
fungi | ianw: i think it's more about test-driving what should be included in zuul 4.11.0, if we don't test drive zuul with that change in place, then we may not want to include it in that release (or maybe just roll the dice, it's tested anyway) | 00:23 |
ianw | fungi: oh, well in this case the multiline logger has been in for a long time, we just have been overriding it with our log config | 00:24 |
fungi | aha, nevermind then | 00:24 |
fungi | i agree it's not as urgent to add. now i see that was a system-config change anyway, not a zuul change | 00:24 |
fungi | it'll be in 4.11.0 regardless | 00:25 |
fungi | we just don't know if we might discover bugs with it | 00:26 |
ianw | clarkb: it looks like you got errno 111 (connection refused) instead of 113 (no route to host)? | 00:26 |
ianw | that ... seems right | 00:27 |
fungi | well, no route to host is what we expect from the firewall's default reject rule | 00:27 |
fungi | unless we have a tcp-specific rule setting --reject-with tcp-reset | 00:28 |
ianw | interesting ... certainly proven it's a good thing to be running tests against | 00:29 |
clarkb | hrm ya I was getting 113 against prod | 00:31 |
clarkb | fungi: ianw: any idea why that would've happened in testing? | 00:31 |
fungi | hitting an interface we're allowing traffic to but the service isn't listening on? | 00:32 |
clarkb | oh I see it ya | 00:33 |
clarkb | its talking to 127.0.0.1 because I asked the zk host for its address | 00:33 |
clarkb | that an interesting behavior | 00:33 |
ianw | 127.0.1.1 | 00:33 |
clarkb | I need to get the zk address some other way to get the external addresses | 00:33 |
clarkb | or maybe 127.0.1.1 and the external are both included. I'll strip out 127\..* to check that | 00:33 |
clarkb | yup that is it | 00:34 |
fungi | make sure you exclude ::1 as well | 00:34 |
clarkb | ++ | 00:35 |
ianw | you could also probably just connect to "zk0X.opendev.org" in the connect() dirrectly? that will look it up on bridge? | 00:35 |
opendevreview | Clark Boylan proposed opendev/system-config master: Add firewall behavior assertions to testinfra testing https://review.opendev.org/c/opendev/system-config/+/821780 | 00:36 |
clarkb | ianw: ya, but my concern with that is I might get the actual prod host | 00:37 |
clarkb | I figured looking up the name on the system configured with that hostname was going to be most reliable even if we didn't do /etc/hosts for all hosts in the multinode job | 00:37 |
opendevreview | James E. Blair proposed zuul/zuul-jobs master: WIP: Switch docs theme to RTD https://review.opendev.org/c/zuul/zuul-jobs/+/821918 | 00:37 |
ianw | either or; you'd also have AF_INET and AF_INET6 to contend with | 00:38 |
ianw | although i do think in other tests we've made the assumption that the host resolution is working | 00:38 |
ianw | it is another good argument for making the testing hosts like zk99 as well | 00:38 |
clarkb | ya looking at it we aren't actually getting the AF_INET6 ip for some reason | 00:42 |
clarkb | I wonder if we don't set up the ipv6 addr in /etc/hosts | 00:42 |
ianw | perhaps not if we couldn't do it consistently? | 00:45 |
clarkb | ya that may be why | 00:46 |
fungi | ahh, no global v6 in some test providers, yep | 00:47 |
clarkb | we also do the overlay networks over ipv4 only beacuse linux didn't support vxlan over ipv6 for the longest time | 00:47 |
opendevreview | wangxiyuan proposed openstack/project-config master: Add openEuler disto support for elements https://review.opendev.org/c/openstack/project-config/+/821794 | 01:37 |
fungi | #status log Our jitsi-meet services including meetpad.opendev.org are shut down temporarily again, out of an abundance of caution awaiting newer images | 01:38 |
opendevstatus | fungi: finished logging | 01:38 |
join_subline | curious, is there a ballpark stats of the number of users using the meetpad per hour,day,week,etc . i've used meetpad, meet.jit.si, and 8x8.vc . have gotten the audio to crash on meet.jit.si, and experienced UI lag / freezing with 8x8.vc . but overall, very useful webservice | 02:03 |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: centos: work around 9-stream BLS issues https://review.opendev.org/c/openstack/diskimage-builder/+/821772 | 02:04 |
opendevreview | Merged openstack/project-config master: Add openEuler disto support for elements https://review.opendev.org/c/openstack/project-config/+/821794 | 02:21 |
*** rlandy|ruck|bbl is now known as rlandy|ruck | 02:30 | |
*** rlandy|ruck is now known as rlandy|out | 02:38 | |
clarkb | join_subline: I think it tends to be more in bursts rather than sustained. Like when we have an event that uses it | 03:01 |
join_subline | ah, ic. well, i think i must be close to being in the top tier of etherpad.opendev.org powerusers. i write so much into this (mostly just copying links), but thnxs for giving me a place to put all my online stuff. that reminds me, i should run my download script for the /p/<pad>/export/txt , so i have a copy of everything i put up. 🤞 | 03:10 |
*** pojadhav|afk is now known as pojadhav | 04:30 | |
*** ysandeep|out is now known as ysandeep | 06:22 | |
opendevreview | Merged openstack/diskimage-builder master: Install only python3 pip in debian bullseye https://review.opendev.org/c/openstack/diskimage-builder/+/820563 | 06:36 |
opendevreview | Vishal Manchanda proposed openstack/project-config master: Add "Review-Priority" label to horizon project https://review.opendev.org/c/openstack/project-config/+/821934 | 06:36 |
*** ysandeep is now known as ysandeep|lunch | 07:38 | |
*** jpena|off is now known as jpena | 08:01 | |
*** TheMaster is now known as Unit193 | 08:10 | |
*** ysandeep|lunch is now known as ysandeep | 08:17 | |
opendevreview | Merged openstack/project-config master: Add openstack-venus irc channel in access an gerrit bot https://review.opendev.org/c/openstack/project-config/+/821875 | 08:59 |
wxy-xiyuan_ | ianw: the image build works now https://nb02.opendev.org/openEuler-20.03-LTS-SP2-0000000055.log Is there any place I can get the nodepool-launcher log? Thanks. | 09:36 |
opendevreview | Merged opendev/system-config master: Add openstack-venus channel in statusbot https://review.opendev.org/c/opendev/system-config/+/821882 | 09:54 |
frickler | wxy-xiyuan_: I don't think there is. I'll try to take a look myself | 09:54 |
frickler | wxy-xiyuan_: infra-root: there are errors uploading the openEuler image which I don't understand yet | 10:02 |
*** ysandeep is now known as ysandeep|afk | 10:32 | |
wxy-xiyuan_ | frickler: Thanks for help | 10:36 |
frickler | we seem to have a bug when the image name contains a "." | 10:44 |
frickler | FileNotFoundError: [Errno 2] No such file or directory: '/opt/nodepool_dib/openEuler-20.vhd' | 10:44 |
frickler | -rw-r--r-- 1 nodepool nodepool 21615583744 Dec 16 05:27 /opt/nodepool_dib/openEuler-20.03-LTS-SP2-0000000055.vhd | 10:45 |
frickler | infra-root: ^^ I can't debug further now, please have a look | 10:48 |
*** sshnaidm|afk is now known as sshnaidm | 10:54 | |
*** ysandeep|afk is now known as ysandeep | 11:16 | |
*** rlandy|out is now known as rlandy|ruck | 11:17 | |
*** dviroel|out is now known as dviroel|rover | 11:26 | |
*** pojadhav is now known as pojadhav|brb | 12:30 | |
*** pojadhav|brb is now known as pojadhav | 12:52 | |
fungi | seems likely to be a greedy filename chop in the builder | 14:25 |
fungi | i'll try to look shortly | 14:25 |
fungi | not finding any obvious places yet where we make assumptions about filenames containing only one '.' | 14:34 |
fungi | but there's a lot of places in nodepool.builder where the filename gets touched or passed through | 14:35 |
fungi | also possible this is happening inside the openstack sdk | 14:43 |
fungi | it may be easier to insert more debugging statements into nodepool in order to track down at what point the image name gets corrupted | 14:44 |
opendevreview | James E. Blair proposed zuul/zuul-jobs master: Switch docs theme to RTD https://review.opendev.org/c/zuul/zuul-jobs/+/821918 | 14:48 |
frickler | fungi: maybe rather amend the image name to -20-03- for now? | 14:55 |
frickler | the other thing that worries me is that nodepool seems to be looping trying to redo the upload every 10s or so | 14:56 |
clarkb | I suspect the with_suffix in DibImageFile.to_path() is to blame | 15:01 |
clarkb | yes just reproduced | 15:02 |
clarkb | https://paste.opendev.org/show/bbXSvqbxVTJ7D6WBHo5N/ | 15:02 |
clarkb | I think that behavior is correct from pathlib. Everything after the last . in the /opt/nodepool_dib/openEuler-20.03-LTS-SP2-0000000055 name is treated as an extension and replaced when we set the .vhd extension on it | 15:04 |
clarkb | Probably best to remoev the .'s from the image name for now. Unless someone has a good idea for fixing that | 15:04 |
fungi | i'll have to look closer after meetings are over | 15:04 |
opendevreview | Merged zuul/zuul-jobs master: Switch docs theme to RTD https://review.opendev.org/c/zuul/zuul-jobs/+/821918 | 15:26 |
*** dviroel|rover is now known as dviroel|rover|lunch | 15:51 | |
*** ysandeep is now known as ysandeep|out | 16:34 | |
*** dviroel|rover|lunch is now known as dviroel|rover | 16:42 | |
*** marios is now known as marios|out | 16:45 | |
*** jpena is now known as jpena|off | 16:58 | |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Add a domain aliases mechanism to lists.o.o https://review.opendev.org/c/opendev/system-config/+/821914 | 16:59 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Create an OpenInfra Foundation staff ML https://review.opendev.org/c/opendev/system-config/+/821915 | 16:59 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Forward messages for OpenInfra Foundation staff ML https://review.opendev.org/c/opendev/system-config/+/821916 | 16:59 |
clarkb | those changes lgtm though exim is always magic to me :) | 17:28 |
fungi | yeah, turing-complete configuration languages always seem like a good idea at the begining, but sometimes you end up spilling your programs guts all over the user as time goes on | 17:31 |
fungi | anyway, the reason that got complicated is i realized that while our mailrouting through the mta is virtual domain aware such that we can have the same list name on multiple domains, the rudimentary /etc/aliases format (in order to be backward-compatible with sendmail) isn't | 17:34 |
clarkb | ah | 17:34 |
fungi | anyway, assuming the testinfra test i added in 821916 passes, i'll stack a dnm break on top and set an autohold so i can manually test message delivery through the mailrouted added by 821914 | 17:39 |
fungi | https://zuul.opendev.org/t/openstack/build/4f6f3417f10c4f65b01f0890cac0ab96/log/lists.openstack.org/aliases.domain.txt definitely has the content i want exim to act on, and the added test also confirms it. actually exercising that forward, on the other hand, is a little more tricky so i'll use an autohold | 17:51 |
clarkb | soudns good | 17:51 |
fungi | i theory we could have testinfra send something to the old address over localhost and then look for signs in the log that it tried to use the new address instead, but probably not worth it for now | 17:53 |
fungi | also test nodes may not actually accept deliveries for the production hostnames in their current state | 17:53 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: DNM: break mailman testing for an autohold https://review.opendev.org/c/opendev/system-config/+/822007 | 18:03 |
kopecmartin | Clark[m]: hi, just out of curiosity, how many resources were needed to make openstack-health run? i guess storage was the biggest one | 18:10 |
clarkb | kopecmartin: the biggest issue is people. We need people to update the configuration management of the system, fix it when it break (like it is currently broken), upgrade the services and so on | 18:11 |
fungi | kopecmartin: yeah, probably storage, we've got a 0.5tb mysql database (trove instance) | 18:11 |
fungi | from a non-wetware perspective | 18:11 |
clarkb | the hosting is an aspect of it, but like the ELK stuff all of the work there needs maintenance to bring it up to speed on current practices and supported operating systems | 18:11 |
clarkb | Specifically for openstack-health we need to update the deployments of the subunit2sql workers, the health api server, and status.o.o. The operating systems need to be upgraded and the configuration management needs to be converted from puppet to ansible + docker | 18:12 |
clarkb | Then the health software also needs maintenance/updating but gmann would have more info on that | 18:13 |
kopecmartin | yeah , people are the crucial point of this, i know .. all said on the call still stands, i was just wondering about the resources | 18:14 |
clarkb | kopecmartin: for the resources its a couple of subunit2sql workers I think they are 4vcpu 4GB memory. The api server and the web server hwich is also reasonably small (8vcpu + 8GB memory?) and then a large database server | 18:15 |
clarkb | *for the hardware resource | 18:15 |
kopecmartin | thanks | 18:16 |
gmann | yeah, we do not have anyone currently to maintain the software (repo) itself. even 1 person is enough to keep it in working condition as not much new things to implement but just bug fixes if it is broken | 18:20 |
gmann | developer with JS skill | 18:20 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Add a domain aliases mechanism to lists.o.o https://review.opendev.org/c/opendev/system-config/+/821914 | 19:22 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Create an OpenInfra Foundation staff ML https://review.opendev.org/c/opendev/system-config/+/821915 | 19:22 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Forward messages for OpenInfra Foundation staff ML https://review.opendev.org/c/opendev/system-config/+/821916 | 19:22 |
fungi | clarkb: ^ through manual testing of the held node i found a problem with my domain aliases mailrouter, and also some options for simplifying it | 19:22 |
clarkb | cool I'll rereview shortly | 19:25 |
fungi | i tested that with both staff@lists.openstack.org and staff@lists.openinfra.dev mailing lists created, messages to the former were delivered to the latter mailing list as intended, while messages to the latter also went directly to the list as usual. then i swapped the alias around and tested both of them again, just to make sure the behavior was reversed and we weren't simply delivering to | 19:25 |
fungi | whatever the first list it found happened to be | 19:26 |
fungi | testing was done using the mail utility locally to inject messages through the loopback interface so that it wouldn't hit the egress reject we added | 19:26 |
fungi | and then i examined mailman's smtp log for each of the lists involved | 19:27 |
fungi | so thorough enough to exercise the exim configuration anyway | 19:28 |
clarkb | seems to do what it says on the tin +2 | 19:33 |
fungi | thanks | 19:37 |
fungi | if one more infra-root has a chance to at least look at 821914 i'll feel pretty good about the direction there, the one after it (821914) is just a simple ml addition | 19:41 |
fungi | er, 821915 is the one after it i meant | 19:42 |
fungi | once both of those deploy, assuming no new errors arise with the list creation, i'll work on the manual steps for copying the list config and subscribers in preparation for landing the forwarding change (821916) | 19:43 |
opendevreview | Ian Wienand proposed openstack/project-config master: nodepool: Remove . from openEuler name https://review.opendev.org/c/openstack/project-config/+/822046 | 20:40 |
clarkb | oh right I meant to check if that had been done yet | 20:41 |
clarkb | we might need to manually clean up the old files but that shouldn't be too much trouble. I've approved ^ | 20:42 |
fungi | thanks ianw, i too hadn't gotten to doing that yet | 20:47 |
fungi | should we switch it to all lower-case as well? all our other labels are | 20:49 |
fungi | now would be the time to decide, lest we have to clean up twice | 20:49 |
ianw | fungi: no probs -- since i'm not really here today and it is going in i think it's ok | 20:51 |
ianw | probably good idea to have some mixed case and periods (for full-stops as we more civilised people call them :) to shake this out | 20:52 |
ianw | i've put really fixing it in nodepool on the todo | 20:52 |
fungi | awesome, thanks again | 20:53 |
fungi | and if you're not really here today, you should get on with not being here! ;) | 20:53 |
fungi | i'll go ahead and approve the domain aliases addition for lists.o.o as well as the new staff ml creation for lists.openinfra.dev so i can try to work on the list move some this evening | 20:55 |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: centos: work around 9-stream BLS issues https://review.opendev.org/c/openstack/diskimage-builder/+/821772 | 20:56 |
opendevreview | Merged openstack/project-config master: nodepool: Remove . from openEuler name https://review.opendev.org/c/openstack/project-config/+/822046 | 20:58 |
fungi | i also removed the list servers from the emergency disable list a little while ago, now that we believe our orchestrated list creation to be working | 21:00 |
fungi | and i need to start putting dinner together while i wait for those to merge and deploy | 21:01 |
fungi | clarkb: the . removal deployed, we're clear to delete the images on disk i guess? | 21:09 |
Clark[m] | Maybe check that it isn't trying to upload the image still first? | 21:12 |
Clark[m] | Just finishing up lunch then I can take a look too if that helps | 21:12 |
fungi | i'll try to look in a moment | 21:12 |
*** dviroel|rover is now known as dviroel|out | 21:21 | |
fungi | once i finish eating | 21:25 |
fungi | okay, back and working on image cleanup | 21:50 |
fungi | openEuler-20.03-LTS-SP2-0000000055 is ready and openEuler-20-03-LTS-SP2-arm64-0000000001 is building, so it should be safe to dib-image-delete openEuler-20.03-LTS-SP2-0000000055 | 21:54 |
Clark[m] | ++ | 21:54 |
Clark[m] | Except you might have to delete the records from zk too? Not sure how it handles that | 21:55 |
fungi | image-list reports several failed state openEuler-20.03-LTS-SP2 images in ovh-bhs1 | 21:55 |
Clark[m] | nodepool dib-image-list should tell you if there is still a build record for the old one | 21:55 |
fungi | oh, bad news... it seems to want to build both | 21:56 |
fungi | after the dib-image-delete there's a openEuler-20.03-LTS-SP2-0000000056 in building state | 21:56 |
fungi | aha, the configuration on nb02 isn't updated | 21:57 |
fungi | -rw-r--r-- 1 root root 14872 Dec 14 06:43 /etc/nodepool/nodepool.yaml | 21:58 |
fungi | shouldn't the deploy job have taken care of that? | 21:58 |
fungi | same for nb01 | 21:58 |
fungi | only the arm64 builder config (nb03) was updated | 21:59 |
fungi | maybe the fix was incomplete | 21:59 |
fungi | yep, that's it | 22:00 |
opendevreview | Jeremy Stanley proposed openstack/project-config master: nodepool: Remove yet still more . from openEuler https://review.opendev.org/c/openstack/project-config/+/822052 | 22:03 |
fungi | clarkb: ianw: ^ | 22:03 |
clarkb | approved | 22:10 |
fungi | much obliged | 22:10 |
fungi | i'll see what remains to be cleaned up once that deploys | 22:10 |
opendevreview | Merged openstack/project-config master: nodepool: Remove yet still more . from openEuler https://review.opendev.org/c/openstack/project-config/+/822052 | 22:21 |
opendevreview | Merged opendev/system-config master: Add a domain aliases mechanism to lists.o.o https://review.opendev.org/c/opendev/system-config/+/821914 | 23:14 |
opendevreview | Merged opendev/system-config master: Create an OpenInfra Foundation staff ML https://review.opendev.org/c/opendev/system-config/+/821915 | 23:18 |
rlandy|ruck | hello ... maybe I'm late to notice this - but we have a lot of tox failures going on across various products: | 23:59 |
rlandy|ruck | https://4effb742a88be8659f07-40bd60678638a1db566d5d37b438f20d.ssl.cf5.rackcdn.com/822049/1/gate/openstack-tox-py36/8e12896/job-output.txt | 23:59 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!