clarkb | I'll pick this up in the morning, dinner should be here soon. Also an early start tomorrow for me due to the selected ptg time slot. If that time slot is terrible for you do not feel bad about skipping it :) | 00:07 |
---|---|---|
openstackgerrit | Merged opendev/zone-opendev.org master: Add inmotion mirror to DNS https://review.opendev.org/c/opendev/zone-opendev.org/+/787459 | 00:09 |
ianw | fungi / zigo: bullseye build completed, images are uploading | 00:23 |
ianw | so we should make that fix permanent | 00:24 |
fungi | ianw: thanks for trying it out! i was going to, but still trying to dig myself out | 00:24 |
*** Tengu_ has joined #opendev | 00:26 | |
openstackgerrit | Merged opendev/system-config master: Handle focal's insistence we don't use root in launch-node.py https://review.opendev.org/c/opendev/system-config/+/787461 | 00:27 |
*** Tengu_ has quit IRC | 00:28 | |
*** Tengu has quit IRC | 00:29 | |
*** Tengu has joined #opendev | 00:31 | |
*** mlavalle has quit IRC | 00:39 | |
kevinz | ianw: morning! | 01:39 |
kevinz | could I reboot the nb03 node to see sth will change? | 01:39 |
ianw | kevinz: we can .. although OSL have just added an ipv6 address and AAAA records for their API node | 01:43 |
ianw | #status log deleted leaked zk node under /nodepool/images/fedora-32/builds to avoid many warnings in builder logs | 01:44 |
openstackstatus | ianw: finished logging | 01:44 |
ianw | i think i have a bit of a handle on it now | 01:44 |
openstackgerrit | Merged openstack/project-config master: Change cyborg project track to launchpad https://review.opendev.org/c/openstack/project-config/+/787306 | 01:48 |
kevinz | ianw: OK, that's great :-) Good to hear that | 01:49 |
openstackgerrit | Merged opendev/system-config master: nodepool-base: prefer ZK IPv6 addresses https://review.opendev.org/c/opendev/system-config/+/787313 | 02:32 |
ianw | ok, nb03 has switched over to ipv6 for ZK now | 03:40 |
ianw | 2021-04-22 02:46:04,895 INFO kazoo.client: Zookeeper connection established, state: CONNECTED | 03:41 |
ianw | 2021-04-22 03:35:43,281 WARNING kazoo.client: Connection dropped: outstanding heartbeat ping not received | 03:41 |
ianw | 2021-04-22 03:35:43,286 WARNING kazoo.client: Transition to CONNECTING | 03:41 |
ianw | 2021-04-22 03:35:43,286 INFO kazoo.client: Zookeeper connection lost | 03:41 |
ianw | 2021-04-22 03:35:43,296 INFO kazoo.client: Connecting to 2001:4800:7815:102:be76:4eff:fe02:f134(2001:4800:7815:102:be76:4eff:fe02:f134):2281, use_ssl: True | 03:41 |
ianw | 2021-04-22 03:35:44,443 INFO nodepool.builder.UploadWorker.0: ZooKeeper suspended. Waiting | 03:41 |
ianw | 2021-04-22 03:35:50,391 WARNING kazoo.client: Session has expired | 03:41 |
ianw | 2021-04-22 03:35:50,391 INFO kazoo.client: Zookeeper session closed, state: EXPIRED_SESSION | 03:41 |
ianw | 2021-04-22 03:35:50,392 INFO kazoo.client: Connecting to 2001:4800:7821:105:be76:4eff:fe04:d599(2001:4800:7821:105:be76:4eff:fe04:d599):2281, use_ssl: True | 03:41 |
ianw | 2021-04-22 03:35:50,419 INFO kazoo.client: Zookeeper connection established, state: CONNECTED | 03:41 |
ianw | so it seemed to drop out (i've not seen that "heartbeat ping not received" before) and then reconnect. i'll keep an eye and see | 03:41 |
*** ysandeep|away is now known as ysandeep | 04:14 | |
frickler | clarkb: fyi the sdk issue you saw is fixed by https://review.opendev.org/c/openstack/openstacksdk/+/786148 , I've asked for a new release to be made | 04:28 |
frickler | and it is a bit sad that sdk doesn't do regression testing against older clouds, though I'm not sure how old a cloud would have to be for that | 04:30 |
*** ykarel has joined #opendev | 04:45 | |
*** slittle1 has quit IRC | 04:46 | |
*** slittle1 has joined #opendev | 04:50 | |
*** zbr9 has joined #opendev | 05:04 | |
*** zbr has quit IRC | 05:06 | |
*** zbr9 is now known as zbr | 05:06 | |
*** DSpider has joined #opendev | 05:15 | |
*** ralonsoh has joined #opendev | 05:20 | |
*** DSpider has quit IRC | 05:21 | |
*** DSpider has joined #opendev | 05:21 | |
*** DSpider has quit IRC | 05:24 | |
*** ykarel_ has joined #opendev | 06:08 | |
*** ykarel has quit IRC | 06:11 | |
*** ykarel__ has joined #opendev | 06:11 | |
*** slaweq has joined #opendev | 06:12 | |
*** ykarel_ has quit IRC | 06:14 | |
*** ykarel__ is now known as ykarel | 06:17 | |
*** ykarel_ has joined #opendev | 06:32 | |
*** ykarel has quit IRC | 06:35 | |
*** lpetrut has joined #opendev | 06:36 | |
*** eolivare has joined #opendev | 06:41 | |
*** sboyron has joined #opendev | 06:42 | |
*** whoami-rajat_ has joined #opendev | 06:50 | |
*** jpena|off is now known as jpena | 06:50 | |
*** ysandeep is now known as ysandeep|lunch | 06:58 | |
zigo | ianw: \o/ Thanks ! | 06:59 |
*** fressi has joined #opendev | 07:04 | |
*** zimmerry_ has quit IRC | 07:07 | |
*** zimmerry has joined #opendev | 07:09 | |
*** amoralej|off is now known as amoralej | 07:13 | |
*** andrewbonney has joined #opendev | 07:17 | |
*** rpittau|afk is now known as rpittau | 07:34 | |
*** tosky has joined #opendev | 07:49 | |
*** zimmerry has quit IRC | 07:49 | |
*** zimmerry has joined #opendev | 07:52 | |
*** xinliang has joined #opendev | 07:55 | |
*** StevenK has joined #opendev | 08:05 | |
*** ysandeep|lunch is now known as ysandeep | 08:50 | |
*** dtantsur|afk is now known as dtantsur | 09:02 | |
*** ykarel_ has quit IRC | 09:18 | |
*** xinliang has quit IRC | 09:47 | |
openstackgerrit | sean mooney proposed openstack/project-config master: Add review priority label to nova deliverables https://review.opendev.org/c/openstack/project-config/+/787523 | 10:19 |
openstackgerrit | sean mooney proposed openstack/project-config master: Copy placement acls to all placement repos https://review.opendev.org/c/openstack/project-config/+/787524 | 10:19 |
*** ykarel has joined #opendev | 11:29 | |
*** jpena is now known as jpena|lunch | 11:34 | |
*** ysandeep is now known as ysandeep|afk | 11:39 | |
*** ykarel has quit IRC | 11:55 | |
*** ykarel has joined #opendev | 11:57 | |
*** ykarel has quit IRC | 12:00 | |
*** amoralej is now known as amoralej|lunch | 12:00 | |
*** ykarel has joined #opendev | 12:00 | |
*** brinzhang has quit IRC | 12:04 | |
*** brinzhang has joined #opendev | 12:04 | |
*** jaicaa has quit IRC | 12:08 | |
*** jaicaa has joined #opendev | 12:11 | |
fungi | frickler: apparently not old, we're seeing it at least with clouds running victoria, and checking when the neutron side feature was added it was apparently in wallaby? so even testing the most recent stable branch would have caught this particular case | 12:12 |
frickler | fungi: in this particular case, yes, but I was thinking about regressions against older clouds which are still heavily being used in production like queens, which we couldn't easily set up in gate even if we wanted to | 12:19 |
fungi | ahh, yeah that's certainly harder | 12:20 |
fungi | there's probably a happy medium between old enough and so old we can't easily do it, also those jobs could be periodic to reduce the job count on new changes. catching them before a release is still better than after releasing | 12:21 |
*** brinzhang has quit IRC | 12:24 | |
*** brinzhang has joined #opendev | 12:24 | |
*** lpetrut_ has joined #opendev | 12:27 | |
*** lpetrut has quit IRC | 12:29 | |
*** zoharm has joined #opendev | 12:33 | |
*** jpena|lunch is now known as jpena | 12:35 | |
*** hamalq has joined #opendev | 12:39 | |
*** ysandeep|afk is now known as ysandeep | 12:39 | |
*** xinliang has joined #opendev | 13:01 | |
*** jaicaa has quit IRC | 13:04 | |
*** zimmerry has quit IRC | 13:04 | |
*** artom has quit IRC | 13:04 | |
*** erbarr has quit IRC | 13:04 | |
*** jaicaa has joined #opendev | 13:05 | |
*** zimmerry has joined #opendev | 13:05 | |
*** artom has joined #opendev | 13:05 | |
*** erbarr has joined #opendev | 13:05 | |
xinliang | hello when will the opendev ptg will begin? https://etherpad.opendev.org/p/apr2021-ptg-opendev | 13:05 |
openstackgerrit | Merged opendev/system-config master: Add iad3.inmotion mirror node https://review.opendev.org/c/opendev/system-config/+/787456 | 13:06 |
xinliang | we are to attend this ptg. arm I mistaking the time. no one in the meeting room now? | 13:07 |
xinliang | It seems a hour later for the PTG. I am mistaking the time. Sorry. | 13:11 |
fungi | xinliang: that's correct, 14:00 utc | 13:13 |
fungi | ~45 minutes from now | 13:13 |
xinliang | ok, got it. thanks fungi :-) | 13:13 |
*** owner__ has joined #opendev | 13:14 | |
*** owner__ has quit IRC | 13:14 | |
*** brinzhang_ has joined #opendev | 13:15 | |
*** amoralej|lunch is now known as amoralej | 13:15 | |
*** ocsabat has joined #opendev | 13:15 | |
*** mfixtex has joined #opendev | 13:25 | |
*** lpetrut_ has quit IRC | 13:45 | |
*** brinzhang_ has quit IRC | 13:50 | |
*** priteau has quit IRC | 13:53 | |
clarkb | the new inmotion cloud deployment failed because the LE playbook failed somewhere. I don't think it failed on the new mirror as I can see in the acme.sh log there that it appears to have created files | 13:54 |
fungi | limestone's not offline this time | 13:54 |
clarkb | I need to shift into ptg prep, but wanted to provide that update | 13:55 |
clarkb | and by prep I mean join the meetpad :) | 13:55 |
fungi | yeah, i'm trying to find the error in the log | 13:56 |
fungi | the last thing in the log was TASK [letsencrypt-create-certs : Run acme.sh driver for mirror01-iad3-inmotion-main certificate issue] | 13:57 |
fungi | 13:36:46 | 13:57 |
clarkb | after the next hourly cloud launcher run runs I'll start a manual rerun using ianw's fixed venv for the sdk issue | 13:58 |
clarkb | fungi: thats about when acme.sh says it has written a key and intermediate cert and all that | 13:59 |
fungi | yeah, but there's no summary output from ansible in the log, like it just abruptly stopped | 14:00 |
fungi | aha! ERROR! The requested handler 'letsencrypt updated mirror01-iad3-inmotion-main' was not found in either the main handlers list nor in the listening handlers list | 14:00 |
clarkb | zuul doesn't say the job timed out | 14:00 |
fungi | clarkb: ^ fix is likely straightforward? | 14:00 |
clarkb | aha, sorry yup | 14:01 |
clarkb | I'll get that up while we wait for people to join the ptg | 14:01 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Add missing inmotion LE apache restart handler https://review.opendev.org/c/opendev/system-config/+/787571 | 14:03 |
*** ocsabat has quit IRC | 14:25 | |
*** xinliang has quit IRC | 14:38 | |
zbr | problem with mirror-int.dfw.rax.opendev.org? https://zuul.opendev.org/t/openstack/build/35039ad11c3d4cecad2a6219203d2866 | 14:51 |
clarkb | zbr: the first thing to check is if the non -int name works (its the same server but different IPs to use internal rax networking if possible) | 14:52 |
fungi | note that's just the rfc-1918 backend network hostname corresponding to mirror.dfw.rax.opendev.org, reachable only in that provider region | 14:52 |
clarkb | https://mirror.dfw.rax.opendev.org/wheel/ubuntu-20.04-x86_64/ yup that is the public path | 14:52 |
clarkb | oh except I got the path wrong | 14:53 |
clarkb | https://mirror.dfw.rax.opendev.org/pypi/simple/tox-bindep/ is what it failed to red | 14:53 |
clarkb | (which is failign to load for me looks like) | 14:53 |
fungi | that'll be a proxy to pypi.org | 14:54 |
clarkb | the first path I gave hits our afs backed wheel caches. The second path is a proxy to pypi | 14:54 |
zbr | for me too | 14:54 |
clarkb | it seems that the server is generally healthy enough to serve afs, but is having trouble proxying to pypi | 14:54 |
clarkb | I cannot wget https://pypi.org from the mirror | 14:55 |
clarkb | seems to be an ipv6 issue. It does work if I wget -4 https://pypi.org | 14:55 |
fungi | maybe connectivity problem between rackspace and fastly | 14:55 |
fungi | ahh, yeah | 14:55 |
fungi | traceroute6 to pypi.org dies with timeouts in rackspace's network or at their edge | 14:56 |
fungi | roughly 4-5 hops from the mirror server | 14:57 |
clarkb | ipv6 seems to generally work though, google is accessible anyway | 14:57 |
clarkb | as is review02 | 14:57 |
fungi | yeah, i'm thinking that last hop i see is their border gateway, and the failure is somewhere on an asymmertic return path | 14:57 |
fungi | so possibly v6 routing blackhole in a backbone provider | 14:58 |
clarkb | I think mnaser was discussing ipv6 rouate issues today too, I wonder if it is more widespread than vexxhost | 14:58 |
mnaser | clarkb: that's in another dc though :p | 14:59 |
mnaser | but, very possible.. | 14:59 |
clarkb | fungi: we can stick an /etc/hosts rule in for pypi.org to address this maybe? | 15:00 |
clarkb | (then just remember to remove it when ipv6 is happy again) | 15:00 |
fungi | i'm testing other mirrors | 15:00 |
*** amoralej is now known as amoralej|afk | 15:01 | |
fungi | yeah, rax-iad and rax-ord are able to reach pypi just fine | 15:01 |
clarkb | I have deleted the accidentally created opendev network and subnet in osuosl | 15:03 |
zbr | i do remember having ipv6 problems with pypi at least twice during the last year, some behavior ipv4 was fine but ipv6 was not and it was their CDN both times (i have full ipv6 at home) | 15:03 |
clarkb | the cloud launcher ran fine when using an older sdk too so I think that is all sorted at this point | 15:03 |
zbr | i think i even raise a bug which was closed as "works now, not our problem" | 15:03 |
fungi | well, to be fair, pypi's fastly account is donated to them, they have no real pull with fastly to fix problems | 15:04 |
clarkb | and even then it may not be fastly's fault either | 15:04 |
fungi | and not having a cdn for pypi was even worse | 15:04 |
clarkb | could be somewhere between | 15:04 |
fungi | yep | 15:05 |
clarkb | anyway `dig A pypi.org` returns four addresses. We could stick them in /etc/hosts? | 15:05 |
zbr | can we force it to us ipv4? | 15:05 |
* dtantsur assumes frequent "No matching distribution found" are being discussed already | 15:05 | |
fungi | dtantsur: with rax-dfw? | 15:05 |
dtantsur | dunno, can check | 15:05 |
zbr | dtantsur: likely the same issue. | 15:06 |
clarkb | zbr: yes, the /etc/hosts idea is one method to do that | 15:06 |
dtantsur | fungi: yes, two jobs from the same patch, both on rax-dfw | 15:07 |
fungi | i'm checking now to see if any of the v6 fastly endpoints returned to dfw are reachable | 15:07 |
fungi | but i agree hard-coding v4 endpoints temporarily in /etc/hosts may get thinks back working for now | 15:07 |
clarkb | the TTLs on those records are super long too so we may need to flush caches (somehow?) and/or restart apache? | 15:09 |
zbr | hopefully they do not change them very often | 15:10 |
fungi | i've set pypi.org to 151.101.0.223 in /etc/hosts but i probably also need to hardcode an address for the file hosting they have since it uses a separate domain name? | 15:11 |
fungi | pypifiles.pythonhosted.org i think? | 15:11 |
* fungi checks | 15:11 | |
clarkb | files.pythonhosted.org according to our proxy config | 15:11 |
clarkb | fungi: no ipv6 issues to that name though | 15:12 |
fungi | ahh, yep thanks just found it | 15:13 |
zbr | dig files.pythonhosted.org AAAA does return ipv6 entries to me. | 15:13 |
clarkb | zbr: it does, but they function | 15:13 |
fungi | i agree, that's reachable from the proxy | 15:13 |
fungi | so it was really just the simple api breaking, and not the file hosting | 15:13 |
clarkb | wget https://pypi.org works on the mirror now but apache is still failing. I suspect we need to restart apache to make it lookup the name again | 15:13 |
fungi | #status log Temporarily hard-coded the pypi.org hostname to a Fastly IPv4 address in /etc/hosts on mirror01.dfw.rax.opendev.org until v6 routing between them returns to working order | 15:14 |
openstackstatus | fungi: finished logging | 15:14 |
clarkb | fungi: do you want to restart apache too? | 15:15 |
fungi | is it still cached? | 15:15 |
clarkb | https://mirror.dfw.rax.opendev.org/pypi/simple/tox-bindep/ continues to not work so I suspect so | 15:15 |
clarkb | the ttls on those names are huge so wouldn't surprise me if apache is holding on to them | 15:15 |
fungi | yeah, i'll restart apache2 | 15:15 |
fungi | sudo systemctl restart apache2 | 15:16 |
clarkb | ++ | 15:16 |
fungi | ought to do it | 15:16 |
fungi | if i type that in the correct terminal ;) | 15:16 |
fungi | restart finished | 15:16 |
clarkb | that url for tox-bindep loads for me now | 15:16 |
fungi | loading for me now | 15:16 |
clarkb | dtantsur: zbr ^ fyi should hopefully be happier now | 15:16 |
dtantsur | \o/ | 15:17 |
fungi | hopefully they work out whatever the routing issue is soon and then we can undo that | 15:17 |
fungi | i'll try to test it periodically | 15:17 |
*** ocsabat has joined #opendev | 15:19 | |
clarkb | fungi: I'd like to run `sudo ansible-playbook --limit mirror01.iad3.inmotion.opendev.org -v /home/zuul/src/opendev.org/opendev/system-config/playbooks/service-mirror.yaml` to get the inmotion mirror up to speed. | 15:22 |
clarkb | do you have any concerns with doing that? | 15:22 |
fungi | none whatsoever | 15:23 |
clarkb | cool proceeding with that now | 15:23 |
fungi | this is to work around reenqueuing in deploy? | 15:23 |
clarkb | yup | 15:23 |
clarkb | reenqueing will take forever since it is an inventory update | 15:23 |
clarkb | and that runs all the jobs | 15:23 |
*** avass has quit IRC | 15:26 | |
*** avass has joined #opendev | 15:27 | |
*** ameono has joined #opendev | 15:30 | |
clarkb | https://mirror.iad3.inmotion.opendev.org/ woot thats up | 15:32 |
*** ocsabat78 has joined #opendev | 15:33 | |
clarkb | I've pulled my WIP vote off of https://review.opendev.org/c/openstack/project-config/+/787428 I think adding the cloud to nodepool is the next step | 15:35 |
*** ykarel is now known as ykarel|away | 15:36 | |
fungi | i've gone ahead and approved it, reviewed it yesterday | 15:37 |
*** ameono has quit IRC | 15:38 | |
*** ocsabat has quit IRC | 15:39 | |
*** ykarel|away has quit IRC | 15:41 | |
*** ocsabat78 has quit IRC | 15:41 | |
openstackgerrit | Merged openstack/project-config master: Add InMotion cloud to nodepool https://review.opendev.org/c/openstack/project-config/+/787428 | 15:45 |
*** brinzhang_ has joined #opendev | 15:55 | |
*** brinzhang has quit IRC | 15:58 | |
*** jpena is now known as jpena|away | 16:00 | |
*** hamalq has quit IRC | 16:06 | |
*** slaweq_ has joined #opendev | 16:20 | |
*** eolivare has quit IRC | 16:20 | |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Sepcify inmotion cloud config in clouds.yaml https://review.opendev.org/c/opendev/system-config/+/787595 | 16:21 |
*** amoralej|afk is now known as amoralej | 16:26 | |
*** sshnaidm is now known as sshnaidm|afk | 16:28 | |
*** ysandeep is now known as ysandeep|away | 16:30 | |
openstackgerrit | Merged opendev/system-config master: Add missing inmotion LE apache restart handler https://review.opendev.org/c/opendev/system-config/+/787571 | 16:37 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Use External inmotion cloud network for zuul nodes https://review.opendev.org/c/opendev/system-config/+/787595 | 16:54 |
*** andrewbonney has quit IRC | 17:04 | |
*** ralonsoh has quit IRC | 17:08 | |
*** ocsabat has joined #opendev | 17:13 | |
*** ocsabat has quit IRC | 17:19 | |
*** zoharm has quit IRC | 17:21 | |
*** dtantsur is now known as dtantsur|afk | 17:22 | |
*** mfixtex has quit IRC | 17:23 | |
*** rpittau is now known as rpittau|afk | 17:24 | |
*** redrobot0 has joined #opendev | 17:39 | |
*** dhellmann_ has joined #opendev | 17:42 | |
*** calcmandan_ has joined #opendev | 17:42 | |
*** elod_ has joined #opendev | 17:43 | |
*** smcginni1 has joined #opendev | 17:44 | |
*** elod has quit IRC | 17:48 | |
*** calcmandan has quit IRC | 17:48 | |
*** smcginnis has quit IRC | 17:48 | |
*** redrobot has quit IRC | 17:48 | |
*** kopecmartin has quit IRC | 17:48 | |
*** fbo has quit IRC | 17:49 | |
*** dhellmann has quit IRC | 17:49 | |
*** redrobot0 is now known as redrobot | 17:49 | |
*** dhellmann_ is now known as dhellmann | 17:49 | |
*** fbo has joined #opendev | 17:55 | |
*** jpena|away is now known as jpena|off | 18:07 | |
clarkb | fungi: the inmotion nodes have all gone away now so I'll set the config to what is proposed above ? | 18:08 |
fungi | sounds reasonable, yep | 18:08 |
fungi | i think one of the check jobs for that change may be stuck | 18:08 |
clarkb | ok done and set max-servers to 6 which should be viable until we get other networking stuff cleaned up | 18:10 |
clarkb | that seems to be working, nodes have launched and are in use | 18:11 |
clarkb | once the above change lands we can clean some stuff up and then set max-servers back to 8 if the expected ip frees happen | 18:12 |
fungi | yeah, i'm basically waiting to see it merge and then i'll undo the disable list entry for nl02 | 18:12 |
*** auristor has quit IRC | 18:13 | |
*** sboyron has quit IRC | 18:15 | |
*** auristor has joined #opendev | 18:25 | |
*** lbragstad_ has joined #opendev | 18:29 | |
*** lbragstad has quit IRC | 18:32 | |
*** amoralej is now known as amoralej|off | 18:33 | |
openstackgerrit | Merged opendev/system-config master: Use External inmotion cloud network for zuul nodes https://review.opendev.org/c/opendev/system-config/+/787595 | 18:39 |
fungi | clarkb: i've undone the disable list entry for nl02 now that's merged | 18:40 |
clarkb | cool. I guess we can probably clean up the router and network now? | 18:43 |
*** elod_ is now known as elod | 18:44 | |
clarkb | I'll do that now | 18:45 |
clarkb | that cleaned up one of the used IPs | 18:48 |
clarkb | btu we still have two dhcp interfaces using ips | 18:48 |
clarkb | I think that means we still one short of eight | 18:49 |
clarkb | also it looks like neutron may just do redundant dhcp servers? | 18:50 |
clarkb | everything about these two ports is the same except for the host and their ids | 18:50 |
clarkb | (they are on the same network and the same subnet etc) | 18:50 |
clarkb | we could free up the mirror's router's IP by switching it over to the External network too (build a mirror02 maybe?) | 18:51 |
clarkb | ok yup I found the dhcp agents listing. We can delete one if we are happy with a single agent | 18:52 |
fungi | that seems fine | 18:53 |
fungi | worst case we stop getting usable nodes there if one of them dies | 18:53 |
clarkb | yup | 18:53 |
openstackgerrit | Clark Boylan proposed opendev/zone-opendev.org master: Add mirror02 to inmotion https://review.opendev.org/c/opendev/zone-opendev.org/+/787628 | 19:14 |
*** hrw has quit IRC | 19:15 | |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Add mirror02 to inmotion https://review.opendev.org/c/opendev/system-config/+/787629 | 19:16 |
openstackgerrit | Clark Boylan proposed opendev/zone-opendev.org master: Flip mirror.iad3.inmotion to mirror02.iad3.inmotion https://review.opendev.org/c/opendev/zone-opendev.org/+/787630 | 19:18 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Cleanup mirror01.iad3.inmotion https://review.opendev.org/c/opendev/system-config/+/787631 | 19:20 |
clarkb | fungi: that should do it | 19:20 |
clarkb | I'm trying to figure out how to set the quota to 6 instances so that we don't try to use 8 until all that work above is done and the extra unnecessary network stuff is cleaned up | 19:22 |
*** mfixtex has joined #opendev | 19:23 | |
clarkb | and that is done | 19:23 |
clarkb | fungi: https://review.opendev.org/c/opendev/system-config/+/778116 that one would be a good one to review too, then we can land the jeepyb updates behind it | 19:26 |
*** slaweq_ has quit IRC | 19:30 | |
*** fressi has quit IRC | 19:32 | |
*** whoami-rajat_ is now known as whoami-rajat | 19:33 | |
fungi | yep, just approved it | 19:35 |
openstackgerrit | Merged opendev/zone-opendev.org master: Add mirror02 to inmotion https://review.opendev.org/c/opendev/zone-opendev.org/+/787628 | 19:37 |
fungi | once dns is resolving i'll approve the inventory one | 19:37 |
clarkb | thanks! | 19:38 |
clarkb | I actually think we can maybe approve it now because zuul serializing everything | 19:38 |
clarkb | and if it fails that is fine because 01 is still the one in charge | 19:38 |
clarkb | but then we do need to wait before changes 3 and 4 | 19:38 |
*** lamt has quit IRC | 19:53 | |
*** lbragstad_ is now known as lbragstad | 19:56 | |
*** lamt has joined #opendev | 19:58 | |
*** iurygregory has quit IRC | 19:59 | |
*** hamalq has joined #opendev | 20:10 | |
*** d34dh0r53 has quit IRC | 20:14 | |
*** mfixtex has quit IRC | 20:23 | |
*** d34dh0r53 has joined #opendev | 20:27 | |
openstackgerrit | Merged opendev/system-config master: Handle zuul-summary-results as .jar / per-project config https://review.opendev.org/c/opendev/system-config/+/778116 | 20:27 |
*** slaweq has quit IRC | 20:44 | |
*** snapdeal has joined #opendev | 20:45 | |
clarkb | fungi: ^ do your jeepyb changes need reviews still? | 21:19 |
fungi | maybe, i'll take another look | 21:25 |
fungi | dns update still hasn't deployed btw | 21:25 |
clarkb | ya the changes that touch host vars or inventory take forever | 21:26 |
clarkb | we probably need to reevaluate that and if it is working properly | 21:26 |
fungi | i went ahead and approved them now | 21:26 |
clarkb | I think the idea was we can split the matches further somehow, but I need to dig into what the idea was behind that | 21:26 |
ianw | hrm, OSUOSL rolled out AAAA records for their API yesterday, but i'm not seeing nb03 seem to use it | 21:26 |
ianw | 21:25:20.460128 IP 192.168.1.72.36182 > 140.211.9.114.9292: Flags [.], seq 289439405:289445045, ack 0, win 505, length 5640 | 21:26 |
clarkb | ianw: maybe we're caching the ipv4 for a long time? we had that problem with one of the mirrors today when it stopped hitting pypi.org via ipv6 | 21:27 |
clarkb | had to restart apache to pick up the new record due to long ttls | 21:27 |
*** mlavalle has joined #opendev | 21:27 | |
ianw | yeah i just restarted the container to test and it still seems to be talking ipv4 from what i can tell | 21:28 |
ianw | # ping arm-openstack.osuosl.org | 21:29 |
ianw | PING arm-controller1.osuosl.org (140.211.9.114): 56 data bytes | 21:29 |
ianw | in container | 21:29 |
ianw | # ping arm-openstack.osuosl.org | 21:29 |
ianw | PING arm-openstack.osuosl.org(arm-controller1.osuosl.org (2605:bc80:3010:104::8cd3:972)) 56 data bytes | 21:29 |
ianw | out container | 21:29 |
fungi | ahh, so inside the container you're getting looking via unbound, outside via /etc/hosts? | 21:30 |
fungi | er, no i guess it's all via unbound? | 21:30 |
clarkb | ya we should be using hostnetworking and that wold send everything to unbound | 21:31 |
ianw | print(socket.getaddrinfo('arm-openstack.osuosl.org', 443)) returns ipv6 | 21:34 |
clarkb | do we have a prefer ipv4 setting in clouds.yaml? | 21:35 |
clarkb | maybe that is just a nodepool thing for booted instances | 21:35 |
ianw | there is a force_ipv4 in clouds.yaml that we have for some clouds, but not this one | 21:36 |
*** snapdeal has quit IRC | 21:38 | |
clarkb | ya I think that affects the instance creation not api connections | 21:38 |
*** irclogbot_1 has quit IRC | 21:50 | |
*** whoami-rajat has quit IRC | 21:50 | |
ianw | $ telnet -6 arm-controller1.osuosl.org 5000 | 21:51 |
ianw | Trying 2605:bc80:3010:104::8cd3:972... | 21:51 |
ianw | telnet: connect to address 2605:bc80:3010:104::8cd3:972: Connection refused | 21:51 |
ianw | i think there's the problem | 21:51 |
ianw | i guess the firewall hasn't caught up | 21:52 |
*** irclogbot_3 has joined #opendev | 21:53 | |
openstackgerrit | Merged opendev/jeepyb master: Update Gerrit hook command-line parameters https://review.opendev.org/c/opendev/jeepyb/+/786095 | 21:55 |
openstackgerrit | Merged opendev/jeepyb master: Correct set_in_progress parameters https://review.opendev.org/c/opendev/jeepyb/+/786120 | 21:58 |
zigo | ianw: I still get "The nodeset "debian-bullseye" was not found." at https://review.opendev.org/c/openstack/puppet-openstack-integration/+/786772 | 22:14 |
zigo | ianw: Is there still a problem? | 22:14 |
fungi | zigo: how recently? i see a ready debian-bullseye node in inap-mtl01 which has been there waiting for something to do for almost 2 hours | 22:25 |
*** Ramereth has joined #opendev | 22:25 | |
zigo | fungi: You can see the time in the review, no? Less than an hour ago... | 22:26 |
zigo | I can do a recheck ... | 22:26 |
fungi | oh, sorry, i missed you linked the change | 22:26 |
zigo | fungi: Still failing ... | 22:27 |
zigo | Maybe my patch is wrong? :) | 22:28 |
*** kopecmartin has joined #opendev | 22:28 | |
ianw | hrm, some sort of config typo is not out of the question | 22:28 |
zigo | I'm not very experimented configuring Zuul jobs in yaml ... :P | 22:29 |
clarkb | heads up I think the jeeypb changes above promoted images before the system-config change. That means when we promoted the system-config change we set to latest a slightly older image | 22:31 |
fungi | or we have a debian-bullseye label but not a nodeset | 22:31 |
clarkb | I suspect we'll need to land another change to either jeepyb or system-config to sync things up | 22:31 |
zigo | Is the "nodeset:" thingy external to the puppet-openstack-integration repo? | 22:32 |
clarkb | (but maybe zuul will do the right thing in that case and not promote the older image when it runs later?) | 22:32 |
ianw | oh, we haven't put it in https://opendev.org/opendev/base-jobs/src/branch/master/zuul.d/nodesets.yaml | 22:32 |
clarkb | ah so you might be able to make your own nodeset | 22:32 |
fungi | yeah, that's it | 22:33 |
zigo | :) | 22:33 |
zigo | ianw: Shall I do the patch? | 22:33 |
openstackgerrit | Ian Wienand proposed opendev/base-jobs master: Add debian bullseye https://review.opendev.org/c/opendev/base-jobs/+/787647 | 22:34 |
ianw | zigo: was just typing it out ^ :) | 22:34 |
zigo | Gosh, he's fast ... :) | 22:34 |
fungi | should hopefully merge fairly quickly | 22:36 |
fungi | i've already approved it | 22:36 |
ianw | for the record OSU have identified some issues with their ipv6 setup and having apache listen correctly, so have reverted the AAAA records for now. we can be test subjects when they're ready | 22:36 |
clarkb | fungi: you're fast too I don't even get a chance to review before it is approved :) | 22:36 |
clarkb | ianw: sounds good! | 22:36 |
clarkb | (we are godo tes subjects) | 22:36 |
clarkb | and i can't type anymore | 22:36 |
fungi | it was a very straightforward change | 22:37 |
ianw | fungi/clarkb: not sure if you saw but I *think* i found a root cause for the blank node issue -> https://review.opendev.org/c/zuul/nodepool/+/787475 | 22:39 |
clarkb | ianw: I hadn't, that seems worthy of a review | 22:41 |
ianw | i would recommend coffee, or perhaps something a little stronger, before trying to consider node lock path layouts and recursive delete interactions :) | 22:42 |
fungi | but hey, it has tests! | 22:43 |
clarkb | ianw: my favorite thing (and I'm serious) is when the commit message is like 3x as long as the change | 22:44 |
zigo | Status of stuff in opendev/base-jobs cannot be seen in zuul.o.o ? | 22:47 |
clarkb | zigo: https://zuul.opendev.org then select the opendev tenant | 22:48 |
zigo | Oh, I see... | 22:49 |
clarkb | ianw: the comment I just left may also require extra coffee | 22:50 |
* zigo sends coffee to everyone | 22:50 | |
fungi | zigo: there's also a tenant selector at the top of the status page if you want to change the tenant view | 22:51 |
corvus | any reason we don't have stage-output in base job? | 22:58 |
clarkb | may just be an incomplete migration to the new logging aggregation plan? | 22:59 |
fungi | yeah, currently that role is used in the run-base-post playbook | 23:01 |
fungi | in system-config | 23:01 |
fungi | for the zuul playbook | 23:01 |
fungi | other than that, seems to just be used by devstack and grenade jobs? | 23:01 |
corvus | yeah. seems like this should be standard: stage-output (lets you use vars to copy files to ~/zuul-output on remote); then fetch-output (copies ~/zuul-output on remote to logs/$hostname in local workspace); then merge-output-to-logs (copies artifacts to logs on workspace) | 23:03 |
corvus | we should try running it twice in system-config, and if it works, move it to base-jobs and drop our own | 23:04 |
fungi | neat, codesearch for it also turns up a comment which looks like it should have been a bug report: "This role is needed because stage-output does not support duplicated file names, even if they come from different directories." https://opendev.org/openstack/grenade/src/branch/master/roles/prepare-grenade-logs/README.rst | 23:04 |
corvus | no idea what that comment means | 23:05 |
fungi | yeah, i think (based on looking at the main.yaml) it's that if you try to use stage-output and have subdirectories containing files with the same name then "something" breaks, so they rename some of their filenames in the tree to make sure they're unique | 23:07 |
fungi | i would be surprised if that were really the case though | 23:08 |
openstackgerrit | Merged opendev/base-jobs master: Add debian bullseye https://review.opendev.org/c/opendev/base-jobs/+/787647 | 23:08 |
openstackgerrit | James E. Blair proposed opendev/system-config master: DNM: Run stage-output twice https://review.opendev.org/c/opendev/system-config/+/787650 | 23:09 |
fungi | looks like https://review.opendev.org/548936 added the workaround but the commit message doesn't add any additional explanation for it | 23:11 |
zigo | fungi: ianw: YEAH ! \o/ Working ... :P | 23:12 |
zigo | See #786772 status in zuul ... :P | 23:12 |
zigo | Thanks guys. | 23:13 |
clarkb | ianw: https://b94c7b46e389cd502b3d-8deb753aa4e0bfff4caa3bbfcf763c6f.ssl.cf1.rackcdn.com/787629/1/gate/system-config-run-review-3.2/96d9bbe/job-output.txt that seemed to fail in selenium | 23:16 |
clarkb | it says "javascript error" | 23:17 |
clarkb | did the image update that just happened somehow break that? | 23:17 |
zigo | Weirdo logs though, looks like the apt sources.list is wrong or something. | 23:17 |
*** tosky has quit IRC | 23:17 | |
zigo | 2021-04-22 23:16:18.289009 | debian-bullseye | Err:9 https://mirror.mtl01.inap.opendev.org/debian n/a-backports/main Sources | 23:18 |
zigo | 2021-04-22 23:16:18.289186 | debian-bullseye | 404 Not Found [IP: 198.72.125.6 443] | 23:18 |
zigo | Why "n/a" ? | 23:18 |
clarkb | zigo: can you link to the job logs? | 23:18 |
clarkb | my guess is ansible is saying "n/a" for some fact | 23:19 |
zigo | https://zuul.openstack.org/stream/08b38f67485b49828bb6e72c08bed039?logfile=console.log | 23:19 |
clarkb | oh the job isn't done yet, hard to say then as we can't see the facts yet | 23:19 |
zigo | Has Ansible been configured to use py3 only? (as py2 is removed from Bullseye) | 23:21 |
fungi | i think we default to py3 | 23:21 |
fungi | anyway, once that build reports, yeah, we should have a better idea of what's wrong there (what's ending up with "n/a" instead of "bullseye") | 23:21 |
clarkb | my hunch is that because its a new unreleased distro release ansible isn't properly setting some fact and it defaults to n/a | 23:22 |
clarkb | ianw: "TypeError: document.querySelector(...).shado...tacktrace" I'm reading that to say that maybe the selenium test is broken because the elements in the page are different now? | 23:22 |
clarkb | the approved change to the plugin did pass testing though so this is really weird to me | 23:23 |
ianw | clarkb: urgh, that may be the case ... | 23:24 |
zigo | lsb_release -c -s <--- This really says bullseye ... so it's ansible being silly ! :) Maybe it only checks for /etc/debian_version ? | 23:24 |
ianw | wouldn't think 3.2 would fail that | 23:25 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: DNM noop change to trigger system-config-run-review jobs https://review.opendev.org/c/opendev/system-config/+/787652 | 23:26 |
clarkb | ianw: ^ I'll put holds on the two jobs for that change, but not sure I'll be able to debug further today | 23:26 |
ianw | clarkb: 'status': 500 makes me maybe feel like it was gerrit retruning the error | 23:27 |
openstackgerrit | Merged opendev/jeepyb master: Bump gerritlib requirement to 0.10.0 https://review.opendev.org/c/opendev/jeepyb/+/765357 | 23:28 |
clarkb | ianw: I think that is coming from the webdriver | 23:28 |
clarkb | but ya it could be passing it rhough | 23:28 |
clarkb | the holds are in place | 23:29 |
clarkb | do we want to put review in emergency to prevent ansible from removing the currently running image? (I suppose that may have already happened?) | 23:30 |
clarkb | its really weird though that the change to update the image passed and presumably ran the same selenium tests | 23:30 |
ianw | i would have expected to see something in the apache logs | 23:32 |
ianw | although maybe we talk to gerrit directly | 23:32 |
clarkb | another options is we can tag the existing image with a "please preserve me" tag | 23:33 |
ianw | http://localhost:8081 yeah | 23:33 |
clarkb | ianw: do we collect the gerrit error log? | 23:33 |
ianw | yep, replication errors in there which i think are expected, can't see anything else | 23:33 |
ianw | i am starting to think it is a javascript error | 23:34 |
clarkb | ya I see the file, and ya replication errors make sense | 23:35 |
ianw | that query works aginst review at least | 23:39 |
ianw | review.opendev.org | 23:39 |
clarkb | ianw: looking at how we pull gerrit images it doesn't seem like we do that automatically? | 23:39 |
clarkb | ianw: do you know if that is the case? | 23:39 |
clarkb | (if so then we're probably ok with what is running in production until we can run this down | 23:39 |
ianw | yeah, we don't auto restart gerrit | 23:39 |
clarkb | ya but do we do a pull and/or prune? | 23:40 |
clarkb | which could potentially cleanup the image we're currently running? | 23:40 |
clarkb | (I'm not seeing it if we do. My raed on this is we should be ok) | 23:40 |
openstackgerrit | Merged opendev/jeepyb master: Set default branch in .gitreview files when creating project https://review.opendev.org/c/opendev/jeepyb/+/758595 | 23:41 |
ianw | the held node should reveal all | 23:41 |
clarkb | ianw: ++, is that something you'll be able to look at? | 23:41 |
ianw | yep, i'll keep an eye on it | 23:42 |
clarkb | thanks! | 23:42 |
clarkb | I've got dinner to sort out momentarily and I had to readd my ssh keys to the agent to do the hold so my computer is telling me I have computered long enough today | 23:42 |
fungi | i've got an appointment first thing in the morning, so will likely drop off once the opendev ptg block is over in 15 minutes | 23:44 |
*** paladox has joined #opendev | 23:53 | |
*** mlavalle has quit IRC | 23:59 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!