Thursday, 2021-04-22

clarkbI'll pick this up in the morning, dinner should be here soon. Also an early start tomorrow for me due to the selected ptg time slot. If that time slot is terrible for you do not feel bad about skipping it :)00:07
openstackgerritMerged opendev/zone-opendev.org master: Add inmotion mirror to DNS  https://review.opendev.org/c/opendev/zone-opendev.org/+/78745900:09
ianwfungi / zigo: bullseye build completed, images are uploading00:23
ianwso we should make that fix permanent00:24
fungiianw: thanks for trying it out! i was going to, but still trying to dig myself out00:24
*** Tengu_ has joined #opendev00:26
openstackgerritMerged opendev/system-config master: Handle focal's insistence we don't use root in launch-node.py  https://review.opendev.org/c/opendev/system-config/+/78746100:27
*** Tengu_ has quit IRC00:28
*** Tengu has quit IRC00:29
*** Tengu has joined #opendev00:31
*** mlavalle has quit IRC00:39
kevinzianw: morning!01:39
kevinzcould I reboot the nb03 node to see sth will change?01:39
ianwkevinz: we can .. although OSL have just added an ipv6 address and AAAA records for their API node01:43
ianw#status log deleted leaked zk node under /nodepool/images/fedora-32/builds to avoid many warnings in builder logs01:44
openstackstatusianw: finished logging01:44
ianwi think i have a bit of a handle on it now01:44
openstackgerritMerged openstack/project-config master: Change cyborg project track to launchpad  https://review.opendev.org/c/openstack/project-config/+/78730601:48
kevinzianw: OK, that's great :-) Good to hear that01:49
openstackgerritMerged opendev/system-config master: nodepool-base: prefer ZK IPv6 addresses  https://review.opendev.org/c/opendev/system-config/+/78731302:32
ianwok, nb03 has switched over to ipv6 for ZK now03:40
ianw2021-04-22 02:46:04,895 INFO kazoo.client: Zookeeper connection established, state: CONNECTED03:41
ianw2021-04-22 03:35:43,281 WARNING kazoo.client: Connection dropped: outstanding heartbeat ping not received03:41
ianw2021-04-22 03:35:43,286 WARNING kazoo.client: Transition to CONNECTING03:41
ianw2021-04-22 03:35:43,286 INFO kazoo.client: Zookeeper connection lost03:41
ianw2021-04-22 03:35:43,296 INFO kazoo.client: Connecting to 2001:4800:7815:102:be76:4eff:fe02:f134(2001:4800:7815:102:be76:4eff:fe02:f134):2281, use_ssl: True03:41
ianw2021-04-22 03:35:44,443 INFO nodepool.builder.UploadWorker.0: ZooKeeper suspended. Waiting03:41
ianw2021-04-22 03:35:50,391 WARNING kazoo.client: Session has expired03:41
ianw2021-04-22 03:35:50,391 INFO kazoo.client: Zookeeper session closed, state: EXPIRED_SESSION03:41
ianw2021-04-22 03:35:50,392 INFO kazoo.client: Connecting to 2001:4800:7821:105:be76:4eff:fe04:d599(2001:4800:7821:105:be76:4eff:fe04:d599):2281, use_ssl: True03:41
ianw2021-04-22 03:35:50,419 INFO kazoo.client: Zookeeper connection established, state: CONNECTED03:41
ianwso it seemed to drop out (i've not seen that "heartbeat ping not received" before) and then reconnect.  i'll keep an eye and see03:41
*** ysandeep|away is now known as ysandeep04:14
fricklerclarkb: fyi the sdk issue you saw is fixed by https://review.opendev.org/c/openstack/openstacksdk/+/786148 , I've asked for a new release to be made04:28
fricklerand it is a bit sad that sdk doesn't do regression testing against older clouds, though I'm not sure how old a cloud would have to be for that04:30
*** ykarel has joined #opendev04:45
*** slittle1 has quit IRC04:46
*** slittle1 has joined #opendev04:50
*** zbr9 has joined #opendev05:04
*** zbr has quit IRC05:06
*** zbr9 is now known as zbr05:06
*** DSpider has joined #opendev05:15
*** ralonsoh has joined #opendev05:20
*** DSpider has quit IRC05:21
*** DSpider has joined #opendev05:21
*** DSpider has quit IRC05:24
*** ykarel_ has joined #opendev06:08
*** ykarel has quit IRC06:11
*** ykarel__ has joined #opendev06:11
*** slaweq has joined #opendev06:12
*** ykarel_ has quit IRC06:14
*** ykarel__ is now known as ykarel06:17
*** ykarel_ has joined #opendev06:32
*** ykarel has quit IRC06:35
*** lpetrut has joined #opendev06:36
*** eolivare has joined #opendev06:41
*** sboyron has joined #opendev06:42
*** whoami-rajat_ has joined #opendev06:50
*** jpena|off is now known as jpena06:50
*** ysandeep is now known as ysandeep|lunch06:58
zigoianw: \o/ Thanks !06:59
*** fressi has joined #opendev07:04
*** zimmerry_ has quit IRC07:07
*** zimmerry has joined #opendev07:09
*** amoralej|off is now known as amoralej07:13
*** andrewbonney has joined #opendev07:17
*** rpittau|afk is now known as rpittau07:34
*** tosky has joined #opendev07:49
*** zimmerry has quit IRC07:49
*** zimmerry has joined #opendev07:52
*** xinliang has joined #opendev07:55
*** StevenK has joined #opendev08:05
*** ysandeep|lunch is now known as ysandeep08:50
*** dtantsur|afk is now known as dtantsur09:02
*** ykarel_ has quit IRC09:18
*** xinliang has quit IRC09:47
openstackgerritsean mooney proposed openstack/project-config master: Add review priority label to nova deliverables  https://review.opendev.org/c/openstack/project-config/+/78752310:19
openstackgerritsean mooney proposed openstack/project-config master: Copy placement acls to all placement repos  https://review.opendev.org/c/openstack/project-config/+/78752410:19
*** ykarel has joined #opendev11:29
*** jpena is now known as jpena|lunch11:34
*** ysandeep is now known as ysandeep|afk11:39
*** ykarel has quit IRC11:55
*** ykarel has joined #opendev11:57
*** ykarel has quit IRC12:00
*** amoralej is now known as amoralej|lunch12:00
*** ykarel has joined #opendev12:00
*** brinzhang has quit IRC12:04
*** brinzhang has joined #opendev12:04
*** jaicaa has quit IRC12:08
*** jaicaa has joined #opendev12:11
fungifrickler: apparently not old, we're seeing it at least with clouds running victoria, and checking when the neutron side feature was added it was apparently in wallaby? so even testing the most recent stable branch would have caught this particular case12:12
fricklerfungi: in this particular case, yes, but I was thinking about regressions against older clouds which are still heavily being used in production like queens, which we couldn't easily set up in gate even if we wanted to12:19
fungiahh, yeah that's certainly harder12:20
fungithere's probably a happy medium between old enough and so old we can't easily do it, also those jobs could be periodic to reduce the job count on new changes. catching them before a release is still better than after releasing12:21
*** brinzhang has quit IRC12:24
*** brinzhang has joined #opendev12:24
*** lpetrut_ has joined #opendev12:27
*** lpetrut has quit IRC12:29
*** zoharm has joined #opendev12:33
*** jpena|lunch is now known as jpena12:35
*** hamalq has joined #opendev12:39
*** ysandeep|afk is now known as ysandeep12:39
*** xinliang has joined #opendev13:01
*** jaicaa has quit IRC13:04
*** zimmerry has quit IRC13:04
*** artom has quit IRC13:04
*** erbarr has quit IRC13:04
*** jaicaa has joined #opendev13:05
*** zimmerry has joined #opendev13:05
*** artom has joined #opendev13:05
*** erbarr has joined #opendev13:05
xinlianghello when will the opendev ptg will begin? https://etherpad.opendev.org/p/apr2021-ptg-opendev13:05
openstackgerritMerged opendev/system-config master: Add iad3.inmotion mirror node  https://review.opendev.org/c/opendev/system-config/+/78745613:06
xinliangwe are to attend this ptg. arm I mistaking the time. no one in the meeting room now?13:07
xinliangIt seems a hour later for the PTG. I am mistaking the time. Sorry.13:11
fungixinliang: that's correct, 14:00 utc13:13
fungi~45 minutes from now13:13
xinliangok, got it. thanks fungi :-)13:13
*** owner__ has joined #opendev13:14
*** owner__ has quit IRC13:14
*** brinzhang_ has joined #opendev13:15
*** amoralej|lunch is now known as amoralej13:15
*** ocsabat has joined #opendev13:15
*** mfixtex has joined #opendev13:25
*** lpetrut_ has quit IRC13:45
*** brinzhang_ has quit IRC13:50
*** priteau has quit IRC13:53
clarkbthe new inmotion cloud deployment failed because the LE playbook failed somewhere. I don't think it failed on the new mirror as I can see in the acme.sh log there that it appears to have created files13:54
fungilimestone's not offline this time13:54
clarkbI need to shift into ptg prep, but wanted to provide that update13:55
clarkband by prep I mean join the meetpad :)13:55
fungiyeah, i'm trying to find the error in the log13:56
fungithe last thing in the log was TASK [letsencrypt-create-certs : Run acme.sh driver for mirror01-iad3-inmotion-main certificate issue]13:57
fungi13:36:4613:57
clarkbafter the next hourly cloud launcher run runs I'll start a manual rerun using ianw's fixed venv for the sdk issue13:58
clarkbfungi: thats about when acme.sh says it has written a key and intermediate cert and all that13:59
fungiyeah, but there's no summary output from ansible in the log, like it just abruptly stopped14:00
fungiaha! ERROR! The requested handler 'letsencrypt updated mirror01-iad3-inmotion-main' was not found in either the main handlers list nor in the listening handlers list14:00
clarkbzuul doesn't say the job timed out14:00
fungiclarkb: ^ fix is likely straightforward?14:00
clarkbaha, sorry yup14:01
clarkbI'll get that up while we wait for people to join the ptg14:01
openstackgerritClark Boylan proposed opendev/system-config master: Add missing inmotion LE apache restart handler  https://review.opendev.org/c/opendev/system-config/+/78757114:03
*** ocsabat has quit IRC14:25
*** xinliang has quit IRC14:38
zbrproblem with mirror-int.dfw.rax.opendev.org? https://zuul.opendev.org/t/openstack/build/35039ad11c3d4cecad2a6219203d286614:51
clarkbzbr: the first thing to check is if the non -int name works (its the same server but different IPs to use internal rax networking if possible)14:52
funginote that's just the rfc-1918 backend network hostname corresponding to mirror.dfw.rax.opendev.org, reachable only in that provider region14:52
clarkbhttps://mirror.dfw.rax.opendev.org/wheel/ubuntu-20.04-x86_64/ yup that is the public path14:52
clarkboh except I got the path wrong14:53
clarkbhttps://mirror.dfw.rax.opendev.org/pypi/simple/tox-bindep/ is what it failed to red14:53
clarkb(which is failign to load for me looks like)14:53
fungithat'll be a proxy to pypi.org14:54
clarkbthe first path I gave hits our afs backed wheel caches. The second path is a proxy to pypi14:54
zbrfor me too14:54
clarkbit seems that the server is generally healthy enough to serve afs, but is having trouble proxying to pypi14:54
clarkbI cannot wget https://pypi.org from the mirror14:55
clarkbseems to be an ipv6 issue. It does work if I wget -4 https://pypi.org14:55
fungimaybe connectivity problem between rackspace and fastly14:55
fungiahh, yeah14:55
fungitraceroute6 to pypi.org dies with timeouts in rackspace's network or at their edge14:56
fungiroughly 4-5 hops from the mirror server14:57
clarkbipv6 seems to generally work though, google is accessible anyway14:57
clarkbas is review0214:57
fungiyeah, i'm thinking that last hop i see is their border gateway, and the failure is somewhere on an asymmertic return path14:57
fungiso possibly v6 routing blackhole in a backbone provider14:58
clarkbI think mnaser was discussing ipv6 rouate issues today too, I wonder if it is more widespread than vexxhost14:58
mnaserclarkb: that's in another dc though :p14:59
mnaserbut, very possible..14:59
clarkbfungi: we can stick an /etc/hosts rule in for pypi.org to address this maybe?15:00
clarkb(then just remember to remove it when ipv6 is happy again)15:00
fungii'm testing other mirrors15:00
*** amoralej is now known as amoralej|afk15:01
fungiyeah, rax-iad and rax-ord are able to reach pypi just fine15:01
clarkbI have deleted the accidentally created opendev network and subnet in osuosl15:03
zbri do remember having ipv6 problems with pypi at least twice during the last year, some behavior ipv4 was fine but ipv6 was not and it was their CDN both times (i have full ipv6 at home)15:03
clarkbthe cloud launcher ran fine when using an older sdk too so I think that is all sorted at this point15:03
zbri think i even raise a bug which was closed as "works now, not our problem"15:03
fungiwell, to be fair, pypi's fastly account is donated to them, they have no real pull with fastly to fix problems15:04
clarkband even then it may not be fastly's fault either15:04
fungiand not having a cdn for pypi was even worse15:04
clarkbcould be somewhere between15:04
fungiyep15:05
clarkbanyway `dig A pypi.org` returns four addresses. We could stick them in /etc/hosts?15:05
zbrcan we force it to us ipv4?15:05
* dtantsur assumes frequent "No matching distribution found" are being discussed already15:05
fungidtantsur: with rax-dfw?15:05
dtantsurdunno, can check15:05
zbrdtantsur: likely the same issue.15:06
clarkbzbr: yes, the /etc/hosts idea is one method to do that15:06
dtantsurfungi: yes, two jobs from the same patch, both on rax-dfw15:07
fungii'm checking now to see if any of the v6 fastly endpoints returned to dfw are reachable15:07
fungibut i agree hard-coding v4 endpoints temporarily in /etc/hosts may get thinks back working for now15:07
clarkbthe TTLs on those records are super long too so we may need to flush caches (somehow?) and/or restart apache?15:09
zbrhopefully they do not change them very often15:10
fungii've set pypi.org to 151.101.0.223 in /etc/hosts but i probably also need to hardcode an address for the file hosting they have since it uses a separate domain name?15:11
fungipypifiles.pythonhosted.org i think?15:11
* fungi checks15:11
clarkbfiles.pythonhosted.org according to our proxy config15:11
clarkbfungi: no ipv6 issues to that name though15:12
fungiahh, yep thanks just found it15:13
zbrdig files.pythonhosted.org AAAA does return ipv6 entries to me.15:13
clarkbzbr: it does, but they function15:13
fungii agree, that's reachable from the proxy15:13
fungiso it was really just the simple api breaking, and not the file hosting15:13
clarkbwget https://pypi.org works on the mirror now but apache is still failing. I suspect we need to restart apache to make it lookup the name again15:13
fungi#status log Temporarily hard-coded the pypi.org hostname to a Fastly IPv4 address in /etc/hosts on mirror01.dfw.rax.opendev.org until v6 routing between them returns to working order15:14
openstackstatusfungi: finished logging15:14
clarkbfungi: do you want to restart apache too?15:15
fungiis it still cached?15:15
clarkbhttps://mirror.dfw.rax.opendev.org/pypi/simple/tox-bindep/ continues to not work so I suspect so15:15
clarkbthe ttls on those names are huge so wouldn't surprise me if apache is holding on to them15:15
fungiyeah, i'll restart apache215:15
fungisudo systemctl restart apache215:16
clarkb++15:16
fungiought to do it15:16
fungiif i type that in the correct terminal ;)15:16
fungirestart finished15:16
clarkbthat url for tox-bindep loads for me now15:16
fungiloading for me now15:16
clarkbdtantsur: zbr ^ fyi should hopefully be happier now15:16
dtantsur\o/15:17
fungihopefully they work out whatever the routing issue is soon and then we can undo that15:17
fungii'll try to test it periodically15:17
*** ocsabat has joined #opendev15:19
clarkbfungi: I'd like to run `sudo ansible-playbook --limit mirror01.iad3.inmotion.opendev.org -v /home/zuul/src/opendev.org/opendev/system-config/playbooks/service-mirror.yaml` to get the inmotion mirror up to speed.15:22
clarkbdo you have any concerns with doing that?15:22
funginone whatsoever15:23
clarkbcool proceeding with that now15:23
fungithis is to work around reenqueuing in deploy?15:23
clarkbyup15:23
clarkbreenqueing will take forever since it is an inventory update15:23
clarkband that runs all the jobs15:23
*** avass has quit IRC15:26
*** avass has joined #opendev15:27
*** ameono has joined #opendev15:30
clarkbhttps://mirror.iad3.inmotion.opendev.org/ woot thats up15:32
*** ocsabat78 has joined #opendev15:33
clarkbI've pulled my WIP vote off of https://review.opendev.org/c/openstack/project-config/+/787428 I think adding the cloud to nodepool is the next step15:35
*** ykarel is now known as ykarel|away15:36
fungii've gone ahead and approved it, reviewed it yesterday15:37
*** ameono has quit IRC15:38
*** ocsabat has quit IRC15:39
*** ykarel|away has quit IRC15:41
*** ocsabat78 has quit IRC15:41
openstackgerritMerged openstack/project-config master: Add InMotion cloud to nodepool  https://review.opendev.org/c/openstack/project-config/+/78742815:45
*** brinzhang_ has joined #opendev15:55
*** brinzhang has quit IRC15:58
*** jpena is now known as jpena|away16:00
*** hamalq has quit IRC16:06
*** slaweq_ has joined #opendev16:20
*** eolivare has quit IRC16:20
openstackgerritClark Boylan proposed opendev/system-config master: Sepcify inmotion cloud config in clouds.yaml  https://review.opendev.org/c/opendev/system-config/+/78759516:21
*** amoralej|afk is now known as amoralej16:26
*** sshnaidm is now known as sshnaidm|afk16:28
*** ysandeep is now known as ysandeep|away16:30
openstackgerritMerged opendev/system-config master: Add missing inmotion LE apache restart handler  https://review.opendev.org/c/opendev/system-config/+/78757116:37
openstackgerritClark Boylan proposed opendev/system-config master: Use External inmotion cloud network for zuul nodes  https://review.opendev.org/c/opendev/system-config/+/78759516:54
*** andrewbonney has quit IRC17:04
*** ralonsoh has quit IRC17:08
*** ocsabat has joined #opendev17:13
*** ocsabat has quit IRC17:19
*** zoharm has quit IRC17:21
*** dtantsur is now known as dtantsur|afk17:22
*** mfixtex has quit IRC17:23
*** rpittau is now known as rpittau|afk17:24
*** redrobot0 has joined #opendev17:39
*** dhellmann_ has joined #opendev17:42
*** calcmandan_ has joined #opendev17:42
*** elod_ has joined #opendev17:43
*** smcginni1 has joined #opendev17:44
*** elod has quit IRC17:48
*** calcmandan has quit IRC17:48
*** smcginnis has quit IRC17:48
*** redrobot has quit IRC17:48
*** kopecmartin has quit IRC17:48
*** fbo has quit IRC17:49
*** dhellmann has quit IRC17:49
*** redrobot0 is now known as redrobot17:49
*** dhellmann_ is now known as dhellmann17:49
*** fbo has joined #opendev17:55
*** jpena|away is now known as jpena|off18:07
clarkbfungi: the inmotion nodes have all gone away now so I'll set the config to what is proposed above ?18:08
fungisounds reasonable, yep18:08
fungii think one of the check jobs for that change may be stuck18:08
clarkbok done and set max-servers to 6 which should be viable until we get other networking stuff cleaned up18:10
clarkbthat seems to be working, nodes have launched and are in use18:11
clarkbonce the above change lands we can clean some stuff up and then set max-servers back to 8 if the expected ip frees happen18:12
fungiyeah, i'm basically waiting to see it merge and then i'll undo the disable list entry for nl0218:12
*** auristor has quit IRC18:13
*** sboyron has quit IRC18:15
*** auristor has joined #opendev18:25
*** lbragstad_ has joined #opendev18:29
*** lbragstad has quit IRC18:32
*** amoralej is now known as amoralej|off18:33
openstackgerritMerged opendev/system-config master: Use External inmotion cloud network for zuul nodes  https://review.opendev.org/c/opendev/system-config/+/78759518:39
fungiclarkb: i've undone the disable list entry for nl02 now that's merged18:40
clarkbcool. I guess we can probably clean up the router and network now?18:43
*** elod_ is now known as elod18:44
clarkbI'll do that now18:45
clarkbthat cleaned up one of the used IPs18:48
clarkbbtu we still have two dhcp interfaces using ips18:48
clarkbI think that means we still one short of eight18:49
clarkbalso it looks like neutron may just do redundant dhcp servers?18:50
clarkbeverything about these two ports is the same except for the host and their ids18:50
clarkb(they are on the same network and the same subnet etc)18:50
clarkbwe could free up the mirror's router's IP by switching it over to the External network too (build a mirror02 maybe?)18:51
clarkbok yup I found the dhcp agents listing. We can delete one if we are happy with a single agent18:52
fungithat seems fine18:53
fungiworst case we stop getting usable nodes there if one of them dies18:53
clarkbyup18:53
openstackgerritClark Boylan proposed opendev/zone-opendev.org master: Add mirror02 to inmotion  https://review.opendev.org/c/opendev/zone-opendev.org/+/78762819:14
*** hrw has quit IRC19:15
openstackgerritClark Boylan proposed opendev/system-config master: Add mirror02 to inmotion  https://review.opendev.org/c/opendev/system-config/+/78762919:16
openstackgerritClark Boylan proposed opendev/zone-opendev.org master: Flip mirror.iad3.inmotion to mirror02.iad3.inmotion  https://review.opendev.org/c/opendev/zone-opendev.org/+/78763019:18
openstackgerritClark Boylan proposed opendev/system-config master: Cleanup mirror01.iad3.inmotion  https://review.opendev.org/c/opendev/system-config/+/78763119:20
clarkbfungi: that should do it19:20
clarkbI'm trying to figure out how to set the quota to 6 instances so that we don't try to use 8 until all that work above is done and the extra unnecessary network stuff is cleaned up19:22
*** mfixtex has joined #opendev19:23
clarkband that is done19:23
clarkbfungi: https://review.opendev.org/c/opendev/system-config/+/778116 that one would be a good one to review too, then we can land the jeepyb updates behind it19:26
*** slaweq_ has quit IRC19:30
*** fressi has quit IRC19:32
*** whoami-rajat_ is now known as whoami-rajat19:33
fungiyep, just approved it19:35
openstackgerritMerged opendev/zone-opendev.org master: Add mirror02 to inmotion  https://review.opendev.org/c/opendev/zone-opendev.org/+/78762819:37
fungionce dns is resolving i'll approve the inventory one19:37
clarkbthanks!19:38
clarkbI actually think we can maybe approve it now because zuul serializing everything19:38
clarkband if it fails that is fine because 01 is still the one in charge19:38
clarkbbut then we do need to wait before changes 3 and 419:38
*** lamt has quit IRC19:53
*** lbragstad_ is now known as lbragstad19:56
*** lamt has joined #opendev19:58
*** iurygregory has quit IRC19:59
*** hamalq has joined #opendev20:10
*** d34dh0r53 has quit IRC20:14
*** mfixtex has quit IRC20:23
*** d34dh0r53 has joined #opendev20:27
openstackgerritMerged opendev/system-config master: Handle zuul-summary-results as .jar / per-project config  https://review.opendev.org/c/opendev/system-config/+/77811620:27
*** slaweq has quit IRC20:44
*** snapdeal has joined #opendev20:45
clarkbfungi: ^ do your jeepyb changes need reviews still?21:19
fungimaybe, i'll take another look21:25
fungidns update still hasn't deployed btw21:25
clarkbya the changes that touch host vars or inventory take forever21:26
clarkbwe probably need to reevaluate that and if it is working properly21:26
fungii went ahead and approved them now21:26
clarkbI think the idea was we can split the matches further somehow, but I need to dig into what the idea was behind that21:26
ianwhrm, OSUOSL rolled out AAAA records for their API yesterday, but i'm not seeing nb03 seem to use it21:26
ianw21:25:20.460128 IP 192.168.1.72.36182 > 140.211.9.114.9292: Flags [.], seq 289439405:289445045, ack 0, win 505, length 564021:26
clarkbianw: maybe we're caching the ipv4 for a long time? we had that problem with one of the mirrors today when it stopped hitting pypi.org via ipv621:27
clarkbhad to restart apache to pick up the new record due to long ttls21:27
*** mlavalle has joined #opendev21:27
ianwyeah i just restarted the container to test and it still seems to be talking ipv4 from what i can tell21:28
ianw# ping arm-openstack.osuosl.org21:29
ianwPING arm-controller1.osuosl.org (140.211.9.114): 56 data bytes21:29
ianwin container21:29
ianw# ping arm-openstack.osuosl.org21:29
ianwPING arm-openstack.osuosl.org(arm-controller1.osuosl.org (2605:bc80:3010:104::8cd3:972)) 56 data bytes21:29
ianwout container21:29
fungiahh, so inside the container you're getting looking via unbound, outside via /etc/hosts?21:30
fungier, no i guess it's all via unbound?21:30
clarkbya we should be using hostnetworking and that wold send everything to unbound21:31
ianwprint(socket.getaddrinfo('arm-openstack.osuosl.org', 443)) returns ipv621:34
clarkbdo we have a prefer ipv4 setting in clouds.yaml?21:35
clarkbmaybe that is just a nodepool thing for booted instances21:35
ianwthere is a force_ipv4 in clouds.yaml that we have for some clouds, but not this one21:36
*** snapdeal has quit IRC21:38
clarkbya I think that affects the instance creation not api connections21:38
*** irclogbot_1 has quit IRC21:50
*** whoami-rajat has quit IRC21:50
ianw$ telnet -6 arm-controller1.osuosl.org 500021:51
ianwTrying 2605:bc80:3010:104::8cd3:972...21:51
ianwtelnet: connect to address 2605:bc80:3010:104::8cd3:972: Connection refused21:51
ianwi think there's the problem21:51
ianwi guess the firewall hasn't caught up21:52
*** irclogbot_3 has joined #opendev21:53
openstackgerritMerged opendev/jeepyb master: Update Gerrit hook command-line parameters  https://review.opendev.org/c/opendev/jeepyb/+/78609521:55
openstackgerritMerged opendev/jeepyb master: Correct set_in_progress parameters  https://review.opendev.org/c/opendev/jeepyb/+/78612021:58
zigoianw: I still get "The nodeset "debian-bullseye" was not found." at https://review.opendev.org/c/openstack/puppet-openstack-integration/+/78677222:14
zigoianw: Is there still a problem?22:14
fungizigo: how recently? i see a ready debian-bullseye node in inap-mtl01 which has been there waiting for something to do for almost 2 hours22:25
*** Ramereth has joined #opendev22:25
zigofungi: You can see the time in the review, no? Less than an hour ago...22:26
zigoI can do a recheck ...22:26
fungioh, sorry, i missed you linked the change22:26
zigofungi: Still failing ...22:27
zigoMaybe my patch is wrong? :)22:28
*** kopecmartin has joined #opendev22:28
ianwhrm, some sort of config typo is not out of the question22:28
zigoI'm not very experimented configuring Zuul jobs in yaml ... :P22:29
clarkbheads up I think the jeeypb changes above promoted images before the system-config change. That means when we promoted the system-config change we set to latest a slightly older image22:31
fungior we have a debian-bullseye label but not a nodeset22:31
clarkbI suspect we'll need to land another change to either jeepyb or system-config to sync things up22:31
zigoIs the "nodeset:" thingy external to the puppet-openstack-integration repo?22:32
clarkb(but maybe zuul will do the right thing in that case and not promote the older image when it runs later?)22:32
ianwoh, we haven't put it in https://opendev.org/opendev/base-jobs/src/branch/master/zuul.d/nodesets.yaml22:32
clarkbah so you might be able to make your own nodeset22:32
fungiyeah, that's it22:33
zigo:)22:33
zigoianw: Shall I do the patch?22:33
openstackgerritIan Wienand proposed opendev/base-jobs master: Add debian bullseye  https://review.opendev.org/c/opendev/base-jobs/+/78764722:34
ianwzigo: was just typing it out ^ :)22:34
zigoGosh, he's fast ... :)22:34
fungishould hopefully merge fairly quickly22:36
fungii've already approved it22:36
ianwfor the record OSU have identified some issues with their ipv6 setup and having apache listen correctly, so have reverted the AAAA records for now.  we can be test subjects when they're ready22:36
clarkbfungi: you're fast too I don't even get a chance to review before it is approved :)22:36
clarkbianw: sounds good!22:36
clarkb(we are godo tes subjects)22:36
clarkband i can't type anymore22:36
fungiit was a very straightforward change22:37
ianwfungi/clarkb: not sure if you saw but I *think* i found a root cause for the blank node issue ->  https://review.opendev.org/c/zuul/nodepool/+/78747522:39
clarkbianw: I hadn't, that seems worthy of a review22:41
ianwi would recommend coffee, or perhaps something a little stronger, before trying to consider node lock path layouts and recursive delete interactions :)22:42
fungibut hey, it has tests!22:43
clarkbianw: my favorite thing (and I'm serious) is when the commit message is like 3x as long as the change22:44
zigoStatus of stuff in opendev/base-jobs cannot be seen in zuul.o.o ?22:47
clarkbzigo: https://zuul.opendev.org then select the opendev tenant22:48
zigoOh, I see...22:49
clarkbianw: the comment I just left may also require extra coffee22:50
* zigo sends coffee to everyone22:50
fungizigo: there's also a tenant selector at the top of the status page if you want to change the tenant view22:51
corvusany reason we don't have stage-output in base job?22:58
clarkbmay just be an incomplete migration to the new logging aggregation plan?22:59
fungiyeah, currently that role is used in the run-base-post playbook23:01
fungiin system-config23:01
fungifor the zuul playbook23:01
fungiother than that, seems to just be used by devstack and grenade jobs?23:01
corvusyeah.  seems like this should be standard:  stage-output (lets you use vars to copy files to ~/zuul-output on remote); then fetch-output (copies ~/zuul-output on remote to logs/$hostname in local workspace); then merge-output-to-logs (copies artifacts to logs on workspace)23:03
corvuswe should try running it twice in system-config, and if it works, move it to base-jobs and drop our own23:04
fungineat, codesearch for it also turns up a comment which looks like it should have been a bug report: "This role is needed because stage-output does not support duplicated file names, even if they come from different directories." https://opendev.org/openstack/grenade/src/branch/master/roles/prepare-grenade-logs/README.rst23:04
corvusno idea what that comment means23:05
fungiyeah, i think (based on looking at the main.yaml) it's that if you try to use stage-output and have subdirectories containing files with the same name then "something" breaks, so they rename some of their filenames in the tree to make sure they're unique23:07
fungii would be surprised if that were really the case though23:08
openstackgerritMerged opendev/base-jobs master: Add debian bullseye  https://review.opendev.org/c/opendev/base-jobs/+/78764723:08
openstackgerritJames E. Blair proposed opendev/system-config master: DNM: Run stage-output twice  https://review.opendev.org/c/opendev/system-config/+/78765023:09
fungilooks like https://review.opendev.org/548936 added the workaround but the commit message doesn't add any additional explanation for it23:11
zigofungi: ianw: YEAH ! \o/ Working ... :P23:12
zigoSee #786772 status in zuul ... :P23:12
zigoThanks guys.23:13
clarkbianw: https://b94c7b46e389cd502b3d-8deb753aa4e0bfff4caa3bbfcf763c6f.ssl.cf1.rackcdn.com/787629/1/gate/system-config-run-review-3.2/96d9bbe/job-output.txt that seemed to fail in selenium23:16
clarkbit says "javascript error"23:17
clarkbdid the image update that just happened somehow break that?23:17
zigoWeirdo logs though, looks like the apt sources.list is wrong or something.23:17
*** tosky has quit IRC23:17
zigo2021-04-22 23:16:18.289009 | debian-bullseye | Err:9 https://mirror.mtl01.inap.opendev.org/debian n/a-backports/main Sources23:18
zigo2021-04-22 23:16:18.289186 | debian-bullseye |   404  Not Found [IP: 198.72.125.6 443]23:18
zigoWhy "n/a" ?23:18
clarkbzigo: can you link to the job logs?23:18
clarkbmy guess is ansible is saying "n/a" for some fact23:19
zigohttps://zuul.openstack.org/stream/08b38f67485b49828bb6e72c08bed039?logfile=console.log23:19
clarkboh the job isn't done yet, hard to say then as we can't see the facts yet23:19
zigoHas Ansible been configured to use py3 only? (as py2 is removed from Bullseye)23:21
fungii think we default to py323:21
fungianyway, once that build reports, yeah, we should have a better idea of what's wrong there (what's ending up with "n/a" instead of "bullseye")23:21
clarkbmy hunch is that because its a new unreleased distro release ansible isn't properly setting some fact and it defaults to n/a23:22
clarkbianw:  "TypeError: document.querySelector(...).shado...tacktrace" I'm reading that to say that maybe the selenium test is broken because the elements in the page are different now?23:22
clarkbthe approved change to the plugin did pass testing though so this is really weird to me23:23
ianwclarkb: urgh, that may be the case ...23:24
zigolsb_release -c -s <--- This really says bullseye ... so it's ansible being silly ! :) Maybe it only checks for /etc/debian_version ?23:24
ianwwouldn't think 3.2 would fail that23:25
openstackgerritClark Boylan proposed opendev/system-config master: DNM noop change to trigger system-config-run-review jobs  https://review.opendev.org/c/opendev/system-config/+/78765223:26
clarkbianw: ^ I'll put holds on the two jobs for that change, but not sure I'll be able to debug further today23:26
ianwclarkb: 'status': 500 makes me maybe feel like it was gerrit retruning the error23:27
openstackgerritMerged opendev/jeepyb master: Bump gerritlib requirement to 0.10.0  https://review.opendev.org/c/opendev/jeepyb/+/76535723:28
clarkbianw: I think that is coming from the webdriver23:28
clarkbbut ya it could be passing it rhough23:28
clarkbthe holds are in place23:29
clarkbdo we want to put review in emergency to prevent ansible from removing the currently running image? (I suppose that may have already happened?)23:30
clarkbits really weird though that the change to update the image passed and presumably ran the same selenium tests23:30
ianwi would have expected to see something in the apache logs23:32
ianwalthough maybe we talk to gerrit directly23:32
clarkbanother options is we can tag the existing image with a "please preserve me" tag23:33
ianwhttp://localhost:8081 yeah23:33
clarkbianw: do we collect the gerrit error log?23:33
ianwyep, replication errors in there which i think are expected, can't see anything else23:33
ianwi am starting to think it is a javascript error23:34
clarkbya I see the file, and ya replication errors make sense23:35
ianwthat query works aginst review at least23:39
ianwreview.opendev.org23:39
clarkbianw: looking at how we pull gerrit images it doesn't seem like we do that automatically?23:39
clarkbianw: do you know if that is the case?23:39
clarkb(if so then we're probably ok with what is running in production until we can run this down23:39
ianwyeah, we don't auto restart gerrit23:39
clarkbya but do we do a pull and/or prune?23:40
clarkbwhich could potentially cleanup the image we're currently running?23:40
clarkb(I'm not seeing it if we do. My raed on this is we should be ok)23:40
openstackgerritMerged opendev/jeepyb master: Set default branch in .gitreview files when creating project  https://review.opendev.org/c/opendev/jeepyb/+/75859523:41
ianwthe held node should reveal all23:41
clarkbianw: ++, is that something you'll be able to look at?23:41
ianwyep, i'll keep an eye on it23:42
clarkbthanks!23:42
clarkbI've got dinner to sort out momentarily and I had to readd my ssh keys to the agent to do the hold so my computer is telling me I have computered long enough today23:42
fungii've got an appointment first thing in the morning, so will likely drop off once the opendev ptg block is over in 15 minutes23:44
*** paladox has joined #opendev23:53
*** mlavalle has quit IRC23:59

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!