fungi | ianw: so the problem boils down to wireless bridging, and isn't really ipv6-specific | 00:10 |
---|---|---|
fungi | 802.11 isn't ethernet, is the crux of the challenge | 00:11 |
fungi | with ethernet, you can have a virtual ethernet bridge on your laptop, and connect the virtual machine's interface to it with a distinct mac | 00:12 |
ianw | right, that's what i have done when i'm at my office desk in the dock, where i'm wired into the network | 00:13 |
fungi | yeah, the ethernet part is the straightforward part | 00:13 |
fungi | i've not tried bridging a virtual interface to a wlan, i'll need to look around and find out what's out there for that | 00:14 |
fungi | i know it exists, because folks set up linux machines as wireless access points all the time | 00:14 |
fungi | so they need to be able to bridge ethernet frames/addresses to wifi and back | 00:15 |
ianw | well yeah, i can't add my wifi card to a bridge on my laptop | 00:16 |
fungi | ianw: arch wiki says to use hostapd to map it: https://wiki.archlinux.org/index.php/Network_bridge#Wireless_interface_on_a_bridge | 00:16 |
ianw | hrm, this must be something like virtualbox does under the hood | 00:17 |
fungi | also links to the debian wiki which suggests alternative of translating via ebtables: https://wiki.debian.org/BridgeNetworkConnections#Bridging_with_a_wireless_NIC | 00:17 |
ianw | that allows me to to select a network of "bridged adapter" and select the wifi interface | 00:18 |
ianw | which is great, but it doesn't work | 00:18 |
fungi | ianw: an alternative would be to set up a router on your laptop with a separate routed subnet... how much v6 space do you get? can you request larger or multiple allocations via dhcp6-pd or similar? | 00:19 |
fungi | the idea would be to configure downstream prefix delegation to your laptop, and then have it run a local dhcpd on the virtual bridge and something like radvd to announce that network, and then route between the vm and your laptop's gateway | 00:22 |
ianw | i have a /56 ... but a) my preference would be not hard-code it; e.g. to handle the case of when i occasionally tether to 4g and am not on the home network, and b) how does that subnet know to go via the ethernet or wifi | 00:22 |
fungi | yeah, i'm suggesting you don't hard-code it, but... it's multiple daemons to have to manage | 00:22 |
fungi | if you also run dhcp6 on your router it can serve an additional /64 to your laptop which your laptop then in turn announces on the virtual bridge | 00:23 |
fungi | but i do think the layer 2 solution is likely to be less work than the layer 3 solution | 00:24 |
ianw | yeah, i'm running a unifi security gateway. it really only provides for slaac on a network; but you can define multiple networks, each with a prefix from the /56 | 00:24 |
fungi | it may not be flexible enough in that case. it's one of the reasons my home router is a sbc with straight up openbsd installed | 00:25 |
fungi | makes it possible to play around with much more complex network configurations | 00:25 |
ianw | (which is exactly what I *don't* want in the house :) | 00:26 |
fungi | so anyway, one of the layer 2 solutions involving either hostapd or ebtables on your laptop is probably what you need | 00:27 |
ianw | i think that must be similar to what virtualbox is doing, which unfortunately seems to have issues https://www.virtualbox.org/ticket/5503 | 00:28 |
ianw | "The benefit of using an IP / Layer 3 solution is that it will work when the outward-facing interface is a wireless ethernet client, without using WDS and without resorting to NAT. Simple Layer 2 bridging does not work in this case due to the vagaries of wireless AP client behaviour; WDS may help but AP support can be patchy. " | 00:31 |
donnyd | ianw: I would think there is a path where your laptop becomes a router instead of a NAT device. If you advertise a subnet from your laptop - you should be able to route v6 networks.. | 00:42 |
donnyd | unless what you are looking for is something like what libvirt does by default - which is just "nat all outbound traffic from this bridge to whatever" | 00:43 |
ianw | donnyd: bridge, NAT, carrier pigeon would be fine, i'd take anything at this point :) | 00:45 |
fungi | donnyd: yep, we covered routing above... it should work but if he wants to avoid hard-coding prefixes it likely involves running a dhcp6d somewhere to subdelegate the additional routed prefix | 00:54 |
fungi | and then route announcements from the laptop to the local lan for routing and onto the bridge for slaac and default route | 00:55 |
ianw | i'm not sure how that knows to send packets out the wired or wireless interface though | 00:56 |
fungi | here i'm assuming you're only ever connecting one at a time | 00:56 |
fungi | having more than one interface routing to the internet at the same time gets... weird | 00:57 |
ianw | well, yes, only one at a time i guess. networkmanager is generally handling this | 00:57 |
fungi | (for v6 or v4) | 00:57 |
fungi | yeah, so your laptop would route those packets "normally" based on whichever interface knows how to get packets where they're destined | 00:58 |
ianw | do people just not use vm's on wifi? | 01:02 |
fungi | that's my suspicion | 01:03 |
fungi | i don't anyway | 01:03 |
ianw | i'm willing to hardcode prefixes because i guess i'm not going anywhere for 6 more weeks :/ i would like the ability to switch to 4g and have it "just" work ... | 01:03 |
fungi | i also mostly launch virtual machines out on the internet rather than locally | 01:04 |
ianw | i've always had a "work" vm, that keeps work keys etc. separate and never logs into any personal accounts etc. | 01:04 |
ianw | it's also nice when the laptop breaks that it's just a cp of the image to get going again | 01:05 |
fungi | ahh, yeah, i just use separate physical machines, but i also don't ever take machines with work keys on them out of the house, so permanently wiring them to the lan is the lazy solution in my case | 01:07 |
ianw | it looks like in NM i can create an ethernet interface with an option "shared to other computers" for ipv6 | 01:08 |
donnyd | You can also use BGP on your laptop (a little heavy), but it will surely work | 01:19 |
fungi | yeah, but radvd or similar should do the trick in this case | 01:20 |
donnyd | surely | 01:20 |
fungi | ibgp just for v6 would definitely be overkill | 01:20 |
donnyd | yes, but fancy and fun | 01:20 |
fungi | heck, even simple rip becomes a useful igp for ipv6 | 01:21 |
donnyd | I think you *can* also NAT ipv6 just like you can ipv4 as well.. I just don't know how to do it. | 01:22 |
donnyd | so you could mock the rules up in ipv6 that are there by default for v4 | 01:22 |
fungi | yeah, i expect iptables could do it | 01:23 |
fungi | nat just makes me queasy | 01:23 |
donnyd | yea, its not very ipv6y | 01:23 |
donnyd | routing is the preferred method I would imagine - but for this use case, I do believe you can just masquerade a v6 subnet just like a v4 one and get the same thing | 01:24 |
donnyd | ianw: are you trying to connect inbound to these VM's or are you just trying to get them a v6 address | 01:25 |
donnyd | I would imagine the end state would require know what exact you want on the instances | 01:25 |
ianw | donnyd: i don't care about inbound. i just want to ssh to nodes via ipv6 basically :) | 01:25 |
fungi | yeah, you could just use linklocal on the bridge and nat it then | 01:26 |
donnyd | from the laptop while you are on it, or from other non local machines | 01:26 |
donnyd | non-localhost machine I should say | 01:27 |
fungi | oh, wait, not linklocal. i don't think you can convince the laptop to route that, even via nat | 01:27 |
fungi | i think there is block set aside as non-global v6 for nat though | 01:27 |
donnyd | yea, you would have to setup dhcpv6 and give them a real address so it can be NAT properly I would think | 01:28 |
* fungi checks iana | 01:28 | |
donnyd | I have never actually tried any of it though - I use the bgp functions in openstack for my v6 needs | 01:28 |
ianw | fungi: yeah, i looked into that. i don't think there is an equivalent | 01:29 |
donnyd | https://en.wikipedia.org/wiki/Unique_local_address | 01:30 |
donnyd | As a result, the IETF reserved the address block fc00::/7 in October 2005 for use in private IPv6 networks and defined the associated term unique local addresses. | 01:31 |
fungi | fc00::/7 | 01:31 |
fungi | yep, that one | 01:31 |
donnyd | that one I think should work | 01:31 |
fungi | https://tools.ietf.org/html/rfc4193 | 01:31 |
donnyd | I am real curious to see how this works out - I may even have to give this a whirl myself | 01:32 |
fungi | is what i was looking for | 01:32 |
fungi | interestingly, rfc 4193 doesn't mention nat | 01:33 |
fungi | it seems to be concerned more with vpn use cases | 01:33 |
fungi | but i don't immediately see why it wouldn't work to do layer 4 pat/overload nat for it to a single global address | 01:34 |
ianw | Since 6.5.0 it is possible to enable NAT with IPv6 networking. As noted above, IPv6 has historically done plain forwarding and thus to avoid breaking historical compatibility, IPv6 NAT must be explicitly requested. | 01:34 |
ianw | ... maybe i've missed something ... | 01:34 |
ianw | https://libvirt.org/formatnetwork.html#elementsConnect | 01:36 |
ianw | v6.5.0 (2020-07-03) | 01:38 |
ianw | ok, maybe i can be forgiven for missing it was added in the last month ... still .. this is promising maybe | 01:39 |
fungi | heh | 01:39 |
fungi | timely! | 01:39 |
ianw | https://www.redhat.com/archives/libvir-list/2020-June/msg00334.html | 01:40 |
ianw | ok dokie, i've got the libvirt preview copr installed with 6.6.0 so ... quitting irc now to see if i can get this to work ... | 01:55 |
donnyd | fungi: does this work for you `curl -k https://[2001:470:8:2e9::1]:5000` | 02:01 |
donnyd | ianw: reminded me to tighten up my firewall rules for v6 - I could have pretty much been an interwebs router the way I had it - LOL | 02:02 |
*** icey has quit IRC | 02:05 | |
*** icey has joined #opendev | 02:08 | |
donnyd | I can also confirm that ipv6 MASQUERADE does work - at least with my setup | 02:14 |
donnyd | ip6tables -t nat -I POSTROUTING -s fc00::/64 -o $YOUR_INTERFACE -j MASQUERADE | 02:14 |
ianw | donnyd: it works for me!!!!!!!!!!!!!!!!!! | 02:38 |
ianw | i have ipv6 over my wifi connection!!!!!! | 02:38 |
ianw | inet6 fc00:dead:beef:55:2450:cdd1:93cc:da6e/64 scope global dynamic noprefixroute | 02:39 |
ianw | valid_lft 3476sec preferred_lft 3476sec | 02:39 |
donnyd | woot woot!!! | 02:40 |
donnyd | now when you switch between does it still work? | 02:40 |
donnyd | and did you set it up, or did libvirt? | 02:40 |
ianw | yes, it seems to work! i am not on wired connection | 02:43 |
ianw | i mean i am *now* | 02:43 |
ianw | i can still hit 2e9::1 | 02:43 |
ianw | i do not really understand how it is working, and there's something about having to set accept_ra on the wifi card that i might have to loop via libvirt mailing list for | 02:44 |
ianw | it also seems that things aren't connecting via ipv6 by default | 02:46 |
ianw | i wonder if that's because it's a fc00 address | 02:46 |
*** zbr9 has joined #opendev | 03:03 | |
*** auristor has quit IRC | 03:04 | |
*** zbr has quit IRC | 03:04 | |
*** dpawlik2 has quit IRC | 03:04 | |
*** zbr9 is now known as zbr | 03:04 | |
donnyd | that is possible | 03:11 |
donnyd | what is in your dnsmasq config | 03:11 |
donnyd | here is what I ended up using | 03:11 |
donnyd | `enable-ra | 03:13 |
donnyd | dhcp-option=option6:dns-server,[fc00::4] | 03:13 |
donnyd | dhcp-range=::100,::1ff,constructor:ens1f4.3,ra-names,slaac,infinite | 03:13 |
donnyd | ra-param=ens1f4.3,mtu:1450,high,60,1200 | 03:13 |
donnyd | ` | 03:13 |
*** auristor has joined #opendev | 03:14 | |
donnyd | and here are the results https://usercontent.irccloud-cdn.com/file/46tvwUCG/image.png | 03:15 |
donnyd | ianw: I have been wanting to get my v6 stuff cleaned up, so thank you for the reminder :) | 03:17 |
ianw | hrm, i guess libvirt must be running it in the background, let me poke | 03:18 |
ianw | dhcp-range=fc00:dead:beef:55::,ra-only | 03:19 |
ianw | that's basically it | 03:19 |
ianw | i guess something lin the 2001:'s is better | 03:43 |
ianw | 2001:10::/28 deprecated (formerly ORCHID) | 03:44 |
ianw | maybe that's a good place that will never conflict | 03:44 |
donnyd | well that was to show my public v6 address is from my edge router | 03:45 |
donnyd | and what is on my machine is being NAT outbound | 03:46 |
donnyd | I am happy you got it working | 03:46 |
donnyd | and I am also happy you reminded me to clean up my ipv6 mess in my edge machine | 03:46 |
donnyd | LOL | 03:46 |
ianw | donnyd: me too :) dropped a message with some of the bumps and see what comes of it : https://www.redhat.com/archives/libvirt-users/2020-August/msg00042.html | 03:47 |
*** raukadah is now known as chkumar|rover | 04:43 | |
*** ysandeep|away is now known as ysandeep | 05:04 | |
*** danpawlik has joined #opendev | 06:37 | |
*** auristor has quit IRC | 06:47 | |
*** hashar has joined #opendev | 06:49 | |
*** danpawlik has quit IRC | 06:54 | |
*** danpawlik has joined #opendev | 06:55 | |
*** hashar has quit IRC | 07:05 | |
*** tosky has joined #opendev | 07:41 | |
*** hashar has joined #opendev | 07:51 | |
*** moppy has quit IRC | 08:01 | |
*** moppy has joined #opendev | 08:01 | |
*** DSpider has joined #opendev | 08:09 | |
*** ysandeep is now known as ysandeep|lunch | 08:25 | |
*** priteau has joined #opendev | 08:26 | |
*** priteau has quit IRC | 08:42 | |
*** priteau has joined #opendev | 08:58 | |
yoctozepto | morning channel | 09:04 |
yoctozepto | did something zuul/nodepool-related changed today (2020-08-11) between 0 and 6 (UTC)? asking because kolla started getting permission errors in jobs; could be new images but it must be something common to all distros | 09:06 |
ianw | yoctozepto: i'm not aware of anything deliberately changing ... not that is much help :) | 09:10 |
ttx | lourot: all set at https://github.com/openstack/charm-keystone-kerberos -- now it will sync at the next push. | 09:10 |
ttx | I'll have a look and see if it would be complicated to trigger that sync after the repo is first created | 09:11 |
yoctozepto | ianw: thanks, no problem, glad to know it was not on purpose :-) | 09:12 |
yoctozepto | FWIW, ansible seems to have released during that time frame https://github.com/ansible/ansible/releases | 09:19 |
lourot | ttx, thanks a lot! | 09:22 |
*** ysandeep|lunch is now known as ysandeep | 09:32 | |
*** tkajinam has quit IRC | 09:36 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: add-build-sshkey: call cmd with command https://review.opendev.org/745646 | 09:41 |
openstackgerrit | Vanou Ishii proposed opendev/puppet-openstackci master: Fix Misuse of Markdown Syntax in reStructuredText https://review.opendev.org/745647 | 09:50 |
openstackgerrit | Vanou Ishii proposed opendev/puppet-openstackci master: Fix Misuse of Markdown Syntax in reStructuredText https://review.opendev.org/745647 | 09:57 |
yoctozepto | ianw: it's to be fixed on our side but notice there is a breaking change in stable ansible backports: https://github.com/ansible/ansible/pull/70221 | 10:04 |
yoctozepto | default mode has changed | 10:04 |
yoctozepto | "for security reasons" | 10:04 |
*** priteau has quit IRC | 10:11 | |
*** dirk has joined #opendev | 10:43 | |
*** hashar has quit IRC | 10:53 | |
frickler | yoctozepto: poor kitty ;) | 11:01 |
yoctozepto | frickler: :-) | 11:06 |
*** hashar has joined #opendev | 11:12 | |
*** hashar has quit IRC | 11:12 | |
*** auristor has joined #opendev | 11:58 | |
*** iurygregory has quit IRC | 12:25 | |
*** iurygregory has joined #opendev | 12:26 | |
*** ryohayakawa has quit IRC | 12:27 | |
*** hashar has joined #opendev | 12:36 | |
fungi | i mentioned it over in #zuul too | 12:45 |
fungi | if this turns out not to be a temporary regression, the service-discuss and zuul-discuss mailing lists might also warrant a heads up | 12:46 |
openstackgerrit | Riccardo Pittau proposed openstack/diskimage-builder master: Do not install python2 packages in ubuntu focal https://review.opendev.org/745665 | 12:52 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Two small gerrit base image cleanups https://review.opendev.org/745595 | 13:00 |
openstackgerrit | Carlos Goncalves proposed openstack/diskimage-builder master: source-repositories: git is a build-only dependency https://review.opendev.org/745678 | 13:53 |
*** mlavalle has joined #opendev | 13:57 | |
*** smcginnis has quit IRC | 14:01 | |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Two small gerrit base image cleanups https://review.opendev.org/745595 | 14:02 |
openstackgerrit | Thierry Carrez proposed openstack/project-config master: Allow TC to review governance-sigs changes https://review.opendev.org/745679 | 14:03 |
*** smcginnis has joined #opendev | 14:05 | |
*** chkumar|rover is now known as raukadah | 14:42 | |
openstackgerrit | Merged openstack/project-config master: Allow TC to review governance-sigs changes https://review.opendev.org/745679 | 14:55 |
openstackgerrit | Julia Kreger proposed openstack/diskimage-builder master: Handle NetworkManager for dhcp-all-interfaces https://review.opendev.org/745698 | 15:12 |
*** mlavalle has quit IRC | 15:15 | |
*** mlavalle has joined #opendev | 15:17 | |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Two small gerrit base image cleanups https://review.opendev.org/745595 | 15:23 |
clarkb | I think that change has fixed the codemirror-editor issue (likely simply due to setting the tag checkout) | 15:26 |
clarkb | still working on the javamelody plugin issue btu I think that latest patchset is close | 15:27 |
clarkb | opendev has finished. I'm going to take a potentially long break now and see if I can get a bike ride in too. I'll be back for the meeting | 15:40 |
*** ysandeep is now known as ysandeep|away | 15:56 | |
*** tosky has quit IRC | 16:59 | |
*** hashar has quit IRC | 17:10 | |
clarkb | https://review.opendev.org/745595 seems to fix the java melody plugin issues on 2.16. I need to run through all images again and test them but that change is looking good | 17:59 |
clarkb | infra-root ^ if you'd prefer I can split it up into a few changes too | 17:59 |
clarkb | but I think I need to prep for the meeting, then after the meeting I want ot try and deploy gerritbot container on eavesdrop | 18:00 |
fungi | clarkb: seems to break the 3.0 image build, just pulling up logs now | 18:13 |
fungi | ERROR: Skipping 'plugins/javamelody:javamelody-deps_deploy.jar': no such target '//plugins/javamelody:javamelody-deps_deploy.jar': target 'javamelody-deps_deploy.jar' not declared in package 'plugins/javamelody' (did you mean 'javamelody_tests_deploy.jar'?) defined by /home/zuul/src/gerrit.googlesource.com/gerrit/plugins/javamelody/BUILD | 18:14 |
clarkb | https://gerrit.googlesource.com/plugins/javamelody/+/refs/heads/stable-3.0/src/main/resources/Documentation/build.md seems to show that the extra copy isn't needed in 3.0 | 18:15 |
clarkb | but also the target is gone too | 18:16 |
fungi | ahh, so we can version-bound that step? | 18:16 |
clarkb | ya I'll get a new ps up shortly | 18:16 |
fungi | no rush, just making sure it wasn't something more complicated | 18:16 |
openstackgerrit | Merged zuul/zuul-jobs master: Allow ara-report to run on any node https://review.opendev.org/742971 | 18:18 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Two small gerrit base image cleanups https://review.opendev.org/745595 | 18:24 |
clarkb | I think ^ should do it. | 18:24 |
*** mlavalle has quit IRC | 18:54 | |
*** mlavalle has joined #opendev | 18:54 | |
openstackgerrit | Merged opendev/gerritbot master: Add option to disable daemonization https://review.opendev.org/745240 | 19:27 |
*** hashar has joined #opendev | 19:31 | |
openstackgerrit | Merged zuul/zuul-jobs master: add-build-sshkey: call cmd with command https://review.opendev.org/745646 | 19:34 |
clarkb | fungi: apparnetly my glob hack with Dockerfile does not work | 19:56 |
clarkb | the internets lied to me ! :) | 19:56 |
clarkb | I guess we may need to split the dockerfile into 2 | 19:56 |
clarkb | corvus: ^ do you have any better ideas for that? | 19:56 |
fungi | it's less makefile-like than i would have hoped | 20:02 |
corvus | clarkb: i'm guessing it's failing on no files; so it would work for >=1 file | 20:03 |
corvus | clarkb: could put a readme in there :) | 20:03 |
corvus | clarkb: if you can put that last, you could make a multi-stage build and set the target to that on the 2.x builds | 20:05 |
clarkb | corvus: oh thats a neat hack | 20:05 |
clarkb | ya I think it can go last | 20:05 |
clarkb | fwiw I've written the split too and its not too bad, though having one would be nice I think | 20:06 |
corvus | clarkb: also, fyi, i think we could use the bazel-build role from zuul jobs, but not if we modify it for javamelody | 20:06 |
corvus | clarkb: but maybe we just keep this one around until 3.x, then drop it and use zuul-jobs | 20:06 |
clarkb | ya 3.x seems to be a lot cleaner build wise | 20:07 |
*** hashar has quit IRC | 20:07 | |
clarkb | hrm not sure how the target stuff would work, it will bother me that 3.x is uploaded from a different baser target than 2.x | 20:09 |
*** hashar has joined #opendev | 20:09 | |
clarkb | meh maybe thats ok | 20:09 |
openstackgerrit | Logan V proposed openstack/project-config master: Disable limestone provider https://review.opendev.org/745733 | 20:09 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Gerrit image cleanups/fixes https://review.opendev.org/745595 | 20:17 |
clarkb | ok I think ^ should do it | 20:17 |
fungi | logan-: i've gone ahead and approved that ^ but be aware that nodepool is going to keep uploading images to glance. if that's not desirable we can also adjust that | 20:17 |
fungi | also our ansible is going to continue trying to manage configuration for the mirror instance hosted there | 20:18 |
fungi | which we can also disable if needed | 20:18 |
clarkb | I'm approving the gerritbot on eavesdrop change now | 20:18 |
clarkb | I'll disable gerritbot on review.o.o once it looks like deploy will get close after merging | 20:18 |
fungi | logan-: similarly we have automation checking/updating ssh keys and security groups | 20:18 |
logan- | fungi: Thanks, it _should_ be OK, we'll have to reboot the mirror and probably cold-migrate it to a new host once we start upgrading, but it would only be a few minutes of outage for that. API access might be spotty for image uploads and other operations. No issue from my end leaving that turned on.. my only concern is if there's potential for API inaccessibility to cause grief in opendev's ansible / nodepool instances. | 20:20 |
clarkb | logan-: no it should be fine, we'll error and then try again until we stop erroring | 20:22 |
fungi | logan-: yeah, that's all fine. i was more thinking like if the entire api is going to be offline for weeks | 20:22 |
logan- | Ok sounds good. I'm not expecting anything like that but if something comes up where we're looking at extended downtime I'll give y'all a heads up! | 20:24 |
fungi | thanks again, we really appreciate all the help! | 20:24 |
clarkb | how does this look #status notice Gerritbot will be offline for a short period while we redeploy it on a new server | 20:24 |
clarkb | I'll send that when I stop the review.o.o service? | 20:24 |
fungi | sure, can't hurt | 20:26 |
fungi | or maybe "the openstackgerrit irc bot (gerritbot) will be..." | 20:27 |
fungi | just in hopes folks don't misread and think it's gerrit going down | 20:27 |
clarkb | ++ | 20:28 |
clarkb | new rev: #status notice The openstackgerrit IRC bot (gerritbot) will be offline for a short period wile we redeploy it on a new server | 20:28 |
fungi | lgtm | 20:29 |
clarkb | zuul says we are about half an hour from the change merging then we need infra-prod-service-eavesdrop to run | 20:29 |
openstackgerrit | Merged openstack/project-config master: Disable limestone provider https://review.opendev.org/745733 | 20:32 |
openstackgerrit | Merged opendev/system-config master: Add ansible role to manage gerritbot https://review.opendev.org/744795 | 20:50 |
clarkb | #status notice The openstackgerrit IRC bot (gerritbot) will be offline for a short period while we redeploy it on a new server | 20:50 |
openstackstatus | clarkb: sending notice | 20:50 |
-openstackstatus- NOTICE: The openstackgerrit IRC bot (gerritbot) will be offline for a short period while we redeploy it on a new server | 20:51 | |
fungi | whee! | 20:51 |
*** openstackgerrit has quit IRC | 20:52 | |
* fungi watches as his weechat channel activity bar lights up from all the statusbot notices | 20:52 | |
clarkb | infra-prod-service-eavesdrop is queued | 20:52 |
clarkb | now we wait | 20:52 |
openstackstatus | clarkb: finished sending notice | 20:54 |
clarkb | the process seems to be running. We got a warning about review's host key but that doesn't seem to be fatal as we are getting gerrit events | 20:58 |
clarkb | but so far nothing that would produce an irc message? | 20:58 |
clarkb | anyone have a change they want to update? | 20:58 |
* fungi checks | 21:00 | |
*** hashar has quit IRC | 21:00 | |
clarkb | hrm according to the logs it should've sent stuff to #openstack-neutron but not seeing that | 21:01 |
clarkb | we are currently logging at INFO level and not DEBUG because DEBUG was super verbose but we can switch to DEBUG and restart again and see if that gives any more info (not see any reason for lack of logging to #openstack-neutron) | 21:02 |
clarkb | also we're double logging that should be an easy fix | 21:03 |
fungi | clarkb: i see one in #openstack-tripleo! | 21:04 |
fungi | er, #tripleo | 21:05 |
clarkb | ok I see that one logged too | 21:05 |
clarkb | am I perhaps netsplit from the bot? | 21:05 |
fungi | http://eavesdrop.openstack.org/irclogs/%23tripleo/%23tripleo.2020-08-11.log.html#t2020-08-11T20:59:20-2 | 21:05 |
fungi | meetbot logged it at least | 21:05 |
clarkb | ya so its somewhat working | 21:06 |
fungi | also the join message said "openstackgerrit (~openstack@eavesdrop01.openstack.org)" so coming from the correct server | 21:08 |
fungi | not accidentally restarted on review01 | 21:08 |
smcginnis | Is this work relevant to the #openstack-stable changes I'm trying to do in https://review.opendev.org/#/c/744947/ | 21:09 |
clarkb | fungi: http://eavesdrop.openstack.org/irclogs/%23airshipit/%23airshipit.2020-08-11.log.html#t2020-08-11T20:59:18-2 that one worked too. That was the first one then tripleo was second. I'm wondering if it hit a problem after those and then broke | 21:10 |
clarkb | but skimming the logs I don't see that happening | 21:10 |
clarkb | smcginnis: yes, this should get us back to auto loading new gerritbot configs | 21:10 |
smcginnis | Cool | 21:10 |
clarkb | fungi: ya the first 3 messages sent all show up on eavesdrop meetbot channel logs | 21:11 |
clarkb | and I see the airshipit events in my client so wasn't split at athat point (so probably not split at all) | 21:11 |
clarkb | there was an newer airshipit event a few minutes ago that I don't see so not channel specific | 21:12 |
clarkb | ah here we go server not connected errors now | 21:13 |
clarkb | fungi: I'm going to switch us to debug logging and restart | 21:14 |
fungi | fun | 21:18 |
clarkb | tripleo, ansible and stable should've all just gotten events | 21:20 |
fungi | i just pushed a pbr change too, which would have notified #openstack-infra and #openstack-oslo (at least) | 21:21 |
clarkb | I see the three I called out | 21:22 |
clarkb | now looking at logs for why your pbr chang emay have failed | 21:22 |
fungi | remote: https://review.opendev.org/744720 Add Release Notes to documentation | 21:24 |
clarkb | ya it saw the event from gerrit | 21:25 |
clarkb | and says it sent things to oslo and friends | 21:25 |
clarkb | netstat shows we're connected to card.freenode.net | 21:25 |
clarkb | its like the tcp connection is dying after it works for a bit and it doesn't seem to know why | 21:26 |
clarkb | I'm hoping that with debug logging we'll get more info once it notices its current connection is dead | 21:27 |
clarkb | would a failed ping pong cause the server to kill our tcp connection? I wonder if it is protocol level like that | 21:29 |
clarkb | reading the code we seem fairly defensive to this sort of thing, its just not throwing exceptions ? | 21:34 |
*** DSpider has quit IRC | 21:34 | |
clarkb | this feels like a: we updated python irc lib and python2 to python3 sort of problem | 21:38 |
clarkb | like its going to be a byte vs string or synchronization type problem | 21:38 |
clarkb | ok it finally errored and there is no additional debugging putput | 21:40 |
clarkb | https://github.com/jaraco/irc/issues/171 ya that looks suspicious | 21:43 |
clarkb | we only handle output ourselves though but maybe that breaks the sending of PONG? | 21:44 |
clarkb | it looks like it should handle that via reactor.process_forever within the irc client for the bot | 21:56 |
clarkb | however I've set logging to debug level and I don't see the debug message for process_forever so I'm not convinced yet it is working properly | 21:56 |
clarkb | I also don't see the I've connected debug message so that may just be a logging config problem | 21:56 |
fungi | sorry, got told to put the computer away during dinner, back now ;) | 21:57 |
fungi | so the problem could be that gerritbot is effectively blocking the (only) thread? | 21:59 |
fungi | and so the pong handler never gets a chance to wake up? | 21:59 |
clarkb | that is what it looks like | 21:59 |
clarkb | since we work until we'd hit a ping timeout | 21:59 |
clarkb | though the logging doesn't yet point to that directly | 21:59 |
fungi | debug level logging doesn't log the ctcp ping/pong exchange though, i guess | 22:00 |
fungi | which either means it's not happening or it's just not logged | 22:00 |
clarkb | ya and there are log.debug calls in the irc bot code | 22:02 |
fungi | but yeah, after around 5 minutes we're seeing "-- openstackgerrit (~openstack@eavesdrop01.openstack.org) has quit (Ping | 22:02 |
fungi | timeout: 256 seconds) | 22:02 |
clarkb | setting root logger level to NOTSET means all messages will be logged /me tries this | 22:05 |
clarkb | ok got it to log irc client things by dropping our logging config | 22:11 |
clarkb | now we should see what is going on | 22:11 |
clarkb | AttributeError: module 'irc.client' has no attribute 'VERSION_STRING' | 22:12 |
clarkb | https://github.com/jaraco/irc/blob/master/irc/bot.py#L310-L315 is hitting that | 22:13 |
clarkb | https://github.com/jaraco/irc/blob/313620f625c1e8562447c2cbf0cc021c73470dc3/CHANGES.rst#v1900 there is testing https://github.com/jaraco/irc/blob/dd68925ba224d4b9c623ca6dc7b9033cd9d5cc2d/irc/tests/test_client.py#L10 and yet :/ | 22:14 |
fungi | does it want to call this instead? https://github.com/jaraco/irc/blob/master/irc/bot.py#L310-L315 | 22:15 |
fungi | er, sorry, https://github.com/jaraco/irc/blob/master/irc/client.py#L645-L647 | 22:15 |
clarkb | no I think we need to do the importlib lookup as described in that change | 22:16 |
clarkb | but ideally in the irc bot? | 22:16 |
fungi | https://github.com/jaraco/irc/commit/313620f | 22:18 |
fungi | ahh, you already found that | 22:18 |
fungi | i guess this calls for a pr | 22:18 |
clarkb | remote: https://review.opendev.org/745755 Pin to irc==18.0.0 | 22:19 |
clarkb | fungi: I think we can start at ^ | 22:19 |
clarkb | then ya do a PR then unpin | 22:19 |
clarkb | I'm running out of energy after the early start today so https://review.opendev.org/745755 seems like a reaosnable place to start. I'll stop gerritbot on eavesdrop now and start it on review.o.o as we seem to grok the problem now | 22:20 |
fungi | whee! https://github.com/jaraco/irc/issues/174 | 22:20 |
clarkb | I did docker-compose down on eavesdrop which should prevent that container from starting after a reboot and systemctl started gerritbot on review.o.o | 22:22 |
clarkb | we should be functional again | 22:22 |
clarkb | and https://review.opendev.org/745755 should allow us to flip them back over a second time :) | 22:22 |
*** qchris has quit IRC | 22:22 | |
* clarkb takes a break | 22:24 | |
fungi | i single-core approved it a few minutes ago since we at least know that's a problem (though whether it's also hiding other problems, no clue of course) | 22:24 |
*** mlavalle has quit IRC | 22:30 | |
*** qchris has joined #opendev | 22:35 | |
*** openstackgerrit has joined #opendev | 22:48 | |
openstackgerrit | Merged opendev/gerritbot master: Pin to irc==18.0.0 https://review.opendev.org/745755 | 22:48 |
*** tkajinam has joined #opendev | 22:54 | |
fungi | clarkb: i've got a patch passing their unit tests and style checks, can push a pr unless you've already got one in progress | 22:56 |
clarkb | fungi: I do not. Go for ut | 22:57 |
fungi | cool, will have it up in a minute or two and mark it as closing #174 | 22:58 |
clarkb | fungi: should I give swapping the gerritbot hosts another try now that the irc pin has merged? | 23:03 |
fungi | if you're up for it, sure | 23:03 |
clarkb | ok doing that now | 23:03 |
*** openstackgerrit has quit IRC | 23:04 | |
clarkb | it is running on eavesdrop again | 23:04 |
*** logan- has quit IRC | 23:05 | |
clarkb | and in ten minutes or so we push a change and check that it works? | 23:06 |
clarkb | it just posted to #tripleo successfully | 23:07 |
clarkb | fungi: smcginnis I think there is one more piece that we need for picking up config changes and that is running service-eavesdrop on changes to the gerritbot config | 23:10 |
clarkb | I won't get to that today and am out tomorrow but can pick that up as well as deduping gerritbot logs thursday if no on beats me to it | 23:11 |
clarkb | 3 minutes ago we sent events to #tripleo | 23:19 |
clarkb | which is well past the previous "timeout" | 23:19 |
clarkb | I think we are likely good | 23:19 |
clarkb | #status log Moved gerritbot from review.openstack.org to eavesdrop.openstack.org. Cleanup on old server needs to be done and we need to have project-config run infra-prod-service-eavesdrop when the gerritbot config updates. | 23:20 |
openstackstatus | clarkb: finished logging | 23:20 |
fungi | and after the usual 20 minutes reminding myself how to use teh githubz, https://github.com/jaraco/irc/pull/175 | 23:20 |
clarkb | fungi: it runs automated tests | 23:21 |
clarkb | something tells me they will fail :) | 23:21 |
clarkb | yup first one just failed | 23:22 |
fungi | ahh, yeah, tox was working for me on my default python 3.8 | 23:22 |
fungi | older pythons don't have importlib.metadata | 23:22 |
clarkb | fungi: well I also expect it will fail otherwise why would they release with broken tests | 23:23 |
clarkb | (its probably normal for them to be broken) | 23:23 |
fungi | also looking more closely at 313620f i suspect the real reason for that commit was to remove calls to importlib | 23:24 |
fungi | so all i'm doing is reintroducing them | 23:25 |
clarkb | :) | 23:25 |
fungi | i think the one place we're hitting this is probably the only one that matters, that's where it responds to `ctcp version` from the server | 23:32 |
fungi | the other uses are a welcome banner (cosmetic) and unit testing (unnecessary now) | 23:32 |
clarkb | it continues to send events. Really think we are set now | 23:33 |
ianw | STARTTLS failed! SSL connect attempt failed error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed at /usr/libexec/git-core/git-send-email line 1548. | 23:56 |
ianw | and people say dealing with gerrit is too hard ... | 23:56 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!