*** cpuga has joined #openstack-lbaas | 00:01 | |
*** armax has quit IRC | 00:02 | |
*** links has joined #openstack-lbaas | 00:06 | |
*** sshank has quit IRC | 00:07 | |
*** cpuga has quit IRC | 00:15 | |
*** cpuga has joined #openstack-lbaas | 00:16 | |
*** cpuga has quit IRC | 00:21 | |
*** cpuga_ has joined #openstack-lbaas | 00:23 | |
*** cpuga_ has quit IRC | 00:23 | |
*** JudeC has quit IRC | 00:31 | |
johnsom | Ok, OSC tomorrow | 00:35 |
---|---|---|
*** cpuga has joined #openstack-lbaas | 00:40 | |
rm_work | :) | 00:47 |
*** cpuga_ has joined #openstack-lbaas | 00:56 | |
*** cpuga has quit IRC | 00:57 | |
*** JudeC has joined #openstack-lbaas | 01:01 | |
*** cpuga_ has quit IRC | 01:03 | |
*** cpuga_ has joined #openstack-lbaas | 01:06 | |
*** cpuga_ has quit IRC | 01:08 | |
*** links has quit IRC | 01:14 | |
*** cpuga has joined #openstack-lbaas | 01:14 | |
rm_work | johnsom: wait what actually marks me as an octavia *admin* in keystone? | 01:21 |
*** kbyrne has quit IRC | 01:39 | |
rm_work | weird, i can't get a GET on /loadbalancers to list other projects, but i can delete, so i must be admin | 01:41 |
*** kbyrne has joined #openstack-lbaas | 01:42 | |
*** fnaval has joined #openstack-lbaas | 01:50 | |
*** cpuga has quit IRC | 01:57 | |
*** cpuga has joined #openstack-lbaas | 02:02 | |
*** cpuga_ has joined #openstack-lbaas | 02:04 | |
*** cpuga has quit IRC | 02:04 | |
*** cpuga has joined #openstack-lbaas | 02:06 | |
*** cpuga_ has quit IRC | 02:06 | |
*** cpuga has quit IRC | 02:09 | |
*** cpuga has joined #openstack-lbaas | 02:09 | |
*** armax has joined #openstack-lbaas | 02:23 | |
*** leitan has joined #openstack-lbaas | 02:28 | |
*** cpuga has quit IRC | 02:31 | |
rm_work | xgerman: when are you back from vacation? :P | 02:43 |
*** sanfern has joined #openstack-lbaas | 02:47 | |
*** links has joined #openstack-lbaas | 02:53 | |
*** yamamoto_ has joined #openstack-lbaas | 02:58 | |
*** yamamoto_ has quit IRC | 02:58 | |
*** yamamoto_ has joined #openstack-lbaas | 02:58 | |
*** cpuga has joined #openstack-lbaas | 03:07 | |
*** JudeC has quit IRC | 03:10 | |
*** leitan has quit IRC | 03:22 | |
*** leitan has joined #openstack-lbaas | 03:23 | |
*** leitan has quit IRC | 03:23 | |
*** gans has joined #openstack-lbaas | 03:24 | |
*** leitan has joined #openstack-lbaas | 03:24 | |
*** leitan has quit IRC | 03:28 | |
*** csomerville has joined #openstack-lbaas | 03:31 | |
*** csomerville has quit IRC | 03:35 | |
*** cody-somerville has joined #openstack-lbaas | 03:36 | |
*** cody-somerville has quit IRC | 03:36 | |
*** cody-somerville has joined #openstack-lbaas | 03:36 | |
*** cpuga_ has joined #openstack-lbaas | 03:37 | |
*** cpuga has quit IRC | 03:40 | |
*** cpuga_ has quit IRC | 03:41 | |
*** gcheresh has joined #openstack-lbaas | 03:45 | |
*** cody-somerville has quit IRC | 03:46 | |
*** gcheresh has quit IRC | 03:51 | |
*** JudeC has joined #openstack-lbaas | 04:03 | |
*** aojea has joined #openstack-lbaas | 04:14 | |
*** aojea has quit IRC | 04:19 | |
*** JudeC has quit IRC | 04:33 | |
*** cody-somerville has joined #openstack-lbaas | 04:33 | |
*** cody-somerville has quit IRC | 04:33 | |
*** cody-somerville has joined #openstack-lbaas | 04:33 | |
*** pcaruana has joined #openstack-lbaas | 04:43 | |
*** yamamoto_ has quit IRC | 04:48 | |
*** csomerville has joined #openstack-lbaas | 05:01 | |
*** armax has quit IRC | 05:02 | |
*** cody-somerville has quit IRC | 05:04 | |
*** cody-somerville has joined #openstack-lbaas | 05:08 | |
*** cody-somerville has quit IRC | 05:08 | |
*** cody-somerville has joined #openstack-lbaas | 05:08 | |
*** gcheresh has joined #openstack-lbaas | 05:10 | |
*** csomerville has quit IRC | 05:10 | |
*** belharar has joined #openstack-lbaas | 05:17 | |
*** JudeC has joined #openstack-lbaas | 05:19 | |
*** belharar has quit IRC | 05:30 | |
*** yamamoto_ has joined #openstack-lbaas | 05:35 | |
*** pcaruana has quit IRC | 05:49 | |
*** rcernin has joined #openstack-lbaas | 05:58 | |
*** belharar has joined #openstack-lbaas | 06:01 | |
*** pcaruana has joined #openstack-lbaas | 06:02 | |
*** Dinesh_Bhor has quit IRC | 06:23 | |
*** belharar has quit IRC | 06:39 | |
*** tesseract has joined #openstack-lbaas | 06:51 | |
xgerman | will be back Monday | 07:02 |
*** aojea has joined #openstack-lbaas | 07:22 | |
*** aojea has quit IRC | 07:23 | |
*** aojea has joined #openstack-lbaas | 07:23 | |
*** belharar has joined #openstack-lbaas | 07:37 | |
rm_work | xgerman: augh not until next monday? T_T | 07:46 |
rm_work | k | 07:46 |
rm_work | well our gates keep breaking so | 07:47 |
rm_work | maybe by then we can merge stuff anyway... | 07:47 |
rm_work | ah you did +A a bunch :P | 07:47 |
nmagnezi | rm_work, who need gates anyway.. :P | 07:51 |
rm_work | anywho | 07:51 |
rm_work | https://github.com/pypa/setuptools/pull/1043 | 07:51 |
rm_work | as soon as setuptools fixes their shit, we're good to go again I guess with rechecks :P | 07:51 |
*** dayou has quit IRC | 07:52 | |
gans | so that's why my diskimage-crate was failing from this morning !!! huh | 08:06 |
*** sanfern has quit IRC | 08:15 | |
*** sanfern has joined #openstack-lbaas | 08:15 | |
*** sanfern has quit IRC | 08:21 | |
*** sanfern has joined #openstack-lbaas | 08:23 | |
*** dayou has joined #openstack-lbaas | 08:28 | |
xgerman | rm_work have a train ride later today... so maybe will have a look... | 08:42 |
nmagnezi | xgerman, you are very productive when you ride trains :-) | 08:43 |
nmagnezi | rm_work, still around? I have a question | 08:44 |
*** JudeC has quit IRC | 09:07 | |
*** openstackstatus has quit IRC | 09:36 | |
*** openstackstatus has joined #openstack-lbaas | 09:38 | |
*** ChanServ sets mode: +v openstackstatus | 09:38 | |
-openstackstatus- NOTICE: There is a known issue with setuptools 36.0.0 and errors about the "six" package. For current details see https://github.com/pypa/setuptools/issues/1042 and monitor #openstack-infra | 09:44 | |
*** yamamoto_ has quit IRC | 10:47 | |
*** sanfern has quit IRC | 10:52 | |
*** gans has quit IRC | 11:06 | |
*** yamamoto has joined #openstack-lbaas | 11:29 | |
*** yamamoto_ has joined #openstack-lbaas | 11:29 | |
*** yamamoto has quit IRC | 11:33 | |
*** yamamoto_ has quit IRC | 11:35 | |
*** links has quit IRC | 11:37 | |
*** atoth has joined #openstack-lbaas | 11:49 | |
openstackgerrit | Nir Magnezi proposed openstack/neutron-lbaas master: Fix file mode https://review.openstack.org/420551 | 11:54 |
*** yamamoto has joined #openstack-lbaas | 12:09 | |
openstackgerrit | Bernard Cafarelli proposed openstack/octavia master: Install amphora agent from distribution package on RHEL https://review.openstack.org/469850 | 12:15 |
*** leitan has joined #openstack-lbaas | 12:28 | |
*** sanfern has joined #openstack-lbaas | 12:37 | |
*** sanfern has quit IRC | 12:47 | |
*** sanfern has joined #openstack-lbaas | 12:47 | |
*** kobis has joined #openstack-lbaas | 12:58 | |
*** yamamoto has quit IRC | 13:02 | |
*** gans has joined #openstack-lbaas | 13:15 | |
*** yamamoto has joined #openstack-lbaas | 13:21 | |
*** gans has quit IRC | 13:27 | |
*** gans has joined #openstack-lbaas | 13:36 | |
*** fnaval_ has joined #openstack-lbaas | 13:40 | |
*** fnaval has quit IRC | 13:41 | |
*** yamamoto has quit IRC | 13:43 | |
*** gans has quit IRC | 13:44 | |
*** yamamoto has joined #openstack-lbaas | 13:45 | |
*** yamamoto has quit IRC | 13:45 | |
*** cpuga has joined #openstack-lbaas | 13:48 | |
*** cpuga_ has joined #openstack-lbaas | 13:49 | |
*** cpuga has quit IRC | 13:53 | |
*** gcheresh has quit IRC | 13:56 | |
*** KeithMnemonic2 has joined #openstack-lbaas | 14:01 | |
*** KeithMnemonic1 has quit IRC | 14:04 | |
*** yamamoto has joined #openstack-lbaas | 14:07 | |
*** cody-somerville has quit IRC | 14:12 | |
*** armax has joined #openstack-lbaas | 14:24 | |
-openstackstatus- NOTICE: python-setuptools 36.0.1 has been released and now making its way into jobs. Feel free to 'recheck' your failures. If you have any problems, please join #openstack-infra | 14:33 | |
*** openstackgerrit has quit IRC | 14:34 | |
*** cpuga_ has quit IRC | 14:59 | |
*** cpuga has joined #openstack-lbaas | 15:00 | |
*** cpuga_ has joined #openstack-lbaas | 15:04 | |
*** cpuga has quit IRC | 15:08 | |
*** gcheresh has joined #openstack-lbaas | 15:14 | |
*** armax has quit IRC | 15:20 | |
*** gcheresh has quit IRC | 15:45 | |
*** leitan has quit IRC | 15:45 | |
*** yamamoto has quit IRC | 15:51 | |
*** yamamoto has joined #openstack-lbaas | 15:52 | |
*** reedip_ has joined #openstack-lbaas | 15:53 | |
*** gcheresh has joined #openstack-lbaas | 15:54 | |
*** yamamoto has quit IRC | 15:58 | |
*** catintheroof has joined #openstack-lbaas | 16:02 | |
*** aojea has quit IRC | 16:03 | |
*** rcernin has quit IRC | 16:04 | |
*** aojea has joined #openstack-lbaas | 16:04 | |
*** gans has joined #openstack-lbaas | 16:04 | |
*** openstackgerrit has joined #openstack-lbaas | 16:05 | |
openstackgerrit | Jason Niesz proposed openstack/octavia master: blueprint: l3-active-active https://review.openstack.org/453005 | 16:05 |
*** aojea has quit IRC | 16:08 | |
openstackgerrit | Michael Johnson proposed openstack/octavia master: Add v2 pool API section https://review.openstack.org/458272 | 16:12 |
*** gcheresh has quit IRC | 16:12 | |
*** cody-somerville has joined #openstack-lbaas | 16:25 | |
*** cody-somerville has quit IRC | 16:25 | |
*** cody-somerville has joined #openstack-lbaas | 16:25 | |
*** catintheroof has quit IRC | 16:27 | |
*** cpuga_ has quit IRC | 16:28 | |
*** cpuga has joined #openstack-lbaas | 16:28 | |
*** armax has joined #openstack-lbaas | 16:29 | |
*** gans has quit IRC | 16:42 | |
*** kobis has quit IRC | 16:44 | |
*** belharar has quit IRC | 16:45 | |
*** tesseract has quit IRC | 16:54 | |
*** yamamoto has joined #openstack-lbaas | 16:55 | |
rm_work | nmagnezi: sorry, slept | 16:58 |
rm_work | nmagnezi: here now :P | 16:58 |
*** foutatoro has joined #openstack-lbaas | 16:58 | |
nmagnezi | rm_work, :-) | 16:58 |
nmagnezi | rm_work, just wanted to ask about the amphora agent log path | 16:58 |
rm_work | johnsom: thanks for all the rechecks :) | 16:58 |
nmagnezi | rm_work, i see it's hardcoded | 16:59 |
nmagnezi | rm_work, is that intentional? | 16:59 |
johnsom | Yeah, working through stuff. I hope some stuff with start merging soon | 16:59 |
nmagnezi | rm_work, https://github.com/openstack/octavia/blob/eb644135afcc676c4d8626bceb6eddf6cea9b3eb/octavia/cmd/agent.py#L79 | 16:59 |
nmagnezi | rm_work, usually openstack services get --log-file <path> | 16:59 |
johnsom | nmagnezi I did that | 16:59 |
nmagnezi | johnsom, hi :) | 17:00 |
johnsom | I think | 17:00 |
rm_work | i thought we did support --log-file | 17:00 |
nmagnezi | johnsom, i actually pinged Adam when it was night time for you | 17:00 |
rm_work | oh | 17:00 |
johnsom | No, not that one. | 17:00 |
rm_work | for the controlplane | 17:00 |
*** JudeC has joined #openstack-lbaas | 17:00 | |
rm_work | this is the amp agent | 17:00 |
rm_work | yeah | 17:00 |
nmagnezi | rm_work, i haven't tried this in the agent | 17:00 |
rm_work | we could config-ize it | 17:00 |
johnsom | I mean, everything is "controlled" in the amp, so, I guess do you have a use case? | 17:01 |
johnsom | Also note, with the recent changes, that log isn't complete. It's still split between that log and the syslog | 17:01 |
johnsom | it's an issue IMO | 17:01 |
rm_work | yeah we need to fix the logging story | 17:01 |
rm_work | that was another thing i was going to bring up if i had more time at the meeting | 17:02 |
rm_work | that's high on my priority list, to get the possibility to inject a custom syslogd config so the amps could for instance send off to an ELK ingress point | 17:02 |
*** yamamoto has quit IRC | 17:04 | |
*** gcheresh has joined #openstack-lbaas | 17:04 | |
*** gcheresh has quit IRC | 17:04 | |
johnsom | Yeah, agreed. Something that is filebeat compatible I think would be best | 17:05 |
nmagnezi | johnsom, just asking because in the rpm I made for that agent the logging path is /var/log/octavia/amphora-agent.log (as oppose to /var/log/amphora-agent.log) so I wanted to know if that could be converted to a param. but i guess i can also modify the rpm | 17:06 |
nmagnezi | that's the usecase :) | 17:06 |
*** cody-somerville has quit IRC | 17:06 | |
rm_work | yeah just modify the rpm imo :P | 17:06 |
johnsom | Well, or just standardize it. I'm not opposed to an octavia sub directory | 17:06 |
rm_work | yeah me either | 17:07 |
*** cody-somerville has joined #openstack-lbaas | 17:07 | |
*** cody-somerville has quit IRC | 17:07 | |
*** cody-somerville has joined #openstack-lbaas | 17:07 | |
johnsom | It probably makes a lot of sense actually | 17:07 |
nmagnezi | https://github.com/rdo-packages/octavia-distgit/blob/rpm-master/octavia-amphora-agent.service#L8 | 17:07 |
rm_work | omg johnsom can you make cody-somerville fix his client lol | 17:07 |
johnsom | rm_work I guess you are showing joins? I turned it off a long time ago. There was someone else just spamming the channel too | 17:08 |
rm_work | heh yeah | 17:08 |
rm_work | i waver back and forth about turning that off | 17:08 |
johnsom | Yeah, it was interesting, but too much noise over signal | 17:08 |
johnsom | I just gave up | 17:08 |
johnsom | Plus with irccloud and such, I don't think that status is very accurate anyway. I have no idea what it does for me. Probably always "here" | 17:09 |
rm_work | yeah | 17:11 |
*** sshank has joined #openstack-lbaas | 17:16 | |
rm_work | johnsom / JudeC can the docs changes be done in JudeC's final followup? | 17:18 |
rm_work | it's just that it's a REALLY long chain, lol | 17:18 |
rm_work | i mean i guess if he doesn't care | 17:18 |
JudeC | I dont care :P | 17:18 |
johnsom | It is nice to have them along the way for review/updates/comments | 17:20 |
johnsom | I am reviewing the load balancer commands now. Just thought I would post that initial observation | 17:22 |
*** cpuga_ has joined #openstack-lbaas | 17:26 | |
*** cpuga has quit IRC | 17:28 | |
*** SumitNaiksatam has joined #openstack-lbaas | 17:29 | |
openstackgerrit | Merged openstack/python-octaviaclient master: Optimize the link address https://review.openstack.org/455288 | 17:41 |
*** tonygunk has quit IRC | 17:42 | |
*** gcheresh has joined #openstack-lbaas | 17:44 | |
openstackgerrit | Merged openstack/octavia master: Pool name/desc needs to be "" when empty, not null https://review.openstack.org/467780 | 17:45 |
*** leitan has joined #openstack-lbaas | 17:56 | |
openstackgerrit | Merged openstack/octavia master: Fix pool response to fill healthmonitor_id properly https://review.openstack.org/467407 | 17:57 |
*** aojea has joined #openstack-lbaas | 18:07 | |
*** aojea has quit IRC | 18:12 | |
*** reedip_ has quit IRC | 18:13 | |
*** rcernin has joined #openstack-lbaas | 18:14 | |
openstackgerrit | Merged openstack/octavia master: Remove lb_network_name from config (it was bogus) https://review.openstack.org/465183 | 18:14 |
*** kobis has joined #openstack-lbaas | 18:15 | |
sanfern | what is the significance of subnet in command lbaas-member-create ? can I have members and vip from difference subnets ? | 18:19 |
rm_work | yes | 18:20 |
rm_work | we will plug the member subnet into the amphora | 18:20 |
rm_work | so it is reachable | 18:20 |
sanfern | oh ok, got it thanks rm_work | 18:20 |
openstackgerrit | Merged openstack/octavia master: Optional L7Policies param was marked as required https://review.openstack.org/467798 | 18:29 |
openstackgerrit | Adam Harwell proposed openstack/octavia master: VRRP amphora_driver functions weren't handled by noop driver https://review.openstack.org/465185 | 18:30 |
johnsom | rm_work FYI, https://bugs.launchpad.net/octavia/+bug/1665069 | 18:34 |
openstack | Launchpad bug 1665069 in octavia "Support of haproxy log aggregation capabilites" [Wishlist,Triaged] - Assigned to Ganpat Agarwal (gans-developer) | 18:34 |
johnsom | We should coordinate on that and agree on an approach | 18:34 |
rm_work | yes | 18:35 |
rm_work | IMO we allow passing in a customizable syslogd template | 18:35 |
rm_work | and just let operators ship logs to wherever | 18:35 |
johnsom | This is the connection tracking log, not the admin log, but we should be consistent to some degree | 18:35 |
sanfern | we try testing splunk, we need to provide IP:port and FQDN will not work | 18:37 |
rm_work | yes we disable DNS on amps | 18:38 |
johnsom | Yeah, and HAproxy doesn't allow DNS names. So, both reasons that won't work | 18:39 |
sanfern | oh ok reason behind | 18:39 |
rm_work | johnsom: i mean yeah, logs is logs. same method IMO | 18:39 |
rm_work | can provide a syslogd config to handle both types | 18:39 |
johnsom | I think it is good to be able to separate them with either an ID or something, but similar method is good. | 18:39 |
*** sanfern has quit IRC | 18:41 | |
*** sshank has quit IRC | 18:41 | |
*** sanfern has joined #openstack-lbaas | 18:42 | |
*** foutatoro has quit IRC | 18:55 | |
*** aojea has joined #openstack-lbaas | 19:06 | |
rm_work | johnsom: i mean, they ARE separate, right? | 19:08 |
rm_work | you'd provide a syslogd config for agent logs, and a syslogd config for haproxy logs | 19:08 |
rm_work | no? | 19:08 |
rm_work | i'm confused about what you mean when you keep saying "separate them by ID" | 19:09 |
johnsom | There are three configs really. Agent logs, haproxy admin logs (startup/stop/errors, and haproxy connection logs (user data) | 19:09 |
johnsom | You probably don't want the HAproxy admin stuff going into the user logs | 19:10 |
rm_work | right, aren't they different files? | 19:10 |
*** aojea has quit IRC | 19:10 | |
johnsom | Depends on how you configure HAproxy | 19:10 |
rm_work | so wouldn't the syslogd config say "admin logs go to <X> and connection logs go to <Y>" ? | 19:10 |
rm_work | well, WE configure it :P | 19:10 |
rm_work | and I thought we had them as separate files, but maybe not yet? | 19:11 |
johnsom | Right, just saying we need to pay attention to that config. Right now I think it all goes into one | 19:11 |
johnsom | The files are a mess | 19:11 |
johnsom | even our agent logs are split between two files now with gunicorn | 19:11 |
rm_work | that's just a quick update to our haproxy.conf template right? | 19:12 |
rm_work | hmm really? will have to look at that | 19:12 |
rm_work | i'm touching all the agent stuff right now anyway | 19:12 |
rm_work | tho, brb lunch | 19:12 |
johnsom | Ok. Yeah, some stuff goes out the the agent.log some stuff is still going into the console->syslog | 19:12 |
johnsom | I should eat too | 19:13 |
leitan | Hi guys | 19:14 |
leitan | im doing HA tests with octavia | 19:14 |
leitan | shutdown the master node, traffic went to the backup flawlessly | 19:14 |
leitan | but, started the master again | 19:14 |
leitan | failed to start VRRP | 19:14 |
leitan | inside the amphorae | 19:15 |
leitan | rm_work xgerman johnsom , if you can take a look, will be great http://paste.openstack.org/show/611238/ | 19:23 |
openstackgerrit | Merged openstack/octavia master: Don't leave LBs in PENDING_DELETE after refusing to cascade https://review.openstack.org/465813 | 19:28 |
leitan | so basically if i shutoff the backup that asummed the mastership, i lost my balancers | 19:30 |
leitan | weird | 19:30 |
leitan | letme know, ill debug this from my side | 19:30 |
*** kobis has quit IRC | 19:31 | |
*** sanfern has quit IRC | 19:31 | |
*** leitan has quit IRC | 19:41 | |
*** leitan has joined #openstack-lbaas | 19:45 | |
*** sshank has joined #openstack-lbaas | 19:48 | |
leitan | back, so basically till i dont restart the backup node, that was the last one with the ownership of the ip address ... i dont recover the traffic | 19:52 |
*** armax has quit IRC | 20:00 | |
*** cody-somerville has quit IRC | 20:10 | |
johnsom | leitan Are you only looking at the keepalived logs or also testing the traffic? I have seen the logging not really reflect the accurate view. Also, by shut off, how are you doing that? | 20:15 |
johnsom | Are you just rebooting the master? Do you have TLS offload configured? | 20:17 |
*** leitan has quit IRC | 20:30 | |
*** leitan has joined #openstack-lbaas | 20:32 | |
leitan | sorry got disconnected | 20:32 |
leitan | johnsom: testing both, traffic and seeing the logs | 20:32 |
leitan | thats the behaviour | 20:33 |
leitan | i dont have TLS offload | 20:33 |
johnsom | Hmmm, I did some testing with act/stndby recently. The failover was slower than before, but still completed. So curious what is up. | 20:33 |
leitan | i rebooted the master first, the traffic migrated fine to the backup, when the master came back, keepalived doesnt started, and if i rebooted the slave, i lost all LB functionality | 20:33 |
leitan | johnsom: ill test now with a LB located on a vxlan to see if has something to do with the flat network i have attached the public LB | 20:34 |
leitan | im courious why all the GARP logs | 20:34 |
johnsom | Hmm, I wonder why keepalived didn't come up... Do you see anything in the journalctl for keepalived of why it didn't start. | 20:35 |
leitan | johnsom: lemme check | 20:35 |
johnsom | The GARP thing is a workaround I did for some strange neutron behavior we saw. We had to force more periodic GARP to make sure the IP migration advertisement is picked up by neutron/OVS | 20:35 |
johnsom | It is a "beat the virtual network over the head with the fact that the IP is on a new instance now" | 20:36 |
johnsom | The GARPs are how the instance tells the network that this instance now owns the IP | 20:37 |
leitan | johnsom: yes, tought that maybe to much GARP logs means network unreacheable in some way | 20:39 |
leitan | Jun 01 20:37:26 amphora-02288985-f910-4c56-9253-9bcd0d0a8bc4 systemd[1]: octavia-keepalived.service: Control process exited, code=exited status=1 | 20:40 |
johnsom | leitan Can you do "systemctl status octavia-keepalived | less" and paste the output? | 20:40 |
leitan | yes | 20:41 |
johnsom | I hope you have time, I'm really interested in what you are seeing. I'm also setting up an act/stdby on my side now. | 20:41 |
leitan | johnsom: im interested that youre interested :P | 20:42 |
rm_work | i am ALSO setting that up shortly | 20:43 |
rm_work | like next two weeks should be switching from SINGLE to active/passive | 20:43 |
leitan | johnsom: http://paste.openstack.org/show/611246/ | 20:43 |
leitan | seems kinda a race condition with the kernel namespace | 20:43 |
leitan | because im listing it, and its there | 20:44 |
johnsom | Oh!, that IS a problem | 20:44 |
rm_work | ah it's starting too early | 20:44 |
rm_work | it seems | 20:44 |
leitan | johnsom rm_work indeed --> http://paste.openstack.org/show/611247/ | 20:44 |
rm_work | yeah it just hasn't been set up fully yet | 20:44 |
rm_work | need to move keepalived later in the boot process | 20:45 |
johnsom | Hmmm, give me a minute to look at something. | 20:46 |
johnsom | Yeah, ok, it's a bug | 20:47 |
leitan | johnsom: i remember facing similar conditions with l3-agent and rpc_response once | 20:47 |
leitan | johnsom: want me to fill a bug report ? | 20:48 |
johnsom | Sure, I will probably push a fix today, so having a bug would be good. | 20:48 |
leitan | right away Sir | 20:48 |
johnsom | Rebooting amps is a bit questionable now that the certs are on the RAM drive, but I want to keep that as functional as possible | 20:48 |
johnsom | That was my mistake in the transition to systemd. | 20:49 |
*** cody-somerville has joined #openstack-lbaas | 20:51 | |
rm_work | ah this is a reboot thing? | 20:52 |
johnsom | Yes | 20:52 |
rm_work | didn't notice that's what he was doing | 20:52 |
rm_work | T_T | 20:52 |
rm_work | normally we don't reboot Amps | 20:52 |
leitan | im rebooting amps since i need to do a full HA test before getting this into prod, this can represent a total failure if both amps fail at different time | 20:54 |
leitan | johnsom: https://bugs.launchpad.net/octavia/+bug/1695087 | 20:54 |
openstack | Launchpad bug 1695087 in octavia "Race condition causes keepalived to fail, namepsace not fully configured" [Undecided,New] | 20:54 |
johnsom | Thanks | 20:54 |
leitan | since the the octavia services has access to the amphorae, is possible to declare the amphorae as DEGRADED since vrrp or haproxy didnt start ? | 20:55 |
johnsom | Well, if haproxy doesn't start the amp will be considered failed by the health manager and it will be replaced | 20:56 |
*** armax has joined #openstack-lbaas | 20:56 | |
leitan | johnsom: but not for keepalived | 20:57 |
johnsom | That said, I'm not sure that keepalived failing to start would trigger the amp to be considered failed. That might be a gap. | 20:57 |
leitan | johnsom: sure | 20:57 |
johnsom | It it failed on deploy it would get caught, but I think on reboot we have a gap | 20:57 |
johnsom | Probably worth another bug | 20:58 |
leitan | yeah, about to ask that | 20:58 |
leitan | ill fill it | 20:58 |
leitan | johnsom: https://bugs.launchpad.net/octavia/+bug/1695090 done | 21:03 |
openstack | Launchpad bug 1695090 in octavia "Keepalived not considered to declare unhealthy an amphorae" [Undecided,New] | 21:03 |
johnsom | Thanks. We might have that, but I can't remember how. With the bug we can track it down. | 21:07 |
rm_work | xgerman: while you are on the train, you should review https://review.openstack.org/458272 and the rest of the chain :) | 21:09 |
*** fnaval_ has quit IRC | 21:12 | |
*** fnaval has joined #openstack-lbaas | 21:14 | |
*** SumitNaiksatam has quit IRC | 21:15 | |
*** tonygunk has joined #openstack-lbaas | 21:16 | |
*** gcheresh has quit IRC | 21:20 | |
*** leitan has quit IRC | 21:21 | |
johnsom | Ugh, this is another one of those where that dumb decision to have multiple haproxy processes is really hurting.... | 21:33 |
*** cpuga_ has quit IRC | 21:36 | |
*** cody-somerville has quit IRC | 21:49 | |
*** cody-somerville has joined #openstack-lbaas | 21:55 | |
*** cody-somerville has quit IRC | 21:55 | |
*** cody-somerville has joined #openstack-lbaas | 21:55 | |
*** ipsecguy_ has joined #openstack-lbaas | 21:56 | |
rm_work | johnsom: can we just ... undo that :P | 21:59 |
rm_work | "just" | 21:59 |
*** ipsecguy has quit IRC | 21:59 | |
*** rcernin has quit IRC | 22:10 | |
johnsom | It's going to be a bunch of work. I don't want to prioritize that over our current efforts | 22:23 |
openstackgerrit | Michael Johnson proposed openstack/octavia master: Fix keepalived systemd race with haproxy namespace https://review.openstack.org/470051 | 22:45 |
*** leitan has joined #openstack-lbaas | 22:49 | |
johnsom | That seems to work | 22:50 |
leitan | johnsom: when you guys have the keepalived patch i can test it right away on my env, let me know | 22:51 |
johnsom | leitan ^^^^ | 22:51 |
johnsom | https://review.openstack.org/470051 | 22:52 |
leitan | got disconnected | 22:52 |
leitan | ill check it | 22:52 |
leitan | johnsom: currently active LBs are going to refresh the template if i reboot them ? | 22:53 |
johnsom | No, that is an agent update. To easily test, go into /usr/lib/systemd/system and edit each haproxy-* file | 22:54 |
johnsom | Add the "Before=octavia-keepalived.service" line, then run "systemctl daemon-reload" to load the new files. Then you can do the reboot test | 22:56 |
leitan | johnsom: roger | 22:56 |
leitan | johnsom: working like a charm | 23:03 |
johnsom | Cool, thanks! | 23:03 |
leitan | johnsom: to you, ill keep testing and let you guys know if i find something else | 23:05 |
leitan | let me know if i can help testing something out | 23:05 |
johnsom | Ok, I'm taking a look the keepalived != failed issue | 23:05 |
leitan | great, i can test the patch whenever you want | 23:06 |
johnsom | Well, I need to come up with the right solution for that first.... Grin | 23:06 |
leitan | johnsom: is there any constrain to do it just like the haproxy pid monitoring ? | 23:11 |
johnsom | Yeah, with haproxy we are actually interrogating haproxy over a unix socket. We don't have that opportunity with keepalived. Plus we need to make sure this amp is actually in a act/stdby pair. | 23:13 |
*** cpuga has joined #openstack-lbaas | 23:13 | |
johnsom | It's a bit more tricky | 23:13 |
johnsom | Not to mention the three init systems.... | 23:15 |
leitan | johnsom: right, so youre configuring keepalived as an standalone master for "SINGLE" LBs too then ? | 23:16 |
johnsom | No, we don't, it doesn't get setup at all on the standalone | 23:17 |
johnsom | So, we can't just check if it is running as the standalone topology does not have it actually running | 23:17 |
leitan | understood, thinking here too | 23:18 |
johnsom | I'm leaning towards checking the process is running via the pid file, but still thinking about it a bit. keepalived actually uses three processes, so is that good enough.... etc. | 23:20 |
*** aojea has joined #openstack-lbaas | 23:21 | |
*** aojea has quit IRC | 23:26 | |
leitan | johnsom: maybe sounds silly, but what about using a tracking script that leave something like state file on some path, if the file at least is there, means that its a active_standby lb, and the the health manager can check if its actually, maybe the file can be a result of the "pidof keepalived" command, and the health can check agains those running pids | 23:32 |
leitan | a keepalived tracking script | 23:32 |
johnsom | Well, the tracking script will only run if the process starts. | 23:33 |
leitan | if keepalived was actually in the past, is gonna be there | 23:33 |
johnsom | Right. | 23:33 |
johnsom | I can check for our config file to see if it should be running or not. No config should mean standalone | 23:34 |
leitan | so if the file exists, but in a reboot keepalived doesnt came up, youre going to check your last pids, on the file, returns 1 and mark as unhealthy | 23:34 |
leitan | johnsom: yes, that too, if its empty | 23:34 |
leitan | but with the other method, you can actually check the running pids, if it dies in the middle | 23:34 |
leitan | is goint to mark it unhealthy on the next iteration | 23:35 |
johnsom | Yeah, I'm just trying to decide if checking that the process is running is good enough (as there are three). Just doing a little research to see how far it is worth taking | 23:35 |
leitan | johnsom: ill be thinking another methods too, i saw a keepalived path to work as cisco like --vrrp-status, having that will ease the things | 23:37 |
*** sshank has quit IRC | 23:38 | |
leitan | johnsom: what about starting keepalived with "-x" flag enabling snmp support and queryng the keealived MIB, like neutron tried to do here: https://bugs.launchpad.net/neutron/+bug/1460116 | 23:45 |
openstack | Launchpad bug 1460116 in neutron "keepalived should have snmp support enabled" [Wishlist,Expired] | 23:45 |
johnsom | I am considering that | 23:45 |
johnsom | Also looking at the dbus interface | 23:45 |
leitan | johnsom: with the snmp support you can also get the states of the VRID, maybe that will too in the future be helpfull for who is the master dectection we were talking the other day | 23:47 |
johnsom | Yeah, that is kind of why I'm looking at options beyond checking if the process is there or not | 23:47 |
leitan | johnsom: yes, maybe in the future the vrrptable counters can help detecting if for some reason vrrp adv pkts are dropped on the hosts, to mark an amphorae as unhealthy on the server | 23:50 |
johnsom | Sadly the snmp support requires an snmp master agent, which we don't currently install. It's a bit heavy weight for what I am hoping | 23:54 |
leitan | thats a downside, but the RHEL image wasnt too heavy to pass through the gate ? snmp cant hurt anyone :P | 23:56 |
johnsom | Oh yes it can... I have a long history with SNMP and the code bases... | 23:56 |
leitan | just kidding, /ME thinking a more lightweight elegant solution | 23:57 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!