*** longkb has joined #openstack-lbaas | 00:23 | |
rm_work | johnsom: ever used multitail? omg | 00:42 |
---|---|---|
bzhao__ | johnsom: ping | 00:44 |
bzhao__ | Hi micheal, thanks for review udp patches, but I think it's better to access with you. So I just want to discuss how to fit the requirement of both. And willing to get your kind opinion. | 00:44 |
bzhao__ | template part | 00:44 |
bzhao__ | https://review.openstack.org/#/c/525420/15/octavia/common/jinja/lvs/jinja_cfg.py | 00:44 |
bzhao__ | For my thought, another jinja_cfg for keepalivedlvs may be good for split them, as the target file and the function processer are different. Another benefit is the split process. | 00:44 |
bzhao__ | agent part | 00:44 |
bzhao__ | https://review.openstack.org/#/c/529651/17/octavia/amphorae/backends/agent/api_server/keepalivedlvs.py | 00:44 |
bzhao__ | For the member's 'no check' status during get_udp_listener_status, your suggestion is to check the lvs running status for setting it, right? | 00:44 |
bzhao__ | service part | 00:45 |
bzhao__ | https://review.openstack.org/#/c/539391/8/octavia/api/v2/types/health_monitor.py | 00:45 |
bzhao__ | For removing the default value of health_monitor requests, sorry for confused. As if users don't specify the fields "http_method", "url_path", "expected_codes" in UDP request, the generated Post obj will still contains these fields, it is contradictory in the udp spec which we not allowed this kind request. | 00:45 |
bzhao__ | Sorry for fresh the screen. | 00:46 |
johnsom | bzhao__: give me 5 minutes and I can chat | 00:47 |
bzhao__ | johnsom: Thank you. :) | 00:47 |
johnsom | bzhao__ Hi. Ok, let's chat. So, I read through your responses today, but didn't get to respond yet. But this is even better | 00:52 |
johnsom | On the jinja template. I think I just got myself confused. I agree the keepalived template should have it's own jinja_cfg. | 00:52 |
johnsom | However, I do think we need to clean it up and only have values that matter for the keepalived template. This was part of what led to my confusion, so I'm sure it will be an issue for others. | 00:53 |
johnsom | Maybe I am mistaken, but I don't think you need all of those values. I did not get a chance to match them against the macros. | 00:55 |
johnsom | bzhao__ Did I just mis-interpret? Are all of those values used in the keepalived jinja template? | 00:57 |
johnsom | Like line 163, 'peer_port': listener.peer_port, I don't see that variable in the jinja macros | 00:58 |
*** threestrands has quit IRC | 01:00 | |
bzhao__ | johnsom: Thanks, you are right. I follow u now. It's sure that the part not used in lvs jinjia. I will remove them . | 01:03 |
johnsom | Ok, cool, yeah, just a little cleanup there should be good. | 01:03 |
bzhao__ | johnsom: ha | 01:04 |
johnsom | Ok, second topic | 01:04 |
johnsom | Yes, with the TCP protocol we provide monitoring information back to the user via the "operating" status. I wanted to research today if there was a way we can provide the same actual health information with the UDP listener. | 01:05 |
johnsom | I was thinking there might be a /sys or /proc filesystem place we can look at that. | 01:05 |
johnsom | I'm pretty sure lvsadm can provide those status, so there is some way. Hopefully something nicer than running lvsadm | 01:06 |
bzhao__ | johnsom: Thanks for suggest. I will take a deep look into how to check the status also. And discuss with u tomorrow. | 01:07 |
bzhao__ | johnsom: Yeah, I also think if we run the lvsadm cmd directly is not suitable for this patrt | 01:07 |
johnsom | sorry, I meant ipvsadm, but yeah, command line is not the best. I will also research this more. | 01:10 |
bzhao__ | johnsom: Oh, sorry for the typo. | 01:10 |
johnsom | No, I think that was my typo... | 01:11 |
johnsom | http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.monitoring_lvs.html | 01:12 |
johnsom | That looks like an interesting place to start | 01:12 |
bzhao__ | Thanks very much, micheal. The pain point is the 3rd part. How to fit the API level? I also don't want to change our existing API default value. But seems not other good way to match the invalid udp requests. | 01:14 |
johnsom | Ok, give me a minute to re-read this and think about it. | 01:14 |
johnsom | So my expectation for those three fields is we handle them the same as TCP. Let me look at the code. | 01:16 |
*** longkb has quit IRC | 01:17 | |
*** longkb has joined #openstack-lbaas | 01:17 | |
*** yamamoto has joined #openstack-lbaas | 01:18 | |
johnsom | bzhao__ Yeah, this is an anomaly in our code. Even a TCP health monitor gets those defaults even though they are not used. I will create a patch to fix this. Please leave the defaults as they are and my patch will decide how to fix this in the API controller code. | 01:22 |
johnsom | https://www.irccloud.com/pastebin/v2l3IOzy/ | 01:22 |
bzhao__ | Yeah, I wrote it like now for raising something. :). I'm sure that it maybe bad. | 01:22 |
bzhao__ | Thanks. | 01:23 |
johnsom | Yeah, leave that one for me, I will fix it. | 01:23 |
bzhao__ | Thank you. Micheal. I will depend on your provider support patch for re-construct the udp part also. :) | 01:24 |
johnsom | bzhao__ Any other questions for today? | 01:24 |
johnsom | Oh, yeah, it's a bit of a big change, but very necessary. | 01:24 |
bzhao__ | no more. Thank you for clear and huge help to me. | 01:25 |
*** yamamoto has quit IRC | 01:25 | |
johnsom | BTW, I am putting down UDP support as a Rocky feature on my OpenStack Summit slides. It's a good feature. | 01:25 |
bzhao__ | Yeah, that's very necessary. | 01:25 |
johnsom | I look forward to trying it out once the file path issue is resolved. | 01:25 |
bzhao__ | I mean the provider support | 01:25 |
bzhao__ | haha. | 01:26 |
bzhao__ | I will fix asap. | 01:26 |
johnsom | Right now I have to jump between tasks, so I will do a review as far as I can, then switch. Great, thank you. | 01:26 |
bzhao__ | OK, Thanks for your time. Take a good rest after work. :) | 01:27 |
johnsom | Thanks! Time to make dinner | 01:27 |
*** yamamoto has joined #openstack-lbaas | 01:41 | |
*** yamamoto has quit IRC | 01:46 | |
*** yamamoto has joined #openstack-lbaas | 02:02 | |
*** yamamoto has quit IRC | 02:06 | |
*** ivve has joined #openstack-lbaas | 02:09 | |
*** xuhaiwei has joined #openstack-lbaas | 02:18 | |
*** yamamoto has joined #openstack-lbaas | 02:23 | |
*** yamamoto has quit IRC | 02:29 | |
*** yamamoto has joined #openstack-lbaas | 02:45 | |
*** rcernin has quit IRC | 02:46 | |
*** yamamoto has quit IRC | 02:50 | |
*** rcernin has joined #openstack-lbaas | 02:50 | |
*** yamamoto has joined #openstack-lbaas | 03:06 | |
*** yamamoto has quit IRC | 03:10 | |
*** yamamoto has joined #openstack-lbaas | 03:27 | |
*** yamamoto has quit IRC | 03:33 | |
*** links has joined #openstack-lbaas | 03:37 | |
*** yamamoto has joined #openstack-lbaas | 03:50 | |
*** yamamoto has quit IRC | 03:54 | |
*** gans has joined #openstack-lbaas | 04:10 | |
*** longkb has quit IRC | 04:12 | |
*** yamamoto has joined #openstack-lbaas | 04:12 | |
*** gans has quit IRC | 04:12 | |
*** longkb has joined #openstack-lbaas | 04:12 | |
*** yamamoto has quit IRC | 04:17 | |
*** annp has quit IRC | 04:18 | |
*** annp has joined #openstack-lbaas | 04:18 | |
*** pcaruana has joined #openstack-lbaas | 04:31 | |
*** yamamoto has joined #openstack-lbaas | 04:33 | |
*** yamamoto has quit IRC | 04:33 | |
*** yamamoto has joined #openstack-lbaas | 04:35 | |
*** linhnm has joined #openstack-lbaas | 04:38 | |
*** dayou has quit IRC | 04:53 | |
*** linhnm has quit IRC | 05:04 | |
*** AlexStaf has quit IRC | 05:09 | |
*** linhnm has joined #openstack-lbaas | 05:26 | |
*** dayou has joined #openstack-lbaas | 05:26 | |
*** yboaron has joined #openstack-lbaas | 05:39 | |
*** linhnm has quit IRC | 05:56 | |
*** kobis has joined #openstack-lbaas | 06:17 | |
*** linhnm has joined #openstack-lbaas | 06:22 | |
*** kobis has quit IRC | 06:52 | |
*** rcernin has quit IRC | 07:13 | |
*** tesseract has joined #openstack-lbaas | 07:13 | |
*** yamamoto_ has joined #openstack-lbaas | 07:15 | |
*** pcaruana has quit IRC | 07:17 | |
*** pcaruana has joined #openstack-lbaas | 07:17 | |
*** pcaruana has quit IRC | 07:18 | |
*** yamamoto has quit IRC | 07:18 | |
*** pcaruana has joined #openstack-lbaas | 07:23 | |
*** annp has quit IRC | 07:28 | |
*** linhnm has quit IRC | 07:36 | |
*** AlexeyAbashkin has joined #openstack-lbaas | 07:36 | |
*** kobis has joined #openstack-lbaas | 07:42 | |
*** salmankhan has joined #openstack-lbaas | 08:23 | |
*** salmankhan has quit IRC | 08:59 | |
*** salmankhan has joined #openstack-lbaas | 09:06 | |
*** linhnm has joined #openstack-lbaas | 09:11 | |
*** devfaz has quit IRC | 09:33 | |
*** devfaz has joined #openstack-lbaas | 09:34 | |
*** annp has joined #openstack-lbaas | 09:41 | |
*** linhnm has quit IRC | 09:42 | |
*** kobis has quit IRC | 09:45 | |
openstackgerrit | Merged openstack/octavia master: Devstack plugin: Check for IPv6 support https://review.openstack.org/567578 | 09:49 |
*** kobis has joined #openstack-lbaas | 10:17 | |
openstackgerrit | Vadim Ponomarev proposed openstack/octavia master: Add a soft check for the allowed_address_pairs extension. https://review.openstack.org/568546 | 10:22 |
*** tesseract is now known as info | 10:41 | |
*** info is now known as tesseract | 10:41 | |
*** xuhaiwei has quit IRC | 10:48 | |
*** atoth has joined #openstack-lbaas | 11:20 | |
*** longkb has quit IRC | 11:29 | |
*** samccann has joined #openstack-lbaas | 11:56 | |
*** kobis has quit IRC | 12:08 | |
*** kobis has joined #openstack-lbaas | 12:13 | |
*** numans has quit IRC | 12:31 | |
*** numans has joined #openstack-lbaas | 12:35 | |
openstackgerrit | Vadim Ponomarev proposed openstack/octavia master: Add a soft check for the allowed_address_pairs extension. https://review.openstack.org/568546 | 13:02 |
*** rpittau has quit IRC | 13:02 | |
*** kobis has quit IRC | 13:57 | |
mnaser | hey everyone | 14:46 |
xgerman_ | o/ | 14:46 |
mnaser | so at the summit, there might be a keynote demo, and it might have octavia with k8s :) | 14:46 |
mnaser | and i'm just running it through to show off how loadbalancer resources in k8s create actual load balancers in openstack | 14:46 |
mnaser | everything works, but in the dashboard, the operating status of the load balancer is 'Offline' | 14:47 |
mnaser | is there a reason behind this? i'd rather people not see that as it might give "oh this isn't real" type of feeling | 14:47 |
xgerman_ | mmh, my k8s LB are ACTIVE/online | 14:49 |
xgerman_ | k8s has a tendency to move members around so I cycle through DEGRADE/ERROR soem times | 14:50 |
mnaser | https://usercontent.irccloud-cdn.com/file/7KXNAyrh/Screen%20Shot%202018-05-15%20at%2010.50.19%20AM.png | 14:50 |
mnaser | is there something that maybe we have misconfigured that might be leaving them in offline operating status? | 14:51 |
johnsom | mnaser: I have an answer, can chat in 10 mind | 14:53 |
johnsom | Mins | 14:53 |
mnaser | johnsom: np I have a call in 10 but will follow up | 14:53 |
*** kobis has joined #openstack-lbaas | 15:19 | |
*** links has quit IRC | 15:21 | |
*** sapd1 has joined #openstack-lbaas | 15:45 | |
*** kobis has quit IRC | 16:00 | |
*** salmankhan has quit IRC | 16:01 | |
*** kobis has joined #openstack-lbaas | 16:05 | |
*** bcafarel|pto is now known as bcafarel | 16:06 | |
*** kobis has quit IRC | 16:07 | |
*** tesseract has quit IRC | 16:21 | |
*** pcaruana has quit IRC | 16:33 | |
mnaser | johnsom: Small friendly ping :) | 16:39 |
johnsom | mnaser Hi\ | 16:40 |
johnsom | Ok, so a couple of thoughts. Is that just an LB or is that a full load balancer, like listeners pools etc.? | 16:40 |
xgerman_ | it’s probably the one generated by k8s | 16:41 |
xgerman_ | so listener pools, etc. | 16:41 |
johnsom | Also, does it have a health monitor? | 16:41 |
xgerman_ | they get them automaticslly | 16:41 |
xgerman_ | I think | 16:41 |
xgerman_ | https://www.irccloud.com/pastebin/fUi8Rzn3/ | 16:45 |
xgerman_ | ^^ this is how the k8s ones look on my ened | 16:45 |
johnsom | Hmm, then my guess is it just hasn't refreshed the screen yet, which is this patch: https://review.openstack.org/#/c/561458/ | 16:46 |
xgerman_ | yeah, never use the GUI | 16:46 |
johnsom | Which I have not tried yet, but could fast track testing it. | 16:47 |
johnsom | Without that patch you have to refresh the web page to get the status updates. Horizon had some bugs with dynamic updating fields that Jacky had to work around | 16:47 |
*** sshank has joined #openstack-lbaas | 17:11 | |
*** AlexeyAbashkin has quit IRC | 17:25 | |
*** Swami has joined #openstack-lbaas | 17:32 | |
*** links has joined #openstack-lbaas | 18:01 | |
*** salmankhan has joined #openstack-lbaas | 18:02 | |
*** sapd1 has quit IRC | 18:22 | |
mnaser | johnsom, xgerman_: sorry, been afk, i think i tried refreshing a few times and it was still offline | 18:33 |
mnaser | let me see if its still the casee | 18:33 |
xgerman_ | mnaser: make sure to compare with the CLI - I hardly ever use the GUI… | 18:33 |
mnaser | it's still offline, okay ill compare now | 18:33 |
johnsom | mnaser I have a dashboard setup now, when the LB is online if I refresh it updates correctly. With Jacky's patch it auto updates, but has a 60 second delay. When I modify his patch down to 5 or 10 seconds it works pretty well. | 18:34 |
johnsom | mnaser Also, which day's keynote? | 18:35 |
mnaser | johnsom: #1 but nothing publicly announced yet :) | 18:35 |
johnsom | Ok, NP. Just wanted to be there. | 18:35 |
johnsom | I can show you what value to change if you want to use Jacky's patch, otherwise it will be an overnight cycle to change the patch and get it reviewed. It of course would only be on master branch | 18:37 |
mnaser | interesitng | 18:38 |
mnaser | it shows offline even in cli | 18:38 |
johnsom | Yeah, I don't think there is a dashboard bug, I think it's actually offline | 18:39 |
xgerman_ | mmh, that’s not good — can you go through pools, etc. | 18:39 |
*** yamamoto_ has quit IRC | 18:46 | |
*** yamamoto has joined #openstack-lbaas | 19:06 | |
*** yamamoto has quit IRC | 19:12 | |
*** KeithMnemonic has joined #openstack-lbaas | 19:14 | |
*** KeithMnemonic1 has joined #openstack-lbaas | 19:14 | |
*** KeithMnemonic1 has quit IRC | 19:15 | |
mnaser | xgerman_: i can see everything | 19:15 |
* mnaser shrugs | 19:15 | |
xgerman_ | yeah, if it works… | 19:16 |
*** yamamoto has joined #openstack-lbaas | 19:17 | |
*** sshank has quit IRC | 19:17 | |
*** links has quit IRC | 19:20 | |
johnsom | mnaser Did you figure out what is up with your LB? | 19:21 |
mnaser | johnsom, xgerman_ just got off a call.. hav eanother one in 10 but | 19:21 |
mnaser | http://162.253.55.204/ | 19:21 |
mnaser | its def working. | 19:21 |
mnaser | maybe something is wrong with my health manager deployment? | 19:21 |
xgerman_ | it should not if it’s offline | 19:21 |
johnsom | Yeah, maybe. Maybe it's not getting it's health heartbeats? | 19:22 |
mnaser | let me look at healthmanager logs | 19:22 |
*** yamamoto has quit IRC | 19:22 | |
johnsom | It doesn't report the packets received unless it's in debug mode | 19:23 |
mnaser | hmm, nothing in logs | 19:23 |
mnaser | yeah let me restart in debug i guess | 19:23 |
mnaser | ok how does the amphorae know where the hms are | 19:23 |
mnaser | controller_ip_port_list | 19:24 |
johnsom | [health_manager] | 19:24 |
johnsom | controller_ip_port_list = 192.168.0.8:5555 | 19:24 |
mnaser | ok, the ip there isn't pingable, that explains it | 19:24 |
mnaser | would restarting octavia be enough to notify about the ip changing | 19:25 |
mnaser | or is that thing configured once when the amphora gets init'd | 19:25 |
johnsom | Yeah, it's at boot time. We don't have the update feature in yet | 19:25 |
johnsom | With HM in debug you should see something like: May 15 12:25:49 devstackpy27-2 octavia-health-manager[83352]: DEBUG octavia.amphorae.drivers.health.heartbeat_udp [-] Received packet from ('192.168.0.6', 16496) {{(pid=83362) dorecv /opt/stack/octavia/octavia/amphorae/drivers/health/heartbeat_udp.py:186}} | 19:26 |
mnaser | yeah i just tried pinging the ip and nothing came back so | 19:26 |
johnsom | Ping might just be security groups depending on how you setup your lb-mgmt net | 19:26 |
mnaser | oh yeah you're right | 19:27 |
mnaser | how often does heartbeat send msgs | 19:27 |
mnaser | aka if i tcpdump -eni any 'port 5555' ... how long should i wait | 19:27 |
johnsom | Also, depending on how you deployed, the devstack systemctl service for the HM doesn't shut down all of the processes, so watch for that, you might have to kill them by hand | 19:27 |
johnsom | default is 10 seconds | 19:28 |
*** yamamoto has joined #openstack-lbaas | 19:39 | |
*** salmankhan has quit IRC | 19:39 | |
*** yamamoto has quit IRC | 19:44 | |
mnaser | ok 10s will verify | 19:49 |
mnaser | are hm pings sent to all controllers | 19:57 |
mnaser | interesting, tcpdump shows stuff coming in | 19:58 |
*** yamamoto has joined #openstack-lbaas | 19:59 | |
johnsom | It rotates through the list, it is an HA strategy | 20:02 |
rm_work | mnaser: also remember it's UDP | 20:03 |
rm_work | so make sure UDP is allowed not just TCP | 20:03 |
*** yamamoto has quit IRC | 20:04 | |
*** yboaron has quit IRC | 20:13 | |
*** aojea has joined #openstack-lbaas | 20:18 | |
*** yamamoto has joined #openstack-lbaas | 20:22 | |
*** yamamoto has quit IRC | 20:26 | |
*** AlexStaf has joined #openstack-lbaas | 20:37 | |
*** kobis has joined #openstack-lbaas | 20:38 | |
mnaser | rm_work, johnsom: i see the udp traffic coming in the interface | 20:41 |
xgerman_ | sweet | 20:41 |
mnaser | no iptables rules | 20:41 |
mnaser | but nothing appearing in the log | 20:41 |
mnaser | 2018-05-15 19:59:42.219 34685 INFO octavia.amphorae.drivers.health.heartbeat_udp [-] attempting to listen on 0.0.0.0 port 5555 | 20:41 |
mnaser | wonder if listening to 0.0.0.0 is messing it up? | 20:41 |
johnsom | mnaser try this, just because I wonder if something isn't strange with your startup script. Do a systemctl stop octavia-health (or whatever it is called there), ps -ef | grep health | 20:42 |
johnsom | See if there are still processes, if so kill them (normal kill please, no -9) | 20:42 |
mnaser | johnsom: deployed via OSA, i used to see that bug in centos/rdo but not seeing it here | 20:42 |
johnsom | Then start up health again with systemctl | 20:42 |
mnaser | i restarted to switch to debug and i see it being the only new process since the restar | 20:42 |
mnaser | udp UNCONN 0 0 *:5555 *:* users:(("octavia-health-",pid=34685,fd=5)) | 20:43 |
mnaser | seems to be listening properly too | 20:43 |
*** yamamoto has joined #openstack-lbaas | 20:43 | |
johnsom | Ok, just checking. I have seen those other sub processes hang around and eat packets | 20:43 |
johnsom | stack@devstackpy27-2:~/horizon$ sudo netstat -apn | grep 5555 | 20:45 |
johnsom | udp 0 0 192.168.0.8:5555 0.0.0.0:* 83362/python | 20:45 |
mnaser | johnsom: ss -anlp | grep 5555 ? | 20:45 |
johnsom | And you see the UDP packets inside the container? hmmm | 20:45 |
* mnaser doesn't have netstat | 20:45 | |
mnaser | johnsom: yes! :\ | 20:46 |
johnsom | Yeah, I'm old school and haven't switched yet | 20:46 |
mnaser | centos dropped ifconfig and netstat | 20:46 |
mnaser | so it forced me to | 20:46 |
mnaser | :P | 20:46 |
mnaser | (i just wanna compare ss outputs) | 20:46 |
johnsom | udp UNCONN 0 0 192.168.0.8:5555 *:* users:(("octavia-health-",pid=83370,fd=4),("octavia-health-",pid=83368,fd=4),("octavia-health-",pid=83367,fd=4),("octavia-health-",pid=83366,fd=4),("octavia-health-",pid=83362,fd=4)) | 20:46 |
mnaser | ok so you're listening on the specific ip | 20:46 |
mnaser | i wonder if that has to do with it | 20:47 |
xgerman_ | OSA listens on all ips… | 20:47 |
*** yamamoto has quit IRC | 20:47 | |
xgerman_ | 0.0.0.0 - that shouldn’t affect anything | 20:47 |
johnsom | What version do you have? I see all of the processes I expect | 20:47 |
mnaser | %prog 2.0.2.dev3/openstack/venvs/octavia-17.0.3/bin/octavia-health-manager --version => | 20:47 |
mnaser | err | 20:47 |
mnaser | /openstack/venvs/octavia-17.0.3/bin/octavia-health-manager --version => %prog 2.0.2.dev3 | 20:48 |
mnaser | i see 2 hm processes, yet you have 5 | 20:48 |
mnaser | and i have only 1 listening vs your 5 on the port | 20:48 |
johnsom | queens | 20:48 |
mnaser | yup | 20:48 |
johnsom | Ok, so you probably have a pre-threads to process version, which is ok for small deployments | 20:49 |
johnsom | We switch from threadpool to procpool because we were hitting huge latency with the threads | 20:50 |
mnaser | strace the root proc shows => wait4(34685, | 20:50 |
mnaser | strace that process shows => recvfrom(5, | 20:50 |
mnaser | and then lsof shows fd 5 is | 20:51 |
mnaser | octavia-h 34685 octavia 5u IPv4 1812475951 0t0 UDP *:personal-agent | 20:51 |
*** kobis has quit IRC | 20:51 | |
* mnaser flips table | 20:51 | |
johnsom | Yeah, one is sleeping, this is the DB health check, the other is the receiver | 20:51 |
mnaser | oh man | 20:52 |
mnaser | i wonder if this rp_filter | 20:52 |
mnaser | packet comes in via br-mgmt from that vm | 20:52 |
mnaser | but the reverse path is not from the same interface, it has to do back from the default interface | 20:53 |
mnaser | because the default route is the "host" for the container | 20:53 |
johnsom | getting martians? | 20:54 |
mnaser | gah | 20:54 |
mnaser | that was it | 20:54 |
mnaser | sysctl -w net.ipv4.conf.all.rp_filter=0 fixed it. | 20:54 |
johnsom | hmmm | 20:55 |
mnaser | i'll have to invest sometime at some point at redoing our entire lbaas networking setup | 20:56 |
mnaser | the whole migration has kinda left some cruft for us | 20:56 |
mnaser | (migration to OSA that is) | 20:56 |
*** AlexStaf has quit IRC | 20:56 | |
johnsom | Ah, ok. I was scratching my head as I didn't think we had asymmetric in OSA | 20:57 |
mnaser | no its because we don't have br-lbaas or what have you | 20:57 |
mnaser | there, it's online now | 20:58 |
johnsom | Well, glad it's fixed for the demo | 20:58 |
mnaser | ++ | 20:58 |
mnaser | i mean it was working anyways but "online" looks way better than "offline" :p | 20:59 |
johnsom | Right | 20:59 |
johnsom | No F5's up our sleaves | 20:59 |
mnaser | may the demo gods be in our favour | 20:59 |
mnaser | i'll setup a second backup env | 20:59 |
johnsom | +1 | 21:00 |
rm_work | ummm | 21:02 |
rm_work | mnaser: soooo that means you NEVER actually had health-management working | 21:02 |
mnaser | shhhh | 21:02 |
rm_work | so just FYI | 21:02 |
mnaser | no one knows | 21:03 |
mnaser | :p | 21:03 |
rm_work | be on the lookout for anything weird | 21:03 |
rm_work | that is a huge piece | 21:03 |
mnaser | yeah you know it's interesting | 21:03 |
mnaser | i remember once deleting an amphora or turning it off | 21:03 |
rm_work | going from disabled to enabled HMs means that you may encounter new ... fun things | 21:03 |
mnaser | and no failover happened | 21:03 |
rm_work | yeah lol | 21:03 |
mnaser | dont think i got time to end up digging into it but i'll keep an eye out lol | 21:03 |
rm_work | so pay attention to make sure you don't get failovers when you shouldn't ;) | 21:03 |
rm_work | or that health messages aren't being delayed | 21:04 |
*** yamamoto has joined #openstack-lbaas | 21:04 | |
mnaser | heh will do | 21:05 |
mnaser | alright, gotta go run pick up someone from the airport | 21:05 |
mnaser | rm_work, johnsom: thanks as usual and sorry for taking up cycles on (once again) our misconfigs ;( | 21:05 |
rm_work | no worries | 21:05 |
johnsom | No worries! | 21:05 |
*** yamamoto has quit IRC | 21:08 | |
rm_work | johnsom: if you respond and/or fix the rest of that chain based on my comments, I can +2 all the way up prolly | 21:20 |
johnsom | Ok, cool. In plan for today. I was just hitting a few dashboard reviews while I had it setup | 21:20 |
rm_work | k | 21:20 |
johnsom | Looking at backup members now | 21:20 |
openstackgerrit | Adam Harwell proposed openstack/octavia-tempest-plugin master: Create api+scenario tests for healthmonitors https://review.openstack.org/567688 | 21:22 |
rm_work | whoops, pep8 | 21:23 |
*** aojea has quit IRC | 21:24 | |
*** yamamoto has joined #openstack-lbaas | 21:24 | |
*** yamamoto has quit IRC | 21:29 | |
*** yamamoto has joined #openstack-lbaas | 21:45 | |
*** yamamoto has quit IRC | 21:50 | |
openstackgerrit | Michael Johnson proposed openstack/octavia-dashboard master: Replace noop tests with registration test https://review.openstack.org/550721 | 21:51 |
*** sshank has joined #openstack-lbaas | 21:53 | |
*** rcernin has joined #openstack-lbaas | 22:00 | |
openstackgerrit | Michael Johnson proposed openstack/octavia-dashboard master: Fix sphinx-docs job for sphinx >1.7 https://review.openstack.org/568708 | 22:03 |
johnsom | ^^^ Same docs gate issue as octavia had. Cores, please give it a glance... | 22:04 |
*** yamamoto has joined #openstack-lbaas | 22:06 | |
openstackgerrit | Michael Johnson proposed openstack/octavia-dashboard master: Replace noop tests with registration test https://review.openstack.org/550721 | 22:09 |
*** yamamoto has quit IRC | 22:10 | |
*** yamamoto has joined #openstack-lbaas | 22:27 | |
openstackgerrit | Adam Harwell proposed openstack/octavia master: Let healthmanager process shutdown cleanly (again) https://review.openstack.org/568711 | 22:27 |
rm_work | johnsom: got it ;) | 22:27 |
johnsom | Awesome! | 22:27 |
rm_work | easy easy | 22:27 |
rm_work | (harlowja helped) | 22:28 |
johnsom | Ha, tell him Hi and thanks | 22:28 |
rm_work | so yeah the issue was that it'll shutdown/restart fine UNTIL it has received a health message, at which point it has run a process via the executor | 22:29 |
rm_work | once the executor has spun up, it needs to be allowed to shut down properly | 22:29 |
rm_work | we were terminating the parent before it had a chance | 22:29 |
rm_work | so it orphaned the children | 22:29 |
rm_work | (which wasn't a problem with threads) | 22:29 |
rm_work | so this will need a backport | 22:30 |
*** yamamoto has quit IRC | 22:31 | |
johnsom | ummm, so, it's just hanging now | 22:31 |
rm_work | in my devstack it does not... | 22:32 |
rm_work | and i don't have anything special.... | 22:32 |
johnsom | I just shutdown my HM, killed the processes, installed your patch, started then stopped | 22:33 |
johnsom | May 15 15:30:09 devstackpy27-2 systemd[1]: Stopping Devstack devstack@o-hm.service... | 22:33 |
johnsom | May 15 15:30:09 devstackpy27-2 octavia-health-manager[71502]: INFO octavia.cmd.health_manager [-] Health Manager exiting due to signal | 22:33 |
rm_work | hmm | 22:33 |
johnsom | Just sitting there | 22:33 |
rm_work | you don't get the rest? | 22:33 |
rm_work | sec | 22:33 |
rm_work | start it, and do a ps | 22:33 |
rm_work | and tell me how many procs you see | 22:34 |
johnsom | Right now I have: | 22:34 |
johnsom | stack 71502 1 0 15:29 ? 00:00:01 /usr/bin/python /usr/local/bin/octavia-health-manager --config-file /etc/octavia/octavia.conf | 22:34 |
johnsom | stack 71512 71502 0 15:29 ? 00:00:00 /usr/bin/python /usr/local/bin/octavia-health-manager --config-file /etc/octavia/octavia.conf | 22:34 |
johnsom | Let me kill this out and start again | 22:34 |
rm_work | http://paste.openstack.org/show/721048/ | 22:34 |
rm_work | it should go to 7 or so :3 | 22:35 |
rm_work | like, MINIMUM is 3 | 22:35 |
rm_work | if you don't get 3, something is f'd | 22:35 |
rm_work | then it should be 3+num_cores | 22:35 |
rm_work | once a message or two have come in | 22:35 |
johnsom | Ok, fresh start: stack 71924 1 15 15:35 ? 00:00:01 /usr/bin/python /usr/local/bin/octavia-health-manager --config-file /etc/octavia/octavia.conf | 22:36 |
johnsom | stack 71936 71924 0 15:35 ? 00:00:00 /usr/bin/python /usr/local/bin/octavia-health-manager --config-file /etc/octavia/octavia.conf | 22:36 |
johnsom | stack 71937 71924 0 15:35 ? 00:00:00 /usr/bin/python /usr/local/bin/octavia-health-manager --config-file /etc/octavia/octavia.conf | 22:36 |
rm_work | http://paste.openstack.org/show/721049/ | 22:36 |
rm_work | can you pastebin, with more context | 22:36 |
rm_work | like what i did | 22:36 |
johnsom | https://www.irccloud.com/pastebin/kjSFrsgu/ | 22:36 |
rm_work | 3 means it hasn't gotten any packets yet | 22:37 |
rm_work | actually i need to test with that, one sec | 22:37 |
johnsom | Yeah, there are no amps yet | 22:37 |
rm_work | k one moment, deleting my LB | 22:37 |
rm_work | hmmmmmmmmmmmmmmmmmmm | 22:38 |
rm_work | yeah i think it's hanging on the recv maybe | 22:38 |
johnsom | Yeah, it just hangs until systemd gives up and kills it | 22:38 |
rm_work | T_T | 22:38 |
*** AlexStaf has joined #openstack-lbaas | 22:45 | |
rm_work | johnsom: got it, one sec, testing | 22:45 |
*** yamamoto has joined #openstack-lbaas | 22:47 | |
*** JudeC has quit IRC | 22:51 | |
rm_work | johnsom: try that | 22:51 |
rm_work | err | 22:51 |
openstackgerrit | Adam Harwell proposed openstack/octavia master: Let healthmanager process shutdown cleanly (again) https://review.openstack.org/568711 | 22:51 |
johnsom | lol | 22:51 |
rm_work | that ^^ | 22:51 |
*** JudeC has joined #openstack-lbaas | 22:52 | |
*** yamamoto has quit IRC | 22:52 | |
johnsom | Bummer, you are right, cert parser is handing me back a decrypted private key at that point. This will be a bit of work to fix | 22:54 |
rm_work | we need to dump that entire file | 22:55 |
rm_work | honestly | 22:55 |
johnsom | Looks good so far | 22:58 |
rm_work | johnsom: so... if you want to sit down at the summit and just DELETE cert_parser.py | 23:02 |
rm_work | and the whole folder its in, i think.... | 23:02 |
rm_work | and then make everything work again... | 23:02 |
rm_work | i will do that | 23:02 |
rm_work | with you | 23:02 |
johnsom | Only if we get these driver patches landed, then sure, if it need to go we can do that | 23:02 |
rm_work | k | 23:03 |
johnsom | Maybe wednesday | 23:03 |
*** yamamoto has joined #openstack-lbaas | 23:08 | |
*** yamamoto has quit IRC | 23:12 | |
openstackgerrit | Michael Johnson proposed openstack/octavia master: Implement provider drivers - Listener https://review.openstack.org/566379 | 23:19 |
johnsom | So, yeah, when we get to cleanup, we need to decide if we want to use 50x or not | 23:21 |
rm_work | I do not like it | 23:21 |
johnsom | I kind of agree I don't really want to return that like ever | 23:21 |
rm_work | but I don't know what we SHOULD send instead :3 | 23:21 |
johnsom | But, it's technically accurate if the driver blows chunks | 23:21 |
johnsom | https://review.openstack.org/#/c/563795/12/octavia/common/exceptions.py | 23:22 |
johnsom | Looking at this list... | 23:23 |
johnsom | We could change all but driverError to 400 as it could be considered bad input. | 23:24 |
johnsom | I mean provider not found would be a bit unfair as it would be listed in the providers list the user can see | 23:24 |
*** sshank has quit IRC | 23:24 | |
rm_work | 412? | 23:25 |
rm_work | or 501 | 23:25 |
johnsom | Well, it's 501 now | 23:26 |
rm_work | oh err | 23:26 |
rm_work | i thought i was seeing 500 | 23:26 |
rm_work | with a traceback | 23:26 |
rm_work | i am pretty sure i was... | 23:26 |
rm_work | maybe i didn't look in the cleanup? | 23:26 |
rm_work | did you fix/change it? | 23:26 |
johnsom | no, it's all at that link I pasted | 23:26 |
rm_work | hmmmm k | 23:27 |
rm_work | err but, hold on | 23:27 |
rm_work | which review did i comment about that on | 23:28 |
johnsom | lol we could 502 them "Bad Gateway". I think that would lead to proxy confusion though | 23:28 |
johnsom | https://review.openstack.org/#/c/566698/3/octavia/tests/functional/api/v2/test_pool.py | 23:29 |
johnsom | That is the one I was looking at | 23:29 |
johnsom | It's a DriverError test | 23:29 |
rm_work | ok yeah so ... | 23:30 |
rm_work | do you see that looking for 500? | 23:30 |
rm_work | that seems to say 500 not 501 | 23:30 |
*** yamamoto has joined #openstack-lbaas | 23:30 | |
johnsom | Right, it's the driver error, the others are 501 | 23:32 |
rm_work | yeah, ok | 23:32 |
johnsom | Also I disabled the stack trace for 5xx | 23:32 |
rm_work | so yes, 501 | 23:32 |
rm_work | I would think | 23:32 |
rm_work | where do you use ProviderNotFound | 23:34 |
johnsom | That is when it is enabled in the config but stevedore can't load it | 23:34 |
rm_work | ah also, is it a setting whether to do traces or not? | 23:34 |
johnsom | drivererror is like when the driver raises some random exception, not really not implemented, but broken | 23:34 |
rm_work | right | 23:35 |
rm_work | so you don't think 501 for that? | 23:35 |
johnsom | https://review.openstack.org/#/c/563795/12/octavia/api/config.py | 23:35 |
rm_work | it's like, we know kinda what's broken, we didn't just randomly explode, we should kinda bubblewrap the drivers for user returns | 23:35 |
johnsom | 501 is not implemented, not your dumb provider just went out to lunch | 23:35 |
*** yamamoto has quit IRC | 23:35 | |
rm_work | lol | 23:36 |
rm_work | yeah but that's something an operator should know/care about | 23:36 |
rm_work | not something the user should see IMO | 23:36 |
johnsom | Yeah, the whole point of this class I linked is to bubble wrap the drivers | 23:36 |
rm_work | as far as the user is concerned, the two things are the same | 23:36 |
rm_work | yeah, so i am not sure the other one should be a 4xx either | 23:36 |
rm_work | that should probably also be a 5xx | 23:36 |
rm_work | because again, it's a deployer issue | 23:37 |
johnsom | I am super temped to make the ProviderNotImplementedError and ProviderUnsupportedOptionError 400 | 23:37 |
rm_work | err | 23:37 |
rm_work | wait | 23:37 |
rm_work | sorry i was looking at something backwards | 23:37 |
johnsom | No ProviderNotEnabled is clearly the user put in the wrong provider name | 23:37 |
rm_work | yes | 23:38 |
rm_work | yeah ok nm i figured it out | 23:38 |
rm_work | but yeah .... i still feel like the rest are operator concerns | 23:38 |
johnsom | That is a provider that is not even enabled in the config, so not returned via the provider list api | 23:38 |
rm_work | and the user should see something that's less "SOMETHING UNEXPECTED HAPPENED!!!" and more "well, we tried to deal with the driver, but they were shit, so we can't do it" | 23:38 |
rm_work | 500 is very "EXPLODE!" | 23:39 |
johnsom | I need to run an errand and beet 5pm traffic. I will be back in a little while | 23:39 |
rm_work | 501 seems more graceful | 23:39 |
rm_work | maybe just me | 23:39 |
rm_work | kk | 23:39 |
johnsom | Yeah, that is why I created these special exceptions. CLEARLY call out the driver | 23:39 |
johnsom | Always | 23:39 |
johnsom | It's my subtle "your driver is 'poor'" | 23:40 |
rm_work | yeah but half the time people are just going to see "500" and think "stupid service exploded" | 23:40 |
rm_work | at least 501 usually causes reading to happen | 23:40 |
johnsom | Alright, back in a few. Noodle, let me know what you think | 23:40 |
johnsom | We could 503 them too | 23:40 |
rm_work | maybe this is territory for "ask at the summit" | 23:41 |
* rm_work waves | 23:41 | |
*** Swami has quit IRC | 23:51 | |
*** yamamoto has joined #openstack-lbaas | 23:52 | |
*** yamamoto has quit IRC | 23:56 | |
*** samccann has quit IRC | 23:58 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!