mephmanx | All, I am having an issue installing cloudfoundry in my openstack env. I get errors related to Octavia/lbaas. I put the question out to a few places and got the following response: | 11:35 |
---|---|---|
mephmanx | I don’t think the Octavia endpoints are the same as the old Neutron lbaas ones were. :path => "/v2.0/lbaas/pools" ''OpenStack API NotFound Expected(200) <=> Actual(404 Not Found) The API docs for Octavia seem to indicate that the path would be /v2/lbaas (rather than /v2.0) I’ve no idea how to fix the issue, but I think that explains why the error is happening. | 11:35 |
mephmanx | Is there a way I could quickly identify if this was the case? If it turns out to be, I think I have 3 options; 1) find and fix the code in whatever is using that url to communication with openstack 2)fix openstack neutron itself to use the correct url 3) put a service in the middle (like HAProxy or something) that could do url rewriting. | 11:37 |
mephmanx | Is there a way I could quickly identify if this was the case? If it turns out to be, I think I have 3 options; 1) find and fix the code in whatever is using that url to communication with openstack 2)fix openstack neutron itself to use the correct url 3) put a service in the middle (like HAProxy or something) that could do url rewriting | 11:37 |
mephmanx | Could I get some advice? | 11:37 |
gthiemonge | mephmanx: /v2.0/lbaas/pools should work, this is an alias to /v2/lbaas/pools | 11:43 |
mephmanx | hmm... on my cloudofundry install, I am receiving this error | 11:44 |
mephmanx | 01:29:55.852077+0000', 'admin', 'update', 'deployment', 'cf', 'CPI error ''Bosh::Clouds::CloudError'' with message ''OpenStack API NotFound Expected(200) <=> Actual(404 Not Found) excon.error.response :body => "{\"NeutronError\": {\"type\": \"HTTPNotFound\", \"message\": \"The resource could not be found.\", \"detail\": \"\"}}" :cookies => [ ] :headers => { "content-length" => "1 | 11:44 |
mephmanx | "content-type" => "application/json" "date" => "Thu, 02 Sep 2021 01:29:32 GMT" "strict-transport-security" => "max-age=31536000;" "x-openstack-request-id" => "req-fbfa7aa1-1f13-46c6-8385-5f0fde847b08" } :host => "openstack-external.lyonsgroup.family" :local_address => "10.0.1.6" :local_port => 45952 :path => "/v2.0/lbaas/pools" :port | 11:44 |
mephmanx | if I do a curl against https://openstack-external.lyonsgroup.family:9696/v2.0/lbaas on any of ther servers in question, the endpoint works so I dont believe its network. | 11:45 |
mephmanx | I made everything unrestricted so that all the bosh vm's had access so I dont believe its security... | 11:46 |
mephmanx | I could forward along anything that could help if you have a moment.... At least if you could point me to where to look Id be grateful. | 11:47 |
gthiemonge | I believe the url is correct, but perhaps the endpoint is not the good one, 9696 is the neutron port, so you're probably using the neutron endpoint here | 11:48 |
mephmanx | yeah, the error looks like it tries to do something with the loadbalancers agains tthe neutron port: | 11:49 |
mephmanx | :host => "openstack-external.lyonsgroup.family" :local_address => "10.0.1.6" :local_port => 45952 :path => "/v2.0/lbaas/pools" :port => 9696 :reason_phrase => "Not Found" :remote_ip => "174.54.141.197" :status => 404 :status_line => "HTTP/1.1 404 Not Found\r\n" | 11:49 |
mephmanx | The 9696 neutron service is active and open...you could hit it as well from your machine using the link I posted. | 11:51 |
gthiemonge | mephmanx: I'm not familiar with cloudfoundry, but to me, it looks it doesn't support octavia | 11:52 |
mephmanx | I heard from them that thier reference design was octavia.... | 11:54 |
mephmanx | I asked on thier slack channel about other issues I have had getting to this point. Mostly either things I did wrong or not very good documentation... | 11:55 |
gthiemonge | mephmanx: there was a neutron-lbaas proxy plugin that is now unsupported, it forwards request from neutron-lbaas to octavia, maybe their reference design uses it | 12:05 |
mephmanx | is that gone now? How could I install that if it is? I have a vanilla Wallaby cloud deployed using kolla that I am working with. | 12:07 |
gthiemonge | it was deprecated in Queen | 12:09 |
mephmanx | is it still useable? Could I deploy it in my stack? | 12:12 |
gthiemonge | it was removed in stein, I don't believe it would work in a Wallaby env | 12:22 |
mephmanx | So the only way to use cloudfoundry then would be to fix the cloudfoundry code or maybe deploy as kubernetes? | 12:25 |
mephmanx | Anyone else also have this feeling? cloudfoundry does not support octavia natively, it supports on up to Queesn (due to lbaas-proxy)? | 12:34 |
opendevreview | Gregory Thiemonge proposed openstack/octavia stable/wallaby: Add generic network interface management in the amphora https://review.opendev.org/c/openstack/octavia/+/807310 | 12:57 |
mephmanx | I think I see the issue in fog-openstack if it is as easy as ust the /v2 endpoint instead of /v2.0 as the /v2.0 goes to the neutron-lbaas proxy (which doesnt exist anymore). I can point that change out (or make it) if someone else could test it out or even approve the change. It looks like the fog library is nearly abandoned as it dosent appear to have had a commit in over a year. | 13:07 |
mephmanx | I would like to be able to get this working in a week or so....not wait months or longer... | 13:08 |
mephmanx_ | sorry, I dropped connection. Anyone have any other thoughts on fog-openstack, cloudfoundry support for openstack version after queens, neutron lbaas / octavia, etc? | 13:39 |
gthiemonge | nop sorry, perhaps the people who are based in the US will join us soon and will have more insights on this | 13:41 |
johnsom | mephmanx_: you need to fix the endpoint cloud foundry is using for the load balancer service (i.e. not the old port way) or map the /lbaas path to point to the octavia api endpoints | 13:41 |
johnsom | People have done the mapping/proxy using apache in the past | 13:43 |
mephmanx_ | ok, so the requests are basically the same, its just the uri that is different? | 13:45 |
mephmanx_ | I can manage that...do you have any links that discuss that? I have worked with apache before. I have PFSense / HAProxy that could help with it... | 13:46 |
mephmanx | sorry, connection dropped again. Was there confirmation that the traffix is the same, its just the uri that is wrong if I was to put a proxy in front of neutron? | 13:55 |
mephmanx | Also, could you repost any links that describe this solution? If some were, I didnt get them. | 13:55 |
johnsom | mephmanx: https://wiki.openstack.org/wiki/Neutron/LBaaS/Deprecation | 14:00 |
mephmanx | I saw that page, thanks. Is there any blogs or walkthrough on how others set up a proxy for the lbaas issue. | 14:02 |
johnsom | Back at the deprecation time there was a test job using the apache method. If you dig back in the neutron lbaas test jobs you may be able to find it. | 14:02 |
johnsom | I am on mobile so can’t dig for it right now | 14:02 |
johnsom | The API is compatible, it is just how the api is reached that changed | 14:03 |
mephmanx | ok, so just to confirm how I see it; if I can proxy requests made to /v2/lbaas/* to /v2.0/lbaas/*, that should do it? | 14:14 |
johnsom | No, the v2 v2.0 is aliased, it is the endpoint where octavia is listening. Could be a different port number or IP depending on how your cloud is configured. Check “openstack endpoint list” | 14:29 |
mephmanx | here is what I have: What would I be rewriting or proxying? | 14:30 |
mephmanx | +----------------------------------+---------+--------------+-----------------+---------+-----------+-------------------------------------------------------------------------+ | ID | Region | Service Name | Service Type | Enabled | Interface | URL | +----------------------------------+---------+--------------+---------- | 14:31 |
mephmanx | hmm... let me pastebin that. | 14:31 |
mephmanx | https://pastebin.com/QVrLWPks | 14:31 |
johnsom | So, this is your target endpoint in that cloud: https://openstack-external.lyonsgroup.family:9876 | 14:36 |
mephmanx | here is the error I see during cloudfoundry install. Looks like it is trying to use 9696. https://pastebin.com/zPuRetJz | 14:37 |
mephmanx | wait, I think I see what you are saying. I need to send the lbaas requests that are going to 9696 to 9876, right? | 14:37 |
johnsom | Right, it is using the neutron port 9696 instead of the octavia port 9876 | 14:37 |
mephmanx | Ah, ok. I got it then. Thank you! Let me see what I can put together. | 14:37 |
mephmanx | I was able to get the redirect working but I am now seeing 504's back from octavia on POSTs to /v2.0/lbaas/pools/<poolid>/members | 15:30 |
johnsom | mephmanx Hmmm, you might need to increase the connection/data timeouts in your proxy. | 15:48 |
mephmanx | ok. Could there be any other resources that would need this sort of rewrite? I ran a bunch of prep scripts via terraform to prep the env for cloudfoundry...the scripts created a bunch of stuff but one thing they created was the loadbalancer. Would I need to delete and recreate or possibly recreate the entire env due to this? Could something be in a bad state from not having this access and now the db is messed up? | 15:51 |
johnsom | No, the database will be consistent, but the resource may not be fully setup as terraform expects. Considering it was adding members, I would just check that all of the service instances you would expect are setup correctly on the load balancer pool. | 15:53 |
johnsom | I'm now in the office, I can see if I can dig and find the old setup that was used for testing if you would like. | 15:53 |
johnsom | It might have the timeouts that were used. | 15:53 |
mephmanx | if you could, I would greatly appreciate it. It looks like I was using the haproxy default of 30 seconds... | 15:54 |
johnsom | I would hope that terraform would have logged any resources it wanted to setup, but was unable to complete. But I have also not used cloudfoundry | 15:54 |
johnsom | Yeah, give me a few minutes. I need to jump in the way-way-back machine. grin | 15:54 |
mephmanx | I see log entries like this as well: Pool cannot be created or modified because the Load Balancer is in an immutable state | 15:56 |
mephmanx | that looks like it happens if LB is in ERROR or one of the pendinge states but mine is in ACTIVE and it looks like the script even addedd a member... | 15:57 |
johnsom | That is normal. If an update is in-progress on the resource, to keep consistency, we send back an HTTP status code that says it should retry the request. | 15:57 |
mephmanx | ok, thanks | 15:57 |
mephmanx | I set it to 600 seconds but still got errors.... https://pastebin.com/7p40kfYf | 15:58 |
johnsom | Hmm, adding members should only take a few seconds really. It depends on the compute environment, is it nested virtualization (running VMs in VMs) or not. | 15:59 |
mephmanx | it is nested.. | 16:00 |
johnsom | Ok, then it's important to have KVM enabled as otherwise nova can take up to 16 minutes to fully boot a VM. | 16:00 |
mephmanx | the 504 look like they are also happening on /v2.0/lbaas/pools<pool id> | 16:00 |
mephmanx | KVM is enabled | 16:00 |
johnsom | Ok, good! Yeah, 504 is purely a timeout at the proxy layer, it's not coming from Octavia API. | 16:01 |
mephmanx | it dosent look like it waited the 600 seconds...the 504 came back pretty quick still... Im wondering if its another timeout or something else missing... | 16:01 |
johnsom | What are you using for the proxy? Apache, haproxy? | 16:01 |
mephmanx | haprox. I added a rule that says "if the string lbaas is in the path, send to octavia instead of neutron" | 16:02 |
mephmanx | It actually looks like all the ocatavia apis are sending 504...must be setup then... | 16:02 |
johnsom | Right, perfect. So there are 2-3 timeouts that probably need to be set in haproxy for this path. | 16:03 |
mephmanx | no, wait...some of the gets are returning... | 16:03 |
mephmanx | I updated the timeouts on the neutron frontend...maybe octavia as well? Ill check backend timeouts... | 16:03 |
mephmanx | found a few more timeouts to mess with... | 16:05 |
johnsom | timeout client <> | 16:07 |
johnsom | timeout server <> | 16:07 |
johnsom | timeout tunnel <> | 16:07 |
johnsom | I would just set those all to like 5 minutes | 16:07 |
mephmanx | I updated them all and its still processing...good sign... | 16:08 |
johnsom | +1 | 16:08 |
mephmanx | still going so it might be working! Im worried if the calls are staying open and will hit the 600 second mark... | 16:14 |
mephmanx | I do see new vms in horizon though... | 16:14 |
johnsom | It really should not, the Octavia API calls return pretty quickly, the longest time spent is waiting for nova/neutron to plug network ports, etc. | 16:15 |
mephmanx | johnsom I think it worked! Or, at least the deployment got further. This is the second issue you helped me through...I shuold send you doughnuts or soemthing! | 16:37 |
mephmanx | deployment success!!! Thank you! | 16:57 |
mephmanx | https://pastebin.com/8cNDiY9i | 16:58 |
johnsom | Cool, glad I could help. | 17:01 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!