*** fnaval has quit IRC | 00:10 | |
*** PagliaccisCloud has quit IRC | 00:21 | |
rm_work | cgoncalves: what happens when you then REMOVE that listener? does it remove the existing rule for that port and break the peering? :P | 00:46 |
---|---|---|
rm_work | was traveling today so I've been out, but what's happening | 00:48 |
rm_work | xgerman: did you make it to kubecon? | 00:49 |
*** Swami has quit IRC | 00:49 | |
johnsom | Also, I can reproduce this barbican ACL issue (queens). so fun times tracking that down | 00:50 |
rm_work | hmmmmmmmm | 00:54 |
johnsom | Yeah, I grant octavia the ACL right, but octavia gets RBAC error. Only if the user creating the secret is in a different project/ | 00:56 |
rm_work | hmmmmmmm | 00:56 |
rm_work | let me spin up a queens env, do you have the steps written out somewhere? (story?) | 00:57 |
johnsom | No, but that could happen super quick | 00:57 |
johnsom | Hmm, nevermind, I think I screwed up reproducing it | 01:02 |
rm_work | k | 01:03 |
*** yamamoto has quit IRC | 01:03 | |
johnsom | I need to switch the octavia account and try this too. | 01:03 |
rm_work | ok well, devstack is spinning | 01:07 |
lxkong | johnsom, rm_work, could you please take a look at this https://storyboard.openstack.org/#!/story/2004602? If it's bug or not? | 01:09 |
rm_work | hmmm | 01:09 |
rm_work | i mean, that SOUNDS like a thing that could be a bug | 01:10 |
rm_work | I guess qos-policy needs to be a capability check? | 01:10 |
rm_work | similar to some of the other stuff we check for in the network driver | 01:10 |
rm_work | and then we can set a flag for whether to try or not | 01:10 |
rm_work | the question would be, what to do in the case it's unsupported -- do we let users set them and just ignore it? or do we do something like what happens if an operator disables TLS-Term in config? | 01:11 |
rm_work | right now it's technically doing what I'd expect -- trying to do the requested provisioning, failing, and rolling back -> ERROR | 01:12 |
rm_work | so that's "correct" kinda | 01:12 |
johnsom | Yeah, create should roll back the API request when the VIP port create fails. | 01:14 |
johnsom | Probably doesn't give the best error though. | 01:14 |
johnsom | Update probably needs a check | 01:14 |
rm_work | the user doesn't really have ANY insight into what happened | 01:15 |
rm_work | which ... didn't we discuss fixing that, at the PTG? | 01:15 |
rm_work | it came up again yesterday internally | 01:15 |
rm_work | "how does the user get any visibility to what broke" -> "they don't" | 01:16 |
johnsom | Sure they do, the API will return a 400 and a fault string | 01:16 |
rm_work | not on an update | 01:16 |
rm_work | i mean when something like THAT breaks | 01:16 |
johnsom | Right, update should probably check the neutron capability | 01:16 |
rm_work | and create won't either right now | 01:16 |
rm_work | because it's on the async side | 01:16 |
rm_work | isn't it? | 01:16 |
johnsom | No it is not | 01:17 |
rm_work | oh, maybe it's on the API side if it's the VIP port | 01:17 |
johnsom | Yeah, we had to create that up front to give them the IP back in the respone | 01:17 |
rm_work | i always forget we moved that to be sync | 01:17 |
rm_work | we're so nice to our users, lol | 01:17 |
lxkong | johnsom: but use just specify a None for that param, even neutron doesn't support, shoudn't that be ignored? | 01:17 |
rm_work | nova don't care | 01:17 |
lxkong | s/use/user | 01:17 |
rm_work | "you'll get an IP when you get one" | 01:17 |
rm_work | lol | 01:17 |
rm_work | hmmm yeah, i kinda skimmed over the None part | 01:18 |
rm_work | that is prolly because of how the WSME stuff loads? None should technically be "no change" so it shouldn't even go through any logic for it | 01:18 |
rm_work | but ... | 01:18 |
johnsom | I'm confused on the None thing? | 01:19 |
rm_work | either the WSME stuff is handling None wrong, or we're doing something we shouldn't on updates (which I don't think is true, because I ran on Liberty Neutron and updates were fine) | 01:19 |
johnsom | Yeah, in the API None/blank is "UnSetType" or something similar in the WSME objects the API gets | 01:19 |
lxkong | rm_work: 'None' means remove any policy on the port, right? | 01:19 |
rm_work | only if there WAS one | 01:20 |
johnsom | Oh, missed the story link | 01:20 |
johnsom | looking | 01:20 |
rm_work | i mean, yes, it is different from Unset | 01:20 |
rm_work | but we SHOULD throw it away | 01:20 |
rm_work | if it's a noop | 01:20 |
rm_work | ... oh, unless we're being lazy/dumb, which is highly probably | 01:20 |
rm_work | let me glance at that path | 01:20 |
lxkong | what if neutron supports, we should not treat it as unsettype | 01:20 |
johnsom | We will pass "null" in as None through WSME. Is the client converting that "None" to null? | 01:21 |
johnsom | Yeah, the unset type gets dropped in our model handling. Unsets don't go into the update list | 01:21 |
lxkong | `Set QoS policy ID for VIP port. Unset with 'None'.` | 01:21 |
lxkong | None is meaningful value | 01:22 |
johnsom | This is in the client? | 01:22 |
lxkong | yeah | 01:22 |
rm_work | we do the validate for it only if there's a change.... looking at what we do after that | 01:22 |
johnsom | That is a fail | 01:22 |
lxkong | from CLI helper | 01:22 |
johnsom | the CLI has an "unset" command for that | 01:22 |
johnsom | Probably didn't get implemented though | 01:22 |
rm_work | yeah but it MUST be passing through something | 01:23 |
rm_work | because it's UUIDType in WSME | 01:23 |
johnsom | Probably null | 01:23 |
rm_work | which means for it to accept the request and get to provisioning, it has to either be a UUID or a null | 01:23 |
johnsom | lxkong For your original question, yes, that is very much a bug | 01:23 |
rm_work | right, so a null *would be correct* for "unsetting" | 01:23 |
rm_work | right? | 01:23 |
johnsom | Yeah | 01:23 |
rm_work | so i think the issue is past this | 01:24 |
rm_work | i think because it's not "wsme-unset" | 01:24 |
rm_work | we DO pass it through in the model that goes to the handler | 01:24 |
rm_work | at which point it'd try to do an update without actually checking to see if that'd be a noop | 01:24 |
rm_work | which would call neutron with a json-body that's invalid for that version | 01:25 |
johnsom | Right, null would get through and become None in the code, post validation | 01:25 |
rm_work | so even though there wasn't a vip-qos | 01:25 |
rm_work | it'll try to remove it in neutron | 01:25 |
rm_work | which will fail | 01:25 |
rm_work | (because that version of neutron doesn't support it) | 01:26 |
rm_work | yep | 01:27 |
rm_work | just confirmed that for myself | 01:28 |
lxkong | johnsom, rm_work, could you please help to fix so I can help backport to Queens and Rocky, we also need to do a quick release. | 01:28 |
rm_work | https://github.com/openstack/octavia/blob/master/octavia/controller/worker/tasks/network_tasks.py#L542-L548 | 01:28 |
rm_work | so it'll try because it IS in the update-dict | 01:28 |
rm_work | well the quick thing to do here would be to just short-circuit | 01:28 |
rm_work | if it's a real noop | 01:29 |
rm_work | which I believe we can check here | 01:29 |
rm_work | but the real solution I think is to actually detect this feature availability, and avoid the operation in general if it is unavailable | 01:29 |
rm_work | oh, maybe not tho? because NOW we store the data to the DB in the API layer, which means by the time we're here we can't tell if it's a noop or not | 01:31 |
rm_work | umm, could short-circuit it in the API layer | 01:31 |
rm_work | but that's kinda hacky | 01:31 |
rm_work | lxkong: how urgent is this really? can you communicate to users just not to try to set that? | 01:31 |
rm_work | why would they even try, if they were not able to create a LB with a qos-policy to begin with? | 01:32 |
rm_work | I feel like this isn't super critical, so we should be able to take the time to fix it correctly | 01:32 |
lxkong | rm_work: that's not so urgent, because i can just restored the error status back to active, and it's not harmful for the users. | 01:32 |
rm_work | yeah, ummm, find out what users are doing that, and tell them to *stop it* | 01:33 |
rm_work | and ask why they thought that made any sense to begin with <_< | 01:33 |
lxkong | that happened because some users are trying the our LBaaS and specify some params that they thought safe to do | 01:33 |
lxkong | i've already told them, and we do need a solution | 01:34 |
lxkong | rm_work: thanks for all your analysis | 01:34 |
johnsom | Yeah, we should never 500 out | 01:34 |
rm_work | I posted my comments on the story | 01:36 |
rm_work | johnsom: that wouldn't be a 500 | 01:36 |
rm_work | it'd be a 202 | 01:36 |
rm_work | err, 200 | 01:36 |
rm_work | and then it'd fail async | 01:36 |
rm_work | well, yeah on a create it might 500? not sure exactly | 01:36 |
johnsom | Yeah, I realized that after I typed it | 01:36 |
rm_work | but this story is specifically about update | 01:36 |
johnsom | Too many things going on | 01:36 |
rm_work | yeah lol | 01:37 |
rm_work | took me a while to even get my brain to focus on what the issue was, but yeah, got it | 01:37 |
johnsom | I'm adding a task for the "unset" issue in the client. | 01:38 |
johnsom | We need to fix a bunch of those | 01:38 |
johnsom | I don't think any of our update commands have "unset" | 01:39 |
rm_work | yeah, well, that's true | 01:39 |
rm_work | but not REALLY the issue | 01:39 |
rm_work | I suppose we should do it anyway tho | 01:39 |
johnsom | Right, but ... | 01:39 |
rm_work | is there a way to mark a bug-story as "triaged" or "confirmed" or something in launchpad? :/ | 01:39 |
johnsom | In launchpad yes, in our friend storyboard, no | 01:40 |
rm_work | errr | 01:40 |
rm_work | oops lol yeah | 01:40 |
johnsom | Other that adding tags to stories, but that is pointless because you can't search for *not* having a tag in SB either | 01:40 |
rm_work | lol | 01:40 |
rm_work | storyboard != bug tracker, I guess | 01:41 |
johnsom | Ok, off to dinner. I have the release patch ready, just waiting for the last patch to finish merging. Will push that later tonight | 01:41 |
rm_work | o/ | 01:41 |
rm_work | ping me for a +2 if you need one | 01:42 |
johnsom | Evidently we are "using it wrong" | 01:42 |
johnsom | sigh | 01:42 |
rm_work | lol | 01:42 |
johnsom | No, our stuff is all +2/+w, this will be a release team +2. So, no worries | 01:42 |
rm_work | kk | 01:42 |
*** phuoc_ has joined #openstack-lbaas | 01:50 | |
*** phuoc has quit IRC | 01:53 | |
openstackgerrit | Merged openstack/octavia stable/pike: Stop Logging Amphora Cert https://review.openstack.org/625066 | 02:15 |
*** sapd1_ has joined #openstack-lbaas | 02:27 | |
*** sapd1 has quit IRC | 02:29 | |
*** yamamoto has joined #openstack-lbaas | 03:02 | |
openstackgerrit | Merged openstack/octavia stable/queens: Bring up secondary IPs on member networks https://review.openstack.org/624804 | 03:05 |
*** hongbin has joined #openstack-lbaas | 03:14 | |
*** hongbin has quit IRC | 03:15 | |
*** hongbin has joined #openstack-lbaas | 03:16 | |
johnsom | stable branch release patch: https://review.openstack.org/625144 | 03:28 |
*** yamamoto has quit IRC | 03:39 | |
rm_work | cool | 03:47 |
rm_work | wait we're still cutting pike releases? lol | 03:47 |
rm_work | we haven't been backporting anything past queens tho | 03:47 |
rm_work | oh, just that one | 03:47 |
*** ramishra has joined #openstack-lbaas | 03:49 | |
*** hongbin has quit IRC | 04:34 | |
*** PagliaccisCloud has joined #openstack-lbaas | 04:44 | |
*** yamamoto has joined #openstack-lbaas | 04:55 | |
*** yamamoto has quit IRC | 05:40 | |
*** yamamoto has joined #openstack-lbaas | 06:20 | |
*** JudeCross has joined #openstack-lbaas | 06:25 | |
*** ccamposr has joined #openstack-lbaas | 06:34 | |
*** JudeCross has quit IRC | 06:49 | |
*** rcernin has quit IRC | 07:03 | |
*** PagliaccisCloud has quit IRC | 07:12 | |
*** pcaruana has joined #openstack-lbaas | 07:12 | |
*** JudeCross has joined #openstack-lbaas | 07:15 | |
*** yamamoto has quit IRC | 07:48 | |
*** yamamoto has joined #openstack-lbaas | 07:48 | |
*** yamamoto has quit IRC | 07:57 | |
*** rpittau has joined #openstack-lbaas | 08:06 | |
*** reedipb has joined #openstack-lbaas | 08:33 | |
reedipb | johnsom : there? | 08:33 |
*** yamamoto has joined #openstack-lbaas | 08:35 | |
*** velizarx has joined #openstack-lbaas | 08:35 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/octavia-dashboard master: Imported Translations from Zanata https://review.openstack.org/625183 | 08:55 |
*** JudeCross has quit IRC | 09:01 | |
*** Emine has joined #openstack-lbaas | 09:05 | |
*** reedipb has quit IRC | 09:15 | |
rm_work | what channel is the storyboard team in again? >_> | 09:28 |
rm_work | need to ask them about bug flags | 09:28 |
rm_work | for priority we could use tags and add like "high" or "low" or whatever | 09:29 |
rm_work | but ... | 09:29 |
rm_work | ah, maybe if you could search for "lack of a flag" that'd do it | 09:29 |
*** yamamoto has quit IRC | 09:37 | |
cgoncalves | rm_work, fields like such (priority) have been requested to the Storyboard team awhile ago IIRC. the workaround is indeed using tags, which I dislike | 09:47 |
*** salmankhan has joined #openstack-lbaas | 10:34 | |
*** PagliaccisCloud has joined #openstack-lbaas | 10:59 | |
*** rpittau is now known as rpittau|lunch | 11:10 | |
*** yamamoto has joined #openstack-lbaas | 11:15 | |
*** salmankhan has quit IRC | 12:06 | |
*** salmankhan has joined #openstack-lbaas | 12:09 | |
*** rpittau|lunch is now known as rpittau | 12:13 | |
*** pcaruana has quit IRC | 12:21 | |
*** pcaruana has joined #openstack-lbaas | 12:22 | |
*** pcaruana is now known as pcaruana|intw| | 12:25 | |
*** yamamoto has quit IRC | 12:28 | |
*** yamamoto has joined #openstack-lbaas | 12:30 | |
*** yamamoto has quit IRC | 12:30 | |
*** yamamoto has joined #openstack-lbaas | 13:05 | |
*** yamamoto has quit IRC | 13:18 | |
*** yamamoto has joined #openstack-lbaas | 13:18 | |
*** velizarx has quit IRC | 13:20 | |
*** velizarx has joined #openstack-lbaas | 14:00 | |
*** pcaruana|intw| has quit IRC | 14:05 | |
*** Emine has quit IRC | 14:16 | |
*** pcaruana has joined #openstack-lbaas | 14:31 | |
*** ccamposr has quit IRC | 14:49 | |
*** ivve has joined #openstack-lbaas | 15:05 | |
*** yangjianfeng has joined #openstack-lbaas | 15:10 | |
*** yangjianfeng has quit IRC | 15:26 | |
*** ivve has quit IRC | 15:34 | |
*** velizarx has quit IRC | 15:47 | |
*** ccamposr has joined #openstack-lbaas | 15:50 | |
*** PagliaccisCloud has quit IRC | 16:15 | |
*** pcaruana has quit IRC | 16:20 | |
*** ivve has joined #openstack-lbaas | 16:24 | |
johnsom | FYI, I have also posted patches to bump OSA to get the taskflow logging patch | 16:53 |
*** salmankhan has quit IRC | 16:56 | |
*** salmankhan has joined #openstack-lbaas | 16:57 | |
*** ccamposr has quit IRC | 17:15 | |
*** PagliaccisCloud has joined #openstack-lbaas | 17:15 | |
johnsom | reedipb Looking for me? | 17:17 |
*** rpittau has quit IRC | 17:22 | |
*** Emine has joined #openstack-lbaas | 17:36 | |
*** PagliaccisCloud has quit IRC | 17:43 | |
*** Swami has joined #openstack-lbaas | 18:26 | |
cgoncalves | I clarified Reedip's questions off channel. ovn devstack plugin is doing weird things for enabling its provider driver in octavia, they have been advised how on to best do it | 18:54 |
johnsom | Ok, cool. Thanks! | 18:54 |
cgoncalves | and second, he was looking for if it is possible and how to enable multiple provider drivers | 18:54 |
johnsom | Yes, we support that | 18:55 |
cgoncalves | right | 18:55 |
cgoncalves | he was setting enabled_provider_drivers option in python dict format :) | 18:55 |
cgoncalves | should be "enabled_providers_drivers: amphora:'some description', ovn:'another description'" | 18:56 |
cgoncalves | "Documentation for Octavia's OVN Driver" -- https://review.openstack.org/#/c/624937/ | 18:57 |
*** salmankhan has quit IRC | 18:57 | |
*** abaindur has joined #openstack-lbaas | 19:06 | |
*** Emine has quit IRC | 19:26 | |
*** abaindur has quit IRC | 19:54 | |
*** abaindur has joined #openstack-lbaas | 20:14 | |
*** abaindur has quit IRC | 20:15 | |
*** abaindur has joined #openstack-lbaas | 20:15 | |
openstackgerrit | German Eichberger proposed openstack/octavia master: Amphora logging https://review.openstack.org/624835 | 20:54 |
xgerman | (hopefully goo to go) | 20:54 |
xgerman | next week haproxy log format to include project-id | 20:55 |
openstackgerrit | Michael Johnson proposed openstack/octavia master: Updates Octavia to support octavia-lib https://review.openstack.org/613709 | 21:34 |
*** badloop has joined #openstack-lbaas | 21:50 | |
badloop | we are having issues with ha-proxy working intermittently for our loadbalancers (communication just stops and starts randomly with no clear reason why). where would be the best place to look for troubleshooting that | 21:51 |
badloop | it is pretty much a vanilla install of ocata | 21:51 |
*** colby_ has joined #openstack-lbaas | 22:09 | |
colby_ | Hey Guys, | 22:09 |
colby_ | I just upgraded to queens and we had octavia working under pike. Suddenly the worker and health monitor are unable to connect. We get the following:SSLError: ("bad handshake: Error([('SSL routines', 'ssl3_read_bytes', 'tlsv1 alert unknown ca')],)",) | 22:10 |
rm_work | badloop: my INITIAL guess, with no real evidence or view into your particular situation, would be that healthchecks are periodically failing for some reason and taking nodes offline | 22:10 |
colby_ | did the configs change for the certs that Im not aware of? | 22:11 |
rm_work | and by healthchecks, i mean the haproxy ones, not amphora health | 22:11 |
rm_work | colby_: i don't think so? are you sure the files didn't get swapped out or something by the upgrade process by accident? | 22:11 |
colby_ | no our SSL certs are stored outside the octavia config directory | 22:12 |
johnsom | How did up upgrade? Did the process recreate the certificates on the controllers? | 22:13 |
colby_ | no I created all the certs manually for pike. Just pointing to them in the config | 22:14 |
johnsom | I don’t think anything changed with the amp cert system. | 22:18 |
rm_work | johnsom: err, is redhat really supposed to write the vip-interface-file twice? https://github.com/openstack/octavia/blob/master/octavia/amphorae/backends/agent/api_server/osutils.py#L393-L409 | 22:19 |
johnsom | Are you using selinux or apparmour that is denying access to to the certs? | 22:20 |
rm_work | (in the case of keepalived) | 22:20 |
rm_work | and why does keepalived on Ubuntu not handle the vip in the same was is it does in redhat? | 22:20 |
rm_work | the ubuntu version of that function is missing all of the stuff past the first write (where it says "Keepalived will handle this" | 22:21 |
rm_work | ) | 22:21 |
colby_ | johnsom: no both are not enabled. | 22:21 |
johnsom | Nir would be best to answer that, but I know RH requires more files than Ubuntu | 22:22 |
rm_work | oh, or is it that keepalived handles the same things on both, but Ubuntu's networking handles the VIP better in the non-keepalived case (and redhat does not) | 22:22 |
rm_work | yeah ok... nmagnezi maybe if you're around | 22:22 |
rm_work | ah it's a different path too, ok, yeah, an extra alias file | 22:22 |
johnsom | Ubuntu we write an int file in single mode, and not in act/stdby | 22:23 |
*** ivve has quit IRC | 22:29 | |
johnsom | colby_: that is in the worker log right? | 22:31 |
colby_ | both the worker and health manager | 22:31 |
johnsom | Yeah, somehow the certs are not aligned. I would double check the config via the startup debug logs and then compare the certs to what you get from openssl on the amp agent port. | 22:34 |
rm_work | for ipv6, the "netmask" *is* the prefixlength? | 23:14 |
rm_work | (of the network) | 23:14 |
rm_work | yeah ok read some docs, i get it | 23:18 |
rm_work | but does ipv6 have a concept of host_routes? | 23:18 |
rm_work | johnsom: ^^ | 23:18 |
rm_work | i ... feel like it wouldn't need them? | 23:19 |
johnsom | Yes, but the term “host route” is a neutron isem | 23:19 |
johnsom | The whole netmask vs prefix thing had a problem. Someone posted a patch for that. We were not using it consistently. | 23:20 |
rm_work | right i am fixing that patch now | 23:22 |
johnsom | Awesome | 23:22 |
rm_work | almost done | 23:22 |
rm_work | patiently waiting on local tests... >_> | 23:23 |
*** PagliaccisCloud has joined #openstack-lbaas | 23:29 | |
*** yamamoto has quit IRC | 23:39 | |
*** abaindur has quit IRC | 23:53 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!