*** celebdor has quit IRC | 00:15 | |
*** rh-jelabarre has quit IRC | 01:29 | |
*** gkadam has joined #openstack-kuryr | 03:18 | |
*** gkadam_ has joined #openstack-kuryr | 03:28 | |
*** gkadam has quit IRC | 03:31 | |
*** kiseok7 has quit IRC | 04:16 | |
*** premsankar has quit IRC | 05:01 | |
*** janki has joined #openstack-kuryr | 06:05 | |
openstackgerrit | Ashish Billore proposed openstack/kuryr-kubernetes master: Update HA doc with corections and minor fixes https://review.openstack.org/635446 | 06:25 |
---|---|---|
openstackgerrit | Ashish Billore proposed openstack/kuryr-kubernetes master: Update and add the docstrings to the functions https://review.openstack.org/627511 | 06:32 |
*** ccamposr has joined #openstack-kuryr | 07:06 | |
*** pcaruana has joined #openstack-kuryr | 07:23 | |
*** maysams has joined #openstack-kuryr | 07:29 | |
*** takamatsu has joined #openstack-kuryr | 08:00 | |
*** celebdor has joined #openstack-kuryr | 08:18 | |
openstackgerrit | Merged openstack/kuryr-kubernetes master: Update HA doc with corections and minor fixes https://review.openstack.org/635446 | 09:37 |
*** EricAdamsZNC has quit IRC | 09:56 | |
*** EricAdamsZNC has joined #openstack-kuryr | 10:00 | |
*** aperevalov has joined #openstack-kuryr | 10:18 | |
*** takamatsu_ has joined #openstack-kuryr | 11:20 | |
*** takamatsu has quit IRC | 11:20 | |
*** ccamposr has quit IRC | 11:20 | |
*** mrostecki has quit IRC | 13:06 | |
*** mrostecki has joined #openstack-kuryr | 13:06 | |
*** rh-jelabarre has joined #openstack-kuryr | 13:34 | |
maysams | ltomasbo, dulek: Could you take a look again whenever possible? https://review.openstack.org/#/c/635039/ | 13:43 |
ltomasbo | maysams, sure! | 13:43 |
maysams | ltomasbo, thanks | 13:44 |
dulek | Looking! | 13:44 |
ltomasbo | maysams, regarding this https://review.openstack.org/#/c/634674/6/kuryr_kubernetes/controller/drivers/vif_pool.py | 13:45 |
ltomasbo | maysams, yuou mean doing list.extend() for each port_list in the pool, right? | 13:46 |
ltomasbo | maysams, I'm actually not sure what will be faster... checking the ports or copying it over to a new array | 13:47 |
ltomasbo | dulek, any preferences? | 13:47 |
maysams | pool_members.extend(port_list) | 13:48 |
dulek | ltomasbo: Do we even have performance issue here? Let's worry once we have it. ;) | 13:48 |
ltomasbo | dulek, and what option do you prefer? | 13:49 |
ltomasbo | the extend or the list comprehension? | 13:49 |
ltomasbo | (I don't have strong opinion... and not even sure performance wise what is better | 13:49 |
dulek | ltomasbo: Extension will be faster here, IIUC. | 13:50 |
ltomasbo | cool, I'll update it then | 13:50 |
dulek | ltomasbo: List comprehension will be a generator in Python 3, won't it? | 13:50 |
* dulek needs to check. :D | 13:50 | |
dulek | ltomasbo: Nooope, it won't be. | 13:51 |
ltomasbo | :) | 13:52 |
dulek | ltomasbo: Okay, so lists in Python are arrays. | 13:55 |
dulek | ltomasbo: That means this shouldn't really matter too much, in any case we'll iterate. | 13:55 |
dulek | Clarification - lists in CPython are arrays. ;) | 13:55 |
dulek | maysams: Okay, so you mean that on_present with updated annotation never happens? | 14:01 |
maysams | dulek, yes | 14:03 |
dulek | maysams: This would be really weird… | 14:03 |
maysams | dulek, I tested removing the k8s API requests that I added and just raise a resourceNotReady | 14:04 |
maysams | the endpoint received on on_present always had the same resourceVersion | 14:04 |
maysams | so it would keep raising the exception until the controller dies | 14:05 |
maysams | dulek, and by getting from the API this is avoided | 14:05 |
maysams | dulek, and yes.. I do agree it's really weird | 14:06 |
ltomasbo | maysams, but what if you simply return? instead of the resourceNotReady? then once the other handlers annotates the endpoints spec, a new on-present will be triggered, right? | 14:13 |
ltomasbo | maysams, I guess the difficult part will be when to wait for the other handler to update the state | 14:14 |
maysams | will try that ltomasbo | 14:20 |
maysams | ty! | 14:22 |
aperevalov | dulek, hello. I would like to ask you about issues mentioned in https://docs.openstack.org/kuryr-kubernetes/latest/devref/high_availability.html, with orphaned openstack ports. | 14:33 |
aperevalov | Looks like it's not yet fixed. Does anybody start this task, is there any bug for it on launchpad? | 14:35 |
dulek | aperevalov: yboaron looked at creating some kind of garbage colector a while ago, but never progressed with that. And I don't think he'll be able to do so now due to some priorities changes. | 14:36 |
dulek | aperevalov: Why asking? Are you hitting those issues or want to work on them? | 14:38 |
aperevalov | Yes, looks like it rare case, but it's not only HA issue, it could happen during development and testing. Do you know details, is it some kind of watcher inside controller/cni daemon or separate process? | 14:38 |
dulek | aperevalov: I would put it as periodic_task inside kuryr-controller. | 14:40 |
aperevalov | I'm asking because of two reason 1) our team want to know more about it 2) personally I faced with issue of orphaned port during development (it's not for production). I clear ports just with bash script. | 14:40 |
aperevalov | dulek, thank you very much for info ) | 14:41 |
dulek | aperevalov: No problem. If you want to work on this, I'm happy to discuss implementation details once there's some design. :) | 14:42 |
*** janki has quit IRC | 14:50 | |
aperevalov | Also, some periodic_task which updates information about openstack network/subnet would be interesting. Currently we keep physnet->subnet_id mapping and use it in controller driver as well as physnet->PF mapping which used in binding driver. So if user introduce new sriov subnet we need to add it into kuryr.conf and restart cni/controller, but it would be flexible to obtain such information in kuryr-kubernetes dynamically. | 15:05 |
aperevalov | Straightforward approach, when we do request in drivers is not perfect due to performance penalty. So it's better to keep such information updated. | 15:06 |
openstackgerrit | Maysa de Macedo Souza proposed openstack/kuryr-kubernetes master: Fix SG rules on targetPort update https://review.openstack.org/635039 | 15:24 |
maysams | ltomasbo, dulek: now it works without the API requests ^ | 15:28 |
dulek | maysams: Magic? | 15:29 |
dulek | maysams: So now it's not raising an exception and just ignores events that aren't matching? | 15:31 |
maysams | dulek: haha just added the targetPort when checking if should ignore the event | 15:31 |
maysams | https://review.openstack.org/#/c/635039/8/kuryr_kubernetes/controller/handlers/lbaas.py@320 | 15:31 |
maysams | dulek: yes.. and when the lbaas spec handler finish to process the lbaas handler will be called again | 15:31 |
*** pcaruana has quit IRC | 15:31 | |
dulek | maysams: I need to put on my anti-radiation suit and jump into watcher.py code to understand why there's a difference when we raise an exception… | 15:32 |
dulek | maysams: As I'm pretty sure that code in watcher.py isn't good for my health. | 15:33 |
dulek | Mostly mental. | 15:33 |
maysams | XD | 15:33 |
ltomasbo | xD | 15:36 |
ltomasbo | maysams, so, using the target port to decide on changes helped? | 15:36 |
ltomasbo | maysams, and even made the patch smaller! \o/ | 15:37 |
ltomasbo | maysams, going to give it a try! | 15:37 |
maysams | ltomasbo: yes, it helped :) | 15:38 |
*** mrostecki has quit IRC | 15:56 | |
openstackgerrit | Luis Tomas Bolivar proposed openstack/kuryr-kubernetes master: Pools support with Network Policies https://review.openstack.org/634674 | 16:18 |
*** aperevalov has quit IRC | 16:45 | |
dulek | ltomasbo: That's second board: https://jira.coreos.com/secure/RapidBoard.jspa?rapidView=217&view=planning.nodetail | 16:49 |
dulek | Whooops, wrong channel. ;) | 16:49 |
dulek | maysams: Want to know what's the difference between raising ResourceNotReady and returning? | 17:02 |
dulek | maysams, ltomasbo: https://github.com/openstack/kuryr-kubernetes/blob/60213747a3eadf93437d96caf59530e28b847bb7/kuryr_kubernetes/controller/handlers/pipeline.py#L55-L58 | 17:02 |
dulek | If maysams was raising ResourceNotReady, the event was retried with exponential backoff. | 17:03 |
dulek | And this is blocking for a single path, IIUC. | 17:04 |
dulek | So until that event was processed no new event could be handled. | 17:04 |
dulek | And as maysams made it always raise ResourceNotReady, we pretty quickly made kuryr-controller to set unhealthy in probes. | 17:04 |
dulek | At least I can +2 the patch as I understand what happened. :D | 17:05 |
maysams | and so that event would keep retrying with the same resource version | 17:05 |
dulek | maysams: With exactly the same event. | 17:05 |
maysams | dulek, yup | 17:05 |
dulek | maysams: ResourceNotReady is supposed to be used when we wait for something else to get updated. | 17:05 |
dulek | maysams: Oh, damn, you also add something to the o.vo… We also need to consider upgrades here. | 17:06 |
dulek | maysams: Basically we can expect that after Kuryr gets upgraded there will be older annotations and Kuryr should still work. | 17:06 |
maysams | dulek: hmmm.. I see | 17:07 |
dulek | maysams: So line 320 of lbaas.py is potentially dangerous, right? | 17:08 |
dulek | Because targetPort may not be there. | 17:09 |
maysams | dulek, yes | 17:10 |
maysams | dulek: I need to rework this | 17:11 |
dulek | maysams: A lot? :P | 17:13 |
dulek | maysams: It's totally fair that in case targetPort is missing, we fetch it from K8s API. | 17:16 |
dulek | I mean it's missing in annotation. | 17:16 |
dulek | maysams: Wait, we should even have it, as it should be in the event we're processing, right? | 17:17 |
dulek | And you can easily assume that there were no NP before the upgrade. | 17:17 |
dulek | So it's not that bad, we just need a fall back. | 17:17 |
maysams | dulek: you mean that we should give a default value if the lbaas_spec port does not contains the targetPort? | 17:24 |
dulek | maysams: Where is targetPort taken from initially? | 17:25 |
dulek | maysams: Service definition, right? | 17:25 |
maysams | from the lbaas_spec annotation | 17:26 |
maysams | in the service, right | 17:26 |
dulek | maysams: Yeah, but initially it's copied from Service. | 17:28 |
dulek | maysams: So if it's missing from annotated LBaaSSpec we can fetch the Service and fill it. | 17:29 |
dulek | maysams: We can just check the version of the object we load. | 17:29 |
dulek | maysams: Am I right that we can get this from Service in K8s API? | 17:29 |
dulek | I mean Service spec, not any annotations. | 17:30 |
maysams | dulek, we can, and also we get it from the endpoints | 17:31 |
maysams | https://review.openstack.org/#/c/635039/8/kuryr_kubernetes/controller/handlers/lbaas.py@317 | 17:31 |
maysams | as we already have it there | 17:32 |
*** gkadam_ has quit IRC | 17:32 | |
dulek | maysams: Oh, okay. So whatever works better for you. | 17:33 |
dulek | maysams: Because alternative could be to increment o.vo version of LBaaSServiceSpec as well. | 17:33 |
dulek | And then when loading the annotation obj_make_compatible() can take care of fetching additional data. | 17:34 |
dulek | Nah, the latter is a bit too complicated, o.vo layer shouldn't have access to K8s API. | 17:34 |
dulek | maysams: Let me just check how to test for value in o.vo being set… | 17:35 |
maysams | dulek, okay | 17:36 |
dulek | maysams: Something like this should work: `port.obj_attr_is_set('targetPort')`. | 17:37 |
dulek | maysams: Another thing to take care of is to be able to drop that compatibility code one day. | 17:38 |
dulek | maysams: Imagine that Kuryr is in version X, gets updated to X+1 which requires targetPort, but does that fallback, then gets updated to X+2 that drops the fallback. | 17:39 |
maysams | yes | 17:39 |
dulek | We need to make sure that before upgrading to X+2 we have all the annotations converted to have targetPort. | 17:39 |
dulek | maysams: https://github.com/openstack/kuryr-kubernetes/blob/aa5ec451f79fc94d99f1c07e553240753bc4df74/kuryr_kubernetes/cmd/status.py#L150-L194 | 17:40 |
dulek | maysams: We have something like this for that purpose. But let's implement conversion tool in a follow up patch, I think I can do that. :) | 17:41 |
maysams | very well, thanks a lot for pointing this out dulek | 17:41 |
dulek | maysams: No problem. I know this sucks, but being able to live upgrade is always hard. | 17:42 |
maysams | dulek, indeed | 17:42 |
dulek | maysams: And at least we only care about K8s API and not RPC, SQL (MariaDB vs MySQL vs PostgreSQL quirks) and REST API. xD | 17:43 |
maysams | dulek, thanks God | 17:44 |
maysams | dulek, did you work on some project that required all that? | 17:44 |
dulek | maysams: Yeah, OpenStack Cinder. :D | 17:45 |
maysams | dulek, ohh ok ok.. I imagined you had lots of fun :P | 17:45 |
maysams | s/imagined/imagine | 17:45 |
dulek | maysams: Yup, needed to write this doc: https://docs.openstack.org/cinder/latest/contributor/rolling.upgrades.html | 17:46 |
maysams | dulek, nice | 17:47 |
dulek | maysams: I think I'll start the weekend now, if we continue talking about rolling upgrades I might have nightmares later on. ;) | 17:49 |
dulek | Have a great weekend! | 17:50 |
maysams | dulek: hahah | 17:50 |
*** gcheresh_ has joined #openstack-kuryr | 17:50 | |
maysams | dulek, a nice weekend to you too! | 17:50 |
*** premsankar has joined #openstack-kuryr | 18:07 | |
*** maysams has quit IRC | 18:07 | |
*** gcheresh_ has quit IRC | 18:08 | |
*** rh-jelabarre has quit IRC | 23:39 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!