14:00:50 <apuimedo> #startmeeting kuryr 14:00:51 <openstack> Meeting started Mon Jan 16 14:00:50 2017 UTC and is due to finish in 60 minutes. The chair is apuimedo. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:52 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:54 <openstack> The meeting name has been set to 'kuryr' 14:01:10 <apuimedo> Hello and welcome everybody to another Kuryr weekly IRC meeting 14:01:21 <garyloug> o/ 14:01:22 <ivc_> o/ 14:01:27 <yedongcan> o/ 14:01:28 <apuimedo> Today vikasc and irenab have excused themselves and won't be able to join 14:01:30 <mchiappero> o/ 14:02:04 <apuimedo> #topic kuryr-lib 14:03:40 <alraddarla> o/ 14:03:41 <apuimedo> There's basically no news on kuryr-lib :-) 14:03:44 <apuimedo> No news, good news 14:03:47 <apuimedo> as they say 14:03:54 <apuimedo> only this patch 14:03:58 <apuimedo> #link https://review.openstack.org/#/c/418792/ 14:04:19 <apuimedo> which pisses me off a bit that came so late 14:04:45 <apuimedo> but... What can we do, we merge it now and we'll consider if we do a point release to get the requirement upped 14:05:04 <ltomasbo> o/ 14:05:46 <apuimedo> Anybody's got anything on kuryr-lib? 14:06:42 <apuimedo> very well! 14:06:44 <apuimedo> Moving on 14:06:48 <apuimedo> #topic kuryr-libnetwokr 14:06:51 <apuimedo> darn!!! 14:06:54 <apuimedo> typo 14:06:56 <mchiappero> eheh 14:06:58 <apuimedo> #topic kuryr-libnetwork 14:07:15 <apuimedo> ltomasbo: can you update us on the race? 14:07:25 <ltomasbo> Hi 14:07:40 <ltomasbo> the problem is that, after calling to attach the (sub)port to the trunk 14:07:49 <ltomasbo> we (kuryr-libnetwork) calls update_port 14:07:59 <ltomasbo> and it seems attach_subport also calls update_port internally 14:08:05 <ltomasbo> therefore, there is a race there 14:08:12 <ltomasbo> and something me device_owner is not properly set 14:08:29 <ltomasbo> which makes kuryr-libnetwork not able to remove the port after removing the container 14:08:53 <ltomasbo> since there is a filter by deviceowner before removing the port 14:09:10 <apuimedo> #link https://review.openstack.org/#/c/419028/ 14:09:41 <ltomasbo> I've been discussing with armax about the possibility of reverting the patch that set device_owner for trunk ports 14:10:04 <ltomasbo> another easy fix could ve to remove the device_owner filter we do on ipam_release before calling the remove_port 14:10:22 <apuimedo> #info there is a race in container-in-VM flow due to subport addition usage of device owner 14:11:06 <mchiappero> I'm sorry, but I'm not sure I've understood the root cause of the race 14:11:15 <apuimedo> ltomasbo: it is kind of a nice service we provide our users to mark which ports we manage for them automatically 14:11:26 <ltomasbo> apuimedo, yes, I agree 14:11:28 <apuimedo> so we should try to preserve it as much as possible 14:11:46 <apuimedo> mchiappero: basically we update the device:owner field to container:kuryr 14:11:50 <ltomasbo> the problem is that, the patch I proposed to revert is setting a different device_owner to the subports 14:11:57 <apuimedo> and the trunk_subport_add does the same behind the scene 14:12:07 <ltomasbo> therefore we call attach_subport and then update_port 14:12:18 <ltomasbo> but attach_subport internally calls to update_port too 14:12:24 <apuimedo> so IIRC, if the trunk_subport_add op finds it changed before it finishes, it goes boom. Is that right ltomasbo? 14:12:27 <ltomasbo> and both update_ports set a different device_owner 14:12:47 <apuimedo> or is it only that it breaks our removal? 14:13:12 <ltomasbo> we need to call first trunk_subport_add and then update_port 14:13:26 <ltomasbo> since trunk_subport_add will fail if device_id is already set in the port 14:13:55 <ltomasbo> butn trunk_subport_add also calls update_port to set the device_owner to trunk:subport 14:14:14 <ltomasbo> while we call update_port from kuryr-libnetwork to set device_owner to kuryr:container 14:14:15 <ivc_> ltomasbo wait a sec, if trunk_subport_add fails, does that mean that it does not support device_id? or is it a bug in neutron? 14:14:42 <ltomasbo> with the current implementation, there is no problem with device_id, as we call first trunk_subport_add 14:14:52 <ltomasbo> the problem is that, as device_owner is not set to kuryr:container 14:15:00 <ltomasbo> the port will not be deleted after the container is removed 14:15:30 <ltomasbo> and, the fact that turnk_subport_add fails if device_id is already set is not a bug 14:15:32 <ivc_> what is the device_owner then? if not 'kuryr:container'? 14:15:33 <ltomasbo> is the way it should work 14:15:55 <ltomasbo> if the port is already in used, it should not be made part of a trunk 14:16:01 <apuimedo> ivc_: no, it is trunk:subport 14:16:08 <ltomasbo> device_owner is set to trunk:subport 14:16:08 <apuimedo> sometimes, depending on the race 14:16:42 <ivc_> apuimedo thats what i see as the problem. neutron trunk code set it to 'trunk:subport' and we reset it to 'kuryr:container', right? 14:16:55 <ivc_> i think we are breaking some neutron contract here 14:16:56 <apuimedo> ivc_: I think so 14:17:03 <apuimedo> no, no contract 14:17:05 <ltomasbo> yes 14:17:17 <mchiappero> yes, no? :D 14:17:22 <apuimedo> they only set it to trunk:subport for no specific reason 14:17:25 <ltomasbo> they just tagged subports to kuryr:container for simplicity 14:17:33 <ltomasbo> there is no real need for that on neutron 14:17:36 <ivc_> neutron might expect it to be 'trunk:subport' 14:17:48 <ivc_> maybe it's not used now, but we should not rely on it 14:17:59 <apuimedo> ivc_: mchiappero: "Altough this is not currently required by any of the business logic, it is handy to have this set to help users quickly identify ports used in trunks" 14:18:11 <apuimedo> This is the justification for setting trunk:subport in the original patch 14:18:16 <apuimedo> nothing uses this fact 14:18:19 <ivc_> 'currently' 14:18:36 <apuimedo> ivc_: this 'currently' has not changed 14:18:42 <apuimedo> and there's no good reason it should 14:18:56 <ivc_> at some point neutron could add the business logic that would rely on it 14:18:57 <mchiappero> I agree with using kuryr as device owner, but still, I don't fully understand whether it's a timing issue or what 14:18:57 <ltomasbo> we can modify (I'm waiting for Armax answer) the way trunk_subport_owner set the device_owner 14:19:04 <apuimedo> ivc_: I agree with you on that 14:19:08 <apuimedo> I find it misguided 14:19:15 <ltomasbo> and make it possible to not set it to anything (for the kuryr case) 14:19:59 <apuimedo> ivc_: I honestly just don't see the point of setting it to trunk:subport 14:20:09 <ltomasbo> neither do I 14:20:14 <apuimedo> when it is something that can be checked in the API, that it belongs to a trukn 14:20:16 <apuimedo> *trunk 14:20:21 <ltomasbo> I think it is just to easily find the subports 14:20:24 <ivc_> apuimedo i understand that, but thats the current api 14:20:26 <apuimedo> but now the damage is done 14:20:29 <apuimedo> exactly 14:21:08 <ivc_> imo best course of action is to update neutron trunk api in a way that would allow us to legitimately set device_owner 14:21:16 <ltomasbo> yes, it seems reverting that could affect already set deployments 14:21:23 <ivc_> excluding the potential conflict between trunk code and kuryr 14:21:26 <apuimedo> it is quite annoying, but we'll probably have to consider whether we do not mark subports or use tags or something else 14:21:38 <apuimedo> ivc_: that was my first thought 14:21:46 <apuimedo> to extend the api of trunk_subport_add 14:21:51 <apuimedo> so that you can pass it a device owner 14:22:10 <apuimedo> (which by the way saves us one neutron call :P ) 14:22:15 <ivc_> yup 14:22:29 <apuimedo> ltomasbo: did you propose that to armax? 14:22:30 <ltomasbo> and easy modifications could be to just change the config option, and allow not to set any device_owner 14:22:35 <apuimedo> Is there a mailing list thread for that? 14:22:38 <ltomasbo> without needed to modify the API 14:22:43 <ltomasbo> which is always trickier 14:23:02 <apuimedo> ivc_: the problem with extending the API is that we're probably too close to freeze or already frozen 14:23:03 <ltomasbo> it is being discussed on the revert patch: 14:23:09 <ltomasbo> https://review.openstack.org/#/c/419028 14:23:12 <apuimedo> ltomasbo: gotcha 14:23:49 <ivc_> -2 ... 14:23:55 <ltomasbo> :D 14:24:09 <ltomasbo> I know! But still discussing with armax 14:24:20 <apuimedo> ltomasbo got the hammer of justice! 14:24:28 <ltomasbo> and I agree reverting could not be done, but we are using that to discuss 14:24:37 <ltomasbo> and then (I suppose) another solution will be proposed 14:24:55 <ltomasbo> I see the other option, not setting TRUNK_SUBPORT_OWNER 14:25:02 <apuimedo> ltomasbo: from the commit message discussion I see an interesting option 14:25:16 <ltomasbo> and making the code not setting the device owner on trunk_add_subport 14:25:20 <apuimedo> we could probably argue for having the device:owner unchanged if it is not None 14:25:22 <ltomasbo> it will be just a couple of lines 14:25:25 <ivc_> so how tricky it would be to keep 'trunk:subport' owner? do we have some sort of workaround? 14:25:30 <apuimedo> and move the update before the trunk_port_add 14:25:37 <apuimedo> *trunk_subport_add 14:25:44 <mchiappero> sorry but I guess there is no guarantee on the serialization of the operations in neutron 14:25:53 <mchiappero> but shouldn't be that way for the same port? 14:26:09 <mchiappero> shouldn't actions to the same port be serialized? 14:26:20 <mchiappero> wouldn't this solve the issue? 14:26:30 <apuimedo> mchiappero: not sure 14:26:37 <ivc_> mchiappero afaik they are. as soon as you got confirmation for your request, it is commited 14:27:08 <ltomasbo> yes, but calls are async, so, they can be executed in different orders 14:27:09 <mchiappero> so in this case our update port could get confirmed first, right? 14:27:15 <apuimedo> ivc_: re keeping trunk:subport would break our contract of marking our resources 14:27:19 <ivc_> the problem is not the race as i understand it but just the conflict between kuryr's port update and trunk logic 14:27:32 <ltomasbo> yes and not 14:27:42 <apuimedo> ivc_: For me the race is just a symptom 14:27:57 <ltomasbo> if it ensure that our call is called after trunk_add_subport fully finished 14:28:01 <mchiappero> apuimedo: right 14:28:05 <ltomasbo> then from kuryr point of view, that will work 14:28:14 <ltomasbo> but I agree with apuimedo 14:28:17 <hongbin> o/ 14:28:21 <ltomasbo> the problem is the use of device_owner 14:28:29 <apuimedo> exactly 14:28:33 <ivc_> yup 14:28:46 <mchiappero> I would expect neutron to perform some ordering or take some port specific lock 14:28:48 <apuimedo> I would really like to have the extra parameter in the trunk_subport_add 14:29:06 <ltomasbo> but that will not solve the problem 14:29:15 <ltomasbo> if we set device_owner to whatever we want 14:29:19 <ltomasbo> we can still do that 14:29:23 <ltomasbo> but the problem will be the same 14:29:36 <ltomasbo> not an unified view of what device_owner should be about 14:29:57 <apuimedo> ltomasbo: recognizing that it can be set to something else kind of forces Neutron to acknowledge that this field is informative for them and not for logic 14:30:06 <ivc_> ltomasbo, toni's point is if we get 'device_owner' as part of 'trunk_subport_add' api, it would make 'kuryr:container' device owner legit 14:30:18 <ltomasbo> got it 14:30:22 <apuimedo> ivc_: or a "use at your own risk" 14:30:34 <apuimedo> depends on the wording in the method doc 14:30:39 <apuimedo> that gets merged 14:31:09 <ltomasbo> of course, for kuryr deployment we can state that TRUNK_SUBPORT_OWNER should be set to kuryr:container 14:31:19 <mchiappero> I still don't fully understand: does setting the owner after a well finished and seccessful trunk_subport_add work? 14:31:21 <ltomasbo> and that will solve the problem, at the expense of flexibility 14:31:23 <apuimedo> ivc_: I didn't want to say it. But the right fix, things being what they are now, would be that there would be multiple owners 14:31:39 <apuimedo> but that is even more API breaking :/ 14:31:52 <apuimedo> mchiappero: it does 14:31:59 <mchiappero> ok, so the problem is neutron 14:32:05 <apuimedo> but it makes us tread in unsafe waters 14:32:19 <mchiappero> that's something for them to fix 14:32:31 <apuimedo> that if neutron subport related code started relying on this (like for upgrades) 14:32:39 <apuimedo> it could render our subports useless 14:33:12 <apuimedo> the fix I'd like to avoid, but that work work right away 14:33:18 <ivc_> apuimedo i think multiple owners would only add confusion. the 'owner' should be unique, but the device_owner field should not be used for storing 'informative' date as it is the case with 'trunk:subport' 14:33:30 <ivc_> data* 14:33:32 <mchiappero> ivc_: agree 14:33:32 <apuimedo> is to use a new tag instead of the device:owner 14:34:21 <ivc_> imo kuryr is the real owner in this case 14:34:32 <apuimedo> ivc_: I agree with you, if Neutron wanted so much to have this informative to avoid checking in DB whether it was a subport, it could have added a field for that 14:34:39 <apuimedo> ivc_: no question about that 14:34:45 <apuimedo> but we need a pragmatic solution 14:35:06 <apuimedo> let's wait to hear what ltomasbo gets from armax in proposing a new parameter for the trunk operations 14:35:47 <ltomasbo> would you agree/like the other solution? just disabling trunk_add_subport to write on device:owner? 14:35:51 <ivc_> the problem is HCF 14:35:56 <apuimedo> #action ltomasbo to continue discussion with armax, proposing trunk_subport_add to receive optionally an API owner name 14:36:13 <apuimedo> ltomasbo: disabling it how? 14:36:16 <apuimedo> ivc_: HCF? 14:36:23 <ivc_> hard code freeze 14:36:41 <ltomasbo> as they just set whatever value is in TRUNK_SUBPORT_OWNER in the config.py file 14:36:45 <ltomasbo> just setting that to none 14:37:02 <ltomasbo> and if it is set to none, then just don't call update_port 14:37:15 <ltomasbo> and if it is set to wahtever it is, keep working as it is 14:37:22 <ltomasbo> so that it does not break current deployments 14:37:32 <apuimedo> ltomasbo: that would apply to all the ports, not just those used by kuryr 14:37:44 <ltomasbo> only to subports 14:38:00 <apuimedo> right, but a user may use subports for other purposes 14:38:20 <ivc_> apuimedo ltomasbo if we look at it from different perspective, do we need 'device_owner' for cleanup only? 14:38:38 <ltomasbo> we don't even really needed 14:38:45 <ltomasbo> is just a filter to speed up the search 14:38:55 <apuimedo> ivc_: we use it for cleanup. But its main purpose is to notify that it is automatically handled by kuryr to users 14:39:10 <apuimedo> it was sort of... Since it is already there, let's use it 14:39:47 <ltomasbo> to me, it makes sense, as it is kuryr service creating/managing them 14:39:49 <mchiappero> I used it a lot while working on ipvlan 14:39:50 <ivc_> as much as i dislike special-casing, maybe then we could have a special case for 'trunk:subport' that would fetch the ports for kuryr-managed nodes somehow 14:39:58 <mchiappero> i often had leftovers 14:40:10 <ltomasbo> while tagging that it is a subport, should not go on device_owner, but device_type or something like that if they need it 14:40:13 <ivc_> ofc until we get a proper api update on neutron side 14:41:36 <apuimedo> ivc_: let's wait a week to see what Neutron people say 14:41:48 <apuimedo> and then we can decide on contingency measures 14:41:50 <ivc_> sure 14:42:32 <apuimedo> anything else about kuryr-libnetwork? 14:43:34 <apuimedo> very well 14:43:35 <apuimedo> moving on 14:43:42 <apuimedo> #topic fuxi 14:43:50 <apuimedo> #chair hongbin 14:43:51 <openstack> Current chairs: apuimedo hongbin 14:43:53 <hongbin> hi 14:44:06 <hongbin> in last week, there are several proposed fixes 14:44:17 <hongbin> #link https://review.openstack.org/#/c/419767/ 14:44:19 <apuimedo> hongbin: today I was asked about fuxi on magnum. Do we have some docs on that? Or it only targets bare metal? 14:44:39 <hongbin> apuimedo: i am happy to explore fuxi on magnum 14:44:56 <hongbin> apuimedo: it is definitely one of the target 14:45:03 <hongbin> apuimedo: there are several things that needs to be done 14:45:13 <hongbin> apuimedo: 1. containerized fuxi 14:45:19 <hongbin> apuimedo: 2. trust support 14:45:34 <hongbin> apuimedo: then, we are ready to propose it to magnum 14:45:57 <apuimedo> cool 14:46:04 <apuimedo> sorry for the interruption 14:46:09 <hongbin> apuimedo: np 14:46:09 <apuimedo> :-) 14:46:24 <hongbin> yes, to continue, 14:46:36 <hongbin> i was trying to move fuxi to py35 14:46:43 <hongbin> #link https://review.openstack.org/#/c/419683/ 14:47:03 <hongbin> the last one, i have a pov for making multi-tenancy support 14:47:12 <hongbin> #link https://review.openstack.org/#/c/420386/ 14:47:25 <hongbin> all of those are under review, feedback is appreciate 14:47:31 <hongbin> apuimedo: that is all from my side 14:47:51 <apuimedo> hongbin: I suppose swift's failure with py3 is reported as a bug, right? 14:47:55 <apuimedo> thanks hongbin 14:48:02 <apuimedo> #topic kuryr-kubernetes 14:48:09 <hongbin> apuimedo: yes, we worked around swift 14:49:04 <apuimedo> #info ivc_ is a new core for Kuryr-kubernetes! Congratulations! 14:49:19 <ivc_> thanks! :) 14:49:20 <apuimedo> ivc_: and now that you are congratulated, pls review https://review.openstack.org/#/c/419933/ for merge :P 14:50:00 <apuimedo> #info vikasc reported that he is finishing addressing ltomasbo and ivc_'s comments to https://review.openstack.org/#/c/410578/ 14:50:29 <apuimedo> hongbin: we could appreciate help into drafting a plan to integrate kuryr-kubernetes with Magnum once that patch is merged 14:50:48 <apuimedo> a TODO list or something like that 14:50:57 <hongbin> apuimedo: i can try 14:51:04 <apuimedo> s/could/would/ 14:51:05 <ltomasbo> nice, I'm trying to follow the instruction to set it up, but so far no luck with it 14:51:05 <apuimedo> :-) 14:51:07 <apuimedo> thanks hongbin 14:51:19 <apuimedo> ltomasbo: you mean vikasc's patch? 14:51:23 <ltomasbo> yep 14:51:31 <ltomasbo> using the devstack templates 14:51:33 <apuimedo> ltomasbo: ping him in the morning then :P 14:51:40 <ltomasbo> I'll do 14:51:43 <apuimedo> cool 14:51:51 <apuimedo> ivc_: any news on the services front? 14:52:07 <ivc_> nope 14:52:31 <apuimedo> very well 14:52:36 <ltomasbo> servivcs == lbaas/octavia? 14:52:43 <ltomasbo> s/servivcs/services 14:52:44 <apuimedo> ltomasbo: neutron-lbaasv2 14:52:55 <apuimedo> we should add octavia after that 14:53:05 <ltomasbo> is there a patch on that already? I would like to take a look 14:53:06 <ivc_> ltomasbo i think toni is referring to the split of https://review.openstack.org/#/c/376045/ :) 14:53:10 <apuimedo> it should be a matter of changing the driver 14:53:28 <ltomasbo> great! 14:53:30 <apuimedo> ltomasbo: there is a patch, the one ivc_ links to. However, it needs a bit of splitting and UT 14:53:54 <ltomasbo> should that include the floating ip support too (in a follow up patch)? 14:54:12 <apuimedo> let me check 14:54:27 <ltomasbo> (maybe it already does...) 14:54:32 <ivc_> ltomasbo it does not 14:54:33 <ltomasbo> just asking... 14:54:39 <ltomasbo> ok 14:55:01 <apuimedo> I was checking if you could define externalIP for pod 14:55:04 <apuimedo> apparently you can't 14:55:12 <apuimedo> so yeah, it should be a follow-up patch 14:55:53 <apuimedo> anybody's got anything about kuryr-kubernetes? 14:56:20 <ivc_> apuimedo ltomasbo thats a rather trivial change technically, but i'm not yet certain if floating ip is the right fit for external IP 14:57:11 <apuimedo> ivc_: why? 14:57:38 <ltomasbo> I think it is something you can add to the VIP in lbaas, so that the loadbalancer can get reached from outside 14:58:09 <ivc_> ltomasbo apuimedo because there's also 'loadbalancer' type service 14:58:58 <ivc_> my understanding was that external ip (https://kubernetes.io/docs/user-guide/services/#external-ips) from k8s point of view is a an IP configured on the node's interface 14:59:04 <sigmavirus> apuimedo: y'all wrapping up? 14:59:11 <apuimedo> sigmavirus: we are 14:59:15 <apuimedo> sorry about that 14:59:21 <sigmavirus> No problem :) 14:59:27 <apuimedo> let's move to the channel 14:59:32 <apuimedo> thank you all for joining 14:59:36 <apuimedo> #endmeeting