18:04:21 #startmeeting networking_policy 18:04:21 Meeting started Thu Dec 10 18:04:21 2015 UTC and is due to finish in 60 minutes. The chair is SumitNaiksatam. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:04:22 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 18:04:24 The meeting name has been set to 'networking_policy' 18:04:41 #info agenda https://wiki.openstack.org/wiki/Meetings/GroupBasedPolicy#Dec_10th_2015 18:04:48 #topic Bugs 18:05:13 #link https://bugs.launchpad.net/group-based-policy/+bug/1523733 18:05:13 Launchpad bug 1523733 in Group Based Policy "GBP PT delete allows to delete PT , even though the PT is bound to VM" [Undecided,New] 18:05:34 there was some discussion on this, i am not sure if we reached a conclusion 18:06:07 this bug also states that - “In other way also, if the port associated with PT gets deleted, it triggers PT deletion also." 18:06:31 has anyone else observed this? 18:07:26 SumitNaiksatam: I don’t see any way to prevent a port from being deleted, other than making it owned by some other project. 18:08:09 rkukura: that part i agree 18:08:35 rkukura: but the bug report is also saying that on deleting the port, the PT which was associated with it got deleted 18:08:41 hi 18:08:47 thats seems wierd 18:09:09 Do we have a foriegn key to the port with on_delete=cascade? 18:09:19 SumitNaiksatam: I think this behavior probably comes from apic gbp driver ? 18:09:36 SumitNaiksatam: I dont recall seing this with RMD earlier 18:09:41 rkukura: mageshgv need to check 18:09:49 mageshgv: yeah 18:10:44 rkukura: https://github.com/openstack/group-based-policy/blob/master/gbpservice/neutron/db/grouppolicy/group_policy_mapping_db.py#L34-L36 18:11:44 ivar-lazzaro: did you get a chance to triage this: #link https://bugs.launchpad.net/group-based-policy/+bug/1521545 18:11:44 Launchpad bug 1521545 in Group Based Policy "If service chain delete fails then the PTG can never be cleaned up" [Undecided,New] - Assigned to Ivar Lazzaro (mmaleckk) 18:11:48 “set null” makes sense, but it seems like its cascading if the PT is disappearing 18:11:59 * tbachman pops in 18:12:27 rkukura: exactly, so that PT delete is not expected 18:12:36 SumitNaiksatam: not yet, was trying to investigate the db caching path 18:12:44 ivar-lazzaro: okay thanks 18:13:12 rkukura: is set_null also the value on the migration? 18:13:17 I suggest checking the logs when this happens to see whether a policy_target_delete is actually getting processed, or if its just getting deleted within the DB (i.e. cascading) 18:13:43 rkukura: right 18:13:58 i checked this with the RMD, and it does not happen 18:14:45 #link https://bugs.launchpad.net/group-based-policy/+bug/1521854 18:14:45 Launchpad bug 1521854 in Group Based Policy "VM going into error state with Virtual Interface creation failed error" [Undecided,New] 18:15:12 i am not sure that the above happens with the stock OVS 18:15:41 mageshgv: any info on how often you are seeing the above? 18:16:31 SumitNaiksatam: This was happening almost everytime if we launch a VM with 10 interfaces 18:17:11 mageshgv: okay, and less often with <10 interfaces? or doesnt happen at all at <10 18:17:11 SumitNaiksatam: Launching two VMs with 10 interfaces each on the same compute host concurrently will definitely cause this 18:17:55 SumitNaiksatam: It happens with one port too, but the frequency is very less 18:18:07 mageshgv: ok 18:19:00 any other high priority bugs we need to discuss today? 18:19:34 mageshgv: any error on the neutron log? 18:20:02 mageshgv: or is it a nova-compute only problem? 18:20:56 ivar-lazzaro: I do not remember exactly, I noticed it a while back and we did not conclude at that time. I will have to check with Vikash for details 18:20:58 ivar-lazzaro: its most likely an issue with contention between libvirt and opflex/ovs on some system resources 18:21:40 this is most likely not neutron related, and there are no error logs on the neutron side (when i last investigated it) 18:22:18 mageshgv: i also believe that with the stock OVS and stock neutron ovs-agent you never see this issue, right? 18:22:53 SumitNaiksatam: We did not see it earlier, but again probably did not stress test as much 18:23:09 mageshgv: okay 18:23:34 circling back on #link https://bugs.launchpad.net/group-based-policy/+bug/1521545 18:23:34 Launchpad bug 1521545 in Group Based Policy "If service chain delete fails then the PTG can never be cleaned up" [Undecided,New] - Assigned to Ivar Lazzaro (mmaleckk) 18:23:46 mageshgv: how often are you seeing this now? 18:24:02 this and the deadlock issue 18:24:47 SumitNaiksatam: We added a workaround for deadlock issue in our orchestration code, to wait for the port detach to be completed 18:25:02 mageshgv: ok great, that seems like the right approach 18:25:27 But the delete issue still occurs if something goes wrong on delete 18:25:27 moving on 18:25:32 mageshgv: "PTG deletion", do you mean deleting the Provider of a chain? 18:25:38 mageshgv: ok 18:25:41 ivar-lazzaro: yes 18:25:51 also, what goes wrong during the delete? 18:25:55 which part of it? 18:26:16 Server goes down? Node driver fails? PTs hang around? 18:27:05 ivar-lazzaro: Most likely PTs or some resources hang around, did not look into the logs today, This issue was seen today too 18:27:42 ivar-lazzaro: Because of the clean up issues, we are not raising any exception from node driver for delete path errors now 18:27:51 mageshgv: it will be helpful to have access to a system where we can reliably reproduce this 18:28:03 if thats possible at ll 18:28:08 *all 18:28:11 SumitNaiksatam: even the initial failure log would be useful 18:28:26 SumitNaiksatam, ivar-lazzaro: okay, will atleast get the logs 18:28:40 ivar-lazzaro: yeah 18:28:57 #topic Integration gate 18:29:18 stable/juno was EoL’ed earlier this week 18:29:34 but we are still maintaining our stable/juno branch 18:29:59 SumitNaiksatam: Does that apply to all 4 projects? 18:30:01 and the gate jobs for that branch broke since the stable/juno branch did not exist for other project 18:30:08 rkukura: yes for now 18:30:50 i was able to fix relatively easily for the pep8, py27 and doc jobs by pointing our dependency on the other projects to the juno-eol tag 18:31:18 however, fixing the integration job on our juno branch is proving to be more complicated 18:32:07 infra does some preconfiguration for the job and i suspect its failing, so at this point i am not even sure that we can revive this job for this branch, but i havent given up yet 18:32:34 meanwhile we will continue to process backports based on the pep8 and UTs 18:33:17 #topic Removal of unmaintained drivers 18:33:55 there are a few policy drivers and one service chain driver which i believe are not being maintained for a while now 18:34:02 so my proposal is to remove these 18:34:58 more specifically the nuage policy driver is not maintained for kilo 18:35:41 also one convergence policy and service chain drivers are not being actively developed 18:35:57 any objections to removing these from the code base? 18:36:45 mageshgv: hemanthravi songole: any objections? 18:36:48 SumitNaiksatam, can i get back to you on the one convergence drivers by next week irc 18:37:12 hemanthravi: sure, i was hoping to do it sooner as a part of the liberty release 18:37:20 hemanthravi: but we can touch base offline on this 18:37:43 ok, when do you need to do this by. should be ok to remove but want to make sure 18:38:05 hemanthravi: end of this week? 18:38:25 ok will let you know 18:38:30 hemanthravi: thanks 18:38:35 moving on to discussion about pending specs 18:38:45 #topic Design specs 18:39:04 https://review.openstack.org/#/c/239743/ 18:39:04 #link https://review.openstack.org/239743 “Network Service Framework for GBP” 18:39:24 would like to get review comments on this 18:39:29 hemanthravi: i put some comments on it just before the meeting 18:39:40 anyone else had a chance to review? 18:40:00 ig0r_, songole : had some comments 18:40:01 Its on my to-review list, but I’ve been prioritizing the short-term patches and haven’t got to it 18:40:07 i believe igordcard_ and songole had some comments as well 18:40:13 rkukura: yes sure 18:40:30 hemanthravi: thanks for addressing those 18:40:33 igordcard_, had comments will address the remaining thanks 18:40:33 ivar: could you also review it? 18:41:02 sure, I was already looking at it 18:41:15 SumitNaiksatam: yes made a quick review, I am getting back onto GBP context 18:41:26 igordcard_: great thanks 18:41:36 hemanthravi: are you also pursuing an implementation patch in parallel? 18:41:41 ivar-lazzaro: thanks 18:41:59 SumitNaiksatam: should we remove the old SC plugin as well? 18:42:03 SumitNaiksatam: MSC ? 18:42:05 yes will be working on the impl in parallel 18:42:26 ivar-lazzaro: yes we can, i believe i had marked it for deprecation 18:42:44 ivar-lazzaro: we need to mark it for deprecation for one cycel and remove it in the next cycle 18:42:52 so whenever that happens to be 18:42:56 SumitNaiksatam: that would significantly drop the amount of running UTs I think 18:43:22 SumitNaiksatam: I see 18:43:32 ivar-lazzaro: yes my concern is with the UTs in the context of the vendor drivers as well 18:43:58 ivar-lazzaro: its unnecessarily takes a long time to execute those UTs and delays the job 18:44:30 hemanthravi: any specific topics you wanted to bring up regarding the NSP spec? 18:45:04 SumitNaiksatam, would like review comments, nothing specific at this time 18:45:23 SumitNaiksatam, might require some enhancements to serviceprofile too 18:45:24 hemanthravi: okay 18:45:41 hemanthravi: okay, did not see that in the spec 18:46:09 #link https://review.openstack.org/242781 “Cluster ID for HA Spec” 18:46:16 i believe ivar-lazzaro has responded to earlier comments 18:46:21 SumitNaiksatam, wasn't sure if i should include it here, will add to it 18:46:43 hemanthravi: thanks 18:46:58 rkukura: hemanthravi: if you can take a look at the above spec as well 18:47:04 Yes, I can re-review 18:47:07 so that we can close on it 18:47:18 will review it 18:47:40 #topic QoS udpate 18:47:50 igordcard_: anything to discuss? 18:48:37 SumitNaiksatam: I'm trying to figure out the "best" way to do this 18:48:46 igordcard_: sure 18:48:58 I will be around at the IRC and will ask anything in case of doubts or questions, etc 18:49:03 igordcard_: perhaps a very high level spec with your current thinking might help 18:49:23 igordcard_: that way others will be able to participate more constructively 18:49:34 SumitNaiksatam: yeah 18:49:40 igordcard_: thanks 18:49:41 #topic Packaging update 18:50:06 nothing new on the RDO side 18:50:11 rkukura: before you ask, perhaps we might have another stable release, so probably still not a good time to use the packages 18:50:21 ok 18:50:25 time frame? 18:50:35 rkukura: i am guessing a week 18:50:50 we added new tags yesterday, but there are still a few more pending things 18:51:19 where do we stand on supporting liberty and mitaka? 18:51:54 rkukura: i did not make much progress since last week on the libery sync 18:52:01 got distracted with a few other things 18:52:14 so essentially we are still where we were 18:52:15 SumitNaiksatam: understood - let me know if I can help 18:52:45 rkukura: yes sure, if i can break down the problem, it will help to work on this in parallel 18:52:53 and get it done asap 18:53:13 SumitNaiksatam: OK, lets discuss when you have time to get back to it 18:53:16 as soon as we cut the liberty release we should be good to transition to mitaka 18:53:24 rkukura: yes, hopefully today 18:53:36 #topic Open Discussion 18:54:22 in terms of logistics, we will not have a meeting on Dec 24th and on Dec 31st 18:55:08 so next week might be our last meeting for this year, we can have a longer meeting if required by moving to the -gbp channel after the schedule meeeting time 18:55:43 so please think and bring all your burning issues next week so that we can address them before we take a break 18:56:00 break from the meetings that is, not from work :-) 18:56:21 anything else we need to discuss today? 18:57:07 ok thanks everyone for joining today 18:57:12 bye! 18:57:17 bye 18:57:22 bye 18:57:25 bye 18:57:32 #endmeeting