#openstack-meeting-alt log

16:01:11 <Sukhdev> #startmeeting networking_ml2
16:01:11 <manishg> hi
16:01:11 <openstack> Meeting started Wed Sep 24 16:01:11 2014 UTC and is due to finish in 60 minutes.  The chair is Sukhdev. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:01:12 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:01:15 <openstack> The meeting name has been set to 'networking_ml2'
16:01:30 <Sukhdev> #topic: Agenda
16:01:47 <Sukhdev> #link: https://wiki.openstack.org/wiki/Meetings/ML2#Agenda
16:02:05 <Sukhdev> #topic: Announcements:
16:02:24 <Sukhdev> Juno RC1 is very close
16:02:43 <Sukhdev> Does anyone know the exact date? I missed the core meeting yesterday
16:02:50 <shivharis> https://wiki.openstack.org/wiki/Juno_Release_Schedule
16:03:12 <shivharis> dont know the exact date
16:03:43 <Sukhdev> I think it is sometime this week
16:03:49 <banix> was said around today  :) no exact day
16:04:31 <amotoki> hi
16:04:40 <shivharis> will high priority bugs still get in after the date?
16:04:46 <Sukhdev> Kilo Summit sessions are growing in numbers
16:04:54 <shivharis> bugs -> bug fixes
16:05:03 <banix> shivharis: i think so
16:05:24 <Sukhdev> shivharis: If any critical bug is left, they may cut RC2
16:05:31 <amotoki> Yes, we are now focusing on high priority bugs.
16:06:02 <Sukhdev> We have one to discuss - lets cover that in the Bugs section
16:06:31 <Sukhdev> Please look at the Kilo summit sessions list - it is growing https://etherpad.openstack.org/p/kilo-neutron-summit-topics
16:07:02 <banix> nice!
16:07:11 <Sukhdev> If you have a passion for something or would like to propose a session, this etherpad is a good place to add
16:07:19 <amotoki> about RC1, the exact date is not decided but several projects plan to release the end of the week, so it is a good milestone.
16:07:21 <shivharis> long!
16:07:48 <Sukhdev> amotoki: thanks for clarifying
16:08:17 <Sukhdev> I have one bug to push - hopefully will make it :-)
16:08:25 <Sukhdev> Any other announcements?
16:08:52 <Sukhdev> #topic: Actions from last week
16:09:04 <shivharis> make hotel reservations soon, they are going fast
16:09:12 <Sukhdev> banix, mlavalle: update?
16:09:24 <mlavalle> Sukhdev: we had a nice chat yesterday
16:09:31 <banix> mlavalle: you want to discuss
16:10:08 <mlavalle> Sukhdev: we agreed to propose to the team to implement a special mechanism driver that will fail in predictable ways, for example every four operations
16:10:42 <mlavalle> this will enable a tempest script to expect the bulk operations to fail and verify they are rolled back
16:10:57 <banix> i was thinking the exact behavior of this driver could also be set by config parameters; just a thought; more flexible
16:11:11 <mlavalle> banix: great idea
16:11:25 <Sukhdev> mlavalle: you would need a predictable trigger to achieve this
16:11:37 <rkukura> mlavalle, banix: Could also use special values set on some attribute (i.e. name) to trigger behaviour.
16:11:53 <shivharis> i like the idea +1
16:12:11 <shivharis> as time goes by it can be used for other tests as well
16:12:25 <mlavalle> I wrote a brief description in the agenda items
16:12:28 <banix> rkukura: yeah we thought about your suggestion as well
16:12:30 <amotoki> my question is how to specify such driver.
16:13:20 <amotoki> ah.. rkukura already mentions the related topic.
16:13:38 <Sukhdev> amotoki: I am thinking a mocked driver with some config knobs
16:13:39 <rkukura> amotoki: The driver could live in tempest - just needs its entry point in the right namespace in setup.py.
16:13:48 <banix> rkukura: thought the other option may be a bit more straightforward less involved
16:14:02 <banix> yeah we have such a driver there already: bulkless
16:14:03 <amotoki> rkukura: yes. we can specify multiple drivers.
16:14:33 <amotoki> bulkfail?
16:14:42 <rkukura> mlavalle: Is this special driver part of tempest or for unit/functional tests?
16:14:48 <shivharis> rkukura: it could live in ml2/plugin/drivers (as a test driver?)
16:15:03 <banix> amotoki: neutron/tests/unit/ml2/drivers/mechanism_bulkless.py
16:15:10 <rkukura> If its only for tempest, why not put it in the tempest repo?
16:15:11 <amotoki> banix: i know.
16:15:22 <mlavalle> rkukura: I am thinking this driver is part of ml2 and used by tempest scripts
16:15:42 <rkukura> I’m a bit uncomfortable making such a driver part of ML2’s “stable API”, with tempest being unversionsed.
16:15:52 <amotoki> but using tempest to this purpose is a bit big thing  as  my first impression
16:16:11 <rkukura> But I’m also uncomfortable making the driver API a “stable API” for tempest :(
16:16:16 <banix> amotoki: you mean too much; not necessary?
16:17:14 <amotoki> banix: tempest test API and scenarios tests for some set of OpenStack, so I wonder whether a special driver fits or not.
16:17:34 <amotoki> tempest test*s* *.....
16:17:35 <rkukura> amotoki: Maybe this is best done as a unit test or functional test.
16:17:50 <banix> amotoki: rkukura i see the concern
16:17:53 <amotoki> rkukura: totally agree.
16:18:19 <banix> we have some forms of these tests already in the unit tests
16:18:56 <banix> may be functional tests is the correct place for something like this
16:19:02 <amotoki> banix: yes, so I think it fits unit test or functional test rather than tempest tests.
16:19:36 <Sukhdev> rkukura: I am not too sure how will you test this behavior with unit tests - but, functional tests may be
16:20:21 <banix> Sukhdev: we do test with failing mechanism drivers in unit tests
16:20:29 <Sukhdev> mlavalle: Are functional tests not part of tempest tests?
16:20:30 <rkukura> Sukhdev: We have plenty of unit tests that configure drivers and driver them through the neutron API. Maybe these should be functional tests instead, but not sure.
16:20:43 <banix> but the proposed solution may be a step further wrt testing functionality
16:20:47 <rkukura> s/and driver them/and drive them/
16:21:03 <mlavalle> Sukhdev: they are not
16:21:20 <Sukhdev> mlavalle: thanks
16:21:29 <amotoki> I think it fits functional tests though i have no good idea on how to implement so far.
16:21:42 <banix> may be that is the next action item to see how we can improve neutron functional tests in this regard?
16:22:05 <mlavalle> banix: +1
16:22:14 <rkukura> It seems that if we want to do this with tempest, we are best off putting this driver in neutron and making sure that its behaviour and the mechanism for triggering it are things we can maintain backwards compatability with.
16:22:35 <mlavalle> rkukura: correct
16:23:28 <amotoki> in my view, tempest tests should not expect some plugins/drivers.
16:23:30 <shivharis> mlavalle: in functional testing is it possible to restart neutron-server?
16:23:33 <Sukhdev> banix mlavalle: Will you continue to drive this?
16:23:44 <banix> Sukhdev: yes
16:24:08 <mlavalle> shivharis: I don't really have experience with neutron functional testing. for me, that is somethiong to eplore
16:24:29 <Sukhdev> #action: banix to work with mlavalle and the team to figure out best testing strategy for bulk operations
16:24:30 <mlavalle> Sukhdev: yes, as i said last week, I am joining the ml2 team for Kilo
16:24:44 <banix> i think that is a good thing for us to explore
16:25:01 <Sukhdev> Anything else on this subject?
16:25:09 <mlavalle> Sukhdev: if you all take me, of course :-)
16:25:29 <Sukhdev> mlavalle: You are most welcome
16:25:34 <rkukura> mlavalle: You are more than welcome to get as involved as you want!
16:25:41 <mlavalle> :-)
16:25:52 <shivharis> mlavalle: why wait for kilo
16:25:56 <Sukhdev> mlavalle: we humbly accept to you as a fellow ML2'er :-)
16:26:20 <mlavalle> shivharis: it means now... it's just that it takes some ramp up time
16:26:28 <Sukhdev> Anything else, folks?
16:26:35 <banix> now we have to tell him about the secret rules ;)
16:26:43 <mlavalle> i'm done for today
16:26:50 <Sukhdev> #topic: Bugs
16:26:58 <shivharis> hi
16:26:59 <Sukhdev> shivharis: Floor is yours
16:27:32 <shivharis> we have discussed banix's bug quite well...
16:27:48 <shivharis> moving on:
16:27:52 <shivharis> https://bugs.launchpad.net/neutron/+bug/1179223
16:27:54 <uvirtbot> Launchpad bug 1179223 in neutron "Retired GRE and VXLAN tunnels persists in neutron db" [High,In progress]
16:28:00 <shivharis> romilg: ?
16:28:40 <romilg_> I posted the patch set
16:28:52 <shivharis> is this based on a new design?
16:28:57 <romilg_> need reviewers
16:29:25 <romilg_> yeah
16:29:30 <shivharis> kyle thinks that this should not be postponsed to kilo
16:29:46 <romilg_> I have added a table tunnel_mappings to accomated this
16:29:51 <shivharis> we need help reviewing this  (high priority)
16:30:00 <romilg_> lets have a looks at https://review.openstack.org/#/c/121000/
16:30:05 <romilg_> and share comments
16:30:16 <Sukhdev> rkukura amotoki: can you spare some cycles for this?
16:30:17 <romilg_> thanks :)
16:30:26 <rkukura> Sukhdev: Yes
16:30:34 <amotoki> sure
16:30:39 <shivharis> rkukura, amotoki: thanks
16:30:43 <shivharis> next:
16:30:50 <shivharis> https://bugs.launchpad.net/neutron/+bug/1367391
16:30:51 <manishg> I'll take a look at this as well.
16:30:52 <uvirtbot> Launchpad bug 1367391 in neutron "ML2 DVR port binding implementation unnecessarily duplicates schema and logic" [High,Confirmed]
16:31:08 <shivharis> rkukura: this one is yours
16:31:35 <rkukura> shivharis: mestery decided to postpone that one to juno
16:31:47 <shivharis> kilo?
16:31:47 <rkukura> to kilo I mean
16:31:52 <shivharis> ok
16:32:18 <rkukura> there are other DVR issues that are more critical I believe
16:32:51 <shivharis> ok, we will move this to kilo
16:32:59 <shivharis> other than the 3 we discussed, please look at all the ml2 bugs and see if these need to be prioritized
16:33:28 <shivharis> if things dont change a whole lot we look quite alright for RC1
16:33:54 <amotoki> Targeted to Kilo just means the second priority and it is a balance to the gate queue... we need to focus on bugs targeted to RC1 now.
16:34:32 <shivharis> amotoki: agreed
16:34:59 <Sukhdev> amotoki: Makes sense - so, we have one from banix and others mentioned by rkukura for DVR
16:34:59 <shivharis> we have only 3 that need work for RC1
16:35:26 <amotoki> this kind of consensus is important :-)
16:35:33 <shivharis> f there are any that should be re-prioritized please speak up now
16:35:47 <banix> if there is time may i ask a question regarding my bug
16:35:54 <banix> that is my beloved bug :)
16:35:55 <shivharis> Any questions on bugs?
16:36:04 <rkukura> I think one DVR issue we should be tracking is https://review.openstack.org/#/c/123403/
16:36:05 <shivharis> banix: yes
16:36:16 <banix> Thanks for all the reviews; Will adress all today; mainly minor; have one question to discuss here
16:37:02 <banix> and that is how we deal with failure in rollback; in the patch, we try to set the status of the resource to ERROR. amotaki in reviews mention about applying the same for non bulks
16:37:10 <Sukhdev> shivharis: can you add that to the list?
16:37:14 <rkukura> This I believe is changing ML2 so that delete_port_postcommit() gets called multiple times, once for each host on which the DVR port is bound.
16:37:14 <shivharis> rkukura: i dont have a background on this will take this up next time
16:37:29 <shivharis> Sukhdev: will do
16:37:37 <banix> that is not done for non bulks right now and am wondering if we could leave that for a separate patch?
16:38:10 <rkukura> It already calls delete_port_precommit() for each host, so this makes these match, but I believe the delete calls should match the create calls.
16:39:04 <rkukura> I’m expecting this fix will need to go into juno-rc1 for DVR, and that in kilo we will need to revisit how deletes work.
16:39:58 <banix> sorry misspelled amotoki above
16:40:07 <amotoki> banix: np
16:40:45 <amotoki> if there is a bug critical for RC1, please raise it or let kyle or one of the core team know it.
16:40:55 <banix> i am just thinking we do not change much for non bulk operations in this patch; considering where we are in the cycle
16:41:14 <Sukhdev> banix: so for my clarification - if one of the four (in a bulk operation) fails, roll back only the failed one or all four?
16:41:27 <banix> Sukhdev: all four
16:41:34 <rkukura> banix: on the bulk ops fix, I feel we should avoid unnecessary changes, and leave cleanup/refactoring/etc for kilo
16:41:35 <banix> the bulk ops are atomic by definition
16:41:50 <shivharis> i will talk to Kyle regarding the bugs owned by banix and romilg (any other?)
16:42:00 <shivharis> amotoki: ^^^^
16:42:03 <amotoki> banix: I am neutral we need to fix it in a same patch.
16:42:37 <banix> rkukura: are you ok with how rollback failures for bulk are done in the patch or want that done in a lter patch as well?
16:42:59 <shivharis> Sukhdev: add action for me
16:43:17 <banix> rkukura: that is setting the status to error for resources that didnt get deleted properly in rollback
16:43:24 <banix> amotoki: ok thanks
16:43:53 <rkukura> banix: what does it mean for a resource to not get deleted properly?
16:43:56 <Sukhdev> #action: shivharis to follow up with mestery regarding ML2 RC1 bugs
16:43:58 <ChuckC> banix: setting error status should be consistent
16:44:09 <banix> rkukura: exception is thrown
16:44:15 <rkukura> Do you mean that some MD failed?
16:44:15 <ChuckC> banix: bulk or non-bulk
16:44:31 <banix> right now we do not worry about the delete failing in non bulks
16:44:55 <ChuckC> banix: IMO
16:45:00 <rkukura> So for non-bulk, we go ahead and delete the resource?
16:45:02 <shivharis> i think consisteny is important, and one gets a way to deal with these later on, based on status
16:45:04 <banix> ChuckC: i see your point
16:45:16 <banix> rkukura: yes
16:45:27 <rkukura> I’d say this late in the game, its best to make the bulk behaviour match the current non-bulk behaviour
16:45:38 <ChuckC> rkukura: +1
16:45:55 <Sukhdev> rkukura: +1
16:45:57 <banix> amotoki: what do you think about what rkukura just stated above
16:46:17 <amotoki> banix: which one?
16:46:41 <banix> amotoki: that we do nit worry about setting the status of resources to ERROR if delete fails in this patch
16:47:26 <shivharis> i like amotoki's idea, but can it wait for kilo?
16:47:27 <banix> asking amotoki explicitly becuse he suggested the further step of setting status to ERROR
16:47:31 <amotoki> hmm... setting status in bulk can be in, but....
16:47:43 <banix> others welcome to chime in of course
16:47:46 <amotoki> for a single operation it should be deferred.
16:48:21 <amotoki> ideally both should match, but we don't need to fix it at the same time.
16:48:45 <rkukura> So if a non-bulk create fails, no resource has been created, but with bulk it may have been created but be left in a state where it cannot be used or deleted?
16:49:11 <banix> rkukura: even in non bulk the failure may come in post commit
16:49:19 <ChuckC> amotoki: do you support a separate Juno patch to make it consistent?
16:49:31 <amotoki> setting status can be deferred too of course.
16:49:36 <rkukura> banix: But in that case, doesn’t the resource get deleted before the create op returns?
16:49:37 <Sukhdev> banix: agreed
16:49:41 <shivharis> imo, i still think keep consistency bulk or no bulk
16:49:42 <amotoki> ChuckC: I don't think it is a requirement.
16:50:06 <banix> rkukura: unless the db delete for some reason fails
16:50:15 <amotoki> rkukura: resources can be visible in the API after precommit finishes so it can potentially be deleted before the rollback in bulk create.
16:50:27 <rkukura> amotoki: you have a point
16:50:29 <banix> i think the issue with bulk is even if a delte fails we have to do the other deletes
16:50:43 <banix> that we will do
16:50:54 <rkukura> banix: +1
16:51:00 <banix> i am thinking dealing with a failed delete may be left for another patch
16:51:11 <banix> that is setting the status to ERROR
16:51:28 <banix> that way we have consistency between bulk and non bulk as well
16:51:37 <amotoki> My comment on delete failure is jus a possibility. +1 for banix now.
16:51:57 <shivharis> banix: please summarize so we can move on
16:52:22 <banix> amotoki: thnanks; just considering how late in the cycle we are; that would be reasonable
16:52:26 <banix> yes to summarize:
16:53:18 <banix> for bulk operations we make sure if a delte in rollack fails we continue other resources as needed; in a future patch we make sure if delete fails (for bulk and non bulk) we set the status of the resouce to ERROR
16:53:29 <shivharis> banix: thanks
16:53:42 <shivharis> we have beaten all bugs to death, back to you Sukhdev
16:53:47 <Sukhdev> #agreed: for bulk operations we make sure if a delte in rollack fails we continue other resources as needed; in a future patch we make sure if delete fails (for bulk and non bulk) we set the status of the resouce to ERROR
16:54:00 <Sukhdev> shivharis: Thanks
16:54:29 <Sukhdev> Anything else on the bugs, folks?
16:54:44 <Sukhdev> #topic: Deferred specs
16:55:17 <Sukhdev> I just put the link in the agenda as a place holder so that folks can see it - feel free to update it https://wiki.openstack.org/wiki/Tracking_ML2_Subgroup_Reviews#Under_Review
16:55:41 <Sukhdev> #topic: Open Discussion
16:55:48 <ChuckC> banix: please include me in discussions re testing mech driver failures
16:55:50 <Sukhdev> anything?
16:56:03 <banix> ChuckC: sure will do
16:56:09 <ChuckC> banix: thanks
16:56:21 <Sukhdev> Anybody want to discuss anything - we have 4 mins ?
16:56:51 <Sukhdev> waiting…..waiting….
16:57:14 <Sukhdev> Looks like we are done, folks….great discussion and conclusions!!!
16:57:17 <Sukhdev> thanks for attending
16:57:21 <amotoki> (announce) horizon Juno supports provider networks.
16:57:22 <Sukhdev> #endmeeting