09:00:19 <gsagie> #startmeeting dragonflow 09:00:20 <openstack> Meeting started Mon Feb 29 09:00:19 2016 UTC and is due to finish in 60 minutes. The chair is gsagie. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:00:21 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:00:23 <openstack> The meeting name has been set to 'dragonflow' 09:00:27 <gsagie> Hello to all the dragons :) and the flows.. 09:00:30 <gampel> Hi 09:00:36 <gsagie> let this show begin! 09:00:45 <gsagie> who is here for dragonflow meeting? 09:00:45 <matrohon> hi 09:00:45 <Shlomo_N> hi 09:00:48 <nick-ma> hi 09:00:51 <matrohon> hi 09:00:58 <gsagie> matrohon: welcome :) first time i see you 09:01:12 <matrohon> gsagie, thanks :) 09:01:20 <gampel> welcome 09:01:28 <yuli_s_> Hello 09:01:44 <matrohon> I hope there will be room for open discussion today! 09:02:02 <gsagie> #info gampel, matrohon, nick-ma, oanson, Shlomo_N, yuli_s, gsagie, DuanKebo in meeting 09:02:20 <DuanKebo> Hi 09:02:24 <gsagie> matrohon: sure, we have a tight schedule but we are here for the open discussion so we will make some time 09:02:40 <gsagie> #topic design summit 09:02:44 <nick-ma> we also can move to dragonflow room for discussion. 09:03:01 <gsagie> so, Dragonflow was approved as big-tent project 09:03:05 <matrohon> gsagie, nick-ma fine! 09:03:24 <Shlomo_N> Congrats...! 09:03:26 <gsagie> and this means we need to request for rooms for design summit sessions, we have few topics and gampel 09:03:33 <gsagie> need to send an email to request how many rooms 09:03:37 <gampel> we need to decide how many work session we want in the summit 09:03:37 <gsagie> how many sessions 09:04:07 <gsagie> gampel: i think we also want 1-2 fishbowl to hopefully have a broder community discussion regarding general roadmap and users requests/questions 09:04:28 <gampel> I think that we need at least 4 1 hour sessions 09:04:34 <gsagie> so i was thinking something in the area of 1 fishbowl session and 5 1 hour sessions 09:04:35 <gsagie> yeah 09:04:50 <gampel> yes sound good 09:05:02 <gsagie> gampel: ok, lets also start etherpad to prioritze the points we want to discuss 09:05:05 <gsagie> so anyone can add 09:05:14 <gsagie> and publish to mailing list maybe 09:05:28 <gampel> road map for N is one 09:05:42 <nick-ma> we can list our session schedule on etherpad for priority. 09:05:44 <nick-ma> yes 09:05:46 <gsagie> #action gampel return design session numbers we need (1 fishbowl, 5x1 hour) 09:05:59 <gsagie> #action gampel start etherpad for design summit topics 09:06:12 <gsagie> yeah lets continue on that etherpad, we still have time 09:06:13 <gampel> I will create the etherpad and sent it to the mailing list 09:06:18 <gsagie> ok great! 09:06:20 <gsagie> thanks 09:06:22 <nick-ma> thakns 09:06:25 <gsagie> #topic testing 09:06:41 <gsagie> Ok, Shlomo_N, yuli_s, please update us on this front 09:06:49 <Shlomo_N> ok 09:06:53 <gsagie> We are starting to create scale and API tests 09:07:18 <Shlomo_N> First, the data plane performance testing spec was merged into the DragonFlow git repository, you can find it here: 09:07:22 <Shlomo_N> https://github.com/openstack/dragonflow/blob/master/doc/source/specs/performance_testing.rst 09:07:37 <gsagie> #link https://github.com/openstack/dragonflow/blob/master/doc/source/specs/performance_testing.rst 09:07:46 <Shlomo_N> 10x gsagie 09:07:50 <Shlomo_N> I have finished the data plane performance testing for DVR and 09:07:55 <Shlomo_N> DragonFlow in multi-node environment 09:08:03 <gsagie> Shlomo: good job, lets wait with publishing the results 09:08:12 <gsagie> until we have full picture of Dragonflow with security groups 09:08:22 <Shlomo_N> sure 09:08:23 <gsagie> but after that we can start pushing things to openstack-performance-docs 09:08:27 <gsagie> ok thanks 09:08:36 <gsagie> We need to see how we can automate this 09:08:43 <gsagie> so we can start applying this per patch 09:09:03 <gsagie> yuli_s also published API/control plane testing spec 09:09:03 <Shlomo_N> I am starting to work on the automation part 09:09:16 <gsagie> #link https://review.openstack.org/#/c/282873/ 09:09:32 <gsagie> so please everyone review and share comments/ideas 09:09:50 <gsagie> we also would like to work on this with the community yuli_s and define shared standards 09:09:53 <DuanKebo> Can we also test the perf of neutron refrence design. 09:10:02 <gsagie> DuanKebo: thats what we do 09:10:09 <DuanKebo> and compare dragonflow's to it . 09:10:12 <gsagie> DuanKebo: Shlomo_N did a comparsion and will send results 09:10:19 <yuli_s_> the doc covvers more single box tests 09:10:20 <DuanKebo> Great! 09:10:20 <gsagie> we need to go over them and verify 09:10:21 <gampel> Good job Yuli , are you going to do it with rally ? 09:10:29 <yuli_s_> so, it will be extended as we go 09:10:46 <gsagie> yeah as me and yuli_s talked about the goal is to reach rally 09:10:50 <yuli_s_> and work with bigger test lab 09:11:07 <gsagie> for automation, but get initial numbers first as well so we will have an idea what to expect 09:11:48 <gampel> It will be important to simulate DB client's as well to test 09:12:08 <gampel> DC scale 1000+ compute node 09:12:14 <gsagie> gampel: yes good point, after API we will want to test the DB backend as well 09:12:36 <gsagie> i think its mentioned in the document 09:13:10 <gsagie> yuli_s: i say lets first get initial numbers and then start investigating Rally for this, i can help you with that 09:13:23 <gsagie> hi vikram! :) 09:13:26 <yuli_s_> gsagie: sure ! 09:13:28 <gsagie> ok lets move to the next topic 09:13:28 <vikram_> hi gal ;) 09:13:33 <gsagie> #topic DB consistency 09:13:40 <gsagie> nick-ma: :) 09:13:42 <gampel> welcome @ vikram_ 09:13:49 <nick-ma> the current code is in the review. 09:13:50 <gsagie> hi raofei welcome 09:13:54 <raofei> hi 09:13:55 <vikram_> gampel; thanks 09:13:57 <gsagie> #info raofei, vikram are in meeting as well 09:14:18 <gsagie> nick-ma: good job, i did saw we had some exceptions in neutron server, will re check 09:14:19 <nick-ma> i'm also working on the testing. there are some errors on updating subnet in fullstack. 09:14:36 <gsagie> yeah i saw some tests are sometimes failing 09:14:39 <nick-ma> i updated the code. 09:14:47 <nick-ma> yes. but for manual testing and rally, it works. 09:14:54 <nick-ma> so i need to figure out why. 09:15:02 <yuli_s_> the DB consistency architecture is great, I think we might have some rejections from neutron team that we want to add a new table 09:15:02 <gsagie> #link DB consistency review https://review.openstack.org/#/c/282290/ 09:15:17 <yuli_s_> to sync data in neutron db 09:15:31 <gsagie> nick-ma: okie great 09:15:35 <gsagie> yuli_s: thats not a problem 09:15:42 <gampel> it is an additional table only accessed from our plugin i do not see a problem 09:15:58 <yuli_s_> gsagie: i hope too 09:16:12 <gsagie> nick-ma: okie let us know if you need any help with investigating these tests, i saw some fails that relate to floating ip as well 09:16:13 <nick-ma> yes. there's no problem for adding a new table for a plugin. 09:16:27 <gsagie> anything else for DB consistency? 09:17:16 <gsagie> #topic pub-sub 09:17:20 <gsagie> okie.. 09:17:21 <nick-ma> there's some discussion on getting rid of neutron db and just use nosql for persistent storage. i'll write my thoughts on it. we can discuss it later. 09:17:53 <gsagie> nick-ma: okie, we had same thoughts but this require much more work and needs to be synced with the Neutron community 09:18:08 <gsagie> so its defiantly not for this release 09:18:13 <gampel> yes this is a big change and the biggest problem there is that the DB is only eventually consistent 09:18:22 <yuli_s_> for me it is not clear why we need to use ZerroMQ together with redis 09:18:28 <nick-ma> yes, you are right. 09:18:32 <gsagie> gampel: the pub-sub stage is yours, please publish us some status :) 09:18:34 <DuanKebo> redis pub/sub driver is under test now. 09:18:45 <gampel> Yes pub-sub status 09:18:52 <DuanKebo> @ruli_s Yes 09:19:00 <gampel> we have the ZMQ merged 09:19:15 <DuanKebo> We are try not to use zerroMQ in redis driver 09:19:35 <gampel> and@ omer is working on separating the publisher into a different service 09:19:39 <gsagie> DuanKebo: gampel found some interesting posts describing that ZeroMQ performance is much better then Redis 09:19:44 <gsagie> we do need to verify all of it 09:19:55 <gampel> Curently we only support Neutron server Publishers 09:20:05 <gsagie> so it might prove more efficent to us redis as DB backend and ZeroMQ as the pub/sub 09:20:07 <yuli_s_> gsagie: in the control performance doc we are covering this 09:20:08 <DuanKebo> Yes, we will investigate it. 09:20:17 <gampel> we do not support for this release controller to controller 09:20:41 <gampel> the chasiss are handled from the neutron server side 09:21:05 <gampel> ZMQ and REdis do not work together they are two drivers 09:21:18 <gsagie> DuanKebo, gampel: any open points regarding Redis pub/sub? 09:21:19 <gampel> we can use any DB with any Pub sub driver 09:21:31 <gsagie> any problems we need to discuss? 09:21:31 <oanson> In case we'll want DFcontroller-DFcontroller, we can use the database + a table monitor on the controller. 09:21:45 <DuanKebo> Currently, we are using redis for db and pub/sub 09:21:57 <gampel> From the discussion today we understand that redis does not need to bind a local socket for the publisher 09:22:01 <DuanKebo> But open to other opinions 09:22:11 <gampel> and can run publisher per Neutron server process 09:22:16 <DuanKebo> opent to other options. 09:22:48 <gsagie> okie, then the implementation doesnt need to use the pub/sub service 09:22:57 <DuanKebo> Yes, it support multiply neutron servers on one host. 09:23:04 <oanson> DuanKebo, by separating the pub/sub and DB drivers, we allow flexibility. In configuration, we can select to do both with redis, or DB with redis and pub/sub with ZMQ. 09:23:05 <gampel> what I suggest is that we test performance of the different Pub?sub and then we could compare them all 09:23:36 <gsagie> DuanKebo, gampel: okie then, lets continue with Redis implementation and compare results to Redis + ZeroMQ or only Redis 09:23:36 <yuli_s_> yes, we can start the actual control plane test from here 09:23:51 <gsagie> DuanKebo: is there any time frame for Redis to be completed? 09:23:55 <gampel> For M Cycle i propose to focus only on Neutron server publishers 09:23:56 <gsagie> are we close? 09:24:20 <DuanKebo> We are doing the test. 09:24:36 <DuanKebo> I think it can be finished this week 09:24:43 <gsagie> the test or the code? 09:24:52 <DuanKebo> the test 09:25:12 <nick-ma> we need to build gate check for each db backend. 09:25:23 <gampel> yes this is a good idea 09:25:33 <gsagie> DuanKebo: okie, so please update us with the results 09:25:39 <gsagie> when you have them 09:25:41 <DuanKebo> OK, np 09:25:50 <gampel> I will add it to the list and register a bug 09:26:04 <oanson> Gate-tests for each possible configuration is very expensive performance-wise, and may take a lot of time. I think this should be considered. 09:26:05 <gsagie> But if the effort of completing Redis is minimal, i think we should do it with the pub/sub as well 09:26:22 <yuli_s_> DuanKebo: check this out https://github.com/openstack/performance-docs/blob/master/doc/source/test_plans/mq/plan.rst 09:26:42 <gsagie> lets talk about gate tests after.. 09:27:14 <nick-ma> yes, very expensive. :-) 09:27:45 <gsagie> okie 09:27:59 <gsagie> #action DuanKebo publish results of pub/sub testing 09:28:03 <gampel> I think in this teste there are not using pub/sub sockets 09:28:19 <gsagie> #action DuanKebo try to asses the time it takes to finish Redis for both pub/sub and DB 09:28:52 <gsagie> gampel: we need to test this end to end, if the effort of adding this now is small i think we need to test this end to end 09:29:21 <gampel> @DuanKebo the test are mainly for The MSQ part and not for the Pub/Sub 09:29:30 <gsagie> There are some advantage to selective proactive distribution with redis 09:29:34 <gsagie> we need to explore 09:29:52 <gsagie> so lets see if we can complete the work and then do the testing 09:30:11 <gsagie> DuanKebo: what do you think? 09:30:15 <gampel> Yes i think that the controller path testing will be the key for the selection 09:30:50 <yuli_s_> DuanKebo I will be happy to help with the test 09:31:08 <gsagie> okie, lets continue talking about this after 09:31:15 <gsagie> #topic security groups and port security 09:31:22 <gsagie> dingbo here? 09:31:28 <gsagie> raofei: can you update on this? 09:31:33 <gampel> we will add th reliability latter but I agree we should start testing as soon as possible 09:31:49 <raofei> security group is done by Yuanwei now 09:32:08 <raofei> The code is almostly completed, now it's testing 09:32:45 <gsagie> okie, you know when he will upload it up stream? 09:32:51 <gsagie> so we can all review 09:33:05 <gampel> maybe ask him to upload with WIP 09:33:23 <gampel> so we could speed the review cycle 09:33:26 <raofei> Yes, I think so. @duankebo, when does yuanwei can commit the latest code? 09:34:06 <gsagie> i think he is disconnected 09:34:16 <DuanKebo> online again 09:34:19 <gsagie> ahh ok 09:34:23 <gsagie> welcome back :) 09:34:38 <DuanKebo> I will confirm it with yuanwei 09:34:51 <DuanKebo> I think it may be this week 09:34:59 <gsagie> DuanKebo: please do because we are approaching end of release 09:35:06 <gsagie> doing the testing without security groups is pointless 09:35:35 <gsagie> #action DuanKebo check security groups status and upload for review to upstream 09:35:40 <gsagie> #topic distributed DNAT 09:35:47 <DuanKebo> yes. as i know most of the code have been finished. 09:35:48 <gsagie> raofei: ...:) 09:36:09 <raofei> it's coding phase. 09:36:22 <gsagie> only the application is missing right? 09:36:35 <raofei> I think the coding and testing will be completed this week. 09:36:36 <raofei> yes. 09:36:46 <gsagie> raofei: ok good job 09:36:50 <gampel> great :) 09:36:56 <raofei> of course, the plugin also need to do some change. 09:37:00 <gsagie> #action raofei upload distributed dnat upstream for review 09:37:02 <DuanKebo> @raofei Hujie is doing the integration 09:37:23 <gsagie> #topic selective proactive 09:37:44 <gsagie> DuanKebo: i saw the code, i need to see about my comments from last patch but over all it looks good 09:38:10 <gsagie> need to make sure the OVSDB monitor patch updates the correct queue and it seems we have this covered and we only then need the support of the DB/pub sub 09:38:11 <DuanKebo> Yes i have saw the comments. 09:38:21 <DuanKebo> and will upload a patch today. 09:38:32 <DuanKebo> it is under testing aslo. 09:38:33 <gsagie> oanson: you have a bug on you to move the local cache to be tenant aware, will you find time to work on this? 09:38:48 <oanson> Sure 09:38:49 <gsagie> DuanKebo: ok great job, looks good to me 09:39:04 <gsagie> #action oanson continue gsagie patch for Tenant aware cache 09:39:16 <gsagie> we can then create better searched per tenant in cache 09:39:26 <gsagie> DuanKebo: any open issue regarding that? 09:39:40 <gsagie> #link tenant cache https://review.openstack.org/#/c/277176/ 09:39:41 <DuanKebo> @gsagie I need nb-api to support query by topic 09:39:49 <gampel> You mean adding additional indexes ? 09:40:15 <gsagie> gampel: no, we talked about that to make the cache structure by tenants 09:40:28 <DuanKebo> Some apis miss this para. But we can discuss this after the IRC 09:40:31 <gsagie> so you have a list/dict of tenants and so on 09:40:46 <gampel> I see we just need to make sure we do not slow the other query s 09:40:49 <gsagie> DuanKebo: ok, look at this patch: https://review.openstack.org/#/c/284178/ 09:40:56 <gsagie> it has eveyrthing you need i think 09:40:59 <gsagie> we need to merge it soon 09:41:11 <DuanKebo> OK 09:41:15 <gsagie> gampel: maybe you can review and merge 09:41:26 <gampel> OK 09:41:33 <gampel> i will 09:41:37 <gsagie> gampel: its not going to slow things only make them faster 09:41:54 <gsagie> both for L3 apps and for DB/controller 09:42:04 <gsagie> especially when using L3 reactive 09:42:27 <gsagie> i do have some work to also fix some things in the L3 proactive app 09:42:46 <gsagie> #action gsagie decrease flow number in L3 proactive app, one flow per router interface 09:43:02 <gsagie> for controller reliability, we need to review this code 09:43:09 <gsagie> #topic controller reliability 09:43:17 <gsagie> DuanKebo: is there any open issue for that? 09:43:25 <DuanKebo> no 09:43:58 <DuanKebo> This work is delayed. 09:44:19 <gsagie> okie great, gampel, Li-Ma please review this patch when you have time (and everyone else of course) 09:44:34 <DuanKebo> Heshan(The guy in charge of this) has been assigned another work 09:44:40 <gsagie> DuanKebo: ok :( 09:44:44 <gampel> I will review it today 09:45:11 <gampel> so we need some one to pick this work 09:45:18 <gsagie> DuanKebo: its probably not the most urgent job, but do you want someone else continue this work? 09:45:18 <nick-ma> okie 09:45:20 <gsagie> i can work on it 09:45:29 <DuanKebo> No, Heshan will come back soon 09:45:29 <gsagie> or you have anyone else? 09:45:39 <gsagie> ok 09:45:41 <DuanKebo> He can continue this work. 09:46:00 <gsagie> #info Heshan come back :) we need you! 09:46:11 <gsagie> #topic open discussion 09:46:18 <gsagie> matrohon: stage is yours :) 09:46:28 <matrohon> gsagie, thanks 09:46:29 <gsagie> then we can talk about CI jobs 09:47:07 <matrohon> as nick-ma said earlier, we are considering getting rid of neutron db, and rely on a nosql db instead 09:47:40 <nick-ma> haven't decided yet. there are lots of tradeoffs and discussion on it. 09:47:41 <matrohon> this kind of work take place in the following context : 09:47:51 <matrohon> #link https://www.openstack.org/summit/austin-2016/vote-for-speakers/presentation/7342 09:47:52 <gsagie> matrohon: you work with nick-ma? 09:48:03 <gampel> Did you evaluate how much work it is ? 09:48:14 <matrohon> gsagie, no, I'm working on distributed cloud 09:48:51 <matrohon> some experiments have already been done by replacing the neutron db backend with redis 09:49:02 <gsagie> matrohon: such a model can really help us, so its an interesting thing that we actually also considered 09:49:04 <nick-ma> i will go to that topic and we can discuss it during the summit. 09:49:20 <matrohon> nick-ma, +1 09:49:36 <gampel> we consider it as well main issues are the eventually consistency of teh D DB and the work load 09:49:41 <gsagie> we would all like to participate, i think gampel and I would also love to be there 09:49:49 <matrohon> there are chances that a dedicated WG gets created in Openstack 09:50:32 <nick-ma> neutron xxx-list may mislead end-users when db backend is not ACID, due to eventual consistency. 09:50:33 <matrohon> gampel, yep, I'm not a nosql expert, but I understand that it's a challenging topic 09:50:45 <nick-ma> yes. 09:51:03 <gampel> and the query speed 09:51:06 <gampel> for read 09:51:15 <gsagie> matrohon: its nice to know about this effort, how can we help? 09:51:26 <matrohon> it seems your group already gave it a lot of thoughs, it would be awesome to share them in the WG 09:51:37 <nick-ma> for that dedicated WG? I'm interested in it. 09:51:44 <gsagie> matrohon: ok would love that, how do we do it? 09:51:47 <gampel> yes this is a great effort that we would like to join 09:52:07 <matrohon> gsagie, we'll probably set up a dedicated meeting during the next summit, I'll let you know 09:52:30 <gsagie> matrohon: ok that will be great, thanks for sharing and hope to meet you in the summit 09:52:42 <nick-ma> awesome. 09:53:21 <gsagie> we might also use one Dragonflow design session to discuss about it, as we explored this area quiet alot 09:53:27 <gampel> Yes lets us know and if there is active talk about it can you please send us a link 09:53:38 <matrohon> do you already have some materials about changes to be made in upstream project to achieve the distributed goal 09:53:40 <matrohon> ? 09:53:54 <gsagie> #link https://www.openstack.org/summit/austin-2016/vote-for-speakers/presentation/7342 09:53:57 <gampel> yes this is a good idea DB consistency and DB alternative 09:54:03 <gsagie> matrohon: nothing written 09:54:24 <gsagie> matrohon: but we know others that might be interested with this as well 09:54:30 <nick-ma> you can review the spec of db consistency. we discussed a lot during review. 09:54:41 <nick-ma> matrohon: . 09:54:43 <matrohon> gsagie, great! 09:54:57 <matrohon> nick-ma, I'll do thanks 09:55:07 <gsagie> okie, so lets continue talk about this and feel free to drop to our channel #openstack-dragonflow if you have any more questions 09:55:23 <matrohon> gsagie, ok 09:55:30 <gsagie> and if you guys start meeting before the summit, would love to join 09:55:46 <gsagie> for the CI, i think we need to decide on our default setup first 09:56:15 <gsagie> as i mentioned to nick-ma in one of the reviews, i think he will also want to run CI for zookeeper 09:56:20 <gampel> For the Ci I feel it is very impotent to test the main used DB drivers 09:56:54 <gsagie> gampel: i agree, so maybe after Redis is implemented we can add this as well 09:56:59 <gampel> in the CI so it seem that zookeeper, Redis, etcd 09:57:05 <gsagie> yep 09:57:19 <gampel> do we have a limit on the number of CI ? 09:57:22 <gampel> jobs 09:57:31 <gsagie> at least for the fullstack tests, dont know if we need to do it for tempest 09:57:47 <gsagie> gampel: i am not aware of such restriction, but i dont think it will be a problem to add 2 more jobs 09:57:52 <nick-ma> if we set up all of them, it will take long time on verification. 09:57:53 <gampel> yes i agree for the fullstack 09:58:07 <gsagie> nick-ma: i think its done in parallel so not sure about that 09:58:10 <oanson> Do we want such a job only for zookeeper, redis, etcd? 09:58:15 <nick-ma> ok 09:58:18 <oanson> Or any other db driver we add? 09:58:30 <nick-ma> fullstack only? or adding tempest api testing? 09:58:30 <gampel> i think it is enough for now 09:58:34 <gsagie> oanson: lets decide that as we go, right now i dont see a reason to add anythign else 09:58:49 <gampel> RamCloud is not used yet 09:58:51 <gsagie> nick-ma: what do you suggest? 09:58:51 <oanson> All right 09:59:17 <gsagie> i think fullstack is enough as it grows it verify the important things 09:59:29 <gsagie> tempest is more about the controller logic and less about the DB it self 09:59:37 <nick-ma> priority? before or after summit? 09:59:39 <gsagie> maybe add some more DB specific tests to the fullstack 09:59:46 <gampel> yes I agree only fullstack and maybe rally latter 10:00:00 <gsagie> nick-ma: zookeeper depends on you :) 10:00:05 <gsagie> i can add the one for Redis 10:00:07 <yuli_s_> gsagie: meta service without q-dhcp 10:00:10 <yuli_s_> :) 10:00:43 <gsagie> yuli_s: yeah, we need to work on this as well :) 10:00:45 <nick-ma> ok 10:00:49 <gampel> this is a feature we do not support yet 10:00:49 <gsagie> lets take this offline as our time is done 10:00:54 <gsagie> thanks everyone for attending! 10:00:59 <gampel> thanks 10:01:02 <gsagie> and see you next week 10:01:07 <gsagie> #endmeeting