15:00:47 <ad_rien_> #startmeeting massively-distributed-clouds 15:00:48 <openstack> Meeting started Wed Dec 6 15:00:47 2017 UTC and is due to finish in 60 minutes. The chair is ad_rien_. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:49 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:51 <openstack> The meeting name has been set to 'massively_distributed_clouds' 15:01:04 <ad_rien_> #chair ad_rien_ 15:01:04 <openstack> Current chairs: ad_rien_ 15:01:10 <ad_rien_> #topic roll call 15:01:17 <oanson> o/ 15:01:24 <rcherrueau> o/ 15:01:29 <avankemp> o/ 15:01:30 <msimonin> hi all o/ 15:01:42 <kgiusti> yo! 15:01:45 <ansmith> o/ 15:02:05 <lihi> o/ 15:02:05 <ad_rien_> #info agenda 15:02:05 <ad_rien_> #link https://etherpad.openstack.org/p/massivfbely_distributed_ircmeetings_2017 1547 15:02:25 <ad_rien_> May I ask you to add you name please on the etherpad 15:02:46 <ad_rien_> line 1549 (in particular for DragonFlow folks :-O) 15:02:53 <ad_rien_> thanks 15:03:01 <oanson> The link is: https://etherpad.openstack.org/p/massively_distributed_ircmeetings_2017 15:03:19 <ad_rien_> yes, we are line 1556 15:03:48 <ad_rien_> so before diving into details today we have the usual ongoing action reviews and 15:04:00 <ad_rien_> I hope a discussion around DragonFlow design choices. 15:04:36 <ad_rien_> oanson: we were discussing about a possible presentation. I hope we can discuss it 15:04:55 <oanson> ad_rien_, yes. I've added a link to the etherpad 15:04:55 <ad_rien_> Last but not the least, is parus there ? 15:05:29 <ad_rien_> oanson: thanks 15:05:38 <ad_rien_> ok so let's start 15:05:49 <ad_rien_> #topic Annoucement 15:05:59 <ad_rien_> so only a few information to share from my side. 15:06:29 <ad_rien_> the OpenStack foundation tries to support edge actions 15:06:49 <ad_rien_> they have created a new ML and they are trying to launch new actions 15:07:12 <ad_rien_> I had a phone call with Jonathan last week to clarify the overlapping between these actions and the FEMDC SiG 15:07:22 <ad_rien_> The goal is to try to move forward faster. 15:07:37 <ad_rien_> So the first etherpad link is the general one (i.e. opened by the foundation) 15:07:45 <ad_rien_> The second one is an overview of our activities 15:07:55 <ad_rien_> To be honest, right now there is only a few interaction. 15:08:18 <ad_rien_> I don't know what we can expect but at least it is great that the foundation is trying to push everything forward 15:08:48 <ad_rien_> Maybe the second informtion I can share is related to the possibility to have a dedicated day during the next PTG in dublin 15:09:15 <ad_rien_> I added that point in the open discussion section (if you are interested to take part to such sessions, please add your name) 15:09:31 <ad_rien_> According to the number of persons, I will confirm our interest to the foundation. 15:09:34 <ad_rien_> That's all from my side 15:09:43 <ad_rien_> so any other news to share? 15:09:59 <ad_rien_> any folks from FBK? 15:10:12 <ad_rien_> ok 15:10:19 <ad_rien_> So if not let's move to the next topic 15:10:31 <dancn> ad_rien_: no, for sure we will note be in Dublin 15:10:36 <ad_rien_> sorry 15:10:43 <ad_rien_> dancn: I just saw you now 15:10:54 <dancn> but we keep an eye! 15:10:55 <ad_rien_> dancn: ack 15:10:57 <dancn> no proble! 15:11:01 <ad_rien_> ok so let's move forward 15:11:08 <ad_rien_> #topic ongoing actions - Use-cases 15:11:15 <dancn> we disapperaed for a while, now back! 15:11:33 <msimonin> welcome back dancn :) 15:11:35 <ad_rien_> Paul-Andre is not present (i didn't see him) but maybe dancn you can update a bit regarding the presentation you did in 15:11:42 <ad_rien_> the Fog Congress 15:11:43 <ad_rien_> ? 15:12:03 <dancn> Yes, sure, I just added to ehterpad few notes 15:12:33 <ad_rien_> thanks 15:12:36 <ad_rien_> any other information. 15:12:39 <dancn> we will do an extended presentaiton at cloudcom, we can redo the presentation for you people at FEMDC 15:12:47 <ad_rien_> Did you get valuable feedbacks from the fog congress 15:13:04 <ad_rien_> it may make sense thanks 15:13:21 <dancn> basically it is a demo that show some problem in edge application with constrains and some "workaround" 15:13:43 <ad_rien_> still including OpenStack and kubernetes? 15:14:20 <dancn> about 20 people showed up, interested in seeing something more than a research prototype 15:14:37 <dancn> yes both k8s and OS are involved 15:14:43 <ad_rien_> ok 15:15:15 <ad_rien_> and did you see some prototypes from your side that can supervise/operate edge infrastructures during the fog congress event? 15:15:26 <ad_rien_> or your proposal was/is the most mature? 15:15:39 <dancn> I was not there 15:15:42 <ad_rien_> ok 15:15:55 <ad_rien_> will be great to have information about that 15:15:56 <ad_rien_> ok 15:16:04 <ad_rien_> any other points/aspects to share? 15:16:06 <dancn> we internally do not had a sync meeting, but as soon as we do I will share 15:16:32 <ad_rien_> ok 15:16:41 <ad_rien_> questions/remarks? 15:16:53 <ad_rien_> if not I proppse to continue 15:17:10 <ad_rien_> ok so next point 15:17:21 <ad_rien_> #topic ongoing-action-openstack++ 15:17:44 <ad_rien_> no news from our side, need time to make progress on that part. I hope that a subgroup will take this action soon 15:17:56 <ad_rien_> #topic ongoing-action-P2P-openstack 15:18:15 <ad_rien_> regarding that point, we had a meeting last week with Ericsson folks. The meeting has been fruitful from my viewpoint. 15:18:25 <ad_rien_> We present the P2P architecture using cockroach 15:18:36 <ad_rien_> find some pros and cons. we will try to summary that on a white paper 15:18:51 <oanson> Could you please refresh what's p2p? 15:19:18 <msimonin> I think what Adrien mean is collaborative Openstack 15:19:22 <ad_rien_> Meanwhile, Ericsson is progressing on an architecture that can enable several operators to share their resources 15:19:25 <ad_rien_> msimonin: thanks ;) 15:19:33 <rcherrueau> A BitTorrent like OpenStack 15:19:45 <ad_rien_> P2P Peer to Peer. sorry oanson 15:19:54 <rcherrueau> Many OpenStack instances that collaborate 15:20:01 <oanson> Ah. I see. Thanks! 15:20:16 <ad_rien_> in an edge infrastructure you will have several geo distributed DCs our goal is to be able to make several openstack collaborate 15:20:22 <ad_rien_> ok that's all 15:20:30 <ad_rien_> we have another meeting schedule on next friday 15:20:35 <ad_rien_> I hope we will progress 15:20:40 <ad_rien_> s/hope/I'm sure 15:20:42 <ad_rien_> ;) 15:20:51 <ad_rien_> any question ? 15:21:03 <ad_rien_> #topic ongoing-action-AMPQ 15:21:21 <ad_rien_> so ansmith kgiusti msimonin avankemp 15:21:26 <ad_rien_> the floorsis yours 15:21:37 <kgiusti> I defer to msimonin and avankemp as they are doing the heavy lifting... 15:21:39 <ad_rien_> (sorry for the typo, please) 15:22:10 <kgiusti> ...but we have been engaging in a weekly meeting re: the msging testplan 15:22:35 <kgiusti> details in epad line 1583 15:22:47 <avankemp> the review has been merged (https://review.openstack.org/#/c/491818/) 15:23:16 <msimonin> So we are now running initial experiments regarding this test plan 15:23:29 <msimonin> on Grid'5000 15:23:50 <msimonin> We 15:23:57 <msimonin> We have now some results to analyse 15:24:39 <ad_rien_> looks great 15:24:40 <ad_rien_> ;) 15:24:41 <msimonin> As kgiusti said we are iterating with kgiusti and ansmith on the code / experiment design 15:25:24 <msimonin> initial tests is about thousand of rpc client and rpc servers 15:25:38 <msimonin> we target 10^4 in the midterm 15:26:29 <msimonin> I'm not sure if I need to dive more into details 15:26:46 <ad_rien_> that's ok from my side 15:27:00 <ad_rien_> any other comments from redhat folks? 15:27:07 <ad_rien_> if not we can move to the next point. 15:27:36 <ad_rien_> #topic ongoin-action-cockroach 15:27:41 <kgiusti> we're good thanks 15:27:56 <ad_rien_> the floor is yours rcherrueau 15:28:00 <ad_rien_> ;-) 15:28:12 <rcherrueau> lemme keep it short 15:28:51 <rcherrueau> ad_rien_ speeks about a collaborative OpenStack using cockroachdb 15:29:35 <rcherrueau> but you need to tweak the code the get the desiderata 15:30:04 <rcherrueau> Especially, there are nova queries that don't work, as it, on CockroachDB 15:30:51 <rcherrueau> So I do a soft that extracts all SQL queries performed during unit/functional tests 15:32:13 <rcherrueau> The soft leverage subunit streams produced by ostestr, so the soft should work for keystone, glance, ... every unit/functional tests that use ostestr. 15:32:52 <rcherrueau> result of the extraction is available online 15:33:08 <rcherrueau> #link http://enos.irisa.fr/nova-sql/pike-nova-tests-unit.sql.json 15:33:26 <rcherrueau> #link http://enos.irisa.fr/nova-sql/pike-nova-tests-functional.sql.json 15:33:35 <rcherrueau> be careful, files are huge 15:33:42 <ad_rien_> be careful these are huge files 15:33:46 <ad_rien_> (too slow) 15:33:56 <dancn> Length: 233394340 (223M) [application/json] 15:33:59 <dancn> :-) 15:35:07 <rcherrueau> Next step is running analysis of sql queries to identify easily where we have to change the nova code regarding cockroachdb investigation 15:35:39 <rcherrueau> actually I already have an analysis that shows all correlated subqueries 15:35:46 <ad_rien_> Maybe you can also add that by conducting this study a few of our students discovered that there is a lot of select 1 requests. While these requests are not so expensive in the context of a LAN deployment, they may become critical in the case of a WAN context. 15:35:52 <rcherrueau> Lemme polish the code and then I will share it with you 15:36:02 <rcherrueau> that all for me 15:36:07 <ad_rien_> thanks 15:36:09 <ad_rien_> questions? 15:36:11 <ad_rien_> comments? 15:37:13 <ad_rien_> ok 15:37:20 <ad_rien_> so I propose to move to the open discussion 15:37:33 <ad_rien_> and more precisely on the dragonflow aspect 15:37:41 <ad_rien_> #topic open-discussions 15:37:54 <ad_rien_> regarding the PTG I already mentioned that if you are interested please add your name 15:38:11 <ad_rien_> so oanson the floor is your 15:38:16 <ad_rien_> ;-) 15:38:17 <oanson> Thanks 15:38:22 <ad_rien_> How do you want to proceed 15:38:30 <ad_rien_> we have a webconference tool 15:38:32 <oanson> We can go over the short presentation I prepared 15:38:35 <oanson> The link is here: https://docs.google.com/presentation/d/1HdpS5FYyBnxNkarI356B7I8vwMrC4x76FMyHmU3Ml-c/edit?usp=sharing 15:38:37 <ad_rien_> or we can continue by IRC 15:38:58 <oanson> Let's go with IRC. I think just setting it up won't be worth the time 15:39:08 <ad_rien_> ok 15:39:10 <ad_rien_> let's go 15:39:24 <oanson> Basically up to slide 5 is very general introduction 15:39:33 <oanson> Slide 5 basically says: 15:39:50 <oanson> Cloud network solution: General idea is Neutron backend 15:40:12 <oanson> including taking care of connecting compute elements (VMs, containers, etc.), also across compute nodes. 15:40:33 <oanson> Fully distributed is that we try to avoid any bottlenecks, centralized elements, etc. 15:40:38 <ad_rien_> ok 15:41:11 <oanson> Fully open source in that everything is upstream, including design, implementation, strategies, etc. Being an openstack project everything is on gerrit. 15:41:18 <ad_rien_> ok 15:41:56 <oanson> Slides 6-7 goes a bit more in depth on what networking we do. But I would like to skip ahead to the architecture (Unless anyone has any preferences). We can always come back to it later 15:42:16 <ad_rien_> please go ahead 15:42:23 <ad_rien_> I think this is the right way to progress 15:42:25 <ad_rien_> ;-) 15:42:36 <oanson> So slide 9 shows the birds-eye view architecture. The top most node is the API layer - currently a neutron server with Dragonflow plugins 15:42:54 <oanson> The Dragonflow elements here are ML2 plugins and service drivers 15:43:20 <oanson> These plugins only translate the Neutron objects to Dragonflow objects, and push them to the distributed database. 15:43:47 <oanson> The distributed database is in the centre in this schema, but in general it is distributed across the Datacenter/cloud/deployment 15:44:19 <oanson> Each compute node (The satellites) pulls the information from the distributed database, and applies the policy locally. 15:44:46 <oanson> Additionally, there's also a pub/sub mechanism where the Neutron server publishes the policy, and the compute nodes subscribe to it. 15:45:17 <oanson> The compilation of the policy translates high-level objects (e.g., ports, networks, subnets, security groups) to OpenFlow rules. 15:45:42 <oanson> These rules are then installed in the (currently) OVS switch, which is the forwarding element for packets between compute elements 15:45:54 <oanson> That's it for this slide :) 15:45:57 <ansmith> oanson: is olso.messaging used for pub/sub or separate library? 15:46:06 <oanson> ansmith, a separate library. 15:46:18 <msimonin> oanson: what message settlement (at least/most/exacly) is offered by the publish/subscribe ? 15:46:41 <oanson> We have implemented pub/sub ourselves in a pluggable way, so we can utilize either distributed db-based pub sub (as provided by Redis or Etcd) or use a broker based one (currently ZMQ) 15:46:50 <oanson> msimonin, sorry? 15:47:19 <msimonin> What happen if messages cannot reach the satellite for a small periods of time ? 15:47:29 <ad_rien_> to All: may I ask you to add your questions into the etherpad 15:47:33 <ad_rien_> so we can keep trace of them 15:47:34 <msimonin> can it lead to inconsistency in the satellite configuration ? 15:47:44 <ad_rien_> and go through each of theam 15:47:52 <oanson> The pub/sub mechanism is considered only an optimization. The 'gold standard' is what's in the database. 15:47:59 <ad_rien_> (just to facilitate the discussion, I think we have all a couple of questions) 15:48:18 <oanson> We rely on the distributed database to (eventually) contain the correct configuration, and therefore the correct configuration will eventually reach the satellites. 15:48:34 <msimonin> oanson: you mean that at any time a agent can get it's whole configuration from the database ? 15:48:47 <oanson> msimonin, yes. Exactly. 15:48:54 <msimonin> oanson: great, thanks 15:49:21 <oanson> So slides 10 and 11 show the architecture a bit more in depth. Slide 10 stresses DB and pub/sub pluggability. 15:49:33 <ad_rien_> oanson: would it be possible to go through slides 15 and after? 15:49:42 <oanson> Sure. 15:49:42 <ad_rien_> I think this is also an important aspect for the SiG 15:49:44 <ad_rien_> thanks 15:49:52 <oanson> Let me just say 2 words on slide 10: 15:49:58 <ad_rien_> sure 15:50:16 <oanson> Everything is pluggable, in that different db stores and pub/sub mechanisms can be used for each deployment, according to the deployment needs. 15:50:41 <oanson> e.g. in some cases Redis performs better than etcd (or vice versa), and in some cases you have to use ZMQ for pub/sub (which is db agnostic) 15:51:09 <oanson> Slide 15-16 shows the two options we had in mind for fog/edge/massively distributed architectures. 15:51:46 <oanson> The first option is either to treat each datacentre (small cloud in the diagrams) as a standalone deployment 15:52:10 <oanson> Then use cross-cloud connectivity solutions for the datacentres to communicate 15:52:36 <ad_rien_> like Tricircle proposal somehow 15:52:44 <oanson> Yes 15:52:47 <oanson> The second option (slide 16) is to have a single large Dragonflow deployment spread across multiple clouds, and use a tunneling solutions to communicate 15:53:31 <oanson> The first option should be much easier to implement than the second. The only pro of the second (in my opinion) is that you have a single entity and then, once it works, management should be easier. 15:53:51 <ad_rien_> just one point which is not clear 15:53:58 <oanson> Additionally, with a single entity more information is available for each controller, and perhaps better decision making is possible 15:54:03 <ad_rien_> why do you need to tunnel all traffics in the second possibility 15:54:22 <ad_rien_> (sorry if my question looks naive but I'm not sure I'm understanding well) 15:54:34 <ad_rien_> in a context of a network operator 15:54:47 <oanson> ad_rien_, I'm guessing we would have to have an IPSec like solution, since the traffic (probably) has to be encrypted. 15:54:47 <ad_rien_> you may consider to have mandatory ports opened 15:54:53 <ad_rien_> ok 15:55:01 <ad_rien_> so for data privacy/security isseus 15:55:04 <ad_rien_> issues 15:55:17 <oanson> ad_rien_, the tunnels exist because I assumed you want a mesh connectivity - everyone can communicate with everyone 15:55:21 <ad_rien_> but this problem is already present in a traditional cloud system 15:55:27 <ad_rien_> yes 15:55:38 <oanson> And if you don't *have* to go through the main cloud, it's better to connect ad-hoc. 15:55:52 <ad_rien_> this is indeed an assumption we take for the P2P openstack scenarion (at least right now in order to make the complex story a bit easier) 15:56:03 <oanson> ad-hoc - in a mesh topology 15:56:04 <ad_rien_> ok 15:56:09 <ad_rien_> so 4 min left 15:56:20 <ad_rien_> may I propose to put all questions you would like to discuss on the pad 15:56:25 <oanson> I think I went over the main points of the presentation. 15:56:31 <msimonin> thanks oanson ! 15:56:33 <ad_rien_> and then we can discuss them during the next point 15:56:48 <oanson> I'll try to answer them on the pad as well 15:56:56 <ad_rien_> oanson: moreover, do you think it makes sense for your to take part to the braimstorming discussions related to the P2P openstack 15:56:58 <rcherrueau> oanson: Thanks, any chance to implement Dragonflow in kolla-ansible? 15:57:08 <msimonin> rcherrueau: +1 15:57:10 <msimonin> :) 15:57:21 <oanson> rcherrueau, I think we already have a kolla image. 15:57:23 <ad_rien_> to be honest we are not at the level of the network (i.e. enabling communication between 2 vms, each one being deployed on a dedidacted site) 15:57:28 <oanson> ad_rien_, sure. 15:57:37 <ad_rien_> but if you think it is relevant from your side, it would be great to have you on board 15:57:48 <ad_rien_> ok 15:57:53 <ad_rien_> 3 min before ending the meeting 15:57:58 <oanson> ad_rien_, at the very least, it's important for us to know what problems exist to be solved 15:58:00 <rcherrueau> oanson: And in kolla ansible, something like `enable_dragonflow: yes` or `neutron_plugin: dragonflow` 15:58:08 <ad_rien_> should we discuss any particular point? 15:58:18 <oanson> rcherrueau, I'll have to check. It was written by an external contributor 15:58:21 <msimonin> rcherrueau: +1000 :) 15:58:30 <rcherrueau> oanson: OK, thanks. 15:58:34 <ad_rien_> ok 15:58:43 <ad_rien_> parus: 15:58:49 <parus> Hello! 15:58:51 <ad_rien_> Hi 15:58:59 <ad_rien_> Unfortunately we are going to end the meeting 15:59:18 <ad_rien_> but I'm available by skype if you want to discuss a bit 15:59:24 <ad_rien_> notes are on the etherpad too 15:59:28 <ad_rien_> so thanks for all 15:59:32 <ad_rien_> and CU next meeting 15:59:33 <parus> Sorry about that. 15:59:34 <oanson> Thanks! 15:59:39 <ad_rien_> #endmeeting