#openstack-meeting log

15:02:05 <msimonin> #startmeeting fog-edge-massively-distributed-clouds
15:02:07 <openstack> Meeting started Wed Jun 21 15:02:05 2017 UTC and is due to finish in 60 minutes.  The chair is msimonin. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:02:08 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:02:10 <openstack> The meeting name has been set to 'fog_edge_massively_distributed_clouds'
15:02:23 <msimonin> #chair parnexius
15:02:24 <openstack> Current chairs: msimonin parnexius
15:02:44 <msimonin> #topic roll call
15:02:48 <parnexius> Hello everyone
15:02:50 <samP> hi..o/
15:02:50 <msimonin> Hello folks !
15:03:07 <msimonin> Let's see who's around :)
15:03:08 <rcherrueau> o/
15:03:10 <ansmith> o/
15:03:16 <samP> o/
15:03:16 <kgiusti> o/
15:03:45 <pbressan> o/
15:04:12 <parnexius> Adrien has posted an Agenda for the meeting but will not be joining us. Mat and I will chair today
15:04:27 <msimonin> #link https://etherpad.openstack.org/p/massively_distributed_ircmeetings_2017
15:04:50 <msimonin> today we are at line 810
15:05:10 <serverascode> o/
15:05:20 <msimonin> Feel free to add your name
15:05:44 <msimonin> It Seems we can start
15:05:57 <msimonin> #topic announcements
15:06:19 <msimonin> As parnexius said _ad_rien_ is not available today
15:06:37 <msimonin> so I'll try to transmit his announcements :)
15:07:07 <msimonin> First thing is that we are in touch wit ESTI MEC
15:07:13 <msimonin> #link http://www.etsi.org/technologies-clusters/technologies/multi-access-edge-computing
15:07:39 <msimonin> They have some architecture work done there
15:08:09 <msimonin> and we are now currently discussing with them to see if OpenStack can be used in their architecture
15:08:20 <msimonin> at least identify some building blocks
15:09:05 <parnexius> Great idea. I have started using some of their material in the use case development.
15:09:10 <parnexius> More on that later.
15:09:15 <msimonin> cool !
15:10:11 <msimonin> I think we can move to the next topic
15:10:29 <msimonin> #topic AMQP alternatives
15:10:49 <msimonin> kgiusti ansmith any updates ?
15:10:49 <kgiusti> hey!
15:11:05 <kgiusti> kolla work is progressing nicely
15:11:31 <kgiusti> andy and I have started an epad to define tests of the messaging bus
15:11:41 <kgiusti> link in epad
15:12:07 <kgiusti> mostly focused on messaging testing
15:12:07 <msimonin> thanks for this proposal kgiusti
15:12:13 <kgiusti> at this point.
15:12:36 <kgiusti> cloud level messaging test tools explored a bit
15:13:10 <kgiusti> but really need help with defining the openstack cloud deployment arch to help define the best messaging topology to tests
15:13:34 <kgiusti> err - "most appropriate messaging topology" for the cloud deployment
15:14:06 <msimonin> by topology: you mean where to put the router and what links between them ?
15:14:15 <kgiusti> so please weigh in if you have any suggestions for messaging-oriented tests, failure scenarios, etc.
15:14:32 <kgiusti> msimonin: yeah - the optimal distribution of routers "under" the cloud
15:14:43 <parnexius> kgiusti: which link is that in the epad (the one on strawman proposal)
15:15:00 <kgiusti> parnexius: line 834
15:15:50 <parnexius> kgiusti: by router, you mean message router (like rabbitmq instances).
15:16:09 <rcherrueau> kgiusti: I guess, we should test several topologies
15:16:26 <kgiusti> parnexius: in general yes, but that really depends on the tech used (rabbitmq/zeromq/router)
15:16:45 <rcherrueau> I mean, putting router on all physical nodes vs router on one leader node
15:16:46 <kgiusti> for example: rabbitmq would be limited to clustering for scale
15:17:03 <parnexius> kgiusti: so if I understand the point right a router topology would include whether the router is deployed in the edge node, or only in a central location.
15:17:04 <kgiusti> rcherrueau: that's one scenario for the router, yes
15:17:23 <kgiusti> parnexius: yes - the router model is the most flexible,
15:17:27 <rcherrueau> meshing is also important I guess. Going with a full mesh between rooters is not the same thing than going with a star mesh
15:17:42 <kgiusti> and thus most complex - we need to consider redundancy, locality of traffic, etc
15:18:13 <kgiusti> rcherrueau: right - for example, how many network or router failures should be tolerated?
15:18:53 <parnexius> Is Dynamic routing available in these solutions?
15:19:02 <rcherrueau> I don't know, I much as OpenStack tolerate no :)
15:19:18 <rcherrueau> as much as*
15:19:44 <parnexius> So High-Availability would need to be provided by the solution (RabbitMQ, ZeroMQ, ....)
15:19:46 <kgiusti> parnexius: yes - all routing is dynamic
15:20:15 <parnexius> kgiusti: can you please elaborate?
15:20:49 <parnexius> What would happen if a router on a node becomes unreachable?
15:20:58 <kgiusti> parnexius: sure - regarding dynamic routing I'm assuming you mean re-routing around network failures
15:21:27 <kgiusti> parnexius: if the clients have a fail-over router configured, that will be tried in recovery.
15:21:55 <parnexius> OK! I would call that static. but you are right it is a little dynamic.
15:22:20 <kgiusti> parnexius: if there are redundant routers on the node (hot standby) that would take over (assuming redundant network paths)
15:23:17 <kgiusti> parnexius: topology is one factor, the other is over-provisioning for throughput
15:23:45 <kgiusti> parnexius: eg, if we lose a rabbit in a cluster, how does that reduce overall capacity?
15:23:58 <msimonin> I think we need time to get through what is written on the epad
15:24:13 <kgiusti> msimonin: yep
15:24:16 <msimonin> and give you feedback
15:24:26 <kgiusti> msimonin: and questions! :)
15:24:51 <msimonin> #info kgiusti share a premilinary "message plane load testing"
15:24:58 <msimonin> #link https://etherpad.openstack.org/p/1BGhFHDIoi
15:25:07 <msimonin> (just recording some stuffs :) )
15:25:25 <msimonin> Can we iterate on the mailing list ?
15:25:45 <msimonin> and/or direclty on the document until next meeting
15:25:47 <kgiusti> msimonin: yes - I did send an email just before the meeting regarding this
15:26:04 <parnexius> msimonin: Good Point. We shoudl all take action to be more active outside of IRC sessions.
15:26:07 <kgiusti> msimonin: so folks not present can get the epad
15:27:11 <msimonin> kgiusti ansmith anything else on qpid dispatch router ?
15:27:17 <kgiusti> fyi: http://lists.openstack.org/pipermail/openstack-dev/2017-June/118716.html
15:27:29 <kgiusti> I'm good - andy?
15:27:33 <ansmith> good
15:28:58 <parnexius> Should we move to next topic then
15:29:13 <msimonin> right parnexius :)
15:29:20 <msimonin> *was reading the ml :) *
15:29:23 <msimonin> so
15:29:26 <msimonin> next topic
15:29:32 <msimonin> #topic cockroachdb
15:30:00 <msimonin> So I would say that similarly to the work on the messaging layer
15:30:41 <msimonin> there has been a mail on the ml about using cockroadb as an alternative backend for keystone
15:31:13 <msimonin> CockroachDB could be very interesting also in the context of a massively distributed cloud
15:32:00 <parnexius> Would that mean the database is replicated to all edge nodes?
15:32:03 <serverascode> cool will have to read through those notes
15:32:26 <msimonin> we had a meeting with some cockroachdb folks last week
15:32:30 <msimonin> a very first contact
15:33:07 <msimonin> parnexius:  the same as qpid the topology will need to be defined
15:33:14 <rcherrueau> parnexius: not necessarily. With cockroachdb, you can tell which tuples should be replicated and which should not.
15:34:13 <msimonin> So
15:34:31 <parnexius> But this is not an opensource solution!
15:34:42 <rcherrueau> no, this is
15:35:13 <rcherrueau> Spanner is proprietary, CockroachDB is opensource
15:35:22 * rcherrueau check the licence
15:35:25 <parnexius> OK. I see it now.
15:35:44 <msimonin> So at Inria we are very much interesting in evaluating this
15:36:06 <msimonin> but we'll need support !
15:36:59 <rcherrueau> parnexius: Apache License, Version 2.0, for community
15:37:30 <msimonin> Changing the DB backend here we'll require to resurect the postgresql driver, choose a service (keystone, nova, glance …) to make primary evaluation
15:37:32 <msimonin> and so on
15:37:52 <parnexius> Ceilometer ?
15:38:18 <msimonin> parnexius:  yes it could be as well
15:38:35 <msimonin> so there's a lot of exciting stuff to do
15:38:38 <msimonin> so my question is
15:38:52 <msimonin> how to make people as excited as I am about this ?
15:39:36 <serverascode> that's a good question, there will be a fair amount of pushback on using cockroachdb
15:39:46 <parnexius> msimonin: why are you excited?
15:40:02 <serverascode> mostly I think around who will actually do the work in terms of the openstack testing system
15:40:37 <msimonin> parnexius: to speak roughly cockroach is noSQL with ACID properties
15:41:32 <parnexius> and Why is ACID important to our usecases? would performance be more important?
15:42:38 <serverascode> I would imagine a lot of code would have to be rewritten if the db were not ACID
15:43:21 <msimonin> I should add that cockroach is compatible with sqlalchemy
15:43:28 <rcherrueau> parnexius: ACID or not, Cockroach speaks Postgres protocol. That means -- in principle -- using it would be as simple as configuring oslo.db to connect to Cockroach
15:44:57 <rcherrueau> parnexius: The thing is, you have to implement ACID properties to implement pgsql protocol
15:45:43 <msimonin> To conclude here
15:45:49 <parnexius> Great discussion: some requirements are popping up here.
15:46:28 <msimonin> parnexius: yes ?
15:46:35 <parnexius> Performance, ACID, noSQL, pgsql
15:47:21 <parnexius> we need to get our heads together and define what is important to massively distributed.
15:47:56 <msimonin> right parnexius
15:48:16 <parnexius> please conclude.
15:48:55 <rcherrueau> parnexius: It would be cool is someone can look at Keystone db API/db requests and says which parts require string consistency (ACID)
15:49:08 <rcherrueau> strong consistency*
15:49:26 <msimonin> I'll take an action on this
15:49:32 <msimonin> I'll ping keystone guys
15:49:53 <msimonin> #action msimonin get in touch with keystone about cockroach
15:49:58 <parnexius> you should look at Nova too.
15:50:18 <rcherrueau> If none require strong consistency, then we can go with NoSQL -- but this means, we have to rewrite the db part of Keystone
15:50:22 <parnexius> keystone db tends to be centralized... while nova can be distributed.
15:50:27 <msimonin> the step after will be nova
15:50:49 <msimonin> the initial idea was from keystone
15:50:59 <parnexius> My point is we should focus on the aspects that are more relevant to Massively distributed.
15:51:02 <serverascode> how can nova be distributed? you mean cells?
15:51:37 <msimonin> #topic use-case definitions/dicussions
15:51:37 <parnexius> I mean that nova has components in each cells or edge nodes..... While keystone is usually central to a region ar the master cell.
15:51:54 <parnexius> Great transition.
15:52:08 <msimonin> parnexius: serverascode: let's see what keystone guy had in mind when proposing cockroach
15:52:28 <parnexius> About Use cases.
15:52:34 <msimonin> yes go on :)
15:52:37 <rcherrueau> parnexius: In multi-region scenarios, some also put keystone in each region and make it consistent using galera
15:52:59 <parnexius> I have uploaded some slides on goold doc. They touch on this very topic.
15:53:33 <Nil_> very bad experience with galera cluster. MySQL nbd better experience but requiere very services and complicated.
15:53:51 <parnexius> #link https://docs.google.com/presentation/d/1sBczuC2Wu1d_misBmPahLPdvyhOI4QuVy129EHunuUM/edit?usp=sharing
15:54:13 <parnexius> I will follow kgiusi example and send in an email.
15:54:16 <serverascode> many people have run galera sucessfully, and that is where some pushback will come from
15:54:28 <rcherrueau> Nil_: I have no doubt on that :)
15:54:31 <msimonin> parnexius:  +1 for the ml
15:54:49 <serverascode> but the most I've seen anyone mention is 12 regions with a shared galera over all of them
15:55:13 <serverascode> where was hopefully with something like cockroachdb we could do many more regions
15:55:14 <parnexius> I was hoping that I could get some input from the team. and the discussion on galera and coackroach is relevant.
15:55:37 <msimonin> too many interesting topics here :)
15:55:51 <parnexius> Let's move this discussion to the mailing list.
15:55:55 <Nil_> maybe CockroachDB, i read gitup and very interesting...
15:55:57 <rcherrueau> serverascode: Galera is not an option in our usecases because of the WAN latency
15:56:18 <msimonin> parnexius: ok
15:56:35 <serverascode> I'm just letting you know the pushback we will get :)
15:56:44 <parnexius> is everyone interested in this discussion, or should we create a subteam.
15:57:17 <msimonin> actually I think we should move this discussion in the beginning next time :)
15:57:29 <parnexius> Msimonin: please give me action to take discussion to ML and let's move to close the meeting.
15:57:33 <msimonin> sure
15:57:56 <msimonin> #action parnexius starts a thread on the ml about massively distributed use cases
15:58:09 <msimonin> We have 2 min left :(
15:58:17 <msimonin> #topic open discussion
15:58:30 <pbressan> just leaving a question for next session, is discussion about how implement distributed UCs in terms of networking ?
15:58:46 <msimonin> UCs ?
15:58:48 <pbressan> ml2/3, tricircle, etc ?
15:58:55 <samP> just a quick intro.. from LCOO
15:58:57 <parnexius> UC = Use cases.
15:59:01 <msimonin> kk :)
15:59:02 <samP> Hi, I’m Sampath from NTT, one of the persons driving Ext ream testing (Destructive Testing) in LCOO with  jamemcc
15:59:33 <msimonin> sorry guys
15:59:34 <msimonin> =)
15:59:50 <msimonin> let's try to iterate as much as possible between meetings
15:59:56 <msimonin> on the ml for example
15:59:57 <parnexius> +!
15:59:59 <parnexius> +1
16:00:01 <msimonin> #endmeeting