#openstack-meeting log

00:05:47 <thinrichs> #startmeeting CongressTeamMeeting
00:05:48 <openstack> Meeting started Thu Jun  2 00:05:47 2016 UTC and is due to finish in 60 minutes.  The chair is thinrichs. Information about MeetBot at http://wiki.debian.org/MeetBot.
00:05:49 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
00:05:51 <openstack> The meeting name has been set to 'congressteammeeting'
00:06:01 <thinrichs> masahito ekcs ramineni_: courtesy piong
00:06:09 <ramineni_> hi
00:06:15 <masahito> hi
00:06:18 <ekcs> hi all.
00:06:46 <thinrichs> I just have status updates on the agenda.  Anything else?
00:07:55 <thinrichs> One other thing first: the gate
00:07:57 <thinrichs> #topic gate
00:08:20 <thinrichs> Looking the reviews, we have a failure on a global requirements update..
00:08:28 <thinrichs> #link https://review.openstack.org/#/c/323883/
00:08:40 <thinrichs> One of the doctor driver tests is failing.
00:08:57 <thinrichs> Error says that it can't find a datasource.
00:09:26 <thinrichs> I wonder if this has to do with our translation of datasource IDs to names in the API.
00:09:43 <thinrichs> masahito: I added you as a reviewer since you're most familiar with that code
00:10:35 <masahito> thinrichs: yap, I'll check it.
00:10:49 <thinrichs> masahito: great!
00:11:26 <thinrichs> ramineni_: One of your patches is failing devstack too, but I couldn't seem to track down the error.
00:11:28 <thinrichs> #link https://review.openstack.org/#/c/320306/
00:12:10 <thinrichs> That patch has been ready to merge since last week
00:12:21 <thinrichs> so I'd be surprised if it had anything to do with the patch
00:12:27 <ramineni_> thinrichs: yes , changing to the rabbit driver is not working if started on multhiple nodes
00:13:04 <thinrichs> ramineni_: I'd like to hear more on that.
00:13:06 <thinrichs> #topic status
00:13:06 <ramineni_> masahito raised a patch to solve the issue https://review.openstack.org/#/c/321459/
00:13:59 <thinrichs> masahito: I had a question on that patch similar to ramineni's…
00:14:23 <thinrichs> That patch assigns a different partition to every DSENode, right?
00:14:46 <masahito> yes
00:14:47 <thinrichs> Doesn't that stop them from communicating?  They'd all be listening on different topics.
00:15:07 <masahito> yes
00:16:08 <ramineni_> thinrichs: ya, i think policies must be synchronized across nodes?
00:16:27 <ekcs> policy synch is done via DB which is independent.
00:16:44 <ramineni_> ekcs: datasource policies are not synchrnozed via DB
00:17:17 <ramineni_> ekcs: they are stored in mem right, not sure about the reason of not storing in DB in code
00:18:16 <ekcs> ramineni_: can you clarify what datasource policies need DSE2 to sync?
00:18:52 <thinrichs> masahito: so when we write a script that creates multiple DSENodes, we need to feed them all the same partition_id, right?
00:20:01 <thinrichs> masahito: (in a multi-node deployment where say the policy engine runs in one DseNode and the datasources run in a separate DseNode)
00:20:02 <masahito> thinrichs: with multiple PolicyEngine?
00:20:20 <masahito> oh
00:20:33 <masahito> in that case, yes.
00:20:52 <masahito> I forgot that usecase.
00:21:03 <ramineni_> ekcs: this is the bug we have now
00:21:06 <ramineni_> https://bugs.launchpad.net/congress/+bug/1585975
00:21:06 <openstack> Launchpad bug 1585975 in congress "Fails to start replica on rabbit bus" [Undecided,New] - Assigned to Masahito Muroi (muroi-masahito)
00:21:30 <ekcs> adding to what thinrichs is asking, sometimes we want multiple nodes to be on the same partition, sometimes we don’t. I wonder how we should control that. Right now it seems the patch is taking the node_id as partition_id. it’d be nice is we can separately specify partition_id, with default being all same partiiton.
00:21:54 <masahito> But I'm not sure we allow users to deploy Congress multiple DseNode with one PE now.
00:22:54 <thinrichs> I'm confused why it matters how many PEs there are.
00:23:04 <ramineni_> PE1 and PE2 maintian their own datasource policies in memeory ,
00:23:06 <masahito> I understand the reason of ramineni_'s and thinrichs's questions.
00:23:10 <thinrichs> Maybe I'm missing something, so let me lay this out.
00:23:29 <thinrichs> Suppose we want PE to run on machine1 and all the datasources to run on machine2.
00:24:02 <thinrichs> Then on machine 1 we create a DseNode, and spin up the PE.
00:24:13 <thinrichs> And on machine 2 we create a DseNode and spin up all the datasources
00:24:36 <thinrichs> The way those 2 DseNodes communicate is via oslo-messaging, which means the DseNodes need to be talking on the same topic.
00:24:46 <thinrichs> So that would require them to be given the same partition_id
00:24:53 <thinrichs> Right?
00:24:55 <masahito> right.
00:25:17 <thinrichs> Today we don't have a mechanism/script for deploying that way, so it shouldn't be a problem.
00:25:42 <thinrichs> I would imagine this bug is due to the fact that the replica SHOULD be assigned a separate partition but it is not.
00:26:05 <ekcs> thinrichs: yes.
00:26:14 <masahito> yes
00:26:29 <thinrichs> ramineni_: does that sound right?
00:27:15 <ramineni_> thinrichs: confused :( if its assigned seperate partition they cant listen on same topic right
00:28:09 <ramineni_> In case of HA we would want to listen on same topic , so that any policy engine can serve the requests
00:28:10 <thinrichs> ramineni_: the replica is intended to be a totally separate instance of Congress …
00:28:12 <thinrichs> that does not communicate with the other version of Congress EXCEPT by synchronizing with the DB.
00:28:57 <thinrichs> ramineni_: We're not that far along yet.  Masahito's change is intended to just fix the existing HA test, not implement the HA/HT arch that ekcs is working on
00:29:36 <thinrichs> In effect, we want to handle 2 different use-cases for DseNode:
00:30:04 <thinrichs> (i) creating a new "root" DseNode that is isolated from all other "root" DseNodes
00:30:18 <thinrichs> (ii) creating a new DseNode that communicates with an existing "root" DseNode
00:30:43 <ramineni_> thinrichs: oh ok, got it
00:31:26 <thinrichs> masahito: does your change allow us to do both use-cases?
00:32:04 <masahito> no
00:32:23 <masahito> It allows only (i)
00:32:40 <masahito> I think we need additional works for (ii)
00:33:20 <thinrichs> I guess what we want is for the eventlet_server to take an argument that is the name of the DseNode partition
00:33:23 <ekcs> I think it’d be clearer to make it an explicit option to specify a different partition_id, rather than using the node_id as partition_id. Then we can also build on that to support (ii). but I’m also okay with this as a quick fix for now.
00:33:39 <thinrichs> ekcs: agreed
00:34:12 <thinrichs> I'd also like us to get a multi-node version of Congress working (the non-replicated kind)
00:34:51 <thinrichs> and have scripts/code/something so that we can arrange multiple DseNodes to implement the different HA/HT architectures we're designing
00:35:06 <thinrichs> That is, we should do some testing that multiple DseNodes on different machines actually works.
00:37:27 <thinrichs> And have a tempest test or two that spins up a multi-node version of Congress and runs tests
00:37:39 <masahito> make sense
00:37:49 <thinrichs> That seems like the next milestone for the dist-arch
00:38:13 <thinrichs> Once that's working, we can begin adding the code for HA/HT
00:38:26 <ramineni_> yes
00:38:28 <ekcs> thinrichs: agreed.
00:38:58 <thinrichs> We do have the script in scripts/start_process.py that was intended to start up several DseNodes
00:39:24 <thinrichs> That was written before we had DseNode, so it'll require some tweaking or rewriting
00:39:49 <thinrichs> But it's worth looking through at the very least, since I know it underwent several rounds of iteration
00:39:59 <ekcs> got it.
00:40:38 <thinrichs> Anyone want to spearhead writing the tempest-tests?
00:42:08 <ramineni_> thinrichs: i will look into it
00:42:42 <thinrichs> ramineini_: great!  Pull us in as you need to.
00:42:58 <ramineni_> sure
00:43:27 <thinrichs> I'd even start by spinning up 2-3 processes on a single machine, where each process has its own DseNode.
00:43:51 <thinrichs> That could be the first tempest test.
00:44:06 <thinrichs> I don't know if tempest supports multi-node stuff, so that may be the only test.
00:44:23 <thinrichs> The script start_process.py was designed for the multi-process, single-node case.
00:44:27 <ramineni_> yes, thats what im thinking too
00:44:53 <ramineni_> current HA test starts 2 processes with different node id currently
00:46:02 <thinrichs> ramineni_: perfect—so we have a model to start from
00:46:23 <thinrichs> 15 minutes left.  Let's discuss HA/HT.
00:46:32 <thinrichs> #topic High availability and throughput
00:46:47 <thinrichs> ekcs: how's that discussion going?
00:47:10 <ekcs> Good. I reworked HA spec with more details on the issues people are concerned about. Thanks to all the feedback, I think we're converging toward supporting active-active replication with symmetric nodes in the first phase. Please comment if you think we should take a different approach!
00:47:20 <ekcs> #link: https://review.openstack.org/#/c/318383/
00:47:43 <ekcs> I will be filling in the smaller details this week.
00:48:20 <ramineni_> I understand from spec that we are going for active-active for PE and active-passive for datasources node right?
00:48:44 <ekcs> ramineni_: right.
00:50:01 <thinrichs> And then what for action execution?  Leader election of some sort
00:50:07 <ramineni_> ekcs: but currently concentrating on all processes on signle node?
00:50:59 <ekcs> thinrichs: yes. not sure there is a consensus between local leader and global leader.
00:52:15 <ekcs> ramineni_: All services (PE, API, DSDs) in one single process node. Replicate that node N times, but disable DSDs on non-leaders.
00:52:53 <thinrichs> ekcs: "disable DSDs" means stop them from polling as well as taking them off the bus?
00:53:09 <ekcs> thinrichs: yes.
00:53:57 <ekcs> ramineni_: nodes will likely be repliacted across different hosts.
00:54:15 <ekcs> ramineni_: up to the deployer but that’s something we need to support.
00:54:43 <ramineni_> ok
00:55:13 <ramineni_> this leader election only for action requests?
00:55:30 <thinrichs> The crucial bit there seems to be ensuring that (a) Pacemaker pushes leader election info into Congress correctly and (b) Congress properly disables DSDs on non-leaders.
00:55:56 <ekcs> ramineni_: leader election needed also for disabling DSDs on non-leaders.
00:56:04 <thinrichs> Do we know how Pacemaker picks a leader and how non-leaders find out they're non-leaders?
00:56:12 <ekcs> thinrichs: yes.
00:56:33 <thinrichs> (Running short on time.)
00:56:35 <thinrichs> ekcs: how
00:56:36 <thinrichs> ?
00:56:41 <ramineni_> ekcs : i thought on other nodes we could start only API and PE not datasources
00:57:01 <ekcs> thinrichs: We write custom scripts called (resource agents) that Pacemaker calls to promote and demote.
00:57:17 <ramineni_> like proposed in this patch https://review.openstack.org/#/c/307693/
00:57:19 <ekcs> those script can make API calls to tell a node to be leader
00:57:52 <thinrichs> ekcs: so basically we'll need to add API calls that tell a DseNode to disable its datasources and to enable its action-execution
00:58:10 <thinrichs> That seems straightforward
00:58:12 <ekcs> yes. pacemaker picks a leader based on underlying clustering as wel as different weigts you can cinfugure.
00:58:47 <ekcs> underlying clustering provided by corosync or another service.
00:59:04 <ramineni_> ekcs: i think im confused on the leader approach, i will add my questions on the spec
00:59:47 <thinrichs> Think that seems good to me.  Only downside I can see is that we're taking a dependency on Pacemaker even if the user doesn't need action-execution.  Without action-execution, we can get by without leader election.
00:59:47 <ramineni_> ekcs: thanks for the very detailed spec, thats realy helpful :)
00:59:48 <ekcs> ramineni_: yes that’s what we started out with. but then evolved to a new proposal. the spec discusses and compares both. look forward to your questions and comments!
01:00:33 <ekcs> thinrichs: yes. In the assymmetric node approach we don’t need global leader period. but is morecomplex in other ways.
01:00:47 <thinrichs> But this way we'll support full functionality out of the box with HA/HT.
01:00:54 <ekcs> on the other hand, Pacemaker is essentially required and leader election comes at low marginal cost.
01:01:04 <ekcs> once you’ve set up pacemaker.
01:01:17 <thinrichs> We can provide an alternative deployment as an option later (e.g. for if you need HT datasources or don't need action-execution)
01:01:29 <thinrichs> Out of time for today.
01:01:32 <thinrichs> Thanks all!
01:01:36 <thinrichs> #endmeeting