00:05:47 <thinrichs> #startmeeting CongressTeamMeeting 00:05:48 <openstack> Meeting started Thu Jun 2 00:05:47 2016 UTC and is due to finish in 60 minutes. The chair is thinrichs. Information about MeetBot at http://wiki.debian.org/MeetBot. 00:05:49 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 00:05:51 <openstack> The meeting name has been set to 'congressteammeeting' 00:06:01 <thinrichs> masahito ekcs ramineni_: courtesy piong 00:06:09 <ramineni_> hi 00:06:15 <masahito> hi 00:06:18 <ekcs> hi all. 00:06:46 <thinrichs> I just have status updates on the agenda. Anything else? 00:07:55 <thinrichs> One other thing first: the gate 00:07:57 <thinrichs> #topic gate 00:08:20 <thinrichs> Looking the reviews, we have a failure on a global requirements update.. 00:08:28 <thinrichs> #link https://review.openstack.org/#/c/323883/ 00:08:40 <thinrichs> One of the doctor driver tests is failing. 00:08:57 <thinrichs> Error says that it can't find a datasource. 00:09:26 <thinrichs> I wonder if this has to do with our translation of datasource IDs to names in the API. 00:09:43 <thinrichs> masahito: I added you as a reviewer since you're most familiar with that code 00:10:35 <masahito> thinrichs: yap, I'll check it. 00:10:49 <thinrichs> masahito: great! 00:11:26 <thinrichs> ramineni_: One of your patches is failing devstack too, but I couldn't seem to track down the error. 00:11:28 <thinrichs> #link https://review.openstack.org/#/c/320306/ 00:12:10 <thinrichs> That patch has been ready to merge since last week 00:12:21 <thinrichs> so I'd be surprised if it had anything to do with the patch 00:12:27 <ramineni_> thinrichs: yes , changing to the rabbit driver is not working if started on multhiple nodes 00:13:04 <thinrichs> ramineni_: I'd like to hear more on that. 00:13:06 <thinrichs> #topic status 00:13:06 <ramineni_> masahito raised a patch to solve the issue https://review.openstack.org/#/c/321459/ 00:13:59 <thinrichs> masahito: I had a question on that patch similar to ramineni's… 00:14:23 <thinrichs> That patch assigns a different partition to every DSENode, right? 00:14:46 <masahito> yes 00:14:47 <thinrichs> Doesn't that stop them from communicating? They'd all be listening on different topics. 00:15:07 <masahito> yes 00:16:08 <ramineni_> thinrichs: ya, i think policies must be synchronized across nodes? 00:16:27 <ekcs> policy synch is done via DB which is independent. 00:16:44 <ramineni_> ekcs: datasource policies are not synchrnozed via DB 00:17:17 <ramineni_> ekcs: they are stored in mem right, not sure about the reason of not storing in DB in code 00:18:16 <ekcs> ramineni_: can you clarify what datasource policies need DSE2 to sync? 00:18:52 <thinrichs> masahito: so when we write a script that creates multiple DSENodes, we need to feed them all the same partition_id, right? 00:20:01 <thinrichs> masahito: (in a multi-node deployment where say the policy engine runs in one DseNode and the datasources run in a separate DseNode) 00:20:02 <masahito> thinrichs: with multiple PolicyEngine? 00:20:20 <masahito> oh 00:20:33 <masahito> in that case, yes. 00:20:52 <masahito> I forgot that usecase. 00:21:03 <ramineni_> ekcs: this is the bug we have now 00:21:06 <ramineni_> https://bugs.launchpad.net/congress/+bug/1585975 00:21:06 <openstack> Launchpad bug 1585975 in congress "Fails to start replica on rabbit bus" [Undecided,New] - Assigned to Masahito Muroi (muroi-masahito) 00:21:30 <ekcs> adding to what thinrichs is asking, sometimes we want multiple nodes to be on the same partition, sometimes we don’t. I wonder how we should control that. Right now it seems the patch is taking the node_id as partition_id. it’d be nice is we can separately specify partition_id, with default being all same partiiton. 00:21:54 <masahito> But I'm not sure we allow users to deploy Congress multiple DseNode with one PE now. 00:22:54 <thinrichs> I'm confused why it matters how many PEs there are. 00:23:04 <ramineni_> PE1 and PE2 maintian their own datasource policies in memeory , 00:23:06 <masahito> I understand the reason of ramineni_'s and thinrichs's questions. 00:23:10 <thinrichs> Maybe I'm missing something, so let me lay this out. 00:23:29 <thinrichs> Suppose we want PE to run on machine1 and all the datasources to run on machine2. 00:24:02 <thinrichs> Then on machine 1 we create a DseNode, and spin up the PE. 00:24:13 <thinrichs> And on machine 2 we create a DseNode and spin up all the datasources 00:24:36 <thinrichs> The way those 2 DseNodes communicate is via oslo-messaging, which means the DseNodes need to be talking on the same topic. 00:24:46 <thinrichs> So that would require them to be given the same partition_id 00:24:53 <thinrichs> Right? 00:24:55 <masahito> right. 00:25:17 <thinrichs> Today we don't have a mechanism/script for deploying that way, so it shouldn't be a problem. 00:25:42 <thinrichs> I would imagine this bug is due to the fact that the replica SHOULD be assigned a separate partition but it is not. 00:26:05 <ekcs> thinrichs: yes. 00:26:14 <masahito> yes 00:26:29 <thinrichs> ramineni_: does that sound right? 00:27:15 <ramineni_> thinrichs: confused :( if its assigned seperate partition they cant listen on same topic right 00:28:09 <ramineni_> In case of HA we would want to listen on same topic , so that any policy engine can serve the requests 00:28:10 <thinrichs> ramineni_: the replica is intended to be a totally separate instance of Congress … 00:28:12 <thinrichs> that does not communicate with the other version of Congress EXCEPT by synchronizing with the DB. 00:28:57 <thinrichs> ramineni_: We're not that far along yet. Masahito's change is intended to just fix the existing HA test, not implement the HA/HT arch that ekcs is working on 00:29:36 <thinrichs> In effect, we want to handle 2 different use-cases for DseNode: 00:30:04 <thinrichs> (i) creating a new "root" DseNode that is isolated from all other "root" DseNodes 00:30:18 <thinrichs> (ii) creating a new DseNode that communicates with an existing "root" DseNode 00:30:43 <ramineni_> thinrichs: oh ok, got it 00:31:26 <thinrichs> masahito: does your change allow us to do both use-cases? 00:32:04 <masahito> no 00:32:23 <masahito> It allows only (i) 00:32:40 <masahito> I think we need additional works for (ii) 00:33:20 <thinrichs> I guess what we want is for the eventlet_server to take an argument that is the name of the DseNode partition 00:33:23 <ekcs> I think it’d be clearer to make it an explicit option to specify a different partition_id, rather than using the node_id as partition_id. Then we can also build on that to support (ii). but I’m also okay with this as a quick fix for now. 00:33:39 <thinrichs> ekcs: agreed 00:34:12 <thinrichs> I'd also like us to get a multi-node version of Congress working (the non-replicated kind) 00:34:51 <thinrichs> and have scripts/code/something so that we can arrange multiple DseNodes to implement the different HA/HT architectures we're designing 00:35:06 <thinrichs> That is, we should do some testing that multiple DseNodes on different machines actually works. 00:37:27 <thinrichs> And have a tempest test or two that spins up a multi-node version of Congress and runs tests 00:37:39 <masahito> make sense 00:37:49 <thinrichs> That seems like the next milestone for the dist-arch 00:38:13 <thinrichs> Once that's working, we can begin adding the code for HA/HT 00:38:26 <ramineni_> yes 00:38:28 <ekcs> thinrichs: agreed. 00:38:58 <thinrichs> We do have the script in scripts/start_process.py that was intended to start up several DseNodes 00:39:24 <thinrichs> That was written before we had DseNode, so it'll require some tweaking or rewriting 00:39:49 <thinrichs> But it's worth looking through at the very least, since I know it underwent several rounds of iteration 00:39:59 <ekcs> got it. 00:40:38 <thinrichs> Anyone want to spearhead writing the tempest-tests? 00:42:08 <ramineni_> thinrichs: i will look into it 00:42:42 <thinrichs> ramineini_: great! Pull us in as you need to. 00:42:58 <ramineni_> sure 00:43:27 <thinrichs> I'd even start by spinning up 2-3 processes on a single machine, where each process has its own DseNode. 00:43:51 <thinrichs> That could be the first tempest test. 00:44:06 <thinrichs> I don't know if tempest supports multi-node stuff, so that may be the only test. 00:44:23 <thinrichs> The script start_process.py was designed for the multi-process, single-node case. 00:44:27 <ramineni_> yes, thats what im thinking too 00:44:53 <ramineni_> current HA test starts 2 processes with different node id currently 00:46:02 <thinrichs> ramineni_: perfect—so we have a model to start from 00:46:23 <thinrichs> 15 minutes left. Let's discuss HA/HT. 00:46:32 <thinrichs> #topic High availability and throughput 00:46:47 <thinrichs> ekcs: how's that discussion going? 00:47:10 <ekcs> Good. I reworked HA spec with more details on the issues people are concerned about. Thanks to all the feedback, I think we're converging toward supporting active-active replication with symmetric nodes in the first phase. Please comment if you think we should take a different approach! 00:47:20 <ekcs> #link: https://review.openstack.org/#/c/318383/ 00:47:43 <ekcs> I will be filling in the smaller details this week. 00:48:20 <ramineni_> I understand from spec that we are going for active-active for PE and active-passive for datasources node right? 00:48:44 <ekcs> ramineni_: right. 00:50:01 <thinrichs> And then what for action execution? Leader election of some sort 00:50:07 <ramineni_> ekcs: but currently concentrating on all processes on signle node? 00:50:59 <ekcs> thinrichs: yes. not sure there is a consensus between local leader and global leader. 00:52:15 <ekcs> ramineni_: All services (PE, API, DSDs) in one single process node. Replicate that node N times, but disable DSDs on non-leaders. 00:52:53 <thinrichs> ekcs: "disable DSDs" means stop them from polling as well as taking them off the bus? 00:53:09 <ekcs> thinrichs: yes. 00:53:57 <ekcs> ramineni_: nodes will likely be repliacted across different hosts. 00:54:15 <ekcs> ramineni_: up to the deployer but that’s something we need to support. 00:54:43 <ramineni_> ok 00:55:13 <ramineni_> this leader election only for action requests? 00:55:30 <thinrichs> The crucial bit there seems to be ensuring that (a) Pacemaker pushes leader election info into Congress correctly and (b) Congress properly disables DSDs on non-leaders. 00:55:56 <ekcs> ramineni_: leader election needed also for disabling DSDs on non-leaders. 00:56:04 <thinrichs> Do we know how Pacemaker picks a leader and how non-leaders find out they're non-leaders? 00:56:12 <ekcs> thinrichs: yes. 00:56:33 <thinrichs> (Running short on time.) 00:56:35 <thinrichs> ekcs: how 00:56:36 <thinrichs> ? 00:56:41 <ramineni_> ekcs : i thought on other nodes we could start only API and PE not datasources 00:57:01 <ekcs> thinrichs: We write custom scripts called (resource agents) that Pacemaker calls to promote and demote. 00:57:17 <ramineni_> like proposed in this patch https://review.openstack.org/#/c/307693/ 00:57:19 <ekcs> those script can make API calls to tell a node to be leader 00:57:52 <thinrichs> ekcs: so basically we'll need to add API calls that tell a DseNode to disable its datasources and to enable its action-execution 00:58:10 <thinrichs> That seems straightforward 00:58:12 <ekcs> yes. pacemaker picks a leader based on underlying clustering as wel as different weigts you can cinfugure. 00:58:47 <ekcs> underlying clustering provided by corosync or another service. 00:59:04 <ramineni_> ekcs: i think im confused on the leader approach, i will add my questions on the spec 00:59:47 <thinrichs> Think that seems good to me. Only downside I can see is that we're taking a dependency on Pacemaker even if the user doesn't need action-execution. Without action-execution, we can get by without leader election. 00:59:47 <ramineni_> ekcs: thanks for the very detailed spec, thats realy helpful :) 00:59:48 <ekcs> ramineni_: yes that’s what we started out with. but then evolved to a new proposal. the spec discusses and compares both. look forward to your questions and comments! 01:00:33 <ekcs> thinrichs: yes. In the assymmetric node approach we don’t need global leader period. but is morecomplex in other ways. 01:00:47 <thinrichs> But this way we'll support full functionality out of the box with HA/HT. 01:00:54 <ekcs> on the other hand, Pacemaker is essentially required and leader election comes at low marginal cost. 01:01:04 <ekcs> once you’ve set up pacemaker. 01:01:17 <thinrichs> We can provide an alternative deployment as an option later (e.g. for if you need HT datasources or don't need action-execution) 01:01:29 <thinrichs> Out of time for today. 01:01:32 <thinrichs> Thanks all! 01:01:36 <thinrichs> #endmeeting