13:06:06 <joehuang> #startmeeting tricircle 13:06:07 <openstack> Meeting started Wed Aug 26 13:06:06 2015 UTC and is due to finish in 60 minutes. The chair is joehuang. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:06:08 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:06:10 <openstack> The meeting name has been set to 'tricircle' 13:06:50 <joehuang> #topic rollcall 13:07:02 <gampel> #info gampel 13:07:05 <joehuang> #info joehuang 13:07:31 <saggi> #info saggi 13:08:03 <irenab> hi 13:08:26 <joehuang> hi irena, rollcall now 13:08:29 * irenab : will attend partially, have conflicting meeting 13:08:45 <joehuang> understand, thanks 13:08:54 <joehuang> hi zhiyuan 13:08:58 <zhiyuan_> hi joe 13:09:07 <irenab> #info irenab 13:09:13 <zhiyuan_> #info zhiyuan 13:09:23 <joehuang> #topic recent progress 13:09:56 <gampel> saggi can you please explain what you did in nova and why 13:10:11 <joehuang> we have a design meeting this Monday about network connectivity 13:10:23 <joehuang> yes, please 13:10:50 <saggi> As I spoke about in previous meeting. I thought up a way too implement what we need without changing nova core code. 13:11:08 <joehuang> how 13:11:25 <saggi> What I did was to hook up the scheduler and have the cascade_service appear to be multiple nova-compute hosts 13:11:26 <joehuang> sorry now shown last meeting 13:11:40 <saggi> joehuang: Takes time to type :) 13:12:29 <saggi> So the general idea is that when the user wants to run a VM we get the scheduling information in the cascade_service since it's registers as the scheduler. Look at the AZ and return the node_name of the site. 13:12:42 <saggi> In the cascade service we have a compute_service per site 13:13:05 <saggi> so the cascade service always gets the request. 13:13:10 <joehuang> the cascade service as a schedluer? 13:13:20 <saggi> and multiple compute nodes :) 13:13:37 <saggi> can you guys connect to imgur or is it blocked? 13:13:52 <joehuang> what's imgur 13:14:34 <zhiyuan_> i can access 13:14:45 <zhiyuan_> http://imgur.com/ this url, right? 13:14:49 <saggi> yes 13:14:51 <joehuang> which cascade service node will be called for reboot/etc VM operation? 13:15:29 <saggi> http://i.imgur.com/za5kZpy.png 13:15:34 <joehuang> ok, I can access too 13:15:58 <saggi> In this case I have 2 fake sites and they look like to compute hosts 13:16:36 <saggi> They are all actually the cascade service 13:16:54 <saggi> in the hypervisor view you can see that they are cascade sites http://i.imgur.com/tNWeDIn.png 13:16:58 <joehuang> go on please 13:18:36 <saggi> This is what I have ATM. You can register how many sites you want and they will appear as compute hosts. 13:18:56 <saggi> And you can control all the stats from the cascade service 13:19:10 <joehuang> shall the cascade service to collect resource usage from bottom OpenStack 13:19:18 <saggi> Yes 13:19:41 <saggi> We will use the site aggregate stats as the host stats 13:20:12 <saggi> What I need to add next is the actual scheduler logic that uses AZs to select the host 13:20:44 <joehuang> so one cascade service will handle one bottom openstack 13:21:10 <saggi> no one cascade service will handle N bottom openstacks 13:21:33 <saggi> At the start we will have only one cascade service 13:21:52 <joehuang> only one cascade service? 13:22:01 <saggi> yes 13:22:09 <saggi> For now 13:22:23 <saggi> The design allows for more. 13:22:25 <zhiyuan_> where is the compute service running? I think for each bottom OS we need one compute service 13:22:25 <joehuang> the availability need to be in consideration 13:22:41 <saggi> The compute service isn't running anywhere. It's fake. 13:22:46 <gampel> first we need to establish flow end to end 13:23:35 <saggi> It represents a whole site. 13:23:54 <gampel> we are not sure that the cascading service will be the bottleneck, depends on the run time info module (push ,pull) and who will do that job 13:24:23 <saggi> The design allows distributing the fake hosts across multiple cascade services. 13:24:57 <saggi> But we don't want to start coding all the synchronization that this requires just now 13:25:04 <joehuang> do you mean two nodes for one bottom openstack 13:25:27 <joehuang> to make flow work is compared simple 13:25:33 <gampel> No every cascaded service handle one or more bottom sites 13:26:00 <zhiyuan_> ic, so there is no rpc between scheduler and computer service, just function call? 13:26:07 <saggi> You need to have a single point to handle requests for a single site to have correct ordering of operations. 13:26:32 <saggi> there is no communication between them in nova. 13:26:53 <saggi> But because we are both the scheduler and the compute host we can pass information between them in the cascade layer. 13:27:11 <joehuang> so how to forward the RPC call like reboot VM from API to cascade service 13:27:13 <saggi> So that we don't loose the scheduling information when passing the create call down to the bottom OS 13:27:38 <saggi> Nova will contact the fake host. Which is the cascade service itself. 13:28:42 <joehuang> ok, the RPC call will be forwarded to fixed fake call right 13:28:58 <joehuang> ok, the RPC call will be forwarded to fixed fake noderight 13:29:36 <joehuang> that means if you add more cascade service 13:29:39 <saggi> yes, which is just an instance inside the cascade service. 13:29:54 <joehuang> the RPC call will still forward to the same fake node 13:30:08 <saggi> Yes, since it's the one managing that host 13:30:36 <joehuang> then how to scale out 13:31:10 <joehuang> and if this fake node failed 13:31:35 <joehuang> which cascade service node will be selected for the bottom opentsack 13:31:48 <saggi> The scheduler tell nova what fake host to use. 13:31:54 <joehuang> and how to redirect the API rpc call to the new cascade service nod 13:32:03 <saggi> This makes nova contact the correct cascade_service 13:32:10 <saggi> this allows you to scale out 13:32:47 <joehuang> but in the database all VM have already been allocated for the fake node 13:33:25 <joehuang> if the cascade service will act as the new fake node (the same old name ) 13:33:38 <saggi> yes 13:33:45 <saggi> as for redundancy, you could have an active passive set up where cascade services spin up fake node on another cascade service and it will handle the requests. 13:34:06 <saggi> Spinning a fake node is just listening on the proper queue 13:34:07 <gampel> we do not see a problem of HA/scaling in this design 13:34:15 <gampel> I think that we need to agree that HA is in the design but will be handled after we have end to end flow 13:34:19 <saggi> There are issues with VNC connections. 13:34:41 <saggi> which will probably have to be reestablished since the proxy IP will change. 13:34:56 <saggi> But all commands that use the message queue will be uneffected. 13:35:31 <gampel> i am not sure regarding the vnc when we get there we could offload the connection directly to the bottom OS 13:35:34 <saggi> Since the passive cascade service will spin up a fake host and listen on that topic 13:35:45 <saggi> gampel: maybe 13:36:18 <joehuang> what's the benefit compared to the PoC, where one compute-node will proxy one bottom openstack 13:37:34 <gampel> small code change not intrusive very clear to understand what we changed and why, one service could handle multi bottom sites 13:38:36 <joehuang> for PoC code, all RPC from scheduler/API kept as before 13:38:51 <saggi> It's also easier for us, at least at the start. To assume a single cascade service and don't worry about ordering and distribution of information across multiple nodes. 13:40:17 <joehuang> if one cascade service will be reponsible for multiple bottom openstack, then is there any issue for the fanout RPC call from neutron API 13:41:02 <saggi> You need to take control of the scheduler anyway so you don't loose the scheduling information in the cascade layer so you can pass it to the bottom scheduler. 13:41:31 <joehuang> no duplicated fake node allowed for multiple cascade service 13:41:48 <gampel> the Neutron /Nova layer will not be aware of the cascading cascading service layout so it must do funout 13:42:28 <gampel> can you say aging i did not understand ? 13:43:44 <joehuang> if you use fanout, then no two fake node ( cascade service ) to work for one bottom openstack 13:44:08 <gampel> I suggest that Saggi and me will ad the new nova to the design doc and we could discuss it there (we will add the high level design for HA ) 13:44:14 <joehuang> if you use fanout, then two fake nodes ( cascade services ) to work for one bottom openstack not allowed 13:44:52 <gampel> no as saggi said we have only one active CS working on a bottom site 13:45:32 <joehuang> if there a lot of API calls for one bottom OpenStack, then other fake node should be moved to other cascade service 13:45:42 <gampel> i suggest we will discuss this in the document and mailing list so we will have time to discuss the status of other tasks 13:45:54 <joehuang> but unfortunately, it's un-estimatble 13:45:59 <joehuang> at last 13:46:17 <joehuang> we have to deploy one cascade service for one bottom openstack 13:46:42 <gampel> saggi will send his patch today and we will have documentation about it 13:47:27 <joehuang> ok 13:48:00 <gampel> we do not agree with that statement and will but lets discuss this with proper design doc intruder 13:48:37 <joehuang> good 13:48:51 <gampel> what is the status of the API , DAL --> Neutron , Nova 13:48:57 <joehuang> the more discussion, the better 13:49:04 <saggi> joehuang: :) 13:49:16 <joehuang> zhiyuan is working on it 13:50:03 <joehuang> keystone part has been settled 13:50:25 <zhiyuan_> yes, I find that we need to store the endpoint url in the database, since normal user cannot get endpoint via "endpoint-list" 13:50:40 <joehuang> agree 13:51:07 <gampel> can you explain a bit more please 13:51:08 <saggi> as cachgin? 13:51:10 <saggi> caching? 13:51:13 <joehuang> and I will change back the site tables for url storage 13:51:45 <joehuang> no caching, because the endpoint can only get through admin 13:52:10 <zhiyuan_> context and site id is passed to DAL, then DAL query the database the get the endpoint url according to the site id and resource type from the database 13:52:21 <joehuang> but we don't want to configure the admin information in configuration 13:52:45 <gampel> are we talking about the DAL to the -->TOP neutron ,nova 13:52:52 <saggi> zhiyuan_: How will we make sure everything is synced than? 13:52:59 <joehuang> so restore to the table design in the doc 13:53:55 <saggi> How will we make sure it's all configured correctly? Keystone and cascade? 13:54:16 <saggi> Make sure nothing was changes at only one end 13:55:01 <zhiyuan_> We have the siteserviceconfiguration table in the design doc, this is to store the url information. 13:55:28 <zhiyuan_> User needs to register this information via cascade API 13:55:35 <saggi> Yes, but if the admin changes the information in keystone. How will we know? 13:55:49 <joehuang> saggi wants to know how to validate the url 13:56:32 <joehuang> this could be done in API. but if later change happen in keystone, then the admin has to reconfigure cascade service too 13:57:09 <zhiyuan_> Or we give cascade service an admin account to sync the change 13:58:13 <saggi> zhiyuan_: I think we will need an admin account anyway. For information from nova and neutron. 13:58:43 <joehuang> do we want to have admin account configuration in cascade service 13:58:45 <saggi> gampel: can you think about any APIs we need right now that are admin only. 13:58:59 <joehuang> if yes, then cache works 13:59:12 <joehuang> if not, the store the url in db 13:59:42 <gampel> in the bottom we hope to avoid admin call 13:59:47 <joehuang> API could be controlled by policy 13:59:48 <saggi> Top 13:59:57 <saggi> joehuang: We could have a sync_keystone() call that requires an admin context. 14:00:18 <saggi> If syncing keystone is our only issue. 14:00:24 <gampel> i think this will work 14:00:39 <saggi> I'll probably call it something different though :) 14:00:56 <joehuang> yes, if we want to get endpoint from keystone, then admin context is needed 14:01:21 <gampel> and then we could use the keystone regions 14:01:30 <saggi> What I mean is that instead of having API to add URIs have an API to sync that information. 14:01:46 <joehuang> so conclusion is that we configure admin information in cascade service 14:01:51 <zhiyuan_> if we have the admin account, I think we can also use it to get the endpoints. Is there any reason we should limit the use of the admin account? 14:03:37 <joehuang> I got gampel's idea, use one api to refresh endpoint information from keystone, and store them in db 14:03:45 <gampel> I do not see problem having admin API to keystone and TOP i hope to avoid admin on the bottoms 14:04:35 <zhiyuan_> joehuang: so db works as a cache? 14:04:55 <joehuang> I think so. how about your ideas 14:05:13 <annegentle> knock knock 14:05:26 <saggi> We gotta bail guys 14:05:27 <zhiyuan_> oh, we run out of time again.... 14:05:31 <annegentle> :) 14:05:32 <joehuang> we have to end meeting now. 14:05:35 <gampel> let switch to the #openstack-tricircle 14:05:36 <joehuang> byr 14:05:37 <saggi> #neutron-tricircle ? 14:05:38 <joehuang> bye 14:05:45 <joehuang> #endmeeting