13:06:06 #startmeeting tricircle 13:06:07 Meeting started Wed Aug 26 13:06:06 2015 UTC and is due to finish in 60 minutes. The chair is joehuang. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:06:08 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:06:10 The meeting name has been set to 'tricircle' 13:06:50 #topic rollcall 13:07:02 #info gampel 13:07:05 #info joehuang 13:07:31 #info saggi 13:08:03 hi 13:08:26 hi irena, rollcall now 13:08:29 * irenab : will attend partially, have conflicting meeting 13:08:45 understand, thanks 13:08:54 hi zhiyuan 13:08:58 hi joe 13:09:07 #info irenab 13:09:13 #info zhiyuan 13:09:23 #topic recent progress 13:09:56 saggi can you please explain what you did in nova and why 13:10:11 we have a design meeting this Monday about network connectivity 13:10:23 yes, please 13:10:50 As I spoke about in previous meeting. I thought up a way too implement what we need without changing nova core code. 13:11:08 how 13:11:25 What I did was to hook up the scheduler and have the cascade_service appear to be multiple nova-compute hosts 13:11:26 sorry now shown last meeting 13:11:40 joehuang: Takes time to type :) 13:12:29 So the general idea is that when the user wants to run a VM we get the scheduling information in the cascade_service since it's registers as the scheduler. Look at the AZ and return the node_name of the site. 13:12:42 In the cascade service we have a compute_service per site 13:13:05 so the cascade service always gets the request. 13:13:10 the cascade service as a schedluer? 13:13:20 and multiple compute nodes :) 13:13:37 can you guys connect to imgur or is it blocked? 13:13:52 what's imgur 13:14:34 i can access 13:14:45 http://imgur.com/ this url, right? 13:14:49 yes 13:14:51 which cascade service node will be called for reboot/etc VM operation? 13:15:29 http://i.imgur.com/za5kZpy.png 13:15:34 ok, I can access too 13:15:58 In this case I have 2 fake sites and they look like to compute hosts 13:16:36 They are all actually the cascade service 13:16:54 in the hypervisor view you can see that they are cascade sites http://i.imgur.com/tNWeDIn.png 13:16:58 go on please 13:18:36 This is what I have ATM. You can register how many sites you want and they will appear as compute hosts. 13:18:56 And you can control all the stats from the cascade service 13:19:10 shall the cascade service to collect resource usage from bottom OpenStack 13:19:18 Yes 13:19:41 We will use the site aggregate stats as the host stats 13:20:12 What I need to add next is the actual scheduler logic that uses AZs to select the host 13:20:44 so one cascade service will handle one bottom openstack 13:21:10 no one cascade service will handle N bottom openstacks 13:21:33 At the start we will have only one cascade service 13:21:52 only one cascade service? 13:22:01 yes 13:22:09 For now 13:22:23 The design allows for more. 13:22:25 where is the compute service running? I think for each bottom OS we need one compute service 13:22:25 the availability need to be in consideration 13:22:41 The compute service isn't running anywhere. It's fake. 13:22:46 first we need to establish flow end to end 13:23:35 It represents a whole site. 13:23:54 we are not sure that the cascading service will be the bottleneck, depends on the run time info module (push ,pull) and who will do that job 13:24:23 The design allows distributing the fake hosts across multiple cascade services. 13:24:57 But we don't want to start coding all the synchronization that this requires just now 13:25:04 do you mean two nodes for one bottom openstack 13:25:27 to make flow work is compared simple 13:25:33 No every cascaded service handle one or more bottom sites 13:26:00 ic, so there is no rpc between scheduler and computer service, just function call? 13:26:07 You need to have a single point to handle requests for a single site to have correct ordering of operations. 13:26:32 there is no communication between them in nova. 13:26:53 But because we are both the scheduler and the compute host we can pass information between them in the cascade layer. 13:27:11 so how to forward the RPC call like reboot VM from API to cascade service 13:27:13 So that we don't loose the scheduling information when passing the create call down to the bottom OS 13:27:38 Nova will contact the fake host. Which is the cascade service itself. 13:28:42 ok, the RPC call will be forwarded to fixed fake call right 13:28:58 ok, the RPC call will be forwarded to fixed fake noderight 13:29:36 that means if you add more cascade service 13:29:39 yes, which is just an instance inside the cascade service. 13:29:54 the RPC call will still forward to the same fake node 13:30:08 Yes, since it's the one managing that host 13:30:36 then how to scale out 13:31:10 and if this fake node failed 13:31:35 which cascade service node will be selected for the bottom opentsack 13:31:48 The scheduler tell nova what fake host to use. 13:31:54 and how to redirect the API rpc call to the new cascade service nod 13:32:03 This makes nova contact the correct cascade_service 13:32:10 this allows you to scale out 13:32:47 but in the database all VM have already been allocated for the fake node 13:33:25 if the cascade service will act as the new fake node (the same old name ) 13:33:38 yes 13:33:45 as for redundancy, you could have an active passive set up where cascade services spin up fake node on another cascade service and it will handle the requests. 13:34:06 Spinning a fake node is just listening on the proper queue 13:34:07 we do not see a problem of HA/scaling in this design 13:34:15 I think that we need to agree that HA is in the design but will be handled after we have end to end flow 13:34:19 There are issues with VNC connections. 13:34:41 which will probably have to be reestablished since the proxy IP will change. 13:34:56 But all commands that use the message queue will be uneffected. 13:35:31 i am not sure regarding the vnc when we get there we could offload the connection directly to the bottom OS 13:35:34 Since the passive cascade service will spin up a fake host and listen on that topic 13:35:45 gampel: maybe 13:36:18 what's the benefit compared to the PoC, where one compute-node will proxy one bottom openstack 13:37:34 small code change not intrusive very clear to understand what we changed and why, one service could handle multi bottom sites 13:38:36 for PoC code, all RPC from scheduler/API kept as before 13:38:51 It's also easier for us, at least at the start. To assume a single cascade service and don't worry about ordering and distribution of information across multiple nodes. 13:40:17 if one cascade service will be reponsible for multiple bottom openstack, then is there any issue for the fanout RPC call from neutron API 13:41:02 You need to take control of the scheduler anyway so you don't loose the scheduling information in the cascade layer so you can pass it to the bottom scheduler. 13:41:31 no duplicated fake node allowed for multiple cascade service 13:41:48 the Neutron /Nova layer will not be aware of the cascading cascading service layout so it must do funout 13:42:28 can you say aging i did not understand ? 13:43:44 if you use fanout, then no two fake node ( cascade service ) to work for one bottom openstack 13:44:08 I suggest that Saggi and me will ad the new nova to the design doc and we could discuss it there (we will add the high level design for HA ) 13:44:14 if you use fanout, then two fake nodes ( cascade services ) to work for one bottom openstack not allowed 13:44:52 no as saggi said we have only one active CS working on a bottom site 13:45:32 if there a lot of API calls for one bottom OpenStack, then other fake node should be moved to other cascade service 13:45:42 i suggest we will discuss this in the document and mailing list so we will have time to discuss the status of other tasks 13:45:54 but unfortunately, it's un-estimatble 13:45:59 at last 13:46:17 we have to deploy one cascade service for one bottom openstack 13:46:42 saggi will send his patch today and we will have documentation about it 13:47:27 ok 13:48:00 we do not agree with that statement and will but lets discuss this with proper design doc intruder 13:48:37 good 13:48:51 what is the status of the API , DAL --> Neutron , Nova 13:48:57 the more discussion, the better 13:49:04 joehuang: :) 13:49:16 zhiyuan is working on it 13:50:03 keystone part has been settled 13:50:25 yes, I find that we need to store the endpoint url in the database, since normal user cannot get endpoint via "endpoint-list" 13:50:40 agree 13:51:07 can you explain a bit more please 13:51:08 as cachgin? 13:51:10 caching? 13:51:13 and I will change back the site tables for url storage 13:51:45 no caching, because the endpoint can only get through admin 13:52:10 context and site id is passed to DAL, then DAL query the database the get the endpoint url according to the site id and resource type from the database 13:52:21 but we don't want to configure the admin information in configuration 13:52:45 are we talking about the DAL to the -->TOP neutron ,nova 13:52:52 zhiyuan_: How will we make sure everything is synced than? 13:52:59 so restore to the table design in the doc 13:53:55 How will we make sure it's all configured correctly? Keystone and cascade? 13:54:16 Make sure nothing was changes at only one end 13:55:01 We have the siteserviceconfiguration table in the design doc, this is to store the url information. 13:55:28 User needs to register this information via cascade API 13:55:35 Yes, but if the admin changes the information in keystone. How will we know? 13:55:49 saggi wants to know how to validate the url 13:56:32 this could be done in API. but if later change happen in keystone, then the admin has to reconfigure cascade service too 13:57:09 Or we give cascade service an admin account to sync the change 13:58:13 zhiyuan_: I think we will need an admin account anyway. For information from nova and neutron. 13:58:43 do we want to have admin account configuration in cascade service 13:58:45 gampel: can you think about any APIs we need right now that are admin only. 13:58:59 if yes, then cache works 13:59:12 if not, the store the url in db 13:59:42 in the bottom we hope to avoid admin call 13:59:47 API could be controlled by policy 13:59:48 Top 13:59:57 joehuang: We could have a sync_keystone() call that requires an admin context. 14:00:18 If syncing keystone is our only issue. 14:00:24 i think this will work 14:00:39 I'll probably call it something different though :) 14:00:56 yes, if we want to get endpoint from keystone, then admin context is needed 14:01:21 and then we could use the keystone regions 14:01:30 What I mean is that instead of having API to add URIs have an API to sync that information. 14:01:46 so conclusion is that we configure admin information in cascade service 14:01:51 if we have the admin account, I think we can also use it to get the endpoints. Is there any reason we should limit the use of the admin account? 14:03:37 I got gampel's idea, use one api to refresh endpoint information from keystone, and store them in db 14:03:45 I do not see problem having admin API to keystone and TOP i hope to avoid admin on the bottoms 14:04:35 joehuang: so db works as a cache? 14:04:55 I think so. how about your ideas 14:05:13 knock knock 14:05:26 We gotta bail guys 14:05:27 oh, we run out of time again.... 14:05:31 :) 14:05:32 we have to end meeting now. 14:05:35 let switch to the #openstack-tricircle 14:05:36 byr 14:05:37 #neutron-tricircle ? 14:05:38 bye 14:05:45 #endmeeting