14:03:40 <sgordon> #startmeeting telcowg 14:03:40 <DaSchab> :-) 14:03:41 <openstack> Meeting started Wed Jan 14 14:03:40 2015 UTC and is due to finish in 60 minutes. The chair is sgordon. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:03:42 <ajo> :) 14:03:43 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:03:44 <mkoderer> hi 14:03:45 <sgordon> let's try that again :) 14:03:46 <openstack> The meeting name has been set to 'telcowg' 14:03:49 <aveiga> hello 14:03:49 <sgordon> #topic roll call 14:03:54 <smazziotta> hi 14:03:55 <sgordon> #link https://etherpad.openstack.org/p/nfv-meeting-agenda 14:04:25 <sgordon> #topic action items from last week 14:04:39 <sgordon> #info amitry to cross reference ops mid-cycle signups with telco wg participants to determine crossover 14:04:51 <sgordon> i dont believe registration for the ops summit has opened yet 14:04:54 <amitry> signup opening this week 14:04:55 <amitry> correct 14:04:56 <sgordon> so not much to be done on this yet 14:05:02 <sgordon> will carry it over 14:05:06 <sgordon> #action amitry to cross reference ops mid-cycle signups with telco wg participants to determine crossover 14:05:54 <sgordon> for those who were still on vacation etc. last week there is a general openstack operators midcycle meetup being held on march 9 and 10 host by amitry and comcast in philli 14:06:16 <sgordon> we are considering whether enough people are interested in or planning to attend that we could have a face to face session on telco use cases there 14:06:21 <amitry> https://etherpad.openstack.org/p/PHL-ops-meetup 14:06:29 <sgordon> #link https://etherpad.openstack.org/p/PHL-ops-meetup 14:06:37 <sgordon> any questions on that? 14:07:01 <sgordon> the etherpad contains all available detail at this time - expect to see registration details on the openstack-operators list real soon now 14:07:23 <sgordon> #info steveg was to send out a brief survey monkey to the list and we select one to review at next week's meeting in detail 14:07:49 <sgordon> so this happened, i only got a handful of responses (6) so we can probably re-evaluate ordering again in the future 14:07:58 <sgordon> but the results were: 14:07:59 <sgordon> 1st: Virtual IMS 14:07:59 <sgordon> 2nd (tie): VPN Instantiation / Access to physical network resources 14:07:59 <sgordon> 4th: Security Segregation 14:07:59 <sgordon> 5th: Session border controller 14:08:18 <sgordon> i believe the vIMS use case was submitted by cloudon 14:08:28 <cloudon> yup, that's right 14:08:31 <sgordon> cloudon, are you available if we want to try deep dive on this in the meeting today 14:08:39 <cloudon> sure 14:08:50 <sgordon> #topic virtual IMS use case discussion 14:08:54 <ybabenko> hi 14:08:55 <sgordon> #link https://wiki.openstack.org/wiki/TelcoWorkingGroup/UseCases#Virtual_IMS_Core 14:08:55 <vks> hi, what about service chaining? 14:09:17 <ybabenko> vks: we have a draft here https://etherpad.openstack.org/p/kKIqu2ipN6 14:09:19 <sgordon> vks, we're going off the use cases that have been submitted to the wiki 14:09:26 <sgordon> there is a broader effort around service chaining 14:09:34 <sgordon> with discussion happening on the mailing list 14:09:53 <sgordon> ybabenko++ 14:10:18 <ybabenko> sgordon: let us discuss our draft here today (comments, critique, etc) and later on i will put it into wiki 14:10:53 <vks> saw it, will it going to fall in line with gbp. 14:11:01 <sgordon> let's focus on the vIMS case first 14:11:07 <vks> ok 14:11:11 <mkoderer> sgordon: +1 14:11:13 <sgordon> if we get time in other discussion we can loop back on service chaining 14:11:22 <vks> go ahead 14:11:32 <sgordon> ok 14:11:39 <ybabenko> sgordon: it looks like to me that for such an VNF as IMS we need a serious HA setup for openstack 14:11:46 <sgordon> so cloudon you had already broken out some requirements in this one 14:11:52 <sgordon> with main constraints being in HA 14:11:56 <ybabenko> does something like this exist already today in form of verified blueprint 14:12:06 <sgordon> ybabenko, not quite 14:12:12 <ybabenko> exactly 14:12:22 <sgordon> in particular that second requirement about affinity/anti-affinity groups being nested 14:13:11 <sgordon> you can possibly force this by combining groups with host aggregate/az assignments 14:13:19 <sgordon> to mimic the same type of setup 14:13:31 <cloudon> the broader issue here I was trying to get at was how to represent the affinity requirements for services deployed as an N+k pool, with N large 14:13:32 <adrian-hoban> ybabenko: We should be clear to differentiate between HA deployment/config of the vIMS app and OpenStack HA from the controller perspective 14:13:36 <sgordon> #info implemented as a series of N+k compute pools; meeting a given SLA requires being able to limit the impact of a single host failure 14:13:43 <sgordon> #info potentially a scheduler gap here: affinity/anti-affinity can be expressed pair-wise between VMs, which is sufficient for a 1:1 active/passive architecture, but an N+k pool needs a concept equivalent to "group anti-affinity" i.e. allowing the NFV orchestrator to assign each VM in a pool to one of X buckets, and requesting OpenStack to ensure no single host failure can affect more than one bucket 14:13:58 <sgordon> adrian-hoban, here we're talking about for the app itself 14:14:36 <cloudon> crudely: don't want too many of the service's VMs on the same host, but for perf reasons want them "nearby" for some definition of "nearby" 14:14:41 <ybabenko> adrian-hoban: i am meaning OpenStack HA setup (core services like keyston) as IMS is normally deployed as a multi site VNF 14:14:52 <vks> sgordon: do we have doc on affinity/anti-affinity stuff in place 14:14:54 <vks> ? 14:15:17 <imendels> are you sure IMS shall be assume as multi site? 14:15:30 <mkoderer> a deployed vIMS will be hosted in several DC's 14:15:31 <sgordon> vks, there is some coverage in the nova scheduler documentation 14:15:37 <vks> ok 14:15:52 <sgordon> it's fairly minimal but so is the functionality today 14:15:53 <ybabenko> imendels: yes 14:16:23 <adrian-hoban> ybabenko: sgordon: Seems like you guys are talking about different aspects of HA then... 14:16:24 <imendels> I have some references claiming different. Though I agree multi site also make sense 14:16:59 <cloudon> yes, in this use case was concentrating solely on HA at the app level, not the platform level 14:17:18 <adrian-hoban> imendels: In the NFV architecture, I would see the multi-site aspects as being in the scope of NFVO 14:17:43 <sgordon> adrian-hoban, i was really just regurgitating what we have in the wiki so people understand the context 14:17:54 <sgordon> (and we have a record in meetbot of what we were referring to) 14:17:59 <imendels> adrian: right but can we assume that all apps (IMS in this case) will run multi site becaue of it? 14:18:05 <cloudon> adrian-hoban: +1; and further NFV explicitly allows deployment of a VNF across multiple underlying cloud platforms (NFVIs) 14:18:12 <vks> adrian-hoban, : +1 14:18:25 <ybabenko> my point is: for such critical NFV as vIMS we need a reference HA design for OpenStack services 14:18:33 <imendels> agree 14:18:49 <adrian-hoban> imendels: I think we can assume that it is a possibility but not required for all apps 14:18:57 <imendels> agree 14:19:07 <adrian-hoban> ybabenko: +1 14:19:19 <gmatefi1> Need of HA mainly depends on application elasticity 14:19:38 <sgordon> #info Require a reference HA design for OpenStack services for critical NFV such as vIMS 14:19:42 <vks> gmatefi1, not only elastcity 14:19:46 <aveiga> going to agree with adrian-hoban here: provide the functionalityt for things like multi-site, but don't make it a requirement. Remember that not everyone will deploy an app the same way 14:19:53 <cloudon> gmatefil: not if you have an SLA to meet... 14:19:55 <sgordon> to me though there is a broader use case discussion here though 14:19:58 <gmatefi1> for apps that are dynamically scaling, controller HA is a must-have as part of real-time operation 14:20:03 <ybabenko> sgordon: who is able to address this in OS community? what is the right entry point for that? 14:20:12 <sgordon> in that there is a separate need to drill down on what multi-site means for telco 14:20:29 <imendels> in other words we need to assume that not all apps are ready for it, or not. But it's a fundamental assumption we must have 14:20:35 <sgordon> ybabenko, there is currently an effort underway to reinvigorate the HA documentation - maybe in the scope of that effort 14:20:46 <sgordon> they need helpers though 14:20:54 <ybabenko> sgordon: +1 can you link it? 14:21:24 <mkoderer> sgordon: +1 .. so are we talking about multi-region setup? or different OS clusters... 14:21:26 <sgordon> here is the *current* scope of the guide: 14:21:29 <sgordon> #link http://docs.openstack.org/high-availability-guide/content/index.html 14:21:33 <sgordon> mkoderer, well this is the thing 14:21:36 <aveiga> sgordon: this is what I was getting at. mutli-site in openstack effectively means that a VNF will exist in more than one "Region" but perhaps a telco may deploy one large region in multiple DCs. It's possible to schedule into different regions using the neturon network as well 14:21:38 <mkoderer> sgordon: ;) 14:21:41 <vks> sgordon, NFV requirement stress much on HA 14:21:42 <sgordon> their focus currently is single site HA 14:21:48 <sgordon> if people want to expand that scope 14:21:49 <cloudon> better HA docn would help but there are stil some fundamental issues such as seamless upgrade from one OS release to another 14:21:52 <sgordon> they need to get involved 14:21:53 <sgordon> ;) 14:22:08 <sgordon> trying to find the mailing list post(s) 14:22:36 <vks> for now single site HA would be fine 14:22:44 <mkoderer> sgordon: so what are we going to do now.. listing gaps in OS? 14:22:48 <imendels> that doc is more about OS control than anything else, no? 14:22:55 <ybabenko> sgordon: thanks. We are familiar with that and have a strong feeling that a lot still need to happen in oder to be able to deploy something like vIMS in HA OS 14:23:36 <sgordon> #link http://lists.openstack.org/pipermail/openstack-operators/2014-August/004987.html 14:23:41 <ybabenko> mkoderer: should be do gap analysis and address/list missing points 14:24:03 <sgordon> ybabenko, yes - but again if nobody is speaking to the team working on it about that 14:24:06 <sgordon> they arent going to cover it 14:24:07 <sgordon> :) 14:24:09 <adrian-hoban> I'd like to suggest what we need to agree on first is what OpenStack should provide from an API perspective to support application HA configuration 14:24:38 <ybabenko> sgordon: who will address this? 14:25:08 <imendels> adrian-hoban: +1 14:25:09 <adrian-hoban> And by that I mean HA deployment configuration (not config of the app itself) 14:25:13 <mkoderer> sgordon: can we add all the gaps the we find during discussion to the use case 14:25:35 <sgordon> it's a wiki, people can add anything they want :) 14:25:42 <mkoderer> and then start to find related blueprints 14:25:51 <sgordon> indeed 14:26:00 <mkoderer> and open specs if needed 14:26:13 <mkoderer> ok 14:28:25 <sgordon> so, a key question to adrian-hoban's point - what do we see as the 'API' here 14:28:38 <sgordon> given that e.g. server groups are implemented via scheduler hints 14:28:54 <sgordon> (albeit with some API calls for initial group creation) 14:29:41 <adrian-hoban> sgordon: I think the Heat APIs are probably the closest in scope to parts of what is required of NFVO functionality. Perhaps we start there? 14:30:02 <aveiga> adrian-hoban: +1 14:30:17 <ybabenko> are we still on vIMS? 14:30:20 <ybabenko> I am confused 14:30:24 <sgordon> yes 14:30:29 <cloudon> doesn't that assume an NFVO would use Heat? not sure that's the case 14:30:47 <vks> adrian-hoban, do u really think heat-apis fit NFV case? 14:30:50 <aveiga> cloudon: it might be a requirement if you're going to need coordination features 14:31:27 <ybabenko> vIMS -> need for OpenStack HA. Heat? We can use heat already today. But heat does not support multi-site configuration. How to address this? 14:31:37 <mkoderer> vks: not yet.. but we can try to change that 14:31:52 <sgordon> ybabenko, actually it does depending on what you mean by multi-site 14:31:52 <cloudon> aveiga: sorry - not sure I follow - a core part of an NFVO is co-ordination? 14:31:58 <sgordon> e.g. multi-region support was recently added 14:32:23 <sgordon> #link https://blueprints.launchpad.net/heat/+spec/multi-region-support 14:32:27 <vks> are we sticking for one site or looking for multi site? 14:32:34 <sgordon> but i think that is still getting ahead of ourselves 14:32:40 <aveiga> cloudon: if you want to ensure that your app VMs are landing where you want them and automatically rebuilt/scaled to meet your HA and load capabilities, then yes 14:32:41 <sgordon> i think stick to the requirements within a single site 14:32:56 <sgordon> as i said earlier multi-site for telco should be analysed as a separate use case imo 14:32:58 <vks> sgordon, +1 14:32:59 <adrian-hoban> vks: I'm not stating that. Just that Heat is close in functionality to some of things NFVO is required to do. There is of course a likely path that NFVO implementations would drive the other APIs (Nova, Neutron) directly. I suggested we consider Heat APIs as a means of fleshing out what may be needed from other core APIs 14:33:00 <cloudon> aveiga: ok, but don't need Heat to do that 14:33:06 <sgordon> because it's more general, it's not specific to e.g. vIMS 14:33:42 <mkoderer> sgordon: I can right a use case for multi-site 14:33:46 <mkoderer> ^write 14:33:57 <mkoderer> if needed 14:33:57 <sgordon> #action mkoderer to take a stab at documenting use for multi-site 14:34:03 <sgordon> mkoderer, thanks - that would be much appreciated 14:34:25 <sgordon> so we have the OS::Nova::ServerGroup resource in heat 14:34:36 <adrian-hoban> sgordon: +1. Agree we need to look at single site and multi-site deployments separately. 14:34:52 <vks> adrian-hoban, i just wanted to say heat-apis in my point of view doesn't fit. yes if we want to start with that , not a bad idea. But i think we should come up with new APIs in some time 14:34:52 <sgordon> which relates to the nova server group add call 14:34:56 <sgordon> under the hood 14:35:38 <sgordon> and then actual group membership is via the hints provided with the OS::Nova::Server resource 14:36:07 <sgordon> the key requirement here appears to be how do i express not only a relationship between servers in the group 14:36:14 <sgordon> but a relationship between those groups 14:36:23 <cloudon> so within a single site I want to deploy an N+k pool (which may just be a fraction of the overall service) - I still want to ensure no single host failure can knock out many VMs (and certainly no more than k...) - can server groups permit me to configure that? 14:36:36 <sgordon> "sort of" 14:36:54 <sgordon> so with the anti-affinity policy you obviously achieve that 14:37:01 <sgordon> at the expense that you dont get 'closeness' 14:37:01 <cloudon> sgordon::) 14:37:10 <sgordon> that is none of your servers/instances will reside on the same host 14:37:37 <sgordon> there have been proposals to implement "soft" anti-affinity that might be closer to what you want 14:37:39 <cloudon> ...which is too much spreading 14:37:47 <vks> sgordon, u mean to say service vms? 14:37:58 <sgordon> but again still would only place on same host after all options exhausted 14:38:00 <sgordon> vks, no 14:38:06 <imendels> mkoderer: I suggest you distinguish between OS "control" and "servers" HA in the use case. Happy to assist if you want 14:38:08 <sgordon> vks, in the nova api instances are referred to as servers 14:38:11 <sgordon> hence "server groups" 14:39:41 <mkoderer> imendels: thx.. yep sure 14:39:44 <ybabenko> imendels: all the time we are speaking about OS HA 14:40:38 <cloudon> (hacky but might work) so could I define a host aggregate of a largish number of "close" hosts, then define my VMs to form a service group, then tell nova to instantiate them on the given aggregate with anti-affinity? 14:40:49 <imendels> ybabenko: not sure.. look at the servers group above... vs. is your NOVA endpoint is HA and can be seamlessly upgraded 14:40:57 <vks> sgordon, here we are talking about special servers ? 14:41:09 <vks> not the normal instanes 14:41:14 <vks> rt?? 14:41:19 <sgordon> vks, no - we're talking about any servers/instances you want to deploy in the manner cloudon refers to 14:41:40 <sgordon> cloudon, yes that was something i mentioned very early in the conversation 14:41:41 <cloudon> s/service group/server group/ 14:41:43 <sgordon> as a way to achieve today 14:42:50 <vks> sgordon, but then u end up dealing with entire instances on cloud instead for the hosts which has special servers running on them 14:43:05 <sgordon> vks, i dont follow 14:43:10 <cloudon> ok, though bit sub-optimal as it requires using a host aggregate to segment your hosts for app affinity purposes rather than physical capabilities 14:43:19 <sgordon> vks, you end up dealing with as many or as few instances as you add to the gorup 14:43:21 <sgordon> *group 14:43:39 <sgordon> not all instances in the cloud need to be in a group, but those you want to place this way do 14:44:20 <adrian-hoban> You could also leverage host aggregates to help identify if the servers had a special config 14:44:39 <sgordon> yes 14:44:52 <vks> sgordon, ok that make sense. but wherever those instances will be running will be in HA 14:44:58 <cloudon> the semantic you really want as an app is "instantiate VMs in this server group such that no more than X are on the same host" without reference to host aggregates unless the service needs some special physical capability 14:45:52 <sgordon> mmm 14:46:12 <sgordon> cloudon, what would expected behavior be if i have exhausted all hosts 14:46:16 <ybabenko> can we just go line for line in the vIMS usecase and agree on it? 14:46:16 <sgordon> that is i say X is 5 14:46:22 <sgordon> and all hosts have 5 instances 14:46:27 <sgordon> fail request? 14:46:33 <ybabenko> i.e. Mainly a compute application: modest demands on storage and networking. - what means "modest"? 14:46:37 <cloudon> if no option then overload - so more of a hint than a hard rule 14:46:48 <ybabenko> which feature do we need from networking in order to support vIMS? 14:46:51 <ybabenko> Ipv6? 14:46:57 <ybabenko> Distributed routing? 14:47:03 <sgordon> cloudon, right but at that point it's really no different than soft-affinity imo 14:47:03 <ybabenko> VRRP? 14:47:09 <DaSchab> LB? 14:47:10 <ybabenko> IP-SEC 14:47:11 <ybabenko> etc 14:47:12 <ybabenko> etc 14:47:13 <ybabenko> etc 14:47:16 <sgordon> unless you are suggesting it should stack the first host until it gets 5, and so on 14:47:32 <mkoderer> I would really like to see a transparent review of the use cases 14:47:34 <cloudon> no, definitely not stacking - that's an anti-pattern 14:47:52 <sgordon> mkoderer, can you expand on that 14:48:39 <mkoderer> should we move them to a git repo and do a gerrit review?... I would really like that 14:49:05 <adrian-hoban> cloudon: Do you see host separation as the only concern? What about rack-level separation or network-level separation? 14:49:19 <sgordon> my concern with that approach is that we lose many of the people who dont know how to interact with it 14:49:41 <sgordon> (similar to how we lose some who cant/wont do irc meetings by having these sessions in irc) 14:50:11 <cloudon> adrian-hoban: indeed, yes, but was wary of introducing new semantics (especially physically motivated) for groupings of hosts that don't already exist in OS 14:50:25 <mkoderer> sgordon: but having it in the IRC meeting doesn't feel that productive 14:50:28 <cloudon> I see that more as a use for avail zones 14:50:54 <sgordon> mkoderer, i agree but is that because of the medium or because we spent 20 mins discussing broader HA issues 14:50:54 <adrian-hoban> cloudon: Agree with starting with incremental changes :-) 14:51:44 <mkoderer> cloudon: I guess we need to have addtional features for AZ/host aggregates in general for NFV 14:52:14 <sgordon> basically from my pov i dont want to raise the bar on use case submission, i already have a couple that were emailed to me because people were unsure about adding to the wiki 14:52:17 <mkoderer> and the nova scheduling must me more flexible 14:52:23 <sgordon> i dont want to become the conduit for adding them to a git repo as well 14:52:34 <aveiga> sgordon: I think we should go through the uses cases. We may find that there are more commonalities 14:52:48 <mkoderer> sgordon: I mean I can upload them to Gerrit... 14:52:52 <aveiga> just nore down in the wiki that HA happens to be one that would be common 14:53:02 <aveiga> note, even 14:53:09 <cloudon> mkoderer: agree; there are many multi-site issues but even if solved that leaves scheduling gaps for what you ideally want within each site 14:53:12 <sgordon> #info possible commonalities around HA and multi-site requirements to identify as we progress through use cases 14:54:00 <sgordon> #info need more flexibility from Availability Zone and Host Aggregate placement, along with more flexible placement rules for the scheduler 14:54:20 <sgordon> mkoderer, with the scheduling are we referring specifically to the server group filters in this case 14:54:27 <sgordon> mkoderer, or are there other desirable tweaks 14:54:32 <adrian-hoban> I'd like it if we could complete the discussion on single site before tackling the multi-site items 14:55:05 <vks> adrian-hoban, +1 14:55:11 <cloudon> +1 14:55:12 <sgordon> +1 14:55:28 <sgordon> #info general agreement to focus on use cases in the context of single site deployment first 14:55:51 <mkoderer> adrian-hoban: yep we move this discssion to the multi-site use case 14:56:01 <sgordon> #info Is gerrit a better mechanism for use case review? 14:56:07 <cloudon> so are we agreed for single site case (a) there is an affinity issue for N+k groups (b) could hack it with server groups + host aggregates (c) but that's not ideal? 14:56:45 <sgordon> that seems right from my pov, the question is really how would an implementation that solves (a) in particular work 14:57:08 <sgordon> can take that offline though 14:57:16 <sgordon> we only have ~ 3 min left 14:57:23 <ybabenko> cloudon: i am not in details on clearwater but maybe it would be a good idea to provide all these details in the wiki 14:57:32 <sgordon> but let's quickly touch on how to move somewhere on service chaining 14:57:51 <sgordon> mestery had mentioned on the m/l thread that this is a topic with much broader interest in neutron than just telco 14:58:08 <cloudon> ybabenko: link in use case gives full details - didn't want to over-burden the wiki 14:58:16 <sgordon> so it's a question of how to ensure telco use case is documented and presentable when that comes around again at the vancouver summit 14:58:27 <mkoderer> sgordon: could you give us a link 14:58:33 <vks> sgordon, can we have everything in single place 14:58:37 <vks> ? 14:58:46 <sgordon> vks, what is 'everything'? 14:59:08 <ybabenko> sgordon: in [NFV] tag there is not email from mestery as far as i can see 14:59:17 <sgordon> that's really my point 14:59:18 <vks> use cases, and the plan of action 14:59:22 <sgordon> because he's not talking about NFV 14:59:28 <ybabenko> here is our draft https://etherpad.openstack.org/p/kKIqu2ipN6 14:59:45 <ybabenko> I would appreciate all the comments before putting it into wiki 15:00:25 <sgordon> #link https://etherpad.openstack.org/p/kKIqu2ipN6 15:00:39 <sgordon> we're at time 15:00:53 <sgordon> let's jump over to #openstack-nfv while i find the link 15:01:08 <sgordon> but basically i cant force people who are having a generic discussion about service chaining in neutron 15:01:13 <sgordon> to tag it nfv / telco 15:02:10 <sgordon> #link http://lists.openstack.org/pipermail/openstack-dev/2015-January/053915.html 15:02:14 <sgordon> thanks all 15:02:18 <sgordon> #endmeeting