#openstack-meeting-alt log

14:03:40 <sgordon> #startmeeting telcowg
14:03:40 <DaSchab> :-)
14:03:41 <openstack> Meeting started Wed Jan 14 14:03:40 2015 UTC and is due to finish in 60 minutes.  The chair is sgordon. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:03:42 <ajo> :)
14:03:43 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:03:44 <mkoderer> hi
14:03:45 <sgordon> let's try that again :)
14:03:46 <openstack> The meeting name has been set to 'telcowg'
14:03:49 <aveiga> hello
14:03:49 <sgordon> #topic roll call
14:03:54 <smazziotta> hi
14:03:55 <sgordon> #link https://etherpad.openstack.org/p/nfv-meeting-agenda
14:04:25 <sgordon> #topic action items from last week
14:04:39 <sgordon> #info amitry to cross reference ops mid-cycle signups with telco wg participants to determine crossover
14:04:51 <sgordon> i dont believe registration for the ops summit has opened yet
14:04:54 <amitry> signup opening this week
14:04:55 <amitry> correct
14:04:56 <sgordon> so not much to be done on this yet
14:05:02 <sgordon> will carry it over
14:05:06 <sgordon> #action amitry to cross reference ops mid-cycle signups with telco wg participants to determine crossover
14:05:54 <sgordon> for those who were still on vacation etc. last week there is a general openstack operators midcycle meetup being held on march 9 and 10 host by amitry and comcast in philli
14:06:16 <sgordon> we are considering whether enough people are interested in or planning to attend that we could have a face to face session on telco use cases there
14:06:21 <amitry> https://etherpad.openstack.org/p/PHL-ops-meetup
14:06:29 <sgordon> #link https://etherpad.openstack.org/p/PHL-ops-meetup
14:06:37 <sgordon> any questions on that?
14:07:01 <sgordon> the etherpad contains all available detail at this time - expect to see registration details on the openstack-operators list real soon now
14:07:23 <sgordon> #info steveg was to send out a brief survey monkey to the list and we select one to review at next week's meeting in detail
14:07:49 <sgordon> so this happened, i only got a handful of responses (6) so we can probably re-evaluate ordering again in the future
14:07:58 <sgordon> but the results were:
14:07:59 <sgordon> 1st: Virtual IMS
14:07:59 <sgordon> 2nd (tie): VPN Instantiation / Access to physical network resources
14:07:59 <sgordon> 4th: Security Segregation
14:07:59 <sgordon> 5th: Session border controller
14:08:18 <sgordon> i believe the vIMS use case was submitted by cloudon
14:08:28 <cloudon> yup, that's right
14:08:31 <sgordon> cloudon, are you available if we want to try deep dive on this in the meeting today
14:08:39 <cloudon> sure
14:08:50 <sgordon> #topic virtual IMS use case discussion
14:08:54 <ybabenko> hi
14:08:55 <sgordon> #link https://wiki.openstack.org/wiki/TelcoWorkingGroup/UseCases#Virtual_IMS_Core
14:08:55 <vks> hi, what about service chaining?
14:09:17 <ybabenko> vks: we have a draft here https://etherpad.openstack.org/p/kKIqu2ipN6
14:09:19 <sgordon> vks, we're going off the use cases that have been submitted to the wiki
14:09:26 <sgordon> there is a broader effort around service chaining
14:09:34 <sgordon> with discussion happening on the mailing list
14:09:53 <sgordon> ybabenko++
14:10:18 <ybabenko> sgordon: let us discuss our draft here today (comments, critique, etc) and later on i will put it into wiki
14:10:53 <vks> saw it, will it going to fall in line with gbp.
14:11:01 <sgordon> let's focus on the vIMS case first
14:11:07 <vks> ok
14:11:11 <mkoderer> sgordon: +1
14:11:13 <sgordon> if we get time in other discussion we can loop back on service chaining
14:11:22 <vks> go ahead
14:11:32 <sgordon> ok
14:11:39 <ybabenko> sgordon: it looks like to me that for such an VNF as IMS we need a serious HA setup for openstack
14:11:46 <sgordon> so cloudon you had already broken out some requirements in this one
14:11:52 <sgordon> with main constraints being in HA
14:11:56 <ybabenko> does something like this exist already today in form of verified blueprint
14:12:06 <sgordon> ybabenko, not quite
14:12:12 <ybabenko> exactly
14:12:22 <sgordon> in particular that second requirement about affinity/anti-affinity groups being nested
14:13:11 <sgordon> you can possibly force this by combining groups with host aggregate/az assignments
14:13:19 <sgordon> to mimic the same type of setup
14:13:31 <cloudon> the broader issue here I was trying to get at was how to represent the affinity requirements for services deployed as an N+k pool, with N large
14:13:32 <adrian-hoban> ybabenko: We should be clear to differentiate between HA deployment/config of the vIMS app and OpenStack HA from the controller perspective
14:13:36 <sgordon> #info implemented as a series of N+k compute pools; meeting a given SLA requires being able to limit the impact of a single host failure
14:13:43 <sgordon> #info potentially a scheduler gap here: affinity/anti-affinity can be expressed pair-wise between VMs, which is sufficient for a 1:1 active/passive architecture, but an N+k pool needs a concept equivalent to "group anti-affinity" i.e. allowing the NFV orchestrator to assign each VM in a pool to one of X buckets, and requesting OpenStack to ensure no single host failure can affect more than one bucket
14:13:58 <sgordon> adrian-hoban, here we're talking about for the app itself
14:14:36 <cloudon> crudely: don't want too many of the service's VMs on the same host, but for perf reasons want them "nearby" for some definition of "nearby"
14:14:41 <ybabenko> adrian-hoban: i am meaning OpenStack HA setup (core services like keyston) as IMS is normally deployed as a multi site VNF
14:14:52 <vks> sgordon: do we have doc on affinity/anti-affinity stuff in place
14:14:54 <vks> ?
14:15:17 <imendels> are you sure IMS shall be assume as multi site?
14:15:30 <mkoderer> a deployed vIMS will be hosted in several DC's
14:15:31 <sgordon> vks, there is some coverage in the nova scheduler documentation
14:15:37 <vks> ok
14:15:52 <sgordon> it's fairly minimal but so is the functionality today
14:15:53 <ybabenko> imendels: yes
14:16:23 <adrian-hoban> ybabenko: sgordon: Seems like you guys are talking about different aspects of HA then...
14:16:24 <imendels> I have some references claiming different. Though I agree multi site also make sense
14:16:59 <cloudon> yes, in this use case was concentrating solely on HA at the app level, not the platform level
14:17:18 <adrian-hoban> imendels: In the NFV architecture, I would see the multi-site aspects as being in the scope of NFVO
14:17:43 <sgordon> adrian-hoban, i was really just regurgitating what we have in the wiki so people understand the context
14:17:54 <sgordon> (and we have a record in meetbot of what we were referring to)
14:17:59 <imendels> adrian: right but can we assume that all apps (IMS in this case) will run multi site becaue of it?
14:18:05 <cloudon> adrian-hoban: +1; and further NFV explicitly allows deployment of a VNF across multiple underlying cloud platforms (NFVIs)
14:18:12 <vks> adrian-hoban, : +1
14:18:25 <ybabenko> my point is: for such critical NFV as vIMS we need a reference HA design for OpenStack services
14:18:33 <imendels> agree
14:18:49 <adrian-hoban> imendels: I think we can assume that it is a possibility but not required for all apps
14:18:57 <imendels> agree
14:19:07 <adrian-hoban> ybabenko: +1
14:19:19 <gmatefi1> Need of HA mainly depends on application elasticity
14:19:38 <sgordon> #info Require a reference HA design for OpenStack services for critical NFV such as vIMS
14:19:42 <vks> gmatefi1, not only elastcity
14:19:46 <aveiga> going to agree with adrian-hoban here: provide the functionalityt for things like multi-site, but don't make it a requirement.  Remember that not everyone will deploy an app the same way
14:19:53 <cloudon> gmatefil: not if you have an SLA to meet...
14:19:55 <sgordon> to me though there is a broader use case discussion here though
14:19:58 <gmatefi1> for apps that are dynamically scaling, controller HA is a must-have as part of real-time operation
14:20:03 <ybabenko> sgordon: who is able to address this in OS community? what is the right entry point for that?
14:20:12 <sgordon> in that there is a separate need to drill down on what multi-site means for telco
14:20:29 <imendels> in other words we need to assume that not all apps are ready for it, or not. But it's a fundamental assumption we must have
14:20:35 <sgordon> ybabenko, there is currently an effort underway to reinvigorate the HA documentation - maybe in the scope of that effort
14:20:46 <sgordon> they need helpers though
14:20:54 <ybabenko> sgordon: +1 can you link it?
14:21:24 <mkoderer> sgordon: +1 .. so are we talking about multi-region setup? or different OS clusters...
14:21:26 <sgordon> here is the *current* scope of the guide:
14:21:29 <sgordon> #link http://docs.openstack.org/high-availability-guide/content/index.html
14:21:33 <sgordon> mkoderer, well this is the thing
14:21:36 <aveiga> sgordon: this is what I was getting at. mutli-site in openstack effectively means that a VNF will exist in more than one "Region" but perhaps a telco may deploy one large region in multiple DCs.  It's possible to schedule into different regions using the neturon network as well
14:21:38 <mkoderer> sgordon: ;)
14:21:41 <vks> sgordon, NFV requirement stress much on HA
14:21:42 <sgordon> their focus currently is single site HA
14:21:48 <sgordon> if people want to expand that scope
14:21:49 <cloudon> better HA docn would help but there are stil some fundamental issues such as seamless upgrade from one OS release to another
14:21:52 <sgordon> they need to get involved
14:21:53 <sgordon> ;)
14:22:08 <sgordon> trying to find the mailing list post(s)
14:22:36 <vks> for now single site HA would be fine
14:22:44 <mkoderer> sgordon: so what are we going to do now.. listing gaps in OS?
14:22:48 <imendels> that doc is more about OS control than anything else, no?
14:22:55 <ybabenko> sgordon: thanks. We are familiar with that and have a strong feeling that a lot still need to happen in oder to be able to deploy something  like vIMS in HA OS
14:23:36 <sgordon> #link http://lists.openstack.org/pipermail/openstack-operators/2014-August/004987.html
14:23:41 <ybabenko> mkoderer: should be do gap analysis  and address/list missing points
14:24:03 <sgordon> ybabenko, yes - but again if nobody is speaking to the team working on it about that
14:24:06 <sgordon> they arent going to cover it
14:24:07 <sgordon> :)
14:24:09 <adrian-hoban> I'd like to suggest what we need to agree on first is what OpenStack should provide from an API perspective to support application HA configuration
14:24:38 <ybabenko> sgordon: who will address this?
14:25:08 <imendels> adrian-hoban: +1
14:25:09 <adrian-hoban> And by that I mean HA deployment configuration (not config of the app itself)
14:25:13 <mkoderer> sgordon: can we add all the gaps the we find during discussion to the use case
14:25:35 <sgordon> it's a wiki, people can add anything they want :)
14:25:42 <mkoderer> and then start to find related blueprints
14:25:51 <sgordon> indeed
14:26:00 <mkoderer> and open specs if needed
14:26:13 <mkoderer> ok
14:28:25 <sgordon> so, a key question to adrian-hoban's point - what do we see as the 'API' here
14:28:38 <sgordon> given that e.g. server groups are implemented via scheduler hints
14:28:54 <sgordon> (albeit with some API calls for initial group creation)
14:29:41 <adrian-hoban> sgordon: I think the Heat APIs are probably the closest in scope to parts of what is required of NFVO functionality. Perhaps we start there?
14:30:02 <aveiga> adrian-hoban: +1
14:30:17 <ybabenko> are we still on vIMS?
14:30:20 <ybabenko> I am confused
14:30:24 <sgordon> yes
14:30:29 <cloudon> doesn't that assume an NFVO would use Heat?  not sure that's the case
14:30:47 <vks> adrian-hoban, do u really think heat-apis fit NFV case?
14:30:50 <aveiga> cloudon: it might be a requirement if you're going to need coordination features
14:31:27 <ybabenko> vIMS -> need for OpenStack HA. Heat? We can use heat already today. But heat does not support multi-site configuration. How to address this?
14:31:37 <mkoderer> vks: not yet.. but we can try to change that
14:31:52 <sgordon> ybabenko, actually it does depending on what you mean by multi-site
14:31:52 <cloudon> aveiga: sorry - not sure I follow - a core part of an NFVO is co-ordination?
14:31:58 <sgordon> e.g. multi-region support was recently added
14:32:23 <sgordon> #link https://blueprints.launchpad.net/heat/+spec/multi-region-support
14:32:27 <vks> are we sticking for one site or looking for multi site?
14:32:34 <sgordon> but i think that is still getting ahead of ourselves
14:32:40 <aveiga> cloudon: if you want to ensure that your app VMs are landing where you want them and automatically rebuilt/scaled to meet your HA and load capabilities, then yes
14:32:41 <sgordon> i think stick to the requirements within a single site
14:32:56 <sgordon> as i said earlier multi-site for telco should be analysed as a separate use case imo
14:32:58 <vks> sgordon, +1
14:32:59 <adrian-hoban> vks: I'm not stating that. Just that Heat is close in functionality to some of things NFVO is required to do. There is of course a likely path that NFVO implementations would drive the other APIs (Nova, Neutron) directly. I suggested we consider Heat APIs as a means of fleshing out what may be needed from other core APIs
14:33:00 <cloudon> aveiga: ok, but don't need Heat to do that
14:33:06 <sgordon> because it's more general, it's not specific to e.g. vIMS
14:33:42 <mkoderer> sgordon: I can right a use case for multi-site
14:33:46 <mkoderer> ^write
14:33:57 <mkoderer> if needed
14:33:57 <sgordon> #action mkoderer to take a stab at documenting use for multi-site
14:34:03 <sgordon> mkoderer, thanks - that would be much appreciated
14:34:25 <sgordon> so we have the OS::Nova::ServerGroup resource in heat
14:34:36 <adrian-hoban> sgordon: +1. Agree we need to look at single site and multi-site deployments separately.
14:34:52 <vks> adrian-hoban, i just wanted to say heat-apis in my point of view doesn't fit. yes if we want to start with that , not a bad idea. But i think we should come up with new APIs in some time
14:34:52 <sgordon> which relates to the nova server group add call
14:34:56 <sgordon> under the hood
14:35:38 <sgordon> and then actual group membership is via the hints provided with the OS::Nova::Server resource
14:36:07 <sgordon> the key requirement here appears to be how do i express not only a relationship between servers in the group
14:36:14 <sgordon> but a relationship between those groups
14:36:23 <cloudon> so within a single site I want to deploy an N+k pool (which may just be a fraction of the overall service) - I still want to ensure no single host failure can knock out many VMs  (and certainly no more than k...) - can server groups permit me to configure that?
14:36:36 <sgordon> "sort of"
14:36:54 <sgordon> so with the anti-affinity policy you obviously achieve that
14:37:01 <sgordon> at the expense that you dont get 'closeness'
14:37:01 <cloudon> sgordon::)
14:37:10 <sgordon> that is none of your servers/instances will reside on the same host
14:37:37 <sgordon> there have been proposals to implement "soft" anti-affinity that might be closer to what you want
14:37:39 <cloudon> ...which is too much spreading
14:37:47 <vks> sgordon, u mean to say service vms?
14:37:58 <sgordon> but again still would only place on same host after all options exhausted
14:38:00 <sgordon> vks, no
14:38:06 <imendels> mkoderer: I suggest you distinguish between OS "control" and "servers" HA in the use case. Happy to assist if you want
14:38:08 <sgordon> vks, in the nova api instances are referred to as servers
14:38:11 <sgordon> hence "server groups"
14:39:41 <mkoderer> imendels: thx.. yep sure
14:39:44 <ybabenko> imendels: all the time we are speaking about OS HA
14:40:38 <cloudon> (hacky but might work) so could I define a host aggregate of a largish number of "close" hosts, then define my VMs to form a service group, then tell nova to instantiate them on the given aggregate with anti-affinity?
14:40:49 <imendels> ybabenko: not sure.. look at the servers group above... vs. is your NOVA endpoint is HA and can be seamlessly upgraded
14:40:57 <vks> sgordon, here we are talking about special servers ?
14:41:09 <vks> not the normal instanes
14:41:14 <vks> rt??
14:41:19 <sgordon> vks, no - we're talking about any servers/instances you want to deploy in the manner cloudon refers to
14:41:40 <sgordon> cloudon, yes that was something i mentioned very early in the conversation
14:41:41 <cloudon> s/service group/server group/
14:41:43 <sgordon> as a way to achieve today
14:42:50 <vks> sgordon, but then u end up dealing with entire instances on cloud instead for the hosts which has special servers running on them
14:43:05 <sgordon> vks, i dont follow
14:43:10 <cloudon> ok, though bit sub-optimal as it requires using a host aggregate to segment your hosts for app affinity purposes rather than physical capabilities
14:43:19 <sgordon> vks, you end up dealing with as many or as few instances as you add to the gorup
14:43:21 <sgordon> *group
14:43:39 <sgordon> not all instances in the cloud need to be in a group, but those you want to place this way do
14:44:20 <adrian-hoban> You could also leverage host aggregates to help identify if the servers had a special config
14:44:39 <sgordon> yes
14:44:52 <vks> sgordon, ok that make sense. but wherever those instances will be running will be in HA
14:44:58 <cloudon> the semantic you really want as an app is "instantiate VMs in this server group such that no more than X are on the same host" without reference to host aggregates unless the service needs some special physical capability
14:45:52 <sgordon> mmm
14:46:12 <sgordon> cloudon, what would expected behavior be if i have exhausted all hosts
14:46:16 <ybabenko> can we just go line for line in the vIMS usecase and agree on it?
14:46:16 <sgordon> that is i say X is 5
14:46:22 <sgordon> and all hosts have 5 instances
14:46:27 <sgordon> fail request?
14:46:33 <ybabenko> i.e. Mainly a compute application: modest demands on storage and networking. - what means "modest"?
14:46:37 <cloudon> if no option then overload - so more of a hint than a hard rule
14:46:48 <ybabenko> which feature do we need from networking in order to support vIMS?
14:46:51 <ybabenko> Ipv6?
14:46:57 <ybabenko> Distributed routing?
14:47:03 <sgordon> cloudon, right but at that point it's really no different than soft-affinity imo
14:47:03 <ybabenko> VRRP?
14:47:09 <DaSchab> LB?
14:47:10 <ybabenko> IP-SEC
14:47:11 <ybabenko> etc
14:47:12 <ybabenko> etc
14:47:13 <ybabenko> etc
14:47:16 <sgordon> unless you are suggesting it should stack the first host until it gets 5, and so on
14:47:32 <mkoderer> I would really like to see a transparent review of the use cases
14:47:34 <cloudon> no, definitely not stacking - that's an anti-pattern
14:47:52 <sgordon> mkoderer, can you expand on that
14:48:39 <mkoderer> should we move them to a git repo and do a gerrit review?... I would really like that
14:49:05 <adrian-hoban> cloudon: Do you see host separation as the only concern? What about rack-level separation or network-level separation?
14:49:19 <sgordon> my concern with that approach is that we lose many of the people who dont know how to interact with it
14:49:41 <sgordon> (similar to how we lose some who cant/wont do irc meetings by having these sessions in irc)
14:50:11 <cloudon> adrian-hoban: indeed, yes, but was wary of introducing new semantics (especially physically motivated) for groupings of hosts that don't already exist in OS
14:50:25 <mkoderer> sgordon: but having it in the IRC meeting doesn't feel that productive
14:50:28 <cloudon> I see that more as a use for avail zones
14:50:54 <sgordon> mkoderer, i agree but is that because of the medium or because we spent 20 mins discussing broader HA issues
14:50:54 <adrian-hoban> cloudon: Agree with starting with incremental changes :-)
14:51:44 <mkoderer> cloudon: I guess we need to have addtional features for AZ/host aggregates in general for NFV
14:52:14 <sgordon> basically from my pov i dont want to raise the bar on use case submission, i already have a couple that were emailed to me because people were unsure about adding to the wiki
14:52:17 <mkoderer> and the nova scheduling must me more flexible
14:52:23 <sgordon> i dont want to become the conduit for adding them to a git repo as well
14:52:34 <aveiga> sgordon: I think we should go through the uses cases.  We may find that there are more commonalities
14:52:48 <mkoderer> sgordon: I mean I can upload them to Gerrit...
14:52:52 <aveiga> just nore down in the wiki that HA happens to be one that would be common
14:53:02 <aveiga> note, even
14:53:09 <cloudon> mkoderer: agree; there are many multi-site issues but even if solved that leaves scheduling gaps for what you ideally want  within each site
14:53:12 <sgordon> #info possible commonalities around HA and multi-site requirements to identify as we progress through use cases
14:54:00 <sgordon> #info need more flexibility from Availability Zone and Host Aggregate placement, along with more flexible placement rules for the scheduler
14:54:20 <sgordon> mkoderer, with the scheduling are we referring specifically to the server group filters in this case
14:54:27 <sgordon> mkoderer, or are there other desirable tweaks
14:54:32 <adrian-hoban> I'd like it if we could complete the discussion on single site before tackling the multi-site items
14:55:05 <vks> adrian-hoban, +1
14:55:11 <cloudon> +1
14:55:12 <sgordon> +1
14:55:28 <sgordon> #info general agreement to focus on use cases in the context of single site deployment first
14:55:51 <mkoderer> adrian-hoban: yep we move this discssion to the multi-site use case
14:56:01 <sgordon> #info Is gerrit a better mechanism for use case review?
14:56:07 <cloudon> so are we agreed for single site case (a) there is an affinity issue for N+k groups (b) could hack it with server groups + host aggregates (c) but that's not ideal?
14:56:45 <sgordon> that seems right from my pov, the question is really how would an implementation that solves (a) in particular work
14:57:08 <sgordon> can take that offline though
14:57:16 <sgordon> we only have ~ 3 min left
14:57:23 <ybabenko> cloudon: i am not in details on clearwater but maybe it would be a good  idea to provide all these details in the wiki
14:57:32 <sgordon> but let's quickly touch on how to move somewhere on service chaining
14:57:51 <sgordon> mestery had mentioned on the m/l thread that this is a topic with much broader interest in neutron than just telco
14:58:08 <cloudon> ybabenko: link in use case gives full details - didn't want to over-burden the wiki
14:58:16 <sgordon> so it's a question of how to ensure telco use case is documented and presentable when that comes around again at the vancouver summit
14:58:27 <mkoderer> sgordon: could you give us a link
14:58:33 <vks> sgordon, can we have everything in single place
14:58:37 <vks> ?
14:58:46 <sgordon> vks, what is 'everything'?
14:59:08 <ybabenko> sgordon: in [NFV] tag there is not email from mestery as far as i can see
14:59:17 <sgordon> that's really my point
14:59:18 <vks> use cases, and the plan of action
14:59:22 <sgordon> because he's not talking about NFV
14:59:28 <ybabenko> here is our draft https://etherpad.openstack.org/p/kKIqu2ipN6
14:59:45 <ybabenko> I would appreciate all the comments before putting it into wiki
15:00:25 <sgordon> #link https://etherpad.openstack.org/p/kKIqu2ipN6
15:00:39 <sgordon> we're at time
15:00:53 <sgordon> let's jump over to #openstack-nfv while i find the link
15:01:08 <sgordon> but basically i cant force people who are having a generic discussion about service chaining in neutron
15:01:13 <sgordon> to tag it nfv / telco
15:02:10 <sgordon> #link http://lists.openstack.org/pipermail/openstack-dev/2015-January/053915.html
15:02:14 <sgordon> thanks all
15:02:18 <sgordon> #endmeeting