#openstack-lbaas log

20:00:15 <johnsom> #startmeeting Octavia
20:00:16 <openstack> Meeting started Wed Dec  6 20:00:15 2017 UTC and is due to finish in 60 minutes.  The chair is johnsom. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:00:17 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
20:00:20 <openstack> The meeting name has been set to 'octavia'
20:00:28 <johnsom> Hi folks!
20:00:29 <cgoncalves> o/
20:00:30 <nmagnezi> o/
20:00:33 <longstaff> hi
20:00:40 <rm_work> o/
20:00:40 <bar_> hi
20:00:54 <johnsom> #topic Announcements
20:01:04 <jniesz> hi
20:01:14 <johnsom> On the top of the announcements list is the milestone 2 release for queens.
20:01:33 <johnsom> I will be posting our release patch right after the meeting
20:01:38 <xgerman_> o/
20:01:48 <johnsom> Thank you to everyone for the great work that has gone into this patch!
20:02:08 <xgerman_> +1
20:02:12 <johnsom> Well, milestone I guess.
20:02:33 <rm_work> yeah, we still have until Q-3 to finish up remaining features
20:02:40 <xgerman_> next  last milestone is in 6 weeks!
20:02:57 <johnsom> This means, we are in the last segment before feature freeze at milestone 3.
20:03:10 <johnsom> Feature freeze for milestone 3 is Jan 22
20:04:02 <nmagnezi> i wish we'll get the specs in before that milestone
20:04:03 <johnsom> We have a lot of stuff in-flight, so please review patches and vote.
20:04:15 <johnsom> I certainly hope so...
20:04:23 <rm_work> it'd have to be specs AND code, no?
20:04:27 <rm_work> to make Queens
20:04:45 <xgerman_> yep, code
20:04:48 <johnsom> We can still merge specs,
20:04:53 <xgerman_> +1
20:05:15 <xgerman_> I think they got a bit more relaxed with keeping specs rolling
20:05:22 <nmagnezi> +200 rm_work , but we need those in first.. trying to to be greedy here :)
20:05:39 <rm_work> heh
20:06:27 <johnsom> Also note, the PTG in Dublin is open for registration and the hotel block is available.
20:06:56 <johnsom> Any other announcements today?
20:07:28 <johnsom> #topic Brief progress reports / bugs needing review
20:07:52 <jniesz> #link https://review.openstack.org/#/c/519509/
20:07:59 <johnsom> I have been working on the driver spec.
20:08:03 <johnsom> #link https://review.openstack.org/509957
20:08:11 <jniesz> looks like we are close on that one, just need one more core : ) Looks at rm_work
20:08:18 <johnsom> I would like to see some reviews on the QoS patch:
20:08:25 <johnsom> #link https://review.openstack.org/458308
20:08:30 <johnsom> That should pretty much be done.
20:09:08 <xgerman_> okey
20:09:22 <johnsom> We also have a UDP protocol spec in flight that can use some reviews:
20:09:29 <johnsom> #link https://review.openstack.org/503606
20:09:48 <xgerman_> #link https://review.openstack.org/#/c/525704/ - watch that space
20:10:32 <rm_work> yeah i've been having some crazy internal fires, but that just wrapped up ... if someone compiles a list of patches ordered by priority, I will work down the list, otherwise I'll just pick random stuff from this meeting's links I guess
20:10:57 <johnsom> I have put in the community goals update patch for the tempest plugin (we need to report for MS2).  Those could use some love as well.
20:11:19 <johnsom> rm_work I will try to put something together
20:12:10 <johnsom> Any other updates?
20:12:16 <nmagnezi> +1 , such list would be great to have
20:12:23 <jniesz> +1 same
20:13:26 <johnsom> Ok, we have a list of other topics on the agenda.  I will go down the list and ask the requester to speak to them.
20:13:37 <johnsom> #topic (BAR_RH) Members API Improvements Proposal
20:13:42 <bar_> k
20:13:49 <bar_> Currently the URL for members show is: rest_method:: GET /v2.0/lbaas/pools/{pool_id}/members/{member-id}
20:13:59 <bar_> Why not allow to address a member WITHOUT POOL_ID. e.g.: GET /v2.0/octavia/members/{member_id}
20:14:16 <bar_> That's my proposal pretty much, since member_id is unique
20:14:38 <cgoncalves> +1
20:15:10 <xgerman_> so basically that saves people from writing tow ids? Or is there another benefit?
20:15:15 <rm_work> yeah honestly i have been wondering this
20:15:18 <johnsom> Would it be for a situation where members are shared?
20:15:21 <rm_work> i think we were considering shared members
20:15:23 <nmagnezi> #link https://storyboard.openstack.org/#!/story/2001317
20:15:25 <rm_work> ^^ yeah that
20:15:41 <rm_work> but, our DB model doesn't work for that
20:15:42 <johnsom> This decision pre-dates me on the project, so I don't have the historical reasoning
20:15:46 <nmagnezi> I'm in favor of that. but on the other hand I also submitted the story..
20:15:52 <rm_work> since it puts pool_id directly in the member table schema
20:15:53 <rm_work> sooooo
20:16:15 <rm_work> this is really a shared-members discussion i think
20:16:20 <rm_work> schema is easier to change than API
20:16:28 <rm_work> if we change the API, we're stuck
20:16:57 <johnsom> Well, for backward compatibility we need to retain the current API path.
20:16:57 <xgerman_> +1
20:17:28 <nmagnezi> rm_work, If members would be shared between the pools we'll need to think about this a bit more. for example an ability to get a list of pool the member is a part of etc
20:17:43 <nmagnezi> i agree with johnsom here
20:18:10 <bar_> The new API must allow list in the field of pool_ids.
20:18:22 <openstackgerrit> Merged openstack/python-octaviaclient master: Complement Octavia client with a set of features  https://review.openstack.org/522666
20:18:25 <bar_> if shared-members is an option.
20:18:39 <bar_> yay!
20:18:49 <xgerman_> well, the other problem I see is that we treat members as part of the pool and delete them with the pool
20:19:01 <nmagnezi> bar_, awesome job on ^^ :)
20:19:27 <bar_> nmagnezi, thx
20:20:15 <nmagnezi> xgerman_, right. we actually discussed this in the providers spec review. currently when we delete a pool we cascade and delete members
20:20:21 <nmagnezi> and I think also health monitors
20:20:40 <cgoncalves> is there anyone we could ask the historical reasoning for pool_id in GET member?
20:21:33 <rm_work> umm
20:21:38 <rm_work> me, johnsom, xgerman_
20:21:39 <nmagnezi> cgoncalves, git log ? :D
20:21:41 <rm_work> sbalukoff_ :P
20:21:52 <rm_work> we made the spec as-is
20:21:52 <nmagnezi> dougwig :P
20:21:57 <rm_work> yes, heh
20:22:01 <cgoncalves> haha!
20:22:04 <rm_work> dougwig but he is not here for me to highlight T_T
20:22:10 <johnsom> We could try to reach out to folks, but most everyone is not on OpenStack anymore.  I think xgerman_ might be the only person that was in Atlanta when that API was designed.
20:22:18 <rm_work> <<---
20:22:26 <rm_work> johnsom: weren't you also?
20:22:30 <johnsom> Nope
20:22:35 <johnsom> Before my time
20:22:45 <rm_work> yeah xgerman_ and me and sbalukoff_ and jorgem
20:22:50 <rm_work> and dougwig and ...
20:22:58 <rm_work> evgeny and sam
20:23:25 <rm_work> i am pretty sure it was for future shared members
20:23:31 <cgoncalves> ok, they all have got a notification on their irc clients. hopefully they will shed some light later
20:23:31 <rm_work> i think that's the only reason
20:24:07 <rm_work> just sbalukoff_ and i doubt it :P
20:24:26 <nmagnezi> i think that even if we don't get a reply about this, there's a value of looking at this with fresh eyes and decide about it
20:24:30 <rm_work> so let's table that and call it "Shared Members? Yes/No/Abort/Retry/Fail"
20:25:31 <nmagnezi> rm_work, not sure this is what bar_ had in mind :D
20:25:39 <cgoncalves> nmagnezi: +1
20:26:06 <rm_work> Yes, but that's what the core of the discussion is
20:26:16 <bar_> I am aware of that notion, and my API should be established in light of new notion like shared-members.
20:26:27 <rm_work> "Can we drop the pool-id on the member lookup" equates to "are we going to drop the idea of shared members"
20:26:38 <johnsom> Ok.  Let's think about issues around shared members. (The only reason it matters is to limit multiple health monitors from pinging the members)
20:26:55 <rm_work> but, the HMs on different pools may be different ...
20:26:56 <rm_work> soooo
20:27:02 <rm_work> honestly i think that doesn't even matter
20:27:08 <rm_work> they SHOULD all do their own pings
20:27:09 <johnsom> I think we can consider *adding* a direct member query API.  That might be independent of shared members.
20:27:28 <rm_work> k. that's a larger discussion we can have offline
20:27:31 <rm_work> our agenda today is packed
20:27:39 <johnsom> Yep.
20:27:47 <nmagnezi> johnsom, nothing prevents a user from adding the IP address as a pool member for two different pools, right?
20:27:58 <johnsom> bar_ Are you good with the discussion progress so far?  Can we table it?
20:27:59 <nmagnezi> it's just a different member object
20:28:02 <rm_work> right
20:28:09 <johnsom> Right
20:28:14 <rm_work> the question was do we bother making member objects shared
20:28:18 <rm_work> i actually vote "no"
20:28:23 <rm_work> complication is not worth the return
20:28:32 <johnsom> Yeah, it gets pretty messy with different protocol pools, etc.
20:28:34 <bar_> let's continue offline them
20:28:38 <bar_> *then
20:28:57 <johnsom> #topic (sanfern / rm_work) Changes to DB schema for VRRP ports
20:29:03 <alee> rm_work, yo
20:29:04 <johnsom> #link https://review.openstack.org/#/c/521138/
20:29:11 <jniesz> I am covering for sanfern, since it is 2 am for hi
20:29:12 <jniesz> m
20:29:35 <jniesz> this is from the discussion we had at the PTG about changing the vrrp_ table names
20:29:45 <rm_work> yeah, i agree with the goal, i love it
20:29:45 <rm_work> but
20:29:54 <rm_work> is this a smart thing to do on the DB-schema side?
20:30:03 <rm_work> should we maybe just change the models and leave it at that?
20:30:10 <rm_work> the repos can do the translation
20:30:32 <alee> rm_work, hey --- I think I asked this before but you were out -- is there a doc to describe how to set up octavia with barbican?
20:30:46 <rm_work> alee: yes -- but hold on, middle of our meeting :P
20:30:48 <jniesz> I remember the reason was to make it obvious since vrrp was already bad naming and when the user looks in db
20:30:55 <alee> rm_work, oops sorry :)
20:31:21 <johnsom> I think there are two issues here:
20:31:21 <johnsom> 1. Admin API output change for amphora field change.
20:31:21 <johnsom> 2. The change to the underlying database schema which would require a full control plane downtime.
20:31:49 <rm_work> ^^ I am concerned about #2
20:32:01 <rm_work> I think even as deep as the Models it would be good to change this stuff
20:32:09 <rm_work> but I think between schema/repo we should leave it
20:32:11 <rm_work> if that's possible
20:32:18 <johnsom> On #1, I'm not so concerned about this.  The paint hasn't dried on that admin API patch and we haven't shipped a release with it yet.
20:32:20 <rm_work> or, i could be wrong
20:32:34 <rm_work> and maybe no one cares about the db schema change
20:32:54 <rm_work> if all of you guys think it's OK, then I guess that's fine
20:33:24 <jniesz> for #2 I think control plane going down for short period of time for planned maintenance upgrade is ok
20:33:28 <johnsom> If people are coding directly to the database, shame on them.  I have no problem breaking that.
20:33:42 <johnsom> The issue comes down to a control plane outage for upgrade right?
20:34:46 <rm_work> yeah
20:34:50 <johnsom> Did we drop everyone?  There is a netsplit...
20:34:57 <jniesz> i'm still here
20:34:57 <rm_work> and i just seriously thought there were openstack guidelines around db schema changes
20:35:04 <nmagnezi> i'm still here
20:35:08 <johnsom> Ok, cool.
20:35:14 <nmagnezi> i think xgerman got disconnected
20:35:43 <jniesz> haven't we had to do schema changes in the past?
20:35:45 <johnsom> Well, for upgrade downtime there are assertions a team can make and get a governance tag. However, we have not yet made that assertion.
20:36:17 <johnsom> jniesz So far we have only done additions, no renames
20:37:09 <rm_work> yeah, no renames or removals
20:37:13 <rm_work> yet
20:37:40 <jniesz> well there is always a time for being first
20:37:46 <jniesz> : )
20:37:48 <rm_work> basically
20:37:50 <rm_work> #vote?
20:37:56 <rm_work> I'll abstain
20:38:07 <johnsom> To me, as long as we maintain the API and provide a migration path, I don't have a problem with it as we have not asserted rolling upgrade or zero downtime upgrade yet.
20:38:27 <johnsom> Please don't abstain, your vote counts.  Especially as you have this deployed.
20:38:35 <rm_work> right but
20:38:39 <jniesz> from my standpoint I think the naming makes more sense, and we started this patch based on feedback from the PTG
20:38:39 <rm_work> i would vote no
20:38:41 <rm_work> but
20:38:45 <rm_work> i don't care so much
20:39:06 <rm_work> ah i guess it makes sense just to vote that way, since we don't need unanimity
20:39:18 <jniesz> with active/active I think vrrp naming makes less sense than it does now
20:39:26 <rm_work> it already makes no sense
20:39:32 <rm_work> my objection has nothing to do with the renaming
20:39:40 <rm_work> i am a huge fan :P
20:39:47 <nmagnezi> I feel like i don't know enough about this (sorry for some reason that patch slipped my radar). so maybe I'm the one who needs to abstain
20:39:49 <johnsom> The other option is to do a two phase upgrade, one that adds and clones the field, then the operation updates the control plane, then one that removes the old field.
20:40:33 <johnsom> Should we table this for a week and put it on the next agenda so everyone has time to review?
20:40:34 <rm_work> meh
20:40:44 <rm_work> yeah we have time
20:40:50 <rm_work> it'll go in quick once we decide, so
20:40:53 <rm_work> not super worried
20:41:07 <jniesz> would like to get this merged or decided as it has a lot of depen
20:41:12 <johnsom> Ok, we will vote next week about the DB schema change.
20:41:27 <jniesz> for example adding frontend network option
20:41:46 <rm_work> the vote would be
20:41:50 <jniesz> in a way it is holding us up from proceeding on other items
20:41:52 <rm_work> "can we rename fields in our db schema"
20:41:59 <rm_work> don't really need to review the specific patch
20:42:11 <rm_work> if you know your answer to that question nir
20:42:37 <johnsom> jniesz I would move forward with what you have.  The changes based on this vote should be very minimal.  Every access to the DB has to go through the abstraction layer.
20:43:48 <johnsom> Ok, in the interest of time:
20:43:51 <johnsom> #topic (rm_work / BAR_RH) Specify management IP addresses per amphora
20:43:58 <johnsom> #link https://review.openstack.org/#/c/505158/
20:44:04 <jniesz> but any patches for the abstraction layer would get impacted
20:44:10 <jniesz> that touched those tables
20:44:39 <bar_> rm_work?
20:44:55 <rm_work> So, there's another approach for this
20:44:58 <johnsom> jniesz Adding the option would use the abstraction which I don't think is in question here, just what is behind the abstraction
20:45:12 <rm_work> that I think is better than mucking with the flows and the actual logic we use
20:45:30 <rm_work> (also it is true that pre-creating that port is problematic for my install)
20:45:39 <rm_work> so yes I have a vested interest here
20:45:56 <johnsom> This has a long history....  It pre-dates the network namespace and was blocked by the old agent framework being broken.
20:46:21 <rm_work> the issue is basically "we want to bind the agent on the amp to the mgmt-ip directly and not listen on 0.0.0.0"
20:46:43 <rm_work> the problem is that we need to send the config for the agent at boot time, which is before the IP is assigned
20:46:46 <rm_work> and there's no way to update it
20:46:49 <rm_work> SO
20:47:53 <johnsom> We need a config update method, that is for sure.
20:47:53 <rm_work> the current solution is: pre-generate the port/IP for the mgmt interface, and bind it on amp boot, so we can send the IP in the pre-generated config
20:48:02 <rm_work> my proposal is that we do something simpler that requires a feature we want anyway: add the ability to send an updated config to the amp agent, and then we simply start on 0.0.0.0 and the first connection (which we are trying to do anyway) just sends a new config
20:48:16 <rm_work> and it restarts on the right interface
20:48:58 <xgerman_> and I am bacl
20:49:31 <rm_work> welcome back xgerman_
20:49:41 <nmagnezi> rm_work, to be frank, i see the binding to 0.0.0.0 as a potential security concern
20:49:41 <johnsom> rm_work for my background, what is the problem with pre-building the port?
20:49:42 <rm_work> oh ptoohill was also there in Atlanta IIRC :P
20:49:44 <cgoncalves> rm_work: couldn't another approach be leveraging cloud-init? I'm not certain but I guess we can get the IP at that phase and before configuring the agent
20:50:09 <rm_work> cgoncalves: hmmm I am not sure -- that's an interesting question
20:50:23 <rm_work> that could be a better approach than either of the ones i mentioned
20:50:24 <xgerman_> pretty sure the computer knows it’s IP on the mgmt port
20:50:27 <nmagnezi> rm_work, can't we both listen to a specific ip address and have the config updates you mentioned?
20:50:27 <johnsom> Yeah, there might be an option to have cloud-init set that setting in the config file
20:50:51 <rm_work> nmagnezi: A) that's what we do now; B) We booted the amp, so it only HAS interfaces we told it to have at boot; C) we still do two-way-cert-auth
20:51:20 <johnsom> Well, right now it binds to 0.0.0.0
20:51:23 <rm_work> right
20:51:31 <johnsom> in the default namespace
20:51:38 <rm_work> so we can have it bound to 0.0.0.0 for all of like .... 1 second
20:51:50 <rm_work> because we're literally polling the address to reach the amp agent
20:52:18 <rm_work> and if someone else managed to get there first and rebind it somewhere else... who cares, it's a blank amp, and we'll kill it momentarily anyway once we timeout (since our calls won't reach it)
20:52:20 <nmagnezi> btw that option for amphora agent ip bind also shows at octavia.conf , which does not make any sense. just a side note.
20:52:36 <rm_work> nmagnezi: err really? lol yeah that makes no sense
20:53:00 <nmagnezi> rm_work, yeah. bar_'s patch handles that as well
20:53:10 <rm_work> nmagnezi: where is that
20:53:13 <rm_work> i actually don't see it
20:53:17 <xgerman_> sorry being late — we can also use the mgmt subnet instead of 0.0.0.0
20:53:28 <xgerman_> why is that now workable?
20:53:35 <xgerman_> now=not
20:53:44 <rm_work> xgerman_: that's what we're talking about -- HOW we do that
20:53:55 <johnsom> Those options were documented there as "amphora agent only" because they show up in the config settings.  That is why they were there.  people were putting up patches saying they were missing.
20:54:21 <xgerman_> we know the mgmt subnet before booting the amp so we can just set it in the config? it in
20:54:24 <rm_work> johnsom: but i really am not seeing them. where do you see them?
20:54:26 <xgerman_> teh config
20:54:30 <nmagnezi> rm_work, so to answer you: A) that does not mean we should keep doing that B) indeed. but someone later on could update the instance via nova and add interface with dhcp, no? C) indeed. but that does not mean we still need to accept mgmt traffic on all available addresses.
20:54:44 <rm_work> nmagnezi: there is no later
20:54:44 <nmagnezi> i don't recall any openstack services who bind to *
20:54:51 <rm_work> we're saying we rebind it immediately on boot
20:55:10 <rm_work> [12:48:07]  <rm_work>	my proposal is that we do something simpler that requires a feature we want anyway: add the ability to send an updated config to the amp agent, and then we simply start on 0.0.0.0 and the first connection (which we are trying to do anyway) just sends a new config
20:55:24 <rm_work> ^^ the new config binds it to the right IP
20:55:44 <xgerman_> so a subnet CIDR is not good enough?
20:55:48 <jniesz> can't we pass the IP to bind through config drive, and then it binds on boot /start
20:55:59 <xgerman_> +1
20:56:08 <nmagnezi> rm_work, touche
20:56:09 <rm_work> xgerman_: how do we use that to set the bind?
20:56:11 <rm_work> jniesz: right the problem is that we don't know the IP until after the boot command is sent
20:56:14 <johnsom> xgerman_ You can't listen on a subnet, you can only listen (bind) to an address.
20:56:34 <xgerman_> I can iptable drop them if I want…
20:56:35 <rm_work> jniesz: that is exactly the current patch's approach -- which requires it to pre-make the port
20:56:44 <rm_work> which i'd like to avoid
20:56:46 <nmagnezi> jniesz, that's what bar_'s patch is doing
20:57:09 <johnsom> jniesz Yes, technically that comes in via config drive from nova.  That is the cloud-init option that was proposed earlier
20:57:34 <rm_work> we're about to be out of time here and i have a hard-stop in 3min T_T
20:57:40 <jniesz> or need something in the systemd script to read the IP off the interface
20:57:42 <johnsom> Yeah, me too
20:57:46 <rm_work> maybe we need to punt this also
20:57:51 <johnsom> lol
20:57:51 <rm_work> since maybe people need to read the existing patch
20:57:54 <rm_work> and see what it's doing
20:58:06 <nmagnezi> rm_work, maybe it's just becoming a late hour for me me , but i still didn't get why pre-creating that port is an issue O_o
20:58:26 <johnsom> Ok, I will put these on the next agenda too.  I might re-order for fairness of the other topic we didn't get to.
20:58:27 <rm_work> nmagnezi: in our environment, we can't specify a network/subnet for VMs
20:58:32 <rm_work> we have to let them create their own ports
20:58:40 <rm_work> so we *can't* pre-create a port and then bind it
20:58:49 <nmagnezi> aha.
20:58:54 <rm_work> IIRC Cern does something similar
20:59:04 <rm_work> from talking with them before
20:59:09 <johnsom> Yeah, I think there are AZ issues with some deployments like that
20:59:12 <rm_work> yes
20:59:23 <rm_work> IMO we *cannot* merge this as it is
20:59:24 <johnsom> Ok, thanks folks!
20:59:25 <xgerman_> yeah, we should avoid pre-creatign the port in case people do fancy SRV-IO stuff
20:59:28 <nmagnezi> well. can't say it's not a valid usecase.
20:59:29 <rm_work> i am just trying to be diplomatic :)
20:59:44 <nmagnezi> rm_work, let's discuss this tomorrow
20:59:46 <rm_work> kk
20:59:46 <johnsom> SRV-IO is a good example too
20:59:50 <nmagnezi> maybw we can figure a middle ground
20:59:53 <rm_work> i mean
21:00:00 <rm_work> i think you will be OK with my proposal
21:00:05 <jniesz> speaking of SR-IOV that has been giving me headaches with the X710
21:00:05 <johnsom> #endmeeting