16:01:26 <Sukhdev_> #startmeeting ironic_neutron
16:01:27 <openstack> Meeting started Mon Nov 23 16:01:26 2015 UTC and is due to finish in 60 minutes.  The chair is Sukhdev_. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:01:29 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:01:31 <openstack> The meeting name has been set to 'ironic_neutron'
16:01:43 <Sukhdev_> Good morning everybody!!
16:01:49 <kevinbenton> good morning
16:02:05 <Sukhdev_> #topic: Agenda
16:02:12 <Sukhdev_> #link: https://wiki.openstack.org/wiki/Meetings/Ironic-neutron#Meeting_November_23.2C_2015
16:02:22 <Sukhdev_> #topic: Announcements
16:02:37 <Sukhdev_> M1 will going sometime next week
16:03:22 <Sukhdev_> I do not have announcement - does anybody want to announce anything?
16:03:35 <Sukhdev_> I have a very focused agenda this morning
16:03:48 <Sukhdev_> Lets dive into it then -
16:04:03 <Sukhdev_> #topic: CI discussion
16:04:06 <lazy_prince> o/
16:04:38 <Sukhdev_> This is the main thing I would like to get a concensus on, if possible
16:05:09 <Sukhdev_> lazy_prince and I had a chat about coming up with a plan for it -
16:05:35 <Sukhdev_> I also had a chat with kevinbenton (most of you know him) from neutron core team about this
16:06:27 <Sukhdev_> so, I invited kevinbenton to join us this morning to participate in this discussion as he is very familiar with the OVS and L2 agent part
16:06:42 <Sukhdev_> having said that - lets dive into the discussion.
16:06:47 <lazy_prince> and were you to reach some conclusion..?
16:07:20 <Sukhdev_> no - we did not conclude anything - we thought we do that here
16:07:34 <Sukhdev_> first lazy_prince can you please start with your findings
16:07:52 <Sukhdev_> based upon what you have tried and the road block you hit
16:08:41 <lazy_prince> So basically, when we use flat network to test network flipping,
16:09:04 <jroll> Sukhdev_: side note, updating some patches, can you press "restore" on this? https://review.openstack.org/#/c/213264/
16:09:54 <lazy_prince> we have one issue.. once a port is created, the dhcp agent will start serving the ip address ir-respective of port bound or not..
16:10:39 <lazy_prince> so when nova boot is called, it will create a port for the tenant network..
16:11:17 <lazy_prince> which will be in unbound state..and then ironic will create another port on provisioning network whihc will be bound.
16:12:20 <lazy_prince> so there will be two ports with the same mac id on two different flat network. which means we get into race condition of who servers the ip address first..
16:12:52 <kevinbenton> lazy_prince: so these flat networks are wired together to be on the same bridge then?
16:12:56 <lazy_prince> I hope I made my point.. did I..?
16:13:17 <lazy_prince> yes.. as its VM in ci/devstack
16:13:44 <kevinbenton> ack
16:14:10 <sambetts> lazy_prince: this shouldn't happen in a real env right? because of vlan separation?
16:14:19 <kevinbenton> so can someone give me a quick explanation of what it is that we are trying to validate with this test?
16:14:38 <lazy_prince> right.. but this is CI job specific issue that we are talking about..
16:14:44 <kevinbenton> sambetts: yes, normally two neutron networks wouldn't land on the same broadcast domain
16:14:47 <Sukhdev_> kevinbenton : basically the network flip logic
16:15:22 <sambetts> lazy_prince: can't we replicate the vlan separation with the OVS tags?
16:15:36 <jroll> kevinbenton: tl;dr attempting to test isolated networks in devstack
16:16:11 <lazy_prince> We want to stay in the limits of openstack and OVS plugns doesn not support baremetal at the moment..
16:16:17 <kevinbenton> jroll, Sukhdev_: right, but if we put everything on the same flat network, how will we be testing isolated networks?
16:16:28 <jroll> kevinbenton: I have the same question :D
16:16:42 <lazy_prince> kevinbenton: good point...
16:17:08 <jroll> could we just drop an iptables rule between the two networks?
16:17:35 <lazy_prince> it a L2 thing.. iptables i think plays in L3..
16:17:47 <lazy_prince> I could be wrong too...
16:18:13 <kevinbenton> but i'm still not sure what this test is validating, is it validating neutron, or just ironic's calls to neutron?
16:18:23 <jroll> the latter
16:18:55 <sambetts> could we set the tags on the ovs ports then the flows will just drop packets for the wrong network?
16:19:39 <kevinbenton> but if we are doing a bunch of manual wiring here, i'm not sure what we are validating...
16:20:04 <lazy_prince> kevinbenton: +1
16:20:46 <Sukhdev_> kevinbenton and I discussed another idea - kevinbenton, can you describe it for benefits of all
16:20:54 <kevinbenton> so what i was discussing with Sukhdev_ is to make an adjustment to OVS/ML2 to be able to bind ports added to vswitch based on the switch_info dict
16:21:32 <kevinbenton> the idea is that we add a port to OVS with some arbitrary name (e.g. IRONICP1)
16:22:06 <kevinbenton> then create ports with switch info populated with a chassis ID of the hostname of the compute node and the port name of IRONICp1
16:23:18 <kevinbenton> then when the OVS agent is doing lookups to the server to find corresponding Neutron port objects to the interfaces, we can lookup based on the switch_info
16:23:40 <lazy_prince> basically adding support for baremetal in ovs mech driver/agent..
16:23:46 <kevinbenton> right now the lookup is based on the mac, a port UUID, or fragment of the UUID
16:23:48 <kevinbenton> lazy_prince: yes
16:23:56 <sambetts> +1
16:23:59 <lazy_prince> +1
16:24:07 <kevinbenton> if we go this route, we are testing everything
16:24:18 <sambetts> this is what I expected us to do
16:24:40 <lazy_prince> How is neutron community going to react to this idea..? do we have there blessings..?
16:24:46 <jroll> sambetts: +1
16:24:55 <jroll> kevinbenton: so is the change purely in the ml2 thing for ovs, or also in ovs itself
16:25:00 <kevinbenton> it will require a change to the ML2 plugin because right now the port lookup stuff is hard-coded and not dependent on the driver
16:25:24 <jroll> nod, seems fine to me
16:25:25 <kevinbenton> jroll: definitely no OVS change, but possibly a modification on the OVS agent (i assume that's what you meant)
16:25:37 <kevinbenton> jroll: mainly a server-side change
16:25:54 <jroll> kevinbenton: nah, I meant ovs itself, just to be sure :)
16:25:55 <kevinbenton> lazy_prince: i can help push this forward
16:26:24 <kevinbenton> lazy_prince: there has been a TODO in the code for a while to move some of the port lookup logic into the ML2 drivers
16:26:32 <lazy_prince> kevinbenton: that would be awesome.. a new bp or soec is needed..?
16:26:50 <lazy_prince> s/soec/spec/
16:27:15 <kevinbenton> lazy_prince: might need a small spec
16:27:33 <kevinbenton> lazy_prince: i coded up most of the server changes yesterday and they aren't too invasive
16:27:53 <kevinbenton> lazy_prince: but it's a new ML2 driver API so it's probably good to have a spec for it anyway
16:28:03 <Sukhdev_> kevinbenton : ha ha and you told me that you might not be able to get to it :-)
16:28:20 <kevinbenton> Sukhdev_: i didn't get it clean enough to push up as gerrit reviews yet
16:28:36 <lazy_prince> kevinbenton: let me know when its ready for testing.. I would like to test it..
16:29:12 <kevinbenton> lazy_prince: yes, so i was planning on getting it pushed up as a WIP patch today so people could try it out
16:29:20 <lazy_prince> does anyone sees any other blockers other than this to get it included in ci..?
16:29:44 <Sukhdev_> lazy_prince good question
16:30:04 <kevinbenton> i have a quick question
16:30:18 <kevinbenton> so is the switch_info dict populated on both ports at the same time?
16:30:36 <kevinbenton> or will ironic only populate it on the port that it wants to be active?
16:30:41 <lazy_prince> nope.. only on one at a time..
16:31:05 <kevinbenton> excellent. the logic would have been more complex if i had multiple results and had to determine which one was currently bound
16:31:31 <kevinbenton> oh, one more thing. was this using a different vnic_type, or was it just a different device_owner?
16:31:51 <jroll> lazy_prince: fyi, it'll need to use a whole disk image because it won't be able to pxe boot the tenant image, but the disk image already exists in devstack so not a huge deal
16:31:52 <Sukhdev_> kevinbenton : nova boot initiated call will not have host_id and will not have this information as well
16:32:27 <lazy_prince> jroll: thanks.. but I have that covered...
16:32:32 <kevinbenton> Sukhdev_: don't you pass a port to nova boot that contains the switch_info>?
16:32:45 <jroll> lazy_prince: cool, just making sure :)
16:32:47 <Sukhdev_> kevinbenton :no
16:33:20 <Sukhdev_> kevinbenton : just the network where the BM needs to attach (i.e. tenant network)
16:33:22 <kevinbenton> Sukhdev_: so in a normal deployment, how is anything supposed to be wired up correctly at boot time?
16:33:54 <kevinbenton> (a deployment with a real switch)
16:34:48 <lazy_prince> Sukhdev_: now you can share your ppt with kevinbenton
16:35:31 * Sukhdev_ will let Ironic experts answer it for kevinbenton
16:35:52 <Sukhdev_> lazy_prince : will do
16:36:14 <kevinbenton> if we need to wire ports for bare metal servers, we will always need to provide the switch_info for where they are connected
16:36:53 <jroll> kevinbenton: nova creates the tenant port, does not wire it up yet. ironic creates the port on the provisioning network with switch info, that's where we do the deploy. then we drop the provisioning port after deploy and wire up the tenant port
16:37:40 <kevinbenton> jroll: but the provisioning port must also have switch_info
16:37:47 <baoli> just a quick thought, if a port is not bound, then the information shouldn't be populated in dnsmasq, right? So althought the tenant network neutron port is initially created, it shouldn't be added into dnsmasq when it's not bound.
16:37:55 <kevinbenton> jroll: otherwise how will neutron put the port on the correct network?
16:38:05 <jroll> kevinbenton: the provisioning port will have switch info
16:38:28 <sambetts> baoli: dnsmasq is populated whether on not the port is bound
16:38:30 <kevinbenton> jroll: oh, i thought Sukhdev_ was telling me that it wouldn't. that was the confusion
16:39:02 <jroll> kevinbenton: the port nova creates is the tenant port, and will initially not have switch_info; that's provided by ironic after the deploy is done
16:39:04 <jroll> make sense?
16:39:06 <lazy_prince> sambetts: I guess, baoli is proposing to change that behaviour..
16:39:10 <baoli> sambetts, would making a change like that solve the problem?
16:39:26 <kevinbenton> jroll: but doesn't it need the network to deploy?
16:39:39 <sambetts> baoli: yes, but it wouldn't solve the problem that the networks aren't isolated
16:39:40 <jroll> kevinbenton: yeah, that's the provisioning port that ironic creates
16:39:56 * Sukhdev_ getting lost in multiple conversations
16:40:07 <kevinbenton> baoli: we can't make that kind of change (i'll come back to that in a second)
16:40:47 <kevinbenton> jroll: so what's the initial port used for that nova creates?
16:40:47 <baoli> kevinbenton, sure.
16:41:20 <jroll> kevinbenton: it's the tenant port, wired up after the deploy is done. the provisioning port only lives for the lifetime of the deploy
16:41:55 <kevinbenton> jroll: oooh, okay. it's ultimately the port you want to use
16:41:59 <jroll> yep
16:42:08 <kevinbenton> jroll: and you will update it with switch_info after everything else is done
16:42:14 <jroll> yep
16:42:19 <kevinbenton> makes sense
16:42:27 <kevinbenton> ok, back to baoli's suggestion
16:43:16 <kevinbenton> baoli: first issue is that creating unbound neutron ports is a way to make DHCP reservations for stuff not managed by openstack
16:43:32 <kevinbenton> baoli: e.g. completely unmanaged bare metal
16:43:50 <kevinbenton> baoli: or whatever someone might want neutron to give an address to on a provider network
16:44:08 <kevinbenton> baoli: so if we don't allow unbound ports to have dhcp reservations, we will break that
16:44:32 <kevinbenton> baoli: the second issue is that it means the DHCP agent has to understand the port binding process
16:44:47 <kevinbenton> baoli: which doesn't necessarily exist in all core plugins
16:46:10 <kevinbenton> baoli: we could conceivably make a change to the dhcp agent to not offer leases to ports with admin_state_up set to False
16:46:49 <baoli> kevinbenton: yeah, that's what I'm thinking. It's just a matter of when to put the lease in the config file to be offered.
16:47:32 <sambetts> still doesn't solve that the machine can access both broadcast domains
16:48:03 <Sukhdev_> baoli : we could end up changing the behavior in neutron for many plugin
16:48:08 <kevinbenton> right, i would be much happier if we can have this testing more of a 'real' end-to-end setup
16:48:25 <baoli> or I'd say that we may not create the tenant network neutron port before the deploy is done if that's possible?
16:48:30 <jroll> agree
16:48:32 <kevinbenton> so if we have this bare metal like support in OVS. it's just like how it will work with a mech driver
16:49:07 <sambetts> baoli: thats would require reworking logic inside nova
16:50:11 * Sukhdev_ time check 10 min
16:50:54 <kevinbenton> baoli: so going down that road is trying to make neutron work in a way that violates the assumptions about neutron networks (separate broadcast domains) so it will be hard to justify to the wider community
16:51:15 <kevinbenton> Sukhdev_: ack. so what i will do is push up my WIP code at the end of the day
16:51:23 <Sukhdev_> Folks, in the interest of time - are we in agreement with the proposal
16:51:40 <sambetts> I am
16:51:43 <jroll> +1 from me
16:52:10 <Sukhdev_> and I had already given +2 to it before the meeting :-):-)
16:52:21 <lazy_prince> +1
16:52:56 <Sukhdev_> #action: Sukhdev to work with kevinbenton to get the spec and the patch worked up in neutron for the network flip logic
16:53:36 <Sukhdev_> Folks, this was the main agenda item on my mind to reach a conclusion on
16:53:50 <Sukhdev_> #topic: Open Discussion
16:53:56 <baoli> I agree to the approach. some questions may be clarifed after seeing the code
16:54:19 <jroll> jfyi, I updated the nova spec and rebased nova patches. hoping to find some time to hack on them to get them working this week
16:54:20 <Sukhdev_> I am going to skip everything on agenda and open for discussion
16:54:43 <baoli> Sukhdev, I have a question on how the vPC config is injected into the tenant's image in your test.
16:54:43 <kevinbenton> (sorry to eat so much time)
16:54:49 <Sukhdev_> jroll : I answered some of the question on the review comments on the nova spec
16:55:12 <Sukhdev_> kevinbenton : that was so nice of you to agree to join us and provide this help
16:55:24 <jroll> Sukhdev_: well, I updated the spec to answer them better as it was a pretty poor spec :P
16:55:32 <jroll> kevinbenton: indeed, thank you!
16:55:47 <Sukhdev_> jroll ::-)
16:56:20 <Sukhdev_> baoli : I did not understand your question - I use the standard image
16:56:45 <baoli> Sukhdev: do you need to create a bonded interface after it's booted
16:56:55 <sambetts> Sukhdev_: are you using cloud-init to setup the bonded interfaces?
16:57:22 <Sukhdev_> baoli sambetts : oh - I have not started to test the bonded interfaces yet
16:57:35 <Sukhdev_> lazy_prince : have you done any such testing yet ?
16:58:08 <Sukhdev_> Folks BTW, item 7 on the etherpad (under issues) still need a closure - https://etherpad.openstack.org/p/ironic-neutron-mid-cycle
16:58:16 <jroll> there's work that needs to be done in nova to support vlans and bonds in configdrive/metadata
16:58:34 <lazy_prince> nope... not yet.. but I will be starting on this very soon... a bit busy for the time being..
16:58:48 <Sukhdev_> lazy_prince : same here :-)
16:58:52 <baoli> is the plan to use the Racker cloud-init that supports network.json?
16:59:12 <jroll> Sukhdev_: I hope to fix that patch for item 7 this week
16:59:31 <Sukhdev_> jroll : cool - thanks
16:59:31 <jroll> baoli: the plan is to get those patches in cloud-init
16:59:48 * Sukhdev_ time check 1 min
16:59:58 <baoli> jroll: when is that going to happen? and is there someone pushing for that to happen?
17:00:34 <Sukhdev_> Folks time is up - I have to chair another meeting on this next - so, we have to close this - sorry :-)
17:00:39 <jroll> baoli: well, nova needs the vlan/bonding support first
17:01:01 <Sukhdev_> Thanks for attending todays meeting - we had excellent discussion and closures...
17:01:04 <kevinbenton> later!
17:01:05 <baoli> jroll, ok, got it, if volunteer is needed, I can start looking into that.
17:01:10 <lazy_prince> and neutron needs support for trunk ports too...
17:01:11 <Sukhdev_> #endmeeting