16:01:26 <Sukhdev_> #startmeeting ironic_neutron 16:01:27 <openstack> Meeting started Mon Nov 23 16:01:26 2015 UTC and is due to finish in 60 minutes. The chair is Sukhdev_. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:29 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:31 <openstack> The meeting name has been set to 'ironic_neutron' 16:01:43 <Sukhdev_> Good morning everybody!! 16:01:49 <kevinbenton> good morning 16:02:05 <Sukhdev_> #topic: Agenda 16:02:12 <Sukhdev_> #link: https://wiki.openstack.org/wiki/Meetings/Ironic-neutron#Meeting_November_23.2C_2015 16:02:22 <Sukhdev_> #topic: Announcements 16:02:37 <Sukhdev_> M1 will going sometime next week 16:03:22 <Sukhdev_> I do not have announcement - does anybody want to announce anything? 16:03:35 <Sukhdev_> I have a very focused agenda this morning 16:03:48 <Sukhdev_> Lets dive into it then - 16:04:03 <Sukhdev_> #topic: CI discussion 16:04:06 <lazy_prince> o/ 16:04:38 <Sukhdev_> This is the main thing I would like to get a concensus on, if possible 16:05:09 <Sukhdev_> lazy_prince and I had a chat about coming up with a plan for it - 16:05:35 <Sukhdev_> I also had a chat with kevinbenton (most of you know him) from neutron core team about this 16:06:27 <Sukhdev_> so, I invited kevinbenton to join us this morning to participate in this discussion as he is very familiar with the OVS and L2 agent part 16:06:42 <Sukhdev_> having said that - lets dive into the discussion. 16:06:47 <lazy_prince> and were you to reach some conclusion..? 16:07:20 <Sukhdev_> no - we did not conclude anything - we thought we do that here 16:07:34 <Sukhdev_> first lazy_prince can you please start with your findings 16:07:52 <Sukhdev_> based upon what you have tried and the road block you hit 16:08:41 <lazy_prince> So basically, when we use flat network to test network flipping, 16:09:04 <jroll> Sukhdev_: side note, updating some patches, can you press "restore" on this? https://review.openstack.org/#/c/213264/ 16:09:54 <lazy_prince> we have one issue.. once a port is created, the dhcp agent will start serving the ip address ir-respective of port bound or not.. 16:10:39 <lazy_prince> so when nova boot is called, it will create a port for the tenant network.. 16:11:17 <lazy_prince> which will be in unbound state..and then ironic will create another port on provisioning network whihc will be bound. 16:12:20 <lazy_prince> so there will be two ports with the same mac id on two different flat network. which means we get into race condition of who servers the ip address first.. 16:12:52 <kevinbenton> lazy_prince: so these flat networks are wired together to be on the same bridge then? 16:12:56 <lazy_prince> I hope I made my point.. did I..? 16:13:17 <lazy_prince> yes.. as its VM in ci/devstack 16:13:44 <kevinbenton> ack 16:14:10 <sambetts> lazy_prince: this shouldn't happen in a real env right? because of vlan separation? 16:14:19 <kevinbenton> so can someone give me a quick explanation of what it is that we are trying to validate with this test? 16:14:38 <lazy_prince> right.. but this is CI job specific issue that we are talking about.. 16:14:44 <kevinbenton> sambetts: yes, normally two neutron networks wouldn't land on the same broadcast domain 16:14:47 <Sukhdev_> kevinbenton : basically the network flip logic 16:15:22 <sambetts> lazy_prince: can't we replicate the vlan separation with the OVS tags? 16:15:36 <jroll> kevinbenton: tl;dr attempting to test isolated networks in devstack 16:16:11 <lazy_prince> We want to stay in the limits of openstack and OVS plugns doesn not support baremetal at the moment.. 16:16:17 <kevinbenton> jroll, Sukhdev_: right, but if we put everything on the same flat network, how will we be testing isolated networks? 16:16:28 <jroll> kevinbenton: I have the same question :D 16:16:42 <lazy_prince> kevinbenton: good point... 16:17:08 <jroll> could we just drop an iptables rule between the two networks? 16:17:35 <lazy_prince> it a L2 thing.. iptables i think plays in L3.. 16:17:47 <lazy_prince> I could be wrong too... 16:18:13 <kevinbenton> but i'm still not sure what this test is validating, is it validating neutron, or just ironic's calls to neutron? 16:18:23 <jroll> the latter 16:18:55 <sambetts> could we set the tags on the ovs ports then the flows will just drop packets for the wrong network? 16:19:39 <kevinbenton> but if we are doing a bunch of manual wiring here, i'm not sure what we are validating... 16:20:04 <lazy_prince> kevinbenton: +1 16:20:46 <Sukhdev_> kevinbenton and I discussed another idea - kevinbenton, can you describe it for benefits of all 16:20:54 <kevinbenton> so what i was discussing with Sukhdev_ is to make an adjustment to OVS/ML2 to be able to bind ports added to vswitch based on the switch_info dict 16:21:32 <kevinbenton> the idea is that we add a port to OVS with some arbitrary name (e.g. IRONICP1) 16:22:06 <kevinbenton> then create ports with switch info populated with a chassis ID of the hostname of the compute node and the port name of IRONICp1 16:23:18 <kevinbenton> then when the OVS agent is doing lookups to the server to find corresponding Neutron port objects to the interfaces, we can lookup based on the switch_info 16:23:40 <lazy_prince> basically adding support for baremetal in ovs mech driver/agent.. 16:23:46 <kevinbenton> right now the lookup is based on the mac, a port UUID, or fragment of the UUID 16:23:48 <kevinbenton> lazy_prince: yes 16:23:56 <sambetts> +1 16:23:59 <lazy_prince> +1 16:24:07 <kevinbenton> if we go this route, we are testing everything 16:24:18 <sambetts> this is what I expected us to do 16:24:40 <lazy_prince> How is neutron community going to react to this idea..? do we have there blessings..? 16:24:46 <jroll> sambetts: +1 16:24:55 <jroll> kevinbenton: so is the change purely in the ml2 thing for ovs, or also in ovs itself 16:25:00 <kevinbenton> it will require a change to the ML2 plugin because right now the port lookup stuff is hard-coded and not dependent on the driver 16:25:24 <jroll> nod, seems fine to me 16:25:25 <kevinbenton> jroll: definitely no OVS change, but possibly a modification on the OVS agent (i assume that's what you meant) 16:25:37 <kevinbenton> jroll: mainly a server-side change 16:25:54 <jroll> kevinbenton: nah, I meant ovs itself, just to be sure :) 16:25:55 <kevinbenton> lazy_prince: i can help push this forward 16:26:24 <kevinbenton> lazy_prince: there has been a TODO in the code for a while to move some of the port lookup logic into the ML2 drivers 16:26:32 <lazy_prince> kevinbenton: that would be awesome.. a new bp or soec is needed..? 16:26:50 <lazy_prince> s/soec/spec/ 16:27:15 <kevinbenton> lazy_prince: might need a small spec 16:27:33 <kevinbenton> lazy_prince: i coded up most of the server changes yesterday and they aren't too invasive 16:27:53 <kevinbenton> lazy_prince: but it's a new ML2 driver API so it's probably good to have a spec for it anyway 16:28:03 <Sukhdev_> kevinbenton : ha ha and you told me that you might not be able to get to it :-) 16:28:20 <kevinbenton> Sukhdev_: i didn't get it clean enough to push up as gerrit reviews yet 16:28:36 <lazy_prince> kevinbenton: let me know when its ready for testing.. I would like to test it.. 16:29:12 <kevinbenton> lazy_prince: yes, so i was planning on getting it pushed up as a WIP patch today so people could try it out 16:29:20 <lazy_prince> does anyone sees any other blockers other than this to get it included in ci..? 16:29:44 <Sukhdev_> lazy_prince good question 16:30:04 <kevinbenton> i have a quick question 16:30:18 <kevinbenton> so is the switch_info dict populated on both ports at the same time? 16:30:36 <kevinbenton> or will ironic only populate it on the port that it wants to be active? 16:30:41 <lazy_prince> nope.. only on one at a time.. 16:31:05 <kevinbenton> excellent. the logic would have been more complex if i had multiple results and had to determine which one was currently bound 16:31:31 <kevinbenton> oh, one more thing. was this using a different vnic_type, or was it just a different device_owner? 16:31:51 <jroll> lazy_prince: fyi, it'll need to use a whole disk image because it won't be able to pxe boot the tenant image, but the disk image already exists in devstack so not a huge deal 16:31:52 <Sukhdev_> kevinbenton : nova boot initiated call will not have host_id and will not have this information as well 16:32:27 <lazy_prince> jroll: thanks.. but I have that covered... 16:32:32 <kevinbenton> Sukhdev_: don't you pass a port to nova boot that contains the switch_info>? 16:32:45 <jroll> lazy_prince: cool, just making sure :) 16:32:47 <Sukhdev_> kevinbenton :no 16:33:20 <Sukhdev_> kevinbenton : just the network where the BM needs to attach (i.e. tenant network) 16:33:22 <kevinbenton> Sukhdev_: so in a normal deployment, how is anything supposed to be wired up correctly at boot time? 16:33:54 <kevinbenton> (a deployment with a real switch) 16:34:48 <lazy_prince> Sukhdev_: now you can share your ppt with kevinbenton 16:35:31 * Sukhdev_ will let Ironic experts answer it for kevinbenton 16:35:52 <Sukhdev_> lazy_prince : will do 16:36:14 <kevinbenton> if we need to wire ports for bare metal servers, we will always need to provide the switch_info for where they are connected 16:36:53 <jroll> kevinbenton: nova creates the tenant port, does not wire it up yet. ironic creates the port on the provisioning network with switch info, that's where we do the deploy. then we drop the provisioning port after deploy and wire up the tenant port 16:37:40 <kevinbenton> jroll: but the provisioning port must also have switch_info 16:37:47 <baoli> just a quick thought, if a port is not bound, then the information shouldn't be populated in dnsmasq, right? So althought the tenant network neutron port is initially created, it shouldn't be added into dnsmasq when it's not bound. 16:37:55 <kevinbenton> jroll: otherwise how will neutron put the port on the correct network? 16:38:05 <jroll> kevinbenton: the provisioning port will have switch info 16:38:28 <sambetts> baoli: dnsmasq is populated whether on not the port is bound 16:38:30 <kevinbenton> jroll: oh, i thought Sukhdev_ was telling me that it wouldn't. that was the confusion 16:39:02 <jroll> kevinbenton: the port nova creates is the tenant port, and will initially not have switch_info; that's provided by ironic after the deploy is done 16:39:04 <jroll> make sense? 16:39:06 <lazy_prince> sambetts: I guess, baoli is proposing to change that behaviour.. 16:39:10 <baoli> sambetts, would making a change like that solve the problem? 16:39:26 <kevinbenton> jroll: but doesn't it need the network to deploy? 16:39:39 <sambetts> baoli: yes, but it wouldn't solve the problem that the networks aren't isolated 16:39:40 <jroll> kevinbenton: yeah, that's the provisioning port that ironic creates 16:39:56 * Sukhdev_ getting lost in multiple conversations 16:40:07 <kevinbenton> baoli: we can't make that kind of change (i'll come back to that in a second) 16:40:47 <kevinbenton> jroll: so what's the initial port used for that nova creates? 16:40:47 <baoli> kevinbenton, sure. 16:41:20 <jroll> kevinbenton: it's the tenant port, wired up after the deploy is done. the provisioning port only lives for the lifetime of the deploy 16:41:55 <kevinbenton> jroll: oooh, okay. it's ultimately the port you want to use 16:41:59 <jroll> yep 16:42:08 <kevinbenton> jroll: and you will update it with switch_info after everything else is done 16:42:14 <jroll> yep 16:42:19 <kevinbenton> makes sense 16:42:27 <kevinbenton> ok, back to baoli's suggestion 16:43:16 <kevinbenton> baoli: first issue is that creating unbound neutron ports is a way to make DHCP reservations for stuff not managed by openstack 16:43:32 <kevinbenton> baoli: e.g. completely unmanaged bare metal 16:43:50 <kevinbenton> baoli: or whatever someone might want neutron to give an address to on a provider network 16:44:08 <kevinbenton> baoli: so if we don't allow unbound ports to have dhcp reservations, we will break that 16:44:32 <kevinbenton> baoli: the second issue is that it means the DHCP agent has to understand the port binding process 16:44:47 <kevinbenton> baoli: which doesn't necessarily exist in all core plugins 16:46:10 <kevinbenton> baoli: we could conceivably make a change to the dhcp agent to not offer leases to ports with admin_state_up set to False 16:46:49 <baoli> kevinbenton: yeah, that's what I'm thinking. It's just a matter of when to put the lease in the config file to be offered. 16:47:32 <sambetts> still doesn't solve that the machine can access both broadcast domains 16:48:03 <Sukhdev_> baoli : we could end up changing the behavior in neutron for many plugin 16:48:08 <kevinbenton> right, i would be much happier if we can have this testing more of a 'real' end-to-end setup 16:48:25 <baoli> or I'd say that we may not create the tenant network neutron port before the deploy is done if that's possible? 16:48:30 <jroll> agree 16:48:32 <kevinbenton> so if we have this bare metal like support in OVS. it's just like how it will work with a mech driver 16:49:07 <sambetts> baoli: thats would require reworking logic inside nova 16:50:11 * Sukhdev_ time check 10 min 16:50:54 <kevinbenton> baoli: so going down that road is trying to make neutron work in a way that violates the assumptions about neutron networks (separate broadcast domains) so it will be hard to justify to the wider community 16:51:15 <kevinbenton> Sukhdev_: ack. so what i will do is push up my WIP code at the end of the day 16:51:23 <Sukhdev_> Folks, in the interest of time - are we in agreement with the proposal 16:51:40 <sambetts> I am 16:51:43 <jroll> +1 from me 16:52:10 <Sukhdev_> and I had already given +2 to it before the meeting :-):-) 16:52:21 <lazy_prince> +1 16:52:56 <Sukhdev_> #action: Sukhdev to work with kevinbenton to get the spec and the patch worked up in neutron for the network flip logic 16:53:36 <Sukhdev_> Folks, this was the main agenda item on my mind to reach a conclusion on 16:53:50 <Sukhdev_> #topic: Open Discussion 16:53:56 <baoli> I agree to the approach. some questions may be clarifed after seeing the code 16:54:19 <jroll> jfyi, I updated the nova spec and rebased nova patches. hoping to find some time to hack on them to get them working this week 16:54:20 <Sukhdev_> I am going to skip everything on agenda and open for discussion 16:54:43 <baoli> Sukhdev, I have a question on how the vPC config is injected into the tenant's image in your test. 16:54:43 <kevinbenton> (sorry to eat so much time) 16:54:49 <Sukhdev_> jroll : I answered some of the question on the review comments on the nova spec 16:55:12 <Sukhdev_> kevinbenton : that was so nice of you to agree to join us and provide this help 16:55:24 <jroll> Sukhdev_: well, I updated the spec to answer them better as it was a pretty poor spec :P 16:55:32 <jroll> kevinbenton: indeed, thank you! 16:55:47 <Sukhdev_> jroll ::-) 16:56:20 <Sukhdev_> baoli : I did not understand your question - I use the standard image 16:56:45 <baoli> Sukhdev: do you need to create a bonded interface after it's booted 16:56:55 <sambetts> Sukhdev_: are you using cloud-init to setup the bonded interfaces? 16:57:22 <Sukhdev_> baoli sambetts : oh - I have not started to test the bonded interfaces yet 16:57:35 <Sukhdev_> lazy_prince : have you done any such testing yet ? 16:58:08 <Sukhdev_> Folks BTW, item 7 on the etherpad (under issues) still need a closure - https://etherpad.openstack.org/p/ironic-neutron-mid-cycle 16:58:16 <jroll> there's work that needs to be done in nova to support vlans and bonds in configdrive/metadata 16:58:34 <lazy_prince> nope... not yet.. but I will be starting on this very soon... a bit busy for the time being.. 16:58:48 <Sukhdev_> lazy_prince : same here :-) 16:58:52 <baoli> is the plan to use the Racker cloud-init that supports network.json? 16:59:12 <jroll> Sukhdev_: I hope to fix that patch for item 7 this week 16:59:31 <Sukhdev_> jroll : cool - thanks 16:59:31 <jroll> baoli: the plan is to get those patches in cloud-init 16:59:48 * Sukhdev_ time check 1 min 16:59:58 <baoli> jroll: when is that going to happen? and is there someone pushing for that to happen? 17:00:34 <Sukhdev_> Folks time is up - I have to chair another meeting on this next - so, we have to close this - sorry :-) 17:00:39 <jroll> baoli: well, nova needs the vlan/bonding support first 17:01:01 <Sukhdev_> Thanks for attending todays meeting - we had excellent discussion and closures... 17:01:04 <kevinbenton> later! 17:01:05 <baoli> jroll, ok, got it, if volunteer is needed, I can start looking into that. 17:01:10 <lazy_prince> and neutron needs support for trunk ports too... 17:01:11 <Sukhdev_> #endmeeting