16:01:16 #startmeeting ironic_neutron 16:01:17 Meeting started Mon Oct 19 16:01:16 2015 UTC and is due to finish in 60 minutes. The chair is Sukhdev. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:18 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:20 cragusa: hi 16:01:21 The meeting name has been set to 'ironic_neutron' 16:01:43 morning 16:02:06 #topic: Agenda 16:02:09 #link: https://wiki.openstack.org/wiki/Meetings/Ironic-neutron#Meeting_October_19.2C_2015 16:02:19 #topic: Announcements 16:02:53 Next week is Summit - will see ya all there in Tokyo 16:03:00 o/ 16:03:05 sorry a bit late.. 16:03:15 No meeting next week - everybody will be in transit 16:03:19 lazy_prince: you are right on time 16:03:24 fine with me.. 16:04:16 The week after Summit - some of us off as well traveling through Japan and others are travelling back home 16:04:36 So, I thought we cancel the meeting week after summit as well 16:04:46 any body has any objection to that? 16:05:31 * Sukhdev waiting 16:06:15 seems like there are no objections - so, next two weeks no meetings 16:06:20 we will resume on Nov 9th 16:06:29 Moving on with the agenda 16:06:40 #topic: Integration Status 16:07:21 I have had all kinds of problems after installing the latest set of patches 16:07:42 Last week was a big struggle to get a BM to boot 16:07:54 It turns out there is an issue with the Image - 16:08:32 jroll tried to help with the images, and I am chatting with lucas in the ironic channel with image related issues 16:08:56 the prebuilt images should always work, ditto for the coreos builder. those are both CI'd extensively. 16:09:44 jroll: the lasted on Friday was that the node will get into the reservation lock as you mentioned 16:10:15 and the node will proceed further than wait-callback state 16:10:20 Sukhdev: no, that's an edge case that can happen when code is broken, not a normal thing 16:10:29 anyway, we can take that stuff offline 16:11:42 yes, we can take this after this meeting - need to undestand how to make forward progress on that 16:12:27 Other than this, I do not have any additional progress report to present 16:13:04 Anybody has anything else to share? 16:13:35 Can we ask all driver owners to validate if this works for them.. 16:14:05 or if some drivers are broken, we can get that addressed in nick of time..? 16:14:24 just a thought.. 16:14:52 lazy_prince: do we know if any other driver is using this other than HP and Arista? 16:15:12 lazy_prince: ironic or neutron drivers? 16:15:24 I was more particular about ironic drivers.. 16:15:36 Sukhdev: also of note, I just proposed the Nova spec for this: https://review.openstack.org/#/c/237067/ 16:15:56 #link: https://review.openstack.org/#/c/237067/ 16:16:16 jroll: thanks - I'll review it after the meeting 16:16:23 lazy_prince: we still don't have instructions for testing :( 16:16:36 nor docs for configuration 16:16:56 we'll need those before asking people to test it; I'd also like it for my own testing 16:17:39 jroll: cragusa has put out the initial version of the doc, we should help him beef it up 16:17:54 hmm.. I think we should take some time to complete those docs.. 16:18:06 Sukhdev: when I saw that it was just how to add portgroups :( 16:18:21 I actually have addressed Sukhdev comments, but need to check it before pushing it 16:18:56 I can do that tomorrow, if that's ok 16:19:02 jroll: we should have everything in this spec - so, that this become one "go to" place for all the answers 16:19:23 I mean from the ironic side of the documentation 16:19:31 right 16:19:37 agree.. 16:19:41 well, everything a deployer needs to know, step by step 16:20:32 correct - if you see my comments on cragusa's spec, you will notice that I tried to elude to that - but, am sure I missed many steps 16:20:57 so, if we all chip in, we can get this beefed up with information 16:21:14 Sukhdev: tomorrow I'll push the updated version, ok? 16:21:32 cragusa: sounds good 16:22:06 We should all review Nova spec which jroll just mentioned - make sure we are all good with that as well 16:22:25 we should get it ratified before the summit 16:22:35 it probably needs some updates, I spent a whole 15 minutes on it :P 16:22:59 :) will make sure to review it.. 16:23:15 I will do it as well 16:23:27 me also 16:24:01 thanks 16:24:17 me too 16:24:58 So, I think we have covered most of the items on the agenda - anybody wants to discuss anything else? 16:25:25 I answered a few comments on a patch, but I need to ask something 16:25:31 I will need help from jroll off-line to get past the issue that I am faced with so that I can make forward progress 16:25:48 it seems we need port/groups uuids to be unique, but how should this be enforced? 16:25:54 yhvh: please do 16:26:15 why do we need them unique? 16:26:17 or rather, where 16:27:13 https://review.openstack.org/#/c/206232/26/ironic/db/sqlalchemy/alembic/versions/5ea1b0d310e_added_port_group_table_and_altered_ports.py 16:27:16 second comment 16:27:39 I also seem to remember a comment on another patch but haven't found yet, possibly neutron side? 16:27:47 oh, so we need the mac address to be unique, not the uuid 16:27:55 sorry 16:28:22 that's a great question 16:28:44 so for the portgroups, is that mac address supplied by the client? (does it need to be?) 16:28:52 I'm wondering if we can just generate those for the client 16:29:13 yes, provided by client 16:30:01 is there a reason that needs to be sent by the client? 16:30:08 or can we just generate them like neutron does 16:30:14 not that I know of 16:30:28 I guess that still doesn't help... 16:30:41 the only thing I can imagine is a query to check, but that gets racy 16:30:49 actually. 16:31:02 they don't need to be unique, right? 16:31:15 if I have aa:aa and bb:bb, and put them into a portgroup 16:31:22 I could have the portgroup mac be aa:aa 16:31:27 actually, they need to be . but i could be wron.. 16:31:29 because neutron will only ever see the group mac 16:31:53 now, if I made the portgroup mac cc:cc, and had a port with NO group as cc:cc, that could be problematic 16:31:55 combo should be unique 16:31:58 say some node have only ports but others could have portgroup// 16:32:07 yeah 16:32:30 so they could start stepping into each others mac ids.. 16:32:53 right 16:32:58 less likely but practically possible that they could land up in same network too.. 16:33:06 which could be problematic.. 16:33:23 as terrible as it is, I'm inclined to think we should check the other table and let races happen if they happen 16:33:26 and document it well 16:33:31 I don't see a better way to do this 16:33:45 I saw a hack with foreignkeys, then a constraint on that? 16:33:54 just seemed a little ugly 16:34:05 or we say, only one of it can be active.. eith port or portgroups.. 16:34:13 yeesh, yeah I don't love that 16:34:25 yhvh: like an intermediate table with macs? 16:34:57 it isn't the worst idea, but I'm not a huge fan 16:35:20 like, portgroup has a relation to the address in port, and you can constrain on the values in foreignkey and local column 16:35:31 devananda: ^ any thoughts? tl;dr what's the best way to enforce uniqueness of port.mac and portgroup.mac 16:35:58 hmm 16:36:27 deva used to build really fast databases in exchange for money, so I'd like to leverage his wisdom here 16:36:32 whether he appears here or not 16:36:49 well if not I can ml and cc 16:37:31 yeah, might be a good idea as others could chime in 16:37:41 he'll respond eventually in irc, though 16:38:18 so, we deal with this kind of stuff in the switches as well - but, generally, it is user conrolled 16:38:29 I mean users configure it - 16:39:25 just a thought, does a port group mac use one of the macs from its memebers? and the member macs are hardware macs in this case? 16:39:34 by virtue that combo has to be unique, by defination, you can not screw it up 16:40:33 another thing i have is say how do we ensure that the mac ids are uniq across VM and baremetal..? We could still end up with a Mac id on BM allocated to a VM 16:40:33 Sukhdev: a portgroup mac could still conflict with a port mac, if the port doesn't have a group 16:41:11 lazy_prince: the deployer configures the MAC prefix used by virtual ports, and they should be using a prefix that isn't assigned to a manufacturer 16:41:28 I am not expert on this, but, I can go check with our experts to query as to how do we deal with this 16:41:29 baoli: it may but is not required, and yes 16:44:14 a wild devananda appears in a cloud of smoke 16:44:19 jroll: ohhai! 16:44:22 lol 16:44:24 haaaai 16:44:32 :-) 16:44:45 devananda: how wild? 16:44:46 what do you mean "uniqueness of ..." ? 16:44:57 devananda: https://review.openstack.org/#/c/206232/26/ironic/db/sqlalchemy/alembic/versions/5ea1b0d310e_added_port_group_table_and_altered_ports.py 16:45:10 devananda: imagine a port with no group with MAC aa:aa, and a portgroup with MAC aa:aa 16:45:20 boom. 16:45:26 heh 16:45:29 so, unique across tables 16:45:32 yah 16:45:44 my best shot at it is an intermediate table with the mac addresses 16:45:49 can be done. but not in a reasonable or supportable-across-databases kind of way 16:45:59 jroll: how do you ensure that table is up to date? 16:46:01 or eat the race conditions and validate in code 16:46:12 eh? 16:46:31 port: id | mac_id portgroup: id | mac_id macs: id | mac_addr 16:46:38 mac_id fields are fk 16:46:46 macs.mac_addr is unique constraint 16:46:57 triggers, FKs 16:46:59 yah 16:47:16 jroll: but how do you know that there isn't a port AND a portgroup with the same MAC ? 16:47:26 FK on those two table won't ensure that 16:47:28 devananda: oh god 16:47:30 yeah 16:47:32 lol 16:47:41 idk if this is solvable in 12 minutes, but some help would be cool :) 16:47:54 right. I'll read the review. lets not try to solve it now 16:47:54 16:46:01 jroll | or eat the race conditions and validate in code 16:48:01 ^ I don't love this idea but it will work :P 16:48:17 it'll lead to a poor UX 16:48:29 right 16:48:40 question 16:48:48 where does the MAC for a portgroup come from? 16:48:49 another option is don't allow clients to supply portgroup.mac, but rather generate it 16:48:54 heh 16:48:59 reccomend prefixes that definitely aren't HW 16:49:02 devananda: client supplied 16:49:05 recommend* 16:49:16 can a switch require a specific MAC for a portgroup? 16:49:19 or is it totally arbitrary? 16:49:28 I have no clue 16:49:33 it seems arbitrary 16:49:38 what is the requirement within the network environment itself for the MAC address' uniqueness? 16:49:44 generally you tell the switch what the MAC is for the MLAG or whatever 16:49:54 eg, what happens if there are two switches and each has a portgroup with the same mac ? 16:49:55 seems bad 16:49:59 and the last question will depend on the deployment, I suspect 16:50:04 and seems like a problem that network folks have already solved 16:50:19 in our v2 deployment I think it would work as there's L2 -> VXLAN translation in the switch 16:50:25 Sukhdev: ^ ? 16:50:40 This is a well understood and solved problem by all switch vendors - I can go ask our experts how we deal with this and share with the team next meeting 16:50:43 (I think, I don't actually know what the group mac would do there) 16:50:53 jroll: ok. so portgroup must be unique withinthe VXLAN 16:50:55 Sukhdev: "next meeting" is three weeks from now :P 16:51:06 still seems like "just let the client set it" is not a great approach 16:51:09 devananda: yeah, I think so 16:51:11 right 16:51:14 especially if they don't need to 16:51:17 jroll: Opps - I can post on the review - as a comment 16:51:24 thanks 16:51:37 ironic or neutron API could accept a MAC then have it rejected by the switch because oops-not-unique-in-the-VXLAN 16:51:56 I think neutron might reject it at that point, but that's still too late 16:52:33 devananda: actually, in my Ironic testing, I often see error message saying mac address is already in use 16:52:42 Sukhdev: o.0 16:53:07 this happens if Ironic driver drive does not clean neutron ports and tries to reuse upon an new launch of VM 16:53:14 yeah 16:53:20 that's a nova->neutron thing 16:53:28 * jroll has been there 16:53:43 ugh 16:53:51 yup - so, it is checked somewhere 16:54:02 well, feel free to move on. i've got enough now to think about the db side of this, but wont have an answer right away 16:54:04 but "deploy time" is too late, regardless. 16:54:24 yeah, I need to bounce as well and stretch my legs before next meeting 16:54:55 good meeting, thanks Sukhdev 16:55:15 Thanks folks this was a great discussion 16:55:23 yes thanks all 16:55:25 We will see each other in Tokyo next week 16:55:32 thanks 16:55:45 bye 16:55:52 #endmeeting