22:03:13 #startmeeting neutron_drivers 22:03:13 Meeting started Thu Sep 22 22:03:13 2016 UTC and is due to finish in 60 minutes. The chair is armax. Information about MeetBot at http://wiki.debian.org/MeetBot. 22:03:14 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 22:03:16 The meeting name has been set to 'neutron_drivers' 22:03:24 worse comes to worse we can make this meeting super brief 22:03:25 o/ 22:03:31 I wanted to discuss the RC2 backlog 22:03:36 hi 22:03:38 it looks like we’re pretty close to squashing it 22:03:42 amotoki: hi! 22:03:58 you guys ready? 22:04:03 #link https://launchpad.net/neutron/+milestone/newton-rc2 22:04:07 YES 22:04:10 YES? 22:04:11 good 22:04:19 HenryG: are you sure? 22:04:24 NO 22:04:32 HI 22:04:40 NO SHOUTING PLEASE 22:04:47 I don't see Inessa's port calculation fix in. was it deferred to O? 22:05:00 ihrachys: you’re jumping the gun 22:05:04 ihrachys: go back to your place 22:05:17 ihrachys: I ask politely 22:05:20 mind you 22:05:32 I see! 22:05:35 you see? 22:05:36 good 22:05:47 just making sure we all know who rules here 22:05:55 and that’s be me 22:06:10 ok 22:06:14 jokes aside 22:06:33 let’s dive in 22:06:44 the provisional deadline for RC2 is Sep 29 22:06:58 so ihrachys and I will aim to have everything in tip top shape by the 27 22:07:05 so 27, we’ll cut RC2 22:07:17 that gives us 48 hours for the clusterfudge of the last minute 22:07:28 fair? 22:07:38 ihrachys: what do you reckon? 22:07:50 since you’re the release manager, I am asking you to stand up now 22:08:04 sounds fair to me 22:08:08 + to what the gentleman said 22:08:15 ihrachys: thank you sir 22:08:18 so 22:08:33 besides, I am on PTO from Sep 28 :) 22:08:49 so you don't show up on Mon? 22:08:55 28 is Wed 22:09:16 pff, oh well, sorry 22:09:17 ok, now with that in mind 22:09:25 ihrachys: it’s ok we know it’s late on your neck of the woods 22:09:41 let’s go from the lowest priority to the highest 22:10:03 bug 1537091 22:10:05 bug 1537091 in neutron "Prevent the attachment of a subnet to a router" [Wishlist,In progress] https://launchpad.net/bugs/1537091 - Assigned to Mathieu Rohon (mathieu-rohon) 22:10:12 it’s a low-hanging-fruit ish 22:10:34 let’s vote for to agree if we want to work/review to have it in 22:10:52 I think the patch churned enough times and it’s ready-ish 22:10:56 so I am +1 22:11:24 +1 to work to allow it in RC2 22:11:43 if we review it extensively and we don’t like it so be it 22:11:45 +1 22:11:48 ok 22:11:50 +1 from me too. I just commented on variable naming in the review, but it looks good in general. 22:11:51 HenryG: ? 22:11:57 ihrachys: ? 22:12:10 I haven't reviewed the patch; I am also not sure why it's high priority on bgpvpn side (it seems like a lack of validation which is not critical in my worldview). but I don't mind either. 22:12:18 ok 22:12:25 let’s keep it targeted 22:12:29 next 22:12:30 bug 1625981 22:12:31 bug 1625981 in neutron "update response of ML2 doesn't include bumped revision number" [Medium,In progress] https://launchpad.net/bugs/1625981 - Assigned to Kevin Benton (kevinbenton) 22:12:40 I think kevinbenton wants this 22:12:54 and assuming he doesn’t screw up again, I am ok with helping him 22:13:07 +1. it makes the dhcp agent unable to distinguish between some stale port updates 22:13:09 :) 22:13:11 * armax loves kevinbenton 22:13:28 I reviewed the patches already 22:13:31 they ready-ish 22:13:50 HenryG, amotoki, ihrachys ? 22:13:51 +1 on that one, I will review tomorrow. 22:13:56 cool 22:14:06 +1 from me. i am reviewing it. 22:14:09 sweet 22:14:11 next one 22:14:13 bug 1623953 22:14:14 bug 1623953 in neutron "Updating firewall rule that is associated with a policy causes KeyError" [Medium,In progress] https://launchpad.net/bugs/1623953 - Assigned to Sridar Kandaswamy (skandasw) 22:14:20 njohnston: that’s in your realm 22:14:37 Yes, it's an edge case that SridarK found in testing 22:14:40 njohnston: I am happy to support you 22:15:00 amotoki, kevinbenton, HenryG, ihrachys ? 22:15:14 this is in fwaas 22:15:15 +1 22:15:17 yes 22:15:18 the fix is simple, and SridarK already has a patch up that includes a test 22:15:28 njohnston: ok, make sure the fix lands in master 22:15:29 +1, though who cares about fwaasv2 :P 22:15:32 and we’ll take care of the backport 22:15:46 I swear I'll read the doc this time! :-) 22:16:00 njohnston: just press the button 22:16:01 :) 22:16:01 I am not sure the impact. njohnston any idea on priority? 22:16:12 this early in the realease it’s probably the only time when it works 22:16:30 amotoki: looks like an edge case, but worth nailing down nonetheless 22:16:44 amotoki: SridarK didn't mention how major the edge case was that he found, so I have only his rating of 'medium' to go on 22:16:44 besides the fwaas gate is bleeding fast 22:16:55 +1 to try to have this in. 22:16:56 armax: bleeding? 22:16:57 because they don’t test anything 22:17:00 cough cough 22:17:26 njohnston: I mean that the fwaas gate is not as cumbersome and job intensive as the neutron one 22:17:43 armax: Ah, I see. :) 22:17:44 so approving/merging code is usually quick 22:17:53 besides not being in the integrated gate 22:18:02 patches are tested in isolation 22:18:14 and thus free of gate resets induced by other projects 22:18:14 Is the fwaas functional job running yet? Do we know if the models and migrations are in sync? 22:18:15 anyhow 22:18:19 HenryG: not yet 22:18:24 I don’t think 22:18:49 HenryG: Not yet https://review.openstack.org/#/c/359320/ 22:18:51 but that’s a good point, if someone were to look into that and found niggley bits 22:19:17 I would not mind to nail issues down 22:19:29 HenryG: didn’t you reconcile the models a while back? 22:19:35 I don’t recall migrations go in recently 22:19:36 I tried 22:19:40 but failed? 22:19:49 I don't remember 22:19:50 I have run the model migration test manually a few times in the past week, it worked for me 22:20:01 cool 22:20:02 HenryG: oh boy 22:20:14 * njohnston will put "it worked for me" on his tombstone 22:20:17 HenryG: it must have been a failure then 22:20:27 HenryG: those are the events that the human minds tend to discard 22:20:30 anyhoo 22:20:33 let’s move on 22:20:33 I recall I got them working but the job was not up 22:20:40 bug 1623708 22:20:42 bug 1623708 in neutron "OVS trunk management does not tolerate agent failures" [Medium,In progress] https://launchpad.net/bugs/1623708 - Assigned to Armando Migliaccio (armando-migliaccio) 22:20:47 this one has two changes 22:20:52 one from me and one from rossella_ 22:20:58 rossella_’s is nearly ready 22:21:18 mine I just posted it, I need to clean it up a little but I will be done in the next hour or so 22:21:33 then I need kuba and some other OVS guru to review it 22:21:42 obviously I’d love it in 22:21:55 amotoki, ihrachys, kevinbenton, HenryG? 22:22:03 definitely + on that one 22:22:07 no objection from me 22:22:09 +1 22:22:10 no objection 22:22:14 cool 22:22:22 I would also like this one 22:22:23 https://review.openstack.org/#/c/374388/ 22:22:30 but I have to talk some sense into kevinbenton first 22:22:52 if that doesn’t make it though it’s not the end of the world 22:23:12 I’ll blame kevinbenton in case people want this 22:23:24 so, 22:23:40 next two issues get a little thorny 22:23:48 bug 1619253 22:23:49 bug 1619253 in neutron "Subnet update bumps revision_number for network but does not notify about the change on RPC wire" [Medium,In progress] https://launchpad.net/bugs/1619253 - Assigned to Darren Shaw (dronshaw) 22:23:57 this has a proposed fix but broken 22:24:00 I pinged the author 22:24:03 no response so far 22:24:08 kevinbenton: seems important 22:24:16 no, this can be deferred 22:24:16 kevinbenton: do you want to spread your magic dust? 22:24:19 why is it rc2? 22:24:23 ok 22:24:37 + to defer, not critical and potentially scary 22:24:42 worth a back-port when it does get fixed (assuming not to complex) 22:24:42 so we want to defer it a priori? 22:24:46 ok 22:24:52 amotoki, HenryG you concur? 22:25:19 I don't understand the implications well enough 22:25:22 ok 22:25:27 that’s enough for me to take it out of RC2 22:25:29 I haven't understood the whole picture 22:25:31 it will only affect things in the future if we depend on using revision numbers to detect out of sync things 22:25:36 your comments say it all 22:25:38 done 22:25:47 ok 22:25:49 next one 22:25:50 bug 1622616 22:25:51 bug 1622616 in neutron "delete_subnet update_port appears racey with ipam" [High,In progress] https://launchpad.net/bugs/1622616 22:25:56 this one is well known I guess 22:26:04 we had a couple of stop gaps in place 22:26:16 it looks like the gate is back to behaving better 22:26:29 but we’re still prone to potential issues, afaik 22:26:29 i think this can be deferred at this point 22:26:39 Root cause not yet pin-pointed? 22:26:43 I did put a patch up 22:26:50 root cause is concurrent port updates 22:26:53 kevinbenton and I can try and whip into shape 22:26:54 HenryG: I think we understand the cause 22:26:56 to the same port 22:26:58 if it looks good we could have it in 22:27:01 if not so be it 22:27:10 I am talking about this one 22:27:10 https://review.openstack.org/#/c/373536/ 22:27:19 armax: post in the bug comments? 22:27:24 ihrachys: I will 22:27:26 which is pretty rare. we just had a bug with DHCP agent racing to update its port with the port update in delete_subnet 22:27:47 that patch has a -2 on it 22:27:53 kevinbenton: duh! 22:27:58 :) 22:28:07 can’t merge like that, now can it? 22:28:27 you and I need to whip it into shape remember 22:28:28 ? 22:28:29 also seems like catch is misplaced in that WIP? 22:28:57 it's in delete_subnet() while should go into update_port()? 22:28:58 ihrachys: yeah, I literally didn’t give it more than 5 minutes worth of thought but I swear I want to get back to it 22:29:08 ok :) 22:29:17 I’ll work on it 22:29:27 and we can make the judgement call later 22:29:32 if we defer, not the end of the world 22:29:40 ok? 22:29:43 it would be nice to have it in if it's ready. I prefer we keep it in list. 22:29:47 rally subnet tests are passing now even with high load 22:30:00 cool 22:30:01 ok 22:30:02 which is what uncovered this quite a bit before 22:30:05 we squashed this list 22:30:09 let’s move on to another list 22:32:23 #link 22:32:24 https://bugs.launchpad.net/neutron/+bugs?field.searchtext=&orderby=-importance&field.status%3Alist=NEW&field.status%3Alist=CONFIRMED&field.status%3Alist=TRIAGED&field.status%3Alist=INPROGRESS&field.status%3Alist=FIXCOMMITTED&field.status%3Alist=INCOMPLETE_WITH_RESPONSE&field.status%3Alist=INCOMPLETE_WITHOUT_RESPONSE&assignee_option=any&field.assignee=&field.bug_reporter=&field.bug_commenter=&field.subscriber=&field.structural_subscriber=&fiel 22:32:24 ld.milestone%3Alist=79508&field.tag=newton-rc-potential+&field.tags_combinator=ANY&field.has_cve.used=&field.omit_dupes.used=&field.omit_dupes=on&field.affects_me.used=&field.has_patch.used=&field.has_branches.used=&field.has_branches=on&field.has_no_branches.used=&field.has_no_branches=on&field.has_blueprints.used=&field.has_blueprints=on&field.has_no_blueprints.used=&field.has_no_blueprints=on&search=Search 22:32:29 don’t get scared 22:32:31 it’s just 3 bugs 22:32:36 bug 1622002 22:32:38 bug 1622002 in neutron "dhcp_release6 can be called when it is not present" [High,In progress] https://launchpad.net/bugs/1622002 - Assigned to Brian Haley (brian-haley) 22:32:47 this seems thorny 22:33:13 I don’t think it’s doable 22:33:15 in time 22:33:18 +1 for RC2 22:33:46 what other things? 22:33:46 armax: define 'it' 22:33:56 ihrachys: it’s doable to have it in RC@ 22:33:58 RC2 22:34:10 armax: depends on what we WANT to see in RC2 for that. 22:34:22 btw we need to nail the grenade patch if it didn’t already 22:34:30 ihrachys: what’s your suggestion? 22:34:44 I may argue it's mostly fine as-is right now 22:35:05 it fails with a traceback, yes, that's expected when you have your setup broken. 22:35:31 right, on that basis and assuming we were particular careful in release note this 22:35:37 I think we should be ok if this doesn’t land 22:35:45 maybe graceful catch of the error is ok, but definitely I discourage going the path of api response 22:36:12 agree with ihar 22:36:17 ihrachys: agreed, it’s overkill for this type of scenario 22:36:26 the user should install the damn right tool and move on 22:36:40 if he/she wants IPv6 22:36:51 I mean dhcp v6 stateful 22:37:01 ok, so let’s keep it targeted for O-1? 22:37:11 i.e. out of RC2? 22:37:19 yeah, we can backport refining of error handling later 22:37:22 ok 22:37:32 next one is bug 1625221 22:37:34 bug 1625221 in neutron "Fullstack looses test workers if eventlet's Timeout is raised" [High,In progress] https://launchpad.net/bugs/1625221 - Assigned to Ihar Hrachyshka (ihar-hrachyshka) 22:37:34 +1 22:37:34 are we sure this doesn't break? 22:37:50 kevinbenton: you mean ^? 22:37:56 the bug I just referenced? 22:38:21 no, the dhcp release uncaught exception 22:38:51 break how? 22:39:38 this is going to prevent normal reload of allocations for a network 22:40:00 if '_release_unused_leases' leaks an exception 22:40:10 you mean the patch proposed or the existing code? 22:40:20 the existing code 22:40:43 ok, let’s take this offline and see if there are loose ends 22:40:48 ok 22:40:50 but let’s agree not to do anymore 22:40:52 than that 22:40:59 ok? 22:41:23 yeah, we need to contain the failure for RC2 IMO 22:41:27 logs can be ugly 22:41:30 ok moving on 22:41:31 but it shouldn't interfere 22:41:33 to bug 1625221 22:41:34 bug 1625221 in neutron "Fullstack looses test workers if eventlet's Timeout is raised" [High,In progress] https://launchpad.net/bugs/1625221 - Assigned to Ihar Hrachyshka (ihar-hrachyshka) 22:41:42 ihrachys: what you feel? 22:41:52 this needs more time? 22:42:00 that one... fullstack is non-voting, also it does not break gate in any way, so... 22:42:11 it just makes some tests not executed if another failed already 22:42:14 ok 22:42:19 I don't think it's rc2 critical 22:42:23 but it's nice to have 22:42:27 the fix is invasive though 22:42:33 so maybe better to wait 22:42:47 it feels like inception 22:42:48 ok, let’s keep it on the backburner for now 22:43:08 last one of the O-1+rc-potential pile 22:43:10 is bug 1611991 22:43:12 bug 1611991 in neutron "[ovs firewall] Port masking adds wrong masks in several cases." [High,In progress] https://launchpad.net/bugs/1611991 - Assigned to Inessa Vasilevskaya (ivasilevskaya) 22:43:27 we have this affecting mitaka as well as newton 22:43:42 amuller initially suggested to defer 22:43:54 but I don’t know if sending a defer signal is the right thing to do 22:44:10 we want people to gain confidence and adopt the driver 22:44:18 especially when it’s the only driver that works with trunks 22:44:35 at the same time I wouldn’t want the existing functionality to regress 22:44:39 so I am a bit on the fence here 22:44:41 this should be RC2 22:44:49 the existing functionality is busted :) 22:44:54 only in some cases 22:44:55 not much to regress 22:44:56 not all cases 22:45:03 depends where the wind blows 22:45:13 we have tests for the stuff though 22:45:19 we need to merge this! 22:45:22 !!!!~!~~!~! 22:45:23 kevinbenton: Error: "!!!~!~~!~!" is not a valid command. 22:45:23 if you want it, commit to it! 22:45:49 what do other people think/ 22:45:50 ? 22:46:03 what happened with the idea of reusing Jakub's algorithm to validate the better one? 22:46:16 no-one is acting on it 22:46:22 i can act on it! 22:46:26 kevinbenton: oh boy 22:46:29 armax did just tell me to commit to it 22:46:32 I smell disaster 22:46:33 git commit 22:46:36 ok 22:46:55 kevinbenton: do you need a nudge? I give you a nudge! 22:47:05 kevinbenton: are you gonna commit with diversity in mind? 22:47:19 ok, then I hear consensus of having this RC2? 22:47:26 rc2 potential looks better. 22:47:27 + for rc2 22:47:33 ok 22:47:38 I’ll add this in a bit 22:47:47 let’s move on to the next and final list 22:48:02 and that’s the bugs that are marked potential 22:48:10 but not vetted yet and hence have no milestone 22:48:21 bug 1624079 22:48:22 bug 1624079 in neutron "KeyError on "subnet_dhcp_ip = subnet_to_interface_ip[subnet.id]"" [High,Confirmed] https://launchpad.net/bugs/1624079 22:48:42 this might be another of kevinbenton screw ups 22:48:52 but it needs triaging 22:48:59 anyone keen on it? 22:49:06 kevinbenton cough kevinbenton cough? 22:49:33 yeah, i can look at the cause of this 22:49:38 looks like nobody is quite sure yet 22:49:51 ok 22:49:55 let’s reassess offline 22:50:17 but eyes on it would be good 22:50:28 so if people could help look into it 22:50:31 that’d be good 22:50:33 next one 22:50:35 bug 1625305 22:50:36 bug 1625305 in neutron "neutron-openvswitch-agent is crashing due to KeyError in _restore_local_vlan_map()" [High,New] https://launchpad.net/bugs/1625305 22:50:43 this is somewhat troubling 22:50:48 but we don’t have enough to go by 22:51:06 it does sound scare 22:51:09 *scary 22:51:13 I have deja-vu 22:51:24 we had something similar back in mitaka rc* times :) 22:51:32 right 22:51:41 but it doesn’t seem the fix worked for the guy 22:52:48 i'm still collecting info 22:52:51 ok 22:52:57 to get a keyerror it seems the ports have to have different net uuids 22:53:14 I don’t understand enough of this code to judge 22:53:57 anyone has an opinion? 22:54:10 no, I would need to read the code; I will tomorrow. 22:54:13 ok 22:54:14 thanks 22:54:22 last one of this pile 22:54:25 bug 1625305 22:54:26 bug 1625305 in neutron "neutron-openvswitch-agent is crashing due to KeyError in _restore_local_vlan_map()" [High,New] https://launchpad.net/bugs/1625305 22:54:33 oops 22:54:38 bug 1626010 22:54:39 bug 1626010 in neutron "Connectivity problem on trunk parent with MAC reuse and openvswitch firewall driver" [High,New] https://launchpad.net/bugs/1626010 - Assigned to Jakub Libosvar (libosvar) 22:54:47 this probably is going to stay untargeted for now 22:55:37 because we need to undestand a bit more how ovs-fw can handle the same mac on different networks 22:56:08 dougwig: ping 22:56:41 ok, I have nothing else 22:56:50 for now 22:57:28 we have another couple of days to squash these 22:58:02 thanks everyone for watching stable backports and current fixes 22:58:14 any last minute comment? 22:58:18 if not 22:58:54 looks like not 22:59:00 #endmeeting