22:04:17 #startmeeting neutron_drivers 22:04:18 Meeting started Thu Dec 8 22:04:17 2016 UTC and is due to finish in 60 minutes. The chair is armax. Information about MeetBot at http://wiki.debian.org/MeetBot. 22:04:19 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 22:04:21 The meeting name has been set to 'neutron_drivers' 22:04:36 pcarver: did you check that patch I sent you? 22:04:51 pcarver: if you have still outstanding questions I am happy to answer 22:04:54 but let’s go in order 22:05:00 so no formal agenda 22:05:05 armax: I gave it a quick look. I need to spend more time on it. 22:05:21 pcarver: ok, please have a deeper look 22:05:30 and then come back if your questions are still unanswered :) 22:05:35 we’ll be here 22:06:06 pcarver: needless to say I’ll review the patches you asked about between today and tomorrow 22:06:22 kevinbenton, ihrachys regarding package version minimums 22:06:56 ihrachys has https://review.openstack.org/#/c/402004/ and https://review.openstack.org/#/c/402003/ 22:06:58 up for review 22:07:54 ihrachys wanna provide a quick summary? 22:08:02 ok, sec 22:08:32 so, first, since we don't gate on anything but xenial, we risk landing breaking patches for other platforms 22:08:40 specifically, centos/rhel 22:08:45 on another note I’d like to talk about kevinbenton’s attempt to switch from lazy to subqueries 22:09:01 that happened several times in last months 22:09:22 ihrachys: you may want to add sles to that list :P 22:09:44 armax: no idea if that was ever broken, otherwise I am fine :) 22:10:05 so far the solution was either quick-revert, or patching code with some conditionals that would not use some 'newer' options from e.g. dnsmasq on older platforms 22:10:10 ihrachys: I mean, that’s yet another distro we may want to care about soon enough 22:10:29 reverts are not nice because it often means a bug re-exposed; while conditionals suck and add complexity 22:10:31 ihrachys: ok, as we stand we reverted and didn’t land any workaround 22:10:44 yep, and that sucks 22:10:47 agreed 22:11:33 I think we as upstream team need to agree on how to approach ‘bleeding edge’ development 22:11:33 so it would still be nice to have some way to exress minimal versions that we really require (even if maybe quite conservative) 22:11:45 in other words 22:12:32 I am not sure if dnsmasq 2.67+ is 'bleeding'... ;) 22:12:37 when and how we bring new stuff that may affect/be affected by runtime deps 22:12:43 ihrachys: you know what I mean 22:13:19 ihrachys: scabbing edge? 22:13:33 I recall situations involving ipset, dnsmasq and ovs that affected what goes in our codebase 22:13:54 but first and foremost I would like to ask this question to you guys and see what you think 22:14:53 I think it makes sense to make an effort to support popular platforms 22:15:01 when Neutron ships on X (X being Ocata, Newton, etc) do we assume that Neutron supports platform A version R+, B version S+, C version T+ 22:15:05 or the other way around? 22:15:31 like platform A version R+ supports Neutron X, X-1, X-2? 22:16:02 because the answer really changes how we go about this 22:16:16 since platforms are traditionally a lot more slow, I think it's neutron supporting them, not vice versa. platforms can't adopt too quick for our style. 22:16:16 i sort of assume we can run on newer distros 22:16:43 kevinbenton: but it seems we must do in a bw-compat way 22:16:52 now, EL is very special in its slowliness; but even Xenial will stay on whatever they picked for next years. 22:16:55 armax: other way around 22:17:01 but we have no clear mechanism to identify when to get rid of the plumbing code 22:17:16 armax: i assume the newer platforms have backwards compatibility in mind for the things we use 22:17:29 armax: but i suppose xenial being painful proved that wrong... 22:17:30 kevinbenton: not necessarily 22:17:38 because the 2.67 issue we hit demonstrated that 22:17:48 armax: how so? 22:17:59 did 2.67 break something that used to work? 22:18:00 apparently that was supposed to be bw-compat but we can’t really tell because 2.67 doesn’t mean anything as it’s not honored 22:18:13 armax: well that's probably because we don't have a clear protocol behind introducing new deps/bumping versions. I would think that e.g. carrying a conditional for a cycle (with a clear statement that it's going away in next release) is bearable. 22:18:18 no, I mean Neutron added something that was supposed to work on 2.67 22:18:42 kevinbenton: we started using --rr-param from dnsmasq 2.67+ and turned out centos is on 2.66 22:18:43 but actually that’s not quite what happened 22:19:01 right 22:19:16 it was furtuitous 22:19:19 right, but that's the opposite of what i'm talking about 22:19:19 kevinbenton: and yes, we indicated it with sanity check before that minimal version is 2.67 22:20:01 i was saying i assume that xenial for example should work with ocata/newton/mitaka/liberty 22:20:09 ok, let me step back, I think there are two issues here: 22:20:20 a) in neutron we cannot really trust package versions 22:20:44 why not? i thought centos was on 2.66 22:21:03 kevinbenton: yeah, but 2.66 + ~20 patches from 2.67+, including features 22:21:12 b) when we introduce features we cannot do by assuming the required dependency can be easily fullfilled 22:21:32 ihrachys, kevinbenton do you at least agre with the two issues? 22:22:12 armax: i'm not sure where we cut off 22:22:19 kevinbenton: elaborate 22:22:21 pls 22:22:28 armax: do we support centos 4? 22:22:50 yeah. though I believe that we should have some way to finally trust/assume some minimal versions and stop carrying about anything lower. 22:22:56 well we need to come up with a reasonable answer we all agree on 22:22:57 armax: are are we going to maintain a list of blessed distributiongs 22:23:06 kevinbenton: latest EL, latest SLES, latest ubuntu LTS 22:23:20 perhaps we support until the distro/version that worked on the latest supported Neutron upstream version 22:24:02 kevinbenton: I think there was already some global openstack discussion before where latest ubuntu and latest EL were identified as 'supported'. I would need to dig latest TC meeting logs to find links. 22:24:32 ihrachys: ack. i would feel comfortable doing that if other projects or the TC have gone down this path 22:24:38 armax: elaborate the last suggestion. 22:24:56 armax: do you suggest we can't bump minimal distro dep until old stable is EOL? 22:24:57 ihrachys: certainly 22:25:27 ihrachys: first of all I am thinking we should make sure reviews like the one that trigger this discussion do not get fall through the cracks 22:25:55 armax: fall as in land or as in reverted and not re-proposed? 22:25:56 so for now I’d suggest that anything that touches/need runttime changes be identified for discussion/revision 22:26:02 at the team meeting or drivers meeting 22:26:14 ihrachys: correct 22:26:21 ihrachys: ideally we catch these premerge 22:26:30 ihrachys: so xenial is obviously tested, is it time to add others since we don't know we've broken on them sometimes for months (did that make sense?) 22:27:06 haleyb: adding others as in adding more jobs is not something we can quite do effectively if we were the only ones doing it 22:27:17 haleyb: hence it needs to be a wider community effort 22:27:30 right, we can't be the only ones in this boat 22:27:35 ihrachys: but let me elaborate on the point that ihrachys was asking about 22:27:54 say for instance that Mitaka was tested on Trusty 22:28:03 and we know it works 22:28:23 we can’t add anything to neutron that won’t work on Trusty until Mitaka is dropped 22:28:39 which is quite tricky to do 22:29:11 because this would mean that if we wanted CI for this we’d have to test Newton and Ocata on Trusty and that’s opposite to what infra is asking us to do 22:29:25 armax: even in Newton+? why can't we keep Mitaka on Trusty but Newton+ on Xenial? I don't see how you would achieve what you propose without gating. 22:29:46 meh, I don't think we should go that far. it makes our lives even more complex. 22:30:16 maybe the better answer is we can't backport something to stable/mitaka that doesn't work on trusty, which i think is what was just said ^^ 22:30:50 I think it's fine to land anything Xenial-ish in Newton+; if we need to backport a patch, we will catch an issue with it in Mitaka if it's incompatible with Trusty, and we'll deal with it then 22:31:23 but that would mean backporting features too 22:31:41 we typically don't backport breakages like this since we don't backport features 22:31:41 armax: not necessarily; --rr-param thing was a bug fix 22:31:45 unless we’d attempt the backport just to try 22:31:59 and validate a dependency 22:32:35 (ok, rr-param was Wishlist; but e.g. dhcp_release6 was definitely a bug fix) 22:33:55 ihrachys: so I hear that we can only handle this on a case by case basis to keep our sanity? 22:34:35 perhaps we start adding experimental jobs that test/validate neutron on some distros that work in the gate 22:34:44 armax: I suggested that minimal deps are per-branch (which reflects infra way) 22:35:18 and by a combination of early catching patches and proactive review we can identify whether we have to put conditionals in place or not 22:35:34 as for exact problem of not being able to depend on dnsmasq 2.67+ version, I think a nice solution would be to have a clear protocol that allows us to track (and bump) minimal deps in tree 22:35:49 such conditionals should not be version-based for the reasons that ihrachys identified on that patch I linked above 22:36:08 and we would just make sure that if e.g. a new feature needs version X, then we bump up to X in next cycle only, and in current cycle, we add some conditional boilerplate 22:36:15 so that platforms have a chance to react 22:36:57 armax: I think at some point, it's ok to state minimal version and relax; just not at the exact same moment when we decide to land a patch using a new version 22:37:12 ihrachys: but I don’t see the point 22:37:12 just give some time for platforms to catch up; if they don't, let them deal with breakage 22:37:48 if we can’t trust versions once, we can’t trust them ever 22:38:01 armax: the point is in managing complexity on neutron side without keeping it indefinitely; while at the same time giving a chance for platforms to catch up without gate breakages 22:38:09 righ 22:38:10 t 22:38:18 so I am thinking about the following 22:38:29 armax: I believe if we give clear signal that we are going to break any version below X in next cycle, that will make platforms to react 22:38:43 each patch that interacts with the runtime should probably have extra scrutiny in review 22:39:05 perhaps a DistroImpact tag on the commit message can help us keep them all in one place 22:39:12 and make them easier to review 22:39:23 UpgradeImpact? 22:39:44 ihrachys: that would work too 22:40:12 though 22:40:29 we’re being more stringent here as we’re asking any runtime related change to be vetted 22:40:48 I didn’t suggest to add the UpgradeImpact on the patch we reverted 22:40:55 because we’ve been running 2.67 in the gate for ages 22:41:27 anyhoo so long as we make folks aware of the stricter rules 22:41:32 UpgradeImpact is fine by me 22:41:59 I am also going to propose something formal on how we could also track minimal versions in distro friendly way (was planning this week, but had a week-long event happening here). I think we may come up with something that would allow us to eventually start assuming 2.67+ without being at fault of any breakage that may ensue. 22:42:01 at the same time if we put together some experimental jobs in our queue to test on distros we care about 22:42:23 for the time being, at least we can fire them up when we see an *-Impact patch 22:42:43 makes sense 22:42:50 we started on centos one in neutron gate 22:42:51 and make a decision accordinly, i.e. put conditionals in place 22:42:54 and whatnot 22:43:01 it fails, but we'll have folks from puppet looking at it 22:43:10 which one is it? 22:43:13 armax: we may need a policy update 22:43:24 armax: I can try to bake something if you like 22:43:37 armax: sec, looking in git 22:43:58 we can’t run the same full job we run but just changing the node? 22:44:14 I’d hate to depend on some other project except tempest and devstack to run these jobs 22:44:30 armax: https://review.openstack.org/#/c/402463/ the patch adding the job 22:44:52 armax: nah, I mean puppet folks have motivation to help us fixing the job :) 22:45:25 we need to clean up the experimental queue a bit 22:45:28 armax: in theory we can run same, it just fails. apparently devstack integration has issues to solve for the platform. 22:46:14 armax: at least trusty flavours could be removed 22:46:27 also, ovs-native seems redundant now that it's for Mitaka only 22:46:46 ihrachys: ok, that’s an offline topic, I’ll take care of it later 22:47:01 ihrachys: so we have somewhat reached a conclusion? 22:47:21 a) be more strict on how we flag impacting changes that touch the runtime deps 22:47:48 b) add distro coverage on an experimental basis so that we have a tool to proactively see what we break ahead of merge 22:48:31 c) decide support/development strategy on a case-by-case basis? 22:48:38 kevinbenton: ^ 22:48:59 sounds fine 22:49:06 how hard is it to make those experimental jobs? 22:49:15 kevinbenton: not hard 22:49:21 now looking at the experimental queue 22:49:24 I see postgres 22:49:27 dont' they require some upkeep to ensure that infra doesn't break? 22:49:38 well it's a little different for the OS level isn't it? 22:49:47 maintaining images and whatnot 22:49:54 kevinbenton: we can add them on the periodic queue too 22:50:03 and we can monitor on a daily basis 22:50:21 kevinbenton: yes; for centos, we are currently looking for some staffing from RH to uplift support for the platform 22:51:17 about PG, do we want to ask reviewers to fire experimental runs to see how PG behaves on DB changes? 22:51:44 I mean, this is a seperate discussion but now that we are talking about distro deps... 22:51:58 armax: + on the a-c), though we may consider changing c) from case-by-case to as-per-defined-protocol if we will later agree on such a protocol. 22:52:24 ihrachys: we’d have to document and evangelize this 22:52:47 ihrachys: if you can come up with a protocol proposal, I think you just volunteered to put this together in some doc, didn’t you? :) 22:53:08 armax: yes, and I think a-c) needs capture irrespective of later changes, so we can start with that, and then follow up if needed. 22:53:19 I’ll add a section to the team meeting to highlight UpgradeImpact patches in flight 22:53:21 armax: yes, I will propose policy update 22:53:56 we have 7 minutes left 22:54:15 pcarver: you still there? 22:54:29 as for kevinbenton’s attempt to switch to subquery vs lazy 22:54:50 ihrachys: I was wondering if you can spare a moment and drop your thoughts on his patch 22:54:57 armax: link? 22:55:00 i think we should just adopt a policy of using subqueries for anything that isn't a 1:1 22:55:02 because I definitely miss context 22:55:10 iit's not ready :) 22:55:14 armax was too fast 22:55:15 https://review.openstack.org/#/c/408143/ 22:55:58 ok 22:56:00 to wrap up 22:56:10 #action armax to clean up the experimental queue for a bit 22:56:18 ok I will need to read some more 22:56:19 #action ihrachys to propose policy changes 22:56:29 #action kevinbenton to continue breaking neutron 22:56:30 we good? 22:56:37 :) 22:56:38 armax: re postgresql, I think it should go into some periodical dash on grafana 22:56:45 ihrachys: that’s what I am gonna do 22:56:53 ihrachys: I’ll tag you on the infra patch 22:56:55 are any other projects still supporting it? 22:56:57 + 22:57:08 kevinbenton: last time I checked nova gates on it 22:57:22 kevinbenton: afaik suse openstack still goes with postgresql 22:57:39 but maybe no-more 22:57:45 ihrachys: yeah, they do 22:57:46 :P 22:57:49 armax: you will learn it in a month! 22:58:18 I can’t see any pg- job in the nova queue nova 22:58:21 but I’ll double check 22:58:33 ihrachys: surely less than that :) 22:58:36 ok folks 22:58:38 2 minutes 22:58:41 ihrachys: go to be 22:58:42 d 22:58:43 I think every time after RC, we get a list of 'critical' patches for PSQL to land 22:58:52 haleyb: thanks for joining 22:59:07 cheerio 22:59:13 #endmeeting