16:00:44 #startmeeting neutron_ci 16:00:45 Meeting started Tue Apr 11 16:00:44 2017 UTC and is due to finish in 60 minutes. The chair is ihrachys. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:46 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:48 The meeting name has been set to 'neutron_ci' 16:00:53 #link https://wiki.openstack.org/wiki/Meetings/NeutronCI Agenda 16:01:10 hi everyone 16:01:22 o/hello 16:01:52 hi 16:02:46 #topic Action items from prev meeting 16:03:07 first is still "ihrachys fix e-r bot not reporting in irc channel", and the status is the same - no progress on that one 16:03:44 Unless someone wants to take it over, I will drop it from the list of actions to chase here, and will report when I actually get there (I track that by other means) 16:04:04 it never comes to the point when I prioritize it 16:04:27 I can take a look 16:04:34 ok nice, thanks 16:05:29 next was "ihrachys to report bugs for fullstack race in ovs agent when calling to enable_connection_uri" and it's still on me, but I will get there for sure, I saw the issue in non-test env 16:05:32 #action ihrachys to report bugs for fullstack race in ovs agent when calling to enable_connection_uri 16:06:10 next was "haleyb or mlavalle to report back on ha+dvr plan after l3 meeting" 16:06:32 I remember l3 team should have discussed that right? 16:06:39 ihrachys: correct 16:06:42 https://review.openstack.org/#/c/455406/ 16:06:51 we discussed it during our meeting 16:07:38 hm so it just works? 16:08:14 ihrachys: plan is to move that non-voting job into the check queue to replace the dvr-multinode one 16:08:37 which is also non-voting? 16:08:44 then if we update the grafana page we can watch it for a bit to make it voting 16:08:54 yes, the current one was -nv as well 16:09:34 i didn't want to add a 3-node job without taking something away 16:09:50 any reason not to include grafana update in the patch? 16:10:28 but in general, have we seen it passing full run? or we will figure it out after the fact? 16:10:30 i thought that was a different repo, maybe i'm wrong 16:10:44 haleyb, no, it's same see grafana/neutron.yml in project-config 16:10:46 o\ /o 16:11:00 ihrachys: oh, then i'll update that too 16:12:00 ihrachys: unless we run 'check experimental' everywhere we'll never know if the job is good, the graphs over time is a better way (imo) 16:12:23 over time sure, but have we at least validated that it has a chance to pass? 16:12:44 last time I checked, it was consistently failing on some scheduler test 16:13:01 (that may have been fixed by late tempest changes in-tree) 16:13:24 I mean https://review.openstack.org/#/c/421155/ 16:13:25 ihrachys: hmm, let me look 16:14:23 ah, that change, yes it would fix that bug 16:14:38 ok let's figure it out off-band, but my general take is, we should show that there is a pass for the job at least once somewhere before we move to triggering it for every patch 16:14:51 i will update the grafana page and check the job in an existing patch i have 16:15:10 so if you have a link to successful run would be nice to see it in the gerrit patch for config 16:15:15 ++ 16:15:51 thanks for working on it, I am happy we make progress on that longstanding issue (I think we started talking about it ~Newton?) 16:16:12 ok next was "jlibosva to prepare py3 transition plan for Pike" 16:16:15 jlibosva, your stage 16:17:20 so I didn't prepare a plan yet 16:17:25 but I have done some research 16:17:56 should we start some etherpad to capture whatever we have on the topic? 16:18:18 and found that dims (?) already started tracking the job for all projects 16:18:20 https://etherpad.openstack.org/p/support-python3.5-functional-tests 16:18:27 :) 16:18:39 we probably want to add your functional suite there 16:18:52 as it turned to catch py3 related errors in the past 16:19:33 it seems like dims is tracking some tempest job for that 16:19:39 but we can do more I think 16:19:45 I plan to also send an rfe bug specific to neutron where we can track down issues 16:19:48 for one, functional and fullstack jobs 16:20:16 I am not fully sure what that gate-tempest-dsvm-nova-py35-ubuntu-xenial job mentioned there is 16:20:17 the links at the etherpad are outdated 16:20:24 aren't all dsvm jobs nova? :) 16:20:28 right 16:20:42 also, some 'issues' are not really issues, like the one about dhcp_release6 16:20:48 (we have it for py2 too) 16:21:04 so the neutron section definitely needs some update 16:21:19 is the document for tracking all py3 progress or just tempest job? 16:21:52 I think the answer to the question will decide if we need our own document, or we can hijack the existing one for other py3 things we could have for Pike 16:22:06 I'd rather go with our document 16:22:12 dims, what's the intent of https://etherpad.openstack.org/p/support-python3.5-functional-tests ? is it all things for py3 pike goal, or just a specific job? 16:22:16 just in sake of better overview 16:22:27 yeah, we can cross link 16:22:38 then here you go: https://etherpad.openstack.org/p/py3-neutron-pike 16:22:51 #link https://etherpad.openstack.org/p/py3-neutron-pike Etherpad to track py3 efforts for Pike goal 16:23:12 cool, thanks 16:23:16 let's start capturing what you have there 16:23:27 and draft some high level bullet points 16:23:28 I plan to look at it more closely this week 16:24:05 cool 16:24:19 #action jlibosva to follow up on py3 plan for pike 16:24:47 next was "ihrachys to chase infra to review https://review.openstack.org/#/c/439114/" 16:24:59 it's actually in already, so we can adopt the new dashboard for our needs 16:25:07 I will update the wiki page with the link to the board. 16:25:18 #action ihrachys to update wiki with the link to gerrit CI dashboard 16:25:38 the last one is "jlibosva document current openvswitch requirements for fullstack/functional in TESTING.rst" 16:26:12 whoa, totally missed that 16:26:25 * jlibosva hides under the rock 16:26:35 but it's good we track those :) you have the cake for the next week then 16:26:42 #action jlibosva document current openvswitch requirements for fullstack/functional in TESTING.rst 16:27:04 and that's about it for the action items 16:28:03 #topic Patches in review 16:28:22 now that we have manjeets's change for the neutron gerrit dashboard, we can have a look what's there 16:28:38 the link to it is at the top of http://status.openstack.org/reviews/ (see Neutron link) 16:28:49 sadly, the link is autogenerated and is too long to copy paste here 16:29:07 I see a single patch captured by it 16:29:15 which puzzles me, we should have some more 16:29:22 (or I think so) 16:29:28 I will have a look at what's missing later 16:29:47 #action ihrachys to figure out why gerrit dashboard seems to not show some gate-failure fixes 16:30:25 manjeets, I may need your help once/if I find missing patches, I will ping you if I do 16:31:36 anyhow, we have this patch for a sporadic tempest failure on project_id missing in resource payload on first GET: https://review.openstack.org/#/c/447781/ 16:32:17 I see amotoki had some comments on the approach there, it's not fully clear to me whether it's a concern around the patch, or a future change that may got wrong 16:33:37 I will personally need some more time to understand the concern of amotoki 16:34:26 * ihrachys looks through the queue to see if any more CI fixes are there 16:34:32 I don't want to stick to my thought on how we can treat project_id and tenant_id equally, but I am not sure we need to treat project_id differently from tenant_id 16:34:52 but I don't want to block this if it blocks the gate 16:35:07 amotoki, it's not like the issue is too pressing, it shows from time to time 16:35:38 amotoki, so what would be your suggestion in this particular case to make treatment same? 16:35:47 I am sometimes looking it but i haven't figured out what is happenng 16:35:53 amotoki, cross-check project_id against tenant_id rules and vice versa? 16:36:43 IIRC in the proposed approach, project_id is checked for both project_id and tenant_id, but tenant_id is checked only for tenant_id 16:37:41 I think we need time to switch project-id and tenant-id and we cannot switch these two at once. 16:37:49 right. because project_id rules are not there (I was thinking about adding them in https://review.openstack.org/448238), and due to the nature of policy.json being a modifiable file, you can't guarantee them being there 16:38:40 ihrachys, sure let me know 16:38:48 personally I would like to treat both equally to avoid unexpected behavior. that is just my point 16:39:11 though the modifiable nature is probably not an argument here, we should not pretend to support that for owner definition :) 16:39:27 amotoki, ok, let's see what we can do, we'll proceed in gerrit 16:39:47 ihrachys: sure 16:39:53 ok, as for other patches up for review 16:39:57 https://tinyurl.com/ly76lmy tiny url 16:40:27 I have this https://review.openstack.org/454870 to fix a sporadic func test failure (not actually sure if it's the fix due to lack of data on failure in the branch where I spotted the failure the last time) 16:40:45 manjeets, but you need to generate every time to keep it fresh 16:41:06 I was hoping to offload that generation matters to infra :) 16:41:10 yep i just generated it 16:41:38 for the func test failure, we will need to land https://review.openstack.org/#/q/Ic5a3b347bea7e5aa8a5caee5035568e5954f58dc,n,z into stable branches to collect more data next time it fails there 16:43:10 we also had a nasty bug sneaked into stable branches where a network delete request could spin indefinitely in a loop spinning CPU up to 100%. That made grenade runs in master to fail sometimes with XXXNotFound errors on cleanup of resources. 16:43:15 it's fixed by https://review.openstack.org/#/q/topic:bug/1672701+message:Revert 16:43:39 but we will need a new Newton release with the patch since we happily released one with regression :-x 16:44:32 also the prev week I realized that most stadium projects forked os-testr in their trees and missed some fixes from there: https://review.openstack.org/#/q/topic:remove-subunit-trace-fork 16:44:49 that made some gates e..g not fail when all tests were skipped (something that happened in lbaas) 16:44:55 so the patches should fix the wrong 16:45:52 I also have this https://review.openstack.org/#/c/453212/ to simplify our api_extensions configuration in tempest.conf 16:46:28 that's not pressing, but something I figured will make our lives easier since we won't need to maintain two almost identical lists of extensions for gate for DVR and non-DVR cases anymore 16:47:04 of other pressing issues, there is https://bugs.launchpad.net/neutron/+bug/1679815 open 16:47:07 Launchpad bug 1679815 in neutron "test_router_interface_ops_bump_router fails with "AssertionError: 5 not greater than 5"" [Critical,Confirmed] - Assigned to Kevin Benton (kevinbenton) 16:47:36 it made our unit tests crash randomly, our last bastion of stability in gate :) 16:47:54 the fix landed it seems: https://review.openstack.org/#/c/452691/ 16:48:02 (or so we think, that it it's a fix) 16:48:28 i tripped over this today and rebased to master, so will know soon 16:48:41 I see reedip_ commented there that it hit him. :-x 16:48:53 hey 16:48:54 yeah 16:48:55 so maybe it's not a fix in the end 16:49:01 will need to have another look 16:49:30 we also have https://bugs.launchpad.net/neutron/+bug/1680136 spooking stable gates 16:49:31 Launchpad bug 1680136 in neutron "Stable newton gate is broken" [Critical,Confirmed] - Assigned to Kevin Benton (kevinbenton) 16:49:44 will be fixed by https://review.openstack.org/#/q/Ieef10eebd93f99404dd2fd87ccbab9b75632945a,n,z 16:50:56 any other pressing patches we are aware of? 16:51:56 ok one more thing, we have that pike goal to switch to mod_wsgi for api 16:52:10 Victor recently respinned his patches https://review.openstack.org/#/q/status:open+topic:goal-deploy-api-in-wsgi+owner:%22Victor+Morales+%253Cvictor.morales%2540intel.com%253E%22 16:52:24 I haven't had a look yet but if someone has cycles, I would appreciate it 16:52:52 there is also somewhat related effort to switch to new devstack lib in gate: https://review.openstack.org/#/q/status:open+topic:new-neutron-devstack-in-gate 16:53:24 I was hoping that the latter would go first, and then we would be able to switch to new wsgi execution mode for lib/neutron only 16:53:38 but with the review pace for the devstack switch patches, I am not sure we will get there 16:53:55 again, maybe spend some review cycles on that one if you have any 16:54:07 ihrachys: regarding the stable/newton gate, https://review.openstack.org/#/c/453741/ still hasn't merged, seems stuck 16:54:43 oh right. hm, why 16:54:55 oh I W+1 again and now it's in merge queue 16:55:06 good you spotted it's stuck 16:55:11 we would waste another day :) 16:55:33 #topic Grafana 16:55:41 #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate 16:55:55 we see unit tests spike, probably because of that floating_ip pagination thingy 16:56:19 we will need to figure that out after the meeting 16:56:36 fullstack is at ~100% failure rate 16:56:47 jlibosva, what's the reason there still? 16:57:13 ihrachys: didn't it improve after the ipconntrack patch? 16:57:15 * ihrachys also sees that scenarios are not in shape (almost 100%) 16:57:26 jlibosva, that's what I thought that it will 16:57:34 hmm, it was merged almost 24 hrs ago 16:57:43 I saw a failure in trunk but that was not consistent 16:57:46 well it's not 100% exactly, more like 80% now 16:58:01 yeah, the trend is that it goes down 16:58:03 so maybe that's an improvement in fullstack-speak 16:58:30 ok gotta figure out why it's still not shiny, as well for scenarios 16:58:51 jlibosva, do you want an action item for that? 16:59:02 * ihrachys is not greedy today 16:59:12 I'd wait when it stabilizes at some value 16:59:17 ok fair enough 16:59:33 #action ihrachys to review fullstack and scenario health before next meeting 16:59:41 and we are at the top of the hour 16:59:56 thanks all, and thanks for reviews and patches and joining 16:59:58 ciao 16:59:59 I found some bugs at dhcp tests in fullstack 17:00:00 #endmeeting