16:00:30 #startmeeting neutron_ci 16:00:33 Meeting started Tue Apr 3 16:00:30 2018 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:34 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:36 hi 16:00:37 The meeting name has been set to 'neutron_ci' 16:00:48 o/ 16:01:14 hi 16:01:35 * mlavalle will have to dropout 15 minutes before the top of the hour 16:01:57 please give me 5 minutes because I'm stil getting back home (big traffic) 16:02:12 slaweq: no problem 16:02:21 and I'm on mobile connection now 16:07:13 ok, we can start now 16:07:23 sorry for being late 16:08:01 but I just get back home from Easter - 500km which I usually do in about 5 hours I did in almost 9 because of traffics 16:08:10 ok, are You there? :) 16:08:31 mlavalle, haleyb? 16:08:41 slaweq: hey 16:08:55 I think that ihrachys and jlibosva are not here now 16:09:16 they haven't spoken up 16:09:53 i know kuba is at a meetup 16:10:04 ok, so I think we can start 16:10:10 #topic Actions from previous meetings 16:10:21 slaweq will write docs how to debug test jobs 16:10:40 I just pushed first version of patch: https://review.openstack.org/#/c/558537/ 16:10:56 clarkb reviewed it for me so I will address his comments 16:11:04 but please check it also :) 16:11:30 next one is: haleyb to check router migrations issue 16:11:38 haleyb: any updates? 16:12:15 slaweq: i am still testing it, am on the systems now, so no update yet 16:12:39 ok, so I will do it as action for this week for You 16:12:41 ok? 16:12:48 sure 16:12:58 #action haleyb to check router migrations issue 16:13:14 next one was: ihrachys to take a look at problem with openstack-tox-py35-with-oslo-master periodic job 16:13:35 AFAIK it is fixed with https://review.openstack.org/#/c/557003/ 16:13:57 so I think all is fine here now 16:14:14 so, next: slaweq to make fullstack job gating 16:14:26 done: https://review.openstack.org/#/c/557218/ 16:14:39 and also grafana dashoboard: https://review.openstack.org/#/c/557266/ 16:15:06 it didn’t appear there yet but as I asked infra team, all looks fine there and probably it should appear when fullstack job will fail at least once - we will see 16:16:02 About details how it works we can talk later, so no moving on to next action: 16:16:06 slaweq will check difference between neutron-tempest-multinode-full and neutron-tempest-dvr-ha-multinode-full 16:16:19 I didn't have time to do it last week 16:16:29 what? 16:16:29 I will do it this week for sure 16:16:39 mlavalle: what what? 16:16:42 :) 16:16:45 LOL 16:16:54 I know youv'e been on vacation 16:17:09 yes, I was 16:17:27 that's why I didn't have time to do this compare of jobs :) 16:18:30 #action slaweq will check difference between neutron-tempest-multinode-full and neutron-tempest-dvr-ha-multinode-full 16:18:40 do You want to add anything/ask about something? 16:18:51 or can we go to the next topic? 16:19:13 I was kidding with you 16:19:27 about you not having time on your vacation to do it 16:19:28 mlavalle: I suppose that :) 16:20:39 ok, moving on? 16:22:09 I assume that we can go to next topic 16:22:17 #topic Grafana 16:22:22 http://grafana.openstack.org/dashboard/db/neutron-failure-rate 16:23:30 I was checking those graphs from last 7 days today 16:23:56 There was one big spike on last Thursday but I think that it was some problem with infra because all jobs have same spike there. 16:24:28 Except that I think that it was pretty quite week. 16:24:56 *quiet 16:24:57 I see fullstack trending up 16:25:16 and also Rally 16:25:57 now yes but it's still not big failure rate 16:26:21 about rally I have few examples of failures and I want to talk about them in few minutes 16:26:46 ok 16:27:31 about fullstack it could be because of me and my DNM patch: https://review.openstack.org/#/c/558259/ 16:27:43 which fails on fullstack few times today :) 16:27:57 except that I don't think there is any problem with it 16:28:19 so we can change topic to fullstack now if we started about it :) 16:28:23 #topic Fullstack 16:28:24 that.s the example for the doc revision you proposed, right? 16:28:32 mlavalle: right 16:29:05 and this example was failing because of timeout reached 16:29:52 as I said, fullstack is IMO stable in both queues now (at least I didn't saw any problems with it during last week) 16:30:09 so I checked also bugs with "fullstack" tag in launchpad 16:30:20 There is one bug with „fullstack” tag now (except wishlist): https://bugs.launchpad.net/neutron/+bug/1744402 16:30:22 Launchpad bug 1744402 in neutron "fullstack security groups test fails because ncat process don't starts" [High,Confirmed] - Assigned to Slawek Kaplonski (slaweq) 16:30:23 ok, cool 16:30:46 this bug is assigned to me - I'm aware of it and I want to check logs if it will happen again 16:31:14 as I added some small change to test few days aga: https://review.openstack.org/#/c/556155/ 16:31:28 but I didn't saw it since this patch was merged 16:31:42 I will keep an eye on it still :) 16:32:10 do You want to add something about fullstack? 16:33:04 no, thanks 16:33:09 ok 16:33:14 #topic Scenarios 16:33:32 only problem which we have is neutron-tempest-plugin-dvr-multinode-scenario still on 100% of failures 16:34:01 * mlavalle just reviewed https://review.openstack.org/#/c/558537. Since manjeets highlighted the entire text, to sse where my comments apply, plese move cursor over comments 16:34:13 but that is because of problems with migration 16:34:31 so haleyb is on that 16:34:39 mlavalle: ok, thx for review 16:35:27 mlavalle, sorry for making it little hard to hard review I should have done file comment 16:35:53 I have question about neutron-tempest-plugin-scenario-linuxbridge - should we maybe try to add it to gate queue also (like fullstack) now? 16:36:11 wht do You think about that? 16:36:32 how long has it been running in check? 16:37:05 I mean voting in the check queue? 16:37:18 It's voting since 14.03: https://review.openstack.org/#/c/552689/ 16:37:37 mmhhh let's hold for a week 16:37:42 sure 16:38:06 expecially, since last week tended to be quiet, mostly towards the end 16:38:18 yes, right 16:38:32 let's wait and see if it will be fine still 16:38:35 :) 16:39:07 ok, next topic 16:39:10 #topic Rally 16:39:22 as mlavalle shows it has some failures recently 16:39:44 so I checked today and found 3 examples of failures from last week 16:39:58 all of them were because of reaching global job timeout: 16:40:05 http://logs.openstack.org/18/558318/1/check/neutron-rally-neutron/fdee864/job-output.txt.gz 16:40:13 http://logs.openstack.org/84/556584/4/check/neutron-rally-neutron/8a4dc9d/job-output.txt.gz 16:40:17 ok, I saw the same with one of my patches 16:40:20 http://logs.openstack.org/81/552881/8/check/neutron-rally-neutron/fb6fb63/job-output.txt.gz 16:40:46 in one of those I think it was even stopped after all tests passed 16:41:22 I think that we should check what takes most time in those jobs and maybe try to speed it up a little bit 16:41:39 is there anyone who wants to check that? :) 16:42:02 This is the one I saw http://logs.openstack.org/84/556584/4/check/neutron-rally-neutron/8a4dc9d/job-output.txt.gz#_2018-04-03_00_22_53_422472 16:42:56 I don't know if I will have the bandwidth this week, but if nobody will take a look, I will try 16:43:18 ok, thx 16:43:23 hopefully I won't be shamed by slaweq if I don't have time next week 16:43:32 for sure not :) 16:43:49 well, what if you are wearing your Hulk mask? 16:43:57 #action mlavalle to take a look why rally jobs are taking so long time 16:44:10 mlavalle: today I don't have it 16:44:20 but next week - we will see :P 16:45:11 ok, let's move on 16:45:14 #topic Periodic 16:45:36 openstack-tox-py35-with-oslo-master looks like is fine again - thx ihrachys :) 16:45:58 neutron-tempest-postgres-full sometimes has 100% of failures 16:46:14 I checked logs from last two failures and what I found is: 16:46:21 once it wasn’t real failure but timeout reached (after all tests passe): http://logs.openstack.org/periodic/git.openstack.org/openstack/neutron/master/neutron-tempest-postgres-full/03ca3f3/job-output.txt.gz 16:46:32 Another time it was failure not related to neutron: http://logs.openstack.org/periodic/git.openstack.org/openstack/neutron/master/neutron-tempest-postgres-full/d5c0933/job-output.txt.gz#_2018-03-30_07_15_04_817879 16:47:57 IMO if such timeouts will happen more often we should try to check it - for now it was only once so it isn't biggest problem for now 16:48:02 what You think about it? 16:48:14 agree 16:48:23 let's keep an eye on it 16:48:29 yes 16:49:11 ok, so moving on to last topic 16:49:16 #topic others 16:49:25 I have one more thing to ask 16:49:37 recently we added new job openstack-tox-lower-constraints 16:49:48 to our queue 16:49:54 *queues 16:50:03 do we want to add it to our grafana dashboard? 16:50:05 was request from infra 16:50:13 IIRC 16:50:26 yes, I know - I just wanted to ask if we should add it to grafana :) 16:50:41 to have better visibility whats going on with it 16:50:49 good point 16:50:52 yes 16:50:57 ok, so I will do it 16:51:15 #action slaweq will add openstack-tox-lower-constraints to grafana dashboard 16:51:26 +1 16:51:26 and last thing 16:51:56 I checked also bugs with tag "gate-failure" on launchpad: https://tinyurl.com/y826rccx 16:52:27 there is quite many such bugs with "high" priority older than few months and not assigned to anybody 16:52:44 maybe we should check them and close those which are not a problem anymore 16:52:48 what do You think? 16:52:55 yes, let's do it 16:53:13 do You want to go through them now? 16:53:25 or should I do it later maybe? 16:53:27 I have to drop out now 16:53:42 as I said at the beginning of the meeting 16:53:49 ok 16:53:55 but let's try to do them over the week 16:54:03 do you want to partner on that? 16:54:25 so I will try to check them this week and I will ask if I will need something 16:54:31 thx a lot :) 16:54:32 ok 16:54:41 so that's all from my side 16:54:53 * mlavalle dropping out 16:54:56 sorry for being so quick but I was preparing it today in car :) 16:54:59 o/ 16:55:05 bye mlavalle 16:55:34 haleyb: do You have anything else? or can we finish few minutes before time? 16:55:54 i am done, lunch here 16:56:06 #action slaweq will check old gate-failure bugs 16:56:33 ok, so bon appetit haleyb :) 16:56:37 and see You 16:56:44 #endmeeting