16:00:19 #startmeeting neutron_ci 16:00:20 Meeting started Tue Dec 18 16:00:19 2018 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:21 hi 16:00:22 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:25 The meeting name has been set to 'neutron_ci' 16:00:51 o/ 16:01:25 o/ 16:01:54 ok, lets start 16:02:00 #topic Actions from previous meetings 16:02:06 mlavalle will continue debugging trunk tests failures in multinode dvr env 16:02:26 I did work on this, in fact I was having a chat with haleyb about it 16:02:57 I am finding that the instance that lands in the controller node cannot access the metadata service 16:03:37 mlavalle: I'm now looking at dvr multinode job results and I see a lot of errors like: http://logs.openstack.org/59/625359/4/check/neutron-tempest-plugin-dvr-multinode-scenario/0fcd454/controller/logs/screen-q-l3.txt.gz?level=ERROR 16:03:48 do You think that those may be related somehow? 16:04:00 * mlavalle looking 16:04:51 slaweq: i had meant to file a bug for that and another traceback i saw, need to find the tab 16:04:55 I saw those and they definitely need to be fixed 16:05:20 but I don't necessarily see the connection with the other bug 16:05:27 yes, but I wonder if that may be a reason why trunk tests are still failing, even with Your patch 16:05:33 I don't rule it out, but I don't see the connection now 16:05:37 ok 16:05:59 haleyb: will You fill bug for this one or should I do it? 16:06:14 I am also finding that the L3 agent in the controller doesn't seem to be processing the router 16:06:45 only the agent in the compute node is processing the router 16:06:56 i can file a bug, wonder about the JSON issues as well, could mean a bad message? 16:07:19 what JSON issue exactly? 16:07:25 i think i've seen these with a malformed rpc, but can't remember 16:07:37 late hi o/ 16:07:45 http://logs.openstack.org/59/625359/4/check/neutron-tempest-plugin-dvr-multinode-scenario/0fcd454/controller/logs/screen-q-l3.txt.gz?level=ERROR#_Dec_17_23_29_17_340537 16:07:52 if that is actually the case, then there is no metadata proxy to process the request from the instance in the controller 16:07:56 slaweq: things like that^^ 16:08:33 so here's what I plan to do: 16:08:42 1) Dig further in the logs 16:08:46 hmm, I didn't saw this one before 16:09:09 2) I will test locally the creation of the router 16:09:26 3) add some log debug statements and test 16:09:36 agree? 16:09:41 mlavalle: You are talking about debugging trunk issue, right? 16:09:47 right 16:09:57 ok, that is fine for me 16:10:01 but at this poiint, it is not a trunk issue 16:10:07 #action mlavalle will continue debugging trunk tests failures in multinode dvr env 16:10:10 it is a router /L3 agent issue 16:10:16 slaweq: ^^^^^ 16:10:24 and that spills to: 16:10:38 1) potentially the messages you are seeing in the logs 16:11:08 2) the other ssh timouts that you gave me as hoemwork last week 16:11:53 makes sense? 16:12:01 yep 16:12:21 ok 16:12:28 mlavalle: so do You think that we should report separate bugs for those issues from logs? 16:12:40 I think so but what's Your opinion? :) 16:12:47 yes 16:12:49 k 16:12:59 haleyb: will You report them? 16:13:02 and I'll point to them in the bug that I am working on 16:13:16 sure, will report them 16:13:19 so if I find a relationship, I keep the connection 16:13:21 #action haleyb to report bugs about recent errors in L3 agent logs 16:13:27 thx mlavalle and haleyb 16:14:13 ok, lets move on 16:14:20 next one: slaweq to continue debugging bug 1798475 16:14:21 bug 1798475 in neutron "Fullstack test test_ha_router_restart_agents_no_packet_lost failing" [High,In progress] https://launchpad.net/bugs/1798475 - Assigned to LIU Yulong (dragon889) 16:14:43 I asked for help L3 experts and liuyulong jumped in. Patch proposed: https://review.openstack.org/#/c/625054/ 16:15:33 ok 16:15:36 great! 16:16:09 it is still WIP 16:16:16 so as liuyulong is working on it, I think we are in good hands with this one :) 16:16:29 yeap 16:17:27 ok, lets move on then 16:17:31 slaweq to continue fixing funtional-py3 tests 16:17:59 I limited output from functional job by disabling warnings and some logging to stdout 16:18:04 patches for that are proposed: 16:18:09 https://review.openstack.org/#/c/625555/ 16:18:11 https://review.openstack.org/#/c/625569/ 16:18:13 https://review.openstack.org/#/c/625704/ 16:18:15 https://review.openstack.org/#/c/625571/ 16:18:33 with those patches this functional job running on python3 should be (almost) good 16:18:56 almost because I noticed that also 3 tests related to SIGHUP signal are failing: http://logs.openstack.org/83/577383/19/check/neutron-functional/6470d68/logs/testr_results.html.gz 16:19:24 bcafarel: can You take a look at them and check if that is related to this issue with handling SIGHUP which we already have reported somewhere? 16:19:30 or maybe it's some different issue 16:21:06 ok, I think that bcafarel is not here now 16:21:15 I will ping him tomorrow about that issue 16:21:32 #action slaweq to talk with bcafarel about SIGHUP issue in functional py3 tests 16:21:44 next one: 16:21:46 hongbin to report and check failing neutron.tests.fullstack.test_l3_agent.TestHAL3Agent.test_gateway_ip_changed test 16:21:53 slaweq: o/ sorry was AFK I can take a look tomorrow (hopefully) 16:22:05 bcafarel: no problem, thx a lot 16:22:26 o/ 16:22:39 there is a proposed patch for that 16:22:47 #link https://review.openstack.org/#/c/625359/ 16:23:31 IMO, we can merge the patch and see if it is able to resolve the error 16:23:45 thx hongbin, I will take a look at it tomorrow 16:24:00 slaweq: thanks 16:24:34 ok, lets move to the next one then 16:24:35 slaweq to switch neutron-tempest-iptables_hybrid job as non-voting if it will be failing a lot because of bug 1807949 16:24:36 bug 1807949 in os-vif "os_vif error: [Errno 24] Too many open files" [High,Fix released] https://launchpad.net/bugs/1807949 - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 16:24:53 I did: Patch https://review.openstack.org/#/c/624489/ is merged already 16:25:05 and also revert is already proposed: https://review.openstack.org/#/c/625519/ - proper fix is already in os_vif 16:25:11 so please review this revert :) 16:25:32 and thx ralonsoh for proper fix in os_vif :) 16:26:20 next one: 16:26:21 slaweq to mark db migration tests as unstable for now 16:26:26 Patch https://review.openstack.org/#/c/624685/ - merged 16:27:25 and I recently found out that there was mistake in this patch, so there is another one: https://review.openstack.org/#/c/625556/ and this is also merged 16:27:40 I hope we should be better in functional tests ratio now 16:28:00 and that was all on my list from last week 16:28:23 anything else You want to ask/talk from previous week? 16:29:10 ok, lets move to the next topic then 16:29:14 #topic Python 3 16:29:28 I have 2 things related to python 3 16:29:42 1. Rally job switch to python3: https://review.openstack.org/#/c/624358/ - please review it 16:29:55 it required fix on rally side and it's merged already 16:30:06 so we should be good to switch this job to python3 16:30:45 2. Some info: As per gmann’s comment in https://review.openstack.org/#/c/624360/3 - we will not get rid of tempest-full python 2.7 job for now. 16:32:07 that's all from my side about python3 CI jobs 16:32:17 anything else You want to add? 16:32:48 nope, thanks for the update 16:32:58 ok 16:33:06 #topic Grafana 16:33:13 #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate 16:34:41 do You see anything worth to discuss now? 16:35:22 I think that all is more or less under controll now - we have still some issues which we are aware of but nothing new and nothing which would cause very high failure rates 16:35:51 neutron-tempest-iptables_hybrid jobs is clearly back in good shape after ralonsoh's fix for os_vif 16:35:54 it looks overall good 16:36:07 and neutron-tempest-plugin-dvr-multinode-scenario is back to 100% of failures 16:36:21 right I was about to mention that 16:36:22 but this was already discussed before and we know why it happens like that 16:36:46 but keep in mind that the work I amdoing with the trunk tests adresses that 16:36:52 at least partially 16:37:17 fullstack tests are mostly failing because of those issues which hongbin and liuyulong are working on 16:37:30 That's what I figured 16:38:23 only quite high failure rate for functional tests worrying me a bit 16:38:32 I'm looking for some recent examples 16:40:15 I found something like that for example: http://logs.openstack.org/00/612400/16/check/neutron-functional/3e8729c/logs/testr_results.html.gz 16:41:10 and I see such error for first time 16:43:24 other examples which I found looks that are related to patches on which it was running 16:43:33 so maybe there is no any new issue with those tests 16:43:42 lets just monitor it for next days :) 16:43:48 what do You think? 16:44:43 yes 16:44:48 let's monitor it 16:45:01 :) 16:45:05 ok, lets move on 16:45:09 #topic Periodic 16:45:42 I just wanted to mention that thx mriedem our neutron-tempest-postgres-full is good again :) 16:45:46 thx mriedem 16:45:59 and except that all other periodic jobs looks good now 16:46:24 ok, so last topic for today 16:46:26 #topic Open discussion 16:46:39 \o/ 16:46:59 first of all I want to mention that next 2 meetings will be cancelled as it would be during Christmas and New Year 16:47:08 I will send email about that today too 16:47:11 no meetings on the 25th and the 1st, right? 16:47:17 mlavalle: right 16:47:32 I don't have anything else 16:47:41 and second thing which I want to raise here is patch 16:47:47 #link https://review.openstack.org/573933 16:47:55 it is waiting for very long time for review 16:48:06 I was already mentioning it here few times 16:48:11 please take a look at it 16:48:13 I'll comment today 16:48:19 thx mlavalle 16:48:25 as I've said before, I'm not fan of it 16:48:44 personally I don't think we should merge it but I don't want to block it if others things that it's good 16:49:06 ok, thats all from me for today 16:49:17 do You have anything else You want to talk? 16:49:21 nope 16:49:47 if not, then I will give You back 10 minutes 16:49:54 thanks 16:50:12 have a great holidays and Happy New Year! 16:50:17 you too! 16:50:18 the same :) 16:50:25 Happy CI New Year even :) 16:50:29 although I still expect to see you in the Neutron channel 16:50:37 and see You all in January on meetings 16:50:50 mlavalle: yes, I will be available until this Friday for sure 16:51:04 next week maybe if I will need to rest from family a bit :P 16:51:06 :-) 16:51:18 #endmeeting