15:00:13 #startmeeting neutron_ci 15:00:14 Meeting started Wed Feb 5 15:00:13 2020 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:15 hi 15:00:16 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:18 The meeting name has been set to 'neutron_ci' 15:00:19 o/ 15:01:13 ralonsoh: bcafarel haleyb: CI meeting, are You around? 15:01:20 hi 15:01:24 o/ 15:01:34 I was waiting in the wrong channel 15:01:35 slaweq: thanks for the ping I was looking for correct window :) 15:01:40 slaweq: i'm in another meeting too, have one eye here :) 15:01:46 :) 15:01:49 ok, lets start 15:01:51 Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:02:03 please open it and we can move on 15:02:05 #topic Actions from previous meetings 15:02:10 slaweq to talk with gmann about vpnaas jobs on rocky 15:02:18 tbh I forgot about it :/ 15:02:47 slaweq: i sent summary of what amotoki and we discussed on ML 15:02:49 maybe gmann is around now so we can ask him how to fix this issue with vpnaas rocky tempest jobs 15:03:02 gmann: ok, so I need to find this email than :) 15:03:42 #link http://lists.openstack.org/pipermail/openstack-discuss/2020-January/012241.html 15:03:45 4th point 15:04:01 slaweq: ping me once you are done then we can discuss further 15:07:00 gmann: so basically we should backport to rocky https://review.opendev.org/#/c/695834/ 15:07:10 or at least "partially" backport it 15:07:12 ? 15:07:41 slaweq: yeah that will be good for long term maintenance and how py2 EOL things happening 15:08:27 ok, and this will not be a problem if we will have those tests in-tree but will run them from neutron-tempest-plugin repo actually? 15:08:56 i am fixing stable branches with stable u-c t use in tempest tox run which solve issue but backport that is what i will suggest in case of another issue 15:09:03 true 15:09:16 ok, thx for explanation 15:09:18 fixing current stable branches summary - http://lists.openstack.org/pipermail/openstack-discuss/2020-February/012371.html 15:09:36 #action slaweq to backport https://review.opendev.org/#/c/695834/ to stable branches in neutron-vpnaas 15:09:44 FYI, all stable branch till rocky are broken now 15:10:14 sigh 15:10:23 :/ 15:11:19 ok, lets move on 15:11:24 next action was 15:11:26 slaweq to update grafana dashboard with missing jobs 15:11:30 and I also forgot about it :/ 15:11:34 #action slaweq to update grafana dashboard with missing jobs 15:11:41 I will do it this week 15:12:14 any questions/comments to this topic? 15:12:44 nope 15:13:02 no 15:13:07 so we can move on to the next topic 15:13:08 #topic Stadium projects 15:13:21 migration to zuulv3 15:13:23 https://etherpad.openstack.org/p/neutron-train-zuulv3-py27drop 15:13:40 I was checking this etherpad few days ago, and I even send some small patches related to it 15:13:47 (but still needs some work) 15:14:41 and generally we are pretty good there 15:14:57 most of the legacy jobs have got patches already in review 15:14:57 +1 15:15:18 huge thx for bcafarel for sending many related patches :) 15:15:38 np, some of them are still not working properly 15:16:02 slaweq: as you know neutron-functional well if you have some time to take a look at https://review.opendev.org/#/c/703601/ 15:16:13 I can't seem to convince it to install/find neutron :( 15:16:24 (nothing urgent of course) 15:16:39 bcafarel: ok, I will take a look 15:17:39 thanks :) 15:17:54 np 15:18:00 anything else related to the stadium projects' ci? 15:18:25 nope 15:18:49 no 15:19:27 ok, lets move on than 15:19:55 #topic Grafana 15:19:57 http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:21:21 first of all our gate jobs are not running since few days 15:21:44 and it's mostly due to broken neutron-ovn-tempest-ovs-release job 15:22:07 second issue is that we have some gap in data in during last weekend and begin of this week 15:22:13 but that's probably infra issue 15:23:21 Is neutron-ovn-tempest-ovs-release one of the missing jobs that needs to be updated in grafana? I can't find it. 15:23:30 njohnston: yes 15:23:33 sorry for that :/ 15:23:33 ok 15:23:44 np :-) 15:24:50 also it seems we had some problem yesterday as many jobs has got high numbers there 15:25:05 but I don't know about any specific issue from yesterday 15:26:55 but this spike can be also due to low number of running jobs (or data stored) in the day before 15:28:04 other than that I don't see anything really wrong 15:29:11 ok, lets move on than 15:29:13 #topic fullstack/functional 15:29:28 I have couple of issues in fullstack tests for today 15:29:43 Error when connecting to the placement service (same as in last week too): 15:29:50 https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f66/703143/3/check/neutron-fullstack/f667c93/controller/logs/dsvm-fullstack-logs/TestPlacementBandwidthReport.test_configurations_are_synced_towards_placement_NIC-Switch-agent_/neutron-server--2020-02-04--12-09-12-759753_log.txt 15:30:03 maybe lajoskatona or rubasov could take a look at it 15:30:57 I pinged rubasov to join this meeting 15:31:05 maybe he will join soon 15:31:48 hi 15:31:54 hi rubasov 15:32:19 recently we spotted few times issue like in https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f66/703143/3/check/neutron-fullstack/f667c93/controller/logs/dsvm-fullstack-logs/TestPlacementBandwidthReport.test_configurations_are_synced_towards_placement_NIC-Switch-agent_/neutron-server--2020-02-04--12-09-12-759753_log.txt 15:32:38 neutron-server can't connect to (fake) placement service 15:32:51 did You maybe saw something like that before? 15:33:07 or do You know why it could happen? 15:33:28 did not see this before 15:35:04 rubasov: can You try to take a look into that? 15:35:12 I don't really have ideas right now 15:35:17 not today ofcourse :) but if You will have some time 15:35:41 sure we'll look into it with lajoskatona 15:35:55 he wrote that fake placement service originally IIRC 15:36:05 how frequent this is? 15:36:39 it's not very frequent, I saw it once per week or something like that 15:37:07 rubasov: I will open LP to track it 15:37:09 okay, I'll put it on my todo list 15:37:29 and I will send it to You and Lajos - maybe You will have some time to take a look 15:38:08 that's even better, thank you 15:38:15 #action slaweq to open LP related to fullstack placement issue 15:38:19 thx rubasov 15:38:32 ok, another issue 15:38:35 in https://c355270b22583c2d2af0-42801c5a43c64ea303a559bec7f7cdd7.ssl.cf5.rackcdn.com/705903/2/check/neutron-fullstack/79cf197/controller/logs/dsvm-fullstack-logs/TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle_egress_/neutron-server--2020-02-05--11-00-19-067345_log.txt 15:38:37 thanks 15:38:52 it seems for me like neutron-server just hanged and that caused test timeout 15:39:38 actually not, it was also connection issue: https://c355270b22583c2d2af0-42801c5a43c64ea303a559bec7f7cdd7.ssl.cf5.rackcdn.com/705903/2/check/neutron-fullstack/79cf197/controller/logs/dsvm-fullstack-logs/TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle_egress_.txt 15:39:42 this time to neutron-server 15:40:37 well if the neutron server crashed hard then both the log would end precipitously like that and ECONNREFUSED is what clients would see 15:41:05 if it was a hang then the client would have timeouts 15:41:55 njohnston: but in the end of the logs You can see 15:41:57 2020-02-05 11:01:19.086 22341 DEBUG neutron.agent.linux.utils [-] Running command: ['kill', '-15', '3180'] create_process /home/zuul/src/opendev.org/openstack/neutron/neutron/agent/linux/utils.py:87 15:41:59 2020-02-05 11:01:19.280 22341 DEBUG neutron.tests.fullstack.resources.process [-] Process stopped: neutron-server stop /home/zuul/src/opendev.org/openstack/neutron/neutron/tests/fullstack/resources/process.py:85 15:42:08 so it seems that neutron-server was properly stopped at the end 15:42:21 if it would crash earlier wouldn't this be an error? 15:42:51 I think so... 15:43:03 which log is that in? I don't see it in https://c355270b22583c2d2af0-42801c5a43c64ea303a559bec7f7cdd7.ssl.cf5.rackcdn.com/705903/2/check/neutron-fullstack/79cf197/controller/logs/dsvm-fullstack-logs/TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle_egress_/neutron-server--2020-02-05--11-00-19-067345_log.txt 15:43:40 njohnston: it's in https://c355270b22583c2d2af0-42801c5a43c64ea303a559bec7f7cdd7.ssl.cf5.rackcdn.com/705903/2/check/neutron-fullstack/79cf197/controller/logs/dsvm-fullstack-logs/TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle_egress_.txt 15:43:50 this is "test log" 15:44:30 ok 15:46:00 slaweq, do you have a bug for this one? 15:46:05 I can review it later 15:46:14 ralonsoh: nope, I saw it only once so far so and I didn't open bug for it 15:46:17 but I can 15:46:24 (at least this is not a QoS error) 15:47:00 I will ping You when I will open LP for that 15:47:09 thanks for the "present" 15:47:22 #action slaweq to open LP related to "hang" neutron-server 15:47:27 ralonsoh: yw :D 15:47:49 and that's all what I have for today for functional/fullstack jobs 15:48:00 anything else You have maybe? 15:48:31 slaweq, https://review.opendev.org/#/c/705760/ is almost merged 15:48:51 I'll abandon https://review.opendev.org/#/c/705903/ 15:49:27 ralonsoh: great 15:49:28 so, recheck time once 705760 is in? 15:49:43 that should unblock our gate hopefully 15:50:12 excellent 15:51:11 ok, so we moved to scenario/tempest test now 15:51:24 #topic Tempest/Scenario 15:51:57 we already mentioned broken neutron-ovn-tempest-ovs-release job which should be fixed with 705760 15:53:38 from other issues I have one with test_show_network_segment_range https://9f9aee74b45263b2a9d8-795792c1f104e79962e44448ab55e3f1.ssl.cf1.rackcdn.com/681466/2/check/neutron-tempest-plugin-api/b340c8e/testr_results.html 15:53:49 and I think I saw something similar couple of times already 15:54:05 I'm not sure if that was always the same test but similar issue for sure 15:54:09 again? 15:54:16 it is the same one 15:54:28 KeyError on project_id?? 15:54:38 I really don't understand why this specific key is not present 15:54:48 and this is not a trivial one, but project_id 15:55:31 yes, I also don't understant it 15:55:56 wait, actually this one was 21.01 15:56:01 so maybe it's an old issue 15:56:37 sorry for the noise than :) 15:57:12 no, but this test error is recurrent 15:57:40 yes, and actually I don't understand it excatly 15:58:46 if You look at code: https://github.com/openstack/neutron-tempest-plugin/blob/master/neutron_tempest_plugin/api/admin/test_network_segment_range.py#L201 15:58:59 it failed after checking "id", "name" and other attributes 15:59:04 so it's not like dict is empty 15:59:13 one question 15:59:14 there is "only" project_id missing from it 15:59:21 this test is using neutron-client 15:59:28 not os-client 15:59:32 is that correct? 15:59:59 idk 16:00:10 but IMO those tests are using tempest clients, no? 16:01:02 Ok, I'll check it 16:01:09 thx ralonsoh 16:01:18 #action ralonsoh to check missing project_id issue 16:01:20 we run out of time 16:01:26 ok we are out of time today 16:01:31 thx for attending 16:01:33 o/ 16:01:34 bye 16:01:35 \o 16:01:35 #endmeeting