Wednesday, 2020-01-29

slaweq#startmeeting neutron_ci15:00
Meeting started Wed Jan 29 15:00:04 2020 UTC
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
*** openstack changes topic to " (Meeting topic: neutron_ci)"15:00
openstackThe meeting name has been set to 'neutron_ci'15:00
slaweqwelcome on CI meeting at new hour and in new room15:01
*** bcafarel has joined #openstack-meeting-315:01
slaweqok, lets start now15:02
slaweqGrafana dashboard:
slaweqplease open now :)15:02
slaweq#topic Actions from previous meetings15:02
*** openstack changes topic to "Actions from previous meetings (Meeting topic: neutron_ci)"15:02
slaweqralonsoh to increase log level for ovsdbapp in fullstack/functional jobs15:03
ralonsohone sec15:03
ralonsohit's still failing ...15:03
slaweqahh, this is the patch for that :)15:04
ralonsohI need to review how to configure properly the ml2 plugin in zuul15:04
slaweqI commented it today again15:04
slaweqbut now I think that it will not work like that15:04
slaweqas You need to set proper config options in test's setUp method15:04
slaweqprobably somewhere in
slaweqfor functional tests15:05
slaweqin those jobs we are not using config files at all15:05
ralonsohyou are right15:05
ralonsohnot in the FTs15:05
ralonsohbut I should configure like this in fullstack15:06
slaweqand in fullstack it is similar15:06
ralonsohok, I'll check it later today15:06
slaweqit's here:
ralonsohyou are right15:06
slaweqsorry that I didn't wrote it earlier15:06
ralonsohthe agent is configured there15:06
slaweqidk why I missed that and tried to fix Your patch in "Your way" :)15:07
slaweqok, next one was:15:07
slaweqslaweq to open bug for issue with get_dp_id in os_ken15:07
slaweqI reported it here
openstackLaunchpad bug 1861269 in neutron "Functional tests failing due to failure with getting datapath ID from ovs" [High,Confirmed]15:08
ralonsohslaweq, I think I have a possible solution for this15:08
slaweqbut I didn't assign it to myself15:08
ralonsohI need to justify it15:08
ralonsohbut the point is, although we have multithreading because of os-ken15:08
bcafarelfunny, usually in this kind of review the "fix" is add a sleep, not remove one :)15:09
ralonsohthe ovs agent code should not give the GIL to other tasks15:09
ralonsohthat means: do not use sleep, what will stop the thread execution15:09
ralonsohif other threads are not expecting the GIL then those threads won't return it back15:09
*** jamesmcarthur has joined #openstack-meeting-315:09
ralonsohI've rechecked this patch several times, no errors in FT and fullstack (related)15:10
slaweqralonsoh: makes sense IMO15:10
ralonsohI'll add a proper explanation in the patch15:10
slaweqbcafarel: LOL, that's true, usually we need to add sleep to "fix" something :)15:10
slaweqralonsoh: please link Your patch to this bug also15:10
slaweqthx ralonsoh15:11
bcafarelwow that line is old, it comes directly from "Introduce Ryu based OpenFlow implementation"15:11
slaweqok, next one15:12
slaweqslaweq to try to skip cleaning up neutron resources in fullstack job15:12
slaweqI was trying it locally and it was fast about 4-5 seconds on test15:12
slaweqso I think it's not worth to do it15:13
slaweqand I didn't send any patch15:13
ralonsohyeah, not too much (and maybe we can introduce new errors)15:13
slaweqralonsoh: exactly15:13
slaweqso risk of unpredictible side effects is too high IMO15:14
*** jamesmcarthur has quit IRC15:14
slaweqok, that was all from last week15:14
slaweq#topic Stadium projects15:14
*** openstack changes topic to "Stadium projects (Meeting topic: neutron_ci)"15:14
slaweqas we talked yesterday, we finished dropping py2 support in Neutron15:15
slaweqso lets use etherpad to track migration to zuul v315:15
slaweqbut I have one more thing about dropping py2 support15:16
slaweqthere is patch
slaweqfor neutron-tempest-plugin15:16
slaweqit's not working properly for Rocky jobs15:16
slaweqand I have a question: shouldn't we first make tag of neutron-tempest-plugin repo and use this tag for rocky with py2715:17
slaweqand than go with this patch to drop py27 completly?15:17
slaweqor how it will work with rocky after we will merge it?15:18
njohnstonthat makes sense to me15:18
ralonsohwe need to tag it first to use it in rocky tests15:18
slaweqgmann: njohnston: am I missing something here?15:18
njohnstonI am not completely sure - gmann has been very active in this area, I believe he has a plan, but I am not sure all the details15:19
slaweqok, I will ask about it in review15:20
slaweqok, njohnston any other updates about zuulv3 migration?15:21
njohnstonI don't have any updates; it has been accepted as an official goal for the V cycle, so we are way ahead of schedule, but it will be good to finish up because of all the reasons15:22
njohnstonthere are only 3 or 4 stadium projects that have changes left15:22
slaweqnot too many15:22
njohnstonbcafarel is working on bgpvpn
bcafarelstadium-wise there is also amotoki's question on vpnaas failing on rocky and moving to use neutron-tempest-plugin there15:22
slaweqbcafarel: do You have any patch with failure example?15:23
bcafarelfor vpnaas?
bcafarelalso for bgpvpn I was wondering: there is an install job which does not run any tests (as per its name), should we migrate it or just drop it? I think other tests cover the "is it installable?" part15:24
njohnstonneutron-dynamic-routing also has bcafarel's magic touch ; I don't see anything zuulv3 related for networking-odl, networking-midonet, neutron-vpnaas15:25
njohnstonthats it for me15:27
slaweqthx njohnston15:27
slaweqspeaking about this vpnaas rocky issue15:27
slaweqam I understanding correct that if we would pin tempest used for rocky than it would be fine?15:27
slaweqor we should use for rocky branch job defined in neutron-tempest-plugin repo (like for master now)?15:28
njohnstonI'll defer on that to gmann15:29
slaweqok, I will talk with him about it15:30
slaweq#action slaweq to talk with gmann about vpnaas jobs on rocky15:30
* slaweq starts hating rocky branch now15:30
* njohnston welcomes slaweq to the club15:31
bcafarel"old but not enough" branch15:31
slaweqlol, that's true15:31
njohnstonI have had too many backports that go "train: green; stein: green; rocky: RED; queens: green"15:31
slaweqgood news is that it's just few more weeks and it will be EM15:31
slaweqok, lets move on15:32
slaweqor do You have anything else related to stadium for today?15:33
njohnstonnope, nothing else15:33
slaweqok, so lets move on15:34
slaweq#topic Grafana15:34
*** openstack changes topic to "Grafana (Meeting topic: neutron_ci)"15:34
slaweqfrom what I can say, we are much better with scenario jobs now15:34
slaweqbut our biggest problems are fullstack/functional jobs15:35
slaweqand grenade jobs15:35
slaweqand the biggest issue from those is functional job15:35
slaweqand I think that we are missing some ovn related job on the dashboard now15:37
slaweqI will check that and update dashboard if needed15:38
slaweq#action slaweq to update grafana dashboard with missing jobs15:38
bcafarelare all the jobs in? I think I saw some reviews on functional ovn (at least)15:38
slaweqbut functional tests will be run together with our "old" functional job I think15:39
bcafarelah ok :)15:39
slaweqanything else related to grafana for today?15:39
slaweqok, so lets move on than15:40
slaweq#topic fullstack/functional15:40
*** openstack changes topic to "fullstack/functional (Meeting topic: neutron_ci)"15:40
slaweqI have few examples of failures in functional job15:40
slaweqfirst     again ovsdbapp command timeouts:15:41
slaweqbut I know ralonsoh is on it already15:41
slaweqso it's just to point to the new examples of this issue15:41
ralonsohyes, let's see if we can have more information with the patch uploaded15:42
ralonsohbut those are the main problems we see in FT and fullstack15:42
ralonsoh1) ovsdb timeouts15:42
ralonsoh2) the os-ken datapath timeout15:42
ralonsoh3) pyroute timeouts15:42
ralonsoh(did I say "timeout" before?)15:42
slaweqyeah, timeouts are our biggest nightmare now :/15:43
slaweqbut it seems logical that removing "sleep" from code may solve timeouts :P15:43
slaweqok, next one than (this one is new for me, at least I don't remember anything like that)15:44
slaweqfailure in neutron.tests.functional.agent.linux.test_linuxbridge_arp_protect.LinuxBridgeARPSpoofTestCase.test_arp_correct_protection_allowed_address_pairs15:44
slaweqthere are errors like     2020-01-29 09:11:10.172 22333 ERROR ovsdbapp.backend.ovs_idl.vlog [-] tcp: error parsing update: ovsdb error: Modify non-existing row: ovs.db.error.Error: ovsdb error: Modify non-existing row15:44
ralonsohI have no idea in this one15:45
ralonsohhow, a Linux Bridge test, is hitting an OVS error??15:45
* njohnston looks for otherwiseguy15:45
bcafarelsome race condition in test because of our timeout friend?15:45
slaweqme neighter but I wonder why ovsdbapp is used in those Linuxbridge tests15:45
ralonsohthat's the point15:46
ralonsohor LB or OVS15:46
slaweqand here is error in test:
slaweqso it seems that it failed on preparation of test env15:47
*** jamesmcarthur has joined #openstack-meeting-315:47
ralonsohslaweq, that seems an error from a previous test15:48
ralonsohand a blocked greenlet thread15:48
ralonsohmaybe it's too late15:48
ralonsohbut the use of greenthreads, IMO, was not a good option15:48
ralonsoh(remember python does NOT have multithreading at all)15:49
slaweqbut that's the only failed test in this job15:49
ralonsohI know...15:49
slaweqlets see if we will have more such issues, as nobody saw it before so far maybe it will never happen again ;)15:50
* slaweq don't belive himself even15:51
slaweqand I have one more, like:
* bcafarel is not betting on it either15:52
slaweqand this one I think I saw at least twice this week15:52
ralonsohno no15:52
ralonsohthis is not a problem15:52
ralonsohthat was related to an error in the OVN functional tests15:52
ralonsohbut that's solved not15:52
ralonsohone sec15:52
ralonsoh(also I pushed a DNM patch to test this)15:53
bcafarela sorting order issue right?15:53
slaweqahh, so it's related to ovn migration, right?15:53
*** jamesmcarthur has quit IRC15:53
ralonsohplease, read the diff and you'll understand15:53
ralonsohBTW, that problem in FTs was tested in
slaweqso it was trying to use test_extensions.setup_extensions_middleware(sg_mgr) as security groups api, instead of "normal" one, right?15:56
ralonsohthat was needed in networking-ovn15:56
ralonsohbut NOT in neutron repo15:56
ralonsohif you are using the basetest class15:57
slaweqok, good that it's not "yet another new issue with functional tests" :)15:57
slaweqthx ralonsoh :)15:57
slaweqso maybe something similar will be needed to fix failures like
slaweqI saw it also only in ovn related patches15:58
slaweqand it seems that there is simly no needed route in neutron loaded15:58
ralonsohyes, first we need "part 1" patch15:58
ralonsohthen will handle "part 2"15:58
slaweqso those 2 from my list are fine than15:58
slaweqfor fullstack tests I saw such failure:15:59
slaweqit's issue with connection to placement service15:59
slaweqI will ask tomorrow rubasov and lajoskatona to take a look at it16:00
slaweqmaybe they can help with this16:00
slaweqand we are out of time now16:00
slaweqthx for the meeting guys16:00
ralonsohthey have experience on this16:00
slaweqsee You around16:00
*** openstack changes topic to "OpenStack Meetings ||"16:00
Meeting ended Wed Jan 29 16:00:34 2020 UTC.
openstackMinutes (text):
