Tuesday, 2020-12-01

*** tosky has quit IRC00:02
*** macz_ has quit IRC00:40
*** psachin has joined #openstack-meeting-303:35
*** hemanth_n has joined #openstack-meeting-305:18
*** belmoreira has joined #openstack-meeting-307:21
*** ralonsoh has joined #openstack-meeting-307:37
*** slaweq has joined #openstack-meeting-307:52
*** tosky has joined #openstack-meeting-308:40
*** psachin has quit IRC09:07
*** ricolin has joined #openstack-meeting-309:21
*** e0ne has joined #openstack-meeting-310:02
*** e0ne has quit IRC10:12
*** e0ne has joined #openstack-meeting-310:13
*** e0ne has quit IRC10:35
*** waverider has joined #openstack-meeting-311:09
*** Luzi has joined #openstack-meeting-312:36
*** hemanth_n has quit IRC13:09
*** baojg has joined #openstack-meeting-313:16
*** raildo has joined #openstack-meeting-313:25
*** baojg has quit IRC13:30
*** baojg has joined #openstack-meeting-313:32
*** obondarev has joined #openstack-meeting-313:33
*** slaweq has quit IRC13:34
*** slaweq has joined #openstack-meeting-313:36
*** Luzi has quit IRC13:43
*** liuyulong has joined #openstack-meeting-313:53
*** mlavalle has joined #openstack-meeting-313:58
*** lajoskatona has joined #openstack-meeting-313:58
slaweq#startmeeting networking14:00
openstackMeeting started Tue Dec  1 14:00:22 2020 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.14:00
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.14:00
*** openstack changes topic to " (Meeting topic: networking)"14:00
openstackThe meeting name has been set to 'networking'14:00
slaweqhi!14:00
mlavalleo/14:00
amotokio/14:00
ralonsohhi14:00
bcafarelhi14:00
obondarevhi14:01
rubasovhi14:01
lajoskatonai14:01
lajoskatonaHi14:01
slaweq#topic Announcements14:01
*** openstack changes topic to "Announcements (Meeting topic: networking)"14:01
slaweqAfter migration to new Gerrit automation scripts which integrates Launchpad and Gerrit seems that are not working14:01
slaweqSo please update Your LPs when You push patch related to some bug report14:02
slaweqlike set LP to be in progress14:02
slaweqand paste link to the proposed fix in comment14:02
slaweqand also assign LP to yourself in such case14:02
slaweqnext one14:03
slaweqaccording to the https://releases.openstack.org/wallaby/schedule.html14:03
slaweqthis week we have Wallaby-1 milestone14:03
slaweqreleases of some libs are already proposed or even done for things like neutron-lib14:04
slaweqso we should be good there14:04
slaweqbut also, I would like ask You all once again to spent some time on reviewing opened specs: https://review.opendev.org/q/project:openstack/neutron-specs+status:open14:04
slaweqso hopefully authors of them will be able to start working on implementation soon14:05
slaweqand that's all announcements/reminders from me for today14:05
slaweqdo You have anything else You want to share?14:05
slaweqok, I guess that this means "no"14:07
slaweqso lets move on14:07
slaweq#topic Blueprints14:07
*** openstack changes topic to "Blueprints (Meeting topic: networking)"14:07
slaweqI moved things from W-1 to W-2 now14:07
slaweqhttps://bugs.launchpad.net/neutron/+milestone/wallaby-214:07
slaweqand I updated today BP https://blueprints.launchpad.net/neutron/+spec/enginefacade-switch which is almost done on neutron side14:07
slaweqlast week we merged ralonsoh's patch which we though that finishes that but there were still some leftovers in the neutron code14:08
slaweqso I pushed couple of new patches today https://review.opendev.org/q/project:openstack/neutron-specs+status:open14:08
ralonsoh(sorry for that)14:08
bcafarelstill, worth some \o/14:08
slaweqralonsoh: np at all :)14:08
slaweqYou did worst part actually14:08
slaweqsorry, wrong link14:09
ralonsohbtw, correct link14:09
slaweqhttps://review.opendev.org/#/q/status:open+topic:bp/enginefacade-switch14:09
ralonsohhttps://review.opendev.org/q/topic:%22bp%252Fenginefacade-switch%22+(status:open%20OR%20status:merged)14:09
slaweqthx ralonsoh :)14:09
slaweqwe still need to make same transition for the stadium projects before we will mark this BP as completed14:10
mlavalleGreat work!14:10
njohnstonGreat job ralonsoh, slaweq, and everyone!  The dragon is slain (at least in neutron)!14:11
slaweq:)14:11
slaweqregarding other BPs, I created new one https://blueprints.launchpad.net/neutron/+spec/secure-bac-roles14:12
slaweqto track migration to secure rbac roles in Neutron14:13
slaweqI know that mlavalle and amotoki was volunteering to work on that but recently it became high prio for Red Hat so we have more pressure to work on that and I already started sending some patches14:13
mlavalleok14:14
slaweqhttps://review.opendev.org/q/topic:%2522secure-rbac%2522+(status:open+OR+status:merged)+project:openstack/neutron14:14
slaweqany help with that is of course welcome :)14:14
mlavalleok14:14
amotokislaweq: sorry for the delay and thanks for starting the work. I will help it.14:14
slaweqno need to sorry, we all have own priorities and limited time14:14
slaweqand thx for help and valuable reviews there :)14:15
amotokibtw, do you continue to use the current blueprint with typo.14:15
amotoki?14:15
slaweqamotoki: is there way to change it? or the only way is to create new BP?14:15
slaweqdo You know?14:16
amotokislaweq: I will check it. otherwise we can create a new one and mark it as superceded14:16
slaweqok, thx14:16
amotokirenamed it to https://blueprints.launchpad.net/neutron/+spec/secure-rbac-roles now14:16
amotokiwe can change it from Change details at the right top corner.14:17
slaweqthx a lot14:17
slaweqgood to know14:17
slaweqthat's all updates from me regarding BPs14:18
slaweqdo You have anything else14:19
slaweq?14:19
mlavalleI have an update / request regarding address groups14:19
mlavalleWe are facing some challenges updating the firewall after an address group has been updated. hangyang left a comment yesterday in https://review.opendev.org/c/openstack/neutron/+/75765014:19
mlavallecould ralonsoh take a look?14:20
ralonsohsure14:20
mlavalleTnaks!14:20
mlavallethat's all14:20
slaweqahh, conjunctions14:21
slaweq:/14:21
slaweqgood luck :P14:21
mlavalleLOL14:21
lajoskatonaAnother conjunction: https://earthsky.org/astronomy-essentials/great-jupiter-saturn-conjunction-dec-21-202014:21
lajoskatonasorry for hyjacking the meeting....14:22
slaweq:)14:22
bcafarelI very much prefer the later one14:23
njohnstonlol, for sure14:23
slaweqok, if there are no other updates about BPs, lets move on14:24
slaweq#topic Community Goals14:24
*** openstack changes topic to "Community Goals (Meeting topic: networking)"14:24
slaweqregarding "Migrate RBAC Policy Format from JSON to YAML" I saw that gmann already pushed some patches14:24
slaweq https://review.opendev.org/c/openstack/neutron/+/76440114:24
slaweqhttps://review.opendev.org/c/openstack/neutron-lib/+/76441614:24
slaweqand there is also mail http://lists.openstack.org/pipermail/openstack-discuss/2020-November/019079.html14:24
slaweqamotoki: You are probably aware of all of that14:25
slaweqbut just in case I wanted to paste it all here :)14:25
amotokiyeah. I tried it and saw some unclear behavior.14:25
amotokiI will follow it up.14:26
slaweqthx amotoki14:27
slaweqamotoki: if You will need any help, please ping me14:27
amotokislaweq: sure. thanks14:27
slaweqso that one should be under control for now14:28
slaweqralonsoh: any updates about migration to privsep?14:28
ralonsohyes, but I'm facing some problems with one patch (maybe pyroute2 related)14:28
ralonsohand I have the "grenade" patch14:28
ralonsoh(one sec)14:28
ralonsohhttps://review.opendev.org/c/openstack/neutron/+/76401514:29
ralonsohthat will replace rootwrap, generically, with a privsep context14:29
ralonsohI know this is gross but effective14:29
slaweqyeah, so kind of "one patch to migrate them all", right?14:30
ralonsohyes14:30
ralonsohbut, of course, many errors still there14:30
ralonsohso I'll be focused first in the short ones14:30
slaweqok14:31
slaweqthx for working on that14:31
slaweqI think we can move on14:33
slaweqnext topic14:33
slaweq#topic Bugs14:33
*** openstack changes topic to "Bugs (Meeting topic: networking)"14:33
slaweqobondarev was bug deputy last week14:33
slaweqreport is at http://lists.openstack.org/pipermail/openstack-discuss/2020-November/019119.html14:33
slaweqobondarev: any bugs You would like to bring to the team now?14:33
obondarevnot really14:33
obondarevpretty quiet week14:34
ralonsoh(cool)14:34
bcafarelthat was until some PTL started filling bugs today :)14:34
slaweqyeah, I saw Your report14:34
slaweqthere is only one unassigned bug there14:34
slaweqhttps://bugs.launchpad.net/neutron/+bug/190555114:34
openstackLaunchpad bug 1905551 in neutron "functional: test_gateway_chassis_rebalance fails" [Medium,Confirmed]14:34
slaweqrelated to ovn functional tests14:35
slaweqso would be great if someone could take a look at it14:35
slaweqok, any other bugs You want to discuss today?14:37
slaweqif not, I have one14:38
slaweqI added it to the "On Demand Agenda" but we can discuss now14:39
slaweqhttps://bugs.launchpad.net/neutron/+bug/190353114:39
openstackLaunchpad bug 1903531 in neutron "Update of neutron-server breaks compatibility to previous neutron-agent version" [Critical,Confirmed] - Assigned to Slawek Kaplonski (slaweq)14:39
slaweqafter some discussion on irc and ML last week I decided that will be better to remove it from all branches14:39
slaweqand then propose it make again to master but with properly provided backward compatibility14:40
slaweqpatches are proposed:14:40
slaweq     master - https://review.opendev.org/c/openstack/neutron/+/76418914:40
slaweq    victoria - https://review.opendev.org/c/openstack/neutron/+/76419014:40
slaweq    ussuri - https://review.opendev.org/c/openstack/neutron/+/76419114:40
slaweqliuyulong proposed https://review.opendev.org/c/openstack/neutron/+/764108 to avoid reverting original patch from the master branch14:40
slaweqso I wanted to discuss it once again here - should we not revert original patch from master and just go with liuyulong's additional patch? Or what is the best approach in Your opinion?14:42
slaweqbecause we basically broke upgrades/updates with that patch and we have to fix it somehow14:42
ralonsohthis is not a bad idea: to convert the message from the server and cap the version given to the agents14:43
bcafarelwell for master it does not change much, by wallaby release we should have updated API target and relevant code14:43
bcafarelwether we revert fully and reapply or just bump14:43
bcafarelbut for stable... train to ussuri we will need old API to fix upgrade right?14:44
slaweqfor stable we need IMO to revert that patch14:44
ralonsohliu's patch should handle both14:44
ralonsoh(if server is updated first)14:44
bcafarelwhich means revert (breaking minor updates but can probably not be helpd, hence possible release note here) and then only fix in wallaby14:44
slaweqas otherwise we will always have some point where we will have incompatibility14:44
slaweqso do You think it will be good to go with revert for stable/victoria and ussuri and with liuyulong's patch for master, right?14:45
amotokiI agree that we need to revert that patch in all stable branches.14:45
ralonsohyes for master, because Liu's patch can handle when server is bumped and agents not14:46
amotoki+1 for slaweq's proposal14:46
ralonsoh+114:46
slaweqok, thx14:46
slaweqso lets make it that way to fix it14:47
slaweqthx for Your opinions14:47
liuyulongWhy not backport the fix to the patch original merged release? V? or U?14:47
slaweqliuyulong: I don't think we should/can really backport things which bumps rpc version14:47
ralonsohexactly14:47
ralonsoh(and we already had this problem with this bug)14:48
slaweqthis original patch should have bumped rpc version and we should never backport it to stable branches tbh14:48
liuyulongIt should be there, we missed that, so I'm OK if we can break the law once.14:48
slaweqliuyulong: I disagree with that - we already broke that law once with that patch and made a lot of mess14:48
slaweqwe shouldn't do that again14:48
liuyulongI'm not saying the release before V or U release.14:49
slaweqand that's also what I got from the releases team when I discussed with them14:49
amotokiI am not sure we cannot bump RPC version in stalbe branhces but we must keep backward compatibility between server and agent.14:49
slaweqamotoki: liuyulong: ok, I will ask once again stable main cores about opinion on that14:50
liuyulongThe patch was in stable/ussuri natively, just checked that.14:51
slaweqif they will tell me that we can go with that, then ok14:51
slaweqliuyulong: yes, I know14:51
amotokiin this case, agent is not upgraded and requests an older version, so the server should talk the older version at least.14:51
bcafarelif you follow a correct order in upgrade yes with versioned it works OK, though if you for some reason update agents first it will complain14:51
ralonsohbcafarel, well, the upgrade procedure says that you should update the servers first14:52
slaweqbcafarel: but we do support update when older agents can work with newer server14:52
slaweqnot vice versa14:52
slaweqand we should only really care about such scenario14:52
lajoskatonahttps://docs.openstack.org/project-team-guide/stable-branches.html#review-guidelines14:53
lajoskatonahere it is mentions only Nova’s internal AMQP API, but I suppose that can be true for other projects as well14:53
bcafarelyes, so it may be acceptable (as in "security fix with proper RPC version bump")14:53
amotokiI think we are discussing two aspects: stable backports and upgrade scenarios.14:54
amotokiupgrade scnearios should be considered even if we are in development cycle.14:54
slaweqamotoki: for me now the question is if we can do backport of that patch which bumps rpc version14:54
slaweqat least for me :)14:55
amotokislaweq: yes you're right14:56
liuyulongActually the RPC was only called when use iptables_hybrid and enable ebtables. The openflow firewall driver will not call this function. Openflow firewall uses local cache directly. Maybe that's why we missed that RPC bump.14:56
slaweqliuyulong: yes, because of that and because of lack of grenade testing14:56
amotokiI wonder whether during stable upgrade process both agent and server can be upgrade first... If we allow to upgrade either first, we cannot upgrade RPC version at all.14:56
slaweqwe are testing grenade only with openvswitch firewall driver which is default in devstack14:56
slaweqamotoki: in the docs we guarantee that You can update server first and newer server will be compatible to work with older agents14:57
slaweqthat's what we broke with that patch actually14:57
bcafarelyep14:57
liuyulongamotoki, yes, if the client side has higher version, the RPC will get failed directly.14:57
bcafarelseeing as time is running out, at least first steps seem OK for everyone right? fixing version in master, reverting ussuri/victoria14:58
amotokithanks. I missed that point.14:58
bcafareland then discuss if this can be backported again with new rpc14:58
ralonsohbcafarel, I think the proposal is different now14:59
slaweqwe are almost on top of the hour now, I will ask stable-maint cores about backporting liuyulong's patch to V/U also14:59
slaweqif that will be ok for them, I will abandon my reverts14:59
slaweqif not, then lets revert it from those branches and go with liuyulong's patch in master14:59
slaweqsounds good?14:59
bcafarelack14:59
liuyulongOK14:59
amotokisounds good14:59
slaweqok, thx for attending the meeting15:00
slaweqhave a great week :)15:00
slaweq#endmeeting15:00
*** openstack changes topic to "OpenStack Meetings || https://wiki.openstack.org/wiki/Meetings/"15:00
openstackMeeting ended Tue Dec  1 15:00:13 2020 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:00
openstackMinutes:        http://eavesdrop.openstack.org/meetings/networking/2020/networking.2020-12-01-14.00.html15:00
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/networking/2020/networking.2020-12-01-14.00.txt15:00
openstackLog:            http://eavesdrop.openstack.org/meetings/networking/2020/networking.2020-12-01-14.00.log.html15:00
lajoskatonao/15:00
slaweq#startmeeting neutron_ci15:00
openstackMeeting started Tue Dec  1 15:00:41 2020 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.15:00
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:00
*** openstack changes topic to " (Meeting topic: neutron_ci)"15:00
openstackThe meeting name has been set to 'neutron_ci'15:00
slaweqwelcome again :)15:00
bcafarelnot even time for coffee break :(15:00
ralonsohhi again15:00
lajoskatonao/15:00
obondarevo/15:01
slaweqok, lets start as we have couple of things to discuss here also :)15:02
slaweqGrafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate15:02
*** artom has quit IRC15:02
slaweq#topic Actions from previous meetings15:02
*** openstack changes topic to "Actions from previous meetings (Meeting topic: neutron_ci)"15:02
slaweqbcafarel to fix stable branches upper-constraints in stadium projects15:02
bcafareldone for victoria https://review.opendev.org/c/openstack/requirements/+/76402215:03
bcafarelussuri close https://review.opendev.org/c/openstack/requirements/+/76402115:03
* mlavalle has a doctor appointment. will skip this meeting o/15:03
bcafarelin the end this also required dropping neutron from blacklist15:04
slaweqtake care mlavalle :)15:04
bcafarelo/ mlavalle15:04
bcafarelwith requirements folks still hoping neutron-lib would be complete one day and remove need for these steps15:04
bcafarelbut well we know this will not be the case soon™15:04
bcafarelanyway at least this will be noted in my next action item15:05
slaweqwhat do You mean by "neutron-lib will be complete"?15:05
slaweqso all projects will import only neutron-lib, and not neutron?15:05
* mlavalle is only going for an eye exam. needs new eye glasses. that's all :-)15:05
bcafarelslaweq: indeed15:05
slaweqbcafarel: that can be hard, especially that we don't work on that too much recently :/15:06
bcafarelyes :/ so I think we will stay with the "need to update requirements after a release" step15:07
slaweqbcafarel: and to fix that in ussuri we need https://review.opendev.org/c/openstack/requirements/+/764021 right?15:07
bcafarelslaweq: yes that's the one (764022 is the merged one for victoria)15:08
slaweqok, so it's almost there15:08
slaweqok, lets move to the next one15:09
slaweq    bcafarel to check and update doc https://docs.openstack.org/neutron/latest/contributor/policies/release-checklist.html15:09
bcafarelbarely started, we can put that for next week15:10
slaweqok15:10
slaweq#action bcafarel to check and update doc https://docs.openstack.org/neutron/latest/contributor/policies/release-checklist.html15:10
slaweqso next one15:10
slaweqslaweq to explore options to fix https://bugs.launchpad.net/neutron/+bug/190353115:10
openstackLaunchpad bug 1903531 in neutron "Update of neutron-server breaks compatibility to previous neutron-agent version" [Critical,Confirmed] - Assigned to Slawek Kaplonski (slaweq)15:10
slaweqwe already dicussed that on the previous meeting15:10
bcafareljust a bit :)15:10
slaweqso no need to repeat it here15:10
slaweqnext one15:11
slaweqslaweq to report bug against rally15:11
slaweqI checked that and it's really not rally bug15:11
slaweqbut some red herring15:11
slaweqreal bug was that some subnet creation failed simply15:11
slaweqso I didn't report anything againt rally15:11
slaweqand that's all actions from last week15:12
slaweqnext topic15:12
slaweq#topic Stadium projects15:12
*** openstack changes topic to "Stadium projects (Meeting topic: neutron_ci)"15:12
slaweqany updates about stadium projects ci?15:12
slaweqlajoskatona?15:12
lajoskatonanothing as I have seen15:12
lajoskatonathings are going on without much problem15:12
slaweqlajoskatona: that's good to hear15:13
slaweq#topic Stable branches15:13
*** openstack changes topic to "Stable branches (Meeting topic: neutron_ci)"15:13
slaweqVictoria dashboard: https://grafana.opendev.org/d/HUCHup2Gz/neutron-failure-rate-previous-stable-release?orgId=115:13
slaweqUssuri dashboard: https://grafana.opendev.org/d/smqHXphMk/neutron-failure-rate-older-stable-release?orgId=115:13
slaweqbcafarel: any updates/issues regarding ci of stable branches?15:14
bcafarelnot that I am aware of at least :)15:14
slaweqok15:15
slaweqso lets move on15:15
slaweq#topic Grafana15:15
*** openstack changes topic to "Grafana (Meeting topic: neutron_ci)"15:15
slaweqin master branch I don't think that things are going well15:15
slaweqwe have plenty of issues and failure rates are pretty high for some jobs15:15
slaweqespecially functional/fullstack recently15:15
*** waverider has quit IRC15:16
ralonsohif we see a recurrent error in the CI (on those jobs), report it and inform in IRC15:17
ralonsohjust to let everybody know that you are on it15:17
slaweqralonsoh: yes, I have couple of examples15:17
ralonsohperfect15:17
slaweqI found them today15:17
ralonsoh(test_walk_versions, for example)15:17
slaweqbut I didn't had time yet to report LPs15:17
slaweqok, regarding grafana I don't really have more to say15:18
slaweqI know that some graphs are a bit not up to date recently but I want to propose one update for that when all patches which changes some jobs will be merged15:18
slaweqI think there is still one or too in gerrit15:18
slaweqother than that, I think we can talk about some specific jobs now15:19
slaweqare You ok with that?15:19
ralonsohyes15:20
slaweq#topic fullstack/functional15:20
*** openstack changes topic to "fullstack/functional (Meeting topic: neutron_ci)"15:20
slaweqok15:20
slaweqfirst one is bug https://bugs.launchpad.net/neutron/+bug/1889781 which is still hitting us from time to time15:20
openstackLaunchpad bug 1889781 in neutron "Functional tests are timing out" [High,Confirmed]15:20
slaweqand I think that it's even more often recently15:20
slaweqI may try to limit number of logs send to stdout during those tests15:21
slaweqbut if there is anyone else who wants to do that, that would be great :)15:21
slaweqplease then simply assign this bug to You15:21
slaweqand work on it15:22
ralonsohis that related to the size of the logs?15:22
slaweqralonsoh: most likely yes15:22
ralonsohok15:22
slaweqwe saw similar issue in the past in UT IIRC15:22
ralonsohbut I think this is because of some failing tests15:22
slaweqbasically it is some bug in stestr or something like that15:22
slaweqralonsoh: no15:22
ralonsohlike neutron.tests.functional.agent.linux.test_tc_lib.TcFiltersTestCase.test_add_tc_filter_vxlan [540.005735s] ... FAILED15:23
ralonsohexpending too much time15:23
slaweqif You will see logs, there is always huge gap when nothing happens15:23
ralonsohbecause all workers are blocked in other tests15:23
slaweqsee for example:15:23
slaweq2020-11-30 10:03:00.937710 | controller | {1} neutron.tests.functional.agent.ovn.metadata.test_metadata_agent.TestMetadataAgent.test_agent_resync_on_non_existing_bridge [1.997655s] ... ok15:23
slaweq2020-11-30 10:43:39.465033 | RUN END RESULT_TIMED_OUT: [untrusted : opendev.org/openstack/neutron/playbooks/run_functional_job.yaml@master]15:23
ralonsohI know15:24
slaweqthose are 2 consequent lines from the log15:24
slaweqso there is nothing for about 40 minutes there15:24
ralonsohbut IMO this is because the other workers are blocked checking something15:24
slaweqand that was exactly the symptom of the issue with too much output and stestr15:24
slaweqralonsoh: maybe the root cause now is different than it was with that stestr issue15:26
slaweqidk really15:26
lajoskatonathere was a new release of stestr recently, not sure though what it fixes15:26
slaweqbut at first glance it looks similar to what we had in the past15:26
slaweqanyway, if someone will have some time, You can take a look at that bug :)15:28
slaweqlets move on15:28
slaweqnext one15:28
slaweqI noticed few times this week failures with TestSimpleMonitorInterface15:28
slaweqlike e.g.:15:29
slaweqhttps://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_93d/764365/1/gate/neutron-functional-with-uwsgi/93df51c/testr_results.html15:29
slaweqI need to report LP for that15:29
slaweqralonsoh: isn't that related to some of Your changes maybe? It looks like something what You could work on :)15:29
ralonsohsure, I'll check it15:30
ralonsohand I'll report a LP15:30
ralonsohahh I think you are talking about a fullstack patch15:31
slaweqralonsoh: in log I see something like:15:31
slaweq2020-11-30 10:40:38.271 61912 DEBUG neutron.agent.linux.utils [req-2aa4c2b1-90e9-4f8d-a708-61d18ad4f3ec - - - - -] Running command: ['sudo', '/home/zuul/src/opendev.org/openstack/neutron/.tox/dsvm-functional/bin/neutron-rootwrap', '/home/zuul/src/opendev.org/openstack/neutron/.tox/dsvm-functional/etc/neutron/rootwrap.conf', 'ovsdb-client', 'monitor', 'Interface',15:31
slaweq'name,ofport,external_ids', '--format=json'] create_process /home/zuul/src/opendev.org/openstack/neutron/neutron/agent/linux/utils.py:8815:31
slaweq2020-11-30 10:40:38.321 61912 DEBUG neutron.agent.common.async_process [-] Output received from [ovsdb-client monitor Interface name,ofport,external_ids --format=json]: None _read_stdout /home/zuul/src/opendev.org/openstack/neutron/neutron/agent/common/async_process.py:26415:31
slaweq2020-11-30 10:40:38.322 61912 DEBUG neutron.agent.common.async_process [-] Halting async process [ovsdb-client monitor Interface name,ofport,external_ids --format=json] in response to an error. stdout: [[]] - stderr: [[]] _handle_process_error /home/zuul/src/opendev.org/openstack/neutron/neutron/agent/common/async_process.py:22215:31
ralonsohslaweq, in OVS there are two monitors15:31
ralonsohone for the ports and another one for the bridges15:31
ralonsohI migrated the birdges one15:32
ralonsohbut I never finished the complex one, for ports15:32
ralonsohhttps://review.opendev.org/c/openstack/neutron/+/73520115:32
slaweqso this seems for me like monitor of Interfaces15:32
ralonsohyes15:32
slaweqname,ofport,external_ids15:32
ralonsoh(not ports, interfaces)15:32
slaweqok, do You want to investigate it? Or do You want me to check that?15:33
ralonsohI'll report and investigate it15:33
slaweqthx15:33
slaweq#action ralonsoh to report and check issue with TestSimpleMonitorInterface in functional tests15:34
slaweqin the meeting agenda https://etherpad.opendev.org/p/neutron-ci-meetings there is more examples of same failure15:34
slaweqlets move now to fullstack tests15:34
slaweqwhich are also not very stable recently15:34
slaweqmost often I saw issue with mysql killed by oom killer15:35
slaweq    bug reported https://launchpad.net/bugs/190636615:35
openstackLaunchpad bug 1906366 in neutron "oom killer kills mysqld process on the node running fullstack tests" [Critical,Confirmed] - Assigned to Slawek Kaplonski (slaweq)15:35
slaweqI proposed patch to limit resources used there15:35
slaweqbut I saw that ralonsoh had some comments there15:35
slaweqI didn't had time yet to address them15:35
ralonsohwe test concurrency so I should not reduce to 1 the number of API workers15:36
ralonsohjust this15:36
slaweqralonsoh: but is that only this one test which You actually mentioned?15:36
slaweqor there are others also15:36
ralonsohI only found this one15:36
slaweqbecause if that's the only test which needs 2 workers, I can set 2 workers only for that test15:36
ralonsohperfect15:37
slaweqand use default to "1" for all other tests15:37
ralonsohyes, I think this is the only one15:37
slaweqok, great15:38
slaweqso I will update my patch15:38
lajoskatonathat is a good exampl,e thanks for mentioning15:38
slaweqand also as I see in the results now, lowering number of test runner workers from 4 to 3 results in about 18 minutes more for whole job15:38
slaweqso should be acceptable15:38
lajoskatonaslaweq: if you are overloaded I can take care of this api_worker change, that dhcp test comes from us.15:39
slaweqlajoskatona: thx, if You could update my patch that would be great15:40
lajoskatonaslaweq: sure15:41
slaweqlajoskatona: thx a lot15:41
slaweqok, lets move on to the scenario/tempest jobs15:41
slaweq#topic Tempest/Scenario15:41
*** openstack changes topic to "Tempest/Scenario (Meeting topic: neutron_ci)"15:41
slaweqfirst of all neutron-tempest-plugin-api15:41
slaweqI noticed quite often that there is one test failing15:42
slaweqtest_dhcp_port_status_active15:42
slaweqe.g.:15:42
slaweqhttps://1973ad26b23f3d5a6239-a05b796fccac2efb122cdf71ce7f0104.ssl.cf5.rackcdn.com/763828/4/check/neutron-tempest-plugin-api/bda79c4/testr_results.html15:42
slaweqor15:42
slaweqhttps://38bbf4ec3cadfd43de08-7d0e556db3075d25d1b91bbdcc8a4562.ssl.cf2.rackcdn.com/764108/6/check/neutron-tempest-plugin-api/cc5cbc6/testr_results.html15:42
slaweqI need to report that one too15:42
slaweqfrom what I saw in neutron-ovs-agent logs it seems that the issue is with rpc loop iteration which takes long time and due to that port is not becoming ACTIVE in 60 seconds15:44
slaweqso one workaround for that could be to bump timeout in that test15:45
ralonsohis this because the VM is not spawned?15:45
slaweqbut I though that maybe ralonsoh's patches which moves sleep(0) to the end of the rpc loop iteration may help with that15:45
slaweqand second patch which lowers number of workers in the tests15:45
ralonsohagree15:45
slaweqis it also for neutron-tempest-plugin-api job?15:45
slaweqralonsoh: there is no really vm spawned in that test. It is checking just dhcp port15:46
slaweqbut that port needs to be provisioned by L2 entity also to be ACTIVE15:46
ralonsohit takes more than one minute to set the device UP15:48
slaweqralonsoh: yes15:48
ralonsohthat's insane...15:48
slaweqand You can check in neutron-ovs-agent's logs that rpc loop iteration takes about 80-90 seconds in that specific time15:48
ralonsohyeah15:49
slaweqso I thought about patch  https://review.opendev.org/c/openstack/neutron/+/755313 that maybe will help with that issue15:49
slaweqif that will be merged and we will still see the same issues, I will investigate it more15:49
ralonsohperfect15:49
*** artom has joined #openstack-meeting-315:50
slaweq#action slaweq to check if test_dhcp_port_status_active will be still failing after https://review.opendev.org/c/openstack/neutron/+/755313 will be merged15:50
slaweqbtw. lajoskatona if You can take a look at ^^ that would be great :)15:50
lajoskatonaslaweq: I check it15:51
slaweqlajoskatona: thx15:51
slaweqok, lets move on15:51
slaweqnext issue which I found was in     neutron-ovn-tempest-ovs-release-ipv6-only15:51
slaweqI saw few times ssh failures in that job15:51
slaweq    https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_cac/764356/1/check/neutron-ovn-tempest-ovs-release-ipv6-only/cacd054/testr_results.html15:51
slaweq    https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_08c/752795/24/check/neutron-ovn-tempest-ovs-release-ipv6-only/08c6400/testr_results.html15:51
*** irclogbot_3 has quit IRC15:53
slaweqin both cases it seems that even metadata wasn't reachable from the vm15:53
slaweqdo You know any issues which could cause that and are already reported/in progress?15:53
ralonsohyes but for OVS-DPDK15:53
ralonsoh(I think this is not related)15:54
ralonsohhttps://review.opendev.org/c/openstack/neutron/+/76374515:54
*** irclogbot_1 has joined #openstack-meeting-315:56
slaweqok, I will report that issue on LP and ask someone from OVN squad to take a look at it15:57
slaweq#action slaweq to report LP about SSH failures in the neutron-ovn-tempest-ovs-release-ipv6-only15:57
slaweqand with that I think it's all for today15:57
ralonsohgive me 10 secs, please. Fullstack related15:57
slaweqfrom me15:57
slaweqsure15:57
ralonsohliuyulong, https://review.opendev.org/c/openstack/neutron/+/73844615:57
ralonsohplease, take a look at the replies15:58
ralonsohand anyone else is welcome to review it15:58
ralonsohthanks a lot15:58
ralonsoh(that's all)15:58
slaweqok15:59
slaweqthx for attending the meeting15:59
ralonsohbye!15:59
slaweqsee You online15:59
slaweqo/15:59
lajoskatonaBye!15:59
slaweq#endmeeting15:59
*** openstack changes topic to "OpenStack Meetings || https://wiki.openstack.org/wiki/Meetings/"15:59
openstackMeeting ended Tue Dec  1 15:59:50 2020 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:59
bcafarelo/15:59
openstackMinutes:        http://eavesdrop.openstack.org/meetings/neutron_ci/2020/neutron_ci.2020-12-01-15.00.html15:59
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/neutron_ci/2020/neutron_ci.2020-12-01-15.00.txt15:59
openstackLog:            http://eavesdrop.openstack.org/meetings/neutron_ci/2020/neutron_ci.2020-12-01-15.00.log.html15:59
*** obondarev has quit IRC16:05
*** liuyulong has quit IRC16:07
*** irclogbot_1 has quit IRC16:11
*** irclogbot_0 has joined #openstack-meeting-316:12
*** lajoskatona has left #openstack-meeting-316:15
*** macz_ has joined #openstack-meeting-316:15
*** macz_ has quit IRC16:17
*** belmoreira has quit IRC20:53
*** e0ne has joined #openstack-meeting-321:10
*** e0ne has quit IRC21:26
*** raildo has quit IRC21:46
*** slaweq has quit IRC21:47
*** slaweq has joined #openstack-meeting-321:49
*** slaweq has quit IRC22:56
*** haleyb has quit IRC22:58
*** tosky has quit IRC23:57

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!