*** tosky has quit IRC | 00:02 | |
*** macz_ has quit IRC | 00:40 | |
*** psachin has joined #openstack-meeting-3 | 03:35 | |
*** hemanth_n has joined #openstack-meeting-3 | 05:18 | |
*** belmoreira has joined #openstack-meeting-3 | 07:21 | |
*** ralonsoh has joined #openstack-meeting-3 | 07:37 | |
*** slaweq has joined #openstack-meeting-3 | 07:52 | |
*** tosky has joined #openstack-meeting-3 | 08:40 | |
*** psachin has quit IRC | 09:07 | |
*** ricolin has joined #openstack-meeting-3 | 09:21 | |
*** e0ne has joined #openstack-meeting-3 | 10:02 | |
*** e0ne has quit IRC | 10:12 | |
*** e0ne has joined #openstack-meeting-3 | 10:13 | |
*** e0ne has quit IRC | 10:35 | |
*** waverider has joined #openstack-meeting-3 | 11:09 | |
*** Luzi has joined #openstack-meeting-3 | 12:36 | |
*** hemanth_n has quit IRC | 13:09 | |
*** baojg has joined #openstack-meeting-3 | 13:16 | |
*** raildo has joined #openstack-meeting-3 | 13:25 | |
*** baojg has quit IRC | 13:30 | |
*** baojg has joined #openstack-meeting-3 | 13:32 | |
*** obondarev has joined #openstack-meeting-3 | 13:33 | |
*** slaweq has quit IRC | 13:34 | |
*** slaweq has joined #openstack-meeting-3 | 13:36 | |
*** Luzi has quit IRC | 13:43 | |
*** liuyulong has joined #openstack-meeting-3 | 13:53 | |
*** mlavalle has joined #openstack-meeting-3 | 13:58 | |
*** lajoskatona has joined #openstack-meeting-3 | 13:58 | |
slaweq | #startmeeting networking | 14:00 |
---|---|---|
openstack | Meeting started Tue Dec 1 14:00:22 2020 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. | 14:00 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 14:00 |
*** openstack changes topic to " (Meeting topic: networking)" | 14:00 | |
openstack | The meeting name has been set to 'networking' | 14:00 |
slaweq | hi! | 14:00 |
mlavalle | o/ | 14:00 |
amotoki | o/ | 14:00 |
ralonsoh | hi | 14:00 |
bcafarel | hi | 14:00 |
obondarev | hi | 14:01 |
rubasov | hi | 14:01 |
lajoskatona | i | 14:01 |
lajoskatona | Hi | 14:01 |
slaweq | #topic Announcements | 14:01 |
*** openstack changes topic to "Announcements (Meeting topic: networking)" | 14:01 | |
slaweq | After migration to new Gerrit automation scripts which integrates Launchpad and Gerrit seems that are not working | 14:01 |
slaweq | So please update Your LPs when You push patch related to some bug report | 14:02 |
slaweq | like set LP to be in progress | 14:02 |
slaweq | and paste link to the proposed fix in comment | 14:02 |
slaweq | and also assign LP to yourself in such case | 14:02 |
slaweq | next one | 14:03 |
slaweq | according to the https://releases.openstack.org/wallaby/schedule.html | 14:03 |
slaweq | this week we have Wallaby-1 milestone | 14:03 |
slaweq | releases of some libs are already proposed or even done for things like neutron-lib | 14:04 |
slaweq | so we should be good there | 14:04 |
slaweq | but also, I would like ask You all once again to spent some time on reviewing opened specs: https://review.opendev.org/q/project:openstack/neutron-specs+status:open | 14:04 |
slaweq | so hopefully authors of them will be able to start working on implementation soon | 14:05 |
slaweq | and that's all announcements/reminders from me for today | 14:05 |
slaweq | do You have anything else You want to share? | 14:05 |
slaweq | ok, I guess that this means "no" | 14:07 |
slaweq | so lets move on | 14:07 |
slaweq | #topic Blueprints | 14:07 |
*** openstack changes topic to "Blueprints (Meeting topic: networking)" | 14:07 | |
slaweq | I moved things from W-1 to W-2 now | 14:07 |
slaweq | https://bugs.launchpad.net/neutron/+milestone/wallaby-2 | 14:07 |
slaweq | and I updated today BP https://blueprints.launchpad.net/neutron/+spec/enginefacade-switch which is almost done on neutron side | 14:07 |
slaweq | last week we merged ralonsoh's patch which we though that finishes that but there were still some leftovers in the neutron code | 14:08 |
slaweq | so I pushed couple of new patches today https://review.opendev.org/q/project:openstack/neutron-specs+status:open | 14:08 |
ralonsoh | (sorry for that) | 14:08 |
bcafarel | still, worth some \o/ | 14:08 |
slaweq | ralonsoh: np at all :) | 14:08 |
slaweq | You did worst part actually | 14:08 |
slaweq | sorry, wrong link | 14:09 |
ralonsoh | btw, correct link | 14:09 |
slaweq | https://review.opendev.org/#/q/status:open+topic:bp/enginefacade-switch | 14:09 |
ralonsoh | https://review.opendev.org/q/topic:%22bp%252Fenginefacade-switch%22+(status:open%20OR%20status:merged) | 14:09 |
slaweq | thx ralonsoh :) | 14:09 |
slaweq | we still need to make same transition for the stadium projects before we will mark this BP as completed | 14:10 |
mlavalle | Great work! | 14:10 |
njohnston | Great job ralonsoh, slaweq, and everyone! The dragon is slain (at least in neutron)! | 14:11 |
slaweq | :) | 14:11 |
slaweq | regarding other BPs, I created new one https://blueprints.launchpad.net/neutron/+spec/secure-bac-roles | 14:12 |
slaweq | to track migration to secure rbac roles in Neutron | 14:13 |
slaweq | I know that mlavalle and amotoki was volunteering to work on that but recently it became high prio for Red Hat so we have more pressure to work on that and I already started sending some patches | 14:13 |
mlavalle | ok | 14:14 |
slaweq | https://review.opendev.org/q/topic:%2522secure-rbac%2522+(status:open+OR+status:merged)+project:openstack/neutron | 14:14 |
slaweq | any help with that is of course welcome :) | 14:14 |
mlavalle | ok | 14:14 |
amotoki | slaweq: sorry for the delay and thanks for starting the work. I will help it. | 14:14 |
slaweq | no need to sorry, we all have own priorities and limited time | 14:14 |
slaweq | and thx for help and valuable reviews there :) | 14:15 |
amotoki | btw, do you continue to use the current blueprint with typo. | 14:15 |
amotoki | ? | 14:15 |
slaweq | amotoki: is there way to change it? or the only way is to create new BP? | 14:15 |
slaweq | do You know? | 14:16 |
amotoki | slaweq: I will check it. otherwise we can create a new one and mark it as superceded | 14:16 |
slaweq | ok, thx | 14:16 |
amotoki | renamed it to https://blueprints.launchpad.net/neutron/+spec/secure-rbac-roles now | 14:16 |
amotoki | we can change it from Change details at the right top corner. | 14:17 |
slaweq | thx a lot | 14:17 |
slaweq | good to know | 14:17 |
slaweq | that's all updates from me regarding BPs | 14:18 |
slaweq | do You have anything else | 14:19 |
slaweq | ? | 14:19 |
mlavalle | I have an update / request regarding address groups | 14:19 |
mlavalle | We are facing some challenges updating the firewall after an address group has been updated. hangyang left a comment yesterday in https://review.opendev.org/c/openstack/neutron/+/757650 | 14:19 |
mlavalle | could ralonsoh take a look? | 14:20 |
ralonsoh | sure | 14:20 |
mlavalle | Tnaks! | 14:20 |
mlavalle | that's all | 14:20 |
slaweq | ahh, conjunctions | 14:21 |
slaweq | :/ | 14:21 |
slaweq | good luck :P | 14:21 |
mlavalle | LOL | 14:21 |
lajoskatona | Another conjunction: https://earthsky.org/astronomy-essentials/great-jupiter-saturn-conjunction-dec-21-2020 | 14:21 |
lajoskatona | sorry for hyjacking the meeting.... | 14:22 |
slaweq | :) | 14:22 |
bcafarel | I very much prefer the later one | 14:23 |
njohnston | lol, for sure | 14:23 |
slaweq | ok, if there are no other updates about BPs, lets move on | 14:24 |
slaweq | #topic Community Goals | 14:24 |
*** openstack changes topic to "Community Goals (Meeting topic: networking)" | 14:24 | |
slaweq | regarding "Migrate RBAC Policy Format from JSON to YAML" I saw that gmann already pushed some patches | 14:24 |
slaweq | https://review.opendev.org/c/openstack/neutron/+/764401 | 14:24 |
slaweq | https://review.opendev.org/c/openstack/neutron-lib/+/764416 | 14:24 |
slaweq | and there is also mail http://lists.openstack.org/pipermail/openstack-discuss/2020-November/019079.html | 14:24 |
slaweq | amotoki: You are probably aware of all of that | 14:25 |
slaweq | but just in case I wanted to paste it all here :) | 14:25 |
amotoki | yeah. I tried it and saw some unclear behavior. | 14:25 |
amotoki | I will follow it up. | 14:26 |
slaweq | thx amotoki | 14:27 |
slaweq | amotoki: if You will need any help, please ping me | 14:27 |
amotoki | slaweq: sure. thanks | 14:27 |
slaweq | so that one should be under control for now | 14:28 |
slaweq | ralonsoh: any updates about migration to privsep? | 14:28 |
ralonsoh | yes, but I'm facing some problems with one patch (maybe pyroute2 related) | 14:28 |
ralonsoh | and I have the "grenade" patch | 14:28 |
ralonsoh | (one sec) | 14:28 |
ralonsoh | https://review.opendev.org/c/openstack/neutron/+/764015 | 14:29 |
ralonsoh | that will replace rootwrap, generically, with a privsep context | 14:29 |
ralonsoh | I know this is gross but effective | 14:29 |
slaweq | yeah, so kind of "one patch to migrate them all", right? | 14:30 |
ralonsoh | yes | 14:30 |
ralonsoh | but, of course, many errors still there | 14:30 |
ralonsoh | so I'll be focused first in the short ones | 14:30 |
slaweq | ok | 14:31 |
slaweq | thx for working on that | 14:31 |
slaweq | I think we can move on | 14:33 |
slaweq | next topic | 14:33 |
slaweq | #topic Bugs | 14:33 |
*** openstack changes topic to "Bugs (Meeting topic: networking)" | 14:33 | |
slaweq | obondarev was bug deputy last week | 14:33 |
slaweq | report is at http://lists.openstack.org/pipermail/openstack-discuss/2020-November/019119.html | 14:33 |
slaweq | obondarev: any bugs You would like to bring to the team now? | 14:33 |
obondarev | not really | 14:33 |
obondarev | pretty quiet week | 14:34 |
ralonsoh | (cool) | 14:34 |
bcafarel | that was until some PTL started filling bugs today :) | 14:34 |
slaweq | yeah, I saw Your report | 14:34 |
slaweq | there is only one unassigned bug there | 14:34 |
slaweq | https://bugs.launchpad.net/neutron/+bug/1905551 | 14:34 |
openstack | Launchpad bug 1905551 in neutron "functional: test_gateway_chassis_rebalance fails" [Medium,Confirmed] | 14:34 |
slaweq | related to ovn functional tests | 14:35 |
slaweq | so would be great if someone could take a look at it | 14:35 |
slaweq | ok, any other bugs You want to discuss today? | 14:37 |
slaweq | if not, I have one | 14:38 |
slaweq | I added it to the "On Demand Agenda" but we can discuss now | 14:39 |
slaweq | https://bugs.launchpad.net/neutron/+bug/1903531 | 14:39 |
openstack | Launchpad bug 1903531 in neutron "Update of neutron-server breaks compatibility to previous neutron-agent version" [Critical,Confirmed] - Assigned to Slawek Kaplonski (slaweq) | 14:39 |
slaweq | after some discussion on irc and ML last week I decided that will be better to remove it from all branches | 14:39 |
slaweq | and then propose it make again to master but with properly provided backward compatibility | 14:40 |
slaweq | patches are proposed: | 14:40 |
slaweq | master - https://review.opendev.org/c/openstack/neutron/+/764189 | 14:40 |
slaweq | victoria - https://review.opendev.org/c/openstack/neutron/+/764190 | 14:40 |
slaweq | ussuri - https://review.opendev.org/c/openstack/neutron/+/764191 | 14:40 |
slaweq | liuyulong proposed https://review.opendev.org/c/openstack/neutron/+/764108 to avoid reverting original patch from the master branch | 14:40 |
slaweq | so I wanted to discuss it once again here - should we not revert original patch from master and just go with liuyulong's additional patch? Or what is the best approach in Your opinion? | 14:42 |
slaweq | because we basically broke upgrades/updates with that patch and we have to fix it somehow | 14:42 |
ralonsoh | this is not a bad idea: to convert the message from the server and cap the version given to the agents | 14:43 |
bcafarel | well for master it does not change much, by wallaby release we should have updated API target and relevant code | 14:43 |
bcafarel | wether we revert fully and reapply or just bump | 14:43 |
bcafarel | but for stable... train to ussuri we will need old API to fix upgrade right? | 14:44 |
slaweq | for stable we need IMO to revert that patch | 14:44 |
ralonsoh | liu's patch should handle both | 14:44 |
ralonsoh | (if server is updated first) | 14:44 |
bcafarel | which means revert (breaking minor updates but can probably not be helpd, hence possible release note here) and then only fix in wallaby | 14:44 |
slaweq | as otherwise we will always have some point where we will have incompatibility | 14:44 |
slaweq | so do You think it will be good to go with revert for stable/victoria and ussuri and with liuyulong's patch for master, right? | 14:45 |
amotoki | I agree that we need to revert that patch in all stable branches. | 14:45 |
ralonsoh | yes for master, because Liu's patch can handle when server is bumped and agents not | 14:46 |
amotoki | +1 for slaweq's proposal | 14:46 |
ralonsoh | +1 | 14:46 |
slaweq | ok, thx | 14:46 |
slaweq | so lets make it that way to fix it | 14:47 |
slaweq | thx for Your opinions | 14:47 |
liuyulong | Why not backport the fix to the patch original merged release? V? or U? | 14:47 |
slaweq | liuyulong: I don't think we should/can really backport things which bumps rpc version | 14:47 |
ralonsoh | exactly | 14:47 |
ralonsoh | (and we already had this problem with this bug) | 14:48 |
slaweq | this original patch should have bumped rpc version and we should never backport it to stable branches tbh | 14:48 |
liuyulong | It should be there, we missed that, so I'm OK if we can break the law once. | 14:48 |
slaweq | liuyulong: I disagree with that - we already broke that law once with that patch and made a lot of mess | 14:48 |
slaweq | we shouldn't do that again | 14:48 |
liuyulong | I'm not saying the release before V or U release. | 14:49 |
slaweq | and that's also what I got from the releases team when I discussed with them | 14:49 |
amotoki | I am not sure we cannot bump RPC version in stalbe branhces but we must keep backward compatibility between server and agent. | 14:49 |
slaweq | amotoki: liuyulong: ok, I will ask once again stable main cores about opinion on that | 14:50 |
liuyulong | The patch was in stable/ussuri natively, just checked that. | 14:51 |
slaweq | if they will tell me that we can go with that, then ok | 14:51 |
slaweq | liuyulong: yes, I know | 14:51 |
amotoki | in this case, agent is not upgraded and requests an older version, so the server should talk the older version at least. | 14:51 |
bcafarel | if you follow a correct order in upgrade yes with versioned it works OK, though if you for some reason update agents first it will complain | 14:51 |
ralonsoh | bcafarel, well, the upgrade procedure says that you should update the servers first | 14:52 |
slaweq | bcafarel: but we do support update when older agents can work with newer server | 14:52 |
slaweq | not vice versa | 14:52 |
slaweq | and we should only really care about such scenario | 14:52 |
lajoskatona | https://docs.openstack.org/project-team-guide/stable-branches.html#review-guidelines | 14:53 |
lajoskatona | here it is mentions only Nova’s internal AMQP API, but I suppose that can be true for other projects as well | 14:53 |
bcafarel | yes, so it may be acceptable (as in "security fix with proper RPC version bump") | 14:53 |
amotoki | I think we are discussing two aspects: stable backports and upgrade scenarios. | 14:54 |
amotoki | upgrade scnearios should be considered even if we are in development cycle. | 14:54 |
slaweq | amotoki: for me now the question is if we can do backport of that patch which bumps rpc version | 14:54 |
slaweq | at least for me :) | 14:55 |
amotoki | slaweq: yes you're right | 14:56 |
liuyulong | Actually the RPC was only called when use iptables_hybrid and enable ebtables. The openflow firewall driver will not call this function. Openflow firewall uses local cache directly. Maybe that's why we missed that RPC bump. | 14:56 |
slaweq | liuyulong: yes, because of that and because of lack of grenade testing | 14:56 |
amotoki | I wonder whether during stable upgrade process both agent and server can be upgrade first... If we allow to upgrade either first, we cannot upgrade RPC version at all. | 14:56 |
slaweq | we are testing grenade only with openvswitch firewall driver which is default in devstack | 14:56 |
slaweq | amotoki: in the docs we guarantee that You can update server first and newer server will be compatible to work with older agents | 14:57 |
slaweq | that's what we broke with that patch actually | 14:57 |
bcafarel | yep | 14:57 |
liuyulong | amotoki, yes, if the client side has higher version, the RPC will get failed directly. | 14:57 |
bcafarel | seeing as time is running out, at least first steps seem OK for everyone right? fixing version in master, reverting ussuri/victoria | 14:58 |
amotoki | thanks. I missed that point. | 14:58 |
bcafarel | and then discuss if this can be backported again with new rpc | 14:58 |
ralonsoh | bcafarel, I think the proposal is different now | 14:59 |
slaweq | we are almost on top of the hour now, I will ask stable-maint cores about backporting liuyulong's patch to V/U also | 14:59 |
slaweq | if that will be ok for them, I will abandon my reverts | 14:59 |
slaweq | if not, then lets revert it from those branches and go with liuyulong's patch in master | 14:59 |
slaweq | sounds good? | 14:59 |
bcafarel | ack | 14:59 |
liuyulong | OK | 14:59 |
amotoki | sounds good | 14:59 |
slaweq | ok, thx for attending the meeting | 15:00 |
slaweq | have a great week :) | 15:00 |
slaweq | #endmeeting | 15:00 |
*** openstack changes topic to "OpenStack Meetings || https://wiki.openstack.org/wiki/Meetings/" | 15:00 | |
openstack | Meeting ended Tue Dec 1 15:00:13 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 15:00 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/networking/2020/networking.2020-12-01-14.00.html | 15:00 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/networking/2020/networking.2020-12-01-14.00.txt | 15:00 |
openstack | Log: http://eavesdrop.openstack.org/meetings/networking/2020/networking.2020-12-01-14.00.log.html | 15:00 |
lajoskatona | o/ | 15:00 |
slaweq | #startmeeting neutron_ci | 15:00 |
openstack | Meeting started Tue Dec 1 15:00:41 2020 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:00 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:00 |
*** openstack changes topic to " (Meeting topic: neutron_ci)" | 15:00 | |
openstack | The meeting name has been set to 'neutron_ci' | 15:00 |
slaweq | welcome again :) | 15:00 |
bcafarel | not even time for coffee break :( | 15:00 |
ralonsoh | hi again | 15:00 |
lajoskatona | o/ | 15:00 |
obondarev | o/ | 15:01 |
slaweq | ok, lets start as we have couple of things to discuss here also :) | 15:02 |
slaweq | Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate | 15:02 |
*** artom has quit IRC | 15:02 | |
slaweq | #topic Actions from previous meetings | 15:02 |
*** openstack changes topic to "Actions from previous meetings (Meeting topic: neutron_ci)" | 15:02 | |
slaweq | bcafarel to fix stable branches upper-constraints in stadium projects | 15:02 |
bcafarel | done for victoria https://review.opendev.org/c/openstack/requirements/+/764022 | 15:03 |
bcafarel | ussuri close https://review.opendev.org/c/openstack/requirements/+/764021 | 15:03 |
* mlavalle has a doctor appointment. will skip this meeting o/ | 15:03 | |
bcafarel | in the end this also required dropping neutron from blacklist | 15:04 |
slaweq | take care mlavalle :) | 15:04 |
bcafarel | o/ mlavalle | 15:04 |
bcafarel | with requirements folks still hoping neutron-lib would be complete one day and remove need for these steps | 15:04 |
bcafarel | but well we know this will not be the case soon™ | 15:04 |
bcafarel | anyway at least this will be noted in my next action item | 15:05 |
slaweq | what do You mean by "neutron-lib will be complete"? | 15:05 |
slaweq | so all projects will import only neutron-lib, and not neutron? | 15:05 |
* mlavalle is only going for an eye exam. needs new eye glasses. that's all :-) | 15:05 | |
bcafarel | slaweq: indeed | 15:05 |
slaweq | bcafarel: that can be hard, especially that we don't work on that too much recently :/ | 15:06 |
bcafarel | yes :/ so I think we will stay with the "need to update requirements after a release" step | 15:07 |
slaweq | bcafarel: and to fix that in ussuri we need https://review.opendev.org/c/openstack/requirements/+/764021 right? | 15:07 |
bcafarel | slaweq: yes that's the one (764022 is the merged one for victoria) | 15:08 |
slaweq | ok, so it's almost there | 15:08 |
slaweq | ok, lets move to the next one | 15:09 |
slaweq | bcafarel to check and update doc https://docs.openstack.org/neutron/latest/contributor/policies/release-checklist.html | 15:09 |
bcafarel | barely started, we can put that for next week | 15:10 |
slaweq | ok | 15:10 |
slaweq | #action bcafarel to check and update doc https://docs.openstack.org/neutron/latest/contributor/policies/release-checklist.html | 15:10 |
slaweq | so next one | 15:10 |
slaweq | slaweq to explore options to fix https://bugs.launchpad.net/neutron/+bug/1903531 | 15:10 |
openstack | Launchpad bug 1903531 in neutron "Update of neutron-server breaks compatibility to previous neutron-agent version" [Critical,Confirmed] - Assigned to Slawek Kaplonski (slaweq) | 15:10 |
slaweq | we already dicussed that on the previous meeting | 15:10 |
bcafarel | just a bit :) | 15:10 |
slaweq | so no need to repeat it here | 15:10 |
slaweq | next one | 15:11 |
slaweq | slaweq to report bug against rally | 15:11 |
slaweq | I checked that and it's really not rally bug | 15:11 |
slaweq | but some red herring | 15:11 |
slaweq | real bug was that some subnet creation failed simply | 15:11 |
slaweq | so I didn't report anything againt rally | 15:11 |
slaweq | and that's all actions from last week | 15:12 |
slaweq | next topic | 15:12 |
slaweq | #topic Stadium projects | 15:12 |
*** openstack changes topic to "Stadium projects (Meeting topic: neutron_ci)" | 15:12 | |
slaweq | any updates about stadium projects ci? | 15:12 |
slaweq | lajoskatona? | 15:12 |
lajoskatona | nothing as I have seen | 15:12 |
lajoskatona | things are going on without much problem | 15:12 |
slaweq | lajoskatona: that's good to hear | 15:13 |
slaweq | #topic Stable branches | 15:13 |
*** openstack changes topic to "Stable branches (Meeting topic: neutron_ci)" | 15:13 | |
slaweq | Victoria dashboard: https://grafana.opendev.org/d/HUCHup2Gz/neutron-failure-rate-previous-stable-release?orgId=1 | 15:13 |
slaweq | Ussuri dashboard: https://grafana.opendev.org/d/smqHXphMk/neutron-failure-rate-older-stable-release?orgId=1 | 15:13 |
slaweq | bcafarel: any updates/issues regarding ci of stable branches? | 15:14 |
bcafarel | not that I am aware of at least :) | 15:14 |
slaweq | ok | 15:15 |
slaweq | so lets move on | 15:15 |
slaweq | #topic Grafana | 15:15 |
*** openstack changes topic to "Grafana (Meeting topic: neutron_ci)" | 15:15 | |
slaweq | in master branch I don't think that things are going well | 15:15 |
slaweq | we have plenty of issues and failure rates are pretty high for some jobs | 15:15 |
slaweq | especially functional/fullstack recently | 15:15 |
*** waverider has quit IRC | 15:16 | |
ralonsoh | if we see a recurrent error in the CI (on those jobs), report it and inform in IRC | 15:17 |
ralonsoh | just to let everybody know that you are on it | 15:17 |
slaweq | ralonsoh: yes, I have couple of examples | 15:17 |
ralonsoh | perfect | 15:17 |
slaweq | I found them today | 15:17 |
ralonsoh | (test_walk_versions, for example) | 15:17 |
slaweq | but I didn't had time yet to report LPs | 15:17 |
slaweq | ok, regarding grafana I don't really have more to say | 15:18 |
slaweq | I know that some graphs are a bit not up to date recently but I want to propose one update for that when all patches which changes some jobs will be merged | 15:18 |
slaweq | I think there is still one or too in gerrit | 15:18 |
slaweq | other than that, I think we can talk about some specific jobs now | 15:19 |
slaweq | are You ok with that? | 15:19 |
ralonsoh | yes | 15:20 |
slaweq | #topic fullstack/functional | 15:20 |
*** openstack changes topic to "fullstack/functional (Meeting topic: neutron_ci)" | 15:20 | |
slaweq | ok | 15:20 |
slaweq | first one is bug https://bugs.launchpad.net/neutron/+bug/1889781 which is still hitting us from time to time | 15:20 |
openstack | Launchpad bug 1889781 in neutron "Functional tests are timing out" [High,Confirmed] | 15:20 |
slaweq | and I think that it's even more often recently | 15:20 |
slaweq | I may try to limit number of logs send to stdout during those tests | 15:21 |
slaweq | but if there is anyone else who wants to do that, that would be great :) | 15:21 |
slaweq | please then simply assign this bug to You | 15:21 |
slaweq | and work on it | 15:22 |
ralonsoh | is that related to the size of the logs? | 15:22 |
slaweq | ralonsoh: most likely yes | 15:22 |
ralonsoh | ok | 15:22 |
slaweq | we saw similar issue in the past in UT IIRC | 15:22 |
ralonsoh | but I think this is because of some failing tests | 15:22 |
slaweq | basically it is some bug in stestr or something like that | 15:22 |
slaweq | ralonsoh: no | 15:22 |
ralonsoh | like neutron.tests.functional.agent.linux.test_tc_lib.TcFiltersTestCase.test_add_tc_filter_vxlan [540.005735s] ... FAILED | 15:23 |
ralonsoh | expending too much time | 15:23 |
slaweq | if You will see logs, there is always huge gap when nothing happens | 15:23 |
ralonsoh | because all workers are blocked in other tests | 15:23 |
slaweq | see for example: | 15:23 |
slaweq | 2020-11-30 10:03:00.937710 | controller | {1} neutron.tests.functional.agent.ovn.metadata.test_metadata_agent.TestMetadataAgent.test_agent_resync_on_non_existing_bridge [1.997655s] ... ok | 15:23 |
slaweq | 2020-11-30 10:43:39.465033 | RUN END RESULT_TIMED_OUT: [untrusted : opendev.org/openstack/neutron/playbooks/run_functional_job.yaml@master] | 15:23 |
ralonsoh | I know | 15:24 |
slaweq | those are 2 consequent lines from the log | 15:24 |
slaweq | so there is nothing for about 40 minutes there | 15:24 |
ralonsoh | but IMO this is because the other workers are blocked checking something | 15:24 |
slaweq | and that was exactly the symptom of the issue with too much output and stestr | 15:24 |
slaweq | ralonsoh: maybe the root cause now is different than it was with that stestr issue | 15:26 |
slaweq | idk really | 15:26 |
lajoskatona | there was a new release of stestr recently, not sure though what it fixes | 15:26 |
slaweq | but at first glance it looks similar to what we had in the past | 15:26 |
slaweq | anyway, if someone will have some time, You can take a look at that bug :) | 15:28 |
slaweq | lets move on | 15:28 |
slaweq | next one | 15:28 |
slaweq | I noticed few times this week failures with TestSimpleMonitorInterface | 15:28 |
slaweq | like e.g.: | 15:29 |
slaweq | https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_93d/764365/1/gate/neutron-functional-with-uwsgi/93df51c/testr_results.html | 15:29 |
slaweq | I need to report LP for that | 15:29 |
slaweq | ralonsoh: isn't that related to some of Your changes maybe? It looks like something what You could work on :) | 15:29 |
ralonsoh | sure, I'll check it | 15:30 |
ralonsoh | and I'll report a LP | 15:30 |
ralonsoh | ahh I think you are talking about a fullstack patch | 15:31 |
slaweq | ralonsoh: in log I see something like: | 15:31 |
slaweq | 2020-11-30 10:40:38.271 61912 DEBUG neutron.agent.linux.utils [req-2aa4c2b1-90e9-4f8d-a708-61d18ad4f3ec - - - - -] Running command: ['sudo', '/home/zuul/src/opendev.org/openstack/neutron/.tox/dsvm-functional/bin/neutron-rootwrap', '/home/zuul/src/opendev.org/openstack/neutron/.tox/dsvm-functional/etc/neutron/rootwrap.conf', 'ovsdb-client', 'monitor', 'Interface', | 15:31 |
slaweq | 'name,ofport,external_ids', '--format=json'] create_process /home/zuul/src/opendev.org/openstack/neutron/neutron/agent/linux/utils.py:88 | 15:31 |
slaweq | 2020-11-30 10:40:38.321 61912 DEBUG neutron.agent.common.async_process [-] Output received from [ovsdb-client monitor Interface name,ofport,external_ids --format=json]: None _read_stdout /home/zuul/src/opendev.org/openstack/neutron/neutron/agent/common/async_process.py:264 | 15:31 |
slaweq | 2020-11-30 10:40:38.322 61912 DEBUG neutron.agent.common.async_process [-] Halting async process [ovsdb-client monitor Interface name,ofport,external_ids --format=json] in response to an error. stdout: [[]] - stderr: [[]] _handle_process_error /home/zuul/src/opendev.org/openstack/neutron/neutron/agent/common/async_process.py:222 | 15:31 |
ralonsoh | slaweq, in OVS there are two monitors | 15:31 |
ralonsoh | one for the ports and another one for the bridges | 15:31 |
ralonsoh | I migrated the birdges one | 15:32 |
ralonsoh | but I never finished the complex one, for ports | 15:32 |
ralonsoh | https://review.opendev.org/c/openstack/neutron/+/735201 | 15:32 |
slaweq | so this seems for me like monitor of Interfaces | 15:32 |
ralonsoh | yes | 15:32 |
slaweq | name,ofport,external_ids | 15:32 |
ralonsoh | (not ports, interfaces) | 15:32 |
slaweq | ok, do You want to investigate it? Or do You want me to check that? | 15:33 |
ralonsoh | I'll report and investigate it | 15:33 |
slaweq | thx | 15:33 |
slaweq | #action ralonsoh to report and check issue with TestSimpleMonitorInterface in functional tests | 15:34 |
slaweq | in the meeting agenda https://etherpad.opendev.org/p/neutron-ci-meetings there is more examples of same failure | 15:34 |
slaweq | lets move now to fullstack tests | 15:34 |
slaweq | which are also not very stable recently | 15:34 |
slaweq | most often I saw issue with mysql killed by oom killer | 15:35 |
slaweq | bug reported https://launchpad.net/bugs/1906366 | 15:35 |
openstack | Launchpad bug 1906366 in neutron "oom killer kills mysqld process on the node running fullstack tests" [Critical,Confirmed] - Assigned to Slawek Kaplonski (slaweq) | 15:35 |
slaweq | I proposed patch to limit resources used there | 15:35 |
slaweq | but I saw that ralonsoh had some comments there | 15:35 |
slaweq | I didn't had time yet to address them | 15:35 |
ralonsoh | we test concurrency so I should not reduce to 1 the number of API workers | 15:36 |
ralonsoh | just this | 15:36 |
slaweq | ralonsoh: but is that only this one test which You actually mentioned? | 15:36 |
slaweq | or there are others also | 15:36 |
ralonsoh | I only found this one | 15:36 |
slaweq | because if that's the only test which needs 2 workers, I can set 2 workers only for that test | 15:36 |
ralonsoh | perfect | 15:37 |
slaweq | and use default to "1" for all other tests | 15:37 |
ralonsoh | yes, I think this is the only one | 15:37 |
slaweq | ok, great | 15:38 |
slaweq | so I will update my patch | 15:38 |
lajoskatona | that is a good exampl,e thanks for mentioning | 15:38 |
slaweq | and also as I see in the results now, lowering number of test runner workers from 4 to 3 results in about 18 minutes more for whole job | 15:38 |
slaweq | so should be acceptable | 15:38 |
lajoskatona | slaweq: if you are overloaded I can take care of this api_worker change, that dhcp test comes from us. | 15:39 |
slaweq | lajoskatona: thx, if You could update my patch that would be great | 15:40 |
lajoskatona | slaweq: sure | 15:41 |
slaweq | lajoskatona: thx a lot | 15:41 |
slaweq | ok, lets move on to the scenario/tempest jobs | 15:41 |
slaweq | #topic Tempest/Scenario | 15:41 |
*** openstack changes topic to "Tempest/Scenario (Meeting topic: neutron_ci)" | 15:41 | |
slaweq | first of all neutron-tempest-plugin-api | 15:41 |
slaweq | I noticed quite often that there is one test failing | 15:42 |
slaweq | test_dhcp_port_status_active | 15:42 |
slaweq | e.g.: | 15:42 |
slaweq | https://1973ad26b23f3d5a6239-a05b796fccac2efb122cdf71ce7f0104.ssl.cf5.rackcdn.com/763828/4/check/neutron-tempest-plugin-api/bda79c4/testr_results.html | 15:42 |
slaweq | or | 15:42 |
slaweq | https://38bbf4ec3cadfd43de08-7d0e556db3075d25d1b91bbdcc8a4562.ssl.cf2.rackcdn.com/764108/6/check/neutron-tempest-plugin-api/cc5cbc6/testr_results.html | 15:42 |
slaweq | I need to report that one too | 15:42 |
slaweq | from what I saw in neutron-ovs-agent logs it seems that the issue is with rpc loop iteration which takes long time and due to that port is not becoming ACTIVE in 60 seconds | 15:44 |
slaweq | so one workaround for that could be to bump timeout in that test | 15:45 |
ralonsoh | is this because the VM is not spawned? | 15:45 |
slaweq | but I though that maybe ralonsoh's patches which moves sleep(0) to the end of the rpc loop iteration may help with that | 15:45 |
slaweq | and second patch which lowers number of workers in the tests | 15:45 |
ralonsoh | agree | 15:45 |
slaweq | is it also for neutron-tempest-plugin-api job? | 15:45 |
slaweq | ralonsoh: there is no really vm spawned in that test. It is checking just dhcp port | 15:46 |
slaweq | but that port needs to be provisioned by L2 entity also to be ACTIVE | 15:46 |
ralonsoh | it takes more than one minute to set the device UP | 15:48 |
slaweq | ralonsoh: yes | 15:48 |
ralonsoh | that's insane... | 15:48 |
slaweq | and You can check in neutron-ovs-agent's logs that rpc loop iteration takes about 80-90 seconds in that specific time | 15:48 |
ralonsoh | yeah | 15:49 |
slaweq | so I thought about patch https://review.opendev.org/c/openstack/neutron/+/755313 that maybe will help with that issue | 15:49 |
slaweq | if that will be merged and we will still see the same issues, I will investigate it more | 15:49 |
ralonsoh | perfect | 15:49 |
*** artom has joined #openstack-meeting-3 | 15:50 | |
slaweq | #action slaweq to check if test_dhcp_port_status_active will be still failing after https://review.opendev.org/c/openstack/neutron/+/755313 will be merged | 15:50 |
slaweq | btw. lajoskatona if You can take a look at ^^ that would be great :) | 15:50 |
lajoskatona | slaweq: I check it | 15:51 |
slaweq | lajoskatona: thx | 15:51 |
slaweq | ok, lets move on | 15:51 |
slaweq | next issue which I found was in neutron-ovn-tempest-ovs-release-ipv6-only | 15:51 |
slaweq | I saw few times ssh failures in that job | 15:51 |
slaweq | https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_cac/764356/1/check/neutron-ovn-tempest-ovs-release-ipv6-only/cacd054/testr_results.html | 15:51 |
slaweq | https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_08c/752795/24/check/neutron-ovn-tempest-ovs-release-ipv6-only/08c6400/testr_results.html | 15:51 |
*** irclogbot_3 has quit IRC | 15:53 | |
slaweq | in both cases it seems that even metadata wasn't reachable from the vm | 15:53 |
slaweq | do You know any issues which could cause that and are already reported/in progress? | 15:53 |
ralonsoh | yes but for OVS-DPDK | 15:53 |
ralonsoh | (I think this is not related) | 15:54 |
ralonsoh | https://review.opendev.org/c/openstack/neutron/+/763745 | 15:54 |
*** irclogbot_1 has joined #openstack-meeting-3 | 15:56 | |
slaweq | ok, I will report that issue on LP and ask someone from OVN squad to take a look at it | 15:57 |
slaweq | #action slaweq to report LP about SSH failures in the neutron-ovn-tempest-ovs-release-ipv6-only | 15:57 |
slaweq | and with that I think it's all for today | 15:57 |
ralonsoh | give me 10 secs, please. Fullstack related | 15:57 |
slaweq | from me | 15:57 |
slaweq | sure | 15:57 |
ralonsoh | liuyulong, https://review.opendev.org/c/openstack/neutron/+/738446 | 15:57 |
ralonsoh | please, take a look at the replies | 15:58 |
ralonsoh | and anyone else is welcome to review it | 15:58 |
ralonsoh | thanks a lot | 15:58 |
ralonsoh | (that's all) | 15:58 |
slaweq | ok | 15:59 |
slaweq | thx for attending the meeting | 15:59 |
ralonsoh | bye! | 15:59 |
slaweq | see You online | 15:59 |
slaweq | o/ | 15:59 |
lajoskatona | Bye! | 15:59 |
slaweq | #endmeeting | 15:59 |
*** openstack changes topic to "OpenStack Meetings || https://wiki.openstack.org/wiki/Meetings/" | 15:59 | |
openstack | Meeting ended Tue Dec 1 15:59:50 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 15:59 |
bcafarel | o/ | 15:59 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/neutron_ci/2020/neutron_ci.2020-12-01-15.00.html | 15:59 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/neutron_ci/2020/neutron_ci.2020-12-01-15.00.txt | 15:59 |
openstack | Log: http://eavesdrop.openstack.org/meetings/neutron_ci/2020/neutron_ci.2020-12-01-15.00.log.html | 15:59 |
*** obondarev has quit IRC | 16:05 | |
*** liuyulong has quit IRC | 16:07 | |
*** irclogbot_1 has quit IRC | 16:11 | |
*** irclogbot_0 has joined #openstack-meeting-3 | 16:12 | |
*** lajoskatona has left #openstack-meeting-3 | 16:15 | |
*** macz_ has joined #openstack-meeting-3 | 16:15 | |
*** macz_ has quit IRC | 16:17 | |
*** belmoreira has quit IRC | 20:53 | |
*** e0ne has joined #openstack-meeting-3 | 21:10 | |
*** e0ne has quit IRC | 21:26 | |
*** raildo has quit IRC | 21:46 | |
*** slaweq has quit IRC | 21:47 | |
*** slaweq has joined #openstack-meeting-3 | 21:49 | |
*** slaweq has quit IRC | 22:56 | |
*** haleyb has quit IRC | 22:58 | |
*** tosky has quit IRC | 23:57 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!