14:00:10 <mlavalle> #startmeeting neutron_drivers
14:00:11 <openstack> Meeting started Fri Jun 14 14:00:10 2019 UTC and is due to finish in 60 minutes.  The chair is mlavalle. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:12 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:14 <openstack> The meeting name has been set to 'neutron_drivers'
14:00:36 <ralonsoh> hi
14:00:52 <slaweq> hi
14:03:00 <haleyb> hi
14:03:33 <mlavalle> let's wait 1 more minute
14:05:06 <mlavalle> ok, let's get going. we need the minimum legally required quorum
14:05:23 <mlavalle> #topic RFEs
14:05:57 <mlavalle> well, maybe not.... anyways. let's start discussing it: https://bugs.launchpad.net/neutron/+bug/1824856
14:05:58 <openstack> Launchpad bug 1824856 in neutron "[RFE] Add new state "ERROR" for HA routers" [Wishlist,Confirmed]
14:06:36 <yamamoto> hi
14:06:40 <yamamoto> sorry late
14:06:40 <davidsha> hey
14:06:45 <davidsha> also sorry
14:07:19 <mlavalle> we have one RFE for today: https://bugs.launchpad.net/neutron/+bug/1824856
14:07:20 <openstack> Launchpad bug 1824856 in neutron "[RFE] Add new state "ERROR" for HA routers" [Wishlist,Confirmed]
14:08:30 <slaweq> so this was originally reported by jlibosva in https://bugzilla.redhat.com/show_bug.cgi?id=1574950 - but I took care of it recently and pushed it upstream
14:08:31 <openstack> bugzilla.redhat.com bug 1574950 in openstack-neutron "HA router can become master but won't be reported in database" [Medium,New] - Assigned to amuller
14:10:56 <mlavalle> This would be visible to the operator in the "status" attribute of the router, right?
14:11:52 <slaweq> yes, as third possible state
14:11:58 <slaweq> sorry, status :)
14:13:19 <mlavalle> I think at some point we had something similar in mind, since the constant already exists: https://opendev.org/openstack/neutron-lib/src/branch/master/neutron_lib/constants.py#L402
14:14:16 <slaweq> I wasn't even aware of this constant :)
14:15:25 <yamamoto> how do you report error_message to api user? add the field to routers?
14:17:17 <yamamoto> mlavalle: it was used by vmware etc iirc
14:17:19 <slaweq> yamamoto: yes, I think we can add some field like "status details" or something like that
14:20:33 <yamamoto> will it be machine-readable? or just for humans?
14:21:05 <slaweq> yamamoto: what do You mean exactly?
14:21:30 <slaweq> are You asking about possible information in this new field?
14:22:26 <yamamoto> something like sub-error-code, or some random text in natural language
14:22:27 <mlavalle> well, it was added to Neutron here: https://review.opendev.org/#/c/399505/
14:23:14 <slaweq> yamamoto: I was rather thinking about some text in natural language, maybe pass there directly error message from node on which it happend
14:23:30 <slaweq> so e.g. info about "can't access file ..." or something like that
14:24:57 <yamamoto> ok i got it
14:25:32 <liuyulong> All errors will be recorded, or there always be only one
14:25:33 <mlavalle> and the constant was added for ha routers
14:25:54 <slaweq> liuyulong: I don't think we should record all errors from the past
14:26:02 <slaweq> so I was thinking about only last one
14:26:36 <mlavalle> yeah, I think the last one is the relevant one
14:26:42 <liuyulong> Yes, this will be important, otherwise the database may not happy if we store large information
14:26:44 <slaweq> mlavalle: but now this constant isn't used in neutron at all: http://codesearch.openstack.org/?q=ROUTER_STATUS_ERROR&i=nope&files=&repos=
14:26:57 <mlavalle> oh I know
14:27:03 <mlavalle> I pointed that out in the RFE
14:27:21 <slaweq> yes, I see now, sorry :)
14:27:49 <mlavalle> the way it was used was also related to ha routers
14:28:46 <slaweq> yes, but it was used to show error on db level, during router creation
14:29:07 <slaweq> and now we want to use during router's lifecycle :)
14:29:39 <liuyulong> So if the router is back to normal state, then we need to manually or automatically delete the error info?
14:30:03 <mlavalle> yeap
14:30:34 <mlavalle> I am good with this RFE
14:30:56 <slaweq> liuyulong: I was thinking that it would be done automatically - in same way like now status is changed e.g. from BACKUP to ACTIVE and vice versa
14:31:10 <liuyulong> Such as, a router admin state down-up can sometimes fix the router status.
14:31:55 <yamamoto> mlavalle: there might not have been a neutron constant. but i remember some plugins were using "ERROR" for router status
14:33:09 <liuyulong> yamamoto, l3_ha mode can now enter the ERROR state.
14:33:25 <yamamoto> i'm fine with the RFE
14:33:29 <mlavalle> yes, in the git history around this constant I see patches related to the decoupling of plugins from the neutron repo
14:34:12 <liuyulong> https://github.com/openstack/neutron/blob/master/neutron/db/l3_hamode_db.py#L421
14:34:35 <mlavalle> it was probably reintroduced by the patch above and then removed whent we adopted neutron-lib, which used another constant: https://review.opendev.org/#/c/489398/
14:35:00 <slaweq> mlavalle: oh, so ^^ looks that it's still there as in initial patch You linked earlier but now other constant is used :)
14:35:22 <mlavalle> the value is the same: "ERROR"
14:35:55 <liuyulong> It should be fixed IMO, : )
14:36:20 <slaweq> yes, value is the same :)
14:37:01 <mlavalle> haleyb: what do you think?
14:37:04 <slaweq> so in such case also we can put some detailed info in this new field :)
14:37:12 <haleyb> +1 from me
14:37:13 <liuyulong> slaweq, when or what's the proper time in the l3 agent side to automatically remove the error info?
14:38:34 <slaweq> liuyulong: this error would come usually from neutron-keepalived-state-change so if this would be changed, it should be changed in Neutron's db also
14:41:05 <liuyulong> So the original bugzilla bug says, neutron-keepalived-state-change may have some potential problem, so we will also rely on that to update the database?
14:41:53 <slaweq> liuyulong: yes
14:43:15 <yamamoto> ERROR etc without prefix were originally for lbaas. i guess we can remove them at this point.
14:44:26 <liuyulong> One issue I can image is if neutron-keepalived-state-change does not spawned, error clean or store to DB?
14:45:34 <slaweq> liuyulong: I think that now if it will not spawn, last status shouldn't be cleaned
14:46:08 <liuyulong> Maybe I'm over concerned. : )_
14:46:32 <mlavalle> so let's move ahead with this RFE, then
14:46:42 <liuyulong> +1
14:46:49 <slaweq> thx
14:47:48 <mlavalle> That's it for today guys
14:47:54 <mlavalle> thanks for attending
14:48:00 <mlavalle> have a great weekend!
14:48:10 <mlavalle> #endmeeting