14:00:10 <mlavalle> #startmeeting neutron_drivers 14:00:11 <openstack> Meeting started Fri Jun 14 14:00:10 2019 UTC and is due to finish in 60 minutes. The chair is mlavalle. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:12 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:14 <openstack> The meeting name has been set to 'neutron_drivers' 14:00:36 <ralonsoh> hi 14:00:52 <slaweq> hi 14:03:00 <haleyb> hi 14:03:33 <mlavalle> let's wait 1 more minute 14:05:06 <mlavalle> ok, let's get going. we need the minimum legally required quorum 14:05:23 <mlavalle> #topic RFEs 14:05:57 <mlavalle> well, maybe not.... anyways. let's start discussing it: https://bugs.launchpad.net/neutron/+bug/1824856 14:05:58 <openstack> Launchpad bug 1824856 in neutron "[RFE] Add new state "ERROR" for HA routers" [Wishlist,Confirmed] 14:06:36 <yamamoto> hi 14:06:40 <yamamoto> sorry late 14:06:40 <davidsha> hey 14:06:45 <davidsha> also sorry 14:07:19 <mlavalle> we have one RFE for today: https://bugs.launchpad.net/neutron/+bug/1824856 14:07:20 <openstack> Launchpad bug 1824856 in neutron "[RFE] Add new state "ERROR" for HA routers" [Wishlist,Confirmed] 14:08:30 <slaweq> so this was originally reported by jlibosva in https://bugzilla.redhat.com/show_bug.cgi?id=1574950 - but I took care of it recently and pushed it upstream 14:08:31 <openstack> bugzilla.redhat.com bug 1574950 in openstack-neutron "HA router can become master but won't be reported in database" [Medium,New] - Assigned to amuller 14:10:56 <mlavalle> This would be visible to the operator in the "status" attribute of the router, right? 14:11:52 <slaweq> yes, as third possible state 14:11:58 <slaweq> sorry, status :) 14:13:19 <mlavalle> I think at some point we had something similar in mind, since the constant already exists: https://opendev.org/openstack/neutron-lib/src/branch/master/neutron_lib/constants.py#L402 14:14:16 <slaweq> I wasn't even aware of this constant :) 14:15:25 <yamamoto> how do you report error_message to api user? add the field to routers? 14:17:17 <yamamoto> mlavalle: it was used by vmware etc iirc 14:17:19 <slaweq> yamamoto: yes, I think we can add some field like "status details" or something like that 14:20:33 <yamamoto> will it be machine-readable? or just for humans? 14:21:05 <slaweq> yamamoto: what do You mean exactly? 14:21:30 <slaweq> are You asking about possible information in this new field? 14:22:26 <yamamoto> something like sub-error-code, or some random text in natural language 14:22:27 <mlavalle> well, it was added to Neutron here: https://review.opendev.org/#/c/399505/ 14:23:14 <slaweq> yamamoto: I was rather thinking about some text in natural language, maybe pass there directly error message from node on which it happend 14:23:30 <slaweq> so e.g. info about "can't access file ..." or something like that 14:24:57 <yamamoto> ok i got it 14:25:32 <liuyulong> All errors will be recorded, or there always be only one 14:25:33 <mlavalle> and the constant was added for ha routers 14:25:54 <slaweq> liuyulong: I don't think we should record all errors from the past 14:26:02 <slaweq> so I was thinking about only last one 14:26:36 <mlavalle> yeah, I think the last one is the relevant one 14:26:42 <liuyulong> Yes, this will be important, otherwise the database may not happy if we store large information 14:26:44 <slaweq> mlavalle: but now this constant isn't used in neutron at all: http://codesearch.openstack.org/?q=ROUTER_STATUS_ERROR&i=nope&files=&repos= 14:26:57 <mlavalle> oh I know 14:27:03 <mlavalle> I pointed that out in the RFE 14:27:21 <slaweq> yes, I see now, sorry :) 14:27:49 <mlavalle> the way it was used was also related to ha routers 14:28:46 <slaweq> yes, but it was used to show error on db level, during router creation 14:29:07 <slaweq> and now we want to use during router's lifecycle :) 14:29:39 <liuyulong> So if the router is back to normal state, then we need to manually or automatically delete the error info? 14:30:03 <mlavalle> yeap 14:30:34 <mlavalle> I am good with this RFE 14:30:56 <slaweq> liuyulong: I was thinking that it would be done automatically - in same way like now status is changed e.g. from BACKUP to ACTIVE and vice versa 14:31:10 <liuyulong> Such as, a router admin state down-up can sometimes fix the router status. 14:31:55 <yamamoto> mlavalle: there might not have been a neutron constant. but i remember some plugins were using "ERROR" for router status 14:33:09 <liuyulong> yamamoto, l3_ha mode can now enter the ERROR state. 14:33:25 <yamamoto> i'm fine with the RFE 14:33:29 <mlavalle> yes, in the git history around this constant I see patches related to the decoupling of plugins from the neutron repo 14:34:12 <liuyulong> https://github.com/openstack/neutron/blob/master/neutron/db/l3_hamode_db.py#L421 14:34:35 <mlavalle> it was probably reintroduced by the patch above and then removed whent we adopted neutron-lib, which used another constant: https://review.opendev.org/#/c/489398/ 14:35:00 <slaweq> mlavalle: oh, so ^^ looks that it's still there as in initial patch You linked earlier but now other constant is used :) 14:35:22 <mlavalle> the value is the same: "ERROR" 14:35:55 <liuyulong> It should be fixed IMO, : ) 14:36:20 <slaweq> yes, value is the same :) 14:37:01 <mlavalle> haleyb: what do you think? 14:37:04 <slaweq> so in such case also we can put some detailed info in this new field :) 14:37:12 <haleyb> +1 from me 14:37:13 <liuyulong> slaweq, when or what's the proper time in the l3 agent side to automatically remove the error info? 14:38:34 <slaweq> liuyulong: this error would come usually from neutron-keepalived-state-change so if this would be changed, it should be changed in Neutron's db also 14:41:05 <liuyulong> So the original bugzilla bug says, neutron-keepalived-state-change may have some potential problem, so we will also rely on that to update the database? 14:41:53 <slaweq> liuyulong: yes 14:43:15 <yamamoto> ERROR etc without prefix were originally for lbaas. i guess we can remove them at this point. 14:44:26 <liuyulong> One issue I can image is if neutron-keepalived-state-change does not spawned, error clean or store to DB? 14:45:34 <slaweq> liuyulong: I think that now if it will not spawn, last status shouldn't be cleaned 14:46:08 <liuyulong> Maybe I'm over concerned. : )_ 14:46:32 <mlavalle> so let's move ahead with this RFE, then 14:46:42 <liuyulong> +1 14:46:49 <slaweq> thx 14:47:48 <mlavalle> That's it for today guys 14:47:54 <mlavalle> thanks for attending 14:48:00 <mlavalle> have a great weekend! 14:48:10 <mlavalle> #endmeeting