amorin | yes, we have those lines here also, and actually we think it' | 08:48 |
---|---|---|
amorin | we think it's not that bad, it help us make sure the agent is running fine | 08:48 |
amorin | we do some rotation every day anyway to prevent the files growing too much | 08:48 |
amorin | but I agree with you that this is maybe overkill | 08:49 |
amorin | moreover, we are using RPC ping to monitor the service, so this is mostly useful for a human | 08:49 |
amorin | I am not against removing such lines on my side | 08:49 |
felixhuettner[m] | i would also see these as quite valuable. | 08:50 |
felixhuettner[m] | We actually do monitoring on these to make sure that the l2 agents process their iterations in a reasonable time | 08:50 |
felixhuettner[m] | it also helps us to check if a issue we might be seeing is related to the l2 agent taking long | 08:50 |
amorin | we can still think about the best way to monitor. I do believe that monitoring agent health through logs is not a good idea and we try to avoid this most of the time | 08:53 |
felixhuettner[m] | yep, we actually use a custom patch for neutron, that outputs the log in a parsable way | 09:00 |
felixhuettner[m] | https://gitlab.com/yaook/images/neutron-agent/-/blob/devel/neutron-agent/files/patch-ovs-iteration-status-victoria.patch | 09:00 |
amorin | nice! | 09:07 |
amorin | and you push that in /tmp | 09:07 |
amorin | the next move would be to push that in a "prometheus" compliant file | 09:07 |
amorin | and read that from node_exporter :) | 09:07 |
felixhuettner[m] | :) we use it in kubernetes lifeness/readyness probes :) | 11:23 |
felixhuettner[m] | (or only readyness now, since killing an neutron-openvswitch-agent has not been the smartest move ever :) ) | 11:23 |
opendevreview | Merged openstack/ha-guide master: remove unicode literal from code https://review.opendev.org/c/openstack/ha-guide/+/851308 | 11:35 |
tobias-urdin | same, we use that output for monitoring so would not have it removed :) | 13:15 |
cstone | Interesting, thanks for the insight on how the log lines are used. We're moving to openvswitch and found it pretty noisy, but it sounds like it's useful to have the log lines to detect failures. We're not in production yet so maybe we haven't found out that they're useful yet. :D | 15:26 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!