13:00:17 <witek> #startmeeting monasca
13:00:18 <openstack> Meeting started Tue Mar  3 13:00:17 2020 UTC and is due to finish in 60 minutes.  The chair is witek. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:00:19 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:00:21 <openstack> The meeting name has been set to 'monasca'
13:00:28 <witek> hello everyone!
13:00:56 <witek> the agenda for today:
13:01:00 <witek> https://etherpad.openstack.org/p/monasca-team-meeting-agenda
13:01:21 <witek> please add if you have any other topics
13:01:52 <witek> #topic updating alarm definitions
13:02:24 <bandorf> OK, I will explain a bit
13:02:31 <witek> please, go on
13:02:37 <bandorf> This is the topic Martin brought up last week already.
13:02:45 <bandorf> I've investigated further.
13:03:18 <bandorf> The reason is quite simple: For an alarm definition, "time" and "times" are not used when updating an alarm definition.
13:03:42 <bandorf> Besides, I detected that even an update for "function" won't have any impact.
13:04:11 <bandorf> According to documentation, everything can be updated, apart from "metric name" and "match by".
13:04:38 <bandorf> And I don't see any reason why the time, times and fucntion shouldn't be updated.
13:04:46 <bandorf> I was just a bit confused:
13:05:24 <bandorf> The queries to create the sub alarm definitions and updating sub alarm definitions are really located very close to each other.
13:05:50 <bandorf> So, I just wanted to confirm if anybody has any idea for a reason why these fields shouldn't be updated?
13:07:18 <witek> I think times, time and function could be encapsulated in `expression`
13:07:44 <bandorf> expression is stored for the alarm definition.
13:08:12 <witek> alright
13:08:13 <bandorf> For sub-alarm-definition, the expression is split up into function, operator, metric-name, time, times etc
13:08:23 <witek> fine
13:08:39 <bandorf> So, the expression in the alarm definition is stored correctly, even after an update.
13:08:50 <dougsz> I wonder if it is a thresholder limitation, or just an oversight
13:08:52 <bandorf> But sub-alarm definitions are not properly handled
13:09:20 <bandorf> dougsz: That's exactly why I wanted to discuss it here
13:10:30 <dougsz> I would be tempted to hack in support for updates and see if it behaves
13:11:03 <bandorf> I did that already, but haven't tested extensively yet.
13:11:29 <dougsz> Cool, I am guessing there is no tempest coverage?
13:11:31 <bandorf> As said: "function" (min, max, avg, ...) won't be reflected either in the update
13:11:56 <bandorf> No, not at all. That's even not that easy to do - a further topic, maybe for the next call
13:12:27 <bandorf> For the other params that must not be updated, checks have been implemented to prevent that.
13:12:33 <bandorf> But not for those...
13:12:51 <witek> we have `test_update_alarm_definition`
13:12:51 <bandorf> No, not at all: it's related to tempest tests.
13:13:00 <witek> but it updates from empty expression
13:13:39 <bandorf> This error is even difficult to detect in tempest tests: You can't read the sub-alarm definitions via api. It's not a logical entity for the user
13:14:07 <bandorf> And the expression itself, stored for the alarm definition is always correct
13:14:13 <witek> also I guess it just reads from alarm_definitions table, so won't find this bug
13:14:36 <bandorf> Exactly.
13:14:40 <dougsz> An indirect test, like checking an alarm fires with the new expression could work
13:14:58 <bandorf> Yes, that's one of the possible solutions
13:15:03 <witek> we would need a test which proves for more realistic scenario with test values and triggered alarm state change
13:15:16 <witek> :)
13:15:48 <bandorf> Another option would be to extend the API, so that sub alarm definitions can be read
13:16:21 <witek> you mean the response body, right?
13:17:14 <bandorf> I thought about a new operation: get sub alarm defs for an alarm def
13:17:35 <bandorf> Not intended for end-users
13:18:59 <witek> I'm not sure I like it, I'd prefer adding more details to existing requests
13:19:44 <dougsz> My concern would be that it would expose what could be considered 'implementation detail'
13:19:45 <bandorf> Well, the "problem" is, that sub alarm definitions are not visible for end-users. They just see alarm defs and the related expression.
13:19:57 <bandorf> dougsz: Yes, you're right
13:20:28 <bandorf> What I anyway want to avoid is direct access to the database.
13:20:37 <bandorf> So, there are only 2 options:
13:21:15 <bandorf> Indirect testing (but that may take long, waiting for an alarm with e..g 2*2 minutes
13:21:23 <bandorf> Or extending the API somehow
13:22:19 <dougsz> I feel quite happy with the first approach. I think having to wait for the alarm to fire is acceptable.
13:23:14 <dougsz> It could all be done via tempest too right, just polling for alarms
13:23:55 <bandorf> Yes.
13:24:24 <dougsz> i suppose before putting effort into that test, it would be nice to know if your patch works with some manual testing
13:24:29 <bandorf> In an error situation you still don't know where the problem is located (from updating alarm def. to triggering of alarms).
13:24:39 <bandorf> However, I think, we could live with this.
13:25:17 <witek> we could add some debug logging in API
13:25:58 <dougsz> +1, that could help
13:26:12 <bandorf> dougsz: Yes, of course. As said, I did already some testing - lokks good so far. Now, wanted to check if anybody knows any reason why these params shouldn't be updated.
13:26:41 <bandorf> Yes, I started already to add some debug logging.
13:26:52 <dougsz> excellent
13:27:41 <bandorf> OK, let me summarize:
13:27:45 <witek> it seems to me to be a bug, indeed, another option could be, that thresholding is a limitation here
13:28:05 <bandorf> o Nobody knows a concrete reason why not to do the update, but not 100% sure
13:28:11 <dougsz> agreed, it either needs a doc fix, or a code fix
13:29:21 <bandorf> To dos: Fix the issue, extend logging, extend tempest tests with a scenario that checks via alarms
13:29:33 <bandorf> Doc fix would be sufficent from your point of view?
13:30:09 <dougsz> Actually, we would probably want to prevent the operation as well, if the patch doesn't work
13:30:28 <bandorf> This issue came up with a Japanese customer. I'm not sure yet whether they will request a fix
13:31:32 <witek> code fix is better, but if it's too expensive, e.g. involves logic change in thresh, we'd have to document and perhaps limit API usage
13:32:55 <bandorf> As said: I'm pretty sure fix in api is sufficient. Just creation of tempest tests might be some more effort. However, I will exec. more manual tests. We can discuss again next week, when I have the results.
13:33:22 <witek> great, good work bandorf
13:33:55 <dougsz> yes, thanks bandorf, good work!
13:34:16 <bandorf> Thanks
13:35:08 <witek> do we have anything else to discuss?
13:35:22 <dougsz> Nothing from me.
13:35:32 <bandorf> Not from me either
13:36:04 <witek> we've got some comments on events topic from last week in review
13:36:15 <witek> https://review.opendev.org/708335
13:37:36 <witek> I guess we can follow up next week again
13:38:35 <witek> #topic PTG
13:38:41 <witek> Monasca will not be holding own session during the PTG in Vancouver
13:38:57 <witek> but we should meet and discuss plans for the next release cycle
13:39:44 <witek> I though, as we are all in Europe, do you think it would make sense to meet F2F?
13:42:04 <bandorf> In geneneral, I think F2F is much better, communication is easier.
13:42:25 <bandorf> I would have to check if we will get a budget...
13:42:31 <dougsz> I think I exceeded my carbon emissions quota last week burning the monasca notification source code
13:42:37 <witek> haha
13:42:45 <dougsz> But it's not impossible
13:43:05 <witek> so we'd have to come to Bristol ;)
13:43:32 <dougsz> You'd be most welcome, but I don't want to impose anything on anyone :)
13:44:05 <witek> I mean, it would only make sense if all parties can participate, we're too small group
13:44:06 <bandorf> As said, I can't confirm any travel budget for us
13:44:21 <bandorf> Yes, you're right.
13:44:40 <dougsz> agreed, perhaps we can discuss again next week
13:44:41 <bandorf> The least travel cost overall would be created if we do it in Munich, right?
13:45:08 <bandorf> OK, good idea, let's discuss next week
13:45:26 <witek> agreed
13:45:47 <witek> alright, I guess that's all for today then
13:46:01 <witek> thanks for joining and for the discussion
13:46:05 <dougsz> thanks all, bye
13:46:09 <witek> and see you next week
13:46:11 <witek> bye
13:46:12 <bandorf> Thanks, bye everybody
13:46:22 <witek> #endmeeting