13:00:17 <witek> #startmeeting monasca 13:00:18 <openstack> Meeting started Tue Mar 3 13:00:17 2020 UTC and is due to finish in 60 minutes. The chair is witek. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:19 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:21 <openstack> The meeting name has been set to 'monasca' 13:00:28 <witek> hello everyone! 13:00:56 <witek> the agenda for today: 13:01:00 <witek> https://etherpad.openstack.org/p/monasca-team-meeting-agenda 13:01:21 <witek> please add if you have any other topics 13:01:52 <witek> #topic updating alarm definitions 13:02:24 <bandorf> OK, I will explain a bit 13:02:31 <witek> please, go on 13:02:37 <bandorf> This is the topic Martin brought up last week already. 13:02:45 <bandorf> I've investigated further. 13:03:18 <bandorf> The reason is quite simple: For an alarm definition, "time" and "times" are not used when updating an alarm definition. 13:03:42 <bandorf> Besides, I detected that even an update for "function" won't have any impact. 13:04:11 <bandorf> According to documentation, everything can be updated, apart from "metric name" and "match by". 13:04:38 <bandorf> And I don't see any reason why the time, times and fucntion shouldn't be updated. 13:04:46 <bandorf> I was just a bit confused: 13:05:24 <bandorf> The queries to create the sub alarm definitions and updating sub alarm definitions are really located very close to each other. 13:05:50 <bandorf> So, I just wanted to confirm if anybody has any idea for a reason why these fields shouldn't be updated? 13:07:18 <witek> I think times, time and function could be encapsulated in `expression` 13:07:44 <bandorf> expression is stored for the alarm definition. 13:08:12 <witek> alright 13:08:13 <bandorf> For sub-alarm-definition, the expression is split up into function, operator, metric-name, time, times etc 13:08:23 <witek> fine 13:08:39 <bandorf> So, the expression in the alarm definition is stored correctly, even after an update. 13:08:50 <dougsz> I wonder if it is a thresholder limitation, or just an oversight 13:08:52 <bandorf> But sub-alarm definitions are not properly handled 13:09:20 <bandorf> dougsz: That's exactly why I wanted to discuss it here 13:10:30 <dougsz> I would be tempted to hack in support for updates and see if it behaves 13:11:03 <bandorf> I did that already, but haven't tested extensively yet. 13:11:29 <dougsz> Cool, I am guessing there is no tempest coverage? 13:11:31 <bandorf> As said: "function" (min, max, avg, ...) won't be reflected either in the update 13:11:56 <bandorf> No, not at all. That's even not that easy to do - a further topic, maybe for the next call 13:12:27 <bandorf> For the other params that must not be updated, checks have been implemented to prevent that. 13:12:33 <bandorf> But not for those... 13:12:51 <witek> we have `test_update_alarm_definition` 13:12:51 <bandorf> No, not at all: it's related to tempest tests. 13:13:00 <witek> but it updates from empty expression 13:13:39 <bandorf> This error is even difficult to detect in tempest tests: You can't read the sub-alarm definitions via api. It's not a logical entity for the user 13:14:07 <bandorf> And the expression itself, stored for the alarm definition is always correct 13:14:13 <witek> also I guess it just reads from alarm_definitions table, so won't find this bug 13:14:36 <bandorf> Exactly. 13:14:40 <dougsz> An indirect test, like checking an alarm fires with the new expression could work 13:14:58 <bandorf> Yes, that's one of the possible solutions 13:15:03 <witek> we would need a test which proves for more realistic scenario with test values and triggered alarm state change 13:15:16 <witek> :) 13:15:48 <bandorf> Another option would be to extend the API, so that sub alarm definitions can be read 13:16:21 <witek> you mean the response body, right? 13:17:14 <bandorf> I thought about a new operation: get sub alarm defs for an alarm def 13:17:35 <bandorf> Not intended for end-users 13:18:59 <witek> I'm not sure I like it, I'd prefer adding more details to existing requests 13:19:44 <dougsz> My concern would be that it would expose what could be considered 'implementation detail' 13:19:45 <bandorf> Well, the "problem" is, that sub alarm definitions are not visible for end-users. They just see alarm defs and the related expression. 13:19:57 <bandorf> dougsz: Yes, you're right 13:20:28 <bandorf> What I anyway want to avoid is direct access to the database. 13:20:37 <bandorf> So, there are only 2 options: 13:21:15 <bandorf> Indirect testing (but that may take long, waiting for an alarm with e..g 2*2 minutes 13:21:23 <bandorf> Or extending the API somehow 13:22:19 <dougsz> I feel quite happy with the first approach. I think having to wait for the alarm to fire is acceptable. 13:23:14 <dougsz> It could all be done via tempest too right, just polling for alarms 13:23:55 <bandorf> Yes. 13:24:24 <dougsz> i suppose before putting effort into that test, it would be nice to know if your patch works with some manual testing 13:24:29 <bandorf> In an error situation you still don't know where the problem is located (from updating alarm def. to triggering of alarms). 13:24:39 <bandorf> However, I think, we could live with this. 13:25:17 <witek> we could add some debug logging in API 13:25:58 <dougsz> +1, that could help 13:26:12 <bandorf> dougsz: Yes, of course. As said, I did already some testing - lokks good so far. Now, wanted to check if anybody knows any reason why these params shouldn't be updated. 13:26:41 <bandorf> Yes, I started already to add some debug logging. 13:26:52 <dougsz> excellent 13:27:41 <bandorf> OK, let me summarize: 13:27:45 <witek> it seems to me to be a bug, indeed, another option could be, that thresholding is a limitation here 13:28:05 <bandorf> o Nobody knows a concrete reason why not to do the update, but not 100% sure 13:28:11 <dougsz> agreed, it either needs a doc fix, or a code fix 13:29:21 <bandorf> To dos: Fix the issue, extend logging, extend tempest tests with a scenario that checks via alarms 13:29:33 <bandorf> Doc fix would be sufficent from your point of view? 13:30:09 <dougsz> Actually, we would probably want to prevent the operation as well, if the patch doesn't work 13:30:28 <bandorf> This issue came up with a Japanese customer. I'm not sure yet whether they will request a fix 13:31:32 <witek> code fix is better, but if it's too expensive, e.g. involves logic change in thresh, we'd have to document and perhaps limit API usage 13:32:55 <bandorf> As said: I'm pretty sure fix in api is sufficient. Just creation of tempest tests might be some more effort. However, I will exec. more manual tests. We can discuss again next week, when I have the results. 13:33:22 <witek> great, good work bandorf 13:33:55 <dougsz> yes, thanks bandorf, good work! 13:34:16 <bandorf> Thanks 13:35:08 <witek> do we have anything else to discuss? 13:35:22 <dougsz> Nothing from me. 13:35:32 <bandorf> Not from me either 13:36:04 <witek> we've got some comments on events topic from last week in review 13:36:15 <witek> https://review.opendev.org/708335 13:37:36 <witek> I guess we can follow up next week again 13:38:35 <witek> #topic PTG 13:38:41 <witek> Monasca will not be holding own session during the PTG in Vancouver 13:38:57 <witek> but we should meet and discuss plans for the next release cycle 13:39:44 <witek> I though, as we are all in Europe, do you think it would make sense to meet F2F? 13:42:04 <bandorf> In geneneral, I think F2F is much better, communication is easier. 13:42:25 <bandorf> I would have to check if we will get a budget... 13:42:31 <dougsz> I think I exceeded my carbon emissions quota last week burning the monasca notification source code 13:42:37 <witek> haha 13:42:45 <dougsz> But it's not impossible 13:43:05 <witek> so we'd have to come to Bristol ;) 13:43:32 <dougsz> You'd be most welcome, but I don't want to impose anything on anyone :) 13:44:05 <witek> I mean, it would only make sense if all parties can participate, we're too small group 13:44:06 <bandorf> As said, I can't confirm any travel budget for us 13:44:21 <bandorf> Yes, you're right. 13:44:40 <dougsz> agreed, perhaps we can discuss again next week 13:44:41 <bandorf> The least travel cost overall would be created if we do it in Munich, right? 13:45:08 <bandorf> OK, good idea, let's discuss next week 13:45:26 <witek> agreed 13:45:47 <witek> alright, I guess that's all for today then 13:46:01 <witek> thanks for joining and for the discussion 13:46:05 <dougsz> thanks all, bye 13:46:09 <witek> and see you next week 13:46:11 <witek> bye 13:46:12 <bandorf> Thanks, bye everybody 13:46:22 <witek> #endmeeting