15:00:53 <witek> #startmeeting monasca 15:00:58 <openstack> Meeting started Wed May 22 15:00:53 2019 UTC and is due to finish in 60 minutes. The chair is witek. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:59 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:02 <openstack> The meeting name has been set to 'monasca' 15:01:10 <witek> hello everyone 15:01:13 <bandorf> Hello 15:01:14 <Dobroslaw> Hi 15:01:15 <dougsz> hey all 15:01:15 <koji_n> hello 15:01:25 <joadavis> greetings 15:01:37 <hosanai> hello 15:01:40 <witek> hi, agenda as usual: 15:01:44 <witek> https://etherpad.openstack.org/p/monasca-team-meeting-agenda 15:02:16 <witek> #topic monasca-thresh config tuning 15:03:17 <witek> bandorf: thanks for putting this up, could you shortly summarize which problems it addresses? 15:03:23 <bandorf> Regarding monasca-thresh: We did a change due to a customer issue where thresh finally crashed because of too high load. 15:03:59 <bandorf> The change I uploaded is just a precondition so that the load can be controlled by setting Topology.max.spout.pending 15:04:19 <bandorf> This setting will be used only if a message has a unique id. 15:04:41 <bandorf> Thechange simply generates and sets this unique id 15:05:32 <witek> what is the result of setting this value? 15:06:09 <bandorf> The value basically controls the number of messages in storm not yet acknowledged. 15:06:22 <bandorf> Thus, RAM consumption canbe controlled. 15:06:49 <witek> what about CPU? 15:07:00 <bandorf> In our scenario, RAM usage got higher and higher. Then, garbage collector ran frequently, without freeing much space. 15:07:20 <bandorf> The lower you set the value the less RAM / CPU is being used. 15:07:36 <bandorf> However, shouldn't be too low either. 15:07:51 <dougsz> Nice work bandorf, I will take it for a spin 15:08:10 <bandorf> CPU load gets high, I think, mainly because garbage collector is running more frequently. 15:08:22 <witek> do you think we could set some reasonable default value in DevStack plugin as well? 15:08:28 <bandorf> Anyway, if the value is decreased, CPU load will go down as well. 15:08:50 <bandorf> We did not do extensive testing with different combinations. 15:08:50 <dougsz> +1 for sensible default 15:09:31 <bandorf> Basically, we provided some more heap (256 MB -> 768 MB), limited the number of messages in not ACK'd to 500. 15:10:00 <bandorf> This seems a reasonable combination for smaller machines (16 GB - 32 GB RAM). 15:10:42 <bandorf> Anyway: With the change proposed, you can do this kind of limitation. 15:11:03 <bandorf> If the value max.spout.pending isn't set, nothing will change. 15:11:32 <bandorf> So, there's no risk in defining the unique id. You then can set max.spout.pending or not 15:12:16 <witek> right, sounds good to me 15:12:17 <bandorf> My understanding: devstack will mainly be used for testing, i.e. smaller machines. 15:12:32 <witek> regarding recommended values I see two options: 15:12:32 <bandorf> My idea would be to start with similar values as we did 15:12:58 <witek> * we can set these in DevStack plugin as the reference setup 15:13:44 <witek> * write short list of tunable options in README 15:13:45 <bandorf> Of course, it would be nice to test on several machines, with several parameters. 15:13:56 <bandorf> But this would be some huge effort, I guess. 15:15:16 <bandorf> Regaring the generation of unique ids: I tested. On my machine, it takes 240 ms for 10**6 ids 15:16:27 <witek> short enough 15:17:25 <witek> I will definitely try these values in our setup and check what's the impact 15:18:17 <bandorf> I think you could play with values between 100 and 1000 for ...spout.pending. 15:18:33 <bandorf> I plan to do some further testing as well. 15:18:53 <witek> OK, sounds good 15:19:12 <dougsz> same - I will set a default in Kolla Ansible 15:19:20 <witek> then we can add some values to DevStack and document in README 15:19:43 <bandorf> So, can then somebody review the code I uploaded? This still as no significant impact, is just the precondition 15:20:03 <bandorf> as=has 15:20:15 <witek> sure, adding myself 15:20:38 <dougsz> +1 15:20:42 <bandorf> OK, great, thanks 15:22:15 <witek> #action check optimal value of topology.max.spout.pending and heap size for monasca-thresh 15:22:43 <witek> any other comments? 15:22:47 <bandorf> In our config (docker), I chaanged one more param: # of partitions, I think. This has been set to 64. However, we don't use this parallelism, and it generates some overhead, as to my understanding. We reduced to 16 15:23:35 <witek> sounds reasonable to me 15:24:15 <bandorf> I haven't checked in devstack yet 15:24:37 <witek> #topic Falcon 2.0 15:24:59 <witek> https://storyboard.openstack.org/#!/story/2005695 15:25:11 <sc> interesting story 15:25:42 <witek> https://review.opendev.org/659264 15:26:10 <witek> unit and tempest tests for monasca-api are passing now 15:26:48 <witek> the code seems not to be backwards compatible 15:26:55 <witek> is it a problem for anyone? 15:27:16 <sc> monasca-tempest-python2-influxdb FAILURE in 42m 20s 15:27:44 <dougsz> Not for me 15:28:12 <witek> sc: that's a random failure, in PS7 it passed 15:28:23 <dougsz> Just the log API which needs updating right? 15:28:27 <sc> OK 15:28:46 <sc> my idea is to go forward and do not turn back 15:28:53 <witek> dougsz: right, Adrian works on it 15:29:11 <witek> I haven't seen any PS though 15:29:26 <dougsz> Great :) thanks to adrian 15:30:09 <witek> perhaps you could sync with him tomorrow and see if there are no blockers? 15:30:36 <witek> otherwise I would suggest to just set this job as non-voting for now 15:30:41 <dougsz> sounds good to me 15:30:52 <witek> it blocks monasca-api for a while now already 15:31:19 <dougsz> Will the log-api change get blocked on the Monasca API change not being merged? 15:31:43 <sc> yes as far as I understand 15:31:57 <witek> I think not, it does not require monasca-api really 15:32:47 <sc> but IIRC log-api is failing same way 15:33:08 <witek> right, so monasca-api change should depend on monasca-log-api 15:33:27 <dougsz> anyway, as you say, if we hit an issue, we can just make the check non-gating temporarily 15:33:35 <witek> right 15:34:14 <witek> #topic monasca-persister instrumentation 15:34:16 <witek> https://review.opendev.org/659264 15:34:47 <joadavis> As we had talked about deprecating the java persister, I thought it might be worth finally merging that spec 15:35:07 <witek> sorry, wrong url 15:35:10 <witek> https://review.opendev.org/#/c/605782/ 15:35:20 <joadavis> should we leave it in the rocky/approved location where it started, or move it forward to train/approved? 15:36:07 <witek> don't have a strong opinion 15:36:56 <joadavis> I'm ok leaving it as-is, if someone wants to workflow it 15:37:07 <witek> I think we can merge it as-is 15:38:01 <witek> #topic Train priorities 15:38:20 <witek> I've put together the results from the spreadsheet 15:38:25 <witek> https://review.opendev.org/660459 15:38:39 <witek> not all descriptions are included 15:39:12 <witek> please feel free to add 15:39:50 <witek> also, if anyone would like to take the ownership of any task 15:40:22 <witek> I've pushed another change to fix docs job in this repo: 15:40:28 <witek> https://review.opendev.org/660604 15:40:38 <witek> the table formatting changed 15:41:07 <witek> but I think it's not that critical 15:41:38 <witek> any comments on this? 15:42:03 <witek> #topic Meeting reminders 15:42:21 <witek> tomorrow is the meeting day :) 15:42:30 <joadavis> doc change looks good so far 15:42:42 <witek> joadavis: thanks 15:42:43 <sc> tomorrow???? 15:43:00 <witek> yes, for Telemetry team 15:43:08 <witek> and Billing initiative 15:43:20 <sc> OK, time? 15:43:24 <witek> Telemetry - 2am and 8am UTC 15:43:24 <joadavis> Is the self-healing meeting this week also? 15:43:33 <witek> today, right! 15:44:16 <witek> Billing initiative - tomorrow, 2pm UTC 15:44:50 <sc> I'm busy both time 15:45:16 <witek> #topic open stage 15:45:19 <joadavis> I think I also have conflicts (church meeting, sleep) 15:45:48 <joadavis> for open stage - I am continuing on the py3 conversion for monasca-agent 15:45:49 <witek> joadavis: :) yes, sleep is an important calendar item 15:46:08 <witek> that's cool, thanks 15:46:13 <sc> joadavis: grand 15:46:45 <joadavis> I've hit some odd py2 to 3 differences that are hard to google, but have made some progress this week 15:47:02 <witek> https://review.opendev.org/657829 15:47:10 <witek> that's the one, right? 15:47:14 <joadavis> yes 15:48:09 <joadavis> The failures on mock-open have me stumped. I think i can eventually figure out the binary string formatting. 15:49:33 <witek> adding Adrian to review as well 15:50:03 <witek> sc: did you have time to have a look at OSA? 15:50:41 <joadavis> I saw an email about OSA that someone was pushing to deprecate, though sc had emailed he would look at supporting it 15:50:54 <sc> not that much, I think that the best decision is to leave they abandon the roles and integrate new in the future 15:51:15 <witek> :( 15:52:11 <sc> tomorrow I'll my mid year review with my boss and I'll ask for time to develop on monasca 15:52:38 <witek> nice, fingers crossed 15:52:38 <sc> we had no time to speak since I got back from Denver 15:52:59 <sc> so everything depends on tomorrow afternoon at 2:30GMT 15:53:25 <witek> do we have anything else? 15:53:55 <witek> I guess, we can wrap up 15:54:00 <witek> thanks for joining 15:54:03 <joadavis> thanks all 15:54:13 <witek> see you next week 15:54:18 <dougsz> thanks all 15:54:20 <witek> #endmeeting