13:00:46 <witek> #startmeeting monasca 13:00:47 <openstack> Meeting started Tue Feb 11 13:00:46 2020 UTC and is due to finish in 60 minutes. The chair is witek. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:48 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:50 <openstack> The meeting name has been set to 'monasca' 13:00:56 <witek> hello everyone 13:01:02 <adriancz> Hi 13:01:05 <chaconpiza> Hello 13:01:08 <Dobroslaw> hi 13:01:45 <witek> the agenda seems to be light 13:01:49 <witek> https://etherpad.openstack.org/p/monasca-team-meeting-agenda 13:01:55 <witek> let's start 13:02:04 <witek> #topic ujson replacement 13:02:47 <witek> I've sent a question to requirements team at openstack-discuss 13:02:56 <witek> http://lists.openstack.org/pipermail/openstack-discuss/2020-February/012376.html 13:03:59 <witek> orjson seems not to be the good option for including into global requirements 13:04:10 <witek> because of complicated build process 13:04:52 <witek> do we have any performance evaluation results already? 13:05:24 <chaconpiza> I compared ujson, json and simplejson in devstack 13:05:48 <chaconpiza> all these are available in devstack already 13:06:21 <chaconpiza> This is the way I did the checks: 13:06:37 <chaconpiza> 1. After stacking devstack, I stopped the log-agent, metric-agent and the persister. 13:06:50 <chaconpiza> 2. I modified the scale_perf/agent_simulator.py from Monasca-perf in order to work with devstack. 13:07:06 <chaconpiza> 3. I sent 690 calls with 1000 points each one using 4 Python Processes. Every point had this form: 13:07:18 <chaconpiza> {'dimensions': {'cloud_name': 'monasca', 13:07:18 <chaconpiza> 'cluster': 'compute', 13:07:18 <chaconpiza> 'component': 'vm', 13:07:18 <chaconpiza> 'container': 'container_0', 13:07:18 <chaconpiza> 'control_plane': 'ccp', 13:07:19 <chaconpiza> 'hostname': 'agent_0', 13:07:20 <chaconpiza> 'resource_id': '34c0ce14-9ce4-4d3d-84a4-172e1ddb26c4', 13:07:22 <chaconpiza> 'service': 'service_0', 13:07:24 <chaconpiza> 'tenant_id': '71fea2331bae4d98bb08df071169806d', 13:07:28 <chaconpiza> 'zone': 'nova'}, 13:07:30 <chaconpiza> 'name': 'aaa.perf_428', 13:07:32 <chaconpiza> 'timestamp': 1581424933000, 13:07:34 <chaconpiza> 'value': 781} 13:07:40 <chaconpiza> where: 'name' can be [aaa.perf_000 to aaa.perf_999] 13:07:54 <chaconpiza> 'value' a randon [000 to 999] 13:08:03 <chaconpiza> and 'hostname': ['agent_0' to 'agent_3'] 13:08:20 <chaconpiza> This point is quite similar to those we have in production. 13:08:37 <chaconpiza> Cardinality: 1000 metrics X 4 agents X 1 containers X 1 services = 4000 13:08:50 <chaconpiza> 4. Changed from 30 to 1000 the batch_size of the metrics in the persister 13:09:07 <chaconpiza> 5. Set a log.warn in the json.loads https://github.com/openstack/monasca-persister/blob/master/monasca_persister/repositories/utils.py#L21 13:09:57 <chaconpiza> 6. Started the persister manually redirecting the output to a file waiting to consume all the 690000 points waiting in kafka. 13:10:26 <chaconpiza> 7. Checked the time difference in the log from the first deserialization (log.warn) to the last. 13:11:21 <chaconpiza> I repeated the process for simplejson and json, by changing the import at 13:11:27 <chaconpiza> https://github.com/openstack/monasca-persister/blob/master/monasca_persister/repositories/utils.py#L16 13:12:21 <chaconpiza> The results: json took 2 min 20 sec in the first test and 2 min 22 sec in the second test. 13:13:02 <chaconpiza> ujson took 1 min 53 sec in the first test and 1 min 51 in the second 13:13:23 <chaconpiza> simplejson took 1 min 59 in both tries. 13:14:52 <witek> from this test simplejson seems to be a good compromise 13:15:15 <chaconpiza> and at least in devstack is already available without any change 13:16:09 <witek> we could test rapidjson as well, but I don't expect it to be much faster the ujson 13:16:32 <chaconpiza> Yes, I have all the mini-infrastructure to test it 13:17:22 <chaconpiza> *Note that I only tested the deserialization (json.loads), I left the serialization (json.dumps) using ujson. 13:17:36 <chaconpiza> https://github.com/openstack/monasca-persister/blob/a4addd0f5e8c60f631a77fd280f2810b8f222203/monasca_persister/repositories/influxdb/metrics_repository.py#L48 13:18:47 <witek> that's fine, we were interested in deserialization because of persister beeing the bottleneck 13:18:55 <chaconpiza> cool 13:20:26 <witek> any other comments on that? 13:21:01 <Dobroslaw> well, it's good if there is no need to add it to global requirements and is fast enough 13:21:49 <witek> agree 13:22:15 <chaconpiza> Probably the profiling test in the internet show differences because of the size of the string to be deserializated. 13:22:27 <chaconpiza> *tests 13:22:48 <witek> sure, it depends strongly on the structure of the object 13:23:38 <chaconpiza> yes like the nesting level of the object 13:24:10 <chaconpiza> Besides of the dimensions: we use a 'flat' object 13:25:23 <witek> so do we agree on using simplejson, or do you still want to test rapidjson as well? 13:26:10 <chaconpiza> can rapidjson be installed easy-way with pip? 13:26:19 <adriancz> I think simplejson should be good enough 13:26:21 <chaconpiza> I can do the check just after the meeting 13:29:10 <witek> I'm fine with simplejson as well, we can still change it in future if proves to cause any problems 13:29:44 <witek> #topic aob 13:29:58 <witek> do we have any other topics for today? 13:30:38 <chaconpiza> no from my part 13:31:42 <witek> OK, let's wrap it up then 13:31:52 <witek> thanks for joining 13:31:57 <witek> and for the tests 13:32:04 <chaconpiza> thanks 13:32:10 <witek> see you next time 13:32:10 <Dobroslaw> thank you 13:32:11 <bandorf> Thanks, bye everybody 13:32:19 <witek> #endmeeting