09:01:09 <ifat_afek> #startmeeting vitrage 09:01:10 <openstack> Meeting started Wed Jan 6 09:01:09 2016 UTC and is due to finish in 60 minutes. The chair is ifat_afek. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:01:11 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:01:14 <openstack> The meeting name has been set to 'vitrage' 09:01:18 <alexey_weyl> Hello There :) 09:01:23 <eyalb> hello 09:01:27 <ifat_afek> Hi everyone, welcome back 09:01:34 <umargolin> hi 09:02:23 <lhartal> hi all :) 09:02:27 <emalin> hi 09:02:30 <Ohad> hi 09:02:39 <lhartal> long time... 09:02:47 <nadav_yakar> hi 09:03:05 <elisha_r> hi all 09:03:07 <emalin> long time no see 09:06:23 <ifat_afek> Our agenda: 09:06:32 <ifat_afek> * Current status and progress 09:06:38 <ayah> hi 09:06:40 <ifat_afek> * Review action items 09:06:50 <ifat_afek> * Next steps 09:06:57 <ifat_afek> * Open Discussion 09:07:07 <ifat_afek> #topic Current status and progress 09:07:25 <ifat_afek> A short update about Vitrage documentation: Maty checked this issue, and we cannot place our documentation in the official openstack place (http://docs.openstack.org) until we are accepted under the big tent. 09:07:37 <ifat_afek> I suggest that for now we add our detailed design diagrams in Vitrage main page 09:07:58 <ifat_afek> Update on what I did: I started working on Nagios plugin for the synchronizer. As a first stage, I’m going to implement the get_all for nagios services (tests). 09:08:01 <inbar_stolberg> hello 09:08:16 <ifat_afek> For notifications, we have decided not to register to Nagios event handlers, as it raises security issues (how will Nagios call vitrage). Instead, we will take Nagios snapshots periodically and compare them. 09:08:34 <ifat_afek> I was also involved in the discussions on the consistency process (with alexey_weyl, elisha_r and Asi), and plan to document the use cases and challenges. We should continue with the design this week. 09:08:57 <ifat_afek> We worked on the first integration of the synchronizer, processor, graph, api and UI. 09:09:04 <ifat_afek> alexey_weyl, can you update? 09:09:10 <alexey_weyl> I would love to 09:09:25 <amir_gur> Hi 09:09:59 <alexey_weyl> I have performed the integration of the synchronizer + processor + transformer. Now it works and runs 09:10:51 <alexey_weyl> In addition I have a created an openstack service for "vitrage-graph" which runs the the synchronizer, entity graph, consistency and api handler oslo services 09:11:17 <alexey_weyl> if you want to run it, you can do: "sudo pip install -e." 09:11:34 <alexey_weyl> which will install the services, and then you can run "vitrage-graph" 09:11:39 <ifat_afek> cool! 09:12:03 <emalin> very nice 09:12:21 <ifat_afek> Ohad, can you update about our discussions with PinPoint? 09:12:26 <elishar_r> cool! 09:12:43 <Ohad> We had a meeting with PinPoint – OPNFV project aiming to provide RCA framework for NFVI and VIM layers focusing on network issues. 09:12:56 <Ohad> We found good alignment between use cases from both projects covering failures from physical and virtual layers. PinPoint are working on gap analysis to find out which information/ data exists in different projects or tools in order to understand root cause of failures and to define the APIs needed for it. 09:13:32 <elishar_r> @Ohad - can you explain what "aiming" means? 09:13:33 <Ohad> It looks like Vitrage perfectly match for providing the get physical/virtual topology and mappings APIs and we will keep working together on this. 09:14:08 <lhartal> @alexey_weyl: cool - lets do next week integration including zones, hosts and instances :) 09:14:52 <alexey_weyl> @lhartal: sounds great :) 09:15:25 <ifat_afek> alexey_weyl: this is great, let's do the full integration next week 09:15:59 <ifat_afek> #action alexey_weyl continue with the integration, including zones, hosts and instances 09:16:01 <Ohad> Elisha: PinPoint is a requirement project 09:16:35 <alexey_weyl> Ok :) 09:17:06 <ifat_afek> Ohad, elisha_r: we are in the process of finalizing vitrage API so we can send the definition to PinPoint, and verify it matches their use cases 09:17:29 <eyalb> I am still working on api 09:17:45 <eyalb> first version was written with a simple filter 09:18:05 <eyalb> next we will use a more complex filter 09:18:22 <eyalb> i did an integration with UI 09:18:55 <eyalb> they are using the client and were able to retrieve a mock graph 09:19:02 <Ohad> Eyalb: once we have a version, please share it with PinPoint 09:19:23 <eyalb> still need to work with dany to call the api handler 09:19:33 <eyalb> ohad sure 09:19:43 <eyalb> thats it 09:19:51 <ifat_afek> eyalb, so once Dani is done, we will have an end-to-end integration? 09:20:10 <eyalb> hopefully yes 09:20:16 <ifat_afek> great 09:20:38 <ifat_afek> nadav_yakar, can you update about the synchronizer status? 09:20:59 <nadav_yakar> we have finalized the synchronizer design which includes hosts, zones and instances snapshotting and notifications propagation. I have checked in the synchronizer's plugin execution framework and worked with Alexey to integrate it with the vitrage graph 09:22:06 <inbar_stolberg> get_all for host and zone are ready 09:22:40 <nadav_yakar> yes, the instances snapshotting process is also checked in 09:22:57 <ifat_afek> great 09:23:24 <ifat_afek> who else wants to update? 09:24:52 <elishar_r> I've started compiling information on how Vitrage will work with Neo4J or Titan (or any other persistant GraphDB) that can replace NetworkX. 09:25:05 <emalin> I did little research about oslo.service and it's multi-thread support 09:26:07 <emalin> It seems that we can use one process with multi-thread of greenlet while working with networkx 09:26:33 <emalin> And multi processes while working with Neo4j 09:26:51 <emalin> or other graph db that support access from multi processes 09:27:24 <alexey_weyl> Sounds great! good solution! 09:27:29 <idan_hefetz> my update: currently working to implement the Get Topology query api over NetworkX, so we can request a filtered subgraph of the entity graph. 09:27:43 <ifat_afek> emaiin, so the design for networkx is finished for now? 09:28:26 <emalin> ifat_afek: if think is finished 09:28:34 <ifat_afek> great 09:28:42 <emalin> I think it's finished 09:29:01 <ifat_afek> any other updates? if not, let's move on 09:29:28 <ifat_afek> #topic Review action items 09:29:39 <ifat_afek> • ifat_afek check Aodh integration workaround and update Ceilometer blueprints 09:29:50 <ifat_afek> I sent an email to Aodh mailing list, and specifically to Julien and Ryota. Got no reply, could be because of the holidays. Will try again in a week or two. 09:30:12 <ifat_afek> I also emailed Aodh and asked them not to remove the ability to send notifications about alarm status changes (they planned to remove it), because we want to register to these notifications. 09:30:26 <ifat_afek> • nadav_yakar checkin a basic synchronizer FW for the vitrage graph to interface with and see that we are on the same page 09:30:40 <nadav_yakar> done 09:30:44 <ifat_afek> • ifat_afek check how we should add vitrage documentation 09:30:47 <gsagie> i have a question, NetworkX is persistent or it has a way to keep the graph in RAM? 09:30:55 <gsagie> sorry for stepping in :) 09:31:13 <nadav_yakar> in memory only 09:31:19 <gsagie> cool, thanks 09:31:55 <gsagie> looks like an interesting project 09:32:03 <elishar_r> Let me expand a bit on NetworkX 09:32:53 <elishar_r> We started with NetworkX b/c it's pure python (and the only significant graph DB in python project we could find). 09:33:06 <elishar_r> It's in-memory 09:33:51 <elishar_r> However, for good performance we are already working now on a design that will allow using persistant graph-DB, such as Neo4J and Titan, instead. 09:34:52 <elishar_r> Already now, in our design, we use a interface called "Graph Driver" that will remain the same even when we replace the graph DB backend in the future. 09:35:32 <gsagie> elishar_r : why there is a performance problem ? i would assume that in-memory should be faster then persistant one in general 09:35:42 <gsagie> or the package itself (NetworkX) is not so good? 09:36:34 <elishar_r> NetworkX itself has reasonable performance, as-is. however, there are a few other performance issues we want to address. 09:36:58 <elishar_r> first of all, it's not persistant. that means that if Vitrage fails, we need to rebuild the DB when it goes back online 09:37:40 <elishar_r> second of all, in pure python there is little support for real multi-threading, while when using a persistant DB like Neo4J we can access it in parallel from different sources. 09:38:26 <gsagie> elisha_r: thanks for explaining, it make sense 09:38:45 <elishar_r> finally, doing things in-memory means we cannot leverage distribution, and limits the graph size as well. Those are the main points 09:38:53 <elishar_r> sure :) 09:39:18 <emalin> But networkx is very use full for dev env 09:39:30 <emalin> you don't need to install any 3rd party DB 09:40:00 <ifat_afek> BTW, we found out that networkx is already in use in openstack (by TaskFlow project if I'm not mistaken) 09:40:41 <emalin> And networkx really fast 09:41:10 <ifat_afek> ok, let's go back to the action items 09:41:12 <ifat_afek> • ifat_afek check how we should add vitrage documentation 09:41:16 <ifat_afek> done, already discussed it 09:41:31 <ifat_afek> • decide on Vitrage next use cases 09:43:09 <ifat_afek> We talked about the second use case. It will include the evaluator for RCA purposes, nagios synchronizer (only snapshots), and physical resources synchronizer 09:43:35 <ifat_afek> #topic Next Steps 09:43:42 <ifat_afek> so we already discussed the integration 09:43:54 <ifat_afek> and the second use case 09:43:58 <ifat_afek> anything else? 09:44:48 <nadav_yakar> I want to adapt our synchronizer framework 09:44:53 <emalin> We hope to start working on message bug listener plugin 09:44:54 <ifat_afek> #action finalize get topology API 09:44:56 <emalin> for nova 09:45:35 <nadav_yakar> adapt our framework per our design changes and oslo conventions 09:45:45 <ifat_afek> #action ifat_afek update the documentation on vitrage main page with our latest design diagrams (of vitrage graph and the synchronizer) 09:46:13 <ifat_afek> #topic Open Discussion 09:46:26 <ifat_afek> I had a look yesterday at Telemetry and Monasca IRC meeting logs, to see if they are doing anything that interests us. 09:46:48 <ifat_afek> Monasca started working on a cassandra time-series DB. This is not related directly to Vitrage, but if they introduce cassandra to Openstack and handle cassandra installation, it might help us in our future “real” graph-database implementation. 09:47:00 <ifat_afek> #link https://blueprints.launchpad.net/monasca/+spec/monasca-cassandra 09:47:19 <ifat_afek> As for Ceilometer, I noticed two interesting issues: 09:47:40 <ifat_afek> They want to improve their alarm rules. They will define complex conditions of and/or over several threshold conditions. 09:47:51 <ifat_afek> #link https://blueprints.launchpad.net/ceilometer/+spec/composite-threshold-rule-alarm 09:48:07 <ifat_afek> one of their future targets (i.e. not for mitaka?) is application level monitoring 09:48:17 <ifat_afek> #link https://wiki.openstack.org/wiki/Telemetry/RoadMap 09:48:27 <ifat_afek> anything else? 09:51:29 <lhartal> we are going to present Vitrage first demo next week 09:51:37 <lhartal> We're planing to display the first use case: Vitrage show topology 09:52:12 <lhartal> including zones, hosts and instances 09:54:01 <gsagie> cool, is this going to be recorded? 09:54:07 <lhartal> yes 09:54:11 <gsagie> great 09:55:02 <lhartal> we will put a link in Vitrage website 09:55:38 <ifat_afek> cool! we will also email it to openstack dev list 09:56:15 <lhartal> #action: presenting first Vitrage demo 09:56:34 <alexey_weyl> Thumbs up! 09:56:34 <ifat_afek> an update on behalf of Marina: she prepared a dev stack that we can use for our tempest tests 09:57:33 <ifat_afek> see you next week then 09:57:54 <gsagie> cya! 09:57:57 <eyalb> bye 09:58:05 <alexey_weyl> bye bye :) 09:58:09 <elishar_r> bye 09:58:12 <lhartal> bye 09:58:17 <amir_gur> bye 09:58:32 <ifat_afek> #endmeeting