09:02:47 #startmeeting self-healing 09:02:48 Meeting started Wed Nov 21 09:02:47 2018 UTC and is due to finish in 60 minutes. The chair is aspiers. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:02:49 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:02:51 The meeting name has been set to 'self_healing' 09:03:08 Well it looks like we have critical mass for the first ever meeting :) 09:03:30 * witek waves 09:04:02 ifat_afek: yes, we discussed the meeting in Berlin. No, I meant to announce it on the list in the last day or two but I had too many things going on in my head so I forgot ;-/ 09:04:06 hi witek! 09:04:25 and hi Xiangyu! really glad you could make it 09:04:32 aspiers: ok, thanks 09:04:56 #action aspiers to announce the meeting times on the list 09:05:22 #link http://eavesdrop.openstack.org/#Self-healing_SIG_Meeting 09:05:26 #topic Berlin recap 09:05:43 OK, let's do a quick recap of the two Berlin sessions 09:06:10 #link https://etherpad.openstack.org/p/berlin-self-healing-sig-brainstorm 09:06:20 That's the etherpad we used for both sessions 09:06:55 The first session was a BoF, and the second was the Forum's official working group session for the SIG 09:07:38 We tried to play the agenda for both by ear, depending on who was there and what they were interested in 09:08:14 #action aspiers to send a recap of the Berlin sessions to the list (and maybe blog about it too) 09:08:25 We had good attendance and some nice discussions 09:08:49 Also there seems to still be quite a bit of interest in compute node HA 09:09:55 There was one new interesting use case raised around memory leaks in OVS 09:10:19 I think we had a volunteer to submit a use case to the repo for that, but I forgot to take a name 09:10:48 #action try to find out who raised memory leak use case 09:11:12 I think we can talk about compute node HA as a separate topic in the next few minutes 09:11:29 but first, any questions about the items in the etherpad? 09:11:41 especially from people who weren't there 09:11:48 Unfortunately I couldn’t attend this summit, seems like you had interesting discussions 09:11:57 I’m going over them now... 09:12:37 Yeah it was good, people came and went from the two sessions so I think we must have had more than 40 people in total 09:13:24 yes, we had plenty of people while room was hard to reach 09:13:39 right, the room was the furthest away in the whole conference ;-) 09:13:43 ifat_afek: there were questions about what to use for notification 09:13:57 What kind of notification? 09:14:08 sorry, I mean monitoring ;-) 09:14:14 brain is not working yet 09:14:17 Oh :-) 09:14:30 I said that I thought the SIG should not be opinionated about that, but only assist in sharing knowledge 09:14:38 about what is possible, or what is being worked on 09:14:57 I’m familiar with Zabbix and a bit Prometheus. I’m not sure if the SIG should recommend monitors 09:15:00 since some operators want to use monasca / zabbix etc. 09:15:08 I would expect a general architecture that allows using different monitors 09:15:13 exactly 09:15:17 Of course, Monasca as well 09:15:21 right 09:15:26 Monasca gives great monitoring possibilities, which should also be listed with other options 09:15:34 Right 09:16:02 #action aspiers to update the SIG scope on the wiki home page to be clear about not being opinionated about which components to use 09:16:02 Monasca having support for Prometheus, so Prometheus metrics can be scraped and stored in Monasca backend 09:16:23 Sounds good 09:16:54 #topic compute HA 09:17:11 since we have Xiangyu here, let's talk about compute HA 09:17:23 there were two presentations on new solutions for compute HA 09:17:35 one from China Mobile 09:17:57 on Guardian 09:18:05 and I am trying to remember the other one :) 09:18:10 it's in my notes somewhere ... 09:18:30 ah, found it 09:18:36 https://www.openstack.org/summit/berlin-2018/summit-schedule/events/22100/a-better-vm-ha-solution-split-brain-solving-and-host-network-fault-awareness 09:18:57 sampath and I attended both talks and talked to the presenters afterwards 09:19:14 to ask if they would like to collaborate with upstream, especially with masakari 09:19:28 since masakari is already an official project doing compute HA 09:19:52 both of the new solutions have some interesting features which masakari does not have, but they also overlap a lot too 09:20:12 IIUC currently neither are open source yet, but there are plans to open source Guardian I think 09:20:30 Were they aware of Masakari? 09:20:45 I think so 09:21:28 the Fiberhome solution decided to avoid existing solutions, but the reasons they gave did not make sense to me 09:21:44 but maybe I was missing something 09:22:09 Xiangyu: would you like to say anything about Guardian? 09:22:22 e.g. possibility to collaborate upstream? 09:23:25 yes, we would like to share our gains. 09:23:57 BTW we heard from Chinese contributors in a separate session (TC community outreach) that IRC meetings are sometimes difficult for Chinese contributors, so I'm really glad to see you here :) 09:24:36 Xiangyu: I see you are on #openstack-masakari too, so we can discuss there also 09:25:02 Guardian is also focus on VM HA now, and has some special features. 09:25:28 yes it has some nice features, so it would be great to converge to one solution upstream 09:25:41 maybe we can set up a separate meeting with samP about this? 09:25:52 I'm also very glad to join in this meeting. 09:26:21 yes, of course. 09:26:56 OK, great! I am always available in IRC to discuss, since I know masakari reasonably well 09:27:09 and I know samP is keen to discuss too 09:27:20 OK, I guess we don't need to go into the details of that now 09:27:35 #topic raising awareness of the SIG 09:27:53 I raised this topic in a seperate Forum session https://etherpad.openstack.org/p/expose-sigs-and-wgs 09:28:27 actually, Rico Lin raised it 09:28:48 but there are a few different aspects: 09:28:57 1. collecting feedback from users/ops 09:29:10 2. coordinating work across projects 09:29:15 3. finding developers 09:29:31 self-healing SIG currently needs help most with 1. 09:29:55 I think 2. is already working fine via Storyboard and we have a specs template too (not used yet, but hopefully soon!) 09:30:37 Sorry to interrupt, I need to leave for another meeting… I plan to join the second meeting today 09:30:38 IIRC the session agreed that 3. is not a good idea. If the devs are available, they are available. If not, there is no magic solution to find them 09:30:50 no problem ifat_afek, thanks a lot for coming and see you later! 09:31:26 if anyone has ideas on how to increase engagement with ops/users, it would be really helpful to hear them 09:31:47 either now or later on IRC or the mailing list 09:31:58 ad. 2 do we have the link to specs page on wiki? 09:32:10 good question 09:32:35 well, there is a link yes 09:32:45 second link under https://wiki.openstack.org/wiki/Self-healing_SIG#Community_Infrastructure_.2F_Resources 09:32:54 got it 09:33:00 but it's called "Official SIG documentation" so maybe it's not obvious enough 09:33:23 #action aspiers to clarify docs link on wiki page 09:33:46 #action aspiers to update meeting info in wiki page 09:33:52 just noticed that too ;-) 09:34:11 ad. 1 should we advertise StoryBoard as the place to collect user stories from ops? 09:34:24 is it the right tool? 09:34:27 I'm not sure 09:34:34 it's a lot better than nothing 09:34:38 :) 09:34:47 personally I think I would prefer them to first discuss on IRC or mailing list 09:34:51 and then submit to the git repo 09:35:07 but Storyboard could be used to track it too 09:35:19 Rico did originally suggest this 09:35:34 so we should probably try it and see how it goes 09:36:06 as of now, git repo holds implemented use cases and has a place for design documents 09:36:11 #action aspiers to put another call for user stories out to the list 09:36:22 right 09:36:42 but the git repo can hold unimplemented use cases too :) 09:36:53 that was always my intention 09:36:54 IRC or mailing list is good, proposing a spec is more effort and could be a barrier I think 09:37:03 although I'm not sure if I made that clear 09:37:14 I agree, spec is more about technical details of implementation 09:37:24 spec would only happen once there are developers on board 09:37:48 so maybe we need a "How To Contribute" document 09:37:51 to make this clear 09:38:02 Maybe some short instructions to write a story if not irc first 09:38:10 yes 09:38:49 ah 09:38:49 it is easy to write 09:38:52 I just remembered 09:38:55 https://docs.openstack.org/self-healing-sig/latest/meta/CONTRIBUTING.html 09:38:58 I wrote this :-) 09:39:34 #action improve CONTRIBUTING.rst to explain how / when to propose stories / specs 09:39:51 #action link to CONTRIBUTING.html from wiki page 09:40:05 OK 09:40:18 tojuvone: do you want to discuss Fenix a bit? 09:40:41 well, ifat_afek is not present 09:40:58 maybe just some words 09:41:05 ah, OK we can discuss later today too 09:41:07 #topic Fenix (new project) 09:41:14 So, there is new project: Fenix - Rolling maintenance, upgrade and scaling 09:41:25 #link https://wiki.openstack.org/wiki/Fenix 09:41:27 I introdicid a bit in session 09:41:48 and one can find link to it from the etherpad 09:42:01 read and reach to know more 09:42:08 do you have an IRC channel? 09:42:29 just the bit for any self-healing project like Vitrage and Masakari 09:42:40 #link https://fenix.readthedocs.io/en/latest/notification/notifications.html#admin 09:43:01 So as Fenix does maintenance there is a notification telling host is in maintenance or not 09:43:23 so this awareness might be good for those projects 09:43:47 Yes, irc is: #openstack-fenix 09:43:52 * aspiers joins :) 09:44:19 OK cool 09:44:28 Maybe not more for here now 09:44:32 let's discuss again with ifat_afek at the meeting later today 09:44:38 thanks tojuvone :) 09:44:48 #topic AOB (Any Other Business) 09:44:55 does anyone else want to bring up any other topics? 09:45:04 witek: anything from you? 09:45:36 perhaps short update from previous virtual Vitrage PTG 09:45:46 oh that would be great! 09:45:58 I have joined to discuss Monasca - Vitrage integration 09:46:09 very nice :) 09:46:17 we've created three stories in StoryBoard for this 09:46:19 this was one of the big gaps in Vitrage IIUC 09:46:29 https://storyboard.openstack.org/#!/story/2004064 09:47:06 unfortunately we don't have anyone who could work on this now :( 09:47:22 ah :/ 09:47:28 looking for developers 09:47:30 well, a plan is a good start anyway 09:47:43 maybe this SIG can help find developers 09:48:02 that's our hope, too 09:48:09 I have added the link in line 74 of https://etherpad.openstack.org/p/self-healing-project-integrations 09:48:39 thanks 09:48:44 I would love to work on that, but won't be able to any time soon :-/ 09:48:47 maybe in the future 09:49:09 can you share a link to the virtual PTG? 09:49:43 the work item seems pretty well defined, so it should be well suited for new developers 09:49:49 nice 09:49:55 also, we're willing to help 09:50:08 I'd have to dig for the PTG link 09:50:36 I think it's https://etherpad.openstack.org/p/vitrage-ptg-queens 09:50:39 no 09:50:42 that's old 09:51:28 https://etherpad.openstack.org/p/vitrage-stein-ptg 09:51:38 that one, right? 09:52:03 yes, thanks 09:52:10 cool 09:52:21 OK, anything else anyone wants to mention? 09:52:25 I think we are pretty much done 09:53:08 nothing from my side 09:53:18 of course anyone is very welcome to also join the Americas/EMEA session in ~7 hours from now 09:53:29 hopefully ekcs will be joining then too 09:53:47 since I might have to drop early from that one :-o 09:54:07 and of course anyone is free to chat in this channel at any other times too 09:54:19 thanks a lot everyone and bye for now! 09:54:25 o/ 09:54:39 thanks everyone, bye 09:54:49 tnaks, byez 09:54:56 #endmeeting