16:00:54 <rakhmerov> #startmeeting Mistral 16:00:55 <openstack> Meeting started Mon Dec 21 16:00:54 2015 UTC and is due to finish in 60 minutes. The chair is rakhmerov. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:56 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:58 <openstack> The meeting name has been set to 'mistral' 16:01:07 <rakhmerov> hi 16:01:11 <hparekh> hi 16:01:19 <melisha> Hi 16:01:36 <^Gal^> hi 16:01:37 <ddeja> hello 16:02:42 <LimorStotland_> hi 16:02:54 <akuznetsova_> hi 16:02:57 <rakhmerov> ok, let's get started 16:03:18 <gpaz> Hi everyone 16:03:23 <tshtilma> hi 16:03:24 <NikolayM> hi everyone! 16:03:30 <rakhmerov> hey 16:03:39 <rakhmerov> agenda: https://wiki.openstack.org/wiki/Meetings/MistralAgenda 16:04:19 <rakhmerov> #topic Review Action Items 16:04:33 <rakhmerov> 1. melisha: send "Mistral HA and multi-regional support" meeting minutes 16:04:35 <rakhmerov> done 16:04:40 <melisha> :-) 16:04:44 <rakhmerov> :) 16:05:03 <rakhmerov> melisha: we talked to akuznetsova_ quickly 16:05:13 <rakhmerov> looks like we don't have any docs on that 16:05:26 <akuznetsova_> yes 16:05:36 <rakhmerov> so we'll need to work on it just based on those meeting minutes and what's in our heads 16:05:51 <rakhmerov> 2. NikolayM: discuss with Renat and confirm appropriate time for planning M-2 on wednesday Dec 16 16:05:55 <rakhmerov> that was done 16:05:55 <melisha> OK 16:06:03 <rakhmerov> 3. NikolayM: review patch "fix join on branch error" 16:06:05 <rakhmerov> done 16:06:10 <NikolayM> correct :) 16:06:18 <rakhmerov> #topic Current status (progress, issues, roadblocks, further plans) 16:06:30 <rakhmerov> ok, let's quickly share our statuses 16:06:45 <^Gal^> ]updating in regard to mistral dashboard 16:06:46 <^Gal^> 1. I'm working on "execution state" to show tooltip of "execution state info". That's the last thing in the execution blueprint. 16:06:46 <^Gal^> 2. I've registered 2 blueprints of missing pagination logic in mistral-engine and mistral-python-client for cron trigger and task. 16:06:47 <^Gal^> 3. I'm gonna start working on cron trigger screen real soon 16:06:47 <^Gal^> 4. Liat is doing well in action screen blueprint 16:07:00 <LimorStotland_> my status: waiting for review on bug https://bugs.launchpad.net/mistral/+bug/1527976 16:07:00 <openstack> Launchpad bug 1527976 in Mistral "Task timeout error message" [Undecided,In progress] - Assigned to Limor Stotland (limor-bortman) 16:07:08 <rakhmerov> rakhmerov's status: mostly doing reviews, also discussed HA with ALU and did planning for M-2 16:07:29 <hparekh> working on to enable mistral containers for kolla 16:07:38 <rakhmerov> ^Gal^: awesome, looks like you're making good progress ) 16:07:50 <^Gal^> rakhmerov: yeah, things go fast now 16:07:53 <^Gal^> :) 16:07:54 <rakhmerov> LimorStotland_: ok 16:08:32 <LimorStotland_> thanks rakhmerov 16:08:38 <rakhmerov> hparekh: we reviewed your docker image patch, it looks good. Just one last thing I'd like to do is to actually build an image ) 16:08:42 <rakhmerov> haven't done that yet 16:09:43 <hparekh> rakhmerov: actually in previous meeting we discussed this and can't decide whether we have to build it on merge basis or on weekly basis 16:09:48 <akuznetsova_> I fixed dsvm gate 16:10:06 <rakhmerov> hparekh: hm... interesting 16:10:23 <rakhmerov> you mean we could have a gate for this sort of? 16:10:47 <rakhmerov> well, a gate might a little bit heavy though to run it on every commit 16:10:57 <hparekh> yeah... 16:10:59 <rakhmerov> what are the alternatives? 16:10:59 <hparekh> hmmm 16:11:17 <rakhmerov> well, actually it depends on how long it gets built 16:11:28 <rakhmerov> how long does it take for you to build it? 16:11:52 <hparekh> well it takes 20 -30 min 16:12:00 <hparekh> on my machine 16:12:06 <rakhmerov> yeah... a little bit too long 16:12:18 <rakhmerov> hm.. ok, need to think about it 16:12:34 <hparekh> ok 16:12:42 <rakhmerov> may be could just do it manually once in a while to check that it works 16:13:18 <rakhmerov> ok, let's move forward 16:13:20 <hparekh> manually it works i added doc for that also 16:13:27 <rakhmerov> yep 16:13:35 <rakhmerov> anyway, good job hparekh! 16:13:37 <hparekh> about how to build it manually 16:13:43 <hparekh> Thanks !! 16:14:30 <rakhmerov> hparekh: I also see that there's a patch from you that you still need to work on 16:14:31 <rakhmerov> https://review.openstack.org/252317 16:14:40 <rakhmerov> address comments etc. 16:14:46 <rakhmerov> please don't forget about it 16:15:22 <rakhmerov> #topic "Work queue" pattern in oslo.messaging (https://review.openstack.org/#/c/256342/) 16:15:29 <hparekh> yeah actually i need to add FT for that.... i will do this once tempest plugin patch et merge 16:15:40 <rakhmerov> FT? 16:15:45 <rakhmerov> what's that? 16:15:53 <hparekh> functional test 16:16:03 <rakhmerov> ooh, ok 16:16:10 <^Gal^> fruit tart 16:16:13 <rakhmerov> so 16:16:17 <rakhmerov> Work queue 16:17:13 <rakhmerov> as you probably know, we had a loooong discussion with oslo.messaging team about a new message delivery model that we need in Mistral 16:17:52 <rakhmerov> if interested, you can refer to https://review.openstack.org/#/c/229186/ 16:18:20 <rakhmerov> WARNING: it's pretty complicated stuff and may consume your time significantly 16:18:22 <rakhmerov> :) 16:19:10 <rakhmerov> so based on that discussion one of the folks from oslo.messaging decided to push a spec in the patch I mentioned above 16:19:36 <rakhmerov> I wasn't going to discuss it here in details 16:20:08 <rakhmerov> I just wanted to participate in that because it's one of the fundamental design things that we need to address 16:20:39 <rakhmerov> specifically, I'd like to participate ddeja, _gryf, melisha 16:20:44 <rakhmerov> and others too 16:21:25 <rakhmerov> in the discussion https://review.openstack.org/#/c/229186/ you can find a comprehensive vision that I have about this problem (see my latest comment) 16:21:25 <melisha> I will read it carefully. Thanks. 16:21:38 <rakhmerov> melisha: btw, it is related to HA 16:21:48 <rakhmerov> it's one of our current gaps 16:22:06 <melisha> rakhmerov: Yes. I remember. Thanks. 16:22:09 <rakhmerov> ok 16:22:29 <rakhmerov> that's all I wanted to share on this topic 16:22:29 <ddeja> rakhmerov: I'll read this. Thank you 16:22:36 <rakhmerov> ddeja: thanks ) 16:23:07 <rakhmerov> ddeja: btw, is what you're doing on VM evacuation shared publicly somewhere? 16:23:35 <rakhmerov> if possible, I'd like to take a look at this 16:23:56 <ddeja> rakhmerov: yes, we have repo on github, but it's early work and right now I'm stuck with hardware problems 16:24:06 <rakhmerov> ddeja: ok 16:24:28 <ddeja> rakhmerov: I'll let you know as soon as I have something interesting 16:24:29 <rakhmerov> then I would ask you to let me know once something is ready to be looked at 16:24:36 <rakhmerov> yep, thanks 16:25:40 <rakhmerov> ok 16:26:09 <rakhmerov> so I also want to touch quickly what we discussed with ALU on Mistral HA last week 16:26:21 <rakhmerov> #topic Mistral HA and stability 16:27:37 <rakhmerov> so, what we came to is that we need the following 16:28:05 <rakhmerov> * Add a gate that runs Mistral in HA mode 16:28:13 <rakhmerov> * Add more functional tests that are focused on HA tests 16:28:25 <rakhmerov> * Put together a list of known HA issues that are currently not handled (For example, if an executor dies immediately after dequeuing a task) and think of solutions. 16:28:46 <rakhmerov> * Expose Mistral load metrics to allow some external system to decide if it needs to scale Mistral components in / out. 16:29:13 <rakhmerov> the last one is lower priority thing, it's I think a far future 16:29:32 <rakhmerov> we already have a BP that's just called "Mistral HA" 16:29:44 <rakhmerov> but it doesn't contain any specific detailed info 16:30:21 <rakhmerov> I'm going to split it into smaller ones with prefix "HA" in their title 16:31:02 <rakhmerov> and as with oslo.messaging stuff I'd like everyone to think on the 3rd asterisk mark 16:31:48 <rakhmerov> by design, we have a number of issues and we need to complete the list of these issues 16:32:25 <rakhmerov> so that we could model situations for them in a special setup (most likely a new gate) and test them 16:32:58 <melisha> rakhmerov: Sounds good to me 16:33:14 <rakhmerov> like, for now, it's possible to lose a message going to an executor completely without an engine knowing about it 16:33:18 <rakhmerov> melisha: ok 16:33:36 <rakhmerov> team, any comments/remarks? 16:33:49 <NikolayM> nothing from me 16:33:58 <rakhmerov> ok 16:35:12 <rakhmerov> akuznetsova_: let's discuss tomorrow how we could approach implementing a new gate for this 16:35:40 <rakhmerov> #action rakhmerov: break "Mistral HA" BP down into smaller ones 16:36:03 <rakhmerov> #action rakhmerov, akuznetsova_: discuss implementing a new gate focusing on HA testing 16:36:04 <akuznetsova_> rakhmerov, sure 16:36:08 <rakhmerov> ok 16:36:15 <rakhmerov> #topic Open discussion 16:36:26 <melisha> rakhmerov, akuznetsova: If you need any help please feel free 16:36:30 <rakhmerov> folks, any other questions? 16:36:33 <rakhmerov> absolutely 16:36:36 <akuznetsova_> I have an idea of new tests that we can run on this gate 16:36:44 <rakhmerov> melisha: if needed, we can setup a meeting 16:36:55 <rakhmerov> akuznetsova_: can you share now? 16:37:00 <akuznetsova_> e.g some destructive scenarios 16:37:07 <rakhmerov> yes 16:37:21 <akuznetsova_> like turing off one of the executor, and so on 16:37:28 <akuznetsova_> *executors 16:37:43 <rakhmerov> yeah 16:37:46 <rakhmerov> ok 16:37:51 <akuznetsova_> it is just an idea, need to think more 16:37:57 <rakhmerov> sure 16:38:33 <rakhmerov> I think we just need to push an initial draft of the spec for this into mistral-specs and start filling it with details 16:38:39 <melisha> We for example have a test that does SSH to localhost and cmd="hostname" and in the output we can see that it reached all executors 16:38:40 <rakhmerov> so that everyone could participate 16:39:09 <melisha> at least once 16:39:20 <rakhmerov> melisha: even though it was intended only for one of them? 16:40:05 <melisha> rakhmerov: No. We run 6 identical tasks - and expect to have at least 3 different hostnames 16:40:30 <rakhmerov> ok 16:41:27 <rakhmerov> melisha: is there a bug for this? 16:41:31 <rakhmerov> in LP 16:41:59 <melisha> rakhmerov: Why a bug? It is good :-) 16:42:09 <melisha> We have 3 executors on 3 VMs in our setup 16:42:14 <rakhmerov> hm... not sure I understood then 16:42:19 <rakhmerov> yes 16:42:29 <melisha> if we run WF with 6 tasks that do SSH to localhost and do hostname 16:42:39 <melisha> I expect that all executors will participate 16:42:46 <rakhmerov> right 16:43:09 <rakhmerov> I guess it will be most probable behavior 16:43:19 <rakhmerov> but it is not necessary right 16:43:36 <melisha> Ofcourse so you could increase to 12 tasks instead of 6 16:43:40 <rakhmerov> it depends on how quickly executors poll messages and process them 16:43:47 <rakhmerov> yes 16:43:52 <rakhmerov> ok, got it 16:44:02 <melisha> If it is not going to all executors at least once - you probably have an issue 16:44:19 <rakhmerov> yes 16:44:47 <rakhmerov> melisha: btw, do you know that oslo dropped support of QPID? 16:45:05 <rakhmerov> I wonder if that's an issue for you 16:45:50 <melisha> rakhmerov: Yes. We saw that it will be removed in Mitaka. It forces us to move to RabbitMQ. Which is good :-) 16:46:48 <rakhmerov> melisha: ok, was there a strong reason for you to use QPid? 16:46:59 <rakhmerov> or it's just a historical choice? 16:47:45 <melisha> rakhmerov: We are using RedHat distro and when we began that was their default. At some point they moved the default to Rabbit but we did not change because we already had QPID in HA, etc. 16:48:38 <rakhmerov> ooh, I see 16:48:43 <rakhmerov> then it's ok 16:48:54 <rakhmerov> I was worrying for you ) 16:49:24 <melisha> rakhmerov: Ohhhh... Thanks :-) 16:50:30 <rakhmerov> :) 16:50:33 <rakhmerov> alright 16:50:38 <rakhmerov> I don't have any more topics 16:50:47 <rakhmerov> anything else guys? 16:51:34 <melisha> I started looking on how to add Mistral specific rules to pep8. I saw it can theoretically be done with flask8. I am looking into it on my spare time 16:52:01 <rakhmerov> melisha: any specific reason for that? 16:52:15 <rakhmerov> what additional checks you'd like to have? 16:52:58 <melisha> My goal is to save time for the committers and reviwers. 16:53:08 <melisha> For example, blank line before return statement 16:53:14 <rakhmerov> oooh 16:53:15 <melisha> Period at the end of comment 16:53:17 <melisha> ... 16:53:28 <rakhmerov> the things that I'm picky about ) 16:53:32 <melisha> If we could fail on DevStack before push that will be great 16:53:33 <rakhmerov> yeah 16:54:06 <rakhmerov> it's actually my fault, I was gonna write a doc explaining "our" style rules 16:54:11 <rakhmerov> but never did that 16:54:27 <rakhmerov> if we could have pep8 checks for this it would be great 16:54:49 <akuznetsova_> rakhmerov, I already created bp and etherpad for that 16:55:01 <akuznetsova_> rakhmerov, you need to take a look at 16:55:13 <rakhmerov> link? 16:55:46 <akuznetsova_> rakhmerov, https://blueprints.launchpad.net/mistral/+spec/add-custom-code-style-checks 16:55:53 <rakhmerov> ok, thanks! 16:56:15 <rakhmerov> so, anything else? 16:56:22 <akuznetsova_> melisha, I think that it can be done like Rally did, using python 16:56:44 <rakhmerov> sahara has that too, AFAIK 16:56:48 <rakhmerov> we can peek 16:57:08 <melisha> akuznetsova_, rakhmerov: OK. I did not know that. I will take a look. Thanks! 16:57:16 <akuznetsova_> yes, they use quite scary regex) 16:57:52 <rakhmerov> :) 16:57:58 <rakhmerov> ok, let's end the meeting 16:58:10 <rakhmerov> unless you have anything else 16:58:15 <rakhmerov> counting down.. 16:58:18 <rakhmerov> 5 16:58:19 <rakhmerov> 4 16:58:21 <rakhmerov> 3 16:58:22 <rakhmerov> 2 16:58:23 <rakhmerov> 1 16:58:27 <rakhmerov> bye! 16:58:28 <NikolayM> bye! 16:58:33 <rakhmerov> thanks for joining! 16:58:34 <LimorStotland_> bye 16:58:35 <melisha> bye! 16:58:35 <ddeja> bye 16:58:37 <hparekh> bye! 16:58:41 <rakhmerov> #endmeeting