15:00:06 <d0ugal> #startmeeting mistral 15:00:07 <openstack> Meeting started Mon Oct 1 15:00:06 2018 UTC and is due to finish in 60 minutes. The chair is d0ugal. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:08 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:11 <openstack> The meeting name has been set to 'mistral' 15:00:22 <d0ugal> Hey everyone. It's the Monday office hour 15:00:25 <d0ugal> Who is around? 15:00:40 <d0ugal> I am back and rested after 2 weeks vaction - so send me your review requests and other questions :) 15:03:08 <therve> d0ugal: That RPC bug was interesting 15:03:17 <therve> Also, welcome back :) 15:03:21 <d0ugal> therve: Thanks! 15:03:33 <bobh> o/ 15:03:37 <d0ugal> therve: Yeah, I gotta admit I didn't fully understand the fix 15:03:44 <d0ugal> therve: must have been hard to track down? 15:03:47 <therve> d0ugal: I think there is an issue with running in a WSGI container and crons 15:03:52 <d0ugal> ah 15:04:02 <therve> d0ugal: Yeah a bit, easier once we got a proper trace 15:04:12 <therve> Still took a good day to reproduce it properly 15:04:41 <therve> d0ugal: https://github.com/openstack/mistral/blob/master/mistral/api/app.py#L58 15:04:46 <d0ugal> That is quite quick by openstack standards :) 15:05:08 <therve> I think that todo is critical if you don't use mistral-server 15:05:37 <d0ugal> Right, makes sense 15:05:46 <d0ugal> Running it in the API process always felt like a massive hack 15:06:50 <therve> d0ugal: Which brings the related remark: crons aren't probably tested by the devstack jobs 15:07:02 <therve> And I'm not sure by the tripleo jobs either 15:08:07 <d0ugal> We have discussed re-writing the cron trigger subsytem a few times. There is some hope it should be easy to do on the new scheduler work rakhmerov has been working on 15:08:37 <d0ugal> PING rakhmerov, apetrich, bobh, mcdoker181818, akovi, hardikjasani (Sorry, forgot to do the pings at the start of the office hour) 15:09:42 <openstackgerrit> Dougal Matthews proposed openstack/mistral stable/rocky: Add a release note for Ic98e2db02abd8483591756d73e06784cc2e9cbe3 https://review.openstack.org/606969 15:09:55 <openstackgerrit> Dougal Matthews proposed openstack/mistral stable/queens: Add a release note for Ic98e2db02abd8483591756d73e06784cc2e9cbe3 https://review.openstack.org/606970 15:10:04 <openstackgerrit> Dougal Matthews proposed openstack/mistral stable/pike: Add a release note for Ic98e2db02abd8483591756d73e06784cc2e9cbe3 https://review.openstack.org/606971 15:16:17 <d0ugal> Once these merge I am going to propose rocky, queens and pike releases. 15:17:23 <d0ugal> We got a few new bugs while I was away, so I'm going to do some triage: https://bugs.launchpad.net/mistral/+bugs?search=Search&field.status=New&orderby=id&start=0 15:19:21 <d0ugal> https://bugs.launchpad.net/mistral/+bug/1795068 15:19:21 <openstack> Launchpad bug 1795068 in Mistral "screen-mistral-engine.txt size is causing logstash index OOM" [Undecided,New] 15:19:27 <d0ugal> Damn, our logs are too verbose :) 17:20:59 <openstackgerrit> Merged openstack/mistral stable/queens: Add a release note for Ic98e2db02abd8483591756d73e06784cc2e9cbe3 https://review.openstack.org/606970 17:52:43 <openstackgerrit> Merged openstack/mistral stable/rocky: Add a release note for Ic98e2db02abd8483591756d73e06784cc2e9cbe3 https://review.openstack.org/606969 17:52:43 <openstackgerrit> Merged openstack/mistral stable/pike: Add a release note for Ic98e2db02abd8483591756d73e06784cc2e9cbe3 https://review.openstack.org/606971 06:32:30 <openstackgerrit> Merged openstack/mistral-lib master: Removed older version of python added 3.5 https://review.openstack.org/606340 11:13:17 <openstackgerrit> Dougal Matthews proposed openstack/mistral master: Use eventlet-aware threading events https://review.openstack.org/557487 12:17:32 <therve> d0ugal: Here to chat about that event patch? 12:17:50 <d0ugal> therve: Yeah? What do you make of it? 12:19:34 <therve> d0ugal: I have the symptoms, but I don't think I have the conditions of the fix 12:19:46 <d0ugal> Right 12:19:47 <therve> d0ugal: It should only happen when running kombu right? 12:19:57 <d0ugal> I'm still working my way through the bugzilla discussion 12:20:53 <therve> I have the issue right now on a deployed machine. 12:21:01 <therve> It's 100% cpu, doing nothing but epoll_wait 12:21:27 <therve> Currently it's the executor, but engine and event-engine had the same issue earlier on 12:23:08 <d0ugal> ouch 12:23:35 <d0ugal> therve: What do you mean by "I don't think I have the conditions of the fix" 12:23:50 <therve> d0ugal: The fix seems to be specific to the kombu server? 12:23:59 <d0ugal> Right 12:24:10 <therve> tripleo uses the regular oslo backend 12:24:34 <d0ugal> Right, so wouldn't that be fixed in oslo-messaging? 12:25:04 <therve> If it was the culprit, yes, but I don't think it is 12:25:17 <therve> Otherwise other services would be affected 12:26:47 <d0ugal> right 12:26:49 <d0ugal> Good point 12:28:09 <therve> Arf 12:28:12 <therve> Easy to reproduce 12:28:21 <therve> d0ugal: Another SIGHUP issue 12:29:16 <d0ugal> therve: Fun. 12:29:21 <d0ugal> How do you reproduce it? 12:29:39 <therve> Just kill -HUP the engine or the executor 12:30:56 <therve> Hum doesn't work the second time obviously :D 12:31:12 <therve> Oh it does 12:35:57 <d0ugal> lol 16:23:17 <openstackgerrit> Thomas Herve proposed openstack/mistral master: Wait for rpc server on shutdown https://review.openstack.org/607306 16:23:29 <therve> d0ugal: ^^ if you're still around 16:23:46 <d0ugal> therve: just leaving, I'll check it out tomorrow. 16:23:55 <therve> np 16:24:44 <therve> thrash|bbl maybe when you come back 16:24:47 * therve away 18:59:42 <thrash> therve: wassup? 19:20:32 <therve> thrash: Was looking at another bug where mistral services were using 100% cpus 19:20:43 <therve> thrash: I think https://review.openstack.org/607306 fixes it 19:21:12 <thrash> therve: awesome! 19:22:20 <therve> Still related to SIGHUP handling... 08:12:40 <rakhmerov> d0ugal: hi, is it a problem to backport a change if it adds a new config option (say with a default value that keeps it backward compatible)? 10:25:31 <rakhmerov> d0ugal: :) Please let me know when you can 10:25:54 <rakhmerov> maybe we can ask for an exception like we did before 10:26:10 <rakhmerov> I may potentially need to backport such a change 10:28:22 <d0ugal> rakhmerov: I think it is okay if there is a default and means no user action is required 10:28:46 <rakhmerov> yeah 10:28:47 <rakhmerov> ok 10:29:02 <rakhmerov> anyway, I'll ask to support me, if needed ;) 10:52:01 <d0ugal> rakhmerov: sure 20:35:27 <openstackgerrit> Oleg Ovcharuk proposed openstack/mistral master: Add started_at and finished_at to task execution. https://review.openstack.org/607703 04:51:07 <openstackgerrit> Renat Akhmerov proposed openstack/mistral master: WIP: change workflow completion logic https://review.openstack.org/607807 05:39:45 <openstackgerrit> Renat Akhmerov proposed openstack/mistral master: WIP: change workflow completion logic https://review.openstack.org/607807 07:40:53 <openstackgerrit> Oleg Ovcharuk proposed openstack/mistral master: Add started_at and finished_at to task execution. https://review.openstack.org/607703 07:58:40 <openstackgerrit> Renat Akhmerov proposed openstack/mistral master: WIP: change workflow completion logic https://review.openstack.org/607807 08:04:38 <openstackgerrit> Renat Akhmerov proposed openstack/mistral master: WIP: change workflow completion logic https://review.openstack.org/607807 08:06:39 <d0ugal> therve: Hey - should we also backport this? https://review.openstack.org/#/c/607306/ 08:07:01 <therve> d0ugal: Depends what happens around log rotation... 08:07:10 <therve> If you want/need SIGHUP to work, yes :) 08:07:42 <d0ugal> I guess I do :) 08:07:48 <d0ugal> therve: would you mind adding a release note? 08:08:19 <therve> Nope! 08:08:25 <d0ugal> Thanks 08:10:15 <openstackgerrit> Thomas Herve proposed openstack/mistral master: Wait for rpc server on shutdown https://review.openstack.org/607306 08:10:36 <therve> d0ugal: We have various issues around SIGHUP handing in Heat as well 08:10:53 <d0ugal> oh, interesting. I'll take a look 08:10:55 <therve> I wonder if we should consider changing the behavior imposed by oslo.service 08:11:15 <therve> Restarting all the services to reload log/conf files seem dumb to me 08:11:23 <d0ugal> Agreed 08:11:52 <d0ugal> but I can see why it was done - it was probably the easiest fix. 08:11:56 <therve> I know long-running heat stacks are broken, I wonder about workflows 08:12:09 <d0ugal> How long is long-running? 08:12:21 <therve> In case of heat, hours 08:12:46 <d0ugal> I don't think we have any workflows that run that long (because of keystone token issues) 08:12:46 <therve> There is just no way that you're going to "wait" for completion in this case 08:12:53 <d0ugal> but rakhmerov may have some in a non-keystone setup 08:13:10 <therve> Even 20 minutes 08:13:28 <d0ugal> oh, we have that for sure with introspection 08:14:00 <therve> If you have a SIGHUP at say 10 minutes into 20, what do you do? 08:14:41 <d0ugal> ¯\_(ツ)_/¯ 08:15:13 <rakhmerov> d0ugal: hey 08:15:17 <rakhmerov> what's the question? 08:16:01 <rakhmerov> yes, we have workflows running for days 08:16:13 <rakhmerov> in a non-keystone env, right 08:16:31 <d0ugal> rakhmerov: Not sure we have a specific question, but therve is investigating SIGHUP issues. Do you use the oslo or kombu rpc backend? 08:16:47 <rakhmerov> oslo 08:17:01 <rakhmerov> I would not recommend to use Kombu RPC, at least under load 08:17:13 <rakhmerov> it has a number of issues actually 08:17:14 <d0ugal> Good to know. 08:17:17 <rakhmerov> yeah 08:17:17 <d0ugal> rakhmerov: https://review.openstack.org/#/c/607306/ 08:17:24 <d0ugal> We use oslo 08:20:44 <rakhmerov> haah.. 08:20:48 <rakhmerov> tricky thing 08:48:04 <pgaxatte> hello 08:48:34 <pgaxatte> I have a small question related to the plugin system 08:49:04 <pgaxatte> this is more related to setuptools actually but since it used in mistral to extend the actions possible I figured I'd ask you :D 08:50:08 <pgaxatte> can I have mistral installed in the normal sys.path via my distribution's packages and register an action plugin deployed somewhere else in a venv? 08:50:39 <pgaxatte> mistral would be run without a venv and resides in /usr/lib/python.... 08:51:11 <pgaxatte> whereas the code of the custom action would be totally outside the sys.path, in a venv somewhere 08:57:20 <openstackgerrit> Renat Akhmerov proposed openstack/mistral master: WIP: change workflow completion logic https://review.openstack.org/607807 09:21:31 <therve> pgaxatte: Well mistral needs to be able to import your code, right? 09:21:42 <therve> If it's in a different venv, that doesn't sound doable 09:22:05 <pgaxatte> therve: that's what I thought so it raises a problem to me 09:22:31 <pgaxatte> because maybe I want to use a different version of some lib in a specific action 09:22:54 <therve> Yeah that won't work 09:23:24 <pgaxatte> and I don't want to have to know in advance the dependencies of all my custom actions so that I can build a mistral in a venv with everything it needs 09:25:05 <pgaxatte> if I know in advance WHERE my plugins venv will be, can't I extend mistral's sys.path before running it? 09:26:09 <therve> Right, but you'll use the same version of the lib everywhere then? 09:27:44 <pgaxatte> the different version of the lib was an hypothetical question, I'd probably need additionnal packages rather than different versions of them 09:33:40 <therve> OK that's a totally different issue though 09:54:12 <openstackgerrit> Renat Akhmerov proposed openstack/mistral master: WIP: change workflow completion logic https://review.openstack.org/607807 10:23:18 <openstackgerrit> Renat Akhmerov proposed openstack/mistral master: WIP: change workflow completion logic https://review.openstack.org/607807 11:24:13 <openstackgerrit> 98k proposed openstack/python-mistralclient master: build universal wheels https://review.openstack.org/607915 11:59:52 <openstackgerrit> Oleg Ovcharuk proposed openstack/mistral master: Add started_at and finished_at to task execution. https://review.openstack.org/607703 13:35:05 <openstackgerrit> Andras Kovi proposed openstack/mistral master: Fix state change propagation in workflows https://review.openstack.org/607960 14:35:13 <openstackgerrit> Brad P. Crochet proposed openstack/mistral master: Use SessionClient for Ironic actions https://review.openstack.org/607974 20:51:00 <openstackgerrit> Oleg Ovcharuk proposed openstack/mistral master: Add started_at and finished_at to task execution. https://review.openstack.org/607703 05:20:55 <openstackgerrit> Renat Akhmerov proposed openstack/mistral master: WIP: change workflow completion logic https://review.openstack.org/607807 06:22:30 <openstackgerrit> Renat Akhmerov proposed openstack/mistral master: WIP: change workflow completion logic https://review.openstack.org/607807 07:30:12 <openstackgerrit> Renat Akhmerov proposed openstack/mistral master: Add sqlalchemy.exc.OperationalError to the retry decorator https://review.openstack.org/608171 07:45:44 <rakhmerov> hi, can anybody look at http://logs.openstack.org/71/608171/1/check/openstack-tox-py27/2cd5c83/job-output.txt.gz#_2018-10-05_07_41_46_803612 07:45:54 <rakhmerov> it happens only on py27 07:46:24 <rakhmerov> seems like some problem with dependencies but I can't understand what's wrong 07:46:44 <rakhmerov> this module exists 07:47:24 <rakhmerov> d0ugal: ^ 07:47:55 <rakhmerov> d0ugal: btw, I haven't seen apetrich for a while. Is he on vacation? 08:02:06 <therve> rakhmerov: Relative import. In mistral.db.utils you import "sqlalchemy", it thinks you try "mistral.db.sqlalchemy" 08:02:15 <openstack> d0ugal: Error: Can't start another meeting, one is in progress. Use #endmeeting first. 08:02:42 <openstack> d0ugal: Error: Can't start another meeting, one is in progress. Use #endmeeting first. 08:03:35 <d0ugal> oops! 08:03:37 <d0ugal> #endmeeting