14:00:35 #startmeeting sahara 14:00:36 Meeting started Thu Dec 6 14:00:35 2018 UTC and is due to finish in 60 minutes. The chair is tellesnobrega. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:37 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:39 The meeting name has been set to 'sahara' 14:00:56 o/ 14:01:04 o/ 14:01:40 #topic News/Updates 14:02:10 I have been working on APIv2 stuff, updated the cluster scaling patch (please review) 14:02:44 I have some questions on the 500 issue to jeremyfreudberg later, but we can have a specific topic for it later 14:02:59 oh, I have plenty of contents for "specific topics" :) 14:03:21 what I did: I tested the current split plugin test repositories 14:03:34 it seems to be mostly working 14:03:55 I hit few issues which do not seem to be regressions introduced by the splitting, especially on vanilla 14:04:13 one of the fixes was already sent and merged 14:04:13 tosky, nice, we can talk about each later 14:04:20 another one is the scaling patch (ok, for later) 14:04:38 sure 14:04:46 I also insisted a bit too much on python3, as you can see from the patches around 14:05:01 aaand I sent out a spec for sahara-tests (another small topic) 14:05:10 busy week for tosky 14:07:29 14:07:33 jeremyfreudberg, any updates? 14:08:04 let me start with the 500 issue, then we can talk about all your topics 14:08:12 uh, if you sent any mail to my red hat account in the past few days, i didn't get it (payroll changes) 14:08:27 and i'm looking at the tempest failures on the unversioned endpoints patch 14:08:52 that would be great 14:09:02 if you find something let me know 14:10:02 #action jeremyfreudberg to look into tempest failure on unversioned endpoints patch 14:10:04 #topic APIv2 - Fix 500 on malformed query string 14:10:10 nah, the notification from gerrit go to your personal account, if you noticed them it'd be more than enough 14:10:22 yup 14:11:18 so, we talked a little about this last week, and I just got a chance to take a look into it 14:11:45 I was able to see the issue on GET, POST seems to work fine (but need to check other calls) 14:13:53 jeremyfreudberg, suggested we could use before_request to solve the issue, using a whitelist of the params that are acceptable 14:14:25 First question: what is the appropriate response we should use? 14:15:19 Second question: all requests go through _mroute, can't we just validate this there? 14:15:28 i think you're right, that it doesn't always cause an issue -- only where get_request_args() is used haphazardly (example: cluster list) 14:16:51 uh, yeah, i think it can go in handler() in sahara.utils.api 14:16:58 (which is "inside" mroute) 14:17:20 I will give it a try 14:17:38 what response do we want to give when that happens? 14:18:00 400? 14:18:37 almost definitely 400 14:18:48 ok 14:19:00 you will see a patch soon 14:19:10 tosky, what do you want to talk about first? 14:19:11 and i remember that the api-sig said that we should try to give a most specific and helpful error message (at the very least, reporting the whitelist to the user) 14:20:27 tellesnobrega: I have: split plugins, python3, and scenario tests spec 14:20:30 sounds good 14:20:32 any order is fine 14:20:44 I will do that 14:20:55 #topic split plugins 14:20:58 tosky, you have the floor 14:21:50 as I mentioned, I hit two issues with vanilla: EDP jobs failing (but it's not a regression, apparently) 14:22:11 and cluster scaling, which is fixed by https://review.openstack.org/616193 - or at least the previous iteration of the patch 14:22:38 does that mean the current fails or you didn't test it yet? 14:23:08 I didn't get to test it yet 14:23:40 what was the edp failure? 14:24:00 some jobs in failed state, but I wasn't able to pinpoint the reason 14:24:14 or better, now that I think about it 14:24:26 vanilla 2.8.2 seems to fail only on Hive job, which is a painful and know issue 14:24:37 vanilla 2.7.1 (on centos) returned other errors 14:24:45 then I moved to other tests 14:25:10 yes, hive is always a bit of funny one 14:25:28 but like you said, none of that is really a regression, i think 14:25:29 the relevant point is that the split plugin codebase behaves as the current master, which is good 14:25:58 I discussed on rdo-dev on how to handle the packaging after the split; the idea is to use a variable to handle the bootstrapping case 14:26:59 aka: at the beginning of each cycle: build the packages from openstack-sahara without any references to the plugins; build the packages of the plugins; 14:27:10 rebuild the openstack-sahara source packages so that the binary package openstack-sahara depends on the binary packages of the plugins 14:27:26 this is already done with openstack-tempest and openstack-tempest-all 14:27:48 doesn't seem like too much trouble 14:28:01 we just need to remember the flip the flag 14:28:12 the discussion started here: https://lists.rdoproject.org/pipermail/dev/2018-November/008972.html 14:28:40 next step, complete the draft of the email for openstack-discuss@ and send it 14:29:10 thanks for doing this work tosky 14:29:19 then we can probably proceed, so that we can close it hopefully before the end of the year vacations (at least my vacations :) 14:29:44 we need some help from puppet people at least (I can probably help more on the ansible side) 14:31:17 sounds good 14:32:11 anything else on this topic? 14:32:29 nothing else from me 14:32:37 jeremyfreudberg? 14:32:39 comments? 14:33:08 it makes sense 14:33:38 cool 14:34:08 #topic python3 14:34:31 you may have noticed few patches related to python3 14:34:46 yes 14:34:51 there are few different issues, sometimes connected 14:35:19 the easy one: switch the default runner of sahara-scenario to python3 14:35:39 original proposal by Doug https://review.openstack.org/606712, but see the comments; my proposal is https://review.openstack.org/#/c/608211/ 14:35:53 this is about running sahara-scenario with python3, not yet running sahara with python3 14:36:26 Talking about running sahara itself with python3, this is more complex 14:36:56 I found few smaller issues which are fixed by https://review.openstack.org/#/c/622611/ 14:37:20 the patch reduces the number of warnings and errors, but it's still not enough 14:37:49 there is a major issue somewhere when the cluster status changes and it's written down in the databse 14:37:51 database* 14:38:21 and that's the reason for patches like https://review.openstack.org/#/c/600689/ but they are still not enough; check the exceptions in http://logs.openstack.org/89/600689/7/check/sahara-tests-scenario-py3/ba2e212/ 14:38:49 it's a bit difficult to decypher, but it's here: http://logs.openstack.org/89/600689/7/check/sahara-tests-scenario-py3/ba2e212/controller/logs/screen-sahara-eng.txt.gz#_Dec_05_22_09_15_243246 14:39:25 I tried to deploy devstack locally on bionic, found few issues (hence https://review.openstack.org/#/c/623078/ ) 14:39:40 I've this before 14:39:42 but I'm still failing and I don't know why 14:40:20 I workarounded the issue with notifications by cheating - I disabled them 14:40:32 see https://review.openstack.org/623193 14:40:44 but still not enough, not even locally, with weird errors on cinder 14:41:22 I'm trying now locally without volumes; still not sure why 623193 failed - because of course the patch disables notifications, so sahara-eng.log is useless :) 14:41:24 and that's it 14:41:48 that is a lot for sure 14:41:57 at some point I will start asking around for python experts; I suspect that the notification issue is some value which is passed with the wrong type (string vs bytes) 14:42:07 thanks for pushing this forward 14:42:19 yes, thanks for all the digging 14:42:42 do you know any python experts, if not I can try to look around and find someone with some time to help out 14:42:57 swift is pretty much on the same position 14:43:45 I workarounded swift as well by using ceph radosgw - it was easier than making sure that swift started with python2 works with an full python3 environment (I think I explained this in the past) 14:44:15 I think we are in a better shape than swift - we have one or two major issues that, when fixed, should probably unblock the rest 14:44:19 compared to a full port 14:44:26 great 14:44:41 I mostly meant about needing a python expert 14:45:27 oh, let's see, there are few in the openstack community 14:46:08 cool, if it gets to that point we call on them 14:46:27 want to move on to scenarios test spec? 14:47:03 yep 14:47:19 #topic scenario tests spec 14:48:06 this is about https://review.openstack.org/623193 14:48:49 I hope I explained everything there, but I suspect my writing may not have been understandable enogh 14:49:22 I think you pasted the wrong patch 14:49:35 right, sorry 14:49:41 this is it: https://review.openstack.org/#/c/622492/ 14:50:10 I have to re-read 14:50:13 i'll have a look soon 14:50:23 anything to mention about it now? 14:51:30 the proposal started as a way to add the support for running S3 jobs only when a cluster supports S3 without having to duplicate the templates, basically 14:51:43 maybe it's overengineered, but I hate hacks :D 14:52:10 I know it's not accepted, but I may quickly write down a draft of the code 14:52:41 writing the spec helped a lot to organize my ideas 14:53:16 from what I read, it mostly looks good, I will look again today 14:54:36 thanks! 14:54:44 feel free to work on the draft 14:55:12 to finalize 14:55:22 lets review the APIv2 patches 14:55:26 we need to get them moving 14:55:59 review tosky's spec 14:56:02 and python3 patches 14:56:38 anything else I missed? 14:57:05 there are few smaller patches lying around, some of them with a +2 already 14:57:15 not many left at this point, but still 14:58:46 I will take a look 14:59:09 we have 1 minute left 14:59:36 anything else for today? 14:59:51 nope 15:00:02 if not, thanks everyone, great work. see you all next week 15:00:13 thank you all! 15:00:13 #endmeeting