15:12:27 #startmeeting 2015-10-15 15:12:28 Meeting started Thu Oct 15 15:12:27 2015 UTC and is due to finish in 60 minutes. The chair is vannif. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:12:28 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:12:28 The meeting name has been set to '2015_10_15' 15:12:29 Meeting started Thu Oct 15 15:12:27 2015 UTC and is due to finish in 60 minutes. The chair is vannif. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:12:30 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:12:32 The meeting name has been set to '2015_10_15' 15:12:55 hi everyone 15:13:30 daemontool_ would you start ? 15:13:33 ok 15:13:54 I've been working mainly on testing 15:14:06 helping vannif integrated testing with devstack 15:14:15 and also now I'm working on moveing from pytest to testr 15:14:23 to be consistent witht he other openstack proejcts 15:14:50 the issue with this 15:14:59 is that many of our unittests are now incompatible 15:15:08 because of the use of monkeypatch from pytests 15:15:26 the unittests needs to be changed asap after https://review.openstack.org/235387 lands in 15:15:45 I'm planning this week also to swift the stackforge/freezer repo to PBR 15:15:50 that's all from me 15:17:21 can you give a rough estimate of the effort to remove the dependency on pytest ? 15:17:39 I think it will take ~2 weeks 15:17:42 realistically 15:18:19 these are the bp for pbr and testr https://blueprints.launchpad.net/freezer/+spec/switch-to-pbr https://blueprints.launchpad.net/freezer/+spec/swift-to-testr 15:18:29 I'll be full on it to have it completed 15:18:31 asap 15:18:44 reldan, I think this task involves you too 15:19:04 Yes, sure. I have some pytest tests and can help with migration 15:19:31 I work with tests anyway, so can replace pytes to unittest step by step 15:20:10 perfect 15:20:21 afaik the probelm is only with monkeypatch 15:20:27 s/probelm/problem/ 15:20:49 good. maybe everyone can help moving it's own part of code to unittest 15:21:34 that's be fantastic 15:22:28 I've recently moved the tests for lvm to unittest. if you know the code, it's faster. even though to be really good the tests should be written by someone else ;) 15:22:36 ok. 15:22:51 reldan ? 15:23:03 Thank you 15:23:05 ok 15:23:40 I was improving test coverage. So I wrote additional tests and for example improved ssh and local storeages in terms of code deduplication 15:24:10 Now ssh only has stuff related to work with ssh and local only os.blabla invokations 15:25:05 I have 62 code coverage overall. The biggest uncovered part is scheduler, so I’m trying to write some tests on it. At least invoke methods to catch syntax errors if any 15:25:35 Also the problem part is OpenStack Cinder/Nova backups 15:26:16 Current coverage is http://pastebin.com/GfAdZ3cE 15:27:26 Also ssh paramiko is very-very slow. Probably we should try some different libary after release. And I have an idea how to read only part of backups from ssh - probably it will improve speed of our integration tests with ssh 15:27:41 Because they get all backups many many times pre run 15:27:46 It is all from my side 15:28:28 thank you. remember not to use pytest ;) 15:28:40 Yes, sure :) Thank you 15:28:51 on my side 15:29:27 the freezer-api devstack plugin is ready. 15:29:34 well, almost. 15:29:47 I just need to add a few lines in the readme 15:30:06 and the plugin for the web-ui also. 15:30:36 so, all it takes to have freezer, freezer-api and freezer-web-ui working in devstack is a couple of lines in the local.conf 15:30:51 vannif, sorry one sec 15:30:58 reldan, why paramiko is slow, do you have an idea? 15:32:23 I have read a lot of message threads on stackoverflow and in repo of paramiko. And I also saw a lot of branches where some guys are trying to resolve it. As far as I know Fabric has it’s own fork of paramiko 15:33:03 But why paramiko is so slow - id don’t know. https://www.google.ie/search?q=paramiko+is+slow&oq=paramiko+is+slow&aqs=chrome..69i57j69i60l2j0l3.2588j0j7&sourceid=chrome&es_sm=91&ie=UTF-8 15:33:51 ok... 15:33:57 Probably it has some problem with buffers 15:34:07 I read it reads information byte-by-byte 15:34:13 ah ok 15:34:23 there's some for in binary_data there.... 15:34:25 I’m not sure, but it seems to be a common problem 15:34:27 ok 15:34:57 I have found it - but don’t try it yet https://github.com/wallix/pylibssh2 15:35:09 and it is old 15:35:32 let's have a meeting at some point about it 15:35:37 but how much is slow 15:35:47 how much MB/s can we transfer? 15:35:50 do you have an idea? 15:36:16 Nope, but it is really slow - I work with hpcloud and even listdir takes minutes 15:36:23 ah..... 15:36:25 ok 15:37:43 that's really slow 15:37:52 yes 15:38:05 I was trying to tune windows size - didn’t help 15:38:11 window size 15:38:18 ok 15:38:27 window size of tcp connection ? 15:38:31 Yes 15:38:36 what uses ansible under the hood? 15:39:06 hmm. that should be a separate issue. the OS should take care of that ... 15:39:24 I'm not sure it's the tcp windows 15:39:54 I think paramiko 15:40:07 anyway we need to find a solution about it 15:40:16 but I have to say reldan with the tests from vanni 15:40:18 I know that Fabric uses paramiko 15:40:19 using ssh 15:40:21 # Development version of Paramiko, just in case we're in one of those phases. 15:40:22 -e git+https://github.com/paramiko/paramiko#egg=paramiko 15:40:23 # Pull in actual "you already have local installed checkouts of Fabric + 15:40:24 it wasn't that slow... 15:40:24 # Paramiko" dev deps. 15:40:25 -r dev-requirements.txt 15:40:40 vannif, do you remember the tests we did also for the demo? 15:40:45 it was acceptable 15:41:16 we should try to transfer few GB 15:41:30 I suppose it is because your ssh machine was not in hpcloud 15:41:31 no, sorry, ansible uses paramiko as a fall back 15:41:39 ah ok 15:41:41 it uses something else 15:41:54 we where using a devstack instance 15:42:10 we'll check this in the future 15:42:29 s/future/near future/ 15:42:32 :) 15:42:39 now let's fix the tests things 15:42:45 yes, definitely. it wouldn't be bad to have a summary of the upload/download speed of the backups 15:42:46 Sure! 15:42:55 Agree 15:43:13 Probably we need some test installation 15:43:21 With distribute metrics 15:44:26 Like http://graphite.wikidot.com/ 15:44:27 you know that the temperature of the exhaust gas from an engine is an important parameter to understand if the engines are healthy and in good operating conditions ? so should be the effective transfer rate for freezer engines :) 15:45:40 Agree. It will be great to actually do some improvement and see - that after merging we actually have improvement ) 15:46:30 yes. I'll help you with that if you want. it's interesting 15:46:44 Let’s say we can have a machine that will be do every 2 hours backup of 1 GB of data to different storages and report transfer speed/errors/… 15:47:06 We also can use https://www.elastic.co/products/kibana for gathering errors from multiple instances 15:47:43 https://www.elastic.co/videos/kibana-logstash 15:47:52 that's th centralized logging tool used in helion :) 15:48:40 yes, with elasticsearch it will be easy to gather statistics and trigger alarms upon threshold crossings 15:48:55 anyway ... back on track 15:48:59 So let’s say we will have several machines and they will check new code from repo, run big tests with actual data and show errors and performance 15:49:01 Ok 15:49:10 reldan, can you try to disable compression in paramiko if enabled by default? 15:49:20 we need to invertigate the effective speed and possible alternatives to paramiko, right ? 15:49:23 we are always sending compressed data with freezer 15:49:29 Yes, I have checked that - no improvement 15:49:32 ok 15:50:15 It will be great. I know how to reduce count of invoking listdir significantly, but still have no idea how to improve transfer 15:51:25 which should play the biggest part in large backups, right ? 15:51:54 It depends on how many backups do we already have in our repo 15:52:10 Now I have one additional listdir per zero-level-backup 15:52:44 I actually can reduce number - if I need only last one for example - I should get number or zero-level and then get all incremental only for this backup 15:53:05 So instead of n+1 backups I can get 2 15:53:12 reldan, did you take a looka at this? http://asyncssh.readthedocs.org/en/latest/ 15:53:12 not backups - listdirs 15:53:54 check under "Direct TCP connections" 15:54:15 so, you are getting listdirs which could be avoided ? 15:54:33 We can try 15:55:08 Yes, I can define only zero-level backup that I need and make only one additional checks for increments instead of doing it for all zero-level backups 15:55:14 ok, I 've never used it, just found it now, so t might not fit the purpose 15:55:40 Will see 15:56:00 it seems it requires python 3.4 15:56:33 there's a port for python 2 15:56:38 http://trollius.readthedocs.org/ 15:56:48 + https://github.com/ronf/asyncssh 15:57:12 damn back ports ... we'll never get rid of python2 ... nor IPv4 :) 15:57:24 haha 15:57:48 ))) 15:59:22 let's move forward 15:59:24 :) 15:59:28 shall we move forward ? 15:59:31 ok 15:59:33 :) 15:59:56 so. as I was saying, the plugins for devstack are ready 16:00:01 in review 16:00:25 they "work on my machine" :) 16:00:43 but you're encouraged to test them on your VMs 16:01:49 aah, only thing, you need to adjust the git repo to one on your local machine where you have the correct patchset, so that devstack will clone and install that repo/branch 16:02:17 thanks daemontool_ for the support :) 16:03:40 I'm also peeking to Saad's from time to time, he's working on oslo.conf and oslo.log, but I haven't still taken a deep look at that. maybe it will be in revirew soon 16:04:39 I was also evaluating the simplification of agent tasks. this is not a high priority I think. just an idea running on 16:04:58 as emerged in recent meetings, the plan is to have a sophisticated freezer-scheduler to manage the jobs, while the agent should be relatively dumb. 16:05:12 ATM the agent does a somewhat articulated task which is: lock the tables, take the snapshot, release the table lock, upload the backup, release the snapshot 16:05:35 this should be broken into pieces and controlled by the scheduler. 16:06:43 as reldan was suggesting, maybe using manager objects with specific responsibilities: snapshots, db locking/unlocking 16:07:08 each one instructed to act in specific moments of the backup "workflow" 16:07:22 kind of "hooks" 16:07:36 Oh, it will be great. I was trying to refactor lvm/shadow - but it’s really hard 16:08:07 yes, we also need to provide a consistent interface for lvm and shadow 16:08:50 it would be great to have single responsability objects, otherwise we are stopping database in one place and starting it in shadow/lvm code - it is conterintuitive 16:09:01 reldan, I can work on some diagrams and then we can discus them 16:09:06 exactly 16:09:29 We can just seat together and try to understand it :) 16:09:31 code is not just nested, it's even intermixed 16:09:37 Agree! 16:11:20 I think I'll send for review the small update to lvm.py I have been working on, and then plan for some more deep refactoring 16:11:32 Deal 16:11:33 so, let's move on. that's all from me. 16:11:52 if you don't have any question ... 16:12:38 m3m0_ 16:12:56 I've been fixing some minor bugs in the ui 16:13:11 specially for the backup_id for local and ssh modes 16:13:20 the id is a string with slashes 16:13:39 so the urls using those slashes breaks the ui 16:13:43 but that is fixed now 16:13:52 and also the retrieval of the freezer-api endpoint from the keystone catalog, right ? 16:14:23 yes, that was another bug, the retrieval was incorrect 16:14:36 at least the logic of the function 16:14:42 but now is fixed and merged 16:14:54 and I'm working in UX improvements 16:15:06 any plan fo future improvements of the web-ui ? 16:15:09 but they are not going to get into master soon 16:15:29 yes, first of all, an overhaul of the readme 16:15:53 then a specific tab for actions 16:15:56 and another for clients 16:16:15 improve unittesting 16:16:26 and ui testing with selenium 16:16:46 improvement in the action and job modal window 16:16:55 add more visual clues for the user 16:17:02 and an overview page as well 16:17:10 so there is a lot to do for the ui 16:17:25 yes, the ui hasn't been high in the priority scale. but in the end it's the think that the user sees and interacts with. it sometimes help also the developers to understand if all they have been working on makes sense :) 16:17:26 but I have a very limited time in freezer now 16:17:42 s/think/thing 16:18:11 agree with that for us, it seems simple for new people they don't know what even a job means 16:18:27 well at least in the context 16:18:41 of backup and restore 16:19:01 Memo Garcia proposed stackforge/freezer-web-ui: Fix minor bugs in freezer dashboard https://review.openstack.org/235478 16:19:04 you can work on it on the upcoming long and boring winter sundays :) 16:19:57 winter is coming 16:19:57 good. anythink else to say ? 16:20:05 yes, windows 16:20:32 I need to rebase the changes and make sure that I didn't break the code by adding the support of the new code 16:20:41 and that's it 16:21:26 btw, when we rework the code for the snapshots we well need your support for the windows snapshots 16:21:36 s/well/will 16:21:50 of course 16:21:54 just let me know 16:22:30 definitely. we'll need you: days are getting shorter, and the nights are long and full of terrors 16:22:43 thanks m3m0 16:22:47 it's dangerous to go alone :) 16:22:53 you're welcome 16:23:42 szaher 16:23:50 Thanks vannif 16:23:55 I know you've been involved in other tasks 16:24:07 I have been working on using oslo.log now it works fine with freezer api 16:24:08 but you had also taken a look at oslo log and conf right ? 16:24:21 is it in review ? 16:25:27 not yet 16:26:20 I am just trying to add all files then I will commit 16:26:51 I need to change all freezer-api files to use oslo.log 16:27:01 and I will do that in one commit 16:27:21 after that I will move to freezer scheduler then agent 16:27:28 if you want some early advice, you can send in for review and mark it as work in progress 16:27:48 just to start the discussion and allow others to see anch give hints 16:28:04 s/anch/and 16:28:20 Ok, will do that today 16:28:27 there is also problem with pylint 16:30:06 I'll checkout the patchset and have a look at it. so we'll see how to deal with it 16:30:17 thanks szaher 16:30:25 thanks vannif :) 16:31:05 I think that's all since slashme hasn't been involved on the public side 16:31:11 anythink else to say ? 16:31:30 anyone ? 16:36:39 ok. thanks all 16:36:49 #endmeeting