15:02:13 #startmeeting manila 15:02:14 Meeting started Thu Jul 2 15:02:13 2015 UTC and is due to finish in 60 minutes. The chair is bswartz. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:02:16 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:02:18 The meeting name has been set to 'manila' 15:02:21 #link https://review.openstack.org/#/c/193664/ 15:02:32 I just got back from vacation 15:02:48 I do not know if anyone has taken a look at this patch I submitted 10 days ago 15:02:50 welcome back! 15:02:54 bswartz: thanks! 15:03:06 ganso_: half the manila core team is still on vacation until next week 15:03:18 bswartz: o_O 15:03:42 bswartz: what's the occasion? I did not know there was a planned vacation 15:03:47 this week there are holidays in the USA 15:03:56 bswartz: oh, 4th of July 15:04:11 bswartz: nice 15:04:25 so, my patch is mostly an example implementation 15:04:39 it looks like it has no reviews so far 15:04:40 so we can take a look of what effort is needed to implement something that can work for Share Migration 15:04:44 it's small though so I can take a quick look 15:05:00 Also, there is this patch for generic driver, based on that one I just posted 15:05:01 #link https://review.openstack.org/#/c/193667/ 15:05:17 ganso_: is this a POC, or something that you are planning to be merged 15:05:18 I did not have time to fix the unit tests for time before leaving for vacation though 15:05:23 I believe that u_glide started working on a proposal for share instances (2 IDs) last week 15:05:48 looks like POC 15:05:50 xyang1: this is initally a PoC, if we end up agreeing that it looks good, and the proposal solves our problems for Share Migration, we may improve and merge it 15:05:59 ok, thanks 15:06:12 I also just stumbled across another scenario where this patch may be useful 15:06:20 my mate over here is working on a driver 15:06:34 I believe that u_glide's share instance work will be how we will ultimately implement 2 IDs per share 15:06:52 ganso_: what is the other scenario 15:07:00 bswartz: has u_glide started working on it? 15:07:13 hi. sorry late 15:07:21 ganso_: yes but he and vponomaryov are both on vacation this week too 15:07:32 while implementing Manage feature, today he has either to rename to rename share, or use private driver storage 15:07:47 rename the share on the backend 15:08:04 okay 15:08:14 both of those approaches work today 15:08:18 with this patch I created, he may store the original share name (from the backend) in the driver_id column 15:08:31 gah! 15:08:39 no I don't think that would work 15:08:59 because even if we have another ID collumn it shouldn't be modifiable by the driver 15:09:33 in fact it would return a model update for the manager, who would edit that field when Manage method is finished 15:09:37 bswartz: so we will be comparing u_glide's approach with ganso_'s and decide which one to use finally? 15:10:01 xyang1: possibly 15:10:26 xyang1: yes 15:10:27 share instance approach is probably the best, but will require more effort 15:10:52 my approach required small effort on core code, and small effort on driver code 15:11:00 however u_glide's approach will be a more general version of this 15:11:13 it's good that ganso_ has this proposal to start working from 15:12:17 to avoid confusion, I think both of you should reference each other's patches then. Or in the commit message, state that there's another approach being worked on 15:12:35 I really want to see migration happen so anything that can unblock ganso's work is something I'll support 15:12:47 xyang1: +1, I can update my commit message 15:12:58 I marked ganso's patch WIP 15:13:31 that is better. thanks 15:13:36 I can continue work on share migration with my patch 15:13:48 like, experiment more on the copy command 15:13:52 etc 15:13:54 I hope that u_glide's patch will arrive soon and ganso_ will be able to rebase on top of that 15:14:10 but it if takes another week or 2 then ganso can continue working based on this 15:14:43 I'll review it but most likely -1 with a recommendation that we hold off 15:14:50 sound good? 15:14:52 bswartz: that would be great 15:15:15 bswartz: but I am not really about this right now, if eventually we are going to accept either one or the other 15:15:39 markstur: the meeting today is short as most people are on vacation 15:15:51 markstur: (and there is no agenda) 15:15:55 bswartz: I am worried about polishing Share Migration, I wish more people get involved with the implementation before merging the ID patch 15:15:57 bswartz, Yes. It's been quiet this week 15:16:19 such as this #link https://review.openstack.org/179790 15:16:22 ganso_: let me know what I can do to help 15:16:44 okay I'll review this too 15:16:59 bswartz: I remember the last topic we discussed regarding Share Migration (not related to IDs), is about the methods the drivers would need to implement to provide the network path 15:17:11 if that approach is ok, then it is up for review 15:17:41 the copy command is a bit unstable, but I am focusing on that now 15:17:48 unstable how 15:17:58 rsync -vza sometimes fails to copy some files 15:18:04 did you decide that cp -a doesn't work? 15:18:06 I/O error or something like that 15:18:13 oh hmm 15:18:22 I am thinking about using a python library that copies all the files 15:18:24 just some files not all of them? 15:18:34 yea, just some files, randomly 15:18:42 I performed several tests, it fails randomly 15:18:50 that's really weird -- which backends did you try? 15:18:59 only generic for now 15:19:06 like, 10 files, sometimes it copies all of them, sometimes copies 8, etc 15:19:20 how are you generating the initial set of files? 15:19:34 the most dangerous thing is that, after migration, the original share is deleted, so files not copied are lost forever 15:19:49 yikes 15:19:53 rsync -vza on share root is recursive 15:20:06 yeah I think it's important that the copy command returns failure if it detects any error so the migration can be stopped 15:20:33 so currently I do not generate a list of files. When using the python library I will be able to list all files, exploring the directory tree, and also know the copy progress 15:20:35 that would at least prevent the original from getting deleted 15:20:49 rsync doesn't return an error status? 15:20:50 yea but there are some scenarios that cannot be copied 15:20:54 such as the lost+found folder 15:21:04 I mean, not sure if cannot be copied 15:21:16 but seems like it, I haven't looked much into this particular case 15:21:53 okay well please make sure you check for errors and fail the migration if everything doesn't go perfectly 15:22:10 if manila sometimes deletes people's data during migration we won't be very popular 15:22:29 we'll have to triple check all that logic in the code review too 15:22:59 rsync does return error code if anything failes, although it continues to copy everything, but for instance, the lost+found folder is always failing. There is a way to pass a list of folders/files to ignore as parameters also... but the copy must be stable, currently it is not 15:23:31 I think I can understand lost+found 15:23:39 so, using this python library I expect to have a much finer-grained control 15:24:00 I don't remember the name now, I have to look it up, but openstack already uses it 15:24:03 that's created by the filesystem -- it shouldn't be user-modifiable 15:24:41 yes but we need to acknowledge the user that his files in his lost+found folder are going to be lost on migration 15:24:41 okay 15:25:10 or we need to move them to a differently-named folder during the migration 15:25:24 or ask the user to delete them before migrating 15:25:40 yes 15:25:59 I can see some strange user experiences coming out of this 15:26:22 various limitations around data copying 15:26:45 we'll need to make sure they get documented 15:26:50 that's why I think we need to have Share Migration unblocked for review, test, and general use asap, it takes a lot of time to get it ready for primetime 15:27:09 yeah 15:27:24 okay thanks for this update 15:27:34 anything else from you ganso_? 15:27:42 ganso_, So you'd really like driver vendors to start using it now? 15:28:18 bswartz: not really, I haven't had progress in the minimum requirements documentation task, it is still in etherpad 15:28:22 markstur: I think he means people should test it out and provide code reviews 15:28:35 markstur: if driver vendor maintainers can start testing it, it would be great 15:28:48 markstur: if they expect to have it supported on Liberty 15:28:50 OK. I was thinking if he's ready for early adopters for POC. He should be recruiting a little. 15:29:19 the best code review is: I downloaded and ran your code and it blows up in this corner case you missed 15:29:22 markstur: from what has been decided so far, they would have to implement a few functions here and there, the network path. 15:29:45 bswartz: +1 15:29:47 yea 15:30:18 since this ID patch came after it, I should edit the commit message and add more description, like, for testing the PoC, it is advisable to download the ID patch 15:30:27 hopefully everyone knows about git review -d for downloading patches 15:31:10 did anyone have another topic for today? 15:31:12 bswartz: pardon my ignorance, I am not familiar with it 15:31:14 2 things... 15:31:22 bswartz: what is -d for? 15:31:27 ganso_: try it out 15:31:33 bswartz: ok 15:31:34 git review -d 15:31:45 it pulls a change out of gerrit into your local workspace 15:31:59 oh, I always do git fetch and rebase 15:32:06 it will make my life easier 15:32:06 markstur: go ahead 15:32:09 1. notice the manila-coverage (non-voting) job on patch sets 15:32:17 very handly coverage report 15:32:32 you should run your own anyway, but I like this for reviews 15:32:53 Boris suggested a way to make it voting if coverage gets worse! But I don't know if we want that yet 15:33:12 I'd leave that up to reviewers for now 15:33:22 2. Some of our schedule stats tests don't consistently pass. I should have a fix for that soon. 15:33:35 cool 15:33:38 was seeing a lot of false negatives 15:34:02 bswartz, cool that tests are random or cool coverage report? 15:34:03 :) 15:34:19 markstur: this seems to be measuring coverage of the test files too though... 15:34:41 what is the point of code coverage in the test files? 15:34:42 bswartz, Yeah. That's odd, but I just look at the other pieces 15:35:03 the aggregate number at the top (89%) might be too high because of that though 15:35:04 Actually, I don't think it is intended, but it can show you if you managed to create tests that are not being ran 15:35:22 Yep. I wouldn't want tests to be included in aggregate. 15:35:50 its interesing that some unit test files have very low coverage though 15:35:59 I guess those would be worth looking into 15:36:35 and I'm not opposed to a gate test that fails if it detects a decrease in coverage -- although it strikes me that that could be dangerous 15:36:47 we'd want to keep it nonvoting for a while 15:36:54 bswartz, Sure. So I think one step at a time. 15:36:59 the mechanism for detecting a decrease could be similar to what pylint does 15:37:43 so the other thing, about the scheduler stats tests 15:37:48 can you provide a link? 15:38:06 I'll look not sure how quickly. 15:38:21 But there is a stats dict compare that fails on timestamp of stat changes. 15:38:28 we've had unstable tests in the past but I'm not aware of the problem you mentioned 15:38:37 And in my tests I also see set() used with non-hashables 15:39:00 did you file a LP bug? 15:39:12 did not yet 15:39:29 maybe file a bug and assign it to yourself (assuming you plan to fix it) and capture the problem there 15:40:01 http://logs.openstack.org/56/197256/3/check/gate-manila-tempest-dsvm-neutron/55771f1/console.html 15:40:28 Yes. I'll do the LP bug. Should've done that right away. Then I'd find the link faster. 15:40:56 oh wait is this using the assert_dict_match function? 15:40:58 I don't know why the set() not hashable was only found locally. So I'll make that separate. 15:41:21 that function has some weird logic in it to do "fuzzy" matches which I really don't like 15:41:39 assertDictEqual 15:42:07 that's the one 15:42:11 but the stats timestamp changes. so our tests tend to fail as things get slower 15:42:17 okay 15:42:23 yeah that's bugworthy 15:42:50 if you don't plan to fix it then we need to find an assignee with bandwidth 15:43:13 I'll fix it right after the meeting 15:43:26 I think everyone will be back in their respective offices on Monday, except toabctl 15:43:33 thanks markstur 15:43:52 okay if nobody has other topics I think we're done 15:44:04 thanks for pointing out the coverage reports, those are cool 15:44:22 next week we'll have a regular meeting with an agenda 15:44:40 #endmeeting