14:03:22 #startmeeting glance 14:03:22 Meeting started Thu Jul 3 14:03:22 2014 UTC and is due to finish in 60 minutes. The chair is markwash. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:03:23 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:03:26 The meeting name has been set to 'glance' 14:03:28 Hi everybody 14:03:43 o/ 14:03:45 onoes 14:03:53 we appear agendaless 14:04:24 lets cobble together a list of suggestions here in the first 5 minutes. I'll take notes on https://etherpad.openstack.org/p/glance-team-meeting-agenda 14:04:32 Are we really doing that well? ;) 14:05:17 I've got a couple of work-in-progress questions on artifacts, tryed to get in touch with zhiyan about them but it seems like we are out of sync of working hours 14:05:41 Can we do "Whack a bug Day"? 14:05:50 ativelkov: yes (sorry for that) 14:06:15 I've got a (public) security bug we should talk about 14:06:24 ativelkov: probably we can book a time for it, if you ok 14:06:45 let's start with the security then 14:06:50 +1 14:07:06 https://bugs.launchpad.net/glance/+bug/1315321 14:07:07 Launchpad bug 1315321 in glance "image_size_cap not checked in v2" [Undecided,In progress] 14:08:18 there is a patch up for review but its been self -1'd on the workflow 14:08:19 https://review.openstack.org/#/c/91764/ 14:08:31 I think I noticed that the patch puts the enforcement in the domain model 14:08:39 well, in the *core* domain model 14:08:42 which is sort of a no-no 14:08:56 it should probably be in the authorization or policy layer 14:09:23 anybody want to volunteer to help us regain momentum on this? 14:09:25 yeah, that seems to check too late 14:10:14 * nikhil__ can help discuss, review push smaller changes to the MP 14:10:56 is Thomas here today? 14:11:18 jokke_: and possibly do you work with him? 14:11:18 https://review.openstack.org/#/c/90726/ is for download, does it make sense to do it in authorization or policy layer as well? 14:11:22 markwash: I doubt ... I think he is touring UK with his band 14:11:27 haha awesome 14:11:30 As I am trying to learn how these layers work, I am interested in investigating. But, well, I am just getting familiar with codebase, probably I am not the best candidate for this task as it seems pretty urgent 14:12:15 zhiyan: yes I think so 14:12:50 the proxy layer is instantiated many times, so that check would be being many-times duplicated if there were several layers that didn't need to override the get_data method 14:12:52 markwash: api common seems like a good place to keep this, no? and enforcement can happen at right layers? 14:13:12 I have quite a bit on my plate as well ... can't promise to be able to give the time needed to sort this out 14:13:51 let's talk for a quick little bit about those domain model layers I guess 14:14:12 so the idea is that if we put every cross cutting concern in every operation, we quickly arrive at an unmaintainable situation 14:14:21 with functions that are 100+ lines long 14:14:38 agreed 14:14:39 or even if they aren't, they are conceptually responsible for 100+ lines worth of functionality 14:14:50 you can basically see this at play in the v1 api controllers 14:15:07 each controller call is responsible for everything, doing the task, sending notifications, enforcing rules about the task, etc. 14:15:28 so for v2 to manage this we instead have proxy layers that are responsible for each cross-cutting concern, broadly 14:15:36 and they are built together like the layers of an onion 14:16:00 the deepest layer is core domain model 14:16:15 which is basically just a definition of what kinds of nouns exist and what they can do 14:16:26 domain/init.py 14:16:30 the layers around that take care of data persistence for us 14:16:54 and then there are layers above that that do enforcement of policy, notifications, other authorization checks, etc 14:17:09 so what is the best layer to enforce filesize checks? Thr outermost? 14:17:17 o/ 14:17:54 ativelkov: we have two layers that are kind of overlapping concerns, maybe that is a problem we should fix or at least acknowledge 14:18:11 but those two candidates are the authorization layer and the policy enforcement layer 14:18:21 Is there specific reason why we are not using more decorators to do this layering? Just trying to get my head around the code the layering makes it bit messy tbh. 14:18:23 I think either of those are acceptable answers 14:18:29 jokke_: yes, I'm glad you asked 14:18:49 jokke_: so python decorators carry a fundamental limitation to them, which is that they are statically composed 14:19:13 so its not really possible, or at least you have to bend way over backwards, to make dynamic decisions about what decorators to apply 14:19:29 a good example of how ugly this can get is to look at the decorators used for locking in nova/compute/manager.py 14:19:40 there are functions that are literally 14:19:50 def foo(instance_id): 14:20:08 @decorator_that_depends_on_instance_id(instance_id) 14:20:16 def real_foo(instance_id): 14:20:21 # do stuff 14:20:28 return real_foo(instance_id) 14:20:36 ouch 14:20:52 markwash: btw you as an experienced developer, may i know who design/dev domain stuff? you? i'm just a little interested in it, it's being there when i join team. 14:21:07 yeah its mine so I take all the blame 14:21:09 (maybe i need to check git) 14:21:21 cool. thanks 14:21:26 anything that is easy to understand about it, iccha did, haha :-) 14:21:39 :) 14:21:47 iccha~ 14:22:03 so that's my justification for the layers approach 14:22:07 take it as you will :-) 14:22:08 undestanding is easy adapting is tough :P 14:22:15 markwash: I think that example you gave cannot be only a result of limitations of the decorators 'though ;) 14:22:50 Where is the "onion-like" structure of proxies constructed? 14:22:51 markwash: but could you open up a bit why we should be able to do dynamical assignments for those? 14:23:15 ativelkov: glance.gateway 14:23:28 ativelkov: https://github.com/openstack/glance/blob/master/glance/gateway.py 14:23:33 got it, thanks 14:24:15 jokke_: it tends to come up, I'm not sure i have a generic answer 14:24:49 k 14:24:53 markwash: iirc, you (and iccha) had a idea to update/change domain before, do you still like that? 14:25:07 wat! 14:25:21 zhiyan: yeah, in my opinion the domain core does not delegate responsibility correctly 14:25:55 markwash: ok, so we just need to a little refactoring right? to fix core-doman 14:26:00 so the whole issues of persistence are done a little incorrectly 14:26:05 it is most painful with the glance.store layer 14:26:13 indeed! 14:26:17 well, I think there is another question here 14:26:24 but we are moving it out :) 14:26:30 which is basically--is this effort even sustainable 14:26:46 * nikhil__ agrees 14:26:46 with the kind of development that our opensource community can offer 14:26:51 and in python (without strong types) 14:27:05 well, or 14:27:15 which effort is least unsustainable 14:27:21 haha 14:27:29 hehe 14:27:42 another option would be to consider the nova objects type approach 14:28:09 jokke_: one area where this layering approach really could deliver wins for us, that it doesn't today, is with constructing db queries 14:28:29 jokke_: for background, right now there is way way way too much business logic in the db driver 14:28:45 basically all of authz visibility restriction decisions happen there 14:28:53 markwash: ok, yeah I kind of noticed when I was poking it for some reason 14:29:13 since we have the image_repo object 14:29:30 it would actually be possible to say "image_repo.get_image_repo_restricted_by_filter(filter)" 14:29:51 so we could say "image_repo = image_repo.restrict_to_tenant(t)" 14:30:05 and pass that image repo further up to users 14:30:25 so this repo will have "fixed" filters which are applied to all other requests? 14:30:26 so that the business logic for restriction lives only in the authz layer 14:30:34 ativelkov: effectively yes 14:30:49 markwash: I think my biggest fear with this layered model everywhere approach is when we get the artifacts coming in. The v2 code is not the easiest to follow already now (specially if you want to take registry in the game) 14:31:08 Thats why I am trying to learn it well now 14:31:25 Don't want to blindly copy anything without deep understanding 14:32:06 after some time you get used to it :) 14:32:11 so there is a definite tradeoff in this approach on understandability vs. maintainability 14:32:35 I personally found it way too easy to make bugs in the "big transaction script" model that is in nova and in glance v1 14:33:02 where as in v2 images there is a big learning curve (bigger than necessary as well) 14:33:08 oh! 14:33:08 markwash: yes, I agree that it is not maintainable approach either 14:33:12 one other super important thing 14:33:16 unit testing 14:33:30 May be we just need some document which will describe all these layers nicely? 14:33:39 I think it also probably needs restructuring 14:33:52 since I wrote most of it I've learned how confusing my brain is for others :-) 14:34:23 literally just laying out the files and modules in a more straightforward manner and including doc strings would be a huge help I think 14:34:35 +1 14:34:53 think that the trick in the domain.proxy module. It's the most confusing and yet most powerful logic. 14:35:09 nikhil__: +1 14:35:18 for others, the idea behind domain.proxy is this 14:35:22 if you get a good understanding of what that does, rest follows naturally 14:35:24 we create a lot of layers 14:35:36 each layer has something like 4 or more objects in it 14:35:41 with 4-5 methods each 14:35:46 so its a big surface 14:35:56 any individual layer probably doesn't need to modify the whole surface 14:36:01 so there are a lot of "noop" methods 14:36:08 yes, he 14:36:27 so the idea of the proxy is to make helpers so you don't have to bother even writing those noop methods in your layer 14:36:34 if you inherit from the proxy 14:37:07 e.g. https://github.com/openstack/glance/blob/master/glance/domain/proxy.py#L47 14:37:17 if your layer just needs to modify how tasks are returned 14:37:31 you can just inherit glance.proxy.TaskRepo 14:37:39 and only reimplement the "get" method 14:37:45 self.base is the underlying layer? 14:37:48 silly example I guess 14:37:49 yes 14:37:56 got it 14:38:10 oh I guess proxy helpers are kind of hard to understand as well 14:38:18 gosh that's a bad name for that object 14:38:19 haha 14:38:35 the proxy helper can do some simple wrapping and unwrapping of objects for you 14:38:53 So a wrapped image repo can easily return a wrapped image 14:39:04 okay, so a document somewhere to help out with this stuff 14:39:07 that sounds like a good action item 14:39:21 +1 14:39:22 is there any chance people can start an etherpad of "domain model WTFs" 14:39:23 +1 14:39:30 :-) 14:39:44 and I and others can try to use that to motivate the documentation best? 14:39:55 markwash: a couple of glancers had that initialted a bit ago 14:39:57 I think once the hardest-to-understand part is clear-ish, it starts to fall into place 14:40:11 markwash: this short explanation helped already a lot, but wouldn't really mind seeing doc explaining it somewhere 14:40:16 (it's in januare mini-summit etherpad plan, guess) 14:40:26 nikhil__: thanks! I'll look there 14:40:42 okay, sorry i took so much of the time with my explanations though! 14:40:55 #topic artifacts spec 14:41:11 ativelkov: I'm sorry I have not checked back in since my review last friday 14:41:50 I fixed a couple of things since that 14:42:14 I have a question: do we need to completely design the json scheema for all the APIs? 14:42:26 As it turns out to require too much time 14:43:21 my initial reaction is, if its not essential I'd rather make it priority 2 14:43:41 or are you talking about for the spec itself? 14:43:50 yes, the spec 14:44:08 AS it requires to include the schema for each call 14:44:24 But I would prefer to already start implementation - we may update the spec as it goes 14:44:31 arnaud__: any thoughts/concerns on rolling forward and letting the schemas show up in the code submissions? 14:44:39 each call? seems it will needs more overhead on perf 14:44:41 sounds good 14:45:18 ativelkov, I think that's a good plan 14:45:25 ativelkov: I think you get a pass on this one. . schemas now might not even be correct if they're written too far ahead of the implementation 14:46:08 or trying to follow them might complicate the code unnecessarily 14:46:15 Good, Then I'll publish the API endpoints with description - and will proceed to the code 14:46:42 jokke_: heh good point! 14:46:56 ativelkov: regarding dynamic references 14:47:00 markwash: been there, seen that too many times 14:47:32 ativelkov: I think that early on, the heat use case will need these a lot more than they need fixed references 14:47:49 but we have a real mismatch in terms of priority and resources available in my opinion 14:47:58 so my proposed solution is 14:48:05 I was sure we've agreed that we do statics only 14:48:09 On the summit, I mean 14:48:12 what if we let dynamic references start out as a plugin-specific behavior 14:48:18 ativelkov: we did, I agree 14:48:39 I just got some feedback from randallburt that made me worry a bit 14:49:23 I may think about how to make them plugin-specific 14:49:58 speaking about resources - I may try asking Mirantis to allocate one more engineer for this task. 14:50:06 ativelkov: okay, sure thing. . let me know what you think. . my goal is to figure out a way for it to happen while still acknowledging its out of scope for what you are working on 14:50:27 markwash: for this meeting at some point - I've a quick question about glance tests and integration tests (completely forgot about it until now) 14:50:35 #topic bug day 14:50:36 (fyi) 14:51:09 Yeah ... I just checked, we have 256 open bugs which of 98 are still at state new 14:52:10 my memory is bad, but it seems like we missed our post-summit bug day plan 14:52:16 arnaud__: ^^ am I remembering correctly? 14:52:32 :) 14:52:45 markwash: That's why I brought it up as I did not see any discussion about it since ;) 14:52:52 unfortunately 14:52:57 what is the timeframe that looks best for people? 14:52:59 it's good that you bring it up again 14:53:13 next week? right before summit? august? 14:53:35 I'm hoping I'm having bit calmer next couple of weeks 14:53:52 here here 14:53:56 august for us at rackpsace 14:54:03 haha, though I think its supposed to be "hear hear" 14:54:16 this month would be very busy :/ 14:54:21 I'm also hoping to be on holidays quite a bit of Aug :P 14:54:22 let's start an ether pad with people preference 14:54:28 +1 14:54:33 please post link 14:54:35 yep 14:54:50 #topic performance v2 14:54:54 * markwash cringes 14:55:02 #link https://etherpad.openstack.org/p/glance-bug-day 14:55:13 can we just make remark that it's really bad ;) 14:55:29 yeah 14:55:37 so there is supposed to be some stuff to help us profile it 14:55:39 i will look it into glanceclient on my tomorrow 14:55:43 coming from rally / osprofiler 14:56:35 anyone have any ideas what is the problem? its bad enough that any "fix" will be evident, we don't need a special measurement framework 14:57:10 no? 14:57:14 le sigh 14:57:15 haven't heard too many bad things about it yet 14:57:16 markwash: not yet ... I was hoping to be able to look into it next week forwards when I get my current urgent task of my hands 14:57:28 jokke_: okay, check back in with us on that 14:57:34 we need to let nova use v2 to truly understand the limitations 14:57:54 oh but please let's fix the performance limitations before we ask nova to feel the pain 14:58:00 nikhil__: I give you one ... I have test system with some 30k images ... image-list on v1 took less than 2min v2 over 8 14:58:03 :) 14:58:09 at least the multi-second list queries :-) 14:58:28 i listed my investigation result in the mail, i think the issue is in client side 14:58:30 markwash: ohk, we can try to do some profiling for our public images service 14:58:37 jokke_: ^ 14:58:40 zhiyan: ah, okay great, thanks 14:58:53 http://lists.openstack.org/pipermail/openstack/2014-July/008237.html 14:59:14 yeah, that seems most likely the case as testing here at the rack did not reveal significate hit on perf 14:59:18 so let's take it as an an item for next week to see if we've learned anything convincingly 14:59:26 nikhil__: that is good news 14:59:45 markwash: should we talk about the tempest question in glance channel? 14:59:53 <10secs 15:00:06 and separated sample config file generation, if you ok 15:00:19 sure, let's move over, but I have to run very soon 15:00:28 cd ../openstack-glance 15:00:30 #endmeeting