15:02:35 #startmeeting storyboard 15:02:35 Meeting started Mon Apr 7 15:02:35 2014 UTC and is due to finish in 60 minutes. The chair is krotscheck. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:02:36 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:02:39 The meeting name has been set to 'storyboard' 15:02:44 o/ 15:02:51 #topic MVP Status 15:02:58 morning all 15:03:02 Ok, so what’s outstanding? 15:03:33 I did a strawpoll in #storyboard earlier, current consensus seems that MVP is complete. 15:03:48 o/ 15:04:00 I think I still need to go clean out some of the extra data from the current install - and then I thnk per last week's infra meeting we're going to start using it to track zuul and nodepool 15:04:08 agreed, that's why I adde a MVP 1.01 topic for end of meeting :) 15:04:14 Got it, so MVP is done. 15:04:15 as a first step 15:04:20 * krotscheck pops the champagne 15:04:33 #topic easier access to logs 15:04:42 krotscheck: thanks for keeping us all honest and going forward :) 15:04:49 the task statuses aren't what we discussed at the sprint, but i assume that's an easy patch 15:04:54 cody-somerville: Is this a zombie topic from last week? 15:05:14 jeblair: Yes, that can be done quickly. Submit a story? 15:05:34 zombie topic from last week, yes. sorry. 15:05:38 Got it 15:05:46 #topic ux labs /resources 15:06:11 Personal question for me: Other than redhat, what UX labs / usability resources do we have available other than “make grumpy engineers use it? 15:06:39 I think "make grumpy engineers use it" is what we have for now 15:06:56 Ok, that’s from HP. I can occasionally get some feedback from jcoufal, anything from Mirantis? 15:07:04 grumpy engineers are like 80% of our target population anyway 15:07:05 there's a UX dude at HP we can talk to to see if he'll do some UX studies on it (he ran some on horizon recently) 15:07:24 Alright, new user profile: Grumpy Engineer. 15:07:32 mordred: Send me an intro? 15:07:40 krotscheck: I shall 15:07:42 Good 15:07:46 yeah, it's piet kruithof 15:07:53 he might help with usability testing 15:07:59 #action mordred Send krotscheck intro to piet kruithof 15:08:03 we actually identified two profiles. Linux grumpy engineer and MacOs hipster dev 15:08:06 + grumpy release manager proflle 15:08:14 :-P 15:08:16 Hipster? 15:08:24 the second profile likes stuff that resembles github 15:08:32 by principle 15:08:36 mordred: I think… I don’t know what I think about being called a hipster :) 15:08:41 ANYWAY 15:08:55 #topic design discussion: Search & DB support 15:09:07 krotscheck: i was not thinking about you 15:09:09 So, our list views right now fail hard. 15:09:21 As does any project dropdown. 15:09:41 Now, for the time being we can do a cached typeahead, however that’s not going to scale. 15:09:46 krotscheck: i suggested typin URLs by hand but somehow you didn't like it 15:09:52 (cached typeahead: Load all results, filte rthem in browser memory) 15:10:04 ttx: That’s orthogonal :) 15:10:29 ttx: That’s more of a “how to find a specific project” issue rather than “find all tasks like X” issue 15:10:39 krotscheck: but if the query fails on the server side, how filtering would help inside a browser? 15:11:01 krotscheck: list views fail ? or the project list view fails ? 15:11:33 NikitaKonovalov: The UX for actually finding a specific project, task, or story is bad. The sort order is wrong, items are shown that are closed, there’s no way to search by text, etc. 15:11:35 krotscheck: so - we probably want some amount of both thing, right? 15:11:43 mordred: Yes. 15:11:50 krotscheck: I think users shall be able to subscribe to project (and/or project groups) and get them in a default view 15:12:21 ttx: that still doesn't help wehn I want to find something for a project I'm not subscribed to - which i do in gerrit more than neer 15:12:25 they should almost never search for them 15:12:28 ttx: True. Difficult to do if you can’t find what you’re looking for in the first place. 15:13:03 krotscheck: it's difficult but not impossible. And with subscription it's a one-time pain 15:13:10 So the actual question I have is this: SQLAlchemy does _not_ have a good fulltext abstraction. 15:13:18 That’s more or less requred for search 15:13:20 (agree it can use improvement) 15:13:25 (rather than browse. Browse is well understood) 15:13:57 * ttx ttx uses CTRL-F in project list window 15:14:04 Should we A) pick either postgres or mysql, and use their own fulltext search, or B) take what ruhe suggested and get elastic search up and running? 15:14:30 Thoughts? 15:14:45 well, mysql whould be fine for searching in 'name' fields 15:15:18 we may even let it fetch all the data (projects) and then filter in db_api 15:15:19 mordred: i think there are other full-text options for mysql, yeah? 15:15:23 NikitaKonovalov: Yes. What about description fields? 15:15:39 door number 3 15:15:41 Or ‘name’ and ‘description’ fields. 15:15:50 i'm pretty sure we would like to search across all fields, including descriptions and comments 15:15:52 jeblair: there is mysql sphinx 15:15:56 we should stay with mysql - and we should use sphinxsearch with it 15:16:01 krotscheck: no differece for those i guess 15:16:04 ruhe: that's what i was thinking of... 15:16:34 mordred: jeblair: That requires ditching postgres altogether, yes? 15:17:40 krotscheck: i believe so. i don't have a problem with that. 15:17:58 Alrightey, let’s vote on it. 15:18:02 fungi: you've poked at our elasticsearch cluster a bit more than i have... 15:18:14 * krotscheck assumes that he can figure out the vote syntax. 15:18:20 postgres is always several steps behind. most OpenStack projects don't have indexes configured for it 15:18:32 fungi: my thoughts are that the infrastructure and maintenance burden is too big for a problem like this 15:18:41 #vote Deprecate all secondary database support in favor of one. 15:18:48 yeah, i think the search limitations we have are imposed by us to aboid dos'ing the cluster currently 15:18:55 I'm not sure whether sphinx can work with postgres - but I have to be hoenst that I do not care about our postgres support personally 15:18:59 perhaps that's just because i've only seen it used in a frighteningly busy system 15:19:06 #startvote Deprecate all secondary database support in favor of one. 15:19:06 Unable to parse vote topic and options. 15:19:10 Dammit 15:19:14 Someone else do that 15:19:24 krotscheck: http://ci.openstack.org/irc.html#voting 15:19:32 krotscheck: only the meeting chair can :) 15:19:33 the quantity of data in storyboard is likely to never pose a problem for elasticsearch substring matching, by comparison with the volume of log data we're currently stashing in it 15:19:34 Oh 15:19:52 : #startvote Deprecate all secondary database support in favor of only one? Yes, No, Maybe 15:20:16 fungi: right, but are we going to need to run another 'cluster'? what's the minimal cluster look like? 15:20:48 krotscheck: once more without the ': ' 15:21:03 oh, yeah from a sizing standpoint i have no idea, but am pretty sure we can run a one-node cluster and not do any sharding initially 15:21:06 jeblair: There isn’t one.... 15:21:31 since that's what clarkb set up originally before we grew it to cope with the volume we have 15:21:54 #startvote Deprecate all secondary database support in favor of only one? 15:21:55 Begin voting on: Deprecate all secondary database support in favor of only one? Valid vote options are Yes, No. 15:21:56 Vote using '#vote OPTION'. Only your last vote counts. 15:22:10 #vote Yes 15:22:11 #vote Yes 15:22:13 i expect we could probably reuse the existing cluster instead if we want, though right now it's already at capacity so we might need to expand it further 15:22:16 #vote Yes 15:22:19 #vote Yes 15:22:23 #vote Abstain 15:22:30 :) 15:22:35 #endvote 15:22:36 Voted on "Deprecate all secondary database support in favor of only one?" Results are 15:22:52 Ok, so next issue: Which database to pick. 15:22:55 it might be reasonable to conduct a study of mysql+sphinx+sqlalchemy before the decision is made to build on top of these technologies 15:23:04 fungi: i'd rather not; i think storyboard search uptime is far more important that test log uptime 15:23:18 #vote Yes 15:23:20 fungi: so we wouldn't be able to take it offline for, say, a day in order to rebuild indexes anymore 15:23:25 sorry (got distracted) 15:23:38 Ok, ruhe: Do you have time to do this study? 15:23:59 jeblair: agreed. we could pilot with a one-node shardless cluster initially and then go to two or three with shards if we wanted better performance/capacity (if it ever became necessary) 15:24:15 might even be able to just stick the elasticsearch and storyboard instances on the same vm for better performance 15:24:47 #startvote Pick a database? mysql, postgres, sqlite 15:24:48 Begin voting on: Pick a database? Valid vote options are mysql, postgres, sqlite. 15:24:49 Vote using '#vote OPTION'. Only your last vote counts. 15:24:52 i'd want to pass this by clarkb later though since he very well may be able to point out that i'm being an idiot 15:24:53 is it hard to run elastic search loaclly? 15:24:59 #vote mysql 15:25:14 fungi, krotscheck: i agree, we should ask clarkb to weigh in on this; he's our elasticsearch expert 15:25:19 krotscheck: i do, if time is equal to 10 days. but i'd prefer ES instead of sphinx (maybe because i have osx and i'm kinda hipster) 15:25:21 #vote mysql 15:25:22 #vote mysql 15:25:28 #vote mysql 15:25:28 #vote mysql 15:25:35 #endvote 15:25:36 Voted on "Pick a database?" Results are 15:25:37 mysql (5): mordred, NikitaKonovalov, ttx, jeblair, ruhe 15:25:50 Ok, we’re using msyql, end of story 15:26:08 now we could dramatically simplify our initial migrations 15:26:16 Last question: We need search, we’ve got two options on the table - sphinx and elasticsearch. I do not know enough about either of these to make a decision, who wants to be our expert? 15:26:33 ttx just volunteered to fix our migrations 15:26:34 ttx: you mean squash all the existing migrations into a single one? 15:26:52 Or is there a third option? :) 15:26:53 ruhe: rather than keep the postgresql compat cruft yes 15:27:11 but then I don'ty know how the prod system would like that 15:27:58 #action ttx File a story in storyboard to get the postgres cruft out of our database migrations. 15:28:02 krotscheck: i think we should get clarkb and mordred, at least, into a conversation, as ES and sphinx experts respectively 15:28:10 ttx: they manage it somehow in giant projects like Nova. It should be a piece of cake in StoryBoard :) 15:28:26 * mordred will ping the guys at sphinxsearch and see if they're interested in helping 15:28:50 jeblair: You got it. I’ll grab clark and mordred offline and get a better sense of the benefits/tradeoffs. 15:28:52 mordred: what does that look like on the sysadmin side? 15:28:52 * ttx is multitasking on 3 cores 15:29:07 krotscheck: feel free to do it online. :) 15:29:15 #action krotscheck extract mordred and clarkb’s brain segments on elastic search vis-a-vis sphinx. 15:29:29 jeblair: Yeah, but that discussion does not have to happen in this meeting :) 15:29:42 krotscheck: yep 15:30:02 #topic FK or no FK 15:30:06 I’m bringing both FK and soft delete back up because I felt the discussion 2 weeks ago was prematurely cut off. 15:30:15 With regards to foreign keys, I propose keeping them. The argument _for_ is all about referential integrity, which is well understood and comes with clear benefits. 15:30:23 The argument _against_ is ‘performance’ and ‘code duplication’. Only the latter is understood as a benefit, while the former (performance) is a complete unknown vague promisey thing which nobody’s been able to put numbers around, and given our scale, isn’t even necessary yet. 15:30:32 Throwing away referential integrity in favor of something whose benefits are poorly explored because of optimizations we don’t yet need feels premature to me. 15:30:46 If foreign keys become a performance constraint, we can revisit and explore this. Until then I propose that we stick with FK’s. 15:31:39 Basically: I think referential integrity is a better benefit than code duplication, because we really should be testing the latter. 15:31:44 krotscheck: i feel you may have misapprehended my arguments for dropping FK support 15:31:56 jeblair: Oh? 15:33:08 krotscheck: i'm primarily interested in dropping it because of the complications introduced by defining the same relationships in multiple places 15:33:26 the multiple places being both the orm and the database 15:34:04 that is error prone and one can end up with problems resulting from mismatches there, as we've already seen 15:34:16 jeblair: The only place where I’ve seen that become an issue is in cross-database situations, are there others? 15:34:58 My hypothesis is that now that we’re on mysql only, most of those issues won’t crop up anymore. 15:35:03 btw, will it be valid if we set up constraints in migrations, but do not tell sqlalchemy anything? 15:35:15 krotscheck: i'm pretty sure it's come up in earlier problems with storyboard's migrations; also, i believe the current unit tests have FK errors 15:35:32 NikitaKonovalov: i think that's backwards -- sqlalchemy needs to know the constraints, the database does not 15:35:45 it's redundant to have two different systems checking those constraints 15:35:46 NikitaKonovalov: I want to do the opposite 15:35:49 yeah - what NikitaKonovalov said 15:35:51 jeblair: It sounds like we can test for the kinds of errors you’re concerned about, is that correct? 15:36:02 the important part is the the orm layer know about them 15:36:11 that' VERY important 15:36:42 Does SQLAlchemy keep an in-memory running cache of queried records, or does it handle things on a per-request basis? 15:37:15 krotscheck: it handles sessions 15:37:24 krotscheck: during a session, records are cached; in our case a session is probably a single http request 15:38:23 * mordred has to drop off 15:38:53 Got it. I get that the ORM needs to know about relationships to simplify reading records and related records. The DB-side FK constraint to me is most useful when _deleting_ records. 15:39:07 krotscheck: the orm is responsible for that too 15:39:16 don't FKs make migrations slightly more complex/costly, which increase the downtime windows ? 15:39:38 ttx: Again, i think it’s testable. 15:40:05 I think the only valid argument FOR FKs is that the data will outlive the app. Which frankly, I doubt. 15:40:31 Ok, any more arguments for/against? I’d like to move to vote. 15:40:34 I've seen data migrated away from systems well before said systems are replaced by some other on top of same db 15:41:04 i.e. when someone replaces your system, it migrates data to new db. It doesn't rebuild on top of same db 15:41:17 #startvote Should we remove Foreign Keys and keep all relationship defenitions in the ORM? Yes, No 15:41:18 Begin voting on: Should we remove Foreign Keys and keep all relationship defenitions in the ORM? Valid vote options are Yes, No. 15:41:19 Vote using '#vote OPTION'. Only your last vote counts. 15:41:31 #vote Yes 15:41:31 #vote No 15:41:41 to be fair, the main advocate for FKs is not present 15:41:52 I think we can assume he’s voting no. 15:41:56 #vote Yes 15:41:57 So we can adjust the vote. 15:42:02 we need a 'don't know' option 15:42:03 krotscheck: that's fair 15:42:17 and that mordred votes Yes, I guess 15:42:22 i abstain 15:42:25 mordred: ^^ vote 15:42:36 krotscheck: mordred said he had to leave 15:42:38 kk 15:42:43 yes he dropped off 15:42:44 #endvote 15:42:45 Voted on "Should we remove Foreign Keys and keep all relationship defenitions in the ORM?" Results are 15:42:46 Yes (2): ttx, jeblair 15:42:47 No (1): krotscheck 15:43:04 Ok, making the assumption that mordred is voting yes, and cody-somerville is voting no, that leaves it with “ORM keys” as the winner 15:43:08 Any disagreements/ 15:43:09 ? 15:43:23 kk 15:43:30 #topic Soft Delete 15:43:33 krotscheck: I suspect the discussion will be reborn in the review, but at least that's a decision 15:43:39 Indeed 15:43:52 someone has to care enough to remove them 15:43:54 yeah, i think seeing some of this in practice will help 15:43:57 The arguments for keeping it are all about data retention for reporting, while the arguments against are performance based. 15:44:00 jeblair: I agree 15:44:07 Sorry, soft delete 15:44:16 Since we don’t know whether we need the performance boost, and the argument for keeping it _requires_ data history, I suggest that we don’t pull the rug out from underneath the opposing argument. 15:44:36 I'd suggest we split objects to soft-deletable(stories, comments, ...) and tokens 15:44:47 NikitaKonovalov: Yes, you’re correct. 15:45:07 things like tokens will populate the base pretty quickly, so let's remove them for real 15:45:07 Security bits are definitely things we want to make dissappear when no longer used. 15:45:24 could you elaborate on "data retention for reporting" ? 15:45:29 krotscheck: i think this is going to be moderately hard to change later; i'm not normally in favor of premature optimization, but i don't think this is premature 15:46:08 to be clear, i am opposed to soft-delete for most things (especially comments, tasks, etc) 15:46:08 ttx: Number of stories closed/deleted over X time. 15:46:18 krotscheck: closed != deleted 15:46:27 jeblair: To be clear, I’m opposed to delete in most things :) 15:46:35 (soft or no) 15:46:40 I don't think we should be able to delete stories 15:46:48 ttx: i think i am in agreement 15:46:55 except in very very corner case 15:47:07 we shoudln't be able to delete comments either 15:47:12 ttx: we may need a superuser level exception for abuse/dmca, etc 15:47:15 (as in "the general public") 15:47:22 jeblair: right 15:47:27 that's the very corner case 15:47:39 basically the only thing i would be able to softdelete would be projects 15:47:53 tasks ? just recreate them 15:47:53 jeblair: Can you go into more details on how removing soft-delete in the future would be a problem? 15:48:04 comments ? just don't allow deletion 15:48:09 ttx: i can probably get behind not being able to delete comments (again, except for superuser) 15:48:36 stories is a bit in -between, i would let stories with no tasks fade from view 15:49:14 even for admin extroadinary situations, delete is probably actually "hide" 15:49:25 fungi: that's soft-delete 15:49:56 We’ve got 10 minutes and a lot more to go through, does anyone object to going to a vote given the several weeks of discussion we’ve had on this? 15:50:32 #startvote Should we support soft-delete? Yes, No, Case-by-case 15:50:33 Begin voting on: Should we support soft-delete? Valid vote options are Yes, No, Case-by-case. 15:50:34 Vote using '#vote OPTION'. Only your last vote counts. 15:50:39 in short, i think we should avoid soft-delete because there are only a few things we actually _want_ users to be able to delete, and they are not worth keeping around in the db 15:50:42 I think there are plenty of things we shouldn't be able to delete, and plenty of things where deletion should be final. Not so sure there is a large need/space for soft-deleted stuff 15:51:06 Hrm. 15:51:17 Ok, so feels to me like voting doesn’t give us a decent enough choice 15:51:20 #endvote 15:51:21 Voted on "Should we support soft-delete?" Results are 15:51:33 krotscheck: i'm not sure about those options -- i basically feel it should be case-by-case, but we should lean towards not using it. 15:51:34 could we have one example of stuff you would like to be able to soft-delete ? 15:52:03 ttx: I could probably come up with use cases for all records. Users, for instance. 15:52:13 we should have 3 buckets (non-deleteable, hard-delete, soft-delete) and put various objects into them 15:52:13 Though in practice users are active/inactive 15:52:24 if a bucket is near empty it might make sense to stop supporting that case 15:52:37 Alright, let’s try a quick straw poll on each current record type. 15:52:42 krotscheck: agreed, i don't think we'd want to delete users, just make them unable to log in 15:52:44 Users: Soft delete or hard? 15:52:50 Wait 15:52:55 krotscheck: users -> no delete 15:53:03 Users: soft delete, hard delete, or other state ( banned/blocked/active) 15:53:09 I’m for ‘other state' 15:53:16 'other state' 15:53:16 krotscheck: users -> other state 15:53:22 Projects? 15:53:44 i'm having trouble with this one 15:53:45 I think that would be other state too 15:53:49 (This one I’m worried about because of cascading references. 15:53:55 a project existed at some point and got shelved 15:54:00 (What if we have 10000 tasks assigned to a project?) 15:54:01 we need to keep cascading refs 15:54:10 so I wouldn't delete it 15:54:18 I would mark it obsolete or something 15:54:21 Feels like ‘other state' 15:54:23 projects -> no delete from me 15:54:25 yeah 15:54:31 so the difference between soft delete and other state here... 15:54:38 is whether random users would still be able to access it? 15:54:50 Is that the other state actually has a specified UI purpose, and makes a record discoverable, rather than creating a zombie in the database. 15:54:58 soft delete == only admin can recover it, other state == maybe users can find it if they look hard? 15:55:02 jeblair: imho it's different because it's still accessible by users 15:55:14 right, if thye look hard enough. 15:55:16 jeblair: Right - A deprecated project is still available and accessible. 15:55:24 Though perhaps a banned user...? 15:55:33 okay, then i think our use cases would tend toward 'other state' for projects 15:55:41 I don’t think we want to get into the legal quandary of publicly shaming someone 15:55:45 krotscheck: 'other state' is like a specific soft-delete 15:55:51 Righto 15:55:56 Tasks? 15:56:01 (a column in the db) 15:56:05 tasks -> hard delete 15:56:06 I saw yw just delete them. 15:56:09 tasks: hard delete 15:56:15 Ok. Stories? 15:56:20 hard delete for tasks 15:56:20 (Stories without tasks? 15:56:37 Stories is a bit harder. I would keep them around as empty or completed 15:56:48 don't let end users delete them 15:56:51 stories -> no delete (admin can hard delete) 15:56:55 same for comments 15:57:13 I hear an awful lot of consensus here. 15:57:13 fwiw in LP you can't delete a bug 15:57:18 ttx: that brings us to the field of "who can do what" 15:57:29 comments -> no delete (admin can hard delete) 15:57:30 NikitaKonovalov: we should never delete a story 15:57:39 NikitaKonovalov: but there will be corner cases. Like DMCA notices 15:57:54 asking you to take down something 15:57:56 Ok, so we’re out of time. 15:58:09 I’m sorry that we didn’t get to ttx’ agenda items, we’ll move those to the top for next week 15:58:13 krotscheck: in summary I think we don't need universal soft-delete 15:58:25 #topic Open Discussion and Summary 15:58:26 krotscheck: we need specific state columns for users 15:58:32 a,d projects 15:58:34 Alright, so current actions are: ttx to get a story in about getting our migrations cleaned up for MySQL, and probably another one to fix our documentation. 15:58:35 and* 15:58:38 krotscheck will grab clarkb and mordred to talk about search options 15:58:40 #link https://storyboard.openstack.org/#!/story/60 15:58:43 We’re getting rid of soft delete in favor of actual use states for each records. 15:58:49 Tasks are hard delete, stories are not (though stories without tasks shouldn’t be surfaced, etc) 15:58:56 submitted story about changing task labels ^ 15:58:57 Projects are other state, users are other state. 15:59:03 #action ttx to create a story about getting our migrations cleaned up for MySQL 15:59:05 Ah, thanks for reminding me jeblair 15:59:12 jeblair: I’ll put that first on my list 15:59:25 ANything else I forgot? 15:59:51 #endmeeting storyboard