19:01:52 <notmyname> #startmeeting swift 19:01:53 <openstack> Meeting started Wed Jun 11 19:01:52 2014 UTC and is due to finish in 60 minutes. The chair is notmyname. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:54 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:56 <openstack> The meeting name has been set to 'swift' 19:02:10 <notmyname> Thanks for coming. big stuff to talk about this week 19:02:15 <notmyname> #link https://wiki.openstack.org/wiki/Meetings/Swift 19:02:25 <notmyname> briancline updated the agenda perfectly :-) 19:02:34 <notmyname> I guess I'm pretty predictable :-) 19:03:01 <notmyname> first up, for some tangentially related logistics 19:03:36 <briancline> huzzah! 19:03:42 <notmyname> I'm having surgery tomorrow morning, early, and I've added clayg and torgomatic to the swift-ptl group in gerrit, temporarily 19:04:12 <notmyname> I expect to be back online this weekend. maybe friday pm 19:04:13 <clayg> o/ 19:04:33 <peluse_> good luck man! 19:04:37 <notmyname> thanks 19:04:40 <notmyname> so, moving on to the current stuff in swift... 19:04:42 <portante> a/ 19:04:49 <portante> o/ 19:04:53 <notmyname> #topic storage policies merge 19:04:53 <portante> sorry I am late 19:05:01 <notmyname> portante: no worries. just getting started 19:05:24 <notmyname> just last night clayg proposed what I think is the "final" set of SP patches. ie the set with all the functionality 19:05:40 <notmyname> note that the new "end of chain" is https://review.openstack.org/#/c/99315/ 19:05:51 <portante> 2 vector timestamps 19:05:59 <peluse_> reviewing now... 19:06:01 <portante> I'd like some time to review that 19:06:08 <notmyname> of course 19:06:17 <acoles> i've started but not done on 99315 19:06:34 <portante> would there be any revolt pushing this out from this week to next? 19:06:42 <clayg> REVOLT! 19:06:43 <notmyname> clayg: can you confirm that, other than discovered issues in the proposed patches, there is no additional functionality expected to be proposed 19:07:03 <clayg> notmyname: well i'm not sure the two vector timestamp stuff is really done *done* 19:07:08 <notmyname> ok 19:07:26 <clayg> notmyname: at a minimum it needs some extra probetests, and after two days of testing and testing i just sorta said - well i guess that's good enough 19:07:33 <notmyname> :-) 19:07:43 <notmyname> ok 19:08:20 <clayg> notmyname: I tried to audit everywhere that swift core was dealing with timestamps and I think it's all quite managable, but w/o probetests my confidence in the consistency engine in the face of the internalized form is only like... 95% 19:08:33 <notmyname> ack 19:08:37 <clayg> aybe 92% 19:08:45 <notmyname> 93.4%? 19:08:48 <peluse_> clayg: I'm only a few files into it but so far it seems like an improvement even aside from the new functinality (cleaner) 19:09:03 <acoles> clayg: i have a concern that the offset needs to be absolute, like another timestamp, for it to be useful for the object metadata post use case 19:09:03 <briancline> (no, negotiate up!) 19:09:05 <clayg> peluse_: maybe the timestamp class is sorta nice 19:09:11 <peluse_> yup 19:09:23 <acoles> clayg: the Timestamp class IS nice 19:09:37 <clayg> acoles: maybe we can offline that - i'm pretty sure it's useless if the offset is absolute ;) 19:09:54 <clayg> fixed deterministic is the way to go! 19:10:05 <acoles> clayg: yeah, discuss offline 19:10:34 <notmyname> ok, so it looks like we should target Monday instead of today/tomorrow for the SP merge 19:10:46 <notmyname> ie giving several more days for review 19:10:58 <peluse_> I think I'm washing my hair on Monday... 19:11:00 <clayg> notmyname: what about getting some of the other pending stuff merged 19:11:10 <notmyname> clayg: well that's my next topic :-) 19:11:13 <clayg> notmyname: did you want to cut a release 1.14 or some such before sp? 19:11:14 <acoles> peluse_: ! 19:11:21 <portante> peluse_: wait, you have that much hair to wash? :) 19:11:26 <notmyname> we've had the soft freeze for a while, and there is stuff queued up 19:11:28 <portante> hair washing? 19:11:34 <portante> next topic? 19:11:42 <peluse_> :) 19:11:47 <notmyname> I'm not anticipating a 1.13.2 or 1.14 release before the SP release 19:12:07 <portante> notmyname: where are we at with the xLO fixes? 19:12:14 <notmyname> what I am expecting is that we'll land the SP chain and also the stuff that's queued up. and all of that will be in the release 19:12:21 <notmyname> portante: 2 +2s but not merged 19:12:27 <briancline> what's queued up for the SP release? or is that the bottom half of the priority reviews page 19:12:35 <portante> but does that fix anticw's concerns? 19:12:41 <briancline> (other than SP of course) 19:12:42 <notmyname> portante: yes it does 19:12:47 <portante> great 19:12:56 <notmyname> the bottom half of https://wiki.openstack.org/wiki/Swift/PriorityReviews has patches that have 1 +2 and could be in a release 19:13:08 <notmyname> I don't have the stuff with 2 +2s listed anywhere right now 19:13:30 <notmyname> portante: and, later, I've got to decide if we backport that to icehouse (likely, but not definite) 19:13:32 <briancline> could probably beat gerrit into submission to find that 19:13:36 <notmyname> portante: if so, I'll take care of it 19:13:45 <portante> notmyname: k 19:14:05 <notmyname> so that's the current tension: a bunch of queued reviews and the SP chain that we want to avoid a bunch of rebasing on 19:14:30 <notmyname> how about this: today I'll merge the stuff that's pending (2 +2s) and then clayg can rebase tomorrow after those land. 19:14:31 <notmyname> ? 19:14:35 <notmyname> clayg: thoughts? 19:14:58 * portante wonders if the 2-vector timestamps should go in first against master as is 19:15:02 <notmyname> the point of the soft freeze is to avoid a bunch of rebases, but if we're all in agreement, then we can to it on a limited basis 19:15:13 <notmyname> portante: stop your speculation ;-) 19:15:21 <portante> okay 19:15:24 <notmyname> lol 19:15:34 * portante wonders why it is not raining here ... 19:15:41 <clayg> rebase's away! 19:16:05 <notmyname> everyone else ok with me landing the pending stuff and then having a monday target for SP landing? 19:16:13 <briancline> +1 19:16:16 <clayg> the only reason to avoid them is cause it's ugly in gerrit for reviewers - but everyone's tolerance seems to have built up against that 19:16:28 <clayg> notmyname: monday is a weird day to do anything 19:16:42 <notmyname> clayg: so are other days that end in "y" 19:16:43 <portante> I don't mind the patch set rebase if it does not hinder clayg's efforts 19:16:51 <notmyname> portante: that's my concern 19:16:59 <clayg> portante: no it's no trouble for me at all really 19:17:09 <clayg> portante: it's just annoying to reviewers 19:17:10 <portante> okay ... pig pile! 19:17:17 <briancline> *shrug* I've grown numb to long patch chains 19:17:17 <acoles> i don't mind being annoyed 19:17:41 * portante wonders if he is comfortably numb 19:17:47 <notmyname> the gate seems to have moved from "horrible" to simply "terrible", so it may take all night to merge stuff 19:17:48 <briancline> numb not because of this... that's glance's fault 19:18:02 <notmyname> or tomorrow 19:18:09 <portante> notmyname: what was the special way to land this patch set? 19:18:17 <portante> is it written up somewhere? 19:18:25 <notmyname> #action notmyname to land pending changes 19:18:27 <portante> the PS set 19:18:37 <portante> SP 19:18:52 <notmyname> portante: good question. nice transition :-) 19:19:08 * portante checks made out to ... 19:19:23 <notmyname> so, given the state of the gate (13.5 hours at a 50% pass rate now), we don't want to try to land 29 patches there 19:19:35 <acoles> notmyname: aww 19:19:36 <clayg> man... i remember when it was *only* 27 19:19:39 <notmyname> lol 19:19:48 <portante> ;) 19:19:50 <notmyname> so I've been talking with -infra to figure out a better way 19:19:58 <notmyname> here's what we've come up with: 19:20:05 <notmyname> review the current patches as normal 19:20:16 <notmyname> leave +1s and +2s 19:20:20 <notmyname> (or -1s) 19:21:28 <notmyname> when they all have all the necessary reviews, then we will have -infra build at new feature branch (probably "sp-review") and we'll force push all the patches there. then one merge commit will be proposed to master and reviewed in gerrit. I'll link the existing patch reviews (for historians) and we'll merge that one patch 19:21:51 <notmyname> the key is that all 29 patches will not have to be gated. just the final set of them all will be gated once 19:22:02 <notmyname> I'm working with mordred on this 19:22:12 <clayg> yay mordred! 19:22:15 <notmyname> make sense? 19:22:45 <portante> so will the individual commits be lost then? 19:22:50 <notmyname> portante: no 19:23:01 <portante> great, all for it then 19:23:30 <notmyname> portante: the individual commits (the 29 proposed) will still exist. but they will be added to master in one atomic commit (which is also nice for future bisects and bug tracking) 19:23:50 <notmyname> basically, this is how you are supposed to do git ;-) 19:24:03 <portante> nice 19:24:16 <notmyname> ok. so that takes us up to the "everything is on master" time 19:24:18 <zaitcev> you mean in one merge 19:24:18 <peluse_> cool 19:24:34 <notmyname> I'm hoping that we'll be there on tuesday (ie merge monday) 19:24:36 * portante wonders if a disney movie quote fits here ... fox and the hound 19:25:04 <notmyname> at that point, with the SP patches and the other queued up stuff, we'll cut and RC for the next release 19:25:20 <notmyname> and master is open for new patches 19:25:39 <notmyname> the RC period will be extended from the normal 3-4 days to two weeks 19:25:59 <portante> is anybody from rackspace here? 19:26:02 <notmyname> during this time, I'm hoping that everyone will be able to do their own testing in their labs for this release 19:26:14 <torgomatic> it amazes me that it takes a whole team effort to force Gerrit to work like Git wants it to :| 19:26:35 <briancline> are there any and/or do we need to define any parameters (logistically) for testing the RC? 19:26:48 <clayg> torgomatic is filled with astonishment 19:26:50 <notmyname> I have soft commitments from RAX, HP, softlayer, Red Hat, and maybe NeCTAR and maybe eNovance to do testing 19:26:57 <briancline> or just throw everything imaginable at it? 19:27:46 <notmyname> briancline: the most important thing is that existing clusters don't break. after that, look at the new features and do "stuff" to ensure it works as expected. IOW, what would happen if you deployed it to prod and turned it on :-) 19:28:22 <notmyname> I'm most concerned about regressions. then functionality 19:28:37 <zaitcev> do we have the 2-phase config in the latest SP or not? If yes, it has to be documented in some kind of readme 1) yum update or apt-get something, 2) edit swift.conf (on all nodes) and set SP_SCHEMA=true 19:28:55 <zaitcev> or is it implicit for >1 policies 19:29:10 <briancline> are there any metrics or other things in specific that all who are *extremely* familiar with this full set of patches would like us to make note of? 19:29:25 <notmyname> zaitcev: yes. updating the code is "safe". having >1 policy is what is the trigger for many of the code paths and is the "can't downgrade" point of no return 19:29:30 <briancline> aside from what we might usually do in our individual normal course of testing 19:29:35 <peluse_> zaitcev: the docs have udpated info about the order to do upgrades 19:29:40 <notmyname> peluse_: ah good 19:29:59 <zaitcev> peluse_: thanks, I'll re-review 19:30:13 <notmyname> assuming nothing is found during the RC period that is not also fixed during the RC period, then at the end of it we will have the final release. that will be Swift v2.0 19:30:24 <peluse_> zaitcev: Cool, in the section called "Upgrading" or something like that 19:30:40 <peluse_> yes! 19:30:43 <notmyname> and I'm letting some upstream community, packagers, and marketing people know 19:30:57 <clayg> briancline: containers are going to be slower to fill up, at least in the pathological case - it may be hard to prove with the all the object server and http connection overhead 19:31:03 <notmyname> unfortunately, as external-to-devs get involved, it puts more pressure on specific dates 19:31:21 <clayg> briancline: but if you're benchmarking normally uses 100 containers - you might try it with only 10 - and get a before and after 19:31:30 <notmyname> all of this put together means an end-of-June release 19:31:44 <notmyname> any questions here? does this sound reasonable? 19:31:55 <peluse_> bueno 19:32:09 <briancline> clayg: works for me - I'll make a note for myself 19:32:20 <notmyname> peluse_: will y'all be testing at Intel? can I add your name to the "soft QA commitment" list? 19:32:41 <peluse_> we don't have production clusters but I planned on testing an upgrade on a real test cluster 19:32:52 <notmyname> ack 19:33:30 <notmyname> ok, so everyone review the SP patches, and when not looking at those, take a look at the ones listed at the bottom of https://wiki.openstack.org/wiki/Swift/PriorityReviews 19:33:51 <portante> ack 19:34:31 <notmyname> and, once again, thank you to everyone here. every time I see the Swift community come together, I struck by your awesomeness :-) 19:34:56 <peluse_> good luck tomorrow... say yes to morphine 19:35:06 <tdasilva> notmyname: good luck tomorrow 19:35:10 <notmyname> thanks 19:35:15 <portante> aw, com'on man, think wolverine! 19:35:23 <notmyname> as a note about the non-SP release, if there are other patches that need to be in the release, please add them to the bottom of that wiki page 19:35:24 <portante> did he have any morphine? 19:35:43 <peluse_> portante: youre right I think he passed 19:35:47 <notmyname> briancline had one more topic he wanted to discuss 19:35:55 <notmyname> #topic container sync questions 19:35:58 <notmyname> briancline: you're up 19:36:29 <briancline> right, so this is just a quick meta-question or two on container sync - 19:37:25 <briancline> in reviewing some of the innards and the doc on it (http://docs.openstack.org/developer/swift/overview_container_sync.html), it isn't quite clear how it handles syncing objects whose replica 0 lives on a downed storage node 19:37:56 <briancline> there's a brief mention of balancing distribution of work but not missing work, but the latter part isn't covered much that I saw 19:38:25 <briancline> I've got a WIP for the multinode instructions and figured if I can get some clarity on it then perhaps I could submit a change to clarify these 19:39:01 <clayg> briancline: the second sync point watches will march up and sync all rows - but it expects to short circut when everyone is doing their job 19:39:07 <notmyname> I know that the container sync processes weren't very scalable (unlike the expirer). what have you seen in your testing pandemicsyn? 19:39:32 <clayg> notmyname: well you get replica count guys per container 19:39:59 <clayg> so smaller less active containers sync more quicklyish than larger more active ones - and it's really mostly dominted by the weight 19:40:17 <notmyname> clayg: I mean the single-thread, single-process syncer that is transporting for the whole cluster. I don't recall if that was improved 19:40:25 <clayg> but there's no idea of "container-sync is running slow, i'll run more" 19:40:45 <briancline> clayg: ahh ok, so the secondary/teriary nodes should detect this from SP2? if so, will they have the intelligence to distinguish between the replica 0 node being down versus it taking a long time to complete a prior sync? 19:40:46 <clayg> notmyname: well it is single process per container server... 19:40:51 <notmyname> clayg: ah ok 19:40:59 <clayg> briancline: what? 19:41:13 <clayg> briancline: you're talking specifically about container sync with storage policies? 19:41:21 <notmyname> sync point, I think :-) 19:41:22 <clayg> or ust scaling container sync in general? 19:41:26 <briancline> no, container sync itself 19:41:28 <clayg> oh heheheheh 19:41:33 <notmyname> briancline: SP now means storage polices :-) 19:41:47 <briancline> oh, haha 19:41:52 <briancline> sorry, sync point 2 :) 19:42:08 <clayg> briancline: no they don't distinguish, everyone moves everything eventually 19:42:54 <clayg> briancline: but at first they only try mod replica count and then wait for the second pass before doing all rows and hope they other guys make that second sweep quick 19:43:25 <briancline> alright, that helps a good bit 19:43:33 <clayg> briancline: you might be surprised 19:43:41 <clayg> briancline: but it *sounds* good! 19:43:54 <clayg> briancline: and it works... which is always nice 19:43:57 <briancline> I only mean it helps my understanding ;-) 19:44:59 <notmyname> cool. I think other design/usage questions should come up in #openstack-swift 19:45:07 <clayg> heheheh 19:45:29 <notmyname> and Rackspace is looking at it too, so you might want to ping them on it as well. and I'm hoping all of it results in patches to make it better :-) 19:45:30 <briancline> so my current understanding is the contention point is we don't want to worry about a single coordination point that would solve some of the scaling issues, and that it's totally serial per container server, correct? 19:46:49 <briancline> if this is a bit too in the weeds I can take it offline 19:47:03 <notmyname> briancline: ya, I think it should be discussion in -swift 19:47:07 <notmyname> *discussed 19:47:32 <briancline> I mostly put it on the agenda since I've seen a lot of lonely souls ask about it without much input 19:47:35 <briancline> alright, cool 19:47:45 <notmyname> #topic other topics? 19:47:54 <notmyname> anythign else to bring up as a group this week? 19:48:34 <creiht> howdy 19:48:35 <creiht> sorry 19:48:41 <notmyname> welcome :-) 19:48:46 <notmyname> creiht: we are just finishing up 19:48:51 <notmyname> actually, I think we're done 19:49:14 <creiht> perfect timing :) 19:49:14 <notmyname> thanks everyone for attending and participating 19:49:16 <clayg> creiht: i'll stay and talk with you if you want 19:49:19 <notmyname> #endmeeting