Friday, 2015-03-06

*** jrichli has quit IRC00:01
*** dmsimard is now known as dmsimard_away00:01
*** InAnimaTe has joined #openstack-swift00:05
InAnimaTehey all, devops/sysadmin here who just inherited a swift cluster that was lifted, and then never touched...00:06
InAnimaTethis chan and myself will most likely become very good friends over the next few months00:06
mattoliverauInAnimaTe: lol, welcome to channel :)00:08
notmynameInAnimaTe: glad to have you. anything you can share as you ask questions will be helpful00:09
notmynameInAnimaTe: anything bugging you today?00:09
InAnimaTeactually yes, compiling the necessary stuff and creating a pastie to share...give me a few minutes00:09
* InAnimaTe is already liking this community :)00:10
claygtorgomatic: man unit.proxy.controller.test_obj got sorta jacked by the earlier call to container_info :/00:16
InAnimaTe^well thats it00:17
InAnimaTetl;dr been getting 404's on some GET's, but completely random. Weird Expect 100 error in proxy logs, everything else seems fine00:17
InAnimaTeohh, and swift 1.8.000:18
InAnimaTenotmyname: yeah i have a blog ill be posting my experiences as best i can00:18
claygone dot *eight*00:19
* clayg rembers the good ol' days00:19
*** ho has joined #openstack-swift00:20
hogood morning guys!00:20
notmynameho: good morning00:20
mattoliverauho: morning00:21
honotmyname: mattoliverau: morning!00:21
notmynameInAnimaTe: just to get the question out of the way, I'm curious about your plans if any to upgrade00:21
notmynameInAnimaTe: I certainly want to help make you successfull with what you have. but I like newer versions of swift better than older ones ;-)00:22
InAnimaTethe guy who stood this up said that as long as there are no ring changes in the new versions, it should be as easy as upgrading the package and restarting services00:22
notmynamelooking at the paste00:22
notmynameInAnimaTe: we're pretty serious about always allowing version upgrades without any end-user downtime00:22
notmynamebut yes there have been changes to the rings since 1.8 (most notably storage policies)00:23
InAnimaTehmm ok. ill have to dig through the docs and figure out how to handle those changes00:24
notmynamewe can help with that later. let's get to your issue today00:24
notmynameInAnimaTe: for future reference, here's a blog post on upgrading swift
*** bhakta has joined #openstack-swift00:25
claygInAnimaTe: topology would be good to know as well - i'm seeing some logs from "storage-proxy-01" *host* from both "proxy-server" and "container-server" - but no log lines from "object-server"00:25
bhaktaI have a question about the operator role00:25
claygbhakta: sounds very keystone-y00:26
bhaktaWe are using keystone middleware for authentication00:26
claygoic, you must be doing some syslog forwarding... I wonder if the object-servers' are misconfigured or just dropping udp messages or something...00:27
claygInAnimaTe: ^00:27
notmynamebhakta: what's your question?00:27
InAnimaTehmm good question.00:27
InAnimaTealso, any chance the chan admin could add this chan to ?00:27
InAnimaTethink it would be pretty worthwhile00:27
bhaktatrying to figure out who should belong to the SwiftOperator role00:27
notmynameInAnimaTe: what is that?00:28
bhaktaFrom the documentation I read it says SwiftOperator is allowed to create containers00:29
notmynamethis is a logged channel
bhaktaSo I created a SwiftOperator role in keystone and assign a user ..everything works from Horizon00:30
notmynamebhakta: that role is keystone's way of knowing if a given identity is the "owner" of a swift account. the owner has full read/write access to a given swift account. other users (non-owners) must be explicitly granted permissions00:30
InAnimaTenotmyname: ahh ok thx. botbot is a really nice log keeper with a great gui and search capabilities00:30
bhaktaBut my users who don't belog to the SwiftOperator roles get - unauthorized errors even when I explicitly gave permission to a container (using swift post -r)00:32
*** ChanServ changes topic to "Review Dashboard: | Overview Dashboard: | Priority Reviews: | Ideas: | Logs:"00:32
notmynameInAnimaTe: now in the topic :-)00:33
bhaktait seems that the user should be able to list the container he as read access to00:33
bhaktaam I missing somethin?00:33
InAnimaTeclayg: im putting together a tldr ver of our layout00:34
notmynamebhakta: that would be up to keystone to track (and AFAIK, they don't)00:34
mattoliveraubhakta: the user needs the rlistings acl to list the container00:35
notmynamebhakta: ah, my mistake. I misread your question. what mattoliverau is correct00:35
bhaktaI even tried setting .rlistings00:35
bhaktabut same ..could not get it to list the container00:35
*** dmorita has joined #openstack-swift00:36
bhaktaI can list the object within the container but not the container itself00:36
*** km has joined #openstack-swift00:36
notmynamebhakta: what's the request you're making that's getting the error?00:37
bhaktalet me get that00:37
bhaktaswift post -r "example_role,example_account:Test1,.rlistings" test_container00:41
notmynameis that's what's getting the 401 response?00:42
notmynameso your user identity doesn't have access to set ACLs on that container00:43
claygif you're giving read access to a user the "refer" permissions don't matter - if a user has read access to a container they can do listings or get objects00:43
claygyeah can you swift list test_container?00:43
bhaktalistings of containers or objects?00:43
bhaktayes.. i was able to "swift list test_container"00:44
hobhakta: Does the user have operator role (default: swiftoperator)? to execute it the user need to have operator role.00:44
bhaktano..that's the I need to assign every user the "SwiftOperator" role00:45
claygwhat role do you have that's giving you access if not admin acess - very strange00:45
bhaktathen what's the advantage of container level access. If I do that every user with that role can access everything00:45
hobhakta: if you put/post to containers, you always have to have operator role00:46
bhaktabut I am trying to do "00:46
bhaktaSwift list00:46
bhaktaas non SwiftOperator00:46
claygbhakta: yes - but to achive that you need to set ACL's - and it sounds like *thats* the part giving you the 401 - so we have to fix that first00:47
openstackgerritTim Burke proposed openstack/python-swiftclient: Mention --segment-size option after 413 response
bhaktaI thought doing this would set the ACL "swift post -r "example_role,example_account:Test1,.rlistings" test_container" ? is that not right?00:48
bhaktaDo I need to set the "ACLs" on Keystone side?00:49
*** rmcall has quit IRC00:52
hobhakta: your commad line looks good and you don't need to set ACL in keystone (I think Container-ACL is a function of swift)00:54
hobhakta: the user needs to have operator roles (swiftoperator) when you execute the command line.00:57
bhaktaso you are saying if I trying get container listing via REST API, it should return me the list?00:58
bhaktais this swift client issue then?00:58
bhaktaI am not able to get the contianer to show up in Horizon which is bad00:59
openstackgerritTim Burke proposed openstack/python-swiftclient: Remove all DLO segments on upload of replacement
notmynamebhakta: can you stat the container (HEAD) and give us the value of the read acl header?00:59
bhaktaI nedd to leave now to pick up my daughter but I will be loggin bck in few hours01:00
claygnotmyname: I know a GET to the container won't *return* X-Container-Read if you're not a swift_owner = True - but I don't know if the cli will just display the empty value regardless or if will only display the thing if it's returned in the resp headers01:00
bhaktahopefully i will be able to continue the discussion01:01
notmynameclayg: right, good point. I was hoping for doing that with an account that has permissions01:01
InAnimaTefyi, all i know about is pretty much whats in /etc/swift01:02
notmynameInAnimaTe: good info, thanks01:02
*** bhakta has quit IRC01:05
notmynameInAnimaTe: ok, so the problem is the connection errors? the 404s? or just understanding what's going on?01:05
InAnimaTemainly the 404's01:05
InAnimaTei guess im wondering why exactly they are happening01:06
InAnimaTeand at what layer, proxy or storage01:06
InAnimaTeseems to be proxy to me01:06
notmynameInAnimaTe: yes, it looks like the proxy is returning a 404 to the client01:07
notmynameit only does that after it gets 404s (or errors) from all of the storage nodes it tried01:08
InAnimaTeahh ok01:08
notmynameit will try all of the primaries, and then several handoff nodes (the default is 2*<replica count>. I think it was that in 1.8.0 too)01:08
InAnimaTeso the "ERROR with Object server re: Expect: 100-continue" means it got an crappy reply and the node that should have it doesnt?01:08
torgomaticnotmyname: I wouldn't count on that01:08
notmynameInAnimaTe: ok, torgomatic says I'm wrong. that means I'm probably wrong01:09
notmynameno, look at that error line. it says ConnectionTimeout(0.5s)01:09
InAnimaTeahh yeah01:09
notmynamethat's the timeout on establishing the tcp connection with that box01:10
notmyname(technically, creating an HTTPConnection object)01:10
InAnimaTeyeah but does that directly mean it couldn't get the object? or just that that particular node didn't answer in time?01:10
notmynamethat log line means that the particular server didn't answer in time. but that's hidden from the end-user01:11
torgomaticnotmyname: request_node_count was in d79a67eb, which is in version (tag), but not 1.8.001:11
notmynameInAnimaTe: swift has some cli tools for figuring out where an object /should/ be. `swift-get-nodes`01:11
InAnimaTeahh good to know01:16
InAnimaTegonna derp around with that command a bit01:16
*** fandi has joined #openstack-swift01:16
*** aix has joined #openstack-swift01:16
*** Canaimero-e64b has joined #openstack-swift01:18
*** nexusz99 has joined #openstack-swift01:19
*** Canaimero-e64b7 has joined #openstack-swift01:20
InAnimaTeso i guess im not understanding why the first PUT of an object (L37) returns a 201 (which from what i know, means the object has been written to at least one node) but then subsequent GET's fail01:21
InAnimaTeand im guessing those connection timeouts are in relation to the HEAD which is checking if the object exists at all01:21
InAnimaTe <- L3701:22
InAnimaTeand even after two minutes, GET's are still failing01:22
*** Canaimero-e64b has quit IRC01:24
notmynameInAnimaTe: ah interesting. look at the transaction id tx02ee2f8d581549f393483d52c49b463b01:25
notmynameso the data got written01:26
notmynamespecirically to storage 3, 5,and 701:26
* notmyname can't English01:26
*** tsg has quit IRC01:26
notmynamehowever, you've got the connection timeouts01:26
*** tsg has joined #openstack-swift01:26
notmynameon .69 .71 and .6301:27
notmynameso I'm guessing swift-get-nodes would show that it should be on those 3 servers01:27
*** Canaimero-e64b7 has left #openstack-swift01:28
InAnimaTeohhhh. so it does actually exist on .69 .71 and .63 but the proxy doesn't get a response from them, so it returns a "nope, this object dont exist"01:28
notmynameso (and based on torgomatic's comment) I'm guessing that 1.8.0 doesn't check handoffs deep enough (or at all) on the GET01:29
notmynameI mean, I could actually look at the code. or I could just hypothesize01:30
notmynameInAnimaTe: but the issue you have, then, is that the data is on handoff nodes and not primary nodes, so you've got to figure out why (ie check processes on those boxes or just check that they are turned on and have network plugged in)01:32
InAnimaTewell, (
*** fandi has quit IRC01:33
*** fandi has joined #openstack-swift01:33
InAnimaTelol have network plugged in?01:34
*** tsg has quit IRC01:37
notmynameInAnimaTe: it's actually in the except below there. catching the timeout01:37
notmynameright? the ConnectionTimeout01:37
InAnimaTeahh yeah01:38
InAnimaTethats super fucking weird then. in no way do any of these boxes *seen* to have connection limits01:39
InAnimaTemight have to fiddle with timewait sysctl values01:40
notmynameis this in the midst of other concurrent reuqests?01:40
InAnimaTei would assume01:40
InAnimaTe(yes its continuously used)01:40
notmynameInAnimaTe: can you paste one of the container server configs?01:41
InAnimaTeyep hold on01:41
notmynamehmm..I was looking for the backlog setting. not there, so it's the default01:45
* notmyname checks out 1.8 version of swift01:45
notmynameThu Apr 4 04:15:54 2013 +000001:45
notmyname2 years old ;-)01:45
notmynamebacklog default is 409601:45
notmynameie it can queue 4096 connections per worker01:47
notmynameso I'm guessing that with 8 workers, you aren't seeing 8k connections to the storage servers.01:48
notmynamewell, you could, if you're seeing about 3k concurrent requests to the proxy01:48
InAnimaTeso it used to be two workers01:48
InAnimaTe(probably should have mentioned this)01:48
InAnimaTeI upped it last night to 801:48
*** zhill has quit IRC01:53
notmynamestill, probably ok01:53
InAnimaTeyeah we run the Account, Container, and Object servers with 8 workers each01:55
notmynamedid you confirm that those servers are online and accepting connections?01:56
InAnimaTeyeah but let me double check01:57
InAnimaTei mean if they weren't id have a lot more problems than intermittent 404'01:57
notmynamewhile you're there, check IO wait and CPU usage01:58
*** kei_yama has joined #openstack-swift02:04
InAnimaTe^some transactions that derped today02:04
InAnimaTeit looks like everything is pretty much running02:10
InAnimaTeone thing that isnt running on any of them is the object-auditor02:10
InAnimaTethe original admin disabled it for whatever reason02:10
*** haomaiwang has joined #openstack-swift02:13
InAnimaTeis it actually super important?02:13
*** tsg has joined #openstack-swift02:14
torgomaticdepends; do you like data integrity? /s02:17
torgomaticit's the only thing that checks offline for object-level bitrot, so it's sort of important02:18
torgomaticwhen you GET an object, the downloaded copy will also get checked, so if your objects are *all* accessed fairly often you might be able to do without, but why run the risk?02:18
torgomaticit is hard on disks though, so it should be tuned to run quite slowly02:19
zaitcevI suspect that's the part that confuses people02:21
zaitcevAlso, we used to have a bug where ZBF just goes crazy regardless02:21
InAnimaTei wonder if it was turned off because its hard on the disks02:23
*** dmsimard_away is now known as dmsimard02:25
*** dmsimard is now known as dmsimard_away02:35
*** tsg has quit IRC02:39
*** bandarji has quit IRC02:40
*** rmcall has joined #openstack-swift02:58
*** amrith has left #openstack-swift02:58
*** rmcall has quit IRC03:20
*** david-lyle is now known as david-lyle_afk03:44
openstackgerritMerged openstack/swift: Use Container override header to update etag and size
openstackgerritMerged openstack/python-swiftclient: Mention --segment-size option after 413 response
*** fandi has quit IRC03:49
openstackgerritPete Zaitcev proposed openstack/swift: Pluggable Back-ends for account and container servers
*** Fin1te has joined #openstack-swift04:15
*** gyee has quit IRC04:23
*** Fin1te has quit IRC04:50
*** ppai has joined #openstack-swift04:58
*** DCWilliams_VA has joined #openstack-swift05:09
*** zaitcev has quit IRC05:10
*** DCWilliams_VA has quit IRC05:13
*** dmorita has quit IRC05:20
*** SkyRocknRoll has joined #openstack-swift05:47
*** SkyRocknRoll has joined #openstack-swift05:47
openstackgerritClay Gerrard proposed openstack/swift: Fix EC download when data nodes is not divisiable by two.
*** Bsony has joined #openstack-swift06:13
*** Bsony has quit IRC06:17
hobhakta: notmyname: clayg: mattoliverau: I just reproduced the bhakta's problem. I think step (2) in the following url is our target. . I will wait for bhkta's response.06:18
*** kei_yama has quit IRC06:19
mattoliverauho: thanks for the research. Yeah (2) will always fail because test1 has no rights to set the container ACLs, they have no rights at all. Only the owner or reseller admin can.06:29
mattoliverauho: so yeah hopefully that's what he's doing wrong.06:29
mattoliveraunice way to write it in an understandable way.06:30
mattoliverauand on that note I'm calling it a day. Night all.06:30
homattoliverau: good night!06:31
*** wshao has joined #openstack-swift06:39
wshaoI have an older version of swift 1.8.4, with swauth. I try to upgrade, but swauth does not work. Is that abandoned?06:40
howshao: hello, I know it worked with havana (1.10.x) so It should work with the version.06:44
howshao: sorry I miss understand. you upgrade from 1.8.4 to latest06:46
hos/miss understand/mis-understand/06:47
*** nexusz99 has quit IRC06:47
*** Bsony has joined #openstack-swift06:57
*** wshao has quit IRC07:10
*** jamielennox is now known as jamielennox|away07:16
*** torgomatic has quit IRC07:21
*** Bsony has quit IRC07:36
*** mmcardle has joined #openstack-swift07:57
*** zul has quit IRC08:01
openstackgerritYuan Zhou proposed openstack/swift: Fix EC PUT on HTTP_CONFLICT or HTTP_PRECONDITION_FAILED
*** rledisez has joined #openstack-swift08:08
*** chlong has quit IRC08:11
*** acoles_away is now known as acoles08:12
acolesclayg: yes i have shamelessly copied ;)08:13
*** zul has joined #openstack-swift08:14
acolesclayg: i thought 'rv' was something you guys drove on vacation08:15
*** geaaru has joined #openstack-swift08:17
*** notmyname has quit IRC08:30
*** zhill has joined #openstack-swift08:34
mattoliverauacoles: lol, love me some good dad jokes especially those aimed at the guys in the US ;p08:34
*** zhill has quit IRC08:36
acolesmattoliverau: so you have started your weekend, right?08:36
*** torgomatic has joined #openstack-swift08:38
*** ChanServ sets mode: +v torgomatic08:38
mattoliverauacoles: yup, and its a long weekend too! But probably still do some work.. Cause I'm a geek :p08:40
acolesmattoliverau: he, have a good one08:41
mattoliverauacoles: You too, happy Friday!08:43
*** jordanP has joined #openstack-swift08:44
*** jistr has joined #openstack-swift09:02
*** jistr is now known as jistr|biab09:36
*** jistr|biab is now known as jistr10:39
*** haomaiwang has quit IRC11:09
*** chlong has joined #openstack-swift11:22
*** nellysmitt has joined #openstack-swift11:25
*** Trixboxer has joined #openstack-swift11:50
*** bill_az has joined #openstack-swift11:54
*** ho has quit IRC12:10
*** aix has quit IRC12:38
*** km has quit IRC12:44
*** DCWilliams_VA has joined #openstack-swift13:04
*** mmcardle has quit IRC13:26
*** ppai has quit IRC13:27
*** mahatic has joined #openstack-swift13:28
*** aix has joined #openstack-swift13:30
*** mahatic has quit IRC13:37
*** ppai has joined #openstack-swift13:39
*** mmcardle has joined #openstack-swift13:48
*** DCWilliams_VA has quit IRC13:57
*** lpabon has joined #openstack-swift13:57
*** mahatic has joined #openstack-swift13:57
*** annegentle has joined #openstack-swift13:59
*** annegentle has quit IRC14:00
*** annegentle has joined #openstack-swift14:01
*** ppai has quit IRC14:01
*** fifieldt_ has quit IRC14:05
*** rdaly2 has joined #openstack-swift14:13
*** annegentle is now known as superanne14:31
*** chlong has quit IRC14:34
*** jrichli has joined #openstack-swift14:36
*** jordanP has quit IRC14:47
*** rmcall has joined #openstack-swift14:52
*** jrichli has quit IRC15:04
*** aarondelp has joined #openstack-swift15:12
*** jmacs has joined #openstack-swift15:12
jmacsHi all, question from a beginner - when object-replicator is copying partitions around with rsync, is there any signal to object-server that the replication has finished? I haven't seen anything in the code yet15:14
*** jordanP has joined #openstack-swift15:14
glangeno signal15:22
glangeout of place objects are deleted after replication is successful15:23
glangeand the proxy looks on multiple object servers when looking for an object15:24
jmacsSure - I was just wondering how the receiving node knew the replication was successful and ready to serve up objects15:24
glangeso, it's ok if one (or sometimes more copies) is out of place15:24
glangewell, on the receiving node, the object is either there or not15:25
glangeif it's there, it will serve the object15:25
*** jrichli has joined #openstack-swift15:26
glangedoes that help?15:26
jmacsIs the underlying data file moved to its destination atomically?15:26
*** dmsimard_away is now known as dmsimard15:28
jmacsYes, I think it is (just reading rsync's man page now)15:29
jmacsThat makes sense. Thanks.15:29
glangecool, I wasn't sure about that, I didn't see anything in the python code that makes that happen15:30
*** superanne has quit IRC15:59
*** superanne has joined #openstack-swift16:00
*** superanne has quit IRC16:07
*** superanne has joined #openstack-swift16:08
*** chlong has joined #openstack-swift16:33
*** Nadeem_ has joined #openstack-swift16:47
mahatichi, this could be very basic: given a URL like this, how does it know which one to call - AccountController, ContainerController, ObjectController?16:48
*** rmcall has quit IRC16:48
*** rmcall has joined #openstack-swift16:49
mahaticI have followed the SAIO installation. And my configs are from there. How does it pick identify devices from here: /etc/swift/container-server/1.conf and call the appropriate controllers?16:49
*** gyee has joined #openstack-swift16:49
tdasilvamahatic: Hi, the proxy server has a method that returns the correct controller based on the request url. Take a look here:,L26516:52
mahatictdasilva, looking. thanks so much for the response!16:53
*** jordanP has quit IRC16:53
tdasilvamahatic: I hope it helps :-)16:53
openstackgerritOpenStack Proposal Bot proposed openstack/python-swiftclient: Updated from global requirements
*** notmyname has joined #openstack-swift16:54
*** ChanServ sets mode: +v notmyname16:54
notmynamegood morning16:55
notmynamelooks like rackspace restarted my znc box last night, so I missed the playback16:55
mahatictdasilva, get_controller is a default call on any URL? In a context where is handling "hosts", does every call to a URL like this "" first goes to get_controller?16:56
mahaticgood morning, notmyname16:58
jmacsmahatic: I think 6000 is always the object server16:58
jmacsBy default, it's configurable in /etc/swift/object-server.conf16:58
mahaticjmacs, true, the config is done that way. But, i'm asking how does it know to call the appropriate controller, does it always call get_controller17:01
jmacsThat I don't know17:02
tdasilvamahatic: I think I understood your question incorrectly. The get_controller is especifically in the proxy server. If a request is being set to the object-server, then there will be some other request handling function in the object server17:02
tdasilva*being sent17:03
tdasilvamahatic: Once I know which server the request is going to, I always start at the __call__ method of the So in the case of the object-server, I'd start here:
mahatictdasilva, actually, that also is not my question I guess :). So here in the config, /etc/swift/container-server/1.conf, [app:container-server] use = egg:swift#container17:06
mahaticthat sets the url to object/container/account server, correct?17:07
tdasilvamahatic: sorry...i'm probably confusing you more than helping17:08
*** superanne has quit IRC17:08
mahatictdasilva, nope. I'm aware how the code works (__call__ or other methods inside a file). My question was how does the config set? I don't understand the words "app" and "egg" in the config or how do they work.17:09
mahatictdasilva, this should help me I believe
mahatictdasilva, thanks for the effort! :)17:11
tdasilvamahatic: in the case of this line  use = egg:swift#container especifically17:12
tdasilvatake a look at setup.cfg17:12
tdasilvaand take a look at paste deploy17:12
tdasilvamahatic: sorry, i'm back, but yeah, take a look here: that should help a little bit....17:17
tdasilvamahatic: sorry for the confusion17:17
jrichliI think she was asking how the port is assigned to a server in the config.  The "bind_port" only has to be specified if you want to use something17:18
jrichlibesides the default.17:18
mahatictdasilva, sure. And you didn't confuse me at all. Thanks for the info! :)17:18
notmynamejmacs: not true any more17:19
notmynamejrichli: ^17:19
notmynamejmacs: mistype17:19
notmynamejrichli: the bind port setting is required. it does not have a default value any more17:19
notmynamethat was the first step in changing the recommended ports to swift (which hasn't been done yet)17:19
mahaticyeah, that true. I noticed the patch17:19
jrichlioh, ok.  I just figured she mentioned the egg b/c it was the only thing there in her config17:20
mahaticjrichli, I was not aware of the terms in config. But i'm now looking at links that help me understand. Thanks!17:20
*** rledisez has quit IRC17:22
openstackgerritPrashanth Pai proposed openstack/swift: Make object creation more atomic in Linux
openstackgerritAlistair Coles proposed openstack/swift: Multiple fragment Archive Index Support
openstackgerritAlistair Coles proposed openstack/swift: DiskFile refactoring towards per-policy classes
acoles^^ peluse clayg i hope i didn't break anything17:32
*** thebloggu has joined #openstack-swift17:37
*** jistr has quit IRC17:37
acolespeluse: i'm not convinced yield_hashes is right, left a TODO in there to come back to17:42
*** david-lyle_afk is now known as david-lyle17:43
*** acoles is now known as acoles_away17:43
*** jordanP has joined #openstack-swift17:43
theblogguI have a swift test cluster with 1 Proxy Node and 5 Storage Nodes and 3 zones. Of those 5 Storage Nodes 4 went down and I was able to recover 3. They're online but the other one has a filesystem corruption and is currently down. this happened before and the cluster recovered (after a long time). Why does it take so long to recover (please note I only have a few KB stored) and can I speed it up?17:45
openstackgerritPrashanth Pai proposed openstack/swift: Make object creation more atomic in Linux
*** jordanP has quit IRC17:53
*** zaitcev has joined #openstack-swift18:01
*** ChanServ sets mode: +v zaitcev18:01
*** superanne has joined #openstack-swift18:04
*** Nadeem_ has quit IRC18:13
*** hurricanerix_ has quit IRC18:15
*** hurricanerix has joined #openstack-swift18:16
*** morganfainberg is now known as needscoffeebadly18:19
*** needscoffeebadly is now known as CaptainMorgan18:22
mahaticnotmyname, I can't figure a scenarios where rings could be different from the actual deployment. Won't the rings pick up from the devices and their config anyway? Could you help on that?18:23
notmynamemahatic: policy 1 could be all spinning drives. policy 2 could be all flash. policy 3 could be all in asia. policy 4 all in europe18:25
mahaticnotmyname, right. But before that, I was thinking about validate servers scenario in the recon18:26
*** superanne has quit IRC18:27
mahaticThe rings would have the info from the config (I'm still talking about policy 0, default)18:28
mahaticWhen could it be wrong?18:29
*** geaaru has quit IRC18:29
*** SkyRocknRoll has quit IRC18:34
*** mmcardle has quit IRC18:39
notmynameI don't understand18:40
openstackgerritClay Gerrard proposed openstack/swift: Fix EC download when data nodes is not divisiable by two.
*** bill_az has quit IRC18:40
claygtorgomatic: so are you going to use the pyeclib get_segment_info to get the last fragment size when you're doing the ranged GET stuff?  I only ask because calculating it by hand seems to be rife with danger ->
torgomaticclayg: dunno, haven't gotten that far yet18:43
claygmaybe you don't need the last fragment size... I mean if your ranged GET to the storage node is fragment_size aligned you can just start reading fragment payloads and getting out real segments - maybe shave a bit of the first segment one if the client request isn't quite aligned... but if they want all the way to the end of the file you'll just hand pyeclib whatever < fragment_size comes out of the object server...18:47
*** rmcall has quit IRC18:58
*** superanne has joined #openstack-swift19:04
torgomaticclayg: yeah, for bytes=M-N kind of requests, that's all I do. for bytes=-N suffix requests, I'll need some math to shave off any partial fragments before decode, but I haven't written that yet19:05
claygtorgomatic: oh right... well I'd still suggest letting pyeclib do the math :P19:07
claygok, back to reviewing tdasilva's patches19:14
claygtdasilva: I'm going to do the ec put/container headers first - then right into version middleware - but I feel like the open question you had about delete's to versioned container where the version-location doesn't have write acl's needs to be addressed before you can finish it?19:15
claygtdasilva: there was also the comment you had being half-way done with the "remove the pre-flight HEAD request"19:16
claygtdasilva: I'm fine with *either* finishing that out or leaving it for future work - but not merging with the unsed method (that's probably stating the obvious)19:16
claygtdasilva: but anyway - i'd like to see that extraction wrapped up on master so we can start dealing with it (and the *other* proxy cleanup) making there way onto feature/ec ASAP19:17
claygtdasilva: so... what can I do to help versioned-writes-middleware get landed :D19:19
tdasilvaclayg: yeah, I think we need to address the acl question.19:19
tdasilvaclayg: and I'm leaning towards putting the "remove the pre-flight HEAD request" as a separate patch19:19
tdasilvaclayg: would you be ok with that?19:19
claygtdasilva: that all sounds so wonderful I could just cry19:20
tdasilvaclayg: lol..if we can deal with the acl behavior, I can right the correct tests for it19:20
claygtdasilva: I think we already identified a use case for "versioned container to protect users from themselves" - so if we have to pick one or the other to "fix" - I'd say fix delete19:20
claygI mean - what's the behavior when it breaks anyway, it just tries to COPY and fails - the "real" object is not over-written by the last version19:21
tdasilvaclayg: yeah, it won't delete anything19:22
claygzaitcev: whoot whoot -> is back!19:22
tdasilvaI think it actually breaks because it can't get a container listing19:22
zaitcevclayg: it never went anywhere, but I am trying to get lpabon and tdasilva to review it, so I updated it to the latest. Unfortunately, it's one of those forever-WIP things.19:23
*** thebloggu has quit IRC19:24
*** superanne has quit IRC19:24
*** aix has quit IRC19:24
claygtdasilva: oh, interesting - that makes sense19:24
lpabonzaitcev: clayg tdasilva, yeah, I am going to spend some time on it after Vault conferenc19:25
tdasilvaclayg, zaitcev: yes, I need to start looking into how to hook that up to swiftonfile19:25
tdasilvaor maybe lpabon is :-)19:25
lpabonis *also* you mean ;-)19:25
tdasilvaclayg: one more question on obj. versioning19:26
tdasilvaclayg: that unit test that you found was missing: test_cross_policy_DELETE_versioning. I was looking at that test and don't think it is applicable now19:28
tdasilvaclayg: I think a functional test would make more sense19:28
claygzaitcev: I need to think more about Backend-has-a-Broker, maybe it could get you want you want and still be a stepping stone to cleaning up our current dbbroker mess19:29
claygtdasilva: I'm cool with "this unit test was moved to functtests" - i was just trying to audit we didn't miss anything19:30
tdasilvaclayg: ok, thanks19:30
claygtdasilva: OTOH sometimes the unittests can give you quicker feedback when you're refactoring... so I never mind seeing them duplicated either19:30
zaitcevif it's not skpped for in-process functests and actually tests what it needs to test, I'm fine with such move... not that anyone asks my opinion19:31
clayglol - zaitcev tell us how you REALLY feel19:31
zaitcevlike an imposter19:31
*** lcurtis has joined #openstack-swift19:33
tdasilvazaitcev: not sure if we have cross-policy tests in in-process functests :\19:34
openstackgerritThiago da Silva proposed openstack/swift: fixing small typos in associated projects doc
claygtdasilva: lol - i saw that the other day :P19:37
tdasilvaclayg: hehe...just saw it today19:37
tdasilvaclayg: btw: got a new version of the saio-ansible project going up in a bit19:38
tdasilvaorganized the scripts a bit more19:38
claygnotmyname: I'm on a tear ->
claygtdasilva: you know honestly I haven't tried it yet - maybe sometime before Vancouver :\19:38
claygnotmyname: please to be making the wrist slapping as needed - I think one time I remeber a bunch of core in the room being like "yeah if it's a simple doc fix just merge it and stop wasting my time" - but I can't remember if that was mostly *me* saying that?  or if we were drinking at the time?19:39
notmynameya, that's totally fine and good IMO19:39
claygtdasilva: oh, the override headers was alreayd merged19:40
claygmattoliverau: peluse: thanks!19:41
tdasilvaclayg: did you have a problem with it? I remember you mentionining something about GETs failing for you19:41
tdasilvadon't remember if it was related to that patch specifically or something else you found19:42
claygtdasilva: no it was the other thing - i had a couple of draft comments that I didn't post - but looking at them now I don't care about them19:42
*** CaptainMorgan is now known as morganfainberg19:50
*** superanne has joined #openstack-swift19:50
*** superanne has quit IRC19:51
*** superanne has joined #openstack-swift19:52
*** superanne has quit IRC19:56
*** superanne has joined #openstack-swift19:57
claygdoes need a rebase or something - i thought we fixed all the py26 issues on master and featre/ec19:58
claygfunctests on feature/ec -> FAILED (SKIP=12, errors=26, failures=13)19:58
zaitcevOuch. A class called "helper" and it's rather big20:07
claygit's *really* helpful!20:08
* clayg kids20:08
*** nellysmitt has quit IRC20:15
notmynamemahatic: still around? (I know it's crazy late for you)20:16
notmynameportante: what's the awesome bash prompt? must be great if you're talking about it publicly :-)20:20
zaitcevwait, what20:21
zaitcevPeter is alive?20:21
*** rdaly2 has quit IRC20:31
*** rdaly2 has joined #openstack-swift20:35
*** dmsimard has quit IRC20:36
tdasilvaclayg: hi, i was looking at your patch 159739, and to fix the doc issue you just added that line back in index.rst. Do I also need to un-delete the overview_object_versioning file? I was looking to add a reference like this:  :ref:`versioned_writes` but that didn't work20:39
portanteaaron griffis, a former co-worker has a really cool .bashrc.prompt file I have been using for years, and I just copied it over to my mac and it worked flawlessly20:41
portantenotmyname: hopefully, he'll comment on g+ so that I can reference where he keeps the source20:41
claygpeluse: acoles_away: I keep trying to review the suffixes hashes changes and it's killing me - I *really* need to pitch in and help clean that up - but we've already got three cooks in that kitchen and at least TWO refactorings we're thinking about20:41
* portante I'm not dead (you're only foolin' yourself)20:42
claygpeluse: acoles_away: maybe I can *just* work on the suffix classes part and over acoles change - I know it'll mess up everyones dependent changes - but I think I've got some good ideas for the suffix classes that will pay off in the end20:42
* portante I feel happy, I feel happy ...20:43
claygonce we're all happy with 159637 we can merge it - peluse can go back to the reconstrutor, acoles can keep chipping away at per policy diskfiles, and I'll... I'll probably just find something new to bitch about :\20:43
claygwhat's the easiest way to turn patch 159739 into a link?  I always end up just changing the url of some other patch I'm done looking at but didn't close the tab for :\20:44
claygtdasilva: oh shit, maybe something got messed up there - i needed to say something about include inline or some ya :\20:45
notmynameclayg: that's why I usd to have the patch bot20:45
claygbut yeah - i don't see it in the diff20:46
claygpatch bot i miss you!20:46
claygtdasilva: oh I remember - i just copied the autodoc stanza that you added to middlewares over the content of the page referenced in the source/index20:47
*** patchbot has joined #openstack-swift20:48
notmynamepatchbot: hi!!20:48
notmynamepatchbot: p 15963720:49
notmynameclayg: ^^20:49
notmynameit was actually portante who made me write that20:50
notmynamehe always talked about patch numbers and I had no way to linkify them :-)20:50
claygthat portante20:51
* portante is a pain in the arse20:51
lcurtishello there a way to see queue as to what is causing so many 'container-updater: ERROR account update failed ' errors?21:06
clayglcurtis: does it say the path to the db file or the name of the container or anything?21:13
claygi mean it could just be one of the account servers is offline21:13
lcurtisboth are online21:18
lcurtisgetting connection timeout21:18
lcurtis18000 connections at the moment.and listening on port 600121:18
lcurtisbut trying to figure out why the timeout21:18
claygpoke at it with curl - backend api is /<device>/<part>/<account> - you should be able to grab any ol' request from the log21:19
claygi guess I'd try to coorolate if it's just some subset of db's that doing the timeouts - or some subset of nodes21:20
claygif the account server seems to be doing other work ok... zoom on a single request that failed (that's why i was asking if the container error log line said the db file or name of the container/account)21:20
clayglcurtis: might do a launchpad search, there's a few open bugs with container updates that pop up now and again - maybe we managed to ignore one and need to fix it.21:23
openstackgerritRicardo Ferreira proposed openstack/swift: Removing the ".data" makes it check *any* file for metadata, so it works with .meta and .ts filetypes.
claygrsFF: ^ lol!21:25
claygor sorry - no no no21:30
claygI meant like i just thought it was funny that was all it took - NICE WORK - i like the easy ones :P21:30
rsFFhaaa ok, i thought i had made a bad branching or something21:31
claygwhere's fifield when you need him :\21:31
claygyes, i should have realized you could have misunderstood me - it was insensitive - my apologies21:32
rsFFI am planning on taking this one:
openstackLaunchpad bug 1428866 in OpenStack Object Storage (swift) "swift-object-info display for sysmeta" [Wishlist,New]21:34
rsFFany objection/tips?21:34
notmynamedid we say anything about canceling next week's team meeting?21:36
*** mahatic has quit IRC21:39
*** superanne has quit IRC21:44
lcurtisclayg delayed thank you21:46
*** rdaly2 has quit IRC21:46
clayglcurtis: i did what now?21:46
lcurtisabout the container update error advice21:46
claygwell... but... i mean - what was the problem :)21:47
lcurtisim not sure...but gives me a place to start21:47
lcurtisim not sure how i would get a request from the log with curl though21:48
claygno i mean peak in the log for an example of the path for a request - then try to make that request with curl - see if it times out21:50
*** torgomatic has quit IRC21:50
lcurtisgot it21:50
*** torgomatic has joined #openstack-swift21:52
*** ChanServ sets mode: +v torgomatic21:52
*** superanne has joined #openstack-swift21:52
lcurtisi have asked before but any ideas on current largest swift implementations worldwide?21:56
*** panbalag has quit IRC21:58
clayglcurtis: single cluster?  object # or petabytes?21:59
lcurtisall of the above ;)21:59
claygi'm sure rax most a good run for their money - i'd bet they have the larget total "bytes managed by swift" with clusters in texas, chicago, london, australia (and others?)22:01
lcurtiswould make sense tho22:03
claygthey've been at it the longest!  that data is sticky and it takes awhile to build up22:09
*** lpabon has quit IRC22:09 seeing  ERROR rsync failed with 10 on replication22:13
lcurtisnode is u[, rsync rsync://ip works22:14
clayghrmm.. wonder what error code 10 is22:15
lcurtissocket io ?22:16
lcurtismaybe something akin to connection refused22:17
lcurtisbut things are syncing22:17
clayghrmmm... max connection limit?22:20
lcurtisyes potentially22:21
lcurtiswould that be on OS side or swift side22:21
lcurtissounds like it22:23
claygyeah rsyncd.conf i think22:23
lcurtisgreat call22:23
lcurtisill bet that is it22:23
*** chlong has quit IRC22:24
*** jrichli has quit IRC22:29
*** aarondelp has quit IRC22:42
claygpeluse: i'm mulling over if growing a X-Backend-Node-Index for the object server and diskfile interface (node_index) is more palitable than DiskFile growing a frag_index22:52
claygpeluse: I think that X-Object-Sysmeta-EC-Frag-Index is a good thing, but X-Backend-Node-Index I think makes more sense for the DiskFile interface22:52
claygpeluse: basically the way it's coded is to translate X-Backend-Node-Index to the DiskFile kwarg frag_index right there in the ObjectController methods - I think going one level down inside DiskFile would be better...22:54
claygI'll keep mulling22:54
*** superanne has quit IRC22:59
torgomaticso this looks interesting for Swift:  (not sure where I saw it; sorry if it was in here already)23:00
torgomaticthe threadpool reader dealie in the object server could first try a readv2 in the main thread, then if it got nothing, kick a blocking read out into a worker thread23:00
torgomaticso if the kernel is doing some readahead stuff, then that could save us a whole ton of overhead when the data is in buffer cache, but if it's not there, *then* we kick over to the big expensive reader threadpool23:03
torgomaticor we could rewrite the whole thing in Erlang. either one. /s23:04
openstackgerritSamuel Merritt proposed openstack/swift: Small optimization to ring builder.
clayghell yeah make that thing go fast23:09
*** sandywalsh has quit IRC23:12
*** sandywalsh has joined #openstack-swift23:13
*** sneezewort has joined #openstack-swift23:18
sneezewortHello all. It appears only admins can add stuff to swift on my install. It this normal, or am I doing something wrong?23:19
claygwhat's an "admin"23:25
claygtorgomatic: golang first23:26
claygtorgomatic: speaking of doing object servers in other languges the sorta off the rails expect 100 continue stuff is going to suck :'(23:27
claygtorgomatic: I still sorta feel like a pipelined POST would have been the way to go23:27
torgomaticclayg: don't think I haven't been tempted23:27
torgomaticclayg: yeah, depends on how good the http libs are in $otherlnag23:27
claygtorgomatic: didn't you some how convince me that a PUT/POST on the same connection wouldn't work for Encyrption?23:27
torgomaticclayg: yeah, there's a race condition in ther23:28
claygtorgomatic: I'd guess they'd be doing good to send anything that looks like a 100: Continue (we had to add it to eventlet in the first place) and *no* one would support this crazy "call a method on the file like readable that is the input from the web server to send extra headers" *crazy*ness23:28
claygtorgomatic: for Encryption or EC - I'm quite sure a POST that said write a durable for this timestamp isn't racy - and cirtainly no worse then what we've got - but there something about encyrption... you made me feel like the trailing metadata was the way to go - oh maybe that was it - not the expect 100 just the trailers23:30
torgomaticclayg: true; depends on how low-level your http libs are. if you've got a send_response_headers(socket, status, headers) function or something, you can pretty easily (ab)use that to get 1200-continue responses going23:30
claygso MIME encoding good, waiting on the request to "finsih" by looking for a second set of information headers - crazy town23:30
* torgomatic can't type for carp today23:31
claygtorgomatic: fair enough - maybe it's wsgi that's the crazy ness23:31
torgomaticruby's rack isn't much better for this23:31
torgomaticclojure, perhaps? I wonder how ring handles things like this23:31
claygtorgomatic is a total clojure hipster23:32
torgomaticheh, if I'd written anything serious in clojure I'd probably know the answer to that question already23:32
claygredbo: dfg: glange: I want to add a policy param to hash_cleanup_listdir and it looks silly after the reclaim_age kwarg, but like all of the place we call that function with only posistional arguemnts...23:43
claygredbo: dfg: glange: we have a bunch of node/disk/health checky type code but we never call hash_cleanup_listdir - you got any scripts that ever call it?  Maybe something for looking for rotten tombstones?23:44
*** dmsimard has joined #openstack-swift23:45
claygi guess I can just leave it hash_cleanup_listdir(hsh_path, reclaim_age=X, policy=None) and either default to the 0-policy or raise a TypeError - but I figure if a script says hash_cleanup_listdir(path, 2_DAY) they'll be somewhat dissappointed with hash_cleanup_listdir(path, policy, reclaim_age=X)23:47
claygi guess the int would attribute error pretty quickly... maybe it's fine23:47
*** Fin1te has joined #openstack-swift23:56
sneezewortI figured it out. I forgot to create the SwiftOperator role.23:56
clayg^ notmyname weren't you trying to tell me that the keystone default setups were like that?23:57
claygnotmyname: and I was like - nah that'd be stupid23:57
claygnotmyname: maybe some docs somewhere are lacking23:57
claygsneezewort: were you following any specific instructions?  Maybe the SwiftOperator role should be in bold red or something?23:57

Generated by 2.14.0 by Marius Gedminas - find it at!