*** kgriffs_afk is now known as kgriffs | 00:35 | |
*** flaper87 is now known as flaper87|afk | 00:36 | |
*** kgriffs is now known as kgriffs_afk | 00:45 | |
*** nos_ has joined #openstack-marconi | 00:54 | |
*** nos_ has quit IRC | 00:54 | |
*** nosnos has joined #openstack-marconi | 00:55 | |
*** amitgandhi has quit IRC | 01:33 | |
*** kgriffs_afk is now known as kgriffs | 01:36 | |
*** kgriffs is now known as kgriffs_afk | 01:45 | |
*** ayoung has quit IRC | 01:52 | |
*** whenry has joined #openstack-marconi | 01:55 | |
*** ayoung has joined #openstack-marconi | 02:11 | |
*** amitgandhi has joined #openstack-marconi | 02:22 | |
*** whenry has quit IRC | 02:29 | |
*** kgriffs_afk is now known as kgriffs | 02:36 | |
*** kgriffs is now known as kgriffs_afk | 02:45 | |
*** nosnos_ has joined #openstack-marconi | 02:47 | |
*** nosnos has quit IRC | 02:49 | |
*** nosnos_ has quit IRC | 02:55 | |
*** nosnos has joined #openstack-marconi | 02:55 | |
*** ayoung has quit IRC | 03:05 | |
*** amitgandhi has quit IRC | 03:17 | |
*** whenry has joined #openstack-marconi | 03:18 | |
*** whenry has quit IRC | 03:34 | |
*** kgriffs_afk is now known as kgriffs | 03:36 | |
*** kgriffs is now known as kgriffs_afk | 03:46 | |
*** kgriffs_afk is now known as kgriffs | 04:37 | |
*** kgriffs is now known as kgriffs_afk | 04:46 | |
*** whenry has joined #openstack-marconi | 05:23 | |
*** openstack has joined #openstack-marconi | 14:53 | |
flaper87 | nope, swift usecase is different from marconi's usecase | 14:54 |
---|---|---|
flaper87 | swift has blob objects, we have messages | 14:54 |
oz_akan_ | in both case we have a resource, that we want to delete | 14:54 |
*** openstackgerrit has joined #openstack-marconi | 14:55 | |
flaper87 | but the use case is not the same. When talking to message systems, most of the times, message deletion is not something users care about | 14:55 |
oz_akan_ | why do you think data type is a differentiator for deletes? | 14:55 |
kgriffs | https://wiki.openstack.org/wiki/Marconi/specs/api/v1/responsecodes | 14:55 |
kgriffs | it is update to date assuming a couple pending patches are merged | 14:55 |
oz_akan_ | flaper87: deleting a message is important as it guarantees it won't be processes by others (though in this case there is claim id) | 14:56 |
flaper87 | I agree we should diferentiate both cases, I don't agree w/ treating it as error | 14:56 |
oz_akan_ | I think if you delete a message, you do it, because you care about it | 14:56 |
flaper87 | oz_akan_: right, when you ack a message in AMQP systems you don't get an error back if it was already acked | 14:56 |
kgriffs | brb (meeting) | 14:57 |
oz_akan_ | flaper87: got it | 14:57 |
oz_akan_ | 404 is just another response.. if not, what do you think we could return? | 14:57 |
flaper87 | oz_akan_: something that is not an error: 200 and 204? Not sure to be honest | 14:59 |
flaper87 | kgriffs: https://review.openstack.org/#/c/45070/1/global-requirements.txt | 15:00 |
*** key4 has quit IRC | 15:00 | |
*** key4 has joined #openstack-marconi | 15:01 | |
oz_akan_ | hmm, 200 for successful delete, 204 for not found message | 15:01 |
oz_akan_ | might be, still I liked 404 because of the definition in wiki : The requested resource could not be found but may be available again in the future.[2] Subsequent requests by the client are permissible. | 15:02 |
oz_akan_ | https://wiki.openstack.org/wiki/Marconi/specs/api/v1/responsecodes#Delete_Messages | 15:03 |
oz_akan_ | here | 15:03 |
oz_akan_ | Delete message from a non existing queue 204 | 15:03 |
oz_akan_ | zyuan_: had told that we return 404 in this case | 15:03 |
oz_akan_ | unfortunately he is off, can't verify at the moment | 15:03 |
oz_akan_ | kgriffs: flaper87 ^^ | 15:03 |
flaper87 | oz_akan_: in which case? | 15:07 |
oz_akan_ | Delete message from a non existing queue | 15:09 |
oz_akan_ | document says returns 204 | 15:09 |
oz_akan_ | while zyuan_ told that it returns 404 | 15:09 |
oz_akan_ | I think malini_afk had told the same | 15:09 |
oz_akan_ | I am not sure if wiki page is wrong | 15:09 |
oz_akan_ | flaper87: ^^ | 15:11 |
flaper87 | we return 204 https://github.com/stackforge/marconi/blob/master/marconi/transport/wsgi/queues.py#L85 | 15:12 |
oz_akan_ | that code is for deleting queue | 15:14 |
oz_akan_ | right? | 15:14 |
flaper87 | ah, sorry, delete from a non-existing queue | 15:14 |
flaper87 | T_T | 15:14 |
flaper87 | oz_akan_: https://github.com/stackforge/marconi/blob/master/marconi/transport/wsgi/messages.py#L303 | 15:15 |
oz_akan_ | flaper87: ok, we don't really check if queue exists to delete the message | 15:16 |
flaper87 | oz_akan_: nope, I was checking in the backend as well | 15:16 |
oz_akan_ | flaper87: so don't return anything specific | 15:16 |
oz_akan_ | only list a queue that doesn't exist return 404 | 15:16 |
oz_akan_ | documentation is correct | 15:17 |
oz_akan_ | flaper87: tks | 15:17 |
oz_akan_ | https://bugs.launchpad.net/marconi/+bug/1220768 | 15:17 |
oz_akan_ | I created this to consider delete message response | 15:17 |
flaper87 | oz_akan_: awesome, thanks about that! This definitely needs further discussion | 15:17 |
openstackgerrit | A change was merged to stackforge/marconi: fix: claimed message require claim_id to delete https://review.openstack.org/43339 | 15:26 |
oz_akan_ | flaper87: not that awesome, just a bug report :D | 15:31 |
oz_akan_ | I am out for lunch | 15:31 |
flaper87 | oz_akan_: enjoy :D | 15:31 |
flaper87 | kgriffs: ping | 15:31 |
oz_akan_ | thanks | 15:31 |
kgriffs | back | 15:54 |
kgriffs | flaper87: so, on this: https://bugs.launchpad.net/marconi/+bug/1220768 | 15:58 |
kgriffs | Is there a proposed solution? | 15:59 |
kgriffs | as in, what return code and/or body? | 15:59 |
flaper87 | not yet | 16:04 |
flaper87 | I think this is worth discussing in our next meeting | 16:04 |
flaper87 | kgriffs: btw, unrelated topic | 16:05 |
flaper87 | About the sqlalchemy / MySQL backend, I talked this morning w/ Yeela and she wanted to take that bp | 16:05 |
flaper87 | She was going to work on the proton / qpid one but, since the rel backend is our priority then she volunteered to work on it | 16:06 |
flaper87 | She'll attend to our next meeting | 16:07 |
flaper87 | but, if you're ok w/ that, we could assign it to her so that other folks know someone is already going to work on that | 16:08 |
flaper87 | kgriffs: also, can you review test patches ? | 16:08 |
flaper87 | :D | 16:08 |
*** ayoung has quit IRC | 16:11 | |
kgriffs | flaper87: re backend, I'm cool with that. let me assign her | 16:11 |
flaper87 | kgriffs: thx | 16:12 |
kgriffs | yeela kaplan? | 16:13 |
flaper87 | kgriffs: yup | 16:13 |
kgriffs | cool, we need some more RedHat copyrights. :D | 16:14 |
* kgriffs loves contributors | 16:14 | |
flaper87 | kgriffs: YEAHHH!!! | 16:14 |
kgriffs | https://blueprints.launchpad.net/marconi/+spec/sql-storage-driver | 16:15 |
flaper87 | kgriffs: awesome! thanks! | 16:15 |
kgriffs | thoughts on this? | 16:15 |
kgriffs | https://blueprints.launchpad.net/marconi/+spec/redis-storage-driver | 16:15 |
*** openstackgerrit has quit IRC | 16:16 | |
*** openstackgerrit has joined #openstack-marconi | 16:16 | |
flaper87 | kgriffs: +1 for that. cppcabrera has some work already done on that | 16:17 |
flaper87 | https://github.com/cabrera/marconi-redis | 16:17 |
flaper87 | Developing it outside is the best test for: 1) Our test suite structure 2) our plugins stuff | 16:17 |
flaper87 | once it's done, I think he can submit it for review | 16:18 |
kgriffs | ok, so keep the blueprint but develop in a separate repo? | 16:20 |
flaper87 | until it's done, I guess. SO, usually this kind of blueprints are impemented separately and then submited in a single patch | 16:21 |
flaper87 | but, we could split it in several patches 1 for each controller | 16:21 |
flaper87 | which makes reviews easier | 16:21 |
flaper87 | What I meant w/ developing it in a separate repo is that I like the fact that he's doing that because that allows us to test both, the test suite and the plugins thing. | 16:22 |
flaper87 | but, I'd be happy to pull that backend into Marconi's source tree | 16:23 |
*** amitgandhi has quit IRC | 16:23 | |
kgriffs | yep, makes sense | 16:24 |
kgriffs | not a bad model actually | 16:24 |
kgriffs | future drivers can be essentially incubated in other/personal repos | 16:24 |
kgriffs | and then we can pull them in IFF it makes sense and the core team wants to take over maintenance | 16:24 |
*** amitgandhi has joined #openstack-marconi | 16:25 | |
*** amitgandhi has quit IRC | 16:25 | |
*** amitgandhi has joined #openstack-marconi | 16:25 | |
flaper87 | kgriffs: correct! awesome! | 16:27 |
flaper87 | brb, dinner | 16:27 |
kgriffs | ciao | 16:32 |
kgriffs | oz_akan_: does that load test delete messages, or leave them in the DB? | 16:40 |
kgriffs | (after running) | 16:40 |
*** ayoung has joined #openstack-marconi | 17:05 | |
*** kgriffs is now known as kgriffs_afk | 17:29 | |
*** kgriffs_afk is now known as kgriffs | 18:05 | |
flaper87 | kgriffs: ping | 18:18 |
flaper87 | kgriffs: could you take a look here? https://review.openstack.org/#/c/45070/1/global-requirements.txt | 18:18 |
kgriffs | yeah, sorry, been swamped and haven't been diligent about reviews today | 18:19 |
* kgriffs is looking | 18:19 | |
flaper87 | kgriffs: no worries, that's just a very quick one that you certainly know the answer | 18:20 |
kgriffs | sometimes falcon does .postX | 18:20 |
flaper87 | https://github.com/racker/falcon/blob/master/falcon/version.py#L19 | 18:20 |
kgriffs | so, if we have ==0.1.6 then you would miss 0.1.6.postX | 18:21 |
kgriffs | the .post things are sort of silly, I know | 18:21 |
flaper87 | cool, I think that's all he wanted to know. I didn't know the answer w.r.t falcon and went lazy on it :P | 18:21 |
kgriffs | I should really just bump the first minor up more often and leave the second for interim stuff | 18:22 |
kgriffs | I can comment on that | 18:22 |
flaper87 | kk, thanks! | 18:22 |
kgriffs | commented | 18:24 |
flaper87 | danke sir! | 18:25 |
kgriffs | no problemo | 18:25 |
kgriffs | oz_akan_: I'm on the mongo primary now, attempting to reproduce the 404 issue. I'll let you know how it goes. | 18:41 |
oz_akan_ | ok | 18:41 |
oz_akan_ | server02 right? | 18:41 |
oz_akan_ | kgriffs: ^^ | 18:41 |
oz_akan_ | mng-02 to be exact | 18:41 |
kgriffs | mar-tst-ord-mng-02 | 18:42 |
kgriffs | right? | 18:43 |
oz_akan_ | right | 18:43 |
kgriffs | 166.78.112.25 | 18:43 |
kgriffs | cool | 18:43 |
oz_akan_ | I see 40000 messages, as if created with a loop, precisely | 18:43 |
*** JRow has left #openstack-marconi | 18:44 | |
oz_akan_ | kgriffs: any luck? | 19:38 |
kgriffs | as far as I can gather, it's not a timestamp issue | 19:44 |
kgriffs | if it were, you should see 204, not 404 | 19:44 |
kgriffs | the queue in the query definitely exists in the queues collection on primary | 19:45 |
kgriffs | I guess I can check the secondaries | 19:45 |
kgriffs | actually, one more think also | 19:45 |
kgriffs | wait a sec | 19:46 |
kgriffs | eeeeenteresting | 19:46 |
kgriffs | mongodb.claims.create claims several messages by applying a claim ID to them | 19:46 |
kgriffs | then, it turns around and does a find for the same messages so it knows which were updated | 19:47 |
kgriffs | that would be a problem for eventually consistent collection, but you said that you still get a 404 when not reading from secondaries? | 19:48 |
oz_akan_ | kgriffs: no | 19:49 |
oz_akan_ | kgriffs: I just had the very first request when not reading from secondaries | 19:49 |
oz_akan_ | just one request amongst thousands | 19:49 |
oz_akan_ | so I can say we have problem only when we write to primary and read from secondaries | 19:50 |
oz_akan_ | claim id logic might be the case then | 19:50 |
kgriffs | odd that you would still get a single 404 tho | 19:50 |
kgriffs | ok, so here is what the code does | 19:51 |
oz_akan_ | I didn't test write-read from primary enough many times that I can say we always get first request 404 | 19:52 |
kgriffs | ok | 19:52 |
kgriffs | well, we definitely have a race condition when reading from secondaries | 19:52 |
kgriffs | what happens is this: | 19:52 |
kgriffs | a batch of messages is tagged by ID | 19:52 |
oz_akan_ | (I feel like watching a thriller, with popcorn ) | 19:53 |
kgriffs | but then the claims controller reads those backs as a sort of sanity check, in case in between creating a list of message IDs to update, and actually tagging them with the claim, some of them may have gotten claimed by another process | 19:53 |
kgriffs | LOL | 19:53 |
kgriffs | aaaanyway | 19:53 |
kgriffs | here's the scary part | 19:53 |
oz_akan_ | oh | 19:54 |
kgriffs | A Nobel Peach Prize laureate is about to start World War III | 19:54 |
kgriffs | no, wait, that's not it | 19:54 |
* kgriffs gets minds back on topic | 19:55 | |
kgriffs | s/peach/peace | 19:55 |
kgriffs | <sigh> | 19:55 |
oz_akan_ | (let me drink this cold coke) | 19:55 |
kgriffs | ok, so if the primary is behind the master by enough, then that final get to make the list of claimed IDs returns an empty result set. This then triggers mongodb.get to raise exceptions.ClaimDoesNotExist | 19:57 |
kgriffs | which is then propagated up the stack by mongodb.create | 19:57 |
kgriffs | and finally... | 19:57 |
* kgriffs queue scary music | 19:57 | |
* oz_akan_ hiding | 19:57 | |
kgriffs | wsgi.claims.on_post catches ClaimDoesNotExist and converts it to falcon.HTTPNotFound()!!!! | 19:58 |
kgriffs | <dun-dun-duuuuuuun!> | 19:58 |
oz_akan_ | what a tragedy | 19:58 |
kgriffs | you said it | 19:58 |
kgriffs | I cried the whole time | 19:58 |
oz_akan_ | :o | 19:58 |
kgriffs | (when I wasn't hiding) | 19:59 |
kgriffs | so, we need to either change the logic, add a retry loop, or always read from the primary for that one call | 19:59 |
kgriffs | so many choices, so little timeā¦ :p | 19:59 |
oz_akan_ | hmm | 19:59 |
oz_akan_ | fastets, w=3 | 20:00 |
oz_akan_ | so we will write to all, and read from secondaries preferred | 20:00 |
kgriffs | won't that hang if one of the nodes goes down? | 20:00 |
oz_akan_ | fastest, I mean, fastest to implement | 20:00 |
oz_akan_ | w=majority | 20:00 |
kgriffs | yes, but I am concerned about hanging when one node goes down | 20:00 |
kgriffs | majority I think mitigates that, correct? | 20:01 |
oz_akan_ | majority is safe | 20:01 |
oz_akan_ | if we would give 3 , it would | 20:01 |
kgriffs | yeah | 20:01 |
kgriffs | that's what I'm worried about | 20:01 |
oz_akan_ | majority is a keyword | 20:01 |
oz_akan_ | ah.. | 20:01 |
oz_akan_ | sorry | 20:01 |
oz_akan_ | right | 20:01 |
oz_akan_ | majority is not good enough | 20:01 |
kgriffs | yeah | 20:02 |
oz_akan_ | maybe there is "all" .. I will check that | 20:02 |
oz_akan_ | ok then at if we read from primary we are fine | 20:02 |
oz_akan_ | at least | 20:02 |
kgriffs | yeah, I'm just wondering if that will impact performance any? | 20:02 |
oz_akan_ | it will, we can measure | 20:02 |
oz_akan_ | though I think we have so many writes that we have to go to primary most of the time anyways | 20:03 |
oz_akan_ | lets measure and see what we will have | 20:03 |
oz_akan_ | I think this could be a temporary solution anyways | 20:04 |
oz_akan_ | app needs to be able to cover this at some point | 20:04 |
oz_akan_ | app = marconi | 20:04 |
kgriffs | ok. retry would slow down claim creation anyway | 20:04 |
kgriffs | ok, let me prepare a patch for you to try | 20:04 |
kgriffs | I'll make it read from primary for just that one call (hopefully that won't be too convoluted ,heh) | 20:05 |
oz_akan_ | that one call | 20:09 |
kgriffs | doing the get from primary | 20:10 |
oz_akan_ | might not be easy as I think we decide that on the driver level while creating a conenction | 20:10 |
kgriffs | I guess I could make all GET claim hit primary, or just when creating? | 20:10 |
kgriffs | mmm | 20:10 |
kgriffs | rings a bell | 20:10 |
kgriffs | seems like with RSE I had to keep two connections | 20:11 |
kgriffs | let me check it out | 20:11 |
oz_akan_ | oh that is trivky | 20:12 |
oz_akan_ | tricky means very tricky | 20:12 |
oz_akan_ | trivky means very tricky | 20:12 |
oz_akan_ | flaper87: missed the movie | 20:13 |
kgriffs | "You can specify a read preference mode on connection objects, database objects, collection objects, or per-operation." | 20:16 |
kgriffs | sweet | 20:16 |
kgriffs | let me try that | 20:16 |
oz_akan_ | per operation, amazing | 20:24 |
kgriffs | not sure if it will work for real, but we can try it. :p | 20:24 |
oz_akan_ | so you say this is the set up for the following episode | 20:26 |
oz_akan_ | I can't wait to watch it | 20:26 |
kgriffs | he | 20:26 |
kgriffs | heh | 20:26 |
kgriffs | oz - what's the easiest way for me to get you the code chance to try? | 20:33 |
kgriffs | change | 20:33 |
oz_akan_ | a branch is esasiest | 20:33 |
oz_akan_ | for like last time, I can apply a patch | 20:33 |
oz_akan_ | for = or | 20:33 |
oz_akan_ | if you have a public fork I could install it easuly | 20:34 |
oz_akan_ | easily | 20:34 |
kgriffs | ok | 20:35 |
kgriffs | let me do that | 20:35 |
oz_akan_ | ok | 20:35 |
*** ayoung is now known as ayoung-afk | 20:36 | |
kgriffs | oz: https://github.com/kgriffs/marconi | 21:03 |
kgriffs | I just set up a temporary fork to try this out | 21:03 |
kgriffs | (just use the master branch) | 21:03 |
kgriffs | not sure if this is easier than using gerrit, but whatever | 21:04 |
kgriffs | oz_akan_: ^^^ | 21:06 |
oz_akan_ | got it | 21:07 |
kgriffs | ok | 21:09 |
kgriffs | I was lazy and didn't test it locally (didn't want to set up a repl set) | 21:09 |
oz_akan_ | starting test | 21:10 |
oz_akan_ | w=1&readPreference=secondaryPreferred. so far not 404s | 21:11 |
oz_akan_ | am I experiencing a happy end? | 21:12 |
oz_akan_ | http://198.61.239.147:8000/log/20130904-2111/report.html | 21:12 |
oz_akan_ | I am not sure about the performance yet but with the patch it seems we don't get any more 404 | 21:13 |
oz_akan_ | please don't touch master branch on that repo | 21:13 |
oz_akan_ | I will run benchmark later this evening or tomorrow morning | 21:13 |
oz_akan_ | I have to leave now | 21:14 |
oz_akan_ | kgriffs: ^^ | 21:14 |
kgriffs | oh, ok. It doesn't include my perf patch for large queues, I just forked from master/mainline | 21:14 |
oz_akan_ | yes, there we have 60K+ messages | 21:14 |
oz_akan_ | I will run on an empty queue | 21:14 |
oz_akan_ | bye for now | 21:15 |
*** oz_akan_ has quit IRC | 21:15 | |
*** flaper87 is now known as flaper87|afk | 22:11 | |
*** tedross has quit IRC | 22:20 | |
*** oz_akan_ has joined #openstack-marconi | 22:27 | |
*** oz_akan_ has quit IRC | 22:31 | |
*** amitgandhi has quit IRC | 22:39 | |
*** oz_akan_ has joined #openstack-marconi | 23:11 | |
*** oz_akan_ has quit IRC | 23:53 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!