Thursday, 2015-06-04

*** alex_klimov has quit IRC00:04
*** Apoorva has quit IRC00:54
*** openstack has joined #openstack-api01:22
*** openstack has joined #openstack-api01:37
*** openstack has quit IRC01:52
*** openstack has joined #openstack-api01:53
openstackgerritAlex Xu proposed openstack/api-wg: Add guideline of api microverion bump  https://review.openstack.org/18789603:20
openstackgerritAlex Xu proposed openstack/api-wg: Add guideline for api microversion specification  https://review.openstack.org/18711203:20
*** krotscheck has quit IRC04:19
*** gilliard has quit IRC04:22
*** krotscheck has joined #openstack-api04:25
*** gilliard has joined #openstack-api04:34
*** ryansb has quit IRC05:36
*** ryansb has joined #openstack-api05:45
*** ryansb has quit IRC05:45
*** ryansb has joined #openstack-api05:45
*** markdstafford has joined #openstack-api06:00
*** flaper87 has quit IRC06:33
*** flaper87 has joined #openstack-api06:33
*** terrylhowe has quit IRC06:54
*** woodster_ has quit IRC07:00
*** alex_klimov has joined #openstack-api07:52
openstackgerritMatthew Gilliard proposed openstack/api-wg: s/call/request/ - This isn't RPC  https://review.openstack.org/18799107:55
*** lucasagomes has joined #openstack-api08:09
*** salv-orlando has quit IRC09:08
*** cdent has joined #openstack-api09:21
*** e0ne has joined #openstack-api09:44
*** salv-orlando has joined #openstack-api10:09
*** salv-orlando has quit IRC10:17
*** salv-orlando has joined #openstack-api10:20
*** salv-orlando has quit IRC10:21
*** salv-orlando has joined #openstack-api10:23
*** woodster_ has joined #openstack-api11:00
*** lucasagomes is now known as lucas-hungry11:27
*** e0ne is now known as e0ne_11:39
*** e0ne_ has quit IRC11:49
*** lucas-hungry is now known as lucasagomes12:07
*** e0ne has joined #openstack-api12:13
*** e0ne is now known as e0ne_13:01
*** e0ne_ is now known as e0ne13:04
*** terrylhowe has joined #openstack-api13:49
*** sigmavirus24_awa is now known as sigmavirus2413:55
*** e0ne is now known as e0ne_13:59
*** e0ne_ has quit IRC14:04
*** e0ne has joined #openstack-api14:06
gilliardHi. alex_xu14:28
alex_xugilliard: hi :)14:28
gilliardAbout your microversion bump devref...14:29
alex_xuyup14:29
gilliardI was looking at a review earlier, something to do with live migration. The exact thing doesn't matter, but it needed an API error for the case where there was an rpc timeout internally in nova.14:30
gilliardThis is an internal server error, which we can catch and return to the user.14:31
alex_xugilliard: emm...good point14:32
gilliardWe have a large number of things which can go wrong in nova, so I think 500 is sometimes a valid response code.  Let me find the exact patch for you.14:33
gilliardhttps://review.openstack.org/#/c/168916/14:33
alex_xugilliard: so if we decided to turn the timeout error to some specific error code, we need bump the microversion14:34
gilliardYes. Although I don't know what else we might change it to...14:34
gilliardThere's nothing in the HTTP spec for "Expected server-side error" ;) But it's definitely in the 5xx range imo14:35
gilliardLike "we knew this might happen", as opposed to "something totally unexpected went wrong"14:35
alex_xugilliard: yea, I need rephase the word, that is good point14:36
alex_xugilliard: actulay it is nova trust other component, but when the behavior isn't expected, the nova return 50014:38
sdagueso, as folks here might have perspectives on this as well - this is a draft post about trying to explain the changes in the Nova API in a more accessible way - https://dague.net/?p=4461&shareadraft=baba4461_557061c963266 as part of our communications plan. If anyone wants to comment on concepts or confusing bits from an outside perspective, it would be appreciated. I'll do typos and grammar nits later14:38
alex_xusdague: cool!14:38
*** jxstanford has joined #openstack-api15:00
cdentis this a 0:00 week for the meeting?15:05
cdentah, my timezoning is brokenated15:09
elmiko16:0015:09
elmikoso, like 50 minutes15:10
elmikoi'm updating the agenda now =)15:10
*** annegentle has joined #openstack-api15:13
ryansbthe new ical files are *great*15:13
ryansbthe tool they wrote for it lets you generate your own files with just meetings you care about, so it's 1000x more useful15:14
*** gmann has quit IRC15:14
cdentI think my calendar got horkenated during a migration15:18
*** gmann has joined #openstack-api15:19
elmikoryansb: are you talking about the red hat calendar stuff?15:22
elmikoor is this something openstack specific?15:22
ryansbelmiko: no, I mean the openstack one15:22
ryansb1 sec, I'll link you15:22
elmikointeresting... yea, please =)15:22
*** notmars has joined #openstack-api15:23
ryansbok, so the project is openstack-infra/yaml2ical15:23
ryansband it grabs meetings from openstack-infra/irc-meetings15:24
ryansbyou can get the Big Giant Master ical here http://eavesdrop.openstack.org/irc-meetings.ical15:24
elmikoneat15:24
ryansbor the list http://eavesdrop.openstack.org/15:24
ryansbsince we start with "A", we're top o' the list15:25
elmikoi ususally just grep through that latter list15:25
elmikohehe15:25
sdagueryansb: yeh, I need to go play with that15:25
*** peterstac has joined #openstack-api15:43
*** alex_klimov has quit IRC15:50
*** annegentle has quit IRC16:08
*** Apoorva has joined #openstack-api16:19
*** pballand has joined #openstack-api17:00
sdaguemiguelgrinberg: so, I think my only concern with trying to do the caching stuff in Liberty, is there is going to be a bunch of stuff coming out of the API group in Liberty as is and projects are going to need time to digest and see what they are going to change in their existing stuff17:00
miguelgrinbergsdague: so what I'm thinking is that for a project that does not want to get bothered with caching, a middleware that sets the safe headers to prevent caching would be useful17:01
*** nikhil_k is now known as nikhil-afk17:01
sdagueand it feels like just getting best practices written down would be good this cycle, and get folks used to digesting that, then attack something like that next cycle as a priority17:01
cdentthat's what I was going to say: a first step is preventing caching for everything17:01
miguelgrinbergcdent: +117:02
cdentbut (as usual) I agree with sdague that we don't want to do too much too quickly, or else no one will do anything17:02
miguelgrinbergbut writing that in a guideline would imply projects have to code all this on their own17:02
miguelgrinbergversus saying here, add this middleware and you have something to start from17:02
cdentI'm cool with that but doing caching correctly (the part that actually allows caching to happen) requires really deep knowledge that a generic middleware will never be able to have, so that caveat will need to be present from the start17:03
sdagueyeh, I'm a bit scared if it's done as the middleware bit first, it will stop there17:03
sdagueand it seems like actually looking through what's protocol cacheable, what projects are returning, and where there are some mismatches (like GET /servers/ID_NOT_YET_CREATED - 404 and cachable by spec)17:04
sdaguerequires some effort.17:05
miguelgrinbergsdague: it's really hard to decipher, at least for me, how a cache should work in the default case, when no caching headers are included in the response17:05
miguelgrinbergit seems it isn't well defined17:05
sdagueso, the spec actually does say what's cachable unless otherwise defined17:06
sdagueI have a review up for that17:06
cdentthere's what the spec says and what happens out in the world and the overlap is confused17:06
miguelgrinbergyes, it does, but it does not cover the "how"17:06
sdaguecdent: sure17:06
miguelgrinbergwithout any specific guidance, how does a cache know for how long to cache a response?17:06
miguelgrinbergand for who?17:07
sdagueso, I'll tell you right now, chrome caches a 301 for eternity17:07
sdaguewhich has made my life a pain, as flushing that bit of it's cache is.. fun17:08
cdent(and ignores vary headers, which burns my hide)17:08
miguelgrinbergthat very much confirms that we should be concerned in adding a no-cache baseline17:09
*** lucasagomes has quit IRC17:10
sdagueyeh, though it would suck to add that to 404s that were valid and were mallformed so never would work17:10
sdaguebecause I want those cached forever17:10
sdagueand get the load off the server17:10
sdagueor even the versions doc on /17:11
*** salv-orlando has quit IRC17:12
miguelgrinberganyway, it seems we can only write a fairly generic guideline on caching, which will mostly say follow the RFC, and when in doubt, add no-cache headers.17:14
miguelgrinbergand maybe then we follow up with a middleware at some point17:15
cdentthat seems like a starting point, which is far better than nothing17:16
sdagueyep, sure17:16
*** nikhil-afk is now known as nikhil_k17:17
*** notmars has quit IRC17:19
*** e0ne has quit IRC17:22
cdentcaring about stuff seem to be a recipe for going quickly insane17:29
*** notmars has joined #openstack-api17:38
*** salv-orlando has joined #openstack-api17:40
krotscheckI'm sorry, but I disagree with "Disable caching for everything is a good first step".17:41
krotscheckIt's basically saying "Doing nothing is good!"17:41
krotscheck(Sorry I'm late to the conversation)17:41
cdentthat's not quite accurate krotscheck17:42
cdentit protects against things being cached unexpectedly17:42
cdentwhich is an improvement over the current situation17:42
krotscheckThe goal is to improve performance. If I am a client who's polling the services, I cause load on the server no matter what the cache settings are.17:42
krotscheckSo adding something that checks the response for etag matching may not reduce load on the server, but it _does_ reduce load on the wire and any clients that are using it.17:43
krotscheckAnd while I agree that ultimately, the service itself should perform sane etag checking and provide a code path that minimizes server load, adding a middleware that does this kind of etag caching is only an improvement. It doesn't add _more_ load to the server when it comes to how clients use it.17:44
krotscheckcdent: Also, it's not the API's job to define how clients choose to use the API.17:45
cdentoh sure krotscheck, I _completely_ agree with that last statement, but I'm not clear on how that's germane?17:45
krotscheckYou're basically removing my ability to make sane, performant choices.17:45
cdentah, okay, perhaps17:46
krotscheckSo hypothetically speaking, let's say we have an etag middleware that checks for oslo's created_at and updated_at fields in the response body, and generates an etag off of that.17:46
krotscheckNow I'm a client who's polling a resource.17:47
krotscheckI'm going to invoke server load on every request, period.17:47
krotscheckbecause I'm polling.17:47
krotscheckAdding this middleware allows me to be awesome about my own performance, and I don't really care about what the backend implementation is.17:48
krotscheckSo if at some point in the future the implementation becomes awesome for the API, I am not impacted, because I'm already making awesome choices.17:48
krotscheckBut it doesn't make things _worse_ for the API.17:48
cdentmiguelgrinberg what do you think about ^17:49
miguelgrinbergsorry, missed this discussion. I agree with krotscheck, and I have actually implemented something like this before. Not on openstack and not as a middleware, but I've done it.17:51
miguelgrinbergkrotscheck: I used an MD5 of the whole response body for the etag17:51
miguelgrinbergI did not have access to created/updated_at, which when available could be used to add a last-modified header in addition to the etag17:52
krotscheckmiguelgrinberg: I'm a little hesitant about doing that, because search results could be large, and md5'ing something like that could actually have a substantial performance impact.17:54
krotscheckAlso, search results really should never cache, period.17:54
krotscheck(aaactually, I think I may be able to argue my way out of that one)17:54
krotscheck(But I digress)17:54
miguelgrinbergnever say never :)17:54
miguelgrinbergIf you have timestamps, then that could be used for the etag as well, as you suggest.17:55
krotscheckmiguelgrinberg: I would love it if we could ask the database to see whether individual records in the search result set have changed, easily :)17:55
krotscheckmiguelgrinberg: I'm pondering an etag middleware that accepts two parameters: created_fieldname and updated_fieldname.17:56
krotscheckIf those don't exist, it does nothing.17:56
miguelgrinbergdo we have any API that names these differently?17:56
miguelgrinbergcreated_at/updated_at seem pretty safe17:57
krotscheckmiguelgrinberg: No idea, but I do not put it beyond openstack's governance model to allow special snowflakes.17:57
miguelgrinbergthough adding a way to override those defaults does not hurt17:57
miguelgrinbergso why no caching on search results?17:57
miguelgrinbergif you can get a good etag on them it's no different17:57
krotscheckmiguelgrinberg: Weeellll I dunno anymore.17:57
cdentI think the concern with this line of thinking is that it only works in situations where you have clear resources and unfortunately there are plenty of requests that result in responses that can't clearly claim nice resource orientation17:58
cdentand it is those requests which need caching-armor of some kind17:58
krotscheckcdent: That's why I'm thinking that etags are only generated if it detects the existence of timestamps.17:58
krotscheckThe resource implicitly opts-in to cache support.17:58
miguelgrinbergthat works for me17:59
cdentwould the bad stuff implicitly opt in to no-cache-ing17:59
krotscheckA different approach would be to allow cache-expiry per regex-matched path.17:59
krotscheckcdent: If they don't have those timestamps, they don't get etag and/or last-modified support. Simple.18:00
miguelgrinbergin that case they get no-cache headers18:00
cdentwhat miguelgrinberg is saying is what I'm asking18:00
cdentbecause that represents the concerns that sdague was mentioning earlier18:00
cdentsorry, but I gott run, dinner18:00
cdentbiab18:00
krotscheckmiguelgrinberg: As for search results, I feel that the response structure of search results vary more than simple resource requests. Some API's return arrays of objects, others return an object with a results: field.18:01
krotscheckkk18:01
*** annegentle has joined #openstack-api18:01
miguelgrinbergkrotscheck: right, so that's why I just hashed the whole thing when I implemented this in the past18:01
krotscheckmiguelgrinberg: Right18:02
miguelgrinbergit's a baseline implementation, the API is free to generate a better etag, then the middleware is not going to contest it18:02
krotscheckmiguelgrinberg: Though, if a search endpoint wants to start caching results, and they respond with an object, they _could_ just add the modified_at field to the result instance.18:02
krotscheckAlso, any middleware we build should definitely check for api-set cache headers and deactivate itself if that logic's already been provided.18:03
miguelgrinbergkrotscheck: ah, that's not a bad idea, they just put the newest of all the timestamps in the collection18:03
miguelgrinbergkrotscheck: yup, if the server sets its own caching, then the middleware becomes a pass-thru18:04
miguelgrinbergactually, not sure you can come up with a good updated_at value to represent a page of a collection18:05
miguelgrinbergsince individual results can come and go18:05
krotscheckmiguelgrinberg: Hrm. Maybe if, for search results, the query parameters are added?18:06
krotscheckThe etag then would become a statement of "For this set of query parameters, the most recent modified_at date is X"18:06
krotscheckOh, I get what you're saying.18:07
krotscheckIf an older record is deleted, shifting some things around, then you're still saying the results aren't fresh when they really are.18:07
miguelgrinbergkrotscheck: it can't be done I think. Hashing the whole thing works, but I can't think of anything more subtle that is safe.18:09
*** notmars has quit IRC18:14
krotscheckmiguelgrinberg: Nothing short of determining the ordered ID's in the result set plus the last time those resources are modified. We'd have to see what's faster.18:14
krotscheckmiguelgrinberg: i.e. is it faster to just hash the whole thing, or to iterate through it all to generate some kind of a hasheable key?18:15
miguelgrinberggood question18:15
krotscheckmiguelgrinberg: Lemme write a quick test script.18:15
krotscheckI'm really curious now.18:17
miguelgrinbergawesome18:17
miguelgrinbergI'm going to guess MD5 of the whole thing is faster, but also very curious to find out what your experiment tells us18:18
*** annegentle has quit IRC18:21
*** annegentle has joined #openstack-api18:22
*** notmars has joined #openstack-api18:26
*** cdent has quit IRC18:29
*** cdent has joined #openstack-api18:33
*** elmiko is now known as _elmiko18:53
*** Apoorva has quit IRC19:04
*** annegentle has quit IRC19:07
*** salv-orlando has quit IRC19:15
*** salv-orlando has joined #openstack-api19:23
krotscheckmiguelgrinberg: I actually think the larger overhead is going to be in how long it takes to convert the data into a hasheable data type.19:24
* krotscheck is trying to figure out how to hash a list :/19:24
miguelgrinbergkrotscheck: can you hash the response body? (i.e. the JSON blob)19:25
krotscheckmiguelgrinberg: Yep. Trying to figure out if that should be part of the algorithm or not. Does the response body arrive pre-encoded in a middleware's post handler?19:27
*** elmiko has joined #openstack-api19:27
krotscheckActually, that's a good question. Would I have to deserialize the response body in the middleware?19:27
miguelgrinbergthe middleware gets the response as a byte sequence19:28
miguelgrinbergit's the rendered JSON in our case19:28
krotschecklooks like the webob response has a json property where the data arrives deserialized.19:31
cdentassuming the response is a webob19:36
cdentI think you need to assume that middleware is pure WSGI19:37
cdentand nothing else19:37
cdentin which case miguelgrinberg is correct: it's a bytesequence19:37
elmikocdent: +119:37
cdentwhy am I still here? It's nearly 9pm19:38
* cdent grumbles19:38
elmikoheh, go relax cdent!19:38
cdentI keep getting distracted by weird stuff19:38
elmikoi know the feeling19:38
cdentthe latest: https://bugs.launchpad.net/gnocchi/+bug/146207619:39
openstackLaunchpad bug 1462076 in Gnocchi "file-based tooz locking unreliable under concurrent measurement posts" [Undecided,New]19:39
cdentbut yeah, I'm going to bail now that I've at least reported that problem and avoid trying to fix it today19:39
cdenthave a good evening all, keep fighting the good fight etc19:39
* cdent waves19:39
elmikotake care19:39
*** cdent has quit IRC19:39
*** e0ne has joined #openstack-api19:55
*** Apoorva has joined #openstack-api20:10
krotscheckmiguelgrinberg: http://paste.openstack.org/show/263825/20:33
krotscheckThat's assuming a webob.response, so there's no additional deserialization step.20:34
krotscheckFull hash is "just hash the raw body).20:34
krotscheckSelective hash is "Loop over the results and read that"20:34
krotscheckPayload is # of parameters per object. Result size is the humber of objects in the result set.20:35
krotscheckAlso, that's running in pydevd, not regular python, so the performance is way worse, but the relationship is pretty clear. Selectively going through a result set for particular properties is WAY more expensive (by a factor of 1000) than just hashing the result body.20:36
krotscheckYou can see the script I wrote here: http://paste.openstack.org/show/263910/20:37
*** e0ne has quit IRC21:05
*** notmars has quit IRC21:09
*** notmars has joined #openstack-api21:10
*** elmiko has quit IRC21:11
*** notmars has quit IRC21:25
*** salv-orlando has quit IRC21:35
*** salv-orlando has joined #openstack-api21:41
*** notmars has joined #openstack-api21:41
*** _elmiko is now known as elmiko21:47
miguelgrinbergkrotscheck: just got back from lunch, taking a look at your test21:56
krotscheckmiguelgrinberg: Incidentally, I was wrong about the performance difference, it's actually an order of magnitutde greater.21:57
miguelgrinbergso my guess was based on the fact that MD5 is all native code, while selectively looking for data in each result goes at Python speed21:58
*** sigmavirus24 is now known as sigmavirus24_awa22:13
*** notmars has quit IRC22:14
*** annegentle has joined #openstack-api22:35
*** annegentle has quit IRC22:40
*** annegentle has joined #openstack-api22:42
krotscheckmiguelgrinberg: That makes sense. It probably drops to the c-lib22:43
krotscheckmiguelgrinberg: With that in mind, it may make the most sense to just md5 _everything_ instead of trying some wacky modified date lookup.22:44
miguelgrinbergyes, that's what I did in previous projects, and always worked well for me22:45
miguelgrinbergand you do this only if the response doesn't have the etag already in it, as this gives the server the option to generate its own more efficient etags22:45
*** Apoorva has quit IRC23:03
*** Apoorva has joined #openstack-api23:04
*** annegentle has quit IRC23:08
*** salv-orlando has quit IRC23:53

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!