Thursday, 2014-11-13

*** shakamunyi has quit IRC00:01
*** shakamunyi has joined #openstack-dns00:03
*** EricGonczer_ has quit IRC00:05
*** EricGonczer_ has joined #openstack-dns00:14
*** jmcbride has quit IRC00:18
*** EricGonczer_ has quit IRC00:19
*** paul_glass has joined #openstack-dns00:20
*** paul_glass has quit IRC00:25
*** eandersson has quit IRC00:39
*** eandersson has joined #openstack-dns00:40
*** EricGonczer_ has joined #openstack-dns00:41
*** rjrjr has quit IRC00:45
*** stevelle has left #openstack-dns00:50
*** amcrn has quit IRC01:04
*** nkinder has joined #openstack-dns01:07
*** rmoe has joined #openstack-dns01:12
*** shakamunyi has quit IRC01:14
*** GonZoPT has quit IRC01:22
openstackgerritOpenStack Proposal Bot proposed openstack/designate: Updated from global requirements  https://review.openstack.org/13409201:23
*** jmcbride has joined #openstack-dns01:29
*** jmcbride has quit IRC01:30
*** jmcbride has joined #openstack-dns01:31
*** EricGonczer_ has quit IRC01:46
*** jmcbride has quit IRC01:46
*** nosnos has joined #openstack-dns02:06
*** stevelle has joined #openstack-dns02:44
*** richm has quit IRC02:45
*** nosnos has quit IRC03:25
*** nosnos has joined #openstack-dns03:57
*** pfreund_ has quit IRC04:05
*** CaptTofu has quit IRC04:06
*** pfreund_ has joined #openstack-dns04:06
*** CaptTofu has joined #openstack-dns04:07
*** harmw_ has joined #openstack-dns04:08
*** simonmcc_ has joined #openstack-dns04:10
*** ahu_ has joined #openstack-dns04:11
*** vipul has quit IRC04:12
*** harmw has quit IRC04:12
*** simonmcc has quit IRC04:12
*** filler has quit IRC04:12
*** ahu has quit IRC04:12
*** vipul has joined #openstack-dns04:12
*** simonmcc_ is now known as simonmcc04:14
*** f1ller has joined #openstack-dns04:14
*** alokj has joined #openstack-dns05:20
*** k4n0 has joined #openstack-dns06:09
*** cbaesema has quit IRC06:38
*** cbaesema has joined #openstack-dns06:42
*** rmoe has quit IRC07:45
*** yfujioka has joined #openstack-dns09:19
*** jordanP has joined #openstack-dns09:29
*** MasterPieceF has joined #openstack-dns10:15
*** MasterPiece has quit IRC10:15
*** MasterPieceF is now known as MasterPiece10:15
*** MasterPiece has quit IRC10:25
*** mwagner_lap has quit IRC10:51
openstackgerritEndre Karlson proposed openstack/designate: Support secondary zones  https://review.openstack.org/13368210:52
*** nosnos has quit IRC11:34
*** nosnos has joined #openstack-dns11:34
Kialleandersson: aha, with paris etc I totally forgot about the issue. I can take a look tomorrow, today is a busy day!11:35
openstackgerritMerged openstack/designate: Move import code to dnsutils  https://review.openstack.org/13368111:35
openstackgerritEndre Karlson proposed openstack/designate: Support secondary zones  https://review.openstack.org/13368211:37
*** nosnos has quit IRC11:39
*** stevelle has left #openstack-dns11:54
*** alokj has quit IRC12:25
*** yfujioka has quit IRC12:28
*** mwagner_lap has joined #openstack-dns12:34
openstackgerritEndre Karlson proposed openstack/python-designateclient: Move some useful code outside v1  https://review.openstack.org/13418612:53
openstackgerritEndre Karlson proposed openstack/python-designateclient: Move session creation up to shell  https://review.openstack.org/13320812:56
openstackgerritEndre Karlson proposed openstack/python-designateclient: Move some useful code outside v1  https://review.openstack.org/13418613:11
openstackgerritEndre Karlson proposed openstack/python-designateclient: V2 CLI  https://review.openstack.org/13367613:11
openstackgerritEndre Karlson proposed openstack/python-designateclient: V2 Bindings  https://review.openstack.org/13419613:11
ekarlsoKiall: ^13:11
openstackgerritEndre Karlson proposed openstack/python-designateclient: Move session creation up to shell  https://review.openstack.org/13320813:11
ekarlsothe session  / Move code thing13:12
openstackgerritEndre Karlson proposed openstack/python-designateclient: Move session creation up to shell  https://review.openstack.org/13320813:28
openstackgerritEndre Karlson proposed openstack/python-designateclient: V2 Bindings  https://review.openstack.org/13367513:29
openstackgerritKiall Mac Innes proposed openstack/designate: Add Server object validations  https://review.openstack.org/12962513:39
openstackgerritKiall Mac Innes proposed openstack/designate: Add basic validation functionality to DesignateObjects  https://review.openstack.org/12784613:39
openstackgerritKiall Mac Innes proposed openstack/designate: Support Nested/Recursive Object Validations  https://review.openstack.org/12989513:39
openstackgerritKiall Mac Innes proposed openstack/designate: Add Domain object validations  https://review.openstack.org/12990913:39
*** ryanpetrello has joined #openstack-dns13:49
openstackgerritKiall Mac Innes proposed openstack/designate: Add Server object validations  https://review.openstack.org/12962513:52
openstackgerritKiall Mac Innes proposed openstack/designate: Support Nested/Recursive Object Validations  https://review.openstack.org/12989513:52
openstackgerritKiall Mac Innes proposed openstack/designate: Add Domain object validations  https://review.openstack.org/12990913:52
*** ryanpetrello has quit IRC13:54
*** nkinder has quit IRC14:08
*** k4n0 has quit IRC14:10
*** richm has joined #openstack-dns14:12
*** MasterPiece has joined #openstack-dns14:21
*** MasterPiece has quit IRC14:22
betsyHey kiall/mugsie: I’ve got a question for y’all re the pool API14:30
*** mikeit has joined #openstack-dns14:44
*** jmcbride has joined #openstack-dns14:46
*** mikeit has quit IRC14:48
openstackgerritMerged openstack/designate: Updated from global requirements  https://review.openstack.org/13409214:54
*** nkinder has joined #openstack-dns14:54
Kiallbetsy: heya14:58
*** ryanpetrello has joined #openstack-dns15:04
betsykiall: hey15:06
betsySo, I just want to make sure the pool api is behaving as expected.15:07
betsythe default pool has null for the tenant id15:07
betsyI’m running a vagrant box with no-auth15:07
betsyBut when I make an api call to GET all pools, it comes back empty15:07
betsyIt’s trying to match where tenant-id is ‘no-auth’, and sense it’s null, it doesn’t match15:08
betsyI can elevate the context to tenant=all in the call to central, but I was wondering if that was the correct thing to do or not15:08
betsyBecause then any admin would get all the pools, but then we only have one now anyway15:08
betsyDoes that make sense?15:09
betsy^since15:09
betsy^ for the first ‘sense’ not the last one ;)15:10
KiallSorry - Was AFK :)15:11
betsynp15:11
KiallSo.. Humm.. I guess the Q is, should normal users be able to see pools that don't belong to them? Even shared/default pools?15:12
KiallI think we talked about them not being able to see them, but I can't remember 100% TBH.15:12
betsyThat’s what I thought. But now as an admin on a no-auth system, I can’t see the default pool15:13
betsySince the admin (me) didn’t create it and it has a null tenant-id15:13
betsyI’m not sure how to fix that tho, other than to change the all-tenant=True15:14
betsyand not sure if that’s the right thing15:14
KiallAh, Okay.. I get you...15:16
betsyOf course, it wouldn’t matter at this time, since we only have one pool and if the admin has permission to view pools, he should be able to see it15:17
KiallLet me re-read the code that does the tenant filtering to see what the "right" thing to do us..15:17
betsyOk. Thanks.15:17
betsyI’ll be glad to open and bug and fix it, just wanted to check with you first on what it shoudl be15:18
openstackgerritGraham Hayes proposed openstack/designate: Added functionality to allow for zone ownership transfers  https://review.openstack.org/10782215:19
Kiallbetsy: so.. I think the right way for this to work is, (explaining in a roundabout way..), as an admin user, if I query for a resource which has a tenant_id field, I should by default see resources where tenant_id = admins_tenant_id AND tenant_id = NULL.. while, as a normal user, I only get to see tenant_id = user_tenant_id15:22
KiallSo.. this method would be updated: https://github.com/openstack/designate/blob/master/designate/sqlalchemy/base.py#L11715:22
*** MasterPiece has joined #openstack-dns15:24
betsyOk. The _apply_tenant_criteria?15:29
betsyThat makes sense15:29
betsyOkay. I’ll open that bug and make the change15:29
betsyThanks15:29
KiallYea, we can do a policy.check(context, 'admin') in there and change that else block to do the `OR tenant_id == NULL` for admin users15:29
mugsieshould the pool no just be created with the managed tenant_id ?15:29
mugsiei thought that was part of the reason we added the managed_tenent_id config car15:30
betsyIt’s created during the db migration15:30
mugsieit should have access to the managed_tenant_id var in the migration afaik15:30
Kiallmugsie: I think that's stretching the "managed" concept into something it wasn't really intended for15:30
mugsienot really...15:31
Kialland - Maybe we do want users to be able to see (but not end) "global" / "shared" resources .. NULL makes much more sense than granting access to some other tenant_id that has lots of other stuff in it.15:31
mugsieit is a managed resource15:31
KiallManaged resources are resources which we manage on behalf of a tenant.. A shared pool isn't a single tenants15:32
mugsiewe dont want to give access to pools directly - (as was dicussed in Seattle)15:32
mugsieit is, its part of the admin / service tenant15:32
KiallYea.. That's what I remember too, but the concepts used here should map to all global things.. which may end up with something we DO want to show to users..15:32
Kialltenant_id=NULL for something which doesn't belong to a tenant is IMO the correct thing.15:33
mugsieand the managed_tentant_id for something that is managed by the admin is right IMO15:33
Kiallmanaged_tentant_id isn't "admin managed". It's a resource managed by the system itself on behalf of a tenant. I wrote it, I know what the intention behind it was ;)15:34
Kiallbesides.. seeing SELECT * FROM table WHERE tenant_id = UUID_B OR tenant_id = UUID_B; vs SELECT * FROM table WHERE tenant_id = UUID_B OR tenant_id IS NULL; - Which one is clearer?15:35
betsySorry, mugsie. I have to side with Kiall on this one. :)15:35
openstackgerritMerged openstack/designate: Add basic validation functionality to DesignateObjects  https://review.openstack.org/12784615:35
mugsiebetsy: fair eniough :)15:35
mugsiehas to happen some of the time15:36
betsy:D15:36
betsySo, I’ll open the bug and fix it15:36
*** timsim has joined #openstack-dns15:37
Kiallbetsy: Cool :)15:37
Kialltimsim: hows the db indexing testing going BTW?15:41
*** paul_glass has joined #openstack-dns15:41
timsimKiall: Good, the indices I have up now improved performance massively. Not ideal, but it's a start. Went from falling over around 50k zones to staying alive past 700k zones, 2 mil records. Some of the response times were like five seconds, but no API failures.15:42
Kiallalso - betsy - re https://review.openstack.org/#/c/133549/ .. I'm not sure I understand the reasoning for exposing "pool_attributes" as a full-fledged resource, rather than linking it into the standard Pool resource?15:43
Kialltimsim: cool - you think there OK to merge as is? (After a rebase + migration renumber).. If so, I'll have a look in the next few mins..15:44
betsyWell, started wondering that myself after I did it, but I was thinking I need those calls to migrate the server table to the pool_attributes table15:44
*** paul_glass1 has joined #openstack-dns15:44
KiallWouldn't the migration would just use the DB directly?15:44
betsyWell, and to make the v1 api server create call work15:45
timsimKiall: Yep, I believe so. I'll rebase/up the migration number in a bit.15:45
KiallFor that - Can't we "attach" attributes to the Pool object, like we attach records to the RecordSet object?15:45
betsyhmm. let me rethink that15:45
*** EricGonczer_ has joined #openstack-dns15:47
betsyWell, we do still have records CRUD calls in central to support v115:47
timsimbetsy: Reading the scroll a bit, somewhat unrelated, but maybe still helpful for the future. If you pass an X-Auth-All-Projects:true header you can see the resource you're looking for, for all tenants: https://github.com/openstack/designate/blob/master/designate/api/middleware.py#L7215:47
KiallYea.. I think it's a tradeoff, in that we end up loading the attributes even if they aren't needed that way.. vs having 500+ lines of extra code and a larger central API.. I'm not convinced I actually have my mind made up on which is better ;)15:47
*** paul_glass has quit IRC15:48
Kiallbetsy: the record CRUD calls really need to be removed :)15:48
betsyok. let me look at my imlementation some more on moving the v1 server call15:48
Kiall(Also .. I'm dying to removing piles of code like the record CRUD, and piles of almost-duplicate per-resource code... Hence everytime I see more, I cry a little inside :P)15:49
KiallJust not sure what the best way to get rid of most of it is yet ;)15:50
betsyok, ok. I’ll try to reduce that. ;)15:50
betsytimsim: true, but won’t work in a production env. At least I hope not15:54
Kiallheh - Records APIv1 mostly uses the RRSet methods already.. Bonus ;)15:54
KiallAnd, I can see why... get record by ID isn't something you can do via the RRSet interfaces...15:56
openstackgerritTim Simmons proposed openstack/designate: Add some helpful SQL indices  https://review.openstack.org/12967515:56
Kialltimsim: Q - Why the index_exists method in ^?16:00
KiallAlso.. Do you have a SQL dump of the largest Designate DB you managed? Save me building one up!16:01
Kiall(I'm assuming it was built with test data...)16:01
*** EricGonczer_ has quit IRC16:02
timsimYeah I do actually. Somewhere.16:02
timsimThe index_exists is there just to make sure that the migration doesn't fail if something has gone awry. Ideally you wouldn't need it, but bad thigns happen if that index already exists. If you try and add one that exists, or drop one that doesn't, or really do anything that isn't in absolute ideal conditions, the whole thign blows up. So I had it there as a safeguard.16:04
timsimPlus, how can you not love: [str(x).split('.')[1] for x in index[1:]]16:04
Kiall:)16:06
*** EricGonczer_ has joined #openstack-dns16:23
*** jmcbride has quit IRC16:24
*** jmcbride has joined #openstack-dns16:26
timsimKiall: Designate DB dump, 623k zones, 1.87 mil records http://964700e4a3d9dbf5b5ba-7a27b8c5d9fcdc26d383a194ab4f0ebe.r14.cf2.rackcdn.com/designatedump.sql16:28
timsim888 MB ^16:28
KiallCool, pulling it down now :)16:28
Kiallprobably going to take a while to import.. Heh..16:35
timsimYeah it took like...2 days to load up via the API, haha. Anything is better than that.16:39
*** mwagner_lap has quit IRC16:42
Kialltimsim: running on a VM or metal?16:43
timsimMany VMs.16:43
KiallWell, the DB, single or traditional replication, or percona cluster etc? :)16:44
timsim3-node galera cluster.16:44
timsimPercona xtrabackup.16:44
openstackgerritEndre Karlson proposed openstack/designate-specs: [designateclient] v1 support for keystone sessions  https://review.openstack.org/10503316:46
Kialltimsim: another Q ... Any tuning of the out of the box mysql/percona settings?16:49
timsimYep, just a sec16:49
timsimThey were 8GB vms. https://gist.github.com/TimSimmons/5b7fa48d95d7ee364d3816:50
timsimHold, on that might not be right.16:51
timsimYeah I think that was right.16:52
timsimIt wasn't much, just a bit of extra config.16:53
KiallK.. Still loading into MySQL anyway ;)16:53
timsimHah.16:53
timsimI did this at some point, not sure if I ever shared, might be of some interest: https://gist.github.com/TimSimmons/4bc5936d9ad8325d3bd216:54
Kialllol @ the ASCII art16:54
timsimI have no idea why I decided to do that. So dumb, it's not searchable!16:55
Kiall;)16:55
mugsietimsim: #v2/zones?name=*poo.com*16:56
mugsienice16:56
KiallSHOW FULL PROCESSLIST; while importing that data == bad idea.16:56
timsimAlways my default test domain.16:56
mugsie:D16:57
KiallI think the import is maxing out my SSD's write performance.. lol16:58
*** jmcbride has quit IRC16:59
timsimKiall: How do you feel about adding a column for domains/maybe others with the name field reversed?17:01
*** rmoe has joined #openstack-dns17:01
KiallWe talked about that in the past, as a possible speed up for sub/super domain checking as far as I recall17:01
mugsieyup17:02
KiallAt the time I seem to remember the discussion ending with something like "Let's test it and see"17:02
timsimYep, that was what I was thinking. I was going to try and test what kind of improvement we get17:02
timsimCool :P17:02
KiallYea, It's worth testing .. I'm expecting that once this data loads (still going...), we'll have some more stuff we can fixup in the DB.. e.g. even with your indexes, many queries are resulting in a filesort...17:04
timsimOh yeah. There's plenty more to do.17:05
timsimThere's probably going to have to be a migration that basically drops all the foreign key relationships, unique key constraints, creates a few more indexes, and adds it all back. But trying to run that in production *shudders*17:07
*** jmcbride has joined #openstack-dns17:11
KiallYea, a massive DB cleanup really means taking a downtime hit :/17:11
timsimNova has ~70 indexes, not counting Unique Constraints: https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/models.py :O17:17
*** betsy has quit IRC17:24
* Kiall is probably going to regret running the latest migrations against your test DB..17:26
KiallYep.. #42 is a biggie :(17:26
KiallAha.. Luckily you had very few MX or SRV records ;)17:27
openstackgerritGraham Hayes proposed openstack/designate: Added functionality to allow for zone ownership transfers  https://review.openstack.org/10782217:28
*** openstackgerrit has quit IRC17:34
*** openstackgerrit has joined #openstack-dns17:34
*** ChanServ sets mode: +v openstackgerrit17:34
timsimhaha.17:34
timsimYeah that dump definitely had my indices in there already. I wonder what it'd be like to create them on something that large.17:35
*** GonZo2K has joined #openstack-dns17:36
*** jordanP has quit IRC17:36
*** paul_glass1 has quit IRC17:39
*** MasterPiece has quit IRC17:40
Kialltimsim: BTW.. You mentioned some queries/API calls were taking ages with that dataset, which ones in particular?17:42
*** rmoe has quit IRC17:43
KiallFetching all zones for T10 (24k zones) via the API is taking 30seconds, with the vast majority of the time being spent in Python (most likely serializing/deserializing for passing over the wire on rabbit..)..17:44
*** rmoe has joined #openstack-dns17:45
*** jmcbride has quit IRC17:45
timsimI believe the first two queries in this doc were the main culprits: https://gist.github.com/TimSimmons/f573c797ddec1b37d2f117:47
timsimAccording to a DBA, the first one can be fixed by reording the unique key constraint (wat). The second one would be using that reverse name column.17:48
KiallInterestingly, I'm seeing that query complete in 0.01 sec17:49
Kiall(for a different domain that exists in the dataset)17:49
timsimProbably not that one then. :P17:50
timsimIs that the one with the wildcard? or the inner join?17:50
Kiall:D17:50
KiallAh yea, Wildcard one is taking 2.3 sec...17:50
timsimYeah I think that was the main problem.17:50
Kiallinner join was 0.0.117:50
Kiall0.01*17:50
timsimMaybe I had that one because it got executed so many times or something. idk.17:51
timsimThe problem with the wildcard is the left wildcard can't use an index. If we flipped it and used a right, it'd be better supposedlyl.17:52
KiallI suppose that makes sense :)17:52
timsimI was going to try it out this afternoon and see17:52
timsimIf that got improved, and the record patch/put thing got resolved, that'd be a huge performance boost.17:53
KiallYea, just adding a reversed name now to test17:53
Kiallrecord patch/put thing?17:54
Kiallmysql> update domains set rname=REVERSE(name); <-- wonder how long this will take ;)17:54
timsimWhen you PUT to add a record to a recordset, it deletes all of them and recreates them plus the one you added?17:55
KiallUhh.. I'm 99% sure I wrote code to avoid doing that :/17:55
Kiallcrud -_-17:55
timsimI could check again, but I'm pretty sure I saw something do like 80 queries to add a record.17:56
mugsie-_-17:56
KiallYea.. I did: https://github.com/openstack/designate/blob/master/designate/storage/impl_sqlalchemy/__init__.py#L39817:56
KiallI wonder if that's not being triggered somehow17:56
KiallOO.. The API layer needs something similar or it won't work :/17:57
timsimYeah here are the queries https://gist.github.com/TimSimmons/4bc5936d9ad8325d3bd2#file-queries-sql-L23317:57
* mugsie is listening to The Naked and Famous - Punching in a Dream17:59
mugsiegah - damn scrollback17:59
Kialltimsim: wow.. http://paste.openstack.org/show/132901/18:00
Kiall(normal vs reversed)18:00
timsimThat's....better.18:00
Kialltimsim: slightly18:00
KiallProbably not enough to be worth it though ;)18:00
mugsie2.36 vs 0 ?18:01
mugsiethats definitly worth it ;)18:01
KiallYea, it's only a wee difference ;)18:01
timsimI'll say18:01
*** EricGonc_ has joined #openstack-dns18:02
Kiallthe actual col + index etc I added.. just for reference .. http://paste.openstack.org/show/132902/18:02
*** EricGonczer_ has quit IRC18:02
KiallSo.. I'd say adding that is a yes ;) But, ideally we hide it inside the SQLAlchemy plugin (it's a very SQL optimization after all)18:03
mugsieyeah, I would agree18:03
Kiall(and.. we can't really use rname as the column name, since that has special meaning in DNS and will be confusing :P)18:03
*** openstackgerrit has quit IRC18:03
*** openstackgerrit has joined #openstack-dns18:04
*** ChanServ sets mode: +v openstackgerrit18:04
timsimYeah that'd be best18:04
mugsieit also would not nessisarly be relevent in another DB18:04
timsimPostgres wouldn't benefit?18:05
mugsieit would, but mongo might not18:05
mugsieor couchDB18:05
* mugsie gets sick think of anyone trying to run this on couchdb18:06
mugsiethinking*18:06
timsimTrue, but they wouldn't be using SQLA for that?18:06
Kialltimsim: yea, hence hiding it inside SQLA :)18:06
Kiallinside the SQLA driver18:06
timsimOk, I've caught up now18:06
Kialltimsim: any chance you created some zones with a massive # of records/recordsets?18:06
timsimUnfortunately not. I didn't want to with that bug that I thought was a known issue :P Vinod and I talked about it, probalby should have those discussions with everyone18:07
KiallThis the "mDNS makes 4.5 billion queries to sever an AXFR" issue?18:08
Kiall(I was hoping to experiment with that exact "bug" ;))18:08
timsimNah the records add/delete bit18:08
timsimI'm actually not sure what you're talking about on that one^18:10
mugsiemDNS does quite a few calls to get the data for an AFXR18:11
KiallYea.. Try adding 100 RRSets to a zone, then do an AXFR and watch mdns do 101 or so DB queries ;)18:12
Kiall102 actually I think..18:12
KiallEasy enough fix, just hasn't been done yet18:12
timsimOop. I haven't tried that.18:12
*** jmcbride has joined #openstack-dns18:14
timsimOur initial test was just Create a ton, add some recordsets. Test the end to end time from API->MiniDNS.18:15
timsimActually, API->Bind9 via MiniDNS18:15
Kiallhttp://paste.openstack.org/show/132904/ <-- Happy with that.. Pulling 5k zones down == 5 sec, that's a fairly reasonable time for a call of that size.. 100 zones taking .2 sec18:16
timsimYeah that isn't bad at all. Did you do anything to optimze that?18:18
KiallNope... Designate etc running in a VirtualBox VM.. MySQL running on desktop with SSD with innodb_buffer_pool_size = 12G / key_buffer              = 1024M18:20
Kiall(stock mysql.. no cluser / slaves etc18:20
*** mwagner_lap has joined #openstack-dns18:20
timsimYeah. I haven't had any issue with those gets. If we're going to add the reverse_name column, that should improve filtering performance too.18:21
timsimKiall: So I'll go ahead and work on that change, unless you want to?18:23
KiallGo ahead.. :)18:23
eanderssonMorning. :)18:26
mugsieeandersson: hey18:27
timsimhey eandersson18:28
*** amcrn has joined #openstack-dns18:30
eanderssonI was busy trying to figure out the threading issue in central yesterday.18:30
Kialltimsim: found an issue with your MySQL tuning ;)18:30
timsimNot surpised :P what'cha got?18:30
timsimGod I can't speel today.18:31
eanderssonhehe18:31
Kiallhttp://paste.openstack.org/show/132905/ You had your innodb_buffer_pool_size set to 3GB, while that dataset really needs 4.5+GB18:31
mugsieeandersson: which threading issue? I *thought* we got all of them18:32
timsimAh, figures. Performance did suffer a bit when it got real big. The configurations I put in for everything were intended as a proof of concept before our new DevOp came in. We ended up trying to muder it :P18:33
eanderssonI can't figure out why _increment_domain_serial is causing a database deadlock for us.18:33
eanderssonhttps://github.com/openstack/designate/blob/stable/juno/designate/central/service.py#L29318:33
mugsieeh.... this is sounding familiar...18:34
eanderssonI wrote a big bug report with lots of logs before I left Europe, but launchpad decided to error out.18:35
eanderssonand never actually published it18:35
mugsie-_-18:35
mugsieah, launchpad18:35
eanderssonOperationalError: (OperationalError) (1205, 'Lock wait timeout exceeded; try restarting transaction') 'UPDATE domains SET updated_at=%s, serial=%s WHERE domains.id = %s AND domains.deleted = %s' (datetime.datetime(2014, 10, 29, 22, 2, 26, 129759), 1414620146, 'bcaea39f389d4336b60de0adcd4513a2', '0')18:36
eanderssonTRX HAS BEEN WAITING 38 SEC FOR THIS LOCK TO BE GRANTED:18:36
eanderssonAt first I thought it could have been related to my Sink changes, but I saw the same thing when simply using the Designate Client to create new records.18:36
*** paul_glass has joined #openstack-dns18:39
mugsieyeah, we found some DB indexing issues at scale recently, but I am not sure if they got back ported to juno18:39
eanderssonHow does threading work in central?18:39
eanderssonI think I added a stupid fix for now, by wrapping that call in a lock :P18:40
eanderssonI am going to set up a new dev envrionment today, so that I can troubleshoot it properly.18:41
*** GonZo2K has quit IRC18:41
*** paul_glass has quit IRC18:42
*** paul_glass has joined #openstack-dns18:43
Kialleandersson: by any chance did you save all that detail?18:45
KiallAlso, were you using percona cluster or galera etc? (I remember asking, but can't remember the answer)18:47
*** openstackgerrit has quit IRC18:49
*** openstackgerrit has joined #openstack-dns18:49
*** ChanServ sets mode: +v openstackgerrit18:49
eanderssonNah, we are pretty much using the same setup as with the devstack.18:50
eanderssonfor Designate18:50
eanderssonPowerDNS with a MySQL DB18:50
eanderssonI think the reason this hasn't been discovered is because it requires a script to start multiple instances at the same time18:51
eanderssonI discovered it when one of our Devs ran a script to start 7 servers at the same time.18:51
KiallWell, we've done testing around creating multiple records in a single zone at once (which should be the same I guess), did you manage to save the SHOW INNODB ENGINE STATUS; output?18:52
eanderssonI did, but lets see if I can find it.18:52
eanderssonI posted it on the pastbin.18:52
eanderssonI think I sent the link to you as a private message a few weeks ago.18:52
KiallI've no PM history with you, so must have been in the room. The room is logged, any chance you remember the date? ;)18:53
eanderssonhttp://paste.openstack.org/show/07TzeLYlhUbKeEeQPDOg/18:53
eanderssonI found some saved logs in my mailbox. :P18:54
eanderssonI can't remember unfortuantely.18:54
eanderssonThat is from when I reproduced it on my dev machine.18:54
KiallHumm - That only seem to be showing 1 of the TX's .. The second TX it's conflicting with would be useful :(18:56
eanderssonI don't think I kept the logs on my laptop. I am in the US now, so I don't have access to my PC unfortunately.18:58
eanderssonhttp://paste.openstack.org/show/132907/18:58
eanderssonI don't think that contains what you need unfortuantely.18:58
eanderssonI am working on setting up a new dev environment, so I can probably get you those logs for tomorrow.18:59
KiallJust dialling into a call, back in 3019:01
*** amcrn_ has joined #openstack-dns19:01
*** amcrn has quit IRC19:04
*** amcrn_ is now known as amcrn19:04
eanderssonok reproduced it19:17
eanderssonThis is how you re-produce the issue19:18
eanderssonhttp://paste.openstack.org/show/132921/19:18
*** openstackgerrit has quit IRC19:18
*** openstackgerrit has joined #openstack-dns19:18
*** ChanServ sets mode: +v openstackgerrit19:18
eanderssonBasically really shitty example. :P19:19
*** paul_glass1 has joined #openstack-dns19:22
*** paul_glass1 has quit IRC19:23
*** paul_glass has quit IRC19:23
Kialleandersson: just off the call, give me 5 and I'll see if I hit it too with that19:24
*** paul_glass has joined #openstack-dns19:26
Kialleandersson: my test DB is well.. HUGE.. right now.. so it's falling over for other reasons right now ;)19:39
eanderssonWe are seeing issues with both updating the domain and insert into recordset19:43
ekarlsoeandersson: what errors ?19:47
KiallI'm not sure if my dev env is broke, or if your script is reproducing something much worse than a deadlock.. I can't even connect to mysql right now ;)19:47
Kiallekarlso: lots of scrollback with detail19:47
eanderssonyep19:47
eanderssonthat is what I was seeing on my dev instance as well19:47
Kialleandersson: opening the mysql client is failing for you too?19:47
KiallThat should never happen really...19:47
eanderssonnot in this enviornment19:48
eanderssonbut it was eating all the resources on my dev environment19:48
eanderssonso I couldn't do anything related to the MySQL DB19:48
ekarlsodeadlock issues ? sorry it looks like TLDR19:49
timsimekarlso: Yep. In the Juno release.19:50
eanderssonekarlso: I am basically seeing my Deisgnate DB completely locked up when multiple create record are called at the same time.19:50
KiallOkay - My env is broke.. a single insert seems to be failing19:51
eanderssonSome additional logs if it helps19:52
eanderssonhttp://paste.openstack.org/show/132925/19:52
eanderssonWe can only see one query though. So not sure what is locking.19:53
KiallInto a table with 3 records, an insert is taking me like 45 seconds! Let me fix this up, then I can re-test with out script :)19:54
Kiallyour*19:54
KiallINSERT into*19:55
Kiallcan't type at all today -_-19:55
eanderssonI am still so jetlagged :p19:56
eanderssonhate traveling19:56
KiallOh.. Can't read either it seems.. was right the first time..19:56
*** openstackgerrit has quit IRC20:04
*** openstackgerrit has joined #openstack-dns20:04
*** ChanServ sets mode: +v openstackgerrit20:05
Kialleandersson: and a rebuild has done the same. I'm really really stumped to be honest. I've not see behaviour like this before20:06
eanderssonI think that it only happens once you go into a more production like environment.20:07
eanderssonSince it is usually only triggered by DevOps type scripts that create multiple instances.20:08
KiallWe've done lots and lots of testing for that kinda stuff20:08
Kialleandersson: think I have it..20:10
KiallSo, the default mysql driver is the C mysql driver.. it can't be monkeypatched by eventlet, so doesn't work async..20:11
eanderssonah wow20:11
Kiallwhen the first one opens a TX, eventlet moves over to another "greenthread", which tries to use the DB, and blocks.20:11
KiallThat's why we only see 1 query running20:11
Kiallbut 2 TX's open20:12
KiallFrom memory, oslo.db (the openstack db lib) works around this.. But.. Clearly it isn't20:12
ekarlsooslo.db bug or smth+20:13
Kiallekarlso: possibly...20:14
KiallI've gotta run, but I've got enough info to debug this properly tomorrow.20:17
ekarlsohow to reproduce ?20:17
ekarlsoKiall: beer time ? :P20:18
eanderssonI posted a terrible example :P http://paste.openstack.org/show/132921/20:18
Kiallyep20:18
ekarlsoKiall: -,,-20:18
eanderssonIs there a way to limit central to one thread?20:18
timsimekarlso: http://paste.openstack.org/show/132921/20:18
ekarlsohmm, 1 worker u mean eandersson ?20:18
eanderssonYea20:18
ekarlsoworkers=120:18
eanderssonSo that this bug cant be triggered20:18
eanderssonah that simple20:18
eanderssonawesome20:19
ekarlsothink so20:19
eanderssonalthough that didnt work on the sink afaik20:19
eanderssonI'll try it out though20:19
timsimWorkers would go under servce:central like this: https://github.com/openstack/designate/blob/master/etc/designate/designate.conf.sample#L12620:20
ekarlsoeandersson: tested on kilo btw?20:20
eanderssonYea. Actually my initial logs were from the master.20:20
eanderssontimsim: Unless I am wrong, the notifications/requests are still handled async, even with one worker.20:22
eanderssonAt least that is what I was seeing with the sink.20:22
timsimYeah you're right, I wasn't sure that'd actually work.20:23
eanderssonMy initial fix for this was to add a thread lock in the Sink. Which works.20:23
timsimThere might be a way to only let it do one request at a time.20:23
eanderssonbut the problem is that it wont prevent API requests from causing this.20:23
timsimYeah that's an issue for sure.20:24
eanderssonYea, that would be perfect. Performance is the least of my concerns for DNS.20:24
eandersson(at least on the management level)20:24
ekarlsowhat's causing the issue ?20:29
eanderssonYou mean in our environment?20:32
ekarlsojust trying to understand the issue :P20:33
eanderssonBasically whenever you try to create 3+ instances (DNS records) at the same time20:33
ekarlsoit locks up ?20:33
eanderssonyep20:33
eanderssonIt opens two TX's, but only one query20:33
ekarlsowonder why that happens :20:34
eandersson<Kiall> So, the default mysql driver is the C mysql driver.. it can't be monkeypatched by eventlet, so doesn't work async..20:34
eanderssonSounds like that is the cause.20:34
ekarlsoyeah, that seems to be hanging here too :|20:36
ekarlsois it only on rrset ?20:37
ekarlsoor domains too ?20:37
eanderssonI haven't tried tbh, but recordsets are much more likely, as it would be rare to create multiple domains that fast.20:37
ekarlsohmm, eandersson we had some deadlock issues earlier this year...20:39
ekarlsoeandersson: what versions have you tried ?20:42
eandersson2014.220:45
eanderssonand master (the week before the summit)20:45
eanderssonI did not see this in 2014.120:46
ekarlsobut also I see it's doing something stupid it shouldn't need to in master I think20:55
ekarlsoor nvm20:56
*** EricGonc_ has quit IRC21:09
*** EricGonczer_ has joined #openstack-dns21:11
*** openstackgerrit has quit IRC21:19
*** openstackgerrit has joined #openstack-dns21:19
*** ChanServ sets mode: +v openstackgerrit21:19
*** nkinder has quit IRC21:56
eanderssonSo anyone has a clue how to limit Central to one record at a time? :D22:07
timsimMaybe rpc_thread_pool_size=1 http://docs.openstack.org/developer/designate/configuration.html or change this to one https://github.com/openstack/designate/blob/b254e98c78f2cdd0f0f038b22fbed985bd7a4bc0/designate/service.py#L4122:12
timsimeandersson^ major hack suggestions, lol.22:12
timsimThat assumes one thread=one concurrent request. *shrugs*22:13
*** ryanpetrello has quit IRC22:16
eanderssonYea. It's pretty much what I was going for.22:19
eanderssonMy current solution was to wrap everything in a lock :P22:19
*** GonZo2K has joined #openstack-dns22:19
timsimYeah, my guess is that Kiall will get a fix up tomorrow ;)22:22
openstackgerritOpenStack Proposal Bot proposed openstack/designate: Updated from global requirements  https://review.openstack.org/13438222:27
openstackgerritTim Simmons proposed openstack/designate: WIP: Add a reverse name column to the domains table  https://review.openstack.org/13438722:36
*** timsim has quit IRC22:40
*** paul_glass has quit IRC22:41
*** jmcbride has quit IRC22:49
*** boris-42 has joined #openstack-dns23:16
*** nkinder has joined #openstack-dns23:17
*** EricGonczer_ has quit IRC23:20
Kialleandersson / ekarlso: to limit central to 1 at a time, you can limit the thread pool size to 1... BUT.. Increasing the # of workers would also resolve (part of) the issue..  The part that's not resolved is, we have generally "ignored" the deadlock situation around the same zone being updated concurrently (since the serial # has to be updated in a single row) because.. Frankly, it virtually never happens. (2 years running HP Cloud DNS, we've seen23:27
Kiallabout 4 deadlocks). The fix for this issue, as far as designate is concerned anyway, is to retry operations on deadlocks - which is basically what the DB expects you to do. Beyond that, we're limited by eventlet + MySQL-Python.... Deployers can choose to either use PyMySQL (a pure python mysql implemention, or, they can add `use_tpool = True` to the [database] config section... Either should resolve the issue.23:27
Kiallwow.. longer than I thought that message was23:27
Kiall(also - re "virtually never happens".. I mean has never really happened for our use case @ HP ;))23:28
Kiallccccccdugckefldltecuuncdltcfdhguvbuninirudvb23:30
boris-42Kiall: ekarlso hi guys23:31
boris-42Kiall ekarlso here is the patch that adds non voting rally job to designate https://review.openstack.org/#/c/134392/23:31
boris-42Kiall: ekarlso I am going to make patch in designate as well=)23:32
ekarlsoboris-42: coolio23:33
ekarlsoKiall: stop drinking beer and fitch the deadlock issue :P23:34
boris-42ekarlso: LOL23:34
Kiallboris-42: nice :) I was going to add the same job to Designate after I saw you had it on rally23:34
boris-42Kiall: the only thing is that we have some issues with designate in Rally23:34
Kiallekarlso: lol.. I'm Irish, when have you ever known an Irishman to not drink beer?23:34
boris-42Kiall:  for some reason python client stopped workign23:34
ekarlsothat's my fault :P23:34
Kialllol.. ekarlso what did you do? -_-23:35
boris-42Kiall: oh I need some business trip to Irish23:35
boris-42Kiall:  =)23:35
boris-42Kiall:  in russian only vodka only hardcore23:35
ekarlsoKiall: the whole constructing the client with a token + ep without a auth_url causes it to try discovery and goes bork no url thingie23:35
boris-42=)23:35
Kiallboris-42: Russian right?23:35
boris-42Kiall: lol23:35
boris-42Kiall:  actually no=)23:35
Kiallheh.. well, our vodka isn't as good ;)23:35
boris-42Kiall:  I just live here for 15 years=)23:35
ekarlsoKiall: u dont want competitors like boris-42, you'll be under the table :P23:36
boris-42ekarlso: lol23:36
boris-42ekarlso: ya something like that=)23:36
Kiallmugsie will take him ;)23:36
ekarlsoKiall: dont forget russians are fed vodka as 40% of their breast feeding -,,-23:36
ekarlsoat least :P23:36
KiallAnyway.. Happy to add the job non-voting, fix the issues, and go from there.. the rally-jobs template though.. I assume that has lots of different jobs?23:37
Kiallekarlso: too far BTW ;)23:37
ekarlsoKiall: yeah, that one was a little off :p23:37
ekarlsoanyways, i'll take a look at the rally + designateclient situation tmrw boris-4223:38
ekarlsoshould be a simple fix :)23:38
KiallAh.. only 1 is added to the layout, so not an issue if theres more than just designate jobs..23:38
boris-42Kiall: ?)23:38
boris-42Kiall: we can add any amount of jobs to any of projects23:38
boris-42Kiall:  i just don't see for now case23:39
boris-42Kiall:  to do that23:39
KiallYea, But I wouldn't want gate-rally-trove or similar running ;) But.. it or similar won't since it's an explicit job name in the zuul layout.. (again, assumes rally-jobs has more than gate-rallt-{project} in it...)23:40
Kiall(in other words, I should have read more before commenting...)23:40
boris-42Kiall: ahhh23:41
boris-42Kiall:  so lemme try to expalin23:41
KiallNo need, I just misread :)23:42
boris-42Kiall: we have few differnet job templates23:42
boris-42Kiall:  (for designate, for zaqar, for neutron, and just for nova-netowrk)23:42
boris-42Kiall: they just have different configured devstack-gate to install required stuff23:42
boris-42Kiall: so in your case we are adding rally job that is running agaisnt openstack with designate)23:43
KiallYep, I see it now :) It originally looked to me like all the $project+rally jobs, but that was just misreading :)23:43
boris-42Kiall: so basically you can write plugins for Rally inside designate project source23:45
KiallI am curious though, about what you consider a fail for when it comes to making it voting... Gating on performance - when dealing with cloud instances from different providers like we do - seems unlikely. You only fail the job when the APIs etc explode?23:45
ekarlsoboris-42: an evil way of just "fixing" it for now is to pass auth_url down to it23:45
eanderssonKiall: So the reason we didn't see this in 2014.1 is probably because we had multiples Nodes running.23:45
boris-42ekarlso: why evil?)23:45
ekarlsoboris-42: ideally though you'd have Rally support keystone sessions23:45
Kiallekarlso: or you fix the code ;)23:45
ekarlsoKiall: pffftm it's not my fault, it's ksclient :P23:46
KiallIf it worked a week ago.. It needs to work today.23:46
eanderssonWe were only running a single instance with one worker for testing purposes, before we went live.23:46
boris-42Kiall: about different VM from nodepool23:46
boris-42Kiall: first of all you can bigger values in criteria of success23:46
boris-42Kiall:  like avg duration < 2 * AVG23:46
ekarlsoKiall: well I think the generic.Token stuff would work, obviously it doesn't work like before :(23:46
boris-42Kiall: that will catch terrible changes. But won't be super preciese23:47
ekarlsoKiall: though I wonder if I could just "hack it" and just pass a v2.Token vs generic.Token since I guess the token payload is the same *goes to test*23:47
boris-42Kiall: one of the topics that we are going to work is normalization of Rally results23:47
Kiallekarlso: but it needs to, it's a bug if it doesn't sadly! The client is an API - rally, horizon (at least in HP), and $random peoples code depened on that API.. If if's fundamentally broken and unfixable, we need to back out the recent changes.23:47
boris-42Kiall: soo you'll multiple all values on some magic number and it will normalize results23:48
boris-42Kiall: but what I would like to say regression testing is just small part of what you can do with rally in gates..23:49
Kiallboris-42: Okay, makes sense.. +/-50% is an unlikely to occur thing on a single change purely due to the speed of the slave.. :)23:49
ekarlsoKiall: call me doh, I think i've used the wrong kind of cred -,,-23:49
boris-42Kiall: like somebody is saying that his patch improves performance => you can easily check that in gates. by running N times recheck23:49
boris-42Kiall: ya dsvm is run against really different VMs but we will find the way to normalize that=)23:50
Kiallboris-42: absolutely! Having the perf info on hand is the main reason I'd want the rally gate, i.e. not necessarily as something which -1's expect in extreme cases.23:50
boris-42Kiall: so as well you'll get profiling in gates23:50
Kiallalso.. Having another client test out code means ekarlso's compatibility breaks won't happen as often ;)23:50
boris-42Kiall: when we finish some very last stuff in rally and add it to designate)23:51
boris-42Kiall: cause getting such traces under load http://boris-42.github.io/ngk.html makes totally sense23:51
ekarlsoKiall: sshhhh23:51
ekarlsosomeone has to do clientwork ! :P23:51
boris-42Kiall: e.g. if person is saying that he is fixing some specific part of code, you can just measure it separatly =)23:52
boris-42Kiall: and then you don't care about absolute values of whole iteration duration=023:52
Kiallboris-42: http://boris-42.github.io/ngk.html, that looks like profiling output? tied to os-profiler?23:52
boris-42Kiall: it's osprofiler output23:52
boris-42Kiall: bootving VM from nova cli23:53
* Kiall loves that kinda info.. The more we have, the better :)23:53
boris-42Kiall: yaa23:53
boris-42Kiall: when I started rally it was the goal to automate profiling + load and reports and gates=)23:54
boris-42Kiall:  so in Kilo it will happen lol=)23:54
Kiallboris-42: I bet when you started you didn't know just how much of a performance difference there was between rackspace and hpcloud, the two providers of of the gate ;) (Let's not get into the details of which one is worse :P)23:55
boris-42Kiall: I know that ...23:55
boris-42knew*23:55
Kiall:)23:55
boris-42Kiall: but it's just one of millions task that have to be resolved23:55
ekarlsoKiall: or dreamhost for that matter :p23:55
Kiallekarlso: none of the gate runs on dreamhost ;)23:56
Kiall(Unless that changed recently)23:56
boris-42Kiall: btw we are going to present some basic historical perfomrnace data for all projects23:56
ekarlso:P23:56
boris-42Kiall: like we will run on same hardware islotaed job23:56
boris-42Kiall: every day from master deployed openstack23:56
boris-42Kiall: finally I got servers lol23:56
Kiallboris-42: Thats actually really interesting - Having identical hardware configs for N years of perf testing gives everyone a really good baseline vs .. well.. cloud.23:57
boris-42Kiall: yep but as well is just 1 small case=)23:57
Kiall1 consistent case is better than zero, or 100s of inconsistent ;)23:58
boris-42Kiall: btw did you try to run Rally locally?)23:58
boris-42Kiall: so we already have a lot =)23:58
ekarlsoKiall: crap23:58
KiallYep, ekarlso wrote the designate pieces and we started using it a bit.. We're still a little stuck on JMeter though due to .. well.. it works..23:59
ekarlsoKiall: so when using token_endpoint.Token ie the thing that does no discovery and all that crap it doesn't even do discovery to discover which version endpoint of designate it should use :p23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!