Tuesday, 2016-01-12

*** jhesketh has quit IRC02:05
*** jhesketh has joined #openstack-solar02:05
*** dshulyak_ has joined #openstack-solar07:09
*** salmon_ has joined #openstack-solar07:54
*** openstackgerrit has quit IRC09:17
*** openstackgerrit has joined #openstack-solar09:17
openstackgerritDmitry Shulyak proposed openstack/solar: Implement db based lock mechanism for orchestration  https://review.openstack.org/25912709:22
openstackgerritJedrzej Nowak proposed openstack/solar: Use stevedore for handlers  https://review.openstack.org/26625509:26
pigmejdshulyak_: your patch looks nice, checking it now :)09:32
dshulyak_i broke tests a bit09:32
pigmej;D09:33
pigmejsomething with config09:33
pigmejdshulyak_: I added some comments :)09:40
*** tzn has joined #openstack-solar09:49
openstackgerritDmitry Shulyak proposed openstack/solar: Implement db based lock mechanism for orchestration  https://review.openstack.org/25912710:00
openstackgerritDmitry Shulyak proposed openstack/solar: Implement db based lock mechanism for orchestration  https://review.openstack.org/25912710:11
pigmejdshulyak_: what is correct way to test this PR ?10:26
dshulyak_pigmej: you will need new container, or stop existing and use - celery worker -A solar.orchestration.runner -P gevent -c 1000 -Q system_log,celery,scheduler10:28
dshulyak_before that select db backend10:28
pigmejyaeh, I'm going to check all :)10:29
dshulyak_i described new config options for riak ensemble10:29
pigmejyup in commit msg :)10:29
dshulyak_basically i used your riak but with strong_consistency = on10:30
dshulyak_and then manually created bucket type from riak doc10:30
pigmejk10:33
pigmejhmm, I have feeling that reset / restart is not workign properly10:39
pigmejdshulyak_: something is not right10:41
pigmejit hangs sometimes10:41
pigmej+ I have some other errors10:41
dshulyak_pigmej: what kind of errors?10:42
pigmejI posted them10:43
pigmejbut10:43
pigmejhttps://bpaste.net/show/5bac6ba16e5f10:43
pigmejthese10:43
pigmejand because of them, sometimes everything just stops10:43
dshulyak_i havent saw this errors10:44
dshulyak_what backend?10:45
pigmejn_val=110:46
pigmejI wonder how the heck set could change size10:47
pigmejI mean I know how but why :) in session_end....10:47
dshulyak_this is riak example?10:48
dshulyak_or ost?10:48
pigmejhosts10:50
pigmejto me it looks like celery fuckup10:50
pigmejdshulyak_: it happens time to time, not always10:50
dshulyak_tried to rerun hosts for couple of times - wasnt able to catch it10:57
pigmejehs10:57
dshulyak_this part shouldnt be affected by locking, maybe by gevent10:57
pigmejnot even10:57
pigmejit should be linear10:58
pigmejthis: `RuntimeError: Set changed size during iteration` sounds bad especially if called from session_end10:58
dshulyak_why it should be linear?10:59
pigmejbecause session_end11:00
pigmejit should be last call during "request"11:00
dshulyak_but session_end can be called in two different gevent threads11:01
dshulyak_it should be related to gevent11:03
dshulyak_there is 2 changes in this patch - one adds locks for scheduling and another changes celery backend for scheduler11:03
dshulyak_locks shouldnt affect this in anyway11:04
dshulyak_pigmej: are u sure that u didnt saw this errors before?11:07
pigmejyep11:07
dshulyak_pigmej: can you please change number of threads to 1 and try to reproduce it?11:16
*** openstackgerrit has quit IRC11:17
*** openstackgerrit has joined #openstack-solar11:17
dshulyak_the only idea that i have is that lazy_save was somehow changed from another thread, it explains both of this exceptions11:19
pigmejdshulyak_: back :)11:25
pigmejso, lazy_save *could* be called from other thread11:25
pigmejhttps://bpaste.net/show/6b253b188e6a11:25
pigmejwhich means that it was called from session_end11:25
pigmejand session_end *should* be latest thingy11:26
pigmejso I wonder wtf11:26
dshulyak_what do you mean by latest?11:26
pigmejsession_end should be called only once11:26
pigmejat the end of "request"11:26
pigmejsave_all_lazy has no lock by intence11:28
dshulyak_are you sure that lock in __get__ ClassCache is correct? it seems it wont protect access to lazy_save set11:30
pigmejwell, nothing should access this set11:30
pigmejI mean, only object can add itself to lazy_save set11:31
dshulyak_if i understood correctly - same set can be iterated in two threads, A and B, and if in thread A it is in the middle of iteration, and B we will call cls._c.lazy_save.clear()11:32
pigmejdshulyak_: kinda11:33
pigmejBUT lazy_save is per greenlet11:33
pigmejeach main greenlet has it's own lazy_save11:33
pigmej(but childs of it can also have the same lazy_save set)11:33
pigmejdshulyak_: so yeah, this is how it works (as you described) BUT, it shouldn't be in session_end11:34
pigmejbecause session_end => it should be last call, nothing more should happen, nothing more should be in progress11:34
pigmejmaybe we leak greenlets...11:34
pigmejand main greenlet finishes, without waiting for childs11:34
pigmejthen child executes save_all_lazy and session_end save crashes then11:35
pigmejbut still then we have second error11:35
pigmejhttps://bpaste.net/show/ca2896571aa311:35
pigmejdshulyak_: I created special "local" object for gevent, so all greenlets created from main greenlet share the same cache object11:37
pigmejif *any* of these child will live longer than main greenlet we're in troubles11:38
pigmejmaybe that's the case11:39
openstackgerritDmitry Shulyak proposed openstack/solar: init script for solar-celery  https://review.openstack.org/26632112:13
openstackgerritDmitry Shulyak proposed openstack/solar: Implement db based lock mechanism for orchestration  https://review.openstack.org/25912712:15
pigmejdshulyak_: you're reverting gevent for celery?12:17
dshulyak_yes12:23
dshulyak_pigmej: you mean every childs shares same object with parent?12:23
pigmejnot every12:24
pigmejjust these created in special way12:24
dshulyak_i didnt get it, what if created in a special way?12:25
pigmejthese https://github.com/openstack/solar/blob/master/solar/dblayer/gevent_helpers.py#L2412:25
pigmejbasically every child created there in this pool, will share cache with it's parent12:25
pigmejso multiget reuse cache12:25
pigmej;)12:25
pigmejwith threads => every thread has it's own separated cache12:26
pigmejwith gevent by defaul also, but yuo can create in "special way" to share cache between them12:26
dshulyak_ok, now it is clear, this is to solve that problem with solar_map12:26
pigmejkinda, it's to make solar_map more efficient :)12:26
pigmejbut yeah :)12:26
pigmejwith thereads there is no way that something is reused, with gevent there is a chance, I will debug this, maybe somewhere I made mistake...12:27
dshulyak_        while 1:12:30
dshulyak_            tmp_c = getattr(c, '_nested_parent', c.parent)12:30
dshulyak_            if not tmp_c:12:30
dshulyak_                return c12:30
dshulyak_            c = tmp_c12:30
dshulyak_oh12:30
dshulyak_i thought it will print in in a better way12:30
pigmej:D12:30
pigmejwelcome to IRC :)12:30
pigmejdshulyak_: what about that while?12:31
dshulyak_sec i want to check original impl12:31
pigmejof gevent local?12:32
dshulyak_hm12:32
dshulyak_you will always use atleast 1st parent12:32
dshulyak_https://github.com/gevent/gevent/blob/master/gevent/local.py#L160-L16112:32
pigmejsure12:33
pigmejthat's intentional12:33
dshulyak_maybe thats the problem?12:33
pigmej?12:33
pigmejthat _nested_parent is assidned in pool.spawn12:33
pigmejassigned12:33
dshulyak_but you have default c.parent12:33
dshulyak_in getattr12:33
pigmejyup12:33
dshulyak_so…12:34
pigmejah, hmm, maybe for celery it's problem12:34
dshulyak_thats the difference with gevent implementation12:34
pigmejsure but it's intentional in all cases except celery12:34
pigmejit seems then that celery spawns worker greenlets from another greenlet12:35
dshulyak_which cases? solar_map?12:35
dshulyak_it looks like gevent local will be always shared, doesnt matter if _nested_parent is set or not12:35
pigmejno, it certainly works fine, you can check it :)12:36
pigmejBUT I probably haven't checked case like celery12:36
pigmejwhere they have their own nested greenlets12:36
pigmejso yeah that may be bad implementation for this case as you noticed12:37
pigmejyeah that may be it12:39
pigmejanyway: KeyError: 'lock_bucket_type' what about that ?12:40
dshulyak_pigmej: do you have it in config?12:44
pigmejyup12:44
pigmejI'm chceking your PR12:44
pigmejand I wanted to run celery with ansible12:45
pigmej`ansible-playbook -v -i "localhost," -c local /vagrant/bootstrap/playbooks/celery.yaml --skip-tags install`12:45
pigmejah12:49
pigmejin that config class requires all variables set in all config files .solar_config_override12:49
pigmejyeah we will need to adjust this _setter for that case :D12:55
dshulyak_pigmej: take a look please - http://paste.openstack.org/show/483565/, is it correct test?13:06
dshulyak_it is not obviouslt == [], i just wanted to print result13:06
pigmejdshulyak_: you need to patch get_local13:08
pigmejbut hmm, maybe it's correct order too13:08
pigmejI will check it after meeting :)13:08
dshulyak_omg, this irc prints messages in random order13:09
openstackgerritLukasz Oles proposed openstack/solar: Add worpdress example from tutorial  https://review.openstack.org/26636013:14
dshulyak_pigmej: please checkout that test, for me that data is shared, but maybe i messed up smth13:59
pigmejbtw the first if is always invalid ?14:02
pigmejisn't it?14:02
pigmejI mean false14:03
dshulyak_if Model._local.test_id == 11: ? wo this if there will be AttributeError14:03
pigmejright14:04
* pigmej stupid...14:04
pigmejyeah it's shared14:04
pigmejBUT I'm starting to wonder if it's good or bad... ;d14:06
tznplanned stuff for 0.1.0 release https://etherpad.openstack.org/p/solar-release-0.1.014:10
tznplease add your comments14:11
pigmejdone :)14:12
pigmejdshulyak_: hmm, your test is a bit unfortunate14:17
pigmejit should be list of single item sets, isn't it?14:18
dshulyak_pigmej: yes, == [] is not correct, i just wanted to print out all items14:18
pigmejah ok14:19
pigmejI runned it without pytest14:19
openstackgerritJedrzej Nowak proposed openstack/solar: Fix gevent cache, by default it shouldn't be shared  https://review.openstack.org/26639514:24
pigmejdshulyak_: ^14:24
dshulyak_it wont brake other cases?14:25
pigmeji shouldn't14:27
pigmejit*14:27
pigmejit should match thread approach now14:27
pigmejI have no idea why I put this c.parent there...14:27
pigmejI probably assumed that first parent will be always empty14:27
pigmejdshulyak_: check it if it works for you14:36
pigmejdshulyak_: I executed example 100 times never crashe14:44
pigmejcrashed14:44
dshulyak_pigmej: i will push patch for celery sqlite and check it with gevent worker14:46
pigmej;D14:46
pigmejdoes this sqlite works 'good enough' ?14:47
dshulyak_pigmej: it fails with database is locked14:54
pigmejhmm14:54
pigmejthat's knows sqlite error14:54
pigmejknown14:54
dshulyak_i changed celery to operate with its own db, and now its fine14:56
pigmejyeah14:56
pigmejI wanted to suggest it14:56
dshulyak_at first i shared solar.db with celery14:56
pigmejbecause sqlite likes to lock itself14:56
pigmejhttps://www.sqlite.org/lockingv3.html14:56
dshulyak_pushed changes, i will modify bootstrap with separate patch15:01
pigmejk I will check it:)15:01
pigmejdshulyak_: did it succeed ? Because still no PR there15:02
pigmejok found it15:04
dshulyak_pigmej: :( idk what is the problem, i added you to patch, gevent fix works for me, i will restore my gevent version, and i need to fix config files15:22
pigmej?15:24
pigmejI don't get your last message dshulyak_15:24
dshulyak_usually there is notification about pushed patch15:26
pigmejah this yeah :)15:28
pigmejopenstackgerrit: you alive ?15:29
pigmejdshulyak_: Yeah I tested  my patch on your gevent version15:29
pigmejit worked for me too15:29
pigmejsalmon_: https://review.openstack.org/#/c/266395/ workflow please there :)15:37
pigmejor dshulyak_15:38
pigmejdshulyak_: hmm, you rebased something ?15:41
openstackgerritDmitry Shulyak proposed openstack/solar: Change celery config to use sqlite backend messaging and results  https://review.openstack.org/26641815:41
pigmejbecause I see in your PR my changes  ;d15:41
dshulyak_in which one?15:41
pigmejin dockerfile I see my "torrent ports" for example15:42
pigmejnot that I care about ownership or sth but I wonder wf15:42
pigmejwtf15:42
dshulyak_https://review.openstack.org/#/c/259127/7/docker-compose.yml ?15:42
pigmejyeah15:43
pigmejhmm15:43
* pigmej wonders15:43
dshulyak_but they are not green15:43
pigmejthey were :D15:43
openstackgerritMerged openstack/solar: Fix gevent cache, by default it shouldn't be shared  https://review.openstack.org/26639515:43
pigmejsalmon_: I see that you prefer to work not talk ?:D15:43
salmon_pigmej: ?15:44
pigmejjoking ;P15:44
pigmejdshulyak_: good job with that locks, seems ok :)15:47
pigmejthough you have one riak = too much :)15:49
pigmejbut everything works fine :)15:49
*** dshulyak_ has quit IRC15:53
*** dshulyak_ has joined #openstack-solar15:54
*** dshulyak_ has quit IRC15:55
*** tzn has quit IRC16:04
*** tzn has joined #openstack-solar16:07
openstackgerritMerged openstack/solar: Set vagrant as a owner of /var/lib/solar/repositories  https://review.openstack.org/26634416:29
openstackgerritMerged openstack/solar: Change celery config to use sqlite backend messaging and results  https://review.openstack.org/26641816:29
openstackgerritJedrzej Nowak proposed openstack/solar: Added possibility to change computable input func  https://review.openstack.org/26647516:50
*** dshulyak_ has joined #openstack-solar17:48
*** dshulyak_ has quit IRC17:51
*** tzn has quit IRC18:34
*** tzn has joined #openstack-solar18:59
*** dshulyak_ has joined #openstack-solar20:15
*** dshulyak_ has quit IRC20:27
*** _tzn has joined #openstack-solar20:31
*** tzn has quit IRC20:34
*** salmon_ has quit IRC20:48
*** salmon_ has joined #openstack-solar20:57
*** _tzn has quit IRC22:31
*** tzn has joined #openstack-solar23:17
*** salmon_ has quit IRC23:46

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!