*** jhesketh has quit IRC | 02:05 | |
*** jhesketh has joined #openstack-solar | 02:05 | |
*** dshulyak_ has joined #openstack-solar | 07:09 | |
*** salmon_ has joined #openstack-solar | 07:54 | |
*** openstackgerrit has quit IRC | 09:17 | |
*** openstackgerrit has joined #openstack-solar | 09:17 | |
openstackgerrit | Dmitry Shulyak proposed openstack/solar: Implement db based lock mechanism for orchestration https://review.openstack.org/259127 | 09:22 |
---|---|---|
openstackgerrit | Jedrzej Nowak proposed openstack/solar: Use stevedore for handlers https://review.openstack.org/266255 | 09:26 |
pigmej | dshulyak_: your patch looks nice, checking it now :) | 09:32 |
dshulyak_ | i broke tests a bit | 09:32 |
pigmej | ;D | 09:33 |
pigmej | something with config | 09:33 |
pigmej | dshulyak_: I added some comments :) | 09:40 |
*** tzn has joined #openstack-solar | 09:49 | |
openstackgerrit | Dmitry Shulyak proposed openstack/solar: Implement db based lock mechanism for orchestration https://review.openstack.org/259127 | 10:00 |
openstackgerrit | Dmitry Shulyak proposed openstack/solar: Implement db based lock mechanism for orchestration https://review.openstack.org/259127 | 10:11 |
pigmej | dshulyak_: what is correct way to test this PR ? | 10:26 |
dshulyak_ | pigmej: you will need new container, or stop existing and use - celery worker -A solar.orchestration.runner -P gevent -c 1000 -Q system_log,celery,scheduler | 10:28 |
dshulyak_ | before that select db backend | 10:28 |
pigmej | yaeh, I'm going to check all :) | 10:29 |
dshulyak_ | i described new config options for riak ensemble | 10:29 |
pigmej | yup in commit msg :) | 10:29 |
dshulyak_ | basically i used your riak but with strong_consistency = on | 10:30 |
dshulyak_ | and then manually created bucket type from riak doc | 10:30 |
pigmej | k | 10:33 |
pigmej | hmm, I have feeling that reset / restart is not workign properly | 10:39 |
pigmej | dshulyak_: something is not right | 10:41 |
pigmej | it hangs sometimes | 10:41 |
pigmej | + I have some other errors | 10:41 |
dshulyak_ | pigmej: what kind of errors? | 10:42 |
pigmej | I posted them | 10:43 |
pigmej | but | 10:43 |
pigmej | https://bpaste.net/show/5bac6ba16e5f | 10:43 |
pigmej | these | 10:43 |
pigmej | and because of them, sometimes everything just stops | 10:43 |
dshulyak_ | i havent saw this errors | 10:44 |
dshulyak_ | what backend? | 10:45 |
pigmej | n_val=1 | 10:46 |
pigmej | I wonder how the heck set could change size | 10:47 |
pigmej | I mean I know how but why :) in session_end.... | 10:47 |
dshulyak_ | this is riak example? | 10:48 |
dshulyak_ | or ost? | 10:48 |
pigmej | hosts | 10:50 |
pigmej | to me it looks like celery fuckup | 10:50 |
pigmej | dshulyak_: it happens time to time, not always | 10:50 |
dshulyak_ | tried to rerun hosts for couple of times - wasnt able to catch it | 10:57 |
pigmej | ehs | 10:57 |
dshulyak_ | this part shouldnt be affected by locking, maybe by gevent | 10:57 |
pigmej | not even | 10:57 |
pigmej | it should be linear | 10:58 |
pigmej | this: `RuntimeError: Set changed size during iteration` sounds bad especially if called from session_end | 10:58 |
dshulyak_ | why it should be linear? | 10:59 |
pigmej | because session_end | 11:00 |
pigmej | it should be last call during "request" | 11:00 |
dshulyak_ | but session_end can be called in two different gevent threads | 11:01 |
dshulyak_ | it should be related to gevent | 11:03 |
dshulyak_ | there is 2 changes in this patch - one adds locks for scheduling and another changes celery backend for scheduler | 11:03 |
dshulyak_ | locks shouldnt affect this in anyway | 11:04 |
dshulyak_ | pigmej: are u sure that u didnt saw this errors before? | 11:07 |
pigmej | yep | 11:07 |
dshulyak_ | pigmej: can you please change number of threads to 1 and try to reproduce it? | 11:16 |
*** openstackgerrit has quit IRC | 11:17 | |
*** openstackgerrit has joined #openstack-solar | 11:17 | |
dshulyak_ | the only idea that i have is that lazy_save was somehow changed from another thread, it explains both of this exceptions | 11:19 |
pigmej | dshulyak_: back :) | 11:25 |
pigmej | so, lazy_save *could* be called from other thread | 11:25 |
pigmej | https://bpaste.net/show/6b253b188e6a | 11:25 |
pigmej | which means that it was called from session_end | 11:25 |
pigmej | and session_end *should* be latest thingy | 11:26 |
pigmej | so I wonder wtf | 11:26 |
dshulyak_ | what do you mean by latest? | 11:26 |
pigmej | session_end should be called only once | 11:26 |
pigmej | at the end of "request" | 11:26 |
pigmej | save_all_lazy has no lock by intence | 11:28 |
dshulyak_ | are you sure that lock in __get__ ClassCache is correct? it seems it wont protect access to lazy_save set | 11:30 |
pigmej | well, nothing should access this set | 11:30 |
pigmej | I mean, only object can add itself to lazy_save set | 11:31 |
dshulyak_ | if i understood correctly - same set can be iterated in two threads, A and B, and if in thread A it is in the middle of iteration, and B we will call cls._c.lazy_save.clear() | 11:32 |
pigmej | dshulyak_: kinda | 11:33 |
pigmej | BUT lazy_save is per greenlet | 11:33 |
pigmej | each main greenlet has it's own lazy_save | 11:33 |
pigmej | (but childs of it can also have the same lazy_save set) | 11:33 |
pigmej | dshulyak_: so yeah, this is how it works (as you described) BUT, it shouldn't be in session_end | 11:34 |
pigmej | because session_end => it should be last call, nothing more should happen, nothing more should be in progress | 11:34 |
pigmej | maybe we leak greenlets... | 11:34 |
pigmej | and main greenlet finishes, without waiting for childs | 11:34 |
pigmej | then child executes save_all_lazy and session_end save crashes then | 11:35 |
pigmej | but still then we have second error | 11:35 |
pigmej | https://bpaste.net/show/ca2896571aa3 | 11:35 |
pigmej | dshulyak_: I created special "local" object for gevent, so all greenlets created from main greenlet share the same cache object | 11:37 |
pigmej | if *any* of these child will live longer than main greenlet we're in troubles | 11:38 |
pigmej | maybe that's the case | 11:39 |
openstackgerrit | Dmitry Shulyak proposed openstack/solar: init script for solar-celery https://review.openstack.org/266321 | 12:13 |
openstackgerrit | Dmitry Shulyak proposed openstack/solar: Implement db based lock mechanism for orchestration https://review.openstack.org/259127 | 12:15 |
pigmej | dshulyak_: you're reverting gevent for celery? | 12:17 |
dshulyak_ | yes | 12:23 |
dshulyak_ | pigmej: you mean every childs shares same object with parent? | 12:23 |
pigmej | not every | 12:24 |
pigmej | just these created in special way | 12:24 |
dshulyak_ | i didnt get it, what if created in a special way? | 12:25 |
pigmej | these https://github.com/openstack/solar/blob/master/solar/dblayer/gevent_helpers.py#L24 | 12:25 |
pigmej | basically every child created there in this pool, will share cache with it's parent | 12:25 |
pigmej | so multiget reuse cache | 12:25 |
pigmej | ;) | 12:25 |
pigmej | with threads => every thread has it's own separated cache | 12:26 |
pigmej | with gevent by defaul also, but yuo can create in "special way" to share cache between them | 12:26 |
dshulyak_ | ok, now it is clear, this is to solve that problem with solar_map | 12:26 |
pigmej | kinda, it's to make solar_map more efficient :) | 12:26 |
pigmej | but yeah :) | 12:26 |
pigmej | with thereads there is no way that something is reused, with gevent there is a chance, I will debug this, maybe somewhere I made mistake... | 12:27 |
dshulyak_ | while 1: | 12:30 |
dshulyak_ | tmp_c = getattr(c, '_nested_parent', c.parent) | 12:30 |
dshulyak_ | if not tmp_c: | 12:30 |
dshulyak_ | return c | 12:30 |
dshulyak_ | c = tmp_c | 12:30 |
dshulyak_ | oh | 12:30 |
dshulyak_ | i thought it will print in in a better way | 12:30 |
pigmej | :D | 12:30 |
pigmej | welcome to IRC :) | 12:30 |
pigmej | dshulyak_: what about that while? | 12:31 |
dshulyak_ | sec i want to check original impl | 12:31 |
pigmej | of gevent local? | 12:32 |
dshulyak_ | hm | 12:32 |
dshulyak_ | you will always use atleast 1st parent | 12:32 |
dshulyak_ | https://github.com/gevent/gevent/blob/master/gevent/local.py#L160-L161 | 12:32 |
pigmej | sure | 12:33 |
pigmej | that's intentional | 12:33 |
dshulyak_ | maybe thats the problem? | 12:33 |
pigmej | ? | 12:33 |
pigmej | that _nested_parent is assidned in pool.spawn | 12:33 |
pigmej | assigned | 12:33 |
dshulyak_ | but you have default c.parent | 12:33 |
dshulyak_ | in getattr | 12:33 |
pigmej | yup | 12:33 |
dshulyak_ | so… | 12:34 |
pigmej | ah, hmm, maybe for celery it's problem | 12:34 |
dshulyak_ | thats the difference with gevent implementation | 12:34 |
pigmej | sure but it's intentional in all cases except celery | 12:34 |
pigmej | it seems then that celery spawns worker greenlets from another greenlet | 12:35 |
dshulyak_ | which cases? solar_map? | 12:35 |
dshulyak_ | it looks like gevent local will be always shared, doesnt matter if _nested_parent is set or not | 12:35 |
pigmej | no, it certainly works fine, you can check it :) | 12:36 |
pigmej | BUT I probably haven't checked case like celery | 12:36 |
pigmej | where they have their own nested greenlets | 12:36 |
pigmej | so yeah that may be bad implementation for this case as you noticed | 12:37 |
pigmej | yeah that may be it | 12:39 |
pigmej | anyway: KeyError: 'lock_bucket_type' what about that ? | 12:40 |
dshulyak_ | pigmej: do you have it in config? | 12:44 |
pigmej | yup | 12:44 |
pigmej | I'm chceking your PR | 12:44 |
pigmej | and I wanted to run celery with ansible | 12:45 |
pigmej | `ansible-playbook -v -i "localhost," -c local /vagrant/bootstrap/playbooks/celery.yaml --skip-tags install` | 12:45 |
pigmej | ah | 12:49 |
pigmej | in that config class requires all variables set in all config files .solar_config_override | 12:49 |
pigmej | yeah we will need to adjust this _setter for that case :D | 12:55 |
dshulyak_ | pigmej: take a look please - http://paste.openstack.org/show/483565/, is it correct test? | 13:06 |
dshulyak_ | it is not obviouslt == [], i just wanted to print result | 13:06 |
pigmej | dshulyak_: you need to patch get_local | 13:08 |
pigmej | but hmm, maybe it's correct order too | 13:08 |
pigmej | I will check it after meeting :) | 13:08 |
dshulyak_ | omg, this irc prints messages in random order | 13:09 |
openstackgerrit | Lukasz Oles proposed openstack/solar: Add worpdress example from tutorial https://review.openstack.org/266360 | 13:14 |
dshulyak_ | pigmej: please checkout that test, for me that data is shared, but maybe i messed up smth | 13:59 |
pigmej | btw the first if is always invalid ? | 14:02 |
pigmej | isn't it? | 14:02 |
pigmej | I mean false | 14:03 |
dshulyak_ | if Model._local.test_id == 11: ? wo this if there will be AttributeError | 14:03 |
pigmej | right | 14:04 |
* pigmej stupid... | 14:04 | |
pigmej | yeah it's shared | 14:04 |
pigmej | BUT I'm starting to wonder if it's good or bad... ;d | 14:06 |
tzn | planned stuff for 0.1.0 release https://etherpad.openstack.org/p/solar-release-0.1.0 | 14:10 |
tzn | please add your comments | 14:11 |
pigmej | done :) | 14:12 |
pigmej | dshulyak_: hmm, your test is a bit unfortunate | 14:17 |
pigmej | it should be list of single item sets, isn't it? | 14:18 |
dshulyak_ | pigmej: yes, == [] is not correct, i just wanted to print out all items | 14:18 |
pigmej | ah ok | 14:19 |
pigmej | I runned it without pytest | 14:19 |
openstackgerrit | Jedrzej Nowak proposed openstack/solar: Fix gevent cache, by default it shouldn't be shared https://review.openstack.org/266395 | 14:24 |
pigmej | dshulyak_: ^ | 14:24 |
dshulyak_ | it wont brake other cases? | 14:25 |
pigmej | i shouldn't | 14:27 |
pigmej | it* | 14:27 |
pigmej | it should match thread approach now | 14:27 |
pigmej | I have no idea why I put this c.parent there... | 14:27 |
pigmej | I probably assumed that first parent will be always empty | 14:27 |
pigmej | dshulyak_: check it if it works for you | 14:36 |
pigmej | dshulyak_: I executed example 100 times never crashe | 14:44 |
pigmej | crashed | 14:44 |
dshulyak_ | pigmej: i will push patch for celery sqlite and check it with gevent worker | 14:46 |
pigmej | ;D | 14:46 |
pigmej | does this sqlite works 'good enough' ? | 14:47 |
dshulyak_ | pigmej: it fails with database is locked | 14:54 |
pigmej | hmm | 14:54 |
pigmej | that's knows sqlite error | 14:54 |
pigmej | known | 14:54 |
dshulyak_ | i changed celery to operate with its own db, and now its fine | 14:56 |
pigmej | yeah | 14:56 |
pigmej | I wanted to suggest it | 14:56 |
dshulyak_ | at first i shared solar.db with celery | 14:56 |
pigmej | because sqlite likes to lock itself | 14:56 |
pigmej | https://www.sqlite.org/lockingv3.html | 14:56 |
dshulyak_ | pushed changes, i will modify bootstrap with separate patch | 15:01 |
pigmej | k I will check it:) | 15:01 |
pigmej | dshulyak_: did it succeed ? Because still no PR there | 15:02 |
pigmej | ok found it | 15:04 |
dshulyak_ | pigmej: :( idk what is the problem, i added you to patch, gevent fix works for me, i will restore my gevent version, and i need to fix config files | 15:22 |
pigmej | ? | 15:24 |
pigmej | I don't get your last message dshulyak_ | 15:24 |
dshulyak_ | usually there is notification about pushed patch | 15:26 |
pigmej | ah this yeah :) | 15:28 |
pigmej | openstackgerrit: you alive ? | 15:29 |
pigmej | dshulyak_: Yeah I tested my patch on your gevent version | 15:29 |
pigmej | it worked for me too | 15:29 |
pigmej | salmon_: https://review.openstack.org/#/c/266395/ workflow please there :) | 15:37 |
pigmej | or dshulyak_ | 15:38 |
pigmej | dshulyak_: hmm, you rebased something ? | 15:41 |
openstackgerrit | Dmitry Shulyak proposed openstack/solar: Change celery config to use sqlite backend messaging and results https://review.openstack.org/266418 | 15:41 |
pigmej | because I see in your PR my changes ;d | 15:41 |
dshulyak_ | in which one? | 15:41 |
pigmej | in dockerfile I see my "torrent ports" for example | 15:42 |
pigmej | not that I care about ownership or sth but I wonder wf | 15:42 |
pigmej | wtf | 15:42 |
dshulyak_ | https://review.openstack.org/#/c/259127/7/docker-compose.yml ? | 15:42 |
pigmej | yeah | 15:43 |
pigmej | hmm | 15:43 |
* pigmej wonders | 15:43 | |
dshulyak_ | but they are not green | 15:43 |
pigmej | they were :D | 15:43 |
openstackgerrit | Merged openstack/solar: Fix gevent cache, by default it shouldn't be shared https://review.openstack.org/266395 | 15:43 |
pigmej | salmon_: I see that you prefer to work not talk ?:D | 15:43 |
salmon_ | pigmej: ? | 15:44 |
pigmej | joking ;P | 15:44 |
pigmej | dshulyak_: good job with that locks, seems ok :) | 15:47 |
pigmej | though you have one riak = too much :) | 15:49 |
pigmej | but everything works fine :) | 15:49 |
*** dshulyak_ has quit IRC | 15:53 | |
*** dshulyak_ has joined #openstack-solar | 15:54 | |
*** dshulyak_ has quit IRC | 15:55 | |
*** tzn has quit IRC | 16:04 | |
*** tzn has joined #openstack-solar | 16:07 | |
openstackgerrit | Merged openstack/solar: Set vagrant as a owner of /var/lib/solar/repositories https://review.openstack.org/266344 | 16:29 |
openstackgerrit | Merged openstack/solar: Change celery config to use sqlite backend messaging and results https://review.openstack.org/266418 | 16:29 |
openstackgerrit | Jedrzej Nowak proposed openstack/solar: Added possibility to change computable input func https://review.openstack.org/266475 | 16:50 |
*** dshulyak_ has joined #openstack-solar | 17:48 | |
*** dshulyak_ has quit IRC | 17:51 | |
*** tzn has quit IRC | 18:34 | |
*** tzn has joined #openstack-solar | 18:59 | |
*** dshulyak_ has joined #openstack-solar | 20:15 | |
*** dshulyak_ has quit IRC | 20:27 | |
*** _tzn has joined #openstack-solar | 20:31 | |
*** tzn has quit IRC | 20:34 | |
*** salmon_ has quit IRC | 20:48 | |
*** salmon_ has joined #openstack-solar | 20:57 | |
*** _tzn has quit IRC | 22:31 | |
*** tzn has joined #openstack-solar | 23:17 | |
*** salmon_ has quit IRC | 23:46 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!