*** shuyingy_ has joined #openstack-sahara | 00:23 | |
*** shuyingy_ has quit IRC | 00:27 | |
openstackgerrit | Siyi Luo proposed openstack/sahara-image-elements master: Update the documentation link for doc migration https://review.openstack.org/498281 | 00:59 |
---|---|---|
*** shuyingya has joined #openstack-sahara | 01:24 | |
*** shuyingya has quit IRC | 01:24 | |
*** shuyingya has joined #openstack-sahara | 01:24 | |
openstackgerrit | chao liu proposed openstack/sahara master: writing convention set to use "." to source script files https://review.openstack.org/498223 | 01:24 |
openstackgerrit | chao liu proposed openstack/sahara master: writing convention: do not use “-y” for package install https://review.openstack.org/498244 | 01:35 |
*** caowei has joined #openstack-sahara | 01:41 | |
*** shuyingy_ has joined #openstack-sahara | 02:17 | |
*** shuyingya has quit IRC | 02:18 | |
*** hoonetorg has quit IRC | 02:43 | |
*** hoonetorg has joined #openstack-sahara | 03:00 | |
*** esikachev has joined #openstack-sahara | 03:55 | |
*** esikachev has quit IRC | 03:59 | |
*** links has joined #openstack-sahara | 04:00 | |
*** shuyingy_ has quit IRC | 04:16 | |
*** shuyingya has joined #openstack-sahara | 04:22 | |
*** caowei has quit IRC | 05:09 | |
*** rcernin has joined #openstack-sahara | 05:30 | |
*** shuyingy_ has joined #openstack-sahara | 05:32 | |
*** shuyingya has quit IRC | 05:35 | |
*** caowei has joined #openstack-sahara | 05:43 | |
*** esikachev has joined #openstack-sahara | 06:10 | |
*** pgadiya has joined #openstack-sahara | 06:13 | |
openstackgerrit | ShangXiao proposed openstack/sahara master: [Trivialfix]Fix typos in sahara https://review.openstack.org/498328 | 06:35 |
*** pcaruana has joined #openstack-sahara | 06:36 | |
openstackgerrit | ShangXiao proposed openstack/sahara master: Replace http with https for doc links in sahara https://review.openstack.org/498329 | 06:46 |
*** anshul has joined #openstack-sahara | 07:11 | |
*** zemuvier has quit IRC | 07:15 | |
*** anshul has quit IRC | 07:20 | |
*** tesseract has joined #openstack-sahara | 07:31 | |
*** esikachev has quit IRC | 07:32 | |
*** esikachev has joined #openstack-sahara | 07:55 | |
*** esikachev has quit IRC | 07:59 | |
*** esikachev has joined #openstack-sahara | 08:03 | |
*** anshul has joined #openstack-sahara | 08:14 | |
*** tosky has joined #openstack-sahara | 08:49 | |
*** esikachev has quit IRC | 09:33 | |
*** esikachev has joined #openstack-sahara | 09:38 | |
openstackgerrit | Alina Nesterova proposed openstack/sahara-ci-config master: Add playbook to run Apache2 https://review.openstack.org/490002 | 10:04 |
*** zemuvier has joined #openstack-sahara | 10:09 | |
*** tosky has quit IRC | 10:21 | |
*** tosky has joined #openstack-sahara | 10:43 | |
*** jamielennox has quit IRC | 10:51 | |
*** jamielennox has joined #openstack-sahara | 10:52 | |
*** caowei has quit IRC | 10:55 | |
*** esikachev has quit IRC | 11:19 | |
*** dave-mccowan has joined #openstack-sahara | 11:32 | |
*** pgadiya has quit IRC | 11:51 | |
openstackgerrit | Merged openstack/sahara-image-elements master: Update the documentation link for doc migration https://review.openstack.org/498281 | 11:53 |
*** esikachev has joined #openstack-sahara | 12:08 | |
*** dave-mccowan has quit IRC | 12:38 | |
*** shuyingy_ has quit IRC | 12:45 | |
*** dave-mccowan has joined #openstack-sahara | 12:49 | |
*** esikachev has quit IRC | 13:03 | |
*** jeremyfreudberg has joined #openstack-sahara | 13:04 | |
*** lucasxu has joined #openstack-sahara | 13:05 | |
openstackgerrit | chao liu proposed openstack/sahara master: Fix to use "." to source script files https://review.openstack.org/498223 | 13:24 |
openstackgerrit | chao liu proposed openstack/sahara master: writing convention: do not use “-y” for package install https://review.openstack.org/498244 | 13:27 |
openstackgerrit | Merged openstack/sahara-image-elements master: Allow control of image output format https://review.openstack.org/498012 | 13:30 |
*** esikachev has joined #openstack-sahara | 14:00 | |
*** esikachev has quit IRC | 14:04 | |
*** shuyingya has joined #openstack-sahara | 14:17 | |
*** shuyingya has quit IRC | 14:19 | |
*** shuyingya has joined #openstack-sahara | 14:19 | |
*** links has quit IRC | 14:22 | |
openstackgerrit | Iwona Kotlarska proposed openstack/sahara master: Add export of cluster templates https://review.openstack.org/498484 | 14:45 |
*** iwonka has joined #openstack-sahara | 14:57 | |
iwonka | ping tellesnobrega | 14:58 |
tellesnobrega | iwonka, here | 14:58 |
iwonka | i have a problem with git (again...) | 14:58 |
iwonka | i have two my commits on my branch | 14:58 |
iwonka | so shouldn't do git-review | 14:59 |
iwonka | but one of them is also on master (i have no idea why) | 14:59 |
iwonka | so how can i do git-review only n the newer one? | 14:59 |
iwonka | the older has already been submitted for review | 15:00 |
iwonka | but i don't have a branch to start with to make the single commit i want | 15:00 |
*** esikachev has joined #openstack-sahara | 15:01 | |
tellesnobrega | iwonka, ok, let me be sure I understand | 15:01 |
tellesnobrega | you are working on a branch and you have two commits on it | 15:01 |
iwonka | yes, one is about ngt and the second one is about ct | 15:02 |
tellesnobrega | one of them are is also on master | 15:02 |
tellesnobrega | but you are not working from master | 15:03 |
iwonka | yes, probably my mistake, it's not merged | 15:03 |
tellesnobrega | if you can you can reset master to HEAD | 15:04 |
tellesnobrega | you can cut a new branch and move the new patch to this branch and do git review from it | 15:05 |
tellesnobrega | also, you can reset master to HEAD | 15:05 |
tellesnobrega | and on the branch that you are working that has 2 patches | 15:05 |
*** esikachev has quit IRC | 15:05 | |
tellesnobrega | you can do a git rebase -i | 15:05 |
tellesnobrega | pick only the one that you want to submit | 15:05 |
tellesnobrega | and send the patch | 15:05 |
openstackgerrit | Merged openstack/sahara master: Fix to use "." to source script files https://review.openstack.org/498223 | 15:06 |
openstackgerrit | Merged openstack/sahara master: writing convention: do not use “-y” for package install https://review.openstack.org/498244 | 15:06 |
tellesnobrega | iwonka, does that make sense to you? | 15:07 |
iwonka | not exactly | 15:07 |
iwonka | git rebase says "There is no tracking information for the current branch." | 15:08 |
iwonka | when i'm on the branch for export of ct | 15:08 |
tellesnobrega | did you reset master? | 15:08 |
iwonka | yes | 15:08 |
tellesnobrega | now on the the branch for export of ct | 15:08 |
tellesnobrega | you can do git rebase -i master | 15:08 |
iwonka | git rebase -i master? | 15:08 |
iwonka | a | 15:08 |
iwonka | okay | 15:08 |
iwonka | now i have to pick only one commit that i want? | 15:09 |
tellesnobrega | yes | 15:09 |
iwonka | ok, it worked, thanks! | 15:09 |
tellesnobrega | np | 15:10 |
tellesnobrega | I have to leave for a bit now | 15:10 |
tellesnobrega | lunch time | 15:10 |
tellesnobrega | if you have any more questions just leave them here or send them by mail | 15:10 |
tellesnobrega | I will reply asap | 15:10 |
iwonka | ok, thanks | 15:13 |
mnaser | i have a question | 15:15 |
jeremyfreudberg | mnaser - fire away | 15:15 |
mnaser | does the sahara api do some sort of busy loop? | 15:16 |
mnaser | i am trying to troubleshoot some performance issues | 15:16 |
mnaser | api requests are very slow | 15:16 |
mnaser | i did some profiling, saw an sql query took 4 seconds in sqlalchemy (nothing but a select all clusters) | 15:16 |
mnaser | upon doing stracing, the process was going nuts with epoll_wait non stop (on a zero timeout) | 15:16 |
mnaser | the cpu usage is relatively higher so i believe that busy loop is what is messing things up potentially | 15:16 |
mnaser | i just don't know where to go from here. a show cluster api operation takes 7 seconds right now | 15:17 |
jeremyfreudberg | mnaser, sadly not my area of expertise... i would say look into oslo.service etc (although perhaps you already have). also, if it's possible to benchmark sahara-engine as well, that might be useful to narrow down something | 15:23 |
mnaser | i saw some conductor code | 15:23 |
mnaser | i believe that it is just pipe work for an evenutal conductor | 15:23 |
mnaser | db calls still happen in the api process? | 15:23 |
*** shuyingya has quit IRC | 15:29 | |
*** links has joined #openstack-sahara | 15:33 | |
jeremyfreudberg | https://bugs.launchpad.net/oslo.messaging/+bug/1518430 https://bugs.launchpad.net/mos/+bug/1380220 - you've probably already read these bugs. and I don't think that what we are seeing here is the same bug exactly, but it still might be good to know if you see different behavior on python3 | 15:38 |
openstack | Launchpad bug 1518430 in Ubuntu Cloud Archive kilo "liberty: ~busy loop on epoll_wait being called with zero timeout" [Medium,Fix committed] | 15:38 |
jeremyfreudberg | again, i know very little about this stuff (seems like most openstack devs take it for granted too). so doing my best to narrow things down | 15:38 |
openstack | Launchpad bug 1380220 in Mirantis OpenStack 10.0.x "OpenStack services excessively poll socket events when oslo.messaging is used" [Medium,Fix committed] - Assigned to MOS Oslo (mos-oslo) | 15:38 |
jeremyfreudberg | ^ mnaser | 15:38 |
mnaser | jeremyfreudberg yes i actually even found a redhat fix where they did a very indepth investigation but even in the investigation after fixing it sahara was a heavy poller | 15:39 |
mnaser | https://bugzilla.redhat.com/show_bug.cgi?id=1384183 | 15:40 |
openstack | bugzilla.redhat.com bug 1384183 in python-oslo-messaging "busy looping on epoll_wait()" [Urgent,Closed: errata] - Assigned to jeckersb | 15:40 |
jeremyfreudberg | mnaser, ok, thanks for clarifying. that is tricky then... i guess it's "good" to know (although difficult to fix) that there is something specifically in sahara | 15:41 |
jeremyfreudberg | mnaser, have you benchmarked another service that uses sahara-style conductor? (I think most other services do not use something quite like that) | 15:42 |
mnaser | jeremyfreudberg https://bugzilla.redhat.com/show_bug.cgi?id=1384183#c24 -- that one specificly you see there | 15:42 |
openstack | bugzilla.redhat.com bug 1384183 in python-oslo-messaging "busy looping on epoll_wait()" [Urgent,Closed: errata] - Assigned to jeckersb | 15:42 |
mnaser | what do you mean by sahara-style conductor? | 15:42 |
jeremyfreudberg | mnaser, i'm not sure exactly what i mean. but i know that sahara has conductor/ folder which has to do with sqlalchemy and some oslo rpc stuff, but I was under the impression that some other openstack services do it a "different" way. but I could be confused | 15:45 |
jeremyfreudberg | just trying to think of what makes sahara service act different | 15:46 |
mnaser | jeremyfreudberg nova has the nova-conductor approach | 15:46 |
jeremyfreudberg | indeed it does | 15:47 |
*** pcaruana has quit IRC | 15:47 | |
jeremyfreudberg | and i'm guessing that the process behaves rather sanely? | 15:47 |
jeremyfreudberg | ah, i see on the rh bug that they saw significant improvement there | 15:49 |
mnaser | jeremyfreudberg but you notice that sahar-api is still drawing 81k epoll_waits over 30s | 15:50 |
jeremyfreudberg | mnaser, yes i do see that | 15:50 |
mnaser | on the other hand nova api is not evne on the list, heat-engine/nova-conductor which are pretty busy usually are nowhere close | 15:50 |
mnaser | so thats why im wondering there must be something | 15:51 |
*** esikachev has joined #openstack-sahara | 16:02 | |
*** esikachev has quit IRC | 16:06 | |
*** rcernin has quit IRC | 16:10 | |
openstackgerrit | Iwona Kotlarska proposed openstack/sahara master: Add export of cluster templates https://review.openstack.org/498484 | 16:20 |
*** jeremyfreudberg has quit IRC | 16:22 | |
*** links has quit IRC | 16:27 | |
*** tesseract has quit IRC | 16:30 | |
openstackgerrit | Iwona Kotlarska proposed openstack/python-saharaclient master: Add export of cluster templates https://review.openstack.org/498520 | 16:35 |
*** rcernin has joined #openstack-sahara | 16:48 | |
*** ssmith has joined #openstack-sahara | 16:51 | |
*** shuyingya has joined #openstack-sahara | 17:02 | |
*** esikachev has joined #openstack-sahara | 17:03 | |
*** shuyingya has quit IRC | 17:07 | |
*** chlong has joined #openstack-sahara | 17:07 | |
*** esikachev has quit IRC | 17:07 | |
mnaser | sahara 13999 12.2 0.8 408456 246736 ? Ss Aug25 497:06 /usr/bin/python2 /usr/bin/sahara-engine --config-file /etc/sahara/sahara.conf | 17:07 |
mnaser | the cpu time is insane! | 17:07 |
*** jeremyfreudberg has joined #openstack-sahara | 17:08 | |
tellesnobrega | mnaser, hey just read your question, it is very interesting and I don't believe we were aware of this issue | 17:08 |
tellesnobrega | thanks for bringing it up | 17:08 |
mnaser | the engine suffers the same idea so i suspect oslo service might be culprit but i dont know enough about the codebase right now | 17:10 |
mnaser | ill get profiling | 17:10 |
jeremyfreudberg | mnaser, thanks a ton for your help and interest | 17:11 |
tellesnobrega | thanks mnaser | 17:11 |
tellesnobrega | let me know if I can be of any help | 17:11 |
mnaser | yeah, we're trying to launch it in prod and this sort of thing is a breaker, thank you :) | 17:11 |
tellesnobrega | jeremyfreudberg, welcome back | 17:11 |
jeremyfreudberg | tellesnobrega, it's good to be back | 17:11 |
jeremyfreudberg | btw mnaser, i'm hoping you'll be at the ptg since you are (if i'm not mistaken) the incoming puppet PTL... if you happen to be around wed/thurs that week please feel free to stop by sahara room | 17:13 |
mnaser | jeremyfreudberg will do so, need to palan my stuff | 17:14 |
jeremyfreudberg | cool | 17:14 |
openstackgerrit | Iwona Kotlarska proposed openstack/sahara master: Add export of cluster templates https://review.openstack.org/498484 | 17:28 |
tellesnobrega | jeremyfreudberg, did you have a chance to read last meeting's log? | 17:40 |
jeremyfreudberg | tellesnobrega, i did, but quickly | 17:41 |
tellesnobrega | jeremyfreudberg, shu yingya described a problem with scale cluster after updating the cluster name | 17:43 |
tosky | and he added some comments about it | 17:43 |
tosky | in the bug | 17:43 |
tosky | it's the change in the heat template | 17:43 |
tosky | in the name | 17:43 |
tellesnobrega | I tried it on a devstack locally | 17:44 |
tellesnobrega | and it worked | 17:44 |
jeremyfreudberg | i can try soon to attempt to replicate the bug | 17:46 |
tellesnobrega | jeremyfreudberg, please do, that was where I wanted to get | 17:55 |
tellesnobrega | I couldn't replicate here with storm | 17:55 |
jeremyfreudberg | i'll try it on a few plugins, i guess (shu yingya said vanilla and storm only, but let's be sure) | 17:57 |
tellesnobrega | sure | 17:59 |
*** esikachev has joined #openstack-sahara | 18:04 | |
*** esikachev has quit IRC | 18:08 | |
*** zemuvier has quit IRC | 18:22 | |
*** zemuvier has joined #openstack-sahara | 18:23 | |
mnaser | jeremyfreudberg tellesnobrega do you know if its possible to run the flask app only somehow with sahara? | 18:36 |
mnaser | im trying to see if i can make use of the werkzeug profiler | 18:37 |
tellesnobrega | mnaser, did you try this sahara-venv/bin/sahara-api --config-file sahara-venv/etc/sahara.conf ? | 18:38 |
tellesnobrega | sahara-venv/bin/sahara-engine --config-file sahara-venv/etc/sahara.conf | 18:38 |
mnaser | tellesnobrega i can do that bu tthat would start up the service using oslo.service rather than just running teh sahara-api flask app directly | 18:38 |
tellesnobrega | hum | 18:38 |
mnaser | if i can run the sahara-api app alone and there is no crazy cpu usage | 18:38 |
tellesnobrega | I see | 18:38 |
jeremyfreudberg | https://github.com/openstack/sahara/blob/4439daca6de49430ad6d4a5b295e94a7eb4acf72/sahara/api/middleware/sahara_middleware.py#L31 | 18:38 |
mnaser | i can at least know that it's a oslo.service thing | 18:38 |
jeremyfreudberg | write something that imports from the linked file? | 18:39 |
mnaser | ok thats a good start, im using this as reference http://www.alexandrejoseph.com/blog/2015-12-17-profiling-werkzeug-flask-app.html | 18:39 |
mnaser | PATH: '/v1.1/8ffa97d00b9a4f55a5f4f40a5a05b600/clusters/4961ff69-a2f9-40a3-b44c-ff208ce5ca51' | 18:47 |
mnaser | 8433826 function calls (8321676 primitive calls) in 9.983 seconds | 18:47 |
mnaser | damn | 18:47 |
tellesnobrega | thats only sahara-api? | 18:48 |
mnaser | thats one GET call | 18:48 |
mnaser | to get a clsuter | 18:48 |
tellesnobrega | thats a lot | 18:48 |
mnaser | getting a clsuter template is 70k calls | 18:48 |
mnaser | so something is wrong | 18:48 |
mnaser | http://paste.openstack.org/show/619664/ | 18:49 |
mnaser | there seems to be some weird stuff going on | 18:50 |
mnaser | with the pymysql # of calls | 18:50 |
mnaser | http://paste.openstack.org/show/619665/ | 18:50 |
mnaser | thats a get clsuter template | 18:50 |
*** esikachev has joined #openstack-sahara | 18:52 | |
tellesnobrega | mnaser, it is weird the number of sqlalchemy calls ther | 18:52 |
tellesnobrega | there | 18:52 |
tellesnobrega | how long is this taking? | 18:53 |
mnaser | 9.983 seconds according to that profile for that query | 18:54 |
mnaser | 0.133 to get a clsuter template | 18:54 |
tellesnobrega | cluster_template seems like it should be ok | 18:55 |
mnaser | _read_row_from_packet is called 8 times in in clsuter template | 18:55 |
tellesnobrega | get cluster not so much | 18:55 |
openstackgerrit | Iwona Kotlarska proposed openstack/sahara master: Add export of cluster templates https://review.openstack.org/498484 | 18:55 |
mnaser | 4682 times in get clsuter | 18:55 |
tellesnobrega | yeah, that is not right | 18:56 |
tellesnobrega | read_length_coded_string is called 698922 on get cluster and 1292 on get_cluster_template | 18:57 |
mnaser | tellesnobrega i think that's actually just an indirect effect of the bad _read_row_from_packet | 18:58 |
mnaser | the number scales linearly so it seems fine | 18:58 |
tellesnobrega | yeah, I was doing the math here | 18:58 |
mnaser | so the thing that inflates it seem to be _read_row_from_packet tellesnobrega which show sthe big cumtime too | 18:58 |
tellesnobrega | mnaser, true. | 19:00 |
mnaser | 2 0.000 0.000 9.966 4.983 /usr/lib/python2.7/site-packages/sahara/db/sqlalchemy/api.py:284(_cluster_get) | 19:00 |
mnaser | def in the sqlalchemy layer | 19:00 |
*** tellesnobrega has left #openstack-sahara | 19:01 | |
*** tellesnobrega has joined #openstack-sahara | 19:01 | |
mnaser | i considered that it is a badly written query but i mean | 19:03 |
mnaser | there's almost no rows | 19:03 |
mnaser | even no indexes no way that can take long | 19:03 |
*** jeremyfreudberg_ has joined #openstack-sahara | 19:04 | |
tellesnobrega | mnaser, yeah, I don't see where this could be badly written | 19:04 |
*** jeremyfreudberg has quit IRC | 19:07 | |
mnaser | 4 0.000 0.000 8.947 2.237 /usr/lib/python2.7/site-packages/pymysql/connections.py:1379(_read_result_packet) | 19:07 |
mnaser | thats called 4 times regardless | 19:08 |
mnaser | but takes 8 seconds in the cluster get | 19:08 |
mnaser | _read_result_packet in turn calls read_length_encoded_integer in it | 19:09 |
mnaser | which is called ~924k times in the slow request | 19:09 |
mnaser | but only 282 | 19:09 |
* jeremyfreudberg_ jeremyfreudberg | 19:14 | |
jeremyfreudberg_ | oops | 19:14 |
*** jeremyfreudberg_ is now known as jeremyfreudberg | 19:14 | |
openstackgerrit | Iwona Kotlarska proposed openstack/python-saharaclient master: Add import of Cluster Templates https://review.openstack.org/498574 | 19:21 |
tellesnobrega | mnaser, thanks for digging this. I will try to take a look in more detail | 19:22 |
tellesnobrega | and if you find anything let me know | 19:22 |
mnaser | tellesnobrega yeah ill keep diggin | 19:23 |
tellesnobrega | mnaser, btw to start only sahara you did as jeremyfreudberg suggested? imported sahara_middleware and started it? | 19:28 |
mnaser | tellesnobrega nope i edited the file jeremyfreudberg mentioned following the instructions of the article i had there and launched it in foreground (profiling info goes in stderr) | 19:28 |
tellesnobrega | cool | 19:29 |
openstackgerrit | Iwona Kotlarska proposed openstack/python-saharaclient master: Add export of cluster templates https://review.openstack.org/498520 | 19:31 |
iwonka | tellesnobrega: alll that's left is dealing with ct in gui | 19:32 |
iwonka | i have export ready | 19:32 |
tellesnobrega | that is awesome iwonka | 19:32 |
iwonka | but about one old commit | 19:32 |
iwonka | jeremyfreudberg asked to do rebase | 19:33 |
iwonka | and i did it | 19:33 |
iwonka | and now it got this problem | 19:33 |
iwonka | Patch in Merge Conflict | 19:33 |
jeremyfreudberg | iwonka, yes, now it's a trickier rebase | 19:33 |
jeremyfreudberg | one that cannot be done from ui :( | 19:33 |
iwonka | uhh | 19:33 |
tellesnobrega | iwonka, you will have to do the following | 19:34 |
tellesnobrega | cut a branch from master | 19:34 |
tellesnobrega | and checkout that change into that branch | 19:34 |
tellesnobrega | fix conflict | 19:34 |
tellesnobrega | and send the review | 19:34 |
jeremyfreudberg | https://www.entropywins.wtf/blog/2013/07/01/resolving-a-merge-conflict-on-gerrit/ | 19:34 |
jeremyfreudberg | basically what tellesnobrega said | 19:34 |
jeremyfreudberg | i do a fresh clone folder, then `git review -d <patch number>` then do rebase stuff | 19:34 |
jeremyfreudberg | and by "rebase stuff" i mean conflict resolution | 19:35 |
tellesnobrega | works as well | 19:35 |
iwonka | ok, i'll try to fix that, thanks | 19:35 |
*** anshul has quit IRC | 19:37 | |
openstackgerrit | Luigi Toscano proposed openstack/python-saharaclient master: Reorganize the documentation following the new structure https://review.openstack.org/498580 | 19:39 |
iwonka | okay, the confilict was with version of oslo.log | 19:45 |
mnaser | ok so | 19:45 |
mnaser | the query to get a specific cluster | 19:45 |
iwonka | can I just take the bigger one and it will be fine? | 19:45 |
mnaser | has 10 joins in it. | 19:46 |
mnaser | and for a single cluster get it returns 2340 rows. | 19:46 |
jeremyfreudberg | iwonka, yes take the later one, but that shouldn't even be a conflict since i don't think you had a patch which also changed that file | 19:46 |
mnaser | http://paste.openstack.org/show/619676/ | 19:46 |
jeremyfreudberg | iwonka, or let me rephrase that, i'm surprised it was a conflict that could not be resolved automatically | 19:48 |
jeremyfreudberg | becuase your change to requirements.txt was a different line | 19:48 |
jeremyfreudberg | regardless, just take the newer version | 19:48 |
jeremyfreudberg | mnaser, looking | 19:48 |
mnaser | i think that was the query when you request progress/steps/whatever that query request | 19:48 |
mnaser | there's so much data sent over the line | 19:49 |
mnaser | mgmt private key * 200 times | 19:49 |
mnaser | 2000* | 19:49 |
jeremyfreudberg | mnaser, this is a little scary to look at... | 19:49 |
mnaser | jeremyfreudberg i just hit ctrl+f LEFT JOIN | 19:49 |
mnaser | didnt even want to haha | 19:49 |
openstackgerrit | Iwona Kotlarska proposed openstack/sahara-dashboard master: Add export of node group templates https://review.openstack.org/485215 | 19:51 |
iwonka | ok, i think i fixed it | 19:51 |
jeremyfreudberg | iwonka, looks like it | 19:51 |
iwonka | cool | 19:52 |
iwonka | thanks for help | 19:52 |
*** anshul has joined #openstack-sahara | 19:52 | |
jeremyfreudberg | so mnaser, where do we go from here? what's the road to optimization? | 19:55 |
mnaser | jeremyfreudberg i think this is a case of "lets make one single query to make it super optimized" | 19:56 |
mnaser | but ending up with something even less optimized | 19:56 |
jeremyfreudberg | mnaser, i'm not super familiar with sqlalchemy - how do we make the query "not weird" again, is really what i mean. or how do we wrangle sqlalchemy into being smarter/dumber in the queries it constructs | 19:59 |
mnaser | jeremyfreudberg thats what im trying to figure out. it seems to carry a lot of info that is not used | 20:00 |
jeremyfreudberg | mnaser, i guess my other question is do we see these kinds of underlying queries with another services as well (like nova, etc). | 20:03 |
jeremyfreudberg | I can look into this at some point too | 20:03 |
mnaser | jeremyfreudberg i think the problem is like the fact there is such huge fields | 20:03 |
mnaser | like describe clusters; | 20:05 |
mnaser | you got all those giant TEXT fields | 20:05 |
mnaser | for mgmt keys and stuff | 20:05 |
jeremyfreudberg | mnaser, that's true, but also nova has some mediumtext fields, and i'm sure other services use fields like that too. so that can't be the whole story | 20:08 |
mnaser | i feel like its a bit of a contributing factor as well, i am trying to see how much data we're sending on the line | 20:09 |
jeremyfreudberg | yes, i'm sure it has some impact, but just not all the impact | 20:10 |
mnaser | jeremyfreudberg small math | 20:13 |
mnaser | that query generates 12521818 bytes of data | 20:13 |
mnaser | aka 12.52 megabytes | 20:13 |
mnaser | that's 12.5 megabytes of data being parsed on every cluster get call | 20:13 |
jeremyfreudberg | mnaser, how can i play around with these myself - is it from werkzeug profiler, something else ...? I might have missed that. but yeah, 12.5 megs is too much. | 20:15 |
tellesnobrega | mnaser, that is too much | 20:17 |
mnaser | jeremyfreudberg next gen debugging... loop spamming 'show full processlist' in shell | 20:18 |
mnaser | while hitting the api | 20:18 |
mnaser | :p | 20:18 |
mnaser | thats how i got that SQL query | 20:18 |
mnaser | https://github.com/openstack/sahara/blob/master/sahara/db/sqlalchemy/models.py#L71-L82 | 20:23 |
mnaser | lazy='joined' is the culprit here i think | 20:23 |
mnaser | im gonna see the nova db code | 20:24 |
jeremyfreudberg | mnaser, pretty interesting that the change was orignally done without explanation too https://review.openstack.org/#/c/40391 | 20:26 |
jeremyfreudberg | that's always "nice" | 20:26 |
mnaser | yep just saw the git blame too.. | 20:27 |
mnaser | jeremyfreudberg spoke with nova folks | 20:28 |
mnaser | 0 implicit joins in the nova code base, only explicit joins | 20:29 |
mnaser | let me try removing the joins and seeing what sort of performance i end up with | 20:29 |
mnaser | (the code will still work, it will just generate an extra query if it needs it) | 20:29 |
jeremyfreudberg | cool | 20:30 |
mnaser | oh god | 20:33 |
mnaser | this is messy | 20:33 |
mnaser | jeremyfreudberg the reason behind the joins is the to_dict code stuff | 20:33 |
mnaser | would resolve all of them | 20:33 |
jeremyfreudberg | mnaser, oh | 20:37 |
mnaser | DetachedInstanceError: Parent instance <Cluster at 0x8a562d0> is not bound to a Session; lazy load operation of attribute 'node_groups' cannot proceed | 20:37 |
mnaser | getting this when removing the join | 20:37 |
tosky | uhm, so the join is needed maybe to retrive that attribute? | 20:39 |
tosky | I know about SQL but I really don't know about sqlalchemy | 20:39 |
mnaser | tosky sqlalchemy can do a join to prepopulate data | 20:41 |
mnaser | alterantively it can avoid doing that and automatically make a select when that attribute is called | 20:41 |
mnaser | the current behaviour loads everything and sends all the data across from the db layer back | 20:41 |
tosky | argh | 20:41 |
mnaser | right now if we disable that lazyload | 20:41 |
mnaser | those fields wont be loaded on the query | 20:41 |
tosky | apart from normalization, the first thing that you learn in SQL is: filter first, then join and sort | 20:42 |
mnaser | and when they are passed to the conductor | 20:42 |
mnaser | they become 'detached' as in ... the conductor doesnt know that they're anything but dicts. | 20:42 |
mnaser | we had 11 joins. | 20:42 |
mnaser | in the cluster_get | 20:42 |
tosky | isn't it possible to filter the unneeded data first? | 20:42 |
mnaser | http://paste.openstack.org/show/619676/ which returns 12.5 mb | 20:43 |
mnaser | i mean i think the issue comes from the fact that there are huge fields being repeated | 20:43 |
mnaser | for example the mgmt private key | 20:43 |
mnaser | that query returns 2500 records on this small empty dev env | 20:43 |
mnaser | i think this would be quite the refactor gr | 20:43 |
mnaser | to pretty much remove all the joins and the db layer returns specific things | 20:44 |
mnaser | or migrate to https://docs.openstack.org/oslo.versionedobjects/latest/ | 20:44 |
tosky | or leave the join, but filtered first | 20:44 |
mnaser | even if its filtered, because the join creates more rows, the data gets multiplied | 20:44 |
tosky | shouldn't that depend on the type of the join? | 20:45 |
*** anshul has quit IRC | 20:45 | |
mnaser | i think it was all left outer joins it was doing | 20:45 |
mnaser | ideally i think the clsuter_get should give you a cluster, nothing more, nothing less. | 20:45 |
mnaser | i want to see how nova does this | 20:45 |
*** lucasxu has quit IRC | 20:46 | |
*** esikachev has quit IRC | 20:46 | |
*** jeremyfreudberg has quit IRC | 20:48 | |
mnaser | https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L1943-L1961 | 20:49 |
mnaser | deally | 20:49 |
mnaser | ideally | 20:49 |
mnaser | this is what we should be doing | 20:49 |
mnaser | explicit joins on demand | 20:49 |
*** chlong has quit IRC | 20:53 | |
tellesnobrega | mnaser, that sounds like quite a work and probably should be done soon | 21:12 |
mnaser | tellesnobrega yeah. i think ill pick it up. it shouldn't be too hard | 21:13 |
mnaser | just moving the logic | 21:13 |
tellesnobrega | mnaser, awesome | 21:13 |
tellesnobrega | let me know if I can be of any assistance | 21:13 |
mnaser | tellesnobrega thanks | 21:14 |
mnaser | hopefully the unit tests help | 21:14 |
*** chlong has joined #openstack-sahara | 21:26 | |
*** chlong has quit IRC | 21:31 | |
*** rcernin has quit IRC | 21:32 | |
*** chlong has joined #openstack-sahara | 21:43 | |
*** ssmith has quit IRC | 21:51 | |
openstackgerrit | Mohammed Naser proposed openstack/sahara master: Stop unnecessary joins and prefer lazy loading https://review.openstack.org/498611 | 22:27 |
mnaser | step #1 ^ | 22:27 |
openstackgerrit | Iwona Kotlarska proposed openstack/sahara-dashboard master: Add export of cluster templates to UI https://review.openstack.org/498612 | 22:31 |
*** iwonka_ has joined #openstack-sahara | 22:47 | |
*** iwonka has quit IRC | 22:47 | |
mnaser | woo | 22:53 |
mnaser | with that patch | 22:53 |
mnaser | a clsuter get | 22:53 |
mnaser | 3242182 function calls (3188882 primitive calls) in 4.226 seconds | 22:53 |
mnaser | down from 8433826 | 22:53 |
*** iwonka has joined #openstack-sahara | 22:53 | |
*** iwonka_ has quit IRC | 22:57 | |
mnaser | went from 2500 rows to 780 | 22:57 |
mnaser | 4 less joins | 22:57 |
tosky | wow | 23:00 |
tosky | and apparently no changes in features? | 23:00 |
mnaser | tosky yep. if it needs to make the query, it'll just make one extra query | 23:04 |
mnaser | im working on figuring out the best way for the second one now | 23:05 |
*** tosky has quit IRC | 23:30 | |
*** chlong has quit IRC | 23:34 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!