*** dimtruck is now known as zz_dimtruck | 00:02 | |
*** zz_dimtruck is now known as dimtruck | 00:23 | |
*** mriedem_away is now known as mriedem | 00:42 | |
*** dimtruck is now known as zz_dimtruck | 01:07 | |
*** mriedem has quit IRC | 01:13 | |
*** f13o has quit IRC | 01:31 | |
*** markvoelker has quit IRC | 01:46 | |
*** badari has quit IRC | 01:54 | |
*** badari has joined #openstack-performance | 02:08 | |
*** badari has quit IRC | 02:14 | |
*** arnoldje has joined #openstack-performance | 02:20 | |
*** rfolco has joined #openstack-performance | 02:21 | |
*** rfolco has quit IRC | 02:51 | |
*** badari has joined #openstack-performance | 05:19 | |
*** arnoldje has quit IRC | 05:25 | |
*** badari has quit IRC | 05:33 | |
*** harlowja_at_home has joined #openstack-performance | 05:57 | |
*** dims has quit IRC | 06:09 | |
*** dims has joined #openstack-performance | 06:14 | |
*** badari has joined #openstack-performance | 06:14 | |
*** badari has quit IRC | 06:22 | |
*** harlowja_at_home has quit IRC | 06:34 | |
*** mlgrneff has joined #openstack-performance | 08:54 | |
*** xek has joined #openstack-performance | 09:07 | |
*** amaretskiy has joined #openstack-performance | 09:22 | |
*** itsuugo has quit IRC | 10:03 | |
*** itsuugo has joined #openstack-performance | 10:05 | |
*** rfolco has joined #openstack-performance | 10:19 | |
*** mlgrneff has quit IRC | 11:37 | |
*** mlgrneff has joined #openstack-performance | 11:37 | |
*** regXboi has joined #openstack-performance | 12:25 | |
*** markvoelker_ has joined #openstack-performance | 13:49 | |
*** arnoldje has joined #openstack-performance | 14:03 | |
*** mdorman has joined #openstack-performance | 14:11 | |
*** markvoelker has joined #openstack-performance | 14:34 | |
*** markvoelker_ has quit IRC | 14:35 | |
*** mriedem has joined #openstack-performance | 14:36 | |
*** badari has joined #openstack-performance | 14:41 | |
*** arnoldje has quit IRC | 14:41 | |
*** mdorman has quit IRC | 14:46 | |
kun_huang | Is our meeting 5 minutes later or one hour later? | 14:54 |
---|---|---|
DinaBelova | kun_huang - sadly I did not resend the email to collect feedback on that step - so in 5 mins | 14:55 |
DinaBelova | I'll send the proposal today | 14:55 |
DinaBelova | kun_huang - I forgot to do that, as was suffering from fever last week :( | 14:55 |
kun_huang | DinaBelova: welcome back now :) | 14:56 |
DinaBelova | kun_huang - yeah! | 14:56 |
*** manand has joined #openstack-performance | 14:58 | |
DinaBelova | #startmeeting Performance Team | 15:00 |
openstack | Meeting started Tue Dec 1 15:00:05 2015 UTC and is due to finish in 60 minutes. The chair is DinaBelova. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:00 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:00 |
openstack | The meeting name has been set to 'performance_team' | 15:00 |
DinaBelova | hello everyone! :) | 15:00 |
DinaBelova | raise your hands who's around :) | 15:00 |
kun_huang | o/ | 15:00 |
amaretskiy | hi | 15:00 |
DinaBelova | klindgren, harlowja? :) | 15:00 |
DinaBelova | hehe, people sleeping :) | 15:01 |
mlgrneff | hi, it's Rob from Flex Ciii | 15:01 |
DinaBelova | mlgrneff - hey! | 15:01 |
DinaBelova | nice to see u here | 15:01 |
mlgrneff | Glad to finally make it. | 15:01 |
DinaBelova | so let's wait for a few more moments :) to allow people appear | 15:02 |
DinaBelova | mlgrneff ;) | 15:02 |
manand | hi | 15:02 |
DinaBelova | ok, so let's probably start. As usual with today's agenda | 15:03 |
DinaBelova | #link https://wiki.openstack.org/wiki/Meetings/Performance#Agenda_for_next_meeting | 15:03 |
DinaBelova | #topic Action Items | 15:03 |
DinaBelova | andreykurilin are you around? | 15:03 |
DinaBelova | we have an action item on you | 15:03 |
andreykurilin | sure | 15:03 |
DinaBelova | from last meeting | 15:03 |
*** mlgrneff is now known as RobNeff | 15:03 | |
andreykurilin | I'm here:) | 15:03 |
andreykurilin | hi | 15:03 |
DinaBelova | about when load pattern tests runner will be available in Rally | 15:04 |
DinaBelova | I know there is already some change for it on review - any feeling when it'll be merged? | 15:04 |
*** rvasilets___ has joined #openstack-performance | 15:04 | |
DinaBelova | #link https://review.openstack.org/#/c/234195/ | 15:04 |
*** AugieMena has joined #openstack-performance | 15:04 | |
DinaBelova | kun_huang or I may asl you as well - as a reviewer :) | 15:05 |
rvasilets___ | o/ | 15:05 |
DinaBelova | any bad/good feeling on this patch? | 15:05 |
DinaBelova | rvasilets___ o/ | 15:05 |
andreykurilin | rvasilets is an owner of this patch ) | 15:05 |
andreykurilin | he can give a better estimates | 15:05 |
DinaBelova | well, his progress depends much on the reviewers :) | 15:05 |
DinaBelova | rvasilets___? | 15:05 |
DinaBelova | I remember this change was hot enough last time - any estimates on merging? | 15:06 |
rvasilets___ | Patch is ready. Just waiting for review. Its already work | 15:06 |
*** Jeff__ has joined #openstack-performance | 15:06 | |
DinaBelova | ok, so I'm right and reviewing effort needs to be spend :) | 15:07 |
andreykurilin | General news are: patch by rvasilets adds a base ability to "stress runner", but it doesn't decrease load on SLA failure | 15:07 |
rvasilets___ | pboldin and kun are reviewed it. That is all | 15:07 |
kun_huang | DinaBelova: I will update my review on new patch set ;) | 15:07 |
DinaBelova | andreykurilin - well, we need to do first steps first :) | 15:07 |
DinaBelova | kun_huang, ok, thanks sir :) | 15:07 |
andreykurilin | DinaBelova: I don't know anyone who works on the second step | 15:07 |
DinaBelova | so waiting for it with great wish to see it merged as a first step - we can add work item for second one | 15:08 |
DinaBelova | andreykurilin ack | 15:08 |
DinaBelova | cool, so let's go further | 15:08 |
*** dims has quit IRC | 15:08 | |
DinaBelova | so about devstack-gate n-api-meta job | 15:08 |
DinaBelova | there is an email to openstack-operators | 15:08 |
DinaBelova | lemme check it | 15:08 |
DinaBelova | mriedem proposed one job to run metadata as well | 15:09 |
DinaBelova | hm, cannot find the email in the arcives | 15:10 |
DinaBelova | I'll check it later | 15:10 |
DinaBelova | so anyway, action item is done | 15:10 |
mriedem | neutron large ops | 15:10 |
DinaBelova | mriedem - yep, thank you sir | 15:10 |
DinaBelova | and about my action items - done both | 15:10 |
DinaBelova | really encouraging you guys to review https://review.openstack.org/#/c/251343/ | 15:11 |
DinaBelova | and all chain related | 15:11 |
*** pglass has joined #openstack-performance | 15:11 | |
DinaBelova | andreykurilin, harlowja, kun_huang - please feel free to leave comments :) | 15:11 |
DinaBelova | ok, cool, so that's it for action items | 15:11 |
kun_huang | got it | 15:12 |
DinaBelova | any questions regarding this topic? | 15:12 |
andreykurilin | my todo list is extended :) will review your patches a little bit later | 15:12 |
DinaBelova | andreykurilin ;) | 15:12 |
DinaBelova | cool, thanks andreykurilin kun_huang | 15:12 |
DinaBelova | let's move forward | 15:12 |
DinaBelova | #topic Nova-conductor performance issues | 15:12 |
DinaBelova | ok, so we have some news on that field :) | 15:12 |
DinaBelova | klindgren has sent me the results of the following patch | 15:13 |
DinaBelova | #link http://paste.openstack.org/show/480426/ | 15:13 |
DinaBelova | it's dumping the conductor workers info once a minute | 15:13 |
DinaBelova | visual results may be found here | 15:13 |
DinaBelova | #link https://drive.google.com/a/mirantis.com/file/d/0ByRtVrZu5ifzUkZTQVZzMERYQWc/view | 15:13 |
DinaBelova | and today dims and I had a chance to take a look | 15:14 |
kun_huang | opening it... | 15:14 |
DinaBelova | frankly speaking it's not super obvious... | 15:14 |
DinaBelova | there are dumps of 2 workers there | 15:14 |
*** dansmith has joined #openstack-performance | 15:14 | |
DinaBelova | i suspect that some of the reds are related to the bug we have observed in MOS as well - https://bugs.launchpad.net/mos/+bug/1380220 - this is related to the eventlet red svg boxes (that can be seen on first picture from conductor worker with 8401 pid) - we did not observe this bug before in community, so it looks like we need to file it on the upstream as well (related to the heartbits probably) | 15:14 |
openstack | Launchpad bug 1380220 in Mirantis OpenStack "OpenStack services excessively poll socket events when oslo.messaging is used" [Medium,Triaged] - Assigned to MOS QA Team (mos-qa) | 15:14 |
DinaBelova | we have seen the same behavior as 8401 worker on some of the MOS installations | 15:15 |
johnthetubaguy | do we know what DB driver they are using for the nova-conductor? | 15:15 |
*** rpodolyaka has joined #openstack-performance | 15:15 | |
*** ctrath has joined #openstack-performance | 15:15 | |
DinaBelova | johnthetubaguy - it's not listed in the https://etherpad.openstack.org/p/remote-conductor-performance | 15:15 |
DinaBelova | hm... | 15:15 |
DinaBelova | I hope klindgren will appear :) | 15:16 |
mriedem | mysql-python i think | 15:16 |
DinaBelova | I know it's a bit early for him now | 15:16 |
johnthetubaguy | would be good to check if its pymysql | 15:16 |
mriedem | b/c it's older oslo.db pre-pymysql | 15:16 |
mriedem | johnthetubaguy: i don't think it is | 15:16 |
johnthetubaguy | mriedem: ah, that is what I was wondinerg | 15:16 |
dansmith | mriedem: they could still be running it with the older one, right? | 15:16 |
dansmith | it's just the default changed.. | 15:16 |
mriedem | oslo.db==1.7.1, MySQL-python==1.2.3 (kilo reqs are: oslo.db<1.8.0,>=1.7.0) | 15:16 |
DinaBelova | ah, stop-stop | 15:16 |
DinaBelova | oslo.db==1.7.1, MySQL-python==1.2.3 (kilo reqs are: oslo.db<1.8.0,>=1.7.0) | 15:16 |
DinaBelova | johnthetubaguy ^^ | 15:16 |
kun_huang | johnthetubaguy: has anyone deployed pymysql yet? | 15:16 |
dansmith | johnthetubaguy: either way, does that impact the messaging performance? | 15:16 |
DinaBelova | mriedem, yep, thanks | 15:16 |
mriedem | dansmith: it might impact eventlet, is probably why people are asking | 15:17 |
mriedem | since mysql-python doesn't support eventlet right? | 15:17 |
johnthetubaguy | dansmith: not sure what you mean by messaging performance | 15:17 |
johnthetubaguy | mriedem: yeah, each DB call locks up the whole thread | 15:17 |
dansmith | mriedem: it will affect eventlet, but the hotspots are banging hard on rabbit sockets, which doesn't seem like it would be related to the db driver | 15:17 |
DinaBelova | but still 8401 worker is having the https://bugs.launchpad.net/mos/+bug/1380220 - issue - but in fact it sould not influence %CPU used | 15:17 |
openstack | Launchpad bug 1380220 in Mirantis OpenStack "OpenStack services excessively poll socket events when oslo.messaging is used" [Medium,Triaged] - Assigned to MOS QA Team (mos-qa) | 15:17 |
johnthetubaguy | and eventlet lets each one in turn do its DB call, before letting it process the response, I am told | 15:18 |
DinaBelova | interesting moment is with 8402 | 15:18 |
DinaBelova | lots of RabbitMQ-related timeouts | 15:18 |
johnthetubaguy | dansmith: ah, so my head is mush today, can I can't even open the files somehow | 15:18 |
dansmith | johnthetubaguy: right so what DinaBelova is talking about right now ... doesn't seem db driver related to me | 15:18 |
DinaBelova | dansmith - yep.. | 15:19 |
dansmith | johnthetubaguy: and that's what I observed when looking at their profile traces a couple weeks ago.. something seems to be banging really hard on rabbit | 15:19 |
DinaBelova | and the only feeling I have now - to check that their CPU load is really related to the RabbitMQ | 15:19 |
dansmith | johnthetubaguy: I thought maybe it was heartbeats or something, but they say that's disabled | 15:19 |
DinaBelova | so we need more dumps + top screens | 15:19 |
dansmith | DinaBelova: ++ | 15:19 |
DinaBelova | if yes | 15:19 |
*** bauzas has joined #openstack-performance | 15:19 | |
DinaBelova | one possible variant to fix it is rabbitmq upgrade 0_o | 15:19 |
DinaBelova | or some tcp / whatever tuning... | 15:20 |
DinaBelova | as we simply see RabbitMQ waiting for reading things on the wire - and that’s it | 15:20 |
DinaBelova | if all their CPU issues are about this - well, it'll be other (one more) RabbitMQ story | 15:20 |
DinaBelova | they have RabbitMQ 3.3.x | 15:20 |
johnthetubaguy | dansmith: ah, interesting, that does seem separate, unless eventlet is making it lie | 15:20 |
DinaBelova | johnthetubaguy - yep, sir | 15:21 |
DinaBelova | so I'll ask klindgren to make more dumps for more workers and for longer time + include tops for the same moments | 15:21 |
johnthetubaguy | dansmith: I know belliott hit some issues with DB locking sending the elapsed times crazy, but I don't remember the details now | 15:21 |
dansmith | johnthetubaguy: yeah, eventlet could be making it lie for sure, I'm just not sure that the db driver could be making the numbers look like they're elsewhere | 15:21 |
mriedem | dansmith: didn't they see a gain when turning off ssl too? | 15:21 |
dansmith | mriedem: a small gain | 15:21 |
dansmith | mriedem: the first big gain was because they mistyped the config :/ | 15:22 |
DinaBelova | #action DinaBelova klindgren more dumps for more workers and for longer time + include tops for the same moments to ensure we see the same RabbitMQ related reds at the same time conductors CPU is going crazy | 15:22 |
DinaBelova | mriedem - without ssl it was just a little drop | 15:22 |
johnthetubaguy | dansmith: oh, wait, this does sound like what brian found, you want to reduce the number of eventlet works, and it stops thrashing the hub | 15:22 |
*** rohanion has joined #openstack-performance | 15:22 | |
dansmith | johnthetubaguy: s/works/workers/ ? | 15:22 |
*** harlowja_at_home has joined #openstack-performance | 15:23 | |
johnthetubaguy | yeah, sorry | 15:23 |
johnthetubaguy | workers | 15:23 |
dansmith | johnthetubaguy: can you explain more? | 15:23 |
johnthetubaguy | well, greenlet threads I mean | 15:23 |
harlowja_at_home | \o | 15:23 |
johnthetubaguy | so I think eventlet got very busy scheduling between lots of active threads | 15:23 |
*** f13o has joined #openstack-performance | 15:23 | |
DinaBelova | harlowja_at_home o/ | 15:23 |
johnthetubaguy | so we turned down the number of workers (this was on the scheduler, rather than conductor) | 15:23 |
johnthetubaguy | and we found it was better at pushing through DB queries, when using mysql-python | 15:24 |
*** arnoldje has joined #openstack-performance | 15:24 | |
dansmith | hmm | 15:24 |
mriedem | scheduler workers? | 15:24 |
dansmith | johnthetubaguy: that means more queuing in rabbit instead of queuing in memory on a conductor itself? | 15:24 |
dansmith | mriedem: greenlet workers | 15:24 |
mriedem | oh | 15:24 |
DinaBelova | johnthetubaguy - interesting, may you please fill your proposal in the https://etherpad.openstack.org/p/remote-conductor-performance somewhere | 15:24 |
johnthetubaguy | well, more waiting to be restored, while the thread is busy doing DB stuff | 15:24 |
dansmith | johnthetubaguy: well, what I mean is, | 15:25 |
johnthetubaguy | as it lets all the DB stuff happen before letting the python code process the response, or something like that? | 15:25 |
DinaBelova | johnthetubaguy - in fact due to the cprofile data DB operations were not so busy | 15:25 |
dansmith | johnthetubaguy: we don't dequeue a thousand things from rabbit and then try to balance them even though we can only do one at a time | 15:25 |
DinaBelova | but who knows | 15:25 |
dansmith | DinaBelova: right, that's fine | 15:25 |
dansmith | DinaBelova: they wouldn't in this case johnthetubaguy is talking about | 15:25 |
DinaBelova | dansmith - yeah, i just understood it | 15:26 |
DinaBelova | dansmith thanks | 15:26 |
johnthetubaguy | its more a starvation issue, as I understood it | 15:26 |
dansmith | yeah, I guess I can see that | 15:26 |
dansmith | the thing is, | 15:26 |
dansmith | they don't have much if any real db traffic needing to be services | 15:26 |
dansmith | er, serviced | 15:26 |
dansmith | so I'm not sure why there would be a pile of requests needing a pile of threads | 15:27 |
dansmith | basically, just periodics and service checkins | 15:27 |
dansmith | their cloud is otherwise mostly idle | 15:27 |
DinaBelova | they have lots of nova metadata requests | 15:27 |
mriedem | isn't cells always syncing up too? | 15:27 |
DinaBelova | due to periodical puppet scripts running | 15:27 |
johnthetubaguy | mriedem: nova via conductor though | 15:27 |
dansmith | DinaBelova: that's true, forgot about those | 15:27 |
mriedem | johnthetubaguy: ok | 15:28 |
DinaBelova | and they sure have the cache as well, but metadata is periodically knocking the conductor | 15:28 |
mriedem | which is why we talked about turning on n-api-meta in one of the large ops jobs | 15:28 |
dansmith | so anyway, seems like worth a try | 15:28 |
johnthetubaguy | honestly, those service updates are all blocking DB calls, but they should be quick-ish though | 15:28 |
DinaBelova | mriedem precisely | 15:28 |
*** zz_dimtruck is now known as dimtruck | 15:28 | |
*** markvoelker_ has joined #openstack-performance | 15:28 | |
DinaBelova | johnthetubaguy - may you please add your idea to the https://etherpad.openstack.org/p/remote-conductor-performance just to have it written up? | 15:29 |
johnthetubaguy | ah... n-api-meta uses conductor... I never quite realised that | 15:29 |
johnthetubaguy | DinaBelova: will do | 15:29 |
dansmith | johnthetubaguy: yeah, so you can have a db-less compute node | 15:29 |
DinaBelova | johnthetubaguy - thank you sir | 15:29 |
DinaBelova | so yeah | 15:29 |
DinaBelova | we need more data! | 15:29 |
DinaBelova | will ping klindgren after the meeting :) | 15:29 |
DinaBelova | and I guess we may leave this topic for a while | 15:29 |
DinaBelova | let's move forward | 15:29 |
johnthetubaguy | dansmith: yeah, only just made that connection, oops | 15:29 |
DinaBelova | #topic Some hardware to reproduce the issues | 15:30 |
DinaBelova | kun_huang - your topic, sir | 15:30 |
kun_huang | oh | 15:30 |
kun_huang | since we are talking about issue on performance everyday | 15:30 |
DinaBelova | may you please explain what do you mean here? do you have the HW or dod you want to have some? | 15:30 |
DinaBelova | :) | 15:30 |
kun_huang | I have some | 15:31 |
kun_huang | and want to make good use of them | 15:31 |
DinaBelova | kun_huang wow, that will be simply perfect | 15:31 |
DinaBelova | and that will make easier possible issues debug, etc | 15:31 |
*** markvoelker has quit IRC | 15:32 | |
kun_huang | so my first question, who need those first | 15:32 |
kun_huang | I know intel&rackspace have public lab in U.S | 15:32 |
kun_huang | Has everyone used their resources? | 15:32 |
*** mdorman has joined #openstack-performance | 15:33 | |
*** markvoelker_ has quit IRC | 15:33 | |
DinaBelova | kun_huang - they have pretty big env, yes. but this lab is having competiting schedule between people who want to use it afaik | 15:33 |
DinaBelova | we don't use it for now | 15:34 |
DinaBelova | it or any other big labs | 15:34 |
kun_huang | DinaBelova: I'll apply some resource from my company | 15:34 |
kun_huang | at least, my boss support this idea | 15:34 |
DinaBelova | kun_huang - that is very promising, thanks! | 15:35 |
kun_huang | and I need write some materials... | 15:35 |
kun_huang | some paper work | 15:35 |
DinaBelova | in case of success - may you please write up some instructions | 15:35 |
DinaBelova | oh yeah :) | 15:35 |
DinaBelova | kun_huang - nobody loves it :) | 15:35 |
DinaBelova | kun_huang thanks in advance! | 15:35 |
*** mdorman has quit IRC | 15:35 | |
DinaBelova | kun_huang - the resources we're using inside mirantis sadly are for mirantis usage only... but we can extend the test plans regarding peoples opinion | 15:36 |
DinaBelova | I hope to solve issue with these documents placing with TC | 15:36 |
DinaBelova | and then we'll start feedback collection | 15:36 |
DinaBelova | from you and others | 15:36 |
DinaBelova | really hope to make this stuff clearer this week | 15:37 |
DinaBelova | kun_huang - once more time - thanks for your effort | 15:37 |
* regXboi wanders in late | 15:37 | |
DinaBelova | regXboi o/ | 15:37 |
DinaBelova | regXboi I PROMISE to send an email with +1 hour to the meeting start time suggestion | 15:37 |
DinaBelova | I feel people are sufferng | 15:37 |
harlowja_at_home | :) | 15:37 |
kun_huang | okay, I'll keep this channel posted with any update | 15:37 |
DinaBelova | kun_huang thanks! | 15:38 |
kun_huang | or I need any help | 15:38 |
DinaBelova | sure, feel free to ping me | 15:38 |
* harlowja_at_home is suffering from not enough coffee | 15:38 | |
DinaBelova | harlowja_at_home :d | 15:38 |
DinaBelova | lol | 15:38 |
DinaBelova | #topic OSProfiler weekly update | 15:38 |
kun_huang | good morning guys harlowja_at_home regXboi | 15:38 |
DinaBelova | k, so let's go to the profiler | 15:38 |
* harlowja_at_home coffeeeeeee | 15:38 | |
regXboi | DinaBelova: my problem is that I've got too many meetings stacking up on each other :( | 15:38 |
* regXboi skims scrollback | 15:39 | |
DinaBelova | regXboi, yes, sir :( | 15:39 |
DinaBelova | that's the issue | 15:39 |
DinaBelova | timeframes comfortable for both US and Europeans are overcrouded | 15:39 |
DinaBelova | crowded* | 15:39 |
DinaBelova | :( | 15:39 |
regXboi | ack | 15:39 |
DinaBelova | so going back to the osprofiler - harlowja_at_home - chain https://review.openstack.org/#/c/251343/ is pretty done | 15:39 |
DinaBelova | so I need reviews! | 15:40 |
DinaBelova | lol | 15:40 |
harlowja_at_home | cool beans, i'll check it out | 15:40 |
DinaBelova | and I need you to finish https://review.openstack.org/#/c/246116/ :) | 15:40 |
harlowja_at_home | it will be my today mission :) | 15:40 |
DinaBelova | so I can play with ELK for 100% here :) | 15:40 |
harlowja_at_home | yes ma'm | 15:40 |
DinaBelova | harlowja_at_home ack | 15:40 |
DinaBelova | :) | 15:40 |
harlowja_at_home | u are playing with elk? | 15:40 |
harlowja_at_home | is that a thing people do in europe? | 15:40 |
DinaBelova | #action harlowja_at_home review https://review.openstack.org/#/c/251343/ | 15:40 |
DinaBelova | harlowja_at_home :D | 15:40 |
DinaBelova | that's what tough Russian wifes are doing in the meanwhile | 15:41 |
DinaBelova | lol | 15:41 |
harlowja_at_home | ;) | 15:41 |
DinaBelova | so speaking seriously - I want to continue working on this direction | 15:41 |
harlowja_at_home | dancing with elk (the new movie, based off dancing with wolves) | 15:41 |
DinaBelova | of adding more logging and analysing opportunities | 15:42 |
DinaBelova | :) | 15:42 |
harlowja_at_home | def | 15:42 |
DinaBelova | so I'm kindly asking you to polish your change | 15:42 |
harlowja_at_home | sure thing | 15:42 |
DinaBelova | and I'll be able to go with ELK here and make some experiments | 15:42 |
harlowja_at_home | (only if i get to dance with elk to) | 15:42 |
DinaBelova | harlowja_at_home thank you sir | 15:42 |
DinaBelova | :D | 15:42 |
DinaBelova | about spec news | 15:43 |
regXboi | DinaBelova: is there a plan to extend osprofiler deeper into what it tracks?:) | 15:43 |
harlowja_at_home | https://review.openstack.org/#/c/103825/ also btw, but dims had some questions there... | 15:43 |
harlowja_at_home | maybe boris can followup on 103825 (or other person?) | 15:43 |
regXboi | er s/plan/patch/ | 15:43 |
DinaBelova | regXboi - not patch but plans :) | 15:43 |
harlowja_at_home | 103825 is also somewhat ambiguous about if it wants oslo to adopt osprofiler | 15:43 |
harlowja_at_home | it'd be nice if that was like stated (is that a goal?) | 15:44 |
DinaBelova | harlowja_at_home - indeed | 15:44 |
DinaBelova | harlowja_at_home yes, it is | 15:44 |
DinaBelova | Boris is currently communicating with dims about his conserns | 15:44 |
harlowja_at_home | k | 15:44 |
DinaBelova | as right now developing approach is a bit different | 15:44 |
DinaBelova | from what dims is asking about | 15:44 |
DinaBelova | the issue is that 2 years ago Boris got -2 on his patch to oslo.messaging and oslo.db | 15:45 |
harlowja_at_home | ya | 15:45 |
* regXboi wonders about decoration | 15:45 | |
DinaBelova | with that profiling thing | 15:45 |
harlowja_at_home | DinaBelova, its been boris life goal to get that merged | 15:45 |
harlowja_at_home | before boris retires he might get it merged | 15:45 |
DinaBelova | regXboi - you can use decoration now everywhere already | 15:45 |
DinaBelova | harlowja_at_home :D | 15:45 |
harlowja_at_home | boris the old, lol | 15:45 |
harlowja_at_home | hopefully before then it will merge | 15:46 |
regXboi | DinaBelova: yes, but it's not mentioned that I could see in 103825 | 15:46 |
DinaBelova | regXboi, hm, lemme check | 15:46 |
DinaBelova | it was there I believe | 15:46 |
DinaBelova | oh | 15:46 |
* harlowja_at_home remembers boris trying decoration, people still complain about random crap (like oh decoration will add code... blah blah) | 15:46 | |
regXboi | decoration can be an dependent add on patch | 15:47 |
harlowja_at_home | anyways, let's work through these weird issues that people have, and finally get it in (i hope) | 15:47 |
DinaBelova | regXboi - it has disappeared | 15:47 |
regXboi | but folks are proposing decoration profiling in other projects | 15:47 |
regXboi | so it's silly not to have it here | 15:47 |
DinaBelova | regXboi - agreed | 15:47 |
regXboi | but - in order to get this merged | 15:47 |
regXboi | let's save that for a follow on :) | 15:47 |
DinaBelova | #action boris-42 add information about ways of profiling to to 103825 | 15:47 |
DinaBelova | harlowja_at_home agreed | 15:47 |
DinaBelova | :D | 15:48 |
regXboi | keep it short and simple ( the OpenStack KISS :) ) | 15:48 |
DinaBelova | :) | 15:48 |
harlowja_at_home | merging before boris retires would be superb to, lol | 15:48 |
DinaBelova | ok, so anything else here about osprofiler for now? | 15:48 |
harlowja_at_home | it needs a dancing with elk logo | 15:48 |
DinaBelova | :D | 15:48 |
regXboi | I like that | 15:48 |
* harlowja_at_home can't draw though | 15:48 | |
regXboi | and I'd want *that* patch :) | 15:48 |
DinaBelova | ok, so open discussion | 15:49 |
DinaBelova | #topic Open Discussion | 15:49 |
DinaBelova | and u can joke here :D | 15:49 |
* harlowja_at_home i never joke | 15:49 | |
harlowja_at_home | i'm always serious | 15:49 |
DinaBelova | harlowja_at_home I suspected that | 15:49 |
DinaBelova | :d | 15:49 |
DinaBelova | ok, so any topics to cover? | 15:50 |
DinaBelova | ideas to share? | 15:50 |
harlowja_at_home | https://en.wikipedia.org/wiki/Dances_with_Wolves (the dancing with wolves movie) btw | 15:50 |
DinaBelova | possible work items to add here? https://etherpad.openstack.org/p/perf-zoom-zoom | 15:50 |
harlowja_at_home | so about that rally upload results, public website thing | 15:50 |
DinaBelova | RobNeff - probably yuo can share something? | 15:51 |
manand | we discussed about figuring our ratio of controllers to compute nodes few weeks ago, is that a topic worth discussing? | 15:51 |
DinaBelova | harlowja_at_home, yep, sir? | 15:51 |
harlowja_at_home | do people think we should try to do that, or save it for later... | 15:51 |
DinaBelova | manand - yes, it is | 15:51 |
DinaBelova | manand - we just did not have much response yet collected | 15:51 |
DinaBelova | probably I'll need to refresh the discussion pinging some of the operators directly | 15:51 |
* harlowja_at_home would really like a way for people to get involved, uploading there useful results with some metadata, and periodically do this, so that we as a community can gather data about each other, and use it for trending, analysis... | 15:52 | |
DinaBelova | harlowja_at_home - I think it's useful for sure | 15:52 |
DinaBelova | the only thing is that it's not only about rally | 15:52 |
DinaBelova | anything in fact may land there | 15:52 |
RobNeff | Do you have a how-to on the Rally Upload yet? | 15:52 |
andreykurilin | `so about that rally upload results, public website thing` I like this idea | 15:52 |
harlowja_at_home | DinaBelova, sure, although kitty-kat pictures hopefully aren't uploaded | 15:52 |
DinaBelova | harlowja_at_home :D | 15:52 |
DinaBelova | RobNeff - we do not have this web site yet | 15:53 |
DinaBelova | but we think it's a good idea | 15:53 |
DinaBelova | important moment here | 15:53 |
DinaBelova | rally results mean nothing without the cloud topology shared | 15:53 |
DinaBelova | :( | 15:53 |
DinaBelova | so people need to be open enough to share some of the details | 15:53 |
DinaBelova | harlowja_at_home - do you think it'll be possible? | 15:54 |
harlowja_at_home | agreed the critical part is open-enough | 15:54 |
harlowja_at_home | i think we have to start by letting people upload what they can, and we can improve on uploading what is better | 15:54 |
harlowja_at_home | *with more metadata about there topology... | 15:54 |
regXboi | DinaBelova: did I forget to point you at https://etherpad.openstack.org/p/hyper-scale ? | 15:54 |
DinaBelova | regXboi - yep :) | 15:54 |
regXboi | and if I did - I'm sorry | 15:54 |
DinaBelova | will go through it | 15:54 |
regXboi | that's open to all to go look at read | 15:54 |
harlowja_at_home | but initially i think we need to just get people to upload the basics, and as they get less 'scared' or whatever the can upload more info | 15:54 |
DinaBelova | #action DinaBelova go through the https://etherpad.openstack.org/p/hyper-scale | 15:55 |
*** dims has joined #openstack-performance | 15:55 | |
regXboi | the back half is neutron specific | 15:55 |
DinaBelova | regXboi thanks! | 15:55 |
regXboi | but I'm wondering if the front half would make sense as a devref documentation *somewhere* | 15:55 |
DinaBelova | regXboi will take a look, if yes, it'll be really good to do that | 15:55 |
DinaBelova | harlowja_at_home - so about the web site - are you the volunteer here? :) | 15:55 |
regXboi | DinaBelova: if you can suggest a *where*, I'm all ears | 15:56 |
harlowja_at_home | DinaBelova, ummm, errr | 15:56 |
harlowja_at_home | let me get back to u on that, ha | 15:56 |
DinaBelova | regXboi - well, we can grad docs team and shake it a bit to find that out | 15:56 |
DinaBelova | grab* | 15:56 |
regXboi | ack | 15:56 |
DinaBelova | :) | 15:56 |
DinaBelova | harlowja_at_home - ok | 15:56 |
DinaBelova | harlowja_at_home - simply I'm a bit busy with profiler now and conductor investigations | 15:57 |
harlowja_at_home | (and elk dancing) | 15:57 |
DinaBelova | therefore right now personally I cannot work on that | 15:57 |
DinaBelova | yeah | 15:57 |
harlowja_at_home | np | 15:57 |
DinaBelova | so help is super appreciated | 15:57 |
DinaBelova | harlowja_at_home :) | 15:57 |
harlowja_at_home | understood | 15:57 |
DinaBelova | k, cool | 15:58 |
DinaBelova | anything else here? | 15:58 |
DinaBelova | thanks everyone for hot and productive discussion! | 15:58 |
johnthetubaguy | klindgren dansmith mriedem DinaBelova: I have attempted to write up my ideas around executor_thread_pool_size in the etherpad: https://etherpad.openstack.org/p/remote-conductor-performance let me know if any of that is unclear. | 15:58 |
DinaBelova | johnthetubaguy thank you sir! | 15:58 |
dansmith | johnthetubaguy: cool | 15:58 |
regXboi | DinaBelova: I'm pinging somebody I know in the docs project right now | 15:58 |
DinaBelova | ok, cool | 15:58 |
DinaBelova | buy! | 15:59 |
DinaBelova | #endmeeting | 15:59 |
openstack | Meeting ended Tue Dec 1 15:59:08 2015 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 15:59 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/performance_team/2015/performance_team.2015-12-01-15.00.html | 15:59 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/performance_team/2015/performance_team.2015-12-01-15.00.txt | 15:59 |
openstack | Log: http://eavesdrop.openstack.org/meetings/performance_team/2015/performance_team.2015-12-01-15.00.log.html | 15:59 |
*** Jeff__ has quit IRC | 15:59 | |
*** harlowja_at_home has quit IRC | 15:59 | |
mriedem | dansmith: johnthetubaguy: i see a few things in the n-api-meta flow that could be optimized, mostly around lazy-loading instance object fields, not sure if it would help much though | 16:00 |
mriedem | but we are loading up security groups, metadata and system_metadata after getting the instance object | 16:00 |
*** markvoelker has joined #openstack-performance | 16:00 | |
dansmith | mriedem: we checked for lazy loads in his logs I think | 16:00 |
dansmith | and found only a few | 16:00 |
mriedem | seems we should just join up front when getting hte instance from the db | 16:00 |
klindgren | and once again /me meeting fail | 16:00 |
mriedem | dansmith: the SecurityGroupList is retrieved separately, rather than via the instance | 16:01 |
mriedem | so that's 3 fields that could be joined up front on the instance get query | 16:01 |
dansmith | mriedem: you mean not lazy loaded via objects, just separately-loaded? | 16:01 |
johnthetubaguy | mriedem: seems worth a try | 16:01 |
mriedem | right | 16:01 |
mriedem | i can write up a change to see how it looks | 16:01 |
*** rohanion has quit IRC | 16:02 | |
johnthetubaguy | but it does sounds like the nova-conductor is saturated (not in a CPU resources sense, in a possible through put sense) which is slowing everything down | 16:02 |
mriedem | the metadata service also proxies back to neutron and we could be smarter in the neutron api code wrt filtering out fields that we don't need in the response, but that would probably not even be noticeable, just cuts down on the amount of content passed around | 16:02 |
mriedem | i was trying to figure out if the neutron-metadata-agent was polling the n-api-meta service, | 16:03 |
mriedem | but not getting any feedback on that in the -neutron channel | 16:03 |
mriedem | the q-meta agent is doing something every 30 seconds http://logs.openstack.org/43/251543/1/check/gate-tempest-dsvm-neutron-large-ops/eb34603/logs/screen-q-meta.txt.gz#_2015-11-30_22_06_48_075 | 16:03 |
mriedem | but that could just be health checking with the server, idk | 16:04 |
DinaBelova | klindgren o/ | 16:05 |
DinaBelova | mriedem - personally I was not able to reproduce enough load via nova-metadata calls, but that leads simple to the API service saturaiton | 16:06 |
DinaBelova | not the metadata | 16:06 |
DinaBelova | :( | 16:07 |
DinaBelova | that's why I ended up with asking klindgren to make some dumps | 16:07 |
DinaBelova | klindgren - please got through todays nova-conductor section in the logs | 16:08 |
*** dims_ has joined #openstack-performance | 16:08 | |
klindgren | yep - will do | 16:08 |
johnthetubaguy | klindgren: sorry, if folks already asked, but do you know what db driver you are using here? mysql-python or pymysql? | 16:09 |
mriedem | mysql-python | 16:09 |
DinaBelova | MySQL-python==1.2.3 | 16:09 |
johnthetubaguy | mriedem: sorry, I thought that was an open question still, my bad | 16:10 |
*** dims has quit IRC | 16:10 | |
DinaBelova | johnthetubaguy - nope, that's listed in the etherpad you added info to :) | 16:10 |
DinaBelova | klindgren - so please take a look on the lines johnthetubaguy has added to the https://etherpad.openstack.org/p/remote-conductor-performance too | 16:12 |
DinaBelova | thanks :) | 16:12 |
johnthetubaguy | DinaBelova: I see it now, thanks | 16:12 |
johnthetubaguy | mriedem: added a note on the rackspace comments, I don't think we have tried using pymysql, I think thats actually a different thing they were talking about switching back to | 16:14 |
mriedem | johnthetubaguy: yeah, i talked to alaski about that a few weeks ago, | 16:14 |
mriedem | i don't think rax moved to that yet b/c the direct to sql code depends on mysql-python | 16:15 |
klindgren | so re: adding more workers are you talkign about adding more physical boxes? Or are you talking about setting workers=xx? | 16:16 |
alaski | mriedem: that's part of it. we don't use the mysql db api everywhere though, so we could try it in some places | 16:16 |
alaski | though where we're hitting the most performance issues we are running mysqldb, so that's really where we would want to try it | 16:17 |
mriedem | klindgren: do you see anything in the neutron-metadata-agent polling the n-api-meta service? | 16:17 |
klindgren | mriedem, we don't run the neutron-metadata-agent | 16:17 |
klindgren | we run metadata on the compute nodes directly | 16:17 |
johnthetubaguy | klindgren: there were to things, greenlet threads and workers, covered it more in the etherpad | 16:17 |
klindgren | since we run flat/dhcp networks | 16:18 |
klindgren | we just bind 169.254.169.254 to loopback and the normal iptables rules route metadata requests locally, occasionaly it will fall back to another server for metadata - but 99% of the time it using its local metadata instance - when I checked. | 16:19 |
johnthetubaguy | klindgren: have you tried executor_thread_pool_size=2 and rpc_conn_pool_size=2 in an attempt to reduce the CPU usage for each worker process? | 16:24 |
klindgren | we haven't I will do that and see what it does | 16:24 |
johnthetubaguy | klindgren: awesome, that will be an interesting test, certainly seen that help with nova-scheduler in the past (although the caching scheduler sort of removes the need for that) | 16:25 |
klindgren | under juno we use to run workers=40 on two boxes and never had cpu alarms, in kilo we started getting 100% cpu alarms. I adjusted the workers down a bit to releive pressure on the boxes, and we added a third physical server. | 16:26 |
klindgren | adding the third box jsut chewed up 100% cpu on that box and did nothing to relieve the cpu usage on the other two servers | 16:26 |
johnthetubaguy | klindgren: what did the rabbit queue lengths look like during this? | 16:27 |
johnthetubaguy | for conductor | 16:27 |
klindgren | I can try starting up multiple conductor workers in differnt complete processes too and see if that makes everything better. IE have 2 worker threads but 10 conductor services | 16:28 |
klindgren | I can check generally we run with pretty much no queue lengths - except for notifications.info where we have to do work in external systems that are not super fast | 16:28 |
johnthetubaguy | klindgren: I guess I assumed these were separate processes, good point | 16:28 |
klindgren | we do see ~500-1k messages/second in this cell | 16:29 |
klindgren | typically in the 500 range | 16:30 |
*** dims has joined #openstack-performance | 16:30 | |
*** harlowja_at_home has joined #openstack-performance | 16:31 | |
*** dims_ has quit IRC | 16:33 | |
johnthetubaguy | klindgren: thats for the extra context, it does sound like a regression, beyond the stuff I am suggesting, although interested to see if they help | 16:40 |
klindgren | yea - I am starting some testing/cahgnes now - will keep the channel/ehterpad updated | 16:40 |
*** RobNeff has quit IRC | 16:47 | |
*** arnoldje has quit IRC | 16:53 | |
*** mriedem is now known as mriedem_meeting | 16:58 | |
*** arnoldje has joined #openstack-performance | 17:17 | |
*** amaretskiy has quit IRC | 17:37 | |
*** mriedem_meeting is now known as mriedem | 17:43 | |
*** RobNeff has joined #openstack-performance | 17:45 | |
*** harlowja_at_home has quit IRC | 17:51 | |
*** RobNeff has quit IRC | 18:04 | |
*** ctrath has quit IRC | 18:06 | |
*** ctrath has joined #openstack-performance | 18:13 | |
*** pglass has quit IRC | 18:25 | |
*** arnoldje has quit IRC | 18:28 | |
*** arnoldje has joined #openstack-performance | 18:49 | |
*** ctrath has quit IRC | 18:55 | |
*** mriedem has quit IRC | 18:56 | |
*** pglass has joined #openstack-performance | 18:58 | |
*** mriedem has joined #openstack-performance | 19:01 | |
*** ctrath1 has joined #openstack-performance | 19:03 | |
*** AugieMena has quit IRC | 19:38 | |
*** ctrath1 has quit IRC | 19:56 | |
*** ctrath has joined #openstack-performance | 19:58 | |
*** pglass has quit IRC | 20:01 | |
*** ctrath has quit IRC | 20:03 | |
*** arnoldje has quit IRC | 20:03 | |
*** ctrath has joined #openstack-performance | 20:11 | |
*** arnoldje has joined #openstack-performance | 20:24 | |
harlowja | DinaBelova ok, i added some hopefully useful feedback on https://review.openstack.org/#/c/247005/ | 20:30 |
*** rfolco has quit IRC | 20:31 | |
harlowja | DinaBelova what tooz does | 20:31 |
harlowja | https://github.com/openstack/tooz/blob/master/tooz/coordination.py#L407 | 20:31 |
harlowja | (taskflow is a little more complicated than that, but similar) | 20:31 |
harlowja | so if we are going to do connection_url for things, imho we might as well do something similar... | 20:32 |
*** SpamapS is now known as TheKettle | 21:04 | |
*** TheKettle is now known as SpamapS | 21:04 | |
*** ctrath1 has joined #openstack-performance | 21:12 | |
*** ctrath has quit IRC | 21:15 | |
*** nihilifer has quit IRC | 21:25 | |
*** nihilifer has joined #openstack-performance | 21:27 | |
*** harlowja has quit IRC | 21:27 | |
*** harlowja has joined #openstack-performance | 21:28 | |
klindgren | DinaBelova, so just for refrence we see ~130-150 messages/s on the conductor queue but a backlog of 0 | 22:09 |
klindgren | johnthetubaguy, I set executor_thread_pool_size = 5 and rpc_request_woekres to 2 | 22:09 |
klindgren | no real change | 22:09 |
*** arnoldje has quit IRC | 22:12 | |
klindgren | the biggest changes in cpu consumption was setting metadata cache timeout to 3 minutes, turning off ssl to rabbitmq and adding in that change from dansmith to stop doing flavor migrations | 22:13 |
harlowja | klindgren did u try pymysql ? | 22:17 |
klindgren | newp | 22:17 |
klindgren | not yet | 22:17 |
harlowja | k | 22:17 |
harlowja | it may help (or not) | 22:17 |
klindgren | starting with the easier win's/attempts | 22:17 |
klindgren | and letthign those bake fore a while to see what we see | 22:17 |
klindgren | letting* | 22:18 |
harlowja | k | 22:19 |
harlowja | fair nuff | 22:19 |
*** harlowja has quit IRC | 22:19 | |
*** harlowja has joined #openstack-performance | 22:21 | |
*** arnoldje has joined #openstack-performance | 22:32 | |
klindgren | trying the running multiple separate nova-conductor processes - currently running 4 separate with 5 workers | 22:39 |
*** ctrath1 has quit IRC | 22:52 | |
*** harlowja has quit IRC | 22:57 | |
*** mriedem has quit IRC | 22:58 | |
*** ctrath has joined #openstack-performance | 22:59 | |
*** harlowja has joined #openstack-performance | 23:01 | |
*** regXboi has quit IRC | 23:02 | |
*** dimtruck is now known as zz_dimtruck | 23:10 | |
*** arnoldje has quit IRC | 23:24 | |
*** manand has quit IRC | 23:37 | |
klindgren | also no real change in cpu usage | 23:38 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!