#openstack-performance log

15:00:02 <DinaBelova> #startmeeting Performance Team
15:00:03 <openstack> Meeting started Tue Nov 17 15:00:02 2015 UTC and is due to finish in 60 minutes.  The chair is DinaBelova. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:05 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:07 <openstack> The meeting name has been set to 'performance_team'
15:00:17 <DinaBelova> hello folks!
15:00:22 <rvasilets___> o/
15:00:24 <rohanion> Hi!
15:00:26 <ozamiatin> o/
15:00:33 <kun_huang> good evening :)
15:00:39 <kun_huang> o/
15:00:42 <DinaBelova> kun_huang - good evening sir
15:00:49 <DinaBelova> so todays agenda
15:00:52 <boris-42> hi
15:00:56 <DinaBelova> #link https://wiki.openstack.org/wiki/Meetings/Performance#Agenda_for_next_meeting
15:01:16 <DinaBelova> there was a complain last time that there was not enough time to fill it
15:01:27 <DinaBelova> although this time it looks not so big as well :)
15:01:35 <DinaBelova> so let's start with action items
15:01:41 <DinaBelova> #topic Action Items
15:01:50 <DinaBelova> last time we had two action items
15:02:18 <DinaBelova> #1 was about filling the etherpad https://etherpad.openstack.org/p/rally_scenarios_list with information about Rally scenarios used
15:02:24 <DinaBelova> in your companies :)
15:02:40 <DinaBelova> well, it looks like nothing has changed since previous meeting
15:02:41 <DinaBelova> :(
15:02:57 <DinaBelova> I really hoped augiemena3, Kristian_, patrykw_ will fill it
15:03:07 <DinaBelova> although I do not see them here today
15:03:24 <kun_huang> I know Kevin had shared a topic about rally and neutron's control plane benchmarking
15:03:36 <DinaBelova> kun_huang - oh, that's cool
15:03:43 <DinaBelova> do you have link to that info?
15:04:04 <kun_huang> a topic in tokyo, wait a minute
15:04:07 <AugieMena> Dina - my bad, should have filled in with some info
15:04:18 * mriedem joins late
15:04:27 * dims waves hi
15:04:32 <kun_huang> #link https://www.youtube.com/watch?v=a0qlsH1hoKs
15:04:37 <DinaBelova> #action everyone (who use Rally for OpenStack testing inside your companies) fill etherpad https://etherpad.openstack.org/p/rally_scenarios_list  with used scanarios
15:04:44 <DinaBelova> AugieMena - :)
15:04:55 <DinaBelova> please spend some time on this etherpad filling
15:05:11 <DinaBelova> if we want to create a standard it'll be useful to collect some info preliminary
15:05:15 <DinaBelova> mriedem, dims o/
15:05:25 <DinaBelova> @kun_huang thank you sir
15:05:31 <DinaBelova> lemme take a quick look
15:05:38 <DinaBelova> ah, that's video
15:05:43 <DinaBelova> so after the meeting :)
15:05:58 <DinaBelova> #action DinaBelova go through the https://www.youtube.com/watch?v=a0qlsH1hoKs
15:05:58 <kun_huang> no problem
15:06:05 <DinaBelova> ok, cool
15:06:18 <DinaBelova> so one more action item was on Kristian_
15:06:40 <DinaBelova> he promised to collect the information about Rally blanks inside ATT
15:06:51 <DinaBelova> it looks like he was not able to join us today
15:07:13 <DinaBelova> #action DinaBelova ping Kristian_ about internal ATT Rally feedback gathering
15:07:31 <DinaBelova> so it looks like we went through the action items :)
15:07:46 <DinaBelova> just once more time - please fill https://etherpad.openstack.org/p/rally_scenarios_list
15:08:05 <DinaBelova> that will be super useful for future recommendations / methodologies creation
15:08:24 <DinaBelova> I guess we may go to the next topic
15:08:29 <DinaBelova> #topic Nova-conductor performance issues
15:08:39 <DinaBelova> #link https://etherpad.openstack.org/p/remote-conductor-performance
15:08:51 <boris-42> DinaBelova: can we retrun back to the previous topic?
15:09:01 <DinaBelova> boris-42 heh :)
15:09:20 <DinaBelova> I dunno how to make that easy using the bot controls
15:09:27 <dansmith> #undo
15:09:33 <DinaBelova> thanks!
15:09:35 <DinaBelova> #undo
15:09:35 <openstack> Removing item from minutes: <ircmeeting.items.Link object at 0xaf28d90>
15:09:43 <DinaBelova> boris-42 - feel free
15:09:46 <boris-42> dansmith: nice
15:10:06 <boris-42> DinaBelova: so we (Rally team) started recently workin on certification task
15:10:20 <boris-42> #link https://github.com/openstack/rally/tree/master/certification/openstack
15:10:29 <DinaBelova> boris-42 - sadly I do not have much info about this initiative
15:10:34 <DinaBelova> lemme take a quick look
15:10:36 <boris-42> DinaBelova: wich is much better way to share your expirience
15:10:37 * bauzas waves
15:11:00 <boris-42> DinaBelova: than just using etherpads
15:11:13 <DinaBelova> boris-42 - that may be cool
15:11:24 <DinaBelova> so it's some task creation on cloud validation
15:11:25 <boris-42> DinaBelova: so basically it's single task that accepts few arguments about cloud and should generate proper load and test everything that you specified
15:11:35 <DinaBelova> boris-42 - a-ha, cool
15:12:10 <boris-42> DinaBelova: so basically it's executable etherpad
15:12:17 <DinaBelova> ok, so that may be very useful for this purpose
15:12:19 <boris-42> DinaBelova: that you are trying to collect
15:12:19 <DinaBelova> thank you sir
15:12:31 <DinaBelova> we may definitely use it
15:12:33 <mriedem> so rally as defcore?
15:12:42 <kun_huang> boris-42: Has mirantis team used this feature?
15:12:53 <boris-42> mriedem: so nope
15:12:57 <regXboi> mriedem: I'm trying to wrap my head around that :)
15:13:06 <DinaBelova> #info we may use https://github.com/openstack/rally/tree/master/certification/openstack to collect information about Rally scenarios used in verious companies
15:13:28 <boris-42> mriedem: it's pain in neck to use rally to validate OpenStack
15:13:31 <DinaBelova> kun_huang - I did not hear about this frankly speaking
15:13:41 <boris-42> mriedem: because you need to create such task and it takes usually 2-3 weeks
15:13:45 <DinaBelova> kun_huang - but as boris-42 said this initiative is fairly new
15:14:07 <boris-42> mriedem: so we decided to create it once and avoid duplication of effort
15:14:17 <DinaBelova> boris-42 - very useful, thank you sir
15:14:18 <boris-42> mriedem: our goal is not say is it openstack or not*
15:14:37 <boris-42> kun_huang: so we just recently made it
15:14:56 <boris-42> kun_huang: I know about only 1 usage and there were bunch of issues that I am going to address soon
15:15:02 <mriedem> ok, maybe the readme there needs more detail
15:15:16 <DinaBelova> ok, very cool. thanks boris-42! something else to mention here?
15:15:20 <boris-42> mriedem: what you would like to see there
15:15:25 <boris-42> mriedem: ?
15:15:28 <mriedem> what it is and what it's used for
15:15:31 <dims> boris-42 why is it called "certification" then? :)
15:15:33 <mriedem> note that i'm not a rally user
15:15:41 <mriedem> right, 'ceritification' makes me think defcore
15:15:44 <DinaBelova> :)
15:15:46 <dims> y
15:16:03 <AugieMena> would someone provide a one-liner on what the  purpose of it is?
15:16:32 <kun_huang> I would like to say that is some kind of task template
15:16:49 <kun_huang> tasks template
15:16:51 <DinaBelova> AugieMena - single task to check all OpenStack cloud. And you may fill it with all scenarios you like
15:16:58 <DinaBelova> kun_huang - is that accurate?
15:16:59 <boris-42> AugieMena: just that will put properl load and SLA on your cloud
15:17:07 <rvasilets___> I guess to run one big task again cloud and to see measures or different resources
15:17:14 <boris-42> dims: nope not scenarios
15:17:19 <boris-42> DinaBelova: nope not scenarios
15:17:21 <kun_huang> DinaBelova: my understanding
15:17:21 <boris-42> dims: sorry
15:17:56 <DinaBelova> boris-42 :)
15:17:57 <AugieMena> so how will it help make it easier to gather info about what Rally scenarios various companies are using?
15:18:12 <boris-42> It's the single task that contains bunch of subtasks that will test specified serviced with proper load (based on size & quality of cloud) and proper SLA
15:18:36 <DinaBelova> AugieMena - boris-42 just proposed to create these lists in form of these "certification" tasks to be able to run them
15:18:57 <AugieMena> OK, I see
15:19:11 <DinaBelova> ack!
15:19:12 <boris-42> AugieMena: separated scenario doesn't mean anything
15:19:32 <boris-42> AugieMena: without it's arguments, context, runner....
15:19:45 <DinaBelova> boris-42 - moving forward? :)
15:19:55 <boris-42> DinaBelova: there is still one question
15:20:07 <DinaBelova> boris-42 - go ahead :)
15:20:13 <boris-42> dims: so certification is picked because it's like "Rally certification of your cloud"
15:20:32 <kun_huang> boris-42: DinaBelova pls make a note to describe rally's certification work, blogs or slides... I will help to understand
15:20:36 <boris-42> dims: it certifies the scalability & performance of evertyhing..
15:20:51 * DinaBelova guesses boris-42 meant Dina
15:21:11 <AugieMena> boris-42 - ok, understand the need to provide specifics about arguments used in the scenarios
15:21:17 <DinaBelova> #idea describe rally's certification work, blogs or slides - kun_huang can help with it
15:21:39 <dims> boris-42 : i understand, some link to the official certification activities would help evangelize this better. you will get this question asked again and again :)
15:22:10 <boris-42> dims: )
15:22:23 <DinaBelova> dims - yep, documentation is everything here :)
15:22:31 <boris-42> dims: honestly we can rename this directory to anything
15:22:47 <boris-42> dims: but personally I don't like word validation because validation is what Tempest is doing
15:22:48 <boris-42> =)
15:23:04 <rvasilets___> )
15:23:17 <rvasilets___> or not doing)
15:23:37 <DinaBelova> rvasilets___ :)
15:23:40 <DinaBelova> ok, anything else here?
15:24:04 <DinaBelova> ok, moving forward
15:24:18 <DinaBelova> #topic Nova-conductor performance issues
15:24:29 <DinaBelova> ok, so some historical info
15:25:01 <DinaBelova> during the Tokyo summit several operators including GoDaddy (ping klindgren) mentioned about issues observed around nova-conductor
15:25:08 <DinaBelova> #link https://etherpad.openstack.org/p/remote-conductor-performance
15:25:29 <DinaBelova> Rackspace mentioned it was well
15:25:46 * klindgren waves
15:25:56 <DinaBelova> so it was decided it'll be cool idea to investigate this issue
15:26:10 <DinaBelova> currently all known info is collected in the etherpad ^^
15:26:29 <DinaBelova> SpamapS has started the investigation of the issue on the local lab
15:26:55 <DinaBelova> afaik he had to switch to something else yesterday, so not sure if anything new has hapenned
15:27:17 <mriedem> i'd be interested to know if moving to oslo.db >= 1.12 helps anything
15:27:30 <rpodolyaka1> why would it?
15:27:31 <dansmith> also, is everyone still using mysqldb-python in these tests?
15:27:38 <mriedem> dansmith: right
15:27:43 <mriedem> b/c oslo.db < 1.12
15:27:53 <dansmith> mriedem: is that a yes, or agreement with the question/
15:27:54 <mriedem> rpodolyaka1: oslo.db 1.12 switched to pymysql
15:27:57 <rpodolyaka1> oslo.db >= 1.12 does not mean they use pymysql
15:27:59 <mriedem> dansmith: that's agreement
15:28:00 <mriedem> and yes
15:28:04 <rpodolyaka1> it's only used in oslo.db tests
15:28:14 <rpodolyaka1> it's up to operator to specify the connection string
15:28:17 <dansmith> right
15:28:20 <mriedem> ooo
15:28:21 <rpodolyaka1> you may use mysql-python as well
15:28:32 <mriedem> have we deprecated mysql-python?
15:28:51 <DinaBelova> and afaik Rackspace fixed this (or probably looking like this) issue by moving back to MySQL-Python
15:29:02 <rpodolyaka1> mriedem:  I think we actually run the unit tests for it in oslo.db
15:29:07 <dansmith> DinaBelova: I think you're conflating two things there
15:29:09 <mriedem> rax has an out of tree change (that's also a DNM in nova) for direct sql for some db APIs
15:29:24 <alaski> DinaBelova: rackspace went back to an out of tree db api
15:29:31 <mriedem> this is what rax has https://review.openstack.org/#/c/243822/
15:29:35 <DinaBelova> dansmith - probably, I just remember conversation on Tokyo summit about issue like that
15:29:41 <alaski> essentially dropping sqlalchemy for some calls
15:29:54 <DinaBelova> alaski - a-ha, thank you sir
15:30:08 <DinaBelova> thanks dansmith, mriedem
15:30:29 <mriedem> it'd also be good to know what the conductor/compute ratios are
15:30:46 <DinaBelova> klindgren ^^
15:31:01 <mriedem> there is some info in the etherpad
15:31:16 <rpodolyaka1> mriedem: e.g. https://review.openstack.org/#/c/246198/ , there is a separate gate job for mysql-python
15:31:37 <mriedem> rpodolyaka1: so why isn't that deprecated? we want people to move to pymysql don't we?
15:31:39 <alaski> that being said, we are using mysqldb
15:31:53 <DinaBelova> mriedem - yeah, conductor service with 20 workers per server (2 servers, 16 cores per server), 250 HV in the cell
15:32:00 <klindgren> Do you want to see if oslo.db >- 1.12 works better ?  Or if pymysql works better
15:32:08 <mriedem> klindgren: pymysql
15:32:10 <klindgren> right now 20 computes * 3 servers
15:32:20 <rpodolyaka1> mriedem: we let them decide which one they want to use
15:32:20 <klindgren> 2 servers are 16 core boxes, one is an 8 core box
15:32:22 <mriedem> but that requires at least oslo.db >= 1.12 if i'm understanding the change history correctly
15:32:40 <dansmith> klindgren: so 2.5 conductor boxes for 20 computes?
15:32:40 <klindgren> 20 conductors*
15:32:43 <mriedem> rpodolyaka1: yeah but mysql-python is not python 3 compliant and has known issues with eventlet right?
15:33:00 <klindgren> for 250 computes
15:33:07 <dansmith> klindgren: that's waaaay low
15:33:10 <rpodolyaka1> mriedem: right, but as rax experience shows, pymysql does not shine on busy clouds :(
15:33:27 <dansmith> rpodolyaka1: I don't think that's what their experience shows
15:33:33 <rpodolyaka1> anyway, are we sure that's a bottleneck?
15:33:35 <mriedem> rpodolyaka1: i think those are unrelated
15:33:36 <harlowja_at_home> \o
15:33:59 <DinaBelova> rpodolyaka1 - not yet, sir. Investigation in progress, we're just collecting the ideas of where to look at
15:34:04 <alaski> rpodolyaka1: rax hasn't tried pymysql yet.  it's on our backlog to test but we don't have any data on it
15:34:05 <DinaBelova> harlowja_at_home - morning sir!
15:34:06 <mriedem> rpodolyaka1: rax uses mysqldb b/c of their direct to mysql change uses mysqld-python
15:34:07 <mriedem> https://review.openstack.org/#/c/243822/
15:34:10 <klindgren> rpodolyaka1, I am getting Model server went away errors randomly from nova-computes
15:34:13 <harlowja_at_home> DinaBelova, hi! :)
15:35:07 <rpodolyaka1> alaski: mriedem: ah, I must have confused them with someone else then. I was pretty sure someone blamed pymysql for causing the load on nova-conductors. and that mysql-python was a solution
15:35:23 <DinaBelova> SpamapS wanted to check if switching to some other JSON lib will help, and I'm going to work on this issue as well (probably start tomorrow)
15:35:24 <dansmith> rpodolyaka1: I'm pretty sure not
15:35:27 <klindgren> dansmith, what would recommend as the number of servers dedicated to nova-conductor to nova-compute ratio?
15:35:28 <rpodolyaka1> ok
15:35:50 <mriedem> DinaBelova: unless you're on python 2.6, i don't know that the json change in oslo.serialization will make a difference
15:35:55 <alaski> rpodolyaka1: we blame sqlalchemy right now :)  but are hopeful that pymysql will be better
15:36:01 <rpodolyaka1> haha
15:36:05 <dansmith> klindgren: it all depends on your environment and your load.. but I just want to clarify.. above you seemed to confuse a few things
15:36:16 <dims> alaski lol
15:36:16 <dansmith> klindgren: 250 computes and how many physical conductor machines running how many workers?
15:36:26 <DinaBelova> mriedem - well, SpamapS is experimenting here, I'll probably start with some meaningful profiling
15:36:33 <rpodolyaka1> ++
15:36:33 <DinaBelova> if will be able to reproduce it
15:36:44 <klindgren> 3 physical boxes one server has 8 cores the others have 16
15:36:49 <klindgren> running 20 workers each
15:36:55 <dansmith> klindgren: so three total boxes for 250 computes, right?
15:37:00 <klindgren> yep
15:37:05 <dansmith> klindgren: right, so that's insanely low, IMHO
15:37:20 <mriedem> plus, you have $workers > npcu on those conductor boxes,
15:37:22 <dansmith> klindgren: and the answer is: keep increasing conductor boxes until the load is manageable :)
15:37:36 <klindgren> thats a pretty shit answer
15:37:37 <dansmith> mriedem: well, with mysqldb you have to have that
15:37:40 <klindgren> imho
15:37:41 <DinaBelova> klindgren :D
15:37:41 <kun_huang> hah
15:38:01 <dansmith> klindgren: so run some conductors on every compute if you want
15:38:18 <mriedem> dansmith: although local conductor is now deprecated
15:38:22 <dansmith> klindgren: the load is all the same, conductor just concentrates it on a much smaller number of boxes if you choose it to be small
15:38:31 <DinaBelova> dansmith - heh, afair conductors were created to avoid local conductoring?
15:38:31 <dansmith> mriedem: sure, but they can still run conductor on compute if they don't want upgrades to work
15:38:40 <klindgren> fyi this environment has always been remote conductor
15:38:46 <klindgren> and load only started being an issue
15:38:48 <rpodolyaka1> klindgren: can you run nova-conductor under cProfile on one of the nodes? We haven't seen anything like that on our 200-compute nodes deployments
15:38:49 <klindgren> when we went to kilo
15:38:56 <dansmith> klindgren: are you still on kilo?
15:39:05 <klindgren> "still"
15:39:10 <klindgren> liberty *just* came out
15:39:21 <dansmith> klindgren: that's an important detail, maybe you're just experiencing the load of the flavor migrations
15:39:21 <mriedem> hmmm, flavor migrations in kilo maybe?
15:39:31 <dansmith> klindgren: that's a hugely important data point :)
15:39:54 <klindgren> we ran all the flavor migration commands after upgrade
15:40:03 <dansmith> klindgren: right, but there is still overhead
15:40:03 <klindgren> btw all of this in in the etherpad
15:40:16 <dansmith> klindgren: and it turned out to be higher than we expected even after the migrations were done
15:40:24 <alaski> even after migrations we still saw overhead as well
15:40:24 <DinaBelova> dansmith, mriedem - yep, these details are in the etherpad as well :)
15:40:24 <dansmith> klindgren: but it's gone in liberty because the migration is complete
15:40:36 <dansmith> DinaBelova: I've read the etherpad and didn't get the impression this was just a kilo thing
15:40:49 <DinaBelova> dansmith, ok
15:41:20 <mriedem> DinaBelova: klindgren: i don't see anything about flavor migrations in the etherpad
15:41:36 <DinaBelova> mriedem - I meant kilo-based cloud
15:41:41 <dansmith> DinaBelova: I see that they say they started getting alarms after kilo, but the rest of the text makes it sound like this has always been a problem and just now tipped over the edge
15:42:24 <mriedem> yeah, i just added the notes on the flavor migrations
15:42:25 <dansmith> klindgren: so I think you should add some more capacity for conductors until you move to liberty, at which time you'll probably be able to drop it back down
15:42:31 <DinaBelova> mriedem thanks!
15:42:44 <mriedem> fyi on the flavor migrations for kilo upgrade https://wiki.openstack.org/wiki/ReleaseNotes/Kilo#Upgrade_Notes_2
15:42:55 <dansmith> klindgren: going forward, we have some better machinery to help us avoid the continued overhead once everything is upgraded
15:43:04 <DinaBelova> ok, so any other points for investigators to look at (except flavor migrations and JSON libs)? // not mentioning some profiling to find the real bottleneck //
15:43:13 <dansmith> klindgren: and also, the flavor migration was about the largest migration we could have done, so it almost can't be worse in the future
15:43:23 <dansmith> DinaBelova: I don't think there is a bottleneck to find, it sounds like
15:43:31 <mriedem> DinaBelova: i'm always curious about rogue periodic tasks in the compute nodes hitting the db too often and pulling too many instances
15:43:35 <dansmith> DinaBelova: I think this is likely due to flavor migrations we were doing in kilo and nothing more
15:43:49 <dansmith> DinaBelova: conductor-specific bottlenecks I mean
15:43:54 <mriedem> but roge periodic tasks pulling too much data could also mean you need to purge your db
15:43:58 <mriedem> *rogue
15:44:44 <alaski> dansmith: not conductor specific bottlenecks, but there are db bottlenecks which conductor amplifies
15:44:49 <DinaBelova> dansmith - that may be very probable answer, I just want to reproduce the same situation klindgren is seeing, track that's about flavor migrations, and check everything is ok on liberty
15:44:52 <dansmith> alaski: yes, totes
15:45:02 <DinaBelova> that is also an answer
15:45:23 <DinaBelova> not mentioning something interesting may be found on what alaski has mentioned
15:45:50 <DinaBelova> ok, cool.
15:45:56 <dansmith> I shouldn't have said "no bottleneck to find" I meant that I think the kilo-centric bit that is the immediate problem is flavor migrations
15:46:13 <DinaBelova> dansmith, yep, gotcha
15:46:23 <dansmith> I'm also amazed that they _were_ fine with 2.5 conductor boxes for 250 computes
15:46:51 <klindgren> it it possible to turn off flavor mgirations under kilo to see if things get better?
15:47:02 <dansmith> klindgren: not really, no
15:47:11 <mriedem> not configurable, it happens in the code
15:47:16 <DinaBelova> klindgren, suffer :)
15:47:20 <dansmith> klindgren: we can have a back alley chat about some hacking you can do if you want
15:47:34 <klindgren> dansmith, can you provide what your mind is an acceptable conductor -> compute ratio?
15:47:53 <dansmith> klindgren: and if I may say, the next time you hit some spike when you roll to a release, please come to the nova channel and raise it :)
15:48:14 <DinaBelova> dansmith - I think if klindgren will be ok with trying some code hacking, I suppose this session will be very useful
15:48:30 <dansmith> klindgren: as I said, there is no magic number.. 1% is much lower than I would have expected would be reasonable for anyone, but you're proving it's doable, which also points to there being no magic number :)
15:49:18 <DinaBelova> #idea check if issue GoDaddy is facing is related to the flavor migrations or just to the too low conductor/compute ratio
15:49:42 <DinaBelova> klindgren - are you interesting in hacking session dansmith has proposed?
15:49:45 <dansmith> I think it's also worth pointing out,
15:50:01 <dansmith> since my answer was "shit" about having enough boxes to handle the load,
15:50:07 <bauzas> maybe running a GMR ?
15:50:11 <dansmith> that conductor separate from computes is mostly an upgrade story
15:50:30 <harlowja_at_home> just out of curiosity since there isn't a magic number, has any bunch of companies shared there conductor ratios with the world, then we can derive a 'suggested' number from those shared values...?
15:50:37 <klindgren> technically 2 -> 250 was working as well.  Adding another physical box didn't actually fix anything, it just resulted in burning cpu on that server as well.
15:50:39 <dansmith> if you don't care about that, you can run a few conductor workers on every compute and distribute the load everywhere
15:51:08 <DinaBelova> dansmith thanks for the note
15:51:11 <klindgren> I mean if local conductor is depreacted - and remote conductor ian upgrade story - people are going to need to know a conductor to compute ratio that is "safe"
15:51:16 <DinaBelova> harlowja_at_home - did not hear about that :(
15:51:33 <dansmith> klindgren: 100% is safe
15:51:41 <klindgren> otherwise people are going to be blowing up their cloud
15:51:45 <harlowja_at_home> :-/
15:51:54 <DinaBelova> klindgren probably we need to write an email to the operators email list
15:52:04 <dansmith> klindgren: let me ask you a question.. how many api nodes should everyone run?
15:52:08 <DinaBelova> and try to find what ratio do other folks have
15:52:12 <klindgren> then un-deprecate local-condcutor because obvious remote-conductor is planned out
15:52:22 <klindgren> is not well planned out*
15:52:41 <harlowja_at_home> DinaBelova, i'd like that
15:53:03 <DinaBelova> #action DinaBelova klindgren compose an email to the operators list and find out what conductors/computes ratio is used
15:53:23 <mriedem> can you do rolling upgrades with cells though? i thought not.
15:53:31 <DinaBelova> dansmith - well, I guess there is no right answer here :)
15:53:56 <dansmith> DinaBelova: right, that's what I'm trying to get at.. if I never create/destroy nodes, I can use one api worker for 250 computes :)
15:54:03 <dansmith> s/nodes/instances/
15:54:06 <DinaBelova> dansmith :D
15:54:16 <klindgren> its almost always been possible in the past to run n-1 in cells
15:54:32 <manand> while we are on the subject of ratio, is this something we should look across other components such as network node to compute ration etc.,
15:54:37 <dansmith> klindgren: just so you know, we think that's crazy :)
15:54:49 <alaski> klindgren: that has been by chance though.  there's no code to ensure it works
15:54:49 <DinaBelova> manand - yep, great note
15:54:51 <dansmith> whether or not it works :)
15:55:14 <DinaBelova> ok, folks, we've spent much time on this item
15:55:15 <mriedem> reminds me of the rpc compat bug in the cells code i saw last week...
15:55:22 <dansmith> yeah
15:55:26 <DinaBelova> it losos like we'll return to it back after the meeting
15:55:34 <DinaBelova> looks*
15:55:54 <DinaBelova> so let's move forward, as we're running out of time
15:55:59 <DinaBelova> #topic OSProfiler weekly update
15:56:34 <DinaBelova> ok, so last time we agreed that if we want to use osprofiler for tracing/profiling needs we need #1 fix it and #2 make it better
15:56:47 <DinaBelova> harlowja_at_home has created an etherpad
15:56:49 <DinaBelova> #link https://etherpad.openstack.org/p/perf-zoom-zoom
15:56:52 <harlowja_at_home> i put some code up for an idea of a different notifier that just uses files!! :-P
15:56:58 <harlowja_at_home> morezoom zoom
15:56:58 <harlowja_at_home> lol
15:57:05 <DinaBelova> harlowja_at_home - yep, saw it
15:57:30 <DinaBelova> and I left a comment - lemme create a change regarding https://github.com/openstack/osprofiler/blob/master/doc/specs/in-progress/multi_backend_support.rst first
15:57:43 <DinaBelova> not to have two drivers for backward compatibility
15:58:00 <DinaBelova> so in short - I was able to make osprofiler working ok with ceilometer events
15:58:06 <harlowja_at_home> cool
15:58:12 <DinaBelova> its limited now and some ceilometer work needs to be done now
15:58:21 <DinaBelova> one of Ceilo devs will work on it
15:58:31 <DinaBelova> and I've moved to https://github.com/openstack/osprofiler/blob/master/doc/specs/in-progress/multi_backend_support.rst task
15:58:47 <DinaBelova> harlowja_at_home - I'll ping you once I'll push the change to gerrit
15:58:51 <harlowja_at_home> kk
15:58:53 <harlowja_at_home> thx
15:58:56 <DinaBelova> so you'll be able to rebase your code
15:58:58 <DinaBelova> np
15:59:02 <harlowja_at_home> sounds good to me
15:59:23 <DinaBelova> boris-42 - did you have a chance to update the osprofiler -> oslo spec?
15:59:31 <DinaBelova> for mitaka?
15:59:56 <DinaBelova> a-ha, I see not yet
15:59:59 <DinaBelova> #action boris-42 update osprofiler spec to fit Mitaka cycle
16:00:07 <DinaBelova> ok, so we ran out of time
16:00:15 <DinaBelova> any last questions to mention?
16:00:29 <DinaBelova> thank you guys!
16:00:32 <harlowja_at_home> boris-42, where are u!
16:00:34 <harlowja_at_home> come in boris!
16:00:35 <harlowja_at_home> lol
16:00:39 <DinaBelova> :D
16:00:40 <DinaBelova> #endmeeting