15:00:02 <DinaBelova> #startmeeting Performance Team 15:00:03 <openstack> Meeting started Tue Nov 17 15:00:02 2015 UTC and is due to finish in 60 minutes. The chair is DinaBelova. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:05 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:07 <openstack> The meeting name has been set to 'performance_team' 15:00:17 <DinaBelova> hello folks! 15:00:22 <rvasilets___> o/ 15:00:24 <rohanion> Hi! 15:00:26 <ozamiatin> o/ 15:00:33 <kun_huang> good evening :) 15:00:39 <kun_huang> o/ 15:00:42 <DinaBelova> kun_huang - good evening sir 15:00:49 <DinaBelova> so todays agenda 15:00:52 <boris-42> hi 15:00:56 <DinaBelova> #link https://wiki.openstack.org/wiki/Meetings/Performance#Agenda_for_next_meeting 15:01:16 <DinaBelova> there was a complain last time that there was not enough time to fill it 15:01:27 <DinaBelova> although this time it looks not so big as well :) 15:01:35 <DinaBelova> so let's start with action items 15:01:41 <DinaBelova> #topic Action Items 15:01:50 <DinaBelova> last time we had two action items 15:02:18 <DinaBelova> #1 was about filling the etherpad https://etherpad.openstack.org/p/rally_scenarios_list with information about Rally scenarios used 15:02:24 <DinaBelova> in your companies :) 15:02:40 <DinaBelova> well, it looks like nothing has changed since previous meeting 15:02:41 <DinaBelova> :( 15:02:57 <DinaBelova> I really hoped augiemena3, Kristian_, patrykw_ will fill it 15:03:07 <DinaBelova> although I do not see them here today 15:03:24 <kun_huang> I know Kevin had shared a topic about rally and neutron's control plane benchmarking 15:03:36 <DinaBelova> kun_huang - oh, that's cool 15:03:43 <DinaBelova> do you have link to that info? 15:04:04 <kun_huang> a topic in tokyo, wait a minute 15:04:07 <AugieMena> Dina - my bad, should have filled in with some info 15:04:18 * mriedem joins late 15:04:27 * dims waves hi 15:04:32 <kun_huang> #link https://www.youtube.com/watch?v=a0qlsH1hoKs 15:04:37 <DinaBelova> #action everyone (who use Rally for OpenStack testing inside your companies) fill etherpad https://etherpad.openstack.org/p/rally_scenarios_list with used scanarios 15:04:44 <DinaBelova> AugieMena - :) 15:04:55 <DinaBelova> please spend some time on this etherpad filling 15:05:11 <DinaBelova> if we want to create a standard it'll be useful to collect some info preliminary 15:05:15 <DinaBelova> mriedem, dims o/ 15:05:25 <DinaBelova> @kun_huang thank you sir 15:05:31 <DinaBelova> lemme take a quick look 15:05:38 <DinaBelova> ah, that's video 15:05:43 <DinaBelova> so after the meeting :) 15:05:58 <DinaBelova> #action DinaBelova go through the https://www.youtube.com/watch?v=a0qlsH1hoKs 15:05:58 <kun_huang> no problem 15:06:05 <DinaBelova> ok, cool 15:06:18 <DinaBelova> so one more action item was on Kristian_ 15:06:40 <DinaBelova> he promised to collect the information about Rally blanks inside ATT 15:06:51 <DinaBelova> it looks like he was not able to join us today 15:07:13 <DinaBelova> #action DinaBelova ping Kristian_ about internal ATT Rally feedback gathering 15:07:31 <DinaBelova> so it looks like we went through the action items :) 15:07:46 <DinaBelova> just once more time - please fill https://etherpad.openstack.org/p/rally_scenarios_list 15:08:05 <DinaBelova> that will be super useful for future recommendations / methodologies creation 15:08:24 <DinaBelova> I guess we may go to the next topic 15:08:29 <DinaBelova> #topic Nova-conductor performance issues 15:08:39 <DinaBelova> #link https://etherpad.openstack.org/p/remote-conductor-performance 15:08:51 <boris-42> DinaBelova: can we retrun back to the previous topic? 15:09:01 <DinaBelova> boris-42 heh :) 15:09:20 <DinaBelova> I dunno how to make that easy using the bot controls 15:09:27 <dansmith> #undo 15:09:33 <DinaBelova> thanks! 15:09:35 <DinaBelova> #undo 15:09:35 <openstack> Removing item from minutes: <ircmeeting.items.Link object at 0xaf28d90> 15:09:43 <DinaBelova> boris-42 - feel free 15:09:46 <boris-42> dansmith: nice 15:10:06 <boris-42> DinaBelova: so we (Rally team) started recently workin on certification task 15:10:20 <boris-42> #link https://github.com/openstack/rally/tree/master/certification/openstack 15:10:29 <DinaBelova> boris-42 - sadly I do not have much info about this initiative 15:10:34 <DinaBelova> lemme take a quick look 15:10:36 <boris-42> DinaBelova: wich is much better way to share your expirience 15:10:37 * bauzas waves 15:11:00 <boris-42> DinaBelova: than just using etherpads 15:11:13 <DinaBelova> boris-42 - that may be cool 15:11:24 <DinaBelova> so it's some task creation on cloud validation 15:11:25 <boris-42> DinaBelova: so basically it's single task that accepts few arguments about cloud and should generate proper load and test everything that you specified 15:11:35 <DinaBelova> boris-42 - a-ha, cool 15:12:10 <boris-42> DinaBelova: so basically it's executable etherpad 15:12:17 <DinaBelova> ok, so that may be very useful for this purpose 15:12:19 <boris-42> DinaBelova: that you are trying to collect 15:12:19 <DinaBelova> thank you sir 15:12:31 <DinaBelova> we may definitely use it 15:12:33 <mriedem> so rally as defcore? 15:12:42 <kun_huang> boris-42: Has mirantis team used this feature? 15:12:53 <boris-42> mriedem: so nope 15:12:57 <regXboi> mriedem: I'm trying to wrap my head around that :) 15:13:06 <DinaBelova> #info we may use https://github.com/openstack/rally/tree/master/certification/openstack to collect information about Rally scenarios used in verious companies 15:13:28 <boris-42> mriedem: it's pain in neck to use rally to validate OpenStack 15:13:31 <DinaBelova> kun_huang - I did not hear about this frankly speaking 15:13:41 <boris-42> mriedem: because you need to create such task and it takes usually 2-3 weeks 15:13:45 <DinaBelova> kun_huang - but as boris-42 said this initiative is fairly new 15:14:07 <boris-42> mriedem: so we decided to create it once and avoid duplication of effort 15:14:17 <DinaBelova> boris-42 - very useful, thank you sir 15:14:18 <boris-42> mriedem: our goal is not say is it openstack or not* 15:14:37 <boris-42> kun_huang: so we just recently made it 15:14:56 <boris-42> kun_huang: I know about only 1 usage and there were bunch of issues that I am going to address soon 15:15:02 <mriedem> ok, maybe the readme there needs more detail 15:15:16 <DinaBelova> ok, very cool. thanks boris-42! something else to mention here? 15:15:20 <boris-42> mriedem: what you would like to see there 15:15:25 <boris-42> mriedem: ? 15:15:28 <mriedem> what it is and what it's used for 15:15:31 <dims> boris-42 why is it called "certification" then? :) 15:15:33 <mriedem> note that i'm not a rally user 15:15:41 <mriedem> right, 'ceritification' makes me think defcore 15:15:44 <DinaBelova> :) 15:15:46 <dims> y 15:16:03 <AugieMena> would someone provide a one-liner on what the purpose of it is? 15:16:32 <kun_huang> I would like to say that is some kind of task template 15:16:49 <kun_huang> tasks template 15:16:51 <DinaBelova> AugieMena - single task to check all OpenStack cloud. And you may fill it with all scenarios you like 15:16:58 <DinaBelova> kun_huang - is that accurate? 15:16:59 <boris-42> AugieMena: just that will put properl load and SLA on your cloud 15:17:07 <rvasilets___> I guess to run one big task again cloud and to see measures or different resources 15:17:14 <boris-42> dims: nope not scenarios 15:17:19 <boris-42> DinaBelova: nope not scenarios 15:17:21 <kun_huang> DinaBelova: my understanding 15:17:21 <boris-42> dims: sorry 15:17:56 <DinaBelova> boris-42 :) 15:17:57 <AugieMena> so how will it help make it easier to gather info about what Rally scenarios various companies are using? 15:18:12 <boris-42> It's the single task that contains bunch of subtasks that will test specified serviced with proper load (based on size & quality of cloud) and proper SLA 15:18:36 <DinaBelova> AugieMena - boris-42 just proposed to create these lists in form of these "certification" tasks to be able to run them 15:18:57 <AugieMena> OK, I see 15:19:11 <DinaBelova> ack! 15:19:12 <boris-42> AugieMena: separated scenario doesn't mean anything 15:19:32 <boris-42> AugieMena: without it's arguments, context, runner.... 15:19:45 <DinaBelova> boris-42 - moving forward? :) 15:19:55 <boris-42> DinaBelova: there is still one question 15:20:07 <DinaBelova> boris-42 - go ahead :) 15:20:13 <boris-42> dims: so certification is picked because it's like "Rally certification of your cloud" 15:20:32 <kun_huang> boris-42: DinaBelova pls make a note to describe rally's certification work, blogs or slides... I will help to understand 15:20:36 <boris-42> dims: it certifies the scalability & performance of evertyhing.. 15:20:51 * DinaBelova guesses boris-42 meant Dina 15:21:11 <AugieMena> boris-42 - ok, understand the need to provide specifics about arguments used in the scenarios 15:21:17 <DinaBelova> #idea describe rally's certification work, blogs or slides - kun_huang can help with it 15:21:39 <dims> boris-42 : i understand, some link to the official certification activities would help evangelize this better. you will get this question asked again and again :) 15:22:10 <boris-42> dims: ) 15:22:23 <DinaBelova> dims - yep, documentation is everything here :) 15:22:31 <boris-42> dims: honestly we can rename this directory to anything 15:22:47 <boris-42> dims: but personally I don't like word validation because validation is what Tempest is doing 15:22:48 <boris-42> =) 15:23:04 <rvasilets___> ) 15:23:17 <rvasilets___> or not doing) 15:23:37 <DinaBelova> rvasilets___ :) 15:23:40 <DinaBelova> ok, anything else here? 15:24:04 <DinaBelova> ok, moving forward 15:24:18 <DinaBelova> #topic Nova-conductor performance issues 15:24:29 <DinaBelova> ok, so some historical info 15:25:01 <DinaBelova> during the Tokyo summit several operators including GoDaddy (ping klindgren) mentioned about issues observed around nova-conductor 15:25:08 <DinaBelova> #link https://etherpad.openstack.org/p/remote-conductor-performance 15:25:29 <DinaBelova> Rackspace mentioned it was well 15:25:46 * klindgren waves 15:25:56 <DinaBelova> so it was decided it'll be cool idea to investigate this issue 15:26:10 <DinaBelova> currently all known info is collected in the etherpad ^^ 15:26:29 <DinaBelova> SpamapS has started the investigation of the issue on the local lab 15:26:55 <DinaBelova> afaik he had to switch to something else yesterday, so not sure if anything new has hapenned 15:27:17 <mriedem> i'd be interested to know if moving to oslo.db >= 1.12 helps anything 15:27:30 <rpodolyaka1> why would it? 15:27:31 <dansmith> also, is everyone still using mysqldb-python in these tests? 15:27:38 <mriedem> dansmith: right 15:27:43 <mriedem> b/c oslo.db < 1.12 15:27:53 <dansmith> mriedem: is that a yes, or agreement with the question/ 15:27:54 <mriedem> rpodolyaka1: oslo.db 1.12 switched to pymysql 15:27:57 <rpodolyaka1> oslo.db >= 1.12 does not mean they use pymysql 15:27:59 <mriedem> dansmith: that's agreement 15:28:00 <mriedem> and yes 15:28:04 <rpodolyaka1> it's only used in oslo.db tests 15:28:14 <rpodolyaka1> it's up to operator to specify the connection string 15:28:17 <dansmith> right 15:28:20 <mriedem> ooo 15:28:21 <rpodolyaka1> you may use mysql-python as well 15:28:32 <mriedem> have we deprecated mysql-python? 15:28:51 <DinaBelova> and afaik Rackspace fixed this (or probably looking like this) issue by moving back to MySQL-Python 15:29:02 <rpodolyaka1> mriedem: I think we actually run the unit tests for it in oslo.db 15:29:07 <dansmith> DinaBelova: I think you're conflating two things there 15:29:09 <mriedem> rax has an out of tree change (that's also a DNM in nova) for direct sql for some db APIs 15:29:24 <alaski> DinaBelova: rackspace went back to an out of tree db api 15:29:31 <mriedem> this is what rax has https://review.openstack.org/#/c/243822/ 15:29:35 <DinaBelova> dansmith - probably, I just remember conversation on Tokyo summit about issue like that 15:29:41 <alaski> essentially dropping sqlalchemy for some calls 15:29:54 <DinaBelova> alaski - a-ha, thank you sir 15:30:08 <DinaBelova> thanks dansmith, mriedem 15:30:29 <mriedem> it'd also be good to know what the conductor/compute ratios are 15:30:46 <DinaBelova> klindgren ^^ 15:31:01 <mriedem> there is some info in the etherpad 15:31:16 <rpodolyaka1> mriedem: e.g. https://review.openstack.org/#/c/246198/ , there is a separate gate job for mysql-python 15:31:37 <mriedem> rpodolyaka1: so why isn't that deprecated? we want people to move to pymysql don't we? 15:31:39 <alaski> that being said, we are using mysqldb 15:31:53 <DinaBelova> mriedem - yeah, conductor service with 20 workers per server (2 servers, 16 cores per server), 250 HV in the cell 15:32:00 <klindgren> Do you want to see if oslo.db >- 1.12 works better ? Or if pymysql works better 15:32:08 <mriedem> klindgren: pymysql 15:32:10 <klindgren> right now 20 computes * 3 servers 15:32:20 <rpodolyaka1> mriedem: we let them decide which one they want to use 15:32:20 <klindgren> 2 servers are 16 core boxes, one is an 8 core box 15:32:22 <mriedem> but that requires at least oslo.db >= 1.12 if i'm understanding the change history correctly 15:32:40 <dansmith> klindgren: so 2.5 conductor boxes for 20 computes? 15:32:40 <klindgren> 20 conductors* 15:32:43 <mriedem> rpodolyaka1: yeah but mysql-python is not python 3 compliant and has known issues with eventlet right? 15:33:00 <klindgren> for 250 computes 15:33:07 <dansmith> klindgren: that's waaaay low 15:33:10 <rpodolyaka1> mriedem: right, but as rax experience shows, pymysql does not shine on busy clouds :( 15:33:27 <dansmith> rpodolyaka1: I don't think that's what their experience shows 15:33:33 <rpodolyaka1> anyway, are we sure that's a bottleneck? 15:33:35 <mriedem> rpodolyaka1: i think those are unrelated 15:33:36 <harlowja_at_home> \o 15:33:59 <DinaBelova> rpodolyaka1 - not yet, sir. Investigation in progress, we're just collecting the ideas of where to look at 15:34:04 <alaski> rpodolyaka1: rax hasn't tried pymysql yet. it's on our backlog to test but we don't have any data on it 15:34:05 <DinaBelova> harlowja_at_home - morning sir! 15:34:06 <mriedem> rpodolyaka1: rax uses mysqldb b/c of their direct to mysql change uses mysqld-python 15:34:07 <mriedem> https://review.openstack.org/#/c/243822/ 15:34:10 <klindgren> rpodolyaka1, I am getting Model server went away errors randomly from nova-computes 15:34:13 <harlowja_at_home> DinaBelova, hi! :) 15:35:07 <rpodolyaka1> alaski: mriedem: ah, I must have confused them with someone else then. I was pretty sure someone blamed pymysql for causing the load on nova-conductors. and that mysql-python was a solution 15:35:23 <DinaBelova> SpamapS wanted to check if switching to some other JSON lib will help, and I'm going to work on this issue as well (probably start tomorrow) 15:35:24 <dansmith> rpodolyaka1: I'm pretty sure not 15:35:27 <klindgren> dansmith, what would recommend as the number of servers dedicated to nova-conductor to nova-compute ratio? 15:35:28 <rpodolyaka1> ok 15:35:50 <mriedem> DinaBelova: unless you're on python 2.6, i don't know that the json change in oslo.serialization will make a difference 15:35:55 <alaski> rpodolyaka1: we blame sqlalchemy right now :) but are hopeful that pymysql will be better 15:36:01 <rpodolyaka1> haha 15:36:05 <dansmith> klindgren: it all depends on your environment and your load.. but I just want to clarify.. above you seemed to confuse a few things 15:36:16 <dims> alaski lol 15:36:16 <dansmith> klindgren: 250 computes and how many physical conductor machines running how many workers? 15:36:26 <DinaBelova> mriedem - well, SpamapS is experimenting here, I'll probably start with some meaningful profiling 15:36:33 <rpodolyaka1> ++ 15:36:33 <DinaBelova> if will be able to reproduce it 15:36:44 <klindgren> 3 physical boxes one server has 8 cores the others have 16 15:36:49 <klindgren> running 20 workers each 15:36:55 <dansmith> klindgren: so three total boxes for 250 computes, right? 15:37:00 <klindgren> yep 15:37:05 <dansmith> klindgren: right, so that's insanely low, IMHO 15:37:20 <mriedem> plus, you have $workers > npcu on those conductor boxes, 15:37:22 <dansmith> klindgren: and the answer is: keep increasing conductor boxes until the load is manageable :) 15:37:36 <klindgren> thats a pretty shit answer 15:37:37 <dansmith> mriedem: well, with mysqldb you have to have that 15:37:40 <klindgren> imho 15:37:41 <DinaBelova> klindgren :D 15:37:41 <kun_huang> hah 15:38:01 <dansmith> klindgren: so run some conductors on every compute if you want 15:38:18 <mriedem> dansmith: although local conductor is now deprecated 15:38:22 <dansmith> klindgren: the load is all the same, conductor just concentrates it on a much smaller number of boxes if you choose it to be small 15:38:31 <DinaBelova> dansmith - heh, afair conductors were created to avoid local conductoring? 15:38:31 <dansmith> mriedem: sure, but they can still run conductor on compute if they don't want upgrades to work 15:38:40 <klindgren> fyi this environment has always been remote conductor 15:38:46 <klindgren> and load only started being an issue 15:38:48 <rpodolyaka1> klindgren: can you run nova-conductor under cProfile on one of the nodes? We haven't seen anything like that on our 200-compute nodes deployments 15:38:49 <klindgren> when we went to kilo 15:38:56 <dansmith> klindgren: are you still on kilo? 15:39:05 <klindgren> "still" 15:39:10 <klindgren> liberty *just* came out 15:39:21 <dansmith> klindgren: that's an important detail, maybe you're just experiencing the load of the flavor migrations 15:39:21 <mriedem> hmmm, flavor migrations in kilo maybe? 15:39:31 <dansmith> klindgren: that's a hugely important data point :) 15:39:54 <klindgren> we ran all the flavor migration commands after upgrade 15:40:03 <dansmith> klindgren: right, but there is still overhead 15:40:03 <klindgren> btw all of this in in the etherpad 15:40:16 <dansmith> klindgren: and it turned out to be higher than we expected even after the migrations were done 15:40:24 <alaski> even after migrations we still saw overhead as well 15:40:24 <DinaBelova> dansmith, mriedem - yep, these details are in the etherpad as well :) 15:40:24 <dansmith> klindgren: but it's gone in liberty because the migration is complete 15:40:36 <dansmith> DinaBelova: I've read the etherpad and didn't get the impression this was just a kilo thing 15:40:49 <DinaBelova> dansmith, ok 15:41:20 <mriedem> DinaBelova: klindgren: i don't see anything about flavor migrations in the etherpad 15:41:36 <DinaBelova> mriedem - I meant kilo-based cloud 15:41:41 <dansmith> DinaBelova: I see that they say they started getting alarms after kilo, but the rest of the text makes it sound like this has always been a problem and just now tipped over the edge 15:42:24 <mriedem> yeah, i just added the notes on the flavor migrations 15:42:25 <dansmith> klindgren: so I think you should add some more capacity for conductors until you move to liberty, at which time you'll probably be able to drop it back down 15:42:31 <DinaBelova> mriedem thanks! 15:42:44 <mriedem> fyi on the flavor migrations for kilo upgrade https://wiki.openstack.org/wiki/ReleaseNotes/Kilo#Upgrade_Notes_2 15:42:55 <dansmith> klindgren: going forward, we have some better machinery to help us avoid the continued overhead once everything is upgraded 15:43:04 <DinaBelova> ok, so any other points for investigators to look at (except flavor migrations and JSON libs)? // not mentioning some profiling to find the real bottleneck // 15:43:13 <dansmith> klindgren: and also, the flavor migration was about the largest migration we could have done, so it almost can't be worse in the future 15:43:23 <dansmith> DinaBelova: I don't think there is a bottleneck to find, it sounds like 15:43:31 <mriedem> DinaBelova: i'm always curious about rogue periodic tasks in the compute nodes hitting the db too often and pulling too many instances 15:43:35 <dansmith> DinaBelova: I think this is likely due to flavor migrations we were doing in kilo and nothing more 15:43:49 <dansmith> DinaBelova: conductor-specific bottlenecks I mean 15:43:54 <mriedem> but roge periodic tasks pulling too much data could also mean you need to purge your db 15:43:58 <mriedem> *rogue 15:44:44 <alaski> dansmith: not conductor specific bottlenecks, but there are db bottlenecks which conductor amplifies 15:44:49 <DinaBelova> dansmith - that may be very probable answer, I just want to reproduce the same situation klindgren is seeing, track that's about flavor migrations, and check everything is ok on liberty 15:44:52 <dansmith> alaski: yes, totes 15:45:02 <DinaBelova> that is also an answer 15:45:23 <DinaBelova> not mentioning something interesting may be found on what alaski has mentioned 15:45:50 <DinaBelova> ok, cool. 15:45:56 <dansmith> I shouldn't have said "no bottleneck to find" I meant that I think the kilo-centric bit that is the immediate problem is flavor migrations 15:46:13 <DinaBelova> dansmith, yep, gotcha 15:46:23 <dansmith> I'm also amazed that they _were_ fine with 2.5 conductor boxes for 250 computes 15:46:51 <klindgren> it it possible to turn off flavor mgirations under kilo to see if things get better? 15:47:02 <dansmith> klindgren: not really, no 15:47:11 <mriedem> not configurable, it happens in the code 15:47:16 <DinaBelova> klindgren, suffer :) 15:47:20 <dansmith> klindgren: we can have a back alley chat about some hacking you can do if you want 15:47:34 <klindgren> dansmith, can you provide what your mind is an acceptable conductor -> compute ratio? 15:47:53 <dansmith> klindgren: and if I may say, the next time you hit some spike when you roll to a release, please come to the nova channel and raise it :) 15:48:14 <DinaBelova> dansmith - I think if klindgren will be ok with trying some code hacking, I suppose this session will be very useful 15:48:30 <dansmith> klindgren: as I said, there is no magic number.. 1% is much lower than I would have expected would be reasonable for anyone, but you're proving it's doable, which also points to there being no magic number :) 15:49:18 <DinaBelova> #idea check if issue GoDaddy is facing is related to the flavor migrations or just to the too low conductor/compute ratio 15:49:42 <DinaBelova> klindgren - are you interesting in hacking session dansmith has proposed? 15:49:45 <dansmith> I think it's also worth pointing out, 15:50:01 <dansmith> since my answer was "shit" about having enough boxes to handle the load, 15:50:07 <bauzas> maybe running a GMR ? 15:50:11 <dansmith> that conductor separate from computes is mostly an upgrade story 15:50:30 <harlowja_at_home> just out of curiosity since there isn't a magic number, has any bunch of companies shared there conductor ratios with the world, then we can derive a 'suggested' number from those shared values...? 15:50:37 <klindgren> technically 2 -> 250 was working as well. Adding another physical box didn't actually fix anything, it just resulted in burning cpu on that server as well. 15:50:39 <dansmith> if you don't care about that, you can run a few conductor workers on every compute and distribute the load everywhere 15:51:08 <DinaBelova> dansmith thanks for the note 15:51:11 <klindgren> I mean if local conductor is depreacted - and remote conductor ian upgrade story - people are going to need to know a conductor to compute ratio that is "safe" 15:51:16 <DinaBelova> harlowja_at_home - did not hear about that :( 15:51:33 <dansmith> klindgren: 100% is safe 15:51:41 <klindgren> otherwise people are going to be blowing up their cloud 15:51:45 <harlowja_at_home> :-/ 15:51:54 <DinaBelova> klindgren probably we need to write an email to the operators email list 15:52:04 <dansmith> klindgren: let me ask you a question.. how many api nodes should everyone run? 15:52:08 <DinaBelova> and try to find what ratio do other folks have 15:52:12 <klindgren> then un-deprecate local-condcutor because obvious remote-conductor is planned out 15:52:22 <klindgren> is not well planned out* 15:52:41 <harlowja_at_home> DinaBelova, i'd like that 15:53:03 <DinaBelova> #action DinaBelova klindgren compose an email to the operators list and find out what conductors/computes ratio is used 15:53:23 <mriedem> can you do rolling upgrades with cells though? i thought not. 15:53:31 <DinaBelova> dansmith - well, I guess there is no right answer here :) 15:53:56 <dansmith> DinaBelova: right, that's what I'm trying to get at.. if I never create/destroy nodes, I can use one api worker for 250 computes :) 15:54:03 <dansmith> s/nodes/instances/ 15:54:06 <DinaBelova> dansmith :D 15:54:16 <klindgren> its almost always been possible in the past to run n-1 in cells 15:54:32 <manand> while we are on the subject of ratio, is this something we should look across other components such as network node to compute ration etc., 15:54:37 <dansmith> klindgren: just so you know, we think that's crazy :) 15:54:49 <alaski> klindgren: that has been by chance though. there's no code to ensure it works 15:54:49 <DinaBelova> manand - yep, great note 15:54:51 <dansmith> whether or not it works :) 15:55:14 <DinaBelova> ok, folks, we've spent much time on this item 15:55:15 <mriedem> reminds me of the rpc compat bug in the cells code i saw last week... 15:55:22 <dansmith> yeah 15:55:26 <DinaBelova> it losos like we'll return to it back after the meeting 15:55:34 <DinaBelova> looks* 15:55:54 <DinaBelova> so let's move forward, as we're running out of time 15:55:59 <DinaBelova> #topic OSProfiler weekly update 15:56:34 <DinaBelova> ok, so last time we agreed that if we want to use osprofiler for tracing/profiling needs we need #1 fix it and #2 make it better 15:56:47 <DinaBelova> harlowja_at_home has created an etherpad 15:56:49 <DinaBelova> #link https://etherpad.openstack.org/p/perf-zoom-zoom 15:56:52 <harlowja_at_home> i put some code up for an idea of a different notifier that just uses files!! :-P 15:56:58 <harlowja_at_home> morezoom zoom 15:56:58 <harlowja_at_home> lol 15:57:05 <DinaBelova> harlowja_at_home - yep, saw it 15:57:30 <DinaBelova> and I left a comment - lemme create a change regarding https://github.com/openstack/osprofiler/blob/master/doc/specs/in-progress/multi_backend_support.rst first 15:57:43 <DinaBelova> not to have two drivers for backward compatibility 15:58:00 <DinaBelova> so in short - I was able to make osprofiler working ok with ceilometer events 15:58:06 <harlowja_at_home> cool 15:58:12 <DinaBelova> its limited now and some ceilometer work needs to be done now 15:58:21 <DinaBelova> one of Ceilo devs will work on it 15:58:31 <DinaBelova> and I've moved to https://github.com/openstack/osprofiler/blob/master/doc/specs/in-progress/multi_backend_support.rst task 15:58:47 <DinaBelova> harlowja_at_home - I'll ping you once I'll push the change to gerrit 15:58:51 <harlowja_at_home> kk 15:58:53 <harlowja_at_home> thx 15:58:56 <DinaBelova> so you'll be able to rebase your code 15:58:58 <DinaBelova> np 15:59:02 <harlowja_at_home> sounds good to me 15:59:23 <DinaBelova> boris-42 - did you have a chance to update the osprofiler -> oslo spec? 15:59:31 <DinaBelova> for mitaka? 15:59:56 <DinaBelova> a-ha, I see not yet 15:59:59 <DinaBelova> #action boris-42 update osprofiler spec to fit Mitaka cycle 16:00:07 <DinaBelova> ok, so we ran out of time 16:00:15 <DinaBelova> any last questions to mention? 16:00:29 <DinaBelova> thank you guys! 16:00:32 <harlowja_at_home> boris-42, where are u! 16:00:34 <harlowja_at_home> come in boris! 16:00:35 <harlowja_at_home> lol 16:00:39 <DinaBelova> :D 16:00:40 <DinaBelova> #endmeeting