#openstack-meeting-alt log

16:02:07 <xarses> #startmeeting Fuel
16:02:08 <openstack> Meeting started Thu Jun  4 16:02:07 2015 UTC and is due to finish in 60 minutes.  The chair is xarses. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:02:09 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:02:11 <openstack> The meeting name has been set to 'fuel'
16:02:16 <xarses> #chair xarses
16:02:16 <openstack> Current chairs: xarses
16:02:18 <mattymo> xarses, guest host?
16:02:23 <xarses> morning folks
16:02:27 <mwhahaha> hola
16:02:31 <agordeev> o/
16:02:36 <xarses> mattymo: surprise
16:02:41 <mattymo> I'm shocked. Also, hi
16:02:51 <alex_didenko> hi
16:02:52 <maximov> hi Andrew
16:03:00 <xarses> so lets get started then
16:03:07 <aglarendil> hi
16:03:11 <vkramskikh> hi
16:03:11 <akasatkin> h
16:03:15 <xarses> #topic HCF status: outstanding issue with fuel-agent (https://bugs.launchpad.net/fuel/+bug/1461532) dpyzhov
16:03:16 <openstack> Launchpad bug 1461532 in Fuel for OpenStack "MCollective has not answered after provisioning and starting of granular_deploy" [High,Confirmed] - Assigned to Vladimir Sharshov (vsharshov)
16:04:00 <xarses> dpyzhov isn't here
16:04:29 <angdraug> warpc__: ^ ?
16:04:41 <xarses> ok, lets come back to it.
16:04:53 <warpc__> Guys, i am still discovering it https://bugs.launchpad.net/fuel/+bug/1461532)
16:04:53 <openstack> Launchpad bug 1461532 in Fuel for OpenStack "MCollective has not answered after provisioning and starting of granular_deploy" [High,Confirmed] - Assigned to Vladimir Sharshov (vsharshov)
16:05:12 <sbog> hi all
16:05:13 <ashtokolov> hi
16:05:23 <ikalnitsky> o/
16:05:25 <dpyzhov> hi
16:05:31 <aglarendil> hi
16:05:32 <IvanKliuk> hi
16:05:35 <xarses> ok, nvmind.
16:05:46 <xarses> dpyzhov: we are talking about https://bugs.launchpad.net/fuel/+bug/1461532
16:05:53 <xarses> and HCF
16:06:01 <angdraug> is that the only remaining HCF blocker?
16:06:08 <maximov> I belive this bug should be fixed before HCF
16:06:34 <maximov> as it can break deployment
16:06:36 <maximov> in some cases
16:06:54 <dpyzhov> In python team we have few more open bugs
16:06:57 <maximov> do we know the reason ? any ideas
16:07:30 <dpyzhov> but I think we can announce HCF without fixes for them
16:07:57 <maximov> dpyzhov: I agree, looks like this one is the only bug that we should fix before HCF
16:09:00 <warpc__> Guys, i need 1 more hour for it. At now moment i have only week idea about id which nailgun-aget got after we ask it node
16:10:17 <xarses> Ok, so we want to address this prior to HCF, are their any others we have to get to before HCF?
16:10:18 <maximov> warpc__: ok, please ask our colleagues from US for help if you feel that you cannot fix it today
16:10:31 <warpc__> maximov: ok, i will
16:10:34 <maximov> I don't think we have in fuel
16:11:26 <xarses> angdraug: what about the UCS IBP issue?
16:11:45 <xarses> should we wait for more analysis on that also?
16:12:09 <angdraug> not sure it should be HCF blocker
16:12:16 <angdraug> link?
16:12:29 <xarses> I don't have one =(
16:12:34 <rmoe> there is a webex scheduled for tomorrow to troubleshoot it
16:12:53 <angdraug> pretty sure there was an LP bug about it, unless I'm confusing it with something else
16:13:05 <rmoe> oh, maybe we're talking about 2 different things
16:13:27 <dpyzhov> xarses: they are working with old env and we don’t have logs, afaik
16:13:32 <rmoe> there is some UCS IBP issue that is going to be looked at tomorrow morning
16:13:36 <angdraug> #link https://bugs.launchpad.net/fuel/+bug/1461126
16:13:36 <openstack> Launchpad bug 1461126 in Fuel for OpenStack "IBP: GBUB failed to recognize very large disk 10+TB" [High,Triaged] - Assigned to Aleksandr Gordeev (a-gordeev)
16:13:38 <angdraug> that?
16:13:46 <agordeev> nope
16:14:16 <agordeev> that one was reproduced on our side
16:14:18 <angdraug> ok, rmoe: please make sure there's a bug reported about it
16:14:29 <rmoe> will do
16:14:41 <angdraug> moving on?
16:14:49 <xarses> #topic Package-based updates delivery (aglarendil)
16:15:10 <aglarendil> so far folks we have set up all the infrastructure and jobs for package-based updates delivery
16:15:36 <aglarendil> we are going to switch to stable/6.1 branches after HCF is called and glue everything
16:15:43 <aglarendil> together by the release date
16:16:08 <mihgen> aglarendil: sounds cool. How can we double sure that patching flows work and application of patches is ok?
16:16:39 <xarses> so will we also deliver fuel-master (and container) updates this way also?
16:16:48 <mihgen> I was thinking about the following: why don't we take ISO which is 2 weeks old, and use current repos as updates
16:16:55 <mihgen> update package by package
16:16:56 <bookwar> aglarendil: do you mean these jobs https://review.fuel-infra.org/#/c/6827/ ?
16:17:00 <aglarendil> mihgen: we gonna re-review all the code once again
16:17:02 <mihgen> and see what actions have to be done
16:17:29 <aglarendil> and do patching acceptance testing during the last week of the release cycle
16:17:32 <mihgen> just to see how it is gonna look like, and ensure we don't miss anything important
16:17:37 <aglarendil> bookwar: yep, these ones
16:18:10 <aglarendil> Mike, obviously we will do code review for jobs and tests several additional times
16:18:12 <aglarendil> do not worry
16:18:40 <mihgen> did we package everything?
16:18:46 <mihgen> including docker containers?
16:19:24 <aglarendil> mihgen: yes, and it was discussed two or three weeks ago already
16:19:28 <mattymo> mihgen, ^ yes. fuel-docker-images package
16:19:52 <mihgen> Just making sure.
16:19:53 <ikalnitsky> honestly, i don't like idea to package docker containers :(
16:20:02 <ikalnitsky> it should be done via docker registry
16:20:03 <mihgen> question - how did you verify that all is packaged?
16:20:25 <mihgen> ikalnitsky: we might consider this for later, there were holywars about this…
16:20:50 <mattymo> ikalnitsky, I wanted registry, but to pack and use registry on ISO, it's much slower and consumes more disk space. I really gave it a full effort
16:21:03 <mattymo> almost 2x as slow
16:21:34 <angdraug> just this morning in osci+linux discussion we recalled the discussion we had about splitting fuel-web packages
16:21:39 <ikalnitsky> mattymo, if we're retrieving packages from some online source, why we can't retrieve docker images from some online source too?
16:21:46 <ikalnitsky> but ok, let's discuss it later
16:21:51 <xarses> mattymo: can we provide public registry along with the packages?
16:21:58 <mattymo> registry doesn't transfer data compressed. it'll be 1.2gb download, just fyi
16:22:15 <mattymo> xarses, yes, but custom isos will be more interesting. final releases are not a big problem
16:22:39 <angdraug> and now we've combined all our containers into a single package, exact opposite...
16:22:56 <xarses> mattymo: hmm, make the startup process get the specific container hash...
16:22:57 <mattymo> it was a tarball. now it's a tarball in a package
16:23:09 * xarses sighs
16:23:20 <mattymo> xarses, even more metadata to bundle into iso outside of an RPM. that adds even greater complexity
16:23:39 <xarses> yep
16:23:53 <aglarendil> folks, it was a part of discussion several months ago - you should have raised your voices then, not now
16:24:04 <mihgen> aglarendil: nurla: so how did we verufy that everything is packaged?
16:24:11 <angdraug> tarball in a package doesn't sound like a full effort
16:24:44 <aglarendil> mihgen: Mike, there is a noop test for this
16:24:49 <mattymo> angdraug, I was referring to implementation of registry local on Fuel Master instead of exported and compressed images
16:25:14 <aglarendil> also, if we miss anything, we will be able to ship this stuff in a package later - there is no risk
16:25:49 <mihgen> ok thanks
16:26:07 <angdraug> also no mention of fuel-docker-images in the patching spec
16:26:07 <xarses> moving on?
16:26:18 <angdraug> ok
16:26:34 <xarses> lets follow up on the ML about it
16:27:01 <xarses> #action follow up on the ML about possible improvements to fuel-docker-image packaging
16:27:11 <xarses> #topic role-as-a-plugin (ikalnitsky)
16:27:35 <maximov> QA implemented a few test jobs for fuel-* repos first of all for library, and they run tests for master node packages and other part of library
16:27:41 <ikalnitsky> ok, guys, honestly i don't have much info to share. all i can say that we have started thinking about scope
16:27:53 <maximov> sorry it was previous topic
16:27:57 <ikalnitsky> and Andrii Popovich have started writing a spec
16:27:59 <ikalnitsky> here's the link
16:28:02 <ikalnitsky> #link https://review.openstack.org/#/c/185267/
16:28:20 <ikalnitsky> unfortunately, i didn't review it yet.. and i ask you guys to review it when you got free time.
16:28:37 <ikalnitsky> in brief, i can highlight some statements.
16:29:11 <ikalnitsky> first, plugin-as-a-role is about exporting some role with disk partitioning and deployment tasks
16:30:05 <ikalnitsky> second, this blueprint is not about changing deployment tasks of other roles. i.e. the plugin won't able to remove some tasks from compute (if we have node with both role)
16:30:24 <ikalnitsky> that's it.
16:30:34 <ikalnitsky> questions?
16:30:37 <mihgen> ikalnitsky: what about network roles?
16:30:40 <xarses> so will the plugin role's tasks be included in the normal deployment graph or still be limited to pre / and post deployment?
16:30:56 <ikalnitsky> xarses, it will be included in normal deployment graph
16:31:10 <alex_didenko> cool
16:31:32 <mihgen> if you can insert it in deployment graph (which is cool), why can't we override/remove some tasks then?
16:31:36 <ikalnitsky> mihgen, we don't consider network-role-as-plugin. i thinks it's another story and it should be implemented after advanced networking
16:32:01 <ikalnitsky> we can't remove task because it requires to inject some python code
16:32:10 <ikalnitsky> currently, plugins unable to do it
16:32:12 <xarses> ikalnitsky: we need to implement them together
16:32:16 <ikalnitsky> because of many reasons
16:32:40 <xarses> ikalnitsky: we will need to think of DSL to replace / and remove other tasks if this role provides it
16:32:42 <ikalnitsky> i think it should be done in separate blueprint..  i mean python code in plugins
16:32:50 <mihgen> xarses: you'd need core to have functionality before you can write plugin using it
16:32:53 <mattymo> maybe you can do it such that a plugin can override a specific task?
16:33:12 <xarses> for example the a role as a plugin would provide rabbitmq
16:33:17 <mattymo> just like a hierarchy where plugin tasks have higher priority
16:33:30 <xarses> it must be able to remove rabbitmq task from controller role
16:33:40 <ikalnitsky> hm.. you propose to override it by name?
16:33:44 <ikalnitsky> or remove it by name?
16:33:51 <xarses> maybe provide the same name
16:34:26 <xarses> and that it's in the graph somewhere may be OK.
16:34:54 <ikalnitsky> i didn't thought about it.. but at first glance the idea about overriding tasks by name looks ok
16:35:09 <ikalnitsky> but i don't know our current limitations.
16:35:24 <mihgen> ikalnitsky: take a look please.. it can help our partners a lot
16:35:28 <xarses> I think we might have to change the tasks slightly so that all tasks are evaluated at the same time regardless of role so that many roles may be orchestrated at once
16:35:49 <ikalnitsky> mihgen, yep, i'll
16:36:50 <mihgen> moving on?
16:37:06 <xarses> #topic flexible(advanced) networking feature status (akasatkin)
16:37:17 <akasatkin> Hi
16:37:21 <akasatkin> Design is in progress. It is mostly clear what tasks are to be done within this feature.
16:37:28 <akasatkin> It was proposed recently to change some relations (network role is connected to task instead of node role, network roles definitions are stored in manifests instead of nailgun). There were discussions on that. I hope most questions will be finally solved on Monday (we have a meeting arranged).
16:37:40 <akasatkin> It was a discussion on Admin-PXE network limitations also (Admin-PXE role is always mapped to it, it cannot be deleted or moved, it cannot be included to lacp bond). It is not a design limitation though. It will be described in metadata and can be adjusted. It can be removed completely after appropriate testing of the final implementation. Library team is not sure we can just drop these limitations right now.
16:37:52 <akasatkin> We are going to start (hopefully tomorrow) implementation of the first part: move existing (and planned) vanila network cases into one (with Neutron ML2 agent). So, there will be an ability to have several VLAN/VxLAN/GRE backends in one env. It will not cover UI for the first time.
16:38:03 <akasatkin> Spec is https://review.openstack.org/#/c/115340/ . Basic considerations on DB scheme and new API are in spec already. Tasks are defined, some of them can be solved in parallel (in nailgun part). Welcome to review. It is not yet addressed there that network role is connected to task (ETA - Tuesday).
16:38:42 <akasatkin> Questions?
16:38:59 <xarses> thanks
16:39:15 <mihgen> thanks akasatkin.
16:39:34 <mihgen> I was thinking about what network roles to create and where to reuse
16:40:04 <mihgen> so I think that if your deployment task provides any service, like rabbitmq, then you must create a new network role for it
16:40:37 <mihgen> if your deployment task just consumes the services (like olso client to rabbit) - you just refer to the existing network role
16:40:55 <xenolog> Yes, by design each,  plugin should have ability add amount of new network roles.
16:40:56 <mihgen> so corresponding network would be created at the place where you need to consume the services
16:41:31 <mihgen> another thing which just came up, and we will have meeting tomorrow on this
16:41:43 <mihgen> is that deployment task may require different sets of network roles
16:41:56 <mihgen> depends on some other metadata, for instance checkbox on settings tab
16:42:07 <mihgen> set*, not sets
16:42:17 <akasatkin> actually, networks will be set up once for he entire node, xenolog ?
16:42:18 <mihgen> so we need to think how to better handle this
16:42:48 <akasatkin> yes, there are choises here
16:42:54 <xenolog> each task should has list of NetworkRoles, that will be used.
16:43:21 <ikalnitsky> maybe if the task require various amount of sets then it should be decompose into few tasks ?
16:43:35 <ikalnitsky> s/sets/netroles/g
16:43:38 <mihgen> so it's gonna be same for plugin too: if you provide new service, then you'd need to create new network role and associate it with the task
16:44:02 <mihgen> if your plugin just consumes some service, you'd need to refer to existing network role which is gonna be created by the provider-task
16:44:29 <ikalnitsky> that's why folks want to expose a requires network roles in tasks. that means, we can get it from plugins with the same mechanism.
16:44:38 <mihgen> ikalnitsky: we thought about it. decomposition is unlikely to happen
16:44:46 <mihgen> you may have large puppet module
16:44:52 <mihgen> and it will be duplication of code
16:45:04 <ikalnitsky> we can decompose in on "tasks.yaml" level
16:45:04 <mihgen> it's gonna be quite hard to separate in many cases
16:45:13 <ikalnitsky> just two different declaration with different conditions
16:45:34 <mihgen> let's talk tomorrow more about it. are you invited to the call?
16:45:40 <ikalnitsky> nope
16:45:48 <mihgen> ok, I'll send invite
16:45:53 <ikalnitsky> thanks
16:45:59 <akasatkin> conditions are in release, far from tasks definition though..
16:46:23 <akasatkin> i mean it is additional work to sync them
16:47:11 <mihgen> yeah we need to figure this out. I kinda have general concept in my mind, but not sure how it will lay down on what we have and what I'm not aware of
16:47:24 <mihgen> moving on folks? 13min left..
16:47:30 <mattymo> +1 moving on
16:47:47 <xarses> #topic Granular deployment enhancements feature status (mattymo) - https://blueprints.launchpad.net/fuel/+spec/detach-components-from-controllers
16:47:52 <mattymo> thanks xarses
16:48:04 <mattymo> So, expanding granular deployment is meant to meet the needs of users with massively scaled deployments. This means creating custom roles for deployment where controller services are split across 2 or more distinct node roles.
16:48:14 <mattymo> For example, keystone and horizon on one role, but nova/cinder/glance/neutron/ceilometer/heat on another.
16:48:24 <mattymo> adidenko started work on this for a little while already. I'm coming on board starting on this week, so now it's just a matter of scoping and understanding all the tasks. Right now, he's working on unmarrying all the services to the "controller" role, so that they can be deployed independently as a granular task.
16:48:59 <mattymo> So far, we've found 2 issues that slow implementation here. One is the role-as-plugin, which I found out just now is underway
16:49:20 <mihgen> why it's related?
16:49:27 <mihgen> role-as-plugin?
16:49:28 <mattymo> the other is the constraint for "at least one controller" - which is a problem if you now have these new foo1 and foo2 roles that split controller role
16:49:42 <mattymo> it's related because we can't define custom roles as a plugin. it must be done with a yaml and fuel cli command
16:50:17 <mattymo> it's not stopping development, just making it a little less accessible to users
16:50:25 <mattymo> but now there's this other feature, so that covers it
16:50:33 <xarses> mattymo: afaict the restriction is simply in the release yaml, we could prepare / modify the release yaml to remove it
16:50:37 <mihgen> mattymo: understand. But we can provide set of yamls / set of commands to the user, right?
16:50:42 <alex_didenko> in the scope of our BP we only add possibility to detach compontents, but real detachment should be done in plugins
16:50:53 <alex_didenko> so they're kind of connected
16:50:54 <xarses> the bigger issue i had when removing the 'controller' role is that there is some bad lookups done for it
16:51:06 <mattymo> xarses, we're going to address that piece by piece
16:51:08 <xarses> in other manifests, like compute
16:51:27 <mihgen> mattymo: second issue you referred, at least one controller - are we in sync with nailgun folks to remove this restriction?
16:51:32 <mattymo> mihgen, nope
16:51:37 <mihgen> for such granular cases
16:51:43 <mattymo> I am still gathering all requirements to present to python team
16:51:55 <alex_didenko> we have a workaround for it to patch a live Fuel master node :)
16:52:08 <xarses> mihgen: in this case, i think this is a plugin is allowed to update data in the release
16:52:20 <xarses> maybe strange
16:52:39 <mihgen> alex_didenko: let's get someone from python team to merge it into master..
16:53:13 <xarses> moving on?
16:53:22 <sbog> +1
16:53:23 <alex_didenko> we'll take care of that, Matt witll gather a list of reqs
16:53:27 <alex_didenko> one sec
16:53:32 <alex_didenko> Using patches from detach-components-from-controllers blueprint and Hiera overrides we were able to deploy rabbitmq+mysql separately in its own corosync cluster. Also we were able to deploy and run the following controller services separately from rabbitmq/mysql/keystone: nova, cinder, heat, ceilometer, neutron. Currently we're working on: keystone, horizon, glance and swift. Some of them are on review already with WIP tag.
16:53:41 <alex_didenko> now we can move on :)
16:53:50 <xarses> #topic SSL feature status
16:54:00 <sbog> Hello again folks
16:54:06 <sbog> As long as we decide mostly all about the scope, I write some code.
16:54:14 <sbog> ssl for master node - PoC done (need a rewrite to generate cert with some domain name in CN)
16:54:16 <mihgen> thanks alex_didenko, excellent progress!
16:54:21 <sbog> ssl for openstack endpoints - keys generation PoC done (need rewrite astute part a little to push public VIP key hostname in CN, in plans to done it on this week)
16:54:27 <sbog> keys distribution done, nodes haproxy new module that support SSL done and adapting it done
16:54:31 <sbog> patches to create https public endpoints in keystone for most of services (all w/o sahara, ceilometer, murano) done
16:54:35 <sbog> code that apply ssl to haproxy public VIP done
16:54:40 <sbog> UI part mostly done (need to add one field to point public VIP hostname - easy task, I work on it now, need to add field to upload user certs - I asked Vitaly K., he said that it will cost about one and half week to be done)
16:54:48 <sbog> Also some tweaks for haproxy and nginx that will exclude not-safe ssl mechs need to be done
16:55:12 <mihgen> sbog: so do we allow a user to upload his certs?
16:55:46 <sbog> Yes, I spoke with UI team about it - they will do patch to UI, I'll write logic to check user cert
16:55:55 <alex_didenko> where exactly do we terminate SSL for endpoints? in haproxy?
16:56:03 <sbog> Yes, in HAProxy
16:56:20 <sbog> It is only one scalable solution, actually
16:56:29 <alex_didenko> ok, thanks
16:56:37 <mihgen> sbog: sounds great. and there is gonna be an option to use self-generated certs, and do I understand right that you already have a code for it ((need rewrite astute part a little to push public VIP … )
16:56:43 <sbog> Terminate SSL on OpenStack services is bad idea now - it is not scalable
16:57:05 <sbog> Yes, I already have code for generate selg-signed certs
16:57:20 <mihgen> sbog: in 7.0 we would get keystone behind of apache, would it make any impact on your work?
16:57:31 <sbog> I need some from astute part for done it - and I work on it right now.
16:57:44 <sbog> I don't think so
16:57:47 <xarses> moving on?
16:57:57 <sbog> couple sec
16:58:19 <sbog> We can push ssl certs on apache for keystone - it should work ok too, so there is no problem, I think
16:58:25 <alex_didenko> keystone is behind haproxy so moving it under apache won't affect SSL
16:58:28 <dstanek> sbog: can't use of the terminators like stud?
16:58:50 <sbog> We can, yes. Tests said that stud work well, actually
16:59:29 <sbog> But we already have HAproxy, so I decide to not add complexity
16:59:39 <mihgen> dstanek: any specif benefit of using it?
16:59:47 <angdraug> time
16:59:58 <angdraug> lets move to #fuel-dev
17:00:01 <dstanek> mihgen: simplicity
17:00:10 <xarses> #endmeeting