16:01:10 #startmeeting Fuel 16:01:11 Meeting started Thu Aug 13 16:01:10 2015 UTC and is due to finish in 60 minutes. The chair is kozhukalov. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:15 The meeting name has been set to 'fuel' 16:01:17 sorry fuel guys, we'll get out 16:01:20 #chair kozhukalov 16:01:21 hi 16:01:21 Current chairs: kozhukalov 16:01:23 hey 16:01:24 hi 16:01:28 \o 16:01:30 hey everyone 16:01:30 hi 16:01:31 hi 16:01:31 hi 16:01:32 hi 16:01:34 рш 16:01:36 hi 16:01:39 hi 16:01:41 Hi 16:01:46 aloha 16:01:59 #topic Ubuntu-based bootstrap (msemenov) 16:02:04 hi all 16:02:12 Ubuntu bootstrap feature needs more testing right now. I've just written email to openstack-dev about it. 16:02:20 Currently, we have 2 bugs from scale team: 16:02:21 hi 16:02:24 \~/ 16:02:28 https://bugs.launchpad.net/mos/+bug/1482049 16:02:28 Launchpad bug 1482049 in Mirantis OpenStack "Ubuntu boostrap uses wrong network interface to retrive the root filesystem image" [High,In progress] - Assigned to Alexei Sheplyakov (asheplyakov) 16:02:28 Launchpad bug 1482049 in mos "Ubuntu boostrap uses wrong network interface to retrive the root filesystem image" [High,In progress] 16:02:29 Launchpad bug 1482049 in mos "Ubuntu boostrap uses wrong network interface to retrive the root filesystem image" [High,In progress] https://launchpad.net/bugs/1482049 16:02:35 https://bugs.launchpad.net/fuel/+bug/1481721 16:02:35 Launchpad bug 1481721 in Fuel for OpenStack "Only 100 of 200 nodes booted successfully with Ubuntu based bootstrap" [High,In progress] - Assigned to Alexei Sheplyakov (asheplyakov) 16:02:41 and we have added bvt_with_ubuntu_botstrap to SWARM: 16:02:48 http://jenkins-product.srt.mirantis.net:8080/job/7.0.system_test.ubuntu.bvt_ubuntu_bootstrap/ 16:02:49 Launchpad bug 1481721 in fuel "Only 100 of 200 nodes booted successfully with Ubuntu based bootstrap" [High,In progress] 16:02:50 Launchpad bug 1481721 in fuel "Only 100 of 200 nodes booted successfully with Ubuntu based bootstrap" [High,In progress] https://launchpad.net/bugs/1481721 16:03:00 so, we need to test more, please help us. 16:03:19 hi 16:03:26 on the next week we want to switch swarm tests to the new ubuntu bootstrap 16:03:44 msemenov: bugs sound pretty serios 16:03:50 msemenov, do you mean we need manual testing? 16:03:50 becasue we have only 2 weeks left to decide whether to amke Ubuntu bootstrap by default or not 16:03:59 first one is perhaps fixed? 16:04:10 mihgen: but unreproducible right now 16:04:39 I'd vote for enabling it by default then if we can't reproduce bugs and there are no known blockers 16:04:55 kozhukalov: yes, we need everyone just try to deploy (switch to new bootstrap) on the different hardware. Especially partners and plugin developers. 16:04:58 which would prevent us from testing other functionality 16:05:24 mihgen: then -> when? 16:05:32 the only way is to just force everyone by making it default ) 16:05:35 mihgen: yes, only by switching is to default right now can give us big amount of testing 16:05:56 mihgen: so, if you approve, Alexei will prepare a chrequest 16:06:10 angdraug: ? I'm saying that we should rather switch now 16:06:20 I'd like to hear other opinions 16:06:39 how much will it disrupt swarm? 16:06:39 any objections? 16:06:44 +1 for switching right now, early - better 16:06:56 anyone from QA? 16:07:24 nurla, around? 16:07:38 I had a conversation with nurla today. Main concern from her side is partners and plugin devs 16:07:46 I don't object, as long as it doesn't cause bvt disruption or swarm regression 16:07:51 so, we decided to post it to openstack-dev 16:08:01 do we have plugins that rely on centos based bootstrap? 16:08:02 I prefer to use by default in 7.0 centos bootstrap and make switch only in 8/.0 16:08:37 angdraug, probably mlnx 16:08:41 we have other team that use custom centos bootstrap 16:08:57 I can ping mlnx guys 16:09:03 which team? they still have time to switch 16:09:03 any others? 16:09:08 angdraug, they built their custom bootstrap, as far as i remember 16:09:13 also, we don't remove centos, do we? 16:09:22 so it would be easy to enable if someone needs it 16:09:26 mihgen: correct 16:10:07 nurla: what about bvt and swarm? what's the expected impact? 16:10:14 I don't know why would we slow down a change .. we bettter try and then see who complains 16:10:22 +1 16:10:25 we don't even know who could be broken now 16:10:36 I need talk with my engineer 16:11:20 so folks - what's the decision? we've got to move on 16:11:20 we should write code for switching, actually we haven't such functionality in swarm 16:11:26 ok, it looks like we are pretty done on this topic 16:11:30 moving on? 16:11:50 nurla: what code? 16:11:59 why do you need to write anything in swarm? 16:11:59 nurla: but from customer's side, we already wrote a script for switching 16:12:12 and also you can do it in fuelmenu 16:12:12 for switching from centos to ubuntu bootstrap 16:12:17 swarm should not be even aware if it's centos or anything else 16:12:31 you just pxeboot and get what's given by default 16:12:51 we use Alexey's patch for switching, it isn't production way 16:13:40 nurla: mihgen but it seems, we are speaking now about making Ubuntu bootstrap default. 16:13:43 kozhukalov: msemenov Do we have code in master for switching on live fuel master? 16:13:45 one impact I can think of is more time to generate the bootstrap image for each master deploy test 16:13:57 mihgen: yes 16:14:22 steps are the following 16:14:23 1. Make sure the master node can access Ubuntu (http://archive.ubuntu.com/ubuntu) and MOS (http://mirror.fuel-infra.org/mos-repos) APT repositories. 16:14:23 2. Run the following command on the master node: 16:14:23 fuel-bootstrap-image-set ubuntu 16:14:23 3. Just in a case restart dnsmasq: 16:14:25 dockerctl shell cobbler service dnsmasq restart 16:14:27 angdraug: we could pregenerate one for our tests.. but this is where it will take some QA resources 16:14:27 4. Reboot the slave nodes. 16:14:29 just for information 16:14:45 actually I think we already abosrbed this impact 16:14:49 absorbed 16:14:56 mihgen, live fuel master? i don't even understand this 16:15:14 correct me if I'm wrong: we're generating the ubuntu bootstrap even though it's not used by default 16:15:32 angdraug: yes 16:15:32 kozhukalov: he means, change it after master node is deployed 16:15:37 angdraug, correct 16:15:51 folks, 15 minutes -- lets move on 16:15:56 and yes, we can change it after deploy - steps are above 16:16:05 feature with Critical priority should not affect GA in whole 16:16:20 ok, so Alexei will prepare a request for making ubuntu-based bootstrap by default 16:16:35 and I'll write to mos-dev about merge 16:16:38 ok? 16:16:42 this image is built during master node deployment, as far as i understand, after deployment we can again run fuelmenu and choose another bootstrap flavour 16:16:50 and then redeploy 16:17:18 it'll changed cobbler configuration so it points out on ubuntu profile 16:17:29 ubuntu bootstrap profile 16:17:49 moving on 16:18:03 #topic CentOS 6.6 based Fuel Master (amnk) 16:18:17 this patch has been merged 16:18:22 yep 16:18:26 is there any outstanding impact or cleanup that still needs to be done? 16:18:30 no 16:18:47 that was quick :) 16:18:47 so, we are on 6.6 now 16:18:55 moving on? 16:18:55 yep, we are on 6.6 16:18:57 amnk: please check documentation... I'm sure there refs to 6.5 still 16:19:08 amnk: thank you for this work 16:19:09 +1 16:19:10 mihgen: ok, I'll double-check it 16:19:37 #action amnk checks references in docs on 6.5 16:19:43 moving on 16:19:56 #topic SSL support status (sbog) 16:20:28 sbog, around? 16:20:41 looks like he's not here 16:20:44 moving on 16:20:51 sbog: ^^ 16:21:09 #topic Trove plugin plans (bmidgley) 16:21:15 hi all 16:21:24 hi 16:21:26 sorry I came in with a different id, I’m bmidgley 16:21:30 not a bo 16:21:31 bot 16:21:40 :) 16:21:57 so I saw mihgen in https://bugs.launchpad.net/fuel/+bug/1403384 saying that trove in fuel would be entirely customer driven 16:21:57 Launchpad bug 1403384 in Fuel for OpenStack "Create Trove plugin for Fuel" [Medium,Invalid] - Assigned to sbartel (samuel-bartel) 16:21:57 Launchpad bug 1403384 in fuel "Create Trove plugin for Fuel" [Medium,Invalid] 16:21:58 Launchpad bug 1403384 in fuel "Create Trove plugin for Fuel" [Medium,Invalid] https://launchpad.net/bugs/1403384 16:22:06 is that the current place we are at? 16:22:21 we need to build it in our MOS cloud, investigating options 16:22:51 do you need trove, or dbaas? 16:23:09 we want trove so we get the db management too 16:23:10 flamebot: sounds like it is. Anyone from partners team? 16:23:13 iirc dbaas can be achieved via murano 16:23:40 we are actually investigating that in a parallel track but people are asking for trove management 16:24:14 I have not heard if anyone else wants to work on Trove plugin... 16:24:35 flamebot: this is the 1st time I hear about this plugin 16:24:52 so looks like it's yours :) Do you have any issues with trying to build it yourself? 16:25:00 several paths are set out 16:25:13 plugin, package 16:25:13 looks like there would be needed a trove package and some deployment puppet code 16:25:16 flamebot: first I'm hearing too, although Irina_p usually knows before I do ;-) 16:25:27 ipackage is preferred for upstream inclusion? 16:26:04 ipackage? 16:26:49 so the question is do you need trove package to be in fuel/mos packages mirrors? 16:26:58 typo, just making sure uncertified plugin was the ideal target 16:27:09 you could create package yourself and deliver it as part of package 16:27:22 and you can actually certify plugin afterwards 16:27:42 Irina_P: we don't have any limitations on whether package has to be on our mirrors or as part of the plugin, do we? 16:27:52 mihgen: we don't 16:28:01 plugin can include its own package repo 16:28:31 flamebot: we could do code review and check if everything is okay..then you could put it into DriverLog or to fuel-infra 16:28:38 flamebot: so I think for this type of extensition, pluggable architecture just fits the best 16:28:47 flamebot: and sure - get this validated by us 16:29:21 ok we are parsing this :) 16:29:22 Irina_p, what about testing? 16:29:52 are we going to run tests on mos ci? 16:30:01 kozhukalov: we could put this into DriverLog/fuel-infra if flamebot at least runs some baseline tests 16:30:23 kozhukalov: plugins should have their own CI or at least manual testing performed 16:30:43 Irina_p, thanx for clarifiction 16:30:50 moving on? 16:31:05 #topic RabbitMQ (dmitryme) 16:31:15 flamebot: feel free to ask questions in openstack-dev if any.. 16:31:20 flamebot: feel free to ask in fuel-dev then about details, will be happy to help 16:31:28 hello guys, should I tell you our current status? 16:31:28 thanks 16:31:30 yeah or in #fuel-dev IRC 16:31:35 * status on RabbitMQ 16:31:51 dmitryme, yes, please 16:31:57 if you have 16:32:00 sure 16:32:50 RabbitMQ 3.5.4 was merged last Saturday. Before that we ran tests with custom ISO with new RabbitMQ on 50-nodes lab and didn’t spot any degradation of behaviour 16:33:26 so, basically that is it. Starting from this week all new ISOs contain new RabbitMQ 16:33:42 Have there been any QA issues related to this merge that you know of? 16:34:12 what are the main pros of using new Rabbit? 16:34:26 HA? 16:34:50 the main advantage is that we get the freshest fixes + we community support 16:35:07 community is not really interested in discussing issues in old RabbitMQ 16:35:22 * we _get_ community support 16:35:29 dmitryme, ok, i see 16:35:42 thanx for your update 16:35:49 any q here? 16:35:53 moving on 16:36:05 mihgen: discussion yesterday re: RMQ, do we need to follow a SCF exception process next time? 16:36:25 SheenaG: yes we certainly need to ask for exception next time. 16:36:35 no major pkgs upgrades should be done after FF 16:36:43 mihgen: we will definitely do that 16:36:45 Okay, noted. Anything after FF we will raise an exception to openstack-dev. 16:36:52 thanks. 16:37:08 I hope that our HA will become even better with new MQ :) 16:37:09 RMQ 3.5.4 is much more stable 16:37:27 #topic Keystone wsgi issues 16:37:33 many many fixes were added, and it's durable queues works as expected by design 16:37:35 I'm looking for some conrete numbers on downtime in sec, etc... 16:37:59 Guys, we have bunch of problems with WSGI 16:38:26 sgolovatiuk, bugs, links? 16:38:31 we have process based WSGI, which creates huge amount of connections to MySQL and memcache 16:38:38 I remember we had to revert it in 6.1... 16:38:54 all reviews are linked https://review.openstack.org/#/c/209589/3 16:38:56 sgolovatiuk: was it discussed with Keystone team / puppet-openstack? 16:39:00 bugs are included 16:39:23 also wsgi blocks apache, sometimes it doesn't want to stop 16:39:31 just hanging 16:40:07 we made an amount of reviews, that should fix apache probles 16:40:25 I am testing all of them on custome ISO 16:40:40 Over weekend scale team will test it on scale 16:40:40 +1 to mihgen, who did you talk to about this? 16:41:03 if there no any improvements, we'll be reverting it back to eventlet 16:41:28 mihgen: I am working with A. Makarov and B. Bobrov 16:41:38 they made fix for WSGI module 16:41:48 added critical section 16:42:02 sgolovatiuk: +1 on a plan but only if you guys discuss it with keystone upstream 16:42:11 it looks like helped with threaded based model as well as apache stop operations 16:42:38 anyhow, I am working with them to provide stable keystone service under Apache WSGI 16:43:15 I will have more results after rully and destructive tests we are going to do on scale lab over weekend 16:43:31 rally* 16:45:00 i think mihgen means that your cooperation with those guys should be reflected in ml or at least in irc logs 16:45:01 sgolovatiuk: thank you. I hope you guys nail it down.. 16:45:12 I hope so :) 16:45:19 ok, moving on 16:45:40 #topic Keystone openrc issues (LP: #1479879) 16:45:41 Launchpad bug 1479879 in Fuel for OpenStack "fix auth_url in keystone providers for v3/v2.0 api split" [High,In progress] https://launchpad.net/bugs/1479879 - Assigned to Matthew Mosesohn (raytrac3r) 16:45:43 Launchpad bug 1479879 in fuel "fix auth_url in keystone providers for v3/v2.0 api split" [High,In progress] https://launchpad.net/bugs/1479879 16:45:44 Launchpad bug 1479879 in fuel "fix auth_url in keystone providers for v3/v2.0 api split" [High,In progress] 16:45:47 ohh, keystone again 16:46:00 :) 16:46:23 mattymo_: around? 16:46:39 let me ping him 16:46:45 angdraug, yes sorry for delay 16:46:49 amaksimov, no worries 16:47:31 so it seems that there is a healthy debate on openstack-dev and on the puppet-keystone review for how we should approach openrc/env vars being set and then trying to create keystone_* resources 16:48:21 several folks believe that internalurl and publicurl should be set in the service catalog with no version in keystone... but that so far doesn't seem to work either 16:48:52 so in the interests of meeting HCF on time, I think we should fork from upstream for this commit until they've decided what is best and what meets the most people's needs 16:49:18 at the same time, I will continue pushing this patch upstream since it seems to be necessary 16:49:56 mattymo, thanx for the status 16:49:56 mattymo_: +1 to fork it, we need to test it before HCF.. 16:50:27 it's very tested for Fuel's puproses, mihgen , but not enough for acceptance tests upstream 16:50:34 https://review.openstack.org/#/q/topic:bug/1479879,n,z for those of you who want to see the patches 16:50:38 purposes* 16:51:18 moving on? 16:51:22 yes, please kozhukalov 16:51:39 #topic default network allocation for CEPH-based installations (xenolog) 16:51:48 hi, guys! 16:52:04 I want ask about storage network and its usage 16:52:14 In the 7.0 we have admin, management, public, storage networks by default. 16:52:25 If CEPH-based deploy was chosen, storage network for ceph replication 16:52:25 and management for ceph-public (access from VM to NBD) will be used. 16:52:25 Do we need change this default scheme in 7.0? 16:53:11 Some people complain, why storage traffic goes through management network 16:53:54 can we now have 2 storage networks in 7.0? 16:54:04 not 16:54:13 so what exactly can we change? 16:54:37 We can leave all as is. 16:54:51 Or change default scheme 16:54:53 moving RBD client traffic to storage network and combining it with RADOS replication traffic isn't recommended by Ceph 16:55:04 yes. 16:55:06 can we have at least example templates which would do the right schema 16:55:24 there was a problem with cluster/rabbit/HA glitches due to overload of management network by VM to RDB traffic 16:56:04 in the most typical case with dedicated compute and ceph-osd nodes, 16:56:33 computes don't need rados replication so they can use storage for rbd traffic 16:56:39 why can't we have two storage networks? 16:56:47 ceph-osd nodes don't need public, so they can use that for rbd traffic 16:57:11 can we provide a template that does different role allocation for compute and osd nodes? 16:57:24 May be "move ceph public traffic to the storage network" checkbox in the settings tab is a solution for 7.0 ? 16:57:26 different network _role_ allocation... 16:57:50 xenolog13: that's stealing from Peter to feed Paul... 16:58:27 2 minutes 16:58:30 +1 to kozhukalov's question, why can't we add a new network type yet? is it out of scope of network templates in 7.0? 16:59:09 and time :( 16:59:11 because SCF was done 16:59:31 yeah I'd say it is too late for any major changes 16:59:37 ok, let's move out discussion somewhere else 16:59:42 so I vote for providig better docs, and template example.. 16:59:51 so people can use templated networking if needed 16:59:55 thanx everyone for attending 16:59:59 closing 17:00:08 #endmeeting