16:01:31 #startmeeting Fuel 16:01:33 Meeting started Thu Jan 22 16:01:31 2015 UTC and is due to finish in 60 minutes. The chair is kozhukalov. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:34 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:37 The meeting name has been set to 'fuel' 16:01:41 #chair kozhukalov 16:01:42 Current chairs: kozhukalov 16:01:46 Hi 16:01:50 hi everyone 16:01:51 hello 16:01:52 o/ 16:01:53 hi 16:01:56 hi 16:01:57 \o 16:01:57 agenda as usual 16:02:00 hi 16:02:03 hi 16:02:06 #link https://etherpad.openstack.org/p/fuel-weekly-meeting-agenda 16:02:07 o/ 16:02:14 hi 16:02:26 #topic granular deployment (dshulyak) 16:02:30 Guest12691 is docaedo, nick issues this morning :( 16:02:53 as you know we merged mvp that is required for library modularization, 16:02:53 and that process is already started, i know that Alex D is able to provide more 16:02:53 details 16:03:08 Guest12691: ok, got that 16:03:19 next step is to provide api for partial deployment, all patches for this 16:03:19 stuff is on review already, and we are currently testing it 16:03:40 also we have all necessery stuff to easily plug tasks for pre/post stages, this 16:03:40 patches also on review, and ofcourse there is ongoing process of moving pre/post tasks from astute to library 16:04:23 well, and warpc is working on plugable reboot task, but this is another story 16:04:34 any questions guys? 16:04:49 dshulyak: great to read that 16:05:26 it looks i started to early) 16:05:42 dshulyak: you started in time 16:06:00 do we really need tasks like apt-get update, yum update, yum clean, etc? can't one just say 'puppet install latest package of this' and update will be made automatically? 16:06:00 everyone is reading what you just write 16:06:04 hi guys, I'm a bit late 16:06:17 dshulyak: we're late, not the other way 16:06:34 did we start meeting or not? 16:06:45 mihgen: we did 16:07:06 mihgen: meeting is going on 16:07:30 topic is granular deployment 16:07:47 ok, looks like we in time with this feature by 6.1 16:07:49 so what about seeg's question? 16:07:51 seeg: what if package will be required in pre deployment? 16:08:11 so basically your sugestion is to do same stuff implicitly by puppet 16:08:47 also we may want to know is repo valid right after we uploaded it 16:08:48 dshulyak: are we planning to have network configuration as a separate stage? 16:09:08 kozhukalov: we already have it as separate task 16:09:34 kozhukalov: https://github.com/stackforge/fuel-library/blob/master/deployment/puppet/osnailyfacter/modular/tasks.yaml#L1 16:09:35 and even more, i'm interested in this feature in the context of IBP 16:09:58 currently we use cloud-init as our initial configuration tool 16:10:19 can we somehow use puppet instead of cloud-init 16:10:44 and is it possible to create a separate granular stage for that 16:11:08 so that it is not connected to any task which we currently track? 16:11:17 kozhukalov: it looks like another task in pre_deployment stage 16:12:02 kozhukalov: evgeniyl___: i would say that we can refactor provisioning to use task api, or even merge provisioning into pre_deployment stage 16:12:09 evgeniyl___: yes but i wonder if it is possible to deal with that pre-deployment stage in terms of granular deployment 16:12:52 for example we put astute.yaml into /etc/astute.yaml during provisioning and before first reboot 16:13:24 dshulyak: ok, let's move this discussion to ML 16:13:38 that would be great if it was possible 16:13:42 ok moving on 16:13:52 i dont see any problems, we need to decide how to do it properly and this is it 16:14:00 #topic 200 nodes and image-based provisioning (vkozhukalov) 16:14:16 ok, just to make sure everyone is aware 16:14:44 it is an official decision to make IBP production ready by 6.1 16:15:03 and it is to be a default provisioning option 16:15:18 \o/ 16:15:19 yay 16:15:29 we have tried it on 100 nodes scale lab and it works pretty well 16:15:46 it looks like 200 nodes is not a problem 16:16:11 besides IBP gonna solve some important issues 16:16:33 so it sounds rational to have it as our default provision method 16:16:55 is anyone from scale team around? 16:17:07 kozhukalov: how long does it take to provision 200 nodes on scale lab? 16:17:38 provisioning itself take about 2 minutes on 100 nodes lab 16:17:58 + reboot takes around 3-4 minues 16:18:14 compared to 20 minutes for traditional provisioning? 16:18:41 we also have some suboptimal code when we reboot nodes 16:19:03 we reboot them one by one, one per second 16:19:20 so when it is just 5 nodes, it is not a big deal 16:19:41 but when you have 200 nodes it gonna take 200 seconds 16:20:02 so it is just the place where we can improve our system 16:20:18 what about not rebooting them at all? 16:20:32 angdraug: actually traditional provisioning works pretty well too 16:20:59 it takes around 11 minutes (including 2 reboots) 16:21:18 so overall it's 6 minutes vs 11? 16:21:29 and it looks like 200 nodes for traditional provisioning is also not a problem 16:21:45 angdraug: yes, exactly 16:22:17 kozhukalov: looks like you are going to cover IBP status. I have something to add. or should i do it separetely? 16:22:18 we increased a lot of timeouts in traditional installers back in 6.0, so that it didn't flip out so much. But with suboptimal networking, it could hang for some time in the beginning 16:22:39 angdraug: about not rebooting, it really sounds interesting, it was xarses's idea, but i have not tried it yet 16:23:24 pivot root would be interesting 16:23:45 agordeev: ok, i just would like all fuelers be aware that IBP in our scope for 6.1 and a default one 16:23:53 i thought we had cobbler only booting nodes in chunks of 20 so that it's not overloaded 16:24:13 that was something that the scale lab came up with to get to 100 nodes 16:24:41 xarses: according to what i've seen in logs it is not true 16:24:49 nodes are rebooted one by one 16:25:00 ok, moving on 16:25:10 hmm ok 16:25:25 let me change the order of our topics 16:25:47 #topic image based provisioning (agordeev) 16:25:49 xarses, 1 at a time because of timeouts and failures to download preseed file. We hacked debian-installer to retry 10 times insteadof just 1 16:25:56 sorry for interrupting 16:25:56 agordeev: please go ahead 16:26:14 2 new blueprint were approved and targeted to 6.1 as high 16:26:16 https://blueprints.launchpad.net/fuel/+spec/build-ubuntu-images-on-masternode 16:26:18 https://blueprints.launchpad.net/fuel/+spec/fuel-agent-improved 16:26:20 the next step is to prerare the specs for it 16:26:22 good news: IBP had been switched to automatic build tests. So new bugs will possibly appear more frequent. 16:26:24 regarding bugs: still few on review, most of high were merged to master and waiting to be backported to 6.0.1 16:26:39 agordeev: we are looking forward for specs 16:27:25 and next week we are planning to give a talk about IBP mostly for scale team 16:27:48 agordeev: are you done? 16:28:03 kozhukalov: yes, i'm done, have nothing to add 16:28:24 folks did we discuss that we want ibp be default and run against 200 nodes 16:28:35 ok, if there are no any other q, let's move on 16:28:35 I don't like when we put words like "improve" in blueprint names 16:28:40 mihgen: yes 16:28:47 good 16:29:02 angdraug: what is the better word for that? 16:29:13 be more specific 16:29:17 angdraug: it will be highly defined in the spec what it means "improve" 16:29:41 if you don't yet know what you're going to improve, why create BP at all? 16:30:00 if it's just bugfixing, shouldn't be a BP 16:30:20 angdraug: the word improve is used because we have 2 particular improvements 16:30:33 should be 2 separate BP's 16:30:48 we've had that with pacemaker-improvement BP before 16:30:58 angdraug: we want it to be able to reconnect when disconnected and we want it to compare checksums of images 16:31:21 so, ibp-reconnect and ibp-image-checksums 16:31:26 angdraug: +1 16:31:39 openended BP's are never finished 16:31:47 maybe we'll split, but for me it looks ok, because planned improvements are pretty small 16:32:16 please split, you already plan to have separate specs for those two, no? 16:32:16 ok, let's make three then 16:32:22 agordeev: will you? 16:32:40 yes, i'll do separete BP's. thanks for suggestions 16:32:54 thanks 16:33:02 #action agordeev splits IBP improvement BP into two separate BPs 16:33:09 moving on 16:33:21 #topic Empty role (evgeniyl) 16:33:36 We’ve tested and merged the feature, also we’ve fixed a bug with progress bar which was related to Granular deployment. 16:33:44 QA team started to work on tests for this role. 16:33:59 Also Meg is going to update the docs, I provided all required information. 16:34:20 it's base-os role, not empty role 16:34:58 Now, user can assign Operating System role to node, and no additional configuration will be performed 16:35:23 xarses: yes, we had different name for this role 16:35:43 That is all. 16:35:49 Any questions? 16:36:56 looks like, no one has 16:37:07 evgeniyl___: thanx 16:37:28 great to read that we have progress here 16:37:33 moving on 16:37:47 #topic downloadable ubuntu release (ikalnitsky) 16:37:58 We had a hot discussion about implementation details this week. 16:37:59 You can see them in the spec: 16:37:59 #link https://review.openstack.org/#/c/147838/1/ 16:37:59 Well, we all agreed to use Nailgun for uploading iso and creating tasks, while Astute would be used for repo extraction. 16:37:59 Also, we need to research Nginx capabilities for uploading files. It looks like we can do not block Nailgun worker until file uploaded. 16:38:00 The most important questions about design are resolved, so I think we can start implementation since tomorrow. 16:38:00 Questions? 16:38:12 you are really good at typing ) 16:38:19 yes, i'm )) 16:38:44 tanks for pre-typing ikalnitsky 16:38:50 there is no support for uploading in upstream nginx iirc 16:38:55 it helps move the meeting along 16:39:10 so we will probably end up with external module, unless there is some black magician around 16:39:19 barthalion: what is nginx iirc? 16:39:25 barthalion: my understanding is that ngnix has problems with large files, and we will need to switch to a fork 16:39:34 rmoe knows about it well 16:39:34 ikalnitsky: iirc = if i remember correctly 16:39:36 ikalnitsky: iirc == if I remember correctly 16:39:47 thank you guys) 16:40:16 ikalnitsky: talk with rmoe regarding ngnix uploads 16:40:36 xarses: ok 16:41:15 ok, guys how do you think is it ok, when user waits for 1 minute to upload an iso , but then it is not the end, it is just means additional task is started. 16:41:18 #link https://dmsimard.com/2014/06/21/a-use-case-of-tengine-a-drop-in-replacement-and-fork-of-nginx/ 16:41:21 ikalnitsky: ^ 16:41:37 xarses: thanks 16:41:46 kozhukalov: we should have feedback that its working 16:41:47 kozhukalov: good question, actually 16:41:48 we had a discussion about UX in spec but maybe there are other opinions 16:42:12 should fuel cli wait until uploading task and repo extraction get ready? 16:42:25 maybe use notification area in UI 16:42:39 ikalnitsky: yes, and no 16:42:58 xarses: notification is good, but it's about UI. and we're interested in fuelclie 16:43:16 ikalnitsky: if request uploads something, there is no way not to block request 16:43:21 correct, they should get task id back 16:43:35 and if they want to wait, they should attach to monitor the task 16:43:42 so you get both 16:43:42 "With nginx, the upload took 1 minute 13 seconds. With Tengine, the upload took 41 seconds." sounds like microoptimization 16:44:00 xarses: when we upload iso there's no way to do this way. it will be performed by pure http. 16:44:04 xarses: how are you going to return task id, if you upload the file? 16:44:41 xarses: you perform post request with a huge body, and you should complete uploading to get something from the server 16:45:00 please guys have a look at the spec and give your opinion, we have some arguments "for" and "against" 16:45:19 correct, i was talking about while astute is running in the background 16:45:54 ok, let's move on and leave this discussion for the spec and ML 16:45:55 for the upload I'm not sure if it matters if you have progress or not, openstack clients dont 16:46:24 #topic Multi-HV support (adanin) 16:46:37 We are going to implement two hypervisors in one environment. They are KVM (or QEMU) and VMware vCenter. 16:46:37 Both of them will have their own Cinder backend and will be placed into different Availability Zones. 16:46:39 It'll be done to avoid attaching volumes with unsupported type to a hypervisor. 16:46:40 A new role "cinder-vmdk" will be added. New vCenter-specific OSTF tests will be added. A new UI tab with vCenter settings will be added. 16:46:42 We still not decided which network backend will be used: nova-network or Neutron with non-upstream ML2 DVS plugin. 16:46:43 Links to blueprints: 16:46:45 #link https://blueprints.launchpad.net/fuel/+spec/vmware-ui-settings 16:46:45 #link https://blueprints.launchpad.net/fuel/+spec/cinder-vmdk-role 16:46:46 #link https://blueprints.launchpad.net/fuel/+spec/vmware-dual-hypervisor 16:47:02 Questions? 16:47:10 why new cinder-vmdk role? 16:47:28 can it need a dedicated node? 16:47:59 adanin: cant we run multiple hypervisors if we just move the compute tasks around and get multi-hv for all types? 16:48:09 also, cinder-vmdk role? 16:48:17 To allow a user place cinder-volume service to a node he wants. 16:48:20 run it on the controllers like ceph 16:48:33 that would be a task descision, not role 16:49:00 I already had to unbreak a cinder-vmdk deployment because it's not automatic 16:49:19 #link https://bugs.launchpad.net/fuel/+bug/1410517 16:49:45 xarses: cinder-volume in this case is just a proxy. I don’t think it’s a good idea to put it on Controller node. It will increase IO for the node. 16:49:56 what about multiple cinder backends? 16:49:58 #link https://blueprints.launchpad.net/fuel/+spec/fuel-cinder-multi-backend 16:50:20 adanin: is cinder-volume in the data path? 16:50:24 with new role a user will be able to place this service to whatever node he wants. 16:50:31 again, cinder-vmkd volume service should be a task, not a role 16:50:48 multi-backend is not applicable in that case. 16:51:35 xarses: and how a user choose a particular node to run this task? 16:52:01 choose a backend 16:52:31 granular deployments is getting us there 16:53:11 once again, why does it have to be a particular node, is it in the data path? 16:53:17 and what about HA for that service? 16:53:27 angdraug: xarses: a user will have two Cinder backends simultaneously - one for KVM and another for vCenter. There is no way to use VMDK for KVM and vise versa. 16:54:27 adanin: correct 16:55:00 angdraug: HA - assign cinder-vmdk role to two or more nodes. Each node will have identical settings for cinder-volume service. 16:55:01 adanin: please lets review this offline, probably as seperate meeting 16:55:05 5 minutes 16:55:08 Like we do it for Ceph now. 16:55:32 I’m going to advertice it in ML. 16:55:49 adanin: thanx 16:56:30 we have nothing more to discuss 16:56:38 open discussion? 16:56:44 3 minutes 16:56:52 does it make sense? 16:57:00 any announcements? 16:57:26 ok, closing then 16:57:32 thanx everyone 16:57:37 great meeting 16:57:40 thank you guys 16:57:51 #endmeeting