16:01:00 #startmeeting Fuel 16:01:01 Meeting started Thu May 7 16:01:00 2015 UTC and is due to finish in 60 minutes. The chair is kozhukalov. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:02 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:05 The meeting name has been set to 'fuel' 16:01:08 #chair kozhukalov 16:01:14 Current chairs: kozhukalov 16:01:19 agenda as usual 16:01:28 #link https://etherpad.openstack.org/p/fuel-weekly-meeting-agenda 16:01:35 who is here? 16:01:47 hi 16:01:47 hola 16:01:48 hi 16:01:52 Hi 16:02:09 5 people 16:02:14 not so much 16:02:28 hi 16:02:43 ok let's start first topic 16:02:47 o/ 16:02:57 #topic GRE MTU patch status (aglarendil, xenolog - please provide status) 16:03:21 aglarendil: around? 16:03:28 xenolog: ? 16:03:30 We are running a bunch of tests right now. 16:04:03 did we manage to merge a fix which is mtu - 42 bytes? 16:04:14 Day before yesterday we applied a partial fix that broke GRE deployments and it is being rewritten 16:04:35 o/ 16:04:40 aglarendil: you mean 3.10 kernel? 16:04:44 mihgen: it was broken, so we reassigned it back to MOS-neutron team to change it 16:04:50 kozhukalov: nope 16:05:17 #link https://review.openstack.org/#/c/180510/ 16:05:20 this is revert 16:05:39 so what's the current plan here? 16:06:46 rewrite partial fix and finish testing to figure out whether we need performance adjustments 16:06:59 partial fix is being rewritten right now 16:07:20 do we have a new review? 16:07:27 xarses: AFAIK, not yet 16:07:32 but it should not be hard 16:07:32 ok 16:07:39 xarses: can you please provide what you do? 16:07:59 in this area 16:08:04 aglarendil: do we have anyone working on the fix now? 16:08:15 mihgen: Sergey Kolekonov from MOS Neutron team 16:08:22 as xarses is actually run around and *might* be able to help too 16:08:39 aglarendil: ok. 16:08:41 we merged a similar review, lets make sure we don't create conflicts 16:08:51 #link https://review.openstack.org/#/c/179267/ 16:09:09 xarses: they are not conflicting actually, so no need to worry 16:09:23 sounds good 16:09:30 ok then good, yeah 16:09:41 moving on? 16:09:47 kozhukalov: yep 16:09:56 #topic repos connectivity checks (xarses, ikalnitsky, dpyzhov, mkwiek - where are we with this?) 16:10:14 hi guys 16:10:41 today we had meeting about repo connectivity checks and how it should be implemented 16:11:10 we have decided to not introduce hacks and workarounds and try to implement this via network checker 16:11:39 ikalnitsky: ok, sounds good. What exactly net-checker will do? 16:12:09 how hard is it going to be to implement? 16:12:26 1. it will check connectivity to the repo through default gateway (which may be master node or some other, if we change defaults in fuelmenu) 16:13:21 2. it will then setup new default gateway (public network) and perhaps in separate network namespace and check connectivity via public network 16:13:22 wait a sec, default gw for bootstrap nodes is always master node admin net IP 16:13:29 not always 16:13:36 we can change it via fuelmenu 16:13:41 I'm talking about 1). Why not always? 16:13:46 if user have some router in admin network 16:13:50 mattymo: what? ^^ 16:14:24 no, its not always, i checked my deployment and they share the same gateway, but its not the fuelmaster its self 16:14:33 I thought we can configure another iface in fuel-menu for outside world, and fuel master will use it as default gw 16:14:34 mihgen, in fuelmenu we can change default gateway for slaves.. i mean, it affect our cobbler/dnsmasq configuration during setuping master node 16:14:53 interesting, why would we need it at all 16:15:07 but anyway good that you cought it 16:15:07 mihgen, we use it in system tests, for example 16:15:20 so how does this cover IBP's need to have the repo's verified? 16:15:34 how does this solve the need to check the repos after l23network is complete 16:15:38 > perhaps in separate network namespace - why do we need this one? 16:15:40 so, that basically means that we need to check internet connectivity not only on slaves, but on master node itself 16:16:05 mihgen, sorry I stepped away from my desk for a moment 16:16:19 mattymo: no worries, resolved. 16:16:25 we need separate net namespace in order to be error-proof. what if something goes wrong and netcheck won't restore original default gateway? (one that was received via dhcp) 16:16:41 mihgen: we need to test in the namespace, because that's what we end up with from the vrouter on the controllers now 16:17:19 mihgen: for example on vcenter, one of the network secuirty policies only breaks the internet traffic through the network namespace 16:17:35 ikalnitsky: xarses: a) I thought that vrouters won't change anything; b) if smth goes wrong - we anyway communicate with master over admin net, we don't actually need gw setup for bootstraps 16:17:49 xarses: ok, vcenter is interesting, yeah 16:17:54 if we could catch that 16:18:11 all right, i'm convinced with namespace if it's not gonna be complicated 16:18:15 well any network could have issues like that. and we need to check it 16:18:32 vcenter is a specific case that is easy to occur 16:18:44 so.. we 1) should check repo connectivity on master node (the patch is on review by mkwiek) 2) check repo connectivity from slaves via both admin net and external net 16:18:53 sounds reasonable folks, if it's almost no extra effort, please do net ns then.. 16:19:16 loles will help mkwiek to do that. rough eta is monday. 16:19:19 so, now we need to make this netchecker task manditory 16:19:43 xarses: not necessary. We show big warning in the UI now if user didn't run it 16:19:50 we can add info about repos in the warning 16:20:02 ikalnitsky: thank you 16:20:24 ok, moving on 16:20:28 mihgen: that most people ignore 16:20:29 * mihgen looking forward to see great UX in repo checks 16:20:46 xarses: why would you ignore it? 16:20:54 I'm not sure we need to make it mandatory 16:21:00 for repos, yes 16:21:29 xarses: I'm actually fine to make it mandatory BUT if you can still continue via CLI for instance 16:21:36 it breaks the deployment in odd ways, and lots of people have run into the issue already because they dont have propper settings 16:21:39 with --force 16:21:53 what do you guys think here? 16:22:06 why would you force it if we know that the repo access will fail? 16:22:23 meaning it's going to fail a step down the line, right? 16:22:24 why would you want to continue if the repos are known to be bad? 16:22:46 this seems like a pre-flight check that must be OK before proceeding 16:23:08 well that would be the user's choice, but we still need them to check it so they can make the choice 16:23:10 and does it hurt to also add it to the pre/post tests for a deployment step if we know that it's soemthing that has to work 16:23:12 before continueing 16:23:42 by the way, looks like it would be also great to implement http proxy support 16:23:49 net verifier is not yet known to be very stable for > 150 nodes 16:23:55 so i guess im asking to force user to run check, green result is irrelevant at this point 16:24:06 which is why I didn't want to implment this kind of check here 16:24:09 this is the only thing I'm worried, so user should be still able to proceed 16:24:22 we can just check before provisioning on master node 16:24:26 and again after l23network 16:24:35 xarses: then you'd fail deployment 16:24:42 which is bad 16:24:49 in a describable way 16:25:07 you won't be able to change configuration as we lock it 16:25:10 instead of apt failed tring to install a package using an IPv6 address 16:25:32 well the config lock is a separate problem, which we need to stop doing 16:25:47 xarses: we can't for now ;) 16:25:52 so it's chain of issues 16:26:09 that's why it seems to be the easiest path is to go with net-checker extension 16:26:33 is there a reason not to do both? 16:26:34 it has all things built-in, it has to be the easiest path to go with in my opinon 16:26:37 can we just call the verify repos seperatly of the rest of the net checker functions? 16:26:55 mwhahaha: I don't see a reason why we would not add additional check in puppet layer 16:27:06 where is Alex 16:27:09 <- 16:27:17 mwhahaha: is alex 16:27:21 ohh :) 16:27:32 here you are ) yeah let's use your work of course as well 16:27:54 ok 16:27:57 so if user insists and continue deploy, we would at least fail with clear message! 16:28:49 ok anything else here? 16:28:52 ok, I'd still like to see the repo check to be called separately of the rest of the netchecker functions so we could eventually put a button on the repos UI to validate at change 16:29:07 xarses: agree, if this is possible 16:29:21 we should plan for it in 7.0+ 16:29:25 +1 to validating the settings in the ui 16:29:37 I would add like a button near repo configs 16:29:45 it must be possible (separate repo check). let's do it data driven 16:30:03 ok if there are no other q here, let's move on 16:30:16 #topic proxy for create mirror / how to install everything without connectivity to repos (dpyzhov & other - where are we with this?) 16:30:18 let's plan for it folks but keep in mind that we don't really have many resources now 16:30:24 no heavy coding please at this point 16:30:32 thanks for going extra mile with this... 16:30:51 now proxy another thing which we completely missed in our design :( 16:31:19 can't we just copy the code we used in the ISO make here? 16:31:29 yes, but it looks not so complicated to implement at least for ibp 16:31:43 i mean proxy 16:31:43 aglarendil mentioned in a meeting earlier today that we have an alternative script in fuel-web that does a full mirror over http 16:32:00 also, are we trying to create proxy, or local cache 16:32:08 angdraug: link? 16:32:15 aglarendil: link? 16:32:22 for the script 16:32:40 angdraug: fuel_package_updates directory in fuel-web repo 16:33:13 #link https://github.com/stackforge/fuel-web/tree/master/fuel_upgrade_system/fuel_package_updates 16:33:35 can't we use system wide http proxy settings? 16:34:17 the script uses urllib2, doesn't this library honor htto_proxy env variable by default? 16:34:33 mihgen: i'm not sure all our components are able to use proxy from env 16:34:50 angdraug: not sure 16:34:58 let's try 16:35:30 mihgen: so are we tring to create a proxy, or a local 'mirror' because we seem to be talking about parts of both 16:35:49 xarses: i think we need both options 16:36:13 I think this agenda item is making sure that create mirror script works in an env where the only way to access internet is over an http proxy 16:36:20 correct me if it's about something else 16:37:24 actually, if there is any issue with external whatever net 16:37:36 and what is wrong with this script fuel-createmirror-6.1-3.mira3.noarch.rpm? 16:37:38 then we would be just fine getting all required repos on master node 16:38:00 if we can get all required repos on master node by any means (including createmirror script), then we are fine 16:38:29 we are even fine if we just write doc how to scp repos 16:39:03 I think for 6.1 we would still need proxy support in createrepo 16:39:22 +1 for solving the edge cases with a doc/how-to 16:39:24 mattymo: but I'm worried that we have some other script for updates repo sync 16:39:59 this tool syncs only updates repo 16:40:11 we would need doc anyway, as people might want to provide their own hacks - we just need a guide where to place repos, what to update in repo config on the UI 16:40:25 the problem with fuel-createmirror is that it uses rsync and won't work behind a proxy 16:40:31 mattymo: two different tools for repo sync.. 16:41:01 angdraug: can't we use RSYNC_PROXY ? 16:41:03 rsync can be done via proxy,it just uses a different env var RSYNC_PROXY 16:41:05 mihgen, there was no collaboration between the two. one was needed sooner so QA could start using it 16:41:18 the other I wasn't even aware that it was ready 16:41:26 I thought it slipped 6.1 release 16:42:17 should we decide right now which script to use? 16:42:18 I didn't know about RSYNC_PROXY, that simplifies the problem considerably 16:42:54 angdraug: dpyzhov: did we identify who can work on this.. ? 16:43:03 angdraug: but the q still if squid for example (which is popular) supports RSYNC? 16:43:29 no, RSYNC_PROXY works over an HTTP proxy 16:43:32 and also, mattymo, angdraug, dpyzhov - Can you please sync and decide if we can merge scripts and use one single one? 16:43:43 squid doesn't need to be aware of rsync 16:44:04 mihgen, I have no idea about the source of fuel-createmirror. anyone knows? 16:44:06 please think about UX as well. We might want to add even some welcome message like "use this script to sync your updates repo" 16:44:21 if one is shell and the other is python, I'm not sure if it's possible to get done in this short time frame to HCF 16:44:23 mattymo: rvyalov can help you 16:44:26 rvyalov, ? 16:44:39 mattymo: source code is on vitaly parakhin github account (which is bad) 16:44:53 https://github.com/brain461/mirror-sync 16:44:54 aglarendil: I thought you guys synced on these scripts a long time ago 16:44:56 yes, we move code to the stackforge 16:45:01 kozhukalov, that means it's not even part of fuel yet... 16:45:02 kozhukalov: it's wrong one 16:45:09 *we will move 16:45:20 before HCF 16:45:41 at the moment code palced in the review.fuel-infra gerrit 16:45:54 #link https://review.fuel-infra.org/#/admin/projects/packages/centos6/fuel-createmirror 16:46:09 this is the one, mattymo ^^ 16:46:20 yes, but as tar+spec 16:46:28 mihgen: yes, but there is only tarball not source code itself 16:46:57 of course one can untar this and see the actual shell code 16:47:12 source is in tar of course, osci team promised to make a change so we will be able to store sources in there 16:47:22 it was initially designed for compiled linux code 16:47:23 yes, therefore we will move code to the fuel-web reposiotry (for example) 16:47:30 but it is not our way of doing things when we want to review this 16:47:43 I agree 16:47:50 vitaly's script should be moved to fuel-web 16:47:59 rvyalov: great 16:48:20 it's 6 shell scripts and 2 python scripts 16:48:34 yes , and will be included to the nailgun packages (for ex) 16:48:49 in this case shell is preferable to python since its more portable 16:49:01 angdraug, but python is more testable.... 16:49:09 and we want operators to be able to run this script outside of fuel 16:49:12 let' s put those 6 shell scripts in a separate directory 16:49:21 ok folks, I didn't get confirmation, please sync mattymo, rvyalov, dpyzhov, aglarendil - and decide what we do with two scripts. Merge them together, keep separated, proxy question, and (important!) UX for using them 16:49:28 angdraug, I wrote fuel-package-updates in shell and an overwhelming # of people insisted on python 16:49:34 and so it was rewritten 16:49:43 twice:) 16:50:05 I don't know why it didn't become part of fuel client, but anyway ) 16:50:13 #action mattymo rvyalov dpyzhov aglarendil sync with each other about create-mirror script 16:50:16 mihgen, python team blocked it 16:50:26 at least from UX perspective 16:50:47 I would love everything to be tied to fuel client, it is gonna be easier 16:50:56 than to remember two more linux tools 16:51:18 ok, let's move on 16:51:26 ok folks please sync and let us know, let's move on 16:51:37 #topic Some recent activities 16:51:55 guys please skim though the list in agenda 16:52:07 and ask questions if you have 16:52:34 Calamari plugin spec is available for review 16:52:52 #link https://review.openstack.org/#/c/180895 16:53:14 let's review it, yeah 16:53:20 please, those guys, who are experts in ceph, review this spec 16:53:41 kozhukalov: thanks for putting the status together in the etherpad 16:53:52 I think we can just review and raise questions? 16:53:55 mihgen: you are welcome 16:54:12 mihgen: great 16:54:26 im still working on image build env cleanup 16:54:43 and there are bunch of bugs which are volume manager realated 16:55:02 and maybe some of them we could address/workaround in 6.1 16:55:16 #link https://bugs.launchpad.net/fuel/+bug/1449186 16:55:16 Launchpad bug 1449186 in Fuel for OpenStack "mcollective was unable to start after provisioning" [High,Incomplete] - Assigned to Vladimir Kozhukalov (kozhukalov) 16:55:21 but i didn't have chance to go deep into them 16:55:25 any idea what logging / whatever to enable to catch it? 16:55:47 was there an mc log at all? 16:56:05 mihgen: yes, this one is floating and I've contacted with listomin 16:56:07 don't init logs what services it runs, did it at least attempt to run mc? 16:56:22 and have asked him which logs i need 16:56:52 at least i need to see mcollective log on a slave node to figure out if it tries to run at all 16:57:26 mihgen: exactly, there is no such log in diagnostic snapshot 16:57:49 kozhukalov: do we have syslog log in there? 16:57:57 does syslog contain init log? 16:59:05 the problem was on node-21, and there were no directory with logs from this node 16:59:16 while there were form others 16:59:42 no syslog from target system 16:59:54 ok, we are out of time 16:59:56 ok.. 16:59:59 let's finish 17:00:05 thank you guys for updates 17:00:10 thanks everyone for attending 17:00:16 #endmeeting