10:31:43 <rluethi> #startmeeting training_labs
10:31:43 <openstack> Meeting started Wed Dec 16 10:31:43 2015 UTC and is due to finish in 60 minutes.  The chair is rluethi. Information about MeetBot at http://wiki.debian.org/MeetBot.
10:31:44 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
10:31:47 <openstack> The meeting name has been set to 'training_labs'
10:32:14 <rluethi> #topic liberty patch
10:32:57 <berndbausch> Anybody else?
10:33:27 <rluethi> berndbausch: I think the rest of the team is travelling this week :)
10:33:55 <berndbausch> OK great. It's all for me :)
10:34:06 <rluethi> We have a few issues to discuss wrt to the liberty patch.
10:34:13 <rluethi> Some minor, some rather major.
10:34:17 <berndbausch> Good.
10:34:32 <rluethi> The first big one is the network reorg.
10:34:36 <berndbausch> I did some testing as you know, fixed some typos and such.
10:34:43 <berndbausch> That's the problem I am running into now.
10:34:53 <rluethi> liberty drops some networks and adds "manual" as a new type of network.
10:35:28 <berndbausch> I guess I need to do some reading. I don't know what "manual" is.
10:35:32 <rluethi> and the install-guide now has two versions just for liberty.
10:36:04 <rluethi> basically, it's a network interface reserved for one network but without a specific IP address preassigned to it.
10:36:44 <berndbausch> In the install guide there is a "provider" and "self-service" network option.
10:36:51 <berndbausch> Is that what you mean?
10:36:55 <rluethi> yup.
10:37:03 <berndbausch> OK.
10:37:27 <berndbausch> I think the self-service is more relevant, so let's implement that.
10:37:38 <rluethi> I agree.
10:37:44 <rluethi> preferably, we implement both.
10:38:16 <berndbausch> I was asking if the environment is flexible enough to provide such options. Yes, I would do that too.
10:38:37 <rluethi> the answer to your questions is that currently, it's tricky.
10:38:49 <berndbausch> Grin
10:39:06 <rluethi> you can easily have several configurations as long as you only change scripts.ubuntu_cluster or somesuch.
10:39:39 <rluethi> for instance, we could have scripts.ubuntu_cluster_provider.
10:39:50 <berndbausch> Yes, and I think that is good enough.
10:40:17 <berndbausch> Just tell users that the default is selfservice, and provider can be implemented by doing this and that
10:41:10 <berndbausch> Right now I am trying to test provider but hit a problem. Is it the right moment to raise this?
10:41:12 <rluethi> works for me. I am looking at how difficult it is to provide more options for customization.
10:41:18 <rluethi> sure
10:41:39 <berndbausch> It's because I am not sure how to change the script so that it works with two nodes instead of three.
10:42:10 <rluethi> which script exactly_
10:42:17 <berndbausch> Here is what I see. In scripts.ubuntu_cluster, there is a line "cmd queue config_external_network.sh"
10:42:30 <berndbausch> This is where the error occurs
10:42:52 <berndbausch> "no network 'external'" (paraphrased)
10:43:19 <berndbausch> In fact, the external network is set up in a different line....
10:43:36 <berndbausch> cmd queue ubuntu/setup_neutron_network.sh
10:43:43 <berndbausch> which is later in the script.
10:43:55 <berndbausch> And currently, it sets up the network node.
10:43:59 <rluethi> I think the install-guide moved the creation of networks from the middle to the end of the guide.
10:44:05 <rluethi> it is part of the testing part now.
10:44:09 <berndbausch> But my question is, why is this line later.
10:44:50 <berndbausch> So, in order for config_external_network to succeed, the setup_neutron_network script must be executed first
10:44:54 <rluethi> the answer is probably that the install-guide used to do it this way.
10:45:02 <berndbausch> and currently it is not. Thus error.
10:45:34 <berndbausch> OK, I will therefore have to redo that script. I wondered if some of those commands are running in parallel.
10:45:46 <rluethi> yes. roughly, the content of config_external_network would now go into launch_instance.sh
10:45:51 <berndbausch> But they can't of course. There are snapshots.
10:46:06 <berndbausch> That would be one of the last steps I guess
10:46:50 <rluethi> we need to go through the install-guide systematically and reorder our scripts accordingly.
10:47:39 <berndbausch> Yes, that's right. So basically each install guide page is more or less represented by a shell script, and the shell scripts are sequenced in scripts.ubuntuy_cluster and so on.
10:47:45 <rluethi> I seem to remember the before the RST-conversion, the order was a bit more obvious than it is now.
10:47:53 <rluethi> right.
10:48:19 <rluethi> that is the concept in a nutshell.
10:48:20 <berndbausch> There are some jumps in the install guide which make you go somewhere else then return to the same place. Like a subroutine.
10:48:39 <berndbausch> I don't like it so much, but that's life.
10:48:42 <rluethi> yes, and the sidebar does not indicate that.
10:49:07 <rluethi> so you have to read the whole page to make sure you didn't miss a "subroutine".
10:49:35 <berndbausch> Sometimes I feel lost in the pages.
10:49:49 <berndbausch> So we have to recreate that flow in the scripts.something files.
10:50:00 <rluethi> yes.
10:50:20 <berndbausch> Since no more testing is possible without reordering the scripts.something, I will give it a try tomorrow.
10:50:35 <rluethi> excellent.
10:51:19 <rluethi> note that the install-guide changed some other things, too, including the private network range.
10:51:37 <rluethi> used to be 192.168..., now 172...
10:52:42 <berndbausch> So far I just went through the install guide pages rather mechanically and adapted the corresponding shell scripts or fixed the problems I saw. That is not so difficult; it just requires a certain level of precision.
10:52:46 <rluethi> I suggest we change that after we dealt with the network refactoring. that way it should be easy to rebase.
10:52:56 <berndbausch> ok
10:53:49 <rluethi> usually, the tricky part is detecting and fixing the races. there are so many of them.
10:55:56 <berndbausch> I noticed that :)
10:56:21 <berndbausch> My system is a good testbed for race conditions. It's rather slow.
10:56:55 <rluethi> Some races seem to turn of only on fast systems. It is good to have a mix.
10:57:08 <rluethi> s/of/up/
10:57:32 <berndbausch> Strangely it is a rather powerful server, perhaps Virtualbox is missing something there.
10:57:49 <rluethi> define "slow", then.
10:57:55 <akamyshnikova_> ihrachys, I will look into failure of test_model_sync now
10:58:17 <ihrachys> akamyshnikova_: I am already looking
10:58:20 <berndbausch> The old Openvswitch race turned up on my system. It took 4-5 seconds to get OVS going
10:58:22 <ihrachys> akamyshnikova: there was a bug for that
10:58:40 <ihrachys> meh, sorry folks, bad channel
10:58:46 <rluethi> np
10:59:10 <rluethi> berndbausch: that's odd.
10:59:14 <berndbausch> Also, starting a cluster took some 30-50 minutes rather than the 20 it's supposed to be.
10:59:40 <rluethi> including basedisk build?
11:00:18 <berndbausch> Yes the whole thing I believe. Excluding the ISO download I *think*. Need to test this again if that interests you.
11:00:57 <rluethi> okay. 20 minutes is pretty fast for building basedisk and cluster.
11:01:33 <rluethi> building the basedisk depends on the internet (and its servers).
11:01:54 <rluethi> building the cluster takes about 10 minutes on a pretty fast machine.
11:02:24 <rluethi> I know there are opportunities to improve speed.
11:02:34 <berndbausch> I have a Proliant ML330 with a four core Xeon and 20GB memory
11:02:42 <rluethi> It just never was a big priority.
11:02:56 <berndbausch> We could build the nodes in parallel to an extent.
11:02:57 <rluethi> SSD?
11:03:03 <berndbausch> normal disk.
11:03:13 <berndbausch> single disk, no RAID.
11:03:20 <rluethi> The build is often disk-bound.
11:03:44 <berndbausch> that would explain it. It's true, the CPU seemed to be bored most of the time.
11:03:52 <rluethi> We can start building in parallel once we are bored because we ran out of races to fix :).
11:04:13 <berndbausch> yes it would provide us a few more races.
11:04:19 <rluethi> there are plenty of dependencies between the nodes.
11:04:41 <berndbausch> yes. The one I am running into, for example.
11:05:21 <rluethi> I opened a ticket for the one in cinder_volumes. The sleep 20 is ugly and not good enough.
11:06:21 <berndbausch> Do I have to remember that bug? Did I raise it :-$
11:07:20 <rluethi> Nah. We sort of patched it up for Juno (I think), but according to my testing it fails maybe once in a 100 tests.
11:07:27 <berndbausch> Ah ok.
11:07:44 <berndbausch> It shouldn't be too hard to replace all sleeps by a proper test.
11:07:59 <berndbausch> testing the condition we are waiting for.
11:08:25 <rluethi> for rare races, it becomes difficult to test the fix.
11:09:06 <berndbausch> by provoking somehow perhaps?
11:09:30 <rluethi> sure, if you know how.
11:09:43 <berndbausch> :)
11:10:24 <rluethi> can you take a look at the network refactoring? I realize the patch is too big for a thorough review. Maybe I need to break it into smaller pieces after all. But it's pretty easy to test, and we kinda need those changes for liberty.
11:10:42 <berndbausch> Is there a patch already?
11:11:14 <berndbausch> Refactoring needs first to restructure the scripts.ubuntu_clusters and friends, right?
11:11:19 <rluethi> https://review.openstack.org/#/c/257063/
11:11:21 <berndbausch> That's what I wanted to do anyway
11:11:24 <berndbausch> OK good.
11:11:59 <rluethi> Actually, this refactoring only changes how the network configuration gets parsed and applied.
11:12:41 <rluethi> the patch is for kilo, in preparation for liberty.
11:13:01 <berndbausch> Good to know!
11:13:51 <rluethi> there is another patch in the queue, but that just improves KVM disk handling (somewhat faster and much less space usage). so that one is not urgent.
11:14:03 <berndbausch> looking at it now. It's too late to do much today, I will dive into this tomorrow.
11:14:39 <rluethi> sure, no rush.
11:15:18 <rluethi> feedback/questions welcome.
11:15:49 <berndbausch> sure!
11:17:01 <berndbausch> So I have two pieces of work, checking the network patch and creating a new scripts.ubuntu for Liberty. I feel satisfied.
11:17:24 <rluethi> :)
11:18:05 <rluethi> and we will go with selfservice for now.
11:18:26 <rluethi> I will look for a way to provide the alternative, too.
11:18:27 <berndbausch> agreed
11:19:12 <rluethi> any other issues?
11:19:38 <berndbausch> I don't know enough about it yet to have more issues.
11:19:53 <berndbausch> So, thanks. Good for now.
11:20:32 <rluethi> okay. thanks for coming, see you next week. (of course, there is always email, too)
11:20:46 <berndbausch> Yep, thanks.
11:20:51 <rluethi> #endmeeting