Monday, 2018-12-10

*** dave-mccowan has quit IRC		00:23
*** PagliaccisCloud has quit IRC		00:37
*** dave-mccowan has joined #openstack-containers		00:57
*** Nel1x has quit IRC		01:02
*** hongbin has quit IRC		01:45
*** PagliaccisCloud has joined #openstack-containers		01:49
*** cbrumm_ has joined #openstack-containers		02:12
*** cbrumm has quit IRC		02:13
*** lbragstad has quit IRC		02:13
*** lbragstad has joined #openstack-containers		02:16
*** itlinux has quit IRC		02:29
*** hongbin has joined #openstack-containers		03:05
*** lbragstad has quit IRC		03:38
*** udesale has joined #openstack-containers		03:40
*** hongbin has quit IRC		04:00
*** ramishra has joined #openstack-containers		04:34
*** ykarel has joined #openstack-containers		04:40
*** ricolin has joined #openstack-containers		04:45
*** dave-mccowan has quit IRC		04:53
*** ykarel_ has joined #openstack-containers		04:59
*** ykarel has quit IRC		05:02
*** ykarel_ has quit IRC		06:30
*** belmoreira has quit IRC		07:04
*** belmoreira has joined #openstack-containers		07:16
*** belmoreira has quit IRC		07:18
*** belmoreira has joined #openstack-containers		07:18
*** rcernin has quit IRC		07:23
*** udesale has quit IRC		07:24
*** ykarel_ has joined #openstack-containers		07:48
*** ykarel_ is now known as ykarel		08:16
*** ykarel is now known as ykarel\|lunch		08:17
*** ykarel\|lunch is now known as ykarel		08:42
*** udesale has joined #openstack-containers		08:49
*** shrasool has joined #openstack-containers		09:25
lxkong	strigazi: hi, i'm working on how to accelerate the launch time of magnum k8s cluster and i found there are several software deployments that are executed in sequence after kubemaster is alive.	09:42
lxkong	strigazi: that's why i asked in heat channel	09:42
strigazi	lxkong: give me a sec, I'll explain	09:43
lxkong	sure	09:43
strigazi	The improvement I have thought is to parallelize the deployment of master and worker nodes. Running the SDs in parallel won't help a lot.	09:49
strigazi	lxkong: the major bottleneck is that nodes wait for master to pull everything. The deployment of k8s manifests is async so it is fast.	09:50
strigazi	lxkong: I'm writing a PoC for this now	09:52
lxkong	strigazi: in my devstack deployment, the core_dns_service takes 80s and kubernetes_dashboard takes 171s somehow	10:00
lxkong	in sequence	10:01
lxkong	for master and worker, i have created this https://storyboard.openstack.org/#!/story/2004573 for HA cluster	10:01
lxkong	but for non-ha, after master gets created and has an ip address, the workers could start	10:02
lxkong	rather than waiting until all the things are finished inside the master	10:02
brtknr	lxkong: isn't that to do with your internet connection speed as it is mostly the time required to download containers?	10:03
brtknr	strigazi: i like the idea of launching the master and workers in parallel!	10:04
lxkong	brtknr: no, besides the container installation time, we also want to decrease the provision time for others	10:04
lxkong	another k8s cluster was just created in my devstack, i can see:	10:05
lxkong	https://www.irccloud.com/pastebin/xKUT4VOM/	10:06
lxkong	most of them are just `kubectl apply`, i don't know why they took so long time to finish	10:06
ricolin	IMO try to increase the polling interval, might help if you wish it run in short time, just remember to change it back after you're done	10:10
ricolin	https://github.com/openstack/heat-templates/blob/master/hot/software-config/boot-config/templates/fragments/os-collect-config.conf#L10	10:10
ricolin	s/increase/decrease/	10:10
ricolin	strigazi, have you try that yet?	10:10
lxkong	ricolin: 'change it back after you're done' sounds like a hack instead of a solution :-(	10:13
ricolin	lxkong, actually not a hack IMO, just an improvement of performance for vm	10:14
ricolin	lxkong, you can always leave it with quick polling interval	10:14
lxkong	ricolin: you mean, in magnum scenario, we give it a small value and remove it after something done?	10:15
*** salmankhan has joined #openstack-containers		10:15
lxkong	but if a small polling interval will increase the api load?	10:15
lxkong	ricolin: that may work, but not a perfect solution	10:16
ricolin	lxkong, it will increase api loading in certain level. Not sure there's a better design for a `polling design`, maybe we can try to figure out one:)	10:17
ricolin	lxkong, and yes, you should be able to change it after you're done	10:18
*** udesale has quit IRC		10:18
lxkong	ricolin: actually, my original requirement is to execute some scripts inside the vm in parallel, currently we are using SD, but seems like it doesn't support parallelism	10:19
lxkong	unless we write all the script in one SD rather than using multiple SDs for a single vm	10:20
ricolin	you can use shell script do to that in SD right?	10:20
ricolin	s/do to/to do/	10:20
lxkong	ricolin: exactly and ugly :-)	10:20
lxkong	i guess in magnum, we separate the script into differetn SDs to gain some modular benefit	10:21
ricolin	lxkong, we can start discuss if we should have that in agent, just have to figure out how to do it in a dozen of agents	10:22
ricolin	shell, ansible, puppet, etc	10:22
lxkong	ricolin: sounds like a plan	10:22
ricolin	lxkong, happy to help to review and trigger a discussion	10:23
ricolin	and it will be better if we got people help on it	10:23
ricolin	like partial success	10:24
ricolin	we need people to do it for sure	10:24
* ricolin Hope he got more time on that		10:24
lxkong	ricolin: btw, not sure if you know anybody in heat team who is familiar with Octavia that can help with this https://storyboard.openstack.org/#!/story/2004564?	10:25
ricolin	ramishra, is the one	10:26
lxkong	ramishra: it'd be much appreciated if you are available for this feature	10:28
ricolin	lxkong, btw, it's not like everything list in etherpad will happen, but would be very appreciated if you can leave your feedback in etherpad with your name, so we can give more voice on that issue https://etherpad.openstack.org/p/heat-user-berlin	10:29
ramishra	lxkong: you asking me to implement it? ;) I can review if someone works on it, have no time to do it myself.	10:40
lxkong	ramishra: that's totally fine, thanks	10:41
lxkong	strigazi: hi, do you have some time to discuss https://review.openstack.org/#/c/497144/? i'm very happy to hear your suggestion	10:48
brtknr	lxkong: i am currently trying to setup a simple lbaas using neutron, octavia seemed like an overkill for what i was trying	10:49
lxkong	but neutron-lbaas was already depcreated	10:50
lxkong	but using neutron-lbaas is ok for this feature, just need to add another hook for it	10:50
brtknr	lxkong: yes, but it should still work... i think magnum uses neutron lbaas by default, you need to enable octavia explicitly in /etc/magnum/magnum.conf	10:51
brtknr	I got as far as setting up the lbaas but my controller manager is complaining: http://paste.openstack.org/show/736864/	10:51
brtknr	I am currently digging around for solutions	10:51
lxkong	brtknr: nope, if Octavia is deployed in your cloud, it will be used as defulat	10:51
lxkong	brtknr: https://github.com/openstack/magnum/blob/f27bde71719905e6f274a1a57799595780bc50c2/magnum/drivers/heat/template_def.py#L342	10:53
brtknr	oh okay	10:55
strigazi	sorry, I was in a meeting	10:55
brtknr	what about loadbalancer for services though?	10:55
brtknr	strigazi: wb	10:56
brtknr	my https://github.com/openstack/magnum/blob/f27bde71719905e6f274a1a57799595780bc50c2/magnum/drivers/h │ shrasool	10:56
brtknr	│ \| eat/template_def.py#L342	10:56
brtknr	oops sorry	10:56
strigazi	I'm reading the thread	10:56
strigazi	lxkong: the time that matters is, after the apiserver reports ok how much time do the SDs take.	10:58
brtknr	https://github.com/openstack/magnum/blob/53a1840d68382fd7bd6cc1f7c6752a37a632b50b/magnum/drivers/heat/k8s_template_def.py#L104	10:58
brtknr	looks like you're right lxkong	10:58
strigazi	putting everything in one script might not help if the node performance is pour. The biggest gain is pull everything in parallel.	11:00
lxkong	strigazi: `pull everything in parallel` you mean for atomic images or heat sd?	11:01
lxkong	strigazi: from the logs, all the SDs are executed in sequence, so the total time is about 4min after the apiserver is up and running	11:03
lxkong	but from what those scripts do, 4min is not reasonable	11:03
lxkong	and from ricolin's explanation, there is a param for os-collect-config `polling-interval` with default value 30, that may be the reason, but i'm not sure	11:04
lxkong	os-collect-config/os-refresh-config/os-apply-config....so many things	11:05
strigazi	maybe your bottleneck is somewhere else. check what these scripts do The times you present don't make sense to me. In our cloud the agent starts at 12:18:53 and the last SD finishes at: 12:19:29	11:08
lxkong	strigazi: are you using the latest magnum code?	11:09
strigazi	lxkong: just a warning, if you reduce the polling time, you will do a ddos to your heat api. Just measure exxactly where the time is spent.	11:10
lxkong	my environment is a devstack with latest code for all the projects	11:10
brtknr	a little while ago, when i was testing changes to magnum on devstack, the VM that i was using had it's ip address throttled by openstack servers... I changed the floating ip and problem was solved.	11:10
strigazi	let me check me devstack which is master.	11:10
lxkong	strigazi: yeah, i know	11:10
*** udesale has joined #openstack-containers		11:11
strigazi	in devstack master: start Dec 07 18:29:55, end Dec 07 18:31:47	11:12
lxkong	hmm...	11:13
lxkong	strigazi: do you mind paste your heat-container-agent service log?	11:15
lxkong	from the service start to all the phases finished	11:16
strigazi	http://paste.openstack.org/raw/736889/	11:21
lxkong	strigazi: in your cluster, you don't have calico_service and kubernetes_dashboard, right?	11:25
*** ArchiFleKs has joined #openstack-containers		11:29
lxkong	strigazi: btw, have you tried to run `atomic install` when deployed the k8s cluster in a multi-thread or multi-process manner?	11:34
* lxkong has to get some sleep, will continue testing tomorrow		11:38
strigazi	lxkong: we don't use calico, maybe calico is slow	12:11
strigazi	lxkong: I have the dashboard too	12:13
strigazi	lxkong: paste.openstack.org cut my pase	12:22
strigazi	lxkong: here it is https://paste.fedoraproject.org/paste/6-B--pO2cvkvx73-RDbXRg/raw	12:22
*** salmankhan has quit IRC		12:24
*** salmankhan has joined #openstack-containers		12:31
*** cbrumm_ has quit IRC		12:42
*** cbrumm_ has joined #openstack-containers		12:49
strigazi	lxkong: https://paste.fedoraproject.org/paste/HfYsaoQLuNDpKsSJ80VCbw/raw with calico	13:08
strigazi	lxkong: less than two mins	13:09
*** dave-mccowan has joined #openstack-containers		13:17
*** jmlowe has quit IRC		13:45
brtknr	strigazi: have you started using cloud-controller-manager by any chance?	13:53
openstackgerrit	Merged openstack/magnum stable/rocky: Add support for www_authenticate_uri in ContextHook https://review.openstack.org/623679	13:57
*** openstackstatus has joined #openstack-containers		14:17
*** ChanServ sets mode: +v openstackstatus		14:17
*** lbragstad has joined #openstack-containers		14:17
*** jmlowe has joined #openstack-containers		14:24
*** jmlowe has quit IRC		14:24
*** jmlowe has joined #openstack-containers		14:25
*** aspiers has quit IRC		14:26
*** jmlowe has quit IRC		14:30
*** shrasool has quit IRC		14:34
*** jmlowe has joined #openstack-containers		14:35
*** aspiers has joined #openstack-containers		14:56
brtknr	anyone here started using cloud-controller-manager service?	15:09
brtknr	with magnum?	15:09
brtknr	looks like there is a service for it	15:09
brtknr	https://hub.docker.com/r/openstackmagnum/openstack-cloud-controller-manager/	15:09
*** salmankhan1 has joined #openstack-containers		15:11
*** salmankhan has quit IRC		15:12
*** salmankhan1 is now known as salmankhan		15:12
brtknr	is this wip?	15:12
brtknr	looks like this change was abanadoned in July and restored recently: https://review.openstack.org/#/c/577477/3/magnum/drivers/common/templates/kubernetes/fragments/configure-kubernetes-master.sh	15:18
strigazi	brtknr: the PS i pushed works	15:18
strigazi	brtknr: you can test	15:19
strigazi	brtknr: i pushed PS version 3	15:19
brtknr	strigazi: nice, what prompted you to look at occm?	15:22
brtknr	there is nothing with the tag v0.2.0 on dockerhub btw	15:23
brtknr	occm is tagged to download v0.2.0	15:23
strigazi	brtknr: http://paste.openstack.org/show/736905/	15:31
strigazi	brtknr: I was testing things upstream for the lbaas delete hook written by lxkong. At CERN we are still not interested.	15:32
strigazi	brtknr: well we might need it for the autoscaler :)	15:33
strigazi	brtknr: we need it for the autoscaler, but i didn't have it in mind xD	15:33
brtknr	oops, i was inspecting the wrong provider	15:34
brtknr	i'm currently looking at neutron-lbaasv2, using cloud-provider=openstack hasnt worked for me. im checking to see if cloud-provider=external will	15:35
strigazi	brtknr: with octavia "It works in devstack"TM	15:35
brtknr	lol	15:36
*** hongbin has joined #openstack-containers		15:36
brtknr	are you using rocky at cern or still queens?	15:36
strigazi	40% rocky	15:36
strigazi	or more	15:36
strigazi	we cherry-pick only what we need, it is difficult to rebase. Our network is too special	15:37
brtknr	i see	15:38
strigazi	we plan to upgrade in Jan, after holidays.	15:40
strigazi	brtknr: we might be in queens but we use k8s v1.12.3.	15:41
brtknr	oh, how are you using v1.12.3? the openstackmagnum docker hub only goes up to v1.11.5? or are you using the upstream k8s image?	15:44
*** itlinux has joined #openstack-containers		15:44
*** jmlowe has quit IRC		15:46
*** itlinux has quit IRC		15:46
brtknr	strigazi: do you need kubelet running on master for occm to work?	15:51
*** lpetrut has joined #openstack-containers		16:12
*** munimeha1 has joined #openstack-containers		16:21
*** jmlowe has joined #openstack-containers		16:24
strigazi	brtknr yes	16:27
*** jmlowe has quit IRC		16:34
*** openstackgerrit has quit IRC		16:35
brtknr	cool neutron lbaas v2 is working now :)	16:41
brtknr	with cloud-provider=external	16:42
brtknr	strigazi: just tested https://review.openstack.org/#/c/577477/3 and it works great!	16:43
brtknr	can we cherry-pick https://review.openstack.org/#/c/571190 to queens?	16:51
brtknr	i get merge conflict when i try to do it	16:52
*** ykarel has quit IRC		16:52
*** ykarel has joined #openstack-containers		16:53
*** itlinux has joined #openstack-containers		16:56
*** udesale has quit IRC		16:58
*** openstackgerrit has joined #openstack-containers		17:01
openstackgerrit	Bharat Kunwar proposed openstack/magnum stable/queens: k8s_fedora: Add cloud_provider_enabled label https://review.openstack.org/624132	17:01
*** ykarel is now known as ykarel\|away		17:03
*** ricolin has quit IRC		17:05
openstackgerrit	Bharat Kunwar proposed openstack/magnum stable/queens: k8s_fedora: Add cloud_provider_enabled label https://review.openstack.org/624132	17:09
openstackgerrit	Bharat Kunwar proposed openstack/magnum stable/queens: k8s_fedora: Add cloud_provider_enabled label https://review.openstack.org/624132	17:10
*** ricolin has joined #openstack-containers		17:14
*** munimeha1 has quit IRC		17:14
*** PagliaccisCloud has quit IRC		17:26
*** jmlowe has joined #openstack-containers		18:12
*** ricolin has quit IRC		18:22
*** PagliaccisCloud has joined #openstack-containers		18:34
*** salmankhan has quit IRC		18:36
*** ykarel\|away has quit IRC		19:08
*** PagliaccisCloud has quit IRC		19:27
*** salmankhan has joined #openstack-containers		20:01
*** shrasool has joined #openstack-containers		20:03
*** salmankhan has quit IRC		20:06
openstackgerrit	Roberto Soares proposed openstack/magnum master: [k8s] Add vulnerability scanner https://review.openstack.org/598142	20:10
*** shrasool has quit IRC		20:18
*** jmlowe has quit IRC		20:26
*** lpetrut has quit IRC		20:27
*** jmlowe has joined #openstack-containers		20:32
*** jmlowe has quit IRC		20:33
mordred	anybody know off the top of their head - using magnum with atomic hosts, is there a particular directory on each node that should be used for files that should persist across reboots?	20:37
*** jmlowe has joined #openstack-containers		20:58
cbrumm_	According to redhat's docs /var is writable and persists through reboots	21:01
*** salmankhan has joined #openstack-containers		21:04
lxkong	strigazi: hi are you still here? Could you please give more suggestion about https://review.openstack.org/#/c/497144? I didn't fully understand your comment 'This is per coe. All other features are written per driver', I want to make sure we agree with the plugin mechanism before I start writing some doc for how to config and test.	21:10
*** roukoswarf has joined #openstack-containers		21:12
roukoswarf	does magnum allow you to set the deployment kubernetes version anywhere?	21:15
lxkong	roukoswarf: `--labels kube_tag=YOUR_VERSION_HERE`	21:16
lxkong	make sure you can find the version in openstackmagnum account in docker hub if you are using the upstream images	21:17
roukoswarf	ah okay, forgot about labels and didnt know where it was sourcing it	21:17
roukoswarf	no versions beyond 1.11? is there a specific update frequency?	21:18
*** salmankhan has quit IRC		21:36
roukoswarf	so even with kube_tag set to v1.11.5-1, i end up with nodes using v1.10.3+coreos.0, is there something im missing?	22:18
*** itlinux has quit IRC		22:43
mnaser	lxkong: strigazi ricolin i actually took a shot at improving speed deploy time	22:56
mnaser	it is failing, i dont know why, but if someone knows the right thing to do, it might pass	22:57
mnaser	it involves using softwaredeploymentgroups	22:57
mnaser	https://review.openstack.org/623724	22:58
mnaser	btw, anyone going to be at kubecon?	22:58
*** rcernin has joined #openstack-containers		22:59
lxkong	mnaser: not me :-(	23:05
mnaser	lxkong: aw bummer	23:06
mnaser	btw, i would love for reviews on this stack here -- https://review.openstack.org/#/c/623628/	23:07
mnaser	everything with functional: prefix brings back functional tests for magnum and runs them in an env with nested virt!	23:07
mnaser	also https://review.openstack.org/#/c/619643/ would be nice because that breaks things for a lot of users :(	23:09
lxkong	mnaser: thanks for proposing this https://review.openstack.org/#/c/623724/, i've added to my review list and will test when i'm available	23:09
mnaser	lxkong: yeah, i dont have tools to repro right now but i think that's the direction we wanna go	23:12
lxkong	mnaser: definitely	23:12
mnaser	that way in non-ha deploys, the servers get deployed quickly and the minions can start spinning up already while softwaredeploys happen for masters	23:12
mnaser	but i havent had time to setup a dev env to hack on it, but yay for functional working tests so we can actually just see that change in CI :)	23:12
lxkong	mnaser: what i thought is spin up the workers just after the vm created successfully	23:13
lxkong	becuase what the workers need is just a ip address	23:13
mnaser	lxkong: yep! and thats why ha goes up faster because they take the lb ip address	23:13
lxkong	in ha cluster, it's lb vip, for non-ha cluster that's the master0's private ip	23:13
mnaser	vs in non-ha we have to wait for the vm to go up	23:13
mnaser	yup :D	23:13
mnaser	also with the approach i took there	23:14
lxkong	yeah	23:14
mnaser	you reduce the # of resources you have	23:14
mnaser	no need to have a softwareconfig+softwaredeployment per ever node	23:14
mnaser	esp when the softwareconfig's are actually static, so it is less load on heat	23:14
lxkong	i'll give it a try after review and testing this one https://review.openstack.org/#/c/561783/	23:15
*** dave-mccowan has quit IRC		23:19
mnaser	lxkong: awesome! :)	23:29
mnaser	please do keep me updated	23:29
lxkong	sure, i'll	23:29
mnaser	if you needs anything please feel free to ping!	23:30
lxkong	mnaser: yep, thanks :-)	23:30
*** itlinux has joined #openstack-containers		23:49

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!