Tuesday, 2018-08-21

openstackgerrit	Feilong Wang proposed openstack/magnum master: [k8s] Update cluster health status by native API https://review.openstack.org/572897	00:00
*** hongbin has joined #openstack-containers		00:26
*** livelace has quit IRC		00:43
*** livelace has joined #openstack-containers		00:43
*** janki has joined #openstack-containers		01:28
*** ricolin has joined #openstack-containers		01:33
*** Bhujay has joined #openstack-containers		02:00
openstackgerrit	Feilong Wang proposed openstack/magnum master: [k8s] Update cluster health status by native API https://review.openstack.org/572897	02:27
*** janki has quit IRC		03:06
*** Bhujay has quit IRC		03:24
*** Nel1x has quit IRC		03:34
*** ramishra has joined #openstack-containers		03:40
*** ykarel has joined #openstack-containers		04:00
*** udesale has joined #openstack-containers		04:00
*** ykarel has quit IRC		04:06
*** ykarel has joined #openstack-containers		04:07
*** hongbin has quit IRC		04:10
*** janki has joined #openstack-containers		04:27
*** Bhujay has joined #openstack-containers		04:45
*** Bhujay has quit IRC		04:51
*** Bhujay has joined #openstack-containers		04:53
*** flwang1 has quit IRC		04:54
*** ykarel has quit IRC		05:07
*** ykarel has joined #openstack-containers		05:25
*** janki has quit IRC		06:06
*** rcernin has quit IRC		06:38
*** rcernin has joined #openstack-containers		06:41
*** pcaruana has joined #openstack-containers		06:42
*** adrianc has joined #openstack-containers		06:51
*** rcernin has quit IRC		06:51
strigazi	ykarel: can you have a look: 592336	07:11
strigazi	imdigitaljim: http://paste.openstack.org/show/728455/ looks nice	07:12
*** mattgo has joined #openstack-containers		07:39
openstackgerrit	OpenStack Proposal Bot proposed openstack/magnum-ui master: Imported Translations from Zanata https://review.openstack.org/594054	07:47
ykarel	strigazi, okk will check	07:49
ykarel	strigazi, there are two test case failing, is that a known issue	07:52
ykarel	No API token found for service account \"default\", retry after the token is automatically created and added to the service account	07:52
strigazi	ykarel in the functional tests?	07:54
ykarel	strigazi, yes and other is TypeError: delete_namespaced_service() takes exactly 4 arguments, which seems related to kubernetes version	07:54
strigazi	ykarel this is known ^^	07:54
strigazi	the other must be happening because it tries too fast to create something	07:55
ykarel	strigazi, okk, +2 +W, for the known issue is there a patch already?	07:56
strigazi	ykarel: no. It needs a change in the params of the client.	07:57
ykarel	strigazi, ack	07:58
strigazi	ykarel: where do you see this? No API token found for service account \"default\", retry after the token is automatically created and added to the service account	08:03
ykarel	amhttp://logs.openstack.org/36/592336/3/check/magnum-functional-k8s/2c671f0/job-output.txt.gz#_2018-08-20_12_01_02_553214	08:03
ykarel	strigazi, ^^	08:03
*** yankcrime has joined #openstack-containers		08:11
*** suanand has joined #openstack-containers		08:28
*** olivenwk has joined #openstack-containers		08:38
*** flwang1 has joined #openstack-containers		08:43
openstackgerrit	Merged openstack/magnum master: [k8s] Add proxy to master and set cluster-cidr https://review.openstack.org/592336	08:46
flwang1	strigazi: around for a quick sync?	08:48
strigazi	flwang1: I'm going to a 1 hour meeting :(	08:49
flwang1	strigazi: no problem	08:50
openstackgerrit	Feilong Wang proposed openstack/magnum master: [k8s] Update cluster health status by native API https://review.openstack.org/572897	08:50
flwang1	strigazi: i can manage get the versioned object works, but it doesn't work well with wsme types system	08:50
flwang1	the wsme types can't support CoercedDict well	08:51
flwang1	as a result, in the api response, user can't see the 2nd layer dict of the health_status_reason	08:53
*** adrianc has quit IRC		10:00
*** adrianc has joined #openstack-containers		10:09
openstackgerrit	suzhengwei proposed openstack/magnum master: service init or heartbeat without hostname https://review.openstack.org/594119	10:20
strigazi	flwang1: ping	10:24
*** Bhujay has quit IRC		10:26
*** ykarel is now known as ykarel\|lunch		10:29
flwang1	strigazi: yes	10:46
strigazi	flwang1: sync?	10:46
flwang1	strigazi: let's do it	10:47
flwang1	except rolling upgrade and health monitoring, anything else we want to get in rocky	10:47
strigazi	I have a minor one, that came up today. It is for swarm-mode	10:48
strigazi	Make the default overleynetwork cidr of swarm-mode configurable	10:49
strigazi	It conflicts with some of our public ips	10:49
strigazi	The current default one	10:49
strigazi	I guess you are not interested in swarm	10:50
flwang1	strigazi: yep, you're right	10:50
flwang1	so let's just focus on the upgrade one and cluster health monitoring?	10:51
strigazi	yes	10:51
flwang1	currently, the health monitoring just works, but as i mentioned above, i'm dealing with the wsme types and oslo.versionedobjects	10:51
strigazi	link to the code?	10:52
flwang1	https://review.openstack.org/572897 you mean patch link?	10:52
strigazi	and file	10:52
strigazi	https://review.openstack.org/#/c/572897/8/magnum/api/controllers/v1/cluster.py@141	10:52
flwang1	https://review.openstack.org/#/c/572897/8/magnum/api/controllers/v1/cluster.py@141	10:52
flwang1	yep	10:53
flwang1	i'm still testing to figure out the correct way to display the 2 layers nested dict	10:53
strigazi	in the client?	10:53
flwang1	client does nothing, the error comes from server side	10:54
strigazi	The api can return a valid json	10:54
flwang1	but we may need a client change to show the two new fields in the table	10:54
flwang1	api can't return a valid json	10:54
strigazi	in a string	10:55
strigazi	if can't, right	10:55
flwang1	https://review.openstack.org/#/c/572897/8/magnum/drivers/common/k8s_monitor.py@196	10:56
flwang1	this is current data structure of the health_status_reason	10:56
flwang1	with current structure, we can basically provide all the info we got from k8s to the cluster auto healer	10:56
strigazi	looks good	10:56
strigazi	Is there something that we miss now? can we take this? https://review.openstack.org/#/c/570818/11/magnum/api/controllers/v1/cluster.py	10:59
flwang1	no, that patch will be updated in favor of the new health status reason structure	11:00
flwang1	never mind, i will figure out	11:00
flwang1	i just need your opinions about the whole workflow and the info we can provide with the health_status_reason	11:01
flwang1	and it would be nice if Ricardo can review it as well	11:02
strigazi	I'm a little lost on the dependency of the patches. Do we need to decide on 572897 and then update 570818 ?	11:02
strigazi	With the current state	11:03
strigazi	we will have another periodic task, the one you created sync_cluster_health_status	11:04
strigazi	that will check the _COMPLETE clusters	11:05
strigazi	or even the ones with the statuses you listed	11:05
strigazi	and it will update the status accordngly	11:05
strigazi	if the api doesn't return ok or if any node is not ready the cluster will be unhealthy	11:06
strigazi	makes sesne?	11:06
strigazi	flwang1: ^^	11:06
flwang1	after figure out the working structure, i will update 570818	11:06
flwang1	firstly	11:06
flwang1	yep, as i mentioned in the design policy, if any node or api is not in good status, then the overall cluster is unhealthy	11:07
flwang1	we can improve the algorithm later, for the first version, i'd like to make it strict	11:07
strigazi	Isn't this strict enough? How it can be stricter?	11:08
strigazi	or more strict	11:08
flwang1	if you do have concern, we can hide the health_status and health_status_reason attributes for now	11:08
strigazi	I don't have a concern about it	11:09
flwang1	i didn't verify, but for example, one of node may have disk pressure, but it's still in ready status	11:09
flwang1	something like that	11:09
strigazi	I think we can base the node status only on the Ready field	11:09
flwang1	because currently, i'm just using the 'Ready' to represent the health status of minion node	11:10
*** ykarel\|lunch is now known as ykarel		11:10
flwang1	yep, that's my current design	11:10
flwang1	but i do want to provide all the conditions of the node for reference	11:10
flwang1	hence why i'm making the health_status_reason's data structure a little bit 'rich'	11:11
strigazi	ok	11:11
strigazi	not bad	11:11
strigazi	what we can do	11:11
strigazi	is leave the reason empty if it is healthy	11:11
strigazi	and there are no 'issues'	11:12
flwang1	for the worst case, we can make the health_status_reason as a very simple dict	11:13
*** Bhujay has joined #openstack-containers		11:13
flwang1	e.g. if cluster is UNHEALTHY, then health_status_reason = {"node-0.Ready": False}	11:14
flwang1	or something like that	11:14
strigazi	I think it is getting complicated like this. Complicated on the server side	11:15
flwang1	given it's a dict and only magnum internal auto healer will parse it, we should be OK	11:15
flwang1	yep	11:15
flwang1	i know	11:15
flwang1	we have to balance	11:15
strigazi	As a first the simpler to implement and maintain solution sounds ideal to me	11:16
flwang1	sure, i will continue to investigate and discuss with you later	11:18
strigazi	wait a moment	11:18
strigazi	I'm still not sure what we haven't decided yet. It seems to me that only the health_status_reason field is not clear right?	11:19
strigazi	flwang1: ^^	11:20
flwang1	yes	11:20
flwang1	otherwise, it just works for me	11:20
strigazi	So the missing part is nested vs not nested dict	11:21
flwang1	yep	11:22
strigazi	Doesn't nested dict work?	11:23
flwang1	with health_status_reason = wtypes.DictType(str, o_fields.CoercedDict)	11:23
flwang1	the response will be "health_status_reason": {"k8scluster-wnd2jvqdmci3-master-0": {}, "api": {}, "k8scluster-wnd2jvqdmci3-minion-0": {}}, "user_id": "6bc2f37c4c424182967b51386270ec1c", "uuid": "cfd56d2b-73f1-4f27-a007-dc3473b681ee", "api_address": "https://172.24.4.19:6443", "master_addresses": ["172.24.4.19"], "node_count": 1, "project_id": "116161cb5f384bfa80c21b6ab0bff625", "status": "CREATE_COMPLETE", "docker_volume_si	11:24
flwang1	as you can see, the 2nd layer dict is {}	11:25
*** udesale has quit IRC		11:25
flwang1	but it's stored correctly in magnum db	11:25
strigazi	So, wsme incompatibility	11:25
flwang1	we just need to figure out how to make wsme and oslo.versionedobjects work together as we want	11:25
flwang1	yep, if we can call it 'incompatibility'	11:26
flwang1	wsme just can't handle the CoercedDict type from oslo.versionedobject	11:26
flwang1	maybe we need a customized wsme type	11:26
strigazi	that is one option	11:27
strigazi	the other is to go for non-nested dict	11:27
flwang1	yep	11:28
flwang1	as we discussed above	11:28
flwang1	but non-nested dict may make the dict looks weird	11:28
flwang1	will dig	11:28
strigazi	IMO, it is not bad to go for the flat dict initially	11:29
flwang1	ok, that's my last sort	11:30
flwang1	are we on the same page now for the health monitoring?	11:32
strigazi	yes	11:32
strigazi	I would go for the flat option	11:32
strigazi	fyi	11:32
flwang1	btw, i'd like to totally drop the https://github.com/openstack/magnum/blob/master/magnum/drivers/common/k8s_monitor.py#L41 in Stein	11:33
flwang1	it's useless	11:33
strigazi	+	11:33
strigazi	+1	11:33
strigazi	Should we also emit a notification when a cluster is not healthy?	11:34
flwang1	pls revisit https://review.openstack.org/572249 and let's merge it and backport to Rocky	11:34
flwang1	then we can drop the pull_data in Stein	11:34
strigazi	ok	11:34
flwang1	strigazi: it's good to have but i'd like to wait until there is a real user requirement	11:35
strigazi	You don't use notification?	11:35
strigazi	You don't use notifications?	11:35
flwang1	we use, but don't consume it much TBH	11:35
strigazi	We do, for magnum we are working on it	11:36
strigazi	but for heat it looks great	11:36
flwang1	ok, cool, then we can have	11:36
strigazi	eg	11:36
flwang1	use zaqar queue for status update?	11:36
strigazi	in the weekend maybe some nodes went unhealthy and then healthy again	11:37
flwang1	ah, right	11:37
strigazi	which status update?	11:37
strigazi	We don't have zaqar	11:37
flwang1	nevermind then	11:37
flwang1	back to flat dict, then how do we define the key	11:38
flwang1	for api, we can just use "api": "ok", but how about minion nodes?	11:38
strigazi	it needs to be one right?	11:39
flwang1	using something like "minion-node-1.Ready": False?	11:39
strigazi	Can we have: {"api": ok, "minion-node-1.Ready": False, "minion-node-2.Ready": True}	11:39
flwang1	{"api": "ok", "node-0.Ready": True, "node-0.OutOfDisk": False, "node-1.Ready": True, "node-1.OutOfDisk": False, ... ...}	11:40
flwang1	that's what i'm suggesting above	11:40
flwang1	that's not perfect, but I think it's clean/clear enough	11:40
strigazi	it is clear, very clear	11:41
flwang1	do you want to see the "node-0.OutOfDisk": False	11:41
flwang1	other conditions except Ready	11:41
flwang1	if yes, then we need the format "nodename.Ready" otherwise, just "nodename": 'ok' or "nodename": True	11:42
strigazi	It is very good to see all, it just gets a bit heavy.	11:42
strigazi	5 per node, right?	11:43
flwang1	yes	11:43
strigazi	Let's make a quick cound of the data required though	11:43
flwang1	so how about just use 'Ready', but still key the 'nodename.Ready' format for future	11:43
strigazi	^^ better than just ok	11:43
flwang1	can you elaborate?	11:44
strigazi	give me 5'	11:45
flwang1	still around?	12:06
flwang1	strigazi: ^	12:11
strigazi	here	12:11
strigazi	for your question	12:12
strigazi	nodename.Ready: True is better than nodename: ok	12:12
flwang1	ok, then next question, do we want to cover all the 5 for the first version?	12:13
strigazi	how much space will we need for a 1000 node cluster?	12:13
flwang1	it could be long, but we're using Text	12:14
strigazi	Text is 5GB i think	12:15
flwang1	that should be fine	12:15
flwang1	from another angle, can we fix issue like OutOfDisk?	12:15
strigazi	replace the node, it is a 'fix'	12:16
*** adrianc has left #openstack-containers		12:16
flwang1	so how about just show the nodename.Ready for now and add more in the future after we figure out the whole picture	12:16
strigazi	It sounds good to me	12:17
flwang1	deal	12:17
strigazi	Incremental changes are better	12:17
strigazi	long text is L + 4 bytes, where L < 2^32	12:17
strigazi	Are we good with health status?	12:18
flwang1	i feel very good	12:19
flwang1	question for the rolling upgrade	12:20
strigazi	upgrades then, did you get the gist of it? Apart from some hypervisor reboot I'll make it fully functional todat	12:20
strigazi	tell me	12:20
flwang1	i need a series of ready to test patches	12:20
flwang1	i'm really keen to test it	12:20
strigazi	The idea is to provide users with cluster templates and the just follow those	12:23
strigazi	The idea is to provide users with cluster templates and they just follow those	12:23
flwang1	yep, i understand that	12:23
strigazi	Did you see my patch for adding the heat agent in the minions?	12:23
flwang1	yep	12:23
flwang1	i saw that	12:23
flwang1	is it ready to go?	12:24
strigazi	The ssh part	12:24
flwang1	why do we need the ssh part?	12:24
strigazi	to act as being in the host	12:24
strigazi	some operations need to be in the same filesystem	12:24
strigazi	eg https://review.openstack.org/#/c/561858/1/magnum/drivers/common/templates/kubernetes/fragments/configure-kubernetes-minion.sh@157	12:25
strigazi	eg https://review.openstack.org/#/c/561858/1/magnum/drivers/common/templates/kubernetes/fragments/configure-kubernetes-minion.sh@16	12:26
flwang1	ok, got	12:28
strigazi	We can continue in the meeting if you don't have a question now	12:29
flwang1	will you upload new patch set today?	12:32
strigazi	yes	12:32
flwang1	cool, i just want to give it a try	12:32
flwang1	to understand it better	12:32
strigazi	ok	12:32
flwang1	thanks for working on that, i know it's hard one	12:33
strigazi	It didn't go as good as I wanted	12:33
strigazi	It took too much time	12:34
flwang1	i can imagine	12:34
flwang1	but it would be a great feature	12:34
strigazi	I hope it is	12:35
strigazi	I need to go, are planing to sleep? :)	12:36
strigazi	I need to go, are you planing to sleep? :)	12:36
flwang1	yep	12:36
flwang1	will we have meeting after 9 hours?	12:36
strigazi	Thanks for staying late Feilong, you have done great work!	12:36
strigazi	yes	12:37
flwang1	cool, ttyl	12:37
flwang1	have a good one	12:37
strigazi	good night	12:37
strigazi	@all meeting: https://wiki.openstack.org/wiki/Meetings/Containers#Agenda_for_2018-08-21_2100_UTC	12:38
strigazi	Tuesday, 21 August 2018 21:00UTC	12:39
*** pbourke has quit IRC		14:06
*** pbourke has joined #openstack-containers		14:07
*** suanand has quit IRC		14:53
openstackgerrit	Spyros Trigazis proposed openstack/magnum stable/queens: [k8s] Add proxy to master and set cluster-cidr https://review.openstack.org/594264	14:55
*** hongbin has joined #openstack-containers		14:56
*** pcaruana has quit IRC		15:09
*** Bhujay has quit IRC		15:27
*** ykarel is now known as ykarel\|away		15:45
*** strigazi has quit IRC		15:46
*** strigazi has joined #openstack-containers		15:46
*** ramishra has quit IRC		15:54
*** itlinux has joined #openstack-containers		15:59
*** ricolin has quit IRC		16:06
*** olivenwk has quit IRC		16:31
*** ykarel\|away has quit IRC		16:49
*** mattgo has quit IRC		16:54
*** robertomls has joined #openstack-containers		17:26
*** cbrumm has quit IRC		17:42
*** dave-mccowan has quit IRC		17:44
*** portdirect has quit IRC		17:54
*** cbrumm has joined #openstack-containers		17:55
*** robertomls has quit IRC		18:27
*** dave-mccowan has joined #openstack-containers		18:30
*** dave-mccowan has quit IRC		18:35
*** vkmc has quit IRC		18:37
*** sahilsinha has quit IRC		18:37
*** fungi has quit IRC		18:37
*** vkmc has joined #openstack-containers		18:40
*** Chealion has quit IRC		18:42
*** tobberydberg has quit IRC		18:42
*** mnaser has quit IRC		18:42
*** fungi has joined #openstack-containers		18:48
*** mnaser has joined #openstack-containers		19:08
*** robertomls has joined #openstack-containers		19:18
*** spiette has quit IRC		19:32
*** sdake has quit IRC		19:33
*** sdake has joined #openstack-containers		19:34
*** spiette has joined #openstack-containers		19:36
*** ArchiFleKs has quit IRC		19:39
*** robertomls has quit IRC		19:43
*** flwang1 has quit IRC		19:47
*** robertomls has joined #openstack-containers		19:47
*** robertomls has quit IRC		20:16
*** robertomls has joined #openstack-containers		20:16
*** robertomls has quit IRC		20:41
strigazi	flwang: imdigitaljim are you here?	20:58
*** canori02 has joined #openstack-containers		20:59
strigazi	I'll wait a bit before starting the meeting	21:00
imdigitaljim	yeah	21:00
imdigitaljim	sorry	21:00
imdigitaljim	im available	21:00
strigazi	I think flwang will join at some point	21:01
strigazi	imdigitaljim: let's start then	21:01
strigazi	#startmeeting containers	21:01
openstack	Meeting started Tue Aug 21 21:01:48 2018 UTC and is due to finish in 60 minutes. The chair is strigazi. Information about MeetBot at http://wiki.debian.org/MeetBot.	21:01
openstack	Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.	21:01
*** openstack changes topic to " (Meeting topic: containers)"		21:01
openstack	The meeting name has been set to 'containers'	21:01
strigazi	#topic Roll Call	21:01
*** openstack changes topic to "Roll Call (Meeting topic: containers)"		21:01
imdigitaljim	o/	21:02
colin-	hi	21:02
strigazi	o/	21:02
*** harlowja has joined #openstack-containers		21:02
strigazi	#topic Announcements	21:02
*** openstack changes topic to "Announcements (Meeting topic: containers)"		21:02
strigazi	kubernetes v1.11.2 is up: https://hub.docker.com/r/openstackmagnum/kubernetes-kubelet/tags/	21:03
canori02	o/	21:03
strigazi	#topic Blueprints/Bugs/Ideas	21:03
*** openstack changes topic to "Blueprints/Bugs/Ideas (Meeting topic: containers)"		21:03
strigazi	hello canori02	21:03
imdigitaljim	i saw your proxy merge. I'll rebase those other ones and we'll be good to go =]	21:04
imdigitaljim	on a sidenote we recently switched over to DS proxy as well	21:04
imdigitaljim	so if you want i can throw a PR for that at some point?	21:04
strigazi	imdigitaljim: you changed to DS downstream?	21:04
imdigitaljim	yeah	21:05
imdigitaljim	making motion towards self-hosted	21:05
imdigitaljim	to make in-place upgrades smoother	21:05
strigazi	I think it is better to move to DS if we have a plan for most of the components	21:05
imdigitaljim	well ill get a PR up for it soon	21:06
strigazi	you have only proxt as DS	21:06
strigazi	?	21:06
imdigitaljim	yeah currently	21:06
imdigitaljim	everything else is static still for now	21:06
strigazi	and calico is a DS right?	21:06
imdigitaljim	yes	21:06
imdigitaljim	and keystone-auth plugin is DS	21:06
imdigitaljim	for masters	21:06
imdigitaljim	nodeSelector: node-role.kubernetes.io/master: ""	21:07
imdigitaljim	in other words	21:07
imdigitaljim	ccm is the same as well	21:07
strigazi	node selector can be used for api and sch and cm I guess	21:08
imdigitaljim	in most deployments that is exactly whats its for	21:08
imdigitaljim	but yeah that'd be appropriate too	21:08
imdigitaljim	and a few tolerations	21:09
imdigitaljim	i have some concerns about the reliability on self-hosted	21:09
imdigitaljim	in terms of node reboot	21:09
imdigitaljim	so i think ill have to plan those out	21:09
imdigitaljim	and see what our 'competition' does	21:10
imdigitaljim	but in general i think our near term goals will be in-place upgrade	21:10
strigazi	static pods shouldn't be a problem if kubelet starts	21:10
imdigitaljim	yeah thats what i was thinking	21:11
imdigitaljim	except we'd probably need to keep etcd static as well	21:11
imdigitaljim	without etcd it cant join	21:11
imdigitaljim	what kubeadm does is it throws up a static set for rejoining	21:11
imdigitaljim	and then tears it down when reconnected to the clusterr	21:11
imdigitaljim	which i think would be preferable	21:11
strigazi	What do you mean with a static set?	21:12
strigazi	for single master that I have tried, it is just a static pod, isn't it?	21:12
imdigitaljim	it ends up being one yes	21:12
imdigitaljim	i think i'll have to investigate the workflow a little more	21:13
imdigitaljim	make sure im also understanding it correctly	21:13
strigazi	if the data of etcd are in place, rebooting the node isn't a problem	21:13
imdigitaljim	but if etcd is self-hosted as well	21:13
strigazi	using a static pod or not	21:14
imdigitaljim	theres no apiserver online to interact with it	21:14
imdigitaljim	no?	21:14
strigazi	kubelet can't start pods without an api server	21:14
imdigitaljim	(multimaster scenario)	21:14
imdigitaljim	exactly	21:14
strigazi	In multimaster I don't know what kubeadm does	21:15
strigazi	if it converts the static pods to a deployment	21:15
strigazi	or ds	21:15
imdigitaljim	i think this problem is significantly easier in single master	21:15
strigazi	well if, you use static pods, it is the "same" for multi and single master	21:16
imdigitaljim	yeah	21:16
strigazi	if reboot one by one	21:16
imdigitaljim	are you assuming etcd is also self-hosted in this?	21:17
strigazi	if you reboot all of them in one go maybe the result is different	21:17
imdigitaljim	or not	21:17
strigazi	I don't see a diference	21:17
strigazi	kubelet for static pods is like systemd for processes :)	21:18
flwang	sorry i'm late	21:18
colin-	welcome	21:18
imdigitaljim	https://github.com/kubernetes/kubeadm/blob/master/docs/design/design_v1.10.md#optional-and-alpha-in-v19-self-hosting	21:18
imdigitaljim	i believe there are some unaccounted for situations	21:19
strigazi	oh, you mean make even etcd a ds	21:20
strigazi	I wouldn't do that :)	21:20
strigazi	it sounds terrifying	21:21
imdigitaljim	haha	21:21
imdigitaljim	yeah just some concerns to look at	21:21
imdigitaljim	definitely possible to overcome though	21:21
strigazi	ok, since flwang is also in,	21:22
flwang	ds for etcd?	21:22
colin-	yes, mad science lab :)	21:22
strigazi	I was rebasing the upgrades api and I spent a good three hourds debugging. (ds for etcd, yes)	21:23
strigazi	The new change for the svc account keys was "breaking" the functionality	21:23
flwang	strigazi: breaking?	21:24
strigazi	but I finally, figured it out, when magnum creates the dict for params to pass to heat	21:24
strigazi	it adds and generate the keys for the service account	21:25
strigazi	and if since it generates every time magnum creates the params it means that the keys will be different	21:25
strigazi	"breaking" not really breaking	21:25
flwang	ah	21:26
flwang	so do we need any change for the key generating?	21:26
strigazi	maybe, but for now I just ignore those two params	21:26
strigazi	I'll push after the meeting	21:26
imdigitaljim	thats a crazy issue	21:27
imdigitaljim	yeah	21:27
imdigitaljim	maybe we should save in barbican	21:27
imdigitaljim	or	21:27
flwang	strigazi: ok, i will keep an eye on that part	21:27
imdigitaljim	whatever secret backend	21:27
imdigitaljim	and extract or create depending on update/create?	21:27
strigazi	we should do the same thing we do for the ca	21:27
imdigitaljim	so yeah^	21:27
strigazi	imdigitaljim we can do that too	21:27
strigazi	the only thing I don't like is having the secret in barbican and then passing it as a heat paramter	21:28
colin-	i think that would be useful without too much cost	21:28
colin-	oh	21:28
strigazi	It will stay in heat db for ever	21:29
colin-	that's lame	21:29
imdigitaljim	thats true	21:29
strigazi	encrypted, but still	21:29
imdigitaljim	can trustee be extended to allow cluster to interact with barbican?	21:29
strigazi	yes	21:29
imdigitaljim	(or secret backend)	21:29
strigazi	we don't have to do anything actually, the trustee with the trust can talk to barbican	21:30
imdigitaljim	yeah	21:30
imdigitaljim	thats what i was hoping :]	21:30
strigazi	vault or other is different	21:30
imdigitaljim	i havent actually tried it	21:30
imdigitaljim	people can add support for other backend as they need imho	21:30
strigazi	Let's see in stein	21:30
flwang	imdigitaljim: i'm interested in your keystone implementation btw	21:31
imdigitaljim	oh sure	21:31
flwang	imdigitaljim: especially if it needs api restart	21:31
imdigitaljim	ive been working through some changes on new features here and we just accepted some alpha customers so we've been tied up	21:31
imdigitaljim	but ill revisit those PR's and then some	21:31
imdigitaljim	which api restart?	21:32
flwang	k8s api server	21:32
imdigitaljim	oh im not doing a restart, im confused?	21:33
strigazi	why it needs an api restart?	21:33
flwang	because based on testing, you have to restart k8s api server after you got the service URL of keystone auth service	21:33
flwang	if it's deployed as DS	21:33
imdigitaljim	oh yeah i havent needed to at all	21:33
strigazi	flwang: when I tried I didn't restarted the k8s-api	21:33
strigazi	flwang you can use host-network	21:33
imdigitaljim	^	21:34
imdigitaljim	i dont know if you're using gthat	21:34
flwang	then I really want to know how did you do that if you don't deployed it on master	21:34
flwang	master's kubelet	21:34
imdigitaljim	but we are doing hostNetwork: true	21:34
imdigitaljim	i do	21:34
imdigitaljim	tolerations:	21:34
imdigitaljim	- key: dedicated	21:34
imdigitaljim	value: master	21:34
imdigitaljim	effect: NoSchedule	21:34
imdigitaljim	- key: CriticalAddonsOnly	21:34
imdigitaljim	value: "True"	21:34
imdigitaljim	effect: NoSchedule	21:34
imdigitaljim	nodeSelector:	21:34
imdigitaljim	node-role.kubernetes.io/master: ""	21:34
imdigitaljim	we're putting most kube-system resources on masters	21:35
flwang	so your k8s-keystone-auth service is running as DS and not running on master, right?	21:35
imdigitaljim	not like dashboards and whatnot	21:35
imdigitaljim	its running a DS on master	21:35
imdigitaljim	and not on minion	21:35
flwang	ah, right	21:35
flwang	then yes, for that case, it's much easier and could avoid the api restart	21:36
imdigitaljim	oh okay yeah	21:36
flwang	so, should we just get the kubelet back on master?	21:36
strigazi	it seems too reasonable :)	21:36
strigazi	to not get it	21:37
imdigitaljim	o/ ill be glad to add it back	21:37
imdigitaljim	also strigazi: could we make the flannel parts also a software deployment	21:37
imdigitaljim	are those necesssary to be apart of cloud-init?	21:37
strigazi	we could, no they don't	21:37
imdigitaljim	i've noticed we're nearing the capacity on cloud-init data	21:38
flwang	if we all agree 'kubelet' back on master, then it's easy	21:38
imdigitaljim	yeah	21:38
flwang	we just need to drop some 'if' check for calico	21:38
strigazi	that should be enough	21:39
imdigitaljim	how do you all feel about leveraging a helm for prometheus/dashboard/etc	21:39
imdigitaljim	instead of using our scripts going forwad?	21:39
imdigitaljim	forward?	21:39
imdigitaljim	helm charts and such are much cleaner/easier to maintain	21:39
strigazi	we were discussing this today	21:39
strigazi	yes, we could	21:39
flwang	to be more clear, should kubelet on master in Rocky?	21:40
strigazi	the question is	21:40
strigazi	if we don't put it Rocky, how many user will cherry-pick downstream	21:40
strigazi	if we don't put it Rocky, how many users will cherry-pick downstream	21:40
imdigitaljim	i think it would be appropriate for rocky	21:40
imdigitaljim	stein is a bit far out for such a critical change	21:41
flwang	Stein will be a very long release	21:41
flwang	the longest so far IIRC	21:41
strigazi	yes it will	21:42
flwang	I think the risk is low and the benefit is big	21:42
strigazi	+1	21:42
flwang	ok, then I will propose a patch this week	21:43
flwang	I'm glad to see Magnum team is so productive	21:43
imdigitaljim	\o/	21:43
strigazi	:)	21:43
colin-	yeah i think that's worthwhile, it will provide a lot of benefit for consumers	21:44
strigazi	The only area we haven't push a lot, is the -1 stable branch	21:44
strigazi	usually stable and master are in very good shape and are up to date	21:45
strigazi	but current-stable -1 is a little behind	21:45
strigazi	I don't know if we can put a lot of effort into old branches	21:46
strigazi	the ones that we are present here run stable + patches	21:47
strigazi	since we will push some more patches in rocky should we give it a timeline of two or three weeks?	21:49
strigazi	the bracnh is cut, packagers will have everything in place, we can do as many releases as we want with non-breaking changes like these	21:50
strigazi	makes sense?	21:50
*** itlinux has quit IRC		21:51
strigazi	imdigitaljim: flwang colin- canori02 ^^	21:51
canori02	Makes sense	21:51
colin-	yeah that seems reasonable	21:52
imdigitaljim	yeah	21:52
imdigitaljim	that sounds great	21:52
strigazi	imdigitaljim: colin- you use rpms? containers? ansible?	21:52
imdigitaljim	for magnum?	21:53
strigazi	flwang: canori02 you?	21:53
strigazi	imdigitaljim: yes	21:53
imdigitaljim	containers	21:53
strigazi	kolla?	21:53
imdigitaljim	puppet + containers	21:54
canori02	ansible here	21:54
strigazi	interesting we have a spectrum	21:54
strigazi	we use puppet + rpms	21:54
flwang	sorry, was in standup meeting	21:55
flwang	i'm reading the log	21:55
strigazi	but we have a large koji infra for rpms	21:55
flwang	we're using puppet+debian pkg	21:56
strigazi	you deploy on debian sid?	21:56
strigazi	it is strech now	21:57
strigazi	anything else folks?	21:59
strigazi	see you next week or just around	21:59
strigazi	#endmeeing	22:00
strigazi	#endmeeting	22:00
*** openstack changes topic to "OpenStack Containers Team"		22:00
openstack	Meeting ended Tue Aug 21 22:00:06 2018 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)	22:00
openstack	Minutes: http://eavesdrop.openstack.org/meetings/containers/2018/containers.2018-08-21-21.01.html	22:00
openstack	Minutes (text): http://eavesdrop.openstack.org/meetings/containers/2018/containers.2018-08-21-21.01.txt	22:00
openstack	Log: http://eavesdrop.openstack.org/meetings/containers/2018/containers.2018-08-21-21.01.log.html	22:00
strigazi	canori02: For your coreos patch, it is still doesn't work for me. Is it working with master?	22:00
flwang	strigazi: thanks for hosting	22:01
flwang	strigazi: what's our strategy for coreOS driver?	22:02
flwang	should we try to do more catch up for that driver?	22:02
openstackgerrit	Spyros Trigazis proposed openstack/magnum master: WIP: Add cluster upgrade to the API https://review.openstack.org/514959	22:03
canori02	strigazi: I think I had it on 17.0.2. But I'll bring it up to master and fix accordingly. Was it just when providing the custom ca tgat it didn't work for you?	22:03
strigazi	flwang: you are welcome. yes we should, we should invest in ignition	22:04
strigazi	canori02: well the base64 way needs also decofing on the driver side	22:05
strigazi	canori02: yes in the make-cert part is not working	22:05
flwang	strigazi: is ignition the replacement for cloud-init on coreos?	22:05
strigazi	flwang: yes, kind of	22:05
flwang	got	22:05
strigazi	ignition runs before the OS boots	22:05
flwang	strigazi: thanks for the upgrade api patch ;)	22:06
strigazi	but it can write config files	22:06
canori02	It's the replacement for coreos-cloudconfig	22:06
strigazi	flwang: it needs one more patch for the separating the paramters for master and minioin	22:07
strigazi	canori02: maybe we can escape the \ in \n	22:07
flwang	strigazi: when that patch will be pushed?	22:07
strigazi	this or we encode in base64 and decode in the nodes	22:07
strigazi	I'm trying to rebase	22:07
flwang	strigazi: great, sorry for pushing	22:09
strigazi	flwang: thanks for pushing!	22:09
*** rcernin has joined #openstack-containers		22:10
flwang	strigazi: haha	22:11
flwang	we're keen for that feature, so.....	22:11
flwang	have a good night, i'm all good for today	22:12
colin-	ttyl	22:12
openstackgerrit	Spyros Trigazis proposed openstack/magnum master: k8s_atomic: Add upgrade node functionallity https://review.openstack.org/514960	22:18
strigazi	res	22:19
flwang	imdigitaljim: still around?	22:22
imdigitaljim	yeah	22:22
flwang	imdigitaljim: when you rebase you tidy master patch, could you please consider that we will bring the kubelet back?	22:23
imdigitaljim	yes absolutely	22:23
flwang	to make all our life easier ;)	22:23
imdigitaljim	i was planning to do so	22:23
imdigitaljim	:]	22:23
flwang	cool, i will add you as reviewer for the kubelet patch	22:23
canori02	strigazi: how can I pass a ca to a magnum cluster? I hadn't used that functionality before	22:24
imdigitaljim	theres a few variables that exist for it	22:24
imdigitaljim	if you mean a ca.crt	22:25
strigazi	flwang: imdigitaljim I think it is cleaner to add kubelet first	22:25
strigazi	canori02: the ca is passed already	22:26
imdigitaljim	thats fine too	22:26
imdigitaljim	i can make a cleanup pass after kubelet	22:26
imdigitaljim	so flwang just do what you'd need and ill fix it up	22:26
flwang	strigazi: yes, that's my plan	22:28
flwang	imdigitaljim: awesome, thanks	22:28
imdigitaljim	strigazi: what is the overall goal of thise upgrade	22:29
imdigitaljim	will you upgrading api/scheduler/controller as well?	22:30
strigazi	all components	22:30
strigazi	that we have tags for	22:30
imdigitaljim	awesome	22:30
imdigitaljim	look forward to seeing it completed	22:31
strigazi	:)	22:32
*** hongbin has quit IRC		22:44
openstackgerrit	Merged openstack/magnum-ui master: Imported Translations from Zanata https://review.openstack.org/594054	23:40

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!