Thursday, 2020-02-27

*** flwang has joined #openstack-containers		00:21
flwang	brtknr: around?	00:21
*** ramishra has joined #openstack-containers		01:18
*** ramishra has quit IRC		01:26
openstackgerrit	Xinliang Liu proposed openstack/magnum master: Prevent scripts from exiting when there is no error https://review.opendev.org/700485	01:32
*** ramishra has joined #openstack-containers		01:48
openstackgerrit	Merged openstack/magnum stable/train: [bug] Fix regression when use_podman=false https://review.opendev.org/709782	02:26
openstackgerrit	Merged openstack/magnum stable/train: [k8s] Make metrics-server work without DNS https://review.opendev.org/709781	02:29
openstackgerrit	Merged openstack/magnum stable/train: Fix api-cert-manager=true blocking cluster creation https://review.opendev.org/709778	02:29
openstackgerrit	Merged openstack/magnum stable/train: k8s_coreos Set REQUESTS_CA for heat-agent https://review.opendev.org/709777	02:29
openstackgerrit	Merged openstack/magnum stable/train: Fix Field `health_status_reason[api]' cannot be None` https://review.opendev.org/709776	02:29
openstackgerrit	Merged openstack/magnum stable/train: Fix the load balancer description regex pattern for deleting cluster https://review.opendev.org/709775	02:29
*** iokiwi has quit IRC		03:59
*** iokiwi has joined #openstack-containers		03:59
*** udesale has joined #openstack-containers		04:35
*** ykarel\|away is now known as ykarel		05:08
*** rcernin has quit IRC		05:33
*** rcernin has joined #openstack-containers		05:33
brtknr	flwang: hi	05:53
*** rcernin has quit IRC		06:24
*** AJaeger has left #openstack-containers		07:24
*** ykarel is now known as ykarel\|lunch		07:35
*** pcaruana has joined #openstack-containers		07:42
*** udesale has quit IRC		08:31
*** udesale has joined #openstack-containers		08:32
*** ykarel\|lunch is now known as ykarel		09:30
cosmicsound	good day	10:35
*** pcaruana has quit IRC		10:41
*** pcaruana has joined #openstack-containers		10:55
brtknr	cosmicsound: hi	11:10
brtknr	strigazi: i dont understand the api versioning	11:10
cosmicsound	hey brtknr	11:11
brtknr	cosmicsound did you get anywhere?	11:12
cosmicsound	brtknr , i got here: https://mdb.uhlhost.net/uploads/32e6d3c9ef5de88f/image.png	11:19
cosmicsound	this is last step that eventually will fail	11:19
cosmicsound	greped all kolla logs and searched for magnum	11:19
cosmicsound	got this	11:19
cosmicsound	http://paste.openstack.org/show/790049/	11:19
cosmicsound	this was on yesterdays deployment, today i start from scratch again	11:20
cosmicsound	kube)deploy_cluster is last step that fails in end with time out, again not much info to debug on it	11:20
cosmicsound	kube_	11:21
cosmicsound	will get heat-aget-logs from this new deployment	11:23
cosmicsound	pitty this is always empty : openstack software deployment output show --all --long	11:24
cosmicsound	latest http://paste.openstack.org/show/790064/ cloud-init.log	11:36
cosmicsound	latest http://paste.openstack.org/show/790065/ cloud-init-output.log	11:37
cosmicsound	the network errors are weird, because the machines got local and public ip	11:41
*** ivve has joined #openstack-containers		11:53
*** udesale_ has joined #openstack-containers		12:19
*** udesale_ has quit IRC		12:21
*** udesale_ has joined #openstack-containers		12:21
*** udesale has quit IRC		12:22
*** mgariepy has joined #openstack-containers		12:32
*** mgariepy has quit IRC		12:38
*** alti_17 has joined #openstack-containers		12:44
*** iokiwi has quit IRC		12:47
*** iokiwi has joined #openstack-containers		12:48
alti_17	Hello, FYI, for those who are using fedora coreos 31 stable, (in our case 31.20200127) yesterday Stable stream was updated to 31.20200223 https://getfedora.org/en/coreos/download/ . If you start cluster provisioning using older image it often fails with errors related to SoftwareDeployment stage, looks like it caused by coreos zincati	12:52
alti_17	https://docs.fedoraproject.org/en-US/fedora-coreos/auto-updates/ auto updates feauture. It start update and restart of OS during heat script execution and interrupts it	12:52
*** mgariepy has joined #openstack-containers		12:52
*** alti_17 has quit IRC		12:53
*** alti_17 has joined #openstack-containers		12:54
*** udesale_ has quit IRC		13:05
*** udesale_ has joined #openstack-containers		13:05
*** markguz_ has joined #openstack-containers		13:06
markguz_	Hi I'm trying to deploy a 1master 3node cluster on fedora coreos using the lastest version of magnum. (git master)	13:10
markguz_	however consistently 1 out of 3 of the minions fails to be configured by heat. I see this error in the logs "Command failed, will not cache new data. Command 'os-refresh-config' died with <Signals.SIGTERM: 15>"	13:11
markguz_	and then the heat-container-agent dies	13:11
markguz_	i can't find any more detailed logging inside the vm.. anyone got any hints as to where to look?	13:12
brtknr	markguz_: try inside /var/log/heat-config/	13:12
markguz_	brtknr: no such folder	13:13
brtknr	markguz_: so heat container is failing for a different reason	13:13
brtknr	markguz_: so when you have 1 master 1 worker its successful?	13:13
brtknr	markguz_: which image version are you using?	13:14
markguz_	brtknr: i checked on the the nodes that did configure and they don't have that folder either	13:14
markguz_	fedora-coreos-31.20200127.3.0	13:14
brtknr	markguz_: are you using heat container agent ussuri-dev	13:14
markguz_	brtknr: yes	13:14
brtknr	markguz_: so when you have 1 master 1 worker its successful?	13:15
markguz_	brtknr: i will try that	13:15
alti_17	This is no longer latest version	13:16
alti_17	fedora-coreos-31.20200127.3.0	13:16
markguz_	alti_17: it was yesterday?	13:16
markguz_	that's when i downloaded it	13:16
markguz_	there was no newer version available	13:16
markguz_	did they release another?	13:17
alti_17	Yes, 1 Day ago new one was released	13:17
alti_17	31.20200210.3.0 stable	13:17
markguz_	do y'all have a verified version that you know works?	13:17
alti_17	And it caused issues for me. Because of https://docs.fedoraproject.org/en-US/fedora-coreos/auto-updates/ zincati auto updates.. it updates and restart vm during heat script execution what causes errors for me.	13:18
alti_17	I'm testing right now, looks like new version works. but haven't verified it yet	13:18
markguz_	alti_17: YES! i saw those zincati things	13:19
alti_17	I think magnum driver doesn't expect that vm will be rebooted by zincati. It also causes issues for already provisioned clusters, we can't scale them now	13:20
markguz_	alti_17: does that mean coreos is broken for the moment?	13:23
markguz_	maybe zincati can be disabled in the user data before config starts..	13:25
alti_17	I have just created 2 clusters in a row using very latest version of coreos. I will start looking on how to disable/control coreos/zincati update behavior, but meanwhile maybe someone from Magnum maintainers will share some thoughts about it, maybe there are some solutions which we missed	13:29
markguz_	i'm just about to try it. was the zincati feature recently added to coreos?	13:30
markguz_	or perhaps using atomic 29	13:33
alti_17	Just for history this is journalctl log from node vm when you are using not latest coreos. Reboot triggered. Master nodes not being rebooted. No nodes being rebooted if you use latest coreos	13:36
alti_17	Feb 27 11:03:56 alti17-2xw75kfskmyd-node-0 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-tmpfiles-se>	13:36
alti_17	Persistent Storage...	13:36
alti_17	BOOT_IMAGE=(hd0,gpt1)/ostree/fedora-coreos-19190477fad0e60d605a623b86e06bb92aa318b6b79f78696b06f68f262ad5d6/vmlinuz-5.4.17->	13:36
*** lpetrut has joined #openstack-containers		13:36
markguz_	hmm. ok i got the image uploaded just about to try spinning a cluster	13:36
markguz_	i hope they disabled auto rebooting.. that seems like a dumb thing to do with something like coreos.	13:38
markguz_	it's dumb thing to do on anything that is designed to run services...	13:39
alti_17	This is, by the way, coreos "killer feature" https://getfedora.org/en/coreos/ "Fedora CoreOS is an automatically-updating" maybe we just don't know how to operate it properly	13:41
*** waverider has joined #openstack-containers		13:44
markguz_	auto updating... maybe.. auto rebooting.. definitely not	13:49
*** alti_17 has quit IRC		13:50
*** alti_17 has joined #openstack-containers		13:57
*** pcaruana has quit IRC		13:59
*** alti_17 has quit IRC		14:07
*** alti_17 has joined #openstack-containers		14:07
*** ykarel is now known as ykarel\|away		14:08
*** mgariepy has quit IRC		14:14
*** mgariepy has joined #openstack-containers		14:55
*** udesale_ has quit IRC		14:57
*** pcaruana has joined #openstack-containers		15:01
*** alti_17 has quit IRC		16:08
*** markguz_ has quit IRC		16:31
*** waverider has quit IRC		16:40
*** lpetrut has quit IRC		16:40
*** jmlowe has quit IRC		16:54
cosmicsound	whats the cli you guys use to build the latest fedora-coreos image?	16:58
cosmicsound	whenever i build it is not visible in magnum at template creation	16:59
*** alti_17 has joined #openstack-containers		17:04
alti_17	cosmicsound something like this openstack image create \	17:05
alti_17	fedora-coreos-31.20200127.3.0	17:05
alti_17	You might missed os_distro or public key	17:06
cosmicsound	what is the good os=distro?	17:06
cosmicsound	fedora-coreos?	17:07
cosmicsound	do i need also to specify a default username? like on fedora-atomic	17:07
cosmicsound	i used this: openstack image create "fedora-coreos-31.20200127.3.0" --file fedora-coreos-latest.qcow2 --disk-format qcow2 --container-format=bare --min-disk 10 --min-ram 4096 --public --protected --property hw_scsi_model=virtio-scsi --property hw_disk_bus=scsi --property hw_qemu_guest_agent=yes --property	17:08
cosmicsound	os_distro=fedora-coreos --property os_admin_user=fedora --property os_version="31.20200127.3.0"	17:08
cosmicsound	https://mdb.uhlhost.net/uploads/330b518a122d9356/image.png in the end is only the fedora atomic visible	17:09
alti_17	Which version of openstack/magnum do you have?	17:11
cosmicsound	train/2.5.0	17:11
cosmicsound	sorry	17:13
cosmicsound	2.17.0	17:13
cosmicsound	I upgraded from 2.15	17:13
cosmicsound	openstack 5.0.0	17:14
cosmicsound	Train release	17:14
alti_17	Sory, a bit confused, magnum train has 9.* version https://docs.openstack.org/releasenotes/magnum/train.html	17:17
cosmicsound	thats the cli version my bad	17:17
alti_17	starting from 9.1.0 tag fedora-coreos supported	17:18
cosmicsound	hmm i am also confused to find real magnum version	17:18
cosmicsound	:D not the cli version	17:18
*** alti_17 has quit IRC		17:44
cosmicsound	magnum-conductor --version	17:51
cosmicsound	9.2.0	17:51
cosmicsound	this is the version i run	17:51
cosmicsound	still my image is not visible	17:51
cosmicsound	il try to go to latest image version	17:52
*** jmlowe has joined #openstack-containers		18:01
*** jmlowe has quit IRC		18:05
*** jmlowe has joined #openstack-containers		18:05
*** jmlowe has quit IRC		18:27
*** jmlowe has joined #openstack-containers		18:30
*** jmlowe has quit IRC		18:34
*** jmlowe has joined #openstack-containers		18:35
*** jmlowe has quit IRC		18:36
*** jmlowe has joined #openstack-containers		18:56
*** jmlowe has quit IRC		18:59
-openstackstatus- NOTICE: Memory pressure on zuul.opendev.org is causing connection timeouts resulting in POST_FAILURE and RETRY_LIMIT results for some jobs since around 06:00 UTC today; we will be restarting the scheduler shortly to relieve the problem, and will follow up with another notice once running changes are reenqueued.		19:10
*** jmlowe has joined #openstack-containers		19:31
*** pcaruana has quit IRC		19:38
-openstackstatus- NOTICE: The scheduler for zuul.opendev.org has been restarted; any changes which were in queues at the time of the restart have been reenqueued automatically, but any changes whose jobs failed with a RETRY_LIMIT, POST_FAILURE or NODE_FAILURE build result in the past 14 hours should be manually rechecked for fresh results		19:44
*** jmlowe has quit IRC		20:07
*** jmlowe has joined #openstack-containers		20:30
*** rcernin has joined #openstack-containers		21:44
*** ivve has quit IRC		22:57

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!