Friday, 2023-06-09

opendevreview	renliang proposed openstack/kolla master: Fixed an issue with zun-cin-daemon building images in aarch64. https://review.opendev.org/c/openstack/kolla/+/885729	08:34
basileus	Hi, hope everyone is doing great ! After a bit of help from this IRC I come back asking for one very simple question, I'd like to know if there is any ressource / tutorial to fully dismantle a kolla-ansible environment by any chance? I want to retry to re-install openstack correctly this time with Veth and Vbridges but I was wondering if there was .sh script to uninstall / revert all changes? Thanks in advance !	08:52
basileus	For information I deployed it through virtual environment, don't know if that would impact anything regarding dismantling the environment	08:55
mmalchuk	basileus kolla-ansible have a bit, take a look into tools/ directory	08:58
mmalchuk	there you can find ║cleanup-containers║cleanup-host║*cleanup-images scripts	08:58
mmalchuk	maybe they should help you	08:59
basileus	I saw these scripts in there ! I will skim through them and see what is achievable, I wanted to reach a "semi" clean slate without having to reinstall the entire OS, thanks !	09:00
mmalchuk	but if you want the tool which can you really help - try kayobe project	09:00
mmalchuk	try kayobe (https://doc.openstack.org/kayobe)	09:02
mmalchuk	https://docs.openstack.org/kayobe/latest/	09:02
mmalchuk	you can do:	09:02
mmalchuk	kayobe overcloud service deploy	09:03
mmalchuk	than	09:03
mmalchuk	kayobe overcloud service destroy	09:03
mmalchuk	and repeat it)	09:03
basileus	Oh that seems... neater than kolla	09:03
mmalchuk	kayobe uses kolla-ansible jfyi	09:03
mmalchuk	kayobe in this case some kind of wrapper with cli	09:04
basileus	I'm assuming, just like my previous install I'll need 2 Veth including an empty one for Neutron?	09:04
mmalchuk	don't remember. and with kayobe you can do any expirement you want and repeat deploy and redeploy	09:05
basileus	awesome ! Let me skim through the cleanup script and try to run it in a virtual environment ! Thanks a lot	09:05
mmalchuk	also, you can start to learn Kayobe from this: https://github.com/stackhpc/a-universe-from-nothing	09:06
mmalchuk	then dig into the documentation and etc.	09:07
basileus	Ay Ay captain, just had a question, how come some projects use CentOS while other more or less don't recommend it whatsoever?	09:08
mmalchuk	I choose Ubuntu. it more stable for me. and this is my choice only.	09:09
mmalchuk	If you prefer RedHat based distros - have a look on RockyLinux	09:10
mmalchuk	it well supported by Kolla/Kolla-ansible/Kayobe	09:10
mmalchuk	both	09:10
mmalchuk	Ubuntu and RockyLinux	09:10
basileus	Yeah I originally went for CentOS and quickly swapped back to ubuntu at that point, too many issues during install	09:11
mmalchuk	jfyi: https://docs.openstack.org/kolla-ansible/latest/user/support-matrix.html	09:11
basileus	Thank you so much ! Time to pull my hair out a bit !	09:13
hrw	basileus: "kolla-ansible destroy"	09:53
frickler	but I also strongly suggest to do a fresh OS deploy for a re-installation. you'll want to automate that part anyway to reduce work and error rates	09:56
basileus	I see, and what would be the best OS version for that ? Would 20.04 Ubuntu server LTS be best or should I stick to 22.04?	09:58
frickler	22.04 is required for latest openstack, why start with something old?	10:06
mmalchuk	frickler did you fix your internet?	10:18
frickler	good enough to do IRC at least	10:48
mmalchuk	may be able to merge docs too?	10:49
frickler	maybe, but I also dont see the urgency	11:52
mmalchuk	only because of no urgency and no rush today I ask... next week there can be problems and urgency to fix other issues maybe	12:00
mmalchuk	but if youre busy, than ok	12:01
spatel	I am running kolla-build and it start building images but somehow it randomly stuck somewhere and don't move further.. only option left to ctrl+c	14:00
spatel	How does it work with CI jobs? because of that I am building image one by one.. instead building all in single shot	14:00
spatel	For example, This is stuck since last night - https://paste.opendev.org/show/bf1rRhg9V3BCy4KmHPRg/	14:01
mmalchuk	stuck - is not an error	14:14
mmalchuk	what the problem?	14:14
opendevreview	Juan Pablo Suazo proposed openstack/kolla-ansible master: Configures the tap-as-a-service neutron plugin https://review.opendev.org/c/openstack/kolla-ansible/+/885417	14:14
spatel	mmalchuk I have no idea what is the problem because its just not moving	14:39
spatel	If i ctrl+c and do again then in second run it works	14:39
spatel	I have 100G internet link so i don't think its internet related issue :)	14:39
mmalchuk	you can enter to the intermediate container and execute last command interactively	14:40
mmalchuk	and maybe you'll see an error	14:40
spatel	Hmm that is good way to test.. How do i find intermediate container? I believe must be show up in docker ps command	14:41
mmalchuk	in the log you provided you can find the line about it	14:42
mmalchuk	but you show only part... there more about keystone-ssh but need nova-compute container logs	14:43
spatel	Oh something like this - Removing intermediate container 7cddcee1dc66	14:43
mmalchuk	yep	14:43
mmalchuk	and lines with an arrow	14:43
spatel	I see, you are saying find intermediate container ID and get into container with exec -it mode and run last command by hand etc..	14:43
spatel	---> fb570654e598	14:43
mmalchuk	yep	14:44
spatel	Perfect! good to know that	14:44
mmalchuk	this is successfully created layer (intermediate container)	14:44
spatel	May be nova-compute container got stuck..	14:45
spatel	I realized building image one-by-one is better way to see things..	14:46
mmalchuk	if you show the tail of the log - yes nova-compute	14:46
spatel	I have built all images with tag 2023.1 but now when I am running deploy getting error sayin - docker-reg:4000/kolla/fluentd:2023.1-ubuntu-jammy not found	14:47
spatel	do i need to use tag 2023.1-ubuntu-jammy ?	14:47
mmalchuk	you always can control the way of build with and config or command-line options	14:47
mmalchuk	there one usefull option for you - threads	14:48
mmalchuk	set it to 1, but also you need to control images already built for example - skip_existing, or even control retries - retries	14:49
mmalchuk	also to be more verbose there good reason to enable debug - debug	14:50
spatel	mmalchuk I see - The number of threads to use while building. (Note: setting to one will allow real time logging)	14:50
mmalchuk	yep	14:50
spatel	I always use --debug	14:50
mmalchuk	threads = 1, retries = 0, skip_existing = True and debug = True - my choice for troubleshooting)	14:51
spatel	+1	14:51
mmalchuk	also format = none to remove unneded info in the tail of the logfile	14:52
mmalchuk	format = none	14:52
spatel	Why don't we document these option for best practice suggestion ?	14:53
spatel	Its handy for people like me :)	14:54
mmalchuk	also, as I can see, you have some problems with ubuntu repos (mirrors) - last two lines with different hosts for one file	14:54
spatel	Yes, its always stuck on some random mirror fetch	14:55
mmalchuk	it tries to download file number 243, but not succeed from mirrors.cmich.edu, then tries to do from mirrors.advancedhosters.com	14:55
mmalchuk	this is very strange	14:55
mmalchuk	http://mirrors.cmich.edu/ is online and have ubuntu repo	14:56
mmalchuk	may be you have network issues?	14:56
mmalchuk	its a good idea to use local reachable ubuntu mirror	14:57
spatel	may be regional issue, I am in US east cost so not sure if something going on there	14:57
spatel	+1 Yes.. I like that idea	14:57
spatel	This is my kolla-build.conf file - https://paste.opendev.org/show/bnI9OFmryOaaFbw3ugXN/	14:58
mmalchuk	even you can do your own mirror, it takes about 2Tb only)	14:58
mmalchuk	or use caching proxy to leverage slow network issues	14:58
spatel	haha! Not a bad idea, in that case how do i inject local mirror during build?	14:59
mmalchuk	about the documentation, you always can run kolla-build --help	14:59
spatel	I will try to poke and figure out..	14:59
spatel	We don't build images everyday but again its good to have it local.	15:00
mmalchuk	you always can use override mechanisms - described in the documentation	15:00
spatel	I am getting this error during deploy command - docker-reg:4000/kolla/fluentd:2023.1-ubuntu-jammy not found	15:01
spatel	my tag is 2023.1 so where this -ubuntu-jammy extra suffix coming from?	15:01
spatel	In global.yml - openstack_release: "2023.1"	15:02
mmalchuk	# Docker image tag used by default.	15:05
mmalchuk	openstack_tag: "{{ openstack_release }}-{{ kolla_base_distro }}-{{ kolla_base_distro_version }}{{ openstack_tag_suffix }}"	15:05
mmalchuk	kolla-ansible/ansible/group_vars/all.yml	15:05
spatel	should I use openstack_tag: 2023.1 ?	15:08
mmalchuk	reasonable if you build images with this tag. but this is not default behaviour	15:10
spatel	Got it :)	15:12
*** hrww is now known as hrw		15:55
spatel	mmalchuk is this looks ok to you? - https://paste.opendev.org/show/bQyDbzxqA7ZY6yiGsm5l/	16:48
spatel	build.py version is 15.1.1 and kolla version 16.x.x	16:48
mmalchuk	not really, better have binary newer or the same major version with the code	16:50
spatel	mmalchuk I did checkout of 16.x.x tag doesn't it include ./build.py binary?	17:11
mmalchuk	don't know what you did)	17:11
mmalchuk	to get the binary you should use pip	17:11
mmalchuk	even if you build from the source	17:12
mmalchuk	https://docs.openstack.org/kolla/latest/admin/image-building.html	17:13
mmalchuk	did you see build.py usage here?	17:13
spatel	That is pip way to install kolla, I did git clone https:// way	17:13
mmalchuk	no. pip install binary into system dir or into a virtualenv (better)	17:14
spatel	https://paste.opendev.org/show/bXLJAyII51oWGqdZsKLc/	17:15
spatel	you are suggesting to do python3 -m pip install kolla==16.0.0 ?	17:17
mmalchuk	ok. what next?	17:17
frickler	do you have an mtu < 1500 on your build host? that could affect downloads in docker containers unless you tell docker to use the lower mtu for networking, too	17:17
mmalchuk	id depends. but yes. pip install	17:18
mmalchuk	if you planed to install from pypi - pip install kolla==<version> (version is optional)	17:19
mmalchuk	if you planed to build from source - pip install path_to_source/	17:19
mmalchuk	but before create the virtualenv (prefered way)	17:20
spatel	I was reading this doc, look at top section - https://hlyani.github.io/notes/openstack/kolla_image_build.html	17:20
spatel	How did they install kolla?	17:20
spatel	I did same way and my binary versions are different as i show you earlier	17:21
spatel	https://paste.opendev.org/show/bXLJAyII51oWGqdZsKLc/	17:21
mmalchuk	japaneeze? queens? this is outdated and at most places incorrect documetation	17:22
spatel	That is example.. I know its old	17:22
spatel	Just trying to use same method for new release	17:22
mmalchuk	don't know how they (japaneeze?) do anything. this is not official documentation!	17:23
mmalchuk	please dont do this way	17:23
spatel	Ok.. let me try python3 -m pip install kolla==16.0.0	17:23
mmalchuk	what you need:	17:23
mmalchuk	1. source dir - git clone ...	17:24
mmalchuk	2. virtualenv dir: python3 -m venv .... or virtualenv .... depends on OS	17:24
mmalchuk	3. pip install path_to_source/	17:25
mmalchuk	4. use kolla-build binary from the virtualenv path	17:25
mmalchuk	thats all	17:25
spatel	what pip install path_to_source would be?	17:26
mmalchuk	pip - python installer, install - an option, path_to_source - path to kolla source code	17:27
mmalchuk	if you do 'cd kolla' after 'git clone' - than use 'pip install .' for example	17:28
spatel	ohh!	17:29
spatel	Let me try	17:29
mmalchuk	do you have created virtualenv before?	17:30
spatel	Yes, I did (I use kolla-ansible to run from venv)	17:30
mmalchuk	cool. then proceed with 'pip install .'	17:30
spatel	I did install kolla on multiple place on production but never use own images. I always pull images from public repo and push them to local mirror	17:30
spatel	This time thinking to use own images to run kolla and that is where I am playing right now	17:31
spatel	I am planning to install kolla on 600 node cluster so better use own images.	17:32
spatel	Hope kolla support on that scale	17:32
mmalchuk	imho its bad idea to use images from the internet in production. the better way to build your own and controll everything.	17:32
spatel	mmalchuk 100% with you.. In past I deployed 10 to 20 node cluster and it was small environment so I didn't bother to build images.	17:33
spatel	This time its very large scale deployment so make it right	17:33
mmalchuk	no matter of size... build takes several minutes	17:34
spatel	Yep	17:34
spatel	My plan is to use 3x node for rabbit/DB and 3x for api to support 600 compute	17:35
mmalchuk	what reason to split bus/db with api ?	17:36
spatel	Just to have dedicated CPU/memory for rabbitMQ and DB	17:36
spatel	Putting everything on 3 node would be too much work	17:37
spatel	DB doesn't take lots of CPU/memory but rabbitMQ crush thing very fast	17:37
mmalchuk	due to 'Probability theory' 1 of 6 node will fail faster in unike 1 of 3)	17:38
spatel	anything is possible when it comes to fail	17:39
mmalchuk	also in your case there will be big latency from api to backend because of network	17:39
spatel	with 600 nodes control plan will be very chatty	17:39
spatel	I was thinking to use virtual machine for API layer to reduce locks or bottleneck	17:40
mmalchuk	virrtualisation layer adds even more latency	17:40
spatel	I meant multiple virtual VM for api instead of just 3	17:40
spatel	horizontal scale	17:41
mmalchuk	don't invent the bycicle)	17:41
mmalchuk	please read the РФ пгшву	17:41
mmalchuk	HA guide	17:41
spatel	I know what you saying but sometime its not about CPU power but number of request they handle	17:42
spatel	РФ пгшву ?	17:42
spatel	Do you have link ?	17:42
mmalchuk	my keyboard switches... sorry. I say 'please read the HA guide'	17:43
spatel	haha :)	17:43
mmalchuk	https://docs.openstack.org/ha-guide/	17:43
spatel	I did saw that and many other openstack summit scaling videos to learn how to scale	17:45
spatel	I am running 3 large cloud already in production with openstack-ansible each cloud has 300 compute nodes with 3x controller nodes.	17:46
mmalchuk	ok. than) lets back to kolla	17:46
spatel	This is first time I am pushing to do 600 to 800 comoute nodes on single control plane	17:46
spatel	using kolla	17:46
spatel	I am just little worried to use 3x node that is why decided to give dedicated nodes to rabbitmq	17:47
mmalchuk	how is your 'pip install .' ?	17:47
spatel	Its all good :) and images started building	17:47
mmalchuk	kolla-build --version ?	17:48
mmalchuk	same as checkout?)	17:48
spatel	its showing 16.0.0 :)	17:48
spatel	Yesss	17:48
mmalchuk	cool	17:48
mmalchuk	I'm proud of you)	17:48
spatel	I am using your options --thread 1 --skip-existing --cache --format none etc.. and not a single image stuck yet :)	17:49
spatel	You should be proud :)	17:49
mmalchuk	note bene - this way it takes much more time to build	17:49
mmalchuk	this way is only for debug	17:49
spatel	I like slow and steady vs troubleshooting	17:50
mmalchuk	nice	17:50
spatel	can you share your kolla-build.conf file if possible.. I am curious to match with my one	17:51
mmalchuk	also with skip_existing and cache - in case of temporary failure and repeated build you also need more space for docker	17:51
spatel	If its safe to share	17:51
spatel	This time i use cache just to save download time during failure	17:52
mmalchuk	https://paste.openstack.org/show/bNsIeRFLYvj5UQlPJGce/	17:54
mmalchuk	note - this is template only. because I use Kayobe and full config created from this template dynamically	17:54
mmalchuk	and this is not from production, but from my dev lab	17:56
spatel	oh okay	17:56
spatel	is this your local image? registry-openstack.cloud.local/openstack/infra/ubuntu	17:56
mmalchuk	yep. the local docker registry	17:57
mmalchuk	with the fixed version of ubuntu from the internet.	17:57
mmalchuk	even if they have tag 20.04	17:58
mmalchuk	it updated from the internet when needed	17:58
spatel	I know what you saying because if you build image after few days then may be its 20.04.100	17:59
mmalchuk	lol)	17:59
spatel	I should push ubuntu image to local repo then	17:59
spatel	You don't put tag in build file?	18:00
mmalchuk	yep)	18:00
mmalchuk	no	18:00
mmalchuk	because it common. the uncommon build options passed in commandline while build started	18:01
mmalchuk	I can build xena or zed, depend of need	18:02
spatel	Hmmm	18:02
mmalchuk	also I can oly build to test and not push	18:02
mmalchuk	so, some options in config some used externaly	18:02
mmalchuk	devops way)	18:03
spatel	We have artifactory so plan is to push images there	18:13
mmalchuk	not cheap	18:20
mmalchuk	why not opensource solution?	18:21
spatel	We are big company and we already have all those tools in production :)	18:49
mmalchuk	so why not the VMWare?)	18:55
spatel	we are big but not that big to use vmware. Our all tooling already working for openstack so why go for VMware? We have multiple business and each business has different budget to run.	19:02
opendevreview	Juan Pablo Suazo proposed openstack/kolla-ansible master: Adds support for Huawei backends https://review.opendev.org/c/openstack/kolla-ansible/+/869252	20:28

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!