opendevreview | renliang proposed openstack/kolla master: Fixed an issue with zun-cin-daemon building images in aarch64. https://review.opendev.org/c/openstack/kolla/+/885729 | 08:34 |
---|---|---|
basileus | Hi, hope everyone is doing great ! After a bit of help from this IRC I come back asking for one very simple question, I'd like to know if there is any ressource / tutorial to fully dismantle a kolla-ansible environment by any chance? I want to retry to re-install openstack correctly this time with Veth and Vbridges but I was wondering if there was .sh script to uninstall / revert all changes? Thanks in advance ! | 08:52 |
basileus | For information I deployed it through virtual environment, don't know if that would impact anything regarding dismantling the environment | 08:55 |
mmalchuk | basileus kolla-ansible have a bit, take a look into tools/ directory | 08:58 |
mmalchuk | there you can find ║*cleanup-containers║*cleanup-host║*cleanup-images scripts | 08:58 |
mmalchuk | maybe they should help you | 08:59 |
basileus | I saw these scripts in there ! I will skim through them and see what is achievable, I wanted to reach a "semi" clean slate without having to reinstall the entire OS, thanks ! | 09:00 |
mmalchuk | but if you want the tool which can you really help - try kayobe project | 09:00 |
mmalchuk | try kayobe (https://doc.openstack.org/kayobe) | 09:02 |
mmalchuk | https://docs.openstack.org/kayobe/latest/ | 09:02 |
mmalchuk | you can do: | 09:02 |
mmalchuk | kayobe overcloud service deploy | 09:03 |
mmalchuk | than | 09:03 |
mmalchuk | kayobe overcloud service destroy | 09:03 |
mmalchuk | and repeat it) | 09:03 |
basileus | Oh that seems... neater than kolla | 09:03 |
mmalchuk | kayobe uses kolla-ansible jfyi | 09:03 |
mmalchuk | kayobe in this case some kind of wrapper with cli | 09:04 |
basileus | I'm assuming, just like my previous install I'll need 2 Veth including an empty one for Neutron? | 09:04 |
mmalchuk | don't remember. and with kayobe you can do any expirement you want and repeat deploy and redeploy | 09:05 |
basileus | awesome ! Let me skim through the cleanup script and try to run it in a virtual environment ! Thanks a lot | 09:05 |
mmalchuk | also, you can start to learn Kayobe from this: https://github.com/stackhpc/a-universe-from-nothing | 09:06 |
mmalchuk | then dig into the documentation and etc. | 09:07 |
basileus | Ay Ay captain, just had a question, how come some projects use CentOS while other more or less don't recommend it whatsoever? | 09:08 |
mmalchuk | I choose Ubuntu. it more stable for me. and this is my choice only. | 09:09 |
mmalchuk | If you prefer RedHat based distros - have a look on RockyLinux | 09:10 |
mmalchuk | it well supported by Kolla/Kolla-ansible/Kayobe | 09:10 |
mmalchuk | both | 09:10 |
mmalchuk | Ubuntu and RockyLinux | 09:10 |
basileus | Yeah I originally went for CentOS and quickly swapped back to ubuntu at that point, too many issues during install | 09:11 |
mmalchuk | jfyi: https://docs.openstack.org/kolla-ansible/latest/user/support-matrix.html | 09:11 |
basileus | Thank you so much ! Time to pull my hair out a bit ! | 09:13 |
hrw | basileus: "kolla-ansible destroy" | 09:53 |
frickler | but I also strongly suggest to do a fresh OS deploy for a re-installation. you'll want to automate that part anyway to reduce work and error rates | 09:56 |
basileus | I see, and what would be the best OS version for that ? Would 20.04 Ubuntu server LTS be best or should I stick to 22.04? | 09:58 |
frickler | 22.04 is required for latest openstack, why start with something old? | 10:06 |
mmalchuk | frickler did you fix your internet? | 10:18 |
frickler | good enough to do IRC at least | 10:48 |
mmalchuk | may be able to merge docs too? | 10:49 |
frickler | maybe, but I also dont see the urgency | 11:52 |
mmalchuk | only because of no urgency and no rush today I ask... next week there can be problems and urgency to fix other issues maybe | 12:00 |
mmalchuk | but if youre busy, than ok | 12:01 |
spatel | I am running kolla-build and it start building images but somehow it randomly stuck somewhere and don't move further.. only option left to ctrl+c | 14:00 |
spatel | How does it work with CI jobs? because of that I am building image one by one.. instead building all in single shot | 14:00 |
spatel | For example, This is stuck since last night - https://paste.opendev.org/show/bf1rRhg9V3BCy4KmHPRg/ | 14:01 |
mmalchuk | stuck - is not an error | 14:14 |
mmalchuk | what the problem? | 14:14 |
opendevreview | Juan Pablo Suazo proposed openstack/kolla-ansible master: Configures the tap-as-a-service neutron plugin https://review.opendev.org/c/openstack/kolla-ansible/+/885417 | 14:14 |
spatel | mmalchuk I have no idea what is the problem because its just not moving | 14:39 |
spatel | If i ctrl+c and do again then in second run it works | 14:39 |
spatel | I have 100G internet link so i don't think its internet related issue :) | 14:39 |
mmalchuk | you can enter to the intermediate container and execute last command interactively | 14:40 |
mmalchuk | and maybe you'll see an error | 14:40 |
spatel | Hmm that is good way to test.. How do i find intermediate container? I believe must be show up in docker ps command | 14:41 |
mmalchuk | in the log you provided you can find the line about it | 14:42 |
mmalchuk | but you show only part... there more about keystone-ssh but need nova-compute container logs | 14:43 |
spatel | Oh something like this - Removing intermediate container 7cddcee1dc66 | 14:43 |
mmalchuk | yep | 14:43 |
mmalchuk | and lines with an arrow | 14:43 |
spatel | I see, you are saying find intermediate container ID and get into container with exec -it mode and run last command by hand etc.. | 14:43 |
spatel | ---> fb570654e598 | 14:43 |
mmalchuk | yep | 14:44 |
spatel | Perfect! good to know that | 14:44 |
mmalchuk | this is successfully created layer (intermediate container) | 14:44 |
spatel | May be nova-compute container got stuck.. | 14:45 |
spatel | I realized building image one-by-one is better way to see things.. | 14:46 |
mmalchuk | if you show the tail of the log - yes nova-compute | 14:46 |
spatel | I have built all images with tag 2023.1 but now when I am running deploy getting error sayin - docker-reg:4000/kolla/fluentd:2023.1-ubuntu-jammy not found | 14:47 |
spatel | do i need to use tag 2023.1-ubuntu-jammy ? | 14:47 |
mmalchuk | you always can control the way of build with and config or command-line options | 14:47 |
mmalchuk | there one usefull option for you - threads | 14:48 |
mmalchuk | set it to 1, but also you need to control images already built for example - skip_existing, or even control retries - retries | 14:49 |
mmalchuk | also to be more verbose there good reason to enable debug - debug | 14:50 |
spatel | mmalchuk I see - The number of threads to use while building. (Note: setting to one will allow real time logging) | 14:50 |
mmalchuk | yep | 14:50 |
spatel | I always use --debug | 14:50 |
mmalchuk | threads = 1, retries = 0, skip_existing = True and debug = True - my choice for troubleshooting) | 14:51 |
spatel | +1 | 14:51 |
mmalchuk | also format = none to remove unneded info in the tail of the logfile | 14:52 |
mmalchuk | format = none | 14:52 |
spatel | Why don't we document these option for best practice suggestion ? | 14:53 |
spatel | Its handy for people like me :) | 14:54 |
mmalchuk | also, as I can see, you have some problems with ubuntu repos (mirrors) - last two lines with different hosts for one file | 14:54 |
spatel | Yes, its always stuck on some random mirror fetch | 14:55 |
mmalchuk | it tries to download file number 243, but not succeed from mirrors.cmich.edu, then tries to do from mirrors.advancedhosters.com | 14:55 |
mmalchuk | this is very strange | 14:55 |
mmalchuk | http://mirrors.cmich.edu/ is online and have ubuntu repo | 14:56 |
mmalchuk | may be you have network issues? | 14:56 |
mmalchuk | its a good idea to use local reachable ubuntu mirror | 14:57 |
spatel | may be regional issue, I am in US east cost so not sure if something going on there | 14:57 |
spatel | +1 Yes.. I like that idea | 14:57 |
spatel | This is my kolla-build.conf file - https://paste.opendev.org/show/bnI9OFmryOaaFbw3ugXN/ | 14:58 |
mmalchuk | even you can do your own mirror, it takes about 2Tb only) | 14:58 |
mmalchuk | or use caching proxy to leverage slow network issues | 14:58 |
spatel | haha! Not a bad idea, in that case how do i inject local mirror during build? | 14:59 |
mmalchuk | about the documentation, you always can run kolla-build --help | 14:59 |
spatel | I will try to poke and figure out.. | 14:59 |
spatel | We don't build images everyday but again its good to have it local. | 15:00 |
mmalchuk | you always can use override mechanisms - described in the documentation | 15:00 |
spatel | I am getting this error during deploy command - docker-reg:4000/kolla/fluentd:2023.1-ubuntu-jammy not found | 15:01 |
spatel | my tag is 2023.1 so where this -ubuntu-jammy extra suffix coming from? | 15:01 |
spatel | In global.yml - openstack_release: "2023.1" | 15:02 |
mmalchuk | # Docker image tag used by default. | 15:05 |
mmalchuk | openstack_tag: "{{ openstack_release }}-{{ kolla_base_distro }}-{{ kolla_base_distro_version }}{{ openstack_tag_suffix }}" | 15:05 |
mmalchuk | kolla-ansible/ansible/group_vars/all.yml | 15:05 |
spatel | should I use openstack_tag: 2023.1 ? | 15:08 |
mmalchuk | reasonable if you build images with this tag. but this is not default behaviour | 15:10 |
spatel | Got it :) | 15:12 |
*** hrww is now known as hrw | 15:55 | |
spatel | mmalchuk is this looks ok to you? - https://paste.opendev.org/show/bQyDbzxqA7ZY6yiGsm5l/ | 16:48 |
spatel | build.py version is 15.1.1 and kolla version 16.x.x | 16:48 |
mmalchuk | not really, better have binary newer or the same major version with the code | 16:50 |
spatel | mmalchuk I did checkout of 16.x.x tag doesn't it include ./build.py binary? | 17:11 |
mmalchuk | don't know what you did) | 17:11 |
mmalchuk | to get the binary you should use pip | 17:11 |
mmalchuk | even if you build from the source | 17:12 |
mmalchuk | https://docs.openstack.org/kolla/latest/admin/image-building.html | 17:13 |
mmalchuk | did you see build.py usage here? | 17:13 |
spatel | That is pip way to install kolla, I did git clone https:// way | 17:13 |
mmalchuk | no. pip install binary into system dir or into a virtualenv (better) | 17:14 |
spatel | https://paste.opendev.org/show/bXLJAyII51oWGqdZsKLc/ | 17:15 |
spatel | you are suggesting to do python3 -m pip install kolla==16.0.0 ? | 17:17 |
mmalchuk | ok. what next? | 17:17 |
frickler | do you have an mtu < 1500 on your build host? that could affect downloads in docker containers unless you tell docker to use the lower mtu for networking, too | 17:17 |
mmalchuk | id depends. but yes. pip install | 17:18 |
mmalchuk | if you planed to install from pypi - pip install kolla==<version> (version is optional) | 17:19 |
mmalchuk | if you planed to build from source - pip install path_to_source/ | 17:19 |
mmalchuk | but before create the virtualenv (prefered way) | 17:20 |
spatel | I was reading this doc, look at top section - https://hlyani.github.io/notes/openstack/kolla_image_build.html | 17:20 |
spatel | How did they install kolla? | 17:20 |
spatel | I did same way and my binary versions are different as i show you earlier | 17:21 |
spatel | https://paste.opendev.org/show/bXLJAyII51oWGqdZsKLc/ | 17:21 |
mmalchuk | japaneeze? queens? this is outdated and at most places incorrect documetation | 17:22 |
spatel | That is example.. I know its old | 17:22 |
spatel | Just trying to use same method for new release | 17:22 |
mmalchuk | don't know how they (japaneeze?) do anything. this is not official documentation! | 17:23 |
mmalchuk | please dont do this way | 17:23 |
spatel | Ok.. let me try python3 -m pip install kolla==16.0.0 | 17:23 |
mmalchuk | what you need: | 17:23 |
mmalchuk | 1. source dir - git clone ... | 17:24 |
mmalchuk | 2. virtualenv dir: python3 -m venv .... or virtualenv .... depends on OS | 17:24 |
mmalchuk | 3. pip install path_to_source/ | 17:25 |
mmalchuk | 4. use kolla-build binary from the virtualenv path | 17:25 |
mmalchuk | thats all | 17:25 |
spatel | what pip install path_to_source would be? | 17:26 |
mmalchuk | pip - python installer, install - an option, path_to_source - path to kolla source code | 17:27 |
mmalchuk | if you do 'cd kolla' after 'git clone' - than use 'pip install .' for example | 17:28 |
spatel | ohh! | 17:29 |
spatel | Let me try | 17:29 |
mmalchuk | do you have created virtualenv before? | 17:30 |
spatel | Yes, I did (I use kolla-ansible to run from venv) | 17:30 |
mmalchuk | cool. then proceed with 'pip install .' | 17:30 |
spatel | I did install kolla on multiple place on production but never use own images. I always pull images from public repo and push them to local mirror | 17:30 |
spatel | This time thinking to use own images to run kolla and that is where I am playing right now | 17:31 |
spatel | I am planning to install kolla on 600 node cluster so better use own images. | 17:32 |
spatel | Hope kolla support on that scale | 17:32 |
mmalchuk | imho its bad idea to use images from the internet in production. the better way to build your own and controll everything. | 17:32 |
spatel | mmalchuk 100% with you.. In past I deployed 10 to 20 node cluster and it was small environment so I didn't bother to build images. | 17:33 |
spatel | This time its very large scale deployment so make it right | 17:33 |
mmalchuk | no matter of size... build takes several minutes | 17:34 |
spatel | Yep | 17:34 |
spatel | My plan is to use 3x node for rabbit/DB and 3x for api to support 600 compute | 17:35 |
mmalchuk | what reason to split bus/db with api ? | 17:36 |
spatel | Just to have dedicated CPU/memory for rabbitMQ and DB | 17:36 |
spatel | Putting everything on 3 node would be too much work | 17:37 |
spatel | DB doesn't take lots of CPU/memory but rabbitMQ crush thing very fast | 17:37 |
mmalchuk | due to 'Probability theory' 1 of 6 node will fail faster in unike 1 of 3) | 17:38 |
spatel | anything is possible when it comes to fail | 17:39 |
mmalchuk | also in your case there will be big latency from api to backend because of network | 17:39 |
spatel | with 600 nodes control plan will be very chatty | 17:39 |
spatel | I was thinking to use virtual machine for API layer to reduce locks or bottleneck | 17:40 |
mmalchuk | virrtualisation layer adds even more latency | 17:40 |
spatel | I meant multiple virtual VM for api instead of just 3 | 17:40 |
spatel | horizontal scale | 17:41 |
mmalchuk | don't invent the bycicle) | 17:41 |
mmalchuk | please read the РФ пгшву | 17:41 |
mmalchuk | HA guide | 17:41 |
spatel | I know what you saying but sometime its not about CPU power but number of request they handle | 17:42 |
spatel | РФ пгшву ? | 17:42 |
spatel | Do you have link ? | 17:42 |
mmalchuk | my keyboard switches... sorry. I say 'please read the HA guide' | 17:43 |
spatel | haha :) | 17:43 |
mmalchuk | https://docs.openstack.org/ha-guide/ | 17:43 |
spatel | I did saw that and many other openstack summit scaling videos to learn how to scale | 17:45 |
spatel | I am running 3 large cloud already in production with openstack-ansible each cloud has 300 compute nodes with 3x controller nodes. | 17:46 |
mmalchuk | ok. than) lets back to kolla | 17:46 |
spatel | This is first time I am pushing to do 600 to 800 comoute nodes on single control plane | 17:46 |
spatel | using kolla | 17:46 |
spatel | I am just little worried to use 3x node that is why decided to give dedicated nodes to rabbitmq | 17:47 |
mmalchuk | how is your 'pip install .' ? | 17:47 |
spatel | Its all good :) and images started building | 17:47 |
mmalchuk | kolla-build --version ? | 17:48 |
mmalchuk | same as checkout?) | 17:48 |
spatel | its showing 16.0.0 :) | 17:48 |
spatel | Yesss | 17:48 |
mmalchuk | cool | 17:48 |
mmalchuk | I'm proud of you) | 17:48 |
spatel | I am using your options --thread 1 --skip-existing --cache --format none etc.. and not a single image stuck yet :) | 17:49 |
spatel | You should be proud :) | 17:49 |
mmalchuk | note bene - this way it takes much more time to build | 17:49 |
mmalchuk | this way is only for debug | 17:49 |
spatel | I like slow and steady vs troubleshooting | 17:50 |
mmalchuk | nice | 17:50 |
spatel | can you share your kolla-build.conf file if possible.. I am curious to match with my one | 17:51 |
mmalchuk | also with skip_existing and cache - in case of temporary failure and repeated build you also need more space for docker | 17:51 |
spatel | If its safe to share | 17:51 |
spatel | This time i use cache just to save download time during failure | 17:52 |
mmalchuk | https://paste.openstack.org/show/bNsIeRFLYvj5UQlPJGce/ | 17:54 |
mmalchuk | note - this is template only. because I use Kayobe and full config created from this template dynamically | 17:54 |
mmalchuk | and this is not from production, but from my dev lab | 17:56 |
spatel | oh okay | 17:56 |
spatel | is this your local image? registry-openstack.cloud.local/openstack/infra/ubuntu | 17:56 |
mmalchuk | yep. the local docker registry | 17:57 |
mmalchuk | with the fixed version of ubuntu from the internet. | 17:57 |
mmalchuk | even if they have tag 20.04 | 17:58 |
mmalchuk | it updated from the internet when needed | 17:58 |
spatel | I know what you saying because if you build image after few days then may be its 20.04.100 | 17:59 |
mmalchuk | lol) | 17:59 |
spatel | I should push ubuntu image to local repo then | 17:59 |
spatel | You don't put tag in build file? | 18:00 |
mmalchuk | yep) | 18:00 |
mmalchuk | no | 18:00 |
mmalchuk | because it common. the uncommon build options passed in commandline while build started | 18:01 |
mmalchuk | I can build xena or zed, depend of need | 18:02 |
spatel | Hmmm | 18:02 |
mmalchuk | also I can oly build to test and not push | 18:02 |
mmalchuk | so, some options in config some used externaly | 18:02 |
mmalchuk | devops way) | 18:03 |
spatel | We have artifactory so plan is to push images there | 18:13 |
mmalchuk | not cheap | 18:20 |
mmalchuk | why not opensource solution? | 18:21 |
spatel | We are big company and we already have all those tools in production :) | 18:49 |
mmalchuk | so why not the VMWare?) | 18:55 |
spatel | we are big but not that big to use vmware. Our all tooling already working for openstack so why go for VMware? We have multiple business and each business has different budget to run. | 19:02 |
opendevreview | Juan Pablo Suazo proposed openstack/kolla-ansible master: Adds support for Huawei backends https://review.opendev.org/c/openstack/kolla-ansible/+/869252 | 20:28 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!