opendevreview | Merged openstack/openstack-ansible-os_tempest master: Add blazar tempest support https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/904785 | 04:54 |
---|---|---|
jrosser | noonedeadpunk: what do you think about converting the pre- gate playbooks to roles? that way they can be shared between parent and child jobs | 08:07 |
jrosser | question would be where to put them as zuul requires them to be in repo/roles/ dir | 08:08 |
jrosser | a job playbook can run a role from any zuul repo it has available | 08:09 |
noonedeadpunk | I need to process that... | 08:10 |
noonedeadpunk | We can deliver them as collections to zuul ? :D | 08:11 |
noonedeadpunk | or better can we?:) | 08:11 |
noonedeadpunk | I have seen one day that it can install them during runtime or smth... | 08:12 |
jrosser | so I can’t call a pre playbook that lives in openstack-ansible repo from os_magnum, for example | 08:12 |
jrosser | but what I can do is use job.roles to specify that a role from another repo would be available in os_magnum pre-playbook | 08:13 |
noonedeadpunk | you still will need to have an individual playbook to call the role? | 08:16 |
jrosser | for example in os_magnum yes | 08:16 |
jrosser | or gross, I just give up and copy the existing pre playbooks into os_magnum but that’s horrible | 08:17 |
jrosser | so as an example, we could put these roles in openstack-ansible-plugins/roles | 08:18 |
noonedeadpunk | I'm still thinking that adding some variable for logic of execute or not pre-step will be easier tbh | 08:20 |
noonedeadpunk | As variables you can override on job level easily | 08:20 |
jrosser | I still to not know how I would execute the pre step | 08:21 |
noonedeadpunk | Give me a min to come up with a sample | 08:21 |
jrosser | I see how to set a var to suppress it in the parent job | 08:21 |
jrosser | but then in the child you want to do “early tasks” followed by the original pre tasks | 08:22 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: DNM Allow to skip pre-step bootstrap https://review.opendev.org/c/openstack/openstack-ansible/+/905290 | 08:25 |
noonedeadpunk | ^ | 08:25 |
noonedeadpunk | Eventually, I guess you can also set ENV per Zuul job, so https://review.opendev.org/c/openstack/openstack-ansible/+/905290/1/zuul.d/playbooks/run.yml part might be irrelevant | 08:26 |
noonedeadpunk | As basically, if you unset SKIP_OSA_BOOTSTRAP_AIO and SKIP_OSA_RUNTIME_VENV_BUILD bootstrap will run just normally during run | 08:26 |
noonedeadpunk | not pre-run | 08:26 |
noonedeadpunk | And in pre-run you do whatever needed, as no bootstrap will happen there | 08:27 |
noonedeadpunk | Or I'm missing smth? | 08:27 |
jrosser | I think so | 08:28 |
jrosser | in os_magnum I need to write some config before bootstrap happens | 08:28 |
noonedeadpunk | yeah, so you define this in pre-run | 08:28 |
jrosser | right | 08:29 |
noonedeadpunk | and with `osa_run_pre_bootstrap: false` it won't happen in pre stage | 08:29 |
jrosser | then how do I run pre-gate-scenario and stuff? | 08:29 |
noonedeadpunk | pre-gate-scenario will run | 08:30 |
noonedeadpunk | It will just not run bootstrap-ansible.sh and bootstrap-aio.sh | 08:30 |
jrosser | ok sure | 08:30 |
jrosser | but I do need those to happen | 08:30 |
noonedeadpunk | But then will happen as part of gate-check-commit.sh on run? | 08:31 |
noonedeadpunk | As now they're skipped throguh `SKIP_OSA_BOOTSTRAP_AIO=1` | 08:31 |
noonedeadpunk | And you need them to happen after you paste your content, right? | 08:32 |
jrosser | maybe I miss a bunch of complexity we have in the gate-check-commit script | 08:32 |
jrosser | yes that’s right I need them after I drop some config | 08:33 |
noonedeadpunk | Like we've explicitly disabled bootstrap-ansible.sh and bootstrap-aio.sh from gate-check-commit.sh here https://opendev.org/openstack/openstack-ansible/src/branch/master/zuul.d/playbooks/run.yml#L22-L24 | 08:33 |
noonedeadpunk | As normally gate-check-commit does that | 08:34 |
noonedeadpunk | And we already have logic to disable pre steps for upgrade jobs I guess | 08:34 |
noonedeadpunk | Eventually... You likely can do even without variables as scenario should be available there | 08:35 |
noonedeadpunk | and instead of `osa_run_pre_bootstrap` have `capi not in scenario` or smth | 08:36 |
jrosser | the complexity is now more than I can understand :( | 08:36 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: DNM Allow to skip pre-step bootstrap https://review.opendev.org/c/openstack/openstack-ansible/+/905290 | 08:44 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 08:46 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 08:46 |
noonedeadpunk | Unless I mixed up some logic here, I hope it will work... | 08:48 |
jrosser | ah ok | 09:06 |
noonedeadpunk | Not sure why exactly it failed, but, according to log it does what you want? | 09:06 |
jrosser | i think i had not appreciated that the bootstrap could happen in the run playbook | 09:06 |
noonedeadpunk | ah, well... | 09:07 |
noonedeadpunk | Maybe it's less problem today given that all repos are on a filesystem and parallel install is very fast.... but dunno | 09:08 |
noonedeadpunk | that sounds slightly simpler then doing a role and a playbook per repo. | 09:09 |
noonedeadpunk | But we can move that to a role as well... | 09:09 |
jrosser | so not quite https://zuul.opendev.org/t/openstack/build/6462a73d23e54f72bbc60f4ce15d5b58/log/job-output.txt#5161-5165 | 09:11 |
jrosser | oh wait | 09:11 |
jrosser | yes thats the thing that your patch skips | 09:11 |
jrosser | trouble is gate-check-commit.sh uses the same flags to skip bootsrapping? https://github.com/openstack/openstack-ansible/blob/master/scripts/gate-check-commit.sh#L95-L98 | 09:14 |
noonedeadpunk | um, what? | 09:19 |
noonedeadpunk | We don't run gate-check-commit.sh in "pre-run", we only run it during "run" | 09:20 |
noonedeadpunk | In "pre-run" we launch bootstrap-ansible.sh and bootstrap-aio.sh "manually" | 09:21 |
noonedeadpunk | And it's kidna of vice-versa, `SKIP_OSA_BOOTSTRAP_AIO` is defined by default in our "run" playbook. My patch adds variable that makes this part execute from gate-check-commit | 09:24 |
noonedeadpunk | rather then being skipped | 09:24 |
* noonedeadpunk is kinda bad judge of what is complex setup | 09:24 | |
gokhani | hello folks, do we need br-storage on controller nodes if we install glance and cinder on hyperconverged nodes (compute + ceph)? | 09:32 |
jrosser | gokhani: you controllers will need to be able to talk to the ceph networks | 09:35 |
jrosser | noonedeadpunk: ultimately that job fails because somehow the gate-check-commit.sh tries to do `ansible -m setup......` but bootstrap-ansible has not happened | 09:35 |
jrosser | gokhani: br-storage is used to connect your storage network into LXC containers on the controllers | 09:36 |
gokhani | jrosser, my customer seperated storage and data switchs. on controller I have no connection to storage switches. I thought that If I move glance and cinder containers to storage nodes, it may work | 09:38 |
noonedeadpunk | jrosser: oh, yes, you're right. | 09:39 |
jrosser | storage nodes? | 09:39 |
noonedeadpunk | Once variable defined - it doesn't matter if its truthy or not | 09:40 |
jrosser | yes the test is for length zero? | 09:40 |
noonedeadpunk | yeah | 09:40 |
noonedeadpunk | well, I mean. It should be solvable :) | 09:41 |
jrosser | gokhani: you have controllers and converged computes i think? so what is "storage nodes"? | 09:42 |
gokhani | jrosser, sorry it is hyperconverged nodes (compute and ceph ) | 09:43 |
jrosser | noonedeadpunk: i am very wary of messing with any of the gate-check-commit stuff becasue personally i already view it as very fragile | 09:43 |
jrosser | particularly how the upgrade jobs work is super not obvious | 09:43 |
jrosser | gokhani: the simplest thing will be to connect your controllers to the storage network, if possible | 09:44 |
jrosser | gokhani: for example how to do you choose where to put your ceph mon? | 09:45 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: DNM Allow to skip pre-step bootstrap https://review.opendev.org/c/openstack/openstack-ansible/+/905290 | 09:46 |
gokhani | jrosser, in that situation I can not connect controllers to storage network. can it be a problem put ceph mon on controller nodes ? you mean ceph mon also need to connect storage network. | 09:52 |
gokhani | If it is not possible, ı have to put ceph mon on some of hyperconverged nodes h | 09:53 |
jrosser | absolutely yes, the ceph mon must have connectivity to the ceph clusgter | 09:53 |
jrosser | you cant just come up with an arbitrary architecture and say "it must work like this" when there are intrinsic requirements of the components | 09:53 |
jrosser | you can, if you want, make the mgmt network routable to the storage network for things like cinder and glance i expect | 09:54 |
jrosser | putting the ceph mon on the converged nodes would be possible but you do then come up with at least 3 "special" compute nodes that start to look more and more like controllers | 09:55 |
gokhani | in fact I am looking for alternatives if controller nodes have no connection to storage network. | 09:57 |
jrosser | well, like i say you would move a set of components to 3 chosen compute nodes that need connection to the storage network | 09:57 |
noonedeadpunk | or you might use ceph mons for that as well.... | 09:58 |
jrosser | but then you do make effectively 3 more "sort of controller" nodes | 09:58 |
jrosser | and you have to worry about uptime/cluster of those 3 differently than the rest of your computes | 09:58 |
noonedeadpunk | But then you need to have access from osa node to mons | 09:58 |
noonedeadpunk | *osa deploy hsot | 09:58 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: DNM Allow to skip pre-step bootstrap https://review.opendev.org/c/openstack/openstack-ansible/+/905290 | 10:03 |
gokhani | jrosser, thanks, I will tell the risk of this architecture to customer. Because it is not possible to add new network card to controller servers. | 10:06 |
jrosser | the data and storage switches can be linked and the storage traffic bought to the controller on a vlan | 10:07 |
jrosser | or routed if it is an L3 setup | 10:07 |
gokhani | jrosser, customer is unfortunately against with this. it is not l3 setup. | 10:19 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: DNM Allow to skip pre-step bootstrap https://review.opendev.org/c/openstack/openstack-ansible/+/905290 | 11:28 |
jrosser | noonedeadpunk: perhaps we modify gate_check_commit.sh to have a default value for this? like `if [[ "${SKIP_OSA_BOOTSTRAP_AIO:-1}" == '0' ]]; then....` | 11:33 |
jrosser | then we specifically check for the condition | 11:33 |
jrosser | rather than logic that depends on if the env var is set or not | 11:33 |
jrosser | which seems unintuitive | 11:33 |
noonedeadpunk | yeah, could be better indeed | 11:34 |
noonedeadpunk | You just said you don't wanna touch it and I kinda short on time today to touch it either :( | 11:36 |
jrosser | tbh it is still hurting my head that we have one var that will skip the boostrap in one place but not skip it in the other | 11:36 |
Tadios | i have seen this patch https://github.com/openstack/openstack-ansible-haproxy_server/commit/df2e7af9b3161eeca61b8695fa97e22c5683ec34 for the haproxy role to solve the issue "AnsibleUndefinedVariable: 'vip_interface' is undefined" when running the haproxy playbook, but the playbook is still failing on that task, i'm on 2023.1(27.3.0) | 12:21 |
Tadios | I can see inside the ansible-role-requirements.yml the trackbranch is set to stable/2023.1 , i dont know why | 12:21 |
Tadios | noonedeadpunk: ^ i think you were the one who patched it right? | 12:37 |
noonedeadpunk | 27.3.0 does not have this patch fwiw | 12:41 |
noonedeadpunk | trackbranch can be ignored - it's for our tooling only. `version` is what is actually being installed | 12:42 |
Tadios | ohh, that's why | 12:45 |
noonedeadpunk | jrosser: should I also rename file in https://review.opendev.org/c/openstack/openstack-ansible/+/905221 to smth more reffering runtime venv? | 12:45 |
noonedeadpunk | Tadios: we're going to make new release really soon | 12:59 |
jrosser | noonedeadpunk: it would be helpful to say “ansible runtime venv” rather than just venv I think | 13:01 |
jrosser | and yeah the file name could be more specific too | 13:01 |
jrosser | user-ansible-venv-requirements maybe | 13:02 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Add support for extra Python packages inside Ansible runtime https://review.opendev.org/c/openstack/openstack-ansible/+/905221 | 13:04 |
noonedeadpunk | Yeah, did exactly that :) | 13:04 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Add support for extra Python packages inside Ansible runtime https://review.opendev.org/c/openstack/openstack-ansible/+/905221 | 13:05 |
opendevreview | Merged openstack/openstack-ansible master: Return back /healtcheck URI verification https://review.opendev.org/c/openstack/openstack-ansible/+/904941 | 13:05 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/2023.2: Return back /healtcheck URI verification https://review.opendev.org/c/openstack/openstack-ansible/+/905321 | 13:15 |
TheCompWiz | Good morning everyone! Can someone help me with an issue I'm getting with a new deployment? (of bobcat) I make it all the way through the setup-hosts.yml and the setup-infrastructure.yml ... but it dies on the os-keystone.yml on the "Create database for service" task. | 13:20 |
noonedeadpunk | TheCompWiz: can you kindly post some output of the failure? ie via https://paste.openstack.org/ | 13:22 |
TheCompWiz | https://paste.openstack.org/show/bHTjK5W0fHJDCk9RxxUf/ | 13:23 |
TheCompWiz | everything else before that is successful. | 13:24 |
TheCompWiz | ... and all the output is "censored" ... which is really annoying. | 13:24 |
jrosser | `failed: [alice_keystone_container-4ffef4a9 -> localhost]` that looks wrong | 13:24 |
jrosser | TheCompWiz: otherwise it would reveal your database credentials in the log, which is annoying in a different way | 13:25 |
TheCompWiz | jrosser: you're not wrong about that. | 13:25 |
TheCompWiz | what about it looks wrong to you? | 13:25 |
jrosser | the delegation of that task should be to the utility host | 13:26 |
jrosser | not to localhost, i.e ansible controller | 13:26 |
TheCompWiz | the "localhost" is my utility host. (I also use it for the deployment) | 13:28 |
jrosser | utility host would be a container on one of your infra nodes for a LXC baed deployment | 13:28 |
NeilHanlon | yeah, 'utility' host here is a specific terminology that OpenStack-Ansible uses | 13:29 |
TheCompWiz | is there an easy way to get the censored output without digging through all the playbooks? | 13:29 |
jrosser | the utlity host has all the stuff necessary for the database client and openstack client | 13:30 |
jrosser | it is the host where the ansible tasks run when ansible modules need to interact with the db and the openstack API | 13:30 |
jrosser | TheCompWiz: you can look in the code https://github.com/openstack/openstack-ansible-plugins/blob/master/roles/db_setup/tasks/main.yml#L29 | 13:31 |
jrosser | TheCompWiz: you can also find the task names in the code easily like this https://codesearch.opendev.org/?q=Create%20database%20for%20service&i=nope&literal=nope&files=&excludeFiles=&repos= | 13:32 |
NeilHanlon | Yep! We are pretty good about making sure everything has an associated 'nerd knob' to influence the functionality, e.g. for debugging/troubleshooting, as well as customization | 13:32 |
TheCompWiz | Do I need to pre-install mariadb? ... or is it supposed to be installed by the playbook? | 13:33 |
jrosser | TheCompWiz: the thing is here, unless we resolve this issue of using localhost as the db setup host then it is not likley to be useful seeing the output from that task | 13:33 |
jrosser | everything will be setup for you | 13:33 |
jrosser | TheCompWiz: have you run an all-in-one deployment first? | 13:34 |
TheCompWiz | jrosser: um.... no. Do you have an example script somewhere for the all-in-one? (I can google it if you don't have it handy) | 13:40 |
jrosser | https://docs.openstack.org/openstack-ansible/2023.2/user/aio/quickstart.html | 13:40 |
noonedeadpunk | TheCompWiz: what happens when you run `openstack-ansible playbooks/utility-install.yml`? | 14:03 |
noonedeadpunk | does the playbook succeeds? | 14:03 |
noonedeadpunk | And I assume you have on the "localhost" /openstack/venvs/utility-<version> ? | 14:03 |
*** jamesdenton__ is now known as jamesdenton | 14:04 | |
noonedeadpunk | TheCompWiz: to disclose the error you can run `openstack-ansible playbooks/os-keystone-install.yml -e_oslodb_setup_nolog=False ` | 14:07 |
noonedeadpunk | ugh, `openstack-ansible playbooks/os-keystone-install.yml -e _oslodb_setup_nolog=False` | 14:08 |
TheCompWiz | noonedeadpunk: interestingly enough... it's complaining about "python"... | 14:18 |
TheCompWiz | "module_stderr": "/bin/sh: 1: /openstack/venvs/utility-28.0.1.dev2/bin/python: not found\n", | 14:18 |
TheCompWiz | since ansible deployed the container... I find it funny that it can't find python. | 14:19 |
noonedeadpunk | So boils down to what's result of `openstack-ansible playbooks/utility-install.yml` basically | 14:19 |
noonedeadpunk | It's delegated to utility container, which seems in your case being localhost | 14:19 |
TheCompWiz | utility-install.yml completes sucessfully... but skipped everything... so I assume something is wrong. | 14:19 |
noonedeadpunk | I guess that;s when things potentially went messy | 14:19 |
jrosser | ah then do you have a utility host defined in openstack_user_config? | 14:19 |
noonedeadpunk | Do you have any host in utility_all group? | 14:20 |
noonedeadpunk | Because it's fine to have utility on localhost, but it should be properly defined anyway | 14:20 |
TheCompWiz | I have a utility container... | 14:21 |
TheCompWiz | lemme poke at it a bit & see if I can figure it out. Thanks for pointing me in the right direction. | 14:23 |
Tadios | noonedeadpunk: jrosser: what am i missing here, i am trying to install elk from the ops repo and i keep getting different variables undefined error | 14:23 |
Tadios | i followed the guided on the readme line by line, but keeps giving me ansible_pkg_mgr, ansible_os_family, ansible_memtotal_mb, ansible_hostname undefined error on different tasks | 14:23 |
Tadios | there is a line where it says "gather facts before running, openstack -m setup elk_all will gather the facts you will need" but i think there is a typo on the command openstack -m setup elk_all cause that is not working | 14:24 |
Tadios | ansible -m elk all work fine on gathering fact but i dont think it is saving it any where and i keep getting the same errors | 14:24 |
jrosser | that is because the openstack-ansible command is set up to disable ansible facts access though ansible_<foo>, they are only available in ansible_facts['foo'] | 14:25 |
noonedeadpunk | TheCompWiz: so by default, DB setup tasks are delegated to openstack_db_setup_host. If variable is undefined for some reason - it will be localhost | 14:25 |
noonedeadpunk | TheCompWiz: and that goes down to "{{ groups['utility_all'][0] }}" | 14:26 |
jrosser | Here is the code that defines the delegation target for service and db setup https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/defaults/source_install.yml#L30-L34 | 14:27 |
Tadios | jrosser: what should i do then? | 14:28 |
noonedeadpunk | Tadios: you can do `export ANSIBLE_INJECT_FACT_VARS=True` I guess | 14:29 |
jrosser | ^ that should work i think | 14:29 |
Tadios | okay, let me check | 14:29 |
jrosser | ideally we need to get the content in the ops repo patched | 14:29 |
jrosser | Tadios: i would be good to know that the playbooks in the openstack-ansible-ops repo are really community supported efforts | 14:30 |
jrosser | so contributions to that code are most welcome | 14:30 |
Tadios | jrosser: okay, i'll do my best | 14:32 |
jrosser | it's not quite search/replace to fix ansible_<foo> to ansible_facts['foo'] but it is not too complicated if you were interested to make a patch | 14:33 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Add support for extra Python packages inside Ansible runtime https://review.opendev.org/c/openstack/openstack-ansible/+/905221 | 14:35 |
jrosser | now thats very very interesting | 14:37 |
jrosser | this patch passed check at 2.01PM https://review.opendev.org/c/openstack/openstack-ansible/+/905290 | 14:37 |
jrosser | and at the same time this one was failed seemingly as a result https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 14:38 |
noonedeadpunk | maybe I triggered recheck too early ? | 14:39 |
jrosser | 11:28:25 vs 11:25:48 | 14:40 |
noonedeadpunk | hm, and now it's not scheduleed | 14:40 |
jrosser | oops 11:28:25 vs 11:28:48 | 14:40 |
jrosser | i have also noticed our jobs spending really quite some time in "queued" state | 14:41 |
NeilHanlon | it's the new year, they're feeling a bit lazy | 14:41 |
jrosser | oooh it's because this got updated underneath all of it https://review.opendev.org/c/openstack/openstack-ansible/+/905221 | 14:42 |
jrosser | huh | 14:43 |
noonedeadpunk | oh | 14:43 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: DNM Allow to skip pre-step bootstrap https://review.opendev.org/c/openstack/openstack-ansible/+/905290 | 14:43 |
noonedeadpunk | my bad | 14:43 |
jrosser | no worries | 14:43 |
jrosser | it's exactly the same as when i was trying to figure out why my job wasnt shcheduled earlier this week | 14:44 |
jrosser | due to rebase needed on dependant patch, but not sure why it just seems to sit and wait rather than fail | 14:44 |
noonedeadpunk | I somehow thought it doesn't matter what relation chain is for this DNM | 14:44 |
Tadios | jrooser: I'll definitely be looking into it, i have had a hard time with the deployment to begin with, for example after running bootstrap-embedded-ansible.sh you will be put into python venv and running `ansible-playbook site.yml $USER_VARS` playbook as per the instruction result in the ModuleNotFoundError: No module named 'ansible.compat.six', so once i sort out all my problems in my deployment, i'll look into the contribution | 14:47 |
noonedeadpunk | o_O | 14:48 |
jrosser | tbh this all needs refactoring into a collection just like ther magnum stuff | 14:50 |
jrosser | all the embedded ansible and surrounding complexity is trying to deal with a very similar situation of running somewhat out-of-tree code during a test defined really in the integrated repo | 14:51 |
noonedeadpunk | jrosser: ok, now capi seems actually running boostrap-aio | 14:59 |
jrosser | it is | 14:59 |
jrosser | but i do not see it run the pre playbook from the child job | 15:00 |
jrosser | seems to go straight from `configure-mirrors` role to `opendev.org/openstack/openstack-ansible/zuul.d/playbooks/pre-gate-cleanup.yml@master` | 15:01 |
noonedeadpunk | I do see | 15:01 |
noonedeadpunk | 2024-01-11 14:56:49.492769 | PLAY [Bootstrap configuration for Vexxhost magnum-cluster-api driver] | 15:01 |
noonedeadpunk | I do not see ops collection being installed though | 15:03 |
jrosser | this probably is now much easier to debug | 15:04 |
jrosser | as will be possible to see what actually got into openstack_deploy in the logs | 15:05 |
noonedeadpunk | yeah | 15:05 |
noonedeadpunk | and it looks like generally it might work out | 15:05 |
jrosser | thankyou for looking at the job running stuff | 15:05 |
jrosser | like i say i would hope that we can do something applicable also to ELK etc | 15:06 |
noonedeadpunk | yeah, that will be really nice | 15:06 |
noonedeadpunk | I think we can also run the same job for changes in ops repo inside mcapi_vexxhost | 15:07 |
jrosser | that would be a good idea | 15:09 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Add tempest tests for Blazar https://review.opendev.org/c/openstack/openstack-ansible/+/904786 | 15:13 |
jrosser | spatel: write a blog about autoscaling with an example :) | 15:14 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 15:16 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 15:17 |
spatel | jrosser I am about to do that.. :) | 15:18 |
spatel | before my buffer flushed out | 15:18 |
jrosser | spatel: excellent - i want to add some practical "day 2" examples to my documentation for this in OSA | 15:19 |
spatel | +1 | 15:19 |
admin1 | what day2 topics are missing ? | 15:19 |
jrosser | well anything at all | 15:19 |
jrosser | and by that i mean that it should be possible for operators who are not familiar with k8s particularly to be able to offer k8s-as-a-service | 15:20 |
jrosser | so enough guidance is needed in the documentation to exercise all the features so that you can 1) document them for your users 2) test if they work | 15:20 |
spatel | agreed jrosser | 15:21 |
jrosser | and right now i feel that there is a massive gap here | 15:21 |
spatel | Deploying stuff is easy but actually Its very important to test functionality :) | 15:22 |
admin1 | i think magnum is something everyone is interested on, but its one that we are lagging behind on | 15:22 |
jrosser | i'm not sure about that | 15:22 |
jrosser | we have a complete ready to go solution for one of the capi drivers, none of the other tools offer that | 15:22 |
jrosser | but indeed documentation is missing all round | 15:23 |
admin1 | i was going to try capi | 15:23 |
spatel | We are lagging behind? | 15:23 |
admin1 | i meant in handholding for the coe tags and labels | 15:23 |
spatel | capi is awesome but it required good operating k8s knowledge | 15:23 |
admin1 | any good capi docs for osa :D ? | 15:24 |
jrosser | admin1: there is little point with that any more as the old magnum stuff will be deprecated | 15:24 |
jrosser | all the madness with labels and version magic will go away | 15:24 |
admin1 | coz so far i am doing vms via terraform and then kubespray for the k8s via git pipelines | 15:24 |
admin1 | if there is anything new/beta to try, i can test it out | 15:25 |
spatel | There isn't any specific doc for capi because each deployment is different. I created doc for kolla-ansible if you want to take some pieces for learning - https://satishdotpatel.github.io/openstack-magnum-capi/ | 15:25 |
spatel | jrosser what tools are you using to deploy mgmt cluster on OSA ? | 15:26 |
spatel | Assuming you are creating lxc container and deploying k8s inside it using ansible tooling.. right? | 15:27 |
noonedeadpunk | admin1: jrosser is working on that currently | 15:28 |
noonedeadpunk | https://review.opendev.org/q/topic:%22capi%22 | 15:29 |
jrosser | spatel: yes that is right | 15:29 |
spatel | are you using kubespray or kubeadmin? | 15:30 |
jrosser | those patches need updating so probably won't work currently | 15:30 |
jrosser | kubeadm | 15:30 |
admin1 | i use terraform to boot/manage the master/worker nodes and , and then kubespray currently | 15:30 |
spatel | Cool! | 15:30 |
admin1 | but i saw a capi demo and i liked it | 15:30 |
admin1 | so was gathering info to make it work | 15:31 |
jrosser | i was hoping to have some working OSA CI for it this week | 15:31 |
jrosser | but that has been tricky to structure so maybe next week :) | 15:31 |
admin1 | if we need a multi node ( all big kvm's ) working osa of specific version to test wtih, i can provide resources | 15:32 |
admin1 | to speed it up | 15:32 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 15:33 |
admin1 | if you use ipv6 for customers, what range do you normally add to openstack ? | 15:34 |
admin1 | like /64 or higher | 15:34 |
jrosser | each tenant network gets a /64 | 15:36 |
jrosser | it kind of has to as thats the size of a subnet | 15:36 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 16:00 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-ops master: WIP - Bootstrapping playbook https://review.opendev.org/c/openstack/openstack-ansible-ops/+/902178 | 16:28 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/zed: Bump SHAs for Zed https://review.opendev.org/c/openstack/openstack-ansible/+/905344 | 16:33 |
TheCompWiz | noonedeadpunk: how on earth does the "utility" container get created? ... I cannot see anything in any documentation that shows how to configure it in the openstack_user_config.yml | 16:39 |
TheCompWiz | I feel like I am running around in circles... and getting nowhere fast. | 16:40 |
jrosser | TheCompWiz: there are example / starting point configs in the etc/ directory in the openstack-ansible repo | 16:43 |
TheCompWiz | jrosser: I know. That's what I'm basing my config on. but none of the examples mention a "utility" anything. | 16:44 |
jrosser | the reason I suggested to build an all-in-one is that the configuration is built automatically for you and is exactly the same as those we use in our CI jobs | 16:44 |
TheCompWiz | jrosser: I was trying to avoid the aio... just because "unpacking" I assume unpacking everything into a production environment would be a headache. | 16:47 |
jrosser | by default this is the setup for the utility host inventory/env.d/utility.yml | 16:47 |
jrosser | the AIO will give you a reference to understand how things like the db setup should work | 16:48 |
jrosser | you can see then that the utility container is part of this inventory/env.d/shared-infra.yml | 16:49 |
jrosser | so should be created on whatever hosts you specify as shared-infra_hosts | 16:50 |
jrosser | (I’m looking at this on a phone so +/- some typos) | 16:50 |
TheCompWiz | sadly, it doesn't seem to get deployed on any of the shared-infra-hosts. | 16:51 |
TheCompWiz | that's where I'm stuck. | 16:51 |
jrosser | did any containers get created? can you share the output of lxc ls? | 16:52 |
TheCompWiz | several... yes. (cinder, glance, heat, horizon, keystone, nova_api, octavia, placement, repo) | 16:53 |
jrosser | you can paste things directly at paste.opendev.org btw | 16:54 |
jrosser | then we need to work back to find out why | 16:54 |
TheCompWiz | I thought that's what we were doing. | 16:54 |
noonedeadpunk | TheCompWiz: so what's the output of `/opt/openstack-ansible/scripts/inventory-manage.py -G | grep utility`? | 16:54 |
jrosser | there is scripts/inventory_manage.py that you can use to show what the inventory expects | 16:55 |
jrosser | snap :) | 16:55 |
TheCompWiz | noonedeadpunk: blank. | 16:56 |
noonedeadpunk | ok | 16:56 |
noonedeadpunk | then add `operator_hosts` to openstack_user_config | 16:57 |
noonedeadpunk | and run openstack-ansible playbooks/lxc-container-create.yml --limit utility_all,lxc_hosts | 16:58 |
noonedeadpunk | and then openstack-ansible playbooks/utility-install.yml | 16:58 |
jrosser | TheCompWiz: can you share your openstack_user_config with us? | 16:58 |
TheCompWiz | what goes in the operator_hosts? | 16:58 |
noonedeadpunk | Ok, so as you've noticed, we're using dynamic inventory script to genreate required inventory for ansible. | 16:59 |
noonedeadpunk | It is descriped through env.d files. For utility it looks like that: https://opendev.org/openstack/openstack-ansible/src/branch/master/inventory/env.d/utility.yml | 16:59 |
noonedeadpunk | So basically it contains only utility container:) But, utility will also be created if you have defined shared-infra_hosts in openstack_user_config | 17:00 |
noonedeadpunk | And I guess jrosser question for openstack_user_config content was related to that - as shared-infra_hosts would be enough to get utility created | 17:01 |
TheCompWiz | here's the config I'm working with: https://paste.openstack.org/show/baUzHFwmoVk5oVWsJhtD/ | 17:04 |
jrosser | you have a type in shared-infra-hosts | 17:05 |
jrosser | dash rather than underscore | 17:05 |
noonedeadpunk | TheCompWiz: that looks quite close to cleura academy config?:) | 17:06 |
TheCompWiz | .... I think I'm going to go jump off a bridge now. | 17:06 |
TheCompWiz | noonedeadpunk: it is. was using that as a boilerplate. | 17:06 |
jrosser | haproxy on deploy host is unusual | 17:07 |
noonedeadpunk | yeah, actually dynamic inventory has some ... interesting design decisions that are kept from year to year. One of which is parsing by `_hosts` suffix | 17:07 |
jrosser | there may be also some duplication there? | 17:08 |
noonedeadpunk | Also. Today a question was raised internally - what for we're using underscores in lxc container names and then have overhead of replacing then with dashes for hostnames | 17:10 |
noonedeadpunk | And I was hardly able to answer that | 17:10 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/2023.1: Bump SHAs for 2023.1 https://review.opendev.org/c/openstack/openstack-ansible/+/905346 | 17:11 |
TheCompWiz | noonedeadpunk: .... thanks a billion. the underscore was the culprit. (It seems to be building the utility container now) | 17:14 |
jrosser | noonedeadpunk: that makes also a mess in the k8s patches too | 17:15 |
noonedeadpunk | do you have any good reason why we might want to keep that? | 17:16 |
noonedeadpunk | or at least just any reason :) | 17:16 |
jrosser | none :) | 17:16 |
jrosser | other than how actually to change it | 17:17 |
jrosser | for an existing deployment | 17:17 |
noonedeadpunk | I think for existing we should keep it | 17:17 |
noonedeadpunk | (at least until OS upgrade?) | 17:17 |
jrosser | so this means some mess in the dynamic inventory script? | 17:18 |
jrosser | “more mess” :) | 17:18 |
noonedeadpunk | haven't evaluated that yet to be frank | 17:18 |
noonedeadpunk | was trying to think of reasons why it's needed first :) | 17:19 |
jrosser | it would be nice to fix | 17:19 |
jrosser | external things assume hostname == inventory_hostname and break in odd ways when it’s not | 17:19 |
jrosser | the k8s collection used ansible_hostname_short a bunch and that was just the wrong value | 17:20 |
noonedeadpunk | Can't say it's valid assumption though.... | 17:20 |
noonedeadpunk | but having inventory_hostname with invalid for hostname symbols is weird | 17:21 |
jrosser | true | 17:21 |
noonedeadpunk | ok, I guess I know where it's coming from... From group names, which are separated with underscore by ansible requirements... | 17:28 |
jrosser | does that point to the group name requirements being resolved in the wrong place? | 17:32 |
jrosser | like encoded in the inventory hostname when actually should be fixed elsewhere | 17:33 |
jrosser | because I’m principle group name and inventory name are completely independent | 17:33 |
spatel | I have very stupid question. One of my customer want to download cinder volume-snapshot and for that he is uploading volume to image glance and because of volume size is so freaking bug its filling openstack controller disk :( | 17:39 |
noonedeadpunk | jrosser: nah, it more or less takes this as part of the container name: https://opendev.org/openstack/openstack-ansible/src/branch/master/inventory/env.d/utility.yml#L23 | 17:40 |
spatel | why its using local controller disk for volume-snapshot to glance image ? (we have ceph so it should convert it in glance which in ceph right? | 17:40 |
spatel | I found this solution - https://xahteiwi.eu/xahteiwi.eu/resources/hints-and-kinks/importing-rbd-into-glance/ | 17:48 |
spatel | is this right way to convert volume snapshot to glance image which in ceph? | 17:49 |
jrosser | spatel: you can only do that with glance v1 api | 18:00 |
spatel | Hmm! I can do - glance --os-image-api-version 1 | 18:01 |
spatel | my pain is I don't want my controller not get in use when I upload volume image to glance | 18:02 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-ops master: WIP - Bootstrapping playbook https://review.opendev.org/c/openstack/openstack-ansible-ops/+/902178 | 20:28 |
spatel | jrosser here you go - https://satishdotpatel.github.io/kubernetes-cluster-autoscaler-with-magnum-capi/ | 21:00 |
jrosser | spatel: what is creating the load? | 21:02 |
spatel | kubectl scale deployment --replicas=120 nginx-deployment | 21:02 |
spatel | add more replica.. | 21:03 |
spatel | 110 pod is hard limit of single worker nodes... so it will demand for more worker nodes | 21:03 |
jrosser | so this likely also my not understanding of how it works :) | 21:04 |
spatel | :) | 21:04 |
jrosser | it is not scaling up in response to some load then | 21:04 |
spatel | Next I am going to add metric server and use CPU load to scale up and down | 21:04 |
jrosser | right | 21:04 |
spatel | We can use CPU based scaling again logic is same telling autoscaler to trigger more nodes | 21:05 |
spatel | current scaling is based on scheduler not able to schedule pods (cause pending) and it will ask for more nodes. | 21:05 |
spatel | I will setup lab for metrics server base where CPU/Memory will be used to check status of load | 21:06 |
spatel | I have to leave now!! i will catch you tomorrow | 21:07 |
jrosser | we are missing ZUUL_SRC_PATH here https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/gate-check-commit.sh#L97 | 21:34 |
jrosser | it is needed here https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/get-ansible-collection-requirements.yml#L34 | 21:34 |
jrosser | and here https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/get-ansible-role-requirements.yml#L58 | 21:34 |
jrosser | in the case where bootstrap is run by the gate-check-commit script | 21:35 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: DNM Allow to skip pre-step bootstrap https://review.opendev.org/c/openstack/openstack-ansible/+/905290 | 21:40 |
TheCompWiz | oh-joy... new problems. Made progress on the new deployment... but it appears now that when the galera-container gets created... the mysql permissions do not get set... so the keystone service still can't do the database configuraiton. | 22:17 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: DNM Allow to skip pre-step bootstrap https://review.opendev.org/c/openstack/openstack-ansible/+/905290 | 22:25 |
jrosser | TheCompWiz: can you can paste any relevant output or specific things about what you believe to be wrong in the galera container? | 22:28 |
jrosser | there should be a configured mysql client installed in the utility container which can be useful to check if the database is reachable | 22:33 |
jrosser | it's also worth knowing that the database connection is via haproxy so you can check that haproxy backend for the database is up | 22:34 |
jrosser | hatop should have been installed on whichever node has haproxy, useful for debugging | 22:34 |
TheCompWiz | jrosser: sorry for the delay. kids & homework. /sigh. https://paste.openstack.org/show/bThWOPt2R6dsWxt8hAeK/ | 23:22 |
TheCompWiz | before a reboot... I could attach to the galera container, and mariadb was running, and I could even connect to the server, but none of the users were created to allow keystone (or anything else) to connect. After a reboot, re-run of all the setup-hosts and setup-infrastructure... mariadb isn't running anymore. | 23:25 |
TheCompWiz | I might have to deal with this tomorrow. I really appreciate all of your help jrosser. | 23:31 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!