recyclehero | jrosser: I see ur name for this all over the place. BTW I set my first debug right | 00:08 |
---|---|---|
recyclehero | Oct 15 03:31:40 infra1-utility-container-3e3911b0 ansible-openstack.cloud.os_project[12159]: Invoked with cloud=default state=present name=service description=Keystone Identity Service domain_id=default endpoint_type=admin validate_certs=True interface=admin wait=True timeout=180 properties={} enabled=True auth_type=None auth=NOT_LOGGING_PARAMETER region_name=None availability_zone=None | 00:08 |
recyclehero | ca_cert=None client_cert=None client_key=NOT_LOGGING_PARAMETER api_timeout=None | 00:08 |
recyclehero | task [OS_keystone: Add service project] | 00:08 |
recyclehero | error: openstacksdk required | 00:08 |
recyclehero | the log is from utillity container | 00:08 |
recyclehero | I added a debug and saw regardless of what I set keystone_service_setup_host: "{{ groups['utility_all'][0] }}" it is the utiliity container | 00:10 |
recyclehero | I cant proced setup-openstack | 00:11 |
recyclehero | its my restore attempt deployment | 00:11 |
*** gillesMo has joined #openstack-ansible | 00:11 | |
recyclehero | jrosser: I think it didnt respect my --regen switch when recreating the tokens! | 00:15 |
*** macz_ has joined #openstack-ansible | 00:15 | |
*** rf0lc0 has joined #openstack-ansible | 00:18 | |
*** macz_ has quit IRC | 00:20 | |
*** MickyMan77 has quit IRC | 00:42 | |
*** MickyMan77 has joined #openstack-ansible | 00:42 | |
*** MickyMan77 has quit IRC | 00:51 | |
*** gyee has quit IRC | 01:04 | |
openstackgerrit | Merged openstack/openstack-ansible-galera_server stable/train: Bump galera version https://review.opendev.org/757483 | 01:05 |
*** rf0lc0 has quit IRC | 01:06 | |
*** macz_ has joined #openstack-ansible | 01:09 | |
*** macz_ has quit IRC | 01:14 | |
*** MickyMan77 has joined #openstack-ansible | 01:20 | |
*** NewJorg has quit IRC | 01:28 | |
*** MickyMan77 has quit IRC | 01:29 | |
*** cshen has joined #openstack-ansible | 01:36 | |
*** cshen has quit IRC | 01:40 | |
*** spatel has joined #openstack-ansible | 01:44 | |
*** MickyMan77 has joined #openstack-ansible | 02:03 | |
*** MickyMan77 has quit IRC | 02:11 | |
*** NewJorg has joined #openstack-ansible | 02:45 | |
*** MickyMan77 has joined #openstack-ansible | 02:46 | |
*** MickyMan77 has quit IRC | 02:55 | |
*** MickyMan77 has joined #openstack-ansible | 03:35 | |
*** MickyMan77 has quit IRC | 04:19 | |
*** MickyMan77 has joined #openstack-ansible | 04:20 | |
*** evrardjp has quit IRC | 04:33 | |
*** evrardjp has joined #openstack-ansible | 04:33 | |
*** MickyMan77 has quit IRC | 04:35 | |
*** spatel has quit IRC | 04:38 | |
*** MickyMan77 has joined #openstack-ansible | 04:39 | |
*** nurdie has quit IRC | 04:43 | |
*** MickyMan77 has quit IRC | 04:48 | |
*** MickyMan77 has joined #openstack-ansible | 04:50 | |
*** MickyMan77 has quit IRC | 05:00 | |
*** MickyMan77 has joined #openstack-ansible | 05:03 | |
*** MickyMan77 has quit IRC | 05:07 | |
*** MickyMan77 has joined #openstack-ansible | 05:10 | |
*** MickyMan77 has quit IRC | 05:16 | |
*** miloa has joined #openstack-ansible | 06:02 | |
*** jbadiapa has joined #openstack-ansible | 06:38 | |
*** andrewbonney has joined #openstack-ansible | 07:01 | |
*** MickyMan77 has joined #openstack-ansible | 07:05 | |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible-os_magnum master: Use openstack_service_*uri_proto vars by default https://review.opendev.org/410681 | 07:21 |
*** rpittau|afk is now known as rpittau | 07:22 | |
jrosser | morning | 07:25 |
*** cshen has joined #openstack-ansible | 07:26 | |
noonedeadpunk | morning | 07:26 |
noonedeadpunk | jrosser: did you get the idea of https://review.opendev.org/#/c/758207 vs https://review.opendev.org/#/c/737221/ ? | 07:27 |
*** yolanda__ has joined #openstack-ansible | 07:27 | |
noonedeadpunk | and what makes first one at all? | 07:28 |
masterpe | Isn' t it beter to look at the number of cores/cpu's that are available? | 07:28 |
*** maharg101 has joined #openstack-ansible | 07:29 | |
masterpe | to determent the number of threats? | 07:29 |
jrosser | i am not sure i understand the second patch | 07:30 |
jrosser | https://review.opendev.org/#/c/758207/ <- that one | 07:30 |
noonedeadpunk | technically the second posted is yours, but I guess you don't get another one (which is same for me) | 07:30 |
noonedeadpunk | well Adri2000 posted comment to 737221 and that's how I paid attention to it | 07:31 |
jrosser | the sed looks wierd | 07:32 |
noonedeadpunk | and what is `ANSIBLE_FORKS_VALUE` - have no idea | 07:32 |
jrosser | no me neither | 07:32 |
*** cloudnull has quit IRC | 07:33 | |
noonedeadpunk | ok, good, then it's not me not seing something obvious:) or at least not me only | 07:33 |
jrosser | my number of 20 was kind of arbitrary | 07:33 |
jrosser | based on the recommendations for AIO cpus really | 07:34 |
noonedeadpunk | but kind of aio requires the way more cpus than needed for just deploy host? | 07:34 |
*** cloudnull has joined #openstack-ansible | 07:34 | |
jrosser | but it's based on the maxium number of containers in the control plane | 07:34 |
jrosser | yeah but it's deploy host threads though, and i'm not sure that means fully utilised CPUs | 07:35 |
jrosser | it's number of parallel tasks | 07:35 |
noonedeadpunk | like I have 2 cpus for some deploy host :p | 07:35 |
jrosser | right, but if you run 20 tasks in parallel which each take N seconds on the target you dont need N cpus on the deploy host at 100% to do that? | 07:36 |
jrosser | ^ confused used of N there, sorry | 07:36 |
jrosser | right, but if you run 20 tasks in parallel which each take N seconds on the target you dont need 20 cpus on the deploy host at 100% to do that? | 07:36 |
noonedeadpunk | depends on how much seconds will take each task honestly | 07:36 |
noonedeadpunk | in terms of threads and cpu cycles | 07:37 |
jrosser | hmm well maybe we need something dynamic then | 07:38 |
noonedeadpunk | and what really disturbes me - is amount of ssh sessions here. I know we talked about that, but we need to guarantee that we do increase them | 07:38 |
noonedeadpunk | or deployers will | 07:38 |
jrosser | what i saw was with AIO (8cpu 8G) it would run the tasks in several batches particularly for lots of containers on the controller | 07:39 |
jrosser | and you could get a good speedup by increasing the forks to make it do just one batch | 07:39 |
noonedeadpunk | and for aio we have https://opendev.org/openstack/openstack-ansible/src/branch/master/tests/bootstrap-aio.yml#L69 which is not applied to regular deployments | 07:40 |
noonedeadpunk | and we also need to adjust this https://opendev.org/openstack/openstack-ansible/src/branch/master/doc/source/admin/maintenance-tasks/ansible-modules.rst#user-content-ansible-forks | 07:40 |
jrosser | tbh that feels out of date info | 07:41 |
jrosser | but if this is difficult we can leave it, or just make an optimisation for CI | 07:42 |
noonedeadpunk | wel, partially | 07:42 |
Adri2000 | in https://review.opendev.org/#/c/758207/ ANSIBLE_FORKS_VALUE is a placeholder in openstack-ansible.rc ... and that is replaced by the actual value | 07:42 |
noonedeadpunk | and what is that valua? | 07:42 |
*** tosky has joined #openstack-ansible | 07:42 | |
Adri2000 | it's computed in scripts-library.sh | 07:43 |
Adri2000 | my patch assumes we don't remove https://review.opendev.org/#/c/737221/2/scripts/scripts-library.sh | 07:43 |
noonedeadpunk | well, it doesn;t have ANSIBLE_FORKS_VALUE | 07:43 |
noonedeadpunk | and according to http://codesearch.openstack.org/?q=ANSIBLE_FORKS_VALUE&i=nope&files=&repos= this var is introduced with that patch | 07:44 |
Adri2000 | `sed -i "s|ANSIBLE_FORKS_VALUE|${ANSIBLE_FORKS}|g" /usr/local/bin/openstack-ansible.rc` < the ANSIBLE_FORKS_VALUE placeholder in openstack-ansible.rc is replaced by ${ANSIBLE_FORKS} which is ocmputed in scripts-library.sh | 07:44 |
* jrosser confused | 07:44 | |
Adri2000 | this change https://review.opendev.org/#/c/758207/3/scripts/openstack-ansible.rc makes sure the ANSIBLE_FORKS env var is defined to either a user defined ANSIBLE_FORKS env var or to ANSIBLE_FORKS_VALUE which will be an actual number (it will have been replaced by bootstrap-ansible.sh) | 07:46 |
Adri2000 | this change https://review.opendev.org/#/c/758207/3/scripts/bootstrap-ansible.sh replaces ANSIBLE_FORKS_VALUE by the value ${ANSIBLE_FORKS} which is computed in scripts-libary.sh (https://review.opendev.org/#/c/737221/2/scripts/scripts-library.sh) | 07:46 |
noonedeadpunk | I still didn;t get why in the world we need ANSIBLE_FORKS_VALUE env var | 07:47 |
Adri2000 | there is no ANSIBLE_FORKS_VALUE env var | 07:47 |
noonedeadpunk | and why we need to replace something that does not exist? | 07:47 |
noonedeadpunk | well, and you introduce it here ? https://review.opendev.org/#/c/758207/3/scripts/openstack-ansible.rc | 07:48 |
Adri2000 | `export ANSIBLE_FORKS="${ANSIBLE_FORKS:-ANSIBLE_FORKS_VALUE}"` will become e.g. `export ANSIBLE_FORKS="${ANSIBLE_FORKS:-10}"` after you've run bootstrap-ansible | 07:48 |
noonedeadpunk | by defining it as default for ANSIBLE_FORKS? | 07:48 |
Adri2000 | I define an ANSIBLE_FORKS env var | 07:48 |
Adri2000 | just look at the line before, it's the same model: `export ANSIBLE_PYTHON_INTERPRETER="${ANSIBLE_PYTHON_INTERPRETER:-OSA_ANSIBLE_PYTHON_INTERPRETER}"` | 07:48 |
Adri2000 | OSA_ANSIBLE_PYTHON_INTERPRETER is a placeholder as well | 07:49 |
noonedeadpunk | but we have this var defined https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/bootstrap-ansible.sh#L59 | 07:49 |
jrosser | imho this is confusing because of the overloading of ANSIBLE_FORKS | 07:49 |
Adri2000 | yes | 07:49 |
Adri2000 | this https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/bootstrap-ansible.sh#L46 defines ${ANSIBLE_FORKS} | 07:50 |
jrosser | to make this clean the only place that ANSIBLE_FORKS should be defined is in the .rc file | 07:50 |
jrosser | and we should have a different var OSA_ANSIBLE_FORKS calculated in scripts library which becomes the default value, based on the the deploy host CPUs | 07:50 |
noonedeadpunk | +1 | 07:51 |
Adri2000 | that's fine to me. I believe my patch works as is, but I can understand you find the use of names confusing. I kind of wanted to make the patch as small as possible to fix the actual problem I had. (I had to use -f X on each openstack-ansible run to define the number of forks) | 07:52 |
jrosser | making the code more obvious is a good thing, so if the patch is a bit bigger then thats totally OK | 07:54 |
noonedeadpunk | another option for me would be just dropping ANSIBLE_FORKS_VALUE and related sed to it | 07:54 |
noonedeadpunk | but I like jrosser's idea | 07:54 |
noonedeadpunk | but actually, I think we should decide if we probably want just some static bump | 07:55 |
noonedeadpunk | as I like this idea as well, except nuance with MaxSessions | 07:55 |
noonedeadpunk | but considering that it wil be pretty easi to override in bashrc... | 07:56 |
noonedeadpunk | maybe we should really just increase number of forks for aio? | 07:57 |
jrosser | simple is good too | 07:58 |
noonedeadpunk | but yeah, again, I with my 2 cores can really the way more threads than 2 | 07:58 |
jrosser | good question | 08:02 |
noonedeadpunk | so I see 3 roads - leave by amount of cpus (which is not so effective considering that we probably should multiply it) and make use of ANSIBLE_FORKS, add MaxSessions bump for sshd for deploy host somehow, or just set fixed number to 10? | 08:02 |
noonedeadpunk | well start really using ANSIBLE_FORKS will be applicable anyway | 08:03 |
jrosser | yes, because we can't even make an AIO/CI special case without that | 08:03 |
noonedeadpunk | so https://review.opendev.org/#/c/758207/ might be a good and backportable shot | 08:04 |
jrosser | fwiw i am using LXDs for deploy hosts, a bunch of them on the same machine | 08:04 |
jrosser | so they all have all the host CPUs should they need them | 08:04 |
jrosser | yes i think https://review.opendev.org/#/c/758207/ is good, if the variable names get regularised to OSA_.... for things replaced in the rc file | 08:06 |
noonedeadpunk | well, we actually need to bump ssh sessions just for lxc hosts | 08:06 |
Adri2000 | jrosser: will prepare and push that change right now so you can both have a look | 08:08 |
jrosser | recyclehero: the openstacksdk is installed into the utility container venv here https://github.com/openstack/openstack-ansible/blob/master/playbooks/utility-install.yml#L176 | 08:09 |
jrosser | recyclehero: the list of things which get installed into the utility venv is here https://opendev.org/openstack/openstack-ansible/src/branch/master/inventory/group_vars/utility_all.yml#L75 | 08:10 |
jrosser | recyclehero: handlers run at the end of a play, but only if the task which notifies them has a status of 'changed' | 08:11 |
openstackgerrit | Adrien Cunin proposed openstack/openstack-ansible master: Actually use ANSIBLE_FORKS in openstack-ansible.rc https://review.opendev.org/758207 | 08:11 |
*** CeeMac has joined #openstack-ansible | 08:12 | |
noonedeadpunk | I wish we dropped these seds | 08:24 |
*** yolanda__ has quit IRC | 08:24 | |
*** yolanda__ has joined #openstack-ansible | 08:24 | |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible-lxc_hosts master: Increase amount of MaxSessions https://review.opendev.org/758364 | 08:25 |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible master: Increase default ansible forks from 5 to 20 https://review.opendev.org/737221 | 08:30 |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible master: Increase default ansible forks from 5 to 20 https://review.opendev.org/737221 | 08:30 |
* noonedeadpunk still can't understand 758207 and why we set default of ANSIBLE_FORKS to undefined OSA_ANSIBLE_FORKS ;( | 08:33 | |
noonedeadpunk | that's kind of what we're doing http://paste.openstack.org/show/799066/ | 08:36 |
recyclehero | morning | 08:36 |
noonedeadpunk | ah ok | 08:36 |
recyclehero | jrosser: so I should see openstacksdk in /openstack/venvs/utility-21.0.1/bin insde openstack container | 08:36 |
noonedeadpunk | in case we have ANSIBLE_FORKS defined we will replace with actual number | 08:36 |
recyclehero | *utility container | 08:37 |
jrosser | openstacksdk is the name of the python package | 08:37 |
recyclehero | for some reason it isnt present on the utility contaienr | 08:38 |
recyclehero | or maybe a symlink isnt created | 08:39 |
recyclehero | some of this tasks got run_once on them | 08:40 |
recyclehero | could it be causing this | 08:40 |
jrosser | how have you confirmed that openstacksdk is not present? | 08:41 |
recyclehero | task [OS_keystone: Add service project] error | 08:42 |
recyclehero | it says openstack sdk required | 08:42 |
jrosser | you said you have some trouble with deploying the utility container | 08:44 |
jrosser | if thats somehow not worked then the rest of the roles are not going to work | 08:44 |
recyclehero | can I delete the container and redeloy it? | 08:45 |
jrosser | you can do that, yes | 08:47 |
recyclehero | the link is there openstack -> /openstack/venvs/utility-21.0.1/bin/openstack | 08:47 |
recyclehero | but the actual bin aint | 08:47 |
recyclehero | so delete with lxc and run utility-install | 08:49 |
jrosser | you could try re-running the utility playbook with -e venv_rebuild=yes | 08:49 |
recyclehero | doh, too late. msg": "Destination /var/lib/lxc/infra1_utility_container-3e3911b0/config does not exist ! | 08:51 |
recyclehero | I think I should do setup-host too | 08:51 |
jrosser | recyclehero: there is a playbook specifically to create the containers, and you can use --limit to make it very specific which will speed things up considerably | 08:56 |
jrosser | it's work spending some time understanding whats inside the setup-*.yml playbooks, because all of the more granular things can be called directly | 08:57 |
recyclehero | I will go limit first to see how its works then I will check that out. thanks | 08:58 |
jrosser | https://docs.openstack.org/openstack-ansible/latest/admin/maintenance-tasks.html#destroy-and-recreate-containers | 09:01 |
MickyMan77 | which openstack-ansible version is the lastes stable for use with Centos 8 ? | 09:09 |
jrosser | MickyMan77: that would be 21.1.0 on the ussuri branch | 09:11 |
jrosser | noonedeadpunk: we need to fix the uwsgi role https://review.opendev.org/#/c/758108/ | 09:17 |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-os_tempest master: Fix tempest init logic https://review.opendev.org/753393 | 09:26 |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-galera_server master: Update galera to 10.5.6 https://review.opendev.org/742105 | 09:29 |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-lxc_hosts master: Increase amount of MaxSessions https://review.opendev.org/758364 | 09:31 |
recyclehero | when does password setting take place? like placement service in keystone | 09:31 |
jrosser | https://github.com/openstack/openstack-ansible-os_placement/blob/master/tasks/main.yml#L87-L115 | 09:33 |
*** Nick_A has quit IRC | 09:34 | |
openstackgerrit | Merged openstack/openstack-ansible-os_placement stable/train: Trigger service restart https://review.opendev.org/757745 | 09:45 |
openstackgerrit | Merged openstack/openstack-ansible-os_cinder stable/ussuri: Trigger uwsgi restart https://review.opendev.org/757712 | 10:02 |
openstackgerrit | Erik Berg proposed openstack/openstack-ansible stable/ussuri: WIP/DNM: Upgrade ceph to octopus during run-upgrade.sh to ussuri https://review.opendev.org/758382 | 10:07 |
*** admin0 has quit IRC | 10:27 | |
*** yolanda__ is now known as yolanda | 10:52 | |
recyclehero | aguys how do u deploy with out getting logged out as an effect of security_hardening of hosts? nohup, output redirection | 11:08 |
recyclehero | I deploy from infra1 | 11:11 |
openstackgerrit | Merged openstack/ansible-role-uwsgi master: Add vars file for ubuntu bionic https://review.opendev.org/758108 | 11:11 |
recyclehero | when It takes a while for example on wheel builds I get logged out eitehr on ssh or localhost login | 11:13 |
*** dave-mccowan has joined #openstack-ansible | 11:16 | |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible master: Fix infra jobs https://review.opendev.org/758399 | 11:20 |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible-galera_server master: Update galera to 10.5.6 https://review.opendev.org/742105 | 11:21 |
*** yann-kaelig has joined #openstack-ansible | 11:32 | |
ebbex | recyclehero: i deploy from a vm with screen, so i'm not sure, but you can set "security_rhel7_session_timeout: 0" in user_variables.yml, and "ServerAliveInterval 300" in your .ssh/config | 11:33 |
openstackgerrit | Merged openstack/openstack-ansible-os_nova master: Enable notifications when Designate is enabled https://review.opendev.org/757904 | 11:37 |
openstackgerrit | Merged openstack/openstack-ansible-os_nova stable/ussuri: Simplify scheduler filter additions https://review.opendev.org/757858 | 11:37 |
openstackgerrit | Erik Berg proposed openstack/openstack-ansible-os_nova stable/train: Simplify scheduler filter additions https://review.opendev.org/758404 | 11:43 |
recyclehero | ebbex: thanks, I changed the ServerAliveInterval. | 11:46 |
recyclehero | but it gets overwritten by ansible! | 11:53 |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible-os_nova stable/ussuri: Remove backported release note https://review.opendev.org/758406 | 11:54 |
jrosser | recyclehero: ebbex uses a seperate deploy host " i deploy from a vm" so i beleive he was referring to that host, not infra1 | 11:55 |
recyclehero | it actually didnt overwrite ServerAliveInterval in sshd_config but setting it to 0 and still got disconnected. I should look for something like session timeout in playbook and reverse it. it kicks me out in the middle of deployment | 11:57 |
jrosser | if you are deploying from one of the target hosts then you'll need to set variables that change what the hardening role does | 11:57 |
ebbex | recyclehero: ServerAliveInterval 300 in .ssh/config on your local machine. The one you ssh *from* into infra1. | 11:58 |
jrosser | ebbex: i think infra1 is the deploy host here | 11:58 |
ebbex | yeah, and he's getting disconnected from infra1 right? | 11:59 |
jrosser | won't the session timeout always win? | 11:59 |
ebbex | session timeout will win yes. | 12:00 |
jrosser | i think your advice to adjust security_rhel7_session_timeout is whats needed | 12:00 |
ebbex | cause he's deploying from a host that gets deployed to? | 12:00 |
jrosser | either way i think? | 12:01 |
recyclehero | jrosser: yes, but I am on debian. | 12:01 |
recyclehero | yes its common as I dont see anything distro specefic in the roel | 12:02 |
ebbex | he probably needs both. one to prevent session timeout on infra1, and two to prevent ssh disconnects from infra1 to his computer. | 12:02 |
openstackgerrit | Georgina Shippey proposed openstack/ansible-role-systemd_service master: Greater flexibility to timer templating https://review.opendev.org/758408 | 12:05 |
openstackgerrit | Adrien Cunin proposed openstack/openstack-ansible-os_nova stable/ussuri: Enable notifications when Designate is enabled https://review.opendev.org/758411 | 12:09 |
recyclehero | its well hardened! declare -rx TMOUT="600" cant easiliy override it (read-only) . I am going cuckoo :D | 12:09 |
openstackgerrit | Adrien Cunin proposed openstack/openstack-ansible-os_nova stable/train: Enable notifications when Designate is enabled https://review.opendev.org/758412 | 12:09 |
*** cshen has quit IRC | 12:09 | |
openstackgerrit | Adrien Cunin proposed openstack/openstack-ansible-os_nova stable/stein: Enable notifications when Designate is enabled https://review.opendev.org/758413 | 12:09 |
recyclehero | at list now I know I shoud press enter every 10 minutes. | 12:11 |
ebbex | that's what setting security_rhel7_session_timeout is for, no? | 12:12 |
ebbex | recyclehero: ^ | 12:12 |
*** rf0lc0 has joined #openstack-ansible | 12:15 | |
openstackgerrit | Erik Berg proposed openstack/openstack-ansible-os_nova stable/train: Simplify scheduler filter additions https://review.opendev.org/758404 | 12:17 |
recyclehero | ebbex: I wanted to do that in palce. I am on it now | 12:22 |
jrosser | noonedeadpunk: i think for stable branch octavia jobs we are really stuck like this https://review.opendev.org/#/c/672556/ | 12:31 |
jrosser | i don't think there is a suitable amphora image for that openstack release | 12:31 |
noonedeadpunk | jrosser: let's just disable then tests? | 12:34 |
jrosser | yep - that would do it, the patch makes sense otherwise | 12:35 |
noonedeadpunk | btw, any thoughts about https://review.opendev.org/#/c/752059/ ? | 12:35 |
jrosser | yeah - i think we are having issues similar | 12:35 |
jrosser | just not got round to check | 12:36 |
noonedeadpunk | got it | 12:36 |
noonedeadpunk | (just was pinged in related bug report about in what state it is) | 12:36 |
jrosser | i'll see if we can test it but going to be not immediate | 12:37 |
jrosser | we try to collect logs with journalbeat only on the hosts, and i think some were missing | 12:40 |
jrosser | which could be broken mounts | 12:40 |
*** macz_ has joined #openstack-ansible | 12:40 | |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-os_tempest master: Fix tempest init logic https://review.opendev.org/753393 | 12:43 |
*** macz_ has quit IRC | 12:45 | |
*** cshen has joined #openstack-ansible | 12:57 | |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-os_octavia stable/rocky: Save iptables rules for all Debian derivative operating systems https://review.opendev.org/672556 | 13:04 |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-lxc_hosts master: Increase amount of MaxSessions https://review.opendev.org/758364 | 13:05 |
MickyMan77 | Finally, :), I have a working openstack farm except for the network part. I can see dhcp request from the instances on the br-vlan interface when i do a tcpdump. | 13:09 |
MickyMan77 | The instances does not get any ip address on the nic. ----> http://paste.openstack.org/show/799080/ | 13:09 |
jrosser | MickyMan77: there are come quite comprehensive checklists here https://docs.openstack.org/openstack-ansible/latest/admin/troubleshooting.html | 13:22 |
openstackgerrit | Merged openstack/openstack-ansible-os_nova stable/ussuri: Remove backported release note https://review.opendev.org/758406 | 13:25 |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible-os_ironic master: Updated from OpenStack Ansible Tests https://review.opendev.org/755536 | 13:30 |
*** nurdie has joined #openstack-ansible | 13:36 | |
openstackgerrit | Merged openstack/openstack-ansible-lxc_hosts stable/train: copy the actual keyring https://review.opendev.org/731626 | 13:37 |
*** sshnaidm has quit IRC | 13:54 | |
openstackgerrit | Jonathan Rosser proposed openstack/openstack-ansible master: Switch from ansible-base + collections to ansible package https://review.opendev.org/758431 | 13:59 |
*** sshnaidm has joined #openstack-ansible | 14:00 | |
jrosser | goodness me how can ansible galaxy be *so* unreliable | 14:21 |
jrosser | like 80% of our jobs are failing | 14:21 |
*** macz_ has joined #openstack-ansible | 14:28 | |
*** gshippey has joined #openstack-ansible | 14:29 | |
* noonedeadpunk in ansible contributor summit and going to rent about that | 14:32 | |
noonedeadpunk | *rant | 14:32 |
*** macz_ has quit IRC | 14:33 | |
noonedeadpunk | jrosser: do you have bug report you've written to somewhere handy? | 14:34 |
jrosser | hmm just need to find it! | 14:36 |
jrosser | noonedeadpunk: https://github.com/ansible/galaxy/issues/2302 | 14:37 |
noonedeadpunk | thanks! | 14:37 |
*** rgogunskiy has joined #openstack-ansible | 14:44 | |
*** miloa has quit IRC | 14:53 | |
*** macz_ has joined #openstack-ansible | 15:01 | |
Adri2000 | would be happy to get a few more opinions on https://review.opendev.org/#/c/729533/ - I think the topic basically boils down to how we define "data" in the context of OSA LXC containers. I have always assumed that OSA LXC containers' "data" were the directories bind mounted from /openstack/... into the LXC containers, such as /var/lib/mysql/ for galera containers. which means that there | 15:03 |
Adri2000 | is no actual "data" for most containers (galera being the main exception); i.e. it's possible to completely destroy/delete and then recreate from scratch (well as long as the OSA inventory is there) most of the containers. | 15:03 |
openstackgerrit | Merged openstack/openstack-ansible master: Remove glance-registry from docs https://review.opendev.org/739794 | 15:11 |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible master: Make collections installation more reliable https://review.opendev.org/758454 | 15:22 |
jrosser | ^ nice :) | 15:23 |
jrosser | noonedeadpunk: i was also thinking about the job failures we get downloading upper-constraints.txt, we do that a lot on each run | 15:24 |
jrosser | that could be recovered from the requirements git repo on the CI node and maybe we put it in /openstack/requirements/<sha>/upper-constraints.txt or something | 15:26 |
jrosser | then change the url to file://.... | 15:26 |
noonedeadpunk | well, I think we can set requirements as required project? | 15:26 |
noonedeadpunk | and then zuul will get it - the only thing we need is to update single variabe in ci? | 15:27 |
jrosser | right - but we set a specific sha and we'd need a specific step to extract just the right version of the file | 15:27 |
noonedeadpunk | well, yes... | 15:27 |
jrosser | and putting it in /openstack/... makes it also be inside all the lxc :) | 15:27 |
jrosser | it's nowhere near as bad as the galaxy thing but maybe the next most frequent thing that breaks | 15:28 |
recyclehero | this python-venv behave very strange. sometimes on upgrades to the version ti wants retry for 5 time even after some failed runs which should have them locally. | 15:30 |
recyclehero | and now this | 15:30 |
recyclehero | fatal: [infra1_horizon_container-9cb968e5 -> 172.29.239.138]: FAILED! => {"changed": false, "msg": "file not found: /var/www/repo/os-releases/21.0.1/horizon-21.0.1-constraints.txt"} | 15:30 |
recyclehero | this is the task: but I think if I rerun it disappear | 15:31 |
recyclehero | TASK [python_venv_build : Slurp up the constraints file for later re-deployment] **** | 15:31 |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible master: Make collections installation more reliable https://review.opendev.org/758454 | 15:32 |
*** gyee has joined #openstack-ansible | 15:39 | |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible master: Make collections installation more reliable https://review.opendev.org/758454 | 15:44 |
*** rpittau is now known as rpittau|afk | 15:52 | |
*** dave-mccowan has quit IRC | 15:53 | |
*** dave-mccowan has joined #openstack-ansible | 15:57 | |
recyclehero | I am blaming the https connection to releases.openstack.org | 15:57 |
recyclehero | what is the varibale to make it try like crazy rather than aborting the deployemnt after 5 tries? | 15:59 |
jrosser | there has been an outage at releases.openstack.org this afternoon | 16:03 |
jrosser | it should be back now | 16:04 |
recyclehero | not very stable | 16:04 |
recyclehero | but this one is a sticky TASK [python_venv_build : Slurp up the constraints file for later re-deployment] | 16:04 |
recyclehero | atal: [infra1_horizon_container-9cb968e5 -> 172.29.239.138]: FAILED! => {"changed": false, "msg": "file not found: /var/www/repo/os-releases/21.0.1/horizon-21.0.1-constraints.txt"} | 16:04 |
recyclehero | waht should I do? | 16:05 |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible master: Make collections installation more reliable https://review.opendev.org/758454 | 16:07 |
recyclehero | I even dont know what Slurp means | 16:09 |
recyclehero | Bit should be an address like this? https://opendev.com/openstack/requirements/raw/21.0.1/horizon-21.0.1-constraints.txt | 16:18 |
recyclehero | its not correct, I am tring to put it in there manually | 16:19 |
*** spatel has joined #openstack-ansible | 16:21 | |
*** MickyMan77 has quit IRC | 16:25 | |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible master: Fix upgrade jobs for bind-to-mgmt https://review.opendev.org/758461 | 16:25 |
recyclehero | ignore_errors: yes :( | 16:29 |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible-galera_server stable/stein: Bump galera version https://review.opendev.org/758462 | 16:30 |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible-galera_client stable/stein: Bump galera version https://review.opendev.org/758464 | 16:31 |
jamesdenton | jrosser with your haproxy and baremetal efforts, have you seen a need to override openstack_service_bind_address for uwsgi-based services? | 16:35 |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible-os_magnum master: Fix linter errors https://review.opendev.org/755569 | 16:55 |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/openstack-ansible master: Fix upgrade jobs for bind-to-mgmt https://review.opendev.org/758461 | 16:58 |
jrosser | jamesdenton: no i've not - if you're having to override that maybe we have a wrong default somewhere | 17:02 |
jrosser | do you have an example? | 17:02 |
jamesdenton | well, i have haproxy running on the same node as ironic-api (on baremetal). default for uwsgi host ip is 0.0.0.0, but haproxy is already listening on thre same ports | 17:03 |
jrosser | the patch that enabled bind-to-mgmt should have set openstack_service_bind_address to something like {{ mangement_address }} iirc | 17:04 |
jamesdenton | i might not have that patch | 17:04 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible/src/branch/master/inventory/group_vars/all/all.yml#L38 | 17:07 |
noonedeadpunk | it's only in master iirc | 17:07 |
jamesdenton | bueno. i overrode to ansible_host, but looks good | 17:08 |
jamesdenton | running ussuri here | 17:08 |
jamesdenton | thank you | 17:08 |
noonedeadpunk | jamesdenton: and you're running U on metal? | 17:08 |
noonedeadpunk | I guess you're if asking:) | 17:09 |
noonedeadpunk | it kind of means that we've failed with our plan to do CI only thig jrosser | 17:09 |
jamesdenton | kinda sorta. this is actually my home lab setup, which started as a rocky? environment a couple of years ago and has been upgraded. slowing moving lxc->baremetal | 17:10 |
jamesdenton | we run Stein on baremetal in prod, but have dedicated haproxy nodes there | 17:10 |
noonedeadpunk | jamesdenton: well, I think we will need some migration plan then.... | 17:10 |
noonedeadpunk | as what I did was https://review.opendev.org/#/c/758461/ | 17:11 |
noonedeadpunk | (not sure it's working at all at the moment) | 17:11 |
noonedeadpunk | but not suitable for prod for sure | 17:11 |
jamesdenton | or stein environments are greenfield, fwiw | 17:11 |
jamesdenton | *our | 17:12 |
jamesdenton | my migration plan has been to copy the env.d files to /etc/openstack_deploy/env.d per service, set baremetal=true, remove the existing lxc container from inventory, regenerate inventory, and redeploy the respective service. But these oddities show up, like haproxy. i'd have to be much more methodical about it | 17:13 |
noonedeadpunk | Well, I was meaning bare metal deployment with external haproxy/tons of overrides on U to V bare metal with our bind-to-mgmt | 17:15 |
noonedeadpunk | and bad thing about that is ppl might have tons of different ways to workaround.... | 17:16 |
jamesdenton | agreed | 17:16 |
noonedeadpunk | offtop - dunno how to comment that (it's just beginning of the work) https://github.com/ansible-collections/cloud.roles | 17:17 |
recyclehero | guys in the repo container /var/www/repo/os-releases/21.0.1 every service has 4 files but horizon is mission the constraint one? | 17:18 |
recyclehero | . | 17:18 |
jamesdenton | i'm curious as to which ton of overrides you're referring to for external haproxy | 17:18 |
noonedeadpunk | I was reffering to overrides in case of internal haproxy) | 17:19 |
jamesdenton | oh ok | 17:19 |
recyclehero | what are these .txt files | 17:19 |
recyclehero | ? | 17:19 |
jamesdenton | yeah, well, like you said, who knows how many people were/are actually doing that? | 17:19 |
jamesdenton | i kinda feel like if you go off the reservation, you're on your own, to an extent. But if there's a "sanctioned" architecture and migration plan, that;s the one you test against | 17:20 |
*** andrewbonney has quit IRC | 17:20 | |
noonedeadpunk | well yes, fair | 17:20 |
recyclehero | openstack-ansible repo-install.yml -e "venv_rebuild=True" | 17:29 |
recyclehero | would this help? | 17:29 |
noonedeadpunk | recyclehero: help with what? | 17:31 |
noonedeadpunk | you have error installing horizon? | 17:31 |
recyclehero | yes | 17:32 |
recyclehero | it complains about a file missing | 17:32 |
recyclehero | horizon-21.0.1-constraints.txt | 17:32 |
recyclehero | it should be in the repo server, and I checked its there for them all except horizon. | 17:33 |
noonedeadpunk | openstack-ansible os-horizon-install.yml -e "venv_rebuild=True" | 17:33 |
recyclehero | then continue with setup-openstack ? | 17:34 |
noonedeadpunk | well either this or manually run rest of playbooks for services that you want to deploy | 17:37 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/setup-openstack.yml#L25-L50 | 17:38 |
noonedeadpunk | but for core I think horizon is one of the last ones | 17:38 |
recyclehero | noonedeadpunk: btw is this in 21.1.0? | 17:41 |
recyclehero | https://review.opendev.org/#/c/751724/ | 17:43 |
noonedeadpunk | um...... it's not but I was pretty sure that it is... | 17:44 |
noonedeadpunk | ah wait | 17:44 |
*** MickyMan77 has joined #openstack-ansible | 17:48 | |
noonedeadpunk | recyclehero: yep, it's included | 17:49 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible/src/tag/21.1.0/ansible-role-requirements.yml#L44 | 17:49 |
recyclehero | great, how did u found out that this commit is included on that lxc_host version? | 17:53 |
noonedeadpunk | well by the commit SHA | 17:57 |
noonedeadpunk | you may see that for stable/ussuri that sha is exactly this commit https://opendev.org/openstack/openstack-ansible-lxc_hosts/commits/branch/stable/ussuri | 17:58 |
recyclehero | got it thanks | 17:59 |
MickyMan77 | Hi, I run it a problem with Cloud-init and SSh keys. "no authorized SSH keys fingerprints found for user debian" --> http://paste.openstack.org/show/799088/ | 18:00 |
MickyMan77 | it there any easy way to fix it ? | 18:00 |
fridtjof[m] | Hey, I'm encountering a lot of instability in an environment. (especially when creating instances) nova-api-wsgi is losing connection to rabbitmq a lot - rabbit is closing the connections due to missing heartbeats, which would indicate that nova-api-wsgi is somehow failing to send those properly? | 18:01 |
fridtjof[m] | any ideas? | 18:01 |
fridtjof[m] | this env is on train 20.1.7 | 18:01 |
spatel | MickyMan77: did you try other distro? | 18:01 |
MickyMan77 | same issue with debian and centos 8 | 18:02 |
fridtjof[m] | whoa, i just got a huge exception in the log | 18:03 |
spatel | MickyMan77: I think neutron metadata services provide those function so make sure its up and running - https://docs.openstack.org/nova/latest/user/metadata.html | 18:03 |
fridtjof[m] | http://paste.openstack.org/show/799093/ | 18:04 |
recyclehero | MickyMan77: these comes to my mind 1- check metadata 2-maybe write some key directly to the volume using libguestfs tools and inspect more | 18:06 |
jrosser | MickyMan77: you did create/upload an ssh keypair? | 18:06 |
MickyMan77 | i did upload the key. | 18:07 |
spatel | fridtjof[m]: i had that kind of issue when i was using F5 load-balancer and found TCP timeout setting was different on F5 and it was closing connection | 18:07 |
*** djhankb has quit IRC | 18:07 | |
fridtjof[m] | this is a plain haproxy setup unfortunately | 18:08 |
fridtjof[m] | just one infra node | 18:08 |
jrosser | MickyMan77: this is not good 87.370969] cloud-init[419]: 2020-10-15 17:45:00,488 - util.py[WARNING]: No active metadata service found | 18:08 |
noonedeadpunk | fridtjof[m]: well I'm wondering if everything with network is ok, and considering it's ok, then I think it's worth checking if you have some rabbit queue overflowed with unread messages | 18:08 |
MickyMan77 | I will take a look at metadata service | 18:09 |
spatel | yes, network issue, mtu (could be) or not health MQ. | 18:09 |
noonedeadpunk | you can check that with `rabbitmqctl -p /nova list_queues | egrep -v "0$"` (but vhosts are nova,cinder,glance,neutron,etc) | 18:09 |
fridtjof[m] | network shouldn't be an issue, the containers are all on the same host in this case | 18:09 |
noonedeadpunk | can you have OOM? :) | 18:09 |
fridtjof[m] | sorry for the dumb question, but how do I check rabbitmq's health? | 18:10 |
openstackgerrit | Merged openstack/openstack-ansible-os_octavia stable/rocky: Save iptables rules for all Debian derivative operating systems https://review.opendev.org/672556 | 18:10 |
fridtjof[m] | ah | 18:10 |
noonedeadpunk | fridtjof[m]: well, there's dashboard and cli util and... | 18:10 |
fridtjof[m] | 128GB on the infra host, oom shouldn't be a problem | 18:10 |
MickyMan77 | is the website https://docs.openstack.org/ down ? | 18:11 |
noonedeadpunk | well yeah | 18:11 |
fridtjof[m] | hm, queues are empty | 18:11 |
spatel | is this AIO deployment? | 18:11 |
noonedeadpunk | MickyMan77: yep :( | 18:11 |
fridtjof[m] | not quite, it's one infra + storage host, and two compute hosts | 18:12 |
fridtjof[m] | what i find weird is that rabbitmq gives me this log output a lot: | 18:12 |
fridtjof[m] | 2020-10-15 18:09:11.722 [error] <0.17759.2> closing AMQP connection <0.17759.2> (10.1.70.227:50126 -> 10.1.70.29:5671 - uwsgi:9290:3e5c3b51-cd0f-4527-9035-cb29d21c23fd): | 18:12 |
fridtjof[m] | missed heartbeats from client, timeout: 60s | 18:12 |
fridtjof[m] | 227 is the nova-api container | 18:12 |
spatel | fridtjof[m]: its normal i believe, i am seeing this in my network very randomly (i believe its kind of bug) | 18:13 |
spatel | run tcpdump on port 5671 | 18:13 |
fridtjof[m] | it correlates with nova-api constantly logging this: http://paste.openstack.org/show/799094/ | 18:14 |
noonedeadpunk | fridtjof[m]: oh, and you're running rabbit with ssl? | 18:14 |
fridtjof[m] | yeah, i'm just wondering if it's a load + timeout issue for my problems | 18:14 |
spatel | that is not good error message | 18:14 |
fridtjof[m] | whatever the default in OSA is | 18:15 |
noonedeadpunk | I think be default we don't use ssl for rabbit/mysql as for now | 18:15 |
noonedeadpunk | but exception is thrown by ssl module at the end | 18:16 |
noonedeadpunk | ah, we use ssl by default | 18:17 |
fridtjof[m] | hm, i'll change nova-api to use 5672 without ssl then | 18:19 |
noonedeadpunk | also, what I used to run to fix rabbit - openstack-ansible playbooks/rabbitmq-install.yml -e rabbitmq_upgrade=true | 18:20 |
noonedeadpunk | I think you can just change this for nova... | 18:20 |
noonedeadpunk | this re-creates queues so might fix things if rabbit starts acting wierdly | 18:21 |
fridtjof[m] | i did that yesterday (as part of a minor upgrade), but it didn't really help | 18:21 |
fridtjof[m] | oh, oops. I rebooted the entire host, not the container | 18:22 |
fridtjof[m] | that was dumb | 18:23 |
spatel | noonedeadpunk: why do we need to use -e rabbitmq_upgrade=true ? | 18:33 |
fridtjof[m] | alright, seems the reboot kind of helped | 18:35 |
fridtjof[m] | the rabbitmq related issues are gone (for now, at least) | 18:35 |
fridtjof[m] | but my base problem is still there >_> | 18:35 |
jrosser | jamesdenton: with the haproxy/metal/bind-to-mgmt theres kind of two things in play | 18:36 |
jrosser | without the bind-to-mgmt patches all the services were bound to 0.0.0.0 so thats the first thing that needs cleaning up, and those changes were a precursor to landing the haproxy+metal patch | 18:37 |
jrosser | however for the prod deploys you mention where haproxy was on seperate nodes then that wouldnt have been an issue, so i would think that the changes we have made in V might not be so impactful there | 18:38 |
fridtjof[m] | alright, i can pin down at least this issue now - creating an instance on an external network times out because network binding fails | 18:38 |
fridtjof[m] | and i can see a steady stream of exceptions on one compute http://paste.openstack.org/show/799095/ | 18:39 |
fridtjof[m] | "permission denied" sounds like the agent is misconfigured? | 18:39 |
noonedeadpunk | this sounds as missing sudo | 18:39 |
jrosser | but..... if there are deploys where great effort has been done to make haproxy co-exist with a metal deploy infra node, thats where the upgrade might be more tricky | 18:40 |
noonedeadpunk | do you have sudo binary on compute? | 18:40 |
fridtjof[m] | sudo is present | 18:40 |
fridtjof[m] | yeah | 18:40 |
*** djhankb has joined #openstack-ansible | 18:40 | |
noonedeadpunk | the probably some command/path is missing from /etc/neutron/rootwrap.d/ | 18:42 |
noonedeadpunk | but um, there ton's of stuff | 18:42 |
noonedeadpunk | so if you don't have command it tries to execute before that stack trace probably worth enabling debug and restarting service to see on what exact command it fails and not being able to gain permissions | 18:43 |
fridtjof[m] | trying that | 18:45 |
fridtjof[m] | debug output doesn't give me the exact command line :/ | 18:50 |
fridtjof[m] | i'll resort to adding some log statements now lol | 18:50 |
recyclehero | guys I want to restore mariad in a 2 node env. 1 controller so 1 node galere. | 18:56 |
recyclehero | I am planning to | 18:56 |
recyclehero | 1- stop maria 2-restore from backup 3-etc/init.d/mysql start --wsrep-new-cluster | 18:57 |
recyclehero | sounds okay? | 18:58 |
recyclehero | https://docs.openstack.org/openstack-ansible/ussuri/admin/maintenance-tasks.html#galera-cluster-recovery | 18:59 |
recyclehero | recovering primary component link is broken on this page | 19:00 |
fridtjof[m] | huh, it's on both compute nodes though. | 19:00 |
fridtjof[m] | they both get "permission denied" | 19:01 |
*** cshen has quit IRC | 19:17 | |
*** cshen has joined #openstack-ansible | 19:17 | |
*** yann-kaelig has quit IRC | 19:18 | |
MickyMan77 | jrosser: when I check the meta via haproxy I do get info.. --> http://paste.openstack.org/show/799098/ | 19:19 |
MickyMan77 | jrosser: but when I tcpdump the br-vlan nic I can't see any autogoing traffic from the new created Instances to the meta service. | 19:20 |
*** gregwork has quit IRC | 19:26 | |
*** cshen has quit IRC | 19:26 | |
fridtjof[m] | okay, it's trying to add a link local ipv6 address to a brq interface...? and that gets a "permission denied" | 19:30 |
fridtjof[m] | looks like it's not a permission problem after all. | 19:32 |
fridtjof[m] | root@compute2-CP6NY03:~# ip a add fe80::ac59:20ff:fe4c:8cff/64 dev brqf7424189-aa | 19:32 |
fridtjof[m] | RTNETLINK answers: Permission denied | 19:32 |
fridtjof[m] | i replicated what it's trying to do | 19:32 |
fridtjof[m] | okay, seems like that address already exists on eth12 | 19:33 |
fridtjof[m] | I configured my environment according to https://docs.openstack.org/openstack-ansible/train/user/prod/example.html | 19:35 |
*** MickyMan77 has quit IRC | 19:48 | |
fridtjof[m] | okay, found the cause for the exception, but I have no idea why that is the case | 20:07 |
fridtjof[m] | when I launch an instance attached to flat provider network (wired up on compute hosts through the br-vlan-veth/eth12 pair), a bridge "brq<guid>" gets created, eth12 and the VM's tap adapter get added to it | 20:08 |
fridtjof[m] | then neutron-linuxbridge-agent is trying to add a link local ipv6 to the bridge, but /proc/sys/net/ipv6/conf/brqf7424189-aa/disable_ipv6 is 1 | 20:09 |
fridtjof[m] | and that's where the "permission denied" is coming from | 20:09 |
fridtjof[m] | now, the question remains: why is that set to 1, and for what is it adding a link local v6 address to that adapter anyway? | 20:10 |
Adri2000 | fridtjof[m]: hello, I just read through quickly, and that sounds like an issue I had very recently and spent a day debugging with the help of neutron developers... have a look at https://bugs.launchpad.net/neutron/+bug/1899141 and https://review.opendev.org/#/c/757107/ | 20:13 |
openstack | Launchpad bug 1899141 in neutron "Linuxbridge agent NetlinkError: (13, 'Permission denied') after Stein upgrade" [Medium,In progress] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) | 20:13 |
fridtjof[m] | > Can you add debug logs in the "add_ip_address" method | 20:15 |
fridtjof[m] | oh my god, that's exactly what i ended up doing | 20:16 |
jamesdenton | jrosser the changes you made will be helpful in my scenario, with haproxy on the controller nodes. thank you | 20:16 |
fridtjof[m] | i wish i would've known earlier, i spent half my evening on this ;_; | 20:16 |
fridtjof[m] | it's the exact same issue, thank you Adri2000 | 20:16 |
Adri2000 | yw :) | 20:17 |
*** spatel has quit IRC | 20:25 | |
*** jeh has joined #openstack-ansible | 20:35 | |
*** nurdie has quit IRC | 20:38 | |
*** nurdie has joined #openstack-ansible | 21:09 | |
*** cshen has joined #openstack-ansible | 21:12 | |
*** cshen has quit IRC | 21:17 | |
*** jbadiapa has quit IRC | 21:17 | |
recyclehero | Create the neutron provider netowrk facts works when neutron_provider_networks is not defined | 21:30 |
recyclehero | so it means I cant make a change to netowrk physical mappings? | 21:30 |
recyclehero | really simple question, how to set host vars? I want to set neutron_provider_networks per host. I am looking for host_vars | 21:39 |
*** MickyMan77 has joined #openstack-ansible | 21:44 | |
*** gshippey has quit IRC | 21:48 | |
*** rh-jelabarre has quit IRC | 22:00 | |
*** yann-kaelig has joined #openstack-ansible | 22:46 | |
*** macz_ has quit IRC | 23:03 | |
*** cshen has joined #openstack-ansible | 23:13 | |
*** cshen has quit IRC | 23:17 | |
*** tosky has quit IRC | 23:23 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!