*** tosky has quit IRC | 00:00 | |
*** luksky has quit IRC | 00:29 | |
*** nurdie has joined #openstack-ansible | 00:35 | |
*** macz_ has quit IRC | 00:39 | |
*** nurdie has quit IRC | 00:45 | |
*** zigo has quit IRC | 00:54 | |
-openstackstatus- NOTICE: The Gerrit service on review.opendev.org is being restarted quickly to make heap memory and jgit config adjustments, downtime should be less than 5 minutes | 01:08 | |
*** cshen has joined #openstack-ansible | 01:25 | |
*** cshen has quit IRC | 01:29 | |
openstackgerrit | Merged openstack/openstack-ansible-ceph_client stable/ussuri: Allow to proceed with role if ceph_conf_file is set https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/765952 | 01:52 |
---|---|---|
*** macz_ has joined #openstack-ansible | 02:31 | |
*** rfolco has joined #openstack-ansible | 02:34 | |
*** macz_ has quit IRC | 02:36 | |
*** nurdie has joined #openstack-ansible | 02:42 | |
*** nurdie has quit IRC | 02:46 | |
*** nurdie has joined #openstack-ansible | 02:58 | |
*** nurdie has quit IRC | 03:03 | |
*** jamesdenton has quit IRC | 03:17 | |
*** jamesdenton has joined #openstack-ansible | 03:18 | |
*** cloudnull has quit IRC | 03:18 | |
*** cloudnull has joined #openstack-ansible | 03:18 | |
*** rfolco has quit IRC | 03:24 | |
*** cshen has joined #openstack-ansible | 03:25 | |
*** cshen has quit IRC | 03:30 | |
*** nurdie has joined #openstack-ansible | 03:38 | |
*** nurdie has quit IRC | 04:10 | |
*** nurdie has joined #openstack-ansible | 04:10 | |
*** nurdie has quit IRC | 04:15 | |
*** cshen has joined #openstack-ansible | 05:26 | |
*** simondodsley has quit IRC | 05:27 | |
*** simondodsley has joined #openstack-ansible | 05:29 | |
*** pto has quit IRC | 05:30 | |
*** pto_ has joined #openstack-ansible | 05:30 | |
*** cshen has quit IRC | 05:30 | |
*** evrardjp has quit IRC | 05:33 | |
*** evrardjp has joined #openstack-ansible | 05:33 | |
*** cshen has joined #openstack-ansible | 06:15 | |
*** pto has joined #openstack-ansible | 06:17 | |
*** pto_ has quit IRC | 06:17 | |
*** pto_ has joined #openstack-ansible | 06:17 | |
*** pto_ has quit IRC | 06:19 | |
*** cshen has quit IRC | 06:20 | |
*** pto_ has joined #openstack-ansible | 06:20 | |
*** pto_ has quit IRC | 06:21 | |
*** pto has quit IRC | 06:21 | |
*** pto_ has joined #openstack-ansible | 06:21 | |
*** gyee has quit IRC | 06:28 | |
*** miloa has joined #openstack-ansible | 07:00 | |
*** jbadiapa has joined #openstack-ansible | 07:09 | |
*** cshen has joined #openstack-ansible | 07:11 | |
*** pto_ has quit IRC | 07:49 | |
*** pto has joined #openstack-ansible | 07:49 | |
*** pto_ has joined #openstack-ansible | 08:01 | |
*** tosky has joined #openstack-ansible | 08:02 | |
*** pto has quit IRC | 08:04 | |
jrosser | morning | 08:05 |
jrosser | i wonder why my centos environment var patch works on lxc deploys but not metal | 08:05 |
jrosser | perhaps because the gate check script runs in a shell that exists before /etc/environment is modified | 08:07 |
*** mmethot has joined #openstack-ansible | 08:07 | |
*** mmethot_ has quit IRC | 08:10 | |
*** pto_ has quit IRC | 08:11 | |
*** pto has joined #openstack-ansible | 08:11 | |
*** andrewbonney has joined #openstack-ansible | 08:13 | |
*** rpittau|afk is now known as rpittau | 08:14 | |
*** pto_ has joined #openstack-ansible | 08:21 | |
*** pto has quit IRC | 08:24 | |
*** rfolco has joined #openstack-ansible | 09:00 | |
*** SiavashSardari has joined #openstack-ansible | 09:01 | |
*** CeeMac has quit IRC | 09:07 | |
snadge | how is the centos stream news going to affect openstack-ansible ? | 09:07 |
noonedeadpunk | snadge: have not decided yet, but I think we will eventually drop centos support in several releases | 09:08 |
snadge | i was going to deploy the next one on centos 8.. but it ends support next year, and i can get RHEL licenses for free | 09:08 |
snadge | its just laziness to deal with tracking registrations etc | 09:08 |
*** luksky has joined #openstack-ansible | 09:10 | |
jrosser | snadge: we did have a bit of discussion about this on irc yesterday | 09:18 |
jrosser | one of the issues is that no one on the openstack-ansible core team is running centos clouds | 09:19 |
jrosser | so it is a lot of maintenance overhead taken from peoples day job on something that they are not benefitting from | 09:19 |
jrosser | if we were to have a set of committed contributors who supported the centos OS then things would certainly be easier | 09:20 |
snadge | yeah that's understandable.. i don't think I would be able to do that, but I could ask around at work | 09:24 |
snadge | companies like IBM and RedHat should see value in something like openstack-ansible, I know their employees do.. people use open source solutions all the time, simply because its easier even if you can get "free" licensing there's still a process you have to go through for registration etc | 09:25 |
jrosser | and also i don't really know much about how RHEL works, if it's going to be the same rolling release as Centos will become | 09:26 |
snadge | i think the basic idea is that that centos stream because RHEL upstream.. so the changes in that will filter through to RHEL | 09:26 |
snadge | because/become | 09:26 |
jrosser | it is pretty scary from a CI point of view to know that the thing that passed tests today is less reproducible than you would like | 09:27 |
snadge | it might push people away from using centos in the enterprise.. as a way of basically getting RHEL stability for free | 09:28 |
snadge | if stream is quality wise, somewhere in between fedora and rhel, it might become popular | 09:28 |
noonedeadpunk | snadge: we had kind of proof last year that RHEL does not care about OSA at all | 09:38 |
snadge | they have their own redhat branded openstack yes.. which helps people who want to pay for that in a commercial environment, and that's fine | 09:39 |
noonedeadpunk | I don't recall avbout the details, but we had some simple request for Ansible at their Ansible Fest last year and they just ingored it and said they are not interested. I guess it's because they have tripleo and products to sell | 09:40 |
SiavashSardari | morning | 09:40 |
SiavashSardari | I wanna update openstack_openrc SHA on stable/ussuri, should I go on and upload a patch or there is some other process for stable branches? | 09:40 |
noonedeadpunk | SiavashSardari: we do bumps of SHAs in automated way once in 2 weeks | 09:40 |
noonedeadpunk | and do releases based on that | 09:40 |
noonedeadpunk | you can update in manually if you wish, but dunno why you would do that | 09:41 |
noonedeadpunk | jrosser: I was thinkin what we could replace centos with and have no idea. The best thing was suse, but it was not stable as well and we've already dropped it... | 09:42 |
SiavashSardari | noonedeadpunk oh didn't know that | 09:42 |
snadge | i dont think this has fully played out yet, there are a lot of people within these commercial organisations that value open source, including fedora etc | 09:42 |
SiavashSardari | I did a minor upgrade and wanna take advantage of https://review.opendev.org/c/openstack/openstack-ansible-openstack_openrc/+/763508 | 09:43 |
jrosser | SiavashSardari: you can just change the SHA in your user_variables if you like | 09:43 |
jrosser | for that one repo, all of the data is in playbooks/defaults/... so is very low precedence for ansible variables | 09:43 |
jrosser | it is designed this way so that you can customise it easily in your config | 09:44 |
snadge | i know that redhat want to sell their commercially supported versions of openstack/openshift etc, but they also want people to install open source versions and use it as well because thats more people who will potentially upgrade to a supported version later | 09:44 |
SiavashSardari | jrosser yeah, I'll do that till the next bumping. thanks | 09:44 |
noonedeadpunk | Fedora is not an option, lol. it super unstable even comparing to centos stream. and it's support term is 6 month iirc | 09:44 |
snadge | i know, i just used it as an example because there are people within these commercial organisations, who are enthusiastic about fedora and use it internally for development purposes etc | 09:45 |
snadge | they're not running clouds off it no | 09:45 |
jrosser | i think that this comes down to "would anyone choose to use centos stream to deploy openstack-ansible in a production environment" | 09:46 |
jrosser | if there are sufficient of those folk who exist and want to take on maintainance then that is all fine | 09:46 |
noonedeadpunk | yeah, exactly | 09:46 |
snadge | thats right, and I think that remains to be seen.. and im usually optimistic, but I'm not sure if I am this time | 09:46 |
jrosser | from a practical POV, we have already seen one *massive* OSA contributor migrate off centos | 09:47 |
noonedeadpunk | I think we probably end up dropping CI for centos, and leave things as is in an unsupported way | 09:47 |
jrosser | and with the centos stream news yesterday spatel already says he will look for alternatives | 09:47 |
noonedeadpunk | wondering what he will come up with... | 09:48 |
snadge | the guys at work have finally switched to openstack train on cent 7, which frees up the previous production system for something new | 09:48 |
snadge | thats a project that will start early in the new year some time | 09:49 |
*** macz_ has joined #openstack-ansible | 10:02 | |
*** macz_ has quit IRC | 10:07 | |
jrosser | noonedeadpunk: do you have any good ideas for metal jobs with this LIBSYSTEMD_VERSION environment variable? | 10:26 |
jrosser | seems to be either add it to the environment in zuul pre-gate playbook, or perhaps export an env var in the gate-check-commit script..... | 10:26 |
noonedeadpunk | hm, and https://zuul.opendev.org/t/openstack/build/aac08c7b31b047408fc1ab239b600130 has passed..... | 10:28 |
jrosser | oh wait | 10:29 |
noonedeadpunk | and the next recheck failed... | 10:29 |
noonedeadpunk | so maybe this one had non updated image... | 10:30 |
noonedeadpunk | but I suppose we would update it... | 10:30 |
noonedeadpunk | and we eventually placed the var https://51a12b4e70c36e2ff198-353a8055100be238a18e62fdcc374ef1.ssl.cf5.rackcdn.com/766030/4/check/openstack-ansible-deploy-aio_metal-centos-8/3ff1f97/logs/ara-report/results/218.html | 10:31 |
jrosser | i was thinking that the var may not be present in the shell running ansible | 10:32 |
jrosser | becasue that is the same one that runs gate-check-commit.sh, and starts before the env var is written | 10:32 |
noonedeadpunk | oh, yes, probably you're right.... | 10:32 |
jrosser | but then why does the job you linked pass, this is strange | 10:33 |
noonedeadpunk | also we would need to backport this to U :( | 10:34 |
jrosser | yes - just thankful that the centos upgrade job is nv right now otherwise we would be in even more difficulty | 10:39 |
noonedeadpunk | but, we change /etc/environment during setup-hosts, then we re-run openstack-ansible in shell script... So shouldn't ansible load correct env while we launch it? or because we call with the script it will share env... | 10:40 |
noonedeadpunk | well, that's pretty easy to check actually :) | 10:40 |
noonedeadpunk | yeah, I was thinking that centos 8 is pretty stable in terms of upgrade jobs and maybe it's time to make them voting :)) | 10:41 |
noonedeadpunk | really regret about these thoughts | 10:41 |
*** gshippey has joined #openstack-ansible | 10:44 | |
pto_ | Have anyone here integrated openstack with Windows AD/LDAP? | 10:59 |
*** pto_ is now known as pto | 10:59 | |
admin0 | pto, yes | 11:09 |
pto | admin0: I dont see any config in openstack ansible to enable tls. Do you know if its possible? | 11:10 |
admin0 | not sure pto .. i had a config and it worked :D .. so never looked into it | 11:10 |
pto | admin0: Could you share the working sample? | 11:11 |
admin0 | sure | 11:11 |
admin0 | one moment please | 11:11 |
admin0 | pto, https://gist.github.com/a1git/1f9b9e438c78683b900ff85d36d6ecc7 .. then after that, they can use domain.com and their AD user/pass | 11:14 |
admin0 | in our case, its internal traffic on a private vlan, so it worked on ldap .. never looked into anything else | 11:14 |
pto | admin0: Awsome! Thx. But i dont think you run secure ldap | 11:15 |
admin0 | though i am building a new one where i need to do ldaps | 11:15 |
admin0 | this one was AD on a secure vlan | 11:15 |
admin0 | in 1 week, i have to do a ldaps on openldap | 11:15 |
admin0 | i will know more in 1 week | 11:15 |
pto | admin0: I dont think its supported in the current config of openstack ansible. Cant find any reference to it | 11:16 |
admin0 | oh | 11:17 |
pto | admin0: nvm. https://github.com/openstack/openstack-ansible-os_keystone/blob/dcc16da7e20f50e1f9e9cd56170427ec9491d15c/tasks/keystone_ldap_setup.yml#L34 | 11:17 |
pto | admin0: Its just passing a dict to the template, so you can put anything in | 11:17 |
admin0 | did you tried ldaps and it failed ? | 11:18 |
pto | admin0: Its not accepting ldaps:// and if I enable tls and ldaps i get: AssertionError: Invalid TLS / LDAPS combination | 11:20 |
admin0 | oh | 11:23 |
admin0 | maybe best to check in #openstack-keystone | 11:23 |
*** SecOpsNinja has joined #openstack-ansible | 11:25 | |
pto | admin0: Channel is dead :-( | 11:25 |
kleini | pto: I am using ldaps: http://paste.openstack.org/show/800888/ | 11:38 |
kleini | Is it possible to use some wildcard to limit e.g. setup-hosts.yml to some host and all its containers? Something like --limit infra3,infra3_* | 11:40 |
kleini | answering again my own question: infra3-host_containers | 11:42 |
pto | kleini: thx! I will have a look at it | 11:45 |
pto | LDAP error: http://paste.openstack.org/show/800889/ | 11:45 |
kleini | error message very clearly states, that the connection is not possible. did you check ports of LDAP and encryption for example with openssl s_client and ldapsearch? | 11:48 |
pto | openssl s_client -connect ADLDAP.srv.aau.dk:3269 gives http://paste.openstack.org/show/800890/ | 11:49 |
pto | Seems to be tls1.2 | 11:50 |
pto | I guess i need to run startls and not ssl? | 11:51 |
kleini | I only heard about startls with IMAP, never with LDAP | 11:51 |
kleini | output of openssl s_client looks good | 11:52 |
kleini | can you have a look at the network traffic from Keystone to LDAP, whether at least some SSL handshakes take place? | 11:52 |
admin0 | -l "infra3_*" | 11:52 |
noonedeadpunk | pto https://docs.openstack.org/openstack-ansible-os_keystone/latest/configure-keystone.html#implementing-ldap-or-active-directory-backends worked for me with AD perfectly | 11:55 |
pto | noonedeadpunk: Thx! I have already been through it multiple times. I am no expert in AD, and the customer is running an AD forest and I think its different from a traditional AD | 11:58 |
jrosser | do you need to ensure that the CA cert from AD is installed into keystone container? | 11:59 |
jrosser | pto: actually in your paste 'Verification error: self signed certificate in certificate chain' | 12:01 |
pto | jrosser: Nice spotted. | 12:02 |
jrosser | been there - done that..... i made this https://github.com/openstack/openstack-ansible-openstack_hosts/blob/master/tasks/openstack_hosts_ca_certificates.yml | 12:02 |
jrosser | heres the whole patch so you can see which variables you need to use https://github.com/openstack/openstack-ansible-openstack_hosts/commit/1498d0d61de3ee8cea4b4e0ba8deb18c274ab1fe | 12:03 |
SecOpsNinja | hi to all. if you have multiple haproxy's and glance containers what do you use to have all the files sincronided? ceph/nfs? because i dont what NFS (for its non redundancy) and dont have expirence with ceph does anyone tried with glusterfs? i tried to see openstack-ansible but it doesn't seem to have any info regarding that | 12:05 |
noonedeadpunk | I'd use ceph tbh | 12:07 |
noonedeadpunk | it can brovide you not only block storage, but also object storage and fs (and even combined with ganesha nfs) | 12:08 |
noonedeadpunk | obejct storage is compatible both with S3 and Swift | 12:09 |
SecOpsNinja | noonedeadpunk, yep i do have cepth in my todo list but atm to resolve this problem regarding the 2º and 3º haproxy and glance hosts. whats the correct way to remove them until i have the HA filesystem installed? the gance would be destroying 2 and 3 containers and them remove them with inventory-manage.py -r parameter? | 12:11 |
pto | jrosser: Is it possible to rerun the tasks openstack-ansible-openstack_hosts by running openstack-hosts-setup.yml or will it break the installtion? | 12:13 |
admin0 | since glance does not change much, you can very much set it up in raid filesystem and rsync every X hour | 12:13 |
admin0 | it does not change much .. works just fine | 12:13 |
*** waverider has joined #openstack-ansible | 12:13 | |
admin0 | and nothing will break if glance breaks .. all that you lose is the limit to launch new vms | 12:13 |
noonedeadpunk | yeah was thinking abou lsync eventually, but with lsync you need to set one glance as master one where all uploads would happen | 12:14 |
*** kukacz has quit IRC | 12:15 | |
admin0 | pto, its safe to run the ansible playbooks multiple times | 12:15 |
pto | openssl s_client gives:Verify return code: 19 (self signed certificate in certificate chain) is that still a problem? | 12:21 |
SecOpsNinja | yep i understand the rsync part, but atm i have 3 infra hosts with keepalive and with 1 replica of glance and haproxy in each infra. because of that i have some lets encrypt certificates and glance images spread by all the infra nodes and aren't acessible in the current master. So to resolve this i was thinking to add a glusterfs volume but after thinking how to configure the glance contain | 12:22 |
SecOpsNinja | er to use i think its easier to remove the 2ª and 3ª replicas atm and put the openstack cluster working again (to create new vms) | 12:22 |
SecOpsNinja | and the openstack-ansible documentation is not very clear regarding removing this extra services/containers. I see the info regarding recreating them but not removing them | 12:23 |
kleini | pto: this self signed certificate is the only problem your keystone is not connecting properly | 12:29 |
pto | kleini: That is also my understanding, man if i can make ubuntu trust the cert, the it should work | 12:36 |
*** shyamb has joined #openstack-ansible | 12:38 | |
admin0 | why not *not* use self signed and use a signed one ? | 12:51 |
admin0 | meh . why not use a signed one ? | 12:51 |
admin0 | hmm.. The `lxc` module is not importable. Check the requirements. - what error is this ? | 12:55 |
admin0 | never seen this before . | 12:55 |
admin0 | so i had a poc on ubutu 20.04 .. tag 21.1.0 .. poc went in good .. now reformatted, added new hosts .. and this came up | 12:56 |
admin0 | deploy is unchanged ( just the facts and the old inventory removed) | 12:56 |
*** d34dh0r53 has quit IRC | 12:56 | |
jrosser | pto: yes you can use openstack-hosts-setup.yml --limit <containername> and it can be run as many times as you like | 13:06 |
*** shyamb has quit IRC | 13:06 | |
*** d34dh0r53 has joined #openstack-ansible | 13:13 | |
*** dave-mccowan has joined #openstack-ansible | 13:15 | |
*** waverider has quit IRC | 13:18 | |
*** maharg101 has joined #openstack-ansible | 13:19 | |
*** dave-mccowan has quit IRC | 13:21 | |
*** shyamb has joined #openstack-ansible | 13:22 | |
noonedeadpunk | just in case decided to place PR https://github.com/systemd/python-systemd/pull/89 | 13:25 |
pto | SSL trust is working, and the bare minimal config keystone is in plance, but its not working: http://paste.openstack.org/show/800894/ | 13:27 |
pto | dap.SERVER_DOWN: {'result': -1, 'desc': "Can't contact LDAP server", 'ctrls': [], 'info': '(unknown error code)'} | 13:28 |
*** kukacz has joined #openstack-ansible | 13:28 | |
noonedeadpunk | the won't help us though.... | 13:28 |
noonedeadpunk | just noticed that it hasn't been released for ages | 13:29 |
*** shyamb has quit IRC | 13:29 | |
jrosser | pto: may be another step in your certificates adventure.... | 13:38 |
mgariepy | pto, try connecting fro python in from the virtual env. | 13:39 |
pto | Good idea. Thx for helping | 13:39 |
admin0 | hi all . what could this error be, and anyone seen it before ? https://gist.githubusercontent.com/a1git/0f6c81533d45bd0781f2a9b324c4eefd/raw/6fbf26d40dec844c4816d1a84a4477a426adfa9b/gistfile1.txt | 13:39 |
mgariepy | also, maybe add some debug to ldap . it might log the ldap query so it might help you debug it a bit | 13:40 |
mgariepy | debug_level under the ldap section. | 13:41 |
SecOpsNinja | is tere a easy way to remove a role from being install in a node? or do i need to recreate the machine from scratch? i don't see the option to remove in the specificed roles like haproxy_server and magnum | 13:41 |
SecOpsNinja | supposly i can use absent option but having a bit of dificulty who to disable it... | 13:44 |
pto | hmnn... SimpleLDAPObject.simple_bind is successful | 13:45 |
jrosser | pto: there is a bunch of detail here https://docs.openstack.org/keystone/latest/admin/configuration.html#secure-the-openstack-identity-service-connection-to-an-ldap-back-end | 13:46 |
pto | Im out of time today and will dig deeper tomorrow. Thanks you all for helping! It has been very useful so far | 13:48 |
mgariepy | pto, then ldappool ;) or try to disable ldappool in the config. | 13:53 |
*** rfolco is now known as rfolco|brb | 13:59 | |
*** waverider has joined #openstack-ansible | 14:04 | |
*** spatel has joined #openstack-ansible | 14:08 | |
spatel | what variable i should use to change region globally ? | 14:20 |
mgariepy | spatel, service_region ? | 14:21 |
spatel | if i put that in user_variables then it will change it for every single service? | 14:22 |
mgariepy | i do beleive so since it will override : https://github.com/openstack/openstack-ansible/blob/master/inventory/group_vars/all/all.yml#L94 | 14:23 |
*** johanssone has quit IRC | 14:23 | |
*** crazzy has quit IRC | 14:24 | |
*** crazzy has joined #openstack-ansible | 14:25 | |
*** johanssone has joined #openstack-ansible | 14:26 | |
spatel | mgariepy: in my case i already cooked my openstack using ReigonOne default and now trying to change it. do you think if i put service_region: foo and re-run all playbook will fix it? | 14:28 |
admin0 | hi all.. what exactly does this error mean ? https://gist.github.com/a1git/0e100976a09669d01eecdd6ad0333fdb | 14:30 |
*** noonedeadpunk has quit IRC | 14:30 | |
*** noonedeadpunk_ has joined #openstack-ansible | 14:33 | |
*** ierdem has joined #openstack-ansible | 14:34 | |
ierdem | Hi everyone, can you explain me how can I list of queues in rabbitmq? Thanks | 14:35 |
ierdem | When I execute "rabbitmqctl list-queues <Name>" command it returns empty, I do not know why. . | 14:36 |
spatel | ierdem: rabbitmqctl list_queues -p /nova | 14:36 |
*** SiavashSardari has quit IRC | 14:36 | |
spatel | rabbitMQ use vhost so you need to use -p key | 14:37 |
ierdem | Oh, thank you, it works now | 14:37 |
jrosser | admin0: i am guessing that whichever host 172.29.236.13 is not setup correctly | 14:38 |
jrosser | i.e it has not had the lxc_hosts role run against it successfully for some reason | 14:38 |
admin0 | does having - in the hostname have any effect on this ? | 14:39 |
admin0 | that was the only diff from poc -> prod | 14:39 |
admin0 | reinstall with ubuntu 20, hostname had a lot of (dashes) | 14:39 |
jrosser | the error message seems unrelated to that really | 14:39 |
jrosser | i don't know what the task is as that is missing from the paste | 14:39 |
admin0 | jrosser, this one has more debug and detail https://gist.githubusercontent.com/a1git/0f6c81533d45bd0781f2a9b324c4eefd/raw/6fbf26d40dec844c4816d1a84a4477a426adfa9b/gistfile1.txt | 14:40 |
jrosser | but it looks to be an ansible lxc module, and the error message suggests that the lxc python bindings are missing from the target host | 14:40 |
admin0 | wouldn't setup host take care of that ? | 14:41 |
jrosser | that should be installed by the lxc_hosts role https://github.com/openstack/openstack-ansible-lxc_hosts/blob/master/vars/ubuntu-20.04-host.yml#L42 | 14:41 |
jrosser | you can check with apt if it is there | 14:41 |
admin0 | thanks for this link .. i will try to manually add these and see if it moved ahead | 14:41 |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/openstack-ansible-openstack_hosts master: Fix libsystemd version for Centos https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/766030 | 14:47 |
jrosser | ^ are you sure? | 14:48 |
jrosser | is pkg-config always there? | 14:48 |
noonedeadpunk_ | well, yes. in your option I got `LIBSYSTEMD_VERSION=(239-41.el8_3)` | 14:48 |
noonedeadpunk_ | systemctl --version - `systemd 239 (239-41.el8_3)` | 14:48 |
jrosser | oh no i mean the command pkg-config, won't we have to install that first? | 14:49 |
noonedeadpunk_ | not sure it's always there though.... but python-systemd relies on it... | 14:49 |
jrosser | only at build time though, and we target these tasks at all hosts right now | 14:49 |
noonedeadpunk_ | yeah, you're right | 14:50 |
noonedeadpunk_ | but we need another strange split then | 14:50 |
*** nurdie has joined #openstack-ansible | 14:50 | |
noonedeadpunk_ | also I've noticed, that ansible do not consume /etc/environment so you was right about it as well | 14:51 |
jrosser | had loads of distraction today, still poking around at that | 14:51 |
noonedeadpunk_ | even when I sourced it - that does not help | 14:51 |
jrosser | broke my AIO and had to start again as well | 14:51 |
noonedeadpunk_ | ansible was able to do lookup only after I did relogin | 14:52 |
jrosser | so do we perhaps want split[1] on the systemctl --version output? | 14:52 |
noonedeadpunk_ | http://paste.openstack.org/show/800900/ | 14:52 |
jrosser | actually its worse than that | 14:54 |
openstackgerrit | Marc Gariépy proposed openstack/openstack-ansible-os_horizon master: DNM simple test https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/766234 | 14:54 |
jrosser | you need pkg-config and systemd-devel everywhere that needs to run | 14:54 |
noonedeadpunk_ | agree. maybe we can use rpm then.... | 14:54 |
jrosser | so systemctl --version gives systemd 239 (239-41.el8_3) | 14:55 |
noonedeadpunk_ | yep | 14:55 |
jrosser | so we take the [1] element from split? | 14:55 |
jrosser | rather than -1 as my patch was which incorrectly takes the last thing | 14:55 |
* jrosser totally confused again becasue i'm sure we saw it do the right thing earlier | 14:58 | |
noonedeadpunk_ | maybe they changed it again lol | 14:58 |
noonedeadpunk_ | and that's why we saw it passing | 14:58 |
jrosser | https://51a12b4e70c36e2ff198-353a8055100be238a18e62fdcc374ef1.ssl.cf5.rackcdn.com/766030/4/check/openstack-ansible-deploy-aio_metal-centos-8/3ff1f97/logs/ara-report/results/218.html | 14:59 |
noonedeadpunk_ | maybe worth checking `rpm -qa systemd` and split on `-` | 14:59 |
noonedeadpunk_ | true | 14:59 |
jrosser | good call on the regexp though | 14:59 |
*** waverider has quit IRC | 15:01 | |
*** waverider has joined #openstack-ansible | 15:03 | |
*** nurdie has quit IRC | 15:06 | |
jrosser | noonedeadpunk_: the only other route we have is deployment_environment_variables but i'm not so sure there is any way we can add something extra to that | 15:07 |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/openstack-ansible-openstack_hosts master: Fix libsystemd version for Centos https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/766030 | 15:08 |
jrosser | ok that looks good | 15:08 |
jrosser | but still the issue of the metal jobs i think | 15:08 |
noonedeadpunk_ | yeah, most likely it is | 15:09 |
jrosser | i wonder if just using the zuul.d pre-gate playbook would do it | 15:10 |
jrosser | if theres enough between that and gate-check-commit.sh for there to be a new login shell | 15:10 |
*** macz_ has joined #openstack-ansible | 15:10 | |
jrosser | the same tasks could go there | 15:10 |
noonedeadpunk_ | but issue would still occur on real deployments then... | 15:11 |
noonedeadpunk_ | on real aio metal lol | 15:11 |
jrosser | right, but i think we have a wierd case becasue the target is localhost? | 15:11 |
jrosser | would be interesting to see if your test you pasted behaves the same if the target is not local | 15:12 |
jrosser | but i agree that this is all really not nice | 15:12 |
SecOpsNinja | in opnestaqck-ansible is there a easy way to remove installed componentes from especific host? or the only way is to reinstall host from scratch? the only info that i found was https://docs.openstack.org/openstack-ansible/pike/admin/maintenance-tasks/scale-environment.html#remove-a-compute-host but nothing regarding infra or storage nodes. | 15:13 |
*** nurdie has joined #openstack-ansible | 15:14 | |
*** macz_ has quit IRC | 15:15 | |
jrosser | SecOpsNinja: to remove a service you would destroy the containers, then use the inventory-manage tool in the scripts directory to remove them from the inventory, and also you have to clean up openstack_user_config stuff does not come back | 15:15 |
jrosser | after that you would have some cleanup to do on haproxy | 15:16 |
SecOpsNinja | jrosser, but it the problem it seams in services that aren't installed in containers like haproxy | 15:16 |
jrosser | well, i think thats really a symptom of ansible (and configuration managment tools in general) | 15:17 |
SecOpsNinja | for that kind of services it seems there aren't a easy way to remove them | 15:17 |
SecOpsNinja | some roles have the option to remove them but it doesn't seem to be the case in openstack ansible haproxy_Server role | 15:17 |
jrosser | really the only thing we have is deleting lxc containers, nothing else really handles uninstalling | 15:18 |
SecOpsNinja | ok i will try to remove it manually and then try to see i came make a commit with the option to have the uninstall option available | 15:18 |
SecOpsNinja | it was my mistake creating a cluster with only 1 infra + 1 compute + 1 storage and after tryied to add multiple ones without having distributed storaged like glusterfs/ceph/nfs.... and now it seems hard to add ceph in the current state. so it seems best to remove the extra replcias and then with time try to add ceph storage to all the services | 15:20 |
jrosser | i think its a balance between trying to keep an environment like that running through huge structural change | 15:22 |
jrosser | or to just clean it and start again | 15:23 |
SecOpsNinja | yep it been a learning phase for me regardiing openstack-ansible and openstack management :D | 15:24 |
noonedeadpunk_ | jrosser: I think we shoud just somehow source /etc/environment in the shell session (after setup-hosts) the only thing is that it is missing export (which is correct) and just source of the file doesn't work - it should be export | 15:25 |
*** miloa has quit IRC | 15:29 | |
jrosser | bit of stackoverflow suggests smt like for env in $( cat /etc/environment ); do export $(echo $env | sed -e 's/"//g'); done | 15:32 |
noonedeadpunk_ | this should work as well | 15:33 |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Apply /etc/environment for runtime after adjustment https://review.opendev.org/c/openstack/openstack-ansible/+/766244 | 15:33 |
jrosser | thats ok even without export? | 15:33 |
noonedeadpunk_ | gerrit is so laggy, that I can nowadays run `git review` and write in IRC about patch before it got pushed... | 15:33 |
noonedeadpunk_ | well set -a does export I think | 15:34 |
noonedeadpunk_ | at least that worked in my tiny script | 15:34 |
jrosser | oh yeah, i find now its just slow enough that i start doing something else whilst waiting for the gerrit ui | 15:34 |
jrosser | thats ruined productivity as i forget to go back to it | 15:34 |
noonedeadpunk_ | well if applying env will work, then we probably just should mark centos metal job as nv for 766030 | 15:37 |
jrosser | ah yes set -a /me reads man page | 15:37 |
spatel | jrosser: noonedeadpunk_ i am getting this rabbitMQ error on production cluster - http://paste.openstack.org/show/800902/ | 15:47 |
spatel | look like everyone saying restart cluster so what is the best way to restart, i did systemctl restart rabitmq but didn't help | 15:47 |
spatel | I think i need to re-build cluster or something | 15:47 |
spatel | does rabbitmqctl stop_app / rabbitmqctl reset / rabbitmqctl join would be enough ? | 15:48 |
spatel | or use nuke way to fix | 15:49 |
*** macz_ has joined #openstack-ansible | 15:52 | |
mgariepy | wow, how is metal check supposed to work? for my horizon patch, osa setup repo role, whcih install nginx, default config bind to 80, then haproxy fails to bind to 80 then dies there.. | 15:56 |
mgariepy | only happens when testing horizon because it's the only service binding to 80. :/ | 15:56 |
jamesdenton | i guess one of those is binding to 0.0.0.0? | 15:59 |
mgariepy | nginx from repo | 15:59 |
mgariepy | the order of stuff is not quite correct :D | 16:00 |
mgariepy | lol | 16:00 |
jrosser | mgariepy: perhaps we never quite finished the work for haproxy / metal / more things / horizon combination | 16:00 |
mgariepy | yep, i'll fix repo role. | 16:00 |
mgariepy | it does try to remove the nginx confg, then install the pkg.. | 16:01 |
jrosser | well also horizon really should not bind to :80 either | 16:01 |
mgariepy | no it's not horizon the issue here. | 16:01 |
jrosser | hmm well actually thats... | 16:01 |
jrosser | yeah theres now an internal vip on .101 for metal jobs | 16:01 |
mgariepy | it's that the port is already occupied to by nginx from repo_server | 16:01 |
*** nurdie has quit IRC | 16:02 | |
mgariepy | when haproxy tries to bind to it. | 16:03 |
admin0 | spatel, nuking works :) | 16:04 |
jrosser | mgariepy: it should be taking these https://github.com/openstack/openstack-ansible-repo_server/blob/master/defaults/main.yml#L51-L52 | 16:04 |
admin0 | but you have to run the setup-openstack again after that | 16:04 |
spatel | admin0: that is my last option, currently trying to see if i can recover | 16:05 |
admin0 | except database, i just nuke and redo it .. | 16:05 |
admin0 | save time | 16:05 |
spatel | i would use manual method to create all service account | 16:05 |
spatel | running full setup-openstack will be big deal | 16:05 |
mgariepy | it does for the slusshee servie | 16:06 |
mgariepy | it does for the slusshee service | 16:06 |
mgariepy | https://zuul.opendev.org/t/openstack/build/5bf01bb6687a41e0970220c461489297/log/logs/etc/host/nginx/sites-enabled/default.txt | 16:07 |
mgariepy | jrosser, ^^ | 16:07 |
jrosser | do we even need that? | 16:08 |
mgariepy | https://github.com/openstack/openstack-ansible-repo_server/blob/master/tasks/repo_pre_install.yml#L80 | 16:09 |
mgariepy | in pre, so the file is removed (but doesn't exist) then the pkg is installed. | 16:09 |
*** waverider has quit IRC | 16:09 | |
jrosser | right | 16:10 |
openstackgerrit | Marc Gariépy proposed openstack/openstack-ansible-repo_server master: Fix order for removing nginx file. https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/766257 | 16:10 |
jrosser | interestingly it used to be in post-install https://github.com/openstack/openstack-ansible-repo_server/commit/330459bc39113321e56f9fc28c126e961e25fc62 | 16:11 |
mgariepy | well. stuff changes ;D | 16:12 |
mgariepy | i was wondering how my changes failed all the metal job.. | 16:13 |
mgariepy | now i know. | 16:13 |
openstackgerrit | Marc Gariépy proposed openstack/openstack-ansible-repo_server master: Fix order for removing nginx file. https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/766257 | 16:13 |
kleini | I failed to extend a single node Galera to two nodes. I just ran the setup-infrastructure.yml without any additional options. after that MariaDB on already existing node failed after restart. | 16:13 |
jrosser | i am not sure galera is able to elect a primary node when there are only two | 16:14 |
kleini | Okay, now I kept MariaDB on existing node and just deployed Galera on second node and it joined successfully | 16:16 |
kleini | Now that I have two working nodes, I can try to redeploy first one | 16:16 |
jrosser | there is some explanation here https://galeracluster.com/library/kb/two-node-clusters.html | 16:17 |
openstackgerrit | Marc Gariépy proposed openstack/openstack-ansible-os_horizon master: Add ability to configure ALLOWED_HOSTS for horizon. https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/765998 | 16:23 |
openstackgerrit | Andrew Bonney proposed openstack/openstack-ansible master: Ensure kuryr repo is available within CI images https://review.opendev.org/c/openstack/openstack-ansible/+/765765 | 16:24 |
*** nurdie has joined #openstack-ansible | 16:24 | |
noonedeadpunk_ | uh now, it doesn't work :( | 16:39 |
noonedeadpunk_ | *no | 16:39 |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Apply /etc/environment for runtime after adjustment https://review.opendev.org/c/openstack/openstack-ansible/+/766244 | 16:45 |
*** rfolco|brb is now known as rfolco | 16:46 | |
kleini | jrosser: okay, so it would be better to have a 3 node cluster. my problem was more, that the first galera was restarted and that restart completely failed somehow. maybe due to starting Galera on second node and then a missing quorum | 16:48 |
jrosser | yes, as that doc says if something happens to one node (like a restart) then they both will become inactive | 16:48 |
jrosser | we had the same situation outside of openstack-ansible and had to add a third node | 16:49 |
noonedeadpunk_ | another solution is to add garbd like to deploy host | 16:51 |
noonedeadpunk_ | I was using it when don't have capacity for the third host, but it prevents split brain | 16:52 |
noonedeadpunk_ | https://galeracluster.com/library/documentation/arbitrator.html | 16:52 |
kleini | deployment node is in my case not a physical node, just a systemd-nspawn container | 16:53 |
kleini | what about rabbitmq? does it require then three nodes, too, or is two good enough? | 16:54 |
*** nurdie has quit IRC | 16:54 | |
kleini | answering my own question: "Two node clusters are highly recommended against" | 16:56 |
noonedeadpunk_ | I'm not sure that rabbit does classic clstering in terms that it shouldn't be too much issues in case of 2 nodes for rabbit | 16:56 |
noonedeadpunk_ | ah, well, yeah | 16:57 |
kleini | https://www.rabbitmq.com/clustering.html#node-count | 16:58 |
noonedeadpunk_ | quorum queues are pretty modern and not sure was ever using mqtt | 16:58 |
noonedeadpunk_ | and having quorum is anyay 3.8 requirement for mqtt | 16:59 |
kleini | I need 3 Galera instances, so there is no real additional effort to have 3 rabbitmq instances, too | 17:00 |
noonedeadpunk_ | so even in stein that worked for me | 17:00 |
kleini | infra1: 1 galera, 2 rabbitmq and infra2: 2 galera and 1 rabbitmq | 17:00 |
spatel | Oh boy, i have removed rabbitmq-server and trying to re-run playbook but getting this issue - http://paste.openstack.org/show/800903/ | 17:03 |
spatel | look like broken repo | 17:03 |
spatel | i am using queen | 17:03 |
spatel | does OSA use public repo to install rabbitMQ or store rpm to local repo container? | 17:03 |
fridtjof[m] | re: the whole centos mess - i'm interested to see what will come out of this: https://github.com/hpcng/rocky | 17:04 |
jrosser | he's the originator of centos? | 17:05 |
fridtjof[m] | yup | 17:05 |
spatel | jrosser: can i manually install rabbitMQ rpm and tell OSA to configure? | 17:05 |
jrosser | if you have the rpm probably | 17:06 |
spatel | i do have RPM i can install | 17:06 |
noonedeadpunk_ | you can provide better url I thing | 17:06 |
jrosser | i depends how the ansible works, if it taked it from a repo or url | 17:06 |
spatel | here is the RPM - http://mirror.centos.org/centos/7/cloud/x86_64/openstack-queens/Packages/r/ | 17:06 |
jrosser | there has been a big mix of things in the past because the packages had to come from all different places depending on the OS / version / ... | 17:06 |
noonedeadpunk_ | spatel: btw https://opendev.org/openstack/openstack-ansible-rabbitmq_server/src/branch/stable/queens/vars/redhat.yml#L17 | 17:06 |
noonedeadpunk_ | queens has different url nowadays | 17:06 |
spatel | let me check | 17:07 |
spatel | so we do hard code URL in OSA? | 17:07 |
noonedeadpunk_ | ah, rabbitmq_install_method: external_repo | 17:07 |
noonedeadpunk_ | I think you can set rabbitmq_install_method: file and it should work then | 17:07 |
spatel | where i should set rabbitmq_install_method ? | 17:08 |
noonedeadpunk_ | oh, well, you have issue with erlang url actually | 17:08 |
spatel | trying to understand what is going on.. my my yum doesn't like repo | 17:09 |
noonedeadpunk_ | spatel: btw https://dl.bintray.com/rabbitmq/rpm/erlang/19/el/7/repodata/repomd.xml works for me | 17:12 |
*** rpittau is now known as rpittau|afk | 17:12 | |
spatel | it works for me if i do curl but yum doesn't like, may be i have proxy issue - http://paste.openstack.org/show/800904/ | 17:12 |
spatel | what is the repo pointing to proxy? | 17:13 |
noonedeadpunk_ | I think you should comment that out.... | 17:13 |
spatel | removing it and see if it works | 17:13 |
noonedeadpunk_ | I think one day we were having rpm proxy on repo server or smth like this... | 17:13 |
*** kukacz has quit IRC | 17:14 | |
jrosser | there was a cache on the repo server | 17:14 |
spatel | noonedeadpunk_: what is this one - http://paste.openstack.org/show/800905/ | 17:15 |
spatel | this is also causing issue | 17:15 |
spatel | or may be i have broken repo container | 17:15 |
spatel | This is broken URL - https://packagecloud.io/rabbitmq/rabbitmq-server/el/7/ | 17:17 |
spatel | its trying to install RabbitMQ from that repo | 17:17 |
noonedeadpunk_ | I'm not sure if it's broken | 17:18 |
noonedeadpunk_ | we use the same repo till now and it's working | 17:19 |
noonedeadpunk_ | It has been never browsable though | 17:19 |
spatel | baseurl = https://packagecloud.io/rabbitmq/rabbitmq-server/el/7/$basearch | 17:19 |
spatel | go to that URL | 17:19 |
spatel | it failed | 17:19 |
spatel | look like they moved everyone to https://packagecloud.io/rabbitmq/rabbitmq-server/ and remove /el/7 director | 17:20 |
noonedeadpunk_ | yeah but repo url looks still valid for me | 17:21 |
noonedeadpunk_ | and heere are docs https://packagecloud.io/rabbitmq/rabbitmq-server/install#manual-rpm | 17:21 |
*** cshen has quit IRC | 17:22 | |
spatel | in my case its trying to reach this place and getting 404 error - https://packagecloud.io/rabbitmq/rabbitmq-server/repodata/repomd.xml | 17:23 |
spatel | noonedeadpunk_: should i use -e rabbitmq_upgrade=true option to run or without that? | 17:29 |
spatel | my playbook getting stuck here TASK [rabbitmq_server : Lock package versions] | 17:29 |
noonedeadpunk_ | well, you can set `rabbitmq_install_method` file I guess to install rabbitmq itself from URL | 17:31 |
noonedeadpunk_ | `rabbitmq_install_method: file` | 17:31 |
spatel | -e `rabbitmq_install_method: file` | 17:31 |
spatel | is this correct? | 17:32 |
noonedeadpunk_ | yep | 17:32 |
spatel | running openstack-ansible rabbitmq-install.yml -e 'rabbitmq_install_method: file' | 17:32 |
noonedeadpunk_ | for centos 8 I have `baseurl = https://packagecloud.io/rabbitmq/rabbitmq-server/el/8/$basearch`... | 17:33 |
spatel | I have centos 7 :( | 17:33 |
spatel | now its getting stuck at TASK [rabbitmq_server : Gather a list of the currently locked versions] | 17:34 |
noonedeadpunk_ | but by the logic, https://packagecloud.io/rabbitmq/rabbitmq-server/repodata should be the same? | 17:34 |
spatel | i think yum still keep looking for bad repo | 17:34 |
*** luksky has quit IRC | 17:35 | |
spatel | damn it totally stuck here - Gather a list of the currently locked versions | 17:36 |
spatel | not sure what its doing let me try -vvv | 17:36 |
noonedeadpunk_ | jrosser: I'm not sure what has happened, but https://review.opendev.org/c/openstack/openstack-ansible/+/766244 failed at the exactly sqame place, but when I added debug - it passed o_O | 17:36 |
noonedeadpunk_ | and now gerrit timeouts... | 17:37 |
spatel | noonedeadpunk_: is there a way in OSA i can disable RabbitMQ.repo for sometime :( | 17:41 |
noonedeadpunk_ | eventually if you set `rabbitmq_install_method: file` shoudld not add it | 17:42 |
noonedeadpunk_ | but if it's already there.... | 17:42 |
spatel | its already there currently | 17:43 |
spatel | do you think i should mv it and then re-run? | 17:43 |
spatel | let me try that | 17:43 |
spatel | els-erlang.repo also borken to moving it | 17:44 |
spatel | hope OSA won't install back | 17:44 |
noonedeadpunk_ | you wil need erlang | 17:45 |
spatel | that one also broken :( | 17:45 |
noonedeadpunk_ | I'm aboslutely sure it's not | 17:45 |
noonedeadpunk_ | from what you've posted | 17:45 |
noonedeadpunk_ | have you commented out proxy setting? | 17:45 |
spatel | i did | 17:45 |
spatel | let me check again.. | 17:46 |
noonedeadpunk_ | well you posted working url.... | 17:46 |
spatel | noonedeadpunk_: it put RabbitMQ.repo back :( | 17:46 |
spatel | even i used file option | 17:46 |
noonedeadpunk_ | uh... | 17:46 |
spatel | damn it | 17:46 |
*** kukacz has joined #openstack-ansible | 17:47 | |
spatel | can i just comment this line in ansible role "Lock package versions" | 17:47 |
spatel | this is where its getting stuck | 17:48 |
*** gyee has joined #openstack-ansible | 17:57 | |
spatel | noonedeadpunk_: if i install rpm by hand then does OSA look for yum? | 17:57 |
spatel | This is totally messed up :( | 17:57 |
noonedeadpunk_ | yep it will | 18:01 |
noonedeadpunk_ | unless you comment it out | 18:01 |
*** noonedeadpunk_ is now known as noonedeadpunk | 18:01 | |
spatel | comment out this playbook task right "Lock package versions" ? | 18:02 |
noonedeadpunk | it's pretty weird task tbh | 18:06 |
noonedeadpunk | I think you should check your current yum versionlock list | 18:07 |
noonedeadpunk | and clear all matches | 18:07 |
noonedeadpunk | and comment out task | 18:08 |
noonedeadpunk | maybe that's it which is preventing you from using repo | 18:08 |
spatel | noonedeadpunk: this is what i have http://paste.openstack.org/show/800910/ | 18:09 |
noonedeadpunk | and btw, you should not get repo installed when set rabbitmq_install_method=file | 18:09 |
spatel | is this correct command openstack-ansible rabbitmq-install.yml -e 'rabbitmq_install_method: file' ? | 18:09 |
noonedeadpunk | as task run only when `rabbitmq_install_method == 'external_repo'` | 18:09 |
noonedeadpunk | -e rabbitmq_install_method=file | 18:09 |
spatel | damn it | 18:09 |
noonedeadpunk | `rabbitmq_install_method: file ` is proper for placing in user_variables.yml | 18:10 |
spatel | openstack-ansible rabbitmq-install.yml -e rabbitmq_install_method=file | 18:11 |
noonedeadpunk | honestly there's a bit of mess in queens role... | 18:11 |
spatel | without any quotes right? | 18:11 |
noonedeadpunk | may add quotes if you wish non critical | 18:11 |
spatel | noonedeadpunk: this is the error i got - http://paste.openstack.org/show/800911/ | 18:12 |
spatel | The checksum for /opt/rabbitmq-server.rpm did not match | 18:12 |
spatel | where is that checksum coming from? | 18:13 |
noonedeadpunk | rabbitmq_package_sha256 | 18:13 |
noonedeadpunk | jsut set then -e rabbitmq_package_sha256=f98a69b2c82c72c3e98bab263da5673e262c9148abb066ec5e9b0599bf280fdc | 18:13 |
spatel | adding that option with command | 18:14 |
-openstackstatus- NOTICE: The Gerrit service on review.opendev.org is currently responding slowly or timing out due to resource starvation, investigation is underway | 18:14 | |
spatel | re-running.. | 18:14 |
spatel | Same error The checksum for /opt/rabbitmq-server.rpm did not match | 18:15 |
spatel | can i comment out that line in Download the RabbitMQ package | 18:16 |
*** kukacz has quit IRC | 18:18 | |
noonedeadpunk | feel free | 18:19 |
noonedeadpunk | but ensure that url provides real rpm package... | 18:20 |
spatel | look like not | 18:20 |
spatel | rpm -ivh https://packagecloud.io/rabbitmq/rabbitmq-server/packages/el/7/rabbitmq-server-3.6.14-1.el7.noarch.rpm | 18:20 |
spatel | they changed something there | 18:20 |
spatel | this is webpage | 18:20 |
spatel | wget https://packagecloud.io/rabbitmq/rabbitmq-server/packages/el/7/rabbitmq-server-3.6.14-1.el7.noarch.rpm | 18:20 |
spatel | This is real path look like - https://www.rabbitmq.com/releases/rabbitmq-server/v3.6.14/rabbitmq-server-3.6.14-1.el7.noarch.rpm | 18:21 |
noonedeadpunk | you can use https://www.rabbitmq.com/releases/rabbitmq-server/v3.6.14/ | 18:21 |
noonedeadpunk | yeah | 18:21 |
spatel | let me change it | 18:21 |
noonedeadpunk | you can set it with variable | 18:22 |
noonedeadpunk | rabbitmq_package_url | 18:22 |
spatel | in -e option right | 18:22 |
*** cshen has joined #openstack-ansible | 18:22 | |
noonedeadpunk | with -e or in user_variables | 18:22 |
spatel | done in user_variables rabbitmq_package_url: https://www.rabbitmq.com/releases/rabbitmq-server/v3.6.14/rabbitmq-server-3.6.14-1.el7.noarch.rpm | 18:23 |
spatel | let me re-run | 18:23 |
*** cshen has quit IRC | 18:26 | |
spatel | didn't work | 18:29 |
spatel | look like its trying to install 3.6.16-1.el7 | 18:29 |
spatel | why it didn't used URL provided link | 18:30 |
spatel | noonedeadpunk: - http://paste.openstack.org/show/800912/ | 18:30 |
noonedeadpunk | well output claims on erlang, not rabbit... | 18:31 |
spatel | I have disabled this repo [rabbitmq_els-erlang] | 18:32 |
noonedeadpunk | and erlang is possible to install only from repo that has been installed | 18:32 |
noonedeadpunk | system repo does not contain erlang of required version | 18:32 |
spatel | hmm | 18:33 |
spatel | I have install this manually [root@ostack-infra-02-rabbit-mq-container-aa705644 root]# rpm -ivh rabbitmq-server-3.6.14-1.el7.noarch.rpm | 18:34 |
spatel | let me see if OSA just configure my cluster | 18:35 |
spatel | i don't know why its trying to install erlang | 18:35 |
-spatel- [root@ostack-infra-02-rabbit-mq-container-aa705644 root]# rpm -qa | grep erlang | 18:35 | |
-spatel- erlang-19.3.6.8-1.el7.centos.x86_64 | 18:35 | |
spatel | i do have erlang already there | 18:36 |
spatel | look like rpm -ivh rabbitmq-server-3.6.14-1.el7.noarch.rpm works.. now its doing configuration of rabbit | 18:37 |
spatel | look like OSA just trying to upgrade erlang | 18:37 |
openstackgerrit | Marc Gariépy proposed openstack/openstack-ansible-os_horizon master: Add ability to configure ALLOWED_HOSTS for horizon. https://review.opendev.org/c/openstack/openstack-ansible-os_horizon/+/765998 | 18:38 |
*** kukacz has joined #openstack-ansible | 18:43 | |
kleini | Is that true, that CentOS 8 will go EOL end 2021? | 19:00 |
noonedeadpunk | yep - everybody is ranting for a while now | 19:00 |
kleini | you put so much work into CentOS8 support and this is all dog's breakfast. damn | 19:02 |
noonedeadpunk | yeah :( | 19:02 |
noonedeadpunk | that's really a dissapointment. but even more for ppl who have upgraded CentOS 7 -> CentOS 8 | 19:03 |
*** andrewbonney has quit IRC | 19:03 | |
noonedeadpunk | luckily neither of us using centos in prod | 19:03 |
spatel | noonedeadpunk: thank you so much! | 19:03 |
spatel | my rabbitMQ is back | 19:03 |
spatel | this is not fun but :( | 19:03 |
spatel | Why don't OSA keep these binary in local repo so we don't need to deal with public servers | 19:04 |
noonedeadpunk | because we don't have local repo? | 19:04 |
noonedeadpunk | even in terms of resources | 19:04 |
spatel | what is repo container for? | 19:04 |
spatel | we can turn that to repo right? | 19:04 |
jrosser | spatel: there are all the hooks for you to host locally as file or to have mirrors locally | 19:05 |
noonedeadpunk | ah, well, it was for cache and that's why you had proxy | 19:05 |
noonedeadpunk | but we don't cache in repo anymore | 19:05 |
noonedeadpunk | and yes - you can define your own mirrors :) | 19:05 |
jrosser | in more recent releases all the caching is removed from the repo server because the assumption is if you want to host local versions you’ll need a local mirror anyway | 19:06 |
jrosser | so the repo cache was kind of pointless for that | 19:06 |
spatel | seriously something totally went wrong today, first rabbitmq.repo was broken.. i am going to investigate. | 19:06 |
spatel | Look like these public repo move their repo and put Redirect on http so they work on browser but yum doesn't know how to handle it | 19:07 |
jrosser | also for queens the branch is extended-mainatainace now so really no one is watching CI for it | 19:08 |
jrosser | things like rabbit url moving will go unnoticed | 19:09 |
spatel | agreed! i want to move out but there are 1000 vm running on it and it would disaster to upgrade | 19:09 |
spatel | jrosser: may be i need to host my local CI server to keep checking these bugs | 19:10 |
jrosser | a local Jenkins job building the tag you care about might catch most major issues like repos moving | 19:10 |
jrosser | just AIO would do | 19:10 |
spatel | AIO would be good | 19:10 |
spatel | I like rabbitmq_install_method=file this option default :) | 19:11 |
noonedeadpunk | We're about to drop it.... | 19:13 |
spatel | drop what? | 19:13 |
noonedeadpunk | rabbitmq_install_method=file | 19:13 |
spatel | why ? | 19:13 |
noonedeadpunk | specificly file method | 19:14 |
noonedeadpunk | because adds much complexity. moreover you still need repo for erlang, so doesn't make much sense | 19:14 |
spatel | but it does work when you want to re-build cluster quickly. | 19:15 |
noonedeadpunk | well we never saw any issues with dropped packages from current repos in CI since Rocky at least | 19:15 |
*** cshen has joined #openstack-ansible | 19:16 | |
noonedeadpunk | can't say about before that | 19:16 |
spatel | agreed.. we can't keep feeding older shit... | 19:16 |
noonedeadpunk | also distro version of rabbit nowadays is valid, so you can probably use distro instead | 19:17 |
noonedeadpunk | (but not on centos 7) | 19:17 |
spatel | OSA use distro version to install rabbitmq right? | 19:18 |
spatel | what do you mean by use distro version? | 19:18 |
noonedeadpunk | no, we don't but it's available option for rabbitmq_install_method | 19:19 |
spatel | i think i should go back to lab and trying to play with it.. | 19:20 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible-rabbitmq_server/src/branch/master/releasenotes/notes/rabbit_install_method-b1defcd376f3bf87.yaml | 19:20 |
*** cshen has quit IRC | 19:20 | |
spatel | its hard to maintain openstack without knowledge of all playbooks and their function | 19:20 |
spatel | hmmm | 19:21 |
openstackgerrit | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Apply /etc/environment for runtime after adjustment https://review.opendev.org/c/openstack/openstack-ansible/+/766244 | 19:25 |
*** newtim has joined #openstack-ansible | 19:29 | |
*** mike44333 has joined #openstack-ansible | 19:33 | |
newtim | Has anyone found any workarounds for the systemd-python issue in Centos8? | 19:38 |
noonedeadpunk | I think we're about to land patches | 19:38 |
noonedeadpunk | https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/766030 | 19:38 |
newtim | awesome | 19:39 |
jrosser | noonedeadpunk: did you find why it apparently didnt work? | 19:39 |
jrosser | newtim: you can take that environment variable from the patch and use it locally if you need | 19:39 |
noonedeadpunk | https://review.opendev.org/c/openstack/openstack-ansible/+/766244 has 1 passed and 1 failed centos 8 metal jobs... | 19:39 |
noonedeadpunk | but, what debug showed, that env var is set properly now | 19:40 |
jrosser | i looked in the venv build log on one of the failed jobs and you could see the CFLAGS being set to the wrong version string | 19:41 |
*** cshen has joined #openstack-ansible | 19:41 | |
jrosser | though the centos-8 job which passed was distro, so that won't be doing the venv build | 19:42 |
*** waverider has joined #openstack-ansible | 19:47 | |
*** waverider has quit IRC | 19:47 | |
*** waverider has joined #openstack-ansible | 19:47 | |
noonedeadpunk | I hope 766244 will pass, which means we can just set centos job to nv for 766030 | 19:48 |
noonedeadpunk | I haven't saved link for the results of the last run( | 19:48 |
*** waverider has quit IRC | 19:48 | |
*** waverider has joined #openstack-ansible | 19:48 | |
*** waverider is now known as adrian-a | 19:49 | |
jrosser | https://zuul.opendev.org/t/openstack/build/f94cbff02b194c6fb2a307c07383ab3a/log/logs/host/python_venv_build.log.txt | 19:50 |
*** adrian-a has quit IRC | 19:51 | |
*** adrian-a_ has joined #openstack-ansible | 19:54 | |
*** adrian-a_ has quit IRC | 19:54 | |
*** adrian-a has joined #openstack-ansible | 19:55 | |
spatel | noonedeadpunk: do you think after fixing centos-8 job we can release with victoria ? | 19:58 |
admin0 | i had a hypervisor h1, which had to be renamed to h2 . but now its old entry is in hypervisors .. is there a way to delete it ? | 19:59 |
spatel | yes we have playbook to clean up | 20:01 |
spatel | admin0: https://docs.openstack.org/openstack-ansible/newton/developer-docs/ops-remove-computehost.html | 20:02 |
*** maharg101 has quit IRC | 20:23 | |
*** adrian-a has quit IRC | 20:24 | |
*** adrian-a has joined #openstack-ansible | 20:25 | |
admin0 | how does volume attachment works when cinder is using ceph, but nova is using local storage ? | 20:36 |
spatel | jrosser: can you explain me why do we add proxy=http://172.28.0.9:3142 in /etc/yum.conf ? | 20:36 |
admin0 | will that compute node have /etc/ceph ? | 20:36 |
ThiagoCMC | admin0, I have that | 20:36 |
admin0 | i am trying to troubleshoot when instance is on non-ceph hypervisor, cinder is not attaching | 20:37 |
admin0 | while if the instance is on a ceph backed hypervisor, volume attaches as well | 20:37 |
spatel | noonedeadpunk: I found issue why my all repo was messed up and nothing was working | 20:38 |
ThiagoCMC | Hmm... Well, my compute node as /dev/ceph and I can launch instances on local storage, and attach ceph volumes. Or I can launch instances directly on ceph pool vms too | 20:38 |
spatel | because of this proxy=http://172.28.0.9:3142 in /etc/yum.conf | 20:38 |
jrosser | spatel: on queens the repo server runs apt-cacher-ng and the nodes have config to get packages from there, iirc | 20:38 |
spatel | trying to understand why do we need that setting on rabbitMQ container? | 20:38 |
spatel | what kind of packages it need for rabbitMQ? | 20:39 |
jrosser | it was on everything for queens I believe | 20:39 |
jrosser | the rabbitmq package and erlang | 20:39 |
spatel | so we do keep those binaries on repo-container? | 20:40 |
jrosser | no, it’s a cache | 20:40 |
jrosser | the repo container runs a package cache | 20:40 |
*** nurdie has joined #openstack-ansible | 20:41 | |
jrosser | which does the upstream fetch of the package, so only done once for all hosts to then use | 20:41 |
spatel | So my rabbit-container ----------->[repo-container]------------>[public_repo_server] ? | 20:41 |
*** ThiagoCMC has left #openstack-ansible | 20:41 | |
*** ThiagoCMC has joined #openstack-ansible | 20:41 | |
jrosser | yes I think so | 20:41 |
spatel | if it download package then it keep in cache for somedays | 20:41 |
jrosser | I forget which release that all got removed | 20:42 |
*** nurdie has quit IRC | 20:42 | |
spatel | Look like my whole issue was repo server not rabbitMQ | 20:42 |
jrosser | there is a releasing W for the removal of the package cache | 20:43 |
spatel | as soon as i removed that proxy all yum repo looking good and freaking fast.. | 20:43 |
jrosser | and I remember some extra tasks had to be added to clean up the proxy config to get rid of it | 20:43 |
spatel | jrosser: that would be great to not have that dependency.. these days internet is so fast so no point to keep in cache | 20:43 |
jrosser | queens is ancient :) | 20:44 |
jrosser | it is not there any more for many releases now | 20:44 |
spatel | jrosser: totally but its in my production and i have to keep feeding until it die. | 20:44 |
spatel | glad we removed that dependency :) | 20:45 |
jrosser | well I guess you know where to debug now, service on port 3142 of the repo server | 20:45 |
jrosser | because the same config will be laid down next time you run ansible, and probably exists everywhere else too | 20:45 |
spatel | debugging repo server, i think i need some good monitoring script to keep checking health | 20:46 |
jrosser | it’s via haproxy so that would be your first stop | 20:46 |
jrosser | I use Prometheus exporters for all of this stuff to get status | 20:47 |
spatel | I am using zabbix | 20:47 |
spatel | Do you using rally for continue creating vm and deleting.. that kind of monitoring? | 20:48 |
spatel | I am looking for that kind of solution which create vm and delete every 15 minute (That way i can catch issue if anything break in cluster) | 20:49 |
jrosser | we made a jenkins job to do that too :) | 20:54 |
jrosser | just with the ansible openstack modules | 20:54 |
ierdem | Hi, I am trying to create sahara cluster and I am encountering this error "reason: Heat stack failed with status Resource CREATE failed: ResourceInError: resources.new-master2.resources[0].resources.inst: Went to status ERROR due to "Message: No valid host was found. , Code: 500"". I have enough resources, nova services works fine. Do you have any | 20:55 |
ierdem | idea? Thkans | 20:55 |
*** newtim has quit IRC | 21:00 | |
admin0 | is there a way to have 2fa in keystone ? | 21:01 |
spatel | jrosser: does jenkins run that job periodically ? | 21:01 |
admin0 | i mean for users . in kind of a publicly open page | 21:01 |
admin0 | err .. in public facing clouds | 21:02 |
spatel | ierdem: Message: No valid host was found i would look for nova logs like nova-scheduler conductor etc.. | 21:03 |
spatel | admin0: try to google, i never thought about 2FA for keystone, it should work | 21:05 |
jrosser | admin0: there is some documentation for keystone here and 2FA https://docs.openstack.org/keystone/latest/admin/multi-factor-authentication.html#multi-factor-authentication | 21:05 |
jrosser | but also if you use some external identity provider SSO then that might also be a route to getting 2FA | 21:06 |
*** rfolco has quit IRC | 21:08 | |
*** kukacz has quit IRC | 21:11 | |
spatel | jrosser: after restarting systemctl restart apt-cacher-ng.service fixed issue | 21:20 |
spatel | looking logs why it was not happy even i can see that in ps output | 21:20 |
*** gshippey has quit IRC | 21:28 | |
*** cshen has quit IRC | 21:50 | |
*** cshen has joined #openstack-ansible | 21:52 | |
*** adrian-a has left #openstack-ansible | 21:54 | |
spatel | Everyone here please sign the petition: https://www.change.org/p/centos-governing-board-do-not-destroy-centos-by-using-it-as-a-rhel-upstream | 22:11 |
admin0 | it might also split and become something else | 22:13 |
admin0 | centos i mean | 22:14 |
ThiagoCMC | I bet that now people will realize the same thing I realized back in 1998: Debian is the only way to go! :-P | 22:16 |
spatel | Until some big giant come and buy Debian | 22:16 |
ThiagoCMC | Nah | 22:17 |
ThiagoCMC | Debian is truly awesome! It's light years ahead of everything else. | 22:18 |
spatel | not sure how many folks running Debian on production cloud | 22:19 |
ThiagoCMC | Only the smart ones... lol | 22:20 |
ThiagoCMC | =P | 22:20 |
ThiagoCMC | Jokes apart, man, come on... CentOS is so hard. | 22:20 |
ThiagoCMC | It is not just a matter of opinion! | 22:20 |
ThiagoCMC | I'm happy it's gone. | 22:21 |
spatel | Believe me CentOS was very stable distro until IBM came in picture, only issue with ubuntu etc.. they changes a lot. every year new distriution and its hard in production to catchup with all those changes | 22:25 |
*** cshen has quit IRC | 22:34 | |
spatel | is https://review.opendev.org/ down? | 22:38 |
*** kukacz has joined #openstack-ansible | 22:53 | |
*** jbadiapa has quit IRC | 22:54 | |
admin0 | i don't like ubuntu using snap in everything .. i migrated my servers to debian | 23:03 |
admin0 | ubuntu server is no bad though :) | 23:04 |
admin0 | if you have good resources, just works fine | 23:05 |
admin0 | in my case were centos + cpanel -> directadmin + debian | 23:07 |
*** spatel has quit IRC | 23:12 | |
*** SecOpsNinja has left #openstack-ansible | 23:17 | |
ThiagoCMC | My main problem with CentOS is that its repository is (was) so small, that you have to use third party repos, or worse, maintain the repos yourself, or even worse, install things from tarballs... The Debian repo is huge, virtually everything is there for you to just "apt install it". An example, when I was working with DPDK, with Ubuntu, just "apt install dpdk", on CentOS? forget it. | 23:21 |
ThiagoCMC | And you can stick with LTS for years, don't have to update it every year... Even my Desktop is Ubuntu LTS, I don't care about the non-LTS flavors, at all. | 23:22 |
ThiagoCMC | The DPDK example is classic for me... Even OVS on Ubuntu come with DPDK if you want.. Soooo easy and stable! I saw people suffering to try to make it work on CentOS and I was just rolling my eyes lol | 23:23 |
ThiagoCMC | admin0, I'll migrate to Debian soon as they fix this: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=768073 ;-) | 23:24 |
openstack | Debian bug 768073 in wnpp "ITP: lxd -- The Linux Container Daemon" [Wishlist,Open] | 23:24 |
ThiagoCMC | Big fan of LXD! But hate snapd... lol | 23:25 |
admin0 | my octavia final test is blocked due to this: https://review.opendev.org/c/openstack/neutron/+/740588/ :D | 23:26 |
admin0 | after that will be trove | 23:26 |
*** lemko3 has joined #openstack-ansible | 23:28 | |
ThiagoCMC | admin0, you have to teach me about Octavia! | 23:30 |
ThiagoCMC | You can manually change the arp_protect.py, I'm doing this in my cloud. | 23:30 |
ThiagoCMC | https://bugs.launchpad.net/neutron/+bug/1887281 <- hit there, it affects you too! | 23:31 |
openstack | Launchpad bug 1887281 in neutron "[linuxbridge] ebtables delete arp protect chain fails" [Medium,Fix released] - Assigned to Lukas Steiner (steinerlukas) | 23:31 |
*** lemko has quit IRC | 23:31 | |
*** lemko3 is now known as lemko | 23:31 | |
admin0 | no manual work man .. manual means you forget and introduce a drift .. i better have it merged and do it the proper way | 23:44 |
admin0 | there seems to be another way that jrosser was telling me to clone the whole repo and use that checksum .. i will wait till tomorrow to see if it merges .. and if not, try it out | 23:49 |
ThiagoCMC | True but, it was printing errors, I had to fix it manually. :-/ | 23:50 |
ThiagoCMC | Cool! | 23:50 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!