noonedeadpunk | mornings | 07:48 |
---|---|---|
jrosser | good morning | 07:49 |
noonedeadpunk | huh, https://review.opendev.org/c/openstack/openstack-ansible/+/879069 is weird indeed | 07:50 |
noonedeadpunk | Sounds like https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/878929/2/vars/redhat-9.yml is off somehow | 07:51 |
noonedeadpunk | yup, no exclude here https://zuul.opendev.org/t/openstack/build/625d1c3798414708b0f14fdec1054688/log/logs/etc/openstack/aio1_neutron_server_container-1b9896cf/yum.repos.d/rdo-deps.repo.txt | 07:52 |
noonedeadpunk | but it's weird - distribution looks nice https://zuul.opendev.org/t/openstack/build/625d1c3798414708b0f14fdec1054688/log/logs/etc/host/openstack_deploy/ansible_facts/aio1_neutron_server_container-1b9896cf.txt#39 | 07:53 |
jrosser | that looks odd `changed: [aio1] => (item={'name': 'rdo-deps', 'file': 'rdo-deps', 'description': 'rdo-deps', 'baseurl': 'https://trunk.rdoproject.org/centos9-zed/deps/latest/', 'gpgcheck': False, 'module_hotfixes': True, 'exclude': False})` | 07:57 |
jrosser | exclude: False | 07:57 |
jrosser | oh i know what it is | 07:58 |
jrosser | theres always needing to be a set of ( ) "{{ (if foo == bar) | ternary(this, that) }}" | 07:59 |
jrosser | otherwise it actually tests if foo is equal to bar | ternary(this, that) | 07:59 |
noonedeadpunk | damn it | 08:00 |
jrosser | | is super high precedence operator | 08:00 |
noonedeadpunk | Yeah, I tend to always set round brackets, no idea how I managed to forget this time | 08:00 |
noonedeadpunk | btw I've spotted same issue here https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/878771/4/tasks/haproxy_service_config.yml damiandabrowski | 08:01 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-openstack_hosts master: Fix package exclude condition for rdo-deps https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/879272 | 08:05 |
opendevreview | Damian Dąbrowski proposed openstack/openstack-ansible-haproxy_server master: Fix haproxy_service_configs format conversion https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/878771 | 10:52 |
Elnaz | Hi | 12:20 |
Elnaz | https://meetings.opendev.org/irclogs/%23openstack-ansible/%23openstack-ansible.2023-03-06.log.html | 12:21 |
Elnaz | > i have a pretty large deployment of ELK using that repo | 12:21 |
Elnaz | jrosser: ^ Can you explain what architecture you are using ELK with? | 12:21 |
opendevreview | Merged openstack/openstack-ansible-openstack_hosts master: Fix package exclude condition for rdo-deps https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/879272 | 12:25 |
admin1 | if nova is local disk, but cinder/glance was ceph .. i recall we needed to use some direct-path setting to make snapshot work else it will time out ( and also same for using images via horizon ) .. do i recall correctly ? | 12:29 |
admin1 | i am trying to remember the variable | 12:29 |
hamidlotfi | Hi there, | 12:40 |
hamidlotfi | According to the last post, I wanted to scale my OSA environment with your guide published at `https://docs.openstack.org/openstack-ansible/latest/admin/scale-environment.html | 12:40 |
hamidlotfi | I added an infra04 in my env with this series of commands you said me in latest my posts: | 12:40 |
hamidlotfi | ` 1# openstack-ansible playbooks/setup-hosts.yml --limit localhost,infra04,infra04-host_containers` | 12:40 |
hamidlotfi | `2# openstack-ansible playbooks/openstack-hosts-setup.yml` | 12:40 |
hamidlotfi | `3# openstack-ansible playbooks/setup-infrastructure.yml -e galera_force_bootstrap=true` | 12:40 |
hamidlotfi | `4# openstack-ansible playbooks/os-keystone-install.yml` | 12:40 |
hamidlotfi | `5# openstack-ansible playbooks/setup-openstack.yml --limit '!keystone_all',localhost,infra04,infra04-host_containers` | 12:40 |
hamidlotfi | In step 5 in the cinder, section show this error | 12:40 |
hamidlotfi | `FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_local'\n\nThe error appears to be in '/opt/openstack-ansible/playbooks/os-cinder-install.yml': line 69, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n # venv tag for all hosts in | 12:40 |
hamidlotfi | the 'cinder_all' host group.\n - name: Gather software version list\n ^ here\n"}` | 12:40 |
hamidlotfi | sorry for long message 😔 | 12:41 |
hamidlotfi | @jrosser | 12:47 |
jrosser | hello | 12:47 |
jrosser | @hamidlotfi can you look at what hosts you have in the cinder_all group? | 12:48 |
hamidlotfi | Let me check it | 12:48 |
hamidlotfi | [cinder_all:children] | 12:49 |
hamidlotfi | cinder_api | 12:49 |
hamidlotfi | cinder_backup | 12:49 |
hamidlotfi | cinder_scheduler | 12:49 |
hamidlotfi | cinder_volume | 12:49 |
hamidlotfi | [cinder_api] | 12:49 |
hamidlotfi | [cinder_backup] | 12:49 |
hamidlotfi | [cinder_scheduler] | 12:49 |
hamidlotfi | [cinder_volume] | 12:49 |
jrosser | noonedeadpunk: ^ this is the "add infra node" equivalent of the "add compute node" facts gathering trouble | 12:50 |
jrosser | hamidlotfi: hmm ok | 12:50 |
jrosser | hamidlotfi: the easy answer is to add "cinder_all" to the --limit on step 5 | 12:51 |
jrosser | hamidlotfi: i don't think it is possible to write a set of instructions here that works for every possible deployment | 12:51 |
hamidlotfi | It means this command like this: | 12:52 |
hamidlotfi | `openstack-ansible playbooks/setup-openstack.yml --limit '!keystone_all',localhost,infra04,infra04-host_containers,cinder_all` | 12:52 |
hamidlotfi | @jrosser Is it correct ? | 12:52 |
jrosser | hamidlotfi: honestly i do not know, because i don't know about your deployment | 12:53 |
jrosser | you need to make sure that the play runs against all the hosts in the cinder_all group | 12:54 |
hamidlotfi | I check it right now | 12:56 |
noonedeadpunk | I'm pretty sure that ansible-core 2.13 just fixed local facts gathering, so that limit doesn't affect them anymore... But for infra case it's waaay more tricky to add some var to skip these steps | 13:07 |
jrosser | these instructions with --limit look unhelpful | 13:10 |
noonedeadpunk | so we likely should evaluate other ways of deciding if migration is needed or if all hosts are executed | 13:10 |
jrosser | and still there is the case of wanting to handle having a node down too | 13:11 |
noonedeadpunk | for me if node is down - it's not a reason not to execute migration. It will likely cause troubles with this specific node once it's booted, but still migrations should pass. | 13:15 |
noonedeadpunk | we probably should have discussed that previous week actually | 13:17 |
noonedeadpunk | my bad it wasn't in agenda, as it's worth to be there | 13:17 |
noonedeadpunk | I'm also trying to think of a good reason to restart all services like we do... | 13:22 |
noonedeadpunk | As I'm not sure on why we're doing that at all. Like to get new rpc version we need to update code. If we're updating it - services are restarted regardless. | 13:23 |
noonedeadpunk | ok, we're running in serial with limits... But again, each service will use default rpc version that's in code, so changing venv or updating packages should be jsut enough to trigger all required restarts | 13:24 |
NeilHanlon | btw it does look like rocky 9 will end up with Python 3.11, so we should be "okay" in that respect | 13:25 |
noonedeadpunk | NeilHanlon: will it end up the same way like centos usually does - without any extra pre-built libs? | 13:25 |
NeilHanlon | unsure :( but i will find out more.. we just noticed some new python packages in the RHEL 8.8 beta | 13:26 |
noonedeadpunk | like libselinux-python or python-libvirt? | 13:26 |
noonedeadpunk | as without that it's close to be useless | 13:26 |
noonedeadpunk | So another usecase we have - db migrations. I kind of wonder if there might be a way to check if they are needed using nova-manage itself | 13:30 |
hamidlotfi | @jrosser It seems to pass the error. | 13:30 |
noonedeadpunk | or we can do like we do for all other services - running against last host in the group | 13:31 |
noonedeadpunk | So technically, I want to just drop all that https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/os-nova-install.yml#L31-L163 if it's possible ofc. But I don't have any multi-node sandbox to play with atm... | 13:32 |
noonedeadpunk | Also checking that https://docs.openstack.org/nova/latest/cli/nova-manage.html#db-online-data-migrations - it says "after upgrading database schema and nova services on all controller nodes" | 13:34 |
noonedeadpunk | So this is weird then https://opendev.org/openstack/openstack-ansible-os_nova/src/branch/master/tasks/main.yml#L292-L302 | 13:34 |
noonedeadpunk | We should really refactor all that.... | 13:36 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Fix dstat run in gates https://review.opendev.org/c/openstack/openstack-ansible/+/879355 | 15:01 |
damiandabrowski | what is the current status of gates? are they still broken? | 16:05 |
damiandabrowski | openstack-ansible-deploy-infra_lxc-centos-9-stream failed twice for https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/878771 | 16:06 |
damiandabrowski | fatal: [aio1_repo_container-40ec3906]: FAILED! => {"changed": false, "msg": "Could not find the requested service systemd-tmpfiles-setup-dev: host"} | 16:06 |
noonedeadpunk | damiandabrowski: https://review.opendev.org/c/openstack/openstack-ansible/+/879069/2 is needed | 16:33 |
noonedeadpunk | you can use depends-on | 16:33 |
damiandabrowski | ahhh, thanks | 16:34 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Fix dstat run in gates https://review.opendev.org/c/openstack/openstack-ansible/+/879355 | 16:39 |
admin1 | is there a limit or timeout somehow when using local storage and glance (ceph ) based snapshots | 17:07 |
admin1 | i get a Broken pipe | 17:07 |
admin1 | i think it was some option in haproxy regarding http vs tcp | 17:07 |
admin1 | i recall this info in bits and pieces from conversation here | 17:07 |
admin1 | the best i recall is glance haproxy backend to be in tcp vs http | 17:13 |
damiandabrowski | glance can run behind uwsgi or standalone | 17:24 |
damiandabrowski | if you want to avoid issues with ceph, i'd recommend disabling uwsgi | 17:24 |
damiandabrowski | wait, I should have a bug report somewhere | 17:24 |
admin1 | is there a way to use variables to do that ? disable uwsgi and change http -> tcp ? | 17:24 |
admin1 | i am downloading the 2020 irc logs we had to check it out | 17:26 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Fix dstat run in gates https://review.opendev.org/c/openstack/openstack-ansible/+/879355 | 17:26 |
damiandabrowski | we have disabled uwsgi for glance with glance_use_uwsgi: False | 17:28 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Tune settings in galera server for reduced ram in all-in-one build https://review.opendev.org/c/openstack/openstack-ansible/+/877278 | 17:29 |
damiandabrowski | seems like switching to http is also an option but i didn't test it: https://bugs.launchpad.net/glance/+bug/1916482 | 17:30 |
damiandabrowski | perhaps you can switch to tcp by overriding haproxy_balance_type with haproxy_glance_api_service_overrides | 17:30 |
damiandabrowski | switching to tcp is also an option* | 17:32 |
noonedeadpunk | admin1: you need either disable uwsg or use tcp. `glance_use_uwsgi` is the variable to control that | 17:34 |
noonedeadpunk | But we have another bug that service likely won't be restarted when you change this variable. | 17:34 |
admin1 | haproxy or glance ? | 17:35 |
noonedeadpunk | switching to tcp is way less trivial | 17:35 |
noonedeadpunk | glance-api | 17:35 |
admin1 | yeah .. is that setting alone enough to fix it ? | 17:36 |
noonedeadpunk | yup | 17:36 |
noonedeadpunk | probably we should disable it by default to be frank... or at least when we see that ceph is going to be used | 17:38 |
jrosser | noonedeadpunk: we have both ceph read and write caches working here - is that something for ceph_client role? | 17:42 |
jrosser | it’s really all done on the computes | 17:43 |
noonedeadpunk | What do you have on mind for the role? As I guess you can place custom ceph.conf there or do overrides? | 17:44 |
jrosser | there’s some directories to make with the right permission, and a drop in is needed for the read cache service | 17:44 |
jrosser | and the package for the read cache installing actually | 17:45 |
jrosser | so pretty small really | 17:45 |
noonedeadpunk | well, we can introduce couple of new vars then I assume? | 17:47 |
jrosser | yes, two bools I guess to enable the read / write caches, and some defaults for dir paths… something like that | 17:48 |
noonedeadpunk | ++ | 17:48 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Disable uWSGI if ceph is used as a store https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/879370 | 17:48 |
jrosser | we got 3x throughput with the read cache | 17:49 |
jrosser | and similar with write, even with nvme osds | 17:49 |
jrosser | well write was more ops/sec actually, depends how you read FIO output | 17:50 |
noonedeadpunk | Well, I assume for write it's a bit different though, as it jsut does them in an async way | 17:50 |
noonedeadpunk | As it ack write without latency (with latency of local drive) | 17:50 |
jrosser | right, but interestingly it seems also to be able to read from it | 17:50 |
jrosser | for recently written stuff | 17:50 |
noonedeadpunk | Huh, interesting... I'm kind of more sceptical about writes, as I'm quite afraid of what will be consequences of local drive failure | 17:51 |
noonedeadpunk | As I'd assume they will worn out really fast | 17:51 |
jrosser | oh indeed - it’s all totally workload and dependant on consequences of failure | 17:52 |
jrosser | anyway - I will see if I can write some code as there’s basically no example anywhere of making this work | 17:53 |
noonedeadpunk | I have no idea about failure, I assume it will result in broken filesystems? As it should not block PGs, as PGs have no idea about locally cached actions I guess... | 17:53 |
noonedeadpunk | And well, compute failures is another interesting case - I guess you don't want to evacuate VMs with that anymore unless you really don't have any chance of bringing in back compute | 17:54 |
noonedeadpunk | but read cache is really appealing | 17:55 |
jrosser | yeah, it’s only caching the parent you snapshot from though, not anything you write subsequently | 17:56 |
jrosser | but for CI it’s probably really worth it, or any use case with huge read only datasets | 17:56 |
noonedeadpunk | yeah, totally worth having sample/docs on how to do that as it's very interesting and might worth a risk - it's anyway not worse then just local drives but with comparable performance | 17:57 |
noonedeadpunk | btw interestingly if live-migration works with write cache on vms with intense iops... | 17:58 |
jrosser | as far as compute failure goes the cache is integrated with the lock on the volume | 17:58 |
noonedeadpunk | yeah, so you're having exclusive lock, right | 17:58 |
jrosser | oh yes live migration :) didn’t try that as it’s only set up on one node right now…. | 18:00 |
noonedeadpunk | I'd assume race conditions are possible there... | 18:01 |
noonedeadpunk | But I didn't test any of these for the matter of fact, so everything I'm saying is just speculation | 18:04 |
noonedeadpunk | or fud :D | 18:04 |
admin1 | the unable to restart when doing glance_use_uwsgi: False .. is that one time .. or every time ? | 18:05 |
admin1 | because if its one time, i can manually stop/start the lxc container for example | 18:06 |
damiandabrowski | i can recall a situation when switching between glance uwsgi<>standalone was leaving the old process running and I had to kill it manually | 18:15 |
damiandabrowski | maybe the same happened to you | 18:15 |
jrosser | I guess it does “do the not uwsgi thing” rather than “undo the uwsgi thing” | 18:16 |
admin1 | changing to tcp seems to have worked . testing with a bigger image/snapshot | 19:09 |
admin1 | thanks guys | 19:09 |
opendevreview | Merged openstack/openstack-ansible master: Disable CentOS LXC jobs due to the bug in systemd packaging https://review.opendev.org/c/openstack/openstack-ansible/+/879069 | 19:22 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-ceph_client stable/zed: Add EPEL GPG key for RHEL 9 https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/879186 | 20:02 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-ceph_client stable/zed: Add thrift to includepkgs from EPEL https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/879187 | 20:03 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-openstack_hosts stable/zed: Add openstack_hosts_file tag https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/879188 | 20:03 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-openstack_hosts stable/yoga: Add openstack_hosts_file tag https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/879189 | 20:04 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-openstack_hosts stable/xena: Add openstack_hosts_file tag https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/879390 | 20:04 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Tune settings in galera server for reduced ram in all-in-one build https://review.opendev.org/c/openstack/openstack-ansible/+/877278 | 20:30 |
opendevreview | Damian Dąbrowski proposed openstack/openstack-ansible master: Add support for TLS backends https://review.opendev.org/c/openstack/openstack-ansible/+/879085 | 20:30 |
opendevreview | Damian Dąbrowski proposed openstack/openstack-ansible master: Add support for TLS backends https://review.opendev.org/c/openstack/openstack-ansible/+/879085 | 20:33 |
opendevreview | Damian Dąbrowski proposed openstack/openstack-ansible-os_keystone master: Define CA cert when needed https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/879378 | 20:35 |
opendevreview | Damian Dąbrowski proposed openstack/openstack-ansible-os_keystone master: Rename keystone_ssl to keystone_backend_https https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/879379 | 20:35 |
opendevreview | Damian Dąbrowski proposed openstack/openstack-ansible-os_placement master: Add TLS support to placement backends https://review.opendev.org/c/openstack/openstack-ansible-os_placement/+/879380 | 20:37 |
opendevreview | Damian Dąbrowski proposed openstack/openstack-ansible-os_nova master: Add TLS support to nova API backends https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/874810 | 20:49 |
opendevreview | Merged openstack/openstack-ansible-plugins master: Revert "Ensure systemd-udev is installed for gluster" https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/878842 | 21:43 |
opendevreview | Damian Dąbrowski proposed openstack/openstack-ansible-os_nova master: Add TLS support to nova API backends https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/874810 | 23:44 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!