Monday, 2023-04-03

noonedeadpunk	mornings	07:48
jrosser	good morning	07:49
noonedeadpunk	huh, https://review.opendev.org/c/openstack/openstack-ansible/+/879069 is weird indeed	07:50
noonedeadpunk	Sounds like https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/878929/2/vars/redhat-9.yml is off somehow	07:51
noonedeadpunk	yup, no exclude here https://zuul.opendev.org/t/openstack/build/625d1c3798414708b0f14fdec1054688/log/logs/etc/openstack/aio1_neutron_server_container-1b9896cf/yum.repos.d/rdo-deps.repo.txt	07:52
noonedeadpunk	but it's weird - distribution looks nice https://zuul.opendev.org/t/openstack/build/625d1c3798414708b0f14fdec1054688/log/logs/etc/host/openstack_deploy/ansible_facts/aio1_neutron_server_container-1b9896cf.txt#39	07:53
jrosser	that looks odd `changed: [aio1] => (item={'name': 'rdo-deps', 'file': 'rdo-deps', 'description': 'rdo-deps', 'baseurl': 'https://trunk.rdoproject.org/centos9-zed/deps/latest/', 'gpgcheck': False, 'module_hotfixes': True, 'exclude': False})`	07:57
jrosser	exclude: False	07:57
jrosser	oh i know what it is	07:58
jrosser	theres always needing to be a set of ( ) "{{ (if foo == bar) \| ternary(this, that) }}"	07:59
jrosser	otherwise it actually tests if foo is equal to bar \| ternary(this, that)	07:59
noonedeadpunk	damn it	08:00
jrosser	\| is super high precedence operator	08:00
noonedeadpunk	Yeah, I tend to always set round brackets, no idea how I managed to forget this time	08:00
noonedeadpunk	btw I've spotted same issue here https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/878771/4/tasks/haproxy_service_config.yml damiandabrowski	08:01
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-openstack_hosts master: Fix package exclude condition for rdo-deps https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/879272	08:05
opendevreview	Damian Dąbrowski proposed openstack/openstack-ansible-haproxy_server master: Fix haproxy_service_configs format conversion https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/878771	10:52
Elnaz	Hi	12:20
Elnaz	https://meetings.opendev.org/irclogs/%23openstack-ansible/%23openstack-ansible.2023-03-06.log.html	12:21
Elnaz	> i have a pretty large deployment of ELK using that repo	12:21
Elnaz	jrosser: ^ Can you explain what architecture you are using ELK with?	12:21
opendevreview	Merged openstack/openstack-ansible-openstack_hosts master: Fix package exclude condition for rdo-deps https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/879272	12:25
admin1	if nova is local disk, but cinder/glance was ceph .. i recall we needed to use some direct-path setting to make snapshot work else it will time out ( and also same for using images via horizon ) .. do i recall correctly ?	12:29
admin1	i am trying to remember the variable	12:29
hamidlotfi	Hi there,	12:40
hamidlotfi	According to the last post, I wanted to scale my OSA environment with your guide published at `https://docs.openstack.org/openstack-ansible/latest/admin/scale-environment.html	12:40
hamidlotfi	I added an infra04 in my env with this series of commands you said me in latest my posts:	12:40
hamidlotfi	` 1# openstack-ansible playbooks/setup-hosts.yml --limit localhost,infra04,infra04-host_containers`	12:40
hamidlotfi	`2# openstack-ansible playbooks/openstack-hosts-setup.yml`	12:40
hamidlotfi	`3# openstack-ansible playbooks/setup-infrastructure.yml -e galera_force_bootstrap=true`	12:40
hamidlotfi	`4# openstack-ansible playbooks/os-keystone-install.yml`	12:40
hamidlotfi	`5# openstack-ansible playbooks/setup-openstack.yml --limit '!keystone_all',localhost,infra04,infra04-host_containers`	12:40
hamidlotfi	In step 5 in the cinder, section show this error	12:40
hamidlotfi	`FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'ansible_local'\n\nThe error appears to be in '/opt/openstack-ansible/playbooks/os-cinder-install.yml': line 69, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n # venv tag for all hosts in	12:40
hamidlotfi	the 'cinder_all' host group.\n - name: Gather software version list\n ^ here\n"}`	12:40
hamidlotfi	sorry for long message 😔	12:41
hamidlotfi	@jrosser	12:47
jrosser	hello	12:47
jrosser	@hamidlotfi can you look at what hosts you have in the cinder_all group?	12:48
hamidlotfi	Let me check it	12:48
hamidlotfi	[cinder_all:children]	12:49
hamidlotfi	cinder_api	12:49
hamidlotfi	cinder_backup	12:49
hamidlotfi	cinder_scheduler	12:49
hamidlotfi	cinder_volume	12:49
hamidlotfi	[cinder_api]	12:49
hamidlotfi	[cinder_backup]	12:49
hamidlotfi	[cinder_scheduler]	12:49
hamidlotfi	[cinder_volume]	12:49
jrosser	noonedeadpunk: ^ this is the "add infra node" equivalent of the "add compute node" facts gathering trouble	12:50
jrosser	hamidlotfi: hmm ok	12:50
jrosser	hamidlotfi: the easy answer is to add "cinder_all" to the --limit on step 5	12:51
jrosser	hamidlotfi: i don't think it is possible to write a set of instructions here that works for every possible deployment	12:51
hamidlotfi	It means this command like this:	12:52
hamidlotfi	`openstack-ansible playbooks/setup-openstack.yml --limit '!keystone_all',localhost,infra04,infra04-host_containers,cinder_all`	12:52
hamidlotfi	@jrosser Is it correct ?	12:52
jrosser	hamidlotfi: honestly i do not know, because i don't know about your deployment	12:53
jrosser	you need to make sure that the play runs against all the hosts in the cinder_all group	12:54
hamidlotfi	I check it right now	12:56
noonedeadpunk	I'm pretty sure that ansible-core 2.13 just fixed local facts gathering, so that limit doesn't affect them anymore... But for infra case it's waaay more tricky to add some var to skip these steps	13:07
jrosser	these instructions with --limit look unhelpful	13:10
noonedeadpunk	so we likely should evaluate other ways of deciding if migration is needed or if all hosts are executed	13:10
jrosser	and still there is the case of wanting to handle having a node down too	13:11
noonedeadpunk	for me if node is down - it's not a reason not to execute migration. It will likely cause troubles with this specific node once it's booted, but still migrations should pass.	13:15
noonedeadpunk	we probably should have discussed that previous week actually	13:17
noonedeadpunk	my bad it wasn't in agenda, as it's worth to be there	13:17
noonedeadpunk	I'm also trying to think of a good reason to restart all services like we do...	13:22
noonedeadpunk	As I'm not sure on why we're doing that at all. Like to get new rpc version we need to update code. If we're updating it - services are restarted regardless.	13:23
noonedeadpunk	ok, we're running in serial with limits... But again, each service will use default rpc version that's in code, so changing venv or updating packages should be jsut enough to trigger all required restarts	13:24
NeilHanlon	btw it does look like rocky 9 will end up with Python 3.11, so we should be "okay" in that respect	13:25
noonedeadpunk	NeilHanlon: will it end up the same way like centos usually does - without any extra pre-built libs?	13:25
NeilHanlon	unsure :( but i will find out more.. we just noticed some new python packages in the RHEL 8.8 beta	13:26
noonedeadpunk	like libselinux-python or python-libvirt?	13:26
noonedeadpunk	as without that it's close to be useless	13:26
noonedeadpunk	So another usecase we have - db migrations. I kind of wonder if there might be a way to check if they are needed using nova-manage itself	13:30
hamidlotfi	@jrosser It seems to pass the error.	13:30
noonedeadpunk	or we can do like we do for all other services - running against last host in the group	13:31
noonedeadpunk	So technically, I want to just drop all that https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/os-nova-install.yml#L31-L163 if it's possible ofc. But I don't have any multi-node sandbox to play with atm...	13:32
noonedeadpunk	Also checking that https://docs.openstack.org/nova/latest/cli/nova-manage.html#db-online-data-migrations - it says "after upgrading database schema and nova services on all controller nodes"	13:34
noonedeadpunk	So this is weird then https://opendev.org/openstack/openstack-ansible-os_nova/src/branch/master/tasks/main.yml#L292-L302	13:34
noonedeadpunk	We should really refactor all that....	13:36
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Fix dstat run in gates https://review.opendev.org/c/openstack/openstack-ansible/+/879355	15:01
damiandabrowski	what is the current status of gates? are they still broken?	16:05
damiandabrowski	openstack-ansible-deploy-infra_lxc-centos-9-stream failed twice for https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/878771	16:06
damiandabrowski	fatal: [aio1_repo_container-40ec3906]: FAILED! => {"changed": false, "msg": "Could not find the requested service systemd-tmpfiles-setup-dev: host"}	16:06
noonedeadpunk	damiandabrowski: https://review.opendev.org/c/openstack/openstack-ansible/+/879069/2 is needed	16:33
noonedeadpunk	you can use depends-on	16:33
damiandabrowski	ahhh, thanks	16:34
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Fix dstat run in gates https://review.opendev.org/c/openstack/openstack-ansible/+/879355	16:39
admin1	is there a limit or timeout somehow when using local storage and glance (ceph ) based snapshots	17:07
admin1	i get a Broken pipe	17:07
admin1	i think it was some option in haproxy regarding http vs tcp	17:07
admin1	i recall this info in bits and pieces from conversation here	17:07
admin1	the best i recall is glance haproxy backend to be in tcp vs http	17:13
damiandabrowski	glance can run behind uwsgi or standalone	17:24
damiandabrowski	if you want to avoid issues with ceph, i'd recommend disabling uwsgi	17:24
damiandabrowski	wait, I should have a bug report somewhere	17:24
admin1	is there a way to use variables to do that ? disable uwsgi and change http -> tcp ?	17:24
admin1	i am downloading the 2020 irc logs we had to check it out	17:26
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Fix dstat run in gates https://review.opendev.org/c/openstack/openstack-ansible/+/879355	17:26
damiandabrowski	we have disabled uwsgi for glance with glance_use_uwsgi: False	17:28
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Tune settings in galera server for reduced ram in all-in-one build https://review.opendev.org/c/openstack/openstack-ansible/+/877278	17:29
damiandabrowski	seems like switching to http is also an option but i didn't test it: https://bugs.launchpad.net/glance/+bug/1916482	17:30
damiandabrowski	perhaps you can switch to tcp by overriding haproxy_balance_type with haproxy_glance_api_service_overrides	17:30
damiandabrowski	switching to tcp is also an option*	17:32
noonedeadpunk	admin1: you need either disable uwsg or use tcp. `glance_use_uwsgi` is the variable to control that	17:34
noonedeadpunk	But we have another bug that service likely won't be restarted when you change this variable.	17:34
admin1	haproxy or glance ?	17:35
noonedeadpunk	switching to tcp is way less trivial	17:35
noonedeadpunk	glance-api	17:35
admin1	yeah .. is that setting alone enough to fix it ?	17:36
noonedeadpunk	yup	17:36
noonedeadpunk	probably we should disable it by default to be frank... or at least when we see that ceph is going to be used	17:38
jrosser	noonedeadpunk: we have both ceph read and write caches working here - is that something for ceph_client role?	17:42
jrosser	it’s really all done on the computes	17:43
noonedeadpunk	What do you have on mind for the role? As I guess you can place custom ceph.conf there or do overrides?	17:44
jrosser	there’s some directories to make with the right permission, and a drop in is needed for the read cache service	17:44
jrosser	and the package for the read cache installing actually	17:45
jrosser	so pretty small really	17:45
noonedeadpunk	well, we can introduce couple of new vars then I assume?	17:47
jrosser	yes, two bools I guess to enable the read / write caches, and some defaults for dir paths… something like that	17:48
noonedeadpunk	++	17:48
opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_glance master: Disable uWSGI if ceph is used as a store https://review.opendev.org/c/openstack/openstack-ansible-os_glance/+/879370	17:48
jrosser	we got 3x throughput with the read cache	17:49
jrosser	and similar with write, even with nvme osds	17:49
jrosser	well write was more ops/sec actually, depends how you read FIO output	17:50
noonedeadpunk	Well, I assume for write it's a bit different though, as it jsut does them in an async way	17:50
noonedeadpunk	As it ack write without latency (with latency of local drive)	17:50
jrosser	right, but interestingly it seems also to be able to read from it	17:50
jrosser	for recently written stuff	17:50
noonedeadpunk	Huh, interesting... I'm kind of more sceptical about writes, as I'm quite afraid of what will be consequences of local drive failure	17:51
noonedeadpunk	As I'd assume they will worn out really fast	17:51
jrosser	oh indeed - it’s all totally workload and dependant on consequences of failure	17:52
jrosser	anyway - I will see if I can write some code as there’s basically no example anywhere of making this work	17:53
noonedeadpunk	I have no idea about failure, I assume it will result in broken filesystems? As it should not block PGs, as PGs have no idea about locally cached actions I guess...	17:53
noonedeadpunk	And well, compute failures is another interesting case - I guess you don't want to evacuate VMs with that anymore unless you really don't have any chance of bringing in back compute	17:54
noonedeadpunk	but read cache is really appealing	17:55
jrosser	yeah, it’s only caching the parent you snapshot from though, not anything you write subsequently	17:56
jrosser	but for CI it’s probably really worth it, or any use case with huge read only datasets	17:56
noonedeadpunk	yeah, totally worth having sample/docs on how to do that as it's very interesting and might worth a risk - it's anyway not worse then just local drives but with comparable performance	17:57
noonedeadpunk	btw interestingly if live-migration works with write cache on vms with intense iops...	17:58
jrosser	as far as compute failure goes the cache is integrated with the lock on the volume	17:58
noonedeadpunk	yeah, so you're having exclusive lock, right	17:58
jrosser	oh yes live migration :) didn’t try that as it’s only set up on one node right now….	18:00
noonedeadpunk	I'd assume race conditions are possible there...	18:01
noonedeadpunk	But I didn't test any of these for the matter of fact, so everything I'm saying is just speculation	18:04
noonedeadpunk	or fud :D	18:04
admin1	the unable to restart when doing glance_use_uwsgi: False .. is that one time .. or every time ?	18:05
admin1	because if its one time, i can manually stop/start the lxc container for example	18:06
damiandabrowski	i can recall a situation when switching between glance uwsgi<>standalone was leaving the old process running and I had to kill it manually	18:15
damiandabrowski	maybe the same happened to you	18:15
jrosser	I guess it does “do the not uwsgi thing” rather than “undo the uwsgi thing”	18:16
admin1	changing to tcp seems to have worked . testing with a bigger image/snapshot	19:09
admin1	thanks guys	19:09
opendevreview	Merged openstack/openstack-ansible master: Disable CentOS LXC jobs due to the bug in systemd packaging https://review.opendev.org/c/openstack/openstack-ansible/+/879069	19:22
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible-ceph_client stable/zed: Add EPEL GPG key for RHEL 9 https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/879186	20:02
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible-ceph_client stable/zed: Add thrift to includepkgs from EPEL https://review.opendev.org/c/openstack/openstack-ansible-ceph_client/+/879187	20:03
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible-openstack_hosts stable/zed: Add openstack_hosts_file tag https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/879188	20:03
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible-openstack_hosts stable/yoga: Add openstack_hosts_file tag https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/879189	20:04
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible-openstack_hosts stable/xena: Add openstack_hosts_file tag https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/879390	20:04
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible master: Tune settings in galera server for reduced ram in all-in-one build https://review.opendev.org/c/openstack/openstack-ansible/+/877278	20:30
opendevreview	Damian Dąbrowski proposed openstack/openstack-ansible master: Add support for TLS backends https://review.opendev.org/c/openstack/openstack-ansible/+/879085	20:30
opendevreview	Damian Dąbrowski proposed openstack/openstack-ansible master: Add support for TLS backends https://review.opendev.org/c/openstack/openstack-ansible/+/879085	20:33
opendevreview	Damian Dąbrowski proposed openstack/openstack-ansible-os_keystone master: Define CA cert when needed https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/879378	20:35
opendevreview	Damian Dąbrowski proposed openstack/openstack-ansible-os_keystone master: Rename keystone_ssl to keystone_backend_https https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/879379	20:35
opendevreview	Damian Dąbrowski proposed openstack/openstack-ansible-os_placement master: Add TLS support to placement backends https://review.opendev.org/c/openstack/openstack-ansible-os_placement/+/879380	20:37
opendevreview	Damian Dąbrowski proposed openstack/openstack-ansible-os_nova master: Add TLS support to nova API backends https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/874810	20:49
opendevreview	Merged openstack/openstack-ansible-plugins master: Revert "Ensure systemd-udev is installed for gluster" https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/878842	21:43
opendevreview	Damian Dąbrowski proposed openstack/openstack-ansible-os_nova master: Add TLS support to nova API backends https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/874810	23:44

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!