Monday, 2022-03-07

jrosser	I think i've misunderstood the purpose of this patch https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/831640	10:47
jrosser	i don;t have time to look at it more though	10:48
*** arxcruz is now known as arxcruz\|off		11:04
noonedeadpunk	well... I guess it's TripleO just don't create that role for their deployemtns	11:06
noonedeadpunk	we do https://opendev.org/openstack/openstack-ansible-os_heat/src/branch/master/tasks/heat_service_setup.yml#L62	11:07
noonedeadpunk	(sorry, must be L48)	11:07
*** dviroel\|out is now known as dviroel		11:23
jrosser	right, but they assign the heat_stack_owner to the tempest users, i guess i just don't quite follow if this is a special need for tempest users or if it's covering a deficiency elsewhere in tripleo	11:46
noonedeadpunk	well, afaik `Add tempest users to heat_stack_owner role` has no point on itself since dynamic credentials used anyway, and not that user currently. But what they do - they cover this task failure since there's no such role to assign	11:51
noonedeadpunk	we need to proceed with service role across projects I believe https://opendev.org/openstack/governance/src/branch/master/goals/selected/consistent-and-secure-rbac.rst#yoga-timeline-7th-mar-2022	12:28
noonedeadpunk	frustrating part is https://review.opendev.org/c/openstack/keystone-specs/+/818616	12:29
noonedeadpunk	if I'm reading governance right, it's requirement for Y and deadline is today?	12:33
jrosser	oh goodness https://review.opendev.org/c/openstack/keystone-specs/+/818616 really is a mess in the comments	12:53
jrosser	I wonder if osa and kolla need to work together on this to have more eyes / understanding on this from deployer pov	12:53
noonedeadpunk	kolla already landed this https://opendev.org/openstack/kolla-ansible/commit/2e933dceb591c3505f35c2c1de924f3978fb81a7	12:59
noonedeadpunk	but it's old one	13:01
noonedeadpunk	ansible openstack collection has 0 refference of system scopes just in case	13:06
noonedeadpunk	Damn, each time I try to dig into this I got enough frustration to psotpone this to better times...	13:08
noonedeadpunk	charms has some sum-up https://etherpad.opendev.org/p/charms-secure-rbac	13:10
noonedeadpunk	tbh with all these changes I'd say it must be apiv4 for keystone....	13:13
NeilHanlon	noticed something late last night - is it intentional there's not any default centralized logging set up in deployments w/o additional user config? e.g. one either has to set `journald_remote_enabled=1` or otherwise configure the vars for rsyslog_(client\|server). Seems the background for not defaulting journald_remote is related to	14:25
NeilHanlon	https://github.com/systemd/systemd/issues/2376 though I believe that change should be landed most systemd places now	14:25
noonedeadpunk	IIRC journald-remote is still pita for destination host and really poorly maintained. We mostly get rid of rsyslog, except ceph which I think still doesn't support journald	14:27
noonedeadpunk	NeilHanlon: and we all do different ways for central logging. Some do graylog, some elk, some journald-remote.	14:28
noonedeadpunk	So we don't do by default anything and let ppl choose	14:28
noonedeadpunk	for graylog we have https://opendev.org/openstack/openstack-ansible-ops/src/branch/master/graylog/graylog-forward-logs.yml	14:28
noonedeadpunk	for elk we have https://opendev.org/openstack/openstack-ansible-ops/src/branch/master/elk_metrics_7x	14:28
NeilHanlon	that makes sense and tracks re: maintenance. Appreciate the info re: graylog and ELK! Very helpful	14:47
NeilHanlon	do you think it would make sense to update the wording on https://docs.openstack.org/openstack-ansible-rsyslog_server/latest/ops-logging.html ?	14:48
*** dviroel is now known as dviroel\|lunch		14:56
admin11	out of the blue, connecting to mysql via haproxy gives Lost connection to MySQL server at 'handshake: reading initial communication packet', system error: 11	15:58
admin11	while the mysql container is good .. doing mysql -h IP of container .. works just fine	15:58
admin11	anyone seen this before and/or can provide me some pointers	15:58
admin11	i already ran setup-infra and haproxy setup a few times .. no issues	15:58
noonedeadpunk	NeilHanlon: yeah, that makes total sense....	16:11
NeilHanlon	i'll take a poke at an update + docs addition for the main project re: logging	16:11
*** dviroel\|lunch is now known as dviroel		16:13
admin1	editing keystone.conf to put galera IP directly worked .. but i am not sure what happend and why galera is not working via haproxy	16:35
admin1	is there a quick setting to override galera IP for the plauybooks .. so that i can give a direct Ip address	16:36
admin1	would setting a galera_address in user_variables allow me to override galera server setting ?	16:37
admin1	nvm .. i think so ..	16:40
noonedeadpunk	yes it will, but you won't have ha then for galera	16:49
noonedeadpunk	it actually would be helpful to say what status haproxy reports	16:49
noonedeadpunk	as it's hard to say why haproxy in unhappy when it's backend state reported is unknown	16:49
admin1	i changed galera db ... every other service except nova is fine .. nova constantly tries to connect to the old db ip ( captured via tcpdum)	18:00
admin1	and the playbooks also fail on ynchronize db	18:01
admin1	is it possible that the connection could be cached internally in some python files ?	18:01
admin1	i restarted the container .. but no fix	18:01
admin1	i am going to destroy and rebuld	18:05
noonedeadpunk	admin1: for nova db access is also stored in cell1	18:56
noonedeadpunk	inside db	18:56
noonedeadpunk	on later releases we switched to templated thing, so it's taken from config directly, but on prior releases db access is stored in db	18:57
noonedeadpunk	but imo solution to get direct access is not really good one as that looses point of having cluster and single controller outage would result in region outage	18:58
admin1	have another strange issue ...after db change, all api ( command line works ) .. horizon fails with 504 gateway timeout .. on /project/ url ..	18:59
admin1	and i can't seem to find exactly where its stuck at or where that 504 is coming from	18:59
opendevreview	Merged openstack/openstack-ansible stable/xena: Bump ansible.netcommon version https://review.opendev.org/c/openstack/openstack-ansible/+/831535	19:23
admin1	when I login to horizon, i get 504 Gateway Time-out .. is there any way for me to check in which api horizon is stuck in ?	20:32
NeilHanlon	horizon runs under wsgi, so check the apache logs	20:54
admin1	NeilHanlon.. there is nothing there .. i mean no errors	20:59
admin1	is it possible to log all api calls	20:59
admin1	to figure out where exactly it gets stuck at	20:59
NeilHanlon	the 504 should be generating a stack trace somewhere. Best idea is to aggregate logs from everywhere into one place to look at things from a whole-cluster approach	21:00
NeilHanlon	e.g. https://paste.opendev.org/show/b6eC9dOwwIeajXtKuByD/	21:01
*** dviroel is now known as dviroel\|out		22:25

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!