Monday, 2023-02-06

moha7	A: https://i.ibb.co/4NsHT13/a.png	07:12
moha7	B: https://i.ibb.co/KWgYtmy/b.png	07:12
moha7	Is B equivalent to A?	07:12
damiandabrowski	hi folks	09:11
damiandabrowski	moha7: imo yes	09:12
damiandabrowski	jrosser: thanks for your reviews, most of them are definitely valid. I'll work on this today	09:13
jrosser	damiandabrowski: i think you can make it massively more simple	09:20
jrosser	also i just found another problem in the haproxy changes, added another comment	09:20
jrosser	damiandabrowski: the reason that i think we should not really adjust the way the vars are handled is that if we leave them alone then the service playbooks could be changed very simply to be like this (example for glance) https://paste.opendev.org/show/bh5S74g7x8lhoMDy0rWj/	09:30
jrosser	^ psuedo-code, not tested	09:30
damiandabrowski	but in this case, we won't be able to fix variable-scope issue(which is the main reason why I started working on this).	09:52
damiandabrowski	for me it really sucks if users would need to be aware what service-specific variables(like glance_ssl) need to be defined globally rather than in group_vars. For 2 reasons: clarity and performance	09:52
damiandabrowski	this problem was raised here before: https://review.opendev.org/c/openstack/openstack-ansible/+/821090/comments/4e4d8147_e0b4d087	09:53
jrosser	i think we just swap one for another really	09:54
jrosser	it is equally confusing that the haproxy "role" now runs targetting all the different service groups, not the haproxy_all group	09:54
jrosser	that took me a loooong time to finally realise when looking at the new patches	09:55
jrosser	of course except when it's run from the haproxy playbook itself.....	09:55
damiandabrowski	hmm, idk... for me having for ex. os_keystone playbook, triggering haproxy_service_config.yml tasks from haproxy_server role to configure its haproxy service isn't confusing at all	10:01
damiandabrowski	(especially when we don't have any other way to fix issue with variables scope)	10:02
moha7	openstack-ansible /opt/openstack-ansible/playbooks/containers-lxc-destroy.yml	10:41
moha7	How can I remove only a specific container?	10:41
moha7	`containers-lxc-destroy.yml --limit infra01_repo_container-55a9a634`	10:48
jrosser	moha7: you can use --limit for with a specific container, or an ansiblel group name for all containers for a particular service	10:56
jrosser	moha7: it's also useful to double-check what you are about to do by adding `--list-hosts` which will just print the things that the playbook will target and exist	10:57
jrosser	*exit	10:57
moha7	jrosser: Thanks	11:46
moha7	1. Does OSA supports Prometheus as a monitoring service, integrated? I couldn't find something in Docs.	12:19
moha7	2. Does OSA deploys SkyLine dashboard? (https://docs.openstack.org/skyline-console/zed/)	12:20
moha7	deploy*	12:20
moha7	For the question 1, is it here: https://opendev.org/openstack/openstack-ansible-ops ?	12:22
jrosser	damiandabrowski: i was just digging into nova / nvidia / mdev stuff through the mailing list and i saw you linked to this in a nova patch https://paste.openstack.org/show/bU4qYaIySl5y7qEsEWjR/	12:22
jrosser	damiandabrowski: we have some similar script (from Tobias on the ML) to recreate the mdev but it seems to be a race condition with nova-compute starting	12:23
jrosser	damiandabrowski: did you manage to make `After=` in those units behave how you need? Like there is some time after the nvidia services have started before they are properly initialised, and you need some way to make the mdev creation script wait for that (sleep(10) /o\)	12:25
jrosser	damiandabrowski: the also nova-compute has to wait for the mdev to be actually created rather than the service just started	12:26
jrosser	we get a lot of race condition here using `After=` and the systemd docs are not clear, it seems to behave like "After the oneshot service has successfully started" rather than "After the oneshot service has successfully finished"	12:28
jrosser	moha7: OSA does not deploy any monitoring natively, but there are settings to enable prometheus exporters on quite a few of the components such as mariadb, zookeeper and haproxy	12:29
jrosser	moha7: there is an OSA role for Skyline here https://opendev.org/openstack/openstack-ansible-os_skyline	12:30
moha7	In user_variables.yml, what happens for `haproxy_keepalived_external_vip_cidr: "{{external_lb_vip_address}}/32"` if I set `external_lb_vip_address` on a domain name in opanstack_user_config.yml file? Would it be "sub.example.com/32"?	12:32
jrosser	moha7: and an incomplete patch to add skyline support to the main repo https://review.opendev.org/c/openstack/openstack-ansible/+/859446 - please do contribute some developer time if you are interested in this	12:32
moha7	+1	12:33
jrosser	you would make something like `external_lb_vip_address: openstack.example.com`	12:33
moha7	I did it the same as you, but `haproxy_keepalived_external_vip_cidr` reads `from external_lb_vip_address`, doesn't it?	12:38
moha7	I did it the same as you, but `haproxy_keepalived_external_vip_cidr` reads from `external_lb_vip_address`, doesn't it?*	12:38
jrosser	i set `haproxy_keepalived_external_vip_cidr: "a.b.c.d/32"` in my user_variables	12:40
jrosser	see https://opendev.org/openstack/openstack-ansible/src/branch/master/etc/openstack_deploy/user_variables.yml#L176-L179	12:41
moha7	What other services do you usually use alongside with your openstack service? I was thinking of monitoring with Prometheus and log management with ELK/EFK. What else?	15:39
jrosser	moha7: everyone has a different answer to this :) but here we use prometheus exporters for every component in the OSA deploy that supports it, plus ELK for log/journal collection, and linux/network snmp collection with Observium	16:06
jrosser	on top of that there are some other things you can do with prometheus blackbox_exporter, libvirt_exporter and so on if you need them	16:07
damiandabrowski	jrosser: looks like certbot-auto is deprecated and no longer available under https://dl.eff.org/certbot-auto	16:16
damiandabrowski	https://eff-certbot.readthedocs.io/en/stable/install.html#certbot-auto-deprecated	16:16
damiandabrowski	should we remove it from haproxy_server role and leave 'distro' as a single valid option for haproxy_ssl_letsencrypt_install_method?	16:17
jrosser	yes i think we should	16:17
jrosser	as far as i remember certbot-auto was what was in the haproxy role before lots of work got done to make it H/A	16:18
jrosser	i added support for `haproxy_ssl_letsencrypt_install_method == 'distro'` to take the package from ubuntu repo	16:18
jrosser	but i'm not sure what happens on RH though, if thats possible at all	16:18
jrosser	ah looks like it is in EPEL	16:20
jrosser	what we need is someone to maintain the RH-alikes support	16:20
damiandabrowski	don't you use centos-based openstack? :D	16:23
damiandabrowski	lol either I don't understand it or I just found something really weird	16:34
moha7	jrosser: Good points. thanks. (I think Zabbix can do what Observium does as I've previously work with it.)	16:34
damiandabrowski	according to LetsEncrypt docs, HTTP-01 challenge works only with port 80: https://letsencrypt.org/docs/challenge-types/#http-01-challenge	16:34
damiandabrowski	"The HTTP-01 challenge can only be done on port 80. Allowing clients to specify arbitrary ports would make the challenge less secure, and so it is not allowed by the ACME standard."	16:34
damiandabrowski	but in our docs, we suggest to spawn haproxy certbot service listening on port 443 and it works fine: https://docs.openstack.org/openstack-ansible/latest/user/security/ssl-certificates.html#certbot-certificates	16:36
damiandabrowski	so think HTTP-01 challenge does not really work as they describe it	16:36
jrosser	we have a redirect from 80 -> 443	16:36
jrosser	and there is an ACL also on the port 80 part for the LE backend	16:36
jrosser	`path_beg` or something i think?	16:37
damiandabrowski	ahhh	16:37
damiandabrowski	haproxy_redirect_http_port: 80	16:37
damiandabrowski	i miseed it sorry	16:37
jrosser	NeilHanlon: i'm sure we see this before https://zuul.opendev.org/t/openstack/build/54443b4af4a24a2e875fd55cd6d06d4a	19:45
jrosser	do you remember what that is about?	19:45
jrosser	i've rechecked it to see if it's just a mirrors / rocky image out-of-step thing	19:46
*** noonedeadpunk_ is now known as noonedeadpunk		21:51
moha7	https://docs.openstack.org/openstack-ansible-rsyslog_client/latest/ops-logging.html --> this page last updated: 2016	22:10
moha7	Setting `log_hosts` in the openstack_user_config file leads to error: https://paste.opendev.org/show/818616/	22:13
jrosser	moha7: that is long deprecated i think, which is why so long since an update	22:16
jrosser	here is the release note https://github.com/openstack/openstack-ansible/blob/master/releasenotes/notes/remove_rsyslog_roles-05893ed9f8534a39.yaml	22:17
moha7	Ah	22:19
jrosser	moha7: i would recommend collecting somehow from the systemd journals	22:22
jrosser	you'll find that they are also bind mounted from the LXC containers down to the host, so it should be possible to run some journal collection tool/agent just on the host and collect logs from pretty much everything	22:23
jrosser	ELK + journalbeat, or the journal input to filebeat etc would be one way to do that	22:24
opendevreview	Damian Dąbrowski proposed openstack/openstack-ansible master: Define some temporary vars for haproxy https://review.opendev.org/c/openstack/openstack-ansible/+/872328	22:34
MrR_	Hi people, managed to get some time to work on my openstack setup today, playbooks completed sucesfully but i'm having trouble with neutron, in the container i kept getting: ERROR neutron.plugins.ml2.managers [-] No type driver for tenant network_type: vxlan. Service terminated so i put neutron_ml2_drivers_type: "flat,vlan,vxlan" in my user_variables, this now throws a massive error that i have pasted here:	22:39
MrR_	https://paste.openstack.org/show/818617/ and i have no idea what the actual issue is there. I also get errors in horizon when trying to view the network which say Unable to check if network availability zone extension is supported/Unable to check if DHCP agent scheduler extension is supported/Network list can not be retrieved - all 503 errors.	22:39
MrR_	Some background, i have a typical bonded setup with the relevant vlans as per the docs and generally followed the docs to set it up, i then have a (currently untagged) bridge on bond0 called br-outside that connects to a switch/router that talks to you guessed it, the outside world, i assume this should just work but i've also tried some variables i may of needed to set to no avail, can anyone help me figure	22:39
MrR_	it out? It would be very much appreciated	22:39
opendevreview	Damian Dąbrowski proposed openstack/openstack-ansible master: Prepare service roles for separated haproxy config https://review.opendev.org/c/openstack/openstack-ansible/+/871189	22:47
jrosser	MrR_: i think that your errors in horizon are likley down to the neutron API service not running? perhaps due to the error in your paste preventing it starting properly	23:07
MrR_	yeah I figured that, just mentioned it as well as. Not sure how to trace the error as i've not messed with any core files only edited user config/variables	23:09
opendevreview	Damian Dąbrowski proposed openstack/openstack-ansible-haproxy_server master: Prepare haproxy role for separated haproxy config https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/871188	23:10
MrR_	have absolutely no idea what ValueError: :6642: bad peer name format means in all to be honest	23:11
jrosser	MrR_: unfortunately OVS is not something i am too familiar with, but if you are lucky jamesdenton might be around	23:11
opendevreview	Damian Dąbrowski proposed openstack/openstack-ansible master: Prepare service roles for separated haproxy config https://review.opendev.org/c/openstack/openstack-ansible/+/871189	23:12
MrR_	i haven't configured ovs so this is just linuxbridge? I'm not sure why both ovs and ovn have been loaded into the container, i did try setting up the relevant ovs config but the above error persists so i havent used it	23:14
jrosser	oh!	23:16
MrR_	this is a fresh setup so its not a remaining config either	23:17
jrosser	which branch/release are you using?	23:17
jrosser	the reason i assumed it was OVN was thats all over the stack trace in your paste	23:18
MrR_	one thing is i have become proficient in quickly setting up systems to test openstack haha, i'm using zed (26.0.0)	23:18
jrosser	ok so in Zed OVN is the default, and you need to switch it specifically to linuxbridge if thats what you want	23:19
jrosser	but linuxbridge is really not supported any more by the neutron developers, they have marked it as "experimental"	23:20
MrR_	i'm not fussed in all honesty, just want it working	23:20
MrR_	thought linuxbridge was default thats all	23:20
jrosser	yes, it used to be	23:21
opendevreview	Damian Dąbrowski proposed openstack/openstack-ansible master: Prepare service roles for separated haproxy config https://review.opendev.org/c/openstack/openstack-ansible/+/871189	23:21
jrosser	though that has changed for Zed	23:21
MrR_	So, knowing that, is it automagically set up or do I now need to follow this: https://docs.openstack.org/openstack-ansible-os_neutron/latest/app-ovn.html	23:23
jrosser	for a multinode deployment that would be what you'd do	23:26
jrosser	if you were to build a all-in-one (https://docs.openstack.org/openstack-ansible/zed/user/aio/quickstart.html) then then default config will automagically set that up for OVN	23:27
jrosser	i would always recommend setting up an all-in-one to use as a reference as this is the thing we test over and over in continuous integration tests	23:27
MrR_	It's a multi node setup, I'll get that done and see how I go, now that i know i was missing a step hopefully i'll be ok	23:30
jrosser	oh also there is now a later release than 26.0.0	23:31
jrosser	that was the point we cut the Zed branch and theres been some pretty big fixes since then	23:31
opendevreview	Damian Dąbrowski proposed openstack/openstack-ansible-specs master: Blueprint for separated haproxy service config https://review.opendev.org/c/openstack/openstack-ansible-specs/+/871187	23:31
jrosser	26.0.1 is tagged as a release	23:32
MrR_	so just a git checkout 26.0.1 and follow the minor upgrade path	23:35
MrR_	i'll do that now, run through the ovn setup and pop back on here tomorrow at some point, i have some time over the next few days to work on this	23:37

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!