Thursday, 2023-01-05

NeilHanlon	prometheanfire: do you have a link to logs for that?	00:44
prometheanfire	sure, https://pastebin.com/raw/gF1sSJdA	00:48
prometheanfire	don't have access to github gists from that laptop	00:48
prometheanfire	the link to ovs commit's message sums it up fairly well though	00:49
NeilHanlon	prometheanfire: just did a bit of digging.. from what I can tell, the packages from centos for openvswitch have the patch you mention for openvswitch 2.17, but not for openvswitch 2.15	01:26
prometheanfire	hmm, looked at it more, looks like it's the ovs version within the neutron venv	01:39
prometheanfire	not the package	01:39
NeilHanlon	i poked around at the os_neutron role and while I see it installing the centos-nfv release and installing `openvswitch`, i'm not actually sure where that dependency is being fulfilled... the repos only provide packages with the version in the name (e.g., openvswitch2.15 and openvswitch2.17). I wasn't able to see a version supplied anywhere. Perhaps	05:25
NeilHanlon	jrosser may be able to shed some light	05:25
noonedeadpunk	prometheanfire: is that for zed or ... ? As we do have OVN as default in gates, but we were missing https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/869042 quite recently	08:28
admin1	good morning .. i am trying to setup a new cluster .. tag 25.2.0 and stuck on openstack.osa.db_setup : Create database for service -> the output has been hidden due to the fact that 'no_log: true' was specified for this result ... how to map this specific step to exactly which file I need to set the no_log to false	10:30
admin1	or if there is an easy way to do no_log: false in command line for all	10:30
admin1	i tried -e no_log=false .. seems to not make a change	10:30
noonedeadpunk	admin1: it's in /etc/ansible/ansible_collections/osa/roles/db_setup/tasks/main.yml	10:34
noonedeadpunk	eventually we should make that easily configurable as you've suggested	10:35
admin1	did not worked .. i made the changes in /etc/ansible/ansible_collections/openstack/osa/roles/db_setup/tasks/main.yml - as it is where it was in my ubuntu 2	10:38
admin1	ubuntu 22.	10:38
noonedeadpunk	ah, yes, sorry, missed openstack folder (was typing from memory)	10:39
noonedeadpunk	could you have overriden ANSIBLE_COLLECTIONS_PATHS or ANSIBLE_COLLECTIONS_PATH?	10:42
admin1	no . i always insisted on osa be able to do stuff out of the box	10:43
admin1	i never override anything	10:43
admin1	just follow the manual :)	10:43
admin1	grep -ri openstack.osa.db_setup /etc/ansible/ returns these files /etc/ansible/roles/os_keystone/tasks/main.yml:	10:44
noonedeadpunk	Um, then I have no idea why it didn't worked...	10:44
noonedeadpunk	Yes, but there we just include the role	10:44
noonedeadpunk	and it should come from $ANSIBLE_COLLECTIONS_PATH/openstack.osa.db_setup	10:45
noonedeadpunk	well, you can check with debug if that's the path as well	10:46
admin1	checking if there is a one liner	10:47
noonedeadpunk	eventually what can go wrong at db setup is either mariadb cluster is not healthy or my.cnf for client on utility container is wrong	10:54
admin1	yeah .. debugging those	10:54
noonedeadpunk	So you can check these 2 things as well - maybe will be faster to solve these	10:55
admin1	suspect haproxy issue	10:55
admin1	strange .. my util container cannot ping any other container on mgmt even on the same server :D	11:08
noonedeadpunk	is there IP configured on lxcbr0 on host?	11:22
noonedeadpunk	but yeah, shouldn't be the reason	11:22
noonedeadpunk	I'd suspect lxc-dnsmasq and tried to restart it	11:22
noonedeadpunk	ah, on mgmt, not eth0	11:23
admin1	on br-mgmt	11:25
noonedeadpunk	and forwarding is enabled for all interfaces on sysctl?	11:26
admin1	it is .. i think it has something to do with mtu	11:40
admin1	at the end, its the mtu :D	11:40
*** dviroel\|ourt is now known as dviroel		11:45
noonedeadpunk	ah, ok, then it's not any of my faults at least :D	12:33
noonedeadpunk	We need to merge this rabbitmq bump to recheck and pass Yoga upgrades https://review.opendev.org/c/openstack/openstack-ansible/+/869078	12:38
admin1	hi noonedeadpunk, do you know what would cause this error: ? https://paste.openstack.org/raw/bvthX5495Y5KLkq6ipKU/	12:42
noonedeadpunk	I assume you don't have /etc/resolv.conf in controller?	12:43
admin1	its a softlink and its broken for some reason :D	12:44
noonedeadpunk	systemd-resolved is dead?	12:44
admin1	Active: inactive (dead)	12:44
noonedeadpunk	yeah. try restarting it	12:44
admin1	hmm.. thanks	12:45
admin1	removing ssl and ssl-verify-server-cert from .my.cnf and changing internal VIP to galera container IP works .. changing the IP to the VIP does not work .. i get ERROR 2013 (HY000): Lost connection to server at 'handshake: reading initial communication packet', system error: 115	13:24
admin1	from utility container	13:25
admin1	this cluster has only 1 controller for now	13:25
admin1	entering the mysql ip directly in .my.cnf gives ERROR 2026 (HY000): TLS/SSL error: Validation of SSL server certificate failed when ssl and ssl-verify-cert is enabeld in .my.cnf	13:27
admin1	what i recalled before was running setup hosts and infra, and then logging to util and hitting mysql ENTER .. if all was good, i was inside mysql and then i used to run setup-openstack	13:27
admin1	br-mgmt runs on top of unrouted network on its own dedi vlans .. so i think I am ok to not use extra ssl and tls for mysql connection .. how/where do I disable those ?	13:30
noonedeadpunk	admin1: so SSL should be issued for hostname and internal VIP (galera_address) only	13:34
noonedeadpunk	Accessing through IP of galera container is supposed to fail validation	13:34
noonedeadpunk	basically that's what's in cert: https://opendev.org/openstack/openstack-ansible-galera_server/src/branch/master/defaults/main.yml#L236	13:35
noonedeadpunk	If you want to omit ssl you can use galera_use_ssl or if you don't want to verify certs, you can also `galera_ssl_verify: false`	13:36
admin1	my internal vip is an ip address	14:57
admin1	that is no longer supported ?	14:57
admin1	i don't see the point of having 1 extra dns query per connection from dns -> ip for internal vip ?	14:58
noonedeadpunk	admin1: as you might see from condition - galera_address can be either IP or FQDN. And by default it's `galera_address: "{{ internal_lb_vip_address }}"`	15:00
noonedeadpunk	tbh it would be interesting to see output of `openssl x509 -in /etc/ssl/certs/galera.pem -text -noout`	15:01
* noonedeadpunk places internal IP to /etc/hosts to minimize time for resolving dns		15:02
admin1	still getting ERROR 2013 (HY000): Lost connection to server at 'handshake: reading initial communication packet', system error: 115 when entering mysql from util	15:03
admin1	something did changed	15:04
admin1	galera backend is listed as DOWN .. while its running and fine in galera container	15:07
admin1	noonedeadpunk, that command openssl is to be tested from which node ? util ?	15:09
admin1	gives unable to laod cert	15:09
admin1	from controller as well as util	15:09
noonedeadpunk	from galera	15:09
noonedeadpunk	oh, well	15:10
admin1	returns a cert .. Subject: CN = c1-galera-container-44ae38ac	15:10
noonedeadpunk	` galera backend is listed as DOWN .. while its running and fine in galera container ` -> it's a bit dfifferent	15:10
noonedeadpunk	in cert there's SAN that's interesting	15:10
noonedeadpunk	but anyway, for galera haproxy checks for aliveness another service that now runs with systemd	15:10
noonedeadpunk	let me recall	15:11
noonedeadpunk	Should be mariadbcheck or smth	15:12
noonedeadpunk	this thing https://opendev.org/openstack/openstack-ansible-galera_server/src/branch/master/tasks/galera_server_post_install.yml#L47-L63	15:12
noonedeadpunk	so basically it's socket that does check for return of /usr/local/bin/clustercheck	15:13
noonedeadpunk	and based on it haproxy determines if galera backend is alive	15:13
admin1	running that command manually works fine	15:16
spatel	I believe this could be issue of password, i had similar issue where i changed monitoring password of mysql and LB stop sending traffic to mysql	15:16
admin1	brand new deployment .. usual checkout the current branch, run setup hosts and setup infra and nothing else	15:17
admin1	i will try to delete all containers, re-initialize the passwords and retry	15:17
admin1	if i do use an internal vip, would it be added in /etc/hosts locally or would I have to add it in internal dns	15:18
spatel	why redeploy.. i would say try to fix.. may be we have a bug	15:18
admin1	password is fine though .. if i change the .my.cnf to point to ip directly and not haproxy, it just works	15:18
admin1	the issue/bug is haproxy is seeing mysql as down when it is not	15:19
admin1	i think if that is fixed, i am unblocked	15:19
spatel	what hatop saying?	15:19
admin1	i use the gui . ( never used hatop )	15:20
admin1	says backend is down	15:21
admin1	galera-back DOWN	15:21
spatel	Good! what is in allow_list in haproxy ?	15:21
spatel	make sure 9200 is listening on galera nodes - https://paste.opendev.org/show/bAtJ6xJE7KpMUlrAfNgK/	15:22
spatel	9200 is check script	15:22
admin1	9200 is there .. but does not respond to queries from outside	15:24
admin1	hmm..	15:24
spatel	cool! now you know what to do	15:24
noonedeadpunk	it's only answering from specific set of addresses	15:24
spatel	check xinit.d	15:24
noonedeadpunk	As socket has IPAddressAllow	15:24
noonedeadpunk	spatel: it's not xinit.d anymore	15:24
noonedeadpunk	it's systemd socket	15:24
spatel	oops!!	15:25
noonedeadpunk	And allowed list defined here https://opendev.org/openstack/openstack-ansible/src/branch/master/inventory/group_vars/galera_all.yml#L33-L39	15:25
spatel	its just systemd daemon.. but using xinet.d (sorry i misspell earlier)	15:26
noonedeadpunk	but it should be reachable from any galera or haproxy hopst	15:26
spatel	https://paste.opendev.org/show/bYp9w2EnPVtReHRcDsRD/	15:26
noonedeadpunk	nope, xinetd not used at all in Yoga+	15:26
spatel	only_from = 0.0.0.0/0	15:26
admin1	let me install tcpdump and see what ip it comes as	15:26
spatel	I am checking in wallaby (sorry)	15:26
admin1	i have multiple ips in the controller in the same range	15:26
admin1	it only has 1 listed	15:26
noonedeadpunk	(or even Xena+)	15:27
spatel	how does it handle upgrade? are we wiping out config during upgrade process?	15:27
noonedeadpunk	admin1: well, you can override galera_monitoring_allowed_source if needed	15:27
admin1	my controller ip is .11, but .9 is also there .. now tcpdump shows IP 172.29.236.9.59070 > 172.29.239.67.9200	15:27
admin1	so it was coming from a diff ip in the controller	15:28
noonedeadpunk	spatel: yup. https://opendev.org/openstack/openstack-ansible-galera_server/src/branch/master/tasks/galera_server_post_install.yml#L16-L36	15:28
spatel	bravo!!	15:28
admin1	why not give it the br-mgmt range as default ?	15:29
admin1	if i put 172.29.236.0/22 there, does 127.0.0.1 also need to exist ?	15:30
admin1	will see	15:30
spatel	it us br-mgmt to poke 9200 (may be you put your hapoxy vip in different range)	15:30
admin1	server base IP = 172.29.236.11 , VIP = 172.29.236.9 ..	15:31
noonedeadpunk	admin1: that was always like that, like 5+ years	15:31
admin1	allow was 172.29.236.11 , while tcpdump showed the 9200 connection was from 236.9	15:31
admin1	i now did galera_monitoring_allowed_source: 172.29.236.0/22	15:31
noonedeadpunk	we didn't touch that allowlist when switching from xinet.d to systemd	15:31
admin1	IPAddressAllow=172.29.236.0/22, but still not work	15:33
admin1	had to do a daemon-reload and stop/start manually	15:33
admin1	finally galera is UP from haproxy	15:34
admin1	i think i can finally run setup-openstack now :)	15:34
admin1	thanks guys	15:34
noonedeadpunk	huh, wonder why deamon-reload and restart wasn't performed	15:34
noonedeadpunk	that could be proper bug	15:35
admin1	can we allow br-mgmt range by default there ? as i am sure most people will have at least base and vip in the controller	15:36
admin1	and in my case, controller was listed but requests went via the VIP	15:37
noonedeadpunk	admin1: are not adding VIP as /32 or /24?	15:37
noonedeadpunk	s/not/you/	15:38
admin1	no	15:38
admin1	adding it as /22 .. same range as br-mgmt	15:38
noonedeadpunk	well... it's worth being /32	15:38
noonedeadpunk	then you won't have issues like that	15:38
noonedeadpunk	or well, should not, as VIP won't be used as sourced one, but simply as an alias	15:39
admin1	right ..	15:40
noonedeadpunk	well, we can set to whole mgmt network as well, but I'm not sure if it's the only place which can be problematic at the end	15:47
noonedeadpunk	feel free to propose patch if you feel it's worth fixing	15:49
*** dviroel is now known as dviroel\|lunch		16:01
prometheanfire	noonedeadpunk: from what I can see it's a backtrace but not a crash, when I manually apply the patch to the venv I get no backtrace	16:37
prometheanfire	and ya, zed	16:38
prometheanfire	noonedeadpunk: you are saying that patch also fixes the error?	16:39
noonedeadpunk	well, no, I don't know if it fixes your issue, but I know it fixes another bug in case when northd is on the same place as neutron-server	16:45
noonedeadpunk	as then they both are conflicting for lock dir	16:45
noonedeadpunk	but I decided to mention that jsut in case	16:46
prometheanfire	ah, I don't think so, they are in separate containers from what I remember	16:46
noonedeadpunk	I see that openvswitch2.17-2.17.0-57 is being installed in CI at least	16:49
prometheanfire	ya, that's the package version, is that package version installing the python ovs package in the venv too (that's the code I patched that fixed it)	16:51
prometheanfire	I imagine that it's ovs from pypi that's used in the venv	16:52
prometheanfire	heh, fix committed in october or later iirc, last release in may https://pypi.org/project/ovs/#history	16:53
admin1	how is gluster used in these new releases ?	17:05
admin1	repo uses gluster now ?	17:06
*** dviroel\|lunch is now known as dviroel		17:10
admin1	one more error .. https://paste.openstack.org/show/bNukeywgadvG2M9PvGsm/ -- not sure of this one as well	17:18
admin1	what would be the correct format for external ceph file	17:23
admin1	i think its a bug .. https://paste.openstack.org/show/bYPWEAipoL0VZaNfExwg/ on how the file is processed	17:30
admin1	oh	17:30
admin1	my bad	17:30
admin1	its = and not :	17:31
noonedeadpunk	prometheanfire: that we take from upper constraints :)	17:40
noonedeadpunk	admin1: yes, so for repo container we dropped out lsyncd. Now any shared filesystem can be mounted, like nfs, cephfs, s3fs, etc.	17:41
noonedeadpunk	So you can pass mountpoint as a variable that will be consumed by systemd_mount role.	17:41
noonedeadpunk	By default gluster is being installed, but you can disable that in case already have cephfs or smth at this point	17:42
noonedeadpunk	prometheanfire: so yeah, for venv we need release of ovs in pypi and then bump of version in u-c	17:43
spatel	noonedeadpunk i have question about billing :) you guys using ceilometer + gnocchi correct?	17:44
noonedeadpunk	nah, we're not. I used that before though	17:44
noonedeadpunk	like... 3 years ago?	17:44
spatel	currently what you guys using for flavor based billing?	17:44
spatel	consume notification and pass it to your home brew billing tool?	17:45
noonedeadpunk	in-house system that just poll APIs	17:45
spatel	I am looking for just flavor based billing and not sure what tools i should use	17:45
spatel	ceilometer/gnocchi will overkill it	17:46
spatel	i hate cloudkitty..	17:46
noonedeadpunk	lol	17:46
spatel	you too?	17:46
noonedeadpunk	nah, never used it.	17:46
spatel	Thinking i can just consume mysql table and run some python math to do monthly or per hour billing based on flavor name	17:47
noonedeadpunk	or well, tried to adopt, but was short in time and then other department wrote a plugin for our billing system that was calculating units based on gnocchi	17:47
noonedeadpunk	the main problem is that you need to take into account when it was created / deleted and also resizes	17:48
noonedeadpunk	as you might have 100 gb volume that was resized to 200 gb 1 min before your script run	17:48
noonedeadpunk	so should you bill for 200gb or 100 gb....	17:49
spatel	we never resize (so i don't care about it) all i care about creation time and just calculate hours based on time	17:49
spatel	We have private cloud so billing is just to see numbers and costing.. just to show it to management	17:50
spatel	nobody going to pay me :)	17:50
spatel	We compare all 3 cloud and then decide which place we should run our service.	17:50
spatel	If i have billing in my private cloud then i can tell costing etc..	17:51
noonedeadpunk	yeah, I dunno any good ready solution... maybe smth can be done jsut through prometheus and grafana	17:51
noonedeadpunk	as using openstack exporter you can get flavors	17:53
spatel	you mean prometheus exporter?	17:55
noonedeadpunk	yeah.... but not sure tbh....	17:55
noonedeadpunk	at least I do recal having grafana dashboard from gnocchi that could sum up usage per project	17:55
noonedeadpunk	so I assume smth can be done with prometheus as well	17:56
spatel	hmm.. why don't just extract those info from mysql :)	17:57
spatel	that info must be somewhere in db	17:57
spatel	let me try or start from somewhere..	17:57
spatel	what you guys do for BW billing?	17:58
admin1	for one cluster, i am testing https://fleio.com/pricing	18:05
admin1	considering the amount of hours you have to spend for doing it your own, the price looks OK in that regard	18:05
noonedeadpunk	ah, yes, I do remember these folks. But they would be quite costy given it's only for internal needs	18:28
noonedeadpunk	fwiw fleio folks were hanging out here some time ago... But they also relied on ceilometer from what I know (at least were contributing to ceilometer role)	18:29
noonedeadpunk	they were not active lately though :)	18:30
noonedeadpunk	*:(	18:30
jrosser	spatel: we used this for usage accouting (not billing) https://opendev.org/vexxhost/atmosphere	18:53
jrosser	it consumes events as http from ceilometer and creates db entries you can then query	18:54
spatel	jrosser Thanks, let me check..	18:54
spatel	how does it work.. there is no doc in repo :)	18:54
jrosser	but like no documentation so you have to read the code	18:54
spatel	why not consumes events directly from rabbitMQ instead of ceilometer? may be some extra metrics there like BW/IOPS etc	18:55
jrosser	i didnt design it	18:55
noonedeadpunk	Hm, I thought atmosphere now is their whole automation stack....	18:56
jrosser	that is also very confusing	18:56
jrosser	anyway we have a combination of atmosphere, openstack exporter and then prometheus	18:56
spatel	Yes atmosphere is openstack deployment tool using k8s :)	18:56
spatel	https://vexxhost.com/private-cloud/atmosphere-openstack-depoyment/	18:57
noonedeadpunk	it's k8s AND asnible from what I know	18:58
spatel	vxxhost was using OSA before correct?	18:58
noonedeadpunk	yup	18:58
spatel	i talked to mnaser in Berlin about it.. and he told check it out in github and give it a shot :)	18:59
spatel	They create rabbitMQ for each service so no single rabbit cluster for all	19:00
spatel	everything is autoheal using k8s	19:00
spatel	But devil in details :) worth give it a try for learning though	19:01
noonedeadpunk	I'm suuuuuuper sceptic about auto-heal for everything	19:04
spatel	Yes, it looks fancy but problem start when shit hit the fan :D	19:13
noonedeadpunk	or well. maybe autoheal (which systemd does anyway out of the box), but totally not auto-scale.	19:13
noonedeadpunk	and to auto-heal rabbit - you need to detect it's not healthy first, which is most tricky thing at least for me....	19:13
spatel	Anyway all openstack services are stateless (they are auto healer itself)	19:14
spatel	Only rabbitMQ and Galera required some love. (I can tell you rabbitMQ is much stable these days, may be they fixed lots of bug)	19:14
spatel	If you create dedicated rabbit instance for each service then very less chances that you will run on any issue.	19:16
spatel	Biggest issue of rabbit is cluster and queue sync up.	19:16
spatel	may be we should try that in OSA :) option to spin up rabbit for each services :)	19:17
noonedeadpunk	You have that option	19:18
noonedeadpunk	It's possible and we used that like 3 years ago	19:18
spatel	Only nova/neutron use heavey rabbit but not others	19:18
spatel	I like that idea to have own rabbit instance for each service	19:18
noonedeadpunk	Well, for nova you have cells	19:18
noonedeadpunk	And for neutron - ovn :D	19:18
noonedeadpunk	You would need to have either quite powerfull control plane or have more then 3 of them	19:19
noonedeadpunk	As many rabbits is quite heavy thing	19:19
noonedeadpunk	rabbit per service is not very well documented though. It's only mentioned in trove role documentation	19:21
noonedeadpunk	But idea is exactly the same.	19:21
noonedeadpunk	and you can have galera per service as well	19:22
spatel	Galera is very stable application	19:23
noonedeadpunk	lol	19:23
spatel	I had almost zero issue	19:23
spatel	I don't know about you :D	19:23
noonedeadpunk	you should have also said it has zero bugs	19:24
noonedeadpunk	well, yeah, if it runs - it runs properly	19:24
noonedeadpunk	until shit hits the fan	19:24
spatel	I don't know what bug you encounter but it works well in my case..	19:24
spatel	May be you guys doing something which i am not aware.	19:25
noonedeadpunk	broken ist, broken threading and when they've broke root permissions during upgrade because of weird bits	19:25
noonedeadpunk	they fix everything but it might be hard to find stable version from time to time	19:26
spatel	noonedeadpunk i agreed on that. There was a bug in newer version and uograde was no smoother	19:26
spatel	May be we need noSQL db in future to not worry about it :)	19:27
spatel	How do you perform housekeeping on mysql	19:27
spatel	we have nova DB size of 150G :D	19:27
noonedeadpunk	oh, that is smth we need to implement in OSA	19:28
noonedeadpunk	as nova-manage allows to trib deleted records	19:28
spatel	I was reading about nova-manage tool to purge but want to learn before i try	19:28
noonedeadpunk	and I was thinking on making an optional systemd service for that for nova and cinder at least	19:28
spatel	do you have command handy to do cleanup?	19:29
admin1	heat always fails => https://paste.openstack.org/raw/bTxVwYrRztFQsaIt9eHn/	19:29
spatel	are there any outage during cleanup?	19:29
noonedeadpunk	https://docs.openstack.org/nova/latest/cli/nova-manage.html#db-archive-deleted-rows	19:29
noonedeadpunk	no-no, it jsut wipes records from DB that are marked as deleted	19:30
noonedeadpunk	so you first move to shadow, and then can purge from shadow	19:30
noonedeadpunk	admin1: that usually means that wheels are not build	19:31
spatel	noonedeadpunk tx	19:31
admin1	how to force rebuild a heat wheel	19:31
spatel	venv_build=true (something like this in ansible command)	19:32
noonedeadpunk	admin1: we also backported bugfix for that https://review.opendev.org/c/openstack/openstack-ansible-os_heat/+/865564	19:32
noonedeadpunk	maybe not released yet....	19:32
admin1	right now, i did apt install git and its running again	19:32
jrosser	but that is wrong	19:32
noonedeadpunk	but that git binary is used only when it can't find repo container as target for wheels	19:33
jrosser	admin1: it is so much more useful to understand why this happens	19:33
jrosser	"git is missing" -> something is broken in your deploy that means wheels are not built properly	19:33
admin1	the playbooks finished	19:34
admin1	i can test heat if its all OK	19:34
admin1	or try to build the wheel again	19:34
noonedeadpunk	admin1: they were not built at the first place	19:34
jrosser	i think you're missing the point really	19:34
admin1	i know :)	19:34
jrosser	something happened during your deploy that made the wheel build for heat not happen or break	19:35
jrosser	then you probably tried again and it went further but then failed with "git not found"	19:35
noonedeadpunk	So the thing is that in case when role can't find a destination on where to build wheels, it tries to just install into the venv, and needs git for that	19:35
noonedeadpunk	oh.. well... maybe that as well...	19:35
prometheanfire	noonedeadpunk: yep :D	19:36
spatel	noonedeadpunk why we don't stop there instead building wheel into venv?	19:37
noonedeadpunk	I kind of wonder if instead of https://opendev.org/openstack/ansible-role-python_venv_build/src/branch/master/vars/main.yml#L80 we should just define True and deal with consequences...	19:37
noonedeadpunk	we are not building wheels at all then	19:37
noonedeadpunk	we're just installing packages from pypi independently	19:37
spatel	I would prefer to no do anything and go back fix stuff instead of find other ways out	19:38
noonedeadpunk	spatel: it's matter if we jsut fail installation or trying to proceed	19:38
jrosser	when the wheel build fails it should delete the .txt files i think	19:38
jrosser	that should make it repeatably try again	19:38
noonedeadpunk	yep, we have block for that	19:38
jrosser	so thats why it is interesting that for admin1 this seems not to happen	19:39
jrosser	but why find root cause when a hack will do /o\	19:39
admin1	no no .. i have to deliver this cluster by this evening .. .. i can create this same env again in dev tomorrow and go into root cause )	19:39
noonedeadpunk	I still think it's matter of `venv_wheel_build_enable` being evaluated to false rather then anything else	19:39
noonedeadpunk	which depends on quite complex logic to be fair	19:40
jrosser	maybe we should make that a hard failure when there is no build target	19:41
jrosser	rather than fall back to building in the ven	19:41
jrosser	v	19:41
jrosser	and only allow that if `venv_wheel_build_enable` is actually set to `False`	19:41
jrosser	i think i am also suspicious about stale facts for this whole process	19:43
jrosser	it relies on having architecture and OS facts available for the repo servers, else it just won't work	19:43
jrosser	this should address that though https://github.com/openstack/ansible-role-python_venv_build/blob/4d766a1f9d9993e2bb3647fdcf19da23fffbae61/tasks/main.yml#L31-L39	19:44
jrosser	this is also pretty complicated logic https://github.com/openstack/ansible-role-python_venv_build/blob/4d766a1f9d9993e2bb3647fdcf19da23fffbae61/tasks/main.yml#L68	19:49
admin1	i have checked the cluster using a heat file and it worked fine .. so it was just a matter of apt install git for me for this one ..should I build the wheels again and try ? or assume it works good	20:11
noonedeadpunk	admin1: it would be great if you could drop the git from container and pasted whole output of python_venv_build	20:19
noonedeadpunk	and re-run with `-e venv_wheels_rebuild=true -e venv_rebuild=true`	20:20
damiandabrowski	sorry, i just jumped in for a minute	20:20
noonedeadpunk	yeah, facts shouldn't be the issue I guess... But I'm not sure what is....	20:20
damiandabrowski	venv_build issue reminds me of uwsgi issue we had in november	20:21
damiandabrowski	so basically the current logic is: if there's only one container matching distro and architecture, then wheel won't be built on repo container	20:22
noonedeadpunk	Well. It's same but different I guess..	20:22
damiandabrowski	so it's mainly the case for single controller deployment	20:22
noonedeadpunk	ohhhhh	20:22
noonedeadpunk	yes, that's true	20:22
damiandabrowski	we tried to fix that for our CI previously: https://review.opendev.org/c/openstack/openstack-ansible/+/752311	20:23
noonedeadpunk	admin1: how many controllers/repo containers do you have ?:)	20:23
damiandabrowski	but i think it would be good te set venv_wheel_build_enable: True in /opt/openstack-ansible/inventory/group_vars/all/all.yml	20:24
noonedeadpunk	for some reason I assumed it's multiple controllers :D	20:24
noonedeadpunk	fwiw 1 vote still needed for https://review.opendev.org/c/openstack/openstack-ansible/+/869078	20:24
admin1	1 controller , 1 repo	20:25
noonedeadpunk	damiandabrowski: the thing is, that in CI we see quite serious performance penalty from using venv_wheel_build_enable: True for AIO at least	20:25
noonedeadpunk	admin1: then disregard all that	20:25
noonedeadpunk	it's all good	20:26
noonedeadpunk	except we need to release a new version with heat isntalling git for that specific usecase	20:26
spatel	Quick question, can i add tags in vm to identify it?	20:29
spatel	like customer name as tags so i can query based on tags and list all vms	20:29
spatel	Oh! nova server-tag-add vm1 cust1	20:33
jrosser	damiandabrowski: where is the "if there's only one container matching distro and architecture, then wheel won't be built on repo container" in the code?	20:47
jrosser	oh like here https://github.com/openstack/ansible-role-python_venv_build/blob/83998be6b81e756828edc723059e6a5405dd2da6/vars/main.yml#L80	20:49
jrosser	why do we do that?	20:49
jrosser	hmm right i need to remember that	20:51
jrosser	going to make some unexpected outcome when i add a single aarch64 repo node and one aarch64 compute node to my lab	20:52
damiandabrowski	i guess we thought "if we have only one host, then there's no reason to build wheels" which is probably right as it won't optimize anything	20:52
damiandabrowski	but on the other hand, more hosts can be added at any time and repo container is always deployed anyway	20:53
damiandabrowski	so at the end it may make sense to always build wheels	20:54
*** dviroel is now known as dviroel\|afk		21:19
*** rgunasekaran_ is now known as rgunasekaran		21:42
prometheanfire	having problems creating ports with ovn, is ovn actually ready to be default?	22:05
admin1	how to tell osa to not treat one host as network host ( so that it does not create network agents again in rerunning the playbooks )	22:12
admin1	so say host h1 was specified as network node before, but not anymore in the user_config .. how to tell osa to not treat h1 as network node anymore ?	22:12
opendevreview	Merged openstack/openstack-ansible stable/xena: Bump rabbitmq role SHA https://review.opendev.org/c/openstack/openstack-ansible/+/869078	23:05

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!