Wednesday, 2024-01-24

opendevreview	Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_ironic master: Fix a typo in pxe_redfish definition https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/906353	08:39
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible master: Extra PIP_OPTS in bootstrap_ansible script must be space separated https://review.opendev.org/c/openstack/openstack-ansible/+/906472	09:06
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199	10:15
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible master: Add SCENARIO_EXCLUDE environment variable to limit expansion of SCENARIO https://review.opendev.org/c/openstack/openstack-ansible/+/906388	10:17
jrosser	^ i was thinking about this patch, and there are some different ways to do it	10:18
jrosser	currently it makes a new env var, SCENARIO_EXCLUDE to remove things from the expanded scenario	10:18
jrosser	but we could also do it in a single var with a new keyword to split on SCENARIO=aio_lxc_magnum_octavia_exclude_heat	10:19
jrosser	that way no changes to gate_check_commit cli parameters would be needed	10:19
noonedeadpunk	I'm really not sure about `_exclude_` separator frnakly speaking	10:43
noonedeadpunk	as overall SCENARIO_EXCLUDE thing. Like feels it's better not to add things to scenario then remove from it afterwards	10:44
noonedeadpunk	but I see how that approach currently it problematic sometimes	10:44
jrosser	yeah, i don't really like it either	10:46
jrosser	i was just looking at https://review.opendev.org/905619	10:47
jrosser	that fails here https://zuul.opendev.org/t/openstack/build/f3940c376d7a4806824c4d15301d7fc4/log/job-output.txt#13444	10:48
jrosser	becasue "Could not find a module for {{hostvars['aio1_repo_container-c0674b1d']['ansible_facts']['pkg_mgr']}}."}	10:48
jrosser	i was wondering if this actually does anything https://zuul.opendev.org/t/openstack/build/f3940c376d7a4806824c4d15301d7fc4/log/job-output.txt#12053	10:49
jrosser	https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/repo-install.yml#L16-L20	10:50
jrosser	i think i will test this locally, maybe if the tasks list is empty its a valid optimisation in ansible not to gather facts	10:50
jrosser	oh well https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/gate-check-commit.sh#L214	10:55
noonedeadpunk	yeah, I guess we're gathering them here though: https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/gate-check-commit.sh#L187	11:03
noonedeadpunk	and then skip down the road	11:03
jrosser	that doesnt gather for lxc containers though	11:05
jrosser	and this is only failing the lxc jobs	11:05
noonedeadpunk	well, ok L210	11:05
jrosser	tbh i thought that python_venv_build role had some specific handling for ensuring that facts are present for the repo host	11:05
noonedeadpunk	handy that we store gathered facts in logs actually (or at least should)	11:07
noonedeadpunk	https://zuul.opendev.org/t/openstack/build/f3940c376d7a4806824c4d15301d7fc4/log/logs/ansible/facts-all.log.txt	11:07
noonedeadpunk	so that should not really be a problem...	11:08
jrosser	ok i make an AIO i think	11:09
gokhan_	hello folks, when upgrading from victoria to wallaby, I am getting rabbitmq crash errors. https://paste.openstack.org/show/blNvUeMG64dBChRchb92/ what can be the reason of this ?	11:16
noonedeadpunk	gokhan_: have really no idea	11:31
noonedeadpunk	would make sense to check rabbitmmq and erlang versions on all containers	11:31
noonedeadpunk	And check they match: https://www.rabbitmq.com/which-erlang.html	11:32
noonedeadpunk	Potentially - upgrade versions of rabbit/erlang to supported ones	11:32
gokhan_	noonedeadpunk, yes you are right	11:35
gokhan_	rabbitmq version is 3.11.3 and erlan version is 26	11:36
gokhan_	erlang 26 is supported with rabbitmq 3.12.0	11:37
noonedeadpunk	gokhan_: yeah, so I assume that versions of rabbit are just missing from the repos, so you're fallbacking to system version	11:42
noonedeadpunk	gokhan_: we have variables `rabbitmq_package_version` and `rabbitmq_erlang_version_spec` to control these 2 things	11:43
gokhan_	thanks noonedeadpunk , I will check rabbitmq repos and resolve it.	11:45
noonedeadpunk	yeah, basically that....	11:46
noonedeadpunk	gokhan_: otherwise, you cna try using `rabbitmq_install_method: distro` which will potentially downgrade the cluster, but dunno how good is that given you're quite far from distro provided versions	11:46
noonedeadpunk	so that might be bad idea	11:47
gokhan_	noonedeadpunk, is it be a problem using sam versions in xena	12:00
gokhan_	*same	12:00
gokhan_	sorry xena is also problem there is no version in repo rabbitmq-server 3.9.28-1	12:02
noonedeadpunk	and you should keep an eye not to really downgrade	12:06
noonedeadpunk	as there might be flags in place that will prevent doing so (or better say startup with lower version)	12:07
noonedeadpunk	So if you're upgrade to antelope - jsut take antelope version right away	12:07
noonedeadpunk	and keep it through all upgrades	12:07
noonedeadpunk	and basically jsut skip running further rabbitmq upgrades	12:07
noonedeadpunk	that might work :)	12:07
gokhan_	yes I am upgrade to antelope	12:08
gokhan_	I will continue with rabbitmq antelope version	12:09
jrosser	so `Could not find a module for {{hostvars['aio1_repo_container-8475472c']['ansible_facts']['pkg_mgr']}}.`	12:11
jrosser	means that this line did not actually template https://github.com/ansible/ansible/blame/stable-2.15/lib/ansible/plugins/action/package.py#L49	12:11
jrosser	and so the module name is not templated when it prints the error here https://github.com/ansible/ansible/blame/stable-2.15/lib/ansible/plugins/action/package.py#L66	12:12
noonedeadpunk	you're looking at ansible-core 2.15.7 issue?	12:21
jrosser	yeah	12:21
noonedeadpunk	tbh that kinda feels a connection plugin thing	12:22
noonedeadpunk	as affects only lxc	12:22
noonedeadpunk	so potentially we fail to propagate facts somehow, dunno...	12:22
noonedeadpunk	or well	12:22
noonedeadpunk	maybe not - we just don't delegate in fact in metal	12:22
jrosser	would you expect to get some templating exception then	12:23
jrosser	like if it cant' find hostvars['foo']['ansible_facts']['pkg_mgr']	12:24
noonedeadpunk	well the error there doesn't make much sense to me at all	12:24
noonedeadpunk	I guess I'll be looking jsut for diffs between 2.15.5 and 2.15.7	12:24
noonedeadpunk	(I guess that's what we're upgrading for?)	12:25
noonedeadpunk	shoudl not be plenty of them...	12:25
jrosser	right, next thing to try	12:25
noonedeadpunk	`Nested templating may result in an inability for the conditional to be evaluated. See the porting guide for more information.`	12:26
noonedeadpunk	Like this one was introduced in 2.15.7: https://docs.ansible.com/ansible-core/2.15/porting_guides/porting_guide_core_2.15.html#playbook	12:27
noonedeadpunk	but yeah, probably not directly related...	12:27
noonedeadpunk	```import_role`` reverts to previous behavior of exporting vars at compile time.` -> that's actually interesting change.....	12:29
noonedeadpunk	As I can recall hassle specifically with python_venv_build regarding imports/includes messign up vars	12:29
noonedeadpunk	I wonder if 2.15.8 might be covering it....	12:32
noonedeadpunk	Doubt though	12:33
jrosser	2.15.6 works, 2.15.7 fails	12:34
jrosser	so it is in the diff between those	12:34
noonedeadpunk	the main diff for 2.15.7 is CVE-2023-5764	12:46
noonedeadpunk	and 2.15.8 looks like bugfix release for this as well	12:47
jrosser	git bisect says this is the commit that breaks it https://github.com/ansible/ansible/commit/fea130480d261ea5bf6fcd5cf19a348f1686ceb1	14:11
jrosser	TIL that you can git clone ansible, run source ./hacking/env-setup and OSA automatically picks up the git version	14:12
noonedeadpunk	so... there were couple of patches to fix unsafe regression in 2.15.8.... So... I assume you've tested that it's still not working?	14:16
jrosser	the original patch tries to update to 2.15.8	14:23
jrosser	https://review.opendev.org/c/openstack/openstack-ansible/+/905619	14:23
noonedeadpunk	oh, ok	14:24
noonedeadpunk	sorry	14:24
spatel	Quick question, If i take snapshot and move to new openstack cloud and spin up instance then can i delete that snapshot from glance once instance is up ?	14:38
spatel	I am using Ceph backend storage.	14:38
spatel	My problem is I am planning to migrate all vm and I don't want 1000s of snapshot sitting in glance repo eating up space :(	14:39
spatel	Getting this error in logs - rbd.PermissionError: [errno 1] RBD permission error (error listing children.)	14:57
spatel	Look like this is parent/child relationship issue	14:58
jrosser	this is OSA?	14:58
spatel	no	14:59
spatel	This should be some Ceph issue right ?	14:59
spatel	I found this article - https://stackoverflow.com/questions/47346402/permissions-for-glance-user-in-ceph	14:59
jrosser	well whatever you use to deploy openstack needs to set the right permissions	14:59
jrosser	or rather openstack / ceph combination	14:59
jrosser	there was a thread on the mailing list about this recently	14:59
spatel	Technically you are allow to delete snapshot right after creating instance from snapshot ?	15:00
spatel	Do you have link for that thread?	15:00
jrosser	not without going checking the ML archive	15:00
spatel	This is my glance caps: caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=images	15:01
spatel	Look like my glance need access of volumes pool also otherwise how does it delete snapshot image?	15:02
jrosser	you can also check in the OSA vars how we set this up	15:04
spatel	Here are the full error logs - https://paste.opendev.org/show/bsHiWAdBpqIO4oZFFI57/	15:05
noonedeadpunk	these caps should work I guess	15:17
noonedeadpunk	But is it's glance user which is in use then?	15:17
noonedeadpunk	I dunno	15:18
spatel	after giving permission - caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=images, allow rx pool=volumes	15:21
spatel	Error disappeared and getting this mesg - Unable to delete image '3708f961-fb74-49f1-ab9b-40cf7954abed' because it is in use.	15:21
spatel	Look like glance need allow rx pool=volumes permission to read volume	15:21
spatel	noonedeadpunk How do i delete snapshot then?	15:22
spatel	I don't want 1000s of snapshot endup in my glance repository :(	15:22
noonedeadpunk	I"m kinda slightly confused about what you're trying to do I guess	15:22
noonedeadpunk	Like snapshot first of all is protected when it appears in glance iirc	15:23
noonedeadpunk	so to be removed it should be unprotected first.	15:23
noonedeadpunk	But then - why not to remove it from glance at the first place?	15:23
spatel	its not protected - https://paste.opendev.org/show/bsHiWAdBpqIO4oZFFI57/	15:23
spatel	protected \| False	15:23
spatel	Am i missing something ?	15:24
noonedeadpunk	That is actually great question, as all images I do have in glance are protected	15:25
spatel	I have two openstack and trying to migrate instances from A to B	15:27
spatel	I taking snapshot from A and moving to B	15:27
spatel	Spin up instance and now want to delete snapshot to cleanup space..	15:28
spatel	If I can't able to delete snapshot then I will have 1000s of them in glance which is crazy	15:28
hamburgler	if a volume is built on top of a snapshot, don't think you can delete the snapshot?	15:32
spatel	Hmm!	15:33
spatel	You are saying I should spin up instance using as an image instead boot from volume?	15:34
hamburgler	you could do either, but if the volume is created from snapshot, I don't think you can delete the snapshot after if the volume is in use	15:35
spatel	This is very interesting point...	15:37
hamburgler	maybe I don't have all the details, is this a completely separate cloud you're moving a volume to?	15:37
spatel	I am taking snapshot in glance and moving them to new cloud	15:37
spatel	importing that snapshot in glance and just spin up instance	15:38
hamburgler	and spinning the instance up works yes? - you will not be able to delete the 'image/volume' that the instance is based on	15:41
spatel	I have create instance and it works..	15:41
hamburgler	like for example if you have instances built off images, in image pool, you cannot delete the base image if volumes are built on it	15:41
hamburgler	at least probably not safely i believe lol	15:41
spatel	How do I check this image is tie up with that volume ?	15:43
spatel	If I want to run a script to find out who is using this snapshot to create instance	15:43
hamburgler	can do it from ceph cli	15:43
hamburgler	need to check my notes :) can't remember	15:44
spatel	Hmm I am trying to google that :)	15:48
jrosser	noonedeadpunk: https://github.com/ansible/ansible/blob/stable-2.15/lib/ansible/template/__init__.py#L744-L746	15:50
jrosser	wild behaviour there just to return the thing you wanted templating, but not templated	15:51
hamburgler	spatel: you can use rbd ls poolnamehere	15:51
jrosser	with apparently no error or warning	15:51
hamburgler	then rbd info poolname/volumename	15:51
noonedeadpunk	IIRC, then treat "unsafe" exactl same as "raw" in jinja	15:51
hamburgler	and you will see parent image in output:	15:52
noonedeadpunk	and I can recall suggestion to use unsafe/raw as same things in docs	15:52
hamburgler	that is just for volume on image, but I think you can see snapshots created too another way	15:52
jrosser	noonedeadpunk: though i am trying really hard to reproduce with a simple playbook delegating package task from one host to another and it pretty much just works	15:52
hamburgler	spatel: https://paste.openstack.org/show/bSGo4DajpV9u4luVkEvm/	15:53
spatel	hamburgler I can see image is parent - https://paste.opendev.org/show/bu3c7lqf6Ue5tUqn7d6X/	15:53
hamburgler	yeah i think if you were to do a test and try to rm the image, it wouldn't allow it, it would say there are volumes built on it or something, but obviously only try with test images/volumes :D	15:54
hamburgler	been awhile since I looked at that	15:54
noonedeadpunk	jrosser: would the thing work if drop our conenction plugin and jsut ssh to container normally?	15:55
jrosser	good question	15:55
noonedeadpunk	like any other host rather then containers	15:55
jrosser	i have a meeting now but will try that	15:55
spatel	You are saying use rbd command to delete from Ceph	15:55
hamburgler	mmm, only with a test image, just to see behaviour	15:56
jrosser	you are moving the snapshot to a different ceph though?	15:57
* jrosser very confusing		15:57
hamburgler	I think he is wanting to delete the snapshot that was moved to a new openstack cluster, after an instance is spun up	15:57
hamburgler	but I think that can't be done if an instance is now running on top of that snapshot	15:58
jrosser	spatel: you have two openstack? and two ceph? or one?	15:58
spatel	Yes...	16:06
spatel	two openstack two ceph	16:06
spatel	This is latest and greatest openstack and trying to move vms from old to new..	16:07
spatel	I wish I can directly move files from ceph to ceph and boot vms instead doing this snapshot ping-pong :(	16:07
hamburgler	not sure if there is a much better way to do it :(	16:11
spatel	Hmm! This is crazy :O	16:11
hamburgler	would be nice to import directly to ceph, but then entries not in database for openstack	16:15
jrosser	it might be possible to rbd export / import then do a bunch of database manipulation	16:15
jrosser	but thats a gigantic hack	16:16
spatel	haha :O	16:16
spatel	This is good read - https://docs.ceph.com/en/quincy/rbd/rbd-snapshot/#layering	16:17
spatel	look like ceph just do COW and keep image as a base image	16:17
spatel	FLATTENING A CLONED IMAGE.. does anyone know what its trying to say?	16:22
jrosser	if you take a snapshot it's like a no-op, just points to the original data	16:24
jrosser	but if you were to want to "detach" that from the original data then you need to fully resolve all the copy-on-write stuff into completely new "flat" data	16:24
spatel	hmm I think I need to do flatten image	16:29
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199	16:32
spatel	Very interesting - https://blueprints.launchpad.net/cinder/+spec/flatten-volume-from-image	16:34
spatel	What If I use QCOW2 image to import as a snapshot ?	16:35
spatel	Let me ask this question in mailing list and see what other think about this	16:37
spatel	I have did experiment with QCOW2 image and it doesn't have parent reference	16:45
spatel	I did*	16:45
mgariepy	if you store qcow2 images they need to be converted for volumes	16:46
spatel	mgariepy Yes I know they need time to download convert and upload back	16:55
spatel	but it let me allow to delete snapshot	16:55
mgariepy	you can enable image caching also.	16:55
jrosser	noonedeadpunk: i completely reproduced it outside OSA	16:55
jrosser	https://paste.opendev.org/show/bdr2qyCGXujPbCQj9x5M/	16:56
mgariepy	which will do it once and then all the other will use the cached image.	16:56
noonedeadpunk	jrosser: looks like veeeeery reportable bug	16:56
jrosser	it looks like the use of ansible_facts[...] to index into my_other_hosts makes the value be AnsibleUnsafeText	16:57
jrosser	if i make that a regular var that i set with -e then it works	16:57
opendevreview	Martin Oravec proposed openstack/openstack-ansible stable/zed: Keepalived WIP missing in proxy-protocol-networks mysql configuration. https://review.opendev.org/c/openstack/openstack-ansible/+/906447	16:58
noonedeadpunk	oh	16:58
jrosser	even better example https://paste.opendev.org/show/btNogSfVyKmguabmOh6n/	17:01
jrosser	andrewbonney: is that patch related to what you were looking at this week? https://review.opendev.org/c/openstack/openstack-ansible/+/906447	17:03
jrosser	this has familiar sound to it	17:03
jrosser	spatel: do you miss a step in the description of what you do? don't you have to create an image from the snapshot in order to download it?	17:05
noonedeadpunk	I really wonder if it's "as designed" or not....	17:09
noonedeadpunk	as if it's not it's getting really weird	17:09
noonedeadpunk	*if it is	17:09
noonedeadpunk	regarding - 906447 - I'm slightly confused why it;s needed. Like if you have exact same issue for different reason - that would be interesting	17:10
jrosser	we have the galera role used outside OSA and i know andrew has a ticket in our system to figure out what is wrong with the proxy protocol setup	17:19
noonedeadpunk	ah, ok	17:24
hamburgler	It looks like during a fresh install of Bobcat, or upgrade and switching between mirrored to quorum queues, that os-cinder-install, the backup service tries to start and ends up failing, because the new vhost doesnt actually get created until later on when cinder-api gets called, think this may need to be re-ordered?	19:07
noonedeadpunk	hamburgler: yeah, that's can be really valid issue	19:22
noonedeadpunk	jrosser: not sure if around... Do you have any idea on how to enable this? https://docs.openstack.org/keystone/latest/api/keystone.api.s3tokens.html	19:23
noonedeadpunk	Like I see plenty of 404 towards /v3/s3tokens and that's annoing.	19:23
noonedeadpunk	At this point I'm about to think it should be done through api-paste	19:23
noonedeadpunk	but I don't see any	19:24
noonedeadpunk	disregard, I guess I"m stupid	19:25
spatel	jrosser noonedeadpunk hamburgler I found solution :) its ceph flatten - https://paste.opendev.org/show/bL03lJDvEomnxVHkZebP/	20:35
spatel	as soon as I flatten image it let me delete snapshot	20:35
hamburgler	awesome! :) glad you found a solution	20:36
spatel	In mailing list someone said it has been addressed in new bobcat release - https://docs.openstack.org/releasenotes/glance_store/2023.2.html#relnotes-4-6-0-stable-2023-2	20:37
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible-plugins master: Add openstack_resources role skeleton https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/878794	21:09

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!