opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_ironic master: Fix a typo in pxe_redfish definition https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/906353 | 08:39 |
---|---|---|
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Extra PIP_OPTS in bootstrap_ansible script must be space separated https://review.opendev.org/c/openstack/openstack-ansible/+/906472 | 09:06 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 10:15 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Add SCENARIO_EXCLUDE environment variable to limit expansion of SCENARIO https://review.opendev.org/c/openstack/openstack-ansible/+/906388 | 10:17 |
jrosser | ^ i was thinking about this patch, and there are some different ways to do it | 10:18 |
jrosser | currently it makes a new env var, SCENARIO_EXCLUDE to remove things from the expanded scenario | 10:18 |
jrosser | but we could also do it in a single var with a new keyword to split on SCENARIO=aio_lxc_magnum_octavia_exclude_heat | 10:19 |
jrosser | that way no changes to gate_check_commit cli parameters would be needed | 10:19 |
noonedeadpunk | I'm really not sure about `_exclude_` separator frnakly speaking | 10:43 |
noonedeadpunk | as overall SCENARIO_EXCLUDE thing. Like feels it's better not to add things to scenario then remove from it afterwards | 10:44 |
noonedeadpunk | but I see how that approach currently it problematic sometimes | 10:44 |
jrosser | yeah, i don't really like it either | 10:46 |
jrosser | i was just looking at https://review.opendev.org/905619 | 10:47 |
jrosser | that fails here https://zuul.opendev.org/t/openstack/build/f3940c376d7a4806824c4d15301d7fc4/log/job-output.txt#13444 | 10:48 |
jrosser | becasue "Could not find a module for {{hostvars['aio1_repo_container-c0674b1d']['ansible_facts']['pkg_mgr']}}."} | 10:48 |
jrosser | i was wondering if this actually does anything https://zuul.opendev.org/t/openstack/build/f3940c376d7a4806824c4d15301d7fc4/log/job-output.txt#12053 | 10:49 |
jrosser | https://opendev.org/openstack/openstack-ansible/src/branch/master/playbooks/repo-install.yml#L16-L20 | 10:50 |
jrosser | i think i will test this locally, maybe if the tasks list is empty its a valid optimisation in ansible not to gather facts | 10:50 |
jrosser | oh well https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/gate-check-commit.sh#L214 | 10:55 |
noonedeadpunk | yeah, I guess we're gathering them here though: https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/gate-check-commit.sh#L187 | 11:03 |
noonedeadpunk | and then skip down the road | 11:03 |
jrosser | that doesnt gather for lxc containers though | 11:05 |
jrosser | and this is only failing the lxc jobs | 11:05 |
noonedeadpunk | well, ok L210 | 11:05 |
jrosser | tbh i thought that python_venv_build role had some specific handling for ensuring that facts are present for the repo host | 11:05 |
noonedeadpunk | handy that we store gathered facts in logs actually (or at least should) | 11:07 |
noonedeadpunk | https://zuul.opendev.org/t/openstack/build/f3940c376d7a4806824c4d15301d7fc4/log/logs/ansible/facts-all.log.txt | 11:07 |
noonedeadpunk | so that should not really be a problem... | 11:08 |
jrosser | ok i make an AIO i think | 11:09 |
gokhan_ | hello folks, when upgrading from victoria to wallaby, I am getting rabbitmq crash errors. https://paste.openstack.org/show/blNvUeMG64dBChRchb92/ what can be the reason of this ? | 11:16 |
noonedeadpunk | gokhan_: have really no idea | 11:31 |
noonedeadpunk | would make sense to check rabbitmmq and erlang versions on all containers | 11:31 |
noonedeadpunk | And check they match: https://www.rabbitmq.com/which-erlang.html | 11:32 |
noonedeadpunk | Potentially - upgrade versions of rabbit/erlang to supported ones | 11:32 |
gokhan_ | noonedeadpunk, yes you are right | 11:35 |
gokhan_ | rabbitmq version is 3.11.3 and erlan version is 26 | 11:36 |
gokhan_ | erlang 26 is supported with rabbitmq 3.12.0 | 11:37 |
noonedeadpunk | gokhan_: yeah, so I assume that versions of rabbit are just missing from the repos, so you're fallbacking to system version | 11:42 |
noonedeadpunk | gokhan_: we have variables `rabbitmq_package_version` and `rabbitmq_erlang_version_spec` to control these 2 things | 11:43 |
gokhan_ | thanks noonedeadpunk , I will check rabbitmq repos and resolve it. | 11:45 |
noonedeadpunk | yeah, basically that.... | 11:46 |
noonedeadpunk | gokhan_: otherwise, you cna try using `rabbitmq_install_method: distro` which will potentially downgrade the cluster, but dunno how good is that given you're quite far from distro provided versions | 11:46 |
noonedeadpunk | so that might be bad idea | 11:47 |
gokhan_ | noonedeadpunk, is it be a problem using sam versions in xena | 12:00 |
gokhan_ | *same | 12:00 |
gokhan_ | sorry xena is also problem there is no version in repo rabbitmq-server 3.9.28-1 | 12:02 |
noonedeadpunk | and you should keep an eye not to really downgrade | 12:06 |
noonedeadpunk | as there might be flags in place that will prevent doing so (or better say startup with lower version) | 12:07 |
noonedeadpunk | So if you're upgrade to antelope - jsut take antelope version right away | 12:07 |
noonedeadpunk | and keep it through all upgrades | 12:07 |
noonedeadpunk | and basically jsut skip running further rabbitmq upgrades | 12:07 |
noonedeadpunk | that might work :) | 12:07 |
gokhan_ | yes I am upgrade to antelope | 12:08 |
gokhan_ | I will continue with rabbitmq antelope version | 12:09 |
jrosser | so `Could not find a module for {{hostvars['aio1_repo_container-8475472c']['ansible_facts']['pkg_mgr']}}.` | 12:11 |
jrosser | means that this line did not actually template https://github.com/ansible/ansible/blame/stable-2.15/lib/ansible/plugins/action/package.py#L49 | 12:11 |
jrosser | and so the module name is not templated when it prints the error here https://github.com/ansible/ansible/blame/stable-2.15/lib/ansible/plugins/action/package.py#L66 | 12:12 |
noonedeadpunk | you're looking at ansible-core 2.15.7 issue? | 12:21 |
jrosser | yeah | 12:21 |
noonedeadpunk | tbh that kinda feels a connection plugin thing | 12:22 |
noonedeadpunk | as affects only lxc | 12:22 |
noonedeadpunk | so potentially we fail to propagate facts somehow, dunno... | 12:22 |
noonedeadpunk | or well | 12:22 |
noonedeadpunk | maybe not - we just don't delegate in fact in metal | 12:22 |
jrosser | would you expect to get some templating exception then | 12:23 |
jrosser | like if it cant' find hostvars['foo']['ansible_facts']['pkg_mgr'] | 12:24 |
noonedeadpunk | well the error there doesn't make much sense to me at all | 12:24 |
noonedeadpunk | I guess I'll be looking jsut for diffs between 2.15.5 and 2.15.7 | 12:24 |
noonedeadpunk | (I guess that's what we're upgrading for?) | 12:25 |
noonedeadpunk | shoudl not be plenty of them... | 12:25 |
jrosser | right, next thing to try | 12:25 |
noonedeadpunk | `Nested templating may result in an inability for the conditional to be evaluated. See the porting guide for more information.` | 12:26 |
noonedeadpunk | Like this one was introduced in 2.15.7: https://docs.ansible.com/ansible-core/2.15/porting_guides/porting_guide_core_2.15.html#playbook | 12:27 |
noonedeadpunk | but yeah, probably not directly related... | 12:27 |
noonedeadpunk | ```import_role`` reverts to previous behavior of exporting vars at compile time.` -> that's actually interesting change..... | 12:29 |
noonedeadpunk | As I can recall hassle specifically with python_venv_build regarding imports/includes messign up vars | 12:29 |
noonedeadpunk | I wonder if 2.15.8 might be covering it.... | 12:32 |
noonedeadpunk | Doubt though | 12:33 |
jrosser | 2.15.6 works, 2.15.7 fails | 12:34 |
jrosser | so it is in the diff between those | 12:34 |
noonedeadpunk | the main diff for 2.15.7 is CVE-2023-5764 | 12:46 |
noonedeadpunk | and 2.15.8 looks like bugfix release for this as well | 12:47 |
jrosser | git bisect says this is the commit that breaks it https://github.com/ansible/ansible/commit/fea130480d261ea5bf6fcd5cf19a348f1686ceb1 | 14:11 |
jrosser | TIL that you can git clone ansible, run source ./hacking/env-setup and OSA automatically picks up the git version | 14:12 |
noonedeadpunk | so... there were couple of patches to fix unsafe regression in 2.15.8.... So... I assume you've tested that it's still not working? | 14:16 |
jrosser | the original patch tries to update to 2.15.8 | 14:23 |
jrosser | https://review.opendev.org/c/openstack/openstack-ansible/+/905619 | 14:23 |
noonedeadpunk | oh, ok | 14:24 |
noonedeadpunk | sorry | 14:24 |
spatel | Quick question, If i take snapshot and move to new openstack cloud and spin up instance then can i delete that snapshot from glance once instance is up ? | 14:38 |
spatel | I am using Ceph backend storage. | 14:38 |
spatel | My problem is I am planning to migrate all vm and I don't want 1000s of snapshot sitting in glance repo eating up space :( | 14:39 |
spatel | Getting this error in logs - rbd.PermissionError: [errno 1] RBD permission error (error listing children.) | 14:57 |
spatel | Look like this is parent/child relationship issue | 14:58 |
jrosser | this is OSA? | 14:58 |
spatel | no | 14:59 |
spatel | This should be some Ceph issue right ? | 14:59 |
spatel | I found this article - https://stackoverflow.com/questions/47346402/permissions-for-glance-user-in-ceph | 14:59 |
jrosser | well whatever you use to deploy openstack needs to set the right permissions | 14:59 |
jrosser | or rather openstack / ceph combination | 14:59 |
jrosser | there was a thread on the mailing list about this recently | 14:59 |
spatel | Technically you are allow to delete snapshot right after creating instance from snapshot ? | 15:00 |
spatel | Do you have link for that thread? | 15:00 |
jrosser | not without going checking the ML archive | 15:00 |
spatel | This is my glance caps: caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=images | 15:01 |
spatel | Look like my glance need access of volumes pool also otherwise how does it delete snapshot image? | 15:02 |
jrosser | you can also check in the OSA vars how we set this up | 15:04 |
spatel | Here are the full error logs - https://paste.opendev.org/show/bsHiWAdBpqIO4oZFFI57/ | 15:05 |
noonedeadpunk | these caps should work I guess | 15:17 |
noonedeadpunk | But is it's glance user which is in use then? | 15:17 |
noonedeadpunk | I dunno | 15:18 |
spatel | after giving permission - caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=images, allow rx pool=volumes | 15:21 |
spatel | Error disappeared and getting this mesg - Unable to delete image '3708f961-fb74-49f1-ab9b-40cf7954abed' because it is in use. | 15:21 |
spatel | Look like glance need allow rx pool=volumes permission to read volume | 15:21 |
spatel | noonedeadpunk How do i delete snapshot then? | 15:22 |
spatel | I don't want 1000s of snapshot endup in my glance repository :( | 15:22 |
noonedeadpunk | I"m kinda slightly confused about what you're trying to do I guess | 15:22 |
noonedeadpunk | Like snapshot first of all is protected when it appears in glance iirc | 15:23 |
noonedeadpunk | so to be removed it should be unprotected first. | 15:23 |
noonedeadpunk | But then - why not to remove it from glance at the first place? | 15:23 |
spatel | its not protected - https://paste.opendev.org/show/bsHiWAdBpqIO4oZFFI57/ | 15:23 |
spatel | protected | False | 15:23 |
spatel | Am i missing something ? | 15:24 |
noonedeadpunk | That is actually great question, as all images I do have in glance are protected | 15:25 |
spatel | I have two openstack and trying to migrate instances from A to B | 15:27 |
spatel | I taking snapshot from A and moving to B | 15:27 |
spatel | Spin up instance and now want to delete snapshot to cleanup space.. | 15:28 |
spatel | If I can't able to delete snapshot then I will have 1000s of them in glance which is crazy | 15:28 |
hamburgler | if a volume is built on top of a snapshot, don't think you can delete the snapshot? | 15:32 |
spatel | Hmm! | 15:33 |
spatel | You are saying I should spin up instance using as an image instead boot from volume? | 15:34 |
hamburgler | you could do either, but if the volume is created from snapshot, I don't think you can delete the snapshot after if the volume is in use | 15:35 |
spatel | This is very interesting point... | 15:37 |
hamburgler | maybe I don't have all the details, is this a completely separate cloud you're moving a volume to? | 15:37 |
spatel | I am taking snapshot in glance and moving them to new cloud | 15:37 |
spatel | importing that snapshot in glance and just spin up instance | 15:38 |
hamburgler | and spinning the instance up works yes? - you will not be able to delete the 'image/volume' that the instance is based on | 15:41 |
spatel | I have create instance and it works.. | 15:41 |
hamburgler | like for example if you have instances built off images, in image pool, you cannot delete the base image if volumes are built on it | 15:41 |
hamburgler | at least probably not safely i believe lol | 15:41 |
spatel | How do I check this image is tie up with that volume ? | 15:43 |
spatel | If I want to run a script to find out who is using this snapshot to create instance | 15:43 |
hamburgler | can do it from ceph cli | 15:43 |
hamburgler | need to check my notes :) can't remember | 15:44 |
spatel | Hmm I am trying to google that :) | 15:48 |
jrosser | noonedeadpunk: https://github.com/ansible/ansible/blob/stable-2.15/lib/ansible/template/__init__.py#L744-L746 | 15:50 |
jrosser | wild behaviour there just to return the thing you wanted templating, but not templated | 15:51 |
hamburgler | spatel: you can use rbd ls poolnamehere | 15:51 |
jrosser | with apparently no error or warning | 15:51 |
hamburgler | then rbd info poolname/volumename | 15:51 |
noonedeadpunk | IIRC, then treat "unsafe" exactl same as "raw" in jinja | 15:51 |
hamburgler | and you will see parent image in output: | 15:52 |
noonedeadpunk | and I can recall suggestion to use unsafe/raw as same things in docs | 15:52 |
hamburgler | that is just for volume on image, but I think you can see snapshots created too another way | 15:52 |
jrosser | noonedeadpunk: though i am trying really hard to reproduce with a simple playbook delegating package task from one host to another and it pretty much just works | 15:52 |
hamburgler | spatel: https://paste.openstack.org/show/bSGo4DajpV9u4luVkEvm/ | 15:53 |
spatel | hamburgler I can see image is parent - https://paste.opendev.org/show/bu3c7lqf6Ue5tUqn7d6X/ | 15:53 |
hamburgler | yeah i think if you were to do a test and try to rm the image, it wouldn't allow it, it would say there are volumes built on it or something, but obviously only try with test images/volumes :D | 15:54 |
hamburgler | been awhile since I looked at that | 15:54 |
noonedeadpunk | jrosser: would the thing work if drop our conenction plugin and jsut ssh to container normally? | 15:55 |
jrosser | good question | 15:55 |
noonedeadpunk | like any other host rather then containers | 15:55 |
jrosser | i have a meeting now but will try that | 15:55 |
spatel | You are saying use rbd command to delete from Ceph | 15:55 |
hamburgler | mmm, only with a test image, just to see behaviour | 15:56 |
jrosser | you are moving the snapshot to a different ceph though? | 15:57 |
* jrosser very confusing | 15:57 | |
hamburgler | I think he is wanting to delete the snapshot that was moved to a new openstack cluster, after an instance is spun up | 15:57 |
hamburgler | but I think that can't be done if an instance is now running on top of that snapshot | 15:58 |
jrosser | spatel: you have two openstack? and two ceph? or one? | 15:58 |
spatel | Yes... | 16:06 |
spatel | two openstack two ceph | 16:06 |
spatel | This is latest and greatest openstack and trying to move vms from old to new.. | 16:07 |
spatel | I wish I can directly move files from ceph to ceph and boot vms instead doing this snapshot ping-pong :( | 16:07 |
hamburgler | not sure if there is a much better way to do it :( | 16:11 |
spatel | Hmm! This is crazy :O | 16:11 |
hamburgler | would be nice to import directly to ceph, but then entries not in database for openstack | 16:15 |
jrosser | it might be possible to rbd export / import then do a bunch of database manipulation | 16:15 |
jrosser | but thats a gigantic hack | 16:16 |
spatel | haha :O | 16:16 |
spatel | This is good read - https://docs.ceph.com/en/quincy/rbd/rbd-snapshot/#layering | 16:17 |
spatel | look like ceph just do COW and keep image as a base image | 16:17 |
spatel | FLATTENING A CLONED IMAGE.. does anyone know what its trying to say? | 16:22 |
jrosser | if you take a snapshot it's like a no-op, just points to the original data | 16:24 |
jrosser | but if you were to want to "detach" that from the original data then you need to fully resolve all the copy-on-write stuff into completely new "flat" data | 16:24 |
spatel | hmm I think I need to do flatten image | 16:29 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199 | 16:32 |
spatel | Very interesting - https://blueprints.launchpad.net/cinder/+spec/flatten-volume-from-image | 16:34 |
spatel | What If I use QCOW2 image to import as a snapshot ? | 16:35 |
spatel | Let me ask this question in mailing list and see what other think about this | 16:37 |
spatel | I have did experiment with QCOW2 image and it doesn't have parent reference | 16:45 |
spatel | I did* | 16:45 |
mgariepy | if you store qcow2 images they need to be converted for volumes | 16:46 |
spatel | mgariepy Yes I know they need time to download convert and upload back | 16:55 |
spatel | but it let me allow to delete snapshot | 16:55 |
mgariepy | you can enable image caching also. | 16:55 |
jrosser | noonedeadpunk: i completely reproduced it outside OSA | 16:55 |
jrosser | https://paste.opendev.org/show/bdr2qyCGXujPbCQj9x5M/ | 16:56 |
mgariepy | which will do it once and then all the other will use the cached image. | 16:56 |
noonedeadpunk | jrosser: looks like veeeeery reportable bug | 16:56 |
jrosser | it looks like the use of ansible_facts[...] to index into my_other_hosts makes the value be AnsibleUnsafeText | 16:57 |
jrosser | if i make that a regular var that i set with -e then it works | 16:57 |
opendevreview | Martin Oravec proposed openstack/openstack-ansible stable/zed: Keepalived WIP missing in proxy-protocol-networks mysql configuration. https://review.opendev.org/c/openstack/openstack-ansible/+/906447 | 16:58 |
noonedeadpunk | oh | 16:58 |
jrosser | even better example https://paste.opendev.org/show/btNogSfVyKmguabmOh6n/ | 17:01 |
jrosser | andrewbonney: is that patch related to what you were looking at this week? https://review.opendev.org/c/openstack/openstack-ansible/+/906447 | 17:03 |
jrosser | this has familiar sound to it | 17:03 |
jrosser | spatel: do you miss a step in the description of what you do? don't you have to create an image from the snapshot in order to download it? | 17:05 |
noonedeadpunk | I really wonder if it's "as designed" or not.... | 17:09 |
noonedeadpunk | as if it's not it's getting really weird | 17:09 |
noonedeadpunk | *if it is | 17:09 |
noonedeadpunk | regarding - 906447 - I'm slightly confused why it;s needed. Like if you have exact same issue for different reason - that would be interesting | 17:10 |
jrosser | we have the galera role used outside OSA and i know andrew has a ticket in our system to figure out what is wrong with the proxy protocol setup | 17:19 |
noonedeadpunk | ah, ok | 17:24 |
hamburgler | It looks like during a fresh install of Bobcat, or upgrade and switching between mirrored to quorum queues, that os-cinder-install, the backup service tries to start and ends up failing, because the new vhost doesnt actually get created until later on when cinder-api gets called, think this may need to be re-ordered? | 19:07 |
noonedeadpunk | hamburgler: yeah, that's can be really valid issue | 19:22 |
noonedeadpunk | jrosser: not sure if around... Do you have any idea on how to enable this? https://docs.openstack.org/keystone/latest/api/keystone.api.s3tokens.html | 19:23 |
noonedeadpunk | Like I see plenty of 404 towards /v3/s3tokens and that's annoing. | 19:23 |
noonedeadpunk | At this point I'm about to think it should be done through api-paste | 19:23 |
noonedeadpunk | but I don't see any | 19:24 |
noonedeadpunk | disregard, I guess I"m stupid | 19:25 |
spatel | jrosser noonedeadpunk hamburgler I found solution :) its ceph flatten - https://paste.opendev.org/show/bL03lJDvEomnxVHkZebP/ | 20:35 |
spatel | as soon as I flatten image it let me delete snapshot | 20:35 |
hamburgler | awesome! :) glad you found a solution | 20:36 |
spatel | In mailing list someone said it has been addressed in new bobcat release - https://docs.openstack.org/releasenotes/glance_store/2023.2.html#relnotes-4-6-0-stable-2023-2 | 20:37 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-plugins master: Add openstack_resources role skeleton https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/878794 | 21:09 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!