noonedeadpunk | well - https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/921690 still fails :( | 06:30 |
---|---|---|
noonedeadpunk | and likely worth reverting qmanager enablement :( | 06:31 |
noonedeadpunk | release patch is not merged yet :D | 06:31 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Revert "Enable oslomsg_rabbit_queue_manager by default" https://review.opendev.org/c/openstack/openstack-ansible/+/921726 | 06:32 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Revert "Enable oslomsg_rabbit_queue_manager by default" https://review.opendev.org/c/openstack/openstack-ansible/+/921726 | 06:36 |
noonedeadpunk | or maybe we can just do that for magnum... dunno | 06:42 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_magnum master: Temporary set CAPI jobs to NV https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/921727 | 06:44 |
jrosser | oh dear /o\ capi job so broken | 07:09 |
jrosser | when really nothing in there changed, just stuff around it | 07:09 |
noonedeadpunk | in k8s world or openstack world? | 07:31 |
jrosser | well that is actually a great question | 07:32 |
jrosser | time to make another sandbox i guess | 07:39 |
jrosser | so i also saw that python3.12 is available in rocky9.4 | 07:45 |
jrosser | that would give us a route for moving the ansible version forward (use 3.12 just in the ansible runtime) | 07:46 |
jrosser | as we are probably a very very long time before anything RH10-ish is usable | 07:47 |
noonedeadpunk | well, ansible-core 2.17 still has support for 3.10? | 07:49 |
* noonedeadpunk can't recall what comes in rocky out of the box though | 07:49 | |
jrosser | its py3.9 on rocky9 though | 07:49 |
noonedeadpunk | ah | 07:49 |
noonedeadpunk | yeah | 07:49 |
noonedeadpunk | and we also can proceed with deb822 patches now | 07:50 |
jrosser | that is probably a fairly simple thing to modify the bootstrap script to install 3.12 and use it just for the ansible venv | 07:50 |
noonedeadpunk | I can recall having smth like that already for EL | 07:51 |
noonedeadpunk | https://opendev.org/openstack/openstack-ansible/src/tag/yoga-eom/scripts/bootstrap-ansible.sh#L72 | 07:52 |
jrosser | ahha | 07:53 |
noonedeadpunk | though there was some selinux thingy as well | 07:54 |
noonedeadpunk | when we were symlinking it... but maybe it's different | 07:54 |
jrosser | i think thats gone away and the selinux handling is now internal to ansible | 07:56 |
noonedeadpunk | yeah, probavbly... | 07:59 |
noonedeadpunk | but I wonder how it's done given that there's no binding for 3.12 apparently... | 08:00 |
noonedeadpunk | anyway | 08:00 |
noonedeadpunk | like python38 and libselinux-python3 would install things for different python versions apparently | 08:00 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Update ansible to 2.17 https://review.opendev.org/c/openstack/openstack-ansible/+/921735 | 08:56 |
jrosser | noonedeadpunk: do you have any AIO around? | 10:17 |
noonedeadpunk | um, yeah, should be some | 10:23 |
jrosser | if you have a chance to try attaching a cinder volume to a vm in an AIO that would be interesting to know if its working|broken | 10:23 |
noonedeadpunk | we should be testing that in CI? | 10:24 |
noonedeadpunk | or you mean in lxc? | 10:25 |
noonedeadpunk | with ceph? | 10:25 |
jrosser | no just iscsi as we set it up without ceph | 10:25 |
noonedeadpunk | well... we test only metal :D | 10:26 |
noonedeadpunk | as I guess it's not in lxc.... | 10:26 |
noonedeadpunk | but I don't have lxc without ceph around | 10:27 |
noonedeadpunk | I have either metal ones or ceph ones... | 10:27 |
jrosser | noonedeadpunk: this is what happens https://paste.opendev.org/show/bNEQGSvkVwV9ShhN5VRp/ | 10:58 |
noonedeadpunk | yeah... seen that in glance when it tried to create image from volume | 10:58 |
noonedeadpunk | on lxc | 10:59 |
noonedeadpunk | I guess we also have couple of bugs with smth simmilar... | 10:59 |
noonedeadpunk | though I couldn't understand what's wrong there quickly | 11:00 |
jrosser | i think it is this https://bugs.launchpad.net/charm-cinder/+bug/1825809 | 11:00 |
jrosser | *same thing as... | 11:00 |
noonedeadpunk | iscsid is stopped? | 11:01 |
noonedeadpunk | or well, like there;s no uuid or smth... | 11:01 |
jrosser | yes it is stopped | 11:03 |
jrosser | root@aio1:/home/ubuntu# cat /etc/iscsi/initiatorname.iscsi | 11:03 |
jrosser | GenerateName=yes | 11:03 |
opendevreview | Amy Marrich proposed openstack/openstack-ansible master: Grammar and OS corrections https://review.opendev.org/c/openstack/openstack-ansible/+/921758 | 12:29 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-plugins master: Add sshpass_prompt to ssh connection options https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/921761 | 13:26 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-ops master: Restart magnum service after deployment https://review.opendev.org/c/openstack/openstack-ansible-ops/+/921767 | 14:22 |
jrosser | hmm i am wondering if magnum leaves its db connection somehow messed up after install / db migrate without a restart | 14:24 |
noonedeadpunk | that would be surprising frankly speaking | 14:25 |
jrosser | becasue i can reproduce the sql connection exception here locally | 14:25 |
noonedeadpunk | as <service>-manage should create their own | 14:25 |
jrosser | and i apply the qmanager patch from yesterday with the cluster stuck half created | 14:25 |
jrosser | restart the service | 14:25 |
jrosser | and bingo it completes amost instantly | 14:26 |
noonedeadpunk | huh | 14:26 |
jrosser | the qmanager patch certainly fixes a sherlock stacktrace | 14:26 |
jrosser | and then once i have done that once i can delete/create clusters just fine | 14:27 |
jrosser | obviously i am conflating applying the qmanager patch here but i think until we disable that globally there is a legitimate bug in magnum setup | 14:28 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-ops master: Restart magnum service after deployment https://review.opendev.org/c/openstack/openstack-ansible-ops/+/921767 | 14:30 |
noonedeadpunk | I've already proposed disablement... | 14:30 |
noonedeadpunk | but dunno, if we should do that or patch magnum first | 14:31 |
noonedeadpunk | eventually, we need to do both | 14:31 |
jrosser | ^ i just adjusted that (bad) restart magnum patch to depend on the qmanager disablement | 14:31 |
jrosser | so we see if thats enough | 14:31 |
noonedeadpunk | I think, you can define `handlers` in playbook | 14:33 |
noonedeadpunk | but not sure when to trigger | 14:33 |
jrosser | i think what i was getting at before is there is a gross exception when magnum starts up with no db present https://zuul.opendev.org/t/openstack/build/893c3205310d4275a3aa2141d2123763/log/logs/openstack/aio1-magnum-container-8af236e8/magnum-conductor.service.journal-18-11-56.log.txt#350 | 14:35 |
jrosser | then kind of behind the back of that we use magnum-manage or whatever to initialise the db | 14:36 |
jrosser | and then without restarting the service after that we end up at another db related exception https://zuul.opendev.org/t/openstack/build/893c3205310d4275a3aa2141d2123763/log/logs/openstack/aio1-magnum-container-8af236e8/magnum-conductor.service.journal-18-11-56.log.txt#2026 | 14:37 |
jrosser | actually maybe it is restarted | 14:40 |
noonedeadpunk | it should be restarted at the very end of the role | 14:41 |
noonedeadpunk | with handlers... | 14:41 |
mgariepy | anyone user unified limits here ? | 14:50 |
mgariepy | uses ** lol | 14:50 |
noonedeadpunk | nah, not yet | 14:51 |
noonedeadpunk | there's actually some homework for us to do regarding them | 14:51 |
noonedeadpunk | as they'd need system scopes still | 14:52 |
noonedeadpunk | and some another oslo thingy.... | 14:52 |
mgariepy | i'd like to use them to manage quota on gpus | 14:52 |
noonedeadpunk | bauzas had nice presentation about how exactly to configure them so they work in France. | 14:52 |
noonedeadpunk | So you can ask him for slides maybe :D | 14:52 |
noonedeadpunk | (they could be in french though) | 14:52 |
mgariepy | i do speak french ;p haha | 14:53 |
mgariepy | https://github.com/sbauza/sbauza.github.io/tree/master/2024/05/22 | 14:57 |
noonedeadpunk | ok, amazing, thanks! | 14:58 |
mgariepy | nice | 14:59 |
mgariepy | was linked in linkedin :) | 14:59 |
noonedeadpunk | ah :D | 14:59 |
mgariepy | basic osint skill here ahha | 15:00 |
mgariepy | it does confirm my idea tho. | 15:00 |
noonedeadpunk | #startmeeting openstack_ansible_meeting | 15:02 |
opendevmeet | Meeting started Tue Jun 11 15:02:27 2024 UTC and is due to finish in 60 minutes. The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:02 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:02 |
opendevmeet | The meeting name has been set to 'openstack_ansible_meeting' | 15:02 |
noonedeadpunk | #topic rollcall | 15:02 |
noonedeadpunk | o/ | 15:02 |
damiandabrowski | hi! good to be back after 2 weeks of absence \o/ | 15:02 |
hamburgler | o/ | 15:02 |
jrosser | o/ hello | 15:03 |
mgariepy | hey | 15:04 |
NeilHanlon | o/ | 15:04 |
noonedeadpunk | #topic office hours | 15:06 |
noonedeadpunk | so. we currenty have a bug with magnum and qmanager | 15:06 |
noonedeadpunk | at least with magnum | 15:06 |
noonedeadpunk | and there're 2 ways kinda. | 15:06 |
jrosser | yeah - i think this is more widespread if you look in codesearch | 15:06 |
noonedeadpunk | first - revert the revert of enabling it (disable qmanager) | 15:07 |
noonedeadpunk | technically - final release didn't cut yet | 15:07 |
noonedeadpunk | so if we merge that now - final release can contain it disabled and help avoiding mass bug | 15:08 |
noonedeadpunk | #link https://review.opendev.org/c/openstack/openstack-ansible/+/921726 | 15:08 |
noonedeadpunk | and then we can take time to do patches like https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/921690 and backport them | 15:09 |
noonedeadpunk | (or not) | 15:09 |
noonedeadpunk | and I also proposed to set jobs to NV: https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/921727 | 15:10 |
jrosser | i think it would be ok to backport those with the qmanager disabled | 15:10 |
jrosser | then we can opt-in to test it easily, which is kind of how we eneded up in trouble by not having really a testing window for it | 15:10 |
noonedeadpunk | yeah | 15:11 |
jrosser | what to think about is when we can settle on a new set of defaults and remove a lot of complexity to switch queu types | 15:12 |
noonedeadpunk | let me backport right awat | 15:12 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/2024.1: Revert "Enable oslomsg_rabbit_queue_manager by default" https://review.opendev.org/c/openstack/openstack-ansible/+/921775 | 15:12 |
jrosser | making preparation for rabbitmq4 | 15:12 |
noonedeadpunk | so, ha queues are dropped with rabbitmq 4.0 | 15:12 |
noonedeadpunk | but then it could be a valid thing to just use CQv2 without HA | 15:12 |
jrosser | 'late 2024' for that | 15:13 |
jrosser | so really for D we need to be getting everything really solid for quorum queues and considering removing support for HA | 15:13 |
noonedeadpunk | I think, we'll release 2025.1 with rabbit 3.* still... | 15:13 |
noonedeadpunk | or that | 15:13 |
noonedeadpunk | but given we keep option for CQv2 - all complexity for migration will stay kinda | 15:14 |
jrosser | it depends if we want to be allowing to handle rabbitmq3 -> 4 and HA -> quorum in the same upgrade, which might be all kinds of /o\ | 15:14 |
noonedeadpunk | nah, I guess HA->quorum should end before 4 | 15:16 |
noonedeadpunk | and that's why I was thinking to still have 3.* for 2025.1... | 15:16 |
noonedeadpunk | but then CQ<->QQ still will be there, right? | 15:17 |
jrosser | well it's kind of a decision to make about what we support | 15:17 |
jrosser | and that vs complexity | 15:17 |
noonedeadpunk | historically there were bunch of deployments who signed off from HA | 15:18 |
noonedeadpunk | which might be still valid for quorum | 15:18 |
noonedeadpunk | as CQv2 still gona be more performant I assume | 15:18 |
noonedeadpunk | dunno | 15:20 |
jrosser | ok - well whichever way we have some fixing up to do | 15:20 |
noonedeadpunk | Eventually, what we can do on 2024.2 - is remove HA policy | 15:22 |
noonedeadpunk | then, what you're left with - either migrate to quorum or regular CQv2 with no HA | 15:23 |
noonedeadpunk | this potentially opens path to 4.0 upgrade whenever we confident in it | 15:23 |
noonedeadpunk | but yes, first we potentially have some fixing to do... | 15:24 |
jrosser | ok so other thing i found today was broken cinder/lvm in aio | 15:26 |
noonedeadpunk | so, are we landing 921726 and backporting right away? | 15:26 |
noonedeadpunk | cinder/lvm/lxc I guess? | 15:26 |
noonedeadpunk | yeah... | 15:26 |
jrosser | yes i think we merge 921726 | 15:26 |
noonedeadpunk | ok, then I'm blocking https://review.opendev.org/c/openstack/releases/+/921502/2 | 15:28 |
noonedeadpunk | and we do RC3 | 15:28 |
noonedeadpunk | regarding cinder/lvm - was you able to find why it's in fact borked? | 15:33 |
noonedeadpunk | or still checking on that? | 15:33 |
jrosser | it is because the initator id is not set | 15:35 |
jrosser | so pretty simple | 15:35 |
noonedeadpunk | so we should do some lineinfile or smth? | 15:35 |
jrosser | perhaps | 15:37 |
jrosser | i think this is what i am not sure of | 15:37 |
jrosser | from the charms patch `Cloud images including MAAS ones have "GenerateName=yes" instead of "InitiatorName=... on purpose not to clone the initiator name.` | 15:38 |
jrosser | and on debian/bu | 15:38 |
jrosser | buntu there is a script run as part of starting the iscsid service to ensure that the ID is generated, if needed | 15:39 |
jrosser | but i can't see anything like this on centos/rocky | 15:39 |
jrosser | NeilHanlon: ^ ? | 15:39 |
jrosser | tbh i have never set up iscsi myself so i don't know where responsibilty lies for creating the ID in a real deployment | 15:40 |
jrosser | so this might be a CI specific thing | 15:41 |
noonedeadpunk | though, ppl would expect it to work.... | 15:41 |
noonedeadpunk | I bet I've seen bugs | 15:41 |
noonedeadpunk | #link https://bugs.launchpad.net/openstack-ansible/+bug/1933279 | 15:42 |
noonedeadpunk | but there was more.... | 15:42 |
jrosser | oh actually `service iscsi start` is enough to generate the initiator name on rocky | 15:43 |
jrosser | so maybe this is just what we need to do for LVM | 15:43 |
noonedeadpunk | on side of ... cinder-volume I assume? | 15:44 |
jrosser | yeah | 15:44 |
jrosser | in here i guess https://github.com/openstack/openstack-ansible-os_cinder/blob/master/tasks/cinder_lvm_config.yml | 15:45 |
jrosser | ok i will make a patch for this | 15:46 |
noonedeadpunk | jrosser: we have that: https://opendev.org/openstack/openstack-ansible-os_cinder/src/branch/master/tasks/cinder_post_install.yml#L150-L158 | 15:48 |
noonedeadpunk | so probably it's wrong or not enough now... | 15:48 |
jrosser | not quite https://opendev.org/openstack/openstack-ansible-os_cinder/src/branch/master/vars/debian.yml#L21 | 15:48 |
noonedeadpunk | as these are just lioadm/tgtadm which is different | 15:48 |
noonedeadpunk | but potentially having another service started somewhere nearby might make sense... | 15:49 |
jrosser | i think iscsid is for persistent config, and probably cinder makes exported volumes as needed on the fly | 15:50 |
NeilHanlon | hm. i'm not really sure on why that would happen or be the case.. i don't really use that much iscsi myself, either | 15:53 |
NeilHanlon | actually, there's `iscsi-iname` from iscsi-initiatior-utils -- perhaps this is what is needed | 15:58 |
NeilHanlon | >iscsi-iname generates a unique iSCSI node name on every invocation. | 15:58 |
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible master: Collect iscsid config for CI jobs https://review.opendev.org/c/openstack/openstack-ansible/+/921778 | 15:59 |
noonedeadpunk | makes sense | 16:00 |
noonedeadpunk | #endmeeting | 16:00 |
opendevmeet | Meeting ended Tue Jun 11 16:00:21 2024 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:00 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-06-11-15.02.html | 16:00 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-06-11-15.02.txt | 16:00 |
opendevmeet | Log: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-06-11-15.02.log.html | 16:00 |
jrosser | fun for ansible 2.17 - nearly full house on failed jobs https://review.opendev.org/c/openstack/openstack-ansible/+/921735?tab=change-view-tab-header-zuul-results-summary | 16:04 |
noonedeadpunk | huh, failing for apt package pinning | 16:08 |
noonedeadpunk | sounds fun | 16:08 |
opendevreview | Mohammadreza Barkhordari proposed openstack/openstack-ansible-os_neutron master: add float_ip and gateway_ip QoS support https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/921779 | 16:09 |
jrosser | well hmm thats interesting https://review.opendev.org/c/openstack/openstack-ansible-ops/+/921767 | 16:50 |
noonedeadpunk | so just disabling qmanager... | 16:50 |
noonedeadpunk | or well. | 16:51 |
noonedeadpunk | and service rerstart | 16:51 |
jrosser | yes both | 16:52 |
noonedeadpunk | but wait | 16:54 |
noonedeadpunk | it was failing with oslo_concurency patch? | 16:54 |
noonedeadpunk | or it was never tried with it | 16:55 |
noonedeadpunk | but then it's service restart likely that helped | 16:55 |
noonedeadpunk | as then https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/921690 would pass | 16:55 |
jrosser | well https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/921690 failed, so oslo_concurrency on its own is not enough | 17:12 |
noonedeadpunk | yep | 17:12 |
jrosser | but there is definatley lock related errors without that, till we merge the disabling | 17:13 |
jrosser | worst kind of bug - i have pretty much no clue what is happening :) | 17:13 |
noonedeadpunk | yeah, code explicitly requires oslo.concurency to be in place | 17:18 |
noonedeadpunk | so that is correct thing to do | 17:18 |
opendevreview | Merged openstack/openstack-ansible master: Revert "Enable oslomsg_rabbit_queue_manager by default" https://review.opendev.org/c/openstack/openstack-ansible/+/921726 | 18:17 |
noonedeadpunk | damn cherry-pick failed on ceph | 18:30 |
noonedeadpunk | but let's land it asap pretty much | 18:40 |
jrosser | grrr MODULE FAILURE there - just bad luck | 18:41 |
opendevreview | Merged openstack/openstack-ansible master: Grammar and OS corrections https://review.opendev.org/c/openstack/openstack-ansible/+/921758 | 19:11 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/2024.1: Grammar and OS corrections https://review.opendev.org/c/openstack/openstack-ansible/+/921795 | 19:25 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_keystone master: Define oslo_messaging_rabbit section if either RPC or Notifications are enabled https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/921796 | 19:37 |
mgariepy | https://github.com/openstack/openstack-ansible-ceph_client/blob/master/tasks/ceph_get_keyrings_from_mons.yml#L23-L25 | 19:43 |
mgariepy | how is it suppose to work ? i did setup-hosts, infra and it fails at this when install glance and cinder and nova.. | 19:43 |
noonedeadpunk | you provision ceph with some external tool? like cephadm? | 19:44 |
mgariepy | AIO | 19:45 |
mgariepy | bootstrapaio.sh then the playbook. | 19:45 |
mgariepy | hmm. might be missing a step then ;(.. | 19:46 |
mgariepy | like: SCENARIO=aio_lxc_infra_ceph ./scripts/bootstrap-aio.sh | 19:48 |
mgariepy | hmm .. | 19:54 |
noonedeadpunk | that should kinda work.... | 19:55 |
noonedeadpunk | unless you're playing with latest ceph-ansible | 19:55 |
noonedeadpunk | as it dropped creation of clients for openstack | 19:55 |
mgariepy | checked ou 2024.1 | 19:56 |
noonedeadpunk | hm... | 19:56 |
mgariepy | stable-7.0 | 19:56 |
mgariepy | 69a990392a94b59e1404eaeae7d6dfb5217ca71c | 19:57 |
noonedeadpunk | yeah, that should work... | 19:57 |
noonedeadpunk | and you do have client.cinder and client.glance in ceph on mon? | 19:57 |
mgariepy | nop | 19:57 |
mgariepy | no pool also. | 19:57 |
noonedeadpunk | then it feels like ceph-ansible failed | 19:57 |
noonedeadpunk | and proceeded somehow towards | 19:57 |
noonedeadpunk | or it was skipped.... | 19:58 |
noonedeadpunk | or infra doesn't work nicely with ceph scenarios.... | 19:58 |
mgariepy | yep it failed. | 20:01 |
mgariepy | fun | 20:01 |
mgariepy | i had some issue with ebtables. modules had to install hwe kernel in the vm and then rebooted. probably had something to do with loop device.. | 20:02 |
mgariepy | have a nice evening. | 20:07 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!