Tuesday, 2024-06-11

noonedeadpunkwell - https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/921690 still fails :(06:30
noonedeadpunkand likely worth reverting qmanager enablement :(06:31
noonedeadpunkrelease patch is not merged yet :D06:31
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Revert "Enable oslomsg_rabbit_queue_manager by default"  https://review.opendev.org/c/openstack/openstack-ansible/+/92172606:32
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible master: Revert "Enable oslomsg_rabbit_queue_manager by default"  https://review.opendev.org/c/openstack/openstack-ansible/+/92172606:36
noonedeadpunkor maybe we can just do that for magnum... dunno06:42
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_magnum master: Temporary set CAPI jobs to NV  https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/92172706:44
jrosseroh dear /o\ capi job so broken07:09
jrosserwhen really nothing in there changed, just stuff around it07:09
noonedeadpunkin k8s world or openstack world?07:31
jrosserwell that is actually a great question07:32
jrossertime to make another sandbox i guess07:39
jrosserso i also saw that python3.12 is available in rocky9.407:45
jrosserthat would give us a route for moving the ansible version forward (use 3.12 just in the ansible runtime)07:46
jrosseras we are probably a very very long time before anything RH10-ish is usable07:47
noonedeadpunkwell, ansible-core 2.17 still has support for 3.10?07:49
* noonedeadpunk can't recall what comes in rocky out of the box though07:49
jrosserits py3.9 on rocky9 though07:49
noonedeadpunkah07:49
noonedeadpunkyeah07:49
noonedeadpunkand we also can proceed with deb822 patches now07:50
jrosserthat is probably a fairly simple thing to modify the bootstrap script to install 3.12 and use it just for the ansible venv07:50
noonedeadpunkI can recall having smth like that already for EL07:51
noonedeadpunkhttps://opendev.org/openstack/openstack-ansible/src/tag/yoga-eom/scripts/bootstrap-ansible.sh#L7207:52
jrosserahha07:53
noonedeadpunkthough there was some selinux thingy as well07:54
noonedeadpunkwhen we were symlinking it... but maybe it's different07:54
jrosseri think thats gone away and the selinux handling is now internal to ansible07:56
noonedeadpunkyeah, probavbly...07:59
noonedeadpunkbut I wonder how it's done given that there's no binding for 3.12 apparently...08:00
noonedeadpunkanyway08:00
noonedeadpunklike python38 and libselinux-python3 would install things for different python versions apparently08:00
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Update ansible to 2.17  https://review.opendev.org/c/openstack/openstack-ansible/+/92173508:56
jrossernoonedeadpunk: do you have any AIO around?10:17
noonedeadpunkum, yeah, should be some10:23
jrosserif you have a chance to try attaching a cinder volume to a vm in an AIO that would be interesting to know if its working|broken10:23
noonedeadpunkwe should be testing that in CI?10:24
noonedeadpunkor you mean in lxc?10:25
noonedeadpunkwith ceph?10:25
jrosserno just iscsi as we set it up without ceph10:25
noonedeadpunkwell... we test only metal :D10:26
noonedeadpunkas I guess it's not in lxc....10:26
noonedeadpunkbut I don't have lxc without ceph around10:27
noonedeadpunkI have either metal ones or ceph ones...10:27
jrossernoonedeadpunk: this is what happens https://paste.opendev.org/show/bNEQGSvkVwV9ShhN5VRp/10:58
noonedeadpunkyeah... seen that in glance when it tried to create image from volume10:58
noonedeadpunkon lxc10:59
noonedeadpunkI guess we also have couple of bugs with smth simmilar...10:59
noonedeadpunkthough I couldn't understand what's wrong there quickly11:00
jrosseri think it is this https://bugs.launchpad.net/charm-cinder/+bug/182580911:00
jrosser*same thing as...11:00
noonedeadpunkiscsid is stopped?11:01
noonedeadpunkor well, like there;s no uuid or smth...11:01
jrosseryes it is stopped11:03
jrosserroot@aio1:/home/ubuntu# cat /etc/iscsi/initiatorname.iscsi11:03
jrosserGenerateName=yes11:03
opendevreviewAmy Marrich proposed openstack/openstack-ansible master: Grammar and OS corrections  https://review.opendev.org/c/openstack/openstack-ansible/+/92175812:29
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-plugins master: Add sshpass_prompt to ssh connection options  https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/92176113:26
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-ops master: Restart magnum service after deployment  https://review.opendev.org/c/openstack/openstack-ansible-ops/+/92176714:22
jrosserhmm i am wondering if magnum leaves its db connection somehow messed up after install / db migrate without a restart14:24
noonedeadpunkthat would be surprising frankly speaking 14:25
jrosserbecasue i can reproduce the sql connection exception here locally14:25
noonedeadpunkas <service>-manage should create their own14:25
jrosserand i apply the qmanager patch from yesterday with the cluster stuck half created14:25
jrosserrestart the service14:25
jrosserand bingo it completes amost instantly14:26
noonedeadpunkhuh14:26
jrosserthe qmanager patch certainly fixes a sherlock stacktrace14:26
jrosserand then once i have done that once i can delete/create clusters just fine14:27
jrosserobviously i am conflating applying the qmanager patch here but i think until we disable that globally there is a legitimate bug in magnum setup14:28
opendevreviewJonathan Rosser proposed openstack/openstack-ansible-ops master: Restart magnum service after deployment  https://review.opendev.org/c/openstack/openstack-ansible-ops/+/92176714:30
noonedeadpunkI've already proposed disablement...14:30
noonedeadpunkbut dunno, if we should do that or patch magnum first14:31
noonedeadpunkeventually, we need to do both14:31
jrosser^ i just adjusted that (bad) restart magnum patch to depend on the qmanager disablement14:31
jrosserso we see if thats enough14:31
noonedeadpunkI think, you can define `handlers` in playbook14:33
noonedeadpunkbut not sure when to trigger 14:33
jrosseri think what i was getting at before is there is a gross exception when magnum starts up with no db present https://zuul.opendev.org/t/openstack/build/893c3205310d4275a3aa2141d2123763/log/logs/openstack/aio1-magnum-container-8af236e8/magnum-conductor.service.journal-18-11-56.log.txt#35014:35
jrosserthen kind of behind the back of that we use magnum-manage or whatever to initialise the db14:36
jrosserand then without restarting the service after that we end up at another db related exception https://zuul.opendev.org/t/openstack/build/893c3205310d4275a3aa2141d2123763/log/logs/openstack/aio1-magnum-container-8af236e8/magnum-conductor.service.journal-18-11-56.log.txt#202614:37
jrosseractually maybe it is restarted14:40
noonedeadpunkit should be restarted at the very end of the role14:41
noonedeadpunkwith handlers...14:41
mgariepyanyone user unified limits here ?14:50
mgariepyuses ** lol14:50
noonedeadpunknah, not yet14:51
noonedeadpunkthere's actually some homework for us to do regarding them14:51
noonedeadpunkas they'd need system scopes still14:52
noonedeadpunkand some another oslo thingy....14:52
mgariepyi'd like to use them to manage quota on gpus14:52
noonedeadpunkbauzas had nice presentation about how exactly to configure them so they work in France. 14:52
noonedeadpunkSo you can ask him for slides maybe :D14:52
noonedeadpunk(they could be in french though)14:52
mgariepyi do speak french ;p haha14:53
mgariepyhttps://github.com/sbauza/sbauza.github.io/tree/master/2024/05/2214:57
noonedeadpunkok, amazing, thanks!14:58
mgariepynice 14:59
mgariepywas linked in linkedin :)14:59
noonedeadpunkah :D14:59
mgariepybasic osint skill here ahha15:00
mgariepyit does confirm my idea tho. 15:00
noonedeadpunk#startmeeting openstack_ansible_meeting15:02
opendevmeetMeeting started Tue Jun 11 15:02:27 2024 UTC and is due to finish in 60 minutes.  The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot.15:02
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:02
opendevmeetThe meeting name has been set to 'openstack_ansible_meeting'15:02
noonedeadpunk#topic rollcall15:02
noonedeadpunko/15:02
damiandabrowskihi! good to be back after 2 weeks of absence \o/15:02
hamburglero/15:02
jrossero/ hello15:03
mgariepyhey15:04
NeilHanlono/ 15:04
noonedeadpunk#topic office hours15:06
noonedeadpunkso. we currenty have a bug with magnum and qmanager15:06
noonedeadpunkat least with magnum15:06
noonedeadpunkand there're 2 ways kinda.15:06
jrosseryeah - i think this is more widespread if you look in codesearch15:06
noonedeadpunkfirst - revert the revert of enabling it (disable qmanager)15:07
noonedeadpunktechnically - final release didn't cut yet15:07
noonedeadpunkso if we merge that now - final release can contain it disabled and help avoiding mass bug 15:08
noonedeadpunk#link https://review.opendev.org/c/openstack/openstack-ansible/+/92172615:08
noonedeadpunkand then we can take time to do patches like https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/921690 and backport them15:09
noonedeadpunk(or not)15:09
noonedeadpunkand I also proposed to set jobs to NV: https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/92172715:10
jrosseri think it would be ok to backport those with the qmanager disabled15:10
jrosserthen we can opt-in to test it easily, which is kind of how we eneded up in trouble by not having really a testing window for it15:10
noonedeadpunkyeah15:11
jrosserwhat to think about is when we can settle on a new set of defaults and remove a lot of complexity to switch queu types15:12
noonedeadpunklet me backport right awat15:12
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/2024.1: Revert "Enable oslomsg_rabbit_queue_manager by default"  https://review.opendev.org/c/openstack/openstack-ansible/+/92177515:12
jrossermaking preparation for rabbitmq415:12
noonedeadpunkso, ha queues are dropped with rabbitmq 4.015:12
noonedeadpunkbut then it could be a valid thing to just use CQv2 without HA15:12
jrosser'late 2024' for that15:13
jrosserso really for D we need to be getting everything really solid for quorum queues and considering removing support for HA15:13
noonedeadpunkI think, we'll release 2025.1 with rabbit 3.* still...15:13
noonedeadpunkor that15:13
noonedeadpunkbut given we keep option for CQv2 - all complexity for migration will stay kinda15:14
jrosserit depends if we want to be allowing to handle rabbitmq3 -> 4 and HA -> quorum in the same upgrade, which might be all kinds of /o\15:14
noonedeadpunknah, I guess HA->quorum should end before 415:16
noonedeadpunkand that's why I was thinking to still have 3.* for 2025.1...15:16
noonedeadpunkbut then CQ<->QQ still will be there, right?15:17
jrosserwell it's kind of a decision to make about what we support15:17
jrosserand that vs complexity15:17
noonedeadpunkhistorically there were bunch of deployments who signed off from HA15:18
noonedeadpunkwhich might be still valid for quorum15:18
noonedeadpunkas CQv2 still gona be more performant I assume15:18
noonedeadpunkdunno15:20
jrosserok - well whichever way we have some fixing up to do15:20
noonedeadpunkEventually, what we can do on 2024.2 - is remove HA policy15:22
noonedeadpunkthen, what you're left with - either migrate to quorum or regular CQv2 with no HA15:23
noonedeadpunkthis potentially opens path to 4.0 upgrade whenever we confident in it15:23
noonedeadpunkbut yes, first we potentially have some fixing to do...15:24
jrosserok so other thing i found today was broken cinder/lvm in aio15:26
noonedeadpunkso, are we landing 921726 and backporting right away?15:26
noonedeadpunkcinder/lvm/lxc I guess?15:26
noonedeadpunkyeah...15:26
jrosseryes i think we merge 92172615:26
noonedeadpunkok, then I'm blocking https://review.opendev.org/c/openstack/releases/+/921502/215:28
noonedeadpunkand we do RC315:28
noonedeadpunkregarding cinder/lvm - was you able to find why it's in fact borked?15:33
noonedeadpunkor still checking on that?15:33
jrosserit is because the initator id is not set15:35
jrosserso pretty simple15:35
noonedeadpunkso we should do some lineinfile or smth?15:35
jrosserperhaps15:37
jrosseri think this is what i am not sure of15:37
jrosserfrom the charms patch `Cloud images including MAAS ones have "GenerateName=yes" instead of "InitiatorName=... on purpose not to clone the initiator name.`15:38
jrosserand on debian/bu15:38
jrosserbuntu there is a script run as part of starting the iscsid service to ensure that the ID is generated, if needed15:39
jrosserbut i can't see anything like this on centos/rocky15:39
jrosserNeilHanlon: ^ ?15:39
jrossertbh i have never set up iscsi myself so i don't know where responsibilty lies for creating the ID in a real deployment15:40
jrosserso this might be a CI specific thing15:41
noonedeadpunkthough, ppl would expect it to work....15:41
noonedeadpunkI bet I've seen bugs15:41
noonedeadpunk#link https://bugs.launchpad.net/openstack-ansible/+bug/193327915:42
noonedeadpunkbut there was more....15:42
jrosseroh actually `service iscsi start` is enough to generate the initiator name on rocky15:43
jrosserso maybe this is just what we need to do for LVM15:43
noonedeadpunkon side of ... cinder-volume I assume?15:44
jrosseryeah15:44
jrosserin here i guess https://github.com/openstack/openstack-ansible-os_cinder/blob/master/tasks/cinder_lvm_config.yml15:45
jrosserok i will make a patch for this15:46
noonedeadpunkjrosser: we have that: https://opendev.org/openstack/openstack-ansible-os_cinder/src/branch/master/tasks/cinder_post_install.yml#L150-L15815:48
noonedeadpunkso probably it's wrong or not enough now...15:48
jrossernot quite https://opendev.org/openstack/openstack-ansible-os_cinder/src/branch/master/vars/debian.yml#L2115:48
noonedeadpunkas these are just lioadm/tgtadm which is different15:48
noonedeadpunkbut potentially having another service started somewhere nearby might make sense...15:49
jrosseri think iscsid is for persistent config, and probably cinder makes exported volumes as needed on the fly15:50
NeilHanlonhm. i'm not really sure on why that would happen or be the case.. i don't really use that much iscsi myself, either15:53
NeilHanlonactually, there's `iscsi-iname` from iscsi-initiatior-utils -- perhaps this is what is needed15:58
NeilHanlon>iscsi-iname generates a unique iSCSI node name on every invocation.15:58
opendevreviewJonathan Rosser proposed openstack/openstack-ansible master: Collect iscsid config for CI jobs  https://review.opendev.org/c/openstack/openstack-ansible/+/92177815:59
noonedeadpunkmakes sense16:00
noonedeadpunk#endmeeting16:00
opendevmeetMeeting ended Tue Jun 11 16:00:21 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:00
opendevmeetMinutes:        https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-06-11-15.02.html16:00
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-06-11-15.02.txt16:00
opendevmeetLog:            https://meetings.opendev.org/meetings/openstack_ansible_meeting/2024/openstack_ansible_meeting.2024-06-11-15.02.log.html16:00
jrosserfun for ansible 2.17 - nearly full house on failed jobs https://review.opendev.org/c/openstack/openstack-ansible/+/921735?tab=change-view-tab-header-zuul-results-summary16:04
noonedeadpunkhuh, failing for apt package pinning16:08
noonedeadpunksounds fun16:08
opendevreviewMohammadreza Barkhordari proposed openstack/openstack-ansible-os_neutron master: add float_ip and gateway_ip QoS support  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/92177916:09
jrosserwell hmm thats interesting https://review.opendev.org/c/openstack/openstack-ansible-ops/+/92176716:50
noonedeadpunkso just disabling qmanager...16:50
noonedeadpunkor well.16:51
noonedeadpunkand service rerstart16:51
jrosseryes both16:52
noonedeadpunkbut wait16:54
noonedeadpunkit was failing with oslo_concurency patch?16:54
noonedeadpunkor it was never tried with it16:55
noonedeadpunkbut then it's service restart likely that helped16:55
noonedeadpunkas then https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/921690 would pass16:55
jrosserwell https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/921690 failed, so oslo_concurrency on its own is not enough17:12
noonedeadpunkyep17:12
jrosserbut there is definatley lock related errors without that, till we merge the disabling17:13
jrosserworst kind of bug - i have pretty much no clue what is happening :)17:13
noonedeadpunkyeah, code explicitly requires oslo.concurency to be in place17:18
noonedeadpunkso that is correct thing to do17:18
opendevreviewMerged openstack/openstack-ansible master: Revert "Enable oslomsg_rabbit_queue_manager by default"  https://review.opendev.org/c/openstack/openstack-ansible/+/92172618:17
noonedeadpunkdamn cherry-pick failed on ceph18:30
noonedeadpunkbut let's land it asap pretty much18:40
jrossergrrr MODULE FAILURE there - just bad luck18:41
opendevreviewMerged openstack/openstack-ansible master: Grammar and OS corrections  https://review.opendev.org/c/openstack/openstack-ansible/+/92175819:11
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible stable/2024.1: Grammar and OS corrections  https://review.opendev.org/c/openstack/openstack-ansible/+/92179519:25
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_keystone master: Define oslo_messaging_rabbit section if either RPC or Notifications are enabled  https://review.opendev.org/c/openstack/openstack-ansible-os_keystone/+/92179619:37
mgariepyhttps://github.com/openstack/openstack-ansible-ceph_client/blob/master/tasks/ceph_get_keyrings_from_mons.yml#L23-L2519:43
mgariepyhow is it suppose to work ?  i did setup-hosts, infra and it fails at this when install glance and cinder and nova.. 19:43
noonedeadpunkyou provision ceph with some external tool? like cephadm?19:44
mgariepyAIO19:45
mgariepybootstrapaio.sh then the playbook.19:45
mgariepyhmm. might be missing a step then ;(.. 19:46
mgariepylike: SCENARIO=aio_lxc_infra_ceph ./scripts/bootstrap-aio.sh 19:48
mgariepyhmm ..19:54
noonedeadpunkthat should kinda work....19:55
noonedeadpunkunless you're playing with latest ceph-ansible19:55
noonedeadpunkas it dropped creation of clients for openstack19:55
mgariepychecked ou 2024.119:56
noonedeadpunkhm... 19:56
mgariepystable-7.019:56
mgariepy69a990392a94b59e1404eaeae7d6dfb5217ca71c19:57
noonedeadpunkyeah, that should work...19:57
noonedeadpunkand you do have client.cinder and client.glance in ceph on mon?19:57
mgariepynop19:57
mgariepyno pool also.19:57
noonedeadpunkthen it feels like ceph-ansible failed 19:57
noonedeadpunkand proceeded somehow towards19:57
noonedeadpunkor it was skipped....19:58
noonedeadpunkor infra doesn't work nicely with ceph scenarios....19:58
mgariepyyep it failed.20:01
mgariepyfun20:01
mgariepyi had some issue with ebtables. modules had to install hwe kernel in the vm and then rebooted. probably had something to do with loop device.. 20:02
mgariepyhave a nice evening.20:07

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!