*** ysandeep|out is now known as ysandeep | 00:51 | |
opendevreview | Kevin Carter proposed openstack/openstack-ansible-rabbitmq_server master: Update rabbitmq to 3.10.7-1 https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/855985 | 01:10 |
---|---|---|
opendevreview | Kevin Carter proposed openstack/openstack-ansible-os_nova master: Remove heartbeat_in_pthread functionality https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/855991 | 02:21 |
opendevreview | Kevin Carter proposed openstack/openstack-ansible-os_cinder master: Remove heartbeat_in_pthread functionality https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/855992 | 02:23 |
opendevreview | Kevin Carter proposed openstack/openstack-ansible-os_neutron master: Remove heartbeat_in_pthread functionality https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/855993 | 02:25 |
opendevreview | Kevin Carter proposed openstack/openstack-ansible-rabbitmq_server master: Update the heartbeat and handshake timeout https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/855996 | 02:50 |
opendevreview | Kevin Carter proposed openstack/openstack-ansible-rabbitmq_server master: Update the heartbeat and handshake timeout https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/855996 | 02:51 |
*** ysandeep is now known as ysandeep|afk | 03:36 | |
opendevreview | OpenStack Proposal Bot proposed openstack/openstack-ansible master: Imported Translations from Zanata https://review.opendev.org/c/openstack/openstack-ansible/+/856012 | 04:30 |
*** ysandeep|afk is now known as ysandeep | 05:29 | |
anskiy | jamesdenton: no, it still depends on it, this thing just prevent explicit enable/start and configuration for openvswtich | 07:10 |
noonedeadpunk | sorry cloudnull for being harsh on your PRs :p | 07:24 |
opendevreview | Merged openstack/openstack-ansible master: Imported Translations from Zanata https://review.opendev.org/c/openstack/openstack-ansible/+/856012 | 07:47 |
opendevreview | Ebbex proposed openstack/openstack-ansible-os_ironic master: Bind http and tftp services to the bmaas network https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/852122 | 08:03 |
opendevreview | Ebbex proposed openstack/openstack-ansible-os_ironic master: Ensure ironic inspector dhcp server listen address is defined https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/852173 | 08:03 |
*** ysandeep is now known as ysandeep|afk | 10:03 | |
*** ysandeep|afk is now known as ysandeep | 10:40 | |
opendevreview | Merged openstack/openstack-ansible-os_neutron master: Convert include to include_tasks https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/855803 | 11:02 |
*** dviroel|out is now known as dviroel | 11:19 | |
cloudnull | noonedeadpunk no worries. I’m just a drive by commiter these days 😉 | 12:41 |
cloudnull | On https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/855992 I left a comment about the error I was seeing in the rabbit logs. Gone after I changed the pthread setting for cinder nova and neutron (the only three services that use it). I’ll leave the prs us but feel free to abandon them if they’re not a good fit. I can work | 12:43 |
cloudnull | around it by using the existing options. | 12:43 |
noonedeadpunk | eventually, uwsgi will work suuuuper wierdly if you enable heartbeat_in_pthread | 12:45 |
noonedeadpunk | sorry, everything except uwsgi | 12:46 |
noonedeadpunk | uwsgi is the only thing that works with this thing | 12:47 |
cloudnull | Should that setting be applied to all the Oslo messaging services ? | 12:47 |
cloudnull | Oh. So it’s just uwsgi | 12:47 |
noonedeadpunk | THough, if you check ML I've shown, ppl agreed there that these error are harmless and the only disadvantage if disabling heartbeat_in_pthread by default and backporting stuff to stable branches | 12:47 |
noonedeadpunk | But for non-uwsgi services it's being a disaster, as they could get stuck or stop responding | 12:48 |
cloudnull | Yeah they’re harmless. But seem to correspond to spikes to cpu when they pop. It’s like 10% so nothing major. But that’s the reason I was even looking. | 12:48 |
cloudnull | I’ll abandon those prs for now. | 12:49 |
cloudnull | On a total aside, project works great with Debian 11. Great work on all that. | 12:51 |
noonedeadpunk | at least now we know it works lol | 12:52 |
noonedeadpunk | cloudnull: are you using ceph? As this is smth that might not work as expected | 12:52 |
noonedeadpunk | or well, we just don't test it, so I don't really know | 12:53 |
cloudnull | no I don't run cephalopods | 12:56 |
cloudnull | ** ceph | 12:57 |
cloudnull | stupid auto-correct | 12:57 |
cloudnull | I run ZFS + NFS | 12:57 |
cloudnull | jrosser_ got me on the ZFS kick, and I never looked back 🙂 | 12:58 |
cloudnull | debian 11 is working perfectly. I need to push a change to the docs for base setup but otherwise its flawless. | 13:00 |
kleini- | ZFS is really nice, although I only use it as root filesystem in Ubuntu | 13:17 |
kleini- | because Ceph cluster is provided and not maintained by me | 13:17 |
*** kleini- is now known as kleini | 13:17 | |
* noonedeadpunk still moves out from zfs+nfs to ceph | 13:42 | |
noonedeadpunk | nasty thing about zfs+nfs is nfs bit, which can not recover in case of any network interruption | 13:43 |
*** ysandeep is now known as ysandeep|afk | 14:32 | |
NeilHanlon | o/ meeting today? | 15:02 |
noonedeadpunk | ah. it's timer | 15:03 |
noonedeadpunk | thanks) | 15:03 |
noonedeadpunk | #startmeeting openstack_ansible_meeting | 15:03 |
opendevmeet | Meeting started Tue Sep 6 15:03:27 2022 UTC and is due to finish in 60 minutes. The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:03 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:03 |
opendevmeet | The meeting name has been set to 'openstack_ansible_meeting' | 15:03 |
*** NeilHanlon is now known as alarmclock | 15:03 | |
noonedeadpunk | #topic rollcall | 15:03 |
alarmclock | o/ | 15:03 |
noonedeadpunk | o/ | 15:03 |
noonedeadpunk | lol | 15:03 |
*** alarmclock is now known as NeilHanlon | 15:03 | |
noonedeadpunk | #topic office hours | 15:12 |
noonedeadpunk | well, seems it will be relatively quite today on the meeting | 15:12 |
noonedeadpunk | NeilHanlon: regarding Rocky 9 - do you need any help with adding that? | 15:12 |
damiandabrowski | hello (sorry for being late) | 15:15 |
NeilHanlon | noonedeadpunk: not right now, i just need to get back to it. hoping i have some free time this week to try and get the changes in. i worked with mgariepy a bit ago to try and get the necessary centos-release-* packages into the Rocky base repos which will eliminate at least one change i have | 15:16 |
noonedeadpunk | ok, great | 15:17 |
noonedeadpunk | as we're closer and closer to Zed release date. We need at least to understand where are we wrt to distro support | 15:17 |
noonedeadpunk | I had some idea about how to optimize another chunk of our code that is for openstack resources creation. | 15:18 |
mgariepy | hello :D | 15:18 |
noonedeadpunk | As simple example - we do have several places where we do upload images - magnum, octavia, tempest, trove | 15:19 |
noonedeadpunk | So it sounds like we could create a role, that will be in charge of all resource creation, and place it in plugins (and maybe to ansible-collections-openstack) in the future | 15:19 |
mgariepy | i wasn't me for the package in the repos :) | 15:19 |
mgariepy | it* | 15:19 |
NeilHanlon | sorry.. tab completion.. mnasiadka -- i think they do Kolla things :) | 15:20 |
noonedeadpunk | And basically what brings me to that - https://review.opendev.org/c/openstack/openstack-ansible/+/854235 | 15:20 |
mgariepy | no worries i just wanted to make the record straith :D haha | 15:21 |
noonedeadpunk | I can't tell what specifically I don't like in this patch, it just looks a bit off from what we're doing everywhere | 15:21 |
noonedeadpunk | this might be also good step to drop resource creation from os_tempest role | 15:22 |
rajesh_gunasekaran | Hey Guys, i want to contribute, could you please help me with picking out some task to work on? | 15:22 |
noonedeadpunk | I was told, that andrewbonney does have a big chunk of this and may advice on how we want to make the structure | 15:24 |
noonedeadpunk | rajesh_gunasekaran: how familiar are you with OSA? | 15:24 |
anskiy | noonedeadpunk: that's a nice step, but it's too restricted, I guess. But, anyways, you can't cover anything, so as much as I would wonder about dropping my own playbook for resource creation, I don't see it happen :( | 15:24 |
damiandabrowski | noonedeadpunk: one thing concerns me, to create a resource you just need to execute one ansible module | 15:24 |
damiandabrowski | i wonder how including external role can simplify that even more :D | 15:25 |
noonedeadpunk | damiandabrowski: it's not always a case? Like for image lifecycle management you would need to have quite complicated flow | 15:25 |
anskiy | I could be done via that ansible thing, where you dynamically define, which module to use | 15:25 |
noonedeadpunk | Same with creating usable network - you need to create net, subnet,l3 router, and wire things | 15:26 |
noonedeadpunk | With images - you might want to download it from remote server, unpack, convert, etc | 15:27 |
noonedeadpunk | I'm looking at octavia code now https://opendev.org/openstack/openstack-ansible-os_octavia/src/branch/master/tasks/octavia_amp_image.yml | 15:27 |
noonedeadpunk | it's not _that_ trivial given you want to drop old images wherever you can | 15:27 |
noonedeadpunk | Also for some role, you might need to create both networks and images | 15:28 |
damiandabrowski | for images - i agree... (full message at https://matrix.org/_matrix/media/r0/download/matrix.org/iGYFmZWPnlHbHpADfrfTmAIG) | 15:29 |
*** dviroel is now known as dviroel|lunch | 15:29 | |
anskiy | noonedeadpunk: the problem with that role is that AFAIK ansible module for image can't use glance's `image-create-via-import`, to which you can just simply pass URL | 15:30 |
rajesh_gunasekaran | @noonedeadpunk OSA - doesn't ring a bell | 15:30 |
damiandabrowski | OSA - openstack-ansible | 15:31 |
noonedeadpunk | anskiy: well, you need also to configure glance properly for that and have a storage where images will be locally placed and served through web server | 15:32 |
noonedeadpunk | and even then it won't solve image rotation as example | 15:32 |
NeilHanlon | you definitely don't wanna see how I did this in my lab... | 15:33 |
NeilHanlon | and it doesn't solve rotation anyways 😂 | 15:33 |
rajesh_gunasekaran | ah okay! i would say am a new comer with basic knowledge about openstack-ansible | 15:33 |
noonedeadpunk | rajesh_gunasekaran: then I would say you can help out with ensuring, that AIO does survive reboots :) | 15:34 |
noonedeadpunk | firstly aio is a good start to get aware about osa. and secondly, I can recall a bug report that was claiming it's not working | 15:35 |
noonedeadpunk | ok, so as an overall result it feels that idea about role for creating resources is useless | 15:36 |
rajesh_gunasekaran | okay sure! will follow as per your guidance | 15:36 |
damiandabrowski | noonedeadpunk: maybe we want to create a role just for image management and lifecycle management? that would be useful for sure | 15:36 |
noonedeadpunk | rajesh_gunasekaran: ie https://bugs.launchpad.net/openstack-ansible/+bug/1819792 and https://bugs.launchpad.net/openstack-ansible/+bug/1819790 | 15:37 |
noonedeadpunk | rajesh_gunasekaran: I know they're quite old, but would be great to ensure that things are working fine | 15:37 |
noonedeadpunk | damiandabrowski: well, what I wanted to achieve is not to handle resource creation at all in any of os_<service> roles | 15:38 |
noonedeadpunk | and manage and maintain that separately | 15:38 |
noonedeadpunk | so that when ansible-collection-openstack updates or changes, we won't need to search where we use it and what do we use out of it | 15:39 |
noonedeadpunk | rajesh_gunasekaran: you can find aio docs here: https://docs.openstack.org/openstack-ansible/latest/user/aio/quickstart.html | 15:40 |
rajesh_gunasekaran | @noonedeadpunk Thank you, will go through the docs | 15:41 |
anskiy | maybe, ability to just plug in user-defined playbook on various stages (like, pre-setup-hosts, post-setup-openstack...) could help?.. | 15:41 |
noonedeadpunk | and then if we need to create or add some resource we could jsut run a playbook rather then re-run some os_<service> role | 15:41 |
noonedeadpunk | anskiy: that is tricky. We do have that for adding new compute, but it's done through the bash script | 15:42 |
noonedeadpunk | I _think_ we have that also for upgrade script... | 15:43 |
*** ysandeep|afk is now known as ysandeep | 15:43 | |
noonedeadpunk | I meant that https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/add-compute.sh#L22 | 15:44 |
anskiy | well, there is another downside: it's shorter path to get things done and not contribute :) | 15:44 |
noonedeadpunk | well... I'd say that if we're talking about resources, you might want/need to run their creating after each role rather then pre-openstack and post-openstack | 15:45 |
noonedeadpunk | but not sure.... | 15:46 |
noonedeadpunk | Also I kind of run that full setup only in sandbox lol | 15:46 |
noonedeadpunk | rest would be just single playbook | 15:46 |
noonedeadpunk | anyway | 15:46 |
noonedeadpunk | as it's quite arguable, I won't push it really. It's totally not top prio. | 15:48 |
noonedeadpunk | But thanks for the feedback! | 15:48 |
noonedeadpunk | Is there anything else e want to quickly discuss? | 15:48 |
* noonedeadpunk needs to run in 5 mins | 15:48 | |
anskiy | https://bugs.launchpad.net/cloud-archive/+bug/1988270 nothing's going on in here :( | 15:49 |
noonedeadpunk | ¯\(◉◡◔)/¯ | 15:49 |
*** ysandeep is now known as ysandeep|out | 15:49 | |
anskiy | noonedeadpunk: you've said, you can reach for UCA-guys somehow, or did I misunderstood? | 15:49 |
noonedeadpunk | nah, it was idea that we need to reach them via irc but I could not find any. THough yes, I can likely ping charms folks, and hopefully they know who should take care of that | 15:50 |
noonedeadpunk | I will try to take care of that tomorrow morning | 15:51 |
anskiy | thank you :) | 15:52 |
noonedeadpunk | #endmeeting | 15:52 |
opendevmeet | Meeting ended Tue Sep 6 15:52:35 2022 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 15:52 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2022/openstack_ansible_meeting.2022-09-06-15.03.html | 15:52 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2022/openstack_ansible_meeting.2022-09-06-15.03.txt | 15:52 |
opendevmeet | Log: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2022/openstack_ansible_meeting.2022-09-06-15.03.log.html | 15:52 |
prometheanfire | I seem to get rabbitmq timeouts all the time, fairly standard OSA deployment, not sure why this happens though | 16:22 |
*** dviroel|lunch is now known as dviroel | 16:26 | |
cloudnull | prometheanfire++ I'm seeing it happen every couple of min. | 17:17 |
prometheanfire | cloudnull: ltns :D | 17:46 |
prometheanfire | might actually be something to do with ceph being borked | 17:50 |
noonedeadpunk | well, some rabbit timeouts can be fairly harmless... | 18:13 |
noonedeadpunk | jamesdenton: hey! Are you around?:) | 18:42 |
jamesdenton | noonedeadpunk you rang? | 18:47 |
noonedeadpunk | yeah! need help :) | 18:49 |
jamesdenton | what's up? | 18:50 |
noonedeadpunk | I ran neutron playbook yestrday with tags. As a result, ovs-agent was reconfigured for br-mgmt local_ip instead br-vlan | 18:50 |
jamesdenton | oops | 18:50 |
noonedeadpunk | This is now reverted, but some networks randomly does not work | 18:50 |
jamesdenton | hmm ok | 18:51 |
jamesdenton | ovs/ linuxbridge? | 18:51 |
jamesdenton | ovs? linuxbridge? | 18:51 |
noonedeadpunk | So for `ovs-vsctl show` I see in ports https://paste.openstack.org/show/bu3ti1r03tuyyb65m5SR/ | 18:51 |
noonedeadpunk | ovs | 18:51 |
jamesdenton | ok | 18:52 |
noonedeadpunk | So network should be 10.149.20.0/24 and as you see remote_ip="10.149.8.52" | 18:52 |
jamesdenton | right, ok | 18:52 |
noonedeadpunk | THough, I'm not actually sure if I should patch it | 18:52 |
noonedeadpunk | as it seems each compute should commute only 1 pair of vxlan ports | 18:53 |
jamesdenton | i assume this is impacting vxlan networks between some of the hosts? | 18:53 |
jamesdenton | yeah, it's a mesh, so should be one to many | 18:53 |
noonedeadpunk | I see smth like that https://paste.openstack.org/show/bUrSBf9zwSH7BWdV7aoO/ | 18:54 |
noonedeadpunk | So should I patch vxlan-0a950835 options to fix remote_ip or try to drop it somehow? | 18:54 |
noonedeadpunk | OR actually, any thoughts ?:D | 18:54 |
jamesdenton | and thats the same host on the remote end? | 18:54 |
noonedeadpunk | yes | 18:55 |
noonedeadpunk | jsut br-mgmt vs br-vxlan | 18:55 |
jamesdenton | right, ok. well, the flows will ultimately dictate which of those two ports to use. | 18:55 |
noonedeadpunk | neutron-ovs-agent was restarted n times | 18:55 |
jamesdenton | k | 18:55 |
jamesdenton | restarted cluster-wide? | 18:55 |
noonedeadpunk | well, on all computes at least | 18:56 |
jamesdenton | my first suggestion would be: make sure local_ip is restored on the hosts, delete the errant ports in ovs bridge, and restart the neutron ovs agent for good measure. | 18:56 |
noonedeadpunk | ok, so such port should be just deleted? | 18:56 |
jamesdenton | it can be, and should be restored (if needed) automatically. lemme bring up an ovs host real quick | 18:57 |
supamatt | make sure you restart nova-compute after | 18:57 |
noonedeadpunk | how to delete port?:) | 18:58 |
supamatt | ovs-vsctl del-port | 18:58 |
jamesdenton | ovs-vsctl del-port <bridge> <port> | 18:59 |
supamatt | jamesdenton: I had just switched to OVN 22.06 btw, with the latest yoga.. this patch (backported to yoga) is a godsend https://review.opendev.org/c/openstack/neutron/+/848845 . Seeing 0 dropped packets for live migration with OVN now, literally no network drops. | 19:01 |
jamesdenton | woot! | 19:01 |
supamatt | and I had been testing with a ping interval of 100ms ;D | 19:01 |
noonedeadpunk | ah, well, I thought I have to define bridge, but seems I don't :) | 19:01 |
jamesdenton | oh, good to know. | 19:01 |
jamesdenton | supamatt what storage are you using? | 19:02 |
supamatt | jamesdenton: shared storage, in this case purestorage | 19:02 |
jamesdenton | cool | 19:02 |
supamatt | should be the same with any shared storage solution, ie ceph | 19:02 |
noonedeadpunk | sounds like really good news | 19:03 |
jamesdenton | noonedeadpunk did that work for you? | 19:03 |
noonedeadpunk | But at the moments when I do have such issues I wish I used LXB lol | 19:04 |
noonedeadpunk | jamesdenton: that is super good question - have no idea as we can't reproduce now with networks we had in the region | 19:04 |
jamesdenton | oh, i see. well, it's possible that the port is just a leftover and the flow table was actually using the proper port | 19:05 |
noonedeadpunk | But random clients saw random issues, and we don't have access to their workloads obviously | 19:05 |
noonedeadpunk | and it was affecting not all networks, but only some | 19:05 |
noonedeadpunk | As example - 2 nets were connected to the same VM, but 1 net was not working, and we could get to vm through another one | 19:06 |
jamesdenton | and both networks were overlay? | 19:06 |
noonedeadpunk | Both vxlans, yes | 19:06 |
noonedeadpunk | So we haven;t even spotted misconfiguration at once | 19:07 |
noonedeadpunk | thanks jamesdenton and supamatt! | 19:10 |
noonedeadpunk | I was not jsut sure how safe to do removing them and what's best way to proceed | 19:10 |
jamesdenton | sure, good luck | 19:10 |
jamesdenton | yeah, i was hoping to pull up an ovs setup but i had a power failure earlier and everything was offline | 19:11 |
jamesdenton | so now i deal with the fallout of starting galera cluster | 19:11 |
jamesdenton | the output of "ovs-ofctl dump-flows br-int" or "br-vxlan" ought to give you insight into which of those vxlan ports are being used for the respective traffic. it's possible there's a mix | 19:12 |
noonedeadpunk | btw this was the root cause https://review.opendev.org/c/openstack/openstack-ansible/+/855977 | 19:13 |
noonedeadpunk | (or well, fix of the root cause) | 19:14 |
jamesdenton | ahh ok, good find | 19:15 |
noonedeadpunk | I wish it was done in a sandbox | 19:18 |
noonedeadpunk | *found | 19:18 |
jamesdenton | yeah, that's a bummer. Sorry. | 19:20 |
jamesdenton | i blame cloudnull | 19:20 |
noonedeadpunk | lol | 19:22 |
jamesdenton | anyone run into this error w/ keystone? Fernet key must be 32 url-safe base64-encoded bytes | 19:36 |
jamesdenton | well, nevermind. error was pretty accurate :D one of the keys on one of the infra nodes was just garage characters. | 19:44 |
jamesdenton | *garbage | 19:44 |
cloudnull | jamesdenton that's fair | 19:57 |
cloudnull | I'm not entirely sure what all that is, but I'm sure its my fault | 19:58 |
jamesdenton | you were there when it was made. thus, it is your fault | 19:58 |
cloudnull | fair | 19:59 |
*** dviroel is now known as dviroel|biab | 21:19 | |
*** dviroel|biab is now known as dviroel | 23:17 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!