Tuesday, 2022-09-06

*** ysandeep|out is now known as ysandeep00:51
opendevreviewKevin Carter proposed openstack/openstack-ansible-rabbitmq_server master: Update rabbitmq to 3.10.7-1  https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/85598501:10
opendevreviewKevin Carter proposed openstack/openstack-ansible-os_nova master: Remove heartbeat_in_pthread functionality  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/85599102:21
opendevreviewKevin Carter proposed openstack/openstack-ansible-os_cinder master: Remove heartbeat_in_pthread functionality  https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/85599202:23
opendevreviewKevin Carter proposed openstack/openstack-ansible-os_neutron master: Remove heartbeat_in_pthread functionality  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/85599302:25
opendevreviewKevin Carter proposed openstack/openstack-ansible-rabbitmq_server master: Update the heartbeat and handshake timeout  https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/85599602:50
opendevreviewKevin Carter proposed openstack/openstack-ansible-rabbitmq_server master: Update the heartbeat and handshake timeout  https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/85599602:51
*** ysandeep is now known as ysandeep|afk03:36
opendevreviewOpenStack Proposal Bot proposed openstack/openstack-ansible master: Imported Translations from Zanata  https://review.opendev.org/c/openstack/openstack-ansible/+/85601204:30
*** ysandeep|afk is now known as ysandeep05:29
anskiyjamesdenton: no, it still depends on it, this thing just prevent explicit enable/start and configuration for openvswtich07:10
noonedeadpunksorry cloudnull for being harsh on your PRs :p07:24
opendevreviewMerged openstack/openstack-ansible master: Imported Translations from Zanata  https://review.opendev.org/c/openstack/openstack-ansible/+/85601207:47
opendevreviewEbbex proposed openstack/openstack-ansible-os_ironic master: Bind http and tftp services to the bmaas network  https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/85212208:03
opendevreviewEbbex proposed openstack/openstack-ansible-os_ironic master: Ensure ironic inspector dhcp server listen address is defined  https://review.opendev.org/c/openstack/openstack-ansible-os_ironic/+/85217308:03
*** ysandeep is now known as ysandeep|afk10:03
*** ysandeep|afk is now known as ysandeep10:40
opendevreviewMerged openstack/openstack-ansible-os_neutron master: Convert include to include_tasks  https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/85580311:02
*** dviroel|out is now known as dviroel11:19
cloudnullnoonedeadpunk no worries. I’m just a drive by commiter these days 😉12:41
cloudnullOn https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/855992 I left a comment about the error I was seeing in the rabbit logs. Gone after I changed the pthread setting for cinder nova and neutron (the only three services that use it). I’ll leave the prs us but feel free to abandon them if they’re not a good fit. I can work12:43
cloudnullaround it by using the existing options. 12:43
noonedeadpunkeventually, uwsgi will work suuuuper wierdly if you enable heartbeat_in_pthread12:45
noonedeadpunksorry, everything except uwsgi12:46
noonedeadpunkuwsgi is the only thing that works with this thing12:47
cloudnullShould that setting be applied to all the Oslo messaging services ? 12:47
cloudnullOh. So it’s just uwsgi 12:47
noonedeadpunkTHough, if you check ML I've shown, ppl agreed there that these error are harmless and the only disadvantage if disabling heartbeat_in_pthread by default and backporting stuff to stable branches12:47
noonedeadpunkBut for non-uwsgi services it's being a disaster, as they could get stuck or stop responding12:48
cloudnullYeah they’re harmless. But seem to correspond to spikes to cpu when they pop. It’s like 10% so nothing major. But that’s the reason I was even looking. 12:48
cloudnullI’ll abandon those prs for now. 12:49
cloudnullOn a total aside, project works great with Debian 11. Great work on all that. 12:51
noonedeadpunkat least now we know it works lol12:52
noonedeadpunkcloudnull: are you using ceph? As this is smth that might not work as expected12:52
noonedeadpunkor well, we just don't test it, so I don't really know12:53
cloudnullno I don't run cephalopods 12:56
cloudnull** ceph12:57
cloudnullstupid auto-correct 12:57
cloudnullI run ZFS + NFS 12:57
cloudnulljrosser_ got me on the ZFS kick, and I never looked back 🙂12:58
cloudnulldebian 11 is working perfectly. I need to push a change to the docs for base setup but otherwise its flawless. 13:00
kleini-ZFS is really nice, although I only use it as root filesystem in Ubuntu13:17
kleini-because Ceph cluster is provided and not maintained by me13:17
*** kleini- is now known as kleini13:17
* noonedeadpunk still moves out from zfs+nfs to ceph13:42
noonedeadpunknasty thing about zfs+nfs is nfs bit, which can not recover in case of any network interruption13:43
*** ysandeep is now known as ysandeep|afk14:32
NeilHanlono/ meeting today?15:02
noonedeadpunkah. it's timer15:03
noonedeadpunkthanks)15:03
noonedeadpunk#startmeeting openstack_ansible_meeting15:03
opendevmeetMeeting started Tue Sep  6 15:03:27 2022 UTC and is due to finish in 60 minutes.  The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot.15:03
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:03
opendevmeetThe meeting name has been set to 'openstack_ansible_meeting'15:03
*** NeilHanlon is now known as alarmclock15:03
noonedeadpunk#topic rollcall15:03
alarmclocko/15:03
noonedeadpunko/15:03
noonedeadpunklol15:03
*** alarmclock is now known as NeilHanlon15:03
noonedeadpunk#topic office hours15:12
noonedeadpunkwell, seems it will be relatively quite today on the meeting15:12
noonedeadpunkNeilHanlon: regarding Rocky 9 - do you need any help with adding that?15:12
damiandabrowskihello (sorry for being late)15:15
NeilHanlonnoonedeadpunk: not right now, i just need to get back to it. hoping i have some free time this week to try and get the changes in. i worked with mgariepy a bit ago to try and get the necessary centos-release-* packages into the Rocky base repos which will eliminate at least one change i have15:16
noonedeadpunkok, great15:17
noonedeadpunkas we're closer and closer to Zed release date. We need at least to understand where are we wrt to distro support15:17
noonedeadpunkI had some idea about how to optimize another chunk of our code that is for openstack resources creation. 15:18
mgariepyhello :D15:18
noonedeadpunkAs simple example - we do have several places where we do upload images - magnum, octavia, tempest, trove15:19
noonedeadpunkSo it sounds like we could create a role, that will be in charge of all resource creation, and place it in plugins (and maybe to ansible-collections-openstack) in the future15:19
mgariepyi wasn't me for the package in the repos :)15:19
mgariepyit*15:19
NeilHanlonsorry.. tab completion.. mnasiadka -- i think they do Kolla things :) 15:20
noonedeadpunkAnd basically what brings me to that  - https://review.opendev.org/c/openstack/openstack-ansible/+/85423515:20
mgariepyno worries i just wanted to make the record straith :D haha15:21
noonedeadpunkI can't tell what specifically I don't like in this patch, it just looks a bit off from what we're doing everywhere15:21
noonedeadpunkthis might be also good step to drop resource creation from os_tempest role15:22
rajesh_gunasekaranHey Guys, i want to contribute, could you please help me with picking out some task to work on?15:22
noonedeadpunkI was told, that andrewbonney does have a big chunk of this and may advice on how we want to make the structure15:24
noonedeadpunkrajesh_gunasekaran: how familiar are you with OSA?15:24
anskiynoonedeadpunk: that's a nice step, but it's too restricted, I guess. But, anyways, you can't cover anything, so as much as I would wonder about dropping my own playbook for resource creation, I don't see it happen :(15:24
damiandabrowskinoonedeadpunk: one thing concerns me, to create a resource you just need to execute one ansible module15:24
damiandabrowskii wonder how including external role can simplify that even more :D 15:25
noonedeadpunkdamiandabrowski: it's not always a case? Like for image lifecycle management you would need to have quite complicated flow15:25
anskiyI could be done via that ansible thing, where you dynamically define, which module to use15:25
noonedeadpunkSame with creating usable network - you need to create net, subnet,l3 router,  and wire things15:26
noonedeadpunkWith images - you might want to download it from remote server, unpack, convert, etc15:27
noonedeadpunkI'm looking at octavia code now https://opendev.org/openstack/openstack-ansible-os_octavia/src/branch/master/tasks/octavia_amp_image.yml15:27
noonedeadpunkit's not _that_ trivial given you want to drop old images wherever you can15:27
noonedeadpunkAlso for some role, you might need to create both networks and images15:28
damiandabrowskifor images - i agree... (full message at https://matrix.org/_matrix/media/r0/download/matrix.org/iGYFmZWPnlHbHpADfrfTmAIG)15:29
*** dviroel is now known as dviroel|lunch15:29
anskiynoonedeadpunk: the problem with that role is that AFAIK ansible module for image can't use glance's `image-create-via-import`, to which you can just simply pass URL15:30
rajesh_gunasekaran@noonedeadpunk OSA - doesn't ring a bell15:30
damiandabrowskiOSA - openstack-ansible15:31
noonedeadpunkanskiy: well, you need also to configure glance properly for that and have a storage where images will be locally placed and served through web server15:32
noonedeadpunkand even then it won't solve image rotation as example15:32
NeilHanlonyou definitely don't wanna see how I did this in my lab...15:33
NeilHanlonand it doesn't solve rotation anyways 😂 15:33
rajesh_gunasekaranah okay! i would say am a new comer with basic knowledge about openstack-ansible15:33
noonedeadpunkrajesh_gunasekaran: then I would say you can help out with ensuring, that AIO does survive reboots :)15:34
noonedeadpunkfirstly aio is a good start to get aware about osa. and secondly, I can recall a bug report that was claiming it's not working15:35
noonedeadpunkok, so as an overall result it feels that idea about role for creating resources is useless15:36
rajesh_gunasekaranokay sure! will follow as per your guidance15:36
damiandabrowskinoonedeadpunk: maybe we want to create a role just for image management and lifecycle management? that would be useful for sure15:36
noonedeadpunkrajesh_gunasekaran: ie https://bugs.launchpad.net/openstack-ansible/+bug/1819792 and https://bugs.launchpad.net/openstack-ansible/+bug/1819790 15:37
noonedeadpunkrajesh_gunasekaran: I know they're quite old, but would be great to ensure that things are working fine15:37
noonedeadpunkdamiandabrowski: well, what I wanted to achieve is not to handle resource creation at all in any of os_<service> roles15:38
noonedeadpunkand manage and maintain that separately 15:38
noonedeadpunkso that when ansible-collection-openstack updates or changes, we won't need to search where we use it and what do we use out of it15:39
noonedeadpunkrajesh_gunasekaran: you can find aio docs here: https://docs.openstack.org/openstack-ansible/latest/user/aio/quickstart.html15:40
rajesh_gunasekaran@noonedeadpunk Thank you, will go through the docs15:41
anskiymaybe, ability to just plug in user-defined playbook on various stages (like, pre-setup-hosts, post-setup-openstack...) could help?..15:41
noonedeadpunkand then if we need to create or add some resource we could jsut run a playbook rather then re-run some os_<service> role15:41
noonedeadpunkanskiy: that is tricky. We do have that for adding new compute, but it's done through the bash script15:42
noonedeadpunkI _think_ we have that also for upgrade script...15:43
*** ysandeep|afk is now known as ysandeep15:43
noonedeadpunkI meant that https://opendev.org/openstack/openstack-ansible/src/branch/master/scripts/add-compute.sh#L2215:44
anskiywell, there is another downside: it's shorter path to get things done and not contribute :)15:44
noonedeadpunkwell... I'd say that if we're talking about resources, you might want/need to run their creating after each role rather then pre-openstack and post-openstack15:45
noonedeadpunkbut not sure....15:46
noonedeadpunkAlso I kind of run that full setup only in sandbox lol15:46
noonedeadpunkrest would be just single playbook15:46
noonedeadpunkanyway15:46
noonedeadpunkas it's quite arguable, I won't push it really. It's totally not top prio.15:48
noonedeadpunkBut thanks for the feedback!15:48
noonedeadpunkIs there anything else e want to quickly discuss? 15:48
* noonedeadpunk needs to run in 5 mins15:48
anskiyhttps://bugs.launchpad.net/cloud-archive/+bug/1988270 nothing's going on in here :(15:49
noonedeadpunk¯\(◉◡◔)/¯15:49
*** ysandeep is now known as ysandeep|out15:49
anskiynoonedeadpunk: you've said, you can reach for UCA-guys somehow, or did I misunderstood? 15:49
noonedeadpunknah, it was idea that we need to reach them via irc but I could not find any. THough yes, I can likely ping charms folks, and hopefully they know who should take care of that15:50
noonedeadpunkI will try to take care of that tomorrow morning15:51
anskiythank you :)15:52
noonedeadpunk#endmeeting15:52
opendevmeetMeeting ended Tue Sep  6 15:52:35 2022 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:52
opendevmeetMinutes:        https://meetings.opendev.org/meetings/openstack_ansible_meeting/2022/openstack_ansible_meeting.2022-09-06-15.03.html15:52
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2022/openstack_ansible_meeting.2022-09-06-15.03.txt15:52
opendevmeetLog:            https://meetings.opendev.org/meetings/openstack_ansible_meeting/2022/openstack_ansible_meeting.2022-09-06-15.03.log.html15:52
prometheanfireI seem to get rabbitmq timeouts all the time, fairly standard OSA deployment, not sure why this happens though16:22
*** dviroel|lunch is now known as dviroel16:26
cloudnullprometheanfire++ I'm seeing it happen every couple of min. 17:17
prometheanfirecloudnull: ltns :D17:46
prometheanfiremight actually be something to do with ceph being borked17:50
noonedeadpunkwell, some rabbit timeouts can be fairly harmless...18:13
noonedeadpunkjamesdenton: hey! Are you around?:)18:42
jamesdentonnoonedeadpunk you rang?18:47
noonedeadpunkyeah! need help :) 18:49
jamesdentonwhat's up?18:50
noonedeadpunkI ran neutron playbook yestrday with tags. As a result, ovs-agent was reconfigured for br-mgmt local_ip instead br-vlan18:50
jamesdentonoops18:50
noonedeadpunkThis is now reverted, but some networks randomly does not work18:50
jamesdentonhmm ok18:51
jamesdentonovs/ linuxbridge?18:51
jamesdentonovs? linuxbridge?18:51
noonedeadpunkSo for `ovs-vsctl show` I see in ports https://paste.openstack.org/show/bu3ti1r03tuyyb65m5SR/18:51
noonedeadpunkovs18:51
jamesdentonok18:52
noonedeadpunkSo network should be 10.149.20.0/24 and as you see remote_ip="10.149.8.52" 18:52
jamesdentonright, ok18:52
noonedeadpunkTHough, I'm not actually sure if I should patch it18:52
noonedeadpunkas it seems each compute should commute only 1 pair of vxlan ports18:53
jamesdentoni assume this is impacting vxlan networks between some of the hosts?18:53
jamesdentonyeah, it's a mesh, so should be one to many18:53
noonedeadpunkI see smth like that https://paste.openstack.org/show/bUrSBf9zwSH7BWdV7aoO/18:54
noonedeadpunkSo should I patch vxlan-0a950835 options to fix remote_ip or try to drop it somehow?18:54
noonedeadpunkOR actually, any thoughts ?:D18:54
jamesdentonand thats the same host on the remote end?18:54
noonedeadpunkyes18:55
noonedeadpunkjsut br-mgmt vs br-vxlan18:55
jamesdentonright, ok. well, the flows will ultimately dictate which of those two ports to use.18:55
noonedeadpunkneutron-ovs-agent was restarted n times18:55
jamesdentonk18:55
jamesdentonrestarted cluster-wide?18:55
noonedeadpunkwell, on all computes at least18:56
jamesdentonmy first suggestion would be: make sure local_ip is restored on the hosts, delete the errant ports in ovs bridge, and restart the neutron ovs agent for good measure.18:56
noonedeadpunkok, so such port should be just deleted?18:56
jamesdentonit can be, and should be restored (if needed) automatically. lemme bring up an ovs host real quick18:57
supamattmake sure you restart nova-compute after 18:57
noonedeadpunkhow to delete port?:)18:58
supamattovs-vsctl del-port 18:58
jamesdentonovs-vsctl del-port <bridge> <port>18:59
supamattjamesdenton: I had just switched to OVN 22.06 btw, with the latest yoga.. this patch (backported to yoga) is a godsend https://review.opendev.org/c/openstack/neutron/+/848845 . Seeing 0 dropped packets for live migration with OVN now, literally no network drops. 19:01
jamesdentonwoot!19:01
supamattand I had been testing with a ping interval of 100ms ;D 19:01
noonedeadpunkah, well, I thought I have to define bridge, but seems I don't :)19:01
jamesdentonoh, good to know. 19:01
jamesdentonsupamatt what storage are you using?19:02
supamattjamesdenton: shared storage, in this case purestorage19:02
jamesdentoncool19:02
supamattshould be the same with any shared storage solution, ie ceph19:02
noonedeadpunksounds like really good news19:03
jamesdentonnoonedeadpunk did that work for you?19:03
noonedeadpunkBut at the moments when I do have such issues I wish I used LXB lol19:04
noonedeadpunkjamesdenton: that is super good question - have no idea as we can't reproduce now with networks we had in the region19:04
jamesdentonoh, i see. well, it's possible that the port is just a leftover and the flow table was actually using the proper port19:05
noonedeadpunkBut random clients saw random issues, and we don't have access to their workloads obviously19:05
noonedeadpunkand it was affecting not all networks, but only some19:05
noonedeadpunkAs example - 2 nets were connected to the same VM, but 1 net was not working, and we could get to vm through another one19:06
jamesdentonand both networks were overlay?19:06
noonedeadpunkBoth vxlans, yes19:06
noonedeadpunkSo we haven;t even spotted misconfiguration at once19:07
noonedeadpunkthanks jamesdenton and supamatt! 19:10
noonedeadpunkI was not jsut sure how safe to do removing them and what's best way to proceed19:10
jamesdentonsure, good luck19:10
jamesdentonyeah, i was hoping to pull up an ovs setup but i had a power failure earlier and everything was offline19:11
jamesdentonso now i deal with the fallout of starting galera cluster19:11
jamesdentonthe output of "ovs-ofctl dump-flows br-int" or "br-vxlan" ought to give you insight into which of those vxlan ports are being used for the respective traffic. it's possible there's a mix 19:12
noonedeadpunkbtw this was the root cause https://review.opendev.org/c/openstack/openstack-ansible/+/85597719:13
noonedeadpunk(or well, fix of the root cause)19:14
jamesdentonahh ok, good find19:15
noonedeadpunkI wish it was done in a sandbox19:18
noonedeadpunk*found19:18
jamesdentonyeah, that's a bummer. Sorry. 19:20
jamesdentoni blame cloudnull 19:20
noonedeadpunklol19:22
jamesdentonanyone run into this error w/ keystone? Fernet key must be 32 url-safe base64-encoded bytes19:36
jamesdentonwell, nevermind. error was pretty accurate :D one of the keys on one of the infra nodes was just garage characters.19:44
jamesdenton*garbage19:44
cloudnulljamesdenton that's fair 19:57
cloudnullI'm not entirely sure what all that is, but I'm sure its my fault 19:58
jamesdentonyou were there when it was made. thus, it is your fault19:58
cloudnullfair 19:59
*** dviroel is now known as dviroel|biab21:19
*** dviroel|biab is now known as dviroel23:17

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!