opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-lxc_hosts master: Add retries to LXC base build command https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/888750 | 06:41 |
---|---|---|
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: [DNM] Bump ansible-core to 2.15.1 and collections https://review.opendev.org/c/openstack/openstack-ansible/+/886527 | 06:46 |
noonedeadpunk | fwiw, I haven't catched any issue with rocky on my AIO | 06:52 |
noonedeadpunk | (today in the morning) | 06:52 |
hamidlotfi_ | Hello, | 07:39 |
hamidlotfi_ | I added a compute node added by this manual: https://docs.openstack.org/openstack-ansible/latest/admin/scale-environment.html#add-a-compute-host | 07:39 |
hamidlotfi_ | but after adding this compute node, all of the instance located in this compute does not ping the network! | 07:39 |
hamidlotfi_ | can you help me? | 07:39 |
noonedeadpunk | hamidlotfi_: what neutron driver is it using? | 07:44 |
hamidlotfi_ | OVN | 07:44 |
hamidlotfi_ | and ZED version | 07:44 |
noonedeadpunk | and regarding network - you mean external network, or also internal network (ie between VMs for the same tenant)? | 07:45 |
hamidlotfi_ | external network | 07:46 |
noonedeadpunk | but VMs are reachable from other VMs? | 07:46 |
noonedeadpunk | including between computes? | 07:47 |
noonedeadpunk | as in this case I'm completely clueless, as have very vague understanding how external connectivity is done in OVN | 07:47 |
hamidlotfi_ | oh ok | 07:48 |
hamidlotfi_ | but let me check interconnect between the VMs. | 07:48 |
noonedeadpunk | but I assume that if VMs are getting spawned, then ovn-agent runs on compute properly, otherwise it would fail to bind port | 07:48 |
noonedeadpunk | but if interconnection between VMs on the same internal network does not work - I assume that smth is wrong with the interface that should be used for geneve | 07:49 |
noonedeadpunk | As I guess external connectivity is done through geneve still, it just goes to gateway nodes and then somehow routed/terminated/etc | 07:50 |
hamidlotfi_ | I have 3 compute nodes (compute01, compute02, compute03) and I just added compute02. | 08:01 |
hamidlotfi_ | All compute01 and compute03 instances see each other but not compute02 instances and vice versa. | 08:01 |
hamidlotfi_ | as I said before compute02 was added newly. | 08:02 |
noonedeadpunk | aha | 08:14 |
noonedeadpunk | Ok, then it should be easy :) | 08:15 |
anskiy | hamidlotfi_: you should see in `ovs-vsctl show` all of your computes -- that's between what geneve is set up | 08:15 |
hamidlotfi_ | what's happen | 08:16 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-plugins master: Installing systemd-udev with NVR https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/888753 | 08:16 |
anskiy | hamidlotfi_: ovn-agent noonedeadpunk's mentioned is `ovn-controller` in case of OVN, you can check its logs. | 08:16 |
noonedeadpunk | anskiy: I think in OVN you need to add interface to the bridge or smth, right? Or at least have an IP address on it? | 08:17 |
* noonedeadpunk jsut don't have any ovn sandbox handy | 08:18 | |
opendevreview | Dmitriy Rabotyagov proposed openstack/ansible-role-systemd_mount master: Installing systemd-udev with NVR https://review.opendev.org/c/openstack/ansible-role-systemd_mount/+/888754 | 08:19 |
anskiy | noonedeadpunk: external one? yeah. | 08:19 |
hamidlotfi_ | yes, my problem on external network | 08:20 |
noonedeadpunk | As I guess this is the issue | 08:20 |
noonedeadpunk | hamidlotfi_: btw, can you ping between compute nodes by IP assigned on the external interface that is used for geneve? | 08:21 |
noonedeadpunk | huh, but where it's defined? | 08:22 |
anskiy | hamidlotfi_: you should see it here `ovs-vsctl list open` in `ovn-encap-ip` | 08:23 |
hamidlotfi_ | compute01: vn-encap-ip="172.17.222.21", compute02:ovn-encap-ip="172.17.222.22", compute03:ovn-encap-ip="172.17.222.23" | 08:35 |
hamidlotfi_ | but just difference between in all of them is ovs_version in the new node is "2.17.7" and on the other node "2.17.5" | 08:35 |
hamidlotfi_ | And all ovn_encap_ip ping each other | 08:36 |
anskiy | hamidlotfi_: did you reinstall compute02? | 08:38 |
hamidlotfi_ | yes, completely remove compute02 from cluster and install new compute02 | 08:39 |
anskiy | do you see it here: `openstack network agent list` and in which state it is now? For each compute there should be two entries: `OVN Metadata agent` and `OVN Controller Gateway agent` | 08:42 |
noonedeadpunk | I'd say that if agent was not there - it won't be able to bind port to the VM | 09:07 |
noonedeadpunk | So VM creation would fail | 09:07 |
hamidlotfi_ | Because the ovs_version do not match, I deleted and reinstalled it, but now in neutron-ovn-metadata-agent.service show me this error: | 09:11 |
hamidlotfi_ | "Error executing command (DbAddCommand): ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Chassis_Private with name" | 09:11 |
hamidlotfi_ | I think it's better to install compute02 from the beginning, right? | 09:13 |
noonedeadpunk | I don't think it's due to version missmatch to be frank - too minor difference | 09:17 |
noonedeadpunk | hamidlotfi_: oh, `Cannot find Chassis_Private with name` is actually interesting | 09:17 |
noonedeadpunk | hamidlotfi_: is the name of compute in `openstack hypervisor list` is following same naming convention? And is it same with `openstack compute service list`? | 09:18 |
anskiy | AFAIR, ovs-vswitchd/ovn-controller should be adding info into southbound database, so you might try just restarting those | 09:18 |
noonedeadpunk | as we sometimes have a mess with .openstack.local vs bare hostnames which can lead to smth like that | 09:18 |
hamidlotfi_ | https://www.irccloud.com/pastebin/o31xoueA/ | 09:20 |
anskiy | so, nova-compute is down too? | 09:21 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: [DNM] Bump ansible-core to 2.15.2 and collections https://review.opendev.org/c/openstack/openstack-ansible/+/886527 | 09:24 |
hamidlotfi_ | it's manually stopped by myself | 09:25 |
opendevreview | Dmitriy Rabotyagov proposed openstack/ansible-role-zookeeper master: Fix linters and metadata https://review.opendev.org/c/openstack/ansible-role-zookeeper/+/888610 | 09:31 |
opendevreview | Dmitriy Rabotyagov proposed openstack/ansible-role-zookeeper master: Fix linters and metadata https://review.opendev.org/c/openstack/ansible-role-zookeeper/+/888610 | 09:32 |
anskiy | hamidlotfi_: I would suggest checking ovn-controller logs on compute node (`/var/log/ovn/ovn-controller.log`) and `/var/log/openvswitch/ovs-vswitchd.log` as it could have some clue about what went wrong on adding chassis into OVN SB DB. | 09:34 |
anskiy | next place would be checking `chassis` and `chassis_private` tables in OVN with something like `ovn-sbctl --db tcp:<IP1>:6642,tcp:<IP2>:6642,tcp:<IP3>:6642 list chassis` | 09:36 |
anskiy | to see, if there is anything with `compute02` name and how does it differ from the others. | 09:38 |
anskiy | Last time I saw something similar was bootstrapping compute with wrong OVS version (2.13 vs 2.17) and it clearly was broken upon trying to add itself to SB, which was running 2.17 too. | 09:39 |
hamidlotfi_ | OK, I will check | 09:40 |
hamidlotfi_ | Thank you very much for your help with the details. | 09:40 |
opendevreview | Dmitriy Rabotyagov proposed openstack/ansible-role-zookeeper master: Do not use notify inside handlers https://review.opendev.org/c/openstack/ansible-role-zookeeper/+/888760 | 09:58 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-haproxy_server master: Do not use notify inside handlers https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/888762 | 10:37 |
Tad | mgariepy Here is my neutron and connectivity issue, instances can communicate with each other on their private addresses, i.e. 192.168.10.144 is able to ping 192.168.10.160 and vice versa, but they are not able to go out to the internet, i.e. ping 8.8.8.8 | 10:38 |
Tad | My configuration files are located at https://github.com/TadiosAbebe/OSA/blob/master/etc/openstack_deploy and I have modified the openstack_user_config.yml, user_variables.yml, ./group_vars/network_hosts, /env.d/neutron.yml and /env.d/nova.yml files. | 10:39 |
Tad | Here is a detail description of what i tried and how my enviroment is setup https://pastebin.com/W9xy0EWC | 10:39 |
noonedeadpunk | Tad: first thing I can tell - `is_container_address` should be defined only once. And that supposed to be br-mgmt | 10:42 |
Tad | okay great i'll fix that. | 10:43 |
noonedeadpunk | one small nit - br-vxlan is super confusing given that this is geneve - 2 different overlay protocols. | 10:43 |
noonedeadpunk | That all is unrelated (likely) to your issue though | 10:43 |
noonedeadpunk | Also - you have all bridges except br-storage as linux bridges, and br-storage as ovs one? | 10:44 |
Tad | how so? do i need to specifiy openvswitch on my netplan for all bridges? | 10:45 |
noonedeadpunk | then, log_hosts is likely will not have any effect - we've dropped rsyslog roles as logs are managed with journald | 10:45 |
noonedeadpunk | Tad: well, I don't know, I'm asking you :D You have `container_bridge_type: "openvswitch"` only for 1 bridge https://github.com/TadiosAbebe/OSA/blob/master/etc/openstack_deploy/openstack_user_config.yml#L51C9-L51C45 | 10:46 |
Tad | nice, i'll remove the log_hosts too | 10:46 |
noonedeadpunk | so was kinda wondering why not to align that to same tech :) But that's not critical as well | 10:46 |
noonedeadpunk | What is critical, is that I don't see some required definitions of groups for OVN | 10:47 |
Tad | like what definitions? | 10:48 |
noonedeadpunk | Tad: I think you're missing `network-gateway_hosts` and `network-northd_hosts` | 10:49 |
noonedeadpunk | https://docs.openstack.org/openstack-ansible-os_neutron/latest/app-ovn.html#deployment-scenarios | 10:50 |
noonedeadpunk | and ovn gateway is exactly the thing that is repsonsible for external connectivity | 10:51 |
Tad | yes, i have seen that from the docs but i supposed it will create them automatically. and i can see OVN Controller Gateway agent on my compute node when issuing openstack network agent list | 10:52 |
noonedeadpunk | I think that controller gateway is implicitly included in network-hosts | 10:53 |
noonedeadpunk | but ovn gateway is totally not | 10:53 |
noonedeadpunk | as there're multiple scenarios where to place them, and usually that's not control plane | 10:54 |
noonedeadpunk | either compute nodes or standalone network nodes | 10:54 |
Tad | Ohh good to know, I’ll specify network-gateway_hosts on my compute node, what else do you see that is off right away? | 10:56 |
noonedeadpunk | actually, that's what made northd available https://github.com/TadiosAbebe/OSA/blob/master/etc/openstack_deploy/env.d/neutron.yml | 10:59 |
noonedeadpunk | that actually looks like beingf taken from Yoga, as it should not be needed since zed | 11:01 |
Tad | i did that from the following suggestion https://bugs.launchpad.net/openstack-ansible/+bug/2002897 so should i remove the neutron.yml config and just place my network-gateway_hosts on openstack_user_config.yml? | 11:03 |
noonedeadpunk | Well, I'd drop both that and nova.yml as well | 11:03 |
noonedeadpunk | and then defined network-gateway_hosts and network-northd_hosts | 11:04 |
Tad | great, what else? | 11:04 |
Tad | this "Also - you have all bridges except br-storage as linux bridges, and br-storage as ovs one?" is a good point out but i don't know what i should do? | 11:05 |
noonedeadpunk | Well, I would either have all bridges on controllers as OVS or all as just linux bridges, not mix of them. But there is technically nothing wrong in mixing them if there is a reason for that | 11:06 |
noonedeadpunk | in your case it should be likely easier to drop `container_bridge_type: "openvswitch"` and re-configure br-storage as simple bridge | 11:07 |
noonedeadpunk | also - once you will change env.d/conf.d/openstack_user_config you will likely need to re-run lxc-containers-create.yml playbook | 11:08 |
Tad | i don't have any reason for mixing that. "We opted to move over to the new OVN provider. This solved our issues and left the deprecated LinuxBrdige driver outside of the equation. Also, VXLAN was replaced with Geneve. Relevant configuration files were adjusted as follows:" is taken from https://bugs.launchpad.net/openstack-ansible/+bug/2002897 that is why i opted for ovs | 11:09 |
noonedeadpunk | Tad: well, this is in the context of neutron driver. Linux bridges are indeed deprecated as a neutron drivers. But they are still part of the Linux :D | 11:10 |
noonedeadpunk | and OSA is quite agnostic of tech - you can even passthrough physical interfaces inside LXC containers and not having bridges at all | 11:11 |
noonedeadpunk | so it's kinda matter of prefference and taste | 11:11 |
Tad | ohh okay, then i'll drop the container_bridge_type: "openvswitch" | 11:13 |
Tad | what about the host_bind_override: "bond1" is this necessary? | 11:15 |
noonedeadpunk | Tad: to be frank - I don't remember :D But idea behind that, is that there might be no bridge as br-vlan - as it's absolutely fine to have just interface instead of the bridge on network/compute hosts | 11:26 |
noonedeadpunk | and it is not needed on storage/controller hosts at all | 11:26 |
noonedeadpunk | so to avoid creating br-vlan bridge, you can just have an interface and mark that with `host_bind_override` | 11:26 |
noonedeadpunk | I can't recall if you can simply use interface in `container_bridge` or not... | 11:27 |
opendevreview | Merged openstack/openstack-ansible-galera_server master: Do not use notify inside handlers https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/887520 | 11:27 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-haproxy_server master: Do not use notify inside handlers https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/888762 | 11:30 |
Tad | oh okay. | 11:30 |
noonedeadpunk | but if you have br-vlan bridge on compute/network hosts - you don't need host_bind_override then | 11:31 |
noonedeadpunk | it's just a bit weird, as basically what will happen with that bridge - it will be added as "interface" to another bridge, while br-vlan would have only 1 interface in it. It was named as "bridge" only for consistency and naming things in the same way across all docs | 11:32 |
noonedeadpunk | same with br-vxlan actually | 11:32 |
noonedeadpunk | but it would work both ways :) | 11:33 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-plugins master: Do not use notify inside handlers https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/888766 | 11:33 |
Tad | oh okay. When specifying network-northd_hosts network-gateway_hosts in openstack_user_config do i need to remove the network_hosts: portion or should i keep it? | 11:36 |
noonedeadpunk | no, keep that | 11:36 |
noonedeadpunk | it is needed for neutron-server | 11:37 |
Tad | oh okay, I'm now running the playbooks after changing what you suggested | 11:37 |
Tad | and when running the playbook what i have been doing so far is running all seutp-host, infrastructure and openstack after changing any configuration, is that the proper way. | 11:39 |
noonedeadpunk | well. it's long way :) | 11:41 |
noonedeadpunk | it would work though | 11:41 |
noonedeadpunk | (with that pace you could run setup-everything.yml as well) | 11:42 |
noonedeadpunk | short path would be to run lxc-containers-create and then affected roles. For example, if you're changing neutron configuration - run os-neutron-install.yml afterwards | 11:42 |
noonedeadpunk | and lxc-containers-create is needed only when you expect changes in inventory, that would result in creating new containers | 11:43 |
Tad | ya it takes me about more that 4 hours, to deploy openstack on 3 nodes every time i make a change :) i'll try the lxc-containers-create route next time | 11:43 |
noonedeadpunk | in current case, as you've dropped some env.d I would likely also re-create neutron-server containers, as some ovn services ended up there, while it shouldn't | 11:44 |
noonedeadpunk | wow, that's too long kinda... | 11:44 |
Tad | ya image doing it when you have a lot to learn and experiment | 11:44 |
noonedeadpunk | I wouldn't expect setup-everything to run more then 2h to be frank, but it's still really long | 11:44 |
Tad | i think it took that long for me because it is installing the control plain on all three hosts | 11:45 |
noonedeadpunk | and what you can do - run `openstack-ansible lxc-containers-destroy.yml --limit neutron_server` | 11:45 |
noonedeadpunk | and re-create these with `openstack-ansible lxc-containers-create.yml --limit neutron_server,localhost` | 11:46 |
Tad | okay great, that would be handy, for now since i have changed my netplan and reapplied it i am running setup-everything to be on the safe side | 11:47 |
Tad | noonedeadpunk: on another note when experimenting with OSA with three nodes and after successful deployment, if there is a power interruption and all my 3 servers losses power, galera cluster won’t start up and I have to go and manually do galera_new_cluster on the node where safe_to_bootstrap is 1 inside the /var/lib/mysql/grastate.dat am I doing something wrong or is there a more permanent solution. | 11:57 |
anskiy | Tad: yeah, it did. I would suggest revisit your openstack_user_config, as, for example this bit: https://github.com/TadiosAbebe/OSA/blob/master/etc/openstack_deploy/openstack_user_config.yml#L68-L74 -- you're installing infrastructure services for control plane (eg galera cluster) on you compute and storage nodes. | 11:57 |
noonedeadpunk | I think it's intentional POC deployment ;) | 11:58 |
anskiy | could be it, but that's an oportunity to speed thing up a little bit :) | 11:59 |
noonedeadpunk | Tad: nah, I guess it's "relatively" fair recovery process of galera. Same actually happens in split-brain, when it did not record last sequence number (due to unexpected shutdown) - it doesn't know where latest data is | 11:59 |
noonedeadpunk | so yes, you need to tell it which one should act as "master" | 12:00 |
noonedeadpunk | Tad: well ,actually, I dunno if you knew that or not, but you can create multiple containers of same type on a single host: https://docs.openstack.org/openstack-ansible/latest/reference/inventory/configure-inventory.html#deploying-0-or-more-than-one-of-component-type-per-host | 12:01 |
noonedeadpunk | so you can have 3 galera containers on the 1 host to play with clustering, for instance | 12:02 |
Tad | oh okay, I guess you wouldn't encounter power interruption in production environment. The thing is I am experimenting with openstack at my office where all 3 baremetal servers are located here and no UPS. and we often encounter power interruption | 12:04 |
noonedeadpunk | Well, power loss of all DC can happen, ofc, but mysql would be your least problem in case of this happening | 12:05 |
Tad | anskiy as noonedeadpunk pointed out it is a POC thing, but I think I could also leave out the HA thing until i get neutron to work properly as you said it might speed things up a little | 12:09 |
Tad | noonedeadpunk what else should be i concerned with on power interruption | 12:10 |
noonedeadpunk | storage?:) | 12:10 |
Tad | what could go wrong with storage, i'am using cinder with lvm backend and i'm not doing anything serious with the openstack cloud | 12:13 |
noonedeadpunk | well, instances that are runnning using buffers, so they can easily get broken FS inside them | 12:27 |
Tad | oh you are right, i should really get a UPS for the future but currently i am at the stage where i'm just running cirros image | 12:31 |
Tad | Having this openstack_user_config now https://pastebin.com/XLduZ9uK setup-openstack fails on TASK [os_neutron : Setup Network Provider Bridges] with the error "ovs-vsctl: cannot create a bridge named br-vlan because a port named br-vlan already exists on bridge br-provider" | 12:43 |
noonedeadpunk | Um... honestly I'm not sure here. And you have pre-created the br-vlan with netplan config? | 12:47 |
Tad | yes i have br-vlan on bond1 of my netplan | 12:49 |
NeilHanlon | noonedeadpunk: how are rocky jobs looking? i tagged new ovn yesterday | 12:51 |
noonedeadpunk | I did some rechecks today but haven't checked their status yet. but my aio was good :) | 12:54 |
noonedeadpunk | And CI seems way greener today | 12:54 |
mgariepy | hey good morning. | 12:58 |
NeilHanlon | good to hear :) | 12:59 |
NeilHanlon | i'll follow up with amorlej to find out what I did wrong lol | 12:59 |
NeilHanlon | morning mgariepy | 12:59 |
mgariepy | NeilHanlon, how did your moving all your stuff went ? | 13:00 |
NeilHanlon | went pretty well, all things considered! starting to have some semblance of normalcy now | 13:01 |
NeilHanlon | movers cost almost 2x what they estimated us... but | 13:01 |
mgariepy | haha dust takes some time to settle :) | 13:01 |
mgariepy | wow. | 13:01 |
NeilHanlon | oh the dust is another thing altogether lol... my asthma hates me | 13:01 |
NeilHanlon | I've always moved in the fall/spring, so I didn't account for that it would take them on the longer end of their time estimate... since it was so hot | 13:02 |
NeilHanlon | plus they charged me $160 for moving my sofa lol | 13:02 |
mgariepy | 160 only for 1 sofa ? | 13:03 |
mgariepy | must be a huge sofa ;) | 13:03 |
NeilHanlon | supposedly because it was heavy (it's a powered recliner) | 13:03 |
NeilHanlon | another $100 for a fridge they moved downstairs... | 13:03 |
NeilHanlon | i get the feeling they knew it was the last time they were gonna move me and wanted to get their last $$$ | 13:04 |
mgariepy | how much does it cost, well it depends, how much do you have, let me get you a personalized quote ? | 13:05 |
mgariepy | well pretty much like anything else i guess. | 13:05 |
NeilHanlon | they had been good to me in the past, which is why I used them. it was like my 5th move with this same group | 13:06 |
NeilHanlon | but yeah, it felt a bit like a bait and switch | 13:07 |
mgariepy | they offer good service/price for moving between appartement but do charge way more for a house ? | 13:07 |
mgariepy | haha | 13:07 |
Tad | noonedeadpunk: when i put back host_bind_override: "bond1" on container_bridge: "br-vlan" os-neutron-install.yml completed without error | 13:11 |
mgariepy | can i have some review on : https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/888314 and https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/888498 | 13:14 |
* NeilHanlon nods | 13:15 | |
Tad | mgariepy: when you got time, here is the problem I was referring yesterday: https://pastebin.com/W9xy0EWC and here is my latest openstack_user_config.yml fine after making changes suggested by noonedeadpunk https://pastebin.com/XLduZ9uK but the issue still persist any help would be appreciated | 13:58 |
jamesdenton | does the haproxy role provide the ability for each endpoint to have a unique fqdn using port 443? | 14:05 |
jamesdenton | @Tad did you create a neutron router and connect it to the external network as well as the tenant network? can you ping the default gateway from the VM? | 14:06 |
jamesdenton | (sorry, looks like i missed that on line 8) | 14:06 |
jamesdenton | If your public network truly is 10.20.30.0/24, do you have an upstream NAT? that network is not routable | 14:07 |
mgariepy | jamesdenton, haproxy does supports SNI | 14:07 |
noonedeadpunk | jamesdenton: you can do that with haproxy maps since Antelope relatively easily | 14:07 |
jamesdenton | but 2023.1 is the key there, huh? | 14:08 |
Tad | jamesdenton even though instances can ping each other they are not able to ping their gateway. | 14:08 |
Tad | and i don't have any nat on that network it is a vlan 102 network created on the router | 14:08 |
noonedeadpunk | I've heard folks doing that even before, but with maps it should become relatively easily | 14:09 |
jamesdenton | Tad if you expect your tenant network to reach the internet, then the external network needs to either be routable or have some NAT upstream. The provider network itself needs a default gateway that can reach the internet | 14:09 |
jamesdenton | thanks noonedeadpunk mgariepy | 14:09 |
jamesdenton | Tad did you attach the router to the tenant network? openstack router add subnet <router> <tenant subnet> | 14:10 |
noonedeadpunk | Tad: so from VM you can't reach router gateway, right? | 14:10 |
noonedeadpunk | openstack router I mean | 14:10 |
Tad | jamesdenton yes i, but the vms are'nt able to ping either their gateway or the interface on the other end which is my provider network. | 14:12 |
Tad | noonedeadpunk yes they cant reach the gateway | 14:12 |
jamesdenton | sure, so the first problem, then is getting them to ping the tenant gateway | 14:12 |
jamesdenton | there's only 1 compute? | 14:12 |
Tad | yes | 14:12 |
jamesdenton | can you do me a favor and provide the output of "ovs-vsctl list open_vswitch" from all 3 nodes? | 14:13 |
Tad | jamesdenton here you go https://pastebin.com/Hz3UP6u2 | 14:16 |
jamesdenton | thanks | 14:17 |
jamesdenton | can you also please provide: openstack network show and openstack subnet show for the 2 networks? | 14:20 |
Tad | jamesdenton here you go https://pastebin.com/70LJggp5 | 14:25 |
jamesdenton | and just to confirm, your VM IPs are really 10.0.0.x not 192.169.10.x right? | 14:26 |
Tad | yes they where on 192.168.10.0 but they are on 10.0.0.0 network now | 14:27 |
jamesdenton | DHCP seems to be working? | 14:28 |
Tad | yes it is | 14:28 |
Tad | jamesdenton can you see any problem on my openstack_user_config file here https://pastebin.com/XLduZ9uK or is it about right | 14:29 |
jamesdenton | Not sure if you intended on spreading out services across all three nodes or not | 14:30 |
jamesdenton | at a glance, the OVN bits seem OK | 14:31 |
jamesdenton | on the compute, can you show me the 'ovs-vsctl show' output? | 14:31 |
Tad | yes i wanted to test a hyperconverged control plane | 14:33 |
Tad | here you go https://pastebin.com/WCzpS4bA | 14:33 |
Tad | but when Having this openstack_user_config https://pastebin.com/XLduZ9uK setup-openstack fails on TASK [os_neutron : Setup Network Provider Bridges] with the error "ovs-vsctl: cannot create a bridge named br-vlan because a port named br-vlan already exists on bridge br-provider" but when i add host_bind_override: "bond1" on container_bridge: "br-vlan" os-neutron-install.yml completed without error | 14:36 |
jamesdenton | ok, so i lied. I see ovn-bridge-mappings="vlan:bond1", which implies bond1 is a bridge, and that vsctl show output confirms that. I suspect you mean for bond1 to be the interface used, and that should be connected to a bridge (likely br-provider) | 14:36 |
jamesdenton | the openstack_user_config.yml shows br-vlan, though, so maybe br-provider was rolled by hand later? | 14:37 |
anskiy | there is bridge called br-provider, which contanins port br-vlan (like in the error you mentioned before), and in your netplan config, br-vlan is the linux bridge with bond1. At the same time, bond1 is a port in OVS bridge bond1... | 14:37 |
anskiy | You might need to delete OVS bridge br-provider, maybe?.. As I don't really see, where it's been used | 14:38 |
jamesdenton | Mine looks like this: https://paste.opendev.org/show/bLkYnCApAH4vXAULykQk/. Playbooks would create br-provider (if it doesn't exist) and connect bond1 to br-provider (bond1 must be an existing interface) | 14:40 |
Tad | so there is a bunch of "failed to add bond1 as port: File exists" inside ovs-vswitchd.log and i think this is happing because i added back the host_bind_override: "bond1" on my config. but without that the playbook fails | 14:40 |
jamesdenton | host_bind_override should only be for linuxbridge, IIRC | 14:41 |
jamesdenton | try using "network_interface: bond1" instead. In the meantime, you should be able to delete br-provider bridge. br-vlan is probably also unnecessary | 14:42 |
jamesdenton | also, if you have the playbook error that would be helpful | 14:42 |
Tad | how did the the br-provider get created in the first place though? | 14:44 |
jamesdenton | ¯\_(ツ)_/¯ | 14:46 |
Tad | and the thing about deleting br-provider of br-vlan is i dont want to go and manually remove these things because i want this to be a repeatable process. so when i move to testing deployment on different machine i want to be able to run the playbooks and make them work. any way i could control this from openstack ansible configs? | 14:48 |
jamesdenton | Well, the environment is in an incompatible state at the moment. The playbooks don't delete anything network related, only create/modify - but ideally you would setup openstack_user_config.yml and the correct bits would be done the first time | 14:50 |
jamesdenton | for OVS/OVN, you want an OVS provider bridge connected directly to a physical interface. The snippet i sent would result in the playbooks creating br-provider and connecting bond1. | 14:51 |
Tad | ohh great, it might me something that was created before so, let me clean install the enviroment run the script with the my latest openstack_user_config and let me get back to you then. | 14:51 |
jamesdenton | wanna ship that config over first so we can glance at it? | 14:51 |
Tad | sure, https://pastebin.com/XLduZ9uK | 14:52 |
jamesdenton | and your netplan? | 14:52 |
Tad | let me collect them in one file, give me a min | 14:53 |
Tad | netplan: https://pastebin.com/6erRikBP user_variable https://pastebin.com/7qxF9rBG | 14:55 |
Tad | jamesdenton: do i need to use openvswitch bridges on my host machines? | 14:56 |
jamesdenton | no | 14:56 |
Tad | so the above configs seems okay? | 14:56 |
jamesdenton | ok, so my suggestion would be to remove br-vlan from netplan | 14:56 |
jamesdenton | then, in openstack_user_config, in the br-vlan network block, add 'network_interface: "bond1"' under 'container_bridge: "br-vlan"' | 14:58 |
jamesdenton | br-vlan will end up being created as an OVS bridge | 14:58 |
jamesdenton | everything else looks pretty sane at a glance | 14:58 |
jamesdenton | which version of OSA? | 14:59 |
Tad | zed | 14:59 |
Tad | 26.1.2 | 15:00 |
jamesdenton | k | 15:00 |
jamesdenton | cool, give that a go and let us know. should be around for a bit | 15:00 |
Tad | great, i'll do that and get back to you later. i have learned a lot today. jamesdenton anskiy noonedeadpunk thank you very much for your time. | 15:01 |
jamesdenton | and you going to wipe and redeploy? | 15:01 |
Tad | what i usually do is i have a timeshift snapshot before i run anything on the machines, so i restore to that point and go on from there | 15:01 |
noonedeadpunk | #startmeeting openstack_ansible_meeting | 15:02 |
opendevmeet | Meeting started Tue Jul 18 15:02:18 2023 UTC and is due to finish in 60 minutes. The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:02 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:02 |
opendevmeet | The meeting name has been set to 'openstack_ansible_meeting' | 15:02 |
jamesdenton | oh cool | 15:02 |
noonedeadpunk | #topic rollcall | 15:02 |
noonedeadpunk | o/ | 15:02 |
jamesdenton | o/ | 15:02 |
mgariepy | hey o/ | 15:03 |
damiandabrowski | hi! | 15:03 |
NeilHanlon | o/ | 15:04 |
noonedeadpunk | #topic office hours | 15:05 |
noonedeadpunk | so, gates look green now - thanks NeilHanlon for fixing OVN stuff! | 15:06 |
jamesdenton | yay | 15:06 |
noonedeadpunk | We are free to recheck things now | 15:06 |
NeilHanlon | wee | 15:06 |
jamesdenton | on the topic of OVN, looks like there's a new OVN agent coming in Bobcat. I will try and get that implemented ASAP | 15:07 |
jamesdenton | unless someone else gets to it first | 15:07 |
* noonedeadpunk was not planning to do that at least | 15:07 | |
noonedeadpunk | I've also pushed 2 patches to fix CentOS LXC (systemd-udev), as it doesn't seem like they're going to do that from their side: https://review.opendev.org/q/topic:osa%252Fcentos_lxc | 15:08 |
* NeilHanlon will also follow up on that bugzilla ticket with the stream product manager because this is absurd | 15:08 | |
noonedeadpunk | I see literally 0 move in the bug report | 15:08 |
noonedeadpunk | do they still reply you NeilHanlon? :D | 15:09 |
NeilHanlon | unclear :) | 15:09 |
NeilHanlon | to quote a friend... I don't want to live in interesting times anymore | 15:09 |
jamesdenton | loosely following along... Rocky has a path forward? | 15:10 |
* noonedeadpunk wonders why they still have IRC channels and not only customer portal | 15:10 | |
NeilHanlon | jamesdenton: until our lawyers tell us otherwise and/or RH cuts off the means by which we are accessing the sources, yep | 15:10 |
jamesdenton | right on | 15:10 |
NeilHanlon | i'm vaguely interested in trying to keep the stream stuff out of Experimental for our jobs, but will admit my fervor to do so has been ... lost | 15:11 |
noonedeadpunk | I really wonder who in their sane mind would deploy Stream in production... | 15:13 |
noonedeadpunk | Especially these days | 15:13 |
NeilHanlon | ostensibly, Meta (facebook) does. they contribute a lot to Stream. I just... often question if a hyperscaler's interests are aligned with everyone else. their needs aren't typical of 99% of operators of infrastructure IME | 15:14 |
mgariepy | someone with too much political leverage over the distro choise ;) | 15:14 |
mgariepy | choice** | 15:15 |
noonedeadpunk | NeilHanlon: they are? o_O then I have some questions, why things, like mcrouter are build only for Ubuntu... | 15:15 |
NeilHanlon | e.g., ELN (Fedora Rawhide but for Stream, which will become Stream 10), is upping the x86 microarch _again_ -- so you'll need a processor supporting at least x86v3 to run EL10 (Stream/RHEL... whatever exists downstream of it now) | 15:16 |
jamesdenton | x86-64-v3 is what, Haswell? Or newer? | 15:17 |
NeilHanlon | haswell and newer | 15:19 |
noonedeadpunk | I'm not sure I have running aything earlier then Hasswell to be frank, but I for sure know ones who have | 15:20 |
NeilHanlon | yeah, and there's a long legacy of supporting... legacy devices in the CentOS world | 15:20 |
noonedeadpunk | On another topic - I've pushed all (or close to all) changes for new ansible-lint: https://review.opendev.org/q/topic:osa%252Fcore-2.15 Bad news - we still need new tag for ansible-lint to pass tests. | 15:21 |
NeilHanlon | that's a lot of patches :) | 15:21 |
noonedeadpunk | yeah, some of them are legitemally failing, some would need rechecks | 15:21 |
noonedeadpunk | I'm also afraid of accidental mistakes that were made | 15:22 |
noonedeadpunk | Like forgotten quotes or splitting into multiple lines wrongly | 15:22 |
noonedeadpunk | I also have no idea why focal fails here: https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/888132 | 15:23 |
noonedeadpunk | While we should probably jsut drop focal by now, with https://review.opendev.org/c/openstack/openstack-ansible/+/886517, but fact it is failing is concerning | 15:23 |
noonedeadpunk | Also quorum queues kinda ready for reviews: https://review.opendev.org/q/topic:osa/quorum_queues Only base services covered by now - don't want to invest more time until approach is not accepted | 15:26 |
noonedeadpunk | Some time has been spent on fixing Zed as well. We have a circular dependency there, so had to disable CI and restore it afterwards: https://review.opendev.org/q/parentproject:openstack/openstack-ansible+branch:%255Estable/zed+status:open++label:Verified | 15:27 |
NeilHanlon | i'll try and take a look at the ansible lint stuff for 'human mistakes' and such for you | 15:30 |
noonedeadpunk | Should do same for Yoga I believe | 15:30 |
noonedeadpunk | NeilHanlon: Yeah, at some point I become feeling sick of doing these patches, so there are obviously some mistakes were made... | 15:30 |
noonedeadpunk | and YAML is damn hard to be frank | 15:30 |
NeilHanlon | yep, i know what you mean :) | 15:31 |
noonedeadpunk | in terms of all these spacing things with `|` and `>` and `-` in tags... | 15:31 |
noonedeadpunk | ugh | 15:31 |
NeilHanlon | i need a drink just hearing you discuss it! | 15:33 |
noonedeadpunk | But that's kinda it I guess. Will try to have some progress with opensatck_resources role during next week | 15:33 |
noonedeadpunk | Talkign about this patch https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/878794 | 15:34 |
noonedeadpunk | wanna add some image-related things | 15:34 |
NeilHanlon | oh, nice! | 15:35 |
NeilHanlon | possibly/probably relevant -- https://bugzilla.redhat.com/show_bug.cgi?id=2221820 | 15:53 |
noonedeadpunk | `RHEL 8.9 will still use 2.15 with python 3.11` | 16:00 |
noonedeadpunk | makes total sense (no) | 16:00 |
noonedeadpunk | #endmeeting | 16:00 |
opendevmeet | Meeting ended Tue Jul 18 16:00:59 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:00 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-07-18-15.02.html | 16:00 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-07-18-15.02.txt | 16:00 |
opendevmeet | Log: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2023/openstack_ansible_meeting.2023-07-18-15.02.log.html | 16:00 |
spatel | Are you guys using ceilometer + gnocchi for billing or Ceilometer + Monasca = Ceilosca ? | 16:10 |
spatel | I am confused what to use for billing and why? | 16:11 |
spatel | I was thinking to use prometheus but its not true billing tool | 16:11 |
spatel | Gnocchi is unmaintained so not worth deploying for new cloud | 16:14 |
noonedeadpunk | I'd say gnocchi is pretty much maintained as of today | 16:16 |
noonedeadpunk | it could be better maintained ofc, but it's not fully unmaintained either | 16:17 |
noonedeadpunk | Like monasca is waaay less maintained | 16:18 |
NeilHanlon | no or few commits doesn't always mean unmaintained | 16:22 |
spatel | Hmm! noonedeadpunk so you prefer gnocchi | 18:29 |
noonedeadpunk | I can't say I am fan of gnocchi, but it works quite nicely. But it's heavy as hell | 18:30 |
noonedeadpunk | Can't say monasca with whole that software stack it requires is lightweight... | 18:31 |
noonedeadpunk | but fwiw, moansca was one step from getting a deprecated project previous cycle | 18:31 |
NeilHanlon | i like gnocchi the food | 18:32 |
noonedeadpunk | as gnocchi the tech - you need to know how to cook it :D | 18:34 |
NeilHanlon | :) | 18:36 |
opendevreview | Merged openstack/openstack-ansible-haproxy_server master: Add ability to have different backend port. https://review.opendev.org/c/openstack/openstack-ansible-haproxy_server/+/888314 | 22:18 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!