*** ysandeep|out is now known as ysandeep|PTO | 00:05 | |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server stable/train: Use cloudsmith repo for rabbit and erlang https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/861794 | 07:06 |
---|---|---|
jrosser_ | ^ thats running an upgrade job on train too, which we probably want to get rid of? | 07:10 |
noonedeadpunk | Yeah, true | 07:11 |
noonedeadpunk | I also see no way of saving centos 7 lxc job | 07:11 |
noonedeadpunk | As centos has dropped their image for lxc | 07:11 |
noonedeadpunk | And for the legacy method we need infra proxy that was likely dropped or super outdated | 07:11 |
noonedeadpunk | Don't want to mess/fix it | 07:12 |
noonedeadpunk | I wonder what out of that we actually need https://opendev.org/opendev/base-jobs/src/branch/master/roles/mirror-info/templates/mirror_info.sh.j2#L83-L85 | 07:13 |
noonedeadpunk | or well, we'd need to fix that https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/mirror/templates/mirror.vhost.j2#L259-L262 | 07:16 |
noonedeadpunk | and repalce with https://us.lxd.images.canonical.com/ | 07:17 |
noonedeadpunk | (but I'd rather drop) | 07:17 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Drop usage of lxc containers proxy https://review.opendev.org/c/openstack/openstack-ansible/+/861825 | 07:20 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/ussuri: Bump SHA for galera, rabbitmq and rally roles https://review.opendev.org/c/openstack/openstack-ansible/+/853029 | 07:41 |
noonedeadpunk | Nah, we can't drop as I guess like Rocky use only this way of images retrieval | 07:42 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/train: Restrict pyOpenSSL to less then 20.0.0 https://review.opendev.org/c/openstack/openstack-ansible-tests/+/861831 | 07:45 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/train: Restrict pyOpenSSL to less then 20.0.0 https://review.opendev.org/c/openstack/openstack-ansible-tests/+/861831 | 07:47 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server stable/train: Use cloudsmith repo for rabbit and erlang https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/861794 | 07:48 |
gokhanisi | hi folks, I am trying to make keystone ldap integration but it didn't work. For testing ı created openldap server on ubuntu focal and created this ldif file > https://paste.openstack.org/show/bEY20h8Nvj5XtaE2zDJC/ this is my keystone domain config > https://paste.openstack.org/show/boQuvnoOHaLZ2lmkFMYd/, I created b3lab domain manually but in keystone logs it says b3lab domain not found. Maybe I have missed some things to do. | 07:52 |
* noonedeadpunk has no experience in ldap integration | 08:01 | |
kleini | gokhanisi: this is my configuration for OSA to configure keystone with LDAP auth | 08:24 |
gokhanisi | kleini, and it worked for you | 08:27 |
kleini | yes, it works | 08:34 |
kleini | gokhanisi: is your keystone configuration file located in /etc/keystone/domains/keystone.b3lab.conf localted? | 08:37 |
gokhanisi | kleini, yes it is like that https://paste.openstack.org/show/bjO6UVFPtOA71srmlcMq/ | 08:41 |
gokhanisi | and this is keystone.conf https://paste.openstack.org/show/bBggmfcI7WMFPZYMGo31/ | 08:43 |
kleini | maybe turn on verbose and debug logging in keystone. did you try to use ldapsearch to test connection to your LDAP? | 08:47 |
gokhanisi | kleini, it is working with ldap search > https://paste.openstack.org/show/bbOjfu6ZohKIh5PVZaZn/ | 08:54 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/ussuri: Switch to tracking stable/ussuri for EM release https://review.opendev.org/c/openstack/openstack-ansible/+/853029 | 08:54 |
gokhanisi | may be I am typing url wron | 08:54 |
gokhanisi | kleini, thanks it is working now :) and now How can we map openstack projects with ldap objects? | 09:02 |
gokhanisi | I can list groups and users on ldap | 09:02 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-rabbitmq_server stable/train: Use cloudsmith repo for rabbit and erlang https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/861794 | 09:13 |
kleini | just do normal role assignments. I create projects then in the same domain - b3lab in your case - and then assign role member to some group or user for that project | 09:16 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/train: Restrict pyOpenSSL to less then 20.0.0 https://review.opendev.org/c/openstack/openstack-ansible-tests/+/861831 | 09:17 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/train: Return jobs to voting https://review.opendev.org/c/openstack/openstack-ansible-tests/+/861855 | 09:20 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/train: Disable upgrade jobs on EM branch https://review.opendev.org/c/openstack/openstack-ansible/+/861858 | 09:23 |
dokeeffe85 | Hi again, I lost my controller and computes to a power failure and when I rebooted them today after getting them back online I get the following errors https://paste.openstack.org/show/beSCxaSdw95snHjbRdhy/ when trying to attach & start containers. I obviously didn't get a chance to stop the containers before I lost the servers. Is there anyway to fix this or is it a lxc-containers-destroy.yml + lxc-containers-create.yml again? Thanks in | 09:56 |
dokeeffe85 | advance | 09:56 |
noonedeadpunk | dokeeffe85: and what does /var/log/lxc/lxc-infra1_utility_container-f80f87fa.log says? | 09:58 |
admin1 | tag 25.1.1 fails on python_venv_build : Install python packages into the venv => ERROR: Error [Errno 2] No such file or directory: 'git' while executing command git version\nERROR: Cannot find command 'git' - do you have 'git' installed and in your PATH .. doh !! | 10:24 |
admin1 | and i have not done any changes or overrides | 10:24 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/queens: EOL OpenStack-Ansible Queens https://review.opendev.org/c/openstack/openstack-ansible/+/861868 | 10:25 |
admin1 | ssh to container, apt install git ; re-run playbook .. and its solved .. | 10:25 |
noonedeadpunk | admin1: is it for placement? | 10:25 |
admin1 | no .. c1_heat_api_container | 10:25 |
admin1 | ignore the c1_ | 10:26 |
noonedeadpunk | nah, then needs patching | 10:26 |
admin1 | where would i see our CI passing logs .. | 10:26 |
noonedeadpunk | Just for placement was fixed with https://opendev.org/openstack/openstack-ansible-os_placement/commit/6084c248fcae02c413133329b705678cd75c1bfe | 10:26 |
admin1 | i got no issues on placement | 10:26 |
noonedeadpunk | CI can be quite different. As we forcefully enable wheels building there | 10:27 |
admin1 | oh | 10:27 |
noonedeadpunk | And you will see error if wheels build is disabled for some reason | 10:27 |
noonedeadpunk | like running with limit is one option | 10:27 |
noonedeadpunk | and it's result of one "fix" that now more properly evaluates things | 10:28 |
noonedeadpunk | but git issues in other places started arising when wheels are not built | 10:29 |
noonedeadpunk | so would be great if you could push some patch for that | 10:29 |
admin1 | this was in acceptance .. i will deploy tonight in a prod env . .. will have a 100% confirmation then .. | 10:31 |
dokeeffe85 | noonedeadpunk this is the entire file https://paste.openstack.org/show/bNys7633pfauez8qdhQV/ | 10:43 |
noonedeadpunk | and what if you add `-F` to lxc-start? | 10:45 |
noonedeadpunk | I just assume that smth is off with either /var/lib/machines mount or some net interface | 10:47 |
noonedeadpunk | But not sure what's exactly | 10:47 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_tacker master: Add deployment of tacker-scheduler https://review.opendev.org/c/openstack/openstack-ansible-os_tacker/+/861870 | 10:52 |
admin1 | when "external" ceph is enabled, ceph is also not installed in glance | 10:52 |
admin1 | that blocks the playbook on octavia when it wants to upload the amphora image | 10:52 |
admin1 | external ceph using cephadm, | 10:53 |
noonedeadpunk | I think it depends on what backends you've enabled for glance? | 10:54 |
admin1 | user_variables => glance_ceph_client: glance ; glance_default_store: rbd ; glance_rbd_store_pool: images | 10:55 |
noonedeadpunk | `glance_default_store: rbd` is the thing that should do the trick actually | 10:55 |
noonedeadpunk | as that's the condition on when ceph part does run https://opendev.org/openstack/openstack-ansible-os_glance/src/branch/master/tasks/main.yml#L157-L158 | 10:56 |
noonedeadpunk | and `_glance_available_stores: "{{ [ glance_default_store ] + glance_additional_stores }}"` | 10:57 |
admin1 | i am going to return glance playbook with -vvv and grep rbd/ceph | 10:59 |
admin1 | return -> rerun | 10:59 |
noonedeadpunk | so if you go to interactive python (ie /openstack/venvs/glance-<version>/bin/python) and execute `import rbd` it will fail with import? | 10:59 |
admin1 | yeah .. no /etc/ceph and no packages | 10:59 |
noonedeadpunk | super weird | 11:00 |
admin1 | ceph_pkg_source: distro .. | 11:00 |
admin1 | without this, build does not work on 22.0.4 | 11:00 |
noonedeadpunk | it's 22.04? | 11:00 |
admin1 | yeah | 11:00 |
noonedeadpunk | rly weird | 11:01 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-tests stable/queens: Switch linters to EOL https://review.opendev.org/c/openstack/openstack-ansible-tests/+/861873 | 11:08 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/queens: EOL OpenStack-Ansible Queens https://review.opendev.org/c/openstack/openstack-ansible/+/861868 | 11:08 |
admin1 | noonedeadpunk https://gist.github.com/a1git/5f5a129b62e57cd52c8791f5ecd0d986 | 11:11 |
admin1 | not sure if it helps | 11:11 |
admin1 | noonedeadpunk, is this a good way to force ? openstack-ansible os-glance-install.yml -e 'glance_default_store: rbd' | 11:12 |
noonedeadpunk | but it shows that ceph.conf is being copied | 11:12 |
noonedeadpunk | and ceph packafges are symlinked properly | 11:13 |
admin1 | now i see ceph.conf :D | 11:13 |
admin1 | hmm.. | 11:13 |
noonedeadpunk | https://gist.github.com/a1git/5f5a129b62e57cd52c8791f5ecd0d986#file-gistfile1-txt-L418 | 11:14 |
admin1 | i will destroy this container and retry .. could be it fails the first time and then works the 2nd time | 11:14 |
noonedeadpunk | and all tasks are OK actually. Nothing was changed | 11:14 |
noonedeadpunk | um, then tasks would be in changed state | 11:14 |
noonedeadpunk | according to paste I can say nothing was done during this run | 11:15 |
noonedeadpunk | `c1_glance_container-bf88ca5b : ok=108 changed=1 ` and this changed is forceful user creation | 11:15 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/queens: EOL OpenStack-Ansible Queens https://review.opendev.org/c/openstack/openstack-ansible/+/861868 | 11:28 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/queens: EOL OpenStack-Ansible Queens https://review.opendev.org/c/openstack/openstack-ansible/+/861868 | 11:28 |
admin1 | tracked down the issue of glance not working to " installed ceph-common package post-installation script subprocess returned error exit status 6" | 11:39 |
admin1 | cannot even purge it .. | 11:40 |
admin1 | that is the only diff in relation to ceph i see in cinder vs glance container | 11:41 |
jrosser_ | try installing that by hand and see what the error is | 11:44 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_barbican stable/rocky: Stop old uwsgi service if exist https://review.opendev.org/c/openstack/openstack-ansible-os_barbican/+/861877 | 11:46 |
dokeeffe85 | noonedeadpunk it seems that lxcbr0 doesn't exist https://paste.openstack.org/show/bI9tQbBaDAeXkBgVqT4Z/ | 11:53 |
admin1 | no logs, nothing except /usr/bin/dpkg returned an error code .. | 11:53 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible stable/ussuri: Switch to tracking stable/ussuri for EM release https://review.opendev.org/c/openstack/openstack-ansible/+/853029 | 11:53 |
admin1 | installing strace to go deep | 11:53 |
admin1 | no logs and nothing | 11:53 |
noonedeadpunk | dokeeffe85:try restarting systemd-networkd | 11:54 |
noonedeadpunk | lxcbr0 is managed with it | 11:54 |
admin1 | write(6, "{\"jsonrpc\":\"2.0\",\"method\":\"org.d"..., 100) = 100 -- read(6, 0x562324022690, 4096) = -1 ECONNRESET (Connection reset by peer) -- EBADF (Bad file descriptor) | 11:55 |
admin1 | i am going to deploy it in prod and see if i face the same issue or not | 11:56 |
noonedeadpunk | so, train seems to be fixed way easier then ussuri | 11:56 |
admin1 | one quick question .. is there any override to tell glance and cinder to use the same container for instance :D | 11:56 |
dokeeffe85 | noonedeadpunk, no joy with that. "brctl show" only lists mgmt, storage & vxlan bridges | 11:56 |
admin1 | same repos, same packages .. ceph-common is good in one, bad in another | 11:56 |
admin1 | i have already lxc-destroyed the containers to recreate | 11:57 |
admin1 | so will try in a prod env now to make sure its not my env . | 11:57 |
noonedeadpunk | dokeeffe85: as I said check systemd-networkd | 11:58 |
dokeeffe85 | noonedeadpunk yep I restarted it and then tried restarting the container with -F and same result | 11:59 |
jrosser_ | if lxcbr0 is missing you need to look at the service that creates it and see why it is missing | 12:01 |
jrosser_ | there is no point moving on to restarting the container until the bridge is there | 12:01 |
admin1 | ip link set dev lxcbr0 up ? | 12:03 |
dokeeffe85 | lxcbr0 doesn't exit. Not sure which service creates it jrosser_ it was all working fine until a reboot of the server so it was created initially | 12:13 |
jrosser_ | you have /etc/network/interfaces.d/lxc-net-bridge.cfg ? | 12:16 |
admin1 | dokeeffe85 is it an aio ? | 12:25 |
admin1 | single node all in 1 install | 12:25 |
admin1 | dokeeffe85, reboot -- was it after some update/upgrade of packages ? | 12:26 |
noonedeadpunk | admin1: regarding override for glance/cinder - it's defenitely smth env.d related | 12:37 |
noonedeadpunk | should be doable | 12:37 |
noonedeadpunk | like create /etc/openstack_deploy/env.d/glance.yml with https://paste.openstack.org/show/bVCyhed2fVR46gT2FQPu/ | 12:39 |
noonedeadpunk | not 100% sure about that so worth to backup openstack_inventory jsut in case :D | 12:39 |
noonedeadpunk | actually... likely it won't work as just glance playbook won't run as no hosts will be in glance_all | 12:40 |
dokeeffe85 | jrosser_ yep I have that file. admin1 nope it's not aio. We had a power cut and I lost the three servers | 12:40 |
jrosser_ | admin1: i dont think that combining glance and cinder is useful to address whatever ceph trouble you had | 12:41 |
jrosser_ | as usual root cause needs to be found | 12:41 |
noonedeadpunk | then likely worth to mess up with other parts of env.d | 12:41 |
admin1 | yeah .. working to deploy in prod with full logs .. | 12:41 |
jrosser_ | dokeeffe85: then you can try to 'ifup' the interface | 12:41 |
admin1 | so that i can share | 12:41 |
noonedeadpunk | but should be overall doable | 12:41 |
jamesdenton | FWIW: I have resigned to adding a lxcbr0 bridge to my netplan config to avoid the issue mentioned (not being recreated on reboot). I could not easily replicate the issue, and since a) we do baremetal in prod and b) we don't often reboot controllers, i don't see it in the wild ,either | 12:44 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Mark Zaqar as deprecated in role matrix https://review.opendev.org/c/openstack/openstack-ansible/+/861884 | 12:45 |
noonedeadpunk | well there're 2 patches around for this topic - first to move lxcbr fully to networkd and second - allow to avoid osa trying to create it | 12:46 |
noonedeadpunk | (and manage) | 12:46 |
jamesdenton | /thumbsup | 12:47 |
*** frenzy_friday is now known as frenzyfriday|rover | 12:52 | |
dokeeffe85 | jrosser_ nope I can't as the interface doesn't exist, I can't see it anywhere. jamesdenton can you give me a paste of that netplan bridge you created please? | 13:01 |
jrosser_ | ifup is the command to bring up the interface with /etc/network/... type definitions afaik | 13:02 |
jrosser_ | it doesnt need to exist, you need to make it exist with some command | 13:02 |
jamesdenton | https://paste.opendev.org/show/bLtNoxDfwJGIHg9t2KQp/ -- netplan apply would bring it up in this case | 13:02 |
admin1 | noonedeadpunk, i am trying the env.d override to have cinder/glance in the same container .. and changed the line in setup-openstack to install cinder first and then glance .. lets see what that does | 13:05 |
admin1 | if it works, then i can destroy this env and again start fresh | 13:05 |
jrosser_ | admin1: we already warned you about the ansible groups being empty doing that | 13:06 |
admin1 | :) | 13:07 |
jrosser_ | i cannot understand doing this rather than just debug an apt problem | 13:07 |
jrosser_ | tbh there are expected to be issues as you're attempting something thats not tested | 13:08 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Remove usage of rsyslog roles https://review.opendev.org/c/openstack/openstack-ansible/+/861886 | 13:11 |
jamesdenton | jrosser_ Re: OVN ref arch: I am thinking of creating three -- 1) 3 controller nodes + X compute nodes, with DVR and computes as gateway chassis. 2) 3 controller node + 3 network node (gateway chassis) + X compute (non-gateway). 3) 3 controller/network node (gateway chassis) + X compute node. I think that mirrors most of today's deployments. Thoughts? | 13:18 |
jrosser_ | yes that would cover it - though i'm not sure they actually call it DVR these days as it's a little different | 13:19 |
jrosser_ | i | 13:20 |
jamesdenton | i'll double check | 13:20 |
admin1 | jamesdenton, these days deployment demand are now more towards HCI with ceph .. so 3x nodes ( controllers ) + 3x nodes ( hypervisor ) .. no network nodes .. the 3x controllers + 3x hypervisors all have ceph running .. | 13:22 |
admin1 | compute also act as network nodes | 13:22 |
jrosser_ | really? | 13:22 |
jamesdenton | Yes, I have seen that pushed more lately. | 13:22 |
admin1 | jrosser_ yes :) | 13:23 |
dokeeffe85 | Thanks jamesdenton that worked as far as starting the containers but there's other issues now after the reboot that I'll have to dig a bit deeper on before I ask any questions | 13:23 |
jamesdenton | dokeeffe85 sure, just let us know. | 13:23 |
dokeeffe85 | Will do thanks | 13:23 |
jrosser_ | must be fun constraining the memory on a combined ceph/hypervisor when things go $wrong in ceph | 13:24 |
admin1 | hypervisor is only doing the role of the osd | 13:24 |
admin1 | and not monitors and others .. its the controllers that do them | 13:24 |
jrosser_ | steady state yeah whatever, but it can get out of hand real quick on the OSD when things are "broken" | 13:24 |
jamesdenton | admin1 since we decouple OSA from Ceph deployment, I'll let the operator tack Ceph onto any one of those scenarios | 13:25 |
jrosser_ | lose a portion of your cluster due to a switch or some other problem and the memory usage can get very high | 13:25 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible master: Add release note about used ansible and ceph versions https://review.opendev.org/c/openstack/openstack-ansible/+/861889 | 13:26 |
jamesdenton | Anyone here using OSA+OpenDaylight? Or OSA+NSX-T? | 13:27 |
noonedeadpunk | hyperconverged inra is always "fun". Most fun for osd+hypervisor is RAM consumption. As you always need RAM for domains, but OSDs also require ram. So I think I would reserve a lot of ram for hypervisor in placement if I had to do that | 13:28 |
noonedeadpunk | there bunch of fixes passes for stable branches - would be good if we could merge them sooner then later before they didn't break again :D https://review.opendev.org/q/parentproject:openstack/openstack-ansible+branch:%255Estable/.*+status:open+label:Verified | 13:32 |
jrosser_ | huh lots of depends-on there | 13:35 |
jrosser_ | need to get the order right | 13:35 |
noonedeadpunk | yeah, quite some... But I expected to be worse tbh | 13:36 |
noonedeadpunk | it's mostly rabbitmq/erlang thing | 13:36 |
noonedeadpunk | that's broken even back to Rocky | 13:37 |
noonedeadpunk | But I'm not sure I have enough motivation now to fix Rocky.... | 13:37 |
noonedeadpunk | and rocky not in terms of distro but in terms of openstack release | 13:40 |
noonedeadpunk | we should EOL it to get rid of this confusion lol | 13:40 |
jamesdenton | mgariepy I seem to recall you had some patches to os_neutron for placing OVN gateway chassis? versus all ovn-controllers being gateways? Or maybe you just mentioned wanting to do it, can't recall | 13:43 |
jamesdenton | if not, i can do it as part of this doc exercise | 13:43 |
mgariepy | let me look | 13:44 |
jamesdenton | My plan is to split out the gateway logic from the neutron_ovn_controller group, and create a second group names neutron_ovn_gateway_chassis, and just manipulate inventory accordingly. | 13:46 |
mgariepy | https://review.opendev.org/c/openstack/openstack-ansible-os_neutron/+/760647 | 13:47 |
jamesdenton | yeah ok, same concept | 13:47 |
mgariepy | yep. | 13:47 |
mgariepy | add me to review i can help if you want :D | 13:47 |
jamesdenton | great. I'll ressurect that, thank you! | 13:47 |
jamesdenton | definitely | 13:47 |
jamesdenton | i'll put together some steps to duplicate this in 6-9 VMs, depending on the scenario | 13:48 |
noonedeadpunk | So regarding zookeeper. I was using fork of this repo https://opendev.org/windmill/ansible-role-zookeeper/src/branch/master | 13:49 |
noonedeadpunk | And I do see how we will struggle with it | 13:50 |
noonedeadpunk | Maybe worth trying to reach pabelanger but I kind of doubt he will be eager to add config_template... | 13:52 |
admin1 | destroyed everything .. retrying again .. | 13:53 |
admin1 | this time i will log everything from start | 13:53 |
noonedeadpunk | as he was not willing to listen about configuring cluster out of the box (https://github.com/openstack-archive/ansible-role-zookeeper/commit/b223d56660ea21f0feb8c6c7bf27dd4bac07a7fe) | 13:54 |
opendevreview | Merged openstack/openstack-ansible stable/queens: EOL OpenStack-Ansible Queens https://review.opendev.org/c/openstack/openstack-ansible/+/861868 | 13:55 |
admin1 | kolla+ovn works out of the box ( to get inspired from ) | 13:55 |
jamesdenton | it's not inspiration that's lacking :D | 13:56 |
jamesdenton | only time | 13:56 |
admin1 | getting inspired is subtly saying to copy to save time :) | 13:56 |
jamesdenton | it does work for OSA, too. Just looking to make it more complete so we can make it the default vs LXB | 13:56 |
admin1 | i also know ovn is now in starlingx .. but have not tested the latest one | 13:56 |
jamesdenton | yes, things have been borrowed here and there :) | 13:57 |
admin1 | no need to reinvent the wheel if it works | 13:57 |
jamesdenton | but I think OSA does a better job of spelling out different deployment scenarios versus some of the others, which tend to be a little more... prescriptive/opinionated | 13:57 |
admin1 | yes .. which is why i stick to osa for all prod and (paid) work .. and rest of the time, test others | 13:57 |
admin1 | we all are operators and so also OSA is better .. 2 weeks back i tried kolla + ovn and could not get octavia to work .. tried for 2 weeks .. zero replies :) | 13:59 |
admin1 | kolla/docker works . and so when it works, there are no interaction because it just works and people have no questions .. and when it breaks or does not work for some reasons, no one knows how to answer | 14:01 |
noonedeadpunk | fwiw we're having operator hours now in https://www.openinfra.dev/ptg/rooms/folsom | 14:01 |
jamesdenton | i am delayed by 25 more min | 14:04 |
opendevreview | Merged openstack/openstack-ansible-tests stable/ussuri: Restrict pyOpenSSL to less then 20.0.0 https://review.opendev.org/c/openstack/openstack-ansible-tests/+/861742 | 14:10 |
opendevreview | Merged openstack/openstack-ansible-tests stable/train: Restrict pyOpenSSL to less then 20.0.0 https://review.opendev.org/c/openstack/openstack-ansible-tests/+/861831 | 14:10 |
dokeeffe85 | jamesdenton I left my desk for a bit and when I came back I know have a horizon dashboard and my VM's are up and can ping out so I don't know what happened but it's back. Thanks everyone | 14:47 |
dokeeffe85 | *now | 14:47 |
*** dviroel is now known as dviroel|lunch | 15:46 | |
nixbuilder | I am confused (as is normal) about vxlan for private tenant networks. The documentation here (https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/targethosts-networkconfig.html) says that "Note that br-vxlan is not required to be a bridge at all, a physical interface or a bond VLAN subinterface can be used directly and will be more efficient." So what parameters do I use in openstack_user_config to use a | 16:17 |
nixbuilder | physical interface? | 16:17 |
noonedeadpunk | basically what you need - consistent interface name across net nodes and compute nodes | 16:24 |
noonedeadpunk | It can be bridge, but you can also name interface as vxlan in netplan or systemd-networkd | 16:24 |
jrosser_ | is it eventually what is specified here https://github.com/openstack/openstack-ansible/blob/master/etc/openstack_deploy/openstack_user_config.yml.example#L273 | 16:25 |
noonedeadpunk | to make it less confusing you might use `host_bind_override: $name` there as well | 16:26 |
noonedeadpunk | (iirc) | 16:26 |
jamesdenton | i would expand that to say, the container_* bits are probably not important these days since neutron agents are on metal and not in LXC anymore (right?), but there is logic that uses the range based on type to populate ml2_conf.ini and the agent configs. we could probably stand to test/update this | 16:26 |
jamesdenton | host_bind_override would only be applicable to vlan type, i think | 16:26 |
jrosser_ | ultimately whatever the value of "{{ tunnel_address }}" is what matters in os_neutron | 16:27 |
noonedeadpunk | iirc it's applied everywhere in bare metal hosts | 16:27 |
jamesdenton | right, so i might try a deployment and it would be nice if we could eliminate container_* since it's irrelevant | 16:27 |
jrosser_ | it is *hugely* confusing | 16:27 |
*** dviroel|lunch is now known as dviroel | 16:28 | |
noonedeadpunk | or you can just define neutron_provider_networks like here https://opendev.org/openstack/openstack-ansible-os_neutron/src/branch/master/defaults/main.yml#L392-L399 in user_variables and forget about openstack_user_config :D | 16:28 |
jamesdenton | or that :) | 16:28 |
noonedeadpunk | (in terms of vxlan/vlan/flat nets) | 16:28 |
jamesdenton | 1,000 ways... to shoot yourself in the foot | 16:29 |
noonedeadpunk | yup, seems we tried our best to make it confusing... | 16:29 |
noonedeadpunk | and really huge part is just historical | 16:29 |
noonedeadpunk | but we indeed need to review our docs | 16:30 |
jamesdenton | nixbuilder i think what we're saying is, that you need a VLAN dedicated to overlay traffic and that vlan interface can have an IP on it (the TEP) or the vlan interface could be in a bridge (i.e. br-vxlan) that has an IP on it (the TEP). Based on some logic, that IP will be automatically discovered and used for local_ip in neutron config files | 16:30 |
jamesdenton | noonedeadpunk agreed. it all comes back to docs <crying emoji> | 16:31 |
jamesdenton | nixbuilder it may be easier to not define at all in openstack_user_config, and instead use what jrosser_ mentioned. You're then defining everything manually | 16:32 |
noonedeadpunk | tbh what I like about metal deployments is how clean your openstack_user_config is.... | 16:33 |
nixbuilder | Thanks everyone... I'll put on another pot of coffee and digest all this. Thanks again. | 16:34 |
jrosser_ | https://github.com/openstack/openstack-ansible-os_neutron/blob/36a2f02561b9281ee7e46287601f2d21a7fbc142/defaults/main.yml#L385-L386 | 16:34 |
jamesdenton | sure, thanks for asking. it prods us to update the docs | 16:34 |
jamesdenton | thats a bad default | 16:34 |
jamesdenton | lol | 16:34 |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-lxc_hosts stable/ussuri: Use legacy image retrieval for CentOS 7 https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/861744 | 16:36 |
jrosser_ | i wonder if theres actually much point at all keeping the vxlan part of `provider_networks` | 16:36 |
jrosser_ | maybe it's needed for feeding into OVS or something, i don't know | 16:36 |
jamesdenton | well, i think the only benefit is the range logic. | 16:36 |
jamesdenton | we'd have to push that into a separate var or doc or whatever | 16:37 |
jrosser_ | but it could instead be just two obivously named vars | 16:37 |
jrosser_ | the address and the range | 16:37 |
jamesdenton | sure. we're gonna find out here shortly | 16:37 |
jamesdenton | i never liked all of those being "provider" networks, anyway. confusion. | 16:38 |
jrosser_ | imho it sort of turns into "container networks" which really is what you might want this for, wiring up the control plane | 16:38 |
jamesdenton | true | 16:38 |
jrosser_ | the neutron-ness of it is somehow making it conceptually really hard | 16:39 |
jrosser_ | but then also we do use it to create OVS bridges with the right settings? | 16:39 |
jamesdenton | not in this case, that only applies to vlan | 16:39 |
jrosser_ | :) | 16:39 |
jamesdenton | for vxlan, since the ip logic is separate, i think it's only the container wiring (not relevant) and range stuffs. So maybe new vars is the way to go, but then also need to keep the provider_networks override mechanism there, too. | 16:41 |
* noonedeadpunk is also confused by now | 16:42 | |
jamesdenton | for vlan, yeah, i think we use container_bridge for building the ovs bridge | 16:42 |
noonedeadpunk | I think I need to get mnaio admin1 promoted to really have a good play with all options to come up with what we can drop and how simplify | 16:43 |
jamesdenton | agreed | 16:43 |
jrosser_ | i dont specify any of this at all btw | 16:43 |
noonedeadpunk | We do have neutron_provider_networks | 16:44 |
jamesdenton | you're using neutron_provider_networks? | 16:44 |
jrosser_ | let me look | 16:44 |
noonedeadpunk | but I see lot's of crap in openstack_user_config as well for $reason | 16:44 |
jamesdenton | yeah, i guess that was intended as the one-stop-shop for abstracting this stuff out? but thee days specifying things in both places gets confusing | 16:45 |
jrosser_ | https://paste.opendev.org/show/bQFDXRzwYupWNh14989p/ | 16:46 |
jamesdenton | i see | 16:46 |
jrosser_ | its the same interface name everywhere - except where it's not | 16:46 |
noonedeadpunk | yeah, it's yet another option I'd say | 16:46 |
jamesdenton | well, we never defined the interface, anyway | 16:46 |
jamesdenton | i think we're relying on matching the CIDR? | 16:46 |
jrosser_ | so its easier to just not bother trying to do it in o_u_c and just do that in group_vars instead | 16:46 |
noonedeadpunk | yeah, I tend to agree here | 16:48 |
jrosser_ | there might have been a neater way to do that, but if you have non uniformity then the providder_networks thing is very hard to use | 16:48 |
jrosser_ | imho variables are better | 16:48 |
jrosser_ | becasue you can have them in user_variables globally -> uniform deployment | 16:48 |
noonedeadpunk | but again, why would you have non uniformity given you can name interface in netplan/systemd-networkd | 16:48 |
jrosser_ | or you can put them wherever you need in group_vars | 16:48 |
noonedeadpunk | But yes, providder_networks is quite complex/unobvious for neutron usecase | 16:49 |
jrosser_ | naming interfaces is hard | 16:50 |
noonedeadpunk | so I'd rather place them wherever else, and leave providder_networks only for containers usecase | 16:50 |
jrosser_ | as you need to have the mac of everything recorded somewhere to deploy those names | 16:50 |
noonedeadpunk | well... In maas and ironic you have I believe? | 16:50 |
jamesdenton | so, yank all of the neutron-specific "provider_networks'" from o_u_c and direct folks (via docs) to using neutron_provider_networks override? either as group or host vars? or global? | 16:51 |
jrosser_ | and fun times as the behaviour is surprising between focal/jammy and focal/focal+HWE kernel | 16:51 |
noonedeadpunk | jamesdenton: I'd say yes? | 16:51 |
jamesdenton | i just went thru this exercise yesterday... real PITA. (interface naming). | 16:51 |
jrosser_ | like PXEboot into focal, mess with the interfaces, install HWE kernel, reboot -> WTF | 16:51 |
jamesdenton | noonedeadpunk i think that's fair. keep the logic for upgraded deployments but remove the doc examples | 16:51 |
noonedeadpunk | yeah, fair | 16:51 |
jrosser_ | jamesdenton: if you have good ideas about interface naming would be interesting | 16:52 |
jrosser_ | we talked about it here this week off the back of nvidia/mlx changing theirs again | 16:52 |
noonedeadpunk | I totally want to get rid of br-vlan/br-vxlan naming.... | 16:52 |
jrosser_ | but did not have a great plan | 16:52 |
noonedeadpunk | but indeed seems that bridge is still really consistent in terms of naming... | 16:53 |
jamesdenton | pfft. seems we had been relying on biosdevname but our recent deploys don't have it. Just had to implement *.link files based on driver and using PATH as the name. Names are obnoxious, but consistent. But even between drivers you can have mild variance in the same slot (ens1f0s0 vs ens1f0s0np0) or something like that | 16:53 |
jamesdenton | yes, i think eliminating those bridges is a Good Thing™ | 16:54 |
jrosser_ | yes that np0 thing is way caught us out | 16:54 |
jrosser_ | theres now np<N> and nv<N> or smth for PF vs. VF | 16:54 |
jamesdenton | but it does add consistency. i don't like br-ex -> br-vlan -> bond1 though | 16:54 |
noonedeadpunk | true... that's why they're still there. | 16:55 |
noonedeadpunk | I feel physical pain for ppl who have br-vlan.100 added to neutron-lxb bridge though... | 16:55 |
jamesdenton | jrosser_ i have no good ideas, just complaints. | 16:56 |
jrosser_ | understood - we decided to do nothing and just fix everything up for the new names across focal->jammy | 16:56 |
jrosser_ | there was no good answer | 16:56 |
jamesdenton | noonedeadpunk Bridgeception | 16:57 |
jamesdenton | non-persistent persistent naming. gotta love it. | 16:57 |
mgariepy | i rename all the interface this way i can upgrade the OS without having a suprise name when i upgrade ;p | 17:07 |
jamesdenton | so, i was under the impression that in order to rename an interface in netplan it first had to be identified as something (ie. rename ens1f0 -> management) but that if the interface came up as ens1f0np0 first, then it wouldn't work? Maybe you then also have to specify something more specific (like MAC?) | 17:08 |
mgariepy | i use MAC | 17:09 |
mgariepy | but i guess it would be too much work to do it via ansible for physical hw. | 17:10 |
jamesdenton | then i guess you just have to be conscious of a chassis swap or nic swap or something? maybe not too common | 17:10 |
jamesdenton | i think some of the tech debt we have in OSA is trying to be too clever | 17:10 |
mgariepy | i don't swap nic often. | 17:10 |
mgariepy | a couple of year back netplan was not too great about it either.. using the mac when adding vlans it was trying to rename the vlan interface as well ... | 17:12 |
mgariepy | fun times. | 17:12 |
jamesdenton | seems to have matured a bit | 17:13 |
mgariepy | yeah it works ok now. | 17:13 |
opendevreview | Merged openstack/openstack-ansible-os_rally stable/ussuri: Move rally details to constraints https://review.opendev.org/c/openstack/openstack-ansible-os_rally/+/861730 | 17:15 |
opendevreview | Merged openstack/openstack-ansible-lxc_hosts stable/ussuri: Use legacy image retrieval for CentOS 7 https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/861744 | 17:15 |
mgariepy | https://netplan.io/reference#common-properties-for-physical-device-types | 17:19 |
mgariepy | you can match by driver also. | 17:20 |
mgariepy | but i do have my full inventory with macs most of the time so it's quite easy for me to set it up with renaming :D | 17:21 |
mgariepy | with interface renaming i get some consistency also. | 17:23 |
mgariepy | all my server have the first 25G interface names 25G-1 whatever pci slot/brand it is. | 17:25 |
mgariepy | you can probably get the info from setup module, and filter by speed/type and so on. | 17:26 |
spatel | I have question folks, by mistake i have deleted monitoring user in galera and now haproxy saying mysql is down | 17:54 |
spatel | How do i quickly add monitoring user back ? | 17:54 |
spatel | what is that monitoring user has to do with haproxy? | 17:58 |
jamesdenton | might be as easy as: CREATE USER 'monitoring' IDENTIFIED BY '{{ galera_monitoring_user_password }}'; | 17:59 |
admin1 | you can do grant connect,select on *.* to monitoring@'%' identified by 'password from secrets' | 17:59 |
jamesdenton | https://github.com/openstack/openstack-ansible-galera_server/blob/5200b50cf650fb5ad5e0733b9e0ead207dbf6c6a/vars/main.yml#L31-L51 | 17:59 |
admin1 | well, you can just add quickly and later fix the specific permissions | 17:59 |
spatel | jamesdenton i didn't find any password in /etc/openstack_deploy/user_secrets.yml | 18:00 |
admin1 | spatel the pass could also be in the haproxy cfg | 18:01 |
spatel | Nothing here - cat /etc/openstack_deploy/user_secrets.yml | grep galera_monitoring_user_password | 18:01 |
jamesdenton | ok hmm | 18:02 |
spatel | nothing in haproxy.cfg file | 18:02 |
mgariepy | in the clustercheck script inside the galera container | 18:04 |
spatel | This is bizarre.. :( | 18:04 |
jamesdenton | looks like a fairly recent addition i guess | 18:04 |
spatel | This is the script - https://paste.opendev.org/show/bYkLV1m5l2ZjrlVjObsa/ | 18:06 |
spatel | No password there, may be it use mysql root password? | 18:07 |
spatel | MYSQL_PASSWORD="${2-}" ? | 18:07 |
jamesdenton | maybe theres no password? | 18:07 |
jamesdenton | i think that's a hash | 18:08 |
anskiy | there is a password, I can see it in user_secrets | 18:08 |
opendevreview | Merged openstack/openstack-ansible-rabbitmq_server stable/train: Use cloudsmith repo for rabbit and erlang https://review.opendev.org/c/openstack/openstack-ansible-rabbitmq_server/+/861794 | 18:08 |
spatel | anskiy why my user_secret not showing it | 18:08 |
jamesdenton | https://github.com/openstack/openstack-ansible/commit/302c8226e6ea51d9e0c76050b470d525dfb33d60 | 18:08 |
spatel | anskiy what is the name in user_secret? | 18:09 |
jamesdenton | see the release note? i think it implies there was no password | 18:09 |
anskiy | spatel: I think, during the upgrade on Y, when you should ran the script to check for missing secrets, you'd had to add one | 18:09 |
jamesdenton | You can also override variable to | 18:09 |
jamesdenton | ``galera_monitoring_user_password: ""`` to not use password for auth and | 18:09 |
jamesdenton | preserve previous behaviour | 18:09 |
jamesdenton | maybe just start with creating the monitoring user w/ no pass and see if that wqorks | 18:10 |
spatel | Now what i should to bring it back quickly to fix my production :( | 18:10 |
spatel | I didn't know its so important | 18:10 |
anskiy | spatel: `galera_monitoring_user_password` is the variable name in my user_secrets | 18:11 |
spatel | anskiy i don't have that in my user_sec file | 18:11 |
jamesdenton | spatel what OSA version? anskiy what OSA version? | 18:11 |
spatel | Wallaby 23.3.0 | 18:12 |
anskiy | jamesdenton: I've added it when I've upgraded to Y (that was stable/yoga at the time) | 18:12 |
anskiy | oh, you're on W, nvm then, I guess... | 18:12 |
jamesdenton | right, so Wallaby wouldn't have that logic | 18:12 |
jamesdenton | spatel create a monitoring user with no password in galera | 18:13 |
jamesdenton | that ought to do it | 18:13 |
spatel | let me do ... | 18:13 |
spatel | done! - CREATE USER 'monitoring' IDENTIFIED BY ''; | 18:14 |
spatel | Look like that fixed my issue jamesdenton | 18:15 |
jamesdenton | good deal | 18:15 |
spatel | haproxy is happy now | 18:15 |
spatel | Thank you so much jamesdenton | 18:16 |
spatel | This is total mess :) | 18:16 |
jamesdenton | as anskiy mentioned, once you upgrade you may need to update secrets to add that var, as a password will be required | 18:16 |
jamesdenton | ^^ release notes ought to cover this | 18:16 |
admin1 | ^^ yes .. release note was there | 18:17 |
admin1 | spatel, still have ovn on prod ? | 18:18 |
admin1 | any issues so far ? | 18:18 |
spatel | Yes i am running ovn in production without any issue. | 18:18 |
Adri2000 | hello jrosser_, I filed this bug about an issue with ansible-role-pki, would appreciate your input when possible :) thanks https://bugs.launchpad.net/openstack-ansible/+bug/1993575 | 19:00 |
Adri2000 | I took the opportunity to close this old (2016) related wishlist bug that's actually fixed since the introduction of openstack_host_ca_certificates: https://bugs.launchpad.net/openstack-ansible/+bug/1649844 (apparently I'm also the reporter of this bug report!) | 19:05 |
jrosser_ | Adri2000: the run_once certainly looks like it could be an issue | 19:08 |
jrosser_ | which playbook would you be expecting to install this CA for you? | 19:08 |
jrosser_ | either playbooks/containers-lxc-create.yml or playbooks/openstack-hosts-setup.yml i expect | 19:10 |
jrosser_ | if you are able to test it out removing that run_once it would be helpful - even better a patch :) | 19:10 |
Adri2000 | jrosser_: playbooks/containers-lxc-create.yml, as I'm targetting the Keystone containers. for now the workaround I used is to run the playbook limited to Keystone containers only, so the task will run_once on a Keystone container and therefore will have the variable correctly set. I guess removing run_once should work, but I'll test to be sure, and can push it as a patch | 19:15 |
Adri2000 | then | 19:15 |
jrosser_ | seems i used rather too much run_once in the PKI role | 19:15 |
jrosser_ | Adri2000: also take a look in vars/main.yml of the PKI role - there are other ways in there to provide your own certs in as many variables as you like | 19:18 |
Adri2000 | interesting, didn't know that | 19:28 |
*** dviroel is now known as dviroel|biab | 20:47 | |
*** dviroel|biab is now known as dviroel | 21:59 | |
*** dviroel is now known as dviroel|out | 23:03 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!