Monday, 2019-12-16

openstackgerritzhangxuanyuan proposed openstack/octavia master: Remove option status_update_threads
openstackgerritzhangxuanyuan proposed openstack/octavia master: Remove configuration option 'amp_ssh_access_allowed'
*** tesseract has quit IRC08:11
hkominosmorning all. Is it possible to get ssh access to the loadbalancer to check the logs ? (maybe obtain octavia-ssh-key somehow )11:12
cgoncalveshkominos, it is. connect to the amphora on the lb-mgmt-net. get the IP from "openstack loadbalancer amphora list"11:17
cgoncalvesuser name depends on the distribution you're using for the amphorae. given in your case it's TripleO deployment, the amphorae are CentOS thus username 'centos' (or cloud-user? I always mix up)11:18
cgoncalveshkominos, this doc should apply to TripleO upstream too:
hkominoscgoncalves: suprisingly openstack loadbalancer amphora list is empty11:31
hkominosI can find the internal ips from the console11:31
hkominosof course11:31
hkominosthe boot console which shows a user ssh key inserted and an ip11:32
cgoncalveshkominos, that's odd. is the list empty or your user does not have admin permissions?11:32
cgoncalvesit might also be that something wrong happened that led octavia to delete the amphora but in the process it failed to delete it in nova11:33
cgoncalvesaka zombie amphora11:33
hkominosso it seems that the instance has been deleted in nova as well11:33
hkominosSo octavia does lifecycle mgmt as well ?11:33
cgoncalveshkominos, lifecycle mgmt of its resources, yes11:34
cgoncalvesfor example, if the octavia health manager service for some reason detected that the amphora is not healthy it triggers an amphora failover11:35
hkominosI assume this is what happened. I am just tryting to login to VM to see logs to see why I am seeeing this networking (?) error.
cgoncalveshkominos, I'm guessing the certificates were not well configured. did you let TripleO handle them or you specified your owns?11:43
hkominosI created a set of certificates according to the instructions.But If the handshake fails I would see it in worker.log .11:45
hkominosNow I see something like 2019-12-16 11:37:46.574 24 INFO octavia.certificates.generator.local [-] Signing a certificate request using OpenSSL locally.2019-12-16 11:37:46.574 24 INFO octavia.certificates.generator.local [-] Using CA Certificate from config.2019-12-16 11:37:46.574 24 INFO octavia.certificates.generator.local [-] Using CA Private Key11:46
hkominosfrom config.2019-12-16 11:37:46.574 24 INFO octavia.certificates.generator.local [-] Using CA Private Key Passphrase from config.11:46
hkominosso I assume it all went well11:46
hkominosSo i want to connect into the VM11:47
hkominosBut I am missing the private key I guess11:47
hkominosthe ssh private key i mean to connect from the controllers11:49
cgoncalveshkominos, not necessarily. those messages are logged when the worker creates the server-side (amphora) certificate. later, the worker tries to TLS connect to the amphora. if the two cannot establish a handshake, the connection fails and thus the amphora create reverts (that is why your amphora list output is empty)11:52
cgoncalveshkominos, TripleO upstream does not document how to provide own certificates. OSP documentation does but is not much clear at the moment. try this (credits to johnsom):
hkominoscgoncalves: so we follow the instructions from RDO (kinda) Some of the config does not propagate at the moment correctly But I have access to the controllers so I login and manually inject the client and the server certificates in the containers. And it seemed to have worked until now11:58
hkominosPerhaps something is not injected properly to the VM though11:58
hkominosI will try again11:58
*** pcaruana has quit IRC12:27
*** hkominos has joined #openstack-lbaas14:27
hkominoscgoncalve: Are you also part of octavia integration in OOO ?14:27
cgoncalveshkominos, yes14:28
hkominosdo you mind if I run debug log by you ? I am trying to rerun the octavia installation with the new keys but some ssh key on the undercloud is giving me trouble14:28
cgoncalveshkominos, sure14:29
hkominosThis is the failing step ansible-playbook -i inventory.yaml --extra-vars group_vars /usr/share/tripleo-common/playbooks/octavia-files.yaml  --private-key /var/lib/mistral/overcloud/ssh_private_key14:32
hkominosI see that the error is found here : /usr/share/openstack-tripleo-common/playbooks/roles/octavia-undercloud/tasks/main.yml L3514:34
hkominoswhich is an ssh key check14:34
hkominosmaybe created here ?
cgoncalveshkominos, tripleo rocky?14:42
hkominos-rwxr-xr-x. 1 42430 42430     1675 Dec 16 12:58 ssh_private_key14:43
hkominosI just wanted to add the the permissins for the key seem ok. I see the ssh key from the mistral container.14:44
hkominosbut I think the test seem to think that it is unreadble for some reason14:45
cgoncalveshmm, sorry, trying to remember the little details of rocky. it handles things a bit differently from queens and >=stein14:50
cgoncalveswhat is amp_ssh_key_path set to?14:51
hkominosexcellent question !14:51
hkominosI dont know. I think that is is empty but it should not14:51
hkominosI mean I dont know.14:51
hkominosI dont set it anywhere14:51
cgoncalvesso you didn't set OctaviaAmphoraSshKeyFile?14:52
cgoncalveswhen not set, default is to use public key from stack@undercloud:~/.ssh/id_rsa.pub14:52
hkominosOctaviaAmphoraSshKeyFile is new to me. I only set as per ouyr previous discussion the yaml files here14:53
cgoncalvesfolks, if this too much TripleO for the channel let us know. we can move the conversation to #tripleo14:53
hkominosis amp_ssh_key_path visible in the plan ?14:54
hkominoslet me check14:54
hkominosOk. It is basically what you said. amp_ssh_key_path: { get_param: OctaviaAmphoraSshKeyFile }14:55
hkominosI dont set it up so it should be using the keys from the UC which should be fine14:55
cgoncalveshkominos, ok. do you have the keypair named "default" in the UC?14:56
hkominoskeypair ???14:58
cgoncalvesyeah, nova keypair14:59
hkominosI assume we are talking about the ansible user ?14:59
cgoncalvesanyway, if you haven't set it it shouldn't be entering that code path14:59
hkominosxm. I did not remember setting anything14:59
cgoncalvesif must have amp_ssh_key_path set (even if empty string) to something, else I don't know how it's entering the ansible block
cgoncalvesif only I could remember where the ansible vars are stored in the undercloud...15:00
hkominosI am speculating that is entering because a keypair is already present there because this deployement of octavia is overwriting a previous one (also broken) but with different keys15:01
hkominoslets find the ansible vars then15:02
hkominoslet me search15:02
hkominosit looks empty15:03
hkominos"amp_ssh_key_name": "octavia-ssh-key", "amp_ssh_key_path": "",15:03
hkominosor at least that is visible in ansible.log15:04
cgoncalvesbeing it set to "" (empty) I'd have expected it to not enter that ansible block... :/15:05
hkominosI dont know. I am too confused15:05
hkominosDoes it make any sense to restart the mistral containers ?15:07
hkominosdoubt it, but what do I know.15:07
cgoncalvesargh, now that I'm thinking more about this I think I once got into the same/similar issue. I'm looking around15:10
*** ivve has quit IRC15:21
*** goldyfruit has joined #openstack-lbaas15:23
*** goldyfruit has quit IRC15:45
hkominosThere is also a permission denied issue in the previous log. Dunno where it comes from15:45
hkominos [WARNING]: Could not create retry file '/usr/share/tripleo-common/playbooks/octavia-files.retry'.         [Errno 13] Permission denied: u'/usr/share/tripleo-common/playbooks/octavia-files.retry'15:45
*** gcheresh has quit IRC16:05
cgoncalveshkominos, sudo ansible-playbook -i "/var/lib/mistral/overcloud/octavia-ansible/inventory.yaml" --extra-vars @"/var/lib/mistral/overcloud/octavia-ansible/group_vars/octavia_vars.yaml" /usr/share/tripleo-common/playbooks/octavia-files.yaml --private-key "/var/lib/mistral/overcloud/ssh_private_key"16:07
cgoncalvesshould be applicable to rocky16:08
cgoncalvesactually, that is for Rocky yeah16:08
hkominossudo ?16:11
hkominosI run it from within mistral_executor16:11
hkominosxm. now it run somehow and bypassed the previous error16:13
hkominosand hit another error on a controller16:13
hkominossayting that ssh key changed16:13
cgoncalvesthat I'm not sure, sorry16:15
cgoncalveshkominos, can't you just run again?16:15
hkominosof course. This still happens. that is what I am trying to disect16:15
hkominosa wait16:16
hkominosyeah I rerun the deployment. If i remove octavia it works like a charm16:16
hkominosonly when I enable octavia I hit this16:16
cgoncalvesyeah, it's something specific to octavia in tripleo so makes sense16:17
cgoncalvescan you share the content of /var/lib/mistral/overcloud/octavia-ansible/group_vars/octavia_vars.yaml please? redact any sensitive information16:17
hkominossure. I was just looking at that file.16:18
cgoncalvesthat last error (keys changed), I really don't know. it's just a SSH from the undercloud to controller-216:18
*** goldyfruit has joined #openstack-lbaas16:20
hkominosbtw I thought that ansible is using the ssh keys of the stack user16:21
hkominosbut the private key found in /var/lib/mistral/overcloud is not hte same as my stack user.16:22
hkominosIS that default behaviour16:22
cgoncalvestripleo is in some places expected to run from the "stack" user. probably the case for Octavia16:27
cgoncalvesit may escalate to "root" at some parts, I don't know16:27
cgoncalveshkominos, can you get around the ssh key changed issue and re-run the same ansible-playbook command?16:28
hkominosok something went horribly wrong. All the nodes are unreachable from ansible16:41
cgoncalvescould you expand? what have you changed?16:49
hkominosnothing really. I just tried to rerun and none of the nodes is reachable. I SPECULATE that as part of the command that we run before ( we might have accidentally created a new key for the ansible-user. But that key is not injected to the undercloud nodes. therefore we fail.17:14
hkominosI rekicked a deployment to see what is going on17:14
*** goldyfruit has quit IRC17:17
cgoncalvesthe octavia ansible playbook does not touch system-wide ssh keys, so very unlikely unless I'm missing something17:23
*** openstackgerrit has joined #openstack-lbaas17:52
openstackgerritMikhail Ushanov proposed openstack/octavia stable/rocky: Limit spares pool to the spare_amphora_pool_size
openstackgerritBrian Haley proposed openstack/octavia master: Fix tests to correctly call reset_mock()
openstackgerritBrian Haley proposed openstack/octavia master: Support hacking 2.0.0
