*** Mudpuppy has joined #openstack-ansible | 00:00 | |
*** openstack has joined #openstack-ansible | 00:05 | |
*** Mudpuppy has joined #openstack-ansible | 00:05 | |
*** TheIntern has quit IRC | 00:07 | |
*** galstrom_zzz is now known as galstrom | 00:13 | |
*** sdake has joined #openstack-ansible | 00:30 | |
*** shaleh has quit IRC | 00:36 | |
*** sigmavirus24 is now known as sigmavirus24_awa | 00:37 | |
*** galstrom is now known as galstrom_zzz | 00:41 | |
*** galstrom_zzz is now known as galstrom | 00:41 | |
openstackgerrit | Miguel Grinberg proposed stackforge/os-ansible-deployment: Keystone Federation Identity Provider Configuration https://review.openstack.org/194259 | 00:44 |
---|---|---|
*** Mudpuppy_ has joined #openstack-ansible | 00:53 | |
*** galstrom is now known as galstrom_zzz | 00:55 | |
*** sacharya has joined #openstack-ansible | 00:55 | |
*** Mudpuppy has quit IRC | 00:56 | |
*** sacharya1 has joined #openstack-ansible | 00:59 | |
*** sacharya has quit IRC | 00:59 | |
jwitko | when I run the setup-infrastructure.yml playbook I am getting "TypeError: argument of type 'NoneType' is not iterable" when gathering facts | 01:17 |
jwitko | this did not happen on the foundation playbook | 01:17 |
jwitko | anyone have any idea whats wrong ? | 01:17 |
*** sacharya1 has quit IRC | 01:19 | |
*** javeriak has quit IRC | 01:52 | |
cloudnull | jwitko: id imagine that its an issue in inventory / user config. does it happen on first run ? or explode on some task ? | 01:55 |
jwitko | first run | 02:00 |
jwitko | cloudnull, looks like its trying to execute on containers | 02:01 |
cloudnull | what play are you running ? | 02:03 |
jwitko | openstack-ansible setup-infrastructure.yml | 02:04 |
jwitko | it fails on the Install memcached play, under gathering facts | 02:04 |
cloudnull | if you execute the memcached-install.yml play directly does it fail the same way ? | 02:05 |
*** Mudpuppy_ is now known as Mudpuppy | 02:06 | |
*** alop has quit IRC | 02:07 | |
*** alop has joined #openstack-ansible | 02:31 | |
*** galstrom_zzz is now known as galstrom | 02:37 | |
*** galstrom is now known as galstrom_zzz | 02:38 | |
*** galstrom_zzz is now known as galstrom | 02:47 | |
jwitko | cloudnull, i'll try that. | 02:52 |
jwitko | yes, it does | 02:55 |
jwitko | cloudnull, http://pastebin.com/pbnJzkWH | 02:56 |
jwitko | i don't know much about containers | 02:56 |
*** galstrom is now known as galstrom_zzz | 02:56 | |
jwitko | but it looks like this is trying to reach out to some containers via ssh, If those are the hostnames its creating dynamically then its not surprising it can't SSH to those? | 02:57 |
cloudnull | did you by chance use vault on /etc/openstack_deploy/openstack_user_config.yml ? | 02:57 |
jwitko | yes | 02:58 |
cloudnull | decrypt that file and try again . | 02:58 |
jwitko | err no | 02:58 |
jwitko | sorry i did not, only on user_secrets.yml | 02:58 |
cloudnull | ok, nevermind then | 02:58 |
cloudnull | next, how was ansible installed? | 02:58 |
cloudnull | using the scripts ? | 02:59 |
jwitko | aye | 02:59 |
cloudnull | if you run lxc-ls -f | 02:59 |
cloudnull | do all your containers have ip addresses ? | 02:59 |
cloudnull | im guessing because i've not seen that issue before | 03:00 |
jwitko | yes they do | 03:00 |
jwitko | http://pastebin.com/wnAuy1xa | 03:01 |
jwitko | if i try to SSH to the memcached_container from the host itself i can get to a password prompt | 03:03 |
jwitko | but what I'm not understanding here is, isn't ansible attempting to SSH straight to the container name from my work-station ? | 03:04 |
jwitko | how would the work station have any idea how to resolve these hostnames? | 03:04 |
Sam-I-Am | dns | 03:07 |
*** sacharya has joined #openstack-ansible | 03:15 | |
*** sdake has quit IRC | 03:23 | |
*** galstrom_zzz is now known as galstrom | 03:24 | |
cloudnull | jwitko: it looks like the containers were created without the other interfaces | 03:29 |
cloudnull | http://cdn.pasteraw.com/316w5iw1f6rbmuc45a7prbrbgr9ev9k | 03:29 |
cloudnull | ^ that is a working set of containers. | 03:30 |
jwitko | oh, the vxlan stuff? | 03:30 |
cloudnull | mgmt is the interface that is missing | 03:30 |
cloudnull | the 10.x interfaces are from lxc | 03:31 |
cloudnull | my 172 addresses are what i use for management | 03:31 |
cloudnull | and then I have others for vxlan / vlan / flat | 03:31 |
jwitko | ok weird i wonder why they would be missing | 03:31 |
jwitko | i have my management network specified | 03:31 |
cloudnull | issue in config from before that is otherwise corrected now? | 03:33 |
cloudnull | id guess your containers inventory has all of the ip addresses set to None | 03:33 |
*** sdake has joined #openstack-ansible | 03:33 | |
jwitko | sorry, where is the containers inventory? | 03:35 |
cloudnull | # /etc/openstack_deploy/openstack_inventory.json | 03:35 |
jwitko | you're right | 03:36 |
jwitko | they're set to null | 03:36 |
jwitko | "os-ctrl1_memcached_container-a8401c9b": { | 03:36 |
jwitko | "ansible_ssh_host": null, | 03:36 |
jwitko | "component": "memcached", | 03:36 |
jwitko | "container_address": null, | 03:36 |
jwitko | cloudnull, any idea why its all populated with null? | 03:42 |
jwitko | hm... is provider networks a sub-item of global_overrides? | 03:49 |
*** tlian has quit IRC | 03:49 | |
*** galstrom is now known as galstrom_zzz | 03:55 | |
cloudnull | sorry being split brained | 03:56 |
cloudnull | its late : ) | 03:56 |
cloudnull | jwitko: check this http://cdn.pasteraw.com/2zw3eoupb535eqbs27gg130jv2oya0r | 03:56 |
cloudnull | if you run # /opt/os-ansible-deployment/scripts/inventory-manage.py -f /etc/openstack_deploy/openstack_inventory.json -l | 03:56 |
cloudnull | id imagine that your inventory output looks something similar. | 03:56 |
cloudnull | if thats the case then to get this back on the right track you just need to run the following. little python code to reset the container addresses. http://paste.openstack.org/show/382642/ | 03:58 |
cloudnull | then rerun # openstack-ansible lxc-container-create.yml | 03:58 |
cloudnull | which will re-assign ip addresses to the various containers. | 03:59 |
cloudnull | jwitko: yes its a sub element . | 04:01 |
cloudnull | here is a complete config from one of my dev labs http://paste.openstack.org/show/382643/ | 04:01 |
*** KLevenstein has joined #openstack-ansible | 04:01 | |
*** galstrom_zzz is now known as galstrom | 04:08 | |
jwitko | cloudnull, yea so the issue was my indentation of the entire provider networks block | 04:11 |
jwitko | so I fixed the indentation and re-ran the $ openstack-ansible setup-hosts.yml | 04:11 |
jwitko | this has failed however on the lxc_container start ups | 04:11 |
jwitko | NOTIFIED: [lxc_container_create | Start Container] **************************** | 04:12 |
cloudnull | so once that play is done you should have some network bits | 04:12 |
jwitko | so its expected the failures of the lxc container starts? | 04:15 |
*** misc has quit IRC | 04:17 | |
cloudnull | no | 04:19 |
jwitko | i see that it got the networking bits correct this time around | 04:19 |
jwitko | after i fixed the indentations | 04:19 |
cloudnull | ok | 04:19 |
cloudnull | great | 04:19 |
cloudnull | all the containers started up ? | 04:19 |
jwitko | no they are all failing to start | 04:20 |
jwitko | failed: [os-ctrl2_nova_cert_container-a9a3f4e2 -> os-ctrl2] => {"error": "Failed to start container [ os-ctrl2_nova_cert_container-a9a3f4e2 ]", "failed": true, "lxc_container": {"init_pid": -1, "interfaces": [], "ips": [], "state": "stopped"}, "rc": 1} | 04:20 |
jwitko | msg: The container [ %s ] failed to start. Check to lxc is available and that the container is in a functional state. | 04:20 |
jwitko | all containers are reporting that error | 04:22 |
jwitko | its still going | 04:22 |
cloudnull | if you do lxc-ls -f | 04:24 |
cloudnull | all are "stopped" | 04:25 |
*** misc has joined #openstack-ansible | 04:25 | |
cloudnull | if so it may jus tbe easier to nuke the containers you have on disk and build new ones. | 04:26 |
cloudnull | openstack-ansible lxc-container-destroy.yml | 04:26 |
cloudnull | will remove all the containers. working or now. | 04:26 |
cloudnull | then rerun openstack-ansibel lxc-container-create.yml | 04:26 |
*** alop has quit IRC | 04:27 | |
cloudnull | you could fix up the configs , which im guessing are made however its likely faster to nuke and rebuild. | 04:27 |
jwitko | ok cool | 04:27 |
jwitko | should i run lxc-container-create.yml before or after the "openstack-ansible setup-hosts.yml" ? | 04:28 |
cloudnull | it wont harm your old inventory so it should just carry on. | 04:28 |
cloudnull | etup-hosts.yml is a meta play that calls lxc-container-create.yml | 04:28 |
jwitko | oh ok | 04:28 |
cloudnull | you can see the order of the calls by catting the file. | 04:29 |
jwitko | yea i'm going to check it out after this errors | 04:29 |
cloudnull | im mostly afk for the rest of the night . | 04:29 |
jwitko | thanks so much for your herlp | 04:29 |
jwitko | help | 04:29 |
jwitko | have a good night | 04:29 |
cloudnull | let us know how it goes | 04:30 |
cloudnull | anytime. | 04:30 |
cloudnull | happy to help | 04:30 |
*** britthou_ has joined #openstack-ansible | 04:38 | |
*** KLevenstein has quit IRC | 04:39 | |
*** galstrom is now known as galstrom_zzz | 04:40 | |
*** britthouser has quit IRC | 04:40 | |
*** galstrom_zzz is now known as galstrom | 04:42 | |
prometheanfire | cloudnull: you happen to know of any cause to a bridge dropping all traffic? | 04:42 |
prometheanfire | cloudnull: since you are up | 04:42 |
Sam-I-Am | prometheanfire: it went from bridge to bitbucket? | 04:44 |
prometheanfire | Sam-I-Am: seems like it | 04:45 |
prometheanfire | Sam-I-Am: if you want we are talking about it in #gentoo-virtualization | 04:45 |
prometheanfire | it's something I've not heard of or reproduced | 04:45 |
Sam-I-Am | but someone else has figured out how to break a bridge? | 04:46 |
jwitko | damn... i did the lxc-container-destroy and rebuild | 04:50 |
jwitko | and it still fails the same way | 04:50 |
prometheanfire | Sam-I-Am: I'm thinking it has to do with rip/stp | 04:51 |
prometheanfire | Sam-I-Am: he's checking | 04:51 |
*** sdake has quit IRC | 04:56 | |
*** galstrom is now known as galstrom_zzz | 05:03 | |
*** Mudpuppy has quit IRC | 05:09 | |
*** mancdaz has quit IRC | 05:23 | |
*** mancdaz has joined #openstack-ansible | 05:24 | |
*** sacharya has quit IRC | 05:29 | |
*** markvoelker has joined #openstack-ansible | 05:41 | |
*** markvoelker_ has joined #openstack-ansible | 05:44 | |
*** markvoelker has quit IRC | 05:45 | |
*** annashen has joined #openstack-ansible | 05:51 | |
*** ig0r_ has joined #openstack-ansible | 06:08 | |
*** ig0r_ has quit IRC | 06:45 | |
*** ig0r_ has joined #openstack-ansible | 06:51 | |
openstackgerrit | Merged stackforge/os-ansible-deployment: Remove {{ from "with_items" and "when" statements https://review.openstack.org/202581 | 07:11 |
openstackgerrit | Merged stackforge/os-ansible-deployment: Fix haproxy service config when ssl is enabled https://review.openstack.org/202485 | 07:40 |
openstackgerrit | Jesse Pretorius proposed stackforge/os-ansible-deployment: Fix haproxy service config when ssl is enabled https://review.openstack.org/202911 | 07:48 |
*** annashen has quit IRC | 08:05 | |
mancdaz | git-harry ping | 08:09 |
git-harry | pong | 08:09 |
mancdaz | good morrow! | 08:11 |
mancdaz | I was just looking at your review for the rabbitmq install stuff | 08:11 |
git-harry | yeah, I know it's broken | 08:12 |
mancdaz | I'm probably being a bit dumb, but isn't it going to stop all the rabbits every time the playbooks are run? | 08:12 |
git-harry | oh | 08:12 |
git-harry | yes, I need to add a tag so you can skip that | 08:12 |
mancdaz | a tag? | 08:13 |
mancdaz | doesn't it need to determine automatically if they need stopping, or not | 08:13 |
git-harry | this is why we had the discussion yesterday about assumptions. | 08:14 |
git-harry | If I can assume that this is only ever run during a maintenance it doesn't matter if they all get shutdown | 08:14 |
mancdaz | well it's part of the 'install' play | 08:15 |
git-harry | I'm not following | 08:15 |
mancdaz | well, we were talking about when person does full juno > kilo upgrade, it would usually be in a maintenance window | 08:20 |
git-harry | I wasn't :P | 08:21 |
mancdaz | ok | 08:21 |
mancdaz | truthfully, I'd expect that any time the playbooks are run fully it would be in a maintenance window. But I couldn't guarantee that | 08:22 |
mancdaz | and also remember, we were speaking rpc specific practices | 08:22 |
mancdaz | this is osad, so other deployers might have different expectations | 08:22 |
git-harry | I know, but no one has reviewed it yet to express dissatisfaction at this method | 08:23 |
mancdaz | ha, I can add my comments there then. Was just double checking here that I wasn't misunderstanding the patch | 08:23 |
*** rcarrillocruz has quit IRC | 08:35 | |
*** markvoelker_ has quit IRC | 08:42 | |
openstackgerrit | git-harry proposed stackforge/os-ansible-deployment: Serialise rabbitmq playbook to allow upgrades https://review.openstack.org/202681 | 08:44 |
*** markvoelker has joined #openstack-ansible | 08:57 | |
*** markvoelker has quit IRC | 09:02 | |
openstackgerrit | Jesse Pretorius proposed stackforge/os-ansible-deployment: Keystone Federation Service Provider Configuration https://review.openstack.org/194395 | 09:07 |
*** markvoelker has joined #openstack-ansible | 09:12 | |
*** markvoelker has quit IRC | 09:17 | |
*** markvoelker has joined #openstack-ansible | 09:26 | |
*** markvoelker has quit IRC | 09:31 | |
*** markvoelker has joined #openstack-ansible | 09:41 | |
*** markvoelker has quit IRC | 09:45 | |
openstackgerrit | git-harry proposed stackforge/os-ansible-deployment: Serialise rabbitmq playbook to allow upgrades https://review.openstack.org/202681 | 09:52 |
*** markvoelker has joined #openstack-ansible | 09:55 | |
*** markvoelker has quit IRC | 10:00 | |
*** markvoelker has joined #openstack-ansible | 10:07 | |
*** markvoelker has quit IRC | 10:12 | |
*** markvoelker has joined #openstack-ansible | 10:22 | |
*** openstackgerrit has quit IRC | 10:31 | |
*** openstackgerrit has joined #openstack-ansible | 10:32 | |
*** markvoelker has quit IRC | 10:32 | |
*** markvoelker has joined #openstack-ansible | 10:37 | |
*** sdake has joined #openstack-ansible | 10:39 | |
*** markvoelker has quit IRC | 10:42 | |
*** markvoelker has joined #openstack-ansible | 10:51 | |
*** markvoelker has quit IRC | 10:56 | |
openstackgerrit | Jesse Pretorius proposed stackforge/os-ansible-deployment: Fix Horizon SSL certificate management and distribution https://review.openstack.org/202977 | 10:57 |
openstackgerrit | Merged stackforge/os-ansible-deployment: Fix repo section in example config file https://review.openstack.org/202377 | 11:00 |
*** markvoelker has joined #openstack-ansible | 11:06 | |
*** markvoelker has quit IRC | 11:11 | |
openstackgerrit | Jesse Pretorius proposed stackforge/os-ansible-deployment: Set haproxy install to use latest packages https://review.openstack.org/202981 | 11:15 |
*** markvoelker has joined #openstack-ansible | 11:19 | |
*** markvoelker has quit IRC | 11:24 | |
*** sdake has quit IRC | 11:32 | |
*** markvoelker has joined #openstack-ansible | 11:32 | |
*** markvoelker has quit IRC | 11:44 | |
*** mmasaki has left #openstack-ansible | 11:51 | |
*** markvoelker has joined #openstack-ansible | 11:55 | |
*** markvoelker has quit IRC | 12:00 | |
*** markvoelker has joined #openstack-ansible | 12:09 | |
*** markvoelker has quit IRC | 12:13 | |
*** markvoelker has joined #openstack-ansible | 12:16 | |
openstackgerrit | Jesse Pretorius proposed stackforge/os-ansible-deployment: Fix Horizon SSL certificate management and distribution https://review.openstack.org/202977 | 12:20 |
*** markvoelker has quit IRC | 12:21 | |
*** markvoelker has joined #openstack-ansible | 12:24 | |
*** markvoelker has quit IRC | 12:32 | |
*** sdake has joined #openstack-ansible | 12:33 | |
*** markvoelker has joined #openstack-ansible | 12:39 | |
*** markvoelker has quit IRC | 12:43 | |
*** markvoelker has joined #openstack-ansible | 12:53 | |
*** tlian has joined #openstack-ansible | 12:57 | |
*** markvoelker has quit IRC | 12:58 | |
mgariepy | hey, how long should it take to start lxc container on a infra hosts? | 13:01 |
mgariepy | is there some order that it respect ? or should it start them all at once on boot ? | 13:01 |
openstackgerrit | Jesse Pretorius proposed stackforge/os-ansible-deployment: Keystone SSL cert/key distribution and configuration https://review.openstack.org/194474 | 13:04 |
*** markvoelker has joined #openstack-ansible | 13:05 | |
odyssey4me | mgariepy if you start them all at once it'll take longer for them to get to a ready state, but they'll normalise and should be fine | 13:05 |
odyssey4me | it's probably best to do a bit of a delay between each of them and start them in some sort of sensible order based on dependencies | 13:06 |
odyssey4me | but if you start them all at once it'll probably work - you may just need to monitor and health check them a bit later | 13:06 |
mgariepy | what's the default behavior ? | 13:06 |
*** markvoelker_ has joined #openstack-ansible | 13:07 | |
odyssey4me | I could be wrong, but the default at this stage is all at once. | 13:07 |
mgariepy | because i rebooted a controller and it's been an hour and still one is not started yet. | 13:07 |
odyssey4me | mgariepy then you have a problem and had best start checking logs - they should've been up within 5 mins | 13:08 |
mancdaz | mgariepy I've seen that before | 13:08 |
mancdaz | though I can't remember exactly what it was :( | 13:08 |
mgariepy | i have a kindof old serveur to test. | 13:08 |
mgariepy | but still. shouldn't take an hour to boot haha | 13:08 |
mancdaz | mgariepy the server is still rebooting? | 13:09 |
mancdaz | or the containers haven\'t started after the reboot? | 13:09 |
mgariepy | it's longer to boot then to install haha :) | 13:09 |
mgariepy | they are starting one by one | 13:09 |
*** markvoelker has quit IRC | 13:09 | |
mgariepy | i'll reboot to see if it's the same | 13:27 |
*** markvoelker_ has quit IRC | 13:34 | |
*** ccrouch has joined #openstack-ansible | 13:36 | |
*** Mudpuppy has joined #openstack-ansible | 13:41 | |
*** Guest9887 has quit IRC | 13:48 | |
*** markvoelker has joined #openstack-ansible | 13:49 | |
*** blewis has joined #openstack-ansible | 13:49 | |
*** blewis is now known as Guest62465 | 13:49 | |
*** Mudpuppy has quit IRC | 13:50 | |
*** markvoelker has quit IRC | 13:54 | |
*** markvoelker has joined #openstack-ansible | 13:59 | |
*** spotz_zzz is now known as spotz | 14:00 | |
*** Mudpuppy has joined #openstack-ansible | 14:01 | |
*** Mudpuppy has quit IRC | 14:01 | |
*** Mudpuppy has joined #openstack-ansible | 14:02 | |
*** sigmavirus24_awa is now known as sigmavirus24 | 14:06 | |
*** markvoelker has quit IRC | 14:07 | |
cloudnull | morning | 14:07 |
*** markvoelker has joined #openstack-ansible | 14:14 | |
*** markvoelker has quit IRC | 14:18 | |
*** galstrom_zzz is now known as galstrom | 14:20 | |
jwitko | good morning cloudnull! :) | 14:21 |
cloudnull | hows it jwitko? | 14:21 |
jwitko | ah i went to bed shortly after you last night | 14:21 |
jwitko | i ran the lxc-containers-destroy, and it destroyed everything | 14:22 |
cloudnull | yea, i was tired . | 14:22 |
jwitko | then i ran the create, and it worked out well once again until starting containers | 14:22 |
jwitko | so in the end it failed in the same spot | 14:22 |
cloudnull | hum... | 14:22 |
cloudnull | do all of the bridges exist on the host that are specified in the provider netwoks section ? | 14:23 |
*** yaya has joined #openstack-ansible | 14:23 | |
cloudnull | are all of the containers in a "stopped" state ? | 14:23 |
cloudnull | if so, can you run # lxc-start -n $container_name | 14:23 |
jwitko | so I have three bridges that only exist on compute hosts | 14:23 |
cloudnull | it should kick out some details on why the container will not start | 14:23 |
jwitko | they are connected to my SAN | 14:24 |
jwitko | so on 2/3 hosts all containers are STOPPED | 14:24 |
jwitko | on one host, the memcached container is running with a single IP | 14:24 |
cloudnull | so your hosts should have all of the netwosk that are specified in the container_bridge key and are also bound to the groups. IE https://github.com/stackforge/os-ansible-deployment/blob/master/etc/openstack_deploy/openstack_user_config.yml.aio#L20 | 14:25 |
cloudnull | so your os-infra host should have br-mgmt and br-storage if you used the default names. | 14:26 |
jwitko | let me make a paste bin | 14:27 |
*** yaya has quit IRC | 14:27 | |
cloudnull | your compute hosts should have br-mgmt, br-storage, br-vlan and br-vxlan | 14:27 |
cloudnull | and your network hosts should have br-vlan and br-vxlan | 14:27 |
cloudnull | again if you had used all of the default names. | 14:27 |
jwitko | http://pastebin.com/XW3DNEE9 | 14:28 |
jwitko | so basically | 14:28 |
*** markvoelker has joined #openstack-ansible | 14:28 | |
jwitko | br-nfs1, br-iscsi1, br-iscsi2 --- these only exist on compute hosts | 14:29 |
jwitko | they are for the storage network | 14:29 |
jwitko | i'm not using br-vxlan because I don't plan to have tenant networks | 14:29 |
jwitko | cloudnull, so I'm guessing my setup is not a functional one | 14:31 |
*** markvoelker has quit IRC | 14:33 | |
*** yaya has joined #openstack-ansible | 14:35 | |
cloudnull | so the config had some syntax issues due to white space. http://paste.openstack.org/show/383824/ | 14:36 |
cloudnull | this is what it was reading before http://paste.openstack.org/show/383837/ | 14:37 |
cloudnull | and this is what is should read once fixed up http://paste.openstack.org/show/383839/ | 14:38 |
cloudnull | notice the difference in group_binds being a list | 14:38 |
cloudnull | palendae, Mudpuppy, sigmavirus24 we really should produce a config scheme validator to help out with these types of issues. | 14:39 |
cloudnull | sigmavirus24: :) | 14:39 |
Mudpuppy | :) | 14:39 |
palendae | cloudnull: I talked to sigmavirus24 about it some yesterday...the thing I had in mind was verifying that the required keys are present, and maybe a syntax check | 14:39 |
sigmavirus24 | cloudnull: yeah yeah yeah | 14:40 |
sigmavirus24 | cloudnull: palendae and I are talking about that for the hackathon | 14:40 |
palendae | Though if we have a schema validator, we'll now need to have the config in 3 places - the schema, example files, and docs | 14:40 |
*** TheIntern has joined #openstack-ansible | 14:40 | |
sigmavirus24 | palendae: if we can generate teh docs from the schema that'd be greeeat | 14:40 |
cloudnull | maybe we could use https://github.com/sigmavirus24/schema-validator | 14:40 |
sigmavirus24 | also maybe example files | 14:40 |
palendae | Or vice-versa | 14:40 |
sigmavirus24 | palendae: nah | 14:40 |
Mudpuppy | In this case it was valid yaml, but the white space messed stuff up, so pass 1 should check typos, and phase two syntax | 14:40 |
sigmavirus24 | cloudnull: what is that magic project you speak of | 14:40 |
sigmavirus24 | =P | 14:41 |
cloudnull | i know, i know, NIH and all | 14:41 |
sigmavirus24 | the example schema and such are probably woefully out of date | 14:41 |
sigmavirus24 | =/ | 14:41 |
palendae | works for juno! | 14:41 |
cloudnull | sigmavirus24: likely | 14:41 |
cloudnull | this is something that we should invest time in | 14:41 |
cloudnull | IMO | 14:41 |
palendae | Yeah | 14:42 |
cloudnull | now just let me find that extra time.... | 14:42 |
palendae | cloudnull: I was thinking about writing a spec | 14:42 |
palendae | But yeah | 14:42 |
cloudnull | typie typie | 14:42 |
sigmavirus24 | cloudnull: I'd totally work on this today | 14:42 |
sigmavirus24 | but those days between sprints we were promised disappeared =P | 14:42 |
palendae | Also rpc-openstack grew a yaml-mergerating script that I think would be generally useful to anyone extending openstack-ansible | 14:42 |
cloudnull | these are not the days you were looking for | 14:42 |
sigmavirus24 | cloudnull: yep | 14:42 |
cloudnull | palendae: as long as its called the extenderator im game | 14:43 |
palendae | I'm for the name changing if it gets into upstream | 14:43 |
palendae | extenderatoring selfie-stick | 14:43 |
* cloudnull will +2 for more erators | 14:44 | |
jwitko | cloudnull, ok i believe i've fixed the syntax. not sure how it got like that | 14:44 |
cloudnull | jwitko: it happens , | 14:44 |
jwitko | is there any issue with my bridges? and having storage bridges that should only be on storage hosts? | 14:44 |
cloudnull | everything else looks fine | 14:44 |
jwitko | alright I'll run the destroy again | 14:45 |
jwitko | and then the setup-hosts | 14:45 |
cloudnull | try running lxc-container-craete again | 14:45 |
jwitko | wouldn't that formatting issues create some problems with the config generation ? | 14:46 |
cloudnull | yea, and being that you have nothing deployed presently, its probably best to nuke it. | 14:47 |
cloudnull | id also remove /etc/openstack_deploy/openstack_inventory.json | 14:47 |
cloudnull | once you do the container destroy | 14:47 |
cloudnull | which will be like starting inventory fresh | 14:47 |
cloudnull | Cores https://review.openstack.org/#/c/202268/ please review and do the needfuls, if you think its a good change that is. I've done 9 tests, number 10 is in progress and so far I've had 100% success. | 14:50 |
cloudnull | only 2 of the tests were done with the successerator reenabled. | 14:50 |
*** markvoelker has joined #openstack-ansible | 14:50 | |
cloudnull | itll be 3 if the last one is good too. but in all it looks like the change is helping general performance and ssh intabilities (even without the successerator). | 14:52 |
jwitko | cloudnull, sorry can you explain what you mean by nuke it? just run the lxc-destroy and delete the json ? Then am I running the setup-hosts.yml or the lxc-create yml ? | 14:54 |
cloudnull | run lxc-container-destory.yml | 14:54 |
cloudnull | rm the json | 14:54 |
cloudnull | and then rerun lxc-container-create | 14:54 |
jwitko | ah ok | 14:54 |
cloudnull | the json is rerendered on every run, but the inventory generator perserves data within it for items already in inventory. so removing it is like starting from a clean state. | 14:55 |
jwitko | cool, thank you i'll try that now | 14:55 |
odyssey4me | cloudnull lol, a nice sneak attack there with MAX_RETRIES | 14:55 |
cloudnull | but destroy your containers first . | 14:55 |
cloudnull | odyssey4me: i added it after the meeting yesterday | 14:56 |
odyssey4me | cloudnull that perhaps should be a separate patch, don't you think? | 14:56 |
cloudnull | if you think it best. | 14:56 |
*** markvoelker has quit IRC | 14:57 | |
*** markvoelker_ has joined #openstack-ansible | 14:57 | |
*** markvoelker_ has quit IRC | 14:57 | |
odyssey4me | that way if we need to revert the other one, or revert this one - they don't interfere with each other | 14:57 |
*** markvoelker has joined #openstack-ansible | 14:57 | |
cloudnull | ok one sec | 14:57 |
odyssey4me | no need for a bug on the successerator one | 14:57 |
odyssey4me | lol, you have no bug on that one anyway :p | 14:58 |
cloudnull | nope | 14:58 |
openstackgerrit | Kevin Carter proposed stackforge/os-ansible-deployment: Container create/system tuning https://review.openstack.org/202268 | 14:59 |
openstackgerrit | Kevin Carter proposed stackforge/os-ansible-deployment: Adds retries https://review.openstack.org/203063 | 15:00 |
*** jaypipes has joined #openstack-ansible | 15:01 | |
cloudnull | done | 15:01 |
odyssey4me | and done | 15:05 |
*** sdake has quit IRC | 15:13 | |
cloudnull | in related news please vote http://lists.openstack.org/pipermail/openstack-dev/2015-July/069857.html | 15:15 |
cloudnull | palendae: https://review.openstack.org/#/c/202821 is a fix for an upgrade issues tha bjorne was seeing | 15:17 |
palendae | cloudnull: Ok, thanks. I'll look at it in a moment | 15:18 |
palendae | When I find what exchange has done with my openstack-dev mails | 15:18 |
palendae | cloudnull: Another thing I kinda wanted to spec out - bash linting on run-upgrades.sh | 15:20 |
palendae | But for valid linting on that, we'd need to do yaml and python, too | 15:20 |
*** alextricity has joined #openstack-ansible | 15:21 | |
cloudnull | +1 | 15:21 |
jwitko | cloudnull, failure :( | 15:21 |
cloudnull | same thing ? | 15:21 |
jwitko | all containers reporting same error: | 15:21 |
jwitko | failed: [os-ctrl1_utility_container-062f5209 -> os-ctrl1] => {"error": "Failed to start container [ os-ctrl1_utility_container-062f5209 ]", "failed": true, "lxc_container": {"init_pid": -1, "interfaces": [], "ips": [], "state": "stopped"}, "rc": 1} | 15:21 |
jwitko | msg: The container [ %s ] failed to start. Check to lxc is available and that the container is in a functional state. | 15:21 |
cloudnull | can you do # lxc-start -n os-ctrl1_utility_container-062f5209 | 15:22 |
cloudnull | what error is it reporting ? | 15:22 |
jwitko | http://paste.openstack.org/show/383961/ | 15:26 |
*** jaypipes is now known as blockedpipes | 15:27 | |
*** annashen has joined #openstack-ansible | 15:28 | |
cloudnull | looks like a network problem | 15:28 |
*** sdake has joined #openstack-ansible | 15:28 | |
cloudnull | jwitko: what bridge is vethT9STR7 connected to ? | 15:29 |
cloudnull | curious, do you have multiple eth0s ? | 15:29 |
jwitko | shouldn't? | 15:29 |
* cloudnull going back to look at your config | 15:29 | |
*** daneyon has joined #openstack-ansible | 15:31 | |
cloudnull | the container_interface is the name of the interface within the container. | 15:31 |
cloudnull | and is created by the lxc-container-create play. | 15:31 |
jwitko | so I can't find that interface | 15:32 |
jwitko | let me know if you need me to paste any updated files | 15:33 |
cloudnull | # /var/lib/lxc/os-ctrl1_utility_container-062f5209/config | 15:33 |
*** weezS has joined #openstack-ansible | 15:33 | |
*** annashen has quit IRC | 15:35 | |
jwitko | http://paste.openstack.org/show/383963/ | 15:36 |
cloudnull | in your config change line 30 | 15:37 |
cloudnull | # /container_interface: "eth0"/container_interface: "eth1" | 15:37 |
cloudnull | the base lxc system is using the default lxc network interface | 15:38 |
cloudnull | which is eth0 | 15:38 |
cloudnull | and your managemant network interface is being instructed to run on eth0 | 15:38 |
cloudnull | so they're conflicting. | 15:38 |
jwitko | oh ok | 15:39 |
cloudnull | you may have the same issue on line 76 too | 15:39 |
jwitko | so this eth1 will live only inside the containers? | 15:39 |
cloudnull | yes | 15:39 |
cloudnull | line 76 to eth2 | 15:39 |
cloudnull | line 85 to eth3 | 15:39 |
cloudnull | i shouldve caught that before sorry | 15:39 |
jwitko | no worries! | 15:40 |
jwitko | it is me who should be apologizing lol | 15:40 |
cloudnull | nah its us , we need moar better docs | 15:40 |
jwitko | so just to confirm | 15:40 |
cloudnull | Sam-I-Am: ^ typie typie | 15:40 |
jwitko | those changes are in openstack_user_config.yml | 15:41 |
cloudnull | yes | 15:41 |
jwitko | then i will destroy containers, remove json file, and create containers | 15:41 |
cloudnull | you should be able to make the change and rerun the lxc-container-create | 15:41 |
jwitko | Sam-I-Am is a king among men :) | 15:41 |
jwitko | i bow to his docs | 15:41 |
cloudnull | lol | 15:42 |
jwitko | cloudnull yay it made it past the error :) | 15:51 |
jwitko | i have to go to the dentist, but I'm sure i'll be in here again today poking you more | 15:51 |
jwitko | thanks again for the help | 15:51 |
cloudnull | anytime | 15:52 |
*** CheKoLyN has joined #openstack-ansible | 15:52 | |
cloudnull | have fun at the dentist | 15:52 |
cloudnull | ;) | 15:52 |
Sam-I-Am | lol dentists | 15:54 |
Sam-I-Am | better than or worse than neutron :) | 15:54 |
Mudpuppy | Nightmare: Your dentist is a neutron contributor | 15:56 |
palendae | Sounds right | 15:57 |
openstackgerrit | Jesse Pretorius proposed stackforge/os-ansible-deployment: Keystone Federation Service Provider Configuration https://review.openstack.org/194395 | 16:02 |
odyssey4me | miguelgrinberg ^ | 16:03 |
*** vdo has quit IRC | 16:03 | |
*** eglute has quit IRC | 16:04 | |
*** eglute has joined #openstack-ansible | 16:05 | |
*** yaya has quit IRC | 16:06 | |
*** weezS has quit IRC | 16:06 | |
*** sacharya has joined #openstack-ansible | 16:06 | |
*** yaya has joined #openstack-ansible | 16:09 | |
*** alop has joined #openstack-ansible | 16:12 | |
*** yaya has quit IRC | 16:13 | |
*** alop_ has joined #openstack-ansible | 16:15 | |
*** alop has quit IRC | 16:16 | |
*** alop_ is now known as alop | 16:16 | |
*** annashen has joined #openstack-ansible | 16:16 | |
*** TheIntern has quit IRC | 16:22 | |
*** eglute has quit IRC | 16:23 | |
*** eglute has joined #openstack-ansible | 16:24 | |
openstackgerrit | Merged stackforge/os-ansible-deployment: Adding missing role for maas user creation https://review.openstack.org/199639 | 16:32 |
*** eglute has quit IRC | 16:33 | |
*** eglute has joined #openstack-ansible | 16:34 | |
*** annashen has quit IRC | 16:36 | |
*** annashen has joined #openstack-ansible | 16:40 | |
*** sigmavirus24 has quit IRC | 16:52 | |
jwitko | cloudnull | 16:53 |
jwitko | So i have returned and it appears while some of the containers progressed past that step, not all did | 16:53 |
jwitko | the network name is still eth0 it appears | 16:54 |
*** sigmavirus24 has joined #openstack-ansible | 16:54 | |
jwitko | the error from starting the container is a little different | 16:55 |
jwitko | http://paste.openstack.org/show/384175/ | 16:56 |
*** sigmavirus24 has quit IRC | 16:57 | |
cloudnull | jwitko: try to run # ansible hosts -m shell -a '/usr/local/bin/lxc-system-manage veth-cleanup' | 16:58 |
*** alop has quit IRC | 16:58 | |
cloudnull | from your deployment box | 16:58 |
*** sigmavirus24 has joined #openstack-ansible | 17:00 | |
jwitko | cloudnull, ok ran it. the boxes that failed had a ton of "cannot find device" output | 17:01 |
jwitko | for the veth* | 17:01 |
jwitko | should i just run the lxc-containers-create.yml in its entirety again ? | 17:01 |
*** alop has joined #openstack-ansible | 17:02 | |
cloudnull | yea give it a go | 17:02 |
jwitko | so is that just some left over meta data? | 17:02 |
jwitko | and the lxc has built in scripts to clean it up? | 17:02 |
cloudnull | it is left over bits from earlier builds | 17:03 |
openstackgerrit | git-harry proposed stackforge/os-ansible-deployment: Serialise rabbitmq playbook to allow upgrades https://review.openstack.org/202681 | 17:03 |
cloudnull | we wrote that tool to do system clean up and such because it was a problem we've run into in the past | 17:03 |
jwitko | damn | 17:09 |
jwitko | some containers still failed to start | 17:09 |
cloudnull | what are the new errors ? | 17:09 |
jwitko | ok | 17:09 |
jwitko | so this one is looking for a bridge that is only available on compute servers | 17:09 |
jwitko | its the bridge that handles NFS storage] | 17:10 |
jwitko | so in my provider networks i have 3 SAN networks listed | 17:10 |
jwitko | nfs, iscsi1, iscsi2 | 17:10 |
jwitko | these are only meant to be applied to cinder/glance hosts | 17:10 |
jwitko | oh, i think i see the problem | 17:12 |
odyssey4me | cloudnull it would seem that this needs an update to a later sha: https://review.openstack.org/199126 | 17:15 |
odyssey4me | it blew out on neutron's db sync - duplicate serial | 17:15 |
cloudnull | yea master is not happy today | 17:16 |
cloudnull | or rather the last few days. | 17:16 |
cloudnull | upstream master that is | 17:16 |
*** alextricity has quit IRC | 17:17 | |
*** annashen has quit IRC | 17:17 | |
odyssey4me | perhaps best to leave it at the current sha for a bit then - we have enough gating shenanigans | 17:21 |
cloudnull | agreeded. | 17:21 |
cloudnull | 100% | 17:22 |
*** daneyon has quit IRC | 17:23 | |
*** harlowja has quit IRC | 17:26 | |
*** harlowja has joined #openstack-ansible | 17:26 | |
openstackgerrit | Merged stackforge/os-ansible-deployment: Adjust key distribution mechanism for Swift https://review.openstack.org/199992 | 17:26 |
*** annashen has joined #openstack-ansible | 17:27 | |
jwitko | hey cloudnull, do you know how many IPs each server takes up with containers? | 17:35 |
*** eglute has quit IRC | 17:35 | |
*** eglute has joined #openstack-ansible | 17:36 | |
cloudnull | it depends on the infra, but it can be a bunch. generally on *infra hosts your looking at somewhere around 32 containerrs per host | 17:41 |
sigmavirus24 | cloudnull: have we pinned down which upstream project is unhappy? | 17:52 |
cloudnull | neutron so far | 17:52 |
*** metral is now known as metral_zzz | 17:53 | |
sigmavirus24 | Can we upgrade everything but neutron? | 17:53 |
* sigmavirus24 is just curious | 17:53 | |
sigmavirus24 | Not necessary | 17:53 |
cloudnull | i've not looked since i rev'd the commit a few days agao | 17:54 |
cloudnull | *ago | 17:54 |
* sigmavirus24 is curious | 17:55 | |
openstackgerrit | Miguel Grinberg proposed stackforge/os-ansible-deployment: Keystone Federation Identity Provider Configuration https://review.openstack.org/194259 | 18:04 |
openstackgerrit | Miguel Grinberg proposed stackforge/os-ansible-deployment: Keystone Federation Identity Provider Configuration https://review.openstack.org/194259 | 18:05 |
openstackgerrit | Miguel Grinberg proposed stackforge/os-ansible-deployment: Keystone Federation Identity Provider Configuration https://review.openstack.org/194259 | 18:08 |
*** TheIntern has joined #openstack-ansible | 18:19 | |
odyssey4me | sigmavirus24 you've been seeing galera failures in builds too, right? | 18:29 |
sigmavirus24 | odyssey4me: yep | 18:29 |
odyssey4me | has it only been with affinity: 1, or with larger affinity? | 18:29 |
sigmavirus24 | default affinity | 18:30 |
sigmavirus24 | I never think to set the affinity to 1 | 18:30 |
sigmavirus24 | hasn't happened for the last couple builds though | 18:30 |
odyssey4me | was it on re-runs of builds? | 18:30 |
odyssey4me | and are you using scripts/run-playbooks? | 18:30 |
sigmavirus24 | no I've gotten into the habit of doing `nova rebuild <server> <image> --poll` then setting things up from scratch | 18:31 |
sigmavirus24 | the reasons the keystone v3 patch kept failing was because I was working off an existing build | 18:31 |
sigmavirus24 | so I wasnt' seeing all the problems that the gate was | 18:31 |
jwitko | cloudnull, finally got the setup-hosts.yml playbook to complete without error :) | 18:32 |
mgariepy | odyssey4me, mancdaz : the containers start very slowly because of a bug in lxc-autostart. | 18:32 |
*** markvoelker has quit IRC | 18:32 | |
mgariepy | odyssey4me, mancdaz http://paste.ubuntu.com/11893981/ | 18:33 |
odyssey4me | sigmavirus24 interesting - I had set affinity: 1 and was getting issues over and over again, so trying now with the standard setting | 18:33 |
odyssey4me | This may be why though: https://review.openstack.org/200054 | 18:33 |
odyssey4me | not a bad review/patch, but it does mean that previously we never had restarts - and now we do | 18:34 |
sigmavirus24 | odyssey4me: blame hughsaunders then =P | 18:34 |
odyssey4me | sigmavirus24 that's git-harry :p | 18:34 |
sigmavirus24 | well git-harry didn't approve it | 18:34 |
sigmavirus24 | =P | 18:34 |
* sigmavirus24 is only kidding | 18:34 | |
jwitko | cloudnull, during the verification however i am experiencing issues | 18:35 |
odyssey4me | it's likely that during the restarts the cluster is getting all mixed up - perhaps we need some sort of serial restart, instead of simultaneous restarts | 18:35 |
jwitko | the galera container doesn't seem to recognize the mysql command | 18:35 |
jwitko | in the process list i don't see mysql running either | 18:35 |
jwitko | (mariadb) | 18:35 |
odyssey4me | sigmavirus24 ah, that patch makes it run in serial :/ | 18:35 |
odyssey4me | sigmavirus24 ah, that patch makes it run in serial :/ | 18:36 |
sigmavirus24 | odyssey4me: oh okay | 18:37 |
sigmavirus24 | odyssey4me: oh okay | 18:37 |
sigmavirus24 | odyssey4me: similar patch with redis is up for review | 18:37 |
palendae | redis or rabbitmq? | 18:38 |
palendae | https://review.openstack.org/#/c/202681/4 is the rabbitmq one | 18:38 |
palendae | Not necessarily exactly the same, but making sure that we restart rabbit cluster members in the correct order after upgrades | 18:39 |
sigmavirus24 | sorry | 18:39 |
* sigmavirus24 's brain is fried | 18:39 | |
palendae | Similar approach, at least | 18:40 |
odyssey4me | yup, I'm a fan of the approach | 18:43 |
*** TheIntern has quit IRC | 18:59 | |
*** TheIntern has joined #openstack-ansible | 19:02 | |
jwitko | has anyone ever had an issue where a container didn't seem to install properly? | 19:03 |
palendae | jwitko: Can you elaborate? | 19:04 |
jwitko | i'm attempting to validate the setup-hosts.yml run as the guide instructs, and while attached to my galera container there is no mariadb process running and no mysql binary to use to attach to a DB | 19:04 |
palendae | I'm seeing containers failing to start right now (with a juno install) because cgmanager claims there are 100 cgroups with that name | 19:04 |
jwitko | mine is the kilo | 19:04 |
palendae | Ah | 19:04 |
palendae | I don't think I've seen that behavior before | 19:05 |
odyssey4me | jwitko you haven't yet installed anything - only the containers are created | 19:07 |
jwitko | thanks odyssey4me, cloudnull just helped me realize that | 19:08 |
odyssey4me | ou've only done the setup-hosts play, right, and haven't yet run the setup-infrastructure play | 19:08 |
palendae | D'oh, yeah | 19:08 |
palendae | Yep, need to do the next step | 19:08 |
odyssey4me | :) palendae needs rest, as do we all :) | 19:08 |
mgariepy | odyssey4me, mancdaz fix will be pushed in lxc 1.1.3 next week. | 19:09 |
palendae | mgariepy: Cool - I was just noticing the slow autostart, too. Thanks for that update! | 19:10 |
cloudnull | mgariepy: which fix was that ? | 19:11 |
mgariepy | https://github.com/lxc/lxc/compare/7faa223603a8...76e484a7093c | 19:11 |
cloudnull | ah. the auto start bits | 19:11 |
mgariepy | lxc-autostart -L -g onboot was duplucating some entry.. | 19:11 |
mgariepy | don't you guys need to reboot your servers ? some times ? haha | 19:14 |
palendae | mgariepy: ! | 19:15 |
palendae | mgariepy: So would that have caused some issues where a container had 100 cgroups? | 19:15 |
mgariepy | booting a node with 21 containers, sleeped 15 secondes 231 times before the last ont starts. | 19:16 |
palendae | Ok, so not the same thing | 19:16 |
git-harry | sigmavirus24: odyssey4me you approve it, you buy it. | 19:29 |
odyssey4me | cloudnull sigmavirus24 miguelgrinberg so the issue I've been fighting seems to relate to horizon's keystoneclient being redirected to the internal endpoint every time, even though I've configured horizon to use the public endpoint | 19:30 |
odyssey4me | for some reason keystone seems to tell the client to use the internal endpoint (http) instead of the public one (https) | 19:31 |
miguelgrinberg | what part of the auth flow is this? | 19:31 |
miguelgrinberg | are we past all the redirects at this point? | 19:32 |
*** sdake has quit IRC | 19:32 | |
sigmavirus24 | odyssey4me: interesting | 19:32 |
odyssey4me | miguelgrinberg yes, we're already past shibboleth | 19:32 |
odyssey4me | so apache is no longer involved - it is now just horizon's auth backend | 19:33 |
miguelgrinberg | the final redirect takes us into a keystone endpoint, and then keystone needs to redirect back to horizon. Is that last one between keystone and horizon that is having this problem? | 19:34 |
odyssey4me | miguelgrinberg join #openstack-keystone for the running analysis - here's a log to peek at: http://paste.openstack.org/show/7LIzjZ09I8bRoVuOl7as/ | 19:35 |
*** galstrom is now known as galstrom_zzz | 19:36 | |
*** wmlynch has quit IRC | 19:41 | |
odyssey4me | so I can fudge this by making keystone's apache change any json/xml content that comes through the public endpoint... but that's less desirable than having this work right | 19:43 |
*** TheIntern has quit IRC | 19:43 | |
*** ig0r_ has quit IRC | 19:47 | |
*** sacharya has quit IRC | 19:56 | |
*** galstrom_zzz is now known as galstrom | 20:03 | |
*** TheIntern has joined #openstack-ansible | 20:14 | |
odyssey4me | miguelgrinberg sigmavirus24 silly me, I should have set public_endpoint in keystone.conf to tell it how to present itself to clients :p | 20:33 |
cloudnull | can a core push the button on this https://review.openstack.org/#/c/203063/ | 20:34 |
miguelgrinberg | odyssey4me: oh, that was it? | 20:34 |
odyssey4me | miguelgrinberg yep - now one more issue | 20:34 |
odyssey4me | I find that on the first auth I get redirected to keystone's public endpoint instead of back to horizon - need to figure that one out | 20:34 |
miguelgrinberg | odyssey4me: isn't keystone supposed to take the redirect and then send you to horizon? I think that's how it is supposed to work | 20:36 |
odyssey4me | miguelgrinberg yep, and that happens on the second auth - but not the first | 20:37 |
miguelgrinberg | what do you mean by "first auth"? | 20:38 |
miguelgrinberg | odyssey4me: I'm using the etherpad as a reference for the auth flow, lines 229-234. | 20:39 |
odyssey4me | access horizon, choose to use the idp, login to the idp, the redirect from the idp goes back to keystone's endpoint | 20:39 |
odyssey4me | in the same session, go back to the login page and retry - this time I go through to the summary page | 20:40 |
odyssey4me | try it for yourself: https://test1.pigeonbrawl.net | 20:40 |
miguelgrinberg | okay, so you mean multiple login attempts | 20:40 |
miguelgrinberg | odyssey4me: and the problem is the JSON blob that appears on the browser? | 20:41 |
odyssey4me | miguelgrinberg that's the keystone endpoint simply responding to a normal GET | 20:41 |
odyssey4me | but yes, that's it | 20:41 |
odyssey4me | if you go back in the same browser session, re-auth, then it'll go through to the horizon summary page | 20:42 |
miguelgrinberg | odd, it doesn't for me, I keep getting the JSON page | 20:42 |
odyssey4me | miguelgrinberg every time? | 20:43 |
miguelgrinberg | it alternates between that JSON page and an error "unable to retrieve authorized projects" | 20:44 |
miguelgrinberg | are you sure this redirect isn't set incorrectly in your testshib config? | 20:45 |
odyssey4me | miguelgrinberg oh - you're still seeing that error? that's odd | 20:46 |
odyssey4me | miguelgrinberg do you have strict cookies or something set? | 20:47 |
miguelgrinberg | let me try another browser, I'm using chrome with default settings now | 20:47 |
miguelgrinberg | well, safari worked the first time | 20:48 |
miguelgrinberg | and the second time I get the JSON page | 20:48 |
odyssey4me | so this inconsistency is what I need to figure out | 20:49 |
odyssey4me | the redirecting isn't right every time | 20:49 |
miguelgrinberg | odyssey4me: clearly there are a few redirects in sequence during the auth flow, but it appears this page comes directly from a redirect from testshib | 20:52 |
openstackgerrit | Kevin Carter proposed stackforge/os-ansible-deployment: Change to ensure container networks are up https://review.openstack.org/202821 | 20:52 |
openstackgerrit | Kevin Carter proposed stackforge/os-ansible-deployment: Change to ensure container networks are up https://review.openstack.org/202821 | 20:53 |
odyssey4me | miguelgrinberg yes, but I think it uses either something from the metadata or from the original page referring it | 20:54 |
miguelgrinberg | odyssey4me: did you have to enter the root keystone URL in testshib? | 20:54 |
odyssey4me | I saw this issue without SSL, and without SSL I'll be able to sniff the data between them so maybe that's the way to get that sorted | 20:54 |
odyssey4me | miguelgrinberg nope | 20:55 |
miguelgrinberg | I wonder where it is coming from then | 20:55 |
odyssey4me | only the metadata, so this: https://test1.pigeonbrawl.net:5000/Shibboleth.sso/Metadata | 20:55 |
odyssey4me | I think it's a referre | 20:55 |
odyssey4me | *referrer | 20:55 |
miguelgrinberg | this sounds painful, but maybe we need to use wireshark to figure out exactly the traffic | 20:56 |
odyssey4me | miguelgrinberg yeah - suffice to say that this can wait until monday - the big hurdle has been resolved | 20:57 |
miguelgrinberg | odyssey4me: yes, sure. I submitted all the IdP fixes earlier today, still waiting for the gate, but I think it'll go through | 20:59 |
*** galstrom is now known as galstrom_zzz | 21:00 | |
miguelgrinberg | odyssey4me: if you add my key to your SP host I can take a look at the config, maybe a set of fresh eyes will help | 21:03 |
odyssey4me | miguelgrinberg I'm looking at marekd's comment here: https://review.openstack.org/#/c/194395/29/playbooks/roles/os_keystone/templates/keystone-httpd.conf.j2,cm | 21:04 |
openstackgerrit | Kevin Carter proposed stackforge/os-ansible-deployment: Change to ensure container networks are up https://review.openstack.org/202821 | 21:05 |
miguelgrinberg | odyssey4me: maybe he means the saml2 portion should be *, to not make it hardcoded to saml2 | 21:05 |
miguelgrinberg | odyssey4me: yeah, see how they do it in the docs: http://docs.openstack.org/developer/keystone/federation/shibboleth.html | 21:06 |
odyssey4me | miguelgrinberg the WSGIScriptAliasMatch line is different | 21:07 |
odyssey4me | which makes sense - it tells keystone to handle any protocol | 21:07 |
odyssey4me | but shibboleth should only handle saml, because it doesn't do other protocols | 21:07 |
miguelgrinberg | ah sorry, looked at the wrong line, yes | 21:07 |
odyssey4me | yeah, I don't think that needs to be different - will have to chat to him to find out what he means | 21:08 |
odyssey4me | he's unfortunately just jumped onto a plane | 21:09 |
miguelgrinberg | maybe he confused it with the alias line, like I did | 21:09 |
odyssey4me | miguelgrinberg I see some tweaks which we can add in the adfs patch sets which make it logout properly, etc - but for now I think once you've added the key distribution we're good | 21:12 |
odyssey4me | we're good for non-ssl anyway | 21:12 |
odyssey4me | I can do the ssl-related stuff in a subsequent patch | 21:13 |
miguelgrinberg | sounds good | 21:14 |
miguelgrinberg | I'll have the shib cert distribution soon, working on that now | 21:14 |
odyssey4me | I think I should probably kill or shutdown these servers now - I've happily been sharing a lot of info about them :p | 21:15 |
odyssey4me | miguelgrinberg both the IDP and SP patches need a DocImpact tag, and we should probably put more details in the commit message about what the patch does, any upgrade impact, etc | 21:26 |
miguelgrinberg | okay, I can attempt that once I have the cert distribution | 21:27 |
odyssey4me | miguelgrinberg thanks, see the commit message in https://review.openstack.org/202977 for an idea of what I think is good practise | 21:28 |
miguelgrinberg | okay, will do | 21:28 |
odyssey4me | I think it makes it easier for reviewers to get the gist of what's going on. | 21:28 |
odyssey4me | thanks, have a great weekend! :) | 21:28 |
openstackgerrit | Kevin Carter proposed stackforge/os-ansible-deployment: Change to ensure container networks are up https://review.openstack.org/202821 | 21:29 |
miguelgrinberg | you yoo! :) | 21:29 |
miguelgrinberg | *too | 21:29 |
*** blockedpipes has quit IRC | 21:29 | |
cloudnull | later odyssey4me! | 21:31 |
cloudnull | have a good one | 21:31 |
*** spotz is now known as spotz_zzz | 21:44 | |
marekd | miguelgrinberg: odyssey4me just responded - my 1st comment was wrong. | 21:48 |
miguelgrinberg | marekd: odyssey4me is gone for the day I think, what was this about? | 21:50 |
marekd | miguelgrinberg: the thing you discussed about protocol '*' | 21:51 |
miguelgrinberg | marekd: ah okay, good, we guessed that was the case | 21:52 |
odyssey4me | marekd :) thanks for getting back to us, and thank you for your help earlier! | 21:53 |
*** Mudpuppy has quit IRC | 21:54 | |
openstackgerrit | Merged stackforge/os-ansible-deployment: Adds retries https://review.openstack.org/203063 | 21:59 |
marekd | odyssey4me: i will be again eu tz from monday. | 22:01 |
odyssey4me | marekd awesome - where are you now? | 22:01 |
*** CheKoLyN has quit IRC | 22:02 | |
openstackgerrit | Kevin Carter proposed stackforge/os-ansible-deployment: Adds retries https://review.openstack.org/203257 | 22:03 |
marekd | odyssey4me: Boston Int Airport | 22:04 |
marekd | waiting form y flight | 22:04 |
marekd | witing for my flight | 22:04 |
marekd | killing time on IRC. | 22:05 |
marekd | :P | 22:05 |
openstackgerrit | Kevin Carter proposed stackforge/os-ansible-deployment: Fix general upgrade issues for Juno > Kilo https://review.openstack.org/202821 | 22:06 |
odyssey4me | marekd have a good flight - time for me to enjoy a glass of wine and spend some time with my wife :) | 22:08 |
odyssey4me | night all, for good this time :p | 22:08 |
odyssey4me | cloudnull have a great weekend! | 22:08 |
cloudnull | will do and you too! | 22:08 |
marekd | odyssey4me: sure thing! | 22:09 |
openstackgerrit | Miguel Grinberg proposed stackforge/os-ansible-deployment: Keystone Federation Service Provider Configuration https://review.openstack.org/194395 | 22:22 |
openstackgerrit | Miguel Grinberg proposed stackforge/os-ansible-deployment: Keystone Federation Identity Provider Configuration https://review.openstack.org/194259 | 22:30 |
*** annashen has quit IRC | 23:06 | |
*** annashen has joined #openstack-ansible | 23:08 | |
*** metral_zzz is now known as metral | 23:32 | |
openstackgerrit | Merged stackforge/os-ansible-deployment: Container create/system tuning https://review.openstack.org/202268 | 23:51 |
*** alop has quit IRC | 23:57 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!