*** sacharya_ has quit IRC | 00:00 | |
alextricity25 | cloudnull still around? | 00:00 |
---|---|---|
cloudnull | thetrav: that issue looks fine. | 00:01 |
cloudnull | something to dig into for sure. | 00:01 |
cloudnull | alextricity25: sure, whats up? | 00:01 |
thetrav | you know what though | 00:01 |
*** markvoelker has quit IRC | 00:01 | |
thetrav | I realised I hadn't done an apt-get dist-upgrade -y | 00:01 |
alextricity25 | cloudnull: Have you tried building a multi-node from master lately? | 00:01 |
thetrav | I saw https://bugs.launchpad.net/openstack-ansible/+bug/1595323 which made me think of it | 00:01 |
openstack | Launchpad bug 1595323 in openstack-ansible "doc: kernel version requirements for mitaka install" [Undecided,Confirmed] - Assigned to Matt Dorn (madorn) | 00:01 |
thetrav | went form 3.13.0-40-generic to 3.13.0-91-generic | 00:02 |
*** markvoelker has joined #openstack-ansible | 00:02 | |
cloudnull | thetrav: maybe a kernel bug fix in there that makes that happier? | 00:02 |
*** markvoelker has quit IRC | 00:02 | |
thetrav | so far it appears to be holding steady at 16 handles | 00:02 |
thetrav | yeah | 00:02 |
cloudnull | interesting . | 00:02 |
*** markvoelker has joined #openstack-ansible | 00:02 | |
cloudnull | alextricity25: no not recently. | 00:02 |
thetrav | I mean, dist-upgrade does more than just the kernal | 00:02 |
thetrav | there was a whole bunch of stuff | 00:02 |
cloudnull | I did do the osic upgrade no long ago | 00:02 |
cloudnull | and that was liberty | 00:03 |
cloudnull | thetrav: for sure. | 00:03 |
cloudnull | I guess it could've been a whole host of fixes | 00:03 |
alextricity25 | cloudnull: It looks like the synchronize ansible module is broken | 00:03 |
cloudnull | which may have been something in python itself | 00:03 |
thetrav | so, I guess it's my bad. If I had to make a suggestion I'd recommend a dist-upgrade as part of the bootstrap script for the deployment node | 00:03 |
cloudnull | thetrav: we can do that | 00:03 |
alextricity25 | cloudnull: https://gist.github.com/alextricity25/20c7045737324f4cc991864fd8ba1f65 | 00:04 |
cloudnull | alextricity25: what version of ansible? | 00:04 |
alextricity25 | cloudnull: v2.1 | 00:04 |
thetrav | oop, no, spoke too soon | 00:04 |
thetrav | same error, same spot | 00:04 |
thetrav | ahh | 00:04 |
thetrav | I was running my lsof watch as ubuntu | 00:04 |
thetrav | couldn't see what root was up to | 00:04 |
thetrav | so ignore what I said. dist-upgrade fixes nothing | 00:05 |
cloudnull | alextricity25: hum. | 00:05 |
alextricity25 | cloudnull: Related? https://github.com/ansible/ansible/issues/15405 | 00:05 |
cloudnull | ill spin an env in a few and see how it goes. | 00:05 |
cloudnull | alextricity25: maybe related. | 00:06 |
cloudnull | ima go eat and spin an env and see what happens. | 00:06 |
alextricity25 | cloudnull: let me know what you get. Thanks buddy | 00:06 |
cloudnull | thetrav: ill look into the FS issue too. | 00:07 |
thetrav | \o/ | 00:07 |
thetrav | mostly I just hope you can reproduce it | 00:07 |
thetrav | well, no, that's not true, mostly I hope you can fix it ;) | 00:07 |
*** adrian_otto1 has quit IRC | 00:08 | |
*** adrian_otto has joined #openstack-ansible | 00:09 | |
*** mummer has quit IRC | 00:10 | |
thetrav | my biggest suspect is still opening sub-processes with pipes and not closing stdin stdout or stderr | 00:13 |
*** thorst has joined #openstack-ansible | 00:13 | |
thetrav | looking for ways to monkey patch the built ins | 00:13 |
*** thorst has quit IRC | 00:21 | |
openstackgerrit | Darren Chan proposed openstack/openstack-ansible: [docs] Revise overview chapter in OSA install guide https://review.openstack.org/331966 | 00:28 |
*** michaelgugino has quit IRC | 00:38 | |
*** jthorne_ has joined #openstack-ansible | 00:40 | |
*** jthorne has quit IRC | 00:40 | |
*** adrian_otto has quit IRC | 00:42 | |
*** jthorne_ has quit IRC | 00:45 | |
*** asettle has joined #openstack-ansible | 00:45 | |
*** asettle has quit IRC | 00:50 | |
*** thorst has joined #openstack-ansible | 00:59 | |
*** wadeholler has joined #openstack-ansible | 01:01 | |
*** appprod0 has quit IRC | 01:02 | |
*** sacharya has joined #openstack-ansible | 01:11 | |
*** openstack has joined #openstack-ansible | 01:25 | |
*** ManojK has quit IRC | 01:43 | |
*** thorst has quit IRC | 01:43 | |
*** thorst has joined #openstack-ansible | 01:44 | |
*** daneyon has quit IRC | 01:50 | |
*** thorst has quit IRC | 01:52 | |
thetrav | so is anyone actually using the mitaka version of openstack-ansible? | 02:03 |
mcarden | I have spun up a few Mitaka AIOs on cloud VMs. | 02:11 |
*** sacharya_ has joined #openstack-ansible | 02:26 | |
*** appprod0 has joined #openstack-ansible | 02:27 | |
*** sacharya has quit IRC | 02:28 | |
*** raddaoui has joined #openstack-ansible | 02:32 | |
thetrav | mcarden All In One? using the openstack-ansible scripts? | 02:34 |
thetrav | I thought it was supposed to be all HA n stuff | 02:34 |
*** woodard has quit IRC | 02:37 | |
*** woodard has joined #openstack-ansible | 02:39 | |
*** woodard_ has joined #openstack-ansible | 02:41 | |
mcarden | thetrav: Yep. All In One via the scripts | 02:41 |
*** woodard has quit IRC | 02:42 | |
thetrav | so you just have one host in the openstack_user_config.yml file? | 02:42 |
mcarden | Lots of hosts - mostly containers. | 02:43 |
thetrav | oh | 02:44 |
*** wadeholler has quit IRC | 02:44 | |
thetrav | I thought the playbooks created containers? | 02:44 |
thetrav | or are you nesting them? | 02:44 |
*** wadeholler has joined #openstack-ansible | 02:44 | |
thetrav | I don't suppose you'd let me have a peek at your openstack_user_config.yml would you? | 02:44 |
thetrav | I've been trying to deal with this file descriptor leak and not getting anywhere. Wondering if I've set things up incorrectly | 02:45 |
mcarden | Sure. Let me get one. | 02:45 |
mcarden | thetrav: http://paste.openstack.org/show/523879/ | 02:47 |
*** woodard_ has quit IRC | 02:50 | |
*** thorst has joined #openstack-ansible | 02:50 | |
thetrav | cheers | 02:50 |
thetrav | looks quite similar to the one I've got | 02:52 |
thetrav | mine doesn't have those affinity in them | 02:52 |
thetrav | also mine has 3 hosts for each block where yours has only aio1 (except compute) | 02:53 |
thetrav | I wonder if I put mine back down to one host if it'll do better | 02:54 |
thetrav | also, does affinity: something: 1 mean only one instance? | 02:54 |
thetrav | so for example you have a galera cluster with only a single galera instance? | 02:54 |
mcarden | IIRC afinity 1 means two instances. | 02:55 |
thetrav | ok... starting at zero or something? | 02:56 |
mcarden | There's doc about it somewhere... | 02:56 |
*** chandanc_ has joined #openstack-ansible | 02:57 | |
*** thorst has quit IRC | 02:57 | |
mcarden | Looks like I'm wrong: http://docs.openstack.org/developer/openstack-ansible/install-guide/configure-initial.html#affinity | 02:58 |
cloudnull | thetrav: so far I've not been able to recreate the issue. | 03:06 |
thetrav | cloudnull would it help if I posted my config? | 03:07 |
cloudnull | I have built an env using 14 nodes + stable/mitaka | 03:07 |
cloudnull | maybe ? | 03:07 |
*** jamielennox is now known as jamielennox|away | 03:07 | |
cloudnull | do you have the openstack_user_config.yml file handy ? | 03:07 |
*** jamielennox|away is now known as jamielennox | 03:07 | |
thetrav | http://cdn.pasteraw.com/2457wjmllptyc6teg9ek4y0ftm0m5dt | 03:07 |
thetrav | you are invoking the setup-hosts.yml ? | 03:08 |
thetrav | so what I'm trying now, is splitting setup-hosts.yml into its component includes | 03:09 |
thetrav | if I run one include at a time that ensures all file handles are released | 03:09 |
thetrav | I'm in work around territory | 03:09 |
*** sacharya_ has quit IRC | 03:20 | |
*** sacharya has joined #openstack-ansible | 03:20 | |
*** weezS has joined #openstack-ansible | 03:36 | |
*** adrian_otto has joined #openstack-ansible | 03:42 | |
cloudnull | thetrav: The config looks just fine. | 03:42 |
cloudnull | how are the commands being invoked? | 03:43 |
cloudnull | is it ansible automation invoking openstack-ansible ? | 03:44 |
mhayden | aww mkrish left | 03:44 |
mhayden | and i finally had time to talk ipv6 | 03:44 |
* mhayden is filled with sads | 03:45 | |
cloudnull | mkrish should be back in the AM i'd imagine. | 03:45 |
cloudnull | thetrav: i just cant get it to explode. | 03:45 |
cloudnull | :| | 03:45 |
thetrav | nohup /usr/local/bin/ansible-playbook -vvv -e @/etc/openstack_deploy/user_secrets.yml -e @/etc/openstack_deploy/user_variables.yml --forks=1 setup-hosts.yml & | 03:46 |
thetrav | the intial failure used the openstack-ansible script, however I wanted more verbose output and fewer forks | 03:46 |
thetrav | so my cidr's are /24 instead of /22 (don't think it matters) | 03:47 |
thetrav | I have multiple hosts in all the infrastructure bits and there's no affinaty | 03:47 |
thetrav | that's about all I can think of | 03:48 |
thetrav | the deploy_host is a small ubuntu node on openstack from the cloud_image | 03:48 |
thetrav | the targets are ubuntu as installed by MaaS | 03:48 |
mhayden | cloudnull: being on PDT is weird | 03:49 |
*** sacharya has quit IRC | 03:49 | |
thetrav | if you watch the output of `lsof | grep '^ansible-p' | grep /dev/null | wc -l` while the build executes do you notice the number climbing? | 03:49 |
*** sacharya has joined #openstack-ansible | 03:49 | |
thetrav | I assume you're using ubuntu 14.04? | 03:49 |
cloudnull | mhayden: have you seen a FD issue with the sec role by chance? | 03:50 |
mhayden | file descriptor? | 03:50 |
cloudnull | mhayden: BTW I love being on the Left coast :) | 03:51 |
mhayden | floppy disk? | 03:51 |
cloudnull | i miss my SF some times | 03:51 |
mhayden | it's chilly | 03:51 |
cloudnull | thetrav: is having a file descriptor issue | 03:51 |
thetrav | I thought it was summer up there? You guys should visit Melbourne if you want cold | 03:51 |
mhayden | it's like 53F in the mornings here | 03:52 |
mhayden | 11.7 C | 03:52 |
thetrav | yeah, we're getting down to 41F | 03:52 |
thetrav | 5C | 03:52 |
*** chandanc_ has quit IRC | 03:53 | |
* thetrav waits for a Canadian to show him up | 03:53 | |
*** bryan_att has quit IRC | 03:53 | |
cloudnull | The coldest winter I ever saw was the summer I spent in San Francisco. -- Mark Twain | 03:53 |
*** chandanc_ has joined #openstack-ansible | 03:53 | |
*** thorst has joined #openstack-ansible | 03:55 | |
cloudnull | thetrav: so i've got two deploys going right now and 1, running master, has 0 open FDs and the other, running mitaka, has 17. | 03:57 |
cloudnull | im poking it though | 03:58 |
thetrav | I tend to see a ton opened up in the security_hardening phase | 03:59 |
thetrav | any chance I can see your user_config and user_variables? | 03:59 |
thetrav | my user_variables is all commented out except glance_default_store: file | 03:59 |
thetrav | because apparantly if you have the file but no variables in it you get an exception | 04:00 |
*** albertcard has quit IRC | 04:00 | |
*** thorst has quit IRC | 04:03 | |
cloudnull | thetrav: sure . let me put that together. | 04:03 |
cloudnull | thetrav: http://cdn.pasteraw.com/vuqdp84inzncp6xkji07luvyn3c3k6 | 04:06 |
cloudnull | thats the collection of "cat openstack_user_config.yml conf.d/*.yml" | 04:07 |
openstackgerrit | Michael Carden proposed openstack/openstack-ansible: conditionally include the scsi_dh kernel module https://review.openstack.org/335202 | 04:13 |
*** Drago1 has joined #openstack-ansible | 04:13 | |
*** Drago1 has joined #openstack-ansible | 04:14 | |
*** zerda2 has joined #openstack-ansible | 04:15 | |
*** Drago1 has quit IRC | 04:19 | |
*** markvoelker has quit IRC | 04:31 | |
openstackgerrit | Kevin Carter (cloudnull) proposed openstack/openstack-ansible-lxc_hosts: Update the version of LXC installed to the latest stable https://review.openstack.org/335301 | 04:40 |
cloudnull | thetrav: still beating on it, i've just not been able to make it die in a fire quite yet. | 04:41 |
thetrav | sorry, got pulled away, am just looking over the user config now | 04:41 |
cloudnull | no worries. | 04:42 |
thetrav | 22 mask is bigger than 24 right? | 04:42 |
cloudnull | yes | 04:42 |
thetrav | your used ip ranges are bigger too | 04:42 |
thetrav | the file looks like output from something rather than the thing you get when you follow the online instructions | 04:42 |
thetrav | all the notes not to write to this file | 04:43 |
cloudnull | yes my env was built using https://github.com/cloudnull/osa-multi-node-aio | 04:43 |
cloudnull | the build scripts create those files. | 04:44 |
cloudnull | the end result is a 14 node deployment if you change nothing and just run it | 04:45 |
cloudnull | where the file starts "---" is a new file. | 04:46 |
cloudnull | I just break out the file into multiple ones instead of having them all in one big one. | 04:46 |
thetrav | ok, so this is my equivalent: http://cdn.pasteraw.com/6c9jeuirth2vnq383itb1ds2g2zruqc | 04:49 |
thetrav | after that I ssh into deploy_host and run the ansible-playbook setup-hosts.yml thinger | 04:50 |
*** adrian_otto has quit IRC | 04:50 | |
*** adrian_otto has joined #openstack-ansible | 04:54 | |
cloudnull | thetrav: so this looks like the issue you're seeing https://github.com/ansible/ansible/issues/15182 | 04:55 |
cloudnull | which was an issue invoked through the api | 04:55 |
cloudnull | but in the end its a FD issue similar to what youre seeing. | 04:56 |
thetrav | first scan yes, looks very similar | 04:56 |
thetrav | the /dev/null thing too | 04:56 |
cloudnull | there was never any follow up on it. | 04:56 |
thetrav | unfortunately it's a bit challenging to parse his issue | 04:57 |
thetrav | is that code a library or something? | 04:57 |
thetrav | oh | 04:57 |
cloudnull | yea, its using the Ansible internals insteaad of the cli clients. | 04:57 |
thetrav | it's programmatically invoking ansible | 04:57 |
cloudnull | just for shits and grins, would you mind trying to run ansible installed from git using the stable1.9 branch ? | 04:58 |
cloudnull | there have been quite a few bug fixes that have gone in that we're never part of tag | 04:58 |
cloudnull | maybe helps? | 04:58 |
*** pcaruana has quit IRC | 04:58 | |
* cloudnull still grasping at straws | 04:58 | |
thetrav | so I've tried using the most recent 2.1 ansible | 04:58 |
thetrav | same result | 04:59 |
cloudnull | ok | 04:59 |
cloudnull | then no. | 04:59 |
cloudnull | are you running the command in screen or tmux ? | 04:59 |
thetrav | tmux | 05:00 |
thetrav | well | 05:00 |
thetrav | ssh + nohup | 05:00 |
*** sacharya has quit IRC | 05:00 | |
cloudnull | using additional logging when the shell was invoked? | 05:00 |
thetrav | not sure if that counts as what | 05:00 |
thetrav | I have -vvv switched on | 05:00 |
thetrav | nohup /usr/local/bin/ansible-playbook -vvv -e @/etc/openstack_deploy/user_secrets.yml -e @/etc/openstack_deploy/user_variables.yml --forks=1 setup-hosts.yml & | 05:00 |
thetrav | that is how I invoke it | 05:00 |
*** thorst has joined #openstack-ansible | 05:01 | |
cloudnull | interesting. so i think its the fact that its nohup. | 05:05 |
cloudnull | i can make it happen like so | 05:06 |
cloudnull | https://snag.gy/LuZa1f.jpg | 05:06 |
cloudnull | w/out nohup, even backgrouning it, nope. | 05:06 |
thetrav | ? | 05:06 |
thetrav | so nohup may be causing it to fail? | 05:06 |
*** M00nr41n has quit IRC | 05:06 | |
thetrav | well that's surprising | 05:07 |
cloudnull | that is | 05:07 |
cloudnull | when you had nohup in the command i never even throught to try that. | 05:07 |
*** thorst has quit IRC | 05:08 | |
mcarden | I'd have thought that using tmux would take away any need for nohup. | 05:08 |
thetrav | tmux = ? | 05:09 |
mcarden | Sorry, I thought you confirmed earlier using tmux | 05:09 |
thetrav | yeah I may have incorrectly assumed | 05:10 |
thetrav | ok, right | 05:10 |
thetrav | no | 05:10 |
thetrav | tmux is a specific program | 05:10 |
thetrav | not some fancy way of saying terminal emulaiton | 05:10 |
mcarden | Sorry. So I always use tmux for ssh to long running things. | 05:10 |
mcarden | yep. apt-get install tmux | 05:11 |
thetrav | so tmux is pretty much a fancy version of screen? | 05:11 |
cloudnull | yup | 05:11 |
mcarden | Yep | 05:11 |
thetrav | ok, cool | 05:11 |
thetrav | are you able to do your thing using tmux and not get the fd issue? | 05:11 |
cloudnull | good read https://gist.github.com/MohamedAlaa/2961058 | 05:11 |
thetrav | cause that'd take away my need for nohup | 05:11 |
cloudnull | I run my default shell in tmux | 05:11 |
cloudnull | when I login to my servers, im in tmux | 05:11 |
thetrav | gonna infer the 'yes' from that response :D | 05:12 |
cloudnull | yes. :) | 05:12 |
cloudnull | you could use screen too | 05:12 |
cloudnull | if you're more familiar with that | 05:12 |
mcarden | ...but tmux is cooler. :) | 05:12 |
thetrav | I have never learned to use screen properly | 05:12 |
thetrav | I don't even know how to scroll back in it | 05:12 |
thetrav | that's why I use nohup and tail -f | 05:13 |
thetrav | does tmux continue to operate when I disconnect? | 05:13 |
thetrav | similar to screen? | 05:13 |
cloudnull | this is a better cheatsheet https://tmuxcheatsheet.com/ | 05:13 |
cloudnull | yes | 05:13 |
*** adrian_otto has quit IRC | 05:13 | |
thetrav | thanks | 05:14 |
mcarden | If you get disconnected, just ssh back in and 'tmux attach -t session-name' and you'll be where you were | 05:14 |
thetrav | rad | 05:16 |
thetrav | ok, so I also just discovered that the deploy_host can't route to the container network ;P | 05:16 |
*** adrian_otto has joined #openstack-ansible | 05:16 | |
thetrav | so I'm gonna need the guy who controls the palo alto to update the routes for me | 05:16 |
thetrav | once that's set up I'll get back to you if I get success | 05:17 |
thetrav | really hoping it works out. Tired of supporting my own ansible playbooks | 05:18 |
thetrav | also trove | 05:18 |
thetrav | mmmm troooove | 05:18 |
cloudnull | so just to confirm, when I nohup ansible commands w/ lots of hosts it explodes quickly. https://asciinema.org/a/62w2uj7tfnqm1zy9tg6uyww55 | 05:21 |
cloudnull | IDK if the screen cast will be nice | 05:21 |
cloudnull | but that was my test case to see it die in a fire | 05:22 |
cloudnull | thetrav: have you worked on trove before? | 05:24 |
cloudnull | we donthave a trove role. | 05:24 |
* cloudnull makes sure of that | 05:24 | |
*** adrian_otto has quit IRC | 05:24 | |
cloudnull | but it'd be nice to make it go | 05:24 |
*** adrian_otto has joined #openstack-ansible | 05:24 | |
mcarden | Nice demo cloudnull. I guess the take=away is "So don't do that" | 05:25 |
cloudnull | Yea. im not sure how to make ansible + nohup = happy, but i think the fix is "dont do that" :) | 05:26 |
thetrav | no, trove is new to me | 05:27 |
thetrav | I've bounced off it a couple of times | 05:27 |
thetrav | so it's not like, 100% new | 05:27 |
thetrav | but I haven't got a working install of the service, nor have I built any of my own db images | 05:28 |
thetrav | I noticed in mitaka it got added to the apt-get repo and install docs however | 05:28 |
thetrav | so that gives me hope that I can make it happen this time | 05:28 |
cloudnull | well if its something your interested in working on it'd be great to get a role together for it. | 05:29 |
cloudnull | and from what I just read your the expert =) | 05:29 |
thetrav | heheh | 05:29 |
thetrav | if I can make a contribution I will do so | 05:29 |
cloudnull | cool . so im going to bed. | 05:29 |
cloudnull | good chat though | 05:30 |
thetrav | cheers, sleep well | 05:30 |
cloudnull | thetrav: if you get a moment to have a look at the launchpad issue raised and comment/close it I'd appreciate it. | 05:30 |
cloudnull | mcarden: cheers brother ttyl | 05:30 |
cloudnull | night all. | 05:30 |
mcarden | cya cloudnull | 05:31 |
*** markvoelker has joined #openstack-ansible | 05:32 | |
mcarden | thetrav: If you do end up interested in a trove role, here's the 'getting going' guide for role development: http://docs.openstack.org/developer/openstack-ansible/developer-docs/additional-roles.html#role-development-maturity | 05:36 |
*** markvoelker has quit IRC | 05:37 | |
*** adrian_otto has quit IRC | 05:44 | |
*** McMurlock1 has joined #openstack-ansible | 05:45 | |
*** McMurlock1 has quit IRC | 05:49 | |
*** javeriak has joined #openstack-ansible | 06:02 | |
*** appprod0 has quit IRC | 06:03 | |
*** thorst has joined #openstack-ansible | 06:05 | |
*** chhavi has joined #openstack-ansible | 06:07 | |
chhavi | Hi all, am facing issue while accessing the VM using the VNC console | 06:08 |
*** M00nr41n has joined #openstack-ansible | 06:08 | |
*** karimb has joined #openstack-ansible | 06:10 | |
chhavi | its not accepting any keyboard input, does openstack-ansible blocks any ports | 06:11 |
*** thorst has quit IRC | 06:12 | |
*** pcaruana has joined #openstack-ansible | 06:16 | |
*** pcaruana is now known as pcaruana|afk| | 06:19 | |
*** M00nr41n has quit IRC | 06:22 | |
*** M00nr41n has joined #openstack-ansible | 06:23 | |
*** deadnull has quit IRC | 06:28 | |
*** markvoelker has joined #openstack-ansible | 06:33 | |
*** M00nr41n has quit IRC | 06:33 | |
*** M00nr41n has joined #openstack-ansible | 06:34 | |
*** karimb has quit IRC | 06:36 | |
*** markvoelker has quit IRC | 06:37 | |
*** bootsha has joined #openstack-ansible | 06:38 | |
*** weezS has quit IRC | 06:39 | |
*** deadnull has joined #openstack-ansible | 06:46 | |
*** pcaruana|afk| is now known as pcaruana | 06:49 | |
*** raddaoui has quit IRC | 06:57 | |
*** appprod0 has joined #openstack-ansible | 07:00 | |
*** appprod0 has quit IRC | 07:05 | |
*** jiteka has joined #openstack-ansible | 07:08 | |
*** thorst has joined #openstack-ansible | 07:10 | |
*** chhavi has quit IRC | 07:12 | |
*** tlbr has quit IRC | 07:17 | |
*** thorst has quit IRC | 07:17 | |
*** tlbr has joined #openstack-ansible | 07:19 | |
*** chhavi has joined #openstack-ansible | 07:20 | |
*** karimb has joined #openstack-ansible | 07:26 | |
*** bootsha has quit IRC | 07:28 | |
*** jhyang has joined #openstack-ansible | 07:30 | |
*** bootsha has joined #openstack-ansible | 07:31 | |
*** jhyang is now known as derekjhyang | 07:32 | |
*** markvoelker has joined #openstack-ansible | 07:34 | |
*** markvoelker has quit IRC | 07:38 | |
*** vnogin_ has joined #openstack-ansible | 07:46 | |
*** vnogin has quit IRC | 07:46 | |
vnogin_ | good morning | 07:47 |
evrardjp | morning everyone | 07:58 |
*** tlbr has quit IRC | 08:00 | |
javeriak | morning evrardjp | 08:01 |
javeriak | im having some ansible ssh problems, it just wont reach the targets, while a direct ssh to that IP works, any idea what could be wrong | 08:02 |
*** karimb has quit IRC | 08:02 | |
javeriak | tried debug mode as well, all it says at the end is FAILED => SSH Error: data could not be sent to the remote host. Make sure this host can be reached over ssh | 08:02 |
*** tlbr has joined #openstack-ansible | 08:03 | |
*** neilus has joined #openstack-ansible | 08:04 | |
*** tlbr has quit IRC | 08:04 | |
*** tlbr has joined #openstack-ansible | 08:07 | |
*** bootsha has quit IRC | 08:12 | |
*** thorst has joined #openstack-ansible | 08:15 | |
*** thetrav has quit IRC | 08:17 | |
ioni | javeriak, make sure that the host you are running ansible, can also connect on the network assigned for containers | 08:19 |
*** thorst has quit IRC | 08:23 | |
javeriak | ioni yes ssh works otherwise | 08:25 |
*** admin0 has joined #openstack-ansible | 08:26 | |
*** chandanc_ has quit IRC | 08:27 | |
*** admin0 has quit IRC | 08:29 | |
*** electrofelix|afk is now known as electrofelix | 08:31 | |
*** karimb has joined #openstack-ansible | 08:31 | |
*** bootsha has joined #openstack-ansible | 08:31 | |
*** markvoelker has joined #openstack-ansible | 08:34 | |
*** chandanc_ has joined #openstack-ansible | 08:35 | |
kamsz | what is the proper way to restart the containers on infra? | 08:36 |
kamsz | lxc-system-manage containers-restart? | 08:36 |
*** markvoelker has quit IRC | 08:38 | |
*** asettle has joined #openstack-ansible | 08:46 | |
evrardjp | javeriak: interesting | 09:01 |
evrardjp | javeriak: which version of ansible are you using, which connection plugin? | 09:01 |
evrardjp | did you try to have the triple v ? | 09:01 |
*** appprod0 has joined #openstack-ansible | 09:01 | |
evrardjp | I mean something like "openstack-ansible playbook.yml -vvv" | 09:02 |
javeriak | evrardjp yes tried with vvv, the last message is that FAIL SSH error | 09:02 |
javeriak | ansible should be old, whatever was pinned with OSA kilo | 09:02 |
evrardjp | kamsz: go to your deploy nodes, and do a good ansible -m shell -a reboot <the container group you want> | 09:02 |
evrardjp | pay attention to what you want, some (like galera and rabbit) don't like that | 09:03 |
*** bootsha has quit IRC | 09:03 | |
evrardjp | javeriak: more and more interesting if it's not a network issue | 09:04 |
kamsz | evrardjp: yeah, i've noticed that galera doesn't like it :p | 09:04 |
evrardjp | could you show what's in the -vvv ? | 09:04 |
evrardjp | kamsz: there is doc for that in the operations guide on the docs | 09:04 |
evrardjp | not only the error I mean | 09:04 |
evrardjp | javeriak: ^ | 09:04 |
kamsz | evrardjp: for cluster recovery? yeah, i've followed it and got galera up and running | 09:05 |
*** appprod0 has quit IRC | 09:06 | |
javeriak | evrardjp : the start has the ansible trace and then later i tried running the ssh command directly http://paste.ubuntu.com/18086570/ | 09:10 |
javeriak | so looks like it might not be ansible | 09:10 |
*** berendt has quit IRC | 09:12 | |
*** bootsha has joined #openstack-ansible | 09:15 | |
evrardjp | definitely ssh | 09:16 |
evrardjp | and a task issue | 09:17 |
evrardjp | what's your ansible.cfg ? | 09:20 |
javeriak | evrardjp yes, but i cant quite make sense of this trace, it has no apparent error, a simple "ssh <IP>" works, just not with these options | 09:20 |
*** thorst has joined #openstack-ansible | 09:20 | |
javeriak | the ansible.cfg is default https://github.com/openstack/openstack-ansible/blob/kilo/playbooks/ansible.cfg | 09:22 |
evrardjp | the command asked should be fine because it's ping module | 09:23 |
evrardjp | nothing fancy in the call itself | 09:23 |
evrardjp | the key exchange look fine, and the authentication seems fine | 09:23 |
evrardjp | but then you start your command, and don't get the info until you arrive on ControlPersist timeout | 09:24 |
evrardjp | could you try this ? | 09:25 |
evrardjp | /bin/sh -c LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python | 09:25 |
evrardjp | directly on the connected node | 09:25 |
evrardjp | (infra1 here) | 09:25 |
evrardjp | and see if it's fast or not | 09:25 |
evrardjp | hint: according to your trace, it should | 09:25 |
evrardjp | also you have to ssh 10.100.1.2 | 09:26 |
evrardjp | for consistency | 09:26 |
*** thorst has quit IRC | 09:28 | |
javeriak | evrardjp ive lost access to the system for now, so im just looking for leads, will give that a try in the morning | 09:29 |
evrardjp | you could have an intermittent network issue | 09:30 |
evrardjp | don't hesitate to try different connection plugins and different ssh connection configurations | 09:30 |
javeriak | so i should ssh into 10.100.1.2 first and then run /bin/sh -c LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python ? | 09:30 |
evrardjp | yes, this is like the first basic test | 09:31 |
javeriak | like paramiko? iev never tried that though | 09:31 |
javeriak | so what would we be testing with that command? | 09:31 |
evrardjp | it should be instantaneous to connect and to instant to return to your host with the command | 09:31 |
*** admin0 has joined #openstack-ansible | 09:31 | |
evrardjp | just to make sure the ping module could work and answer directly | 09:32 |
evrardjp | if you have an answer directly, it means it's not the cause of your timeout | 09:33 |
evrardjp | I strongly suspect this will not be the case | 09:33 |
admin0 | automagically: here ? | 09:36 |
javeriak | evrardjp im not quite following sorry; by the ping module do you mean the ansible ping? we know that doesnt work | 09:36 |
evrardjp | javeriak: I'll try to explain better | 09:36 |
evrardjp | by manually sshing and typing the command, you manually do stuff that are the same as the ping command | 09:37 |
evrardjp | however, you'll be using an interactive session | 09:37 |
evrardjp | you can have more feedback to it | 09:37 |
evrardjp | with it | 09:38 |
evrardjp | so by doing this, you can first see timings | 09:38 |
evrardjp | (understand how long does it take to connect and to execute this command) | 09:38 |
evrardjp | if the command executes instantly it's fine, it's the working behavior | 09:38 |
evrardjp | if not, you have python issues | 09:39 |
evrardjp | if the ssh connection takes long, you may have network issues, etc | 09:39 |
evrardjp | (maybe check the ssh server for UseDNS and things like this) | 09:39 |
evrardjp | every ssh setting you've defined as applying for the nodes you're connecting to (in your ~/.ssh/config) will apply to your interactive session, so you'll maybe discover things there | 09:40 |
javeriak | evrardjp ah yes i get it now; but once i've ssh'd in, wouldnt i be in that machine directly, unless i try executing that command through ssh | 09:40 |
*** bootsha has quit IRC | 09:45 | |
*** vnogin_ has quit IRC | 09:50 | |
*** karimb has quit IRC | 09:52 | |
chhavi | does the default openstack-ansible novnc configuration blocks no-ssl connections while accessing VNC console | 09:53 |
*** vnogin has joined #openstack-ansible | 09:55 | |
*** berendt has joined #openstack-ansible | 09:59 | |
*** admin0 has quit IRC | 09:59 | |
*** karimb has joined #openstack-ansible | 10:01 | |
openstackgerrit | Jirayut Nimsaeng proposed openstack/openstack-ansible-os_horizon: Make horizon to use policy as the same as other projects https://review.openstack.org/330397 | 10:04 |
winggundamth | evrardjp: https://review.openstack.org/#/c/333770/ I tried to recheck. Is it works? | 10:05 |
*** Andrew_jedi has joined #openstack-ansible | 10:07 | |
evrardjp | javeriak: you'd be in that machine, but you'll already "feel" how the machine is behaving | 10:07 |
evrardjp | like you get connection drops after 60 seconds etc | 10:08 |
evrardjp | it's just for your to better understand | 10:08 |
Andrew_jedi | Hello folks, Do we have any scripts in OSA to recover rabbitmq cluster from a network partition? | 10:08 |
evrardjp | winggundamth: I'll check | 10:08 |
winggundamth | evrardjp: thanks | 10:08 |
evrardjp | glance issue | 10:10 |
evrardjp | I'll do one more recheck, but we should maybe focus on the code now | 10:11 |
evrardjp | I'll mark it in my starred list, and I'll review it as soon as I can | 10:11 |
winggundamth | evrardjp: which code? mine or gate? | 10:11 |
evrardjp | yours | 10:11 |
winggundamth | evrardjp: okay | 10:11 |
evrardjp | making sure that's what we really want | 10:11 |
evrardjp | compared to the blueprint that passed long ago etc | 10:12 |
*** javeriak has quit IRC | 10:12 | |
*** chhavi has quit IRC | 10:15 | |
*** javeriak has joined #openstack-ansible | 10:17 | |
javeriak | evrardjp okay.. and worst case, if all works and is also responsive enough directly, any idea what else i could try... | 10:18 |
evrardjp | I'm pretty sure you have net drops | 10:18 |
evrardjp | but yes, move to paramiko as I said, or extend the control master | 10:19 |
evrardjp | are you sure there is no weird net stuff happening on this host | 10:19 |
evrardjp | ? | 10:19 |
evrardjp | did you check the system logs ? | 10:19 |
evrardjp | just in case | 10:19 |
evrardjp | :p | 10:19 |
evrardjp | winggundamth: don't hesitate to ask others to review too | 10:20 |
javeriak | i was hoping it wouldnt come to that :P | 10:20 |
javeriak | the auth logs are oddly empty | 10:20 |
*** johnmilton has quit IRC | 10:20 | |
evrardjp | lastlog is empty too ? | 10:20 |
javeriak | btw could this be relevant: https://github.com/ansible/ansible/issues/13401 | 10:20 |
javeriak | im on ansible 1.9 though | 10:20 |
javeriak | didnt check lastlog | 10:21 |
evrardjp | it's not really relevant here | 10:22 |
evrardjp | however, like I said earlier, adapting the ssh config and ansible.cfg ssh config could be useful | 10:23 |
evrardjp | like not using pipelining etc | 10:23 |
evrardjp | but I strongly doubt this is the cause of your issues | 10:23 |
evrardjp | most of the time it's network or host issue | 10:24 |
*** thorst has joined #openstack-ansible | 10:25 | |
javeriak | yea probably | 10:28 |
javeriak | well lets see, thanks for your help | 10:28 |
*** thorst has quit IRC | 10:33 | |
*** bootsha has joined #openstack-ansible | 10:35 | |
*** markvoelker has joined #openstack-ansible | 10:36 | |
*** bootsha has quit IRC | 10:38 | |
*** bootsha has joined #openstack-ansible | 10:39 | |
*** markvoelker has quit IRC | 10:40 | |
evrardjp | anytime | 10:46 |
*** chhavi has joined #openstack-ansible | 10:47 | |
*** bootsha has quit IRC | 10:51 | |
*** bootsha has joined #openstack-ansible | 10:53 | |
*** javeriak has quit IRC | 10:56 | |
*** neilus1 has joined #openstack-ansible | 10:59 | |
*** johnmilton has joined #openstack-ansible | 11:01 | |
*** neilus2 has joined #openstack-ansible | 11:01 | |
*** chandanc_ has quit IRC | 11:02 | |
*** appprod0 has joined #openstack-ansible | 11:02 | |
*** neilus has quit IRC | 11:03 | |
*** neilus1 has quit IRC | 11:03 | |
*** smatzek has joined #openstack-ansible | 11:05 | |
*** appprod0 has quit IRC | 11:07 | |
*** asettle has quit IRC | 11:26 | |
*** asettle has joined #openstack-ansible | 11:29 | |
*** bootsha has quit IRC | 11:29 | |
*** v1k0d3n has joined #openstack-ansible | 11:29 | |
*** bootsha has joined #openstack-ansible | 11:30 | |
*** bootsha has quit IRC | 11:35 | |
*** javeriak has joined #openstack-ansible | 11:36 | |
*** markvoelker has joined #openstack-ansible | 11:37 | |
*** McMurlock1 has joined #openstack-ansible | 11:40 | |
*** markvoelker has quit IRC | 11:41 | |
*** thorst has joined #openstack-ansible | 11:42 | |
*** deverter has joined #openstack-ansible | 11:43 | |
*** weshay has joined #openstack-ansible | 11:46 | |
Andrew_jedi | Folks, any idea about this error | 11:49 |
Andrew_jedi | (item={'src': u'/etc/rabbitmq/rabbitmq.pem', 'name': 'rabbitmq_ssl_cert', 'file_mode': '0640'}) => {"attempts": 5, "err": "Memcache key not found", "failed": true, "item": {"file_mode": "0640", "name": "rabbitmq_ssl_cert", "src": "/etc/rabbitmq/rabbitmq.pem"}, "rc": 1} | 11:49 |
*** GMAzrael has joined #openstack-ansible | 11:57 | |
*** prometheanfire has quit IRC | 12:03 | |
*** prometheanfire has joined #openstack-ansible | 12:03 | |
*** neilus has joined #openstack-ansible | 12:07 | |
*** neilus has quit IRC | 12:08 | |
*** markvoelker has joined #openstack-ansible | 12:08 | |
*** karimb has quit IRC | 12:08 | |
*** neilus has joined #openstack-ansible | 12:09 | |
*** neilus2 has quit IRC | 12:10 | |
*** admin0 has joined #openstack-ansible | 12:12 | |
*** bootsha has joined #openstack-ansible | 12:14 | |
*** neilus has quit IRC | 12:16 | |
Andrew_jedi | cloudnull odyssey4me ^^ | 12:17 |
Andrew_jedi | I am running a Kilo setup. | 12:18 |
*** psilvad has joined #openstack-ansible | 12:21 | |
mgariepy | good morning everyone | 12:25 |
*** psilvad has quit IRC | 12:29 | |
*** automagically has quit IRC | 12:29 | |
*** automagically has joined #openstack-ansible | 12:30 | |
*** neilus has joined #openstack-ansible | 12:32 | |
*** v1k0d3n has quit IRC | 12:35 | |
*** woodard has joined #openstack-ansible | 12:36 | |
*** aernhart has joined #openstack-ansible | 12:40 | |
*** ManojK has joined #openstack-ansible | 12:45 | |
*** GMAzrael has quit IRC | 12:47 | |
*** psilvad has joined #openstack-ansible | 12:48 | |
*** karimb has joined #openstack-ansible | 12:48 | |
*** ManojK has quit IRC | 12:55 | |
*** zerda2 has quit IRC | 12:58 | |
*** TxGirlGeek has joined #openstack-ansible | 13:00 | |
*** appprod0 has joined #openstack-ansible | 13:03 | |
*** messy has joined #openstack-ansible | 13:04 | |
*** appprod0 has quit IRC | 13:08 | |
*** ManojK has joined #openstack-ansible | 13:10 | |
*** deverter has quit IRC | 13:10 | |
*** M00nr41n has quit IRC | 13:11 | |
alextricity25 | Andrew_jedi: could you give more detail? What task were you running? Could you possibly send a paste of the entire task when it failed? | 13:13 |
*** javeriak has quit IRC | 13:14 | |
automagically | morning all | 13:15 |
Andrew_jedi | alextricity25: I was trying to reinstall rabbitmq cluster. http://paste.openstack.org/show/524038/ | 13:16 |
*** sdake has joined #openstack-ansible | 13:17 | |
alextricity25 | Andrew_jedi: At first glance it looks to me that the rabbimq_ssl_cert no longer is the memcache server. It probably expired from there and was removed. Let me try to see if I can find a way to refresh what's stored in memcache | 13:19 |
Andrew_jedi | alextricity25: Thanks, i have been searching for it for sometime. Couldn't find anything :/ | 13:19 |
*** gregfaust has joined #openstack-ansible | 13:21 | |
*** sdake_ has joined #openstack-ansible | 13:22 | |
*** sdake has quit IRC | 13:22 | |
alextricity25 | Andrew_jedi: Do you think it might be this task that's failing? https://github.com/openstack/openstack-ansible/blob/kilo/playbooks/roles/rabbitmq_server/tasks/rabbitmq_ssl_key_store.yml#L16-L31 | 13:22 |
alextricity25 | Andrew_jedi: or maybe this one? https://github.com/openstack/openstack-ansible/blob/kilo/playbooks/roles/rabbitmq_server/tasks/rabbitmq_ssl_key_distribute.yml#L16-L32 | 13:24 |
Andrew_jedi | alextricity25: Possibly yes, i am trying to implement this fix for the issue https://github.com/openstack/openstack-ansible-rabbitmq_server/blob/c9773b9d9c85dbec0422839829a9dedbd07991d0/tasks/rabbitmq_ssl_key_distribute.yml | 13:24 |
Andrew_jedi | alextricity25: Looks like it worked ... | 13:24 |
alextricity25 | Andrew_jedi: awesome! | 13:25 |
Andrew_jedi | Andrew_jedi: New error :/, | 13:26 |
Andrew_jedi | alextricity25: New error, | 13:26 |
Andrew_jedi | failed: [controller3_rabbit_mq_container-48ecc3b2] => {"cmd": "/usr/sbin/rabbitmqctl -q -n rabbit add_user openstack a414b123025ff04c84fd", "failed": true, "rc": 2} | 13:26 |
Andrew_jedi | stderr: Error: user_already_exists: openstack | 13:26 |
*** sdake has joined #openstack-ansible | 13:26 | |
*** sdake_ has quit IRC | 13:27 | |
alextricity25 | Andrew_jedi: what task is that? | 13:28 |
Andrew_jedi | alextricity25: TASK: [rabbitmq_server | Ensure rabbitmq user] | 13:28 |
*** KLevenstein has joined #openstack-ansible | 13:29 | |
Andrew_jedi | alextricity25: fixed | 13:30 |
alextricity25 | Andrew_jedi: That's strange...the rabbitmq ansible module should skip it | 13:30 |
cloudnull | morning | 13:30 |
alextricity25 | Andrew_jedi: oh good | 13:30 |
alextricity25 | good morning cloudnull | 13:30 |
Andrew_jedi | cloudnull: good morning, saw your pics on twitter, nice place +1 | 13:31 |
*** deverter has joined #openstack-ansible | 13:31 | |
*** deverter has quit IRC | 13:34 | |
*** deverter has joined #openstack-ansible | 13:34 | |
*** karimb has quit IRC | 13:36 | |
*** ManojK has quit IRC | 13:38 | |
*** TxGirlGeek has quit IRC | 13:38 | |
*** karimb has joined #openstack-ansible | 13:38 | |
*** TxGirlGeek has joined #openstack-ansible | 13:38 | |
*** TxGirlGeek has quit IRC | 13:40 | |
*** ManojK has joined #openstack-ansible | 13:40 | |
*** TxGirlGeek has joined #openstack-ansible | 13:40 | |
automagically | Any cores available to review https://review.openstack.org/334506 ? | 13:52 |
andymccr | lgtm | 13:53 |
automagically | thx andymccr | 13:54 |
automagically | Can you +2. I just see a +w andymccr | 13:54 |
automagically | If you are so inclined | 13:54 |
*** raddaoui has joined #openstack-ansible | 13:54 | |
*** jiteka has quit IRC | 13:54 | |
*** ametts has joined #openstack-ansible | 13:56 | |
andymccr | automagically: sure, it should still gate/merge either way though. | 13:57 |
*** jthorne has joined #openstack-ansible | 13:57 | |
automagically | Ah, I thought it needed 2 +2 and a +w | 13:57 |
automagically | Thanks again | 13:57 |
*** michaelgugino has joined #openstack-ansible | 13:58 | |
*** ajo_ has joined #openstack-ansible | 14:00 | |
*** v1k0d3n has joined #openstack-ansible | 14:02 | |
*** ajo_ has quit IRC | 14:03 | |
*** bootsha has quit IRC | 14:03 | |
*** ajo_ has joined #openstack-ansible | 14:04 | |
*** jayc has joined #openstack-ansible | 14:04 | |
*** asettle has quit IRC | 14:05 | |
*** ajo_ has quit IRC | 14:07 | |
*** ajo_ has joined #openstack-ansible | 14:07 | |
cloudnull | Andrew_jedi: sorry was afk a min. thanks it was fun to be away :) | 14:08 |
cloudnull | automagically: looking now | 14:08 |
automagically | cloudnull: don’t bother, its on its way | 14:09 |
automagically | Appreciate it tho | 14:09 |
* cloudnull neverminding | 14:09 | |
*** ajo_ has quit IRC | 14:09 | |
*** TxGirlGeek has quit IRC | 14:11 | |
*** TxGirlGeek has joined #openstack-ansible | 14:11 | |
*** TxGirlGeek has quit IRC | 14:14 | |
*** TxGirlGeek has joined #openstack-ansible | 14:14 | |
openstackgerrit | Kevin Carter (cloudnull) proposed openstack/openstack-ansible-openstack_hosts: Updated the hostname generation https://review.openstack.org/323504 | 14:15 |
*** sdake has quit IRC | 14:16 | |
*** jmckind has joined #openstack-ansible | 14:16 | |
*** sdake has joined #openstack-ansible | 14:17 | |
*** sdake has quit IRC | 14:17 | |
*** klamath has joined #openstack-ansible | 14:19 | |
*** sdake has joined #openstack-ansible | 14:19 | |
*** TxGirlGeek has quit IRC | 14:20 | |
openstackgerrit | Matt Dorn proposed openstack/openstack-ansible-openstack_hosts: Add linux-image-extra-virtual to host packages https://review.openstack.org/335525 | 14:21 |
*** sc68cal has quit IRC | 14:23 | |
*** spotz_zzz is now known as spotz | 14:24 | |
*** sc68cal has joined #openstack-ansible | 14:25 | |
*** jorge_munoz has joined #openstack-ansible | 14:32 | |
*** jiteka has joined #openstack-ansible | 14:40 | |
*** Mudpuppy has joined #openstack-ansible | 14:41 | |
*** TxGirlGeek has joined #openstack-ansible | 14:45 | |
*** karimb has quit IRC | 14:46 | |
*** kstev has joined #openstack-ansible | 14:47 | |
*** jorge_munoz_ has joined #openstack-ansible | 14:48 | |
*** jorge_munoz has quit IRC | 14:48 | |
*** jorge_munoz_ is now known as jorge_munoz | 14:48 | |
*** eil397 has joined #openstack-ansible | 14:48 | |
Andrew_jedi | cloudnull: Do we have any script in osa to deal with network partitions, like this, https://gist.github.com/niedbalski/aceba280b0365bdff46f#file-partition-recover-rabbitmq-py | 14:52 |
* cloudnull looking | 14:53 | |
cloudnull | Andrew_jedi: no nothing in tree that im aware of | 14:54 |
cloudnull | though that looks useful . | 14:54 |
Andrew_jedi | ok, thanks! | 14:54 |
automagically | Contribute to https://github.com/openstack/openstack-ansible-ops Andrew_jedi | 14:54 |
cloudnull | +1 | 14:54 |
*** berendt has quit IRC | 14:56 | |
*** karimb has joined #openstack-ansible | 14:58 | |
evrardjp | Andrew_jedi: we still have some kind of mention of it in a template IIRC | 15:01 |
evrardjp | the rabbitmq.config.j2 | 15:01 |
evrardjp | don't know what you need or what you are talking about, just remembering that partitions were something I saw | 15:01 |
evrardjp | but then I guess it's probably the first steps to ops | 15:02 |
*** jiteka has quit IRC | 15:03 | |
*** appprod0 has joined #openstack-ansible | 15:04 | |
*** chhavi has quit IRC | 15:06 | |
*** neilus has quit IRC | 15:07 | |
*** cloader89 has joined #openstack-ansible | 15:07 | |
*** cloader89 has quit IRC | 15:08 | |
*** appprod0 has quit IRC | 15:08 | |
*** aernhart has quit IRC | 15:08 | |
*** alan__ has joined #openstack-ansible | 15:08 | |
Andrew_jedi | evrardjp: I had power failure yesterday, and as a result i have to deal with the network partition. Rabbitmq cluster got screwed. Still trying to fix it. | 15:08 |
*** cloader89 has joined #openstack-ansible | 15:09 | |
*** sacharya has joined #openstack-ansible | 15:09 | |
*** weezS has joined #openstack-ansible | 15:10 | |
openstackgerrit | Kevin Carter (cloudnull) proposed openstack/openstack-ansible-openstack_hosts: Updated the hostname generation https://review.openstack.org/323504 | 15:10 |
*** daneyon has joined #openstack-ansible | 15:11 | |
evrardjp | at some point it's easier to rebuild your cluster from scratch and rerun the os playbooks | 15:12 |
evrardjp | :D | 15:12 |
*** eil397 has quit IRC | 15:12 | |
evrardjp | I mean it's just rabbitmq queues | 15:12 |
cloudnull | Andrew_jedi: can you stop then start the cluster to reset the partitioning ? | 15:12 |
evrardjp | I guess he already tried that | 15:13 |
Andrew_jedi | cloudnull: Tried that, infact restarted the entire setup, and then finally rebuilt the rabbitmq cluster, but still facing network partition :/ | 15:13 |
cloudnull | :'( | 15:14 |
evrardjp | that's weird | 15:14 |
evrardjp | rebuilt from new containers or from existing ones ? | 15:14 |
*** TxGirlGeek has quit IRC | 15:14 | |
Andrew_jedi | cloudnull: And to make the matter worse, this is a production setup | 15:14 |
evrardjp | oh | 15:14 |
*** TxGirlGeek has joined #openstack-ansible | 15:14 | |
cloudnull | Andrew_jedi: kilo ? | 15:15 |
Andrew_jedi | evrardjp: existing ones, stop-destroy-and recreate | 15:15 |
evrardjp | so complete process for the rabbit recreate | 15:15 |
openstackgerrit | Nolan Brubaker proposed openstack/openstack-ansible: Use in-tree env.d files, provide override support https://review.openstack.org/332595 | 15:15 |
Andrew_jedi | cloudnull: yep, scheduled for upgrade to Liberty in August. | 15:15 |
cloudnull | have you tried setting https://github.com/openstack/openstack-ansible/blob/kilo/playbooks/roles/rabbitmq_server/defaults/main.yml#L47 ? | 15:15 |
cloudnull | to autoheal ? | 15:15 |
*** sacharya_ has joined #openstack-ansible | 15:16 | |
Andrew_jedi | cloudnull: nope, let me try that. | 15:16 |
Andrew_jedi | evrardjp: Yep! | 15:16 |
*** catintheroof has joined #openstack-ansible | 15:16 | |
*** pcaruana has quit IRC | 15:17 | |
cloudnull | setting "rabbitmq_cluster_partition_handling: autoheal" in user_variables.yml and reruning `openstack-ansible rabbitmq-install.yml --tags rabbitmq-config` should drop the needed config and restart the app. | 15:17 |
evrardjp | that variable is indeed used in the file I told earlier | 15:17 |
evrardjp | you can directly give it in CLI | 15:17 |
evrardjp | openstack-ansible rabbitmq-install.yml -e rabbitmq_cluster_partition_handling=autoheal | 15:18 |
evrardjp | not sure about what it does 'though, I'm no rabbit expert | 15:18 |
*** sacharya has quit IRC | 15:18 | |
cloudnull | Andrew_jedi: in reading https://www.rabbitmq.com/partitions.html it looks like autoheal is the way to go when dealing with network issues. its the most aggressive way of recovering from partitioning however it should do the trick. | 15:21 |
Andrew_jedi | cloudnull: thanks, fingers crossed. | 15:21 |
cloudnull | **residual network issues caused by a major outage. | 15:21 |
* cloudnull grabing coffee back in a min | 15:22 | |
*** eil397 has joined #openstack-ansible | 15:26 | |
*** chandanc_ has joined #openstack-ansible | 15:27 | |
evrardjp | This kind of hands on experience deserves docs IMP | 15:29 |
evrardjp | IMO* | 15:29 |
*** v1k0d3n has quit IRC | 15:29 | |
*** asettle has joined #openstack-ansible | 15:29 | |
*** eil397 has quit IRC | 15:30 | |
Andrew_jedi | cloudnull evrardjp : No cigars, http://paste.openstack.org/show/524089/ | 15:32 |
*** pcaruana has joined #openstack-ansible | 15:32 | |
*** eil397 has joined #openstack-ansible | 15:33 | |
cloudnull | is infra3 the only partitioned node? | 15:33 |
cloudnull | what does: rabbitmqctl cluster_status show? | 15:34 |
Andrew_jedi | cloudnull: only controller1 | 15:35 |
Andrew_jedi | cloudnull: pasting the output now | 15:35 |
*** michaelgugino has quit IRC | 15:36 | |
Andrew_jedi | cloudnull: http://paste.openstack.org/show/524091/ | 15:38 |
*** ManojK has quit IRC | 15:38 | |
evrardjp | rabbitmqctl status on node 1? | 15:39 |
cloudnull | idk If you've tried this but, can you login to the misbehaving node "controller1_rabbit_mq_container-2d645e7f" and run `rabbitmqctl stop_app; rabbitmqctl reset; rabbitmqctl join_cluster rabbit@controller2_rabbit_mq_container-8b157b02; rabbitmqctl start_app; | 15:40 |
*** ManojK has joined #openstack-ansible | 15:40 | |
evrardjp | status will show the app runing | 15:40 |
evrardjp | please do that before :D | 15:40 |
evrardjp | for pasting purposes | 15:40 |
evrardjp | IIRC | 15:41 |
Andrew_jedi | cloudnull: Yes, i did. But it will fail to join the cluster. | 15:41 |
cloudnull | same reason ? | 15:41 |
Andrew_jedi | evrardjp: http://paste.openstack.org/show/524092/, No rabbit app active | 15:41 |
Andrew_jedi | cloudnull: let me show you | 15:41 |
*** mummer has joined #openstack-ansible | 15:42 | |
evrardjp | rm -rf /var/lib/rabbitmq/mnesia | 15:43 |
cloudnull | if that continues to fail you could try destroying the node and rebuilding it to se if it'll rejoin. `openstack-ansible lxc-container-destroy.yml lxc-container-create.yml --limit controller1_rabbit_mq_container-2d645e7f; openstack-ansible rabbitmq-install.yml` | 15:43 |
Andrew_jedi | cloudnull: http://paste.openstack.org/show/524094/ | 15:45 |
Andrew_jedi | evrardjp: i thought rabbitmqctl reset will delete the mnesia | 15:45 |
*** Drago has joined #openstack-ansible | 15:46 | |
*** phalmos has joined #openstack-ansible | 15:46 | |
cloudnull | Andrew_jedi: can you ping the nodes in the cluster using the hostname from the misbehaving one? | 15:47 |
Andrew_jedi | cloudnull: Yes | 15:47 |
evrardjp | he has a fundamental mnesia problem I think, the process isn't listed in its erlang vm | 15:48 |
evrardjp | that's why I tried to trash the folder | 15:48 |
eil397 | good morning everyone | 15:48 |
mrhillsman | yo | 15:48 |
cloudnull | you try to reset it using the erl commands | 15:48 |
evrardjp | good morning | 15:48 |
mrhillsman | https://bugs.launchpad.net/openstack-ansible/+bug/1597410 | 15:48 |
openstack | Launchpad bug 1597410 in openstack-ansible "manual upgrade: memcached flush fails" [Undecided,New] | 15:48 |
cloudnull | morning eil397 | 15:48 |
mrhillsman | this is related to upgrade again cloudnull evrardjp | 15:48 |
Andrew_jedi | cloudnull: http://paste.openstack.org/show/524095/ | 15:48 |
cloudnull | yo mrhillsman | 15:48 |
mrhillsman | i wish i knew how to contribute :( | 15:49 |
mrhillsman | i put the change i made to make it work in the description | 15:49 |
Andrew_jedi | evrardjp: trashing it | 15:49 |
Andrew_jedi | evrardjp: Done | 15:49 |
cloudnull | Andrew_jedi: try: erl -sname "rabbit@controller1_rabbit_mq_container-2d645e7f" -mnesia dir | 15:50 |
evrardjp | restart rabbit and check status | 15:50 |
cloudnull | that will drop you into a shell | 15:50 |
cloudnull | if it can connect | 15:50 |
evrardjp | yes that's even better | 15:50 |
evrardjp | and then check mnesia info | 15:50 |
evrardjp | good idea cloudnull | 15:50 |
cloudnull | then: mnesia:delete_schema(['rabbit@controller1_rabbit_mq_container-2d645e7f']) | 15:50 |
*** adrian_otto has joined #openstack-ansible | 15:50 | |
eil397 | mrhillsman: you have issue with sending commit on review ? | 15:51 |
cloudnull | and try to stop_app, join_cluster, start_app | 15:51 |
evrardjp | that should be done by removing the folder completely | 15:51 |
cloudnull | ++ | 15:51 |
cloudnull | thats very true | 15:51 |
cloudnull | rm is the hammer way :) | 15:51 |
evrardjp | I tried the bazooka aproach | 15:51 |
evrardjp | yes | 15:51 |
evrardjp | :D | 15:51 |
evrardjp | that's rabbit | 15:51 |
evrardjp | who cares | 15:51 |
evrardjp | eventually consistent, right ? | 15:52 |
evrardjp | anyway, rabbitmqctl status should give you at least the pid of the mnesia process | 15:52 |
cloudnull | mrhillsman: for that to fail like so it would mean the entire cluster was unreachable ? | 15:52 |
*** alan__ has quit IRC | 15:52 | |
cloudnull | was memcached running ? | 15:53 |
cloudnull | on any of the nodes? | 15:53 |
mrhillsman | the issue is when hostname has -l in it | 15:53 |
mrhillsman | the regex checks memcached.conf for -l | 15:53 |
evrardjp | cloudnull: I think it's problem with parsing the config of memcached | 15:53 |
mrhillsman | first line has hostname | 15:53 |
*** alan__ has joined #openstack-ansible | 15:53 | |
mrhillsman | so it returns Ansible - { print $2 } | 15:53 |
evrardjp | echo 'flush_all' | nc $(awk '/\\-l/ {print $2}' /etc/memcached.conf | 15:53 |
evrardjp | nc: port number invalid: 172.29.238.134 | 15:53 |
mrhillsman | right | 15:53 |
evrardjp | that's ... no luck ? | 15:54 |
mrhillsman | first line is about ansible managing the file for the hostname (melv7301-rpcops-lab) | 15:54 |
evrardjp | :p | 15:54 |
*** alan__ has quit IRC | 15:54 | |
mrhillsman | so that -l(ab) is the issue | 15:54 |
evrardjp | -ab | 15:54 |
evrardjp | that's the solution | 15:54 |
evrardjp | rename your host :p | 15:54 |
mrhillsman | hehe | 15:54 |
cloudnull | mrhillsman: http://cdn.pasteraw.com/efukadmfhg9uwt3hhwst75cmzlky6qw | 15:54 |
*** alan__ has joined #openstack-ansible | 15:54 | |
mrhillsman | right | 15:54 |
mrhillsman | i added that in the description as fix | 15:55 |
evrardjp | listen should be an ip anyway IMO | 15:55 |
mrhillsman | it is | 15:55 |
mrhillsman | nc ip port | 15:55 |
mrhillsman | but you get nc Ansible ip port | 15:55 |
mrhillsman | the ^ should be used anyway since it makes it more specific | 15:56 |
evrardjp | You're right | 15:56 |
evrardjp | it's starting by this | 15:56 |
Andrew_jedi | cloudnull: Is this right, "erl -sname "rabbit@controller1_rabbit_mq_container-2d645e7f" -mnesia /var/lib/rabbitmq/mnesia/" | 15:57 |
openstackgerrit | Travis Truman (automagically) proposed openstack/openstack-ansible: Define glance_default_store in group_vars https://review.openstack.org/335571 | 15:58 |
*** KLevenstein has quit IRC | 15:58 | |
cloudnull | Andrew_jedi: no. just dir at the end, If i remember right | 15:58 |
*** alan__ has quit IRC | 15:58 | |
*** phalmos has quit IRC | 15:59 | |
*** KLevenstein has joined #openstack-ansible | 15:59 | |
*** alan__ has joined #openstack-ansible | 15:59 | |
Andrew_jedi | cloudnull: ack! | 15:59 |
Andrew_jedi | thanks | 15:59 |
Andrew_jedi | cloudnull: http://paste.openstack.org/show/524097/ | 16:00 |
cloudnull | Andrew_jedi: so rabbit is running | 16:00 |
Andrew_jedi | cloudnull: this was from inside the rabbitmq container on controller1 | 16:00 |
cloudnull | and connected to the db | 16:00 |
openstackgerrit | Travis Truman (automagically) proposed openstack/openstack-ansible: Define glance_default_store in group_vars https://review.openstack.org/335571 | 16:01 |
*** TheIntern has joined #openstack-ansible | 16:01 | |
cloudnull | evrardjp: do you see the issue mrhillsman is seeing? -- when I run "echo 'flush_all' | nc $(awk '/\-l/ {print $2}' /etc/memcached.conf) $(awk '/\-p/ {print $2}' /etc/memcached.conf)" it works | 16:02 |
mrhillsman | cloudnull | 16:02 |
evrardjp | I understand why he wants to add ^ | 16:02 |
mrhillsman | the first line in your memcached.conf | 16:02 |
mrhillsman | does it have a -l anywhere? | 16:02 |
mrhillsman | if not, it will work | 16:02 |
cloudnull | my bad. | 16:02 |
mrhillsman | if it does, as does mine because of my hostname, it fails | 16:02 |
cloudnull | i see it now. | 16:02 |
mrhillsman | cool | 16:03 |
mrhillsman | other than that, manual upgrade with the rabbit changes succeeded | 16:04 |
cloudnull | cool | 16:04 |
cloudnull | evrardjp: do you have a PR in the works? | 16:04 |
*** karimb has quit IRC | 16:05 | |
Andrew_jedi | cloudnull: Aha, i got the shell | 16:05 |
cloudnull | awesome | 16:05 |
evrardjp | I'm writing it right now | 16:05 |
openstackgerrit | Jean-Philippe Evrard proposed openstack/openstack-ansible: Fix memcached flush if -l is in hostname https://review.openstack.org/335574 | 16:06 |
evrardjp | quick thing, so it could need more love | 16:06 |
*** jmckind_ has joined #openstack-ansible | 16:07 | |
evrardjp | mrhillsman: from which version are you upgrading from/to | 16:07 |
*** jmckind has quit IRC | 16:10 | |
cloudnull | Andrew_jedi: were you able to nuke the DB and get the node to reconnect? | 16:11 |
*** phalmos has joined #openstack-ansible | 16:14 | |
*** admin0 has quit IRC | 16:15 | |
*** phalmos has quit IRC | 16:16 | |
cloudnull | evrardjp: https://review.openstack.org/#/c/335574 -- reviewed | 16:16 |
cloudnull | anchor works but needs to be moved. | 16:16 |
Andrew_jedi | cloudnull: http://paste.openstack.org/show/524100/ | 16:17 |
evrardjp | thanks dslexia | 16:17 |
evrardjp | dyslexia | 16:17 |
*** neilus has joined #openstack-ansible | 16:17 | |
Andrew_jedi | still same issue | 16:17 |
*** neilus has quit IRC | 16:17 | |
openstackgerrit | Jean-Philippe Evrard proposed openstack/openstack-ansible: Fix memcached flush if -l is in hostname https://review.openstack.org/335574 | 16:17 |
evrardjp | I didn't test this ^ | 16:17 |
evrardjp | mrhillsman: could you test it ? | 16:18 |
cloudnull | Andrew_jedi: try rebuilding that container? | 16:18 |
cloudnull | if that continues to fail you could try destroying the node and rebuilding it to se if it'll rejoin. `openstack-ansible lxc-container-destroy.yml lxc-container-create.yml --limit controller1_rabbit_mq_container-2d645e7f; openstack-ansible rabbitmq-install.yml` | 16:18 |
*** neilus has joined #openstack-ansible | 16:19 | |
mrhillsman | was in a meeting | 16:19 |
mrhillsman | looking | 16:19 |
*** appprod0 has joined #openstack-ansible | 16:20 | |
openstackgerrit | Jean-Philippe Evrard proposed openstack/openstack-ansible: Fix memcached flush if -l is in hostname https://review.openstack.org/335574 | 16:20 |
*** david-lyle has joined #openstack-ansible | 16:20 | |
Andrew_jedi | cloudnull: Roger that. | 16:20 |
mrhillsman | yeah | 16:20 |
mrhillsman | cloudnull | 16:20 |
cloudnull | when you rerun the rabbitmq-install.yml play you can elect a different cluster node | 16:20 |
mrhillsman | evrardjp commented | 16:20 |
evrardjp | Andrew_jedi: don't forget to limit on the destroy :p | 16:21 |
*** david-lyle has quit IRC | 16:21 | |
cloudnull | IE: openstack-ansible rabbitmq-install.yml -e "rabbitmq_primary_cluster_node=controller3_rabbit_mq_container-48ecc3b2" | 16:21 |
*** phalmos has joined #openstack-ansible | 16:21 | |
*** david-lyle has joined #openstack-ansible | 16:21 | |
evrardjp | mrhillsman: I don't see your comment :/ | 16:22 |
*** david-lyle has quit IRC | 16:22 | |
*** david-lyle has joined #openstack-ansible | 16:23 | |
mrhillsman | why you add the -F,... part? | 16:24 |
*** berendt has joined #openstack-ansible | 16:24 | |
*** david-lyle has quit IRC | 16:25 | |
*** david-lyle has joined #openstack-ansible | 16:25 | |
evrardjp | mrhillsman: that's how master is behaving | 16:27 |
evrardjp | I barely added ^ | 16:27 |
cloudnull | Andrew_jedi: I just caused my cluster to get into a partitioned state and then ran: openstack-ansible lxc-containers-destroy.yml lxc-containers-create.yml --limit infra1_rabbit_mq_container-d3c5f2d5 && openstack-ansible rabbitmq-install.yml -e "rabbitmq_primary_cluster_node=infra3-rabbit-mq-container-ff06bfc8" ## Note my hostnames are different than yours ## and it seemed to recover nicely. | 16:28 |
*** phalmos has quit IRC | 16:28 | |
cloudnull | idk if that will help your situation but its worth a shot | 16:28 |
*** cloader89 has quit IRC | 16:30 | |
cloudnull | you shouldnt need the cluster node part but its worth noting | 16:30 |
Andrew_jedi | cloudnull: Thanks, I appreciate this. I am trying the same thing now | 16:30 |
cloudnull | ** cluster node part == setting rabbitmq_primary_cluster_node | 16:30 |
*** sulo has joined #openstack-ansible | 16:30 | |
cloudnull | afk lunching Andrew_jedi best of luck , let us know how it goes. | 16:32 |
*** jorge_munoz has quit IRC | 16:33 | |
openstackgerrit | Travis Truman (automagically) proposed openstack/openstack-ansible: Add conditional for overlay network settings https://review.openstack.org/335579 | 16:33 |
*** asettle has quit IRC | 16:33 | |
automagically | cloudnull, hope you don’t mind that cherry-pick of your change to master ^ | 16:34 |
evrardjp | good change btw | 16:34 |
*** berendt has quit IRC | 16:35 | |
evrardjp | will star it, and be back on it later | 16:35 |
*** pcaruana has quit IRC | 16:36 | |
*** TxGirlGeek has quit IRC | 16:38 | |
*** neilus has quit IRC | 16:44 | |
*** Andrew_jedi has quit IRC | 16:48 | |
*** karimb has joined #openstack-ansible | 16:48 | |
*** karimb has quit IRC | 16:48 | |
*** Andrew_jedi has joined #openstack-ansible | 16:48 | |
openstackgerrit | Travis Truman (automagically) proposed openstack/openstack-ansible: Add conditional for overlay network settings https://review.openstack.org/335579 | 16:48 |
*** KLevenstein has quit IRC | 16:53 | |
*** KLevenstein has joined #openstack-ansible | 16:54 | |
*** TxGirlGeek has joined #openstack-ansible | 16:55 | |
evrardjp | see you tomorrow everyone! | 16:56 |
automagically | later evrardjp | 16:56 |
eil397 | have a good one | 16:57 |
*** krotscheck is now known as krotscheck_vaca | 17:01 | |
*** krotscheck_vaca is now known as krot_vaca_jul19 | 17:01 | |
spotz | bye evrardjp | 17:03 |
*** javeriak has joined #openstack-ansible | 17:03 | |
*** sdake_ has joined #openstack-ansible | 17:08 | |
*** sdake has quit IRC | 17:10 | |
*** weezS has quit IRC | 17:11 | |
*** asettle has joined #openstack-ansible | 17:12 | |
Andrew_jedi | bye evrardjp | 17:16 |
*** eil397 has quit IRC | 17:16 | |
*** TheIntern has quit IRC | 17:16 | |
*** TheIntern has joined #openstack-ansible | 17:17 | |
*** PrestonBannister has joined #openstack-ansible | 17:28 | |
*** eil397 has joined #openstack-ansible | 17:29 | |
*** javeriak_ has joined #openstack-ansible | 17:33 | |
*** javeriak has quit IRC | 17:34 | |
*** TheIntern has quit IRC | 17:37 | |
*** ManojK has quit IRC | 17:40 | |
*** ManojK has joined #openstack-ansible | 17:40 | |
*** TheIntern has joined #openstack-ansible | 17:40 | |
*** electrofelix has quit IRC | 17:41 | |
*** TheIntern has quit IRC | 17:42 | |
*** McMurlock1 has quit IRC | 17:46 | |
openstackgerrit | Nolan Brubaker proposed openstack/openstack-ansible: Use in-tree env.d files, provide override support https://review.openstack.org/332595 | 17:48 |
*** chandanc_ has quit IRC | 17:48 | |
*** admin0 has joined #openstack-ansible | 17:55 | |
*** admin0 has quit IRC | 17:55 | |
*** sdake_ has quit IRC | 17:55 | |
cloudnull | Andrew_jedi: whats the word? | 17:55 |
Andrew_jedi | cloudnull: Part of the problem is hardware. Waiting for that to get fixed. Faulty cable. | 17:56 |
Andrew_jedi | cloudnull: I will update you within an hour. | 17:56 |
cloudnull | ah. that makes networking angry | 17:56 |
cloudnull | :P | 17:56 |
Andrew_jedi | cloudnull: Lol ;) | 17:57 |
cloudnull | any cores around that might want to give this a shove https://review.openstack.org/#/c/323504/ | 17:57 |
*** permalac has quit IRC | 17:57 | |
*** albertcard has joined #openstack-ansible | 18:00 | |
jmccrory | cloudnull: got it | 18:02 |
cloudnull | jmccrory: tyvm | 18:02 |
openstackgerrit | Merged openstack/openstack-ansible-openstack_hosts: Updated the hostname generation https://review.openstack.org/323504 | 18:03 |
*** TxGirlGeek has quit IRC | 18:05 | |
openstackgerrit | Anton Khaldin proposed openstack/openstack-ansible-galera_client: Add ignore_errors to fix minor bug with fallback source for apt-key. https://review.openstack.org/335233 | 18:07 |
*** TxGirlGeek has joined #openstack-ansible | 18:17 | |
*** berendt has joined #openstack-ansible | 18:17 | |
openstackgerrit | Anton Khaldin proposed openstack/openstack-ansible-galera_client: Add ignore_errors to fix minor bug with fallback source for apt-key. https://review.openstack.org/335233 | 18:18 |
*** jorge_munoz has joined #openstack-ansible | 18:20 | |
*** mrhillsman has quit IRC | 18:21 | |
*** TxGirlGeek has quit IRC | 18:23 | |
*** johnmilton has quit IRC | 18:25 | |
*** cloader89 has joined #openstack-ansible | 18:26 | |
*** mrhillsman has joined #openstack-ansible | 18:26 | |
alextricity25 | cloudnull: odyssey4me: This is something worth looking into, as it will block anyone building a multi-node with master: https://bugs.launchpad.net/openstack-ansible/+bug/1597475 | 18:36 |
openstack | Launchpad bug 1597475 in openstack-ansible "swift_rings_distribute.yml synchronize task broken on multi-node" [Undecided,New] | 18:36 |
cloudnull | alextricity25: i was looking into that last night. | 18:37 |
alextricity25 | cloudnull: did you get it too? | 18:37 |
cloudnull | no | 18:37 |
*** daneyon has left #openstack-ansible | 18:37 | |
alextricity25 | ....well then... | 18:37 |
*** asettle has quit IRC | 18:37 | |
cloudnull | i was using the multi-node-aio env | 18:38 |
alextricity25 | Are you sure you were building with master? Ansible version 2.1.0? | 18:38 |
cloudnull | and ive not been able to replicate it | 18:38 |
openstackgerrit | Merged openstack/openstack-ansible-openstack_hosts: Added the ip_vs kernel module to all openstack hosts https://review.openstack.org/334701 | 18:38 |
cloudnull | alextricity25: yes im on e6d2f771b8d2b9fd9578396d276398ed1bdaafa2 | 18:39 |
cloudnull | i have another env being kicked right now | 18:39 |
cloudnull | so more soon , but thus far I cant recreat that | 18:39 |
cloudnull | *recreate | 18:40 |
javeriak_ | hey guys, i have an issue with ansible ssh not working, direct ssh works, but not through ansible; here are my traces: http://paste.ubuntu.com/18086570/ | 18:41 |
cloudnull | javeriak_: so running: ssh 10.100.1.2 works? | 18:48 |
*** TxGirlGeek has joined #openstack-ansible | 18:48 | |
javeriak_ | cloudnull yes | 18:48 |
cloudnull | do you by change have a lot of keys loaded in your ssh-agent ? | 18:49 |
cloudnull | ssh-add -L | 18:49 |
javeriak_ | btw where can i find the actual code for the ansible core modules on my deploy? for example i want to see what the ansible ping module does | 18:49 |
javeriak_ | cloudnull nope, only the deploy node key | 18:50 |
cloudnull | javeriak_: /opt/ansible-runtime | 18:50 |
javeriak_ | i have the installed ansible version under there, /opt/ansible_v1.9.3-1/ ? | 18:52 |
cloudnull | javeriak_: is this master? | 18:52 |
javeriak_ | its kilo | 18:52 |
cloudnull | ah. | 18:52 |
javeriak_ | 10.1.11 | 18:52 |
*** TxGirlGeek has quit IRC | 18:53 | |
cloudnull | that should be in /usr/local/lib/python2.7/dist-packages/ansible | 18:54 |
*** admin0 has joined #openstack-ansible | 18:54 | |
*** ManojK has quit IRC | 18:54 | |
errr | if I am using developer mode on horizon where does the gitrepo I specify get checked out to? | 18:55 |
javeriak_ | cloudnull found it, thanks | 18:55 |
errr | oh it only grabs the egg anyway, so mever mind. | 18:56 |
javeriak_ | so back to the issue, i dont know whats wrong with the connection, the debug trace just timesout | 18:56 |
cloudnull | errr: its cloning the git repo directly and then installing it using the local clone as a constraint | 18:56 |
errr | cloudnull: so the whole repo is cloned then? where does it put it? | 18:57 |
cloudnull | errr: pip puts it in /tmp/build i believ | 18:58 |
javeriak_ | cloudnull btw where does it place this ping module on the target node somewhere too? | 18:58 |
cloudnull | the task builds the constraint file here /opt/developer-pip-constraints.txt | 18:58 |
cloudnull | then the regular install process happens using the local constraints | 18:59 |
errr | cloudnull: ah so its gone after its built. I was just wanting to check the sha in it vs what this other box we are deving on has | 18:59 |
*** TxGirlGeek has joined #openstack-ansible | 19:01 | |
cloudnull | i do believe its gone post build, and the default is set to use the master branch when dev mode is enabled | 19:01 |
cloudnull | so it may be hard to track | 19:02 |
cloudnull | you can active the venv | 19:02 |
cloudnull | and see what the installed version is | 19:02 |
*** ScarZy has quit IRC | 19:02 | |
*** vnogin has quit IRC | 19:03 | |
cloudnull | javeriak_: idk. | 19:04 |
cloudnull | i believe the module is copied over at runtime. | 19:04 |
javeriak_ | cloudnull yes but it doesnt persist | 19:05 |
cloudnull | no i dont believe so | 19:05 |
javeriak_ | i mean i cant find it on the target, so i suppose i can just modify the master one and use that | 19:05 |
cloudnull | if you suspect the ping module to be misbehaving you can try a shell command. | 19:06 |
cloudnull | ansible infra1 -m shell -a 'echo hi' | 19:06 |
javeriak_ | i think the module is fine, but im trying to understand what its doing, because the ssh debug trace only shows it running /usr/bin/python and then nothing, boom output | 19:07 |
*** Andrew_jedi has quit IRC | 19:10 | |
*** woodard has quit IRC | 19:12 | |
*** woodard has joined #openstack-ansible | 19:13 | |
*** woodard has quit IRC | 19:13 | |
*** woodard has joined #openstack-ansible | 19:14 | |
*** sdake has joined #openstack-ansible | 19:14 | |
*** vnogin has joined #openstack-ansible | 19:15 | |
*** ManojK has joined #openstack-ansible | 19:15 | |
*** ScarZy has joined #openstack-ansible | 19:15 | |
*** TM1 has quit IRC | 19:27 | |
*** Andrew_jedi has joined #openstack-ansible | 19:27 | |
*** javeriak_ has quit IRC | 19:32 | |
*** javeriak has joined #openstack-ansible | 19:34 | |
*** asettle has joined #openstack-ansible | 19:38 | |
*** asettle has quit IRC | 19:42 | |
*** catintheroof has quit IRC | 19:43 | |
eil397 | can someone review oneline bug fix ? https://review.openstack.org/#/c/335233/ | 19:49 |
cloudnull | eil397: looking now | 19:50 |
cloudnull | ++ | 19:50 |
eil397 | cloudnull: thanks | 19:50 |
cloudnull | thank you for putting it together :) | 19:51 |
alextricity25 | cloudnull: i have a repo-build question for you | 19:52 |
cloudnull | sure | 19:52 |
alextricity25 | How does repo-build determine what wheels to build? | 19:52 |
eil397 | cloudnull: it was David Wilde. who found and described it. I just that opporunity to send my first commit to osa. hope I will be able to add value. | 19:52 |
eil397 | s\just that\jsut used that\g | 19:53 |
alextricity25 | cloudnull: I imagine that the repo-build play does some sort of logic around requirements.txt | 19:53 |
openstackgerrit | Matt Dorn proposed openstack/openstack-ansible-openstack_hosts: Add linux-image-extra-virtual to host packages https://review.openstack.org/335650 | 19:55 |
alextricity25 | cloudnull: If I wanted to tell the repo-build server to not build wheels for a specific role, how would I do that? | 19:56 |
openstackgerrit | Travis Truman (automagically) proposed openstack/openstack-ansible-os_nova: Remove tags from functional testing playbooks https://review.openstack.org/335651 | 19:57 |
openstackgerrit | Merged openstack/openstack-ansible-galera_client: Add ignore_errors to fix minor bug with fallback source for apt-key. https://review.openstack.org/335233 | 19:58 |
*** Guest20454 is now known as mgagne | 19:58 | |
*** mgagne has joined #openstack-ansible | 19:58 | |
*** asettle has joined #openstack-ansible | 19:59 | |
*** appprod0 has quit IRC | 20:05 | |
*** TxGirlGeek has quit IRC | 20:07 | |
*** TxGirlGeek has joined #openstack-ansible | 20:07 | |
openstackgerrit | Kevin Carter (cloudnull) proposed openstack/openstack-ansible-repo_build: Updated repo-build to store package sources https://review.openstack.org/334110 | 20:09 |
cloudnull | alextricity25: which role ? | 20:09 |
alextricity25 | cloudnull: I want the repo-build playbook to skip building wheels for an RPC-O role, beaver. | 20:10 |
*** weezS has joined #openstack-ansible | 20:10 | |
alextricity25 | cloudnull: I think i figured it out though...it looks for ansible variables postfixed with "pip_packages"? | 20:10 |
alextricity25 | or any variant of BUILD_IN_PIP_PACKAGE_VARS? | 20:11 |
alextricity25 | s/BUILD/BUILT/ | 20:11 |
cloudnull | alextricity25: you can change pkg_locations to no include the beaver role. | 20:11 |
cloudnull | rather the localtion of the beaver role. | 20:11 |
cloudnull | if you just dont want wheels built for that role you can modify the vars too. | 20:12 |
*** TxGirlGeek has quit IRC | 20:12 | |
*** TxGirlGeek has joined #openstack-ansible | 20:12 | |
alextricity25 | cloudnull: I just deleted the beaver role from the code tree :P | 20:12 |
alextricity25 | and all it's variables | 20:12 |
alextricity25 | ha | 20:13 |
alextricity25 | cloudnull: Where is this pkg_locations variable you speak of? | 20:13 |
cloudnull | alextricity25: https://github.com/openstack/openstack-ansible/blob/master/playbooks/repo-build.yml#L24 | 20:14 |
cloudnull | values found here by default https://github.com/openstack/openstack-ansible/blob/master/playbooks/repo-build.yml#L44-L47 | 20:14 |
*** TxGirlGeek has quit IRC | 20:14 | |
cloudnull | if RPC-O may be storing roles in one of those locations or overriding the default. | 20:14 |
*** TxGirlGeek has joined #openstack-ansible | 20:15 | |
mhayden | cloudnull: nice find on the nohup | 20:16 |
cloudnull | that was an odd one. | 20:16 |
mhayden | i wonder why that happens | 20:16 |
*** Drago has left #openstack-ansible | 20:17 | |
cloudnull | it looks like stdout is just left open for the entire run | 20:17 |
*** asettle has quit IRC | 20:19 | |
*** appprod0 has joined #openstack-ansible | 20:20 | |
*** mkrish004c has joined #openstack-ansible | 20:22 | |
*** Andrew_jedi has quit IRC | 20:25 | |
*** asettle has joined #openstack-ansible | 20:26 | |
*** Andrew_jedi has joined #openstack-ansible | 20:29 | |
Andrew_jedi | clounull: Still not fixed. It may be a hardware issue. We first saw the network partition when we introduced bonding on one of this setup. | 20:30 |
*** alan__ has quit IRC | 20:34 | |
cloudnull | Andrew_jedi: maybe switch configs ? | 20:36 |
Andrew_jedi | cloudnull: Could you pls spare 2 mins and have a look at this, http://paste.openstack.org/show/524128/ | 20:40 |
Andrew_jedi | This is the network config we introduced when we implemented bonding | 20:41 |
Andrew_jedi | bond0 replaced eth0 and bond1 replaced eth1 basically | 20:42 |
Andrew_jedi | It could be a switch issue but we are not even using vlans on this setup. | 20:43 |
*** TxGirlGeek has quit IRC | 20:45 | |
*** asettle has quit IRC | 20:51 | |
*** woodard_ has joined #openstack-ansible | 20:56 | |
*** asettle has joined #openstack-ansible | 20:57 | |
*** woodard has quit IRC | 21:00 | |
cloudnull | Andrew_jedi: looking | 21:00 |
*** Mudpuppy has quit IRC | 21:01 | |
*** javeriak has quit IRC | 21:03 | |
*** woodard has joined #openstack-ansible | 21:08 | |
*** psilvad has quit IRC | 21:09 | |
*** javeriak has joined #openstack-ansible | 21:10 | |
cloudnull | Andrew_jedi: i dont see anything wrong with the interface file. | 21:10 |
cloudnull | that should work. But if you're seeing issues with the bond you might want to try disbaling a channel to see if the connection stabalizes | 21:11 |
*** TxGirlGeek has joined #openstack-ansible | 21:11 | |
*** asettle has quit IRC | 21:11 | |
*** woodard_ has quit IRC | 21:12 | |
*** woodard has quit IRC | 21:13 | |
*** ManojK has quit IRC | 21:13 | |
openstackgerrit | Kevin Carter (cloudnull) proposed openstack/openstack-ansible-repo_build: Updated repo-build to store package sources https://review.openstack.org/334110 | 21:13 |
*** ManojK has joined #openstack-ansible | 21:15 | |
*** pester has joined #openstack-ansible | 21:15 | |
*** TxGirlGeek has quit IRC | 21:16 | |
*** fxpester has quit IRC | 21:17 | |
Andrew_jedi | cloudnull: thx! looking in to this. | 21:20 |
*** thorst has quit IRC | 21:21 | |
*** kstev has quit IRC | 21:23 | |
*** smatzek has quit IRC | 21:27 | |
*** mkrish004c has quit IRC | 21:36 | |
*** PrestonBannister has quit IRC | 21:38 | |
*** spotz is now known as spotz_zzz | 21:39 | |
*** thorst has joined #openstack-ansible | 21:46 | |
*** Andrew_jedi has quit IRC | 21:46 | |
*** ManojK has quit IRC | 21:48 | |
*** thorst has quit IRC | 21:50 | |
*** woodard has joined #openstack-ansible | 21:55 | |
*** messy has quit IRC | 21:57 | |
*** adrian_otto has quit IRC | 21:57 | |
*** jmckind_ has quit IRC | 21:58 | |
*** TxGirlGeek has joined #openstack-ansible | 22:00 | |
mrda | Morning all | 22:03 |
*** TxGirlGeek has quit IRC | 22:06 | |
*** thorst has joined #openstack-ansible | 22:07 | |
admin0 | morning mrda (0:10 AM here) | 22:10 |
*** berendt has quit IRC | 22:10 | |
*** TxGirlGeek has joined #openstack-ansible | 22:11 | |
*** asettle has joined #openstack-ansible | 22:12 | |
mrda | :) | 22:15 |
*** ametts has quit IRC | 22:16 | |
*** asettle has quit IRC | 22:17 | |
eil397 | mrning mrda | 22:18 |
*** TxGirlGeek has quit IRC | 22:21 | |
mrda | o/ | 22:23 |
*** adrian_otto has joined #openstack-ansible | 22:25 | |
*** aernhart has joined #openstack-ansible | 22:33 | |
*** cloader89 has quit IRC | 22:37 | |
*** admin0 has left #openstack-ansible | 22:40 | |
*** admin0 has quit IRC | 22:40 | |
*** sdake_ has joined #openstack-ansible | 22:42 | |
*** sdake has quit IRC | 22:45 | |
*** KLevenstein has quit IRC | 22:46 | |
*** sdake_ has quit IRC | 22:49 | |
*** thorst has quit IRC | 22:58 | |
*** weshay has quit IRC | 22:58 | |
*** thorst has joined #openstack-ansible | 22:59 | |
*** sdake has joined #openstack-ansible | 23:01 | |
*** sdake has quit IRC | 23:04 | |
*** thorst has quit IRC | 23:07 | |
*** woodard has quit IRC | 23:12 | |
*** jamielennox is now known as jamielennox|away | 23:13 | |
*** deverter has quit IRC | 23:14 | |
*** asettle has joined #openstack-ansible | 23:28 | |
*** asettle has quit IRC | 23:33 | |
*** daneyon has joined #openstack-ansible | 23:38 | |
*** daneyon_ has joined #openstack-ansible | 23:39 | |
*** daneyon has quit IRC | 23:43 | |
*** jamielennox|away is now known as jamielennox | 23:51 | |
*** sacharya_ has quit IRC | 23:56 | |
*** mummer has quit IRC | 23:58 | |
*** eil397 has left #openstack-ansible | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!