*** Marga_ has joined #openstack-containers | 00:26 | |
*** dboik has quit IRC | 00:28 | |
*** jay-lau-513 has quit IRC | 00:44 | |
*** suro-patz has joined #openstack-containers | 01:25 | |
*** achanda has joined #openstack-containers | 01:28 | |
*** suro-patz has quit IRC | 01:32 | |
*** suro-patz has joined #openstack-containers | 01:32 | |
*** suro-patz has quit IRC | 01:34 | |
*** achanda has quit IRC | 01:40 | |
*** achanda has joined #openstack-containers | 01:47 | |
*** achanda has quit IRC | 01:50 | |
*** dims has quit IRC | 02:15 | |
*** dims has joined #openstack-containers | 02:16 | |
*** erkules has joined #openstack-containers | 02:16 | |
*** erkules_ has quit IRC | 02:19 | |
*** dims has quit IRC | 02:20 | |
*** Marga_ has quit IRC | 02:31 | |
*** achanda has joined #openstack-containers | 02:38 | |
*** Marga_ has joined #openstack-containers | 02:44 | |
*** Marga_ has quit IRC | 02:50 | |
*** Marga_ has joined #openstack-containers | 02:50 | |
*** Marga_ has quit IRC | 03:05 | |
*** achanda has quit IRC | 03:27 | |
*** sdake_ has joined #openstack-containers | 03:52 | |
*** sdake has quit IRC | 03:53 | |
*** sdake has joined #openstack-containers | 03:53 | |
*** sdake_ has quit IRC | 03:57 | |
*** raginbaj- has joined #openstack-containers | 03:57 | |
*** raginbajin has quit IRC | 03:58 | |
*** raginbaj- is now known as raginbajin | 03:59 | |
*** adrian_otto has quit IRC | 04:03 | |
*** vilobhmm has joined #openstack-containers | 04:07 | |
*** hongbin has joined #openstack-containers | 04:08 | |
hongbin | sdake: The new image didn't seem to work.... I wrote the details on the bug https://bugs.launchpad.net/magnum/+bug/1434468 . | 04:10 |
---|---|---|
openstack | Launchpad bug 1434468 in Magnum "Magnum default images used for kubeclt should be upgraded" [Critical,Triaged] - Assigned to Steven Dake (sdake) | 04:10 |
*** hongbin has quit IRC | 04:12 | |
*** sdake_ has joined #openstack-containers | 04:15 | |
*** sdake has quit IRC | 04:19 | |
* sdake_ groans | 04:21 | |
sdake_ | hongbin looks like the minions are registering | 04:24 |
sdake_ | I suspect a problem with the template | 04:24 |
sdake_ | perhaps a binary not being started that should be | 04:24 |
sdake_ | do you have ssh access to the machine? | 04:25 |
*** achanda has joined #openstack-containers | 04:34 | |
*** hongbin has joined #openstack-containers | 04:36 | |
hongbin | sdake_: yes I have | 04:40 |
sdake_ | systemctl | grep kube | fpaste | 04:41 |
sdake_ | on both nodes pls | 04:41 |
sdake_ | My dev env is busted atm | 04:41 |
sdake_ | which is why I can't test | 04:41 |
sdake_ | and super busy releasing kolla | 04:41 |
sdake_ | I plan to get into my magnum blueprint next week | 04:41 |
hongbin | sdake_: will do that | 04:43 |
hongbin | first I see an error on console | 04:43 |
hongbin | [ 59.959326] cloud-init[841]: <resource>node_wait_handle</resource>2015-03-22 03:37:56,708 - cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts) ci-info: +++++++++Authorized keys from /home/minion/.ssh/authorized_keys for user minion++++++++++ [ 60.092661] cloud-initci-info: +---------+-------------------------------------------------+---------+--------------- | 04:43 |
sdake_ | systemctl | grep failed | 04:43 |
hongbin | $ sudo systemctl | grep failed ● cloud-final.service loaded failed failed Execute cloud user/final scripts ● flanneld.service loaded failed failed Flanneld overlay address etcd agent | 04:44 |
*** funzo has quit IRC | 04:45 | |
hongbin | sdake_: will go to sleep soon | 04:45 |
sdake_ | if flanneld is busted magnum will not work i think | 04:45 |
hongbin | .... | 04:45 |
*** funzo has joined #openstack-containers | 04:45 | |
sdake_ | enjoy | 04:45 |
*** funzo has quit IRC | 04:45 | |
sdake_ | without an overlay network nothing works i think | 04:46 |
hongbin | ... | 04:46 |
hongbin | I will get back to it tomorrow | 04:47 |
sdake_ | ok | 04:47 |
hongbin | see you | 04:47 |
sdake_ | debug the failed services and see what you can find | 04:47 |
*** hongbin has quit IRC | 04:47 | |
sdake_ | the template is probably out dated | 04:47 |
*** sdake_ has quit IRC | 04:54 | |
*** sdake has joined #openstack-containers | 04:55 | |
*** dims has joined #openstack-containers | 05:08 | |
*** sdake_ has joined #openstack-containers | 05:20 | |
*** sdake has quit IRC | 05:24 | |
*** sdake_ has quit IRC | 05:26 | |
*** dims has quit IRC | 05:39 | |
*** vilobhmm has quit IRC | 06:55 | |
*** dims has joined #openstack-containers | 07:24 | |
*** Marga_ has joined #openstack-containers | 07:32 | |
*** Marga_ has quit IRC | 07:56 | |
*** dims has quit IRC | 07:58 | |
*** oro has joined #openstack-containers | 08:18 | |
*** dims has joined #openstack-containers | 09:24 | |
*** achanda has quit IRC | 09:41 | |
*** dims has quit IRC | 09:58 | |
*** oro has quit IRC | 10:15 | |
*** oro has joined #openstack-containers | 10:26 | |
*** dims has joined #openstack-containers | 10:55 | |
*** dims has quit IRC | 11:28 | |
*** Marga_ has joined #openstack-containers | 11:51 | |
*** Marga_ has quit IRC | 12:26 | |
*** dims has joined #openstack-containers | 12:37 | |
*** jay-lau-513 has joined #openstack-containers | 12:45 | |
*** dims has quit IRC | 12:59 | |
*** Marga_ has joined #openstack-containers | 14:36 | |
*** Marga_ has quit IRC | 14:53 | |
*** hongbin has joined #openstack-containers | 15:26 | |
hongbin | good morning | 15:26 |
*** sdake has joined #openstack-containers | 15:30 | |
sdake | morning | 15:37 |
hongbin | sdake: Figuring the image out | 15:40 |
sdake | I think the image is correct | 15:40 |
hongbin | This is the flannel service log http://pastie.org/10045394 | 15:40 |
sdake | just needs to be setup properly | 15:40 |
hongbin | possibly | 15:41 |
sdake | rpm -qa | grep etcd | 15:41 |
sdake | on master | 15:41 |
sdake | rather | 15:41 |
sdake | systemctl | grep etcd | 15:41 |
hongbin | $ rpm -qa | grep etcd etcd-0.4.6-6.fc21.x86_64 | 15:42 |
sdake | use journalctl on flanneld | 15:42 |
sdake | it will give a full log | 15:43 |
sdake | systemd status only gives a partial log | 15:43 |
hongbin | k. doing that | 15:43 |
sdake | I believe the syntaix is journalctl -xl -u flanndeld.service | 15:43 |
sdake | but not certain | 15:43 |
sdake | one thing that looks wierd is your network is "novalocal" | 15:44 |
sdake | I have only seen novalocal with nova networking, not with neutron | 15:44 |
hongbin | I running on a VM on a cloud | 15:45 |
hongbin | The log looks empty... | 15:45 |
hongbin | $ sudo journalctl -xl -u flanndeld.service -- Logs begin at Fri 2015-03-20 14:54:29 UTC, end at Sun 2015-03-22 15:45:39 UTC. -- | 15:45 |
sdake | spell it right :) | 15:46 |
sdake | flanneld.service | 15:46 |
sdake | I had a typo I think | 15:46 |
hongbin | yeah. get the output | 15:46 |
sdake | fpaste /etc/flannel-docker-bridge.conf | 15:46 |
hongbin | pasting them | 15:46 |
sdake | just pipe through the fpaste command, its faster ;) | 15:50 |
hongbin | There is not such file /etc/flannel-docker-bridge.conf | 15:50 |
sdake | find it | 15:50 |
sdake | its on the filesystem somewhere | 15:50 |
sdake | sudo updatedb | 15:50 |
hongbin | k | 15:50 |
sdake | locate flannel-docker-bridge.conf | 15:50 |
sdake | I feel like i have been working 7 days a week - because I have! | 15:51 |
hongbin | here u go | 15:52 |
hongbin | http://pastie.org/10045429 | 15:52 |
hongbin | pasting the output of the log | 15:52 |
sdake | there is this cat reviewing on kolla | 15:52 |
sdake | I swear he must put every commit through a spell checker ;) | 15:53 |
sdake | I don't know how he sees these things | 15:53 |
*** sdake_ has joined #openstack-containers | 15:54 | |
*** sdake__ has joined #openstack-containers | 15:56 | |
*** sdake has quit IRC | 15:56 | |
hongbin | don't have the fpaste command... scping them | 15:58 |
hongbin | http://paste.openstack.org/show/194874/ | 15:59 |
*** dims has joined #openstack-containers | 16:00 | |
*** sdake_ has quit IRC | 16:00 | |
sdake__ | bummer fpaste isn't in atomic | 16:00 |
* sdake__ groans | 16:00 | |
sdake__ | your hostname looks greater then 64 characters | 16:01 |
sdake__ | and you are using nova networking | 16:01 |
sdake__ | you need to be using neutron imo :) | 16:01 |
hongbin | how to do that? | 16:01 |
sdake__ | neutron is setup via devstack | 16:02 |
sdake__ | the FQDN of a host must be less then 64 characters, or linux busts | 16:02 |
hongbin | I am running the devstack according to the guide... | 16:02 |
sdake__ | maybe the guide is busted | 16:03 |
hongbin | I see. Let me fix that | 16:03 |
sdake__ | I couldn't get the devstack guide to work last time I tried | 16:03 |
sdake__ | but I only tried for 2-3 hours | 16:03 |
sdake__ | paste your localrc? | 16:03 |
sdake__ | I am pretty certain when novalocal is in the domain name, nova networking is being used | 16:03 |
hongbin | http://paste.openstack.org/show/194875/ | 16:04 |
sdake__ | well I could be wrong on the nova networking thing | 16:05 |
sdake__ | if that is your localrc :) | 16:05 |
sdake__ | on your host ps -ef | grep nova | grep net | 16:05 |
hongbin | http://paste.openstack.org/show/194876/ | 16:06 |
hongbin | One thing is that my devstack may be a little outdated. It is a month ago. | 16:11 |
sdake__ | well lets not go messing that up atm :) | 16:14 |
sdake__ | can you boot he original image and make sure it works? | 16:14 |
sdake__ | (or have you recently) | 16:14 |
sdake__ | and see if the hostnames are super long | 16:14 |
sdake__ | if the hostnames are greater then 63 characters, networking doesn't work | 16:15 |
hongbin | sure. Let me try that | 16:15 |
sdake__ | its a hardcode in the linux kernel | 16:15 |
sdake__ | it may be the new atomic has a different hostname mechanism | 16:15 |
*** sdake has joined #openstack-containers | 16:20 | |
*** sdake__ has quit IRC | 16:23 | |
*** Marga_ has joined #openstack-containers | 16:31 | |
hongbin | sdake__: so far, the old image seems to work | 16:31 |
sdake | run the hostname command and show results | 16:31 |
hongbin | $ hostname te-ewkbyzjh6uzy-1-a74torhnlmdn-kube-node-hce7dkyvysyt.novalocal | 16:32 |
sdake | ok so we know its not the host name length | 16:38 |
sdake | journalctl the flanneld service | 16:38 |
hongbin | old image? | 16:39 |
sdake | right | 16:39 |
sdake | what we ar eafter is why flanneld starts on old not on new | 16:39 |
sdake | it is probably a configuration file problem | 16:39 |
hongbin | here u go http://paste.openstack.org/show/194892/ | 16:42 |
sdake | is your new cluster still up? | 16:44 |
hongbin | now | 16:44 |
hongbin | s/now/no/ | 16:44 |
sdake | yes | 16:44 |
sdake | oh | 16:44 |
hongbin | ..... | 16:44 |
hongbin | I have to bring it down to get the new cluster | 16:45 |
hongbin | But I can use another VM to bring both up | 16:45 |
sdake | put that paste in the bug | 16:45 |
sdake | that is the working flanneld output | 16:45 |
sdake | find the configuration file | 16:45 |
sdake | paste it as well | 16:45 |
sdake | put it in the same comment | 16:45 |
hongbin | k. will go that | 16:45 |
hongbin | Exactly which config file? | 16:46 |
sdake | locate flannel } grep conf after running sudo updatedb | 16:46 |
hongbin | $ sudo updatedb sudo: updatedb: command not found | 16:47 |
sdake | find /etc -name flannel\* -ls | 16:47 |
*** hongbin has quit IRC | 16:49 | |
*** hongbin has joined #openstack-containers | 16:49 | |
sdake | find /etc -name flannel\* -ls | 16:49 |
hongbin | this one? /etc/sysconfig/flanneld | 16:49 |
sdake | is that all there is? | 16:50 |
hongbin | this is the list http://paste.openstack.org/show/194901/ | 16:50 |
sdake | paste that one in, and the flannel journalctl log above to the bug | 16:50 |
sdake | ok, i'm pretty sure I know the problem | 16:51 |
sdake | flanneld has been updated | 16:51 |
sdake | it requires configring its configuration file | 16:51 |
sdake | oh wait | 16:51 |
sdake | /etc/systemd/system/docker.service.d/flannel.conf | 16:51 |
sdake | paste that | 16:51 |
hongbin | k | 16:52 |
sdake | configure-flannel.sh creates the coreos.com keys in etcd | 16:55 |
sdake | I assume that only runs on the master node | 16:55 |
sdake | http://pastie.org/10045394 | 16:59 |
sdake | it says systemd timed out trying to start flanneld | 16:59 |
sdake | ok kill old cluster | 17:00 |
sdake | lets go back to new image | 17:00 |
sdake | and try to manually restart falnneld | 17:01 |
sdake | this will verify the master is creating the keys in etcd | 17:01 |
hongbin | k. will do that. | 17:03 |
hongbin | Need to get a lunch first | 17:03 |
sdake | i have a haircut | 17:03 |
sdake | so I may be out for a bit | 17:03 |
hongbin | k | 17:03 |
hongbin | see you later | 17:03 |
*** hongbin has quit IRC | 17:08 | |
*** dims has quit IRC | 17:22 | |
*** sdake_ has joined #openstack-containers | 17:31 | |
*** daneyon has quit IRC | 17:43 | |
*** Marga_ has quit IRC | 17:57 | |
*** vilobhmm has joined #openstack-containers | 18:21 | |
*** Marga_ has joined #openstack-containers | 18:37 | |
*** hongbin has joined #openstack-containers | 18:43 | |
*** achanda has joined #openstack-containers | 19:06 | |
*** Marga_ has quit IRC | 19:09 | |
*** Marga_ has joined #openstack-containers | 19:11 | |
*** Tango has joined #openstack-containers | 19:12 | |
*** Marga_ has quit IRC | 19:17 | |
*** Tango has quit IRC | 19:26 | |
*** daneyon has joined #openstack-containers | 19:39 | |
*** achanda has quit IRC | 19:40 | |
*** achanda has joined #openstack-containers | 20:07 | |
*** vilobhmm has quit IRC | 20:16 | |
*** dims has joined #openstack-containers | 20:21 | |
*** achanda has quit IRC | 20:21 | |
*** dims has quit IRC | 20:24 | |
*** achanda has joined #openstack-containers | 20:29 | |
*** sdake__ has joined #openstack-containers | 20:31 | |
*** sdake has quit IRC | 20:35 | |
sdake__ | hongbin did you make it back from lucnh yet | 20:51 |
hongbin | sdake__: Yes, I am back | 20:52 |
hongbin | trying to bring up another VM | 20:52 |
sdake__ | did you try manual restart of flannel (if your still working on the problem) | 20:52 |
hongbin | Trying that | 20:52 |
sdake__ | sudo systemctl reset-failed flanneld.service | 20:52 |
sdake__ | sudo systemctl start flanneld.service | 20:53 |
hongbin | k | 20:53 |
hongbin | The old stack was blocked on delete_in_progress state. Trying to figure out why | 20:54 |
*** sdake has joined #openstack-containers | 20:56 | |
*** sdake__ has quit IRC | 21:00 | |
hongbin | sdake: K. Those command works. The status of flanneld is on running now. | 21:09 |
hongbin | somehow, the minions were still on state NotReady | 21:10 |
hongbin | maybe need a restart | 21:10 |
*** dims has joined #openstack-containers | 21:15 | |
*** achanda has quit IRC | 21:20 | |
sdake | i think they probably added a timeout to the systemd file for flannel | 21:40 |
sdake | which is a bit of a problem since we start the master second | 21:40 |
sdake | can you see if there isa timeout in the falnnel systemd file | 21:40 |
hongbin | let me check | 21:41 |
hongbin | It seems there is no timeout specified http://paste.openstack.org/show/195001/ | 21:44 |
sdake | when you star the cluster does the master come up first? | 21:45 |
hongbin | Didn't pay attention for that | 21:46 |
sdake | can you check | 21:46 |
hongbin | I remembered the minion first | 21:46 |
hongbin | k, let me bring the cluster down and bring it up again | 21:47 |
sdake | we need to change the order | 21:47 |
sdake | so that minion launching blocks until master is up | 21:47 |
sdake | a depends_on in the minions should get the job done | 21:48 |
hongbin | I guess that will result cycli dependency on heat | 21:48 |
hongbin | since the ip addresses are passed from minion to master | 21:48 |
hongbin | so the minions have to go first | 21:49 |
sdake | try rebooting the master | 21:50 |
sdake | then rebooting the minion | 21:50 |
sdake | and see if kubectl works | 21:50 |
sdake | with a delay to finish up between reboot | 21:51 |
hongbin | I restarted all process on minions | 21:51 |
hongbin | it works | 21:51 |
sdake | kubctl works? | 21:51 |
hongbin | works | 21:51 |
hongbin | I means the restarted minion went to ready state | 21:52 |
sdake | kubernetes needs to be able to handle a down master | 21:52 |
sdake | and retry | 21:52 |
sdake | the minions that is | 21:52 |
hongbin | I am not sure if minion needs to talk to master. | 21:53 |
hongbin | On startup, for sure. Maybe not after that | 21:54 |
hongbin | Master periodically poll minions, according to the docs | 21:54 |
sdake | log into minion, run systemctl | grep kube | 21:55 |
hongbin | The cluster is down now.... | 21:55 |
hongbin | Testing the order | 21:55 |
sdake | ok | 21:55 |
sdake | I want to see if there is a retry option in the kube services on the minion | 21:56 |
sdake | if not there should be ;) | 21:56 |
hongbin | I think the problem is not retry | 21:56 |
hongbin | The problem seems to be | 21:56 |
hongbin | The flanneld process not success | 21:56 |
hongbin | the rest of kube process depends on flanneld | 21:57 |
hongbin | so all of them not started | 21:57 |
hongbin | Since, I saw the status of kubelet is not started due to dependency | 21:57 |
sdake | we can fix that | 21:59 |
hongbin | k | 21:59 |
sdake | http://www.freedesktop.org/software/systemd/man/systemd.service.html | 22:00 |
sdake | TimeoutStartSec=0 | 22:00 |
sdake | can you figure out how to hack that into the systemd file for flanneld | 22:01 |
hongbin | I can try | 22:02 |
sdake | let me try to make a new image while you try the heat template modification | 22:03 |
hongbin | k | 22:03 |
hongbin | confirmed. Minion first, then master | 22:03 |
hongbin | I have a pull request, that can change the order https://github.com/larsks/heat-kubernetes/pull/14 | 22:04 |
hongbin | If that is merged, master will go first. | 22:04 |
hongbin | That could be another option | 22:05 |
sdake | do the minions have a wait condition on the master? | 22:05 |
hongbin | ??? | 22:05 |
sdake | does it work with your patch applied? | 22:06 |
hongbin | Oh, yes that is | 22:06 |
hongbin | I can try | 22:06 |
sdake | please do, if does, i'll merge that change | 22:07 |
hongbin | k | 22:07 |
sdake | does the curl request in your change dynamically register the minion? | 22:09 |
sdake | https://github.com/larsks/heat-kubernetes/pull/14/files#diff-a7f24c17d7f1801ed844d6044b8194a4R36 | 22:10 |
*** achanda has joined #openstack-containers | 22:11 | |
hongbin | Yes, that did the work. | 22:12 |
hongbin | I added a dependency on wait condition to ensure master go first | 22:12 |
sdake | patch looks good to me | 22:13 |
sdake | i want to make sure it is verified with kue 0.11 | 22:13 |
hongbin | sure | 22:14 |
hongbin | I am more confident if larsks have tested it | 22:14 |
hongbin | anyway, it works with the old image in my prespective | 22:15 |
sdake | thats good news, lets see if it works with new one :) | 22:15 |
hongbin | yes, bringing the cluster up | 22:16 |
sdake | i really like your patch | 22:17 |
sdake | nice work :) | 22:17 |
hongbin | thx! | 22:18 |
*** dims has quit IRC | 22:19 | |
*** sdake__ has joined #openstack-containers | 22:19 | |
hongbin | K. All the minions are Ready, testing pod creation | 22:20 |
*** prad has quit IRC | 22:21 | |
sdake__ | how many minions do you have? | 22:21 |
hongbin | tow | 22:21 |
sdake__ | wfm | 22:21 |
*** sdake has quit IRC | 22:23 | |
*** dims has joined #openstack-containers | 22:23 | |
*** Marga_ has joined #openstack-containers | 22:23 | |
*** dims has quit IRC | 22:24 | |
*** vilobhmm has joined #openstack-containers | 22:27 | |
*** vilobhmm has quit IRC | 22:27 | |
sdake__ | hongbin is this accurate https://bugs.launchpad.net/magnum/+bug/1434468/comments/9 | 22:28 |
openstack | Launchpad bug 1434468 in Magnum "Magnum default images used for kubeclt should be upgraded" [Critical,Triaged] - Assigned to Steven Dake (sdake) | 22:28 |
sdake__ | earlier on irc you said the minion kube services were not running because of the flanneld failure to start | 22:29 |
hongbin | I think it is correct. | 22:30 |
hongbin | Since I cannot find kubelet in minion | 22:30 |
sdake__ | wierd | 22:30 |
sdake__ | it should say failed or something | 22:30 |
sdake__ | not just not start and give no output ;) | 22:30 |
hongbin | yes, it should fail | 22:31 |
*** vilobhmm has joined #openstack-containers | 22:32 | |
sdake__ | do the pods start? | 22:32 |
hongbin | starting: It is very slow on VM inside VM :) | 22:33 |
sdake__ | oh virt on virt | 22:34 |
sdake__ | yikes :) | 22:34 |
sdake__ | I run magnum on bare metal, but my system is in a constant state of bustedness because of devstack as a result | 22:34 |
hongbin | sorry to hear that | 22:35 |
hongbin | :) | 22:35 |
sdake__ | hook up jumper cables to your machines - give em more juice :) | 22:36 |
hongbin | Yes, somehow, I cannot run the devstack recently. Not sure why | 22:36 |
sdake__ | my workstation has 128g ram | 22:36 |
sdake__ | after I bought it I plugged 8 16gb ram chips in | 22:36 |
sdake__ | and the computer said "memory busted" | 22:36 |
sdake__ | I was like "OH NO" | 22:36 |
sdake__ | 1700$ down the drain on busted memory | 22:36 |
sdake__ | but one wasn't seated properly | 22:37 |
hongbin | you have a lot of memory :) | 22:37 |
sdake__ | my 3 lab machines at my hosue have 32gb as well | 22:37 |
sdake__ | I use those mostly for kolla | 22:37 |
sdake__ | and my workstation mostly for magnum | 22:37 |
sdake__ | and if someone ever invents ironic for magnum, I'll use my lab machines for magnum too ;) | 22:37 |
hongbin | sdake__: The pod creation seems to be pending forever | 22:43 |
sdake__ | is the pod assigned an ip? | 22:44 |
hongbin | yes | 22:44 |
hongbin | $ kubectl get pods POD IP CONTAINER(S) IMAGE(S) HOST LABELS STATUS redis-master master kubernetes/redis:v1 10.0.0.5/ name=redis,redis-sentinel=true,role=master Pending sentinel kubernetes/redis:v1 | 22:44 |
sdake__ | ssh into 10.0.0.5 | 22:44 |
hongbin | in | 22:45 |
sdake__ | sudo docker images | 22:45 |
hongbin | empty list | 22:45 |
sdake__ | ping 8.8.8.8 | 22:45 |
hongbin | .......... | 22:46 |
hongbin | That is the problem | 22:46 |
sdake__ | ;) | 22:46 |
hongbin | unpingable | 22:46 |
sdake__ | masquerade ftw | 22:46 |
sdake__ | turn on a masquare iptables rule | 22:46 |
sdake__ | on the host | 22:46 |
hongbin | I did that | 22:47 |
hongbin | with this command sudo iptables --table nat -A POSTROUTING -o eth0 -j MASQUERADE | 22:47 |
hongbin | no luck | 22:47 |
hongbin | I remembered the old image can ping | 22:48 |
*** sdake has joined #openstack-containers | 22:48 | |
sdake | sudo route | fpaste | 22:48 |
sdake | inside the vm 10.0.0.5 | 22:48 |
sdake | hongbin i'm not sure how iptables masquarade works with vm in vm | 22:49 |
sdake | I have never tried it | 22:49 |
sdake | I struggle to get neutron to work baremetal ;) | 22:49 |
hongbin | http://paste.openstack.org/show/195027/ | 22:49 |
hongbin | I confirmed that iptables masquarade works | 22:50 |
hongbin | in nested virtualization | 22:50 |
hongbin | since I made it before | 22:50 |
sdake | route table looks good | 22:50 |
sdake | inside the first layer vm, run ip link show | fpaste | 22:51 |
sdake | the one where your devstack is running | 22:51 |
*** sdake__ has quit IRC | 22:51 | |
hongbin | http://paste.openstack.org/show/195028/ | 22:51 |
sdake | sudo ip addr show | fpaste | 22:52 |
hongbin | http://paste.openstack.org/show/195029/ | 22:53 |
sdake | I would htink you would have a bridge for 10.0.0.1 in your devstack vm | 22:54 |
sdake | and you do not | 22:54 |
sdake | the 10.0.0.5 vm is using a default gateway of 10.0.0.1 | 22:55 |
sdake | so the traffic is going to where? | 22:55 |
hongbin | I think it go to somewhere inside my cloud provider | 22:56 |
hongbin | which in turn route to internet | 22:56 |
hongbin | That is the cloud our lab built | 22:56 |
sdake | delete your bay | 22:57 |
sdake | boot the fedora-atomic original image | 22:57 |
sdake | and see if the routing works | 22:57 |
sdake | eg, can you ping 8.8.8.8 from the kubecluster | 22:57 |
hongbin | k | 22:58 |
sdake | lets see if the rotuing is busted in half the cases or all cases | 22:58 |
sdake | if its busted in all cases, I think we should proceed with merging the changes and generating a gerrit review | 22:58 |
sdake | and find another cat ot test while you sort out the network | 22:59 |
hongbin | k | 22:59 |
sdake | the traffic shoudl be flowing out of your floating ips | 23:00 |
sdake | not the 10.0.0.1 network | 23:00 |
sdake | but somehow neutron takes care of that | 23:00 |
sdake | the floating ips shoudl be on your external network | 23:00 |
hongbin | My VM has internet access without floating IP | 23:01 |
sdake | what is the ip of your vm? | 23:01 |
hongbin | I am using a private IP through VPN | 23:01 |
sdake | my brain just imploded | 23:01 |
sdake | i'll stop asking questions I dont want to know the answer to now :) | 23:02 |
hongbin | Yes, the network setting is not straightward | 23:02 |
hongbin | :) | 23:02 |
sdake | it is possible someone changed a setting in the cloud lab that broke your ip connectivity in vm on vm | 23:03 |
sdake | so lets see if thats the case from a known good working version of a fedora image | 23:03 |
hongbin | k | 23:04 |
sdake | the idae is magnum connects floating ips from your external network to the vm | 23:05 |
sdake | then external stuff goes out the float | 23:05 |
sdake | but I dont know what happens in nested virt case | 23:05 |
sdake | i am pretty sure your nested vm shoud lhave a 10.0.0.1 interface though | 23:06 |
sdake | you may need to add one | 23:06 |
hongbin | I see | 23:07 |
hongbin | Yes, both layer1 and layer2 use the 10.0.0.0 space | 23:07 |
hongbin | which may cause problem | 23:07 |
sdake | well lets try debug your neutron second, and debug the image first :) | 23:08 |
hongbin | I am bringing up the VM. | 23:08 |
sdake | is your baremetal machine lackingmemory? | 23:09 |
sdake | you might have more fun if you run devstack on it :) | 23:09 |
hongbin | it has virtualized bit disabled... | 23:09 |
sdake | you can change that in bios no? | 23:10 |
hongbin | I guess no, but that is out of my controll | 23:10 |
sdake | so your running nested qemu with nested qemu | 23:11 |
sdake | it must take forever to ge things done :) | 23:11 |
sdake | sounds like 3 layers of virt to me | 23:12 |
* sdake shudders | 23:12 | |
sdake | my grass is growing by the minute | 23:13 |
hongbin | no. two layer only :) | 23:13 |
hongbin | This time I brought up the VM, but cannot SSH to it. Very strange | 23:15 |
hongbin | Let me try again | 23:16 |
sdake | when in doubt reboot | 23:16 |
sdake | (your host machine) | 23:16 |
hongbin | You means I need to reboot the hosted VM? | 23:17 |
sdake | I would reboot the entire thing | 23:17 |
sdake | at the lowest layer | 23:17 |
hongbin | k | 23:17 |
sdake | so you have a hosted vm, and in that you run devstack, and that launches magnum, which launches new vms in teh hosted vm? | 23:19 |
hongbin | yes, correct | 23:21 |
sdake | you connect to the hosted vm how, via laptop? | 23:22 |
*** yuanying has joined #openstack-containers | 23:22 | |
hongbin | yes, via laptop | 23:22 |
sdake | yuanying you around? | 23:22 |
sdake | hongbin that is where I'd run devstack ;-) | 23:22 |
yuanying | hi | 23:23 |
yuanying | good morning | 23:23 |
sdake | hey - mind running a quick test of magnum | 23:23 |
hongbin | :) | 23:23 |
sdake | morning fine sir | 23:23 |
hongbin | hey yuanying | 23:23 |
sdake | use this image https://fedorapeople.org/groups/heat/kolla/fedora-21-atomic-2.qcow2 | 23:24 |
sdake | hongbin can paste his magnum diff | 23:24 |
sdake | hongbin has a busted network I suspect | 23:25 |
sdake | I'd like a second opinion ;) | 23:25 |
hongbin | This is the pull request https://github.com/larsks/heat-kubernetes/pull/14 | 23:25 |
yuanying | Is Magnum broken at new fedora atomic image? | 23:26 |
sdake | yes | 23:26 |
hongbin | sdake: confirmed. the old image cannot ping 8.8.8.8 as well | 23:26 |
sdake | hongbin that is good news :) | 23:26 |
hongbin | yup | 23:27 |
sdake | if yuanying could confirm it works without the crazy network nested virt setup you got rolling, that would be grand :) | 23:27 |
sdake | hongbin/yuanying I just merged that pull request | 23:28 |
sdake | hongbin can you make a master patch for magnum and git review | 23:28 |
hongbin | yes, will do that | 23:29 |
sdake | after your done, I'll add a second patch on top that depends on yours that adds the appropriate documentation | 23:29 |
hongbin | k | 23:30 |
sdake | boy I need some anestaphine | 23:32 |
hongbin | sdake: I need to leave for a while to get dinner. Will get back to the git review | 23:32 |
sdake | hongbin sounds good enjoy food | 23:33 |
sdake | can you paste your diff and I'll do the git review? | 23:33 |
*** oro has quit IRC | 23:46 | |
openstackgerrit | Steven Dake proposed stackforge/magnum: Merge heat-kubernetes pull request 14 https://review.openstack.org/166661 | 23:49 |
openstackgerrit | Steven Dake proposed stackforge/magnum: Modify documentation to point to kubernetes-0.11 atomic image https://review.openstack.org/166662 | 23:49 |
sdake | yuanying https://review.openstack.org/#/c/166662/ | 23:53 |
*** julim has quit IRC | 23:54 | |
*** sdake__ has joined #openstack-containers | 23:54 | |
*** sdake has quit IRC | 23:58 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!