14:01:04 <tongli> #startmeeting interop_challenge 14:01:05 <openstack> Meeting started Wed May 3 14:01:04 2017 UTC and is due to finish in 60 minutes. The chair is tongli. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:01:06 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:01:08 <openstack> The meeting name has been set to 'interop_challenge' 14:01:10 <rarcea> o/ 14:01:13 <tongli> hello, every one. 14:01:17 <topol> o/ 14:01:21 <JASON_SHI> hello 14:01:23 <dmellado> hi folks 14:01:25 <GregWaines> hello 14:01:25 <wxy|> hello 14:01:37 <alexrobinson> hello 14:01:44 <jb4free> Hi 14:02:18 <ksumit> o/ 14:02:39 <tongli> The agenda for today is just run the test and join the big cockroachdb cluster. 14:02:45 <tongli> and post results to the mailing list. 14:03:06 <tongli> #link https://etherpad.openstack.org/p/interop-challenge-meeting-2017-05-03 14:03:07 <JASON_SHI> cool 14:03:07 <zhipeng> o/ 14:03:32 <tongli> We do have few items to talk before we dive into the actual run. 14:03:45 <dmellado> yep, I ran earlier today and also merged the commit, so we don't have to rely on in-review patchset 14:03:45 <tongli> Please see the etherpad at the agenda section. 14:04:05 <tongli> @dmellado, thanks. 14:04:13 <topol> awesome 14:04:13 <dmellado> zhipeng: I'd love +2 your commit ASAP too, so I'll do as soon as you address my comments there ;) 14:04:17 <tongli> @dmellado, one more review though. 14:04:33 <zhipeng> dmellado no problemo 14:04:40 <dmellado> tongli: the one for the tags? I assume that we'll need those for phase2 14:04:52 <tongli> all cores, please also look at the Mark's patch for destroying the preallocated floating IPs. 14:05:23 <tongli> @dmellado, not needed absolutely but it is nice to have when you test run it. 14:05:40 <dmellado> I'll take a look 14:05:47 <tongli> https://review.openstack.org/#/c/461591/ 14:06:00 <tongli> less than 10 charactor change 14:06:20 <tongli> any way, please take a look at that. we've also merged the nfv patch, yeah!!! 14:06:37 <tongli> now today's stuff. 14:07:07 <tongli> I've put two commands on the etherpad, these two commands are very important. 14:07:09 <dmellado> tongli: for the NFV one, not yet! 14:07:11 <dmellado> pls review also this one 14:07:13 <dmellado> https://review.openstack.org/#/c/462018/ 14:07:40 <fredli> Hi all 14:07:54 <vkmc> o/ 14:07:54 <tongli> @dmellado, u r right. 14:07:56 <vkmc> hey! 14:08:02 <dmellado> hi vkmc o/ 14:08:06 <tongli> ok, please review that one as well. 14:08:19 <tongli> let me continue for today's stuff. 14:08:27 <tongli> the two commands, one for each phase. 14:09:04 <tongli> is everybody clear on that? how to run and what to expect? 14:09:31 <dmellado> just a question, the phase 1 acts with 'localhost' as a master 14:09:45 <GregWaines> what is the --skip-tags="apps" option ? 14:09:49 <tongli> @dmellado, not localhost, master node. 14:09:53 <dmellado> and then phase 2 tag would change that to use the master from the main cluster 14:10:01 <dmellado> yep, that's what I meant tongli ;) 14:10:16 <tongli> ok. 14:10:38 <dmellado> tongli: I mean, for the first tag 14:10:40 <tongli> @GregWaines, --skip-tags means do not run any tasks marked "apps" 14:10:46 <dmellado> should it act with own_master: True? 14:11:55 <tongli> @dmellado, if you only do phase #1, own_cluster set to False. 14:12:00 <GregWaines> @tongli, will it still start k8s and cockroach ? 14:12:08 <tongli> that way, you are not dealing with cockroachdb at all. 14:12:25 <tongli> if you set own_cluster == True, then you end up with one node cockroachdb cluster. 14:12:30 <tongli> which does not hurt anything. 14:12:31 <wxy|> dmellado: hi, the task: setup first cockroachdb node is skipped. I'v configed the cockroachdb link in vars/ubuntu.yml. What should I do next to let it run? 14:12:59 <dmellado> tongli: yep, but that should be it for phase1-only participants, at least AFAIU 14:13:03 <dmellado> maybe we should make that clear 14:13:13 <tongli> ok. 14:13:22 <tongli> for phase runners, here is the configuration. 14:13:32 <tongli> own_cluster set to False. 14:13:37 <tongli> and run this command: 14:13:52 <tongli> ansible-playbook -e "action=apply env=xxxx password=xxxx" site.yml--skip-tags="apps" 14:14:17 <tongli> doing that will only set you up a k8s cluster, has nothing to do with cockroachdb. 14:14:26 <tongli> now for phase2 runners. 14:14:40 <tongli> own_cluster set to False. 14:14:47 <tongli> public_node set to a IP 14:15:08 <tongli> first run phase 1 command: 14:15:15 <tongli> ansible-playbook -e "action=apply env=xxxx password=xxxx" site.yml--skip-tags="apps" 14:15:28 <tongli> then run this command to join the first cockroachdb cluster 14:15:40 <tongli> ansible-playbook -i run/runhosts -e "action=apply env=xxxxpassword=xxxx" site.yml --tags="info,apps" 14:15:49 <tongli> notice that the runhosts and tags 14:16:09 <tongli> the second command is for phase #2, which should take like 10 seconds. 14:16:30 <tongli> basically create a pod for on each k8s node. 14:16:35 <tongli> clear? 14:16:39 <dmellado> yep ;) 14:17:00 <tongli> everybody else? 14:17:08 <wxy|> very clear. 14:17:08 <GregWaines> yep 14:17:11 <daniela_ebert> clear 14:17:12 <wxy|> Thanks very much 14:17:18 <tongli> great. 14:18:17 <tongli> if you run the test using cached container image, this is important for phase #2 runners, please make sure that you are caching the latest cockroachdb container image. 14:18:45 <tongli> if not, the cockroachdb dashboard will show a warning, not end of the world but we do not want to see that. 14:18:51 <tongli> it will be a distraction. 14:19:04 <alexrobinson> regarding the cached container image, I'd much prefer if we picked a specific tag for everyone to use than for everyone to just grab the "latest" image at different times 14:19:32 <tongli> alexrobinson, that means another patch. 14:20:01 <alexrobinson> sure, but if the "latest" versions that people have grabbed vary by enough, the versions might not play well together 14:20:20 <alexrobinson> we'll see today, I guess, but it's a non-trivial risk 14:20:59 <tongli> ok. 14:21:10 <alexrobinson> it's always a best practice when using docker to use specific version tags rather than ":latest" for this reason 14:21:19 <alexrobinson> (not just for cockroachdb, but for all container images) 14:21:43 <alexrobinson> anyways, sorry to interrupt 14:22:10 <alexrobinson> if it's an issue, everyone should switch to the "cockroachdb/cockroach:v1.0-rc.1" image 14:22:24 <dmellado> wxy|: you should be ok just with k8s after latest changes ;) 14:22:26 <tongli> point #2, right after you run the test, do not destroy your cluster just yet especially you are joining the big cluster. 14:23:14 <tongli> ok, back to point #1, container image, this is for phase #2 runners only. 14:23:37 <tongli> do you all want to change it that tag for cockroachdb image? 14:23:45 <wxy|> dmellado: I think so. Just checking now. tongli printed a very clear flow. Thanks. 14:23:50 <tongli> I have no issue with that change, just another small patch. 14:23:56 <dmellado> np wxy| ! 14:24:46 <tongli> any one? 14:25:03 <tongli> on using the specific tag rather than the latest by default? 14:26:02 <tongli> or I have lost everybody? 14:26:24 <dmellado> tongli: heh, no, I'm still around 14:26:32 <dmellado> tbh I don't really care, I don't think that it'd affect too much 14:27:02 <tongli> ok, if you refresh your cached images, then I think you are ok. 14:27:20 <tongli> so #agreed, not to make a new patch for specific image 14:27:49 <tongli> point #3, stack size, need some input from Mark, Alex, 14:28:06 <tongli> when other cloud join in, how many nodes should we set? 14:28:22 <tongli> should each cloud have a different number of nodes join or same number of node join. 14:28:39 <tongli> from cockroachdb point of view, it should not matter, but to make the demo more dynamic, 14:28:46 <tongli> should the number vary a bit? 14:29:06 <dmellado> I guess that'd also depend on the size of the cloud 14:29:07 <tongli> if not, I suggest we set to 5, so each cloud have 4 nodes join the big cluster. 14:29:07 <dmellado> and the tenant 14:30:07 <tongli> any one else? 14:30:24 <tongli> any cloud should allow create 5 nodes, no? 14:31:03 <dmellado> hope so! 14:31:21 <tongli> #agreed, stack_size to be set to 5. 14:31:59 <tongli> @alexrobinson, do you have a IP we can use to join your cluster? 14:32:42 <vkmc> shouldn't be a problem with 5 nodes 14:33:38 <tongli> @alexrobinson, u still there? 14:33:48 <tongli> did you prepare a cluster so that we can join 14:34:11 <alexrobinson> @tongli, no I don't have a cluster 14:34:54 <SpencerKimball> @tongli, I think there's some confusion here. We (including Alex Polvi from CoreOS) were under the impression that we'd be using one of your clouds 14:35:38 <tongli> @alexrobinson, @SpencerKimball, hmmm, I do not think so. 14:36:16 <tongli> mark sent out an email right after the Austin rehearsal, that you guys will setup the first cluster at the session before our show. 14:36:40 <tongli> that has been communicated few times. 14:37:07 <tongli> let's hash that out offline. 14:37:28 <tongli> anyway, I have setup ibm cloud as the first cluster. 14:37:40 <tongli> can we all start run the phase 1? 14:37:51 <tongli> and post your run time and results to the mailing list? 14:38:07 <tongli> the first cluster information is on the etherpad. 14:38:30 <tongli> my screen is shared here #link https://apps.na.collabserv.com/meetings/join?id=1947-0912 14:40:18 <dmellado> tongli: password for the meeting? 14:40:23 <Wei_Liu> tongli, what's passcode? 14:40:33 <tongli> meettong 14:41:17 <daniela_ebert> works fine, I can see your screen 14:41:21 <dmellado> yep 14:41:28 <Wei_Liu> yep 14:41:33 <tongli> great. it shows the cockroachdb cluster. 14:41:44 <tongli> currently has 11 nodes, however 2 dead. 14:41:58 <dmellado> that should've my cluster around too 14:42:02 <GregWaines> emailed my/WindRiver results to interop-wg@lists.openstack.org <interop-wg@lists.openstack.org> 14:42:31 <dmellado> tongli: which kind of results would you need? I thought we were tracking this on the wiki page too 14:42:49 <dmellado> https://wiki.openstack.org/wiki/Interop_Challenge#Boston_Summit_On_Stage_Keynote_K8S_Demo_Commited_Parties 14:42:54 <tongli> wiki page has an example , at the very end. 14:43:22 <GregWaines> I just cut & pasted the timing and PLAY RECAP messages at the end 14:43:23 <tongli> it was made for the lampstack but should work for the k8s workload as well. 14:43:58 <GregWaines> ... sent email previously using template from wiki 14:44:04 <tongli> you can access cockroachdb dashboard by point a browser to your node as well. 14:44:18 <tongli> @GregWaines, thanks. 14:44:27 <dmellado> will send it after the meeting, then 14:45:00 <tongli> from screen, you can see that the interop database was created, and the size of the table demo will change 14:45:31 <tongli> it does that because each pod also has a data generator container running. 14:45:40 <tongli> which insert new values to the table. 14:46:39 <tongli> there are two nodes at 162.2.44.37, 162.2.44.22 are dead nodes. 14:47:41 <tongli> for phase #1, you should see something like this from k8s dashboard. 14:47:49 <wxy|> it's mine. :( 14:49:06 <tongli> @wxyl, ok, make sure everything is ok. 14:49:28 <tongli> if you switch namespace to kube-system from k8s dashboard, you should see more stuff on the k8s dashboard. 14:49:50 <tongli> get yourself familiar with the k8s dashboard especially for the phase #1 only runners. 14:50:11 <tongli> you should be able to just do this against your own cluster. 14:50:31 <Wei_Liu> ok 14:51:00 <tongli> if you switch to Pods section, you should see the Status to be running and Restarts to be 0 14:51:17 <tongli> if you see Restarts to be greater than 0, normally something is not right. 14:51:27 <tongli> it will be true for your cockroachdb pods as well. 14:52:08 <tongli> my screen shows that my cockroachdb pods restarted more than 0. which is not a good sign. 14:52:15 <wxy|> pods for k8s run well. But there is no cockroachdb pods 14:52:17 <dmellado> tongli: heh xD 14:52:29 <dmellado> wxy|: that's normal if you're running only phase 1 14:52:43 <wxy|> I run phase 2 as well. 14:52:45 <tongli> @wxyl, if you only run the phase #1, that will be correct 14:53:17 <wxy|> cool I saw it. 14:53:20 <tongli> if you run phase 2 as well, that won't be correct. switch the namespace. 14:53:29 <topol> :-) 14:53:34 <wxy|> but result is 2.:( 14:53:57 <tongli> @wxyl, that is correct. since you are not the first cluster, my cloud is. 14:54:21 <tongli> one of your node is used as k8s master node, which has no cockroachdb nodes running on. 14:54:28 <tongli> your stack_size must be 3. 14:54:35 <tongli> which is the default if you did not change it. 14:54:46 <tongli> ok, only few minutes left. 14:54:49 <wxy|> Does it means that I'm ok now? 14:55:00 <tongli> we need a count for who will be running phase #2 14:55:07 <dmellado> o/ 14:55:30 <tongli> @wxyl, no, seems your rockroachdb nodes are dead. 14:56:10 <tongli> everybody, please indicate on the etherpad, who runs phase #2? 14:56:14 <dmellado> you can go in there and do kubectl get pods and check 14:56:17 <dmellado> wxy|: 14:56:18 <wxy|> How can I debug it? 14:56:29 <kongwei> o/ 14:56:32 <wxy|> I'm a new guy for k8S. Sorry 14:57:08 <tongli> Anybody else for phase #2? 14:57:13 <dmellado> wxy|: using kubectl, but you could first try even with the dashboard 14:57:14 <daniela_ebert> I plan to run phase #2, but I still have an issue with the latest patches 14:57:30 <dmellado> daniela_ebert: what's on with those for you? 14:57:38 <GregWaines> Not me .... Wind River / Greg Waines / Brent Rowsell ... will be doing only Phase 1 14:57:56 <dmellado> tongli: I'm assuming that anyone who said 'public cloud' on the etherpad with the participant info 14:57:59 <dmellado> would be running it 14:58:02 <dmellado> even if not around today 14:58:07 <tongli> we need confirmation. 14:58:18 <tongli> the foundation has asked me yesterday. 14:58:41 <tongli> I can go with that if no objections. 14:58:53 <Wei_Liu> tongli, if I do not join phase #2, so I should config own_cluster as true and ignore --skip-tags="apps" in the command, right? 14:59:09 <vkmc> last meeting we said we were going to do the demo today, running the phase 2 so we make sure that with the last code we are all achieving the results we epect 14:59:21 <tongli> @Wei_Liu, please see the conversation earlier. 14:59:54 <vkmc> expect* 14:59:59 <Wei_Liu> yes, I was here all the time. 15:00:12 <dmellado> that's why I left the nodes connected to the DB, basically 15:00:19 <tongli> ok. one minute left. 15:00:29 <daniela_ebert> @dmellado: I am investigating in parallel. may I contact you later? 15:00:35 <tongli> contact me if you still have problems. 15:00:46 <dmellado> daniela_ebert: sure, feel free reaching me out (and tongli too ;) ) 15:00:54 <tongli> #endmeeting