16:04:25 #startmeeting containers 16:04:25 Meeting started Tue May 16 16:04:25 2017 UTC and is due to finish in 60 minutes. The chair is adrian_otto. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:04:26 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:04:28 The meeting name has been set to 'containers' 16:04:44 #link https://wiki.openstack.org/wiki/Meetings/Containers#Agenda_for_2017-05-16_1600_UTC Our Agenda 16:04:51 #topic Roll Call 16:04:53 o/ 16:04:54 Adrian Otto 16:04:57 Perry Rivera 16:04:58 Spyros Trigazis 16:04:59 Ton Ngo 16:04:59 kevinz 16:05:31 Kevin Lefevre 16:05:44 hello SamYaple juggler strigazi tonanhngo kevinz and ArchiFleKs 16:05:57 let's begin 16:06:03 #topic Announcements 16:06:12 1) The OpenStack summit was in Boston last week. 16:06:42 I know that strigazi and tonanhngo attended. Are there any other participants present who joined us? 16:06:53 I was here also 16:07:11 i was there 16:07:28 It was great to see you both! Thanks for joining. 16:08:09 2) Reminder that we switched from a weeks meeting cadence to alternating weeks. 16:08:36 If you ever want to know the meeting schedule, it is published here: 16:08:42 #link https://wiki.openstack.org/wiki/Meetings/Containers Meeting Schedule 16:08:55 Any other announcements from team members? 16:09:14 I'm getting 16:09:25 The k8s and swarm driver for newton up to date 16:09:39 it also might be worth mentioning that we reached a consensus to streamline our Cluster Upgrades feature. 16:10:02 For security reasons we need to have secure images and up to date kernels 16:10:07 so rather than being a major refactor, we will add cluster upgrades as a feature without redesigning the API at the same time. 16:10:55 thanks strigazi for the image update work 16:11:08 +1 16:11:32 That means we'll have a new api endpoint for the api but the rest of the API won't be affected, minor changes only 16:12:05 I'm very happy with this approach. We can discuss it further in Open Discussion if there are concerns to address. 16:12:16 #topic Review Action Items 16:12:30 I believe this is a (none) but I am taking a quick look to verify that. 16:12:59 1) ACTION: adrian_otto to update team meeting schedule to */2 weeks (juggler, 16:21:41) 16:13:06 Status: Complete 16:13:30 #topic Blueprints/Bugs/Reviews/Ideas 16:13:50 In this section any team member may raise a work item for discussion 16:13:57 or request status 16:14:25 if we don't have any inquiries in a minute or two, we can enter Open Discussion 16:14:57 i HAVE one 16:15:02 Hello 16:15:06 proceed kevinz 16:15:43 I have get a BP for kuryr-kubernetes integrate with magnum 16:15:45 https://blueprints.launchpad.net/magnum/+spec/integrate-kuryr-kubernetes 16:16:30 But not sure whether this bp can be implemented, do we ready to integrate kuryr service? 16:16:42 kevinz do you think you can implement this in the Pike timeframe? 16:16:59 kevinz The only problem is how to add trunk ports with heat 16:17:02 The problem with this integration is that it opens a fundamental security hole 16:17:27 so it would need to be off by default with a sensible warning in the documentation about what the security risk is to enabling it 16:17:32 Yeah the heat trunk port patch has not merged 16:17:37 the ptl of kuryr has example but I'm not sure how this can be easily added to magnum 16:17:48 let fetch the example for you 16:17:49 basically anyone with access to the hosts that run this code has full access to your Neutron API 16:17:52 OK 16:17:52 let me fetch the example for you 16:18:28 kevinz https://github.com/danielmellado/kuryr_heat 16:18:36 It's probably worth proceeding in parallel while the issues are sorted out 16:18:41 adrian_otto: In pike I think it's hard to finish it. since heat is not ready 16:18:54 adrian_otto We can leverage the trust users and highlight the security implications 16:18:56 strigazi: Thanks 16:19:08 so my suggestion is to go ahead and implement it labeled EXPERIMENTAL with a security disclaimer, and then once we have that as a working feature (default off), we can add the required work to secure it. 16:19:13 In some use cases might not be a problem. 16:19:26 agreed 16:19:30 Flannel will be for sure the default 16:19:32 Right, it depends on the use case 16:20:21 For CERN, for example we don't have a use case at the moment but we may have in the future 16:21:03 kevinz based on your remarks above, it seems you are asking me not to target the blueprint for Pike. Do you think it's *possible* an initial version could be completed in Pike? 16:21:04 kevinz You plan to run kuryr in containers I guess right? 16:21:52 adrian_otto: Yeah I think it is a "possible" :-) 16:22:08 ok, so I'll target it for Pike, and if we need to change it later we can. 16:22:28 strigazi: Yeah, I think it's a good choice 16:22:57 adrian_otto: thx 16:22:58 kevinz if you need any help, testing etc Don't hesitate to ping me 16:23:00 #action adrian_otto to target https://blueprints.launchpad.net/magnum/+spec/integrate-kuryr-kubernetes for Pike, with a comment that it's okay for this to slip to Queens if needed. 16:23:30 kevinz: I'd be happy to help 16:23:59 * adrian_otto nods approvingly to ArchiFleKs 16:24:06 strigazi: ArchiFleKs: That's great! Thanks a lot 16:24:35 cool, this is a great feature that I'd love to include in Pike. 16:24:37 kevinz If you will build new container images have a look here, this is a way to make small python based images https://github.com/strigazi/heat-container-agent/blob/alpine-3.5/Dockerfile 16:24:59 good idea strigazi 16:25:36 Cool 16:26:15 I have something to propose that I don't like. 16:26:26 It is for the CI 16:26:34 go for it strigazi 16:27:04 I personally don't trust very much the convergence of our functional tests and the rate of "random" failures is very high 16:27:18 So, I end up testing many things manually anyway. 16:27:58 I think we can reduce some of the *voting* tests that create pods or containers 16:28:07 make them non-voting 16:28:17 And focus more on the API 16:28:33 Be sure that the magnum service don't break 16:28:39 Be sure that the magnum service doesn't break 16:28:51 Is there way to automate the separate manual tests? 16:29:14 or semi-automate them 16:29:16 Then we can run them ourselves 16:29:21 I'm working in adding 3rd party CIs but I can't wait. 16:29:33 We can. 16:29:41 One way is ansible 16:29:58 strigazi: Is there a way to run only some test elsewhere? I know I'm testing a lot manually also but I think we could provide a limited number of concurrent instances at osones whit nested kvm for the CI 16:30:03 Have a playbook that creates devstack and runs the tests 16:30:33 But it will be be very limited, but at least it will be fast and limit failure due to timeout 16:30:43 Or a heat template for clouds with heat running 16:31:37 ArchiFleKs I think that can be comlementary to our semi manual/automatic tests. 16:31:57 *complementary 16:32:12 strigazi: what about the centos ci? 16:32:20 I'm still waiting 16:32:39 Waiting for CoreOS to reply as well 16:32:54 #topic Open Discussion 16:33:23 So, what do you think about making some tests non-voting? 16:33:44 strigazi which ones? 16:34:34 For k8s and swarm. We will check that the stack is created and this test will be voting. But not try to ping the API server 16:34:42 the same for swarm 16:35:17 in all honesty, I'm reluctant to do that. 16:35:27 For k8s, specifically even if our tests run successfully that doesn't mean anything 16:35:41 I'm worried that we will merge code with regressions that cause the COEs not to work 16:36:10 well, suppose we break etcd, for example 16:36:32 That mean that API says ok. We don't test if pods can be created or test more complicated features like service accounts 16:36:40 the func test for the affected driver should alert us to that 16:36:55 The tests will still run 16:37:00 but non voting 16:37:08 but they fail most of the time now, I think 16:37:19 so we will become used to seeing those tests fail. 16:37:29 they fail on RAX, bluebox and infracloud 16:37:39 and probably will begin ignoring them completely because they do not vote 16:37:42 the work on OSIC and OCH 16:37:48 OCH 16:37:55 OVH 16:37:59 yep 16:38:25 does infra know this, and is there anything we can do to address the root cause of those failures? 16:38:38 adrian_otto I wold them multiple times 16:38:39 I'm not satisfied that marking the tests as nonvoting actually solves this. 16:38:41 adrian_otto I told them multiple times 16:39:10 I asked for nested virtualization and they don't provide it 16:39:20 They have it documented 16:39:25 just a sec 16:40:22 https://docs.openstack.org/infra/manual/testing.html#known-differences-to-watch-out-for 16:40:50 They says that with nested, some tests crash. For us it is the other way round 16:41:07 Almost the other way round 16:41:55 FYI, FOr k8s, we don't test etcd at the moment 16:42:25 So reducing the k8s voting tests won't cost anything. 16:43:57 team? 16:44:15 I read the differences list. 16:44:36 3rd bullet 16:45:22 long ago we did discuss setting up Nova in our devstack to use the libvirt LXC driver 16:45:24 The infra PTL told me that since clouds don't offer nested as an attribute the can't offer nodepools with that feature 16:45:43 that would work around this issue because no hypervisor virt would be required to run the tests 16:45:50 If we test these manually to verify they are not broken, it may be workable 16:46:32 adrian_otto: we'll have to test but I've been playing with the nova lxd driver and running Docker inside can be painful and not reflect the behavior inside kvm instances 16:46:53 fedora-atomic doesn't like lxd, I have tried 16:47:05 ArchiFleKs we can still test kvm instances on 3rd party CI per driver. 16:47:17 And we can't make decisions based on a slow CI 16:47:30 but I'd like to find a way to get around the nested virt problems for devstack and the basic gate tests 16:48:31 I'm talking specificaly for 3 tests. 16:49:04 I heard some mumbling about "Elastic Recheck" at the summit 16:49:44 TestKubernetesAPIs.test_pod_apis, TestKubernetesAPIs.test_replication_controller_apis and TestKubernetesAPIs.test_service_apis 16:50:12 We discussed at one point about automatically ordering rechecks that fail on clouds we know to be problematic. I wonder if ElasticRecheck is for that. 16:50:20 I meant 4, this one too test_start_stop_container_from_api 16:50:38 adrian_otto I'll have a look 16:51:03 strigazi have we asked infra if there is a way for us to blacklist nodepools for clouds we know are incpompatible? 16:51:13 so that we don't even try to test on those? 16:51:15 adrian_otto: from what I saw on openstack-ansible it can point you to bug you might have encountered during CI 16:51:30 But I've not look into deeper into it 16:51:49 I will ask again 16:51:52 ArchiFleKs if you can find a pointer to that, I'd love to take a look at it 16:51:54 On the ML this time 16:52:19 adrian_otto: there is some here http://status.openstack.org/elastic-recheck/ 16:52:24 I'll try to find out more 16:52:31 because we are just wasting the Foundation's money/resources running tests that we know won't pass 16:53:15 adrian_otto I can make this point, they may like it 16:53:28 thanks strigazi 16:53:44 okay, we are scheduled to end our team meeitng in just a few minutes. 16:53:54 any other remarks before we wrap up? 16:54:01 I'll ask on openstack ansible but I wonder how there are doing there test but the must do some test that requires nested kvm and encountered similar issues 16:54:18 good from this side 16:54:22 I wanted to add something but I don't of we have time 16:54:50 Thanks everyone for attending. Our next team meeting will be 2017-05-30 at 1600UTC here in #openstack-meeting-alt :-) 16:54:56 #endmeeting