16:04:25 <adrian_otto> #startmeeting containers 16:04:25 <openstack> Meeting started Tue May 16 16:04:25 2017 UTC and is due to finish in 60 minutes. The chair is adrian_otto. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:04:26 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:04:28 <openstack> The meeting name has been set to 'containers' 16:04:44 <adrian_otto> #link https://wiki.openstack.org/wiki/Meetings/Containers#Agenda_for_2017-05-16_1600_UTC Our Agenda 16:04:51 <adrian_otto> #topic Roll Call 16:04:53 <SamYaple> o/ 16:04:54 <adrian_otto> Adrian Otto 16:04:57 <juggler> Perry Rivera 16:04:58 <strigazi> Spyros Trigazis 16:04:59 <tonanhngo> Ton Ngo 16:04:59 <kevinz> kevinz 16:05:31 <ArchiFleKs> Kevin Lefevre 16:05:44 <adrian_otto> hello SamYaple juggler strigazi tonanhngo kevinz and ArchiFleKs 16:05:57 <adrian_otto> let's begin 16:06:03 <adrian_otto> #topic Announcements 16:06:12 <adrian_otto> 1) The OpenStack summit was in Boston last week. 16:06:42 <adrian_otto> I know that strigazi and tonanhngo attended. Are there any other participants present who joined us? 16:06:53 <ArchiFleKs> I was here also 16:07:11 <SamYaple> i was there 16:07:28 <adrian_otto> It was great to see you both! Thanks for joining. 16:08:09 <adrian_otto> 2) Reminder that we switched from a weeks meeting cadence to alternating weeks. 16:08:36 <adrian_otto> If you ever want to know the meeting schedule, it is published here: 16:08:42 <adrian_otto> #link https://wiki.openstack.org/wiki/Meetings/Containers Meeting Schedule 16:08:55 <adrian_otto> Any other announcements from team members? 16:09:14 <strigazi> I'm getting 16:09:25 <strigazi> The k8s and swarm driver for newton up to date 16:09:39 <adrian_otto> it also might be worth mentioning that we reached a consensus to streamline our Cluster Upgrades feature. 16:10:02 <strigazi> For security reasons we need to have secure images and up to date kernels 16:10:07 <adrian_otto> so rather than being a major refactor, we will add cluster upgrades as a feature without redesigning the API at the same time. 16:10:55 <adrian_otto> thanks strigazi for the image update work 16:11:08 <juggler> +1 16:11:32 <strigazi> That means we'll have a new api endpoint for the api but the rest of the API won't be affected, minor changes only 16:12:05 <adrian_otto> I'm very happy with this approach. We can discuss it further in Open Discussion if there are concerns to address. 16:12:16 <adrian_otto> #topic Review Action Items 16:12:30 <adrian_otto> I believe this is a (none) but I am taking a quick look to verify that. 16:12:59 <adrian_otto> 1) ACTION: adrian_otto to update team meeting schedule to */2 weeks (juggler, 16:21:41) 16:13:06 <adrian_otto> Status: Complete 16:13:30 <adrian_otto> #topic Blueprints/Bugs/Reviews/Ideas 16:13:50 <adrian_otto> In this section any team member may raise a work item for discussion 16:13:57 <adrian_otto> or request status 16:14:25 <adrian_otto> if we don't have any inquiries in a minute or two, we can enter Open Discussion 16:14:57 <kevinz> i HAVE one 16:15:02 <kevinz> Hello 16:15:06 <adrian_otto> proceed kevinz 16:15:43 <kevinz> I have get a BP for kuryr-kubernetes integrate with magnum 16:15:45 <kevinz> https://blueprints.launchpad.net/magnum/+spec/integrate-kuryr-kubernetes 16:16:30 <kevinz> But not sure whether this bp can be implemented, do we ready to integrate kuryr service? 16:16:42 <adrian_otto> kevinz do you think you can implement this in the Pike timeframe? 16:16:59 <strigazi> kevinz The only problem is how to add trunk ports with heat 16:17:02 <adrian_otto> The problem with this integration is that it opens a fundamental security hole 16:17:27 <adrian_otto> so it would need to be off by default with a sensible warning in the documentation about what the security risk is to enabling it 16:17:32 <kevinz> Yeah the heat trunk port patch has not merged 16:17:37 <strigazi> the ptl of kuryr has example but I'm not sure how this can be easily added to magnum 16:17:48 <strigazi> let fetch the example for you 16:17:49 <adrian_otto> basically anyone with access to the hosts that run this code has full access to your Neutron API 16:17:52 <kevinz> OK 16:17:52 <strigazi> let me fetch the example for you 16:18:28 <strigazi> kevinz https://github.com/danielmellado/kuryr_heat 16:18:36 <tonanhngo> It's probably worth proceeding in parallel while the issues are sorted out 16:18:41 <kevinz> adrian_otto: In pike I think it's hard to finish it. since heat is not ready 16:18:54 <strigazi> adrian_otto We can leverage the trust users and highlight the security implications 16:18:56 <kevinz> strigazi: Thanks 16:19:08 <adrian_otto> so my suggestion is to go ahead and implement it labeled EXPERIMENTAL with a security disclaimer, and then once we have that as a working feature (default off), we can add the required work to secure it. 16:19:13 <strigazi> In some use cases might not be a problem. 16:19:26 <adrian_otto> agreed 16:19:30 <strigazi> Flannel will be for sure the default 16:19:32 <tonanhngo> Right, it depends on the use case 16:20:21 <strigazi> For CERN, for example we don't have a use case at the moment but we may have in the future 16:21:03 <adrian_otto> kevinz based on your remarks above, it seems you are asking me not to target the blueprint for Pike. Do you think it's *possible* an initial version could be completed in Pike? 16:21:04 <strigazi> kevinz You plan to run kuryr in containers I guess right? 16:21:52 <kevinz> adrian_otto: Yeah I think it is a "possible" :-) 16:22:08 <adrian_otto> ok, so I'll target it for Pike, and if we need to change it later we can. 16:22:28 <kevinz> strigazi: Yeah, I think it's a good choice 16:22:57 <kevinz> adrian_otto: thx 16:22:58 <strigazi> kevinz if you need any help, testing etc Don't hesitate to ping me 16:23:00 <adrian_otto> #action adrian_otto to target https://blueprints.launchpad.net/magnum/+spec/integrate-kuryr-kubernetes for Pike, with a comment that it's okay for this to slip to Queens if needed. 16:23:30 <ArchiFleKs> kevinz: I'd be happy to help 16:23:59 * adrian_otto nods approvingly to ArchiFleKs 16:24:06 <kevinz> strigazi: ArchiFleKs: That's great! Thanks a lot 16:24:35 <adrian_otto> cool, this is a great feature that I'd love to include in Pike. 16:24:37 <strigazi> kevinz If you will build new container images have a look here, this is a way to make small python based images https://github.com/strigazi/heat-container-agent/blob/alpine-3.5/Dockerfile 16:24:59 <adrian_otto> good idea strigazi 16:25:36 <kevinz> Cool 16:26:15 <strigazi> I have something to propose that I don't like. 16:26:26 <strigazi> It is for the CI 16:26:34 <juggler> go for it strigazi 16:27:04 <strigazi> I personally don't trust very much the convergence of our functional tests and the rate of "random" failures is very high 16:27:18 <strigazi> So, I end up testing many things manually anyway. 16:27:58 <strigazi> I think we can reduce some of the *voting* tests that create pods or containers 16:28:07 <strigazi> make them non-voting 16:28:17 <strigazi> And focus more on the API 16:28:33 <strigazi> Be sure that the magnum service don't break 16:28:39 <strigazi> Be sure that the magnum service doesn't break 16:28:51 <tonanhngo> Is there way to automate the separate manual tests? 16:29:14 <juggler> or semi-automate them 16:29:16 <tonanhngo> Then we can run them ourselves 16:29:21 <strigazi> I'm working in adding 3rd party CIs but I can't wait. 16:29:33 <strigazi> We can. 16:29:41 <strigazi> One way is ansible 16:29:58 <ArchiFleKs> strigazi: Is there a way to run only some test elsewhere? I know I'm testing a lot manually also but I think we could provide a limited number of concurrent instances at osones whit nested kvm for the CI 16:30:03 <strigazi> Have a playbook that creates devstack and runs the tests 16:30:33 <ArchiFleKs> But it will be be very limited, but at least it will be fast and limit failure due to timeout 16:30:43 <strigazi> Or a heat template for clouds with heat running 16:31:37 <strigazi> ArchiFleKs I think that can be comlementary to our semi manual/automatic tests. 16:31:57 <strigazi> *complementary 16:32:12 <ArchiFleKs> strigazi: what about the centos ci? 16:32:20 <strigazi> I'm still waiting 16:32:39 <strigazi> Waiting for CoreOS to reply as well 16:32:54 <adrian_otto> #topic Open Discussion 16:33:23 <strigazi> So, what do you think about making some tests non-voting? 16:33:44 <adrian_otto> strigazi which ones? 16:34:34 <strigazi> For k8s and swarm. We will check that the stack is created and this test will be voting. But not try to ping the API server 16:34:42 <strigazi> the same for swarm 16:35:17 <adrian_otto> in all honesty, I'm reluctant to do that. 16:35:27 <strigazi> For k8s, specifically even if our tests run successfully that doesn't mean anything 16:35:41 <adrian_otto> I'm worried that we will merge code with regressions that cause the COEs not to work 16:36:10 <adrian_otto> well, suppose we break etcd, for example 16:36:32 <strigazi> That mean that API says ok. We don't test if pods can be created or test more complicated features like service accounts 16:36:40 <adrian_otto> the func test for the affected driver should alert us to that 16:36:55 <strigazi> The tests will still run 16:37:00 <strigazi> but non voting 16:37:08 <adrian_otto> but they fail most of the time now, I think 16:37:19 <adrian_otto> so we will become used to seeing those tests fail. 16:37:29 <strigazi> they fail on RAX, bluebox and infracloud 16:37:39 <adrian_otto> and probably will begin ignoring them completely because they do not vote 16:37:42 <strigazi> the work on OSIC and OCH 16:37:48 <strigazi> OCH 16:37:55 <strigazi> OVH 16:37:59 <adrian_otto> yep 16:38:25 <adrian_otto> does infra know this, and is there anything we can do to address the root cause of those failures? 16:38:38 <strigazi> adrian_otto I wold them multiple times 16:38:39 <adrian_otto> I'm not satisfied that marking the tests as nonvoting actually solves this. 16:38:41 <strigazi> adrian_otto I told them multiple times 16:39:10 <strigazi> I asked for nested virtualization and they don't provide it 16:39:20 <strigazi> They have it documented 16:39:25 <strigazi> just a sec 16:40:22 <strigazi> https://docs.openstack.org/infra/manual/testing.html#known-differences-to-watch-out-for 16:40:50 <strigazi> They says that with nested, some tests crash. For us it is the other way round 16:41:07 <strigazi> Almost the other way round 16:41:55 <strigazi> FYI, FOr k8s, we don't test etcd at the moment 16:42:25 <strigazi> So reducing the k8s voting tests won't cost anything. 16:43:57 <strigazi> team? 16:44:15 <adrian_otto> I read the differences list. 16:44:36 <strigazi> 3rd bullet 16:45:22 <adrian_otto> long ago we did discuss setting up Nova in our devstack to use the libvirt LXC driver 16:45:24 <strigazi> The infra PTL told me that since clouds don't offer nested as an attribute the can't offer nodepools with that feature 16:45:43 <adrian_otto> that would work around this issue because no hypervisor virt would be required to run the tests 16:45:50 <tonanhngo> If we test these manually to verify they are not broken, it may be workable 16:46:32 <ArchiFleKs> adrian_otto: we'll have to test but I've been playing with the nova lxd driver and running Docker inside can be painful and not reflect the behavior inside kvm instances 16:46:53 <strigazi> fedora-atomic doesn't like lxd, I have tried 16:47:05 <adrian_otto> ArchiFleKs we can still test kvm instances on 3rd party CI per driver. 16:47:17 <strigazi> And we can't make decisions based on a slow CI 16:47:30 <adrian_otto> but I'd like to find a way to get around the nested virt problems for devstack and the basic gate tests 16:48:31 <strigazi> I'm talking specificaly for 3 tests. 16:49:04 <adrian_otto> I heard some mumbling about "Elastic Recheck" at the summit 16:49:44 <strigazi> TestKubernetesAPIs.test_pod_apis, TestKubernetesAPIs.test_replication_controller_apis and TestKubernetesAPIs.test_service_apis 16:50:12 <adrian_otto> We discussed at one point about automatically ordering rechecks that fail on clouds we know to be problematic. I wonder if ElasticRecheck is for that. 16:50:20 <strigazi> I meant 4, this one too test_start_stop_container_from_api 16:50:38 <strigazi> adrian_otto I'll have a look 16:51:03 <adrian_otto> strigazi have we asked infra if there is a way for us to blacklist nodepools for clouds we know are incpompatible? 16:51:13 <adrian_otto> so that we don't even try to test on those? 16:51:15 <ArchiFleKs> adrian_otto: from what I saw on openstack-ansible it can point you to bug you might have encountered during CI 16:51:30 <ArchiFleKs> But I've not look into deeper into it 16:51:49 <strigazi> I will ask again 16:51:52 <adrian_otto> ArchiFleKs if you can find a pointer to that, I'd love to take a look at it 16:51:54 <strigazi> On the ML this time 16:52:19 <ArchiFleKs> adrian_otto: there is some here http://status.openstack.org/elastic-recheck/ 16:52:24 <ArchiFleKs> I'll try to find out more 16:52:31 <adrian_otto> because we are just wasting the Foundation's money/resources running tests that we know won't pass 16:53:15 <strigazi> adrian_otto I can make this point, they may like it 16:53:28 <adrian_otto> thanks strigazi 16:53:44 <adrian_otto> okay, we are scheduled to end our team meeitng in just a few minutes. 16:53:54 <adrian_otto> any other remarks before we wrap up? 16:54:01 <ArchiFleKs> I'll ask on openstack ansible but I wonder how there are doing there test but the must do some test that requires nested kvm and encountered similar issues 16:54:18 <juggler> good from this side 16:54:22 <ArchiFleKs> I wanted to add something but I don't of we have time 16:54:50 <adrian_otto> Thanks everyone for attending. Our next team meeting will be 2017-05-30 at 1600UTC here in #openstack-meeting-alt :-) 16:54:56 <adrian_otto> #endmeeting