#openstack-meeting-alt log

16:04:25 <adrian_otto> #startmeeting containers
16:04:25 <openstack> Meeting started Tue May 16 16:04:25 2017 UTC and is due to finish in 60 minutes.  The chair is adrian_otto. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:04:26 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:04:28 <openstack> The meeting name has been set to 'containers'
16:04:44 <adrian_otto> #link https://wiki.openstack.org/wiki/Meetings/Containers#Agenda_for_2017-05-16_1600_UTC Our Agenda
16:04:51 <adrian_otto> #topic Roll Call
16:04:53 <SamYaple> o/
16:04:54 <adrian_otto> Adrian Otto
16:04:57 <juggler> Perry Rivera
16:04:58 <strigazi> Spyros Trigazis
16:04:59 <tonanhngo> Ton Ngo
16:04:59 <kevinz> kevinz
16:05:31 <ArchiFleKs> Kevin Lefevre
16:05:44 <adrian_otto> hello SamYaple juggler strigazi tonanhngo kevinz and ArchiFleKs
16:05:57 <adrian_otto> let's begin
16:06:03 <adrian_otto> #topic Announcements
16:06:12 <adrian_otto> 1) The OpenStack summit was in Boston last week.
16:06:42 <adrian_otto> I know that strigazi and tonanhngo attended. Are there any other participants present who joined us?
16:06:53 <ArchiFleKs> I was here also
16:07:11 <SamYaple> i was there
16:07:28 <adrian_otto> It was great to see you both! Thanks for joining.
16:08:09 <adrian_otto> 2) Reminder that we switched from a weeks meeting cadence to alternating weeks.
16:08:36 <adrian_otto> If you ever want to know the meeting schedule, it is published here:
16:08:42 <adrian_otto> #link https://wiki.openstack.org/wiki/Meetings/Containers Meeting Schedule
16:08:55 <adrian_otto> Any other announcements from team members?
16:09:14 <strigazi> I'm getting
16:09:25 <strigazi> The k8s and swarm driver for newton up to date
16:09:39 <adrian_otto> it also might be worth mentioning that we reached a consensus to streamline our Cluster Upgrades feature.
16:10:02 <strigazi> For security reasons we need to have secure images and up to date kernels
16:10:07 <adrian_otto> so rather than being a major refactor, we will add cluster upgrades as a feature without redesigning the API at the same time.
16:10:55 <adrian_otto> thanks strigazi for the image update work
16:11:08 <juggler> +1
16:11:32 <strigazi> That means we'll have a new api endpoint for the api but the rest of the API won't be affected, minor changes only
16:12:05 <adrian_otto> I'm very happy with this approach. We can discuss it further in Open Discussion if there are concerns to address.
16:12:16 <adrian_otto> #topic Review Action Items
16:12:30 <adrian_otto> I believe this is a (none) but I am taking a quick look to verify that.
16:12:59 <adrian_otto> 1) ACTION: adrian_otto to update team meeting schedule to */2 weeks (juggler, 16:21:41)
16:13:06 <adrian_otto> Status: Complete
16:13:30 <adrian_otto> #topic Blueprints/Bugs/Reviews/Ideas
16:13:50 <adrian_otto> In this section any team member may raise a work item for discussion
16:13:57 <adrian_otto> or request status
16:14:25 <adrian_otto> if we don't have any inquiries in a minute or two, we can enter Open Discussion
16:14:57 <kevinz> i HAVE one
16:15:02 <kevinz> Hello
16:15:06 <adrian_otto> proceed kevinz
16:15:43 <kevinz> I have get a BP for kuryr-kubernetes integrate with magnum
16:15:45 <kevinz> https://blueprints.launchpad.net/magnum/+spec/integrate-kuryr-kubernetes
16:16:30 <kevinz> But not sure whether this bp can be implemented, do we ready to integrate kuryr service?
16:16:42 <adrian_otto> kevinz do you think you can implement this in the Pike timeframe?
16:16:59 <strigazi> kevinz The only problem is how to add trunk ports with heat
16:17:02 <adrian_otto> The problem with this integration is that it opens a fundamental security hole
16:17:27 <adrian_otto> so it would need to be off by default with a sensible warning in the documentation about what the security risk is to enabling it
16:17:32 <kevinz> Yeah the heat trunk port patch has not merged
16:17:37 <strigazi> the ptl of kuryr has example but I'm not sure how this can be easily added to magnum
16:17:48 <strigazi> let fetch the example for you
16:17:49 <adrian_otto> basically anyone with access to the hosts that run this code has full access to your Neutron API
16:17:52 <kevinz> OK
16:17:52 <strigazi> let me fetch the example for you
16:18:28 <strigazi> kevinz https://github.com/danielmellado/kuryr_heat
16:18:36 <tonanhngo> It's probably worth proceeding in parallel while the issues are sorted out
16:18:41 <kevinz> adrian_otto: In pike I think it's hard to finish it. since heat is not ready
16:18:54 <strigazi> adrian_otto We can leverage the trust users and highlight the security implications
16:18:56 <kevinz> strigazi: Thanks
16:19:08 <adrian_otto> so my suggestion is to go ahead and implement it labeled EXPERIMENTAL with a security disclaimer, and then once we have that as a working feature (default off), we can add the required work to secure it.
16:19:13 <strigazi> In some use cases might not be a problem.
16:19:26 <adrian_otto> agreed
16:19:30 <strigazi> Flannel will be for sure the default
16:19:32 <tonanhngo> Right, it depends on the use case
16:20:21 <strigazi> For CERN, for example we don't have a use case at the moment but we may have in the future
16:21:03 <adrian_otto> kevinz based on your remarks above, it seems you are asking me not to target the blueprint for Pike. Do you think it's *possible* an initial version could be completed in Pike?
16:21:04 <strigazi> kevinz You plan to run kuryr in containers I guess right?
16:21:52 <kevinz> adrian_otto: Yeah I think it is a "possible" :-)
16:22:08 <adrian_otto> ok, so I'll target it for Pike, and if we need to change it later we can.
16:22:28 <kevinz> strigazi: Yeah, I think it's a good choice
16:22:57 <kevinz> adrian_otto: thx
16:22:58 <strigazi> kevinz if you need any help, testing etc Don't hesitate to ping me
16:23:00 <adrian_otto> #action adrian_otto to target https://blueprints.launchpad.net/magnum/+spec/integrate-kuryr-kubernetes for Pike, with a comment that it's okay for this to slip to Queens if needed.
16:23:30 <ArchiFleKs> kevinz: I'd be happy to help
16:23:59 * adrian_otto nods approvingly to ArchiFleKs
16:24:06 <kevinz> strigazi: ArchiFleKs: That's great! Thanks a lot
16:24:35 <adrian_otto> cool, this is a great feature that I'd love to include in Pike.
16:24:37 <strigazi> kevinz If you will build new container images have a look here, this is a way to make small python based images https://github.com/strigazi/heat-container-agent/blob/alpine-3.5/Dockerfile
16:24:59 <adrian_otto> good idea strigazi
16:25:36 <kevinz> Cool
16:26:15 <strigazi> I have something to propose that I don't like.
16:26:26 <strigazi> It is for the CI
16:26:34 <juggler> go for it strigazi
16:27:04 <strigazi> I personally don't trust very much the convergence of our functional tests and the rate of "random" failures is very high
16:27:18 <strigazi> So, I end up testing many things manually anyway.
16:27:58 <strigazi> I think we can reduce some of the *voting* tests that create pods or containers
16:28:07 <strigazi> make them non-voting
16:28:17 <strigazi> And focus more on the API
16:28:33 <strigazi> Be sure that the magnum service don't break
16:28:39 <strigazi> Be sure that the magnum service doesn't break
16:28:51 <tonanhngo> Is there way to automate the separate manual tests?
16:29:14 <juggler> or semi-automate them
16:29:16 <tonanhngo> Then we can run them ourselves
16:29:21 <strigazi> I'm working in adding 3rd party CIs but I can't wait.
16:29:33 <strigazi> We can.
16:29:41 <strigazi> One way is ansible
16:29:58 <ArchiFleKs> strigazi: Is there a way to run only some test elsewhere? I know I'm testing a lot manually also but I think we could provide a limited number of concurrent instances at osones whit nested kvm for the CI
16:30:03 <strigazi> Have a playbook that creates devstack and runs the tests
16:30:33 <ArchiFleKs> But it will be be very limited, but at least it will be fast and limit failure due to timeout
16:30:43 <strigazi> Or a heat template for clouds with heat running
16:31:37 <strigazi> ArchiFleKs I think that can be comlementary to our semi manual/automatic tests.
16:31:57 <strigazi> *complementary
16:32:12 <ArchiFleKs> strigazi: what about the centos ci?
16:32:20 <strigazi> I'm still waiting
16:32:39 <strigazi> Waiting for CoreOS to reply as well
16:32:54 <adrian_otto> #topic Open Discussion
16:33:23 <strigazi> So, what do you think about making some tests non-voting?
16:33:44 <adrian_otto> strigazi which ones?
16:34:34 <strigazi> For k8s and swarm. We will check that the stack is created and this test will be voting. But not try to ping the API server
16:34:42 <strigazi> the same for swarm
16:35:17 <adrian_otto> in all honesty, I'm reluctant to do that.
16:35:27 <strigazi> For k8s, specifically even if our tests run successfully that doesn't mean anything
16:35:41 <adrian_otto> I'm worried that we will merge code with regressions that cause the COEs not to work
16:36:10 <adrian_otto> well, suppose we break etcd, for example
16:36:32 <strigazi> That mean that API says ok. We don't test if pods can be created or test more complicated features like service accounts
16:36:40 <adrian_otto> the func test for the affected driver should alert us to that
16:36:55 <strigazi> The tests will still run
16:37:00 <strigazi> but non voting
16:37:08 <adrian_otto> but they fail most of the time now, I think
16:37:19 <adrian_otto> so we will become used to seeing those tests fail.
16:37:29 <strigazi> they fail on RAX, bluebox and infracloud
16:37:39 <adrian_otto> and probably will begin ignoring them completely because they do not vote
16:37:42 <strigazi> the work on OSIC and OCH
16:37:48 <strigazi> OCH
16:37:55 <strigazi> OVH
16:37:59 <adrian_otto> yep
16:38:25 <adrian_otto> does infra know this, and is there anything we can do to address the root cause of those failures?
16:38:38 <strigazi> adrian_otto I wold them multiple times
16:38:39 <adrian_otto> I'm not satisfied that marking the tests as nonvoting actually solves this.
16:38:41 <strigazi> adrian_otto I told them multiple times
16:39:10 <strigazi> I asked for nested virtualization and they don't provide it
16:39:20 <strigazi> They have it documented
16:39:25 <strigazi> just a sec
16:40:22 <strigazi> https://docs.openstack.org/infra/manual/testing.html#known-differences-to-watch-out-for
16:40:50 <strigazi> They says that with nested, some tests crash. For us it is the other way round
16:41:07 <strigazi> Almost the other way round
16:41:55 <strigazi> FYI, FOr k8s, we don't test etcd at the moment
16:42:25 <strigazi> So reducing the k8s voting tests won't cost anything.
16:43:57 <strigazi> team?
16:44:15 <adrian_otto> I read the differences list.
16:44:36 <strigazi> 3rd bullet
16:45:22 <adrian_otto> long ago we did discuss setting up Nova in our devstack to use the libvirt LXC driver
16:45:24 <strigazi> The infra PTL told me that since clouds don't offer nested as an attribute the can't offer nodepools with that feature
16:45:43 <adrian_otto> that would work around this issue because no hypervisor virt would be required to run the tests
16:45:50 <tonanhngo> If we test these manually to verify they are not broken, it may be workable
16:46:32 <ArchiFleKs> adrian_otto: we'll have to test but I've been playing  with the nova lxd driver and running Docker inside can be painful and not reflect the behavior inside kvm instances
16:46:53 <strigazi> fedora-atomic doesn't like lxd, I have tried
16:47:05 <adrian_otto> ArchiFleKs we can still test kvm instances on 3rd party CI per driver.
16:47:17 <strigazi> And we can't make decisions based on a slow CI
16:47:30 <adrian_otto> but I'd like to find a way to get around the nested virt problems for devstack and the basic gate tests
16:48:31 <strigazi> I'm talking specificaly for 3 tests.
16:49:04 <adrian_otto> I heard some mumbling about "Elastic Recheck" at the summit
16:49:44 <strigazi> TestKubernetesAPIs.test_pod_apis, TestKubernetesAPIs.test_replication_controller_apis and TestKubernetesAPIs.test_service_apis
16:50:12 <adrian_otto> We discussed at one point about automatically ordering rechecks that fail on clouds we know to be problematic. I wonder if ElasticRecheck is for that.
16:50:20 <strigazi> I meant 4, this one too test_start_stop_container_from_api
16:50:38 <strigazi> adrian_otto I'll have a look
16:51:03 <adrian_otto> strigazi have we asked infra if there is a way for us to blacklist nodepools for clouds we know are incpompatible?
16:51:13 <adrian_otto> so that we don't even try to test on those?
16:51:15 <ArchiFleKs> adrian_otto: from what I saw on openstack-ansible it can point you to bug you might have encountered during CI
16:51:30 <ArchiFleKs> But I've not look into deeper into it
16:51:49 <strigazi> I will ask again
16:51:52 <adrian_otto> ArchiFleKs if you can find a pointer to that, I'd love to take a look at it
16:51:54 <strigazi> On the ML this time
16:52:19 <ArchiFleKs> adrian_otto: there is some here http://status.openstack.org/elastic-recheck/
16:52:24 <ArchiFleKs> I'll try to find out more
16:52:31 <adrian_otto> because we are just wasting the Foundation's money/resources running tests that we know won't pass
16:53:15 <strigazi> adrian_otto I can make this point, they may like it
16:53:28 <adrian_otto> thanks strigazi
16:53:44 <adrian_otto> okay, we are scheduled to end our team meeitng in just a few minutes.
16:53:54 <adrian_otto> any other remarks before we wrap up?
16:54:01 <ArchiFleKs> I'll ask on openstack ansible but I wonder how there are doing there test but the must do some test that requires nested kvm and encountered similar issues
16:54:18 <juggler> good from this side
16:54:22 <ArchiFleKs> I wanted to add something but I don't of we have time
16:54:50 <adrian_otto> Thanks everyone for attending. Our next team meeting will be 2017-05-30 at 1600UTC here in #openstack-meeting-alt :-)
16:54:56 <adrian_otto> #endmeeting