03:00:26 #startmeeting zun 03:00:28 Meeting started Tue Jul 26 03:00:26 2016 UTC and is due to finish in 60 minutes. The chair is hongbin. Information about MeetBot at http://wiki.debian.org/MeetBot. 03:00:29 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 03:00:31 The meeting name has been set to 'zun' 03:00:33 #link https://wiki.openstack.org/wiki/Zun#Agenda_for_2016-07-26_0300_UTC Today's agenda 03:00:38 #topic Roll Call 03:00:47 Wenzhi Yu 03:00:49 Madhuri Kumari 03:00:52 Shubham 03:00:56 Namrata 03:00:59 hi 03:02:04 o/ 03:02:21 Thanks for joining the meeting Wenzhi mkrai shubhams_ Namrata yanyanhu flwang 03:02:27 #topic Announcements 03:02:32 We have a CLI now! 03:02:37 #link https://github.com/openstack/python-zunclient 03:02:45 #link https://review.openstack.org/#/c/337360/ service-list command is supported 03:02:49 cool 03:02:51 #link https://review.openstack.org/#/c/344594/ it is enabled in devstack 03:03:00 great 03:03:10 Thanks mkrai for the work :) 03:03:23 bravo 03:03:29 My Pleasure :) 03:03:37 #topic Review Action Items 03:03:43 1. hongbin investigate an option for message passing (DONE) 03:03:49 #link https://etherpad.openstack.org/p/zun-container-state-management 03:04:01 We have a session below to discuss it 03:04:07 #topic Re-consider RabbitMQ. How about using key/value store (i.e. etcd) for passing messages (shubhams) 03:04:12 #link https://etherpad.openstack.org/p/zun-container-state-management 03:04:15 Here you go 03:05:01 Want a few minutes to work on the etherpad? 03:05:05 hongbin, Thanks for the work 03:05:14 np 03:05:21 But seems we haven't looked at taskflow 03:05:36 taskflow looks more high level 03:05:42 Shall we also look at it and then decide on a final option 03:06:23 mkrai: do you have an idea about how to use taskflow? 03:06:31 No not yet 03:06:36 But I can look at it 03:06:47 ok 03:06:58 taskflow has more use cases for a component like heat .. Dividing each operation in small tasks and then work upon them 03:07:37 Sounds like it is not conflict with the choice of data store 03:07:57 You can have taskflows for tasks, but not for data store 03:07:58 Yes more of message passing 03:08:32 not sure for messaging pssing 03:08:47 anyway 03:08:56 IMO rabbitmq is reliable for message passing 03:09:04 true 03:09:12 And etcd for storage 03:09:31 and it's easy to implement by importing oslo.messaging 03:09:46 Yes, it is 03:10:04 o/ 03:10:10 sudipto: hey 03:10:27 sudipto: we are discussing this etherpad: https://etherpad.openstack.org/p/zun-container-state-management 03:10:39 hongbin, thanks. Looking. 03:11:02 OK. Maybe I could summarize 03:11:07 For data store 03:11:16 The choice is etcd and db 03:11:34 Both have pros and cons 03:11:45 For options for passing message 03:11:55 etcd is not designed for message passing but rabbitmq is 03:11:55 The choices is http or message queue 03:12:09 Wenzhi: yes 03:12:34 I think it's risky to employ etcd for message passing 03:13:05 No, etcd is not for passing message 03:13:37 Wenzhi, I think that depends on how we design the communication mechanism between different modules 03:14:06 using etcd, watching on configuration change is used to support it 03:14:21 but in most OpenStack services, messaging is the way applied 03:15:10 yanyanhu: good summarize 03:15:11 myself is also not so sure which one is better, but we do have two choices here 03:15:34 yanyanhu: yes I agree, etcd also could be used for message passing, but I don't think it's as reliable as rabbitmq 03:15:50 as I said, it's not designed for this 03:16:28 I see. 03:16:39 There are ways to deal with the unreliability I think 03:16:41 just have little experience about them 03:17:07 Yes may be we are sure about rabbitmq because it is used in Openstack 03:17:16 And that's not the case with etcd 03:17:17 For example, perioidcally sync data from data store 03:17:50 mkrai: good point. rabbitmq is easier to understand for us maybe 03:18:10 Then, I would suggest to have both 03:18:29 We can abstract the message passing 03:18:34 hongbin : option#3 in etherpad ? 03:18:43 novice question w.r.t the etherpad - when you schedule the container - you are storing the 'data' of the container in the datastore - what is this data? The provisioning request? When you create the container - host agents again store back the 200 OK or SUCCESS states back to the data store? 03:18:43 shubhams_: yes 03:19:20 sudipto: I copied the idea from k8s 03:19:28 sudipto: here is how it works (i think) 03:19:34 sudipto, Is it architecture #1? 03:19:45 mkrai, either #1 or #3 03:19:51 sudipto: the api server write the pod to etcd 03:20:21 sudipto: the schedule assign pod to node, and write it to etcd, with state "pending" 03:20:28 It is the scheduling request 03:21:00 sudipto: kubelet watch the node, figure out there is a new pod 03:21:16 hongbin, and this figuring out - happens through some kind of polling? 03:21:24 sudipto: then, create the pod. WHen finish, write the state as "running" 03:21:42 sudipto: In k8s, they do both 03:21:56 sudipto: they call it list and watch 03:22:00 hongbin, or is it some kind of a rpc call back? 03:22:16 ok 03:22:30 sudipto: they periodically pulling data from api, also watch data chance 03:22:45 hongbin, ok 03:22:59 The watch will figure out there is a new pod 03:23:18 data chance == data channel? 03:23:19 If the watch is miss, it can be catched in the next pull 03:23:39 I guess yes, it is basically a push 03:24:25 if kubelet watch a node, kubeapiserver will push an event to kubelet 03:24:50 I guess it happens via a http socket 03:24:59 hongbin, ok... however, it does sound like - we will re-implement a scheduler in zun then? 03:25:20 sudipto, which i think is mostly needed since we want to do container scheduling and the reference architecture. 03:25:35 sudipto: yes 03:25:49 hongbin, but i guess it should have some by pass mechanism while we are doing the COE support? Or should we just not think about it atm? 03:26:12 No, this is not for COE 03:26:19 This is the runtime architecture 03:26:31 which we call "referenced COE" 03:26:38 hongbin, alrite. The runtime architecture which is the referenced one. 03:26:44 yes 03:26:47 hongbin, got it! 03:27:17 hongbin, the model does seem nice to follow for me. The only thing though is - this is a bit orthogonal to openstack 03:27:47 sounds like a lot of investment from development per say - but may be it's what we need. 03:27:50 sudipto: yes, it is totally news for openstack devs 03:28:39 OK. Everyone, you have a choice or needs more time to decide? 03:29:13 I think option #3 is better 03:29:14 silent..... 03:29:27 agree with #3 03:29:31 But nothing can be said now as it all depends on performance 03:29:43 mkrai, +1 03:30:05 Then, I think option #3 is more flexible 03:30:10 agree with #3 03:30:17 #3 with a proof of concept 03:30:19 if we find performance is not good, we can switch the backends 03:30:27 ok 03:30:38 sounds we agreed on #3 03:30:42 +1 03:30:43 Yes the architecture remains the same 03:31:01 #agreed try option #3 in https://etherpad.openstack.org/p/zun-container-state-management 03:31:05 hongbin, the language of choice to code - is still python right? 03:31:15 sudipto: yes python 03:31:19 with the Global Interpreter Lock :D 03:31:32 ?? 03:31:59 gah - sorry about that. Basically I was talking about the slowness of python . 03:32:11 haha 03:32:32 We are an OpenStack project, I guess we don't have choice for hte language 03:32:34 :) 03:32:41 even if it is slow 03:32:57 OK. Advance topic 03:32:59 #topic Runtimes API design 03:33:05 #link https://blueprints.launchpad.net/zun/+spec/api-design The BP 03:33:09 I have almost forgot everything I have learned about other language since I worked on openstack :) 03:33:10 #link https://etherpad.openstack.org/p/zun-containers-service-api The etherpad 03:33:17 #link https://etherpad.openstack.org/p/zun-containers-service-api-spec The spec 03:33:39 I worked on the api patch last week but it is not yet ready 03:33:59 To test it we first need to finalize the backend services 03:34:26 Like what services we will have conductor or compute or both? 03:34:44 mkrai: we just decided to go with #3 03:34:54 mkrai: that means no conductor basically 03:35:10 Ok so compute and scheduler 03:35:13 mkrai: just api and agent (compute) 03:35:30 mkrai: yes 03:35:55 Ok so I need to work on compute agent as it doesn't have the docker api calls 03:36:12 This makes things more simpler 03:36:36 I will work on same 03:36:56 mkrai, can you explain what you meant by "doesn't have the docker api calls"? 03:37:35 sudipto, the compute code is not yet implemented that has the essential container related actions 03:37:43 Like the docker container create or etc 03:37:58 mkrai, sigh! My bad. Got it. 03:38:10 I have a patch upstream for conductor that has this all 03:38:22 Now I will work on adding it to compute 03:38:25 :) 03:38:37 your calls to docker APIs were in the conductor? 03:38:38 mkrai: I'll help on that 03:38:46 Yes sudipto 03:39:07 We are following the openstack architecture and that was bad 03:39:18 s/are/were 03:39:28 oh - that doesn't sound like the place to put them anyway right? Since you would always have that in the compute agent - with the new design or otherwise. 03:39:47 Yes that patch was submitted before compute agent I remember 03:39:52 yeah ok... 03:39:56 So that's why 03:40:09 i am up for doing code reviews - and help wherever i can. 03:40:22 Thanks sudipto :) 03:40:36 hongbin, that's all from my side. 03:40:42 thanks mkrai 03:40:45 the design of the compute agent probably also should go into an etherpad? 03:40:58 before we start realizing it with code? 03:41:14 mkrai: this is the bp for compute agent https://blueprints.launchpad.net/zun/+spec/zun-host-agent 03:41:17 since i would imagine a stevedore plugin or some such thing to load drivers? 03:41:26 Thanks Wenzhi 03:41:26 sudipto: I guess we can code first 03:41:37 I will connect to you later 03:41:47 okay 03:42:14 hongbin, i guess we should design first :) but that's me. 03:42:33 sudipto: The compute is basically the implementation details, which cannot be explained clearly in an etherpad 03:42:45 since we have now involved arch # 3 which will vary from the traditional openstack projects. 03:43:04 sudipto: yes, maybe 03:43:08 but either way - i am good. 03:43:17 Yes agree sudipto 03:43:32 sudipto: OK, then I will leave it to the contribution 03:43:50 sudipto: if the contributor like to have the design in etherpad, it works 03:43:56 hongbin, ok - shall be eager to work with mkrai and Wenzhi to have that in place. 03:44:01 sudipto: if the contributor want to code first, it also works 03:44:13 sudipto: k 03:44:16 either works for me 03:44:27 ok 03:44:51 great progress so far :) 03:45:02 yes 03:45:06 let's advance topic 03:45:11 #topic COE API design 03:45:17 #link https://etherpad.openstack.org/p/zun-coe-service-api Etherpad 03:45:36 I guess we can leave the etherpad as homework 03:45:43 and discuss the details there 03:46:17 Anything you guys want to discuss about the COE api? 03:46:17 agree 03:46:46 I think we can discuss later 03:46:52 k 03:46:57 #topic Nova integration 03:47:01 #link https://blueprints.launchpad.net/zun/+spec/nova-integration The BP 03:47:06 #link https://etherpad.openstack.org/p/zun-containers-nova-integration The etherpad 03:47:19 hongbin, not related to the API design as such, but i think - we can have a way to plug the scheduler as well. For instance - we could plug a mesos scheduler and work with runtimes above it? 03:47:34 sudipto: good point 03:48:19 sudipto: I think this is a good idea, besides you, there are other people asking for that 03:48:32 sudipto, That's only specific to the COEs? 03:48:52 Let's discuss that in open discussion 03:49:05 mkrai, I would guess so - unless we build a zun runtime - that could be plugged to the mesos scheduler as well. So yeah open discussions then. 03:49:05 Now nova integration :) 03:49:40 Namrata_: are you the one who volunteer to work on Nova integration? 03:49:44 yes 03:49:58 Namrata_: anything you wan tto update the team? 03:49:58 i have gone through ironic code 03:50:27 Nova conductor and scheduler are tightly coupled 03:50:31 https://github.com/openstack/nova/blob/master/nova/conductor/manager.py#L470 03:50:56 as we are considering scheduler for zun 03:51:13 we have two schedulers 03:51:28 imo we wwill give preference to zun's scheduler 03:51:43 and discard the host info given by nova's scheduler 03:51:49 Yes that's a pitfall for us 03:52:21 yanyanhu: what do you think? 03:52:25 i thought we were looking at it more like a replacement to the nova-docker driver? Is that still true? 03:52:38 Yes sudipto 03:52:55 But nova-docker doesn't have any scheduler 03:53:09 mkrai, yeah coz it's yet another compute driver to nova. 03:53:26 but i keep forgetting... the reason we want to think like ironic is because of the dual scheduler thingy? 03:53:47 For Ironic, it uses the nova scheduler 03:53:54 hongbin, sorry, just trapped by some other stuff 03:54:08 yanyanhu: we are discussing the scheduler 03:54:28 about scheduler, if we can reuse the nova scheduler, that will be the best I feel 03:54:35 but not sure when they will split it out 03:54:39 finally 03:55:30 I think it's not easy for us to reuse nova scheduler 03:55:40 i am wondering - the API design we just had for the zun runtime - doesn't necessarily map to the nova VM states - so how do we want to deal with it? 03:55:48 container is not like server(virtual or physical) 03:56:00 Wenzhi, yes, they behaves differently 03:56:09 yes 03:56:27 So does it mean schedulers also differ? 03:56:28 ironic use nova scheduler because physical servers are also servers 03:56:31 just feel the scheduling mechanism could be similar 03:56:56 they can fit into nova data model, but containers can not, right? 03:57:06 just placement decision 03:57:10 Wenzhi, precisely the point i thought... 03:57:35 #topic Open Discussion 03:57:43 sudipto, Doesn't the scheduler also? 03:57:46 based on resource capability and also other advanced strategy 03:58:01 s/capability/capacity 03:58:24 mkrai, scheduler also - does not fit for containers you mean? 03:58:31 Yes 03:58:52 I mean the nova scheduler for containers 03:59:04 mkrai, you mean for nova-docker? 03:59:09 mkrai, yeah - i would guess so. It's a rabbit hole there. 03:59:36 anyway that i don't mean to de-rail that effort... so either way again - it works for me. 03:59:50 OK. Time is up. 03:59:58 Thanks everyone for joining the meeting 04:00:03 Lets discuss on zun channel 04:00:06 #endmeetings 04:00:08 Thanks.. 04:00:11 Thanks! 04:00:11 #endmeeting