#openstack-cyborg log

14:02:54 <shaohe_feng> #startmeeting openstack-cyborg-driver
14:02:55 <openstack> Meeting started Mon Jun  4 14:02:54 2018 UTC and is due to finish in 60 minutes.  The chair is shaohe_feng. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:02:56 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:02:58 <openstack> The meeting name has been set to 'openstack_cyborg_driver'
14:03:14 <shaohe_feng> #topic Roll Call
14:03:28 <Sundar_> #info Sundar
14:03:29 <shaohe_feng> #info shaohe
14:04:16 <Helloway> #info Helloway
14:05:53 <shaohe_feng> Sundar_, let's wait minutes for other?
14:06:21 <shaohe_feng> evening wangzhh
14:06:23 <tony> hi
14:06:35 <shaohe_feng> hello tony
14:06:58 <wangzhh> hello everyone
14:07:11 <tony> hello everyone
14:07:32 <Sundar_> shaohe: Sure
14:08:12 <shaohe_feng> OK. let's start.
14:08:19 <Sundar_> Hi Tony
14:08:26 <Sundar_> Hi Wangzhh
14:08:29 <Guest4480> Hi
14:08:31 <shaohe_feng> welcome tony
14:08:39 <shaohe_feng> #topic current status of drivers
14:09:16 <tony> ths shaohe
14:09:17 <shaohe_feng> I have list the tasks on the etherpad #link https://etherpad.openstack.org/p/cyborg-driver-tasks
14:09:53 <shaohe_feng> let go through the tasks
14:10:20 <shaohe_feng> wangzhh, are you going on the VGPU?
14:10:31 <wangzhh> OK. Let introduce my work.
14:10:50 <shaohe_feng> welcome.
14:11:08 <wangzhh> I'm going on the VGPU.
14:11:42 <wangzhh> And when I merged my code. I find some exist bug:(
14:12:12 <wangzhh> cyborg-agent doesn't work well.
14:12:35 <wangzhh> Such as https://review.openstack.org/#/c/572080/
14:13:56 <wangzhh> So, before VGPU driver, maybe I should fix them first.
14:14:13 <shaohe_feng> good catch.
14:14:49 <shaohe_feng> so this is an urgent fix.
14:15:10 <xinran__> Hi sorry for being late
14:15:21 <shaohe_feng> xinran__, evening.
14:15:50 <shaohe_feng> Li_liu is not on line.
14:16:11 <shaohe_feng> he introduce deployable object.
14:17:04 <shaohe_feng> Sundar_, and other developers, please help to review wangzhh bug fix
14:17:15 <shaohe_feng> #link https://review.openstack.org/#/c/572080/
14:18:05 <shaohe_feng> wangzhh, other process on VGPU? can you help to update the task list? #link https://etherpad.openstack.org/p/cyborg-driver-tasks
14:19:12 <shaohe_feng> ^ wangzhh, update the status.
14:19:23 <shaohe_feng> OK, next.
14:19:29 <wangzhh> shaohe_feng: OK
14:19:46 <shaohe_feng> SPDK, Helloway are you on line?
14:20:38 <Helloway> yes
14:21:15 <shaohe_feng> so any process on the SPDK?
14:22:37 <Helloway> Temporarily no
14:24:38 <shaohe_feng> Helloway, can you help to update the SPDK status on the #link https://etherpad.openstack.org/p/cyborg-driver-tasks ？
14:24:58 <shaohe_feng> OK, next provider report.
14:25:52 <shaohe_feng> Sundar_, Do we support multi-resource class and nest provider in this release
14:26:09 <Sundar_> Shaohe: IMHO, it is becoming risky
14:26:31 <shaohe_feng> Sundar_, what's the risky?
14:26:38 <Sundar_> We have been waiting too long. Even in today's discussion in Nova scheduler meeting, it is not clear that it is going to come soon
14:27:05 <shaohe_feng> how should we do?
14:27:08 <Sundar_> IMO, we should switch immediately to my originally proposed plan to use compute node as RP, till we get nRP
14:27:56 <Sundar_> This means we can have multiple devices on the same host but they must all have the same types/traits
14:28:02 <shaohe_feng> OK. Can you have a discuss with jaypipes about cyborg's resource provider?
14:28:11 <shaohe_feng> during the summit?
14:28:30 <Sundar_> E.g. 2 GPUs, 2 FPGAs both from Xilinx with same device family, 2 FPGAs both from Intel (say A10) etc
14:29:22 <Sundar_> I discussed with edleafe etc., primarily on the comments that we should not have vendor/product/device names in traits.
14:29:40 <Sundar_> That got resolved, because we do need vendor/device names, but not product names. The spec has been updated
14:29:57 <Sundar_> On nRP, it is a larger discussion, that is still going on in the Nova community
14:30:09 <shaohe_feng> if we have a 1 Xilinx and 1 intel's FPGA on one host, what's the resource name? and what's the traits name?
14:30:33 <Sundar_> We cannot have that unless we get nRPs. I'll explain why
14:30:47 <shaohe_feng> OK. please
14:30:47 * edleafe is here. Didn't know the meeting time changed
14:31:04 <Sundar_> For each device (GPU or FPGA), we will apply a trait like CUSTOM_GPU_AMD, CUSTOM_FPGA_INTEL
14:31:30 <Sundar_> In your example, it will be CUSTOM_FPGA_INTEL and CUSTOM_FPGA_XILINX
14:31:55 <Sundar_> We will also publish the resource class as CUSTOM_ACCELERATOR_FPGA
14:32:14 <edleafe> I would prefer 2 nested children, each with an inventory of 1 CUSTOM_FPGA. Then use traits to distinguish them
14:32:20 <Sundar_> However, without nRP, it will get applied on the compute node, not on the individual devices
14:32:47 <Sundar_> shaohe: that requires nRP. We may not get that soon enough to meet our Rocky goals.
14:33:26 <edleafe> There is currently discussion in placement on the potential upgrade problems when moving from non-nested to nested. It would be better to only plan on implementing on the nested model if possible
14:34:00 <Sundar_> Without nRP, it will get applied on the compute node. So, the cpmpute node will advertise 2 units of CUSTOM_ACCELERATOR_FPGA, with 2 traits: CUSTOM_FPGA_INTEL and CUSTOM_FPGA_XILINX. But, Placement wil see that as 2 RCs each with 2 traits
14:34:23 <Sundar_> There is no way to say that one unit of the RC has one trait, and the other has another trait
14:34:59 <Sundar_> Does that make sense?
14:35:07 <edleafe> Sundar_: precisely. That's the main reason for waiting for nested
14:35:58 <shaohe_feng> edleafe, Sundar_ : there's issue in this way.
14:36:24 <Sundar_> edleafe: Thanks for joining. My concern is, we are already in June and even today's Nova sched discussion indicates concerns with rolling upgrades and nRPs
14:36:46 <Sundar_> It would be better to get something done with caveats, than nothing
14:37:23 <shaohe_feng> for example, user apply one CUSTOM_FPGA_XILINX, then inventory will remain one intel's CUSTOM_ACCELERATOR_FPGA
14:37:49 <shaohe_feng> then user still apply one CUSTOM_FPGA_XILINX, what's will go on?
14:38:04 <edleafe> But besides the issues noted with moving from non-nested to nested, Cyborg will also have to re-do a lot of the naming of custom resource classes
14:38:11 <Sundar_> shaohe: traits go with resource providers (RPs), not resource classes (RCs)
14:39:55 <shaohe_feng> Sundar_, yes, that's issue.
14:40:41 <shaohe_feng> Sundar_, so the user still can apply a FPGA, but not XILINX that he expect.
14:44:16 <shaohe_feng> edleafe, any suggestion on this issue?
14:44:48 <Sundar_> Hi
14:45:02 <openstackgerrit> wangzhh proposed openstack/cyborg master: Fix Deployable get_by_host  https://review.openstack.org/572080
14:45:16 <Sundar_> Can you see my typing?
14:45:20 <shaohe_feng> Sundar_, welcome to come back
14:45:33 <shaohe_feng> Sundar_, just see "Hi"
14:45:37 <Sundar_> I got blocked for some reason -- whatever I typed did not show up
14:45:49 <shaohe_feng> edleafe, Sundar_ , only support one kind of FPGA to avoid this issue?
14:46:13 <Sundar_> Yes, omultiple devices ok, but all of the same type
14:46:30 <Sundar_> We cannot have a GPU and a FPGA on the same host, or 2 kinds of FPGAs
14:46:39 <Sundar_> That should be ok for Rocky
14:48:05 <edleafe> shaohe_feng: sorry, had to step away for a moment
14:48:39 <edleafe> shaohe_feng: FPGA is a resource *class*. GPU is a resource *class*
14:48:49 <edleafe> They should be modeled that way from the start.
14:49:01 <edleafe> specific types should be distinguished with traits
14:49:27 <Sundar_> edleafe: That is exactly what we are doing
14:49:54 <edleafe> Sundar_: good. I really would hate to see things like CUSTOM_FPGA_XILINX
14:50:14 <Sundar_> It is just that, without nRPs, we will apply the traits to the compute node, for Rocky cycle alone. That means, all devices on the same host must have the same traits.
14:50:37 <Sundar_> edleafe: CUSTOM_FPGA_XILINX would be a trait on the RP, not a RC
14:50:39 <shaohe_feng> CUSTOM_FPGA_XILINX is resource or traits?
14:50:55 <edleafe> We are still hoping to have NRP complete in Rocky
14:51:02 <Sundar_> shaohe: trait, not RC
14:51:25 <edleafe> ah, that wasn't clear. In context I thought you were using it as an RC
14:51:39 <Sundar_> edleafe: I understand, but with 2 months left to go, I think we are risking Cyborg's Rocky goals by waiting further
14:52:42 <edleafe> Sure, that's understandable. I just want to make sure you know that that will make moving to an NRP design later harder
14:53:08 <Sundar_> What makes the switch to nRPs hard?
14:53:54 <Sundar_> The set of RCs and traits will stay the same. But we will apply the traits to individual device RPs later
14:54:50 <edleafe> You will have inventory of devices for the compute node. When you upgrade, somehow these must be converted to a nested design, and any existing instances that are using those resources will have to have their allocations moved. That is the current upgrade discussion going on.
14:55:42 <edleafe> Remember, allocations are for an instance against the RP. When you move to nested, you now have a new child RP, and the allocations should be against it
14:56:28 <edleafe> But they will be against the compute node for any existing instances at the time of upgrade. How to reconcile all of this correctly is what we are trying to work out now
14:58:31 <shaohe_feng> OK
14:58:48 <Sundar_> OK. There are 2 options: (a) Do not support expectations of upgrades for Cyborg in rocky (IMHO, Rocky addresses the basic Cyborg use cases and lays a strong foundation for further development) (b) Support upgrades by providing some mechanism to delete traits on compute node at a safe time (would appreciate your input here)
15:00:23 <Sundar_> edleafe: What do you think?
15:00:24 <shaohe_feng> Sundar_, can you summary it, and we can get conclusion on Wednesday's meeting
15:00:56 <shaohe_feng> ^ edleafe,  any suggestion on it?
15:01:00 <Sundar_> shaohe: Above is the summary :). What additional info do you want?
15:01:14 <edleafe> You would probably have to write some custom upgrade script to iterate over all machines to find old-style traits and allocations, convert them to the new style, and then delete the old ones
15:01:55 <shaohe_feng> Sundar_, does these 2 options are mutually exclusive?
15:02:20 <Sundar_> edleafe: On an upgrade, the new agent/driver(s) will automatically create nested RPs and apply traits there, while the old traits on the compute node still exist
15:03:21 <Sundar_> Can we then delete the traits on the compute node, while instances are still running?
15:04:14 <Sundar_> If so, we can provide a script that the operator must run after upgrade, which deletes Cyborg traits on compute nodes
15:05:14 <Sundar_> shaohe: they are exclusive.
15:06:29 <edleafe> Sundar_: Traits aren't the issue. Allocations / Inventory is what is important to update
15:07:15 <edleafe> Otherwise, a compute node will have, say, inventory of 2 FPGAs, and will now have two child RPs with an FPGA inventory
15:07:38 <edleafe> In that example, Placement will now see 4 FPGAs on the one compute node.
15:08:06 <Sundar_> edleafe: True. May be the upgrade script can set the reserved to total for the compute node RP?
15:09:07 <edleafe> That's one way. The other would be to simply delete that inventory, since it really isn't the compute node's anymore
15:09:44 <Sundar_> Can we do that while the instances are still using that inventory?
15:09:44 <edleafe> Allocations will also have to be adjusted, because having a double allocation has an impact on things like measuring quotas
15:10:09 <edleafe> Sundar_: :) Now you
15:10:12 <edleafe> oops
15:10:33 <edleafe> Now you're seeing why I don't like to go the non-nested to nested path
15:10:43 <Sundar_> Can we do that while the instances are still using that inventory?
15:11:03 <Sundar_> i.e. delete the inventory on compute node RPs
15:11:05 <edleafe> There are a lot of issues, and we're tyring to come up with a generic solution that will work for Cyborg as well as things like NUMA
15:11:40 <shaohe_feng> edleafe, does provider support NUMA at present?
15:12:11 <shaohe_feng> edleafe, how do we organizate
15:12:21 <edleafe> shaohe_feng: yes, but in a non-nested way
15:12:59 <edleafe> We're trying to figure out how to do that upgrade, and it doesn't look easy
15:13:59 <shaohe_feng> one in numa node0, another in numa node1
15:14:38 <edleafe> The way NUMA has been modeled in Nova is a horrible hack that was done before Placement even existed
15:14:40 <Sundar_> May be it is simplest to go with option a: for upgrades with Cyborg, first stop all instances using accelerators, run a script that cleans up, then upgrade to new Cyborg. For other subsystems, I understand the issue. But, Cyborg is a new and we can set expectations
15:15:14 <edleafe> Sundar_: yes, that is a luxury that a generic solution wouldn't have
15:16:39 <shaohe_feng> edleafe, ^ any suggestion on the ultima solution for cyborg accelerator numa topo ?
15:18:22 <edleafe> shaohe_feng: I would think it would look something like: compute_node -> NUMA node -> FPGA device -> FPGA regions - > FPGA function
15:18:52 <edleafe> But from what I know of NUMA, you can configure it multiple ways.
15:20:43 <shaohe_feng> compute_node is provider, NUMA node is provider, FPGA device , FPGA regions, FPGA function are all provider?
15:20:49 <shaohe_feng> ^ edleafe,
15:21:15 <shaohe_feng> Sundar_, have consider numa topo for FPGA
15:21:23 <edleafe> shaohe_feng: yes. The only inventory is the function, which is what the user wants
15:21:23 <shaohe_feng> ?
15:21:51 <shaohe_feng> edleafe, ok, got it. 4 level provider.
15:21:53 <shaohe_feng> thanks
15:22:30 <Sundar_> shaohe: kinda. But, my suggestion is to focus on the basics for Rocky. If we try to throw in everything, we will not deliver anything
15:23:01 <edleafe> shaohe_feng: of course, it doesn't *have to* be that way, but it is one possibility
15:24:31 <shaohe_feng> edleafe, thanks again.
15:24:50 <shaohe_feng> Sundar_, let's go ahead?
15:25:00 <shaohe_feng> for next task?
15:25:29 <Sundar_> Yes, I will send an email to openstack-dev, proposing this (lack of) upgrade plan
15:25:55 <shaohe_feng> Sundar_, OK, please, thanks.
15:26:06 <shaohe_feng> next is Intel FPGA driver
15:26:40 <shaohe_feng> I see the owner is rorojo
15:26:58 <shaohe_feng> Sundar_, rorojo is not on line
15:27:09 <shaohe_feng> can you help to sync with him?
15:27:31 <Sundar_> Yes. Discussions are ongoing about the implementation
15:27:40 <shaohe_feng> great.
15:27:46 <shaohe_feng> any update there?
15:28:35 <Sundar_> Nothing significant. I helped Rodrigo to get started, with code browsing etc
15:28:57 <Sundar_> I had issues with devstack myself :), but I got that fixed, so I have a working deployment on MCP now
15:29:20 <Sundar_> I am now working on agent-driver API update
15:29:59 <shaohe_feng> Sundar_, Oh, we have improve the devstack doc
15:30:08 <shaohe_feng> let me find it for you.
15:31:02 <shaohe_feng> #link https://docs.openstack.org/cyborg/latest/contributor/devstack_setup.html#devstack-quick-start
15:31:06 <Sundar_> The issue was not with Cyborg plugin. I was hitting version conflicts on various components in oslo etc.
15:31:38 <shaohe_feng> OK, there's a bug on oslo components?
15:32:13 <Sundar_> I had to do a 'pip install --upgrade' on many components, because they were lower version than the minimum
15:32:37 <Sundar_> Then, I took eveything in /opt/stack/requirements/lower_constraints.text and tried to do a mass upgrade
15:33:06 <shaohe_feng> should we submit a patch to upgrade the cyborg requirement?
15:33:09 <Sundar_> That failed because some components need Python 3. So, I excluded some things manually, and otherwise hacked, till it worked
15:34:02 <Sundar_> The issue is, devstack does not seem to upgrade the components to their minimum versions automatically.
15:34:08 <openstackgerrit> wangzhh proposed openstack/cyborg master: Fix Deployable get_by_host  https://review.openstack.org/572080
15:34:43 <Sundar_> This is not Cyborg-specific
15:35:31 <shaohe_feng> FPGA programing, Li_liu is not on line. let's skip it.
15:35:54 <shaohe_feng> HPTS, yumeng is not online, skip it.
15:36:12 <shaohe_feng> Cyborg/Nova interaction, Sundar_
15:36:33 <Sundar_> You may have seen the spec updates
15:36:46 <Sundar_> Waiting for approval
15:36:50 <wangzhh> Sundar_: Maybe you can try PIP_UPGRADE=True in local.conf
15:37:22 <Sundar_> wangzhh: Thanks, will try it next time
15:37:40 <shaohe_feng> wangzhh, thanks. can you submit a patch for our developer guider?
15:38:13 <Guest4480> Hi Sundar, there are one problem we faced in our environment: suppose there are two GPU in one host, when attaching one gpu to vm, if this attachment is failed,  will nova try to attach the second gpu in this host to vm?
15:38:23 <shaohe_feng> wangzhh, also with other doc bugs.  #link https://docs.openstack.org/cyborg/latest/contributor/devstack_setup.html#devstack-quick-start
15:38:37 <wangzhh> It's an common config. Do we need to maintain it?
15:39:04 <shaohe_feng> wangzhh, maybe we can add some note for developer.
15:39:05 <Guest4480> sorry to interrupt.
15:39:18 <shaohe_feng> Guest4480, it does not matter.
15:39:18 <wangzhh> OK.
15:39:33 <shaohe_feng> Guest4480, go ahead.
15:39:55 <shaohe_feng> Sundar_, Guest4480 ask help from you.
15:40:12 <Sundar_> Guest4480: since they are independent resources, failure to attach one should not affect the other one.
15:41:49 <shaohe_feng> next task.
15:42:26 <shaohe_feng> Cyborg/Nova interaction(nova side), Sundar_ any plan for it. Should we move it in next release?
15:42:58 <Sundar_> What does that mean?
15:43:15 <shaohe_feng> Sundar_, I think you will talk with jaypipes, edleafe and other nova developer's about it.
15:43:24 <Sundar_> Do you mean, nova compute calling into os-acc?
15:43:55 <Sundar_> Yes, we need to follow up. We have an os-acc spec, which is awaiting approval
15:44:36 <Sundar_> I will update the specs once more by this week
15:44:53 <shaohe_feng> zhipengh[m], zhuli_ will works on os-acc.
15:45:04 <edleafe> Sundar_: ping me in -nova when they are ready
15:45:30 <Sundar_> edleafe: Sure. Thanks
15:45:43 <shaohe_feng> Sundar_,  but we still need volunteers working on nova side.
15:45:50 <Guest4480> OK, we will follow the spec to see where to add our work(code) which has be done.
15:46:12 <edleafe> shaohe_feng: I can help with the nova side
15:46:55 <Sundar_> Excellent, edleafe. Nova compute needs to call into os-acc for specific instance events, as documented in os-acc spec
15:47:11 <Sundar_> #link https://review.openstack.org/#/c/566798/
15:47:18 <shaohe_feng> edleafe, thanks.
15:48:22 <shaohe_feng> Sundar_, you can have more details with edleafe.
15:48:30 <Sundar_> Yes
15:48:36 <shaohe_feng> thanks.
15:48:43 <shaohe_feng> next task.
15:48:45 <shaohe_feng> Quota for accelerator
15:48:51 <shaohe_feng> xinran__, are you on line?
15:49:14 <xinran__> Yes
15:49:15 <Sundar_> I need to drop off in 5 minutes
15:49:20 <xinran__> I’m here
15:49:25 <shaohe_feng> OK.
15:49:45 <shaohe_feng> Cyborg/Nova/Glance intercation in compute node, including os-acc
15:49:51 <shaohe_feng> Sundar_, it is ready?
15:50:19 <shaohe_feng> xinran__, how is quota going on?
15:50:44 <openstackgerrit> wangzhh proposed openstack/cyborg master: Fix Deployable get_by_host  https://review.openstack.org/572080
15:50:50 <shaohe_feng> can you update the task status on the #link https://etherpad.openstack.org/p/cyborg-driver-tasks
15:50:57 <Sundar_> shaohe: I will respond to comments and update it. Hopefully, we are on the last iteration
15:51:14 <xinran__> I implemented quota reserve and commit in api layer
15:51:14 <shaohe_feng> Sundar_, thanks.
15:51:26 <shaohe_feng> xinran__, ok, thanks.
15:51:56 <shaohe_feng> that's all for today's task status
15:52:06 <shaohe_feng> Sundar_, have a good day.
15:52:13 <shaohe_feng> it is too late in China.
15:52:18 <xinran__> But I got sundar’s comment that the first point nove enter to cyborg is agent,  I’m not sure about that
15:53:07 <shaohe_feng> do you means nova will call cyborg agent directly?
15:53:40 <Sundar_> Nova compute calls os-acc, which calls Cyborg agent
15:54:34 <Sundar_> Bye
15:54:57 <wangzhh> Bye
15:55:36 <shaohe_feng> no time left for us to discuss it on today's meeting. let's remain it to Wed's meeting.
15:55:49 <xinran__> okay bye
15:55:50 <shaohe_feng> For it is too late in China.
15:56:08 <shaohe_feng> okey meeting adjourned
15:56:26 <shaohe_feng> #endmeeting