*** shaohe_feng has quit IRC | 00:00 | |
*** shaohe_feng has joined #openstack-cyborg | 00:01 | |
*** shaohe_feng has quit IRC | 00:11 | |
*** shaohe_feng has joined #openstack-cyborg | 00:12 | |
*** shaohe_feng has quit IRC | 00:21 | |
*** shaohe_feng has joined #openstack-cyborg | 00:22 | |
*** shaohe_feng has quit IRC | 00:31 | |
*** shaohe_feng has joined #openstack-cyborg | 00:32 | |
*** shaohe_feng has quit IRC | 00:41 | |
*** shaohe_feng has joined #openstack-cyborg | 00:44 | |
*** shaohe_feng has quit IRC | 00:52 | |
*** shaohe_feng has joined #openstack-cyborg | 00:53 | |
*** shaohe_feng has quit IRC | 01:02 | |
*** shaohe_feng has joined #openstack-cyborg | 01:03 | |
*** shaohe_feng has quit IRC | 01:12 | |
*** shaohe_feng has joined #openstack-cyborg | 01:13 | |
*** shaohe_feng has quit IRC | 01:22 | |
*** shaohe_feng has joined #openstack-cyborg | 01:23 | |
*** shaohe_feng has quit IRC | 01:33 | |
*** shaohe_feng has joined #openstack-cyborg | 01:33 | |
*** shaohe_feng has quit IRC | 01:43 | |
*** shaohe_feng has joined #openstack-cyborg | 01:47 | |
*** shaohe_feng has quit IRC | 01:53 | |
*** shaohe_feng has joined #openstack-cyborg | 01:56 | |
*** shaohe_feng has quit IRC | 02:03 | |
*** shaohe_feng has joined #openstack-cyborg | 02:04 | |
*** shaohe_feng has quit IRC | 02:14 | |
*** shaohe_feng has joined #openstack-cyborg | 02:14 | |
*** shaohe_feng has quit IRC | 02:24 | |
*** shaohe_feng has joined #openstack-cyborg | 02:25 | |
*** shaohe_feng has quit IRC | 02:34 | |
*** shaohe_feng has joined #openstack-cyborg | 02:37 | |
*** shaohe_feng has quit IRC | 02:44 | |
*** shaohe_feng has joined #openstack-cyborg | 02:46 | |
*** shaohe_feng has quit IRC | 02:55 | |
*** shaohe_feng has joined #openstack-cyborg | 02:56 | |
*** shaohe_feng has quit IRC | 03:05 | |
*** shaohe_feng has joined #openstack-cyborg | 03:05 | |
*** shaohe_feng has quit IRC | 03:15 | |
*** shaohe_feng has joined #openstack-cyborg | 03:17 | |
*** shaohe_feng has quit IRC | 03:25 | |
*** shaohe_feng has joined #openstack-cyborg | 03:27 | |
*** shaohe_feng has quit IRC | 03:36 | |
*** shaohe_feng has joined #openstack-cyborg | 03:37 | |
*** shaohe_feng has quit IRC | 03:46 | |
*** shaohe_feng has joined #openstack-cyborg | 03:47 | |
*** shaohe_feng has quit IRC | 03:56 | |
*** shaohe_feng has joined #openstack-cyborg | 04:00 | |
*** shaohe_feng has quit IRC | 04:06 | |
*** shaohe_feng has joined #openstack-cyborg | 04:07 | |
*** shaohe_feng has quit IRC | 04:17 | |
*** shaohe_feng has joined #openstack-cyborg | 04:18 | |
*** shaohe_feng has quit IRC | 04:27 | |
*** shaohe_feng has joined #openstack-cyborg | 04:31 | |
*** shaohe_feng has quit IRC | 04:37 | |
*** shaohe_feng has joined #openstack-cyborg | 04:39 | |
*** shaohe_feng has quit IRC | 04:47 | |
*** shaohe_feng has joined #openstack-cyborg | 04:48 | |
*** shaohe_feng has quit IRC | 04:58 | |
*** shaohe_feng has joined #openstack-cyborg | 04:59 | |
*** shaohe_feng has quit IRC | 05:08 | |
*** shaohe_feng has joined #openstack-cyborg | 05:09 | |
*** shaohe_feng has quit IRC | 05:18 | |
*** shaohe_feng has joined #openstack-cyborg | 05:20 | |
*** shaohe_feng has quit IRC | 05:28 | |
*** shaohe_feng has joined #openstack-cyborg | 05:29 | |
*** shaohe_feng has quit IRC | 05:39 | |
*** shaohe_feng has joined #openstack-cyborg | 05:40 | |
*** shaohe_feng has quit IRC | 05:49 | |
*** shaohe_feng has joined #openstack-cyborg | 05:50 | |
*** shaohe_feng has quit IRC | 05:59 | |
*** shaohe_feng has joined #openstack-cyborg | 06:01 | |
*** shaohe_feng has quit IRC | 06:09 | |
*** shaohe_feng has joined #openstack-cyborg | 06:13 | |
*** shaohe_feng has quit IRC | 06:20 | |
*** shaohe_feng has joined #openstack-cyborg | 06:20 | |
*** shaohe_feng has quit IRC | 06:30 | |
*** shaohe_feng has joined #openstack-cyborg | 06:32 | |
*** shaohe_feng has quit IRC | 06:40 | |
*** shaohe_feng has joined #openstack-cyborg | 06:42 | |
*** shaohe_feng has quit IRC | 06:50 | |
*** shaohe_feng has joined #openstack-cyborg | 06:51 | |
*** shaohe_feng has quit IRC | 07:01 | |
*** shaohe_feng has joined #openstack-cyborg | 07:01 | |
*** shaohe_feng has quit IRC | 07:11 | |
*** shaohe_feng has joined #openstack-cyborg | 07:12 | |
*** shaohe_feng has quit IRC | 07:21 | |
*** shaohe_feng has joined #openstack-cyborg | 07:22 | |
*** shaohe_feng has quit IRC | 07:31 | |
*** shaohe_feng has joined #openstack-cyborg | 07:34 | |
*** shaohe_feng has quit IRC | 07:42 | |
*** shaohe_feng has joined #openstack-cyborg | 07:42 | |
*** shaohe_feng has quit IRC | 07:52 | |
*** shaohe_feng has joined #openstack-cyborg | 07:53 | |
*** shaohe_feng has quit IRC | 08:02 | |
*** shaohe_feng has joined #openstack-cyborg | 08:03 | |
*** shaohe_feng has quit IRC | 08:12 | |
*** shaohe_feng has joined #openstack-cyborg | 08:14 | |
*** shaohe_feng has quit IRC | 08:23 | |
*** shaohe_feng has joined #openstack-cyborg | 08:25 | |
*** shaohe_feng has quit IRC | 08:33 | |
*** shaohe_feng has joined #openstack-cyborg | 08:34 | |
*** captaindutch has joined #openstack-cyborg | 08:38 | |
*** shaohe_feng has quit IRC | 08:43 | |
*** shaohe_feng has joined #openstack-cyborg | 08:44 | |
*** shaohe_feng has quit IRC | 08:53 | |
*** shaohe_feng has joined #openstack-cyborg | 08:55 | |
*** shaohe_feng has quit IRC | 09:04 | |
*** shaohe_feng has joined #openstack-cyborg | 09:06 | |
*** shaohe_feng has quit IRC | 09:14 | |
*** shaohe_feng has joined #openstack-cyborg | 09:17 | |
*** shaohe_feng has quit IRC | 09:24 | |
*** shaohe_feng has joined #openstack-cyborg | 09:27 | |
*** shaohe_feng has quit IRC | 09:34 | |
*** shaohe_feng has joined #openstack-cyborg | 09:37 | |
*** shaohe_feng has quit IRC | 09:45 | |
*** shaohe_feng has joined #openstack-cyborg | 09:48 | |
*** shaohe_feng has quit IRC | 09:55 | |
*** shaohe_feng has joined #openstack-cyborg | 09:57 | |
*** shaohe_feng has quit IRC | 10:05 | |
*** shaohe_feng has joined #openstack-cyborg | 10:08 | |
*** shaohe_feng has quit IRC | 10:15 | |
*** shaohe_feng has joined #openstack-cyborg | 10:17 | |
*** alex_xu_ has joined #openstack-cyborg | 10:17 | |
*** alex_xu has quit IRC | 10:20 | |
*** shaohe_feng has quit IRC | 10:26 | |
*** shaohe_feng has joined #openstack-cyborg | 10:26 | |
*** shaohe_feng has quit IRC | 10:36 | |
*** shaohe_feng has joined #openstack-cyborg | 10:37 | |
*** shaohe_feng has quit IRC | 10:46 | |
*** shaohe_feng has joined #openstack-cyborg | 10:47 | |
*** shaohe_feng has quit IRC | 10:56 | |
*** shaohe_feng has joined #openstack-cyborg | 10:58 | |
*** shaohe_feng has quit IRC | 11:07 | |
*** shaohe_feng has joined #openstack-cyborg | 11:07 | |
*** shaohe_feng has quit IRC | 11:17 | |
*** shaohe_feng has joined #openstack-cyborg | 11:18 | |
*** openstackgerrit has joined #openstack-cyborg | 11:22 | |
openstackgerrit | wangzhh proposed openstack/cyborg master: Fix Deployable get_by_host https://review.openstack.org/572080 | 11:22 |
---|---|---|
*** shaohe_feng has quit IRC | 11:27 | |
*** shaohe_feng has joined #openstack-cyborg | 11:28 | |
*** shaohe_feng has quit IRC | 11:37 | |
*** shaohe_feng has joined #openstack-cyborg | 11:38 | |
*** shaohe_feng has quit IRC | 11:48 | |
*** shaohe_feng has joined #openstack-cyborg | 11:49 | |
*** shaohe_feng has quit IRC | 11:58 | |
*** shaohe_feng has joined #openstack-cyborg | 11:59 | |
*** shaohe_feng has quit IRC | 12:08 | |
*** shaohe_feng has joined #openstack-cyborg | 12:08 | |
*** shaohe_feng has quit IRC | 12:18 | |
*** shaohe_feng has joined #openstack-cyborg | 12:20 | |
*** shaohe_feng has quit IRC | 12:29 | |
*** shaohe_feng has joined #openstack-cyborg | 12:30 | |
*** openstackgerrit has quit IRC | 12:34 | |
*** shaohe_feng has quit IRC | 12:39 | |
*** shaohe_feng has joined #openstack-cyborg | 12:42 | |
*** shaohe_feng has quit IRC | 12:49 | |
*** shaohe_feng has joined #openstack-cyborg | 12:54 | |
*** shaohe_feng has quit IRC | 12:59 | |
*** shaohe_feng has joined #openstack-cyborg | 13:00 | |
*** shaohe_feng has quit IRC | 13:10 | |
*** shaohe_feng has joined #openstack-cyborg | 13:11 | |
*** shaohe_feng has quit IRC | 13:20 | |
*** jaypipes has joined #openstack-cyborg | 13:20 | |
*** shaohe_feng has joined #openstack-cyborg | 13:23 | |
*** shaohe_feng has quit IRC | 13:30 | |
*** shaohe_feng has joined #openstack-cyborg | 13:32 | |
*** shaohe_feng has quit IRC | 13:40 | |
*** shaohe_feng has joined #openstack-cyborg | 13:43 | |
*** shaohe_feng has quit IRC | 13:51 | |
*** shaohe_feng has joined #openstack-cyborg | 13:51 | |
*** Helloway has joined #openstack-cyborg | 13:58 | |
*** Sundar_ has joined #openstack-cyborg | 13:58 | |
*** shaohe_feng has quit IRC | 14:01 | |
*** shaohe_feng has joined #openstack-cyborg | 14:02 | |
shaohe_feng | #startmeeting openstack-cyborg-driver | 14:02 |
openstack | Meeting started Mon Jun 4 14:02:54 2018 UTC and is due to finish in 60 minutes. The chair is shaohe_feng. Information about MeetBot at http://wiki.debian.org/MeetBot. | 14:02 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 14:02 |
*** openstack changes topic to " (Meeting topic: openstack-cyborg-driver)" | 14:02 | |
openstack | The meeting name has been set to 'openstack_cyborg_driver' | 14:02 |
shaohe_feng | #topic Roll Call | 14:03 |
*** openstack changes topic to "Roll Call (Meeting topic: openstack-cyborg-driver)" | 14:03 | |
Sundar_ | #info Sundar | 14:03 |
shaohe_feng | #info shaohe | 14:03 |
*** tony has joined #openstack-cyborg | 14:03 | |
Helloway | #info Helloway | 14:04 |
shaohe_feng | Sundar_, let's wait minutes for other? | 14:05 |
*** wangzhh has joined #openstack-cyborg | 14:06 | |
shaohe_feng | evening wangzhh | 14:06 |
tony | hi | 14:06 |
shaohe_feng | hello tony | 14:06 |
wangzhh | hello everyone | 14:06 |
tony | hello everyone | 14:07 |
Sundar_ | shaohe: Sure | 14:07 |
shaohe_feng | OK. let's start. | 14:08 |
*** Guest4480 has joined #openstack-cyborg | 14:08 | |
Sundar_ | Hi Tony | 14:08 |
Sundar_ | Hi Wangzhh | 14:08 |
Guest4480 | Hi | 14:08 |
shaohe_feng | welcome tony | 14:08 |
shaohe_feng | #topic current status of drivers | 14:08 |
*** openstack changes topic to "current status of drivers (Meeting topic: openstack-cyborg-driver)" | 14:08 | |
tony | ths shaohe | 14:09 |
shaohe_feng | I have list the tasks on the etherpad #link https://etherpad.openstack.org/p/cyborg-driver-tasks | 14:09 |
shaohe_feng | let go through the tasks | 14:09 |
shaohe_feng | wangzhh, are you going on the VGPU? | 14:10 |
wangzhh | OK. Let introduce my work. | 14:10 |
shaohe_feng | welcome. | 14:10 |
wangzhh | I'm going on the VGPU. | 14:11 |
*** shaohe_feng has quit IRC | 14:11 | |
wangzhh | And when I merged my code. I find some exist bug:( | 14:11 |
wangzhh | cyborg-agent doesn't work well. | 14:12 |
*** shaohe_feng has joined #openstack-cyborg | 14:12 | |
wangzhh | Such as https://review.openstack.org/#/c/572080/ | 14:12 |
wangzhh | So, before VGPU driver, maybe I should fix them first. | 14:13 |
shaohe_feng | good catch. | 14:14 |
*** xinran__ has joined #openstack-cyborg | 14:14 | |
shaohe_feng | so this is an urgent fix. | 14:14 |
xinran__ | Hi sorry for being late | 14:15 |
shaohe_feng | xinran__, evening. | 14:15 |
shaohe_feng | Li_liu is not on line. | 14:15 |
shaohe_feng | he introduce deployable object. | 14:16 |
shaohe_feng | Sundar_, and other developers, please help to review wangzhh bug fix | 14:17 |
shaohe_feng | #link https://review.openstack.org/#/c/572080/ | 14:17 |
shaohe_feng | wangzhh, other process on VGPU? can you help to update the task list? #link https://etherpad.openstack.org/p/cyborg-driver-tasks | 14:18 |
shaohe_feng | ^ wangzhh, update the status. | 14:19 |
shaohe_feng | OK, next. | 14:19 |
wangzhh | shaohe_feng: OK | 14:19 |
shaohe_feng | SPDK, Helloway are you on line? | 14:19 |
*** alex_xu_ has quit IRC | 14:20 | |
Helloway | yes | 14:20 |
*** alex_xu has joined #openstack-cyborg | 14:20 | |
shaohe_feng | so any process on the SPDK? | 14:21 |
*** shaohe_feng has quit IRC | 14:21 | |
Helloway | Temporarily no | 14:22 |
*** shaohe_feng has joined #openstack-cyborg | 14:22 | |
shaohe_feng | Helloway, can you help to update the SPDK status on the #link https://etherpad.openstack.org/p/cyborg-driver-tasks ? | 14:24 |
shaohe_feng | OK, next provider report. | 14:24 |
shaohe_feng | Sundar_, Do we support multi-resource class and nest provider in this release | 14:25 |
Sundar_ | Shaohe: IMHO, it is becoming risky | 14:26 |
shaohe_feng | Sundar_, what's the risky? | 14:26 |
Sundar_ | We have been waiting too long. Even in today's discussion in Nova scheduler meeting, it is not clear that it is going to come soon | 14:26 |
shaohe_feng | how should we do? | 14:27 |
Sundar_ | IMO, we should switch immediately to my originally proposed plan to use compute node as RP, till we get nRP | 14:27 |
Sundar_ | This means we can have multiple devices on the same host but they must all have the same types/traits | 14:27 |
shaohe_feng | OK. Can you have a discuss with jaypipes about cyborg's resource provider? | 14:28 |
shaohe_feng | during the summit? | 14:28 |
Sundar_ | E.g. 2 GPUs, 2 FPGAs both from Xilinx with same device family, 2 FPGAs both from Intel (say A10) etc | 14:28 |
Sundar_ | I discussed with edleafe etc., primarily on the comments that we should not have vendor/product/device names in traits. | 14:29 |
*** Helloway has quit IRC | 14:29 | |
Sundar_ | That got resolved, because we do need vendor/device names, but not product names. The spec has been updated | 14:29 |
Sundar_ | On nRP, it is a larger discussion, that is still going on in the Nova community | 14:29 |
shaohe_feng | if we have a 1 Xilinx and 1 intel's FPGA on one host, what's the resource name? and what's the traits name? | 14:30 |
Sundar_ | We cannot have that unless we get nRPs. I'll explain why | 14:30 |
shaohe_feng | OK. please | 14:30 |
* edleafe is here. Didn't know the meeting time changed | 14:30 | |
Sundar_ | For each device (GPU or FPGA), we will apply a trait like CUSTOM_GPU_AMD, CUSTOM_FPGA_INTEL | 14:31 |
Sundar_ | In your example, it will be CUSTOM_FPGA_INTEL and CUSTOM_FPGA_XILINX | 14:31 |
Sundar_ | We will also publish the resource class as CUSTOM_ACCELERATOR_FPGA | 14:31 |
*** shaohe_feng has quit IRC | 14:32 | |
edleafe | I would prefer 2 nested children, each with an inventory of 1 CUSTOM_FPGA. Then use traits to distinguish them | 14:32 |
Sundar_ | However, without nRP, it will get applied on the compute node, not on the individual devices | 14:32 |
Sundar_ | shaohe: that requires nRP. We may not get that soon enough to meet our Rocky goals. | 14:32 |
edleafe | There is currently discussion in placement on the potential upgrade problems when moving from non-nested to nested. It would be better to only plan on implementing on the nested model if possible | 14:33 |
*** shaohe_feng has joined #openstack-cyborg | 14:33 | |
Sundar_ | Without nRP, it will get applied on the compute node. So, the cpmpute node will advertise 2 units of CUSTOM_ACCELERATOR_FPGA, with 2 traits: CUSTOM_FPGA_INTEL and CUSTOM_FPGA_XILINX. But, Placement wil see that as 2 RCs each with 2 traits | 14:34 |
Sundar_ | There is no way to say that one unit of the RC has one trait, and the other has another trait | 14:34 |
Sundar_ | Does that make sense? | 14:34 |
edleafe | Sundar_: precisely. That's the main reason for waiting for nested | 14:35 |
shaohe_feng | edleafe, Sundar_ : there's issue in this way. | 14:35 |
Sundar_ | edleafe: Thanks for joining. My concern is, we are already in June and even today's Nova sched discussion indicates concerns with rolling upgrades and nRPs | 14:36 |
Sundar_ | It would be better to get something done with caveats, than nothing | 14:36 |
shaohe_feng | for example, user apply one CUSTOM_FPGA_XILINX, then inventory will remain one intel's CUSTOM_ACCELERATOR_FPGA | 14:37 |
shaohe_feng | then user still apply one CUSTOM_FPGA_XILINX, what's will go on? | 14:37 |
edleafe | But besides the issues noted with moving from non-nested to nested, Cyborg will also have to re-do a lot of the naming of custom resource classes | 14:38 |
Sundar_ | shaohe: traits go with resource providers (RPs), not resource classes (RCs) | 14:38 |
shaohe_feng | Sundar_, yes, that's issue. | 14:39 |
shaohe_feng | Sundar_, so the user still can apply a FPGA, but not XILINX that he expect. | 14:40 |
*** Helloway has joined #openstack-cyborg | 14:41 | |
*** shaohe_feng has quit IRC | 14:42 | |
*** shaohe_feng has joined #openstack-cyborg | 14:42 | |
*** Sundar_ has quit IRC | 14:42 | |
shaohe_feng | edleafe, any suggestion on this issue? | 14:44 |
*** Sundar_ has joined #openstack-cyborg | 14:44 | |
Sundar_ | Hi | 14:44 |
*** openstackgerrit has joined #openstack-cyborg | 14:45 | |
openstackgerrit | wangzhh proposed openstack/cyborg master: Fix Deployable get_by_host https://review.openstack.org/572080 | 14:45 |
Sundar_ | Can you see my typing? | 14:45 |
*** tony has quit IRC | 14:45 | |
shaohe_feng | Sundar_, welcome to come back | 14:45 |
shaohe_feng | Sundar_, just see "Hi" | 14:45 |
Sundar_ | I got blocked for some reason -- whatever I typed did not show up | 14:45 |
shaohe_feng | edleafe, Sundar_ , only support one kind of FPGA to avoid this issue? | 14:45 |
Sundar_ | Yes, omultiple devices ok, but all of the same type | 14:46 |
Sundar_ | We cannot have a GPU and a FPGA on the same host, or 2 kinds of FPGAs | 14:46 |
Sundar_ | That should be ok for Rocky | 14:46 |
*** tony has joined #openstack-cyborg | 14:47 | |
*** alex_xu has quit IRC | 14:47 | |
edleafe | shaohe_feng: sorry, had to step away for a moment | 14:48 |
edleafe | shaohe_feng: FPGA is a resource *class*. GPU is a resource *class* | 14:48 |
edleafe | They should be modeled that way from the start. | 14:48 |
edleafe | specific types should be distinguished with traits | 14:49 |
*** alex_xu has joined #openstack-cyborg | 14:49 | |
Sundar_ | edleafe: That is exactly what we are doing | 14:49 |
edleafe | Sundar_: good. I really would hate to see things like CUSTOM_FPGA_XILINX | 14:49 |
Sundar_ | It is just that, without nRPs, we will apply the traits to the compute node, for Rocky cycle alone. That means, all devices on the same host must have the same traits. | 14:50 |
Sundar_ | edleafe: CUSTOM_FPGA_XILINX would be a trait on the RP, not a RC | 14:50 |
shaohe_feng | CUSTOM_FPGA_XILINX is resource or traits? | 14:50 |
edleafe | We are still hoping to have NRP complete in Rocky | 14:50 |
Sundar_ | shaohe: trait, not RC | 14:51 |
edleafe | ah, that wasn't clear. In context I thought you were using it as an RC | 14:51 |
Sundar_ | edleafe: I understand, but with 2 months left to go, I think we are risking Cyborg's Rocky goals by waiting further | 14:51 |
*** shaohe_feng has quit IRC | 14:52 | |
edleafe | Sure, that's understandable. I just want to make sure you know that that will make moving to an NRP design later harder | 14:52 |
Sundar_ | What makes the switch to nRPs hard? | 14:53 |
*** shaohe_feng has joined #openstack-cyborg | 14:53 | |
Sundar_ | The set of RCs and traits will stay the same. But we will apply the traits to individual device RPs later | 14:53 |
edleafe | You will have inventory of devices for the compute node. When you upgrade, somehow these must be converted to a nested design, and any existing instances that are using those resources will have to have their allocations moved. That is the current upgrade discussion going on. | 14:54 |
edleafe | Remember, allocations are for an instance against the RP. When you move to nested, you now have a new child RP, and the allocations should be against it | 14:55 |
edleafe | But they will be against the compute node for any existing instances at the time of upgrade. How to reconcile all of this correctly is what we are trying to work out now | 14:56 |
*** Helloway has quit IRC | 14:58 | |
shaohe_feng | OK | 14:58 |
Sundar_ | OK. There are 2 options: (a) Do not support expectations of upgrades for Cyborg in rocky (IMHO, Rocky addresses the basic Cyborg use cases and lays a strong foundation for further development) (b) Support upgrades by providing some mechanism to delete traits on compute node at a safe time (would appreciate your input here) | 14:58 |
Sundar_ | edleafe: What do you think? | 15:00 |
shaohe_feng | Sundar_, can you summary it, and we can get conclusion on Wednesday's meeting | 15:00 |
shaohe_feng | ^ edleafe, any suggestion on it? | 15:00 |
Sundar_ | shaohe: Above is the summary :). What additional info do you want? | 15:01 |
edleafe | You would probably have to write some custom upgrade script to iterate over all machines to find old-style traits and allocations, convert them to the new style, and then delete the old ones | 15:01 |
shaohe_feng | Sundar_, does these 2 options are mutually exclusive? | 15:01 |
Sundar_ | edleafe: On an upgrade, the new agent/driver(s) will automatically create nested RPs and apply traits there, while the old traits on the compute node still exist | 15:02 |
*** shaohe_feng has quit IRC | 15:02 | |
Sundar_ | Can we then delete the traits on the compute node, while instances are still running? | 15:03 |
*** shaohe_feng has joined #openstack-cyborg | 15:03 | |
Sundar_ | If so, we can provide a script that the operator must run after upgrade, which deletes Cyborg traits on compute nodes | 15:04 |
Sundar_ | shaohe: they are exclusive. | 15:05 |
edleafe | Sundar_: Traits aren't the issue. Allocations / Inventory is what is important to update | 15:06 |
edleafe | Otherwise, a compute node will have, say, inventory of 2 FPGAs, and will now have two child RPs with an FPGA inventory | 15:07 |
edleafe | In that example, Placement will now see 4 FPGAs on the one compute node. | 15:07 |
Sundar_ | edleafe: True. May be the upgrade script can set the reserved to total for the compute node RP? | 15:08 |
edleafe | That's one way. The other would be to simply delete that inventory, since it really isn't the compute node's anymore | 15:09 |
Sundar_ | Can we do that while the instances are still using that inventory? | 15:09 |
edleafe | Allocations will also have to be adjusted, because having a double allocation has an impact on things like measuring quotas | 15:09 |
edleafe | Sundar_: :) Now you | 15:10 |
edleafe | oops | 15:10 |
edleafe | Now you're seeing why I don't like to go the non-nested to nested path | 15:10 |
Sundar_ | Can we do that while the instances are still using that inventory? | 15:10 |
Sundar_ | i.e. delete the inventory on compute node RPs | 15:11 |
edleafe | There are a lot of issues, and we're tyring to come up with a generic solution that will work for Cyborg as well as things like NUMA | 15:11 |
shaohe_feng | edleafe, does provider support NUMA at present? | 15:11 |
shaohe_feng | edleafe, how do we organizate | 15:12 |
edleafe | shaohe_feng: yes, but in a non-nested way | 15:12 |
edleafe | We're trying to figure out how to do that upgrade, and it doesn't look easy | 15:12 |
*** shaohe_feng has quit IRC | 15:13 | |
*** shaohe_feng has joined #openstack-cyborg | 15:13 | |
shaohe_feng | one in numa node0, another in numa node1 | 15:13 |
edleafe | The way NUMA has been modeled in Nova is a horrible hack that was done before Placement even existed | 15:14 |
Sundar_ | May be it is simplest to go with option a: for upgrades with Cyborg, first stop all instances using accelerators, run a script that cleans up, then upgrade to new Cyborg. For other subsystems, I understand the issue. But, Cyborg is a new and we can set expectations | 15:14 |
edleafe | Sundar_: yes, that is a luxury that a generic solution wouldn't have | 15:15 |
shaohe_feng | edleafe, ^ any suggestion on the ultima solution for cyborg accelerator numa topo ? | 15:16 |
edleafe | shaohe_feng: I would think it would look something like: compute_node -> NUMA node -> FPGA device -> FPGA regions - > FPGA function | 15:18 |
edleafe | But from what I know of NUMA, you can configure it multiple ways. | 15:18 |
shaohe_feng | compute_node is provider, NUMA node is provider, FPGA device , FPGA regions, FPGA function are all provider? | 15:20 |
shaohe_feng | ^ edleafe, | 15:20 |
shaohe_feng | Sundar_, have consider numa topo for FPGA | 15:21 |
edleafe | shaohe_feng: yes. The only inventory is the function, which is what the user wants | 15:21 |
shaohe_feng | ? | 15:21 |
shaohe_feng | edleafe, ok, got it. 4 level provider. | 15:21 |
shaohe_feng | thanks | 15:21 |
Sundar_ | shaohe: kinda. But, my suggestion is to focus on the basics for Rocky. If we try to throw in everything, we will not deliver anything | 15:22 |
edleafe | shaohe_feng: of course, it doesn't *have to* be that way, but it is one possibility | 15:23 |
*** shaohe_feng has quit IRC | 15:23 | |
*** shaohe_feng has joined #openstack-cyborg | 15:24 | |
shaohe_feng | edleafe, thanks again. | 15:24 |
shaohe_feng | Sundar_, let's go ahead? | 15:24 |
*** tony has quit IRC | 15:24 | |
shaohe_feng | for next task? | 15:25 |
Sundar_ | Yes, I will send an email to openstack-dev, proposing this (lack of) upgrade plan | 15:25 |
shaohe_feng | Sundar_, OK, please, thanks. | 15:25 |
shaohe_feng | next is Intel FPGA driver | 15:26 |
shaohe_feng | I see the owner is rorojo | 15:26 |
shaohe_feng | Sundar_, rorojo is not on line | 15:26 |
shaohe_feng | can you help to sync with him? | 15:27 |
Sundar_ | Yes. Discussions are ongoing about the implementation | 15:27 |
shaohe_feng | great. | 15:27 |
shaohe_feng | any update there? | 15:27 |
Sundar_ | Nothing significant. I helped Rodrigo to get started, with code browsing etc | 15:28 |
Sundar_ | I had issues with devstack myself :), but I got that fixed, so I have a working deployment on MCP now | 15:28 |
Sundar_ | I am now working on agent-driver API update | 15:29 |
shaohe_feng | Sundar_, Oh, we have improve the devstack doc | 15:29 |
shaohe_feng | let me find it for you. | 15:30 |
shaohe_feng | #link https://docs.openstack.org/cyborg/latest/contributor/devstack_setup.html#devstack-quick-start | 15:31 |
Sundar_ | The issue was not with Cyborg plugin. I was hitting version conflicts on various components in oslo etc. | 15:31 |
shaohe_feng | OK, there's a bug on oslo components? | 15:31 |
Sundar_ | I had to do a 'pip install --upgrade' on many components, because they were lower version than the minimum | 15:32 |
Sundar_ | Then, I took eveything in /opt/stack/requirements/lower_constraints.text and tried to do a mass upgrade | 15:32 |
shaohe_feng | should we submit a patch to upgrade the cyborg requirement? | 15:33 |
Sundar_ | That failed because some components need Python 3. So, I excluded some things manually, and otherwise hacked, till it worked | 15:33 |
*** shaohe_feng has quit IRC | 15:33 | |
Sundar_ | The issue is, devstack does not seem to upgrade the components to their minimum versions automatically. | 15:34 |
openstackgerrit | wangzhh proposed openstack/cyborg master: Fix Deployable get_by_host https://review.openstack.org/572080 | 15:34 |
Sundar_ | This is not Cyborg-specific | 15:34 |
*** shaohe_feng has joined #openstack-cyborg | 15:35 | |
shaohe_feng | FPGA programing, Li_liu is not on line. let's skip it. | 15:35 |
shaohe_feng | HPTS, yumeng is not online, skip it. | 15:35 |
shaohe_feng | Cyborg/Nova interaction, Sundar_ | 15:36 |
Sundar_ | You may have seen the spec updates | 15:36 |
Sundar_ | Waiting for approval | 15:36 |
wangzhh | Sundar_: Maybe you can try PIP_UPGRADE=True in local.conf | 15:36 |
Sundar_ | wangzhh: Thanks, will try it next time | 15:37 |
shaohe_feng | wangzhh, thanks. can you submit a patch for our developer guider? | 15:37 |
Guest4480 | Hi Sundar, there are one problem we faced in our environment: suppose there are two GPU in one host, when attaching one gpu to vm, if this attachment is failed, will nova try to attach the second gpu in this host to vm? | 15:38 |
shaohe_feng | wangzhh, also with other doc bugs. #link https://docs.openstack.org/cyborg/latest/contributor/devstack_setup.html#devstack-quick-start | 15:38 |
wangzhh | It's an common config. Do we need to maintain it? | 15:38 |
shaohe_feng | wangzhh, maybe we can add some note for developer. | 15:39 |
Guest4480 | sorry to interrupt. | 15:39 |
shaohe_feng | Guest4480, it does not matter. | 15:39 |
wangzhh | OK. | 15:39 |
shaohe_feng | Guest4480, go ahead. | 15:39 |
shaohe_feng | Sundar_, Guest4480 ask help from you. | 15:39 |
Sundar_ | Guest4480: since they are independent resources, failure to attach one should not affect the other one. | 15:40 |
shaohe_feng | next task. | 15:41 |
shaohe_feng | Cyborg/Nova interaction(nova side), Sundar_ any plan for it. Should we move it in next release? | 15:42 |
Sundar_ | What does that mean? | 15:42 |
shaohe_feng | Sundar_, I think you will talk with jaypipes, edleafe and other nova developer's about it. | 15:43 |
Sundar_ | Do you mean, nova compute calling into os-acc? | 15:43 |
*** shaohe_feng has quit IRC | 15:43 | |
Sundar_ | Yes, we need to follow up. We have an os-acc spec, which is awaiting approval | 15:43 |
Sundar_ | I will update the specs once more by this week | 15:44 |
*** shaohe_feng has joined #openstack-cyborg | 15:44 | |
shaohe_feng | zhipengh[m], zhuli_ will works on os-acc. | 15:44 |
edleafe | Sundar_: ping me in -nova when they are ready | 15:45 |
Sundar_ | edleafe: Sure. Thanks | 15:45 |
shaohe_feng | Sundar_, but we still need volunteers working on nova side. | 15:45 |
Guest4480 | OK, we will follow the spec to see where to add our work(code) which has be done. | 15:45 |
edleafe | shaohe_feng: I can help with the nova side | 15:46 |
Sundar_ | Excellent, edleafe. Nova compute needs to call into os-acc for specific instance events, as documented in os-acc spec | 15:46 |
Sundar_ | #link https://review.openstack.org/#/c/566798/ | 15:47 |
shaohe_feng | edleafe, thanks. | 15:47 |
shaohe_feng | Sundar_, you can have more details with edleafe. | 15:48 |
Sundar_ | Yes | 15:48 |
shaohe_feng | thanks. | 15:48 |
shaohe_feng | next task. | 15:48 |
shaohe_feng | Quota for accelerator | 15:48 |
shaohe_feng | xinran__, are you on line? | 15:48 |
xinran__ | Yes | 15:49 |
Sundar_ | I need to drop off in 5 minutes | 15:49 |
xinran__ | I’m here | 15:49 |
shaohe_feng | OK. | 15:49 |
shaohe_feng | Cyborg/Nova/Glance intercation in compute node, including os-acc | 15:49 |
shaohe_feng | Sundar_, it is ready? | 15:49 |
shaohe_feng | xinran__, how is quota going on? | 15:50 |
openstackgerrit | wangzhh proposed openstack/cyborg master: Fix Deployable get_by_host https://review.openstack.org/572080 | 15:50 |
shaohe_feng | can you update the task status on the #link https://etherpad.openstack.org/p/cyborg-driver-tasks | 15:50 |
Sundar_ | shaohe: I will respond to comments and update it. Hopefully, we are on the last iteration | 15:50 |
xinran__ | I implemented quota reserve and commit in api layer | 15:51 |
shaohe_feng | Sundar_, thanks. | 15:51 |
shaohe_feng | xinran__, ok, thanks. | 15:51 |
shaohe_feng | that's all for today's task status | 15:51 |
shaohe_feng | Sundar_, have a good day. | 15:52 |
shaohe_feng | it is too late in China. | 15:52 |
xinran__ | But I got sundar’s comment that the first point nove enter to cyborg is agent, I’m not sure about that | 15:52 |
shaohe_feng | do you means nova will call cyborg agent directly? | 15:53 |
Sundar_ | Nova compute calls os-acc, which calls Cyborg agent | 15:53 |
*** shaohe_feng has quit IRC | 15:54 | |
Sundar_ | Bye | 15:54 |
*** Sundar_ has quit IRC | 15:54 | |
wangzhh | Bye | 15:54 |
*** shaohe_feng has joined #openstack-cyborg | 15:55 | |
shaohe_feng | no time left for us to discuss it on today's meeting. let's remain it to Wed's meeting. | 15:55 |
xinran__ | okay bye | 15:55 |
shaohe_feng | For it is too late in China. | 15:55 |
shaohe_feng | okey meeting adjourned | 15:56 |
shaohe_feng | #endmeeting | 15:56 |
*** openstack changes topic to "spec review day (Meeting topic: openstack-cyborg)" | 15:56 | |
openstack | Meeting ended Mon Jun 4 15:56:26 2018 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 15:56 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/openstack_cyborg_driver/2018/openstack_cyborg_driver.2018-06-04-14.02.html | 15:56 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/openstack_cyborg_driver/2018/openstack_cyborg_driver.2018-06-04-14.02.txt | 15:56 |
openstack | Log: http://eavesdrop.openstack.org/meetings/openstack_cyborg_driver/2018/openstack_cyborg_driver.2018-06-04-14.02.log.html | 15:56 |
shaohe_feng | xinran__, bye. have a good night | 15:56 |
*** shaohe_feng has quit IRC | 16:04 | |
*** shaohe_feng has joined #openstack-cyborg | 16:07 | |
*** shaohe_feng has quit IRC | 16:14 | |
*** shaohe_feng has joined #openstack-cyborg | 16:14 | |
*** shaohe_feng has quit IRC | 16:24 | |
*** shaohe_feng has joined #openstack-cyborg | 16:26 | |
*** shaohe_feng has quit IRC | 16:35 | |
*** shaohe_feng has joined #openstack-cyborg | 16:36 | |
*** shaohe_feng has quit IRC | 16:45 | |
*** shaohe_feng has joined #openstack-cyborg | 16:55 | |
*** shaohe_feng has quit IRC | 16:55 | |
*** shaohe_feng has joined #openstack-cyborg | 16:56 | |
*** shaohe_feng has quit IRC | 17:05 | |
*** shaohe_feng has joined #openstack-cyborg | 17:08 | |
*** shaohe_feng has quit IRC | 17:16 | |
*** shaohe_feng has joined #openstack-cyborg | 17:18 | |
*** shaohe_feng has quit IRC | 17:26 | |
*** shaohe_feng has joined #openstack-cyborg | 17:28 | |
*** shaohe_feng has quit IRC | 17:36 | |
*** shaohe_feng has joined #openstack-cyborg | 17:38 | |
*** shaohe_feng has quit IRC | 17:46 | |
*** shaohe_feng has joined #openstack-cyborg | 17:47 | |
*** shaohe_feng has quit IRC | 17:57 | |
*** shaohe_feng has joined #openstack-cyborg | 17:57 | |
*** wangzhh has quit IRC | 18:04 | |
*** xinran__ has quit IRC | 18:04 | |
*** shaohe_feng has quit IRC | 18:07 | |
*** shaohe_feng has joined #openstack-cyborg | 18:08 | |
*** shaohe_feng has quit IRC | 18:17 | |
*** shaohe_feng has joined #openstack-cyborg | 18:18 | |
*** shaohe_feng has quit IRC | 18:27 | |
*** shaohe_feng has joined #openstack-cyborg | 18:29 | |
*** shaohe_feng has quit IRC | 18:38 | |
*** shaohe_feng has joined #openstack-cyborg | 18:39 | |
*** shaohe_feng has quit IRC | 18:48 | |
*** shaohe_feng has joined #openstack-cyborg | 18:48 | |
*** shaohe_feng has quit IRC | 18:58 | |
*** shaohe_feng has joined #openstack-cyborg | 18:59 | |
*** shaohe_feng has quit IRC | 19:08 | |
*** shaohe_feng has joined #openstack-cyborg | 19:11 | |
*** shaohe_feng has quit IRC | 19:19 | |
*** shaohe_feng has joined #openstack-cyborg | 19:20 | |
*** shaohe_feng has quit IRC | 19:29 | |
*** shaohe_feng has joined #openstack-cyborg | 19:30 | |
*** shaohe_feng has quit IRC | 19:39 | |
*** shaohe_feng has joined #openstack-cyborg | 19:43 | |
*** shaohe_feng has quit IRC | 19:49 | |
*** shaohe_feng has joined #openstack-cyborg | 19:50 | |
*** shaohe_feng has quit IRC | 20:00 | |
*** shaohe_feng has joined #openstack-cyborg | 20:01 | |
*** shaohe_feng has quit IRC | 20:10 | |
*** shaohe_feng has joined #openstack-cyborg | 20:12 | |
*** shaohe_feng has quit IRC | 20:20 | |
*** shaohe_feng has joined #openstack-cyborg | 20:23 | |
*** shaohe_feng has quit IRC | 20:30 | |
*** shaohe_feng has joined #openstack-cyborg | 20:31 | |
*** shaohe_feng has quit IRC | 20:41 | |
*** shaohe_feng has joined #openstack-cyborg | 20:43 | |
*** shaohe_feng has quit IRC | 20:51 | |
*** shaohe_feng has joined #openstack-cyborg | 20:52 | |
*** captaindutch has quit IRC | 20:55 | |
*** shaohe_feng has quit IRC | 21:01 | |
*** shaohe_feng has joined #openstack-cyborg | 21:02 | |
*** shaohe_feng has quit IRC | 21:11 | |
*** shaohe_feng has joined #openstack-cyborg | 21:12 | |
*** shaohe_feng has quit IRC | 21:22 | |
*** shaohe_feng has joined #openstack-cyborg | 21:23 | |
*** shaohe_feng has quit IRC | 21:32 | |
*** shaohe_feng has joined #openstack-cyborg | 21:33 | |
*** shaohe_feng has quit IRC | 21:42 | |
*** shaohe_feng has joined #openstack-cyborg | 21:43 | |
*** shaohe_feng has quit IRC | 21:52 | |
*** shaohe_feng has joined #openstack-cyborg | 21:53 | |
*** shaohe_feng has quit IRC | 22:03 | |
*** shaohe_feng has joined #openstack-cyborg | 22:06 | |
*** shaohe_feng has quit IRC | 22:13 | |
*** shaohe_feng has joined #openstack-cyborg | 22:14 | |
*** shaohe_feng has quit IRC | 22:23 | |
*** shaohe_feng has joined #openstack-cyborg | 22:24 | |
*** shaohe_feng has quit IRC | 22:33 | |
*** shaohe_feng has joined #openstack-cyborg | 22:34 | |
*** shaohe_feng has quit IRC | 22:44 | |
*** shaohe_feng has joined #openstack-cyborg | 22:45 | |
*** shaohe_feng has quit IRC | 22:54 | |
*** shaohe_feng has joined #openstack-cyborg | 22:55 | |
*** shaohe_feng has quit IRC | 23:04 | |
*** shaohe_feng has joined #openstack-cyborg | 23:05 | |
*** shaohe_feng has quit IRC | 23:14 | |
*** shaohe_feng has joined #openstack-cyborg | 23:16 | |
*** shaohe_feng has quit IRC | 23:25 | |
*** shaohe_feng has joined #openstack-cyborg | 23:25 | |
*** shaohe_feng has quit IRC | 23:35 | |
*** shaohe_feng has joined #openstack-cyborg | 23:38 | |
*** shaohe_feng has quit IRC | 23:45 | |
*** shaohe_feng has joined #openstack-cyborg | 23:46 | |
*** masber has joined #openstack-cyborg | 23:49 | |
*** sum12 has quit IRC | 23:49 | |
*** masuberu has quit IRC | 23:52 | |
*** shaohe_feng has quit IRC | 23:55 | |
*** shaohe_feng has joined #openstack-cyborg | 23:57 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!