*** helenafm has joined #openstack-cyborg | 07:31 | |
*** fanzhang_ has joined #openstack-cyborg | 08:33 | |
fanzhang_ | zhipeng hi, I've read about your mail about zero tolerance policy on padding activities, thanks for pointing it out. Yes, it's exactly what I've been confused with. Many Chinese developers could do better that just commit such patches. I talked about this with one contributor in patch https://review.openstack.org/#/c/605300/. And how can I join the wechat group? | 08:35 |
---|---|---|
zhipeng | Thanks fan :) you could add me by searching y cell number 18576658966 | 08:37 |
zhipeng | I can pull you in :) | 08:37 |
fanzhang_ | thanks so mucn | 08:37 |
fanzhang_ | much | 08:37 |
zhipeng | No problem :) | 08:37 |
*** jaypipes has joined #openstack-cyborg | 11:17 | |
*** Coco_gao has joined #openstack-cyborg | 14:02 | |
Coco_gao | Hi all | 14:02 |
Coco_gao | Good night or good morning~ | 14:02 |
*** Li_Liu has joined #openstack-cyborg | 14:03 | |
Li_Liu | HI Guys | 14:03 |
*** xinran has joined #openstack-cyborg | 14:04 | |
Li_Liu | #startmeeting openstack-cyborg | 14:05 |
openstack | Meeting started Wed Sep 26 14:05:13 2018 UTC and is due to finish in 60 minutes. The chair is Li_Liu. Information about MeetBot at http://wiki.debian.org/MeetBot. | 14:05 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 14:05 |
*** openstack changes topic to " (Meeting topic: openstack-cyborg)" | 14:05 | |
openstack | The meeting name has been set to 'openstack_cyborg' | 14:05 |
Coco_gao | #info Coco_gao | 14:05 |
Li_Liu | #topic Roll Call | 14:05 |
*** openstack changes topic to "Roll Call (Meeting topic: openstack-cyborg)" | 14:05 | |
Li_Liu | #info Li_Liu | 14:05 |
Li_Liu | Who else do we have | 14:06 |
xinran | hi | 14:06 |
sum12 | #info sum12 | 14:07 |
Li_Liu | Hi Xinran | 14:07 |
xinran | hi | 14:07 |
Li_Liu | Is Sandar around? | 14:07 |
xinran | #info xinran | 14:07 |
Li_Liu | I was hoping to discuss some nova related question with him | 14:08 |
Li_Liu | #topic post-ptg work items | 14:08 |
*** openstack changes topic to "post-ptg work items (Meeting topic: openstack-cyborg)" | 14:08 | |
Li_Liu | Coco, how is the DB evolution going? | 14:09 |
Coco_gao | Since we have VARs discussed in PTG, we still need VAR table. | 14:10 |
xinran | I have looked into ptg summary, we will support a device profile in this release? | 14:10 |
Coco_gao | Yes | 14:10 |
Li_Liu | what's the dependency for VAR? | 14:10 |
Coco_gao | we need to support dev_profile in Stein | 14:10 |
Li_Liu | does it involve nova folks? | 14:10 |
Coco_gao | nova only use VARs, and will update VARs' status by calling os-acc, I think | 14:11 |
xinran | what is VAR | 14:12 |
Coco_gao | We should have a discuss on what field still remained and what else should be removed as well as the new table field | 14:12 |
Li_Liu | Do you and XiaoHei need any help with this? | 14:13 |
Coco_gao | we may need Sundar's help on VAR table | 14:14 |
*** wangzhh has joined #openstack-cyborg | 14:14 | |
xinran | what is VAR ? | 14:15 |
Coco_gao | I think I can make a draft then we can discuss it together, Since Sundar wrote the recent Specs on device discovery and interactions with nova, he may have some advice I think. | 14:16 |
Coco_gao | Virtual Accelerator Record (VAR) | 14:16 |
xinran | Coco_gao: thanks, | 14:17 |
Li_Liu | ok, Sundar is busy with his specs right now most likely | 14:17 |
Li_Liu | I will talk with him and see how to balance his work | 14:17 |
xinran | Seems there will be a changement at db layer? | 14:18 |
Li_Liu | I have a feeling we have too much work depend on him right now | 14:18 |
Li_Liu | xinran, yes. | 14:18 |
Coco_gao | Yes I think so, because he wrote the specs | 14:19 |
xinran | Li_Liu: could you please elaborate it or is there any docs concerning that? | 14:19 |
wangzhh | Yeap, most of specs. | 14:19 |
Coco_gao | We should follow the spec right? | 14:20 |
Li_Liu | Regarding the db change, I think Coco_gao is still drafting it | 14:21 |
Li_Liu | but the you can find related information here: https://etherpad.openstack.org/p/cyborg-ptg-stein | 14:21 |
Coco_gao | But I and xiaohe talked to Sundar that he should not write the implementation details on his spec. Who write the code decide the details | 14:21 |
xinran | OK | 14:22 |
xinran | thanks | 14:22 |
Li_Liu | Coco_gao, thist's the right way | 14:22 |
Li_Liu | Ok, move on. | 14:23 |
sum12 | Li_Liu: were there any notes taken during the discussion with neutron team during PTG? | 14:23 |
Li_Liu | sum12, there are some notes here | 14:24 |
Li_Liu | https://etherpad.openstack.org/p/fpga-networking | 14:24 |
sum12 | thanks | 14:24 |
Coco_gao | sum12, also https://etherpad.openstack.org/p/cyborg-ptg-stein-summary | 14:25 |
Li_Liu | I saw Xinran is have 4 patches waiting :P | 14:26 |
Li_Liu | I will take a look and provide comments | 14:26 |
sum12 | thanks Cato_gao | 14:26 |
Li_Liu | hope to clear them up soon | 14:26 |
xinran | Li_Liu: thank | 14:26 |
xinran | *thanks | 14:27 |
xinran | And I will submit another patch of os-acc soon | 14:27 |
Li_Liu | I am still working on FPGA programming api. Just solve the glance client problem should provide a updated patch some within the week | 14:27 |
Li_Liu | xinran, you are our super hero right now~~ | 14:27 |
xinran | About parse the flavor.extra_spec to the acceptable format to cyborg api | 14:28 |
xinran | Li_Liu: lol | 14:28 |
xinran | But these patches still need ovo land first | 14:28 |
Li_Liu | right, you might need to sync up with Coco and Xiaohei for that | 14:29 |
Li_Liu | Coco_gao wangzhh | 14:30 |
xinran | Li_Liu: yes, I have pull their patch to local and tested | 14:30 |
Coco_gao | the DB evolution is a huge change if we merge two tables(accelerators and deployables) as Sundar's suggest. Should we keep the old one? | 14:30 |
wangzhh | :) Xiaohei... | 14:30 |
wangzhh | Got it. | 14:30 |
Coco_gao | and evolve to the new version? | 14:31 |
wangzhh | Agree. I suggest keep the old one now. | 14:32 |
Coco_gao | everything will be changed after db evolution, the ovo, my patch on device object should also be rewritten. | 14:33 |
Li_Liu | I agree, let's keep the old one and decide when to remove it later | 14:34 |
xinran | The old one is the current one? | 14:35 |
Coco_gao | yes | 14:35 |
wangzhh | Coco, not really. Most of them can be reused. And we should hava a quick start. | 14:35 |
xinran | I mean acc dep these 2 tables | 14:35 |
Coco_gao | yes. | 14:35 |
xinran | Ok, but my placement repot patch depends on ovo now | 14:38 |
Li_Liu | ok, I think we are clear on the work item and their dependencies | 14:42 |
Li_Liu | Let's do this :) | 14:42 |
Li_Liu | #topic AoB | 14:43 |
*** openstack changes topic to "AoB (Meeting topic: openstack-cyborg)" | 14:43 | |
Coco_gao | If we don't merge the two tables, then DB evolution is not urgent. Maybe I can do something more urgent right now. | 14:44 |
Li_Liu | Coco_gao, I think xinran is worried if she's implementing based on old DB design, once you new DB is merged, she might have to rewrite the code over again | 14:46 |
*** helenafm has quit IRC | 14:47 | |
Coco_gao | I think so, not only xinran's code. | 14:47 |
Coco_gao | we finish rocky's target first? | 14:47 |
xinran | My code is now based on coco’s ovo patch :) | 14:48 |
Li_Liu | i see | 14:48 |
Coco_gao | Any feedback pls contact me directly, thanks xinran | 14:49 |
Li_Liu | Coco_gao, yes, try to finish up R's left over first | 14:49 |
xinran | So I think it’s ok for me if we use ovo design | 14:49 |
Coco_gao | I think the urgent thing is the docs~~ | 14:51 |
Li_Liu | alright, let's wrap up. | 14:51 |
Li_Liu | Coco_gao I will work with Sundar on that to speed it up | 14:51 |
Coco_gao | I can join from this week, feel free to contact me if I can help. | 14:52 |
Li_Liu | sure | 14:52 |
Li_Liu | thanks Coco | 14:52 |
Li_Liu | have a night night guys | 14:52 |
Li_Liu | #endmeeting | 14:52 |
*** openstack changes topic to "A zuul config error slipped through and caused a pile of job failures with retry_limit - a fix is being applied and should be back up in a few minutes" | 14:53 | |
wangzhh | Installation guide is urgent. | 14:53 |
openstack | Meeting ended Wed Sep 26 14:52:59 2018 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 14:53 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/openstack_cyborg/2018/openstack_cyborg.2018-09-26-14.05.html | 14:53 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/openstack_cyborg/2018/openstack_cyborg.2018-09-26-14.05.txt | 14:53 |
openstack | Log: http://eavesdrop.openstack.org/meetings/openstack_cyborg/2018/openstack_cyborg.2018-09-26-14.05.log.html | 14:53 |
wangzhh | Bye | 14:53 |
Li_Liu | 88 | 14:53 |
xinran | bye | 14:53 |
Coco_gao | Bye | 14:53 |
*** Li_Liu has quit IRC | 14:54 | |
*** munimeha1 has joined #openstack-cyborg | 15:45 | |
*** xinran has quit IRC | 17:02 | |
*** Coco_gao has quit IRC | 17:11 | |
*** wangzhh has quit IRC | 17:23 | |
*** Sundar has joined #openstack-cyborg | 19:33 | |
Sundar | efried: Please ping me when you can | 19:33 |
efried | Sundar: Hi there. | 19:38 |
Sundar | Hi Eric, in the nova spec 603955, you said that having Cyborg claim a device by default (without explicit whitelisting) is a security risk. Can you elaborate? | 19:42 |
Sundar | Line 108 in https://review.openstack.org/#/c/603955 | 19:45 |
Sundar | @efried: I'll be around for another 90 minutes. I'll step out after that and should be back by 5 pm PDT. | 20:26 |
efried | sorry I missed your response earlier. ffr best to tag me | 20:27 |
efried | Not claiming, but exposing | 20:27 |
efried | If you expose, say, the controller that's handling the root disk of the management partition, and someone manages to attach it to the VM, your end user (the owner of the VM) could wreak havoc. | 20:28 |
efried | Sundar: ^^ :) | 20:28 |
Sundar | Ah, yes, will make sure to tag you :) | 20:29 |
Sundar | efried: My statement was "if the operator has installed and configured Cyborg drivers, he has explicitly enabled the devices managed by those drivers." | 20:30 |
Sundar | Why is that a security risk? | 20:30 |
efried | Sundar: that should be fine. | 20:31 |
efried | sorry | 20:31 |
efried | strike that. | 20:31 |
Sundar | efried: NP. Will respond in the spec. Thanks. | 20:31 |
efried | Sundar: as worded, it implies that by installing cyborg, you've enabled the devices | 20:32 |
efried | Sundar: Is the "and configured" part where I would list which devices I wanted cyborg to manage? | 20:32 |
Sundar | Well, "installed Cyborg drivers". Installing a driver is a dleiberate act, right? | 20:33 |
Sundar | *deliberate | 20:33 |
efried | Yes. But installing a driver capable of handling X does *not* mean that I want *all* instances of X to be handled by cyborg and made available for attachment to VMs. | 20:33 |
Sundar | I don't follow why that is a security risk. You may want to disable some, but that could be an explicit blacklist | 20:34 |
Sundar | If the admin doesn't want all those devices, why would he install them? | 20:35 |
efried | I'm no expert, but that's not how security works. | 20:36 |
Sundar | If some of those devices come pre-installed in the physical servers, like built-in NICs, I can understand. | 20:37 |
Sundar | I don;t think GPUs are pre-installed in data center class servers. (Desktops are a different story) | 20:37 |
*** edmondsw has joined #openstack-cyborg | 20:39 | |
efried | Sundar: I think it's less of a matter of what's *likely* to happen, and more about it just being a bad policy in general. | 20:39 |
efried | edmondsw: you're a security guy... | 20:39 |
efried | we're talking about whether, by installing a cyborg driver, you should automatically expose any device that driver can handle. | 20:39 |
efried | ...expose for attachability to VMs. | 20:39 |
efried | ...and if you want to reserve/restrict one, you have to explicitly blacklist it. | 20:40 |
edmondsw | yeah, that's probably not a great idea | 20:40 |
*** munimeha1 has quit IRC | 20:40 | |
Sundar | efried, edmondsw: If some of those devices come pre-installed in the physical servers, like built-in NICs, I can understand the need to blacklist by default. That is not the case for accelerators in the data center. | 20:41 |
efried | I mean, I'm having a tough time coming up with a practical scenario where it could cause you a real problem, it just feels wrong. | 20:41 |
edmondsw | do we think there will never be a case where it makes sense for an accelerator to be owned by the hypervisor itself? | 20:42 |
Sundar | edmondsw: Yes. That could be an explicit Cyborg configuration too. | 20:43 |
Sundar | Isn't one of the pain points of PCI whitelisting that there is so much operator labor required? That may be unavoidable for common elements like NICs, but hopefully can be avoided in most cases for accelerators. | 20:45 |
edmondsw | but accelerators are going to become more and more common | 20:45 |
edmondsw | could we compromise with a conf setting that allows the user to specify whether they want whitelist-by-default behavior or not? | 20:46 |
Sundar | edmondsw: That seems reasonable. That is a single setting for an entire deployment. | 20:47 |
Sundar | We still need that to coexist with Nova's PCI whitelists. The accelerator's PCI ID may be in te white list, due to operator error, or because an upgrade moved the device from Nova to Cyborg, or whatever | 20:49 |
Sundar | We can help that by providing a tool that the operator runs, wherein he feeds the Nova's PCI whitelist file and the output says which PCI IDs are in conflict | 20:49 |
Sundar | edmondsw, efried: What do you think? | 20:50 |
efried | Sundar: That sounds like an ambitious idea, but I'm sure it would be welcomed if you could pull it off :) | 20:51 |
efried | TBC, coexisting with the nova PCI whitelist in the sense that the set of devices managed by each are mutually exclusive. | 20:52 |
edmondsw | I'm not sure every conflict would be an error... but they could flick that conf setting to *not* whitelist-by-default in cyborg if that's the case | 20:52 |
edmondsw | I put that badly... I meant it wouldn't necessarily be a nova error... could be that they need to blacklist it in cyborg | 20:53 |
Sundar | edmondsw: Sure. | 20:54 |
Sundar | efried: Do you see practical issues with such a tool? It seems to me that the tool is merely scanning CYborg's db for PCI IDs and the operator-provided PCI whitelist file for conflicts. I suppose wild cards need careful handling | 20:55 |
efried | heh, yeah, good luck duplicating the logic nova uses to process that whitelist. | 20:55 |
Sundar | efried: I'm probably being naive. How does Nova handle conflicts within the whitelist? Say one entry enables a device and another blacklists it. | 20:57 |
efried | Sundar: Don't know off the top of my head. | 20:57 |
Sundar | efried: NP. I nly intend this to be a helper tool, not some official way that is fool-proof | 20:58 |
efried | Here are the notes I scribbled down a couple years ago when I was pawing through the code trying to understand how it's "working": https://etherpad.openstack.org/p/powervm-pci-passthrough-notes | 20:58 |
efried | gosh, maybe it was only a year ago | 20:59 |
Sundar | efried: Thanks for the notes. | 21:02 |
Sundar | efried, edmondsw: Thanks for the good feedback. :) | 21:02 |
*** Sundar has quit IRC | 23:05 | |
*** openstackgerrit has quit IRC | 23:49 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!