Tuesday, 2020-06-02

*** songwenping_ has joined #openstack-cyborg00:36
*** songwenping__ has quit IRC00:39
*** songwenping__ has joined #openstack-cyborg00:50
*** songwenping_ has quit IRC00:53
*** songwenping_ has joined #openstack-cyborg01:27
*** songwenping__ has quit IRC01:29
*** tetsuro has joined #openstack-cyborg04:22
*** links has joined #openstack-cyborg05:28
*** Yumeng has joined #openstack-cyborg06:07
*** Sundar has joined #openstack-cyborg06:08
SundarHello all06:08
brinzhangSundar means add two nova APIs to support bind and unbinding ARQs?06:08
*** chenke has joined #openstack-cyborg06:08
songwenping_Hello all06:09
chenkeHi I move here.06:09
brinzhanghello all06:09
Sundarbrinzhang -- That is not clear to me. If we can do HTTP PATCH on the server object, we can apply or remove devic eprofiles.06:10
*** xinranwang has joined #openstack-cyborg06:10
SundarSO I think it is one API only06:10
brinzhangYeah, one or two all ok for me06:10
SundarSecond problem is with libvirt/qemu -- I don;t think we can add PCI interfaces to a running VM. We have to stop it first, regenerate the doain XML and then restart the server06:11
Sundar*domain XML06:11
*** s_shogo has joined #openstack-cyborg06:12
SundarWhat is the openstack CLI command you are thinking of?06:12
SundarOne option is to avoid all this, and just do a resize with a different flavor, which has a different device profile06:13
SundarThen no need to change Nova API at all06:13
songwenping_Sundar, I have test the case of add and remove PCI interfaces to a running VM like this https://review.opendev.org/#/c/729945/5/nova/virt/libvirt/driver.py06:13
Sundarsongwenping_ : How does this work? How will libvirt change the domain XML on the fly?06:14
brinzhangSundar, I mean I want add two nova api to support binding and unbinding arqs from nova. that we can use "nova bind-accelerators and/or unbind-accelerators"06:14
*** shaohe_feng has joined #openstack-cyborg06:15
*** haibin-huang has joined #openstack-cyborg06:15
brinzhangthis need a spec to nova team06:15
Sundarbrinzhang: The Nova CLI is old. In the new openstack CLI, we say: $ openstack server create --flavor myflavor ..., where myflavor has the device profile. What is the openstack CLI you plan to use for hot attach/detach?06:16
songwenping_`guest.detach_device(hostdevPCI, persistent=True, live=True)' this function with "live=True" param can work for running VM.06:17
songwenping_we can dumpxml the VM and see the hostdev PCI attach.06:17
brinzhangSundar, they are same, we should do completed the cli in novaclinet and then to openstackclient to support, otherwise we cannt completed this feature in Nova06:17
brinzhangSundar: binding/unbinding will be an independent operation in nova06:18
brinzhangNow we dont consider to support resize operation, that will be out of my control06:19
brinzhangI think06:19
Sundarbrinzhang: will you pass a device profile to this CLI? if so, will one VM have 2 or more associated device profiles?06:19
brinzhangSundar: maybe you are right, but I am sure I can answer your question, I dont know does it need pass device_profile to the CLI06:21
Sundarsongwenping_: Great. I found this: https://libvirt.org/pci-hotplug.html Thanks for the pointer.06:22
brinzhangIf we want to bind an ARQ to an instance that its flavor has not accel:device_profile, maybe we should pass it.06:23
brinzhangwith unbind, I think we should judge the instance does have the accel:device_profile, access unbind or reject.06:24
songwenping_Yeah, right Sundar. Libvirt has supported hotplug for PCI devices.06:25
SundarIMHO, it is better to support resize: it covers hot-add and is more generally useful also. It allows us to keep the model that one VM has one device profile.06:25
SundarIt is a standrd Nova operation: so less resistance, and may be we can even get help from Nova developers06:26
shaohe_fengLibvirt has supported hotplug for PCI devices for a long time-_-06:27
Sundarbrinzhang: on a resize, does Nova scheduler get involved? If so, can it pick a different host?06:27
brinzhangSundar, resize may need to change host,that not easy to control, I know it should support, but it need time06:27
xinranwangI think it will do re-schedule06:27
SundarLibvirt does not support it for the default config: looks like you have to specify a different cpu model q35.06:27
Sundarbrinzhang: we already have the scheduler flow for device profiles and ARQs. Why do you think it is difficult?06:29
Yumengbrinzhang: I think unshelve will also do re-schedule, right?06:30
brinzhangunshelve jsut need to re-create ARQs, it's easy to get06:31
brinzhangI am not search into resize, so I can not answer clearly.06:32
SundarWe can bring this up in the Nova discussion.06:32
SundarAnyway, we need to tlak to them about other VM ops for accelerators.06:33
shaohe_fengSundar: a question about schedule. You have bring up a discussion about this:06:33
brinzhangThis release we have rebuild/evcaute, suspend/resume, shelve/unshelve, I think it's enough to do in V06:33
shaohe_fengadd a new filter for dynamic program06:34
shaohe_fenglet cyborg do the filter, right?06:34
Sundarshaohe_feng: not sue what you are referring to06:34
brinzhangIn https://review.opendev.org/#/c/729945/5/api-guide/source/accelerator-support.rst we record the operations supported in cyborg06:34
*** minmin has joined #openstack-cyborg06:35
Yumengbrinzhang: I think we can also start to support resize, chenke can investigate in. anyway, we can start discuss with nova during the session.06:36
brinzhangafter chenke talked with nova team, I think we can get more info about resize.06:37
shaohe_fengsuch as, currently the function of the FPGA is ovs with 4 VF, but no VM uses it.  and now on more FPGA available , we can select this one ovs PFGA as IPsec with 2 NICs, and change its function.06:38
shaohe_feng^ Sundar so this need a scheduler improvement, right?06:38
*** zhurong has joined #openstack-cyborg06:39
shaohe_fengand you bring up to add a new filter.06:39
shaohe_fengI remember.06:39
Sundarshaohe_feng: That doesn;t involve the Nova scheduler. We can define a device profile with a trait that selects that type of FPGA, and an accel:bitstream that refers to ipsec. Then Placement may return some FPGA with ovs image if it is free. But Cyborg will notice it does not have ipsec and do the programming.06:40
SundarWhat we don't have is this:06:41
shaohe_fengyes, that's the current flow in cyborg at present06:42
SundarSay, we have two free FPGAs, one with ovs and with ipsec. In this case, we want the scheduler to pick the one with ipsec preferentially. That is not there today.06:42
SundarThe scheduler has a weigher but it selects among hosts, not among allocation candidates from Placement06:43
shaohe_fengyou can not fill ovs or ipsec  in the trait06:43
Yumengbrinzhang: I was curious, given "unshelve jsut need to re-create ARQs, it's easy to get", does that mean shelve operation does not actually release the accelerator in hypervisor, the accelerator is still a claimed allocation, that's why there is no re-scheduler in unshelve.06:43
shaohe_fengbut the weigher also not support for FPGA, right?06:44
Yumengplease correct me if I was wrong06:44
*** Mudit has joined #openstack-cyborg06:44
SundarThat is right. At least, if you do, it makes it too specific -- if you ask for ipsec and no FPGA with ipsec is free, the request will fail. No reprogramming.06:44
SundarWhat we really want is preferred traits:06:44
SundarWe should be able to say I prefer the resource provider have this trait but, if not, give me something06:44
SundarAlex Xu knows about this06:44
shaohe_fengeveryone  also know about it06:45
shaohe_fengbut we want to improve it06:45
Sundarshaohe_feng: There are many weighers, and we cna write custom weighers too. None of them cover FPGA, because they deal with hosts, not device RPs.06:45
shaohe_fengand everyone know placement is so weak :')06:45
shaohe_fengwe can consider how to improve it.06:46
SundarWhat I meant is, Alex Xu knows Cyborg wants preferred traits. I think he was looking into implementing it at some point, but he got busy with other things06:46
shaohe_fengyes, we can consider write a custom weigher for FPGA06:47
SundarNope, that won;t help, as I said06:47
shaohe_fenghe did not give us a good suggestion06:47
shaohe_fengwhat he know, others also know \06:47
shaohe_fengwe want a good solution06:48
shaohe_fengnot the common knowledge06:48
brinzhangYumeng: when shelve instance we delete the instance's ARQs from nova and cyborg, when unshelve we should create it, it need to re-scheduler in unshelve06:48
shaohe_fengyes, we can improve placement, every one can improve it. but we can let upstream accept it.06:49
brinzhangYumeng: the poc code is here, you can review https://review.opendev.org/#/c/729563/206:49
shaohe_fengsuch as SDL can used for placement, but it is a big change for placement, right? vendor can improve it in this way. but it is difficult in upstream, right?06:50
SundarIf we can get preferred traits, that would help. Shall we add that to Nova discussion?06:51
brinzhangwe will introduce the service version to support these operations, controled in https://review.opendev.org/#/c/715326/13/nova/compute/api.py@28806:51
shaohe_fengpreferred is just a case of SDL. SDL can cover it.06:52
shaohe_fengand preferred can not resolve the complex scenarios :')06:54
SundarYumeng: ^06:54
Sundarbrinzhang: ^06:54
YumengSundar,yes sure. I will add operations to Nova etherpad06:55
shaohe_fengYumeng no, not this release.06:55
SundarNo I mean discuss preferred traits06:55
shaohe_fengthe next release.06:55
shaohe_fengI just want to know some history from sunder06:56
YumengI don't think we have enough time to discuss preferred traits.06:56
shaohe_fengbut not preferred history06:56
Yumengyes next release.06:56
brinzhangI found nova PTG time is too later for china, I will atten it asap,06:56
shaohe_feng about the filter and weigher discussion06:56
Yumengthis release, we discuss nova operations and smartnic integration06:56
brinzhangYumeng: you can ping me when this will be start06:56
Yumengyes, nova session is 10:00-11:00pm beijing time Friday.06:57
brinzhangYumeng: ack06:58
SundarOk, anything else for me? :)06:58
YumengNope for today. Thank you Sundar !06:59
SundarThank you, Yumeng and all! Have a good day.06:59
brinzhangSundar, hope you have a good day, bye06:59
YumengHave a good night. BTW, wil you come  tomorrow?06:59
SundarYumeng, am I required for some topic tomorrow?06:59
SundarNew drivers and drive rprograms -- sounds interesting07:00
YumengDriver program API support? API attribute?07:00
SundarI'll try to join for 1 hour, but I have a long day tomorrow07:00
SundarBye for now :)07:01
Yumengok. Thanks! Then I will move New drivers and drive rprograms first, others later.07:01
YumengSundar: bye.07:01
*** Sundar has quit IRC07:01
brinzhangIn zoom all chinese people?07:02
Yumenghi. let's take a break for 10 mins.07:02
Yumengbrinzhang: nope. We have shogo and one guest.07:03
Yumengafter ten mins shall we go back to zoom?07:03
MuditHello! from India this is Mudit07:03
chenkeare your phone ok with zoom? brin07:03
chenkehello dudit07:04
chenkehello mudit07:04
brinzhangYeah, I asked but noone responed, I think my net is not very good07:04
Yumenghello Mudit! very welcome!07:04
brinzhangwelcome Mudit^07:04
chenkeyou can ask again, if i here, i will reply you.07:04
s_shogoSure,  I can't understand Mandarin,,,, thanks.07:05
haibin-huangDo we have cyborg-api for program fpga?07:05
Yumenghi Mudit: are you intrested in using accelerator device?07:05
Mudityes looking at smart nic specifically07:06
Yumengwhat kind of smart nic? SRIOV or others?07:06
MuditSRIOV basic + FPGA based or ARM cores too07:07
s_shogohaibin-huang cyborg has no dynamic programming API cuurently, that is under developing in this patch.  https://review.opendev.org/#/c/698190/07:07
haibin-huangok,  I will see it, thank you07:08
haibin-huangwhen will merge this patch to master branch07:08
brinzhangs_shogo: you should update this patch, and fix the comments, let merge this in V release07:09
Muditdoes cyborg address LCM of FPGAs as well07:09
s_shogo haibin-huang I'll restart this work after the PTG, and I would like to complete this in June, in my hope:)07:09
s_shogobringzhang yes, that's right.07:09
brinzhangs_shogo: I think you can, come on ^^07:10
haibin-huangcool, s_shogo07:10
YumengMudit: sounds cool. cyborg is trying to implement SRIOV in this release. LCM of FPGAs are not supported yet.07:11
Muditthanks yumeng07:12
Yumengcurrently supported Intel FPGAs https://github.com/openstack/cyborg/blob/master/cyborg/accelerator/drivers/fpga/intel/driver.py07:12
*** shaohe_feng has quit IRC07:12
YumengMudit: if possible, can you share your cases? just add it to etherpad07:13
Mudityes sure I will07:13
Yumengwe want to hear feedbacks from everywhere, vendors or operators.^^07:13
YumengThanks Mudit!07:14
Muditdoes Cyborg treat a smart NIC with FPGA as another class than say standalone FPGA07:14
*** minmin has quit IRC07:21
*** links has quit IRC07:49
*** links has joined #openstack-cyborg07:53
s_shogoYumeng: related to tomorrow topic, ( [Jun 3, 7:00  - 7:15 UTC] Implementation for Deivce Enable/Disable API )08:01
s_shogoimportant topics :)08:01
brinzhangxinranwang, Yumeng: The placement seems can filter data by resource name https://docs.openstack.org/api-ref/placement/?expanded=list-resource-providers-detail,list-resource-classes-detail,list-resource-provider-inventories-detail#list-resource-classes08:08
brinzhangxinranwang: if we want to get all cyborg resource, I think it's difficult, but I think we can get one type class then to make diff in cyborg, just an idea08:09
*** Mudit has quit IRC08:16
*** tetsuro has quit IRC08:42
*** tetsuro has joined #openstack-cyborg08:42
*** tetsuro has quit IRC09:29
*** s_shogo has quit IRC09:49
*** links has quit IRC10:16
*** haibin-huang has quit IRC10:20
*** links has joined #openstack-cyborg10:21
*** brinzhang has quit IRC11:29
*** tetsuro has joined #openstack-cyborg11:56
*** dansmith has quit IRC12:00
*** dansmith has joined #openstack-cyborg12:01
openstackgerritzhangboye proposed openstack/cyborg-specs master: Add py38 package metadata  https://review.opendev.org/73254912:04
openstackgerritzhangboye proposed openstack/python-cyborgclient master: Fix hacking min version to 3.0.1  https://review.opendev.org/73255412:08
*** xinranwang has quit IRC12:29
*** Yumeng has quit IRC13:54
*** tetsuro has quit IRC14:03
*** chenke has quit IRC14:09
*** links has quit IRC16:39
openstackgerritHervĂ© Beraud proposed openstack/cyborg master: Stop to use the __future__ module.  https://review.opendev.org/73282718:09
openstackgerritHervĂ© Beraud proposed openstack/python-cyborgclient master: Stop to use the __future__ module.  https://review.opendev.org/73291818:47

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!