*** s_shogo has joined #openstack-cyborg | 01:15 | |
*** s_shogo has quit IRC | 01:36 | |
*** s_shogo has joined #openstack-cyborg | 01:59 | |
*** openstackgerrit has joined #openstack-cyborg | 02:06 | |
openstackgerrit | chenker proposed openstack/cyborg master: P4: Fix pep8 error in cyborg/api https://review.opendev.org/679172 | 02:06 |
---|---|---|
openstackgerrit | Xinran WANG proposed openstack/cyborg master: Fill "driver_name" field in Deployable object https://review.opendev.org/677952 | 02:20 |
*** openstackgerrit has quit IRC | 02:37 | |
*** openstackgerrit has joined #openstack-cyborg | 02:46 | |
openstackgerrit | YumengBao proposed openstack/cyborg master: enable branch selection in devstack installation https://review.opendev.org/669303 | 02:46 |
*** shaohe_feng has joined #openstack-cyborg | 02:56 | |
openstackgerrit | chenker proposed openstack/cyborg master: P5: Fix pep8 error in cyborg/accelerator https://review.opendev.org/679175 | 02:59 |
*** Coco_gao_ has joined #openstack-cyborg | 03:04 | |
Coco_gao_ | Hi all | 03:04 |
*** Sundar has joined #openstack-cyborg | 03:04 | |
Coco_gao_ | Hi Sundar | 03:04 |
Coco_gao_ | Good evening | 03:04 |
Sundar | Hi Coco_gao_ | 03:04 |
shaohe_feng | Good evening. | 03:05 |
Sundar | #startmeeting openstack-cyborg | 03:05 |
openstack | Meeting started Thu Aug 29 03:05:12 2019 UTC and is due to finish in 60 minutes. The chair is Sundar. Information about MeetBot at http://wiki.debian.org/MeetBot. | 03:05 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 03:05 |
*** openstack changes topic to " (Meeting topic: openstack-cyborg)" | 03:05 | |
openstack | The meeting name has been set to 'openstack_cyborg' | 03:05 |
Coco_gao_ | #info Coco_gao_ | 03:05 |
Sundar | Hi all | 03:05 |
shaohe_feng | morning Coco_gao_ | 03:05 |
s_shogo | Hi all | 03:05 |
Sundar | #topic Attendance | 03:05 |
Coco_gao_ | morning shaohe | 03:05 |
*** openstack changes topic to "Attendance (Meeting topic: openstack-cyborg)" | 03:05 | |
*** chenke has joined #openstack-cyborg | 03:05 | |
Sundar | #info SUndar | 03:05 |
s_shogo | #info s_shogo | 03:05 |
*** Yumeng has joined #openstack-cyborg | 03:05 | |
chenke | Hi~ | 03:05 |
Sundar | Hi all | 03:05 |
Coco_gao_ | Hi chenke | 03:05 |
chenke | #info chenke | 03:05 |
shaohe_feng | #info shaohe_feng | 03:06 |
Yumeng | #info Yumeng | 03:06 |
yikun | #info yikun | 03:06 |
Sundar | Agenda: https://wiki.openstack.org/wiki/Meetings/CyborgTeamMeeting#Agenda | 03:06 |
chenke | Hi Coco_gao_ | 03:06 |
Sundar | Python 3: Since OpenStack Train release has some Python 3 goals, due by Milestone 3, and it seems that we are close to fixing Py3 issues for Cyborg, | 03:07 |
Sundar | I have requested s_shogo to make Python 3 tests as a voting job in Zuul. | 03:08 |
Sundar | Any objections or comments? | 03:08 |
*** wangzhh has joined #openstack-cyborg | 03:08 | |
*** chunxiu has joined #openstack-cyborg | 03:08 | |
Sundar | I'll take the silence as agreement. ;) There were requests for fixing Python 3 in the cyborg client too. Luckily, it has taken only 1 patch so far, so we don't need to spend much time on it. | 03:10 |
chenke | +1 | 03:10 |
s_shogo | I'll do the py3 work in cyborg client, too. | 03:11 |
chenke | good job | 03:11 |
chenke | I had modify the tox.ini default env support py36,py37 | 03:12 |
Coco_gao_ | thank you | 03:12 |
Sundar | s_shogo: The catch is, the current client is for v1 API code and not based on the openstacksdk method. Bringing it to v2 is more important, right? | 03:12 |
Coco_gao_ | s_shaogo | 03:12 |
Coco_gao_ | s_shogo | 03:12 |
wangzhh | Cool. | 03:12 |
shaohe_feng | https://review.opendev.org/#/c/673228/ | 03:12 |
shaohe_feng | this is a python3 issue fix for client | 03:13 |
Sundar | But somebody else proposed a patch and it got merged. | 03:13 |
s_shogo | Sundar: I think so, My openstackSDK patch is made for the v2 Deployable API, now. | 03:13 |
s_shogo | And the P5-P9 patches doesn't include the migration code , "Deployable" API , from v1 to v2. | 03:14 |
Sundar | s_shogo: Great. Please add device profiles, as that is more importan IMHO. Operators need to create device profiles to use Cyborg, but doing that with curl is not easy | 03:15 |
Coco_gao_ | agree, Sundar | 03:15 |
Sundar | As 2nd priority, I'd say devices -- that will give an inventory of accelerator devices in the cluster | 03:15 |
Sundar | IMHO, when devices are asked for, we can return the components like deployables and attributes, so the client gets a full picture | 03:16 |
shaohe_feng | yes, client if more friendly than curl | 03:16 |
s_shogo | As related the client, the deadline for openstackSDK's commit seems to be near, so would like to begin commit to that, prior to the merge of APIv2 patches. | 03:17 |
Sundar | Yes, makes sense | 03:17 |
Sundar | Thanks, s_shogo! | 03:17 |
Sundar | The main thing that is holding me back is that I am testing P5-P9 with the notification and Placement report patches. Plus, Nova code changes to create a merge conflict for me. | 03:18 |
Sundar | Once those are resolved, hope we can merge the P5-P9 patches | 03:18 |
Sundar | ANy other comments on the client, anybody? | 03:19 |
shaohe_feng | yes, async job depends on P5-P9 | 03:19 |
s_shogo | In my assumption,python-cyborg client and openstacksdk could to be completed before the Train release, | 03:19 |
shaohe_feng | great | 03:19 |
s_shogo | but I'm anxious of sufficiency in my test codes, thus please review that in following patches, and help that if necessary. | 03:19 |
Sundar | shaohe_feng: Agreed. I'll expedite as much as I can. | 03:19 |
Sundar | s_shogo: Agreed, we'll help for sure | 03:20 |
s_shogo | Thanks , Sundar | 03:20 |
shaohe_feng | maybe the test codes can be add later. | 03:20 |
Sundar | wangzhh: Thanks for proposing the RBAC patch. I had some concerns/questions in the patch. Please take a look. | 03:20 |
shaohe_feng | firstly let the client can work. | 03:20 |
Coco_gao_ | s_shogo, thank you . We will review the code. | 03:21 |
wangzhh | Yep. I have updated my code. May commit after meeting. | 03:21 |
Sundar | Thanks, wangzhh | 03:21 |
s_shogo | shaohe_feng : OK, I'll do that preferentially. | 03:22 |
Sundar | shaohe_feng: Part of the issue is that some Nova developers want to test Cyborg code with Nova code in theor env. Also, we need to show tempest working end-to-end. | 03:22 |
Sundar | Anybody else trying out the Placement report? With GPUs, AI chip, etc.? | 03:23 |
Coco_gao_ | What's the remaining work for tempest? | 03:23 |
shaohe_feng | yes, tempest can eliminate their concerns | 03:23 |
Sundar | Coco_gao_: It is mostly to get the patches to work together, I think | 03:24 |
Sundar | Xinran's patches look good IMO. Trying to make sure they work with P5-P9 | 03:24 |
Yumeng | I have tried the Placement report With GPUs | 03:24 |
Sundar | Yumeng: Good to know | 03:25 |
Sundar | #topic Nova functional tests | 03:25 |
*** openstack changes topic to "Nova functional tests (Meeting topic: openstack-cyborg)" | 03:25 | |
Sundar | There was talk at the PTG that we should propose functional tests for Nova, which mock CYborg API in a test fixture, and use that to test Nova patches | 03:26 |
Sundar | They seem to cover a few more scenarios than unit tests and tempest | 03:27 |
Coco_gao_ | mock cyborg API's return? | 03:27 |
chenke | I agree we need to import functional test for nova. | 03:27 |
Sundar | Coco_gao_: Yes | 03:28 |
Sundar | We have an entry in the Storyboard too. I have not any comments of late, but there is concern that it may come up at the last moment | 03:28 |
Sundar | Since there is lots of stuff in Nova runway, it can be tough to get a 2nd look if this issue comes up | 03:29 |
Sundar | DO we have any volunteers for writing Nova functional tests? I'll help as much as I can | 03:29 |
Sundar | Please think it over and LMK if you can. | 03:31 |
Sundar | shaohe_feng: Do you want to bring up the discussion about ARQ states and transitions, as followup? Or is it settled? | 03:32 |
shaohe_feng | yes | 03:33 |
shaohe_feng | one things is that, who delete the ARQ | 03:33 |
shaohe_feng | when delete API tag the state as delete_pending? | 03:34 |
Sundar | There is Nova code to delete the ARQ in some error cases and when VM is terminated | 03:34 |
shaohe_feng | maybe it is still in bind process | 03:35 |
shaohe_feng | the bind process to delete it when it find the state is delete_pending? | 03:35 |
Sundar | Yes. In that case, IMHO, it is best to let the bind complete and the traits get updated in Placement, and then unbind/delete the ARQ | 03:36 |
Sundar | If we try to interrupt FPGA progamming, bad things can happen | 03:36 |
shaohe_feng | we will not add any rollback this release for bind. just go through the whole process even deleting. | 03:36 |
Sundar | Agreed | 03:36 |
shaohe_feng | OK. | 03:37 |
Coco_gao_ | OK | 03:37 |
shaohe_feng | any state transform should be transaction. | 03:37 |
Sundar | Yes, db transaction | 03:38 |
*** xinranwang has joined #openstack-cyborg | 03:38 | |
shaohe_feng | seems there is a state machine in oslo lib | 03:39 |
Sundar | Any other issue, shaohe_feng? | 03:39 |
shaohe_feng | we will not introduce it release | 03:39 |
Sundar | Ok by me. What are the benefits of using that? | 03:40 |
shaohe_feng | for I need time to read up it. | 03:40 |
shaohe_feng | do not look into it at present. | 03:40 |
Sundar | ok | 03:40 |
shaohe_feng | maybe after the whole flow code are finished | 03:40 |
shaohe_feng | we can have a look for cons and pros | 03:41 |
Sundar | Sure. We'll trust your judgement on this :) | 03:41 |
shaohe_feng | another things, should the async job timeout? | 03:41 |
Sundar | On a different note, I am seeing this issue for allocating attach handles: https://opendev.org/openstack/cyborg/src/branch/master/cyborg/db/sqlalchemy/api.py#L269 The in_use field does not get written to db | 03:42 |
shaohe_feng | but there's still a problem. | 03:42 |
Sundar | The timeout should correspond to default Nova timeout | 03:42 |
shaohe_feng | maybe it is in programming or other critical job | 03:43 |
Sundar | The programming typically takes a few seconds, so default of 300 seconds (I think) is good enough | 03:43 |
shaohe_feng | timeout can be disaster | 03:43 |
shaohe_feng | another things | 03:44 |
shaohe_feng | currently the bind process is specify for FPGA | 03:44 |
Sundar | Umm, bind if for all accelerators. Only programming is for FPGA. the bind means the ARQ is associated with a host and deployable in Cyborg's db, and the device is ready to use | 03:45 |
Sundar | *is for | 03:45 |
shaohe_feng | there should be good extension for other kinds | 03:45 |
shaohe_feng | I means: | 03:46 |
shaohe_feng | 1. get the resource type. | 03:46 |
shaohe_feng | every resource type should has its own extend bind action | 03:46 |
shaohe_feng | for FPGA it is program. | 03:46 |
shaohe_feng | other's maybe evn setup, not sure. | 03:47 |
shaohe_feng | 2. every resource should be has its own placement report. | 03:47 |
shaohe_feng | the report info maybe different | 03:48 |
shaohe_feng | so the code should be: | 03:48 |
shaohe_feng | type, num = arq.group_get_resource() | 03:49 |
shaohe_feng | for n in num: | 03:49 |
shaohe_feng | action = get_accelerator_action(type) # fpga is program | 03:50 |
shaohe_feng | action() | 03:50 |
shaohe_feng | somethings like this | 03:50 |
shaohe_feng | and these code should be split from the arq object file | 03:50 |
Sundar | In general, the process should be generic for all accelerators. The current code looks at the device profile request group to see if it has function_id or bitstream_id entries, which are specific to FPGA, to decide if programming is needed | 03:51 |
shaohe_feng | we maybe add other spec in | 03:52 |
Sundar | AFAIK, for non-FPGA devices in this release, there is nothing required to prepare the device, right? | 03:52 |
shaohe_feng | devices profile for different acclerations | 03:53 |
shaohe_feng | such as HDDL | 03:53 |
shaohe_feng | we can add | 03:53 |
shaohe_feng | "accel:affinity": true | 03:54 |
Sundar | Ok | 03:54 |
shaohe_feng | which means we need 4 accelerator in one card | 03:54 |
Sundar | We had an idea of a generic prepare_device API in the driver, which gets a dictionary as a parameter, where the dictionary values depend on the device type. | 03:54 |
shaohe_feng | yes, different devices maybe take different action during bind. | 03:55 |
Sundar | Quick process check: Since we have only few minutes left, should we continue this via email, copying all of us and openstack-ML? What do you all think? | 03:56 |
shaohe_feng | also another things, where we init the threadpoolexcutor? | 03:56 |
shaohe_feng | int the arq object file? | 03:56 |
shaohe_feng | seems not good. | 03:56 |
shaohe_feng | OK. | 03:57 |
Sundar | All, please look at this issue for allocating attach handles: https://opendev.org/openstack/cyborg/src/branch/master/cyborg/db/sqlalchemy/api.py#L269 The in_use field does not get written to db | 03:57 |
Sundar | All, we are seeing good review activity of late. Thank you all, and please keep it up. We are literally 2 weeks from the milestone. :) | 03:58 |
Sundar | #topic AoB | 03:58 |
*** openstack changes topic to "AoB (Meeting topic: openstack-cyborg)" | 03:58 | |
Sundar | shaohe_feng: if you prefer, I can initiate an email thread for the good points that you brought up. Good? | 03:59 |
shaohe_feng | OK | 03:59 |
Sundar | Anything else, folks? | 03:59 |
shaohe_feng | do you have a look that the in_use is in the arguments of the update function? | 04:00 |
Sundar | Yes | 04:01 |
shaohe_feng | and the your DB really have the in_use field? | 04:01 |
shaohe_feng | directly use mysql command. | 04:01 |
Sundar | Oh yes. The ref.update has it, but it doesn;t get written to db. Use mysql cmd from Python code? | 04:02 |
shaohe_feng | no | 04:02 |
shaohe_feng | such as: | 04:02 |
shaohe_feng | mysql -uroot -ppass cyborg | 04:02 |
Sundar | Yes, update command works from CLI | 04:02 |
Sundar | We'll follow up on this too by email. | 04:03 |
shaohe_feng | desc haddler; | 04:03 |
shaohe_feng | OK. | 04:03 |
Sundar | Thanks, everybody. Happy coding and reviewing :). Have a good day. Bye. | 04:03 |
Sundar | #endmeeting | 04:03 |
*** openstack changes topic to "Pending patches (Meeting topic: openstack-cyborg)" | 04:03 | |
openstack | Meeting ended Thu Aug 29 04:03:58 2019 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 04:04 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/openstack_cyborg/2019/openstack_cyborg.2019-08-29-03.05.html | 04:04 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/openstack_cyborg/2019/openstack_cyborg.2019-08-29-03.05.txt | 04:04 |
openstack | Log: http://eavesdrop.openstack.org/meetings/openstack_cyborg/2019/openstack_cyborg.2019-08-29-03.05.log.html | 04:04 |
shaohe_feng | you can update the field manully by mysql CLI, right? | 04:04 |
shaohe_feng | manually | 04:04 |
Sundar | Hi shaohe_feng: Yes | 04:07 |
shaohe_feng | can I login your evn? | 04:07 |
Sundar | Sure, I'll send you the info separately | 04:07 |
shaohe_feng | OK. thanks. | 04:07 |
*** chunxiu has quit IRC | 04:34 | |
*** s_shogo has quit IRC | 04:34 | |
openstackgerrit | chenker proposed openstack/cyborg master: Fix pep8 error in cyborg/*.py and add Forbidden class https://review.opendev.org/679042 | 06:03 |
openstackgerrit | YumengBao proposed openstack/cyborg master: enable branch selection in devstack installation https://review.opendev.org/669303 | 06:07 |
*** xinranwang has quit IRC | 06:08 | |
openstackgerrit | chenker proposed openstack/cyborg master: P3: Fix pep8 error in cyborg/common and cyborg/conductor https://review.opendev.org/679062 | 06:24 |
openstackgerrit | chenker proposed openstack/cyborg master: P4: Fix pep8 error in cyborg/api https://review.opendev.org/679172 | 06:27 |
*** Coco_gao_ has quit IRC | 06:34 | |
openstackgerrit | YumengBao proposed openstack/cyborg master: enable branch selection in devstack installation https://review.opendev.org/669303 | 06:36 |
openstackgerrit | chenker proposed openstack/cyborg master: Fix pep8 error in cyborg/agent and cyborg/db https://review.opendev.org/679193 | 07:01 |
openstackgerrit | chenker proposed openstack/cyborg master: P4: Fix pep8 error in cyborg/api https://review.opendev.org/679172 | 07:07 |
openstackgerrit | chenker proposed openstack/cyborg master: Fix pep8 error in cyborg/*.py and add Forbidden class https://review.opendev.org/679042 | 07:21 |
openstackgerrit | chenker proposed openstack/cyborg master: P2: Fix pep8 error in cyborg/conf and cyborg/cmd https://review.opendev.org/679045 | 07:21 |
openstackgerrit | chenker proposed openstack/cyborg master: P3: Fix pep8 error in cyborg/common and cyborg/conductor https://review.opendev.org/679062 | 07:21 |
openstackgerrit | chenker proposed openstack/cyborg master: P4: Fix pep8 error in cyborg/api https://review.opendev.org/679172 | 07:21 |
*** shaohe_feng has quit IRC | 07:23 | |
openstackgerrit | chenker proposed openstack/cyborg master: P5: Fix pep8 error in cyborg/accelerator https://review.opendev.org/679175 | 07:26 |
openstackgerrit | chenker proposed openstack/cyborg master: P6: Fix pep8 error in cyborg/agent and cyborg/db https://review.opendev.org/679193 | 07:26 |
*** Sundar has quit IRC | 07:49 | |
chenke | Hi, all. When you have time, please help me review the pep8 related commit. I think this should be merged as soon as possible, which will facilitate the code in cyborg. Thanks. | 07:52 |
*** wangzhh has quit IRC | 07:58 | |
*** chenke has quit IRC | 09:59 | |
*** Yumeng has quit IRC | 11:18 | |
openstackgerrit | chenker proposed openstack/cyborg master: P6: Fix pep8 error in cyborg/agent and cyborg/db https://review.opendev.org/679193 | 12:39 |
openstackgerrit | chenker proposed openstack/cyborg master: P6: Fix pep8 error in cyborg/agent and cyborg/db https://review.opendev.org/679193 | 12:41 |
*** efried is now known as efried_afk | 13:47 | |
openstackgerrit | Merged openstack/cyborg master: enable branch selection in devstack installation https://review.opendev.org/669303 | 14:06 |
openstackgerrit | Merged openstack/cyborg master: Fill "driver_name" field in Deployable object https://review.opendev.org/677952 | 14:36 |
*** efried_afk is now known as efried | 15:13 | |
openstackgerrit | ShaoHe Feng proposed openstack/cyborg master: bug fix: update in DB instead of in cache https://review.opendev.org/679314 | 18:00 |
openstackgerrit | ShaoHe Feng proposed openstack/cyborg master: bug fix: update in DB instead of in cache https://review.opendev.org/679314 | 18:40 |
openstackgerrit | Merged openstack/cyborg master: Move to releases.openstack.org https://review.opendev.org/664774 | 21:13 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!