*** wanghao has quit IRC | 00:14 | |
*** wanghao has joined #openstack-mogan | 00:14 | |
wanghao | morning mogan! | 00:43 |
---|---|---|
openstackgerrit | Merged openstack/python-moganclient master: Updated from global requirements https://review.openstack.org/494902 | 01:16 |
openstackgerrit | wanghao proposed openstack/mogan master: Manage existing BMs: Part-1 https://review.openstack.org/479660 | 01:22 |
*** wanghao_ has joined #openstack-mogan | 01:23 | |
*** wanghao has quit IRC | 01:26 | |
*** litao__ has joined #openstack-mogan | 01:29 | |
*** wanghao_ has quit IRC | 01:37 | |
*** wanghao has joined #openstack-mogan | 01:38 | |
zhenguo | morning all! | 01:58 |
openstackgerrit | Zhenguo Niu proposed openstack/mogan master: Clean up server node uuid on task revert https://review.openstack.org/496207 | 02:07 |
zhenguo | liusheng: please look at the above patch, seems I got the reason why placement refresh failed | 02:08 |
liusheng | zhenguo: ok | 02:10 |
zhenguo | sure | 02:35 |
* zhenguo brb | 02:37 | |
liusheng | zhenguo: why change to allow all the bulding exceptions will be revert ? | 02:44 |
zhenguo | liusheng: it's hard to check all expections there, and we have a check in the destroy method, if the provision_state is in not in unprovision state we just remove the instance_uuid/instance_info | 02:50 |
zhenguo | liusheng: I find in some situations, we have the node uuid associated with the server, but the backend node is available | 02:51 |
liusheng | zhenguo: it that a bug for Ironic ? | 02:51 |
zhenguo | liusheng: no | 02:51 |
zhenguo | liusheng: it's a bug for mogan | 02:51 |
liusheng | zhenguo: if a node is available, why it should has a instance associated with it | 02:52 |
zhenguo | liusheng: because on our revert method, we don't clean it | 02:52 |
liusheng | zhenguo: you mean Ironic allow that situation and don't care about that ? | 02:53 |
zhenguo | liusheng: it's not related to ironic | 02:53 |
liusheng | zhenguo: why ? the state or provision state defined Ironic | 02:53 |
zhenguo | liusheng: liusheng: yes, but instance_uuid, instance_info is set by us | 02:54 |
liusheng | zhenguo: so we can also set the instance info on a available state node ? | 02:55 |
zhenguo | liusheng: sure | 02:55 |
liusheng | zhenguo: oh... | 02:55 |
zhenguo | liusheng: it's just a property of the node object | 02:55 |
liusheng | zhenguo: so should we skip this situation when reporting resources to placement | 02:56 |
zhenguo | liusheng: yes | 02:56 |
zhenguo | liusheng: when will we set allocations | 02:57 |
zhenguo | liusheng: if there's allocation, it should be ok | 02:57 |
liusheng | zhenguo: after scheduling | 02:57 |
zhenguo | liusheng: so, it seems ok | 02:57 |
zhenguo | liusheng: my problem is that, the server assocaited with a node uuid, then that rp will not get deleted as the allocations is there | 02:58 |
zhenguo | liusheng: then I deleted the node in ironic and registered it again | 02:58 |
openstackgerrit | Merged openstack/mogan master: Updated from global requirements https://review.openstack.org/496211 | 02:59 |
zhenguo | liusheng: which causes refresh placement error | 02:59 |
zhenguo | liusheng: as the node uuid changed but the node name not | 02:59 |
zhenguo | liusheng: when register a rp, it failed as name collisions | 02:59 |
zhenguo | liusheng: if the old rp can got deleted it would be ok, | 02:59 |
zhenguo | liusheng: so I tried to clean up the server node_uuid | 03:00 |
liusheng | zhenguo: if a rebuilding was reverted, the server will be in error state, right ? | 03:00 |
zhenguo | liusheng: rebuilding? | 03:01 |
liusheng | zhenguo: building ,sorry | 03:01 |
zhenguo | liusheng: that wil cause rescheduling | 03:01 |
zhenguo | liusheng: if max attempts exceeds, it will fail | 03:01 |
liusheng | zhenguo: oh, seems it will be rescheduled and destroy in driver layer | 03:02 |
zhenguo | liusheng: not in driver layer | 03:02 |
zhenguo | liusheng: in our workflow | 03:02 |
liusheng | zhenguo: yes, our workflow will destroy call the driver.destroy(), right | 03:03 |
zhenguo | liusheng: yes | 03:03 |
zhenguo | liusheng: I changed the revert conditions, to make it always happen | 03:03 |
zhenguo | liusheng: if spawn failed | 03:04 |
liusheng | zhenguo: as we disscussed last time, the periodic reporting task will irgnore this situation until the node cleaned complete and turn to available state, the reporting task will delete the allocation | 03:05 |
zhenguo | liusheng: my situation is that the node is available, not get deleted at all | 03:08 |
zhenguo | liusheng: it's avaialble | 03:08 |
zhenguo | liusheng: I manullay deleted it in ironic then enrolled again | 03:09 |
liusheng | zhenguo: just checked, yes, in this situation the rp will be skipped to delete | 03:16 |
zhenguo | liusheng: just as the error server still associated with the node | 03:16 |
zhenguo | liusheng: with the already deleted node | 03:16 |
liusheng | zhenguo: just because in Mogan, there is a server associated with that ndoe | 03:16 |
zhenguo | liusheng: yes | 03:16 |
liusheng | zhenguo: yes, so why don't delete the error server to release | 03:17 |
liusheng | zhenguo: ? | 03:17 |
liusheng | zhenguo: I remember in Nova, a error server may also occupy a node resource | 03:17 |
zhenguo | liusheng: as the node is available, I don't know why we need to assicate it with the error server | 03:18 |
zhenguo | liusheng: if the node is really related with the error server, of course we should not delete the rp | 03:18 |
liusheng | zhenguo: if we release the node, how to hanld the quota computing ? | 03:19 |
liusheng | zhenguo: error state server also occupy quota | 03:20 |
zhenguo | liusheng: quota is for server instead of node | 03:20 |
zhenguo | liusheng: why do you think is related to the backend node | 03:21 |
zhenguo | liusheng: currently enve if you build network failed, the server will associated the scheduled node | 03:21 |
liusheng | zhenguo: not very sure, but maybe we cannot all the reason of a error state server, maybe in some cases, we can recovery the server, so it is better don't release the node | 03:22 |
liusheng | zhenguo: may we need to ping zhenyu why Nova does that | 03:22 |
zhenguo | liusheng: how can you recover it | 03:23 |
zhenguo | liusheng: we are talking, the node is available | 03:23 |
liusheng | zhenguo: cannot be rebuild ? | 03:23 |
liusheng | zhenguo: same situation in Nova | 03:24 |
zhenguo | liusheng: I think it can not be | 03:24 |
liusheng | zhenguo: the compute node can also be normal state | 03:24 |
zhenguo | liusheng: there's no reason to occupy the node just in case you may rebuild it | 03:24 |
zhenguo | liusheng: you spawn a server failed, you can just spawn a new one, why rebuild it | 03:25 |
liusheng | zhenguo: ... | 03:25 |
liusheng | zhenguo: I can keep my opinion... | 03:25 |
zhenguo | liusheng: it's really not like nova | 03:26 |
zhenguo | liusheng: in our situation, we really occupied resources, but nova don't | 03:27 |
liusheng | zhenguo: why Nova don't ? | 03:27 |
zhenguo | liusheng: why nova does | 03:27 |
zhenguo | liusheng: the vm didn't created at all | 03:28 |
liusheng | zhenguo: nova scheduler will accurate the resource occupied by the error server | 03:28 |
liusheng | zhenguo: no | 03:28 |
zhenguo | liusheng: I mean in the same situation with us | 03:29 |
zhenguo | liusheng: the backend node is avaialbe for us, which means it doesn't get deployed | 03:29 |
liusheng | zhenguo: I am saying the similar situation | 03:29 |
zhenguo | liusheng: I can call you in person with espace | 03:30 |
liusheng | zhenguo: Nova may also didn't create the server in libvirt | 03:30 |
*** wanghao_ has joined #openstack-mogan | 03:37 | |
*** wanghao has quit IRC | 03:39 | |
*** zhenguo has quit IRC | 05:02 | |
*** Jeffrey4l has quit IRC | 05:02 | |
*** dims has quit IRC | 05:02 | |
*** zhangyang has quit IRC | 05:02 | |
*** zhangyang has joined #openstack-mogan | 05:02 | |
*** zhenguo has joined #openstack-mogan | 05:02 | |
*** Jeffrey4l has joined #openstack-mogan | 05:03 | |
*** dims has joined #openstack-mogan | 05:04 | |
*** wanghao_ has quit IRC | 05:24 | |
*** wanghao has joined #openstack-mogan | 05:26 | |
openstackgerrit | Zhenguo Niu proposed openstack/mogan master: Only set/update RPs when the node is available https://review.openstack.org/496497 | 05:52 |
openstackgerrit | Xinran WANG proposed openstack/mogan master: Get rid of flavor access https://review.openstack.org/495758 | 06:11 |
zhenguo | liusheng: I find some problems with the current resource update logic | 06:24 |
zhenguo | liusheng: when I enrolled some nodes with 'enroll' state, it will report to placement, and mogan will use it like normal node | 06:25 |
zhenguo | liusheng: which will cause error | 06:25 |
liusheng | zhenguo: there is a unavailable "enroll" state ? if so, that should be one of the obnormal nodes list | 06:27 |
zhenguo | liusheng: I don't find there's a abnormal list | 06:27 |
liusheng | zhenguo: mogan/baremetal/ironic/driver.py:572 no, here we will skip nodes we don't want to report | 06:28 |
zhenguo | liusheng: you can check the conditions | 06:29 |
zhenguo | liusheng: which only check bad power state, resource class, and associated with server but in available state | 06:30 |
liusheng | zhenguo:yes, I mean we should add "enroll" state :) | 06:30 |
zhenguo | liusheng: which means node in all provision state will report to the placement if power_state is good and not associated with server | 06:30 |
zhenguo | liusheng: no only enroll state, what about other failed/error state | 06:31 |
*** wanghao_ has joined #openstack-mogan | 06:32 | |
liusheng | zhenguo: seems yes | 06:35 |
*** wanghao has quit IRC | 06:35 | |
zhenguo | liusheng: we can keep obnormal node to placement only if there's a error server associate it, and now we can ensure that | 06:36 |
zhenguo | liusheng: and for *ing state node, should be normal as well | 06:37 |
zhenguo | liusheng: or only 'available' node should be normal? | 06:37 |
zhenguo | liusheng: also seems using ironic_states in mogan/engine is not proper, as ironic is only one driver | 06:40 |
liusheng | zhenguo: seems, NOSTATE is also an obnormal state | 06:43 |
liusheng | zhenguo: I don't know why I didn't include that... | 06:44 |
zhenguo | liusheng: I would like to change to only get 'available' nodes | 06:44 |
liusheng | zhenguo: oh, ironic_states.AVAILABLE, ironic_states.NOSTATE is normal state | 06:44 |
liusheng | zhenguo: sorry | 06:44 |
zhenguo | liusheng: and check bad_power_state and associated | 06:44 |
zhenguo | liusheng: that's for compatibility with old ironic | 06:44 |
liusheng | zhenguo: nova.virt.ironic.driver.IronicDriver#_node_resources_unavailable | 06:44 |
zhenguo | liusheng: I don't want to copy nova here | 06:44 |
zhenguo | liusheng: we can just think our use case | 06:45 |
zhenguo | liusheng: let's considering what's the side effect of only get 'available' nodes | 06:46 |
liusheng | zhenguo: we need to also update the in-use nodes | 06:46 |
zhenguo | liusheng: why we need to update it | 06:46 |
liusheng | zhenguo: node resource maybe updated | 06:47 |
zhenguo | liusheng: then update it will bring what benefit to us | 06:47 |
zhenguo | liusheng: node resource here is just resource_class right? | 06:47 |
zhenguo | liusheng: we are not like nova's cpu/mem/... | 06:47 |
liusheng | zhenguo: no, I mean a node maybe update by admins, yes we don't have cpu/mem... but may be add different resource traits, or attaching a special interfaces by admins | 06:48 |
liusheng | zhenguo: right ? | 06:48 |
zhenguo | liusheng: traits should be retrieved from ironic? | 06:49 |
zhenguo | liusheng: interfaces is a possible use cases, but we don't save it for now | 06:50 |
zhenguo | liusheng: so only for 'available' and 'active' node | 06:50 |
zhenguo | liusheng: we can get only nodes in that two provision_state | 06:50 |
liusheng | zhenguo: sorry, I am not sure, does provision state includes other states that a node in-use ? | 06:51 |
liusheng | zhenguo: deploying, wait call-back | 06:52 |
zhenguo | liusheng: if the node in-use, it will not get deleted from placement as there's allocations | 06:52 |
liusheng | zhenguo: yes | 06:52 |
zhenguo | liusheng: so for properties update, it's ok to just wait for a moment that the node get active | 06:53 |
liusheng | zhenguo: from our use cases, it looks reasonable, but as my understanding, placement if used for manage the normal resource, and don't care about its status or power state, only except the resource in obnormal states | 06:58 |
liusheng | zhenguo: my personal view | 06:58 |
zhenguo | liusheng: that depends what state is obnormal state | 06:59 |
zhenguo | liusheng: things in placement means mogan can consume it, only condition is there's no allocations | 06:59 |
liusheng | zhenguo: yes | 07:00 |
zhenguo | liusheng: but there maybe some problems if we only get 'available' state node, if node gone and come back, I don't know what will happen with the aggregates relationship | 07:01 |
zhenguo | liusheng: I never test that | 07:01 |
liusheng | zhenguo: not sure, maybe the members of aggregates won't be changes, so if the uuid of node didnt changed, maybe it still can work | 07:03 |
zhenguo | liusheng: seems collisions will happen if we use node name for RP name | 07:05 |
zhenguo | liusheng: like I may remove the node then add the same name node in future | 07:05 |
liusheng | zhenguo: how about treat it as a new node, and delete the aggregate relationship when removing the node | 07:07 |
zhenguo | liusheng: it seems like nova host | 07:07 |
zhenguo | liusheng: they also use name | 07:07 |
liusheng | zhenguo: we don't have a way to update the resource immediately after removing the node from driver | 07:08 |
zhenguo | liusheng: yes | 07:08 |
zhenguo | liusheng: need to think more about the resources update logic with placement | 07:08 |
zhenguo | liusheng: but seems we don't have time left for PIke :( | 07:09 |
zhenguo | liusheng: do you have another way to not import ironic_state in mogan/engine/manager? | 07:09 |
zhenguo | liusheng: it should only appear in ironic driver | 07:09 |
liusheng | zhenguo: I have considered about that, I propose to add a driver motthod maybe named "is_node_resource_normal" | 07:10 |
zhenguo | liusheng: sounds good | 07:10 |
liusheng | zhenguo: since the normal states in different may have different meanings | 07:11 |
liusheng | zhenguo: in different drivers | 07:11 |
zhenguo | liusheng: yes | 07:11 |
*** wanghao_ has quit IRC | 07:12 | |
liusheng | zhenguo: will add that :) | 07:12 |
zhenguo | liusheng: thanks | 07:12 |
*** wanghao has joined #openstack-mogan | 07:13 | |
zhenguo | liusheng: for get_available_nodes, I can keep the currently logic and just add a 'enroll' state check for now | 07:13 |
liusheng | zhenguo: ok | 07:14 |
*** wanghao has quit IRC | 07:23 | |
openstackgerrit | Zhenguo Niu proposed openstack/mogan master: Add bad provision states check for nodes https://review.openstack.org/496497 | 07:25 |
zhenguo | liusheng: btw, please make a backup for the scripts in our physical env to avoid getting lost some day :D | 07:27 |
liusheng | zhenguo: ok, will do that | 07:27 |
liusheng | zhenguo: seems I catched a cold and not in a good state :D | 07:28 |
zhenguo | liusheng: hah, feels better now :D | 07:28 |
zhenguo | liusheng: seems this will make the server always not associated with a node uuid https://review.openstack.org/#/c/496207/ | 07:34 |
zhenguo | liusheng: if our revert process works well | 07:35 |
liusheng | zhenguo: once it enter the rescheduling entry, the node uuid will be None, right / | 07:36 |
zhenguo | liusheng: yes | 07:36 |
liusheng | zhenguo: it that reasonable ? hah | 07:37 |
zhenguo | liusheng: our revert is just designed for this, lol | 07:37 |
zhenguo | liusheng: we clean up the node, entworks,.. during the revert process | 07:37 |
zhenguo | *networks | 07:38 |
liusheng | zhenguo: ok, make sense | 07:38 |
zhenguo | liusheng: I'm testing it now, at least it seems better then before with this | 07:38 |
liusheng | zhenguo: ok | 07:39 |
zhenguo | liusheng: seems everytime we setup the physical env, we can find a lot of mogan bugs, lol | 07:39 |
*** wanghao has joined #openstack-mogan | 07:39 | |
liusheng | zhenguo: hah, it is ok, even Nova now have many bugs | 07:40 |
zhenguo | liusheng: sure, hah | 07:40 |
*** openstackgerrit has quit IRC | 08:03 | |
*** openstackgerrit has joined #openstack-mogan | 08:07 | |
openstackgerrit | liusheng proposed openstack/mogan master: Add affinity_zone field for server object https://review.openstack.org/495725 | 08:08 |
openstackgerrit | liusheng proposed openstack/mogan master: Add support for scheduler_hints https://review.openstack.org/463534 | 08:08 |
openstackgerrit | liusheng proposed openstack/mogan master: Use server group in scheduler https://review.openstack.org/496151 | 08:08 |
openstackgerrit | liusheng proposed openstack/mogan master: Use server group in scheduler https://review.openstack.org/496151 | 08:14 |
litao__ | ping zhenguo | 08:27 |
zhenguo | litao__: pong | 08:27 |
litao__ | zhenguo: I want to talk about the managing bare metal node API | 08:28 |
litao__ | zhenguo: I think we can update the neutron port profile in managing it | 08:29 |
zhenguo | litao__: not currently | 08:30 |
zhenguo | litao__: it's hard to control | 08:30 |
litao__ | zhenguo: If all tasks are left to admin, it is a complexity work | 08:30 |
zhenguo | litao__: not all tasks left to admins | 08:30 |
litao__ | zhenguo: you mean we do this work later? | 08:30 |
zhenguo | litao__: even if you don't have neutron port connected, we can manage it | 08:30 |
zhenguo | litao__: we can just make things simple for now, or at least before Pike | 08:31 |
*** wanghao has quit IRC | 08:35 | |
litao__ | zhenguo: if so, we can simple it, and after pike ,we can add these functions | 08:36 |
*** wanghao has joined #openstack-mogan | 08:36 | |
litao__ | zhenguo: because it is a huge work for managing a bare metal node | 08:38 |
litao__ | zhenguo: especially for network configration | 08:39 |
zhenguo | litao__: yes, that part is too complex as my understanding | 08:43 |
zhenguo | litao__: we can keep it simple in mogan, at least we can work, but if you want to continue to make it more smart, you can continue in next cycle :D | 08:43 |
litao__ | zhenguo: so mogan should do more to reduce the work for admins | 08:43 |
litao__ | zhenguo: ok | 08:44 |
zhenguo | litao__: hah | 08:44 |
*** litao__ has quit IRC | 08:51 | |
*** zhuli has quit IRC | 08:51 | |
*** zhuli has joined #openstack-mogan | 08:51 | |
*** litao__ has joined #openstack-mogan | 08:51 | |
*** zhenguo has quit IRC | 08:52 | |
*** zhenguo has joined #openstack-mogan | 08:52 | |
openstackgerrit | Zhenguo Niu proposed openstack/mogan master: Add unplug_vif to network task revert https://review.openstack.org/496569 | 09:19 |
openstackgerrit | wanghao proposed openstack/mogan master: Manage existing BMs: Part-1 https://review.openstack.org/479660 | 09:21 |
openstackgerrit | wanghao proposed openstack/mogan master: Manage existing BMs: Part-1 https://review.openstack.org/479660 | 09:26 |
zhenguo | wanghao: seems my comments on the previous patch set get lost for the above patch :D | 09:28 |
zhenguo | wanghao: it looks mostly good, please add the detail api response parameters as well, thanks! | 09:28 |
wanghao | zhenguo: yeah, I miss the api-ref doc | 09:29 |
wanghao | zhenguo: okay, I will update the doc. | 09:29 |
zhenguo | wanghao: ok, thansk ! | 09:30 |
wanghao | zhenguo: np :) | 09:30 |
*** wanghao has quit IRC | 09:31 | |
*** wanghao has joined #openstack-mogan | 09:31 | |
*** wanghao has quit IRC | 09:31 | |
*** wanghao has joined #openstack-mogan | 09:32 | |
*** wanghao has quit IRC | 09:32 | |
*** wanghao has joined #openstack-mogan | 09:32 | |
*** wanghao has quit IRC | 09:33 | |
*** wanghao has joined #openstack-mogan | 09:33 | |
*** wanghao has quit IRC | 09:34 | |
*** wanghao has joined #openstack-mogan | 09:34 | |
*** wanghao has quit IRC | 09:35 | |
*** wanghao has joined #openstack-mogan | 09:35 | |
*** wanghao has quit IRC | 09:35 | |
*** wanghao has joined #openstack-mogan | 09:36 | |
*** wanghao has quit IRC | 09:36 | |
*** wanghao has joined #openstack-mogan | 09:37 | |
openstackgerrit | Xinran WANG proposed openstack/mogan master: support specifying port_id when attaching interface https://review.openstack.org/494121 | 09:42 |
openstackgerrit | liusheng proposed openstack/mogan master: Use server group in scheduler https://review.openstack.org/496151 | 10:08 |
openstackgerrit | Zhenguo Niu proposed openstack/mogan master: Add unplug_vif to network task revert https://review.openstack.org/496569 | 10:46 |
openstackgerrit | Zhenguo Niu proposed openstack/mogan master: Leverage _detach_interface for destroying networks https://review.openstack.org/496569 | 10:49 |
*** litao__ has quit IRC | 11:57 | |
*** liusheng has quit IRC | 17:37 | |
*** liusheng has joined #openstack-mogan | 17:38 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!