Wednesday, 2017-08-23

*** wanghao has quit IRC		00:14
*** wanghao has joined #openstack-mogan		00:14
wanghao	morning mogan!	00:43
openstackgerrit	Merged openstack/python-moganclient master: Updated from global requirements https://review.openstack.org/494902	01:16
openstackgerrit	wanghao proposed openstack/mogan master: Manage existing BMs: Part-1 https://review.openstack.org/479660	01:22
*** wanghao_ has joined #openstack-mogan		01:23
*** wanghao has quit IRC		01:26
*** litao__ has joined #openstack-mogan		01:29
*** wanghao_ has quit IRC		01:37
*** wanghao has joined #openstack-mogan		01:38
zhenguo	morning all!	01:58
openstackgerrit	Zhenguo Niu proposed openstack/mogan master: Clean up server node uuid on task revert https://review.openstack.org/496207	02:07
zhenguo	liusheng: please look at the above patch, seems I got the reason why placement refresh failed	02:08
liusheng	zhenguo: ok	02:10
zhenguo	sure	02:35
* zhenguo brb		02:37
liusheng	zhenguo: why change to allow all the bulding exceptions will be revert ?	02:44
zhenguo	liusheng: it's hard to check all expections there, and we have a check in the destroy method, if the provision_state is in not in unprovision state we just remove the instance_uuid/instance_info	02:50
zhenguo	liusheng: I find in some situations, we have the node uuid associated with the server, but the backend node is available	02:51
liusheng	zhenguo: it that a bug for Ironic ?	02:51
zhenguo	liusheng: no	02:51
zhenguo	liusheng: it's a bug for mogan	02:51
liusheng	zhenguo: if a node is available, why it should has a instance associated with it	02:52
zhenguo	liusheng: because on our revert method, we don't clean it	02:52
liusheng	zhenguo: you mean Ironic allow that situation and don't care about that ?	02:53
zhenguo	liusheng: it's not related to ironic	02:53
liusheng	zhenguo: why ? the state or provision state defined Ironic	02:53
zhenguo	liusheng: liusheng: yes, but instance_uuid, instance_info is set by us	02:54
liusheng	zhenguo: so we can also set the instance info on a available state node ?	02:55
zhenguo	liusheng: sure	02:55
liusheng	zhenguo: oh...	02:55
zhenguo	liusheng: it's just a property of the node object	02:55
liusheng	zhenguo: so should we skip this situation when reporting resources to placement	02:56
zhenguo	liusheng: yes	02:56
zhenguo	liusheng: when will we set allocations	02:57
zhenguo	liusheng: if there's allocation, it should be ok	02:57
liusheng	zhenguo: after scheduling	02:57
zhenguo	liusheng: so, it seems ok	02:57
zhenguo	liusheng: my problem is that, the server assocaited with a node uuid, then that rp will not get deleted as the allocations is there	02:58
zhenguo	liusheng: then I deleted the node in ironic and registered it again	02:58
openstackgerrit	Merged openstack/mogan master: Updated from global requirements https://review.openstack.org/496211	02:59
zhenguo	liusheng: which causes refresh placement error	02:59
zhenguo	liusheng: as the node uuid changed but the node name not	02:59
zhenguo	liusheng: when register a rp, it failed as name collisions	02:59
zhenguo	liusheng: if the old rp can got deleted it would be ok,	02:59
zhenguo	liusheng: so I tried to clean up the server node_uuid	03:00
liusheng	zhenguo: if a rebuilding was reverted, the server will be in error state, right ?	03:00
zhenguo	liusheng: rebuilding?	03:01
liusheng	zhenguo: building ,sorry	03:01
zhenguo	liusheng: that wil cause rescheduling	03:01
zhenguo	liusheng: if max attempts exceeds, it will fail	03:01
liusheng	zhenguo: oh, seems it will be rescheduled and destroy in driver layer	03:02
zhenguo	liusheng: not in driver layer	03:02
zhenguo	liusheng: in our workflow	03:02
liusheng	zhenguo: yes, our workflow will destroy call the driver.destroy(), right	03:03
zhenguo	liusheng: yes	03:03
zhenguo	liusheng: I changed the revert conditions, to make it always happen	03:03
zhenguo	liusheng: if spawn failed	03:04
liusheng	zhenguo: as we disscussed last time, the periodic reporting task will irgnore this situation until the node cleaned complete and turn to available state, the reporting task will delete the allocation	03:05
zhenguo	liusheng: my situation is that the node is available, not get deleted at all	03:08
zhenguo	liusheng: it's avaialble	03:08
zhenguo	liusheng: I manullay deleted it in ironic then enrolled again	03:09
liusheng	zhenguo: just checked, yes, in this situation the rp will be skipped to delete	03:16
zhenguo	liusheng: just as the error server still associated with the node	03:16
zhenguo	liusheng: with the already deleted node	03:16
liusheng	zhenguo: just because in Mogan, there is a server associated with that ndoe	03:16
zhenguo	liusheng: yes	03:16
liusheng	zhenguo: yes, so why don't delete the error server to release	03:17
liusheng	zhenguo: ?	03:17
liusheng	zhenguo: I remember in Nova, a error server may also occupy a node resource	03:17
zhenguo	liusheng: as the node is available, I don't know why we need to assicate it with the error server	03:18
zhenguo	liusheng: if the node is really related with the error server, of course we should not delete the rp	03:18
liusheng	zhenguo: if we release the node, how to hanld the quota computing ?	03:19
liusheng	zhenguo: error state server also occupy quota	03:20
zhenguo	liusheng: quota is for server instead of node	03:20
zhenguo	liusheng: why do you think is related to the backend node	03:21
zhenguo	liusheng: currently enve if you build network failed, the server will associated the scheduled node	03:21
liusheng	zhenguo: not very sure, but maybe we cannot all the reason of a error state server, maybe in some cases, we can recovery the server, so it is better don't release the node	03:22
liusheng	zhenguo: may we need to ping zhenyu why Nova does that	03:22
zhenguo	liusheng: how can you recover it	03:23
zhenguo	liusheng: we are talking, the node is available	03:23
liusheng	zhenguo: cannot be rebuild ?	03:23
liusheng	zhenguo: same situation in Nova	03:24
zhenguo	liusheng: I think it can not be	03:24
liusheng	zhenguo: the compute node can also be normal state	03:24
zhenguo	liusheng: there's no reason to occupy the node just in case you may rebuild it	03:24
zhenguo	liusheng: you spawn a server failed, you can just spawn a new one, why rebuild it	03:25
liusheng	zhenguo: ...	03:25
liusheng	zhenguo: I can keep my opinion...	03:25
zhenguo	liusheng: it's really not like nova	03:26
zhenguo	liusheng: in our situation, we really occupied resources, but nova don't	03:27
liusheng	zhenguo: why Nova don't ?	03:27
zhenguo	liusheng: why nova does	03:27
zhenguo	liusheng: the vm didn't created at all	03:28
liusheng	zhenguo: nova scheduler will accurate the resource occupied by the error server	03:28
liusheng	zhenguo: no	03:28
zhenguo	liusheng: I mean in the same situation with us	03:29
zhenguo	liusheng: the backend node is avaialbe for us, which means it doesn't get deployed	03:29
liusheng	zhenguo: I am saying the similar situation	03:29
zhenguo	liusheng: I can call you in person with espace	03:30
liusheng	zhenguo: Nova may also didn't create the server in libvirt	03:30
*** wanghao_ has joined #openstack-mogan		03:37
*** wanghao has quit IRC		03:39
*** zhenguo has quit IRC		05:02
*** Jeffrey4l has quit IRC		05:02
*** dims has quit IRC		05:02
*** zhangyang has quit IRC		05:02
*** zhangyang has joined #openstack-mogan		05:02
*** zhenguo has joined #openstack-mogan		05:02
*** Jeffrey4l has joined #openstack-mogan		05:03
*** dims has joined #openstack-mogan		05:04
*** wanghao_ has quit IRC		05:24
*** wanghao has joined #openstack-mogan		05:26
openstackgerrit	Zhenguo Niu proposed openstack/mogan master: Only set/update RPs when the node is available https://review.openstack.org/496497	05:52
openstackgerrit	Xinran WANG proposed openstack/mogan master: Get rid of flavor access https://review.openstack.org/495758	06:11
zhenguo	liusheng: I find some problems with the current resource update logic	06:24
zhenguo	liusheng: when I enrolled some nodes with 'enroll' state, it will report to placement, and mogan will use it like normal node	06:25
zhenguo	liusheng: which will cause error	06:25
liusheng	zhenguo: there is a unavailable "enroll" state ? if so, that should be one of the obnormal nodes list	06:27
zhenguo	liusheng: I don't find there's a abnormal list	06:27
liusheng	zhenguo: mogan/baremetal/ironic/driver.py:572 no, here we will skip nodes we don't want to report	06:28
zhenguo	liusheng: you can check the conditions	06:29
zhenguo	liusheng: which only check bad power state, resource class, and associated with server but in available state	06:30
liusheng	zhenguo:yes, I mean we should add "enroll" state :)	06:30
zhenguo	liusheng: which means node in all provision state will report to the placement if power_state is good and not associated with server	06:30
zhenguo	liusheng: no only enroll state, what about other failed/error state	06:31
*** wanghao_ has joined #openstack-mogan		06:32
liusheng	zhenguo: seems yes	06:35
*** wanghao has quit IRC		06:35
zhenguo	liusheng: we can keep obnormal node to placement only if there's a error server associate it, and now we can ensure that	06:36
zhenguo	liusheng: and for *ing state node, should be normal as well	06:37
zhenguo	liusheng: or only 'available' node should be normal?	06:37
zhenguo	liusheng: also seems using ironic_states in mogan/engine is not proper, as ironic is only one driver	06:40
liusheng	zhenguo: seems, NOSTATE is also an obnormal state	06:43
liusheng	zhenguo: I don't know why I didn't include that...	06:44
zhenguo	liusheng: I would like to change to only get 'available' nodes	06:44
liusheng	zhenguo: oh, ironic_states.AVAILABLE, ironic_states.NOSTATE is normal state	06:44
liusheng	zhenguo: sorry	06:44
zhenguo	liusheng: and check bad_power_state and associated	06:44
zhenguo	liusheng: that's for compatibility with old ironic	06:44
liusheng	zhenguo: nova.virt.ironic.driver.IronicDriver#_node_resources_unavailable	06:44
zhenguo	liusheng: I don't want to copy nova here	06:44
zhenguo	liusheng: we can just think our use case	06:45
zhenguo	liusheng: let's considering what's the side effect of only get 'available' nodes	06:46
liusheng	zhenguo: we need to also update the in-use nodes	06:46
zhenguo	liusheng: why we need to update it	06:46
liusheng	zhenguo: node resource maybe updated	06:47
zhenguo	liusheng: then update it will bring what benefit to us	06:47
zhenguo	liusheng: node resource here is just resource_class right?	06:47
zhenguo	liusheng: we are not like nova's cpu/mem/...	06:47
liusheng	zhenguo: no, I mean a node maybe update by admins, yes we don't have cpu/mem... but may be add different resource traits, or attaching a special interfaces by admins	06:48
liusheng	zhenguo: right ?	06:48
zhenguo	liusheng: traits should be retrieved from ironic?	06:49
zhenguo	liusheng: interfaces is a possible use cases, but we don't save it for now	06:50
zhenguo	liusheng: so only for 'available' and 'active' node	06:50
zhenguo	liusheng: we can get only nodes in that two provision_state	06:50
liusheng	zhenguo: sorry, I am not sure, does provision state includes other states that a node in-use ?	06:51
liusheng	zhenguo: deploying, wait call-back	06:52
zhenguo	liusheng: if the node in-use, it will not get deleted from placement as there's allocations	06:52
liusheng	zhenguo: yes	06:52
zhenguo	liusheng: so for properties update, it's ok to just wait for a moment that the node get active	06:53
liusheng	zhenguo: from our use cases, it looks reasonable, but as my understanding, placement if used for manage the normal resource, and don't care about its status or power state, only except the resource in obnormal states	06:58
liusheng	zhenguo: my personal view	06:58
zhenguo	liusheng: that depends what state is obnormal state	06:59
zhenguo	liusheng: things in placement means mogan can consume it, only condition is there's no allocations	06:59
liusheng	zhenguo: yes	07:00
zhenguo	liusheng: but there maybe some problems if we only get 'available' state node, if node gone and come back, I don't know what will happen with the aggregates relationship	07:01
zhenguo	liusheng: I never test that	07:01
liusheng	zhenguo: not sure, maybe the members of aggregates won't be changes, so if the uuid of node didnt changed, maybe it still can work	07:03
zhenguo	liusheng: seems collisions will happen if we use node name for RP name	07:05
zhenguo	liusheng: like I may remove the node then add the same name node in future	07:05
liusheng	zhenguo: how about treat it as a new node, and delete the aggregate relationship when removing the node	07:07
zhenguo	liusheng: it seems like nova host	07:07
zhenguo	liusheng: they also use name	07:07
liusheng	zhenguo: we don't have a way to update the resource immediately after removing the node from driver	07:08
zhenguo	liusheng: yes	07:08
zhenguo	liusheng: need to think more about the resources update logic with placement	07:08
zhenguo	liusheng: but seems we don't have time left for PIke :(	07:09
zhenguo	liusheng: do you have another way to not import ironic_state in mogan/engine/manager?	07:09
zhenguo	liusheng: it should only appear in ironic driver	07:09
liusheng	zhenguo: I have considered about that, I propose to add a driver motthod maybe named "is_node_resource_normal"	07:10
zhenguo	liusheng: sounds good	07:10
liusheng	zhenguo: since the normal states in different may have different meanings	07:11
liusheng	zhenguo: in different drivers	07:11
zhenguo	liusheng: yes	07:11
*** wanghao_ has quit IRC		07:12
liusheng	zhenguo: will add that :)	07:12
zhenguo	liusheng: thanks	07:12
*** wanghao has joined #openstack-mogan		07:13
zhenguo	liusheng: for get_available_nodes, I can keep the currently logic and just add a 'enroll' state check for now	07:13
liusheng	zhenguo: ok	07:14
*** wanghao has quit IRC		07:23
openstackgerrit	Zhenguo Niu proposed openstack/mogan master: Add bad provision states check for nodes https://review.openstack.org/496497	07:25
zhenguo	liusheng: btw, please make a backup for the scripts in our physical env to avoid getting lost some day :D	07:27
liusheng	zhenguo: ok, will do that	07:27
liusheng	zhenguo: seems I catched a cold and not in a good state :D	07:28
zhenguo	liusheng: hah, feels better now :D	07:28
zhenguo	liusheng: seems this will make the server always not associated with a node uuid https://review.openstack.org/#/c/496207/	07:34
zhenguo	liusheng: if our revert process works well	07:35
liusheng	zhenguo: once it enter the rescheduling entry, the node uuid will be None, right /	07:36
zhenguo	liusheng: yes	07:36
liusheng	zhenguo: it that reasonable ? hah	07:37
zhenguo	liusheng: our revert is just designed for this, lol	07:37
zhenguo	liusheng: we clean up the node, entworks,.. during the revert process	07:37
zhenguo	*networks	07:38
liusheng	zhenguo: ok, make sense	07:38
zhenguo	liusheng: I'm testing it now, at least it seems better then before with this	07:38
liusheng	zhenguo: ok	07:39
zhenguo	liusheng: seems everytime we setup the physical env, we can find a lot of mogan bugs, lol	07:39
*** wanghao has joined #openstack-mogan		07:39
liusheng	zhenguo: hah, it is ok, even Nova now have many bugs	07:40
zhenguo	liusheng: sure, hah	07:40
*** openstackgerrit has quit IRC		08:03
*** openstackgerrit has joined #openstack-mogan		08:07
openstackgerrit	liusheng proposed openstack/mogan master: Add affinity_zone field for server object https://review.openstack.org/495725	08:08
openstackgerrit	liusheng proposed openstack/mogan master: Add support for scheduler_hints https://review.openstack.org/463534	08:08
openstackgerrit	liusheng proposed openstack/mogan master: Use server group in scheduler https://review.openstack.org/496151	08:08
openstackgerrit	liusheng proposed openstack/mogan master: Use server group in scheduler https://review.openstack.org/496151	08:14
litao__	ping zhenguo	08:27
zhenguo	litao__: pong	08:27
litao__	zhenguo: I want to talk about the managing bare metal node API	08:28
litao__	zhenguo: I think we can update the neutron port profile in managing it	08:29
zhenguo	litao__: not currently	08:30
zhenguo	litao__: it's hard to control	08:30
litao__	zhenguo: If all tasks are left to admin, it is a complexity work	08:30
zhenguo	litao__: not all tasks left to admins	08:30
litao__	zhenguo: you mean we do this work later?	08:30
zhenguo	litao__: even if you don't have neutron port connected, we can manage it	08:30
zhenguo	litao__: we can just make things simple for now, or at least before Pike	08:31
*** wanghao has quit IRC		08:35
litao__	zhenguo: if so, we can simple it, and after pike ,we can add these functions	08:36
*** wanghao has joined #openstack-mogan		08:36
litao__	zhenguo: because it is a huge work for managing a bare metal node	08:38
litao__	zhenguo: especially for network configration	08:39
zhenguo	litao__: yes, that part is too complex as my understanding	08:43
zhenguo	litao__: we can keep it simple in mogan, at least we can work, but if you want to continue to make it more smart, you can continue in next cycle :D	08:43
litao__	zhenguo: so mogan should do more to reduce the work for admins	08:43
litao__	zhenguo: ok	08:44
zhenguo	litao__: hah	08:44
*** litao__ has quit IRC		08:51
*** zhuli has quit IRC		08:51
*** zhuli has joined #openstack-mogan		08:51
*** litao__ has joined #openstack-mogan		08:51
*** zhenguo has quit IRC		08:52
*** zhenguo has joined #openstack-mogan		08:52
openstackgerrit	Zhenguo Niu proposed openstack/mogan master: Add unplug_vif to network task revert https://review.openstack.org/496569	09:19
openstackgerrit	wanghao proposed openstack/mogan master: Manage existing BMs: Part-1 https://review.openstack.org/479660	09:21
openstackgerrit	wanghao proposed openstack/mogan master: Manage existing BMs: Part-1 https://review.openstack.org/479660	09:26
zhenguo	wanghao: seems my comments on the previous patch set get lost for the above patch :D	09:28
zhenguo	wanghao: it looks mostly good, please add the detail api response parameters as well, thanks!	09:28
wanghao	zhenguo: yeah, I miss the api-ref doc	09:29
wanghao	zhenguo: okay, I will update the doc.	09:29
zhenguo	wanghao: ok, thansk !	09:30
wanghao	zhenguo: np :)	09:30
*** wanghao has quit IRC		09:31
*** wanghao has joined #openstack-mogan		09:31
*** wanghao has quit IRC		09:31
*** wanghao has joined #openstack-mogan		09:32
*** wanghao has quit IRC		09:32
*** wanghao has joined #openstack-mogan		09:32
*** wanghao has quit IRC		09:33
*** wanghao has joined #openstack-mogan		09:33
*** wanghao has quit IRC		09:34
*** wanghao has joined #openstack-mogan		09:34
*** wanghao has quit IRC		09:35
*** wanghao has joined #openstack-mogan		09:35
*** wanghao has quit IRC		09:35
*** wanghao has joined #openstack-mogan		09:36
*** wanghao has quit IRC		09:36
*** wanghao has joined #openstack-mogan		09:37
openstackgerrit	Xinran WANG proposed openstack/mogan master: support specifying port_id when attaching interface https://review.openstack.org/494121	09:42
openstackgerrit	liusheng proposed openstack/mogan master: Use server group in scheduler https://review.openstack.org/496151	10:08
openstackgerrit	Zhenguo Niu proposed openstack/mogan master: Add unplug_vif to network task revert https://review.openstack.org/496569	10:46
openstackgerrit	Zhenguo Niu proposed openstack/mogan master: Leverage _detach_interface for destroying networks https://review.openstack.org/496569	10:49
*** litao__ has quit IRC		11:57
*** liusheng has quit IRC		17:37
*** liusheng has joined #openstack-mogan		17:38

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!