*** Qiming_ has joined #senlin | 01:03 | |
*** lawrancejing has joined #senlin | 01:10 | |
*** LiuWei has joined #senlin | 01:16 | |
*** Yanyanhu has joined #senlin | 01:32 | |
Qiming_ | admin__, there? | 01:50 |
---|---|---|
xuhaiwei | hi, Qiming, it's ok to ask a question now? | 01:55 |
Qiming_ | sure | 01:56 |
*** elynn has joined #senlin | 01:57 | |
xuhaiwei | about the set_status() method in actions/base | 01:57 |
*** lawrancejing has quit IRC | 01:57 | |
xuhaiwei | https://github.com/stackforge/senlin/blob/master/senlin/engine/actions/base.py#L310 | 01:57 |
xuhaiwei | when want to set the status as READY, the db api will do action_abandon? | 01:58 |
xuhaiwei | it seems not appropriate | 01:58 |
*** lawrancejing has joined #senlin | 01:58 | |
Qiming_ | when we want to enable an action to be retried later, we set its status back to READY, so that the scheduler will schedule it for execution later | 01:59 |
Qiming_ | setting status to READY is not the goal, it is a preparation for 'abandon' | 02:00 |
openstackgerrit | Yanyan Hu proposed stackforge/senlin: Reinitialize action properties after locking it https://review.openstack.org/220866 | 02:00 |
xuhaiwei | when setting the action to READY, it means we want some high priority action to be ran first? | 02:02 |
Qiming_ | there is no priorities among actions at the moment | 02:03 |
Qiming_ | when an action's status is READY and its lock released by the 'abandon' operation, we are expecting that the scheduler will schedule it for execution eventually | 02:04 |
Qiming_ | the action's current status, before 'abandon' should be RUNNING | 02:05 |
*** zhenguo_ has joined #senlin | 02:07 | |
*** elynn_ has joined #senlin | 02:08 | |
*** elynn_ has quit IRC | 02:08 | |
xuhaiwei | the dependency is added before setting the action's status, so there should be already priorities at that time | 02:10 |
xuhaiwei | https://github.com/stackforge/senlin/blob/master/senlin/engine/actions/cluster_action.py#L291 | 02:10 |
*** zhenguo has quit IRC | 02:15 | |
*** zhenguo_ is now known as zhenguo | 02:16 | |
Qiming_ | xuhaiwei, those are not 'priorities' | 02:16 |
Qiming_ | those are 'dependencies', expressed using the 'depends_on' and 'depended_by' fields in the action table | 02:17 |
xuhaiwei | shoud the actions marked as depended_by be ran first? | 02:17 |
Qiming_ | we don't know | 02:18 |
Qiming_ | we leave that to the scheduler | 02:18 |
Qiming_ | the scheduler is expected to run any action that is in READY state | 02:19 |
xuhaiwei | so why the 'dependency' field is needed | 02:19 |
Qiming_ | it is used to emulate fork-join | 02:19 |
openstackgerrit | Yanyan Hu proposed stackforge/senlin: Reinitialize action properties after locking it https://review.openstack.org/220866 | 02:20 |
Qiming_ | action A forks actions B, C, D, for example | 02:20 |
Qiming_ | using dependency, A puts itself into WAITING status, and B, C, D into READY status | 02:20 |
Qiming_ | when B, C, D are completed, they will put A back to READY status | 02:21 |
Qiming_ | then A continues to run | 02:21 |
xuhaiwei | so if A is dependened by B, A should be ran earlier than B? | 02:22 |
Qiming_ | yes | 02:23 |
Qiming_ | A will be run, because A is put into READY and B is in WAITING status | 02:23 |
Qiming_ | that is not a priority thing | 02:23 |
Qiming_ | priority, if you want to implement it, refers to fact that A is always executed before B, if both A and B are in READY status | 02:24 |
xuhaiwei | ok, got it | 02:24 |
*** branw has joined #senlin | 02:30 | |
xuhaiwei | so when we set the action to 'READY' status, we want it to be ran, but in action_abandon() we mark the status_reason as 'The action was abandoned.', that is confusing | 02:30 |
Qiming_ | xuhaiwei, you can propose a better, more generic logic if that is confusing you, :) | 02:34 |
xuhaiwei | I thought about it and felt READY and RES_RETRY should be treated differently, but not sure about it | 02:42 |
openstackgerrit | Yanyan Hu proposed stackforge/senlin: Reinitialize action properties after locking it https://review.openstack.org/220866 | 02:54 |
Qiming_ | Yanyanhu, about the patch 220866 | 02:55 |
Yanyanhu | yes? | 02:55 |
Qiming_ | I'm thinking of a revision to the ActionProc to solve the problem once for all | 02:55 |
Qiming_ | currently, line 468 is not a mandatory step | 02:56 |
Qiming_ | We can delete it "Step 1" completely | 02:56 |
Yanyanhu | yes, if we can build a 'context' for db access | 02:57 |
Yanyanhu | then we can load the action obj only after the action is locked successfully | 02:57 |
Qiming_ | The warning on line 478 can be changed to reflect the possible reaons: either the action doesn't exist, or it has been locked by another worker | 02:58 |
Yanyanhu | ok | 02:59 |
Qiming_ | we then materialize the action object by moving line 468 after line 479 | 02:59 |
Yanyanhu | right | 02:59 |
Qiming_ | .... | 03:00 |
Qiming_ | shit, this is incorrect either | 03:00 |
Yanyanhu | but for db access in line475, we still need a correct context | 03:00 |
Qiming_ | the key is to use the right context | 03:00 |
Qiming_ | yep | 03:00 |
Yanyanhu | so still the context problem... | 03:01 |
Yanyanhu | need a complete refactoring here | 03:01 |
Qiming_ | right, has to be done step by step | 03:02 |
Yanyanhu | em | 03:02 |
Qiming_ | first of all, modify __init__() to avoid the creation of new context if context is provided | 03:03 |
Yanyanhu | I have made a try to redesign context usage in action object as we discuessed last week, but met some issues | 03:03 |
Yanyanhu | yes | 03:03 |
Yanyanhu | so the basic idea is for action derived from existing action, we didn't create new context | 03:04 |
Yanyanhu | we create new context just for those actions created for RPC request | 03:04 |
Qiming_ | when __init__ is called, the testing can be simplified | 03:05 |
Qiming_ | just check if context is provided | 03:05 |
Yanyanhu | yes, if not, we use the user infor stored in target to initialize the new context for db acess | 03:07 |
*** Qiming__ has joined #senlin | 03:07 | |
*** Qiming_ has quit IRC | 03:07 | |
Yanyanhu | oh, I recalled those two questions I met when trying to make this change | 03:07 |
*** elynn has quit IRC | 03:08 | |
Yanyanhu | The first thing is we need a way to transmit 'is_admin' property. It is not contained in target object. | 03:09 |
*** elynn has joined #senlin | 03:09 | |
Yanyanhu | umm, this is the only issue. The second one about target obj can be solved using another way. | 03:11 |
Qiming__ | "is_admin" ? | 03:12 |
Qiming__ | why is it important? | 03:12 |
Yanyanhu | the context need it for initialization | 03:12 |
Yanyanhu | and the policy check will use it then | 03:13 |
Qiming__ | it is a problem need to be fixed | 03:13 |
Yanyanhu | seems for DB, we don't use this property | 03:14 |
Yanyanhu | another interim solution can be setting is_admin=true when new context is created in action initialization progress since we don't rely on policy to do any check in action layer | 03:27 |
Yanyanhu | or we remove context from action layer completely | 03:35 |
Yanyanhu | design another class for DB access | 03:36 |
Yanyanhu | or in first step, we define a class derives from RequestContext and override its methods like __init__ | 03:37 |
Qiming__ | another class sounds good | 03:38 |
Yanyanhu | and init those attributes just needed by DB access | 03:38 |
Qiming__ | don't mix it with the current 'RequestContext' | 03:38 |
Qiming__ | there is no | 03:38 |
Qiming__ | such thing once the request is turned into an internal action | 03:39 |
Yanyanhu | ok, then we know the 'context' in DB access progress is actually different from the one used in API/engine service layer | 03:39 |
Yanyanhu | yes | 03:39 |
*** Qiming__ has quit IRC | 03:44 | |
openstackgerrit | xu-haiwei proposed stackforge/senlin: Do some source cleanup https://review.openstack.org/220417 | 04:43 |
*** Qiming__ has joined #senlin | 05:23 | |
openstackgerrit | Merged stackforge/senlin: Do some source cleanup https://review.openstack.org/220417 | 05:32 |
*** branw has quit IRC | 05:50 | |
*** Shijia has joined #senlin | 05:52 | |
Yanyanhu | hi, Qiming, just quick looked through related code of olso_db, I think we may need to customize enginefacade module in oslo_db/sqlalchemy/ locally to use our own Session class to replace the default one. | 06:27 |
Yanyanhu | http://git.openstack.org/cgit/openstack/oslo.db/tree/oslo_db/sqlalchemy/orm.py#n63 | 06:29 |
Yanyanhu | http://git.openstack.org/cgit/openstack/oslo.db/tree/oslo_db/sqlalchemy/enginefacade.py#n363 | 06:29 |
Yanyanhu | the above two lines need to be revised | 06:29 |
Qiming__ | ... | 06:29 |
Qiming__ | that is going too far | 06:30 |
Yanyanhu | ... | 06:30 |
Qiming__ | we can just provide a DBSession class, for example, wrapping the current session reference into this new class | 06:31 |
Yanyanhu | oh | 06:31 |
Yanyanhu | this way | 06:31 |
Yanyanhu | I thought you mean adding some new properties like project into the Session class | 06:31 |
Qiming__ | don't introduce more inheritance please | 06:32 |
Yanyanhu | so I tried to find a way to override the definition of Session | 06:32 |
Yanyanhu | sure | 06:32 |
Qiming__ | it will make our code very dependent to the implementation details of other projects | 06:32 |
Yanyanhu | that will be too complicated and we need to maintain those code | 06:32 |
Yanyanhu | yes | 06:32 |
Yanyanhu | will be nightmare | 06:32 |
Yanyanhu | ok, will do the job by defining a DBSession class | 06:33 |
Qiming__ | the first job in the queue is to revise the Action constructor I think | 06:34 |
Yanyanhu | yes, we don't need context in action any more | 06:34 |
Yanyanhu | a DBSession obj will replace it | 06:34 |
Qiming__ | then we replace the (revised) context parameter with a DBSession object | 06:34 |
Yanyanhu | ok | 06:35 |
Yanyanhu | in first step, we don't change the param name defined in DB layer | 06:35 |
Yanyanhu | that will be a huge change | 06:35 |
Qiming__ | We have to make sure each step made is not worsening the situation at least | 06:35 |
Yanyanhu | ok | 06:36 |
Qiming__ | think about it | 06:36 |
Qiming__ | suppose you have a DBSession class at hand, how would you change the Action intialization code? | 06:36 |
Qiming__ | change the intialization logic, and at the same time, replace the context with an object of a new class? | 06:37 |
Yanyanhu | guess so | 06:37 |
Qiming__ | that is too dangerous a thing to do | 06:38 |
Qiming__ | the Action initialization part is already very sensitive to any changes | 06:38 |
Yanyanhu | hmm, that's true | 06:39 |
Qiming__ | and I'd strongly suggest you to do this by steps | 06:39 |
Yanyanhu | ok | 06:40 |
Yanyanhu | will give the definition and DB model support of DBSession class | 06:40 |
Yanyanhu | and then touch the Action base module | 06:40 |
Yanyanhu | oh, not DBSession class | 06:41 |
Yanyanhu | let me see | 06:41 |
Yanyanhu | we don't need a DB model for DBSession class | 06:42 |
Yanyanhu | just covert it to dict before storing | 06:42 |
Qiming__ | ... | 06:42 |
Qiming__ | DB model for DBSession class? | 06:42 |
Qiming__ | a new table? | 06:42 |
Yanyanhu | false statement | 06:43 |
Yanyanhu | plz just ignore it :) | 06:43 |
Qiming__ | ok | 06:43 |
Qiming__ | admin__, online? | 07:03 |
openstackgerrit | Yanyan Hu proposed stackforge/senlin: Reinitialize action properties after locking it https://review.openstack.org/220866 | 07:03 |
openstackgerrit | Merged stackforge/senlin: Reinitialize action properties after locking it https://review.openstack.org/220866 | 07:13 |
openstackgerrit | Yanyan Hu proposed stackforge/senlin: Implement DBSession class https://review.openstack.org/220904 | 08:02 |
Yanyanhu | hi, Qiming__, just proposed the patch to add DBSession class, please help to review. Thanks | 08:02 |
Qiming__ | any reasoning behind the late initialization of the DBSession._session? | 08:03 |
Yanyanhu | no special reason, just feel maybe this can reduce the opening time of session connection | 08:04 |
Qiming__ | don't understand | 08:06 |
Yanyanhu | I mean if we initialize the _session property of DBSession only when it is first referred, we don't need to create DB connection until the DBSession.session is really used to open DB transaction | 08:08 |
Qiming__ | if we don't use DBSession.session, why are we creating DBSession anyway? | 08:09 |
Yanyanhu | yes, we will use it. But not just after it is initialized. For example, in action.__init__, a DBSession object will be created. But the _session will not be used until we first time invoke a db_api interface with DBSession object. | 08:11 |
Yanyanhu | so I think we just need to initialize the _session in that time but not in action.__init__ | 08:12 |
Qiming__ | so .... the point is, we need to think through the whole life cycle of the DBSession object | 08:12 |
Yanyanhu | yep | 08:12 |
Yanyanhu | actually I'm wondering maybe we can have a place to store those opened session and control them by ourself | 08:12 |
Yanyanhu | rather than just let them been destroyed when the DBSession object is destructed | 08:13 |
Yanyanhu | but this may be dangerous... | 08:13 |
Qiming__ | surely if will be dangerous, given that concurrency is so difficult | 08:13 |
Yanyanhu | yes | 08:13 |
Yanyanhu | so I guess late initialization of DBSession._session might be able to provide some help for reducing the presure of DB connection | 08:15 |
Qiming__ | I'm not sure of that | 08:15 |
Yanyanhu | but no big effect actually | 08:15 |
Qiming__ | In senlin's execution model | 08:17 |
Qiming__ | how would that save us a lot resource? | 08:17 |
Qiming__ | That was the reason I suggested you to start with refactoring the Action initialization as the FIRST step | 08:18 |
Qiming__ | We are constructing a context when calling _from_db_record() | 08:19 |
Qiming__ | we are always creating a new context when __init__ is called | 08:19 |
Qiming__ | all those logic need to be revised | 08:19 |
Yanyanhu | yes | 08:19 |
Qiming__ | without revising that, replacing the context with DBSession or whatever won't be making things any better | 08:20 |
Yanyanhu | yes, that's true. But without the new DBSession class, I think we can't finish the refactoring just use RequestContext I think since there are too much trash properties in this class | 08:22 |
Yanyanhu | of course the key is to fix rebuilding of context/DBSession each time action is reloaded from DB | 08:23 |
Qiming__ | yes | 08:30 |
Yanyanhu | I guess we may need to know more about the DB session to understand how it effect the progress of DB accessing since we can't store the session object in DB. Will make more investigations about this issue. | 08:35 |
*** lixinhui has joined #senlin | 08:49 | |
Qiming__ | in senlin's execution model | 08:51 |
Qiming__ | an action is always retrieved from DB before execution (ref: scheduler and action.ActionProc) | 08:52 |
Qiming__ | thus any db session created in senlin.common.context.RequestContext won't get used during action execution | 08:53 |
Qiming__ | an action object is constructed before execution and it becomes garbage once it is finished | 08:53 |
Qiming__ | we have DBSession generated for action object's during this phase | 08:54 |
Qiming__ | then ... where is the need to persist DBSession object? | 08:54 |
Yanyanhu | yes, this is the problem. | 08:57 |
*** Shijia has quit IRC | 08:57 | |
Qiming__ | DBSession is only supposed to be used by an Action instance, I don't feel the need to persist it | 08:57 |
Yanyanhu | But we need to store some properties like project, user into DB since they should be part of DBSession | 08:58 |
Qiming__ | why into DB? | 08:58 |
Qiming__ | the DBSession object is always in memory | 08:58 |
Yanyanhu | yes, DBSession object is always in memory. But session property of action will be stored into DB which actually contains information of DBSession object | 09:00 |
Qiming__ | why? | 09:00 |
Yanyanhu | when action is reloaded from DB, its context/session property will be re-initialized based on this record | 09:00 |
Qiming__ | yes | 09:00 |
Qiming__ | is that a problem? | 09:01 |
Yanyanhu | sure | 09:01 |
Yanyanhu | it is | 09:01 |
Yanyanhu | the ideal way is reuse the context/session object that in memory when action is reloaed | 09:01 |
Qiming__ | when reloaded from DB, an action ALWAYS creates a new (in-memory) DBSession for execution | 09:01 |
Yanyanhu | yes | 09:02 |
Qiming__ | the best way to detect/handle concurrency problems is to have each action working with its own DB session | 09:03 |
Yanyanhu | but I think we may not be able to resolve this issue if we don't manage the lifecycle of orm.session.Session by ourself | 09:03 |
Yanyanhu | yes | 09:03 |
Yanyanhu | that's our goal | 09:03 |
Qiming__ | you may want to save your effort on that | 09:03 |
Qiming__ | if you dive into the orm model, you will get lost for sure | 09:04 |
Qiming__ | the session design is more complicated than we can imagine | 09:04 |
Yanyanhu | Yes, I found it's almost impossible to do that | 09:04 |
Yanyanhu | just need can't figure out how can we keep the session alive | 09:05 |
Qiming__ | and there is no benefit, other than the imagined session savings | 09:05 |
Yanyanhu | during the store/reload progress of action | 09:05 |
Qiming__ | ... I don't think keeping session alive is that important | 09:06 |
Yanyanhu | yes, that's right | 09:06 |
Yanyanhu | this is why the context object rebuilding in _from_db_record is allowed I think | 09:07 |
Yanyanhu | the difference between new/old context objects is their session property | 09:07 |
Qiming__ | we create new context/DBSession intentionally | 09:08 |
Yanyanhu | the context recreate in Action.__init__ can be removed | 09:08 |
Yanyanhu | you mean? | 09:09 |
Qiming__ | .... | 09:10 |
Qiming__ | we create new DBsession (or 'context') when we load an action from DB | 09:10 |
Qiming__ | we do this creation only once | 09:10 |
Qiming__ | that is the first step | 09:10 |
Yanyanhu | yes? | 09:10 |
Yanyanhu | then we spread it to derived actions(like node create action)? | 09:11 |
Yanyanhu | spread the DBSession object | 09:11 |
Qiming__ | then, if we have proofs to show that inherit 'context'/'dbsession' is something worth to do, we do it | 09:11 |
Qiming__ | if not, we don't do it | 09:12 |
Qiming__ | by the way | 09:12 |
Yanyanhu | inherit? You mean use the same DBSession object created in the first time action object is reloaded from DB? | 09:13 |
Qiming__ | I have tried that path ... having "child" actions to "inherit" DBsessions from "parent" action | 09:13 |
Qiming__ | it is not passible | 09:13 |
Qiming__ | it is impossible | 09:13 |
Yanyanhu | yes, that's the problem | 09:13 |
Qiming__ | don't understand 'inherit'? | 09:13 |
Yanyanhu | it's impossible since we are in a multi-region env | 09:13 |
Qiming__ | more than that | 09:13 |
Qiming__ | action triggering is done via IPC | 09:14 |
Qiming__ | RPC | 09:14 |
Yanyanhu | yes | 09:14 |
Qiming__ | you cannot pass in-memory object in that way | 09:14 |
Yanyanhu | so DB is the only way we can transmit the action information | 09:14 |
Qiming__ | no!!!! | 09:15 |
Qiming__ | DBSession is an in-memory object | 09:15 |
Qiming__ | session is an in-memory object | 09:15 |
Qiming__ | how would you handle DBMS restart? | 09:15 |
Yanyanhu | yes, but how we transmit them between two individual process without manage the memory sharing by ourself? | 09:16 |
Qiming__ | how do you persist a db-session object, serialize it and deserialize it? | 09:16 |
Qiming__ | we don't transmit them! | 09:16 |
Qiming__ | we don't pass session from one action to another | 09:17 |
Yanyanhu | Qiming__> how do you persist a db-session object, serialize it and deserialize it? This is just what I want to say... | 09:17 |
Qiming__ | we don't 'spread' it, we don't allow inherit it | 09:17 |
Yanyanhu | yes! | 09:17 |
Yanyanhu | sigh | 09:17 |
Yanyanhu | seems some misunderstanding here... | 09:17 |
Qiming__ | each action will have a db-session (in-memory) property during its execution | 09:18 |
Yanyanhu | yes | 09:18 |
Yanyanhu | that's right | 09:18 |
Qiming__ | then why the question: but how we transmit them between two individual process without manage the memory sharing by ourself? | 09:18 |
Yanyanhu | I thought you mean we don't store action/session into DB | 09:19 |
Yanyanhu | and directly share the same action/session object between multiple engine. I think this is impossible | 09:19 |
Qiming__ | we don't store it, yes, we don't | 09:19 |
Qiming__ | we don't share it | 09:19 |
Yanyanhu | ok | 09:20 |
Qiming__ | I never said sharing it | 09:20 |
Qiming__ | Qiming__> I have tried that path ... having "child" actions to "inherit" DBsessions from "parent" action | 09:20 |
Qiming__ | Qiming__> it is impossible | 09:20 |
Yanyanhu | ok | 09:20 |
Qiming__ | it is not possible even between parent and child actions | 09:20 |
Yanyanhu | yes | 09:20 |
Qiming__ | how could you enable it in other scenarios? | 09:21 |
Yanyanhu | ok. So if we neither store action into DB or share it, how we notify other workers to execute new actions created in current worker thread? | 09:23 |
Qiming__ | show me the line of code you are talking about? | 09:24 |
Yanyanhu | ? you mean the code of notify? | 09:24 |
Qiming__ | yes | 09:25 |
Yanyanhu | ok, let me check it | 09:25 |
Yanyanhu | http://git.openstack.org/cgit/stackforge/senlin/tree/senlin/engine/dispatcher.py | 09:27 |
Yanyanhu | http://git.openstack.org/cgit/stackforge/senlin/tree/senlin/engine/dispatcher.py#n108 | 09:28 |
Yanyanhu | this line I think | 09:28 |
Qiming__ | 1 min, talking to Ji Zi Lian | 09:28 |
Yanyanhu | ok | 09:29 |
Qiming__ | okay, why we need a context there? | 09:30 |
Yanyanhu | we don't | 09:31 |
Qiming__ | it seems we are passing 'context' on line 102 | 09:32 |
Yanyanhu | this is a historical problem I think | 09:32 |
Qiming__ | but the context you get on line 54 is a completely different thing | 09:32 |
Yanyanhu | yes, the same as the one from api to engine | 09:32 |
Yanyanhu | yes | 09:33 |
Yanyanhu | they are two objects in two different threads | 09:34 |
Yanyanhu | there is a serialization and deserialization progress | 09:36 |
Yanyanhu | when this param transmitted through rpc | 09:36 |
Qiming__ | so? | 09:40 |
Qiming__ | why is it needed? | 09:40 |
Yanyanhu | it is not needed. But when we started to write the code, context is still there | 09:41 |
Yanyanhu | in every coner of senlin service | 09:41 |
Yanyanhu | but it is not needed actually | 09:42 |
Qiming__ | so ... pls help revisit that code | 09:42 |
Yanyanhu | actually after the code execution is outof engine/service, we don't need context any more | 09:42 |
Yanyanhu | sure | 09:42 |
Qiming__ | right | 09:42 |
Yanyanhu | will recheck it | 09:42 |
Qiming__ | cool | 09:43 |
Yanyanhu | will propose a patch for this | 09:44 |
xuhaiwei | sorry to interpret | 09:44 |
Yanyanhu | hello | 09:45 |
xuhaiwei | is an action executed only when it is in 'READY' status? | 09:45 |
xuhaiwei | I found the set_status is not used to all the actions | 09:46 |
xuhaiwei | in engine/service layer, we dont set status for actions | 09:46 |
Yanyanhu | hi, xuhaiwei, I guess the action is in INIT status before the first time execution of an action | 09:48 |
Yanyanhu | let me have a look | 09:48 |
Yanyanhu | after the action is created, its status is INIT. Then after it is locked, it will be marked as RUNNING | 09:49 |
xuhaiwei | yes, in engine/service layer, the actions are INIT status, but in the cluster/node actions layer, they are marked READY before being executed | 09:49 |
xuhaiwei | that's right Yanyanhu, I just wonder if we need to check the action's status before locking it | 09:50 |
Yanyanhu | hmm, I guess we don't. | 09:52 |
Yanyanhu | or it's too complicated to check all cases? | 09:52 |
xuhaiwei | if we don't check the status, an action marked FAILED will also get the lock , though currently no actions are marked failed | 09:54 |
Yanyanhu | you mean the lock of cluster? | 09:54 |
Yanyanhu | or the lock of a cluster_action | 09:54 |
xuhaiwei | cluster_action | 09:55 |
Yanyanhu | so what you mean by 'an action marked FAILED will also get the lock' | 09:55 |
Yanyanhu | you mean an action marked as FAILED will be locked by a worker? | 09:55 |
xuhaiwei | currently we only set actions' status to READY, but if an action is marked as FAILED by set_status, it will also be executed by the scheduler | 09:57 |
xuhaiwei | If it is not, I think we dont need to set status to action | 09:58 |
Yanyanhu | hi, xuhaiwei, you're right if the work will check the action table by themselves periodically | 09:58 |
Yanyanhu | and that is one of our goal in future | 09:58 |
*** lawrancejing has quit IRC | 09:59 | |
Yanyanhu | but currently, action dispatching only happens after the action is first time created | 09:59 |
xuhaiwei | yes | 10:00 |
Yanyanhu | so we don't need to worry an action with FAILED status will be rescheduled by other worker threads | 10:00 |
xuhaiwei | yes, currently not, I just think we need to improve the action execution | 10:02 |
Yanyanhu | sure | 10:02 |
*** elynn has quit IRC | 10:03 | |
Yanyanhu | theoretically, worker should check the action status before acquiring its lock | 10:03 |
Yanyanhu | it should only acquire lock of those actions whose status is READY(or INIT?) | 10:04 |
xuhaiwei | I think READY is meaningful | 10:04 |
Qiming__ | the current scheduler is not doing scheduling at all | 10:06 |
Qiming__ | it only accepts a request and start an action immediately | 10:06 |
Qiming__ | the design/goal was to have it work like a scheduler | 10:06 |
Qiming__ | so it will do things like invoke this api: http://git.openstack.org/cgit/stackforge/senlin/tree/senlin/db/sqlalchemy/api.py#n1154 | 10:07 |
xuhaiwei | so the status logic is not working actually, right? | 10:07 |
Qiming__ | the pitfall is as what I have just outlined above | 10:08 |
Qiming__ | if you equalize that to "the status logic is not working actually, ", fine | 10:08 |
Yanyanhu | hi, xuhaiwei, the status logic is mainly used for dependency checking now I think | 10:09 |
Qiming__ | one of the most complicated handling of action status is here: http://git.openstack.org/cgit/stackforge/senlin/tree/senlin/engine/actions/cluster_action.py#n58 | 10:10 |
xuhaiwei | yes | 10:11 |
Qiming__ | if you read this set_status() function: http://git.openstack.org/cgit/stackforge/senlin/tree/senlin/engine/actions/base.py#n287 | 10:11 |
Qiming__ | you will get a better understanding how we have been making heavy use of action.status | 10:11 |
Qiming__ | when we call db_api.action_mark_failed() there, we are doing a lot in the database layer, all manipulating action.status | 10:14 |
Qiming__ | http://git.openstack.org/cgit/stackforge/senlin/tree/senlin/db/sqlalchemy/api.py#n1341 | 10:14 |
xuhaiwei | yes, but after I read the set_status method, I got a little confused, why in engine/service layer, we dont set status to action before executing it | 10:14 |
Qiming__ | we should | 10:14 |
Qiming__ | the code still works because "<Qiming__> the current scheduler is not doing scheduling at all" | 10:15 |
xuhaiwei | got it | 10:16 |
xuhaiwei | I'd like to make a patch to make it a little nearer to our goal | 10:16 |
Yanyanhu | xuhaiwei, that's great :) | 10:18 |
Qiming__ | ok | 10:18 |
*** Qiming__ has quit IRC | 10:53 | |
*** Yanyanhu has quit IRC | 10:56 | |
openstackgerrit | xu-haiwei proposed stackforge/senlin: Check action status before lock it https://review.openstack.org/220950 | 10:56 |
*** lixinhui has quit IRC | 11:12 | |
*** lawrancejing has joined #senlin | 11:33 | |
*** lixinhui has joined #senlin | 12:09 | |
*** Qiming_ has joined #senlin | 12:12 | |
*** lixinhui has quit IRC | 12:14 | |
Qiming_ | xuhaiwei, still on? | 12:19 |
*** Qiming_ has quit IRC | 12:33 | |
*** Qiming__ has joined #senlin | 12:33 | |
*** lawrance_ has joined #senlin | 12:35 | |
*** lawrancejing has quit IRC | 12:37 | |
openstackgerrit | Merged stackforge/senlin: Cluster action test case (1) https://review.openstack.org/220727 | 12:55 |
openstackgerrit | Merged stackforge/senlin: Cluster action test case (2) https://review.openstack.org/220785 | 12:59 |
*** zhenguo has quit IRC | 13:00 | |
*** lixinhui has joined #senlin | 13:02 | |
openstackgerrit | Cindia-blue proposed stackforge/senlin: Add a placement-policy to enable vSphere DRS functions https://review.openstack.org/219212 | 13:20 |
openstackgerrit | Merged stackforge/senlin: Add test case for Node Action https://review.openstack.org/211409 | 13:33 |
openstackgerrit | Qiming Teng proposed stackforge/senlin: Cluster action test case (3) https://review.openstack.org/221017 | 14:18 |
*** lixinhui has quit IRC | 14:46 | |
*** Qiming__ has quit IRC | 15:28 | |
*** lawrance_ has quit IRC | 15:28 | |
*** Qiming__ has joined #senlin | 23:41 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!