*** kevinz has joined #openstack-nimble | 00:50 | |
*** kevinz has quit IRC | 00:58 | |
*** kevinz has joined #openstack-nimble | 01:39 | |
*** kevinz has quit IRC | 02:34 | |
*** kevinz has joined #openstack-nimble | 02:40 | |
liusheng | zhenguo: ping | 02:57 |
---|---|---|
zhenguo | liusheng: pong | 02:57 |
liusheng | zhenguo: a question, we don't need to config [pxe] section in our devstack installation ? | 02:58 |
zhenguo | liusheng: seems not | 02:58 |
liusheng | zhenguo: in ironic.conf | 02:58 |
liusheng | zhenguo: why | 02:58 |
zhenguo | liusheng: it has default values | 02:59 |
zhenguo | liusheng: we should not change it in devstack env | 03:00 |
liusheng | zhenguo: now, my local devstack can also create instance, I just compared the ironic.conf in my devstack and the tempest job, the obvious difference is the ironic.conf in tempest job didn't config this section | 03:00 |
zhenguo | liusheng: oh, you mean you env is ok now? | 03:01 |
liusheng | zhenguo: yes, but I have re-installed my devstack in a vm, it is ok. seems it cannot work in a devstack installed based on a physical server. | 03:02 |
liusheng | zhenguo: it is strange | 03:02 |
zhenguo | liusheng: hah | 03:03 |
liusheng | zhenguo: I have checked the ironic.conf both in my env and another ironic tempest job, it has the [pxe] section, but our tempest job don't include that | 03:04 |
zhenguo | liusheng: I just checked the tempest ironic.conf, it has a pxe section | 03:06 |
liusheng | zhenguo: oh, my mistake :( | 03:07 |
liusheng | zhenguo: yes, it has | 03:07 |
zhenguo | liusheng: but not sure why it not work | 03:07 |
liusheng | zhenguo: I am crazy with that :( | 03:08 |
zhenguo | liusheng: can you read more information form the console log | 03:08 |
liusheng | zhenguo: do you think it is relate with "crazy" | 03:08 |
liusheng | enabled_drivers = fake,agent_ssh,agent_ipmitool,pxe_ssh,pxe_ipmitool | 03:08 |
zhenguo | liusheng: lol | 03:08 |
liusheng | this config option ? | 03:08 |
zhenguo | liusheng: no, the ironic node console log | 03:08 |
liusheng | zhenguo: I cannot find useful info from the console log | 03:09 |
zhenguo | liusheng: does it have a DHCP process | 03:09 |
liusheng | zhenguo: it is enabled_drivers = fake,agent_ssh,agent_ipmitool in our tempest job, but enabled_drivers = fake,agent_ssh,agent_ipmitool,pxe_ssh,pxe_ipmitool in my env | 03:10 |
zhenguo | liusheng: seems not related, as we only use agent_ssh. | 03:10 |
liusheng | zhenguo: you mean neutron-dhcp-agent ? | 03:11 |
zhenguo | liusheng: no, | 03:11 |
zhenguo | liusheng: I mean when the node is start, it should have a DHCP process, I want to know if it can get a IP from neutron | 03:12 |
zhenguo | liusheng: I still suspect it's a network problem | 03:12 |
liusheng | zhenguo: I don't how to confirm, the ironic-cond's log just says "timeout for waiting call-back" | 03:13 |
zhenguo | liusheng: the only way it's to get some clues from ironic-bm-logs | 03:13 |
zhenguo | liusheng: you can check the ironic-bm-logs in your env | 03:17 |
zhenguo | liusheng: which VIM plugin do you use to make it human readable? | 03:18 |
liusheng | zhenguo: Nothing to boot: No such file or directory (http://ipxe.org/2d03e13b | 03:19 |
liusheng | No more network devices | 03:19 |
liusheng | Press Ctrl-B for the iPX E command line... | 03:19 |
liusheng | No bootable device. | 03:19 |
liusheng | zhenguo: you can use less -R {filename} to read | 03:19 |
zhenguo | liusheng: ok, thanks | 03:19 |
liusheng | zhenguo: or install the plugin AnsiEsc | 03:19 |
liusheng | zhenguo: http://vim.sourceforge.net/scripts/script.php?script_id=302 | 03:20 |
liusheng | zhenguo: the above is the useful info from the console log | 03:20 |
zhenguo | liusheng: seems it's still network problem | 03:22 |
zhenguo | liusheng: we don't get a IP from the neutron network DHCP | 03:22 |
liusheng | zhenguo: hmm, may it cannot support dynamically creating network in tempest job :( | 03:25 |
zhenguo | liusheng: I remember Ironic devstack plugin also set other options for tempest network | 03:25 |
zhenguo | liusheng: no, we have already created the network | 03:25 |
zhenguo | liusheng: some other options | 03:25 |
liusheng | zhenguo: may you can help, I haven't in the door of ironic yet, lol | 03:26 |
zhenguo | liusheng: haha. sure | 03:27 |
zhenguo | liusheng: ironic's tempest can work, and it really create instance and delete instance | 03:39 |
zhenguo | liusheng: the best way it's to follow it | 03:39 |
liusheng | zhenguo: yes, but I don't what is the difference of ironic and nimble's job | 03:40 |
*** kevinz has quit IRC | 03:58 | |
*** kevinz has joined #openstack-nimble | 04:50 | |
*** kevinz has quit IRC | 06:14 | |
zhenguo | liusheng: I will dig the tempest failure after taskflow work, maybe you can do other things first | 06:37 |
liusheng | zhenguo: thank you a lot! :) | 06:37 |
zhenguo | liusheng: np :D | 06:38 |
zhenguo | liusheng: Alex mentioned that nova has a config options for name, maybe we can follow that way | 06:38 |
zhenguo | liusheng: but on db side, we should not set name as unique, please continue the patch | 06:39 |
liusheng | zhenguo: OK, get it | 06:39 |
zhenguo | liusheng: thanks | 06:39 |
*** yuntongjin has joined #openstack-nimble | 06:55 | |
shaohe_feng | zhenguo: hi | 06:57 |
zhenguo | shaohe_feng: hi | 06:57 |
shaohe_feng | zhenguo: I create the etherpad, you can add it to wiki. https://etherpad.openstack.org/p/nimble-task | 06:58 |
shaohe_feng | zhenguo: and I have seen you are working on configdrive | 06:58 |
*** kevinz has joined #openstack-nimble | 06:58 | |
shaohe_feng | zhenguo: so I will working on the quotas. | 06:58 |
zhenguo | shaohe_feng: ok, thanks | 06:59 |
zhenguo | shaohe_feng: I have taken over the taskflow work | 06:59 |
shaohe_feng | zhenguo: so many tasks for you. | 06:59 |
shaohe_feng | zhenguo: you can update the etherpad. | 07:00 |
zhenguo | shaohe_feng: I find a way to revert task, will update soon, hope the create taskflow work will be done by tomorrow. | 07:00 |
shaohe_feng | zhenguo: so I can help to work on the configdrive | 07:01 |
shaohe_feng | zhenguo: OK. another questions. | 07:01 |
shaohe_feng | zhenguo: if the delete api delete the DB | 07:01 |
shaohe_feng | zhenguo: and the create is still in process | 07:01 |
zhenguo | shaohe_feng: sure, you can work on that first :P | 07:01 |
zhenguo | shaohe_feng: I will focus on create/delete refactor task | 07:01 |
zhenguo | shahe_feng: it's a tricky one | 07:02 |
shaohe_feng | zhenguo: something wrong with the nimble daemon | 07:02 |
zhenguo | shaohe_feng: maybe we need to add a lock | 07:02 |
shaohe_feng | zhenguo: the nimble restart | 07:02 |
shaohe_feng | zhenguo: I means the nimble daemon running "create instance", restart | 07:03 |
shaohe_feng | zhenguo: how does it what resource should it free? | 07:03 |
zhenguo | shaohe_feng: not sure, I will refactor the how process | 07:04 |
zhenguo | shaohe_feng: please check tomorrow, hah | 07:04 |
zhenguo | s/how/whole | 07:04 |
shaohe_feng | zhenguo: so the resource such as network and volumes will be zombie | 07:04 |
shaohe_feng | zhenguo: who will reap | 07:05 |
shaohe_feng | zhenguo: who will reap them? | 07:05 |
zhenguo | shaohe_feng: yes, nobody will remove them | 07:05 |
zhenguo | shaohe_feng: and as the instance has been deleted, you even can't find the network information | 07:05 |
shaohe_feng | zhenguo: can we avoid the zombie resource? | 07:06 |
zhenguo | shahe_feng: you mean the scenario of delete instance when create is still in process | 07:06 |
zhenguo | shaohe_feng: or create instance failed | 07:07 |
shaohe_feng | zhenguo: yes. | 07:08 |
zhenguo | shaohe_feng: do you think it make sense to prevent deleting when we are in building process | 07:09 |
shaohe_feng | zhenguo: prevent is a simple design. | 07:11 |
zhenguo | shaohe_feng: yes, but simple doesn't mean bad, hah | 07:12 |
shaohe_feng | zhenguo: agree. | 07:14 |
zhenguo | shaohe_feng: why do you want to delete it when it's still in building process | 07:14 |
zhenguo | shaohe_feng: if we can make sure it will not stay in building process always | 07:14 |
shaohe_feng | zhenguo: yes, we need a scenario. | 07:14 |
shaohe_feng | zhenguo: https://wiki.openstack.org/wiki/Nimble#Task_track | 07:15 |
zhenguo | shaohe_feng: yes, before someone requests us to do that, I think just prevent deleting is ok | 07:15 |
shaohe_feng | zhenguo: let me discuss it with ZangRui. | 07:15 |
shaohe_feng | zhenguo: OK. | 07:15 |
zhenguo | shaohe_feng: thanks | 07:15 |
zhenguo | shaohe_feng: and when an instance is building process, if a delete request comming, do you think we should just retrun with error or wait the process done and then delete it | 07:18 |
shaohe_feng | zhenguo: an error is simple. | 07:37 |
shaohe_feng | zhenguo: also return 202 and a task in backgroud to delete is OK. | 07:38 |
zhenguo | shaohe_feng: I think we can discuss it more, when the deleting refactor is in process. | 07:39 |
zhenguo | shaohe_feng: after I finished the create taskflow work | 07:39 |
shaohe_feng | zhenguo: OK. another idea. And an "deleted" field in DB. and default value is "False". | 07:41 |
shaohe_feng | zhenguo: when delete API, request. If the status is error or deploying or finish, just simple reap the resource and delete the instance. | 07:42 |
shaohe_feng | zhenguo: if the status is in buiding, just mark the "deleted" field of instance in DB as "True" | 07:43 |
shaohe_feng | zhenguo: and the create task check the this "deleted" field, and start revert. | 07:44 |
zhenguo | shaohe_feng: seems ok | 07:44 |
zhenguo | shaohe_feng: besides this, I also want to add deleted field to db | 07:45 |
zhenguo | shaohe_feng: as users may want to check the last months instances used information | 07:45 |
shaohe_feng | zhenguo: if nimble daemon restart, at the initiation phase, it can check the status/and "deleted" field of all instances and then reap the zombie resource. | 07:46 |
shaohe_feng | zhenguo: let me check nova code, if there is "deleted" field of instance | 07:47 |
zhenguo | shaohe_feng: do you think it's a bit waste if we created all resouces and just delete them after that. | 07:47 |
zhenguo | shaohe_feng: there are deleted fields in all nova, cinder, glance tables. | 07:48 |
shaohe_feng | zhenguo: maybe the users may regret after they create the instance for some reason, maybe somemistick. | 07:49 |
zhenguo | shaohe_feng: in every task in the create instance flow, we should check if it has been deleted and raise InstanceNotFound exception to trigger the revert work, instead of waiting for all resouces created | 07:50 |
shaohe_feng | zhenguo: such as they use the wrong key to inject, or other reasons. | 07:50 |
shaohe_feng | zhenguo: agree. | 07:50 |
zhenguo | shaohe_feng: yes we should gracefully handle users' requests instead of just return a forbidden | 07:51 |
zhenguo | shaohe_feng: do you know whether we need to proved floating ip associate/unassociate API? or neutron API can do that | 07:53 |
zhenguo | s/proved/provide | 07:53 |
shaohe_feng | zhenguo: https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/models.py#L196 nova mark many deleted. | 07:55 |
zhenguo | shaohe_feng: yes | 07:57 |
shaohe_feng | zhenguo: neutron API do that, does that nimble always request floating ip to neutron? | 07:58 |
zhenguo | shaohe_feng: yes | 07:58 |
zhenguo | shaohe_feng: so we don't need to provide a seperate API? | 07:59 |
zhenguo | shaohe_feng: I know we can allocate floating IP from neutron, but netron can also provide the instance associate API? | 08:00 |
shaohe_feng | zhenguo: we need to dig out, how nova use the deleted information? | 08:01 |
shaohe_feng | zhenguo: https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L2075 you can see the delete and soft_delete description | 08:01 |
*** yuntongjin has quit IRC | 08:02 | |
zhenguo | shaohe_feng: yes, but seems we don't need soft_deleted | 08:02 |
shaohe_feng | zhenguo: 'deleted' - only return (or exclude) deleted instances | 08:03 |
zhenguo | shaohe_feng: yes, it's a query option | 08:03 |
zhenguo | shaohe_feng: if you specify deleted parameter when query instances, it just return the deleted instances | 08:04 |
zhenguo | shaohe_feng: but I remember someone said that nova will remove all 'deleted' parameter in it's API | 08:05 |
shaohe_feng | zhenguo: yes. but we need to know more about how does "deleted" take effect. | 08:06 |
zhenguo | shaohe_feng: yes, | 08:07 |
shaohe_feng | zhenguo: Oh, so we need to check it. if it remove all "deleted", then should we support it? | 08:07 |
shaohe_feng | zhenguo: does RuiChen know this "deleted" field in DB? | 08:08 |
zhenguo | shaohe_feng: I amd also worried about that, maybe you can ask Alex about that | 08:08 |
shaohe_feng | zhenguo: OK. it remove all "deleted" parameters in API, does also the field in DB? | 08:08 |
zhenguo | shaohe_feng: not sure | 08:09 |
*** kevinz has quit IRC | 08:13 | |
* zhenguo brb | 08:18 | |
*** yuntongjin has joined #openstack-nimble | 08:38 | |
zhenguo | shaohe_feng, liusheng: I find a issue when using node_cache, if there are many create request received, seems they will schedule to the same node :( | 08:51 |
zhenguo | we should add a threading lock when one accesses the node_cache, and after scheduling remove the selected node from cache. | 08:57 |
liusheng | zhenguo: may we will support multiple-workers ? | 08:58 |
zhenguo | liusheng: you mean nimble-engine? | 08:59 |
liusheng | zhenguo: yes | 08:59 |
zhenguo | liusheng: yes, but not sure whether we need to support active-active mode | 08:59 |
zhenguo | liusheng: maybe just active and standby? | 08:59 |
liusheng | zhenguo: most of other openstack services multi-workers is a-a mode, right ? | 09:01 |
zhenguo | liusheng: yes | 09:01 |
liusheng | zhenguo: if we support multi-workers with a-a mode with node_cache, we may need an external lock mechanism | 09:02 |
zhenguo | liusheng: I think we can use one worker now, as I think the most biggest baremetal cluster is about 4000 | 09:03 |
zhenguo | liusheng: one worker should be enough to handle that | 09:04 |
liusheng | zhenguo: hah, hope that | 09:04 |
zhenguo | liusheng: hah, If there are more, we can change in the future | 09:04 |
zhenguo | liusheng: but if there are more than one worker, I think we can't use cache but should store the nodes in DB | 09:05 |
liusheng | zhenguo: ok, yes, the DB naturally support lock mechanism | 09:06 |
zhenguo | liusheng: yes | 09:06 |
*** yuntongjin has quit IRC | 09:19 | |
*** Kevin_Zheng has quit IRC | 10:47 | |
*** RuiChen has quit IRC | 11:06 | |
openstackgerrit | Zhenguo Niu proposed openstack/nimble: [WIP] Add create instance taskflow https://review.openstack.org/403555 | 12:01 |
*** liusheng has quit IRC | 12:17 | |
openstackgerrit | Zhenguo Niu proposed openstack/nimble: Add create instance taskflow https://review.openstack.org/403555 | 12:47 |
-openstackstatus- NOTICE: Launchpad SSO is not currently working, so logins to our services like review.openstack.org and wiki.openstack.org are failing; the admins at Canonical are looking into the issue but there is no estimated time for a fix yet. | 16:24 | |
*** ChanServ changes topic to "Launchpad SSO is not currently working, so logins to our services like review.openstack.org and wiki.openstack.org are failing; the admins at Canonical are looking into the issue but there is no estimated time for a fix yet." | 16:24 | |
*** ChanServ changes topic to "Bugs: bugs.launchpad.net/nimble | Review: https://review.openstack.org/#/q/project:openstack/nimble,n,z" | 17:01 | |
-openstackstatus- NOTICE: Canonical admins have resolved the issue with login.launchpad.net, so authentication should be restored now. | 17:01 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!