*** matsuhashi has joined #openstack-ironic | 00:18 | |
* NobodyCam wanders away... | 00:23 | |
*** hemna has quit IRC | 00:23 | |
*** hemna has joined #openstack-ironic | 00:27 | |
*** urulama has quit IRC | 00:45 | |
*** urulama has joined #openstack-ironic | 00:46 | |
*** urulama_ has joined #openstack-ironic | 00:49 | |
*** urulama has quit IRC | 00:50 | |
*** hemna has quit IRC | 00:51 | |
*** ctracey1 has quit IRC | 00:57 | |
*** urulama_ has quit IRC | 01:03 | |
*** hemna has joined #openstack-ironic | 01:05 | |
*** urulama has joined #openstack-ironic | 01:09 | |
*** nosnos has joined #openstack-ironic | 01:13 | |
*** nosnos has quit IRC | 01:13 | |
*** nosnos has joined #openstack-ironic | 01:14 | |
*** rloo has quit IRC | 01:18 | |
*** urulama has quit IRC | 01:19 | |
*** urulama has joined #openstack-ironic | 01:20 | |
*** urulama has quit IRC | 01:28 | |
*** urulama has joined #openstack-ironic | 01:28 | |
*** ctracey1 has joined #openstack-ironic | 01:34 | |
*** hemna has quit IRC | 01:34 | |
*** itooon has quit IRC | 01:38 | |
*** urulama_ has joined #openstack-ironic | 01:38 | |
*** urulama has quit IRC | 01:39 | |
*** urulama_ has quit IRC | 01:43 | |
*** ctracey1 has quit IRC | 01:43 | |
*** urulama has joined #openstack-ironic | 01:48 | |
*** itooon has joined #openstack-ironic | 01:52 | |
*** urulama has quit IRC | 01:57 | |
*** urulama has joined #openstack-ironic | 01:58 | |
*** urulama has quit IRC | 02:08 | |
*** urulama has joined #openstack-ironic | 02:08 | |
*** urulama has quit IRC | 02:12 | |
*** urulama has joined #openstack-ironic | 02:18 | |
*** urulama_ has joined #openstack-ironic | 02:27 | |
*** nosnos_ has joined #openstack-ironic | 02:27 | |
*** urulama has quit IRC | 02:28 | |
*** nosnos has quit IRC | 02:30 | |
*** urulama_ has quit IRC | 02:37 | |
*** urulama has joined #openstack-ironic | 02:38 | |
*** urulama has quit IRC | 02:42 | |
*** urulama has joined #openstack-ironic | 02:47 | |
*** urulama_ has joined #openstack-ironic | 02:57 | |
*** urulama has quit IRC | 02:57 | |
*** matsuhashi has quit IRC | 03:02 | |
*** matsuhashi has joined #openstack-ironic | 03:03 | |
*** matsuhashi has quit IRC | 03:03 | |
*** urulama_ has quit IRC | 03:07 | |
*** urulama has joined #openstack-ironic | 03:07 | |
*** urulama has quit IRC | 03:12 | |
openstackgerrit | Haomeng,Wang proposed a change to openstack/ironic: ipmitool SHOULD accept empty username/password https://review.openstack.org/54886 | 03:13 |
---|---|---|
*** matsuhashi has joined #openstack-ironic | 03:13 | |
openstackgerrit | A change was merged to openstack/python-ironicclient: Add driver-list https://review.openstack.org/53683 | 03:14 |
openstackgerrit | A change was merged to openstack/python-ironicclient: Fix cmd usage msg for ironic port-create https://review.openstack.org/55898 | 03:16 |
openstackgerrit | A change was merged to openstack/python-ironicclient: Remove Python 2.4 all() implementation https://review.openstack.org/56051 | 03:16 |
*** urulama has joined #openstack-ironic | 03:16 | |
sandeepr | NobodyCam, you around? | 03:18 |
sandeepr | !ping lifeless | 03:19 |
openstack | pong | 03:19 |
lifeless | sandeepr: ? | 03:20 |
sandeepr | lifeless, from the scale tests i did on bm provisioning, i had found a bug https://bugs.launchpad.net/nova/+bug/1226170 | 03:24 |
lifeless | just one ? | 03:25 |
sandeepr | your comment "first recommendation is to turn off file injection" | 03:25 |
sandeepr | one out of the many few i guess | 03:25 |
lifeless | yes, absolutely. I have to go pick up daughter from kindy etc; I'll be back here in about 2.5 hours | 03:25 |
lifeless | sorry ;) | 03:26 |
sandeepr | ok, | 03:26 |
sandeepr | i'll ping you around that time | 03:26 |
sandeepr | thanks | 03:26 |
*** urulama_ has joined #openstack-ironic | 03:26 | |
*** urulama has quit IRC | 03:26 | |
*** urulama_ has quit IRC | 03:36 | |
*** urulama has joined #openstack-ironic | 03:36 | |
*** urulama has quit IRC | 03:41 | |
*** urulama has joined #openstack-ironic | 03:43 | |
*** matsuhashi has quit IRC | 03:45 | |
*** matsuhashi has joined #openstack-ironic | 03:46 | |
*** matsuhas_ has joined #openstack-ironic | 03:49 | |
*** matsuhashi has quit IRC | 03:50 | |
*** urulama has quit IRC | 04:36 | |
*** urulama has joined #openstack-ironic | 04:36 | |
*** nosnos_ has quit IRC | 04:44 | |
*** nosnos has joined #openstack-ironic | 04:44 | |
*** itooon has quit IRC | 05:03 | |
*** rameshg87 has joined #openstack-ironic | 05:48 | |
*** matsuhas_ has quit IRC | 05:58 | |
rameshg87 | Hi | 05:58 |
rameshg87 | i had a question regarding ironic compute node and its database | 05:59 |
Haomeng | rameshg, good morning | 05:59 |
Haomeng | !ping lifeless | 05:59 |
openstack | pong | 05:59 |
rameshg87 | good morning, Haomeng | 06:00 |
Haomeng | :) | 06:00 |
*** blamar has quit IRC | 06:00 | |
*** matsuhashi has joined #openstack-ironic | 06:00 | |
*** blamar has joined #openstack-ironic | 06:00 | |
openstackgerrit | Jenkins proposed a change to openstack/ironic: Imported Translations from Transifex https://review.openstack.org/55967 | 06:01 |
*** matsuhashi has quit IRC | 06:01 | |
Haomeng | just feel free for your any questions are welcome | 06:01 |
rameshg87 | when ironic is ready, a nova compute node which has ironic-baremetal driver will be running nova-compute and will be contacting ironic to get the jobs done, right ? | 06:01 |
*** matsuhashi has joined #openstack-ironic | 06:02 | |
Haomeng | yes, our Ironic driver for nova compute is under construction | 06:02 |
Haomeng | not it is not ready to use | 06:02 |
lifeless | Haomeng: ? | 06:02 |
lifeless | Haomeng: when you do !ping I read that as 'not ping' | 06:03 |
Haomeng | lifeless: I have a question, can you help about https://review.openstack.org/#/c/55231/5 | 06:03 |
Haomeng | yes, but irc response is "<openstack> pong" | 06:03 |
Haomeng | :) | 06:04 |
rameshg87 | so, there can be more than one nova-compute node running this ironic inorder to achieve some load-balancing as well, right ? | 06:04 |
Haomeng | I think the "!" is indicator for IRC internal commands "!ping" can probe the connection with your IRC client | 06:04 |
Haomeng | rameshg: great idea, we have such plan | 06:05 |
*** ctracey has quit IRC | 06:05 | |
lifeless | Haomeng: it's a command to the openstack irc bot, nothing to do with any person | 06:05 |
rameshg87 | when we run multiple instances of ironic, each ironic instance will have its own DB also, right ? | 06:05 |
Haomeng | rameshg87: this is our IRONIC TODO/UnderContrustion Actions, we can find the "HA / failover for ironic-conductor" in " Do these awesome things later" list | 06:06 |
lifeless | Haomeng: you may be thinking of CTCP which can do irc client to client commands | 06:06 |
Haomeng | ok, got | 06:07 |
Haomeng | :) | 06:07 |
*** yanhcdl has joined #openstack-ironic | 06:08 | |
Haomeng | lifeless: for https://review.openstack.org/#/c/55231/5, Sergey suggest to remove the behavior which dbapi can filter port with MacAddress as input | 06:08 |
Haomeng | my fix is just add condition for GET/UPDATE/DELETE api call | 06:08 |
Haomeng | I just want to know how do you think for such issue | 06:08 |
lifeless | Haomeng: so it can be useful to search for nodes by MacAddress | 06:09 |
Haomeng | I searched our code, some codes will call db api to get port with address as input | 06:09 |
lifeless | Haomeng: would that use the dbapi feature as well ? | 06:09 |
Haomeng | so I think for our API, we can export with UUID as input, but for internal, maybe the macaddress is required to filter, how do you think? | 06:09 |
Haomeng | let me shou you current dbapi.get_node code | 06:12 |
lifeless | Haomeng: what code passes mac addresses to get_port ? | 06:13 |
Haomeng | let me check | 06:13 |
Haomeng | almostly should be unittest code I think | 06:14 |
lifeless | ./api/controllers/v1/port.py _check_address does | 06:14 |
lifeless | but thats buggy, it's doing read-then-write | 06:14 |
lifeless | it should have a unique index and do write-then-catch instead. | 06:14 |
Haomeng | yes | 06:14 |
lifeless | and test code | 06:15 |
lifeless | so | 06:15 |
lifeless | I think sergey is right | 06:15 |
Haomeng | so can we modify this https://github.com/openstack/ironic/blob/master/ironic/db/sqlalchemy/api.py#L278, to remove the macaddress as filter supporting | 06:15 |
Haomeng | yes | 06:15 |
lifeless | but we need to fix ./api/controllers/v1/port.py _check_address first. | 06:15 |
Haomeng | I just discuss with you about the solution | 06:15 |
Haomeng | ok | 06:15 |
Haomeng | should we fix this on port object level, or db level | 06:16 |
Haomeng | current, my fix is on API level, to reject macaddress | 06:16 |
Haomeng | that is not 'root' I think, agree with sergey | 06:16 |
Haomeng | our _check_address is called when port is creating, for new port, it will input with portaddress | 06:18 |
Haomeng | so I think this logic is required for us | 06:19 |
Haomeng | so here, we have to key, one is uuid, that is pyhsical key in db | 06:19 |
Haomeng | another one is macaddress, that is networking 2-layer id | 06:19 |
Haomeng | so can I create new method to cover get_node_by_address | 06:20 |
Haomeng | should be get_port_by_macaddress | 06:20 |
Haomeng | rameshg87: I think we have single db for Ironic, not very sure:) | 06:21 |
rameshg87 | great, thanks .. | 06:21 |
rameshg87 | i am just reading through the blueprints regarding this | 06:22 |
rameshg87 | http://summit.openstack.org/cfp/details/112 | 06:22 |
rameshg87 | since you were in the middle of discussion, i didn't want to interrupt :-) | 06:22 |
Haomeng | welcome | 06:22 |
Haomeng | I am not in the discussion:) welcome you | 06:22 |
Haomeng | lifeless: so in summary, I want to create new dbapi dbapi.get_port_by_address to filter with address only, and modify the existing one dbapi.get_port to dbapi.get_port_by_uuid, and remove address filter from dbapi.get_port | 06:28 |
lifeless | Haomeng: I don't think get_port_by_address is needed; what might be needed is get_node_by_port_address | 06:29 |
lifeless | the rest sounds fine | 06:30 |
lifeless | Haomeng: _check_address should not try to lookup the port | 06:30 |
Haomeng | ok, but current _check_address will call dbapi.get_port with macaddress | 06:30 |
lifeless | Haomeng: it should just create the port and catch constraint violation errors | 06:31 |
lifeless | Haomeng: the current _check_address code is racy - read-then-write is an antipattern | 06:31 |
Haomeng | yes, no need to check, because we can get dbexception, yes, I have another defect to fix this issue - https://review.openstack.org/#/c/54537/ | 06:32 |
Haomeng | it changes Port create API to EAFP | 06:33 |
lifeless | cool | 06:33 |
lifeless | so once thats done, _check_address doesn't need to lookup the port | 06:33 |
lifeless | and looking up port by address is just unneeded. | 06:34 |
Haomeng | yes | 06:34 |
Haomeng | that is clear now:) | 06:34 |
Haomeng | thank you lifeless | 06:34 |
Haomeng | appreciate your supporting for my patches:) | 06:35 |
Haomeng | I am new guy for Ironic | 06:35 |
Haomeng | will try my best to understand current code and do more contributions | 06:36 |
Haomeng | lifeless: one more question | 06:37 |
Haomeng | can I fix this port address issue with one single patch? because we have dependency with anothe one https://review.openstack.org/#/c/54537/ | 06:37 |
lifeless | make your patch depend on 54537 | 06:38 |
lifeless | and then it can be simple yes | 06:38 |
Haomeng | ok | 06:38 |
Haomeng | what is action to set dependency with our codereview sys | 06:39 |
Haomeng | i see the Dependencies section | 06:39 |
Haomeng | but that looks like readonly | 06:39 |
lifeless | if you have two patches in git in a row | 06:40 |
lifeless | then do git review | 06:40 |
lifeless | it will create a dep on the higher patch on the lower patch | 06:41 |
lifeless | so just rebase your patches from separate branches to one branch with two patches | 06:41 |
Haomeng | ok | 06:42 |
Haomeng | thank you lifeless, but the other one has some unittest issue now, have to fix that one first:) | 06:42 |
sandeepr | !ping lifeless | 06:44 |
openstack | pong | 06:44 |
lifeless | sandeepr: the ! is for the bot, it doesn't actually get me. | 06:45 |
sandeepr | oh ok | 06:46 |
*** sjing has joined #openstack-ironic | 06:46 | |
sandeepr | i was asking on that bug "first recommendation is to turn off file injection" - how can file injection be turned off? | 06:47 |
lifeless | sandeepr: nova.conf | 06:47 |
lifeless | # If True, enable file injection for network info, files and | 06:47 |
lifeless | # admin password (boolean value) | 06:47 |
lifeless | #use_file_injection=true | 06:47 |
lifeless | sandeepr: there is a Heat value for it in TripleO too. | 06:47 |
sandeepr | what does this file injection actually do? | 06:50 |
*** matsuhashi has quit IRC | 06:53 | |
sandeepr | isn't that the cloud-init stuff? | 06:53 |
*** matsuhashi has joined #openstack-ironic | 06:53 | |
lifeless | sandeepr: nova mounts the disk image, rewrites arbitrary files within it, then unmounts it, then finally actually sends it to the node | 06:56 |
*** matsuhashi has quit IRC | 06:57 | |
sandeepr | when you say finally actually sends it to the node, do you mean the arbitrary files? | 07:01 |
lifeless | the disk image | 07:04 |
*** matsuhashi has joined #openstack-ironic | 07:06 | |
sandeepr | wow you mean the qcow2? by turning off file_injection what will we loose? | 07:11 |
*** nosnos_ has joined #openstack-ironic | 07:11 | |
*** nosnos has quit IRC | 07:14 | |
lifeless | if you're dependent on it for setting up /etc/network/interfaces, you'll break :) | 07:18 |
lifeless | sandeepr: anyhow, point is - we have a bunch of optimisations queued up | 07:18 |
lifeless | sandeepr: the dd thing being single threaded is something we might want to revisit, but too much concurrency will push the sum(average_time_to_live) up, but slow disks will be slower than the network - most single disks are around 1Gbps; and DC networks get up to 10Gbps | 07:19 |
GheRivero | morning all | 07:23 |
sandeepr | what is the 1Gbps on the disks? | 07:26 |
sandeepr | morning GheRivero | 07:30 |
*** urulama has quit IRC | 07:31 | |
lifeless | sandeepr: bandwidth to a spinning disk - 150MBps for linear I/O last I checked. | 07:31 |
*** nosnos_ has quit IRC | 07:31 | |
lifeless | sandeepr: which is 1.2Gbps | 07:31 |
*** urulama has joined #openstack-ironic | 07:31 | |
anteaya | morning GheRivero | 07:31 |
*** nosnos has joined #openstack-ironic | 07:32 | |
anteaya | general call to ironic, neutron is experiencing issues with bringing up 150 vms: https://bugs.launchpad.net/neutron/+bug/1250168 | 07:33 |
anteaya | the communication traffic is too high and timeouts are preventing action | 07:33 |
anteaya | in comment #17 there is an idea proposed to move network allocation from the compute to api node and some acknowledgement of bare-metal | 07:34 |
lifeless | it's not ironic specific | 07:35 |
lifeless | I will point that out | 07:35 |
anteaya | wanted to let you know in the hopes someone or more than one has interest in following the conversation | 07:35 |
anteaya | lifeless: agreed | 07:35 |
anteaya | and participating in cross-project conversation if they can | 07:35 |
anteaya | just trying to tap as many wells as a I can to facilitate a stable fix | 07:35 |
anteaya | I am willing to knock on closed doors to do so | 07:36 |
*** urulama has quit IRC | 07:36 | |
sandeepr | lifeless, the optimisations - is there a list which you guys have planned? | 07:40 |
lifeless | not really. | 07:41 |
lifeless | We have some cards open, and some bugs, here and there. | 07:41 |
lifeless | The problem is that we have lots of possible optimisations, but no @ scale test harness at the moment | 07:41 |
lifeless | sandeepr: so it requires lots of care | 07:42 |
*** sjing has quit IRC | 07:46 | |
sandeepr | lifeless, ok. | 07:55 |
sandeepr | lifeless, how does iscsi work - what is the request sent by nova to indicate the iscsi info to which the os needs to be installed? | 07:55 |
lifeless | sandeepr: EPARSE: I don't understand what you're actually asking. iscsi is an IETF standard - you can read the RFC for it to understand how it works. | 07:56 |
*** urulama has joined #openstack-ironic | 08:02 | |
sandeepr | lifeless, hmmm, during provisioning there is "expose disks via iscsi" | 08:05 |
sandeepr | deploy ramdisk exposes the nodes local disk via iscsi | 08:07 |
sandeepr | so how does nova know the iscsi info? | 08:08 |
sandeepr | so the image can be deployed | 08:08 |
*** jistr has joined #openstack-ironic | 08:18 | |
*** jistr is now known as jistr|mtg | 08:19 | |
lifeless | sandeepr: it pings deploy-helper | 08:26 |
openstackgerrit | Jenkins proposed a change to openstack/ironic: Updated from global requirements https://review.openstack.org/55854 | 08:26 |
*** Krast has joined #openstack-ironic | 08:32 | |
sandeepr | lifeless, what will be the response of the deploy-helper? will it send the iqn #? | 08:34 |
lifeless | sandeepr: I suggest you read the code; it will be faster and more comprehensive | 08:34 |
sandeepr | https://github.com/openstack/nova/blob/master/nova/virt/baremetal/volume_driver.py - this one? | 08:35 |
lifeless | baremetal_deploy_helper.py | 08:35 |
lifeless | and pxe.py | 08:35 |
Haomeng | lifeless: I found baremetal_deploy_helper.py is removed from Ironic, right? | 08:36 |
Haomeng | baremetal_deploy_helper.py is from our Nova bm | 08:36 |
*** matsuhashi has quit IRC | 08:36 | |
Haomeng | https://github.com/openstack/nova/blob/master/nova/cmd/baremetal_deploy_helper.py | 08:37 |
lifeless | Haomeng: sandeepr filed a bug about nova-bm | 08:37 |
*** matsuhashi has joined #openstack-ironic | 08:37 | |
Haomeng | ok, got | 08:37 |
lifeless | Haomeng: so thats what we're discussing :) | 08:37 |
lifeless | yes, in Ironic the deploy-helper is replaced by the Ironic API | 08:37 |
Haomeng | yes | 08:37 |
Haomeng | it is a service | 08:38 |
sandeepr | thanks Haomeng for the link | 08:39 |
sandeepr | lifeless, i'll go through them | 08:39 |
Haomeng | welcome:) | 08:39 |
sandeepr | lifeless, the test i did was on bulk deployment w/ diff images and flavor | 08:40 |
sandeepr | may i ask if you have any thought of other possible scenario i can test for the perf/scale? | 08:41 |
lifeless | most large deployments will have lots of identical images | 08:41 |
*** romcheg has joined #openstack-ironic | 08:49 | |
Haomeng | night | 08:59 |
*** matsuhashi has quit IRC | 09:08 | |
*** matsuhashi has joined #openstack-ironic | 09:11 | |
openstackgerrit | Haomeng,Wang proposed a change to openstack/ironic: Supporting both Python 2 and Python 3 with six https://review.openstack.org/56169 | 09:11 |
openstackgerrit | Haomeng,Wang proposed a change to openstack/ironic: ipmitool SHOULD accept empty username/password https://review.openstack.org/54886 | 09:15 |
Haomeng | lifeless: are you here, looks like you sleep very late:) | 09:21 |
*** lucasagomes has joined #openstack-ironic | 09:24 | |
*** derekh has joined #openstack-ironic | 09:28 | |
*** nosnos_ has joined #openstack-ironic | 09:33 | |
*** nosnos has quit IRC | 09:37 | |
*** matsuhashi has quit IRC | 09:45 | |
*** matsuhashi has joined #openstack-ironic | 09:46 | |
*** matsuhashi has quit IRC | 09:51 | |
*** matsuhashi has joined #openstack-ironic | 09:51 | |
openstackgerrit | Lucas Alvares Gomes proposed a change to openstack/ironic: Required fields on nodes https://review.openstack.org/53664 | 10:00 |
*** Krast has quit IRC | 10:07 | |
*** Krast has joined #openstack-ironic | 10:07 | |
*** Krast has quit IRC | 10:07 | |
openstackgerrit | A change was merged to openstack/ironic: Pass Ironic API url to deploy ramdisk in PXE driver https://review.openstack.org/55302 | 10:19 |
*** rameshg87 has quit IRC | 10:23 | |
*** matsuhashi has quit IRC | 10:36 | |
*** matsuhashi has joined #openstack-ironic | 10:36 | |
*** matsuhashi has quit IRC | 10:41 | |
*** ctracey has joined #openstack-ironic | 10:43 | |
*** matsuhashi has joined #openstack-ironic | 10:49 | |
openstackgerrit | Haomeng,Wang proposed a change to openstack/ironic: Supporting both Python 2 and Python 3 with six https://review.openstack.org/56169 | 10:49 |
*** prekarat has joined #openstack-ironic | 10:51 | |
*** matsuhashi has quit IRC | 10:56 | |
*** matsuhashi has joined #openstack-ironic | 10:57 | |
*** matsuhas_ has joined #openstack-ironic | 11:00 | |
*** matsuhashi has quit IRC | 11:00 | |
*** romcheg1 has joined #openstack-ironic | 11:07 | |
*** romcheg has quit IRC | 11:11 | |
*** nosnos_ has quit IRC | 11:52 | |
*** nosnos has joined #openstack-ironic | 11:53 | |
*** nosnos has quit IRC | 11:53 | |
*** nosnos has joined #openstack-ironic | 11:54 | |
*** matsuhas_ has quit IRC | 11:58 | |
*** nosnos has quit IRC | 11:58 | |
*** matsuhashi has joined #openstack-ironic | 11:59 | |
*** nosnos has joined #openstack-ironic | 11:59 | |
*** jistr|mtg is now known as jistr | 12:02 | |
*** ndipanov_gone is now known as ndipanov | 12:03 | |
*** matsuhashi has quit IRC | 12:03 | |
*** nosnos has quit IRC | 12:04 | |
*** prekarat has quit IRC | 12:07 | |
*** jistr is now known as jistr|eng | 12:50 | |
*** lucasagomes is now known as lucas-hungry | 12:59 | |
*** urulama has quit IRC | 13:20 | |
*** jdob has joined #openstack-ironic | 13:29 | |
*** linggao has joined #openstack-ironic | 13:41 | |
*** rloo has joined #openstack-ironic | 13:45 | |
*** urulama has joined #openstack-ironic | 13:45 | |
*** jdob has quit IRC | 13:53 | |
*** jdob has joined #openstack-ironic | 13:53 | |
openstackgerrit | Yuriy Zveryanskyy proposed a change to openstack/ironic: Add power control to PXE driver https://review.openstack.org/50409 | 13:58 |
*** jbjohnso has joined #openstack-ironic | 14:00 | |
*** yuriyz has joined #openstack-ironic | 14:03 | |
*** ben_duyujie has joined #openstack-ironic | 14:05 | |
*** ndipanov_ has joined #openstack-ironic | 14:05 | |
*** ndipanov has quit IRC | 14:06 | |
*** lucas-hungry is now known as lucasagomes | 14:10 | |
linggao | morning rloo, lucasagomes, | 14:14 |
openstackgerrit | Yuriy Zveryanskyy proposed a change to openstack/ironic: Add power control to PXE driver https://review.openstack.org/50409 | 14:17 |
yuriyz | Morning all | 14:19 |
openstackgerrit | linggao proposed a change to openstack/ironic: Supports get node by instance uuid in API https://review.openstack.org/53262 | 14:19 |
lucasagomes | linggao, yuriyz hey morning | 14:21 |
GheRivero | morning all | 14:21 |
*** romcheg has joined #openstack-ironic | 14:22 | |
linggao | morning GheRivero yuriyz | 14:22 |
*** ndipanov_ is now known as ndipanov | 14:24 | |
linggao | lucasagomes, how do you like this: /v1/nodes/?associated=1 gives the following error: Invalid parameter value: 1, 'associated' can only be True/true or False/false. | 14:25 |
*** romcheg1 has quit IRC | 14:26 | |
linggao | I changed "True or False' to 'True/true or False/false' becaue we use .lower() for the next link. | 14:26 |
lucasagomes | linggao, hmm I would that, as it's an URL we should always go with the lower case | 14:27 |
lucasagomes | so imo 'true or false' would looks better | 14:27 |
linggao | lucasagomes, I agree with you. I'll make the change now. | 14:28 |
linggao | that error message looks overloaded with True/true or False/false. | 14:29 |
lucasagomes | hehe yea indeed | 14:30 |
openstackgerrit | linggao proposed a change to openstack/ironic: Supports get node by instance uuid in API https://review.openstack.org/53262 | 14:31 |
rloo | morning linggao | 14:34 |
romcheg | Morning everyone | 14:34 |
*** jistr|eng is now known as jistr | 14:34 | |
rloo | afternoon romcheg, lucasagomes, yuriyz | 14:35 |
lucasagomes | rloo, hey ya :) | 14:35 |
rloo | linggao - was wondering why 1/0 wasn't accepted for associated. | 14:35 |
max_lobur | g'morning Everyone | 14:38 |
linggao | rloo, that was the result of the discussion here a couple of weeks ago. | 14:39 |
GheRivero | one silly question. How are you people testing the API/client? I have a hand made enviroment, but fearing the day I have to ercreate it | 14:39 |
rloo | linggao: I wondered about that. I think nova allows 1/0 so it seemed odd, but. wrt true/false, True/False, I would have thought there was a convention already but maybe not. | 14:40 |
rloo | GheRivero: c'mon. Be adventurous! :-) | 14:41 |
rloo | GheRivero: I've been using tripleo/ironic to test. I think devstack works too? | 14:41 |
romcheg | GheRivero: I have created several tempest tests for Ironic API | 14:43 |
romcheg | They are on their road to master | 14:43 |
romcheg | https://review.openstack.org/#/c/48109/ | 14:44 |
linggao | GheRievro, I use 2 ways. 1. unit tests cases. 2 devstack to install then use brower to test. | 14:44 |
yuriyz | GheRivero, https://wiki.openstack.org/wiki/Ironic#Try_it_on_Devstack | 14:48 |
GheRivero | thanks all, i will take a look to all those options | 14:48 |
linggao | rloo, do you want to bring it up again? we have had a very lengthy discussion/debate on external url and CLI for the association. | 14:50 |
rloo | linggao: no, I trust that the experts know what they are talking about :-) | 14:50 |
linggao | rloo, I am on the same boat with you. I just make the pluming working. letting the experts worry about the externals. | 14:51 |
rloo | linggao: I think what would be useful is to document these things, but I'm not sure where. I can only guess at what things are doing/meant to do, by looking at the code | 14:51 |
rloo | linggao: eg, if we've decided that for boolean parameters, we're only allowing True/False/true/false, we have to remember to be consistent throughout. | 14:52 |
linggao | rloo, yes. one reviewer mentioned about the doc, but we found out that the API doc is way out of sync, so I think lucasagomes opend a bug for the doc overall. | 14:53 |
linggao | open/opened | 14:53 |
lucasagomes | yup there's an open bug there | 14:54 |
rloo | linggao: yes, at least no one will be bored with nothing to do :-) | 14:54 |
lucasagomes | but also, soon we are going to change the way we document the API, it will be auto generated | 14:54 |
linggao | lol | 14:54 |
linggao | lucasagomes, that would be great! CLI does it today. | 14:55 |
linggao | I mean CLI does the on line help today. | 14:55 |
lucasagomes | yea :) | 14:56 |
lucasagomes | I start working on it soon | 14:57 |
rloo | hi yuriyz, do you have a few minutes? | 15:04 |
yuriyz | yes | 15:05 |
rloo | about your comments for https://review.openstack.org/#/c/54466 | 15:05 |
rloo | so I didn't know about the task_manager stuff. It seems then, that if you want to get info about the node, you should use the task's node, not the 'node_obj' passed to the function? | 15:06 |
yuriyz | 3 min .. | 15:07 |
rloo | eg, if I look through the code in conductor/manager.py, within the 'with task_manager.acquire() as task:, I see references to node_obj. That may not be quite correct? | 15:08 |
yuriyz | rloo, look at https://github.com/openstack/ironic/blob/master/ironic/conductor/resource_manager.py#L55 | 15:10 |
yuriyz | node manager get a node from db | 15:11 |
rloo | yuriyz, and that is done as part of the acquire, so the node info would not get changed, right? | 15:11 |
yuriyz | rloo, if we have concurrent power task running exclisive lock already used | 15:12 |
rloo | ok, so nothing else would have changed the node info. Good. So I think the code in manager.py ought to be changed to use the task's node, rather than the node_obj passed into the functions. | 15:13 |
rloo | Thx for pointing it out. One day, I hope to be as good as you yuriyz! | 15:14 |
yuriyz | rloo, node info may be changed, but not target_power_state | 15:14 |
rloo | yuriyz, how would node info be changed outside of the task, since it has an exclusive lock? | 15:15 |
yuriyz | because target_power_state != None only inside task manager context (and node is locked) | 15:15 |
rloo | yuriyz: let me make sure I understand you. The code I put to check for if node_obj['target_power_state'] is not None, is not needed cuz it will never happen, right? | 15:17 |
rloo | yuriyz: and if I used task.node instead of node_obj, I wouldn't have had to do a node_obj.refresh() | 15:18 |
yuriyz | yes, if target_power_state is set by concurrent task we get exception 'node locked' | 15:19 |
yuriyz | but I am not sure that another race condition not possible | 15:19 |
yuriyz | rloo, and let Devananda will look :) | 15:22 |
rloo | yuriyz: yes, I hope he'll look at it :-) At least, we've fixed one race condition... | 15:23 |
NobodyCam | Good Morning Ironic | 15:23 |
rloo | Thank you yuriyz! | 15:25 |
rloo | NobodyCam: Hello! | 15:26 |
linggao | morning NobodyCam. | 15:26 |
NobodyCam | morning rloo linggao and yuriyz :) | 15:28 |
yuriyz | Morning NobodyCam | 15:30 |
*** mdenny has joined #openstack-ironic | 15:30 | |
NobodyCam | yuriyz: do you you get your hoodie? | 15:33 |
*** ndipanov has quit IRC | 15:33 | |
yuriyz | yes, i like it | 15:33 |
NobodyCam | w00t :) | 15:34 |
yuriyz | Nobodycam, are you tried small drinks? | 15:35 |
NobodyCam | the vodka :) | 15:38 |
yuriyz | ukrainian vodka -> gorilka | 15:38 |
NobodyCam | :) | 15:39 |
*** urulama has quit IRC | 15:43 | |
NobodyCam | reboot ing... brb | 15:45 |
*** ndipanov has joined #openstack-ironic | 15:47 | |
rloo | hey lucasagomes, yt? | 15:53 |
lucasagomes | rloo, hey | 15:54 |
rloo | i think you missed a try/except in 53664 | 15:54 |
lucasagomes | yt!? | 15:54 |
lucasagomes | ahh | 15:54 |
rloo | https://review.openstack.org/#/c/53664/7/ironic/api/controllers/v1/port.py,unified | 15:54 |
rloo | lucasagomes: do you see it? the lsat .check_required_attributes()? | 15:55 |
rloo | lucasagomes: or is it in a bigger try/except. would be nice to see all the code. | 15:56 |
lucasagomes | rloo, ops! I see | 15:56 |
lucasagomes | the ports didn't have it on patch() only nodes | 15:56 |
lucasagomes | urgh I will fix that :) | 15:56 |
rloo | lucasagomes: ok, thx. sorry about that. | 15:56 |
lucasagomes | rloo, thanks for pointing it out! | 15:57 |
rloo | lucasagomes: yw, although I feel like I'm a pain-in-the-butt. I'm going to at least try to review things faster. | 15:57 |
lucasagomes | rloo, no that's grand :) | 15:58 |
rloo | lucasagomes. You say that now. Let me know when you think otherwise :D | 15:58 |
lucasagomes | haha, seriously it's fine | 15:58 |
rloo | lucasagomes: seriously, in the future! ^^ | 16:00 |
lucasagomes | :P | 16:00 |
openstackgerrit | Lucas Alvares Gomes proposed a change to openstack/ironic: Required fields on nodes https://review.openstack.org/53664 | 16:04 |
NobodyCam | brb | 16:07 |
NobodyCam | hey hey lucasagomes look at 558884 | 16:15 |
NobodyCam | gah | 16:15 |
NobodyCam | 559994 | 16:15 |
NobodyCam | output file on print finctions | 16:15 |
lucasagomes | NobodyCam, yes? | 16:16 |
NobodyCam | how / where would I enable that output | 16:16 |
lucasagomes | oh it's just to make tests/debugging easier | 16:16 |
lucasagomes | I don't think it should be something confiurable to the user | 16:16 |
lucasagomes | that's what I understood from lifeless comments | 16:17 |
NobodyCam | ahh :) that makes sense :) | 16:17 |
lucasagomes | because before I had to fake out the stdout | 16:17 |
lucasagomes | so having a way to redirect it when testing is way easier and clear | 16:18 |
NobodyCam | where are lifeless's comments. on the review I just see LGTM? | 16:18 |
lucasagomes | it was on another patch | 16:18 |
lucasagomes | the unicode one | 16:18 |
NobodyCam | ahh | 16:18 |
lucasagomes | NobodyCam, https://review.openstack.org/#/c/54942/2/ironicclient/tests/test_utils.py | 16:18 |
NobodyCam | gotch ya | 16:19 |
NobodyCam | :-p | 16:19 |
lucasagomes | :) | 16:19 |
NobodyCam | lol I just +2 +a that one too | 16:19 |
NobodyCam | lol | 16:19 |
* NobodyCam need more coffee | 16:20 | |
lucasagomes | haha thanks :D | 16:20 |
openstackgerrit | A change was merged to openstack/python-ironicclient: Custom output file on the print_*() functions https://review.openstack.org/55994 | 16:29 |
openstackgerrit | A change was merged to openstack/python-ironicclient: Deal with unicode strings https://review.openstack.org/54942 | 16:29 |
NobodyCam | :) | 16:29 |
*** kobier has quit IRC | 16:50 | |
NobodyCam | bbt ... brb | 16:59 |
*** jistr has quit IRC | 17:02 | |
*** bauzas has quit IRC | 17:03 | |
*** blamar has quit IRC | 17:06 | |
max_lobur | Hi Everyone again | 17:15 |
romcheg | Oh, it looks like I got an enlightenment! | 17:17 |
max_lobur | have anybody been working with serialize_remote_exception, deserialize_remote_exception methods | 17:17 |
max_lobur | and all those stuff about transfering exceptions over RPC | 17:17 |
romcheg | And by enlightenment I mean that I found why CI for Ironic fails now | 17:18 |
romcheg | max_lobur: It looks like it's something in unified object model, isn't it? | 17:18 |
max_lobur | yea, it's under openstack.common | 17:19 |
max_lobur | I was trying to find a way to fix https://bugs.launchpad.net/ironic/+bug/1244747 | 17:19 |
max_lobur | and the main problem is in ronic/openstack/common/rpc/common.py | 17:20 |
max_lobur | so I assume if I want to change that I need to go to Oslo, right? | 17:20 |
romcheg | Lemme walk to you. Will be faster :) | 17:20 |
max_lobur | =) | 17:20 |
yuriyz | max, I think this is in WSME code https://github.com/stackforge/wsme/blob/master/wsme/api.py#L202 | 17:24 |
yuriyz | this code do format exceptions for REST API clients | 17:27 |
*** ben_duyujie has quit IRC | 17:30 | |
*** yuriyz has quit IRC | 17:32 | |
max_lobur | yes, that | 17:42 |
max_lobur | * thats for wsme level | 17:42 |
max_lobur | talked with romcheg , decided to try push some code to Oslo to make this work properly. there is silly deserialize_remote_exception method under ironic/openstack/common/rpc/common.py | 17:44 |
max_lobur | it builds Conductor traceback into the error message itself | 17:45 |
max_lobur | so it's not so easy to get it out from there =) | 17:45 |
*** hemna has joined #openstack-ironic | 17:46 | |
*** yuriyz has joined #openstack-ironic | 17:48 | |
*** ndipanov has quit IRC | 17:51 | |
*** blamar has joined #openstack-ironic | 17:51 | |
yuriyz | max, and I see it is hardcoded in oslo RPC code https://github.com/openstack/oslo-incubator/blob/master/openstack/common/rpc/common.py#L311 | 17:51 |
*** yuriyz has quit IRC | 17:51 | |
*** jimjiang has quit IRC | 17:54 | |
*** romcheg has quit IRC | 17:54 | |
*** derekh has quit IRC | 18:00 | |
max_lobur | general question - do I need to ping someone to review new bugs created by me (e.g. determine bug's Importance etc) or our core team periodically review those | 18:08 |
NobodyCam | max_lobur: Deva will be going thru them as he he does the Bp's based of the last summit, I beleieve. | 18:13 |
NobodyCam | s/he he does/he re does/ | 18:14 |
max_lobur | NobodyCam, ok thanks | 18:22 |
yjiang5 | hi, can anyone reply the question on the mailing list as http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg08390.html ? That's also interesting to me. | 18:29 |
*** lucasagomes has quit IRC | 18:44 | |
jbjohnso | so I have a couple of potential areas for interest to mention | 19:06 |
jbjohnso | I do now have a remote media deployment that avoids tftp entirely, the size of my one right now is a 700 kilobyte iso image | 19:06 |
jbjohnso | so if remote media is happy enough to do 700 kilobytes | 19:08 |
jbjohnso | you can skip the whole tftp dance altogether | 19:11 |
jbjohnso | otherwise, you have to do ~136kibibytes of tftp | 19:11 |
jbjohnso | either way, tftp load is pretty much trivial | 19:13 |
jbjohnso | even when needed, the tftp load can be absolutely static | 19:13 |
jbjohnso | another thing is a boot configuration server that can sidestep a hard need to do a lot of interaction with dhcpd config | 19:14 |
*** epim has joined #openstack-ironic | 19:15 | |
*** bauzas has joined #openstack-ironic | 19:15 | |
jbjohnso | and finally, whether you want to delegate sensor readings and such to a separate process managing it's own cache and such | 19:15 |
jbjohnso | or just call pyghmi directly in whatever fashion you feel like | 19:15 |
* devananda waves | 19:26 | |
jbjohnso | hello or goodbye? | 19:27 |
devananda | hello ;) | 19:27 |
jbjohnso | or 'go away' | 19:27 |
jbjohnso | ;) | 19:27 |
jbjohnso | anyhow, xCAT's been in the business of putting tftp out of it's misery for a while | 19:27 |
NobodyCam | :) | 19:28 |
NobodyCam | welcome devananda | 19:28 |
jbjohnso | Windows UEFI boot is the last bastion of evil tftp | 19:28 |
jbjohnso | been trying to get with microsoft on that one.... esxi and linux were easy enough since both use open boot loaders | 19:28 |
NobodyCam | we have requests to support windows deploys | 19:28 |
jbjohnso | so in BIOS style boot | 19:28 |
jbjohnso | you can reasonably netboot a windows pe image (e.g. if you need to do driver injection to said image for it to boot) | 19:29 |
jbjohnso | without resorting to tftp | 19:29 |
jbjohnso | and also without a port 4011 server | 19:29 |
jbjohnso | going to UEFI style boot, back to tftp hell and also you need a sane looking port 4011 response | 19:29 |
jbjohnso | I have been hoping someone gets microsoft's ear and I can describe to them what I'd want out of bootmgfw.efi | 19:30 |
jbjohnso | I should do my next inquiry with all caps SECURITY in front of it, maybe that would work | 19:30 |
NobodyCam | jbjohnso: try and get primemin1sterp ear | 19:32 |
NobodyCam | devananda: that the correct nic ^^^^ | 19:32 |
jbjohnso | anyway, with that in place, you could merrily do http and https for things | 19:33 |
jbjohnso | http*s* can get complicated and fragile though | 19:33 |
jbjohnso | what with the whole 'your firmware clock is probably wrong' and 'your certificate may be so new that it isn't valid yet' | 19:34 |
jbjohnso | well, that and unless you replace the cert in the boot loader, you can't use self-signed or private certs at all | 19:34 |
devananda | jbjohnso: https://blueprints.launchpad.net/ironic/+spec/windows-pxe-localboot0 | 19:35 |
jbjohnso | actually, that doesn't seem very windows-specific to me | 19:35 |
jbjohnso | that's the way we have done for a long time, that netboot usually is boot to local disk | 19:35 |
devananda | jbjohnso: so, the purpose of the developer summit every 6 months is for us all to get together and make plans about what we're going to do | 19:36 |
jbjohnso | and only when it has something interesting would pxe say do something interesting | 19:36 |
devananda | jbjohnso: most of what we discussed was captured at a high level here: https://etherpad.openstack.org/p/IcehouseIronicNextSteps | 19:36 |
jbjohnso | though with UEFI boot and IPMI in play, things get much much more straightforward if you are in the process of requesting one-time-pxe boots | 19:36 |
jbjohnso | so when does the train come to RTP? ;) | 19:37 |
jbjohnso | redeploying ease with 'boot from network first' becomes harder when the OSes have the ability to remove network boot at will | 19:38 |
jbjohnso | fyi, ipxe continues to have a problem they haven't sucked a patch in for that's in the xnba tree | 19:39 |
jbjohnso | I would recommend iPXE person perhaps add their voice to people asking for snponly.efi to work again | 19:40 |
jbjohnso | secure deploy is possible | 19:41 |
jbjohnso | but the strategy is not secure boot compatible ;) | 19:41 |
jbjohnso | for the hardware discovery piece.... | 19:41 |
jbjohnso | I can provide snooping and efficient search for ibm equipment easy enough | 19:42 |
jbjohnso | but can only fingerprint so many things in my lab | 19:42 |
jbjohnso | hey, I see Sun Jing owns replicating genesis, cool | 19:44 |
*** epim has quit IRC | 19:54 | |
*** epim has joined #openstack-ironic | 20:03 | |
devananda | jbjohnso: next summit will be in Atlanta in May. think you'll make it? :) | 20:04 |
jbjohnso | so hard to be hodophobic in this world | 20:10 |
*** bauzas has quit IRC | 20:28 | |
rloo | jbjohnso: i had to look it up. For real? | 20:30 |
jbjohnso | I could be being melodramatic | 20:31 |
jbjohnso | I'm certainly anxious about travel and am very very happy to be home after a trip, but it's not like I'm paralyzed or anything | 20:31 |
rloo | jbjohnso. I'm sure we can figure out a way (we = deva, ha) so you can be effective w/o going to atlanta. | 20:32 |
devananda | jbjohnso: can other folks from IBM proxy for you // use skype or something similar to give you a virtual presence during the summit sessions? | 20:34 |
devananda | jbjohnso: there were several IBM folks there last week | 20:34 |
rloo | devananda: has anyone looked into video conferencing at the summits? | 20:35 |
devananda | rloo: yes. we tried it back in san diego, IIRC. | 20:35 |
devananda | rloo: was barely used, very slow, and overall too expensive to put in every room | 20:36 |
rloo | devananda: ah. ok. phone line would be == skype or something similar I guess. | 20:36 |
devananda | also, the general noise of a design session doesn't translate well over a phone or mic | 20:37 |
devananda | regardless of quality | 20:37 |
devananda | it can be hard for the remote participant t omake sense of the discussion when several voices are going | 20:37 |
devananda | (can be hard for the non-remote folks too ....) | 20:38 |
rloo | yeah, true. it wasn't always easy to hear even for the participants in the room. | 20:38 |
devananda | :) | 20:38 |
jbjohnso | just need telepresence bots | 20:38 |
jbjohnso | preferably large, bulletproof, and software to make my voice sound like Schwarzenagger | 20:39 |
devananda | ++ | 20:39 |
NobodyCam | lol | 20:39 |
rloo | jbjohnso. where are you located? maybe you can lobby for the summit to be held there in 2015 or ?? | 20:39 |
rloo | (maybe that's too long to wait!) | 20:39 |
jbjohnso | Raleigh area, but I can get over myself | 20:39 |
devananda | jbjohnso: that's why i asked about atlanta -- probably as close to Raleigh as you'll get | 20:40 |
devananda | the fall summit will be in Paris | 20:40 |
jbjohnso | oh, come on, in Raleigh we have.... uhh... pretty trees | 20:40 |
devananda | :) | 20:40 |
rloo | I'm SURE it is prettier than Atlanta... | 20:41 |
jbjohnso | I really need to go to the new(ish) redhat building sometime | 20:41 |
devananda | rloo: did you volunteer for any of the specific tasks in the last session? | 20:42 |
rloo | devananda. Nope :-) | 20:42 |
rloo | I think you had volunteers for everything that you listed. | 20:42 |
devananda | i forgot a few (didn't copy from earlier sessions) | 20:42 |
rloo | did you need volunteers for anything? | 20:42 |
devananda | https://etherpad.openstack.org/p/IcehouseIronicNextSteps | 20:42 |
devananda | look down to "Copied from other sessions" | 20:43 |
jbjohnso | so quick question, for sensor data | 20:43 |
hemna | devananda, howdy. I wanted to ping you about working on cinder support in ironic for boot from cinder volume. | 20:43 |
devananda | hemna: hi! | 20:43 |
jbjohnso | would you be wanting to recreate pyghmi objects every time | 20:43 |
jbjohnso | or wanting to perisist and reuse them? | 20:43 |
hemna | devananda, I work with Gary Thunquest at HP (and work as core on Cinder) | 20:43 |
devananda | hemna: did we talk at the summit about it? | 20:43 |
devananda | hemna: ah! great | 20:43 |
jbjohnso | and if you didn't want to reuse, how about delegating to a process that wold? | 20:43 |
hemna | devananda, yah you chatted with gary about it, but I was stuck in cinder sessions at the time. | 20:43 |
devananda | hemna: yep. np. also introduced gary to mikyung kang @ usc/isi, who also need it for their tilera cluster | 20:44 |
devananda | jbjohnso: recreating objects thousands of times per minute seems wasteful. | 20:44 |
jbjohnso | devananda, well, I mean like persisting them a long time | 20:45 |
jbjohnso | devananda, If I ask for sensor data now and then sensor data 5 minutes for now, it'd be nice if the same objects are coming at me | 20:45 |
jbjohnso | devananda, trying not to do the xCAT strategy where we required disk space | 20:45 |
devananda | hemna: care to file a BP to describe how you'd like to do it? | 20:45 |
jbjohnso | and also I start getting worried about dynamic SDRs if not tying caching to session lifetime | 20:46 |
devananda | hemna: and have you seen teh etherpads with our notes on it? | 20:46 |
hemna | devananda, I need to plug into ironic first to get up to speed | 20:46 |
hemna | I haven't | 20:46 |
rloo | devananda: there was something in tripleo session about putting mac address in neutron as source-of-truth? | 20:47 |
devananda | hemna: i lied - we actually dont have notes on that .... but this may still give you some context: https://etherpad.openstack.org/p/IcehouseIronicNextSteps | 20:47 |
devananda | hemna: you'll find it under "important next steps" | 20:48 |
hemna | ok thanks | 20:48 |
devananda | hemna: once you're up to speed a bit, please file a BP: https://blueprints.launchpad.net/ironic | 20:48 |
devananda | describing, roughly, how you'll implement it | 20:48 |
rloo | devananda: wrt "Copied from other sessions", most I have no idea how to do, so ah, which would you like me to volunteer for? (no hurry) | 20:48 |
jbjohnso | devananda, also, my console server already shares sessionsbetween things like ipmi 'commands' and sol data | 20:49 |
hemna | devananda, ok will do. I'll work with Gary a bit on getting a BP done as soon as we can. | 20:49 |
NobodyCam | bbiab | 20:49 |
devananda | jbjohnso: not sure i follow why you'd care to get the same objects one or five min later | 20:49 |
jbjohnso | devananda, an optimization in ipmi | 20:49 |
jbjohnso | devananda, when you use ipmitool to read sensors, it is dog slow | 20:49 |
jbjohnso | devananda, but reading sensors is generally very easy and fast | 20:49 |
devananda | hemna: thanks. also, please reach out to mkkang@isi.edu | 20:49 |
jbjohnso | the slow bit is retrieving SDRs | 20:49 |
devananda | hemna: I'd like this feature to be implemented in a way that both teams benefit, if at all possible | 20:50 |
jbjohnso | in xCAT, we cache those to disk if requested assuming that a given firmware version, mfg, and product tuple will have stable SDR data | 20:50 |
jbjohnso | which has thus far always been the case | 20:50 |
hemna | yah of course. this would be a big win for both IMO | 20:50 |
jbjohnso | I was thinking in pyghmi, the SDR cache data could instead live in memory and get populated per session connection | 20:50 |
devananda | rloo: no worries then. do what you can :) | 20:50 |
jbjohnso | 3.4 seconds to retrieve SDRs and read sensors, 0.9 seconds to read senosrs with SDR cached | 20:52 |
jbjohnso | on a system I just checked | 20:52 |
jbjohnso | and that's with 800 milliseconds of overhead | 20:53 |
rloo | devananda: I'd like to get the power_state/provision_state /last_error done first. | 20:53 |
jbjohnso | so read all sensors is something that should take less than 0.1 seconds, but sdr retrieval adds 2.5 seconds | 20:53 |
devananda | rloo: ah! right, i need to review that :) | 20:54 |
jbjohnso | devananda, the other fun fact is that 'power state' without persistent session takes 14 packets, with persistent session, 2 packets | 20:55 |
jbjohnso | though generally the 12 packets are barely noticed | 20:55 |
devananda | rloo: random thought - the API for lock breaking wouldn't be that hard | 20:55 |
rloo | devananda: yeah, I started looking at provision states, and yuriy pointed out some stuff, so now i have some questions. (well, i had before too). | 20:56 |
devananda | rloo: it's basically some way to trigger "UPDATE nodes SET instance_uuid=NULL, *_state=NULL, WHERE ...", with a few safeguards around it | 20:56 |
rloo | devananda: ok, if you say so :-) I'll put my name down for that, and ping you later about it. | 20:57 |
*** jamespage_ has joined #openstack-ironic | 21:13 | |
*** jamespage_ has quit IRC | 21:13 | |
devananda | rloo: ok, on the power state stuff, you had questions? | 21:15 |
rloo | devananda: I'm wondering about the 'break' between the api call, and the rpc call. until you get the lock on a node, the node's state/info could have changed. | 21:16 |
devananda | rloo: correct | 21:16 |
rloo | and the code for provision state isn't all there. | 21:16 |
rloo | so if a node's info could change, does it mean that the checking that is done in the api part, needs to be rechecked after getting the lock? | 21:17 |
devananda | rloo: that's why i suggested to remove the checking in the api :) | 21:17 |
rloo | devananda. yes... but then, the code does 'wait' to get a lock, it craps out if it doesn't get a lock. | 21:18 |
rloo | doesn't I mean, not does. | 21:18 |
devananda | rloo: true. which is possible *anyway*. but perhaps less likely | 21:18 |
rloo | devananda. maybe it doesn't matter. if you want to change the power, and there's a lock on it already, you can't change it. | 21:18 |
devananda | the issue isn't the timeout or 'wait' | 21:19 |
devananda | it's the race condition between the API checking and then the manager actually getting a lock | 21:19 |
devananda | there is such a thing as an 'intent lock', which we may need to implement at some point to solve things like this | 21:20 |
rloo | yes, there's the race condition that we don't want. but from user's point of view. i issue a command to turn power on. and api forwards request but can't get lock. what kind of response do i get, 'try again later'? | 21:20 |
devananda | rloo: yes. regardless of our implementation, if user says "power on" but node is already locked (eg for firmware update that takes 10 minutes), user should get a 500 or 503 error, with a message to retry later | 21:23 |
devananda | (i'm open to debate whether that's a 4xx or 5xx class error) | 21:24 |
devananda | might be a 408 or 409? | 21:24 |
rloo | devananda: ok, so any? checking should be done after getting the lock. Gad, error classes, do we have an expert on that? | 21:25 |
devananda | lucas or martyn, i suspect | 21:25 |
rloo | devananda: i'll think about the error class or not; see if reviewers agree/disagree... | 21:26 |
devananda | rloo: as for a lock waiting -- afaik, acquire() doesn't wait. it raise()s if it cant lock | 21:26 |
rloo | devananda: thx, i'll redo things and see what you/yuriy think ;) | 21:26 |
rloo | devananda: yes, it doesn't wait. seems like there may be times where you want to wait. but anyway. | 21:27 |
devananda | there's a catch | 21:27 |
devananda | change_node_power_state is a cast, not call | 21:27 |
jbjohnso | devananda, on things like firmware update, I actually have a more comprehensive thought | 21:27 |
devananda | we can't return an error from it | 21:27 |
rloo | devananda: ugh, right. unless we look at last_error. | 21:27 |
jbjohnso | devananda, so long as we are talking about ipmi class devices for now.... | 21:27 |
devananda | rloo: right. but if we fail to get the lock, we had best not change anything at all | 21:28 |
jbjohnso | then set firmware firewall so even the BMC would reject power control commands for that duration | 21:28 |
rloo | devananda. more ugh. | 21:28 |
jbjohnso | lock at a higher layer sure, but I'd feel cozier of the platform was protected at the lowest level too | 21:28 |
jbjohnso | could probably more simply reduce priveleg level on a channel to 'user', which would forbid all sorts of operator action | 21:29 |
jbjohnso | you could still read sensors and get power state, but not actually modify state remotely until released | 21:30 |
jbjohnso | though at least in flex, the only way out of that is through cll to the enclosure manager... | 21:30 |
* devananda noodles on the locking problem | 21:30 | |
jbjohnso | to be honest it's been a while since I dealt with a firmware update that could corrupt a system | 21:31 |
devananda | jbjohnso: that was a huge cnocern from several folks in audience | 21:31 |
jbjohnso | well, ipmi level remote lockout is feasible in most cases | 21:32 |
jbjohnso | which would afford the greatest protection | 21:32 |
jbjohnso | firmware firewall against just the handful of relevant commands could be more fine grained... | 21:33 |
jbjohnso | the nice thing about that is if a firmware update process included that scheme | 21:34 |
jbjohnso | regardless of whether the management infrastructure is openstack or bob's ipmi scripts, it would protect itself | 21:34 |
jbjohnso | even if said infrastructure has no idea a firmware update is in progress | 21:35 |
jbjohnso | lot's of firmware updates could be applied without trying to coordinate with the management solution anyhow | 21:35 |
*** epim has quit IRC | 21:38 | |
openstackgerrit | A change was merged to openstack/ironic: Supports get node by instance uuid in API https://review.openstack.org/53262 | 21:43 |
devananda | jbjohnso: problem needs to be solved at a higher level | 21:44 |
devananda | jbjohnso: eg, combine a PDU power driver with a ramdisk-based firmware update | 21:44 |
devananda | jbjohnso: ironic is an abstraction layer on top of _many_ types of hardware (or more precisely, taht's our collective vision fo rit) | 21:45 |
devananda | rloo: yuriy has an interesting point as well, L173 of https://review.openstack.org/#/c/54466/4/ironic/conductor/manager.py | 21:46 |
rloo | devananda. right. if we assume that target_power_state will only get modified wi a lock? | 21:47 |
devananda | rloo: right. under normal operations | 21:47 |
rloo | so what about non-normal operations? | 21:47 |
devananda | rloo: what happens if a conductor instance dies? | 21:48 |
rloo | devananda: i was hoping you'd tell me that! | 21:48 |
devananda | i suppose this mandates that our lock-breaking also clears target_power_state | 21:48 |
devananda | :) | 21:48 |
devananda | which seems reasonable | 21:48 |
rloo | ha ha. devananda, I guess you foresaw the future. | 21:49 |
devananda | so, leaving the check in the API exposes a race condition, a small one, but it's not as bad as i thought | 21:49 |
devananda | APi check pass -> RPC -> condcutor fails to get lock -> silent error, but no corruption of data | 21:49 |
devananda | that could happen if two requests come in at the same time. both pass the API check but only one can get the lock | 21:50 |
rloo | devananda: ok, i wouldn't call that a race condition since nothing gets corrupted. i'd call it bad timing. but yeah. | 21:50 |
devananda | from user perspective, it's broken. he gets a "202" but nothing happens | 21:50 |
devananda | and there's no error | 21:50 |
rloo | right. if check is done in api and target is set, all is 'good'. if target isn't set, and it can't get lock, then user gets some sort of error. | 21:51 |
rloo | if target isn't set, and can get lock, then if eg request is to turn power on, and power is on -- code returns exception. Is that considered an error? | 21:52 |
devananda | (in case anteaya is watching, I say "he" because statistically, men break things more often) ... (/me makes up random statistics to justify fictitious claims) | 21:53 |
rloo | ha ha. I agree cuz I'm sexist (my spouse would kill me) | 21:53 |
devananda | rloo: well, neither of your scenarios are quite right. in first case, user does not get any error, because manager.change_node_power_state _cant_ return anything | 21:54 |
anteaya | watch how I break things | 21:54 |
rloo | oh oh, anteaya is on the war path. | 21:54 |
devananda | rloo: and in second case, again, even though an exception is raised and logged, the user won't see it | 21:54 |
devananda | anteaya: hi :) | 21:54 |
rloo | devananda. sorry. right. in the first case, the user doesn't know what happened. | 21:54 |
anteaya | hi devananda | 21:54 |
rloo | devananda: in the second case, the user won't see the exception. why do we bother with the exception? | 21:55 |
devananda | at least the admin could see it in the logs??? :( | 21:55 |
rloo | devananda: but if someone requests to do something, and it is already in that state, is it worth logging? | 21:55 |
devananda | no | 21:55 |
devananda | well. maybe | 21:55 |
rloo | devananda: just wondering if baremetal is a special case. i have no idea, it just isn't | 21:56 |
rloo | normal | 21:56 |
rloo | devananda: if anything, it would seem to be an eg log.info? | 21:56 |
rloo | devananda: anyway, probably not a big deal. I was just trying to understand. | 21:57 |
devananda | rloo: for instance, node should be off, deploy is called, but for some reason, when it gets down to here, node is actually on | 21:58 |
devananda | rloo: now, the deploy is going to fail, because pxe deploy depends on the DHCP request from PXE module when machine starts | 21:59 |
devananda | rloo: this was dealt with in noav-bm by having deploy() call power-off, power-on | 21:59 |
devananda | and squelching any error from power-off if machine was alerady off | 21:59 |
devananda | if that logic isn't preserved in our pxe driver, someone should add it :) | 22:00 |
* devananda jumps on a ph call | 22:00 | |
rloo | devananda: hmm, need to think about it. thx. | 22:01 |
openstackgerrit | linggao proposed a change to openstack/python-ironicclient: Modifies CLI to show nodes by instance uuid https://review.openstack.org/53485 | 22:08 |
*** epim has joined #openstack-ironic | 22:08 | |
*** jdob has quit IRC | 22:16 | |
*** jbjohnso has quit IRC | 22:17 | |
*** linggao has quit IRC | 22:23 | |
*** hemna has quit IRC | 22:35 | |
*** epim has quit IRC | 22:37 | |
*** blamar has quit IRC | 22:39 | |
openstackgerrit | A change was merged to openstack/ironic: Imported Translations from Transifex https://review.openstack.org/55967 | 23:04 |
devananda | NobodyCam: ^ :) | 23:19 |
openstackgerrit | Haomeng,Wang proposed a change to openstack/ironic: Supporting both Python 2 and Python 3 with six https://review.openstack.org/56169 | 23:20 |
NobodyCam | :) | 23:21 |
NobodyCam | do I have to gen a pot file now :-p | 23:21 |
NobodyCam | seems I need to go walkies ... brb | 23:33 |
*** matsuhashi has joined #openstack-ironic | 23:57 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!