*** ellenh has joined #openstack-ironic | 00:01 | |
*** eghobo has quit IRC | 00:04 | |
*** eghobo has joined #openstack-ironic | 00:05 | |
*** ellenh has quit IRC | 00:05 | |
openstackgerrit | Ruby Loo proposed a change to openstack/ironic: Driver interface's validate should return nothing https://review.openstack.org/97855 | 00:07 |
---|---|---|
openstackgerrit | Yongli He proposed a change to openstack/ironic: Rewrite ironic policy to use the new changes of common policy https://review.openstack.org/97731 | 00:23 |
*** rushiagr has quit IRC | 00:25 | |
lifeless | devananda: so I think we need to reevaluate the locking strategy around nodes | 00:30 |
lifeless | devananda: dealing with hardware that has any fragility is super hard because the ipmi background tasks just stall | 00:30 |
lifeless | devananda: and I think this ties into the async API discussion | 00:31 |
*** eghobo has quit IRC | 00:31 | |
lifeless | devananda: so - when you can, I'd like some high bw brainstorming, followed up by a spec for broad assessment | 00:31 |
*** rushiagr has joined #openstack-ironic | 00:32 | |
devananda | back | 00:34 |
devananda | lifeless: hi! i have ~1hr | 00:34 |
devananda | lifeless: is that enough // is now good? | 00:34 |
lifeless | devananda: 20m right now would rock | 00:35 |
lifeless | let me relocate | 00:35 |
devananda | lifeless: text? phone? vid? | 00:35 |
lifeless | g+ | 00:36 |
lifeless | devananda: g+ | 00:38 |
* devananda kicks browser | 00:38 | |
devananda | tried to answer, nothing happened. sec | 00:38 |
*** annegentle has quit IRC | 00:40 | |
devananda | lifeless: https://etherpad.openstack.org/p/ironic-and-fragile-hardware | 00:40 |
NobodyCam | through rain and hail we have arived in Buffalo Wy | 00:47 |
NobodyCam | oh happy happy joy joy 97447 was blocked | 00:48 |
*** godp1301 has joined #openstack-ironic | 00:51 | |
*** rwsu has quit IRC | 00:51 | |
*** godp1301 has quit IRC | 00:54 | |
openstackgerrit | Chris Krelle proposed a change to openstack/ironic: Rework make_partitions logic when preserve_ephemeral is set https://review.openstack.org/97590 | 00:55 |
openstackgerrit | Chris Krelle proposed a change to openstack/ironic: Wipe any metadata from a nodes disk https://review.openstack.org/93133 | 00:57 |
devananda | NobodyCam: you missed a long conversation about it | 01:01 |
devananda | yes,it was blocked -- nova_bm was broken too, and the fix wasn't approved for that | 01:02 |
NobodyCam | :) just rebased the lable patches | 01:03 |
NobodyCam | :-p | 01:03 |
NobodyCam | anyone want to review 93133 or 97590? | 01:04 |
NobodyCam | :-p | 01:04 |
NobodyCam | ok brb runing to grab fast food. | 01:04 |
devananda | comstud: ping | 01:14 |
*** rloo has quit IRC | 01:21 | |
devananda | nvm | 01:22 |
devananda | lifeless: ok, see the etherpad for some post-brainstorm notes | 01:25 |
devananda | lifeless: i'm concerned this idea will be a massive rewrite of the driver API, even if it doesn't affect the REST API that much. you probably thought of a way to avoid doing that, though :) | 01:26 |
devananda | lifeless: so pls add it to etherpad, i'llcheck back later (your tomorrow, probably) | 01:26 |
NobodyCam | back but eating taco's | 01:29 |
devananda | finishing for the day ... bbtmw! | 01:33 |
*** coolsvap is now known as coolsvap|afk | 01:38 | |
*** nosnos has joined #openstack-ironic | 01:46 | |
NobodyCam | night devananda | 01:46 |
comstud | devananda: neverminding | 01:55 |
comstud | (was on the road) | 01:56 |
NobodyCam | we should change our status message as 97447 is no longer inthe queue | 02:07 |
lifeless | devananda: it will be, but there should be a graceful way forward | 02:23 |
*** pcrews has quit IRC | 02:32 | |
ryanpetrello | devananda: any reservations on https://review.openstack.org/#/c/97475/ ? | 02:32 |
ryanpetrello | I’m held up on a few pecan reviews because of the failing tests in the stable branch :\ | 02:32 |
ryanpetrello | I could set ironic-stable to non-voting for a bit, but I’d really like to get the pecan tests gating against it :) | 02:33 |
*** Jatin360 has joined #openstack-ironic | 02:56 | |
*** Jatin360_ has joined #openstack-ironic | 02:58 | |
*** Jatin360 has quit IRC | 03:01 | |
*** Jatin360_ is now known as Jatin360 | 03:01 | |
*** coolsvap|afk is now known as coolsvap | 03:13 | |
*** vinbs has joined #openstack-ironic | 03:15 | |
*** nosnos has quit IRC | 03:43 | |
*** eghobo has joined #openstack-ironic | 03:45 | |
lifeless | devananda: rambled a bit on https://etherpad.openstack.org/p/ironic-and-fragile-hardware | 03:54 |
*** harlowja is now known as harlowja_away | 04:07 | |
*** rameshg87 has joined #openstack-ironic | 04:08 | |
*** godp1301 has joined #openstack-ironic | 04:27 | |
*** eghobo has quit IRC | 04:30 | |
*** eghobo has joined #openstack-ironic | 04:31 | |
*** jcoufal has joined #openstack-ironic | 04:36 | |
*** lazy_prince has joined #openstack-ironic | 04:38 | |
*** matsuhashi has joined #openstack-ironic | 04:39 | |
*** nosnos has joined #openstack-ironic | 04:41 | |
*** godp1301 has quit IRC | 04:52 | |
*** eguz has joined #openstack-ironic | 04:55 | |
*** eghobo has quit IRC | 04:58 | |
*** k4n0 has joined #openstack-ironic | 05:02 | |
*** Jatin360 has quit IRC | 05:03 | |
*** eghobo has joined #openstack-ironic | 05:04 | |
*** eguz has quit IRC | 05:04 | |
*** eghobo has quit IRC | 05:04 | |
*** eghobo has joined #openstack-ironic | 05:05 | |
*** Jatin360 has joined #openstack-ironic | 05:05 | |
*** sysexit has joined #openstack-ironic | 05:09 | |
*** eguz has joined #openstack-ironic | 05:10 | |
*** eguz has quit IRC | 05:10 | |
*** eguz has joined #openstack-ironic | 05:11 | |
*** eghobo has quit IRC | 05:11 | |
k4n0 | morning all | 05:12 |
*** rakesh_hs has joined #openstack-ironic | 05:14 | |
*** coolsvap is now known as coolsvap|afk | 05:38 | |
*** coolsvap|afk is now known as coolsvap | 05:52 | |
*** jcoufal has quit IRC | 05:56 | |
*** Jatin360_ has joined #openstack-ironic | 05:57 | |
*** Jatin360 has quit IRC | 06:00 | |
openstackgerrit | OpenStack Proposal Bot proposed a change to openstack/ironic: Imported Translations from Transifex https://review.openstack.org/96063 | 06:02 |
*** Jatin360_ has quit IRC | 06:02 | |
*** Jatin360 has joined #openstack-ironic | 06:03 | |
*** aweeks has quit IRC | 06:04 | |
*** aweeks has joined #openstack-ironic | 06:05 | |
*** Kai14 has joined #openstack-ironic | 06:15 | |
*** loki184 has joined #openstack-ironic | 06:27 | |
*** Mikhail_D_ltp has joined #openstack-ironic | 06:33 | |
*** lsmola_ has joined #openstack-ironic | 06:38 | |
*** lsmola has quit IRC | 06:39 | |
*** Jatin360 has quit IRC | 06:47 | |
GheRivero | morning all | 06:52 |
Mikhail_D_ltp | Good morning folks! :) | 06:55 |
*** Mikhail_D_ltp has left #openstack-ironic | 06:56 | |
*** Kai14 has quit IRC | 06:56 | |
*** Mikhail_D_ltp has joined #openstack-ironic | 06:58 | |
*** loki184 has quit IRC | 06:59 | |
*** Jatin360 has joined #openstack-ironic | 07:00 | |
mrda | Hi GheRivero and Mikhail_D_ltp | 07:01 |
Mikhail_D_ltp | mrda: Hi! :) | 07:02 |
*** Jatin360 has quit IRC | 07:05 | |
*** Jatin360 has joined #openstack-ironic | 07:06 | |
*** Haomeng|2 has joined #openstack-ironic | 07:06 | |
*** Haomeng has quit IRC | 07:06 | |
*** Kai14 has joined #openstack-ironic | 07:10 | |
*** Jatin360_ has joined #openstack-ironic | 07:11 | |
*** ndipanov has joined #openstack-ironic | 07:12 | |
*** Jatin360 has quit IRC | 07:15 | |
*** Jatin360_ is now known as Jatin360 | 07:15 | |
*** Jatin360_ has joined #openstack-ironic | 07:25 | |
openstackgerrit | Rakesh H S proposed a change to openstack/ironic: ipmi double bridging functionality https://review.openstack.org/95775 | 07:27 |
*** Jatin360 has quit IRC | 07:28 | |
*** Jatin360_ has quit IRC | 07:30 | |
*** Jatin360 has joined #openstack-ironic | 07:33 | |
Haomeng|2 | morning GheRivero, Mikhail_D_ltp, mrda :) | 07:41 |
mrda | Hi Haomeng|2 | 07:41 |
Haomeng|2 | and k4n0 :) | 07:42 |
Haomeng|2 | mrda: :) | 07:42 |
*** max_lobur has joined #openstack-ironic | 07:43 | |
Mikhail_D_ltp | Haomeng|2: hi! :) | 07:44 |
Haomeng|2 | Mikhail_D_ltp: :) | 07:44 |
mrda | Time to say good night. See you tomorrow!' | 07:44 |
*** mrda is now known as mrda-away | 07:44 | |
*** Jatin360 has quit IRC | 07:51 | |
*** jistr has joined #openstack-ironic | 07:53 | |
*** Jatin360 has joined #openstack-ironic | 07:54 | |
Mikhail_D_ltp | k4n0: Hi! Are you around? | 07:55 |
Mikhail_D_ltp | k4n0: Are you making this a task -> https://bugs.launchpad.net/ironic/+bug/1282836 ??? | 07:56 |
*** pbrooko has joined #openstack-ironic | 07:57 | |
*** pelix has joined #openstack-ironic | 08:01 | |
Mikhail_D_ltp | k4n0: Oh! I've already saw that you are not making this :) | 08:02 |
openstackgerrit | lifeless proposed a change to openstack/ironic: Add in text for text mode on trusty https://review.openstack.org/98050 | 08:02 |
*** r0j4z0 has quit IRC | 08:03 | |
Haomeng|2 | FYI - our all Jenkins gate failed, that is because the nova new code changed, I raise defect to track the change - https://bugs.launchpad.net/ironic/+bug/1326680 | 08:04 |
dtantsur|afk | morning Ironic | 08:13 |
agordeev | morning Ironic | 08:14 |
agordeev | morning dtantsur|afk Haomeng|2 Mikhail_D_ltp | 08:14 |
dtantsur|afk | morning, Haomeng|2, that's a know bug :) we've been stuck with it since yesterday morning | 08:14 |
Haomeng|2 | morning dtantsur|afk, agordeev :) | 08:14 |
dtantsur|afk | agordeev, morning | 08:14 |
*** dtantsur|afk is now known as dtantsur | 08:14 | |
Mikhail_D_ltp | agordeev, dtantsur: g'morning :) | 08:14 |
dtantsur | Mikhail_D_ltp, morning | 08:15 |
Haomeng|2 | dtantsur|afk: ok, which one? let me update the bug I raised:) thank you the information. | 08:15 |
dtantsur | Haomeng|2, already updated :) | 08:15 |
Haomeng|2 | dtantsur: thk | 08:15 |
dtantsur | actually the situation is terrible now: Nova patch is being _reverted_, so we're stuck again | 08:15 |
dtantsur | this is a nightmare :( | 08:15 |
Haomeng|2 | dtantsur: looks like nova guys will rollback the change | 08:17 |
Haomeng|2 | https://review.openstack.org/#/c/97447/ | 08:17 |
*** lucasagomes has joined #openstack-ironic | 08:17 | |
dtantsur | yeah, that's what I mean | 08:17 |
Haomeng|2 | dtantsur: ok, we can wait nova new patch:) | 08:18 |
*** martyntaylor has joined #openstack-ironic | 08:18 | |
dtantsur | Haomeng|2, we can't do anything anyway :) issues reverify of Nova patch, hope it lands asap... | 08:20 |
Haomeng|2 | dtantsur: yes | 08:20 |
Haomeng|2 | dtantsur: :) | 08:20 |
dtantsur | brb | 08:22 |
dtantsur | btw, someone update the topic of the channel with up-to-date information, please | 08:23 |
*** dtantsur is now known as dtantsur|afk | 08:23 | |
vinbs | Hello Ironic! | 08:25 |
vinbs | Haomeng | 08:26 |
Haomeng|2 | vinbs: welcome:) | 08:26 |
vinbs | thank you | 08:26 |
Haomeng|2 | vinbs: :) | 08:26 |
vinbs | Haomeng|2 I have got my openstack setup with Ironic now | 08:26 |
*** jcoufal has joined #openstack-ironic | 08:27 | |
Haomeng|2 | vinbs: cool | 08:27 |
Haomeng|2 | vinbs: does it work now? | 08:27 |
vinbs | Haomeng|2 , now if I launch an instance on a network which has dhcp enabled, the dhcp packets should reach the baremetal node, right? | 08:27 |
*** jcoufal has quit IRC | 08:27 | |
vinbs | I haven't tried it yet | 08:27 |
*** jcoufal has joined #openstack-ironic | 08:28 | |
vinbs | I have enrolled the baremetal node, created the baremetal flavor key too | 08:28 |
*** romcheg has joined #openstack-ironic | 08:28 | |
*** romcheg has left #openstack-ironic | 08:28 | |
vinbs | and I have the required images in glance | 08:28 |
*** sabah has joined #openstack-ironic | 08:29 | |
Haomeng|2 | vinbs: cool | 08:29 |
openstackgerrit | Sandhya Balakrishnan proposed a change to openstack/ironic: Updates Ironic Guide with deployment information https://review.openstack.org/94604 | 08:30 |
*** athomas has joined #openstack-ironic | 08:30 | |
Haomeng|2 | vinbs: did you check your net type | 08:30 |
Haomeng|2 | vinbs: let me show you the neutron command, if your network used by nova booting for baremetal is vlan, that is more complex | 08:30 |
vinbs | Haomeng|2, I want to try with a simple setup first | 08:31 |
Haomeng|2 | vinbs: ok | 08:31 |
vinbs | Haomeng|2, which network type should I choose for that? | 08:31 |
*** Kai14 has quit IRC | 08:31 | |
Haomeng|2 | vinbs: can you run "neutron net-show" to check the net type | 08:33 |
Haomeng|2 | vinbs: if it is "provider:network_type | local", that means the package will not be sent to outsite network | 08:34 |
*** max_lobur has quit IRC | 08:35 | |
vinbs | I have network type: gre | 08:35 |
vinbs | Haomeng|2 here's my network details http://paste.openstack.org/show/82904/ | 08:38 |
Haomeng|2 | vinbs: your net is gre, it is diffcult to debug I think | 08:44 |
*** Kai14 has joined #openstack-ironic | 08:44 | |
*** romcheg has joined #openstack-ironic | 08:45 | |
romcheg | Morning Ironic | 08:45 |
Haomeng|2 | vinbs: can you try with flat network? | 08:45 |
romcheg | Sorry, I had to disappear unexpectedly yesterday due to personal reasons | 08:46 |
vinbs | Haomeng|2 ok then.. let me change the network type to flat | 08:46 |
vinbs | Haomeng|2 I'll have to modify the ml2_conf.ini file for this right? | 08:47 |
*** Jatin360 has quit IRC | 08:47 | |
Haomeng|2 | vinbs: and make sure your host eth card can receive the dhcp request from our baremetal, that is physical networking scope | 08:48 |
Haomeng|2 | vinbs: no, just create another net which is type is flat, and boot with this flat net | 08:48 |
vinbs | Haomeng|2 yes it can.. I have checked that by running my own dnsmasq service | 08:48 |
*** eglynn_ has joined #openstack-ironic | 08:49 | |
Haomeng|2 | vinbs: for such issue, we have to debug for end-to-end to see where the package missing | 08:49 |
Haomeng|2 | vinbs: so if you just want to try our ironic, suggest to use the simple net work type - flat net | 08:50 |
romcheg | sudo: /usr/sbin/apache2ctl: command not found | 08:51 |
Haomeng|2 | vinbs: if your case require GRE, we have to debug the whole GRE path to check where our DHCP request missing or if the dhcp resposne missing | 08:51 |
vinbs | Haomeng|2 yes I just want to try Ironic for now.. So I'm not particular about network type | 08:51 |
vinbs | Haomeng|2 I'll create a network with type flat | 08:51 |
Haomeng|2 | vinbs: that is not problem, just play with them:) | 08:51 |
Haomeng|2 | vinbs: ok, good luck | 08:52 |
eglynn_ | Haomeng|2: good morning/evening! | 08:53 |
*** Jatin360 has joined #openstack-ironic | 08:53 | |
Haomeng|2 | eglynn_: morning:) | 08:54 |
eglynn_ | Haomeng|2: ... just a quick question about ironic emitting notifications with IPMI sensor data for ceilometer to consume | 08:54 |
Haomeng|2 | eglynn_: sure | 08:54 |
eglynn_ | Haomeng|2: (... discussed at summit https://etherpad.openstack.org/p/juno-ironic-and-ceilometer in a session led by devananda) | 08:55 |
eglynn_ | Haomeng|2: ... just wondering if a fresh blueprint is needed for this? | 08:55 |
Haomeng|2 | eglynn_: yes | 08:55 |
eglynn_ | Haomeng|2: ... and do you know if it'll be worked on in the juno-2 timeframe? | 08:56 |
eglynn_ | Haomeng|2: (assuming its missed the boat on juno-1 at this stage) | 08:56 |
Haomeng|2 | eglynn_: to follow new process, for any blueprint, we need design spec, so I just plan to commit the spec first | 08:57 |
Haomeng|2 | eglynn_: then, I will work with ceilometer guys to confirm the solution, and will modify the existing patch to commit again for reviewing | 08:57 |
eglynn_ | Haomeng|2: ... cool, can you nominate me as a reviewer on the BP spec? (... from the ceilometer perspective) | 08:58 |
Haomeng|2 | eglynn_: sure | 08:58 |
Haomeng|2 | eglynn_: are your working from Ceilometer? | 08:58 |
Haomeng|2 | eglynn_: welcome your comments | 08:58 |
Haomeng|2 | eglynn_: and what is your launch pad id? | 08:58 |
eglynn_ | Haomeng|2: yeah I'm the ceilometer PTL for my sins ;) | 08:58 |
eglynn_ | Haomeng|2: launchpad ID == eglynn | 08:59 |
Haomeng|2 | eglynn_: thank you | 08:59 |
*** spearson has joined #openstack-ironic | 08:59 | |
Haomeng|2 | eglynn_: I will prepare the spec asap | 08:59 |
eglynn_ | Haomeng|2: ... excellent, thank you sir! | 08:59 |
eglynn_ | Haomeng|2: ... also the ceilometer contributor who's likely to be working on the ceilometer code to consume the Ironic notifications will be Chris Dent (LP ID == cdent) | 09:00 |
Kai14 | hi Ironic! | 09:02 |
Kai14 | f got a question regarding the docu "Deploying Ironic with Devstack" http://docs.openstack.org/developer/ironic/dev/dev-quickstart.html#deploying-ironic-with-devstack | 09:04 |
Kai14 | is this supposed to work in a single node? | 09:04 |
Haomeng|2 | eglynn_: ok, I will work with Chris to complete this design, thanks for your inforamtion | 09:08 |
eglynn_ | Haomeng|2: thanks again! | 09:09 |
Haomeng|2 | eglynn_: in icehouse release, I try to find ceilometer guys to work together for this bp:) | 09:09 |
Haomeng|2 | eglynn_: :) | 09:09 |
eglynn_ | Haomeng|2: ... yeah, I think we dropped the ball on that interaction from the ceilometer side, apologies! | 09:09 |
eglynn_ | Haomeng|2: ... but for Juno, I've pushed it up the priority stack | 09:10 |
Haomeng|2 | eglynn_: dot worry, that is fine with me:) | 09:10 |
Haomeng|2 | eglynn_: thanks for your supporting:) | 09:10 |
eglynn_ | Haomeng|2: (... because it's important for the Tuskar/TripleO folks, that makes it important for Ceilometer also) | 09:10 |
eglynn_ | Haomeng|2: ... well, thank *you* for your efforts on this :) | 09:11 |
Haomeng|2 | eglynn_: yes | 09:11 |
Haomeng|2 | eglynn_: welcome:) | 09:11 |
lucasagomes | Kai14, I believe that's using nested vms, so yes | 09:11 |
Haomeng|2 | eglynn_: that is my pleasure:) | 09:11 |
Haomeng|2 | eglynn_: one more question | 09:11 |
eglynn_ | Haomeng|2: shoot | 09:12 |
Haomeng|2 | eglynn_: based on current code, Ironic will send the message to Ceilometer directly, not sure if the message can be handled by Ceiloemter, if the Ceilometer will validate the message first with any trust model? | 09:13 |
Haomeng|2 | eglynn_: that is my concern:) | 09:13 |
lucasagomes | Kai14, https://etherpad.openstack.org/p/IronicDeployDevstack << this is not updated, but you can have an idea about how to use more VMs to ur tests | 09:13 |
eglynn_ | Haomeng|2: ... the idea discussed at summit was that Ironic will simply emit an AMQP notification with these sensor data | 09:14 |
Haomeng|2 | eglynn_: that is fine with my existing code:) | 09:14 |
eglynn_ | Haomeng|2: ... and we take the AMQP message bus to be "implicitly secure" | 09:14 |
Haomeng|2 | eglynn_: ok | 09:15 |
Haomeng|2 | eglynn_: thank you:) | 09:15 |
eglynn_ | Haomeng|2: ... I *think* the general prinicipal in openstack seems to be that REST API call are authenticated, but AMQP-mediated interactions are assumed to be secure | 09:15 |
Kai14 | lucasgomes, thank you. The problem is when i come to the command "nova boot" my ironic nodes get powered on but the instance stucks in spanning state | 09:15 |
Haomeng|2 | eglynn_: yes, we can discuss the interface between Ironic and Ceilometer, both API and AMQP message are ok for me :) | 09:16 |
Haomeng|2 | eglynn_: let me prepare the spec, and welcome your comments:) | 09:16 |
eglynn_ | Haomeng|2: excellent, thank you! | 09:17 |
Haomeng|2 | eglynn_: :) | 09:17 |
Kai14 | lucasagmoes: the ironic node stucks in provisinging state "wait call-back" | 09:17 |
openstackgerrit | Rakesh H S proposed a change to openstack/ironic: ipmi double bridging functionality https://review.openstack.org/95775 | 09:22 |
*** igordcard has joined #openstack-ironic | 09:27 | |
lucasagomes | Kai14, right, and did you see the console of the machine ur trying to boot? | 09:28 |
romcheg | Morning lucasagomes! | 09:28 |
lucasagomes | Kai14, did they get an ip from the dhcp request and all? | 09:28 |
lucasagomes | romcheg, morning | 09:28 |
lucasagomes | mroning Haomeng|2 :) | 09:28 |
*** rameshg87 has left #openstack-ironic | 09:28 | |
Haomeng|2 | lucasagomes: morning:) | 09:28 |
openstackgerrit | Yongli He proposed a change to openstack/ironic: Rewrite ironic policy to use the new changes of common policy https://review.openstack.org/97731 | 09:29 |
Kai14 | lucasagomes, no console and they also don't get an ip. But I didn't do configure a dhcp server or TFTP or PXE. I think this should be all done by ironic. Or am I wrong? | 09:30 |
*** sabah has quit IRC | 09:30 | |
lucasagomes | Kai14, yeah dhcp will come from neutron and the tftp devstack should have configured to u AFAIR | 09:31 |
lucasagomes | do you see any dnsmasq process running? (neutron uses dnsmasq as the dhcp server) | 09:31 |
*** max_lobur has joined #openstack-ironic | 09:32 | |
Kai14 | lucasagomes, dnsmasq is running. | 09:32 |
*** vinbs has quit IRC | 09:33 | |
*** vinbs has joined #openstack-ironic | 09:33 | |
lucasagomes | right hmm, it should have got an ip, I need to know what the machine machine booting is showing in the console :( | 09:33 |
lucasagomes | debbuging this part without seem it is pretty hard | 09:34 |
Kai14 | okay how can I connect to the console via virsh? | 09:34 |
*** Alexei_987 has joined #openstack-ironic | 09:35 | |
Kai14 | virsh list --all shows that one baremetal node is running | 09:35 |
*** derekh_ has joined #openstack-ironic | 09:37 | |
lucasagomes | Kai14, http://rwmj.wordpress.com/2011/07/08/setting-up-a-serial-console-in-qemu-and-libvirt/ might help, but instead of the grub.conf you gotta use the pxe config file | 09:38 |
Kai14 | lucasagomes, I know my host OS isn't supporting nested KVM (host os is rhel but I'm working with a ubuntu VM) could this be a problem? but you can use nested qemu instead of kvm therefore it should be no problem, right? | 09:41 |
Kai14 | lucasagomes, and thanks for the link I will work through that | 09:41 |
lucasagomes | Kai14, right, qemu will be pretty slow but might work | 09:42 |
lucasagomes | Kai14, you can also try things out without using nested vm, you set a bridge between the devstack vm and the vm u are trying to deploy | 09:43 |
openstackgerrit | Sirushti Murugesan proposed a change to openstack/ironic-specs: Whole Disk Image Support https://review.openstack.org/97150 | 09:45 |
Kai14 | lucasagomes, yeah I already thought about that but my problem is how will the devstack vm connect to the vm I want to deploy. I suppose I create an ironic node an pass a MAC address as parameter. but I'm not sure which (ironic) driver I shoud use? pxe_ssh? | 09:45 |
Kai14 | lucasagomes, sorry I think it's the ironic port where the MAC address parameter exists | 09:47 |
lucasagomes | Kai14, so the host is the glue, on devstack you can configure ironic to ssh into ur host and issue virsh commands to start/stop the machine | 09:47 |
lucasagomes | I mean, on the ironic running on the devstack machine | 09:47 |
Kai14 | lucasagomes, thank you. | 09:50 |
lucasagomes | Kai14, take a look at that etherpad, tho it's not updated you can picture how it's done | 09:50 |
*** dtantsur|afk is now known as dtantsur | 09:50 | |
dtantsur | g'afternoon, Ironic, now I'm finally here :) | 09:51 |
dtantsur | Haomeng|2, lucasagomes, romcheg shall we just disable failing tests to unblock our approves? No one of our patches currently touches scheduler | 09:52 |
dtantsur | it doesn't look like nova patch is going to be merged soon - it failed again in the gate | 09:52 |
lucasagomes | dtantsur, hmm hard decision, we have to disable py27 and py26 checks for that no? | 09:53 |
lucasagomes | I dunno if that's a good idea | 09:53 |
Haomeng|2 | dtantsur: it is a good idea, but we can sync with nova guys to see if new nova patch is ready soon | 09:53 |
dtantsur | lucasagomes, disable only 4 failing tests | 09:53 |
dtantsur | Haomeng|2, it's ready bug again failed both check and verification | 09:54 |
dtantsur | * bug = but | 09:54 |
lucasagomes | pff damn, /me thinking | 09:55 |
dtantsur | Let me know, what you think, folks. If you're ok, I think we can try. I'm afraid to rely on Nova fixing it soon enough (and Jenkins approve it) | 09:57 |
Haomeng|2 | lifeless: need your supporting here about this - https://review.openstack.org/#/c/97447/ | 09:58 |
Haomeng|2 | lifeless: we depends on the nova fix I think, not sure if nova will rollback the code? | 10:00 |
*** Jatin360 has quit IRC | 10:00 | |
*** Jatin360_ has joined #openstack-ironic | 10:00 | |
*** Jatin360_ is now known as Jatin360 | 10:00 | |
lifeless | Haomeng|2: https://review.openstack.org/#/c/97757/ | 10:00 |
Haomeng|2 | lifeless: ok thank you | 10:00 |
* lifeless crashes for sleep | 10:00 | |
Haomeng|2 | dtantsur: so we can wait https://review.openstack.org/#/c/97447/, code will be merged soon I think, it is approved already | 10:01 |
dtantsur | Haomeng|2, <dtantsur> it doesn't look like nova patch is going to be merged soon - it failed again in the gate | 10:01 |
dtantsur | Haomeng|2, I already reverified this morning and it's gonna fail again | 10:02 |
dtantsur | (I can see it in zuul) | 10:02 |
Haomeng|2 | dtantsur: yes, see your comments | 10:02 |
*** pradipta_away is now known as pradipta | 10:02 | |
Haomeng|2 | dtantsur: I think nova team can help fix quickly | 10:02 |
dtantsur | Haomeng|2, I don't think they can fix transient failures :) on the other hand, we can also hit them, so I don't know | 10:04 |
Haomeng|2 | dtantsur: :) | 10:04 |
*** Jatin360_ has joined #openstack-ironic | 10:05 | |
Haomeng|2 | dtantsur: so we can discuss with Deva, maybe he has idea | 10:05 |
dtantsur | maybe. I just wanted to merge something, while gate pipeline is not full :) but well, I'm ok with waiting for nova | 10:05 |
Haomeng|2 | dtantsur: :) | 10:06 |
*** Jatin360 has quit IRC | 10:08 | |
*** Jatin360_ is now known as Jatin360 | 10:08 | |
*** r0j4z0 has joined #openstack-ironic | 10:09 | |
*** romcheg has quit IRC | 10:10 | |
openstackgerrit | Sirushti Murugesan proposed a change to openstack/ironic-specs: Whole Disk Image Support https://review.openstack.org/97150 | 10:11 |
*** Jatin360 has quit IRC | 10:17 | |
dtantsur | Folks, do we have any docs on our spec review process? | 10:20 |
*** pbrooko has quit IRC | 10:22 | |
lucasagomes | dtantsur, the README at the ironic-spec repo? | 10:27 |
lucasagomes | dtantsur, https://github.com/openstack/ironic-specs/blob/master/README.rst | 10:27 |
lucasagomes | and the template as well | 10:27 |
lucasagomes | dtantsur, https://github.com/openstack/ironic-specs/blob/master/specs/template.rst | 10:28 |
*** matsuhashi has quit IRC | 10:31 | |
*** matsuhashi has joined #openstack-ironic | 10:34 | |
*** romcheg has joined #openstack-ironic | 10:46 | |
*** romcheg has quit IRC | 10:50 | |
*** romcheg has joined #openstack-ironic | 10:50 | |
dtantsur | thanks, lucasagomes. I would still like a more-or-less complete wiki page, but ok | 10:53 |
dtantsur | in the meanwhile https://review.openstack.org/#/c/97757/ has failed. Reverifying again... | 10:54 |
openstackgerrit | Imre Farkas proposed a change to openstack/ironic: Make driver validation asynchronous https://review.openstack.org/97789 | 10:57 |
lucasagomes | dtantsur, yeah wiki page would be better indeed | 11:01 |
*** romcheg1 has joined #openstack-ironic | 11:02 | |
*** romcheg has quit IRC | 11:02 | |
*** matsuhashi has quit IRC | 11:06 | |
*** romcheg1 is now known as romcheg | 11:06 | |
*** matsuhashi has joined #openstack-ironic | 11:06 | |
*** lazy_prince has quit IRC | 11:07 | |
*** matsuhashi has quit IRC | 11:10 | |
*** nosnos has quit IRC | 11:20 | |
*** lucasagomes is now known as lucas-hungry | 11:23 | |
*** matsuhashi has joined #openstack-ironic | 11:24 | |
*** matsuhashi has quit IRC | 11:24 | |
*** matsuhashi has joined #openstack-ironic | 11:24 | |
dtantsur | lucas-hungry and others: please have a look: https://wiki.openstack.org/wiki/Ironic/Specs_Process | 11:27 |
*** matsuhashi has quit IRC | 11:29 | |
dtantsur | also addition in the end of https://wiki.openstack.org/wiki/Ironic | 11:30 |
romcheg | dtantsur: /me is looking | 11:39 |
*** k4n0 has quit IRC | 11:54 | |
*** takadayuiko has joined #openstack-ironic | 11:57 | |
NobodyCam | good morning Irpnic | 12:05 |
romcheg | Morning NobodyCam! | 12:05 |
NobodyCam | :-p Ironic even | 12:05 |
dtantsur | morning NobodyCam :) | 12:05 |
NobodyCam | hey, Morning romcheg & dtantsur | 12:06 |
dtantsur | folks, I introduced a few official tags for our bugs, among them new 'driver' tag: https://bugs.launchpad.net/ironic/+bugs?field.tag=driver | 12:06 |
dtantsur | this is to help prioritize issues, as driver once have quite high relative priority, if we're going to merge with Nova | 12:07 |
NobodyCam | is Haomeng|2 here? | 12:08 |
romcheg | NobodyCam: I think it's late for him | 12:08 |
NobodyCam | ya | 12:08 |
NobodyCam | whanted to ask about his comment on https://review.openstack.org/#/c/97590 | 12:09 |
NobodyCam | and oh is our gate still broke? | 12:10 |
NobodyCam | broken even... | 12:10 |
* NobodyCam watches the coffee pot brew...wishing it would go faster | 12:11 | |
dtantsur | NobodyCam, it's a fun story | 12:11 |
dtantsur | NobodyCam, now they reverted Nova patch :) | 12:11 |
dtantsur | NobodyCam, and we're waiting for the revert to land | 12:11 |
dtantsur | and it does not land due to transient failures | 12:11 |
dtantsur | so I keep reverify it in a hope that it will merge :) | 12:12 |
NobodyCam | oh ya | 12:12 |
dtantsur | ok. it failed. again. kmp >_< | 12:12 |
dtantsur | NobodyCam, I even suggested disabling 4 failing tests on our side, until fix is found | 12:13 |
*** vinbs_ has joined #openstack-ironic | 12:13 | |
dtantsur | other cores were not that sure about it... and I still think it may be a good idea | 12:13 |
NobodyCam | its our tests that are failing when its reverted? | 12:14 |
dtantsur | NobodyCam, no, I meant, if our work is blocked due to 4 unit tests, maybe just temporary skip them? | 12:14 |
*** coolsvap is now known as coolsvap|afk | 12:15 | |
NobodyCam | I don;t have mail open do you have the link to the revert patch | 12:15 |
*** vinbs__ has joined #openstack-ironic | 12:15 | |
dtantsur | NobodyCam, https://review.openstack.org/#/c/97757/ | 12:15 |
*** vinbs has quit IRC | 12:15 | |
*** vinbs__ is now known as vinbs | 12:15 | |
*** vinbs_ has quit IRC | 12:17 | |
dtantsur | NobodyCam, oh, sdague is writing to ML that things are really bad with gates now and asks to stop approving things | 12:18 |
NobodyCam | oh joy :( | 12:21 |
*** vinbs has quit IRC | 12:22 | |
*** lucas-hungry is now known as lucasagomes | 12:24 | |
lucasagomes | dtantsur, thanks will do | 12:24 |
lucasagomes | morning NobodyCam | 12:24 |
NobodyCam | good morning lucasagomes | 12:28 |
NobodyCam | was just looking over 97757 | 12:28 |
NobodyCam | :-p | 12:28 |
NobodyCam | brb | 12:29 |
lucasagomes | :( | 12:32 |
lucasagomes | still not merged | 12:32 |
openstackgerrit | A change was merged to openstack/ironic-python-agent: Add missing methods to base HardwareManager class https://review.openstack.org/97631 | 12:32 |
lucasagomes | reverify again | 12:33 |
lucasagomes | let's see | 12:33 |
NobodyCam | :) | 12:34 |
lucasagomes | what a pain | 12:34 |
dtantsur | yeah, now your turn to reverify :) | 12:35 |
dtantsur | Yet Another (tm) reason to merge driver into Nova asap | 12:36 |
dtantsur | this would be their problem :D | 12:36 |
NobodyCam | yes, but our tests would still be failing | 12:36 |
romcheg | CI should be redesigned | 12:37 |
romcheg | Every time one project has a serious failure — everyone has it too :) | 12:38 |
*** godp1301 has joined #openstack-ironic | 12:38 | |
NobodyCam | romcheg: if you have thoughts on how to redesign shot a email to mordred on it : | 12:39 |
NobodyCam | :) | 12:39 |
romcheg | NobodyCam: I don't have now so I can just be complaining :) | 12:39 |
NobodyCam | lol | 12:39 |
NobodyCam | ++ | 12:39 |
lucasagomes | dtantsur, that would make it easier yes, tho the driver still our problem | 12:41 |
*** godp13011 has joined #openstack-ironic | 12:41 | |
*** godp1301 has quit IRC | 12:43 | |
romcheg | I just came up with an idea regarding to the specs process | 12:45 |
romcheg | I think it might be reasonable to add a section to the specification that is saying what configuration options are going to be added or removed and what are the reasonable defaults for them | 12:46 |
NobodyCam | oh I like that. | 12:47 |
lucasagomes | romcheg, we have that already | 12:48 |
lucasagomes | https://github.com/devananda/ironic-specs/blob/master/specs/template.rst | 12:48 |
*** linggao has joined #openstack-ironic | 12:48 | |
lucasagomes | Other deployer impact: What config options are being added? Should they be more generic than proposed (for example a flag that other hypervisor drivers might want to implement as well)? Are the default values ones which will work well in real deployments? | 12:48 |
*** rloo has joined #openstack-ironic | 12:48 | |
romcheg | Whoops, I had not enough patience to read it that carefully | 12:49 |
lucasagomes | heh | 12:49 |
lucasagomes | maybe it's a bit hidden and u want to have its own section for it | 12:49 |
lucasagomes | for the configs I mean | 12:49 |
NobodyCam | humm lucasagomes just off the top of your head do you know how parted would fail with bad drive... see https://bugs.launchpad.net/bugs/1326172 | 12:50 |
lucasagomes | NobodyCam, oh not really | 12:52 |
lucasagomes | NobodyCam, but u could simulate a bad device | 12:53 |
lucasagomes | using device mapper | 12:53 |
lucasagomes | there's an "error" target that can be used for that | 12:53 |
lucasagomes | NobodyCam, http://linux.die.net/man/8/dmsetup | 12:54 |
NobodyCam | neet-o | 12:59 |
NobodyCam | brb | 12:59 |
openstackgerrit | Mikhail Durnosvistov proposed a change to openstack/ironic: Checking formatting to the specified filesystem https://review.openstack.org/98102 | 13:01 |
NobodyCam | I almost thought that was going to check for bad sectors | 13:04 |
NobodyCam | lol | 13:04 |
*** rloo has quit IRC | 13:14 | |
*** rloo has joined #openstack-ironic | 13:15 | |
*** matty_dubs|gone is now known as matty_dubs | 13:19 | |
Kai14 | lucasagomes, do you have a moment? I think I'm a step further with my problem. I did the second method: I will deploy a vm on my host using the devstack vm (so NO nested VM). | 13:19 |
Kai14 | lucasagomes, ironic found the node on the host "Found Mac address...." but the deployment fails. ironic node-show tells my something about an HTTP404. what could be the problem? | 13:20 |
*** early has quit IRC | 13:21 | |
*** Mikhail_D_ltp has left #openstack-ironic | 13:22 | |
*** Mikhail_D_ltp has joined #openstack-ironic | 13:22 | |
romcheg | Mikhail_D_wk: in addition to my comments please also change the title | 13:23 |
romcheg | Mikhail_D_ltp: to something like "Check whether specified FS is supported" | 13:23 |
dtantsur | lucasagomes, re https://bugs.launchpad.net/python-ironicclient/+bug/1326749 what is it a bug? | 13:23 |
dtantsur | lucasagomes, I would say we should add unique constraint on instance_uuid, not fixing client | 13:24 |
*** early has joined #openstack-ironic | 13:24 | |
lucasagomes | Kai14, hey, hmm so did ironic started the vm? | 13:25 |
lucasagomes | dtantsur, ah, hmm yeah I didn't know where actually I should put that bug | 13:26 |
dtantsur | lucasagomes, I would highly vote for preventing this situation, not making client work with it | 13:27 |
lucasagomes | dtantsur, idk it's confusing to have 2 bug trackers one for the client and one for ironic | 13:27 |
Kai14 | lucasagomes, hmm not sure have to check. Maybe the 404 comes from glance. But you're right first I should check if ironic even started the vm | 13:27 |
dtantsur | lucasagomes, I also hate it :( | 13:28 |
lucasagomes | dtantsur, this bug in the client affects the nova driver, since the nova driver uses the libs | 13:28 |
lucasagomes | dtantsur, http://paste.openstack.org/show/wOdHiJurWONIhlYrejCj/ | 13:28 |
lucasagomes | the fix is really in Ironic, but urghh heh it's a mess | 13:28 |
lucasagomes | dtantsur, you think it would be easier to report the client erros in the ironic bug tracker? instead of having this division? | 13:29 |
dtantsur | lucasagomes, I would like on bug tracker per team. Otherwise we also need IPA bug tracker and so on... | 13:30 |
lucasagomes | dtantsur, right... but anyway, the fix of that bug is really in ironic | 13:30 |
dtantsur | btw, lucasagomes how are you going to fix the client, provided the error is from server (aka error 500)? | 13:31 |
dtantsur | oh, I see we're thinking of the same :) | 13:31 |
lucasagomes | dtantsur, yeah I wouldn't fix the client, the fix would go to Ironic | 13:31 |
lucasagomes | maybe i will just mark that bug as invalid | 13:31 |
dtantsur | yes please | 13:31 |
lucasagomes | cool | 13:31 |
dtantsur | lucasagomes, now my question is: what should node.get_by_instance_uuid return in this case? | 13:33 |
dtantsur | lucasagomes, the only answer I see is that we should prevent this situation from even happening | 13:33 |
dtantsur | what do you think? | 13:33 |
lucasagomes | dtantsur, +1, what I'm doing to fix it | 13:33 |
lucasagomes | is to add a unique constraint to the instance_uuid in Ironic | 13:34 |
lucasagomes | and guarantee that if the deployment fail | 13:34 |
lucasagomes | exception/timeout etc... | 13:34 |
lucasagomes | the instance get unassociated | 13:34 |
lucasagomes | with that node | 13:34 |
lucasagomes | I'm working on that as part of https://bugs.launchpad.net/ironic/+bug/1326364 | 13:34 |
lucasagomes | this is part of the problem | 13:34 |
dtantsur | aha. these 2 steps both look good to me! | 13:35 |
lucasagomes | yeah | 13:35 |
lucasagomes | cause | 13:35 |
*** takadayuiko has quit IRC | 13:35 | |
lucasagomes | the reason why if one instance fail it tries to deploy another is because of the RetryFilter in the nova scheduler | 13:35 |
lucasagomes | so instance 1 fails for whatever reason | 13:35 |
lucasagomes | nova tries to deploy the next | 13:35 |
Kai14 | lucasagomes, cool! one vm starts another vm it's kind of magic despite I understand how it works. Now it says deploy complete. Thank you very much, you helped me really a lot! | 13:35 |
lucasagomes | but the instance still associated with the first and no unique constraint | 13:36 |
lucasagomes | so nova will associate that instance with the second node | 13:36 |
lucasagomes | driver will use the lib to find the node associated with that instance | 13:36 |
lucasagomes | that will fail with that 500 error, cause now there's 2 nodes associated with the same instance | 13:36 |
dtantsur | ouch | 13:36 |
lucasagomes | and things just get really messy | 13:36 |
lucasagomes | heh | 13:36 |
lucasagomes | so it's a bunch of things tangled together | 13:37 |
lucasagomes | having the unique constraint and making sure that the instance gets unassociated if the deploy fails is a step to fix it | 13:37 |
lucasagomes | dtantsur, what u think? | 13:37 |
lucasagomes | Kai14, no worries, feel free to ask more question. I know our documentation is not the best | 13:37 |
lucasagomes | Kai14, but we will try to improve it as we get things more stable | 13:38 |
dtantsur | lucasagomes, sounds good, but we need somehow to cope with the situation, if we fail to clean instance id | 13:38 |
lucasagomes | dtantsur, yup, looking at that right now | 13:38 |
dtantsur | lucasagomes, maybe the driver can check and clean instance uuid from previous node or so | 13:38 |
dtantsur | if it sees, that it failed to deploy | 13:39 |
dtantsur | I donna, but so that we don't return "constraint violation" to nova | 13:39 |
lucasagomes | dtantsur, yeah, I'll soon submit a patch addressing that problem so u can take a look | 13:39 |
dtantsur | great! | 13:39 |
lucasagomes | I see 3 parts where we can fail and leave the instance associated... when u issue the deploy and something goes bad; when the first part of the deploy is completed but the ramdisk didn't ping ironic back to continue it; something goes bad in the second part of the deployment | 13:40 |
lucasagomes | the ramdisk didn't ping ironic back = timeout problem | 13:41 |
*** rakesh_hs has quit IRC | 13:41 | |
Mikhail_D_ltp | Folks :) If I need to synchronize only one method from Oslo code, I can don't sync all Oslo code? :) | 13:42 |
dtantsur | Mikhail_D_ltp, depends on the situation, I guess. It better sync on files level to avoid conflicts in the future | 13:42 |
lucasagomes | dtantsur, uhul 97757 verified! | 13:43 |
lucasagomes | now going to gate | 13:44 |
dtantsur | fingers crossed... | 13:44 |
*** pradipta is now known as pradipta_away | 13:44 | |
NobodyCam | nice! | 13:45 |
godp13011 | From a new devstack this morning: stack@devstack:~/devstack$ openstack project create admin | 13:49 |
godp13011 | ERROR: openstackclient.shell Exception raised: six>=1.6.0 | 13:49 |
*** pcrews has joined #openstack-ironic | 13:54 | |
*** eghobo has joined #openstack-ironic | 13:55 | |
*** eguz has quit IRC | 13:56 | |
*** eghobo has quit IRC | 13:56 | |
dtantsur | errands, folks, hope to be online in a few hours | 14:05 |
*** dtantsur is now known as dtantsur|afk | 14:05 | |
*** rloo has quit IRC | 14:06 | |
*** rloo has joined #openstack-ironic | 14:06 | |
*** rloo has quit IRC | 14:06 | |
*** rloo has joined #openstack-ironic | 14:07 | |
*** rloo has quit IRC | 14:07 | |
*** rloo has joined #openstack-ironic | 14:07 | |
*** rloo has quit IRC | 14:13 | |
*** rloo has joined #openstack-ironic | 14:14 | |
*** athomas has quit IRC | 14:17 | |
*** athomas has joined #openstack-ironic | 14:21 | |
openstackgerrit | Lucas Alvares Gomes proposed a change to openstack/ironic: Add unique constraint to instance_uuid https://review.openstack.org/98120 | 14:27 |
openstackgerrit | Lucas Alvares Gomes proposed a change to openstack/ironic: Unassociate instance if the deployment fail https://review.openstack.org/98121 | 14:27 |
NobodyCam | humm seems my google rules are eating messages sent directly to me with openstack in the subject... hummm | 14:35 |
*** jgrimm has joined #openstack-ironic | 14:35 | |
*** Lingo_ has joined #openstack-ironic | 14:36 | |
*** Lingo_ is now known as TravelingBear | 14:37 | |
*** TravelingBear is now known as BadCub01 | 14:39 | |
NobodyCam | morning BadCub01 | 14:40 |
BadCub01 | Morning NobodyCam | 14:40 |
NobodyCam | our house in currently under attack by birds :-p | 14:43 |
BadCub01 | Creepy!!!! :-O | 14:44 |
NobodyCam | will 1326289 be our recheck bug once 97757 lands? | 14:46 |
Kai14 | hi! If got the problem that ironic can start a baremetal node (in fact another VM on my host OS) but this vm doesn't get a dhcp offer. How can I bring neutron to run on the network interface which is used by both vms? | 14:49 |
devananda | morning, all | 14:51 |
NobodyCam | good morning devananda :) | 14:51 |
romcheg | Morning devananda! | 14:51 |
BadCub01 | Morning devananda | 14:51 |
devananda | looks like the gate is still broken? | 14:52 |
NobodyCam | yeppers they are reverting see 97757 | 14:52 |
NobodyCam | devananda: we should update the topic/ status | 14:53 |
*** coolsvap|afk is now known as coolsvap | 14:53 | |
*** jgrimm has quit IRC | 14:53 | |
NobodyCam | Kai14: are you running devstack or devtest? | 14:54 |
Kai14 | NobodyCam: actually I started with devstack. But I'm not using nested vms now. I have one vm running with devstack/ironic and this vm powers on another vm on my host os | 14:55 |
Kai14 | NobodyCam, the devstack vm has to networkinterfaces one is connected to a bridge on the host os. The vm which I want to deploy has one network interface which is also connected to the same bridge on the host os | 14:57 |
Kai14 | NobodyCam, but since I started with devstack. Ironic or dnsmasq is running on the wrong interface (I suppose). | 14:58 |
*** jistr has quit IRC | 14:58 | |
Kai14 | I mean neutron respectively dnsmasq | 14:59 |
NobodyCam | Kai14: off the top of my head it sounds like a networking issue. Have you used something like tcpdump to see if the request is making it to the correct interface | 14:59 |
*** jistr has joined #openstack-ironic | 15:00 | |
NobodyCam | bbt..brb | 15:02 |
Kai14 | NobodyCam, I will look into that. | 15:02 |
lucasagomes | devananda, morning | 15:04 |
lucasagomes | devananda, yes :( the nova patch is now being tested on gate | 15:04 |
lucasagomes | fingers crossed | 15:04 |
lucasagomes | devananda, when you get a time can we talk a bit about #1326364 ? | 15:05 |
*** devananda changes topic to "ATTN: Ironic's gate queue is broken, pending a fix in nova: https://review.openstack.org/#/c/97757/" | 15:06 | |
BadCub01 | BadCub01 is off to another PM meeting :-p | 15:07 |
devananda | BadCub01: enjoy :p | 15:07 |
*** Kai14 has quit IRC | 15:11 | |
devananda | dtantsur|afk: on the 'driver' tag, this term is slightly overloaded | 15:16 |
devananda | dtantsur|afk: it could mean "ironic.drivers.*" or "nova.virt.ironic" | 15:16 |
Mikhail_D_ltp | devananda g'morning :) | 15:16 |
NobodyCam | morning Mikhail_D_ltp | 15:16 |
* devananda skims some scrollback... | 15:19 | |
lucasagomes | devananda, so on bug #1326364, there are more than one problem tangled | 15:19 |
lucasagomes | devananda, what I see happening there is that nova tries to deploy a machine and it fails for some reason, then the RetryFilter picks another node to deploy that instance | 15:20 |
lucasagomes | but the instance_uuid from the first machine wasn't cleaned up, so when the driver which is now deploying the machine number two tries to find the machine associated with that instance, the libs fail with "InternalServerError: Multiple rows were found for one() (HTTP 500" | 15:20 |
devananda | right | 15:20 |
lucasagomes | because it's now returning multiple rows | 15:20 |
lucasagomes | where it was expected to return only one | 15:21 |
lucasagomes | and then things get messy heh | 15:21 |
devananda | so there are at least two things to address | 15:21 |
devananda | 1) make sure instance_uuid is cleaned up properly | 15:21 |
lucasagomes | devananda, are you confortable on adding a unique constraint to the instance_uuid field? | 15:21 |
lucasagomes | devananda, yes | 15:21 |
devananda | 2) stop exploding when instance_uuid isn't unique | 15:21 |
devananda | lucasagomes: so the problem with making it unique is, if we dont' get (1) done right, then the UX will break the ability to reschedule | 15:22 |
lucasagomes | right, yeah that's the two problems I'm seeing there, I put up 2 patches for that, one to add the unique constraint and the second one to clean the instance_uuid when the deploy fails | 15:22 |
lucasagomes | devananda, true | 15:22 |
lucasagomes | devananda, perhaps I should invert the order of the patches I think I'm adding the unique constraint before fixing the cleaning up | 15:22 |
devananda | :) | 15:22 |
lucasagomes | devananda, it seems to get fixed in my env | 15:23 |
lucasagomes | I'm raising some exception on random parts | 15:23 |
lucasagomes | driver and in our conductor | 15:23 |
devananda | fwiw, i thought we were already cleaning up the instance_uuid all the time. either I was wrong or something changed | 15:23 |
*** coolsvap is now known as coolsvap|afk | 15:23 | |
lucasagomes | checking if the instances get stuck in nova, it seems to fix, but I won't know for sure until I test on that env that you guys saw thr bug | 15:23 |
*** rloo has quit IRC | 15:23 | |
lucasagomes | devananda, there's 3 parts where it needs to be cleaned | 15:23 |
devananda | ah | 15:24 |
*** rloo has joined #openstack-ironic | 15:24 | |
lucasagomes | devananda, when the deploy fail for whatever reasons before going to the second phase of the deployment; when the timeout occurs waiting the ramdisk to ping the ironic api back; when it fails for whatever reason on the 2nd phase of the deployment | 15:24 |
NobodyCam | brb | 15:25 |
*** Mikhail_D_ltp has quit IRC | 15:28 | |
openstackgerrit | Lucas Alvares Gomes proposed a change to openstack/ironic: Add unique constraint to instance_uuid https://review.openstack.org/98120 | 15:28 |
openstackgerrit | Lucas Alvares Gomes proposed a change to openstack/ironic: Unassociate instance if the deployment fail https://review.openstack.org/98121 | 15:28 |
lucasagomes | inverted the order of the patches | 15:29 |
openstackgerrit | Aleksandr Gordeev proposed a change to openstack/ironic-python-agent: Improve GenericHardwareManager https://review.openstack.org/92847 | 15:36 |
openstackgerrit | Aleksandr Gordeev proposed a change to openstack/ironic-python-agent: Add hardware_utils https://review.openstack.org/92399 | 15:36 |
*** coolsvap|afk is now known as coolsvap | 15:36 | |
*** eghobo has joined #openstack-ironic | 15:38 | |
*** athomas has quit IRC | 15:42 | |
*** eghobo has quit IRC | 15:44 | |
*** eguz has joined #openstack-ironic | 15:44 | |
agordeev | Mikhail_D_wk: https://bugs.launchpad.net/ironic/+bug/1326849 fresh low-hanging fruit for you or for somebody else :) | 15:44 |
*** eguz has quit IRC | 15:44 | |
*** eghobo has joined #openstack-ironic | 15:44 | |
*** christop1eraedo has quit IRC | 15:45 | |
*** christopheraedo has joined #openstack-ironic | 15:46 | |
*** jistr has quit IRC | 15:46 | |
*** jistr has joined #openstack-ironic | 15:47 | |
*** jcoufal has quit IRC | 15:49 | |
*** athomas has joined #openstack-ironic | 15:52 | |
romcheg | agordeev: He's just left the office :) | 15:54 |
*** coolsvap is now known as coolsvap|afk | 15:56 | |
devananda | update on the gate: | 16:04 |
devananda | - 97757 is a revert for the nova change that broke us to begin with | 16:04 |
*** Haomeng|2 has quit IRC | 16:04 | |
JayF | aweeks: < agordeev> Mikhail_D_wk: https://bugs.launchpad.net/ironic/+bug/1326849 fresh low-hanging fruit for you or for somebody else :) | 16:05 |
devananda | - there was a discussion with infra, lifeless, myself, and a few nova cores about whether to approve the fix for ironic (97447) | 16:05 |
JayF | aweeks: ^ that would be a great first bug for you | 16:05 |
devananda | - but tripleo was still broken, pending a non-approved change to nova | 16:05 |
devananda | - so the decision was to revert the nova change (by landing 97757) | 16:05 |
devananda | - however, that didnt land (i think it hit transient failures, and wasn't promoted to the top of the queue again) | 16:06 |
devananda | - and i'm working on addressing that now | 16:06 |
*** godp13011 has quit IRC | 16:06 | |
JayF | thanks deva | 16:06 |
romcheg | Thank you for the update! | 16:07 |
lucasagomes | thanks | 16:07 |
rloo | devananda: thx. Yest, we weren't supposed to approve unless rebased on 97447. Today, should we just wait to see what happens before approving anything? | 16:08 |
NobodyCam | rloo: :) | 16:09 |
rloo | NobodyCam: sorry for not reviewing your two dear-to-the-heart patches sooner. | 16:11 |
NobodyCam | humm is there a way we can test and test and report a 3rd party nova driver back to nova... ie our driver, until it lands in nova | 16:12 |
*** Haomeng has joined #openstack-ironic | 16:13 | |
NobodyCam | rloo: nothing is landing now... I just trying to keep them up to date for when the gate is un-broken | 16:13 |
*** hemna_ has joined #openstack-ironic | 16:14 | |
devananda | who was working on elastic recheck patterns for ironic bugs? | 16:14 |
Shrews | devananda: adam_g maybe? | 16:17 |
jroll | agordeev: nice find on https://bugs.launchpad.net/ironic/+bug/1326849 | 16:18 |
*** ellenh has joined #openstack-ironic | 16:19 | |
devananda | Shrews: hi! busy? | 16:19 |
jroll | morning y'all :) | 16:19 |
Shrews | devananda: just trying to recover code from a power loss yesterday. what's up? | 16:20 |
romcheg | Morning jroll! | 16:20 |
NobodyCam | morning jroll JayF and Shrews ... | 16:20 |
*** openstackgerrit has quit IRC | 16:20 | |
devananda | Shrews: wondering if i can enlist your help in the gate breakage | 16:20 |
NobodyCam | Shrews: hope I didn't drink tomuch lastnight | 16:20 |
NobodyCam | at the party | 16:20 |
Shrews | NobodyCam: eh, you were a good boy | 16:20 |
Shrews | devananda: sure | 16:20 |
*** openstackgerrit has joined #openstack-ironic | 16:21 | |
*** jgrimm has joined #openstack-ironic | 16:21 | |
devananda | Shrews: two things on my $urgent plate: add elastic recheck queries. add unit tests to nova for the places that ironic's out of tree code depends on things. | 16:21 |
devananda | Shrews: want to help? | 16:22 |
Shrews | devananda: never touched elastic recheck, but i can probably tackle the unit test stuff | 16:22 |
Shrews | but i could learn elastic recheck if needed | 16:22 |
devananda | Shrews: ack. see the context in -infra. | 16:22 |
devananda | Shrews: tldr; first one: a unit test that would have caught this change: https://review.openstack.org/#/c/94043/2/nova/scheduler/host_manager.py | 16:23 |
devananda | Shrews: since that broke ironic and nova_bm | 16:23 |
devananda | Shrews: perhaps in a file named "test_ironic_api_contracts.py" with a gigantic # NOTE saying "if you change this, we'll hunt you down" | 16:24 |
devananda | Shrews: unit test should fail against current nova trunk and succeed on top of https://review.openstack.org/#/c/97757/ | 16:25 |
devananda | Shrews: then we should expand that to cover other internal APIs that ironic is depending on, eg, anywhere we're replacing nova classes/functions with code that's in ironic/nova/* | 16:26 |
devananda | meanwhile i'll hit E-R and see what I can do there | 16:26 |
Shrews | devananda: so, the __init__ signature for HostState is the utlimate culprite, correct? | 16:27 |
Shrews | b/c you mentioned HostManager's signature in -infra, but I don't see that one changing | 16:27 |
devananda | Shrews: fwiw, all that ^ unit test stuff will help us avoid nova inadvertently breaking ironic in this sort of way again. even jogo and I weren't aware that this would break us :( | 16:27 |
devananda | Shrews: bah. typing fast. you're correct - HostState, not HostManager | 16:28 |
devananda | I think we depend on both | 16:28 |
Shrews | right... just trying to come up to speed quickly | 16:28 |
NobodyCam | Shrews: awesome TY | 16:29 |
aweeks | JayF: I'll take a look | 16:29 |
*** jistr has quit IRC | 16:31 | |
*** r0j4z0 has quit IRC | 16:33 | |
*** max_lobur has quit IRC | 16:35 | |
NobodyCam | rloo: for the save and raise comment... because I wanted the node uuid in the log | 16:37 |
*** romcheg has quit IRC | 16:38 | |
NobodyCam | I will take another look but I recall it wouldn't give node id.. but I could be wrong there | 16:38 |
*** martyntaylor has left #openstack-ironic | 16:39 | |
rloo | NobodyCam: 'sok to log the node id. But you're raising the same exception right? | 16:42 |
NobodyCam | yep | 16:43 |
rloo | NobodyCam: so you can do 'with excutils.save_and_reraise_exception(): LOG....'. no need to raise | 16:43 |
rloo | NobodyCam: see line 309, it is used there. | 16:45 |
NobodyCam | rloo: ok I can do that | 16:45 |
NobodyCam | ya | 16:45 |
*** coolsvap|afk is now known as coolsvap | 16:48 | |
*** harlowja_away is now known as harlowja | 16:50 | |
*** godp1301 has joined #openstack-ironic | 16:57 | |
openstackgerrit | Chris Krelle proposed a change to openstack/ironic: Rework make_partitions logic when preserve_ephemeral is set https://review.openstack.org/97590 | 16:59 |
*** matty_dubs is now known as matty_dubs|lunch | 17:02 | |
*** eghobo has quit IRC | 17:02 | |
*** derekh_ has quit IRC | 17:06 | |
lucasagomes | it's dinner time for me, have a good night everyone! | 17:07 |
*** lucasagomes is now known as lucas-dinner | 17:07 | |
JayF | jroll: russell_h: Others: email just went out to os-dev@ asking people to not approve patches until the gate clears | 17:11 |
JayF | apparently the queue has gotten large enough to cause headaches and expose more bugs, and they want to give the queue time to work through before making it longer | 17:11 |
*** jbjohnso has joined #openstack-ironic | 17:15 | |
*** rakesh_hs has joined #openstack-ironic | 17:20 | |
NobodyCam | will push in a bit... need to do quick walkies | 17:22 |
*** r0j4z0 has joined #openstack-ironic | 17:23 | |
*** BadCub01 has quit IRC | 17:26 | |
openstackgerrit | Jim Rollenhagen proposed a change to openstack/ironic: Fix concurrent deletes in virt driver https://review.openstack.org/98184 | 17:26 |
jroll | bugfix for the virt driver right there ^ | 17:26 |
*** coolsvap is now known as coolsvap|afk | 17:30 | |
*** athomas has quit IRC | 17:33 | |
*** eghobo has joined #openstack-ironic | 17:33 | |
*** eghobo has quit IRC | 17:34 | |
*** eghobo has joined #openstack-ironic | 17:37 | |
*** spearson has quit IRC | 17:39 | |
*** pelix has quit IRC | 17:47 | |
*** athomas has joined #openstack-ironic | 17:47 | |
*** Alexei_987 has quit IRC | 17:47 | |
*** eguz has joined #openstack-ironic | 17:48 | |
*** matty_dubs|lunch is now known as matty_dubs | 17:49 | |
*** eghobo has quit IRC | 17:51 | |
*** jbjohnso has quit IRC | 17:53 | |
devananda | jroll: ah shit, good catch. how many other places are affected by that? | 18:02 |
jroll | devananda: good question :P | 18:03 |
jroll | comstud actually found it | 18:03 |
JayF | devananda: hey an FYI, aweeks is Alex, and he's going to be working with us as well (not just for the summer, hopefully for-ever) | 18:03 |
aweeks | JayF: dawwww | 18:03 |
jroll | I hope none of us are working on this forever :) | 18:04 |
JayF | something something software is never done | 18:04 |
jroll | I want to retire one day | 18:04 |
jroll | before forever happens | 18:04 |
*** jbjohnso has joined #openstack-ironic | 18:05 | |
devananda | jroll: so that's not taggign any bug | 18:06 |
jroll | devananda: I don't see it anywhere else in the virt driver, not sure about elsewhere | 18:06 |
devananda | jroll: might it be a cause of https://bugs.launchpad.net/ironic/+bug/1326364 ? | 18:07 |
jroll | devananda: damn, I knew you would ask me to file a bug :) | 18:07 |
jroll | mmm, idk. this makes tear_down time out too quickly | 18:07 |
jroll | oh right, so no, it wouldn't | 18:07 |
jroll | well, maybe | 18:07 |
* jroll thinks | 18:07 | |
jroll | ok, so, this bug I fixed just makes nova time out tear_down too quickly | 18:08 |
jroll | but ironic does still do the tear_down | 18:08 |
jroll | so, instance_uuid should get cleared | 18:08 |
jroll | but imbw | 18:09 |
jroll | devananda: ^ | 18:09 |
comstud | devananda: Hm, I don't think those are the same bugs | 18:09 |
comstud | Unless | 18:09 |
comstud | there's another self.* somewhere causing that one | 18:09 |
jroll | yeah | 18:09 |
comstud | but yeah, the one jroll is fixing is just affecting teardown waiting | 18:09 |
jroll | I'm heading out to lunch, bbiab | 18:09 |
devananda | jroll: i haven' tlooked too closely yet. so IMBW. but it seemed possible | 18:09 |
jroll | sure, it's worth some thought | 18:09 |
comstud | self.tries being set by multiple greenthreads | 18:09 |
devananda | no, it totally is related | 18:12 |
devananda | _cleanup_deploy is what unsets the node.instance_uuid | 18:12 |
devananda | if _wait_for_provision_state times out and raises, it doesn't call _cleanup_deploy at all | 18:13 |
devananda | so if this loop short-circuits (eg, runs in less than the time it takes ironic api -> conductor -> db to update) then it will raise NovaExceptiona nd not clear node.isntance_uuid | 18:14 |
*** eguz has quit IRC | 18:14 | |
* devananda edits commit message | 18:14 | |
*** eghobo has joined #openstack-ironic | 18:14 | |
devananda | well, before i do that, would be great to have another pair of eyes tell me i'm not crazy :) | 18:14 |
devananda | commented on the review | 18:25 |
devananda | comstud: not the same bug, but a contributing factor | 18:25 |
devananda | may help explain why 1326364 shows up more under concurrent workloads | 18:26 |
comstud | will look | 18:27 |
*** romcheg has joined #openstack-ironic | 18:28 | |
*** rloo has quit IRC | 18:29 | |
*** rloo has joined #openstack-ironic | 18:30 | |
*** igordcard has quit IRC | 18:32 | |
lifeless | morning | 18:40 |
lifeless | devananda: hi | 18:40 |
openstackgerrit | Chris Krelle proposed a change to openstack/ironic: Wipe any metadata from a nodes disk https://review.openstack.org/93133 | 18:40 |
NobodyCam | morning lifeless | 18:41 |
NobodyCam | :) | 18:41 |
NobodyCam | rloo: that look a little better? | 18:42 |
NobodyCam | :-p | 18:42 |
rloo | NobodyCam: you're asking me? Can't you tell? :-) In meeting, will look in 30 min. | 18:43 |
NobodyCam | lol | 18:44 |
NobodyCam | sorry on the road and bandwidth comes and goes | 18:44 |
NobodyCam | oh I see more comments.. doh will fix them | 18:44 |
*** rloo has quit IRC | 18:48 | |
*** rloo has joined #openstack-ironic | 18:48 | |
openstackgerrit | Rakesh H S proposed a change to openstack/ironic-specs: Enabling IPMI double bridge support https://review.openstack.org/98208 | 18:48 |
*** romcheg has quit IRC | 18:49 | |
openstackgerrit | Chris Krelle proposed a change to openstack/ironic: Rework make_partitions logic when preserve_ephemeral is set https://review.openstack.org/97590 | 18:49 |
openstackgerrit | Chris Krelle proposed a change to openstack/ironic: Wipe any metadata from a nodes disk https://review.openstack.org/93133 | 18:51 |
devananda | lifeless: hi | 18:52 |
*** romcheg has joined #openstack-ironic | 18:53 | |
NobodyCam | devananda: you working 96902? | 18:54 |
NobodyCam | lol http://downdetector.com/status/verizon-communications/map note big red spot in Wy so thats where my bandwidth went | 18:58 |
openstackgerrit | Rakesh H S proposed a change to openstack/ironic-specs: Enabling IPMI double bridge support https://review.openstack.org/98208 | 19:00 |
jroll | devananda: oh, nice | 19:03 |
jroll | you might be right | 19:03 |
*** ellenh has quit IRC | 19:03 | |
jroll | devananda: do you want a partial-fix tag or whatever on that, then? | 19:04 |
rloo | NobodyCam: fwiw, wrt 97590, I'm not convinced that it helps to solve bug 1317647; seems like a diff bug. Or at least I don't think the connection is made. | 19:06 |
devananda | jroll: related-bug | 19:08 |
devananda | jroll: if you agree with me that it's related | 19:09 |
NobodyCam | rloo: if you dont call parted then you dont encounter the bug | 19:09 |
jroll | devananda: sounds like, you're right, I'm going to poke around and decide for myself | 19:09 |
jroll | thanks | 19:09 |
devananda | jroll: thanks! | 19:09 |
jroll | :) | 19:10 |
devananda | Shrews: for the api contract tests, I'd add one for filters.BaseHostFilter:host_passes | 19:10 |
devananda | Shrews: and check with sdague if he thinks its wortha dding one for each public method of nova.virt.driver.virt_driver:ComputeDriver that we're using in ironic.nova.virt.ironic.driver.IronicDriver | 19:12 |
Shrews | devananda: ack | 19:13 |
devananda | hm, he's not in this channel | 19:13 |
rloo | NobodyCam: wrt 93133. The tests. I replied to your reply. | 19:14 |
devananda | Shrews: he says yes | 19:15 |
*** stevebaker has quit IRC | 19:15 | |
rloo | NobodyCam: also, what about more unit tests for get_dev_block_size & destroy_disk_metadata()? (I commented on this but wondering if you missed it.) | 19:15 |
*** stevebaker has joined #openstack-ironic | 19:15 | |
Shrews | devananda: ack x 2 | 19:15 |
devananda | pcrews: how's things? | 19:17 |
NobodyCam | rloo: humm thought i got them all, but missed that one. ack. will alook at adding more test | 19:17 |
NobodyCam | s | 19:17 |
rloo | thx NobodyCam. I'm trying to get you to reach #60 :-) | 19:18 |
pcrews | devananda: well, my brain hasn't exploded...yet ;) Been delving into the mysteries of tripleo testing and whatnot | 19:19 |
pcrews | seemingly bumping into this - https://bugs.launchpad.net/ironic/+bug/1300589 | 19:19 |
openstackgerrit | Rakesh H S proposed a change to openstack/ironic-specs: Enabling IPMI double bridge support https://review.openstack.org/98208 | 19:24 |
devananda | pcrews: that's a very unspecific bug with no discernable cause, a proposed fix that sais DO NOT MERGE, and the last lines of the traceback are something we already fixed | 19:24 |
devananda | pcrews: so if you've hit that recently, i'd be delighted i fyou wouldn't mind posting more details on the bug :) | 19:25 |
*** coolsvap|afk has quit IRC | 19:25 | |
pcrews | devananda: good to know :) I'll poke into it a bit more as I've been hitting it consistently | 19:26 |
openstackgerrit | Rakesh H S proposed a change to openstack/ironic-specs: Enabling IPMI double bridge support https://review.openstack.org/98208 | 19:31 |
*** coolsvap|afk has joined #openstack-ironic | 19:32 | |
comstud | devananda: Ok, I see. So, we ran into this as well... | 19:33 |
comstud | where instance is not unassigned | 19:33 |
comstud | So, we're finding nova not waiting long enough is 1 problem | 19:33 |
comstud | and then things get out of sync | 19:33 |
comstud | (at least with IPA, it's not waiting long enough) | 19:34 |
devananda | same is happening with PXE | 19:34 |
comstud | I tend to think nova is doing the right thing here | 19:34 |
devananda | once it gets out of sync, then the nova driver balks because .one() fails | 19:34 |
comstud | gotcha | 19:34 |
lifeless | comstud: I have a patch up for waiting longer | 19:34 |
lifeless | (in Ironic) | 19:34 |
comstud | From nova's perspective, it's thinking the unprovision failed... or is stuck | 19:35 |
devananda | lifeless: and there's a fix from jroll for a race that was shortening the wait time | 19:35 |
lifeless | devananda: can we put a unique constraint on the instance_uuid ? | 19:35 |
devananda | comstud: yea, unprovision gets stuck -- what should it do at that point? reschedule or stop? | 19:35 |
comstud | devananda: Even without the race, we're seeing nova not waiting long enough | 19:35 |
devananda | lifeless: yes | 19:35 |
devananda | lifeless: that'll cause reschedule to fail faster | 19:35 |
comstud | devananda: Well, it's unprovision... there's no 'reschedule' | 19:35 |
comstud | or I'm not sure what you mean | 19:35 |
lifeless | devananda: it will also prevent ironic getting panties in knot :) | 19:35 |
lifeless | comstud: nova scheduler reschedules 3 times | 19:36 |
devananda | comstud: if provision failed, it calls unprovision. if unprovision fails at that point, it still tries to schedule again | 19:36 |
comstud | lifeless: On build, yes | 19:36 |
devananda | i think | 19:36 |
comstud | yes | 19:36 |
comstud | that's correct | 19:36 |
devananda | so that's the race | 19:36 |
lifeless | comstud: I think you're saying | 19:36 |
comstud | Ok, we have a problem on just straight destroy as well | 19:36 |
devananda | if that happens, we get >1 node with same instance uuid | 19:36 |
comstud | There's like multiple problems :) | 19:36 |
lifeless | comstud: 'when nova delete foo fails it leaves crap behind' ? | 19:36 |
lifeless | comstud: I filed a bug for that too :) | 19:36 |
comstud | delete 10 instnaces at once | 19:36 |
devananda | comstud: right. that compounds when the destroy fails during privision | 19:36 |
comstud | self.tries is overwritten | 19:36 |
comstud | but also... | 19:36 |
comstud | even when self.tries is correct, we need a longer wait time for IPA it seems | 19:37 |
lifeless | so the driver rework devananda and I brainstormed yesterday wil fix this I think | 19:37 |
comstud | k | 19:37 |
comstud | devananda: Right | 19:37 |
lifeless | in that we can poke the desired state into ironic without contention with whatever is going on in the conductor | 19:37 |
jroll | comstud: we've also talked about a different solution | 19:37 |
jroll | comstud: where we set some other state that nova sees equivalent to deleted ('decom' or something) | 19:38 |
devananda | lifeless: i skimmed the etherpad but didn't get thorugh all your notes... will look in a bit. i need to wrap my head around an incremental plan for that and start writing it | 19:38 |
lifeless | and from nova's perspective all we need to do is *not* update the host resources during unprovision | 19:38 |
comstud | I kinda want nova to be hands off once it tells ironic to deprovision | 19:38 |
jroll | comstud: the reason for the long wait is because we wait for the agent to come up and do decom work | 19:38 |
lifeless | instead we'll pick it up automatically once ironic has successfully converged state | 19:38 |
lifeless | comstud: exactly | 19:38 |
comstud | because anything after that point is an ironic problem | 19:38 |
comstud | ok great!! | 19:38 |
devananda | yes, but -- question for the room :) | 19:38 |
lifeless | comstud: have you seen the etherpad? | 19:38 |
comstud | i have not | 19:38 |
comstud | I'm going to claim you stole my idea | 19:39 |
devananda | if a node under ironic's control fails (eg, hw fault) | 19:39 |
lifeless | devananda: the room is listening | 19:39 |
* comstud smirks | 19:39 | |
devananda | what should happen to the nova instance? | 19:39 |
comstud | right | 19:39 |
lifeless | devananda: nova should refuse to alter things at that point | 19:39 |
comstud | I think nova should consider it destroyed still | 19:39 |
comstud | and the node goes into mainteannce or something in ironic | 19:39 |
lifeless | devananda: but it shouldn't power it off or consider its state changed | 19:39 |
comstud | so that nova doesn't try to use it again | 19:39 |
devananda | comstud: so if it's destroyed/deleted, then ironic contains a reference to something that doesnt exist | 19:39 |
lifeless | devananda: if the node is gone in ironic, nova should treat the instance as destroyed | 19:40 |
comstud | 'consider it destroyed' == the nova instance | 19:40 |
lifeless | devananda: (theres specific code in nova to do that already) | 19:40 |
devananda | lifeless: "if the node is gone" is not what I said | 19:40 |
comstud | well, can ironic de-associate the nova instance uuid immediately? | 19:40 |
lifeless | devananda: I know | 19:40 |
lifeless | comstud: I don't think that makes sense | 19:40 |
comstud | k. | 19:40 |
devananda | lifeless: hw fault => node is still there, preserving state (at least in ironic). | 19:40 |
lifeless | comstud: Ironic can only detect mgmt plane failures | 19:40 |
devananda | if the operator deletes the ndoe from ironic, i agree, any instance in nova should be deleted immediately | 19:40 |
*** rakesh_hs has quit IRC | 19:40 | |
lifeless | comstud: what if the BMC blows up but the server is still running just fine | 19:40 |
devananda | ^^ exactly | 19:41 |
comstud | yeah | 19:41 |
lifeless | so what I'm proposing is that 'machine dead' -> ops remove from ironic, nova logs 'missing in hypervisor' and follows that codepath | 19:41 |
devananda | comstud: ironic can't initiate an action in nova (at least not today). we can propagate some changes via the various periodoic tasks(eg resource tracker ) | 19:41 |
comstud | yeah | 19:42 |
comstud | I was about to say: DO we need callbacks to nova? | 19:42 |
lifeless | 'transient fault' -> Ironic logs that it can't control the machine, we stop advertising its resources to the nova scheduler, but allow normal nova requests like 'unprovision' etc to be set in the Ironic API | 19:42 |
comstud | nova has the mechanisms to support it | 19:42 |
lifeless | they will either converge when its fixed, or it will be removed by ops eventually | 19:42 |
comstud | but we can also make nova try to sync state in a better way | 19:42 |
mat128 | I'm not sure the instance should be destroyed from nova | 19:42 |
lifeless | mat128: if the machine is unregistered from Ironic | 19:43 |
mat128 | Imagine you spun up an instance and it just disappeared from nova list?! | 19:43 |
lifeless | mat128: its precisely equivalent to someone using virsh directly to delete a kvm VM | 19:43 |
mat128 | unregistered, not unreachable | 19:43 |
mat128 | ok | 19:43 |
comstud | we could have a nova periodic task that manages these 'in progress' things | 19:43 |
lifeless | mat128: whatever happens in that case, should happen in this, no ? | 19:43 |
comstud | vs the inline polling we have within driver.destroy | 19:44 |
mat128 | lifeless: agreed. what are the mechanisms by which nova ensures that the vm is present in libvirt? | 19:44 |
lifeless | comstud: the scheduler sync one is all we need I think; other syncing will happen (like it does with virsh) when the user requests something | 19:44 |
comstud | lifeless: I'm talking about things like nova actually deleting instances etc | 19:44 |
lifeless | mat128: when it makes a query and its missing, it throws an ERROR state; when the hypervisor starts up it cross-checks what its meant to be managing and whats locally present. | 19:44 |
mat128 | lifeless: so it's in error, not completely removed from the instance list | 19:45 |
lifeless | comstud: so, I'm proposing that we write to the Ironic API the right settings to make it be deleted, and from novas side then stop. | 19:45 |
lifeless | mat128: there are config knobs at that point but yeah :) | 19:45 |
comstud | lifeless: When does nova mark instance as deleted? | 19:45 |
lifeless | mat128: point is we don't and shouldn't do anything special here. Just map Ironic's facilities into regular hypervisor | 19:45 |
mat128 | lifeless: ok then I agree, same should happen if physical node disappears. This is analog to a kill -9 on the kvm process | 19:45 |
devananda | so the corrolary for this is, what happens when a provision fails due to hardware fault? | 19:45 |
lifeless | comstud: there are two cases I know of: api DELETE and startup routine | 19:46 |
lifeless | comstud: and I may be wrong on the startup routine :) | 19:46 |
comstud | startup routine in nova does try to re-issue deletes | 19:46 |
comstud | for things stuck in 'deleting' task_state | 19:46 |
comstud | if that's what you're referring to | 19:47 |
lifeless | devananda: so, the node is still in ironic but ironic has detected a fault it can't converge around (e.g. IPMI non-responsive, power not coming on etc) | 19:47 |
lifeless | devananda: ? | 19:47 |
lifeless | comstud: that may be it | 19:47 |
comstud | lifeless: link to etherpad? | 19:47 |
comstud | https://etherpad.openstack.org/p/ironic-and-fragile-hardware this one? | 19:48 |
lifeless | https://etherpad.openstack.org/p/ironic-and-fragile-hardware | 19:48 |
lifeless | ah yes | 19:48 |
comstud | ty | 19:48 |
comstud | ok, probably a little much to digest right now | 19:49 |
lifeless | devananda: so ideally I'd expect nova to time out eventually, and initiate cleanup by resetting the node properties to not-deployed, and Ironic to only mark the node usable again when the fault has cleared | 19:49 |
comstud | but I like what I'm reading so far | 19:49 |
*** slamont has joined #openstack-ironic | 19:49 | |
lifeless | devananda: that implies a bunch of stuff we haven't written or considered yet | 19:49 |
JayF | The big thing in that etherpad, that I commented on | 19:52 |
JayF | is that we should be very careful about modelling specific decom steps for the agent in ironic | 19:53 |
JayF | because the idea is that the agent can have different hardware drivers that do different things | 19:53 |
JayF | and I'd want to ensure we didn't create a model that someone with wacky enough handware couldn't follow. | 19:53 |
lifeless | JayF: so the state machine for the agent driver could just hand over to the agent | 19:58 |
lifeless | JayF: and let it run autonomously with its own state machine (stored in the same measured-side driver_info) | 19:58 |
lifeless | JayF: that would then get idempotency, support for rebooting in stages and other such things, for roughly free | 19:59 |
JayF | yeah something russell_h just mentioned though in meatspace to me is that there are some tasks, such as firmware updating, that need ironic cooperation or that ironic might drive (i.e. OOB firmware updates) | 19:59 |
JayF | so there's probably some middle ground there, where Ironic directs the agent to do some stuff, but maybe the agent can define additional things it wants to do for specific hardware | 20:00 |
JayF | i don't really know, but I want to ensure the model allows the agent to do more things in a generic 'decom' state than ironic needs to know about | 20:00 |
JayF | i.e. 5 different firmwares on a system with 5 different crappy utilities to update them, Ironic doesn't need to know we're doing 5 things 5 ways | 20:00 |
*** ellenh has joined #openstack-ironic | 20:02 | |
*** igordcard has joined #openstack-ironic | 20:05 | |
jroll | I would think the flow would be: ironic tells the agent to do any 'pre-decom things', then tells the agent to do x, y, z, then tells the agent 'do anything else you need to do post-decom' | 20:09 |
*** romcheg has quit IRC | 20:12 | |
*** romcheg has joined #openstack-ironic | 20:15 | |
*** romcheg has quit IRC | 20:15 | |
*** devananda has quit IRC | 20:19 | |
*** rloo has quit IRC | 20:22 | |
*** rloo has joined #openstack-ironic | 20:23 | |
*** rloo has quit IRC | 20:26 | |
*** rloo has joined #openstack-ironic | 20:26 | |
Shrews | devananda MIA? | 20:29 |
lifeless | JayF: it does | 20:33 |
lifeless | JayF: the whole point of saying its a state machine is to let the framework get out of the way | 20:34 |
lifeless | Shrews: he was here just before | 20:34 |
*** rloo has quit IRC | 20:34 | |
*** rloo has joined #openstack-ironic | 20:35 | |
*** rloo has quit IRC | 20:45 | |
*** eglynn_ has quit IRC | 20:45 | |
*** rloo has joined #openstack-ironic | 20:45 | |
*** stevebaker has quit IRC | 21:00 | |
*** stevebaker has joined #openstack-ironic | 21:00 | |
*** ndipanov has quit IRC | 21:01 | |
*** jbjohnso has quit IRC | 21:06 | |
lifeless | Shrews: he's having trouble with is irc proxy | 21:09 |
*** rloo has quit IRC | 21:09 | |
*** rloo has joined #openstack-ironic | 21:10 | |
Shrews | lifeless: no worries. i'm about to bug out for the day | 21:10 |
*** rloo has quit IRC | 21:11 | |
*** rloo has joined #openstack-ironic | 21:11 | |
*** rloo has quit IRC | 21:12 | |
*** rloo has joined #openstack-ironic | 21:12 | |
*** slamont has quit IRC | 21:16 | |
*** devananda has joined #openstack-ironic | 21:17 | |
* devananda throws rocks at the intertubes | 21:18 | |
*** linggao has quit IRC | 21:22 | |
lifeless | devananda: wb | 21:23 |
*** slamont has joined #openstack-ironic | 21:28 | |
*** rloo has quit IRC | 21:30 | |
*** rloo has joined #openstack-ironic | 21:30 | |
NobodyCam | Hello from YellowStone (or Billings Montana really) | 21:32 |
*** rloo has quit IRC | 21:32 | |
*** rloo has joined #openstack-ironic | 21:32 | |
*** rloo has quit IRC | 21:35 | |
*** mrda-away is now known as mrda | 21:35 | |
*** rloo has joined #openstack-ironic | 21:35 | |
mrda | Morning Ironic! | 21:35 |
*** rloo has quit IRC | 21:36 | |
*** rloo has joined #openstack-ironic | 21:36 | |
*** rloo has quit IRC | 21:37 | |
*** rloo has joined #openstack-ironic | 21:37 | |
*** rloo has quit IRC | 21:37 | |
*** rloo has joined #openstack-ironic | 21:37 | |
jroll | devananda: where is instance_uuid set to null? I couldn't find it quickly | 21:38 |
*** rloo has quit IRC | 21:38 | |
devananda | morning, mrda | 21:38 |
*** rloo has joined #openstack-ironic | 21:38 | |
lifeless | jroll: in the nova driver? | 21:39 |
devananda | jroll: https://github.com/openstack/ironic/blob/master/ironic/nova/virt/ironic/driver.py#L246 | 21:39 |
NobodyCam | morning mrda | 21:39 |
mrda | \o | 21:39 |
*** romcheg has joined #openstack-ironic | 21:39 | |
jroll | wow | 21:40 |
jroll | thanks devananda | 21:40 |
*** romcheg has left #openstack-ironic | 21:40 | |
jroll | I read that function at least 3 times and totally missed it every time | 21:40 |
devananda | np | 21:40 |
*** rloo has quit IRC | 21:41 | |
*** rloo has joined #openstack-ironic | 21:41 | |
jroll | devananda: so yeah, I think you're right wrt that bug | 21:42 |
devananda | jroll: awesome | 21:42 |
* jroll adds partial-bug to his commit message | 21:42 | |
openstackgerrit | Jim Rollenhagen proposed a change to openstack/ironic: Fix concurrent deletes in virt driver https://review.openstack.org/98184 | 21:43 |
lifeless | devananda: reminder: dinner with hernan :) | 21:43 |
devananda | lifeless: ack. i pinged him. no response | 21:43 |
lifeless | ack | 21:48 |
lifeless | devananda: when would you like to do another pass on the api stuff? | 21:48 |
devananda | lifeless: after i recover from nova meeting :) | 21:52 |
devananda | and rebuild my irssi host | 21:52 |
devananda | lifeless: timing will be hard, but it'd be great to get you and lucas in a hangout about the api, as he's done most of the work on it | 21:55 |
lifeless | can do | 21:58 |
lifeless | lucas-dinner: yo :) | 21:58 |
*** rloo has quit IRC | 21:58 | |
*** rloo has joined #openstack-ironic | 21:59 | |
*** stevebaker has quit IRC | 22:03 | |
*** stevebaker has joined #openstack-ironic | 22:03 | |
jroll | can someone assign lucas-dinner back to this? https://bugs.launchpad.net/ironic/+bug/1326364 | 22:03 |
jroll | I accidentally stole assignment | 22:04 |
jroll | with my commit | 22:04 |
jroll | (which anecdotally, isn't showing up as 'fix proposed' or whatever) | 22:04 |
rloo | jroll: done. but can't you do it? | 22:04 |
mrda | jroll: done | 22:04 |
rloo | ha ha | 22:05 |
jroll | rloo: 'you can only assign yourself because you're not part of the project team' or something dumb | 22:05 |
mrda | snap | 22:05 |
jroll | thanks mrda | 22:05 |
jroll | 'You may only assign yourself because you are not affiliated with this project and do not have any team memberships.' | 22:05 |
jroll | is the actual membership | 22:05 |
* devananda restarts irssi and stuff... bbiab | 22:05 | |
jroll | wow | 22:05 |
rloo | jroll, you need to go to the dark side and join us | 22:05 |
jroll | s/membership/message | 22:05 |
jroll | heh | 22:05 |
*** devananda has quit IRC | 22:05 | |
jroll | I just figured it meant core | 22:05 |
* jroll pokes around launchpad | 22:05 | |
mrda | rloo: does this mean I get to wear a cape now? | 22:06 |
mrda | jroll: not core, just bug team | 22:06 |
rloo | mrda: only if you wear black and a mask. | 22:06 |
*** devananda has joined #openstack-ironic | 22:06 | |
rloo | jroll: i can't remember now, how you sign up. i seem to recall it wasn't totally obvious. | 22:06 |
mrda | I'm happy to do that. Just so long as I don't have to dress like Charlie Chaplin | 22:06 |
jroll | mrda: hmm, so it's not something I can flip on I guess? | 22:06 |
jroll | oh | 22:06 |
jroll | rloo: launchpad. | 22:06 |
mrda | just join the group | 22:06 |
mrda | IIRC it will need approving, but that shouldn't be too hard to get :) | 22:07 |
jroll | :P | 22:07 |
jroll | now how to join the group.. | 22:07 |
*** devananda has quit IRC | 22:07 | |
JayF | jroll: if you figure out, show me | 22:08 |
JayF | heh | 22:08 |
mrda | https://launchpad.net/~ironic-bugs | 22:08 |
mrda | Click "join" :) | 22:08 |
jroll | wheeeeeeeee | 22:08 |
jroll | thanks mrda | 22:08 |
jroll | wait does this mean I need to help triage? | 22:08 |
* jroll runs away | 22:08 | |
mrda | I am a useful engine | 22:08 |
mrda | jroll: yes, you're part of the dark side now | 22:09 |
rloo | jroll: yeah. initiation rites... | 22:09 |
jroll | lol | 22:09 |
*** slamont has quit IRC | 22:09 | |
mrda | ooo, I missed out on that rloo | 22:09 |
mrda | can you show me the secret handshake? | 22:09 |
rloo | mrda: never too late. i'll have dtantsur|afk talk to you ;) | 22:10 |
mrda | ;) | 22:10 |
*** godp1301 has quit IRC | 22:13 | |
*** jgrimm has quit IRC | 22:16 | |
*** stevebaker has quit IRC | 22:28 | |
*** stevebaker has joined #openstack-ironic | 22:28 | |
*** aboutGod has joined #openstack-ironic | 22:31 | |
*** sysexit has quit IRC | 22:32 | |
*** slamont has joined #openstack-ironic | 22:34 | |
*** aboutGod has left #openstack-ironic | 22:36 | |
*** matty_dubs is now known as matty_dubs|gone | 22:36 | |
*** devananda has joined #openstack-ironic | 22:45 | |
*** devananda has quit IRC | 22:45 | |
*** athomas has quit IRC | 22:47 | |
openstackgerrit | Adam Gandelman proposed a change to openstack/ironic: Update Nova driver's list_instance_uuids() https://review.openstack.org/98268 | 22:47 |
*** devananda has joined #openstack-ironic | 22:48 | |
devananda | \o/ | 22:48 |
devananda | got all my irc things fixed, and switched to SSL while I was at it | 22:49 |
NobodyCam | nice | 22:49 |
jroll | \o/ for ssl | 22:55 |
openstackgerrit | Ellen Hui proposed a change to openstack/ironic-python-agent: Tries to advertise valid default IP https://review.openstack.org/96980 | 23:06 |
devananda | so folks, our gate is still broken, and the fix in nova (97757) hit a neutron failure the last time it made it to the top of the queue | 23:28 |
devananda | SO | 23:28 |
devananda | now it's at the bottom of the queue again | 23:28 |
devananda | what can YOU do to help? watch the rechecks page http://status.openstack.org/rechecks/ and help FIX THOSE | 23:29 |
devananda | also watch 97757 and, if it fails to merge, investigate the failure -- if it's known bug, recheck bug ###. if there's a bug but no elastic-recheck, create an E-R query. | 23:30 |
devananda | in short, right now, we're suffering along with all of openstack, so help fix anything that's polluting the gate with random failures will help us, too | 23:30 |
mrda | thanks devananda - will try and keep a look out for this today | 23:31 |
devananda | Shrews: keep on what you were doing. that'll help us directly prevent some of this in the future :) | 23:31 |
devananda | and anyone who's working on fixing bugs (especially bugs that creep into the gate randomly) in ironic, please continue doing that! | 23:32 |
devananda | everyone else -- please help out. your features can't land until the gate gets better anyway. | 23:32 |
*** hemna_ has quit IRC | 23:37 | |
openstackgerrit | Chris Krelle proposed a change to openstack/ironic: Wipe any metadata from a nodes disk https://review.openstack.org/93133 | 23:44 |
NobodyCam | ok folks dinner is calling | 23:44 |
NobodyCam | devananda: should we be poking folks in infra to bump it up? | 23:46 |
devananda | NobodyCam: no. I already did. and they did. and it failed due to a neutron bug. it's in the queue again, and they know | 23:47 |
NobodyCam | ack :( | 23:47 |
*** ellenh has quit IRC | 23:51 | |
devananda | just sent an email to the list | 23:51 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!