17:00:00 #startmeeting ironic 17:00:02 Meeting started Mon May 23 17:00:00 2016 UTC and is due to finish in 60 minutes. The chair is jroll. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:03 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:05 The meeting name has been set to 'ironic' 17:00:11 hi everyone 17:00:14 o/ 17:00:15 o/ 17:00:18 o/ 17:00:19 \o 17:00:37 as always, our agenda is here: 17:00:40 #link https://wiki.openstack.org/wiki/Meetings/Ironic 17:00:45 o. 17:00:46 let's jump in 17:00:47 o/ 17:00:47 o/ 17:00:48 o/ 17:00:52 o/ 17:00:58 #topic announcements and reminders 17:01:02 o/ 17:01:03 o/ 17:01:24 reminder to take the midcycle dates poll by next monday 17:01:26 #link http://doodle.com/poll/gpug7ynd9fn4rdfe 17:01:37 also, we had a green grenade run last week \o/ 17:01:41 huge thanks to all working on that 17:01:47 o/ 17:01:51 does anyone else have announcements or reminders? 17:01:51 o/ 17:01:59 o/ 17:02:02 o/ 17:02:17 o/ 17:02:17 Just big kudos to both vsaienko and vdrok for all their work on Grenade :) 17:02:24 ++ 17:02:32 \o 17:02:36 http://www.merriam-webster.com/dictionary/kudos 17:03:01 #topic subteam status reports 17:03:09 as always these are on the whiteboard: 17:03:11 #link https://etherpad.openstack.org/p/IronicWhiteBoard 17:03:17 starting around line 76 17:03:24 * jroll gives people time to review 17:03:24 o/ 17:03:42 * dtantsur forgot about bug stats, will update soon 17:03:58 0/ 17:04:08 not many surprises here 17:04:27 If people don't know. We have gotten a Grenade run to pass in the gate. With about 15 un-merged patches. But it did pass :) 17:04:44 let's get those patches landed, then! 17:04:52 Can those patches be consolidated somewhere so we can all review them? 17:05:08 * thiagop misses tags on gerrit 17:05:19 I think there's a topic for them 17:05:19 JayF: they are consolidated https://etherpad.openstack.org/p/ironic-newton-grenade-whiteboard 17:05:20 Yep. We will need to work on cleaning them up. They do span multiple projects. But there are a few in Ironic that we can get in probably relatively quickly 17:05:21 * dtantsur updated 17:05:26 thiagop: you can use a topic name in gerrit like a tag 17:05:33 vsaienko: thanks! I'll have a look 17:05:36 jlvillal, w00t that's fantastic! 17:05:53 devananda: real tags would be great though 17:06:09 good stuff vsaienko vdrok (and all others involved) 17:06:15 btw, great job guys! 17:06:23 Thank you guys! 17:06:30 * jlvillal will be spending today going through the various Grenade related patches. 17:06:40 fantastic 17:06:44 vsaienko++ vdrok++ 17:07:12 jlvillal: will do 17:07:26 <_milan_> o/ 17:07:30 thanks to you all too :) 17:08:26 anything else on this topic? 17:09:23 I'll bring up JayF topic before open discussion 17:09:30 #topic Should IPA HardwareManager interface changes require a spec? 17:09:41 JayF: dtantsur: bring your conversation here please :) 17:10:06 So I wanted to have a bigger discussion about what changes in IPA we might want to be more careful/explicit about 17:10:14 mainly around changing the harwdare manager interface 17:10:20 jroll, it's not on topic 17:10:28 there have been several patches lately I've had to -1 because they break under multiple hardware manager support 17:10:40 and one was merged recently that (arguably) has the same problem 17:11:04 I'd like to propose we start protecting the HardwareManager interface in IPA the same way we would any other external Ironic interface; by requiring explicit design via spec when it changes 17:11:06 I don't agree with this assessment yet 17:11:20 JayF, are you committing to read all this spec in a timely fashion? 17:11:25 JayF: it sounds like the context that other hardware managers exist seems to be missing, I could see a specification helping in that regard, except I worry about the weight of such 17:11:26 you = you personally 17:11:58 otherwise other cores will approve it just as well 17:12:17 dtantsur: I don't think it's about me; I think it's about use cases (admittedly, ones I care about) being ignored when changing an interface in our software 17:12:26 I think the main problem is that very few people here understand how multiple hw managers work. I'm not one for sure 17:12:27 mistakes will happen, but the more we're explicit about design considerations 17:12:36 Is there any way for us to add tests to catch these cases? Or can we add docs/comments to emphasize about multiple hardware manager support? /me doesn't know 17:12:36 the less the information about why some patterns are bad will be isolated to inside my head 17:13:07 jlvillal: somehow testing was my next question 17:13:14 JayF, so lets start with moving it our of your head, or you'll be the only ironic-specs-core who get review such changes 17:13:38 I'm not saying I have some kind of magic to find these things, or that I want to block all those changes via jay 17:13:43 very much not what I want at all 17:13:53 big warnings in the code? 17:13:57 and go from there? 17:13:59 I'm just saying the last few times this has happened, it took some discussion to tease out all the edge cases 17:14:08 like what happened with lucasagomes and the node caching patches 17:14:26 and that having a specs process allows us to ask these questions, have them answered in a written format, and start disseminating that information 17:14:32 yeah, from a personal experience, it's _very_ hard today to land a change in the hw manager interface 17:14:45 * dtantsur would love to see programming interface that is easy to extend without a special knowledge 17:14:46 the design of it makes it hard to accommodate changes 17:14:51 lucasagomes++ 17:15:04 I do not want to be personally responsible for documenting all the IPA/HWM edge cases; I'm not sure I know them all; but I do know lots more disucssion needs to happen around these changes than are happening today 17:15:25 JayF, only if someone understanding the scope of the problem (= you) actually reviews these specs.. 17:15:29 e.g out of tree managers should _not_ inherit from other hw managers (like the generic one) it should always inherit from the base hw manager class 17:15:37 lucasagomes, WUT? 17:16:23 if so, then lets talk about fixing the hardware manager interface, not about bringing in more bureaucracy.. 17:16:26 dtantsur, yeah, well not sure if I can dump my brain here in this meeting cause the info is not even organized in my head 17:16:29 JayF: didn't you mention that you wanted to see more hardware managers in tree as an example for downstream and consumers to be able to implement thier own? 17:16:58 yeah, I don't want to rearchitect this in this meeting anyway 17:17:11 how about documentation describing the design of that application, how we intend folks to subclass/inherit from it, and API documentation for the base classes? 17:17:13 It almost sounds like we could use a spec on how to change the interface. 17:17:19 devananda++ 17:17:27 devananda: ++ 17:17:32 dtantsur, but think about a driver that inherits from another driver (say drac inherits from ilo, as an example). Out of tree managers are now inhertiting from existing hw managers (the generic one) instead of using the base class 17:17:33 Yes documentation ++ 17:17:35 maybe I'm wrong, but it seems like JayF has most of that in his head already 17:17:40 I know I do not 17:17:45 like this? http://docs.openstack.org/developer/ironic-python-agent/#hardware-managers 17:17:45 lucasagomes, and that's correct IMO 17:18:07 only missing API doc for it, which is here: http://docs.openstack.org/developer/ironic-python-agent/api/ironic_python_agent.hardware.html 17:18:18 ¯\_(ツ)_/¯ 17:18:21 Which I wrote the majority of, fwiw 17:18:28 (me and josh iirc) 17:18:28 lucasagomes, "Custom HardwareManagers should subclass hardware.HardwareManager or hardware.GenericHardwareManager." 17:18:40 so it seems to be fine to subclass ¯\_(ツ)_/¯ 17:19:28 anyway, the fix to my patch (according to Jay) is a one line dispatch -> dispatch_to_all, I'm not sure why all this panic 17:19:40 dtantsur, yeah, I can show the example with the os_get_install_device() outside the meeting 17:19:46 I'm also saying if we had written a spec for this, I would've -1'd the design as well 17:19:47 we have a good record of landing a spec and then realizing we need to rewrite half of it 17:19:50 dtantsur: it seems to be a proposal, not a panic 17:19:57 but overall, we need to re-think some of it's interface and document how to change it 17:20:07 because you changed the HWM interface when it wasn't really needed; we already have a method to initialize the hardware in a manager: evaluate_hardware_support() 17:20:16 which is run on every manager before IPA is fully started up 17:20:16 jroll, panic = rushing to revert the patch which only might break someone in some case 17:20:37 since a given HWM only knows it can support hardware if it initializes it first 17:20:45 okay let's slow down 17:20:50 JayF, evaluate is not the best word for "initialize", is it? 17:20:56 dtantsur: revert-then-converse is the accepted openstack way of handling post-merge core reviewer objections 17:21:06 dtantsur: I'm not arguing the interface is good, lol 17:21:24 JayF, we can deadlock on it ;) 17:21:31 so everything around this seems to be documented, us as reviewers tend to miss these things anyway 17:21:36 anyway, we're not talking about this particular patch, right? 17:21:43 jay has proposed the spec process as a way to miss less of this 17:21:57 people seem opposed to this - what other proposals do those folks have to fix it? 17:22:06 jroll, I'm not opposed to spec process here 17:22:17 (keeping in mind this is a driver interface, essentially) 17:22:28 jroll, I'm pointing that we'll get blocked on JayF reviews, cause otherwise we don't quite understand the whole thing 17:22:48 dtantsur: what do you need to understand it better? 17:22:56 jroll, first thing is a documentation (action point to Jay) so we all can read and better understand hw managers when reviewing it 17:22:58 I've written documentation, tried to solicit feedback on the interfaces in meetings over the last year or so as they changed 17:22:58 for changes to a driver interface in ironic, I would expect a spec -- why is this any different? 17:23:03 lucasagomes: like this? http://docs.openstack.org/developer/ironic-python-agent/#hardware-managers 17:23:08 I don't know what else I would need to do to disseminate this information. 17:23:32 * lucasagomes reads 17:23:40 I also think the spec process makes sense here 17:23:46 JayF: my apologies for my forgetting that these docs exist - as I look at them, I recall reviewing them, but it's been a while 17:24:13 jroll, this documentation is good, but it won't prevent me from doing the same patch :) 17:24:31 I probably need to understand more how multiple hw managers even used 17:24:56 I can think of adding clean steps and extending inspection, but I struggle to think beyond that 17:25:03 Maybe the code should have a link to the documentation. I imagine people making changes may not know about the documentation. Apologizes if the code already contains the links 17:25:04 Would you all be interested in kind of a ironic-tech-talk on how multiple managers can interact? 17:25:05 sure, and reviewers that review hw mgr changes should also do such a thing 17:25:12 JayF, +1000 17:25:33 Cool; I'm already working on documentation and training this week, I'll add that to the list of things. 17:25:43 JayF: a tech talk on implementing a moderately-useful third-party HWM would be great 17:26:06 devananda: my process would be "rip proprietary bits out of onmetal downstream hwm, walk through how it works and why we did it that way" 17:26:25 oh great 17:26:35 I don't think there's a general understanding of how ... complex you can make a HWM :) 17:27:12 I think that would be great too 17:27:20 cool, so JayF is going to do that 17:27:30 I still think spec process wouldn't be a bad thing, do people agree/disagree? 17:27:31 JayF: I suspect youre correct, and I suspect that that information right now exists only in the minds of a few downstream developers at large companies 17:27:35 I'll work on it, give me a couple of weeks. But my suggestion still stands, and I think there's still value in it 17:27:37 getting it out in the open would be really helpful 17:27:40 and I hate specs too :/ 17:27:59 devananda, ++ (and JayF jroll thanks for the docs, I didn't know they existed) 17:28:05 can we just make a folder in ironic-python-agent repo, and add the specs along with code? 17:28:19 with like a couple of required sections 17:28:28 vdrok, how does it change the latency before spec is approved? 17:28:28 I'd prefer just do it in ironic-specs 17:28:35 jroll: practical consideration for specs here - would we need a separate repo, or could we file them in with ironic-specs? 17:28:38 heh 17:28:50 I mean, IPA is like ironic-lib 17:28:53 prefix title with "agent:" or something 17:28:55 it's tightly coupled to Ironic 17:29:03 * jroll not opinionated so prefers less infra 17:29:03 so why wouldn't it go in ironic-specs with all the others? 17:29:06 interesting question actually. not all IPA hw manager changes directly affect Ironic 17:29:15 I consider IPA a vital part of Ironic (since now the bash way is gone), so ironic-specs seems fine 17:29:16 i.e. the one in question does not 17:29:18 and that's okay 17:29:26 "this change doesn't impact ironic" 17:29:30 bam done 17:29:38 who wants to volunteer to add the agent impact section? 17:29:44 o/ 17:29:49 thanks 17:29:58 devananda: wanna send an informational email or shall I? 17:29:59 I'm pretty familiar with the spec tooling, should be easy to add that 17:30:10 devananda, jroll, lets expand the agent impact then into several sections 17:30:15 dtantsur: yah 17:30:26 jroll: don't care. I can do that today 17:30:27 sure, we can discuss those in gerrit 17:30:28 1. agent API impact, 2. hw manager impact, (3. inspection impact? :) 17:30:35 I trust deva for a good first try at that 17:30:40 devananda: that'd be awesome, thanks 17:30:54 dtantsur: ramdisk environment impact :)" 17:31:00 lol 17:31:15 somewhat serious, new dependencies are hard with 3-4 different builders 17:31:37 okay, anything else on this topic? 17:32:38 #topic open discussion 17:32:45 have at it folks 17:33:00 here i go? 17:33:05 :-) 17:33:10 jroll, I'm working at Neutron integration patches. I hope to upload an updated version tomorrow 17:33:29 vsaienko: awesome, thanks 17:33:37 chopmann: open to anyone :) 17:34:20 jaoh and me are working on https://bugs.launchpad.net/ironic/+bug/1583065 17:34:21 Launchpad bug 1583065 in Ironic "[RFE] Support network switches provisioning" [Wishlist,Incomplete] - Assigned to Cornelio Hopmann (hopmann-n) 17:34:45 indeed 17:34:56 Talking about IPA and testing ... I think we have a need (now-ish but really in the future too) for a DIB job in IPA (i.e., testing IPA inside a dib ramdisk). Does anyone know who in our community works on those the most? 17:35:09 (we are OpenSource contribution noobs :-P btw) 17:35:15 JayF, HPE folks and us 17:35:21 I just know in my project-config proposed patch, we'd have voting coreos-src and tinyipa-src jobs, just seems like we should have a dib-src job as well to ensure we don't miss any deps there 17:35:29 dtansur asked: Hi! What exactly remains to implement to support such switches? 17:35:46 JayF: because the cusomtomization of HWM's is something we expect operators to need to do, WDYT about moving that info here? http://docs.openstack.org/developer/ironic/drivers/ipa.html 17:36:00 the idea being all ramdisks get a -src job in IPA to verify that works, then we go up to Ironic and test them against the rmadisk best suited for CI (tinyipa) and presume that the IPA gate jobs "protect" us from other ramdisk breakages 17:36:07 JayF: or somewhere here - http://docs.openstack.org/developer/ironic/#administrator-s-guide 17:36:11 We dont really know which parts are afected by our request. 17:36:17 Our Use-Case is not on the examples uses-cases. 17:36:24 chopmann, yeah, this RFE could use some details. did you even try doing that with whatever we have now? :) 17:36:25 devananda: It's kinda hard, right? I know when I want dev docs for IPA, I look in IPA codebase not in Ironic codebase. 17:36:38 devananda: if we moved some of them, we should move all of them, and change IPA docs themselves to just be a pointed 17:36:39 The "Image" is just a config file (Text). 17:36:40 *pointer 17:36:49 JayF: but "add or customize hardware manager" is something we need ops folks to do for *their* hardware 17:36:53 JayF: ++ then we could also run that dib-src job against dib 17:37:03 JayF: that's different than "change how IPA works" 17:37:16 devananda: there aren't even any docs in Ironic proper on customizing your ramdisk, is there? 17:37:24 devananda: all the build instructions and tooling are in IPA as well 17:37:24 JayF: no. that's my point. 17:37:27 we can't use the ramdisk, these are embedded system (like cisco iOS) 17:37:28 jroll, JayF: +1 to DIB job 17:37:28 devananda: JayF: I'd put words in drivers/ipa.html about "you may need to customize your IPA ramdisk, see this link" 17:37:39 JayF: we tell operators they need to customize the ramdisk, but we don't give them any info on *how* 17:37:46 chopmann, what are you even trying to achieve? just configuring these switches? 17:37:47 chopmann: this is totally something I'd like to do, fwiw 17:37:56 chopmann: maybe you should avoid iscsi and agent deployment drivers and devise your own deployment interface 17:37:59 jroll: ++ that's what I mean, just a pointer one way or another :) either move IPA docs to Ironic, and put a pointer in IPA, or add a blurb to Ironic docs that point to the IPA docs 17:38:01 chopmann: is this something you've implemented or looked into? 17:38:11 jroll, JayF: and I think doc'ing how to create a new HWM should be written for operators 17:38:17 not for developers 17:38:27 dtantsur: AIUI, clean and provision a thing with a base config for a new tenant 17:38:35 devananda: well, I'm told operators can't write python 17:38:36 jep, the first step is configuring the switch. Point it to talk to a sdn controller 17:38:36 JayF: I would <3 a dib job, we have one in bifrost at least for making sure the main element works, and that has helped us catch a number of things in the past 17:38:38 (and yea I know, devops and things ... but there is a difference) 17:38:54 jroll, for what definition of "provision"? provided that an image is a text file.. 17:38:56 dtantsur: chopmann: and then have neutron or something for configuration beyond that 17:39:03 devananda: ConfigDrivenHardwareManager :P input a yaml file, it changes it into prioritized clean steps running utils.execute() on commands it's given, hehe 17:39:08 dtantsur: well, tell the switch to read it and do the thing 17:39:15 JayF, sounds not so bad tbh 17:39:16 dtantsur: update base OS, etc 17:39:21 hmm 17:39:25 dtantsur: I'm only half joking, I don't think it's a bad idea at all actually 17:39:34 jroll: oh, totally. so we should describe how to write a hardware manager in ruby ;) :P 17:39:34 jroll, how is it different from kickstart which we actively rejected? 17:39:41 this is actually a perfect use case for ansible deploy driver 17:40:05 we are working on a PoC. Normal dhcp+tftp combo and point it to an instance (sdn controller) inside OpenStack 17:40:09 dtantsur: mostly that these can't run a ramdisk :) 17:40:21 dtantsur, jroll: so many of these switches can get an init config via TFTP, think thats what they are going for, basicly just using ironic to configure the TFTP and serve it to the switch 17:40:23 chopmann: wouldn't the switch OS image itself be the image you deploy? 17:40:35 chopmann: and then pass in the config (text file) like you would pass in user data? 17:40:41 jroll, yes.. still it's an OS configuration, not quite installation. like kickstart. I'm not against both of them, but I know that devananda is not fond of configuring OS 17:40:42 yes and now. 17:40:45 and no 17:40:59 yeah, I'd love for "deployment" to be "pull down your new software via tftp" 17:41:23 so whitebox switches == deploy an OS on, these are blackbox switches == os already installed but they need bootstraping 17:41:37 sambetts++ 17:41:56 sambetts: blackbox switch may have a different means of loading a new "OS", but they do need to update that sometimes too 17:42:18 sambetts: "boot OS" could be "reset firmware, apply base configs" 17:42:42 I'd like to see: copy tftp flash0:; copy run start; reload 17:42:46 be this driver 17:42:48 that what i meant with yes and no. the images you load on those switches are sometimes 1+GB (and some sort of linux) 17:43:29 we dont want to do more than point it to the proper sdn controller 17:43:57 the rest can be taken care with ironic/netman 17:44:11 err neutron 17:44:21 jroll: sure. though, the exact sequence of commands will vary between vendor/model/revision 17:44:26 chopmann, to you plan to write a spec on it (sorry if you already did)? 17:44:35 devananda: yeah, it's a strawman 17:44:40 yes, after this :-) 17:45:08 my other concern is that I don't think an Ironic running at the same level as the neutron thats going to control that switch makes sense right? 17:45:34 JayF: to your earlier point of -src jobs in IPA -- yes please 17:45:59 devananda: yeah; I'm saying I'm working on it for the existing jobs. There's no dib-src job at all, and I think the scaffolding for it isn't in devstack yet either 17:46:08 ironic in an undercloud providing a switch to an overcloud neutron makes sense, but i'm not sure about them at the same level seems like a layer break 17:46:08 devananda: so more a "call for someone to do the work" than volunteering, fwiw 17:46:10 lol 17:46:22 sambetts, chopmann: I'm not clear on that. how would you "point it to an sdn controller" to update the switch? 17:46:26 or do you mean, to apply configuration? 17:47:41 JayF, the patches are up for that IIRC 17:47:49 standing up an OVS-DB instance next to ironic, then booting a switch OS that supports OVS onto a switch, passing in enough configuration & credentials for it to connect back to the OVS-DB... that seems like all you'd need 17:47:51 "to hand it over to the tenant" so he can use it in his env 17:48:00 JayF, I'll keen an eye on them, as we do use DIB-based agent in prod 17:48:26 chopmann: oh! I wasn't thinking of that use case 17:48:59 chopmann: yea, spec please, with some clear description of your intended use case, would be very helpful to get the discussion going 17:49:07 yeah, the use case I've always had in mind is, provision to tenant, give them cli access 17:49:19 gotcha. not what I was thinking of 17:49:23 jroll: I think in our minds it's been more LBs or GWs rather than switches 17:49:24 ++ to a spec 17:49:30 jroll: but the whole pattern could make sense together 17:49:40 for public public cloud it could be a security problem (you can brick the switch), but if you do only OpenFlow, there is not that much you can break 17:50:48 JayF, devananda: https://review.openstack.org/#/c/264579/ 17:51:15 JayF, devananda: work is ongoing for adding DIB support to ironic devstack ^ 17:51:40 chopmann: how do you see a user interacting with it? nova boot ? 17:51:42 sambetts: perfect; that'll be easy to integrate into tests once it's complete 17:51:51 sambetts: is that related to the switch discussion? 17:52:04 oh - I see 17:52:06 alright then. We'll work on a spec. With our use-case outline, should we add more to it, or "expand/refine it" as the discussion grows 17:52:58 sambetts: either throu the sdn controller itself or neutron 17:53:16 chopmann: so the sdn controller would trigger a build in ironic? 17:54:20 chopmann: add as much info as you have today, and we'll go from there 17:55:53 chopmann: describe the use case and problem in the opening secttions of a spec, following the template here: https://github.com/openstack/ironic-specs/blob/master/specs/template.rst 17:56:38 chopmann: we can discuss on the spec review and help you flesh out the rest, or put it on the agenda here if it needs another broader discussion 17:57:08 thanks guys! :-) 17:57:19 no problem good luck 17:57:20 (and girls) 17:57:38 2 minute warning 17:57:40 err and girls. thank you everyone ;-) 17:57:44 :) 17:58:04 okay sounds like that's it 17:58:08 thanks everyone 17:58:10 #endmeeting