15:00:22 #startmeeting ironic 15:00:22 Meeting started Mon Oct 8 15:00:22 2018 UTC and is due to finish in 60 minutes. The chair is TheJulia. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:23 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:25 Good morning eveyrone! 15:00:26 The meeting name has been set to 'ironic' 15:00:31 o/ 15:00:35 \o 15:00:37 o/ 15:00:38 \o 15:00:38 o/ 15:00:40 o/ 15:00:52 o/ 15:01:00 Our agenda this week is fairly strait forward, and can be found on the wiki. 15:01:01 #link https://wiki.openstack.org/wiki/Meetings/Ironic#Agenda_for_next_meeting 15:01:09 o/ 15:01:36 #topic Announcements / Reminders 15:01:44 o、 15:01:48 o/ 15:02:01 #info We have published the priorities document for the cycle! \o/ 15:02:08 o/ 15:02:09 woo 15:02:16 #link http://specs.openstack.org/openstack/ironic-specs/priorities/stein-priorities.html 15:02:50 There should be a story for everything in storyboard at this time. There is also a high level worklist page on storyboard now 15:03:14 #link https://storyboard.openstack.org/#!/worklist/494 15:04:09 Does anyone have anything else to announce or remind us of this week? 15:04:16 * etingof would like to introduce the newborn ironicer - iurygregory \o/ 15:04:42 Welcome, iurygregory :-) 15:04:53 welcome! 15:04:58 Welcome iurygregory! 15:05:09 thanks everyone o/ 15:05:13 welcome 15:05:23 One reminder, I will be effectively completely AFK all day tomorrow. 15:05:29 welcome 15:05:49 welcome iurygregory 15:06:09 #topic Review action items from previous meeting 15:06:19 #info No action items last week, so moving on! 15:06:47 #topic Review subteam status reports 15:07:11 #link https://etherpad.openstack.org/p/IronicWhiteBoard 15:07:27 Starting around line 173 15:08:37 I've put initial statuses on some of the items, If you own one of the items according to the priorities, please indicate a status, even if work has not started on that item yet. 15:09:31 dtantsur: awesome about a prototype 15:09:44 :) 15:11:28 o/ 15:12:08 Greetings mjturek 15:12:29 Regarding python3 first, have any 3rd party CI operators had a chance to look at setting up a job or two to run python3? 15:13:15 TheJulia not much of an update, but our CI guru doesn't see any issues with it 15:14:19 Okay, it should be fairly easy, duplicate and pass USE_PYTHON3=True into the CI job :) 15:14:45 * TheJulia wonders if we need a "Just for fun" category 15:15:13 Anyway, has everyone had a chance to review and update statuses? 15:15:33 And with that, are we ready to proceed? 15:15:38 TheJulia: We're looking into it. 15:16:00 Just a question of which to cut over. 15:16:16 rpioso: in your guys case, you have tons of jobs, I would just cut over the ones you feel most exercise your driver to use Python3 15:16:30 TheJulia: +1 15:17:54 Everyone good to proceed? 15:18:27 * TheJulia gets out the crickets 15:19:02 Okay, I guess we're good to proceed then 15:19:21 #topic Deciding on priorities for the coming week 15:19:35 #link https://etherpad.openstack.org/p/IronicWhiteBoard 15:20:13 Starting at line 105, I pre-populated a list based based on looking through review this morning 15:20:27 Is there anything anyone feels is missing? That they feel needs to be added or removed? 15:21:53 seems reasonable 15:22:55 Any objections, are we good to proceed 15:23:00 * TheJulia brews more coffee and hands it out 15:23:24 * etingof would love to have the long-running ipmitool processes idea criticized 15:23:56 meaning https://review.openstack.org/#/c/607949/ 15:23:57 patch 607949 - ironic - WIP: Avoid long-pending ipmitool processes - 1 patch set 15:23:58 looks good to me 15:24:08 * jroll already gave his feedback :) 15:24:25 etingof: I've not had a chance to look yet, could we take that to discussion? 15:24:31 sure 15:24:33 thank you! 15:24:35 Okay, Moving on then 15:24:39 #topic Discussion 15:24:55 First topic of the day is: Do we have a forward path on deploy templates? 15:25:15 I'm still waiting to see jay's proposal for the full picture 15:25:21 but I think we have enough to start building it out? 15:25:37 jroll: I'm not sure he is going to given the way the discussion went :( 15:25:58 jroll: I wasn't planning on continuing any formal proposal considering the discussions. 15:26:16 erm 15:26:22 I thought we landed on agreement with that proposal 15:26:45 other than folks saying it would take too long, I guess 15:26:46 I think we can do the internals without many issues or disagreements, the what information we act upon and how we get that in or populated seems to be lacking agreement 15:26:48 jroll: I essentially capitulated. 15:27:08 jaypipes: don't blame you, I guess it's my optimism hoping you were continuing writing instead of arguing :) 15:27:44 TheJulia: again, I think everybody came to agreement in that thread, with a small chunk of "we don't have time!!!!!!" 15:27:51 maybe I read it wrong 15:28:08 jroll: the official line from nova is that virt drivers should feel free to use the required traits list as signals to the virt driver to configure an instance. We advise the virt driver not to put non-schedulable/non-placement-influencing things as required traits. 15:28:11 the recent mail Chris Friesen suggested traits should only be used for configuration of booleans 15:28:17 I feel like with the side discusison, is that the mechanism would be capabilities, and we would just ignore traits 15:28:36 and capabilities would essentially now live forever 15:28:37 capabilities? 15:28:48 jroll: that said, we're not keen to add any deploy_template_X stuff to os-traits standard traits library, so the deploy template traits should be prefixed with CUSTOM_. 15:28:51 And then we would have an external overide mechanism 15:28:57 I thought the purpose of the deploy templates was to turn deploy steps into booleans? 15:29:03 jaypipes: We were never ever suggesting that 15:29:04 jaypipes: yep, hear everything you're saying loud and clear 15:29:18 TheJulia: johnthetubaguy was. 15:29:29 jaypipes: oh... well... Ummm.. Hmm. :( 15:29:46 jaypipes: he wanted to use standard traits, but not like that 15:30:21 jaypipes: I'll put it this way, my impression and understanding is that we would simply rely upon existing traits, but not do anything like a template definition in os-traits, since it is completely freeform with CUSTOM_ 15:30:24 if there is a sensible standard trait, then it could be used. He wasn't suggesting putting garbage into os-traits 15:30:29 mgoddard: we'd still support standard traits being added to os-traits like BOOT_MODE_UEFI/BOOT_MODE_BIOS or STORAGE_RAID5 etc 15:30:40 so where I believe we're at (or did before this meeting) - we have a path for boolean config things like UEFI, we still need to determine the path for more complex configuration data, and there's a good proposal from jaypipes in that thread. I still think this is the path we should take 15:30:57 ^ curious folks' take on that 15:31:00 * jroll gets links 15:31:06 jroll: I concur 15:31:12 TheJulia: ++ 15:31:24 * TheJulia doesn't know why she didn't do that sooner 15:31:38 this is the proposal I think we should take: 15:31:41 #link http://lists.openstack.org/pipermail/openstack-dev/2018-October/135300.html 15:32:43 and I can't find the simple boolean proposal now :| 15:33:16 ah, the simple part: 15:33:19 #link http://lists.openstack.org/pipermail/openstack-dev/2018-October/135446.html 15:33:46 jroll: so, yeah, I'd love to see that type of solution long term, but it ain't a reality any time soon given current state of thinking in nova. 15:34:05 jroll: I'm referring to the first link above. 15:34:33 jaypipes: yeah, social problem, not technical. can be overcome. 15:34:42 jroll: and yes, cfriesen's email represents the agreed, simple approach. 15:34:57 jroll: It was on another thread if I remember correctly 15:35:08 TheJulia: yes, I linked it :) 15:35:18 jroll: to which I responded with the Ironic-ness here: http://lists.openstack.org/pipermail/openstack-dev/2018-October/135474.html 15:35:28 * devananda looks for the jar of unicorn dust, and adds some to their coffee 15:35:28 jaypipes: indeed 15:35:53 anyway, to the original question, 15:35:55 so if we were to build a solution for the boolean proposal using traits, and until --config-data exists, abuse it for non-booleans, how would that sit with everyone? 15:36:03 let's proceed... yes what mgoddard said :) 15:36:10 er wait 15:36:25 no, I don't want to abuse traits like that 15:36:37 just wait for complex things until someone cares enough to push for the right solution 15:36:39 so no support for non-booleans? 15:36:43 I totally didn't see jaypipes's further comments below 15:37:12 mgoddard: that's my opinion, yes, we're going to dig ourselves another compatibility hole like capabilities 15:37:42 if we build a generic mechanism, can we stop people? 15:37:43 mgoddard: I'm good with that 15:37:54 mgoddard: maybe if we also sprinkle some of this unicorn dust devananda has been hiding 15:38:11 s/sprinkle/sprinkle in/ 15:38:14 we can't stop people 15:38:19 but we can yell that it isn't supported 15:38:23 and then not care about breaking it 15:38:45 people will just ignore it 15:39:03 that's fine 15:39:04 which might be ok, but we should understand that 15:39:06 mgoddard: well, if things don't match up or are not viable and don't pass validate, drivers should fail and prohibit deployment 15:39:12 and they may be broken later 15:39:29 are we leaning towards delaying RAID for eternity more? 15:39:29 (we might need to augment the list in the ironic virt driver for interfaces cared about at some point) 15:39:46 dtantsur: I think we can consider a boolean of raid, but not information about a raid 15:39:56 right, so a template still? 15:39:58 boolean is quite subtle here - you could argue that RAID5 is a boolean - you either have it or you don't 15:40:07 yeah, this ^^ is my question 15:40:32 dtantsur: I think it would still be a template, default configuraitons would need to be populated in the boolean scenario 15:40:51 once we have something with metadata references, then we can allow more dynamic pass-in of raid configuration 15:41:19 * TheJulia wonders if we're all on the same page and in a relatively happy place 15:41:20 so RAID is a boolean? 15:41:43 (that's not what I understand boolean to mean...) 15:42:36 so, the gap is the fact that our model requires the configuraiton for raid to be set, the template stored in ironic... I guess in theory could replace the raid template 15:42:38 very few boot-time-configurable traits are true booleans, because many of them interact with other settings. how about secure boot mode <-> legacy BIOS setting? 15:42:38 err 15:42:40 raid config 15:43:18 So in theory, we could have RAID5, RAID10, and a deploy template could swap default configurations around :\ 15:43:44 * TheJulia doesn't want raid to derail the boolean nature of secure boot 15:44:08 TheJulia: I'm pointing out that secure boot actually isn't a simple boolean, unfortunately 15:44:36 what if I request secureboot=true, and biosmode=true? 15:44:39 devananda: I think settings would need to fall into the entire bios setting side of the universe where an operator could advertise a specific trait on nodes based upon bios settings they have applied, and as time goes on we could iterrate that 15:45:26 now I'm thinking - drivers can already read the traits passed to ironic, for the simple things. the simple proposal just adds some mechanisms to nova to pass additional traits to the virt driver; our side is done. I imaging the more complex --config-data proposal would pass this data in a different way, and I think that's where deploy templates need to come in. 15:45:27 devananda: validate() code would need to be sufficent enough to recognize such a condition and prohibit deployment. 15:45:37 I don't mean to derail, but I don't think this problem is limited to RAID settings 15:46:05 so maybe the deploy templates work needs to propose the ironic side of the --config-data bits, and then we can complete the work in the nova api 15:46:19 devananda: I absolutely agree with you there, but we can only announce 80-ish possible traits to be scheduled upon anyway. 15:46:52 devananda: and all of those things are booleans, they exist or not 15:47:02 I see 15:47:26 jroll: I think that is reasonable as well 15:47:40 * TheJulia thinks we just need to go off and hack on code at this point 15:47:54 jroll: I'd be wary of implementing something without buy in from nova on a high level design 15:48:48 mgoddard: I didn't see anyone from nova opposed to this proposal for reasons other than "it'll take too long and we need to solve this asap" 15:48:58 jaypipes: ^ would you agree with that? 15:49:47 We also kind of reached a similar point in prior in-person discussions and it felt like we're were kind of at that point where an ID value was blessed, and even ironic could recieve that in the post to move to active state, and then go lookup the data if needed 15:50:01 jroll: reading back, one sec 15:50:22 Well, it should likely be set first, that way validate can do the needful to determine if the deployment is actually possible or not 15:50:50 * TheJulia thinks we still call validate right before deployment anyway, so she is just rambling off into the wind 15:51:03 jaypipes: more pointedly, would you agree that the only resistance to your proposal from nova folks was about the time it will take? 15:53:07 jroll: that was the primary resistance, yes. 15:53:27 nod 15:53:34 jroll: and that resistance was from ironic as well :) 15:53:37 sure 15:54:04 mgoddard: so I'm not too worried about technical resistance from nova, but we can get a nova spec up sooner than later, so we don't implement it without some sort of buyoff 15:55:16 jroll: +1 for submitting a nova spec 15:55:16 jaypipes: I think a good chunk of that was also because we didn't want to go to an extreme to get started, but it feels like (with the current discussion) that a happy place has been obtained 15:56:08 So 5 more minutes for our scheduled time block, I'd like to jump to etingof's ad-hoc topic if we feel that we're at a happy place on the current topic 15:56:30 one more question: who's doing the nova spec? :) 15:57:02 Additional to that, do we want to get it done before berlin? 15:57:25 TheJulia: ++ 15:58:02 If needed, I can take the nova spec on my todo list. 15:58:16 But I'm totally adding a just for fun review category then :) 15:58:29 (Its the only way I'll retain sanity) 15:59:02 * TheJulia takes silence as consensus 15:59:15 TheJulia: take "delegate the nova spec" on your todo list instead :) 15:59:32 jroll: heh 15:59:48 etingof: So, https://review.openstack.org/#/c/607949/ :) 15:59:49 patch 607949 - ironic - WIP: Avoid long-pending ipmitool processes - 1 patch set 15:59:55 so I think I've discovered that we can't trust ipmitool's timeout/retries values we pass it. if we do, ironic gets blocked for up to 250 secs on a dead node. in the patch I've proposed I am trying to play it safe with ipmitool not to get blocked for long. 16:00:31 I remember we discussed this a long long long time ago (many years) and couldn't reach consensus because it required really cracking open ipmitool's code to understand what it was doing. 16:00:50 I'm +1 to fixing the behavior 16:00:51 well, I've debugged it a bit 16:01:09 it has adaptive delays here and there 16:01:39 etingof: just curious, which implementation and version of ipmitool are you having that problem with? 16:01:46 but that does not matter, the bad thing is that once we call ipmitool on a dead node, we are blocked out for some time 16:02:01 devananda, 1.8 16:02:36 etingof: I mean, which implementation? there are different codebases out there, which different distros package under similar package names, which all create a binary called "ipmitool" 16:03:00 again, don't mean to derail, but that was one of the fixes that I found way-back-when ... 16:03:14 one of the implementations was a lot less "locky" than others 16:03:15 devananda, oh, I used one packaged for centos and fedora 16:03:37 k 16:03:41 devananda, but does it matter? do we want to depend on some specific ipmitool? 16:04:03 rather than on the one being shipped with a distro 16:04:09 if there's a bug in an external package, should we fix it in ironic? 16:04:24 devananda, it does not sound like a bug 16:04:30 ah 16:04:33 We kind of already do, I seem to remember we've got comments stating 1.8.15 in our docs. 16:04:37 devananda, I take it as a way to deal with broken BMCs 16:04:50 etingof: we already depend on a given implementation: https://github.com/openstack/ironic/blob/master/ironic/drivers/modules/ipmitool.py#L27 16:05:47 we also already implemented backoff timing in ironic, to work around issues like this in external commands. I don't know if that code is still around (/me goes and look s for it) 16:06:14 * dtantsur suspects we should wrap up the meeting.. 16:06:17 devananda, that backoff thing does not prevent ipmitool to take as much time as it wants 16:06:21 indeed, I'm hungry 16:06:25 Yeah... 16:06:35 Anyone have anything else or we'll call this meeting a wrap? 16:06:37 * etingof is sorry for being boring tonight 16:06:51 etingof: your not being boring :( 16:07:37 Okay, calling this meeting over, Thanks everyone! 16:07:40 #endmeeting