19:00:56 #startmeeting ironic 19:00:57 Meeting started Mon Jun 2 19:00:56 2014 UTC and is due to finish in 60 minutes. The chair is devananda. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:58 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:58 \o 19:01:00 The meeting name has been set to 'ironic' 19:01:05 hi everyone! 19:01:10 as usual, the agenda is available here: 19:01:12 hi devananda :) 19:01:13 #link https://wiki.openstack.org/wiki/Meetings/Ironic 19:01:18 g'evening :) 19:01:19 howdy folks :) 19:01:19 o/ 19:01:26 Hi everyone! 19:01:33 hi 19:01:36 \o 19:01:58 actually, I dont have any big announcements today, just a few housekeeping items 19:02:07 \o 19:02:20 specs repo is online, and lots are coming in -- thanks everyone for starting to submit them, and hopefully review them too! 19:02:40 * mrda looks 19:02:50 I'm going to try to sync up with mikal today about the nova midcycle 19:03:17 after discussing it with ironic-core folks at the end of last week, I'd like to take some time during this meeting (not now, after the subteam reports) to talk more 19:03:42 that's it for my annoucements ... 19:03:54 #topic progress reports 19:04:06 specs and blueprints are me 19:04:34 i focused on some initial big refactoring that we talked about at the summit 19:04:42 and wrote up a spec, which has gotten a lot of mixed feedback 19:04:56 #link https://review.openstack.org/94923 19:05:24 I'd really like us to come an agreement on how to proceed with the work on fixing the API / RPC bottleneck 19:06:06 only one -1, can 19:06:06 * lucasagomes have to re-read the spec after the updates 19:06:11 can't be that bad! 19:06:32 * dtantsur actually had some doubts without using -1... 19:06:35 rloo: there were a lot of -1's on the previous rev 19:07:00 devananda: oh, well, one can only hope then ;) 19:07:09 dtantsur: Why? 19:07:16 a lot of concerns are related to debugging after this change lands, I guess 19:07:24 russell_h: AIUI, you guys have been hit the hardest by the issue which this is trying to solve 19:07:32 russell_h: so it'd be great to get some feedback on the spec from your team 19:08:07 devananda: I commented on... Friday or something 19:08:18 I think its a good direction 19:08:32 russell_h: awesome. i'm still catching up from the weekend -- jsut read your comment 19:09:00 russell_h: so an intent-lock that can be check-and-set in the API layer would address your concerns, it sounds like 19:09:23 if that means what I think it means, yes 19:09:44 russell_h: http://en.wikipedia.org/wiki/Test-and-set 19:09:46 that ^ 19:09:56 devananda: I have some concern over these will be 61 queued up locally and processed in the order received. 19:10:42 so the problem with test-and-set in the API is it serializes all API requests against the same resource 19:10:56 it mainly the locally that I am concerned about, ie what if conductor should die, will we be able to pick up on what was in the que 19:11:00 any client will get some sort of resoruce-is-busy (eg, NodeLocked) error while a previous request is still in progress 19:11:53 #action devananda to write up the pros/cons to test-and-set vs. serializing API actions on the spec 19:11:57 moving on ... 19:12:06 lucasagomes: hi! how's the overall code/driver clean up going? 19:12:29 devananda, about the instance_info we need the changes in devstack and tempest to land 19:12:32 to unblock the work 19:12:39 but pretty much all the patches are up already 19:12:49 lucasagomes: awesome. is the list in an etherpad? 19:12:55 I need one more patch adding documentation 19:13:13 devananda, nop, lemme add to the ironic white board or something the list of patches 19:13:28 lucasagomes: TY 19:13:43 lucasagomes: nice, thank you for taking that on 19:13:47 lucasagomes: thanks. would be great to get some other eyes on it. QA folks like to see our team review our QA patches 19:14:08 +1 19:14:14 #action lucasagomes to add links to driver-info related QA patches to https://etherpad.openstack.org/p/IronicWhiteBoard 19:14:18 devananda: oh, in that case, i need you to review a couple of tempest changes 19:14:28 #action anyone familiar with tempest/devstack to help by reviewing said patches 19:14:36 o/ 19:14:54 +! 19:14:54 Shrews: fantastic. please add to the etherpad ^ 19:14:56 lucasagomes: are there devtest patches too 19:15:01 I'll take a look on those 19:15:31 NobodyCam, yes, there's one adding the pxe_deploy_ramdisk and pxe_deploy_kernel information when enrolling the node 19:15:36 not only to the flavor 19:15:41 moving on to other subteam reports, since we're already sorta touching on them :) 19:15:57 #topic subteam: integration testing 19:16:19 sounds like several folks are making tempest changes now 19:17:06 anyone want to give a summary of all that? 19:17:15 wiki says adam_g is leading that subteam 19:18:02 I also have a link saved from last week where several folks are adding new tempest coverage for ironic 19:18:05 #link https://review.openstack.org/#/q/status:open+project:openstack/tempest+branch:master+topic:bp/missing-baremetal-api-test,n,z 19:18:05 TBH im not too on-top of in-flight tempest changes ATM. ive been busy the last week hammering on some large baremetal test racks trying to iron out issues that are blocking ironic+tripleO there 19:18:17 i did see Shrews has some new scenario tests up for ironic, which is awesome 19:18:39 adam_g: yes, just added those to the etherpad. plz review if you get the chance 19:18:42 adam_g: np - the bugs/patches coming out of that work are great, too 19:19:28 I have changes up for a rebuild() test. And was working on a reboot() test last week until getting pulled away for a TripleO blocker bug 19:19:35 there's also the item of getting ironic gate jobs to report separately, similar to 3rd party CI 19:20:01 but i personally have not had a chance to look at the required zuul changes to make that work 19:20:34 Oh, also, d-g now creates the baremetal flavor with a 1G ephemeral partition 19:20:35 #link http://lists.openstack.org/pipermail/openstack-dev/2014-May/035827.html 19:20:45 joshua hesketh volunteered for the thrid-party-CI thing -- anyone know his IRC handle? 19:20:58 devananda: it's jhesketh 19:21:17 he's a cow orker of mine, so I can follow up if you like 19:21:34 mrda: awesome, thanks 19:21:54 adam_g: do you want to / have time to continue tracking the ironic CI efforts? 19:22:26 adam_g: i think you've been doing great, fwiw, but understand when priorities shift 19:22:30 #action mrda to follow up with joshua hesketh (jhesketh) on the thrid-party-CI thing 19:22:32 devananda, i would like to, yes. i should be freed up a bit from some of this other stuff in the next week or two 19:22:37 #chair NobodyCam 19:22:38 Current chairs: NobodyCam devananda 19:22:42 NobodyCam: try again :) 19:22:43 doh 19:22:47 #action mrda to follow up with joshua hesketh (jhesketh) on the thrid-party-CI thing 19:22:54 adam_g: great - thanks! 19:23:05 any other CI efforts folks want to share before we move on? 19:23:10 Just to grab your attention: https://review.openstack.org/#/c/97309/ <-- this seems to affect devstack running with Ironic enabled (at least for me) 19:23:16 I'll be starting to jump in a bit more on our Tempest efforts as well 19:23:24 Starting to get freed up more now 19:24:02 dwalleck: that's interesting. the config file is not in the default location -- is that true for other services as well? 19:24:47 dtantsur? ^ 19:24:47 was it for me? 19:25:06 I think so :) 19:25:07 dwalleck, hmm. thats starnge. i've had to manually hack on the ironic mysql tables in the past, so i know it should be used correclty. or at least it was at some point 19:25:11 woops. yes :) 19:25:13 er 19:25:16 ! 19:25:25 Well, that's what I saw for some time: the default it to create sqlite database in the current directory 19:25:51 dwalleck: you're popular already :P 19:26:00 dtantsur: hm, k. thanks for calling attention to that. doesn't look like lib/nova needs to do that... 19:26:20 moving on 19:26:28 #topic subteam: bug triage 19:26:36 dtantsur: you're up (again) :) 19:26:44 any updates for us? 19:26:58 devananda, $NOVA_BIN_DIR/nova-manage --config-file $NOVA_CELLS_CONF db sync <-- seems like nova does 19:27:03 ok, now to updates 19:27:23 except for this bug, I am now seeing something with SELinux inside DIB 19:27:25 dtantsur: 623 $NOVA_BIN_DIR/nova-manage db sync 19:27:51 see https://etherpad.openstack.org/p/IronicTestingOnFedora for both issues 19:28:07 I will investigate both a bit more though, did not have time the last week 19:28:27 for now, you have to build deploy k&r with selinux-permissive element 19:28:57 oh, I actually though we're talking about Fedora issues :) 19:29:05 thought even 19:29:14 * dtantsur hates morning evenings 19:29:21 dtantsur: slightly out of order ... that's ok. 19:29:26 that was monday evening..... 19:29:39 fwiw, it looks like we have 21 new/undecided bugs -- https://bugs.launchpad.net/ironic/+bugs?field.status=NEW&field.importance=UNDECIDED 19:29:44 so much triage is still needed 19:29:49 yeah, a lot 19:30:00 I started with python-ironicclient and it looks quite ok 19:30:09 except for 2 bugs which nobody can close 19:30:29 hurray for launchpad unclosable bugs :( 19:30:45 now I'm pushing folks to have a look what they're assigned to and check whether they're working on it 19:30:59 I appear unable to set importance on bugs, even the ones I submit. Am I just doing something dumb? 19:31:05 we have _a lot_ of assigned bugs with no visible progress 19:31:08 matty_dubs: join the bug triage team 19:31:39 dtantsur: ugh. I've seen a few of those and poked people, but I'm sure you're seeing more. 19:31:54 to sum it up, I only started actually with undecided bugs - triaged 1 or 2 19:31:59 * mrda wishes peple would move bugs to "In Progress" from "Triaged" when they work on them 19:32:04 matty_dubs, https://launchpad.net/~ironic-bugs 19:32:08 will do much more this week 19:32:21 Thanks -- joined! 19:32:25 IMO, bugs shouldn't be Assigned to someone without a patch up for more than a day or two 19:32:33 dtantsur: if timming is a issue for you I am happy to help poke people 19:32:57 also just FYI: will be on PTO 9.06-13.06 19:33:31 devananda, or some updates in the bug at least 19:33:31 NobodyCam, cool! Actually most assignees are here, so if we all have a look at what we're assigned, will be very good 19:33:41 any objection to some cleanup // un-assigning bugs where there is no visible progress (no patch) for more than 7 days after the bug was assigned? 19:33:46 devananda: i think 1-2 days is too short. how about 1 week? 19:33:57 1 week sounds right 19:34:03 +1 19:34:20 +1 19:34:22 devananda: i actually assigned some bugs during icehouse and since they were low priority, decided to wait til after refactoring. so am waiting still... 19:35:03 +1 for 1 week 19:35:08 great 19:35:14 maybe a sliding scale crit = 1 day , high 2-3 days, < high 1 week 19:35:16 (ok, and one I assigned to keep as a low hanging fruit in case someone was looking and there weren't any avail. guess i won't do that again.) 19:35:49 rloo: better to leave LHF unassigned -- and we should all do that, when we have bigger things to work on 19:35:58 ok, so I will just unassign, when I see no activity for > 1 week 19:36:05 devananda, what to do with abandoned patches? 19:36:10 devananda: seems like the problem there is that someone snaps up the lhf and there aren't any left for a newbie. 19:36:19 for now I either ping people or just move bug back to triaged 19:36:27 depending on how much time passed 19:36:30 is it right? 19:36:36 devananda: no worries. i'll look for a typo next time ;) 19:36:50 :) 19:36:50 rloo: cultural problem there -- core reviewers and our most active develoers should be leaving LHF and encouranging newcomers 19:36:58 rloo, I've tagged at least one more today :) 19:37:24 dtantsur: I prefer to give folks a chance, rather than /just/ unassign, especially if they're on IRC and I know them 19:37:43 dtantsur: but we've had a few where eg. folks I've never met and who aren't on IRC grab a bug and don't work on it 19:37:50 dtantsur: use your judgement :) 19:37:54 ack :) 19:38:00 thanks! 19:38:04 anything else on bug triage? 19:38:17 i think we should just say something now. so that it is 'official'. so the clock starts now... 19:38:18 that's it for me, will be triaging more this week 19:38:29 rloo: fair. i'll send an email 19:38:38 next week I'm on PTO, so no report from me next monday 19:38:41 #action devananda to send email to ML that we're doing a bug cleanup 19:39:05 #info dtantsur on PTO next monday 19:39:31 dtantsur: will you be around friday? if so, mind sending a little summary either to me (i'll paste in the meeting) or to the ML? 19:39:39 o/ 19:39:44 sorry from the antipodes 19:39:47 morning lifeless 19:39:54 devananda, sure, will try not to forget :) 19:40:02 dtantsur: thanks:) 19:40:10 #topic ironic python agent 19:40:20 russell_h: hi! 19:40:26 * jroll is here too :) 19:40:31 russell_h: status updates? need eyes on something in particular? 19:41:04 so, we've been working lately on getting ironic + IPA working on real hardware, with real switches, etc. 19:41:07 jroll: hi! 19:41:17 it's getting pretty good, not 100% but close 19:41:45 we've built a neutron extension to do network isolation things on real switches 19:41:58 jroll: that ^ is awesome 19:42:04 #link https://github.com/rackerlabs/ironic-neutron-plugin/ 19:42:22 jroll: i've seen a lot of chatter on the ML about that work from the neutron folks. i assume you're working with them? 19:42:46 I believe we've started the conversation 19:42:53 morgabra is working on that side of things 19:43:04 he may be able to answer that 19:43:12 great 19:43:39 jroll: fwiw, i would like to see a spec describing the interaction between ironic<->neutron, what API changes will be requried in ironic (if any), etc 19:43:44 I'm thinking people should start looking at the reviews related to the agent (and they have) 19:44:05 and I think the major patch will stabilize soon 19:44:08 devananda: agreed 19:44:18 I still owe you the IPA spec, too 19:44:45 jroll: I suspect some of that may not even be known to folks yet ... and that's fine. it'll give a good reference as the interaction/integration is proven. also, I bet the neutron team will want a spec too ;) 19:44:46 which is half-done, but I've been out of office slash juggling things 19:44:54 heh, of course 19:45:45 devananda: I think that's all I have 19:46:14 jroll: np on the spec not ready yet. having that by J1 would be good, as I'd like to land it, if it's ready, by J2 or very soon there after, idealyly 19:46:17 jroll: thanks! 19:46:22 #topic oslo 19:46:23 when is J1? 19:46:33 I'll try to be quick, oslo.db and oslo.i18n 19:46:35 GheRivero: hi! any news on the oslo libraries side of things? 19:46:42 oslo.db has a couple of blockers (config and the remove of tpool) 19:46:59 but the plans are to fix them in a couple of weeks, and release an alfa librarie 19:47:02 jroll: https://wiki.openstack.org/wiki/Juno_Release_Schedule 19:47:09 thanks rloo 19:47:27 there is a WIP patch for ironic just waiting for the config stuff to land, but nothing blocking 19:47:39 devananda: that timeline sounds right about perfect to me 19:47:45 on the oslo.i18n side, a release should be done really soon 19:47:56 for us, everything will be mostly the same, using _ 19:48:02 GheRivero: i'll be looking for your input on when you think we should start moving to oslo.db lib. I'm not aware of any reason for us to hurry into it 19:48:24 but we will have to use _LW, _LD... for the logs i18n 19:48:43 devananda: it's just for testing, I know there is no hurry 19:48:50 I have a spec up for us to improve our logging -- https://review.openstack.org/#/c/94924/ 19:48:58 but it's better to have something prepare upfront 19:49:14 GheRivero: do we need to move all logging to _LE,_LW,etc before switching to oslo.i18n ? 19:49:21 I'm saw at least one logging improvment patch this morning 19:49:39 devananda: there is no need, we can still use _ 19:49:49 s/i'm/I/ 19:49:51 but is a recomendation. Can be done graddually 19:49:56 GheRivero: cool 19:50:11 ok, thanks! 19:50:29 that's it for subteam reports -- thanks everyone, that was really good! 19:50:39 #topic discussion - sfdisk vs parted 19:50:41 NobodyCam: hi! 19:50:42 Thank you everyone! 19:50:49 May I have a quick announcement, before holiwar starts? 19:50:54 hey hey so I've split 93133 up 19:50:57 NobodyCam: you've had a pretty painful time fixing the volume label bug, it looks like 19:51:08 hah -- ok, hold on 19:51:15 #topic something else 19:51:18 dtantsur: go :) 19:51:33 I've prepared two versions of review dashboard http://goo.gl/hqRrRw and http://goo.gl/5O3SWt :) 19:51:57 most of you already seen them, just posting now to get everyone's attention. feedback is welcome 19:52:06 that's all for me, thanks :) 19:52:13 dtantsur: thanks! these look awesome - i've already started using the one you emailed 19:52:29 and btw they include IPA 19:52:32 dtantsur: wondering if it's valuable to split out IPA pull requests 19:52:41 JayF why? 19:52:43 #info super awesome new review dashboards! http://goo.gl/hqRrRw and http://goo.gl/5O3SWt 19:52:51 dtantsur: Thanks a lot! 19:52:53 \o/ 19:52:59 dtantsur, cool! 19:52:59 dtantsur: My honest answer is that I want it, but I know my needs might not align with others :) 19:53:24 #topic discussion - sfdisk vs parted 19:53:27 JayF let's discuss on #openstack-ironic tomorrow, ok? 19:53:29 NobodyCam: ok - you're really up now :) 19:53:35 :-p I have started to work on the preserve-ephemeral patch and wanted to get input on should we look at replacing parted or keep it? 19:53:36 dtantsur: sure, and no is an OK answer to me as well :) 19:54:25 NobodyCam: AIUI, there's a bug in parted itself which is causing our problem 19:54:29 NobodyCam, preserve ephemeral? 19:54:39 NobodyCam: is this a new patch? 19:54:51 NobodyCam: and there's no way to work within parted to avoid this issue. Is that correct? 19:54:58 devananda: yes: lucasagomes: i have split 93133 19:55:03 could you remind, why we moved to parted in the first place? I remember it was for reasons... 19:55:10 lucasagomes: ^^ ? 19:55:11 Is there a bug or a note open with what the exact 'issue' being referenced is? 19:55:13 devananda, we are also doing things wrong on having the same label on multiple disks 19:55:24 lifeless: per review coments I have split the 93133 in to two patches 19:55:40 lucasagomes: the issue is that conductor's host's local disk labels may duplicate the node's disk labels 19:55:49 lucasagomes: not that >1 disk in a given host has duplicate labels 19:55:51 dtantsur, sfdisk was broken, so when i was fixing the issue and moving things to it's own classes etc (DiskPartitioner) 19:56:24 one to wipe meta data (93133) and another as yet to be pushed up one to rework the preserve-ephemeral logic 19:56:29 I thought about using parted cause it's was more convinient and also we would use the same tool for both MBR and GPT 19:56:56 devananda, right, yeah that's correct, in the IPA approach for e.g we wouldn't have that problem 19:57:05 we were talking last week about generating random labels for the partitions 19:57:10 and adding it to the instance_info 19:57:11 lucasagomes: devananda: with IPA we can suffer the same problem 19:57:19 if there are two disks in a machine and the order has been switched 19:57:30 corner case, but can happen. 19:57:33 right 19:57:35 ah yeah 19:57:48 or eg if an operator moved disks between machines (eg, during maintenance) 19:57:52 Aren't there UUIDs for disks you can reference and lookup? 19:58:08 JayF: yes but this isn't about what you 'can do' its about what parted *does do* 19:58:12 JayF: parted checkes label names 19:58:22 I guess I'm still missing some context about what the exact problem is. I looked at 93133 and didn't see a bug linked in there. 19:58:25 so names like "cloud-rootfs" conflict 19:58:27 JayF: and parted treats labels as unique in [some parts of] its code 19:58:30 JayF: incorrectly. 19:58:33 *could conflict 19:58:41 #link https://bugs.launchpad.net/ironic/+bug/1317647 19:58:42 Launchpad bug 1317647 in ironic "disk volume labels on conductor can conflict with nodes disk labels" [Critical,In progress] 19:58:43 JayF: ^ 19:58:59 NobodyCam: I don't think we should change from parted in fixing the bug 19:59:18 NobodyCam: IMO parted should *only ever* be being used on blank disks. 19:59:28 Yeah so that could happen in IPA, although I think it's more likely to happen with pxe driver / on a conductor 19:59:32 (one minute) can we continue in channel? 19:59:33 NobodyCam: so whether we want to move from parted or not is irrelevant to this bug 19:59:47 yeah, removing the metadata seems the right to do 19:59:55 before re-partitioning the disk anyway 20:00:00 lucasagomes: right 20:00:10 that is up under 93133 20:00:32 summary: during deploy: wipe metadata, use parted. during rebuild: don't touch partition table and dont use parted. yes/no? 20:00:45 woops - time's up. let's continue in our channel 20:00:51 thanks everyone! 20:00:58 #endmeeting