19:00:02 #startmeeting Ironic 19:00:02 #chair devananda 19:00:02 Welcome everyone to the Ironic meeting. 19:00:03 Meeting started Mon Mar 24 19:00:02 2014 UTC and is due to finish in 60 minutes. The chair is NobodyCam. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:04 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:07 The meeting name has been set to 'ironic' 19:00:08 Current chairs: NobodyCam devananda 19:00:09 Of course the agenda can be found at: 19:00:09 #link https://wiki.openstack.org/wiki/Meetings/Ironic#Agenda_for_next_meeting 19:00:16 #topic Greetings, roll-call and announcements 19:00:16 Who's here for the Ironic Meeting? 19:00:19 hi all! 19:00:20 o/ 19:00:21 o/ 19:00:22 o/ 19:00:24 \o 19:00:27 o/ 19:00:30 hello o/ 19:00:32 \o 19:00:37 hi 19:00:37 \o/ 19:00:41 welcome all :) 19:00:42 hey 19:00:47 o/ 19:01:10 o/ 19:01:11 hey lifeless glade to see you 19:01:23 and welcome everyone else 19:01:47 o/ 19:01:55 announcements: 19:02:04 o/ 19:02:27 o/ 19:02:34 Ironic is in working in tripleO deploys... at least in testing ... not quite ready for production 19:02:51 but we're getting really close 19:02:59 o/ 19:03:14 NobodyCam: well, we can deploy from seed, but we can't deploy Ironic from tripleo 19:03:28 NobodyCam: due to https://bugs.launchpad.net/ironic/+bug/1295503 19:03:29 Launchpad bug 1295503 in ironic "ironic nova driver blocks nova-compute startup when ironic isn't available" [Medium,Incomplete] 19:03:32 lifeless: that is very true :) 19:03:50 let's come back to tripleo integration bits in a bit 19:03:59 is there an etherpad tracking the tripleO integration and issues similar to the one for devstack? 19:04:15 yeah that would be nice 19:04:27 we'll come back to that ... there was 19:04:36 I'd like to discuss the open bug list for icehouse RC1 which should be getting tagged by EOW // early next week 19:04:42 #link https://launchpad.net/ironic/+milestone/icehouse-rc1 19:04:45 #topic Ironic RC1 milestone 19:05:56 I think we're making good progress fixing bugs but reviews are a bit slow still 19:06:24 devananda: should we start up the review jams again 19:06:31 NobodyCam, yeah 19:06:35 also I was away last week 19:06:37 I know several folks were gone last week 19:06:49 devananda: I posted a partial fix for https://bugs.launchpad.net/ironic/+bug/1289048, which the bug doesn't seem to reflect 19:06:50 Launchpad bug 1289048 in ironic "nova virt driver performance issue" [High,Triaged] 19:06:58 which merged 19:07:03 I can put aside some time for reviews this week 19:07:04 yup, so I guess we need to clean up the queue 19:07:05 The rest of it will require API changes, so 19:07:09 I don't think that happens for rc1 19:07:12 and would be up for a review jam 19:07:30 comstud: right. would you mind updating the bug? it may be reasonable to un-target it at this point 19:07:34 anyone is working on #1295874 default heartbeat_timeout too low, leads to random test failures? 19:07:48 Yeah, I'll update it.. I can't untarget it, of course 19:07:54 devananda: should https://bugs.launchpad.net/ironic/+bug/1295503 be on the rc-1 list? 19:07:56 Launchpad bug 1295503 in ironic "ironic nova driver blocks nova-compute startup when ironic isn't available" [Medium,Incomplete] 19:08:03 * devananda untargets it 19:09:13 NobodyCam: lifeless: AIUI, that is actually "ironic-nova driver blocks nova-compute when keystone is not configured" 19:09:23 if ir-api service is offline, the retry code should handle it 19:09:36 devananda: at startup ? 19:10:06 I'll tweak the title, its definitely wrong 19:10:08 lifeless: https://bugs.launchpad.net/ironic/+bug/1295870 19:10:10 Launchpad bug 1295870 in ironic "ironic.driver does not wrap all icli calls in _retry" [High,Triaged] 19:10:11 https://bugs.launchpad.net/ironic/+bug/1285507 can be closed based on the comments in it. 19:10:12 Launchpad bug 1285507 in ironic "Improve Ironic Conductor threading & locks" [High,In progress] 19:10:27 rloo: ah, right! thanks for pointing that out 19:10:38 rloo: agree 19:11:31 devananda: so https://bugs.launchpad.net/ironic/+bug/1295870 is related, but we'd need an arbitrarily long timeout for startup operations. 19:11:32 Launchpad bug 1295870 in ironic "ironic.driver does not wrap all icli calls in _retry" [High,Triaged] 19:11:53 devananda: because unlike responses to RPCs, if an error escapes there, startup fails, and nova compute exits 19:12:24 lifeless: yea. these are definitely separate-but-related issues 19:12:36 devananda: although it might be ok as long as it stays up long enough for upstart to believe it started 19:12:42 then we can finish the bootstrap 19:12:50 lifeless: ack. let's revisit the tripleo startup sequence problem in a bit :) 19:12:53 and upstart will restart it when it does fail 19:12:58 sorry, dog with bone 19:13:03 also C is up now I need to go get her 19:13:13 :) 19:13:17 dumb question: https://bugs.launchpad.net/ironic/+bug/1291420 doesn't the fix have to go in nova? 19:13:19 Launchpad bug 1291420 in ironic "Scheduler not doing exact matching when picking an Ironic Node" [High,Triaged] 19:13:32 re: icehouse-rc1 bugs --> https://bugs.launchpad.net/ironic/+bug/1295874 is on the rc-1 list. Is the desired fix just to bump the default heartbeat_timeout or something more involved? If it's a timeout bump I'll gladly take care of it, if it's something more involved the bug might need more details. 19:13:34 Launchpad bug 1295874 in ironic "default heartbeat_timeout too low, leads to random test failures" [High,Triaged] 19:13:59 davanada, for #1295874 I can take a look at it if it is not taken. 19:14:01 JayF: I think that's the only solution we can do now 19:14:10 rloo: nova driver and host manager and in our tree atm 19:14:11 devananda: I can take care of that today then. 19:14:42 NobodyCam: ah, so we'd add the filter in our tree too. thx. 19:14:51 linggao: JayF: doesn't matter to me which of you take it :) 19:14:55 devananda: 1295870 isn't targeted. should it be? 19:15:30 Shrews: it'd be great to have that fixed, yes 19:15:38 just not sure how much work that is. might be a lot 19:15:42 ou, JayF I did not see your message. 19:15:57 JayF it is yours then. 19:15:59 linggao: it's OK, we were both thinking about the same thing :) 19:16:22 linggao: ty, was hoping to get a low-hanging-bug for my first direct ironic contribution, and that fits the bill. I'll ping you when it's done for a review if you want. 19:16:42 JayF, sure. :) 19:16:44 * devananda looks through untriaged bugs 19:17:10 https://bugs.launchpad.net/ironic/+bug/1285806 has a review that needs a look over 19:17:12 Launchpad bug 1285806 in ironic "Ironic will poke the node continuously, if it fails to change power state" [Medium,In progress] 19:17:13 https://bugs.launchpad.net/ironic/+bug/1280267 19:17:14 Launchpad bug 1280267 in ironic "pxelinux config files: absolute paths and symlinks" [Undecided,New] 19:17:34 * NobodyCam looks 19:17:35 https://bugs.launchpad.net/ironic/+bug/1292733 19:17:38 Launchpad bug 1292733 in ironic "unplugging of instance VIFs fails if no VIFs associated with port" [Undecided,New] 19:17:55 devananda, er, thats invalid 19:17:56 let me update 19:18:01 thanks 19:18:39 devananda: https://bugs.launchpad.net/ironic/+bug/1280267 should have a map file 19:18:40 Launchpad bug 1280267 in ironic "pxelinux config files: absolute paths and symlinks" [Undecided,New] 19:19:25 yeah--thats just deployment detail handled in tripleO / devstack / etc 19:19:33 NobodyCam: does that need to be fixed for RC ? 19:19:38 https://github.com/openstack/tripleo-image-elements/blob/master/elements/ironic-conductor/install.d/68-ironic-tftp-support#L38-L39 19:19:53 ok I added it to the dib element 19:20:15 so devstack prob needs to create 19:20:23 NobodyCam, we adjusted to match what tripleO was doing 19:21:05 the deploy-ironic element explicitly ffetches its token- file from /tftpboot/token-$foo so whoever sets up the tftpd needs to take that into account 19:21:12 adam_g: ya so you'll have a map file if you build with DIB 19:21:31 NobodyCam, yup, we just laid down a similar map with devstack 19:22:11 tho we were hitting problems fetching the token file /w tftpd, not the kernel or ramdisk, so i'm not sure about that bug 19:22:28 so is this 1280267 a valid bug? 19:22:59 ya maybe some more testing needed 19:23:19 yeah 19:23:27 we can discuss later 19:23:42 devananda: I would not tag that for RC just yet 19:23:47 k, untagging 19:24:09 or not taggign :) 19:24:14 :) 19:24:16 also sounds like you guys aren't sure it's even a bug 19:24:22 ya 19:24:32 thats almost tftp config 19:24:43 adam_g: can you update // mark incomplete ? 19:24:53 devananda, yeah 19:24:58 ty 19:25:00 :) 19:25:06 so we've still got a decent list of bugs to fix 19:25:25 let's plan on meeting 8am PST a couple days this week 19:25:32 to plough through them 19:25:33 when is the new review jam? 19:25:46 ideally, i'd like https://launchpad.net/ironic/+milestone/icehouse-rc1 all green by friday 19:26:01 :) I will afk tomorrow from 8:15 pst to about 10 ish 19:26:02 *friday morning GMT 19:26:23 lucasagomes, romcheg1 - can you guys make 1500 GMT tmw? 19:26:39 devananda: I think I can 19:26:56 devananda: would non-cores be helpful at these jams? 19:27:00 jroll: yes 19:27:01 yeah I think it's fine for me as well 19:27:12 jroll, definitely 19:27:14 devananda: cool, I'll do my best to make it 19:27:26 jroll: we need a minimum of cores present, other folks are very helpful too, as long as we can all stay focused on landing code :) 19:27:37 meaning I may be closer to 8:15 PST for some of them :) 19:27:42 sure :) 19:27:46 devananda: I can reproduce the 'nvoa delete' does nothing issue btw 19:28:03 devananda: I haven't captured any traces yes (the default logging level appears fairly useless) 19:28:04 jroll: start times have not been exact in the past ;) 19:28:09 heh 19:28:21 lifeless: :( 19:28:21 I will have to leave the jam at 8:15 19:28:22 lifeless: bug # ? 19:28:34 NobodyCam: we'll probably have one wednesday morning too :) 19:28:46 devananda: I'll file one 19:28:56 :) that I can make :) 19:29:02 lifeless: thanks much 19:29:15 should we bump the topic to the next one? 19:29:16 ok - let's move on 19:29:18 heh 19:29:18 yes 19:29:22 #topic Ongoing integration & testing work 19:30:23 oh, so we've got an initial non-voting devstack gate check running, which gets us as far as failing tests on half the runs (issues when running on rackspace cloud still being worked out) 19:30:47 again TripleO integration going well, Lifeless jumped in later last week and got things on track for tripleO 19:31:29 anyone have anything on devstack? 19:31:34 adam_g: the implication from your statement is that it succeeds on half the runs 19:31:35 i've started brainstorming and putting together some functional testing for tempest - https://etherpad.openstack.org/p/IronicCI, but we'll likely be blocked on some broader tempest discussions that likely wont happen till post-release / summit 19:31:49 adam_g: if that's the case, we can ask infra to target it to just the working cloud 19:31:51 dtantsur: you hit a firewall issue this morning with fedora? 19:32:03 NobodyCam, yes, and a lot more :) 19:32:12 firewall issue was with port udp:69 19:32:19 adam_g: that is, if it looks like the issue with xen will be impractical to solve 19:32:53 devananda, yeah--i think the xen issue can probably be worked around, another patch up for review now to help out with that. will ping infra about it later 19:32:55 dtantsur: was that from a fresh install? 19:33:02 I'm still keeping updating https://etherpad.openstack.org/p/jjWcLDThTK; will work on patch for firewall setting 19:33:03 adam_g: ah, graet 19:33:18 dtantsur: awesome Thank You :) 19:33:25 NobodyCam, yes. I'm constantly reverting my devstack to clear VM snapshot, the last time was today 19:33:39 dtantsur: i know that fedora/RH support is important to some folks, but again, I want to point out that IMO, it shouldn't be a priority for us right now 19:33:45 what is our CI roadmap for the rest of the cycle? is it a top priority still? i'm not sure what we can accomplish right now other than the non-voting check /w failing tempest tests 19:33:57 dtantsur: simply because devstack isn't yet functional(ly tested) on those platforms anyway 19:34:11 dtantsur: so there's no way to ensure that, once you implement it, it doesn't get silently broken upstream again 19:34:33 devananda: is 12.04 (same as gate jobs) our supported Distro? 19:34:38 devananda, I understand, but I hope we're uncovering some braoder issue (e.g. with firewall) 19:34:39 devananda, same applies to centos? 19:34:46 * broader 19:34:55 adam_g: goal is to get asymmetric tests running ASAP in general, not tied to the Icehouse release cycle 19:35:03 ifarkas: yes 19:35:22 dtantsur: it sounds like you are -- and I'm sure the folks wokring on broader support for devstack appreciate it :) 19:35:33 adam_g: has dwalleck been chatting with y'all about testing at all? 19:36:05 dtantsur: as do I, fwiw, just not as much of a priority for me until there's upstream support for testing it 19:36:06 jroll, yup, he's been working on getting a testable environment up himself 19:36:32 adam_g: awesome, he should be a big help in the near future 19:36:36 NobodyCam: 12.04 is what openstack CI tests on today. therefor it's what counts for devstack testing 19:37:09 devananda: ack :) 19:37:09 for those that might not have it, the context for why devstack/tempest is important even though we're not graduating in Icehouse 19:37:15 devananda, right--just seems such testing taking place in tempest is not going to be an ASAP thing 19:37:27 is that the community will need us to have sufficient integration tests 19:37:31 in the gate 19:37:37 there are some broader issues that need resolving in tempest WRT non-default compute drivers 19:37:37 for "long enough" to build confidence 19:37:39 devananda, my current purpose is to collect and triage as many issues with Fedora as it is possible :) not trying to shift overall focus at any rate 19:37:58 adam_g: indeed there are ... do you know if there's a session at the summit yet proposed for that? 19:38:19 adam_g: i spoke with clarkb ~1mo ago, and I think he might have been planning to propose something? 19:38:22 in the shorter term (pre-summit), i propose we lean on old devstack exercises to at least give ironic developers confidence, with tempest testing following later 19:38:30 adam_g: ++ 19:38:42 adam_g: and it sounds like you can spin that up pretty easily 19:38:46 sdague: any objection to ^ ? 19:38:50 devananda, i need to touch base with #openstack-qa and find that out. i wont be at the summit to drive that, but i know its a broader topic, esp. WRT defcore 19:39:00 devananda: I think sdague is driving it 19:39:32 devananda: what am I objecting to? :) 19:39:34 adam_g: hmmm, defcore is a future issue for Ironic, as its not yet integrasted 19:39:53 sdague: :) 19:40:09 sdague, we're just discussing where/how testing of a driver like ironic fits in tempest, in terms of defining and testing supported features 19:40:11 so, honestly, I wouldn't really bother with exercises, if you can get a simple scenario test instead 19:40:22 sdague: us leveraging devstack exercises in the short-term, while issues around tempest && nova-driver-differences get sorted out 19:40:22 it's fine to punt on API right now 19:40:40 but we basically don't run exercises many places, so they are fragile 19:40:48 and prone to breaking 19:41:12 just have a custom ironic job that skips the API tests for now 19:41:27 okay ill take an action for that ^ 19:41:43 i have a WIP scenario test, just need a good way to run it without the API/scenarios that i know will break 19:41:45 *fail 19:41:47 #action adam_g to implement a simple tempest scenario test, skipping the API part for now 19:41:50 adam_g: action ? 19:41:55 lol devananda beet me to it 19:42:25 I also wonder if eventually we really want to hit ironic through nova api, or just hit the ironic api directly 19:42:34 sdague: need to hit it through nova 19:42:37 two reasons 19:42:53 we need to test the nova.virt.ironic driver -- taht's an explicit requirement of our graduation 19:43:34 and we'd essentially have to rewrite all taht driver code 19:43:37 oh that reminds me... 19:43:38 if we wanted to test without nova 19:43:40 which is silly 19:43:45 yeh, fair 19:44:09 we just can't implement the full set of libvirt functionality 19:44:32 hence the need to discuss what driver functionality can/should be tested && how to designate that in the test suite 19:44:48 sdague: we'll also probably run into timeout issues 19:44:56 eg, PXE bootign nested virt is SLOOOWWWW :( 19:45:07 devananda, the timeouts are all configurable, should be okay there 19:45:09 so the way to handle that is get the logic in lib/tempest to set the right enabled things based on virt driver 19:45:23 ahh 19:45:51 I'm sure we'll talk moer at the summit about all this 19:46:00 adam_g: I'm sad you won't make it 19:46:14 15 minutes left.. can we bump topic? 19:46:15 devananda, me too. first time since diablo 19:46:32 yep. let's move on. severa things in the open discussion list 19:46:35 We still have agent architechture, and 15m will not be a lot of time for that. 19:46:40 #topic Open Discussion 19:46:42 adam_g: can't make it ? 19:46:58 JayF: the full depth of agent arch will not fit into a 1hr meeting anyway :p 19:46:59 open floor... 19:47:09 lifeless, unfortunately, no 19:47:12 tripleo bootstrapping 19:47:18 I'd like to get this unblocked 19:47:27 since an ironic seed with nova-bm undercloud doesn't make a lot of sense 19:47:39 but I'd really like to bring this up and find out who's got what interests: ironic as a CMDB // the agent running without ironic at all. 19:47:46 ^ 19:47:55 I like to throw out there that we need to figure out how to run the nova driver tests... they are already out of date 19:48:16 devananda: that's coming from vkozhukalov afaik 19:48:22 NobodyCam, yeah :/ I don't think we want to enable that in our tree tho 19:48:22 I have an explicit interest in making sure the scope of the agent remains Ironic, as to avoid creating every agent possible/an agent framework rather than an Ironic Python Agent. 19:48:25 lifeless: I agree. but IIRC, the initial plan was to get ironic into undercloud and then figure out how to get it into seed. a lot may have changed tho 19:48:31 due the nova dependencies that would need to be installed anda ll 19:48:33 lifeless: so i'm ++ if there's a way to get it landed in both at once 19:48:54 cmdb is just a use case, nothing more, it is not critical 19:49:06 lucasagomes: we need something I, have landed patches that I know broke the current tests 19:49:07 jroll: devananda: i'm also interested in new python agent 19:49:13 JayF: i have a very strong interest -- and, practically speaking, a requirement -- to keep IPA <-> ir-cond bindigns very tight 19:49:14 devananda: so, seed works and undercloud fails, today, with my patchsets to incubator/images/templates. 19:49:17 agordeev2: I know :) 19:49:25 devananda: +1 19:49:29 devananda: the only reason undercloud fails is the bug I pointed you at about nova-compute startup. 19:49:35 vkozhukalov: are you ok with a non-pluggable heartbeat, then? 19:49:45 As a first note; the agent has been fully integrated into Openstack now. 19:49:48 #link http://git.openstack.org/cgit/openstack/ironic-python-agent/ 19:50:05 lucasagomes, NobodyCam - I think we can enable the noav.virt.ironic unit tets in our tree IFF we somehow clone the nova repo and make some symlinks 19:50:06 JayF: great! 19:50:07 NobodyCam, yeah... maybe updating the nova review with the new code and fixing the bugs? 19:50:18 lucasagomes: it's not pretty. perhaps someone with more unit test knowledge than I has a better way 19:50:25 Which means if you want to contribute, please follow standard workflow, bugtracker/blueprints are shared with Ironic. 19:50:38 JayF: woot woot 19:50:53 devananda, yeah 19:51:00 worth trying :) 19:51:05 lifeless: right. so there's the related but wont fix it bug. But really what I see as the issue is 19:51:11 The team here will continue to have a laser-focus to get a working prototype by the summit. Do one thing before we try to do all the things :) 19:51:16 JayF, good stuff 19:51:19 clarkb: you happen to know know we could test a nova driver thats in the ironic repo? 19:51:29 lifeless: you're starting n-cpu + nova.virt.ironic before creating the requried accounts in keystone 19:51:46 jroll: i prefer everything in agent implemented in a pluggable manner 19:51:49 lifeless: so even if ir-api/ir-cond were running, it wouldn't matter 19:51:52 devananda: right, which we do for every other service, and only nova-compute w/Ironic driver has an issue 19:52:21 vkozhukalov: heartbeat will always go to ironic. I see no reason to put in the effort to make this pluggable. 19:52:30 lifeless: interesting. nova expects the hypervisor to provide some info during init 19:52:43 vkozhukalov: we are not building an agent framework, we are building an agent specifically for ironic 19:52:59 lifeless: there are several code paths later on in n-cpu's run that aren't goign to be exercized if those things are not initialized 19:53:14 devananda: so I'm not suggesting that we don't run them, just defer it 19:53:24 devananda: rather than failing so fast upstart detects it 19:53:29 lifeless: cause n-cpu to block until it becomes available? 19:53:36 devananda: for instance yes 19:53:44 just poll every 5 seconds say 19:53:56 lifeless: so taht will render a huge back-log if any other nova service attempts to send RPC messages to n-cpu 19:53:58 cluster init will proceed asynchronously 19:54:21 devananda: yes, so you'd want to log a WARNING 19:54:41 'Cannot connect to Ironic <...>' or something 19:54:59 vkozhukalov: ironic agent should not be designed in a way that does not assume it will talk to ironic 19:55:06 five (5) minute WARNING 19:55:06 vkozhukalov: which seems to be waht you are proposing 19:55:19 doesn't nova currently do the same /w DB and MQ connections at startup? 19:55:30 adam_g: not meaningfully 19:56:33 jroll: devananda: ok, let's start from tight version, not pluggable 19:56:49 another POV on this issue is that, from nova's perspective, the configured hypervisor is unavailable 19:57:00 thus nova-compute can't perform its job 19:57:21 so the etherpad u guys are using to track the agent decisions is the https://etherpad.openstack.org/p/282Ocf7oXR? 19:57:32 https://etherpad.openstack.org/p/282Ocf7oXR 19:57:53 lucasagomes: discussion started here: https://etherpad.openstack.org/p/IronicPythonAgent 19:58:04 #link https://etherpad.openstack.org/p/282Ocf7oXR 19:58:08 An etherpad doesn't seem to be an awesome interface for the discussion to happen and decisions be reached though, honestly 19:58:10 ack cheers jroll 19:58:12 the etherpad you linked is something I made as a draft of the new wiki page 19:58:15 time's almost up -- lifeless, let's continue in a bit. I want to solve this, just not sure how 19:58:18 devananda: I don't expect n-c to perform its job 19:58:20 #link https://etherpad.openstack.org/p/IronicPythonAgent 19:58:24 devananda: just to not bail and exit 19:58:30 and is my attempt at a cleaned up version of the named etherpad 19:58:39 jroll: vkozhukalov: who added all the mentions of a CMDB to https://etherpad.openstack.org/p/IronicPythonAgent ? 19:58:45 not I 19:58:59 devananda: I believe those were there in the initial etherpad, which was put together by vkozhukalov iirc 19:59:04 I believe that was vkozhukalov but not 100% sure 19:59:04 *BEEP* One minute *BEEP* 19:59:09 JayF, heh indeed, but it's good to have some documentation somewhere 19:59:09 devananda: i added 19:59:16 can we continue in channel? 19:59:19 ^ 19:59:24 I think we should take jroll's proposed wiki page, make it the wiki page, and put arch discussions in there/in a blueprint 19:59:25 vkozhukalov: ack. let's talk :) 19:59:46 thanks everyone! let's continue in channel - we've got a few good discussions going now 19:59:47 The etherpad is not a well organized or well attributed way to figure out what is good 19:59:54 thank you all for a great meeting 20:00:04 thanks 20:00:05 #endmeeting