17:01:12 <johnthetubaguy> #startmeeting XenAPI 17:01:13 <openstack> Meeting started Wed Mar 6 17:01:12 2013 UTC. The chair is johnthetubaguy. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:01:14 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:01:16 <openstack> The meeting name has been set to 'xenapi' 17:01:22 <BobBall> Yay :) 17:01:34 <johnthetubaguy> hi everyone 17:02:02 <BobBall> Morning John 17:02:05 <BobBall> or afternoon 17:02:09 <BobBall> depending on where everyone is 17:02:25 <matelakat> hi 17:02:29 <johnthetubaguy> #topic actions from last meeting 17:02:38 <johnthetubaguy> So I had a few actions 17:03:04 <johnthetubaguy> #action johnthetubaguy needs to do the actions from last week 17:03:14 <BobBall> haha 17:03:16 <BobBall> nice action 17:03:29 <johnthetubaguy> been stuck with XCP on CentOS, so not really started on the CentOS install docs 17:03:32 <johnthetubaguy> hey ho 17:03:34 <matelakat> I guess it was a busy week. 17:03:53 * johnthetubaguy bangs head against wall 17:03:59 <johnthetubaguy> anyways... 17:04:07 <BobBall> What's the progress on XCP on CentOS? 17:04:16 <johnthetubaguy> #topic blueprints 17:04:38 <johnthetubaguy> we got a meeting on that Tueday 17.00 UTC in #centos-devel 17:04:59 <johnthetubaguy> summary: broken with odd permissions errors, no one really can tell why 17:05:05 <BobBall> sounds fun 17:05:14 <johnthetubaguy> joyus 17:05:22 <johnthetubaguy> so… blueprints? 17:05:36 <johnthetubaguy> just a call to look at the etherpad for the summit and add things 17:05:41 <BobBall> oh 17:05:43 <johnthetubaguy> we added the odd bit last time 17:05:48 <BobBall> I forgot to add stuff 17:05:52 <BobBall> lemme have a quick look 17:06:05 <johnthetubaguy> #link https://etherpad.openstack.org/HavanaXenAPIRoadmap 17:06:11 <BobBall> yeah 17:06:14 <BobBall> just got it from the web page 17:06:17 <BobBall> my bad, sorry 17:06:27 <johnthetubaguy> np 17:06:34 <BobBall> okay 17:06:46 <BobBall> there is one key thing that isn't there 17:06:55 <BobBall> or if it is then I can't see... 17:07:01 <BobBall> is quantum support for XS 17:07:06 <BobBall> didn't make grizzly-3 17:07:13 <johnthetubaguy> its a nova session though 17:07:26 <BobBall> so it should definitely be on the roadmap even if we don't need a new blueprint 17:07:28 <BobBall> ahhhhh 17:07:33 <johnthetubaguy> it was covered in the last summit 17:07:33 <BobBall> didn't spot that from etherpad! 17:07:43 <johnthetubaguy> implementation agreed, code pushed for review 17:07:48 <BobBall> *nod* 17:07:53 <johnthetubaguy> but only one quatum core ever reviewed 17:08:03 <johnthetubaguy> just to document / share 17:08:13 <BobBall> Tis a shame it didn't make the grizzly cut! 17:08:41 <johnthetubaguy> indeed, we half planned a backport to folsom 17:09:06 <johnthetubaguy> but no one will review it, but hopefully that will change now 17:09:19 <johnthetubaguy> so anything else? 17:09:25 <johnthetubaguy> blueprint wise 17:09:53 <BobBall> not from my end 17:10:01 <johnthetubaguy> #topic docs 17:10:04 <johnthetubaguy> any news? 17:10:09 <johnthetubaguy> I didn't do my things 17:10:21 <johnthetubaguy> there was a note on the mailing list about live-migraiton docs 17:10:28 <BobBall> but I guess there will be an AOB for adding new features to the roadmap? 17:10:32 <johnthetubaguy> they look fairly poor, might need to expand them 17:10:50 <johnthetubaguy> sure, we can do 17:10:57 <BobBall> we've got a plan to look at some docs - Mate has an action this week to go through and look at what's there 17:11:03 <johnthetubaguy> cool 17:11:20 <johnthetubaguy> #action matelakat to look at state of XenAPI docs and report back next week 17:11:23 <BobBall> there we go :D 17:11:29 <BobBall> I was going to say can you add an #action 17:11:41 <johnthetubaguy> catch me on IRC if there are questions 17:11:51 <matelakat> #link https://github.com/citrix-openstack/bugstat/blob/master/bugreport/main_report.md#openstack-manuals----20 17:12:13 <johnthetubaguy> cool 17:12:36 <matelakat> A lot to do... 17:13:06 <johnthetubaguy> some of those don't affect XenAPI 17:13:23 <matelakat> it's just a dumb search 17:13:31 <johnthetubaguy> sure, no worries 17:13:38 <matelakat> I will look at them, and put em to categories. 17:13:42 <BobBall> e.g. https://bugs.launchpad.net/openstack-manuals/+bug/1095095 17:13:43 <uvirtbot> Launchpad bug 1095095 in openstack-manuals "Configuring for resize with KVM" [Medium,Confirmed] 17:13:59 <BobBall> just says KVM docs aren't as good as XenServer for resize 17:14:19 <matelakat> So it includes the string "XenServer" 17:14:22 <BobBall> indeed 17:14:35 <BobBall> but only in the context of "XenServer doesn't have this bug" :) 17:15:07 <johnthetubaguy> its probably worth manually added xenserver tags, and having a tag only search 17:15:22 <johnthetubaguy> anyways, lets move on 17:15:36 <johnthetubaguy> any more for any more? 17:15:38 <BobBall> I'd like to keep the dumb search, but use the tagged search to say "this has been triaged by someone who knows it's a XS bug" 17:15:47 <johnthetubaguy> +1 17:16:07 <johnthetubaguy> that is what I meant 17:16:34 <BobBall> ah right 17:16:50 <johnthetubaguy> #topic QA and Bugs 17:16:59 <johnthetubaguy> anything major worrying people? 17:17:11 <johnthetubaguy> matelakat have you got the link to your bug finder? 17:17:26 <BobBall> I guess one thing that surprised me is that devstack multihost doesn't seem to be tested by anyone else 17:17:49 <matelakat> #link https://github.com/citrix-openstack/bugstat 17:18:01 <guitarzan> we'd like to get some eyes on this: https://review.openstack.org/#/c/23662/ 17:18:41 <BobBall> *has a butchers* 17:20:32 <BobBall> That one's an interesting issue! 17:20:42 <guitarzan> and a little painful :) 17:21:13 <guitarzan> I'm going to try it out 17:21:23 <BobBall> So has the whole SR gone away? 17:21:34 <guitarzan> hopefully with the patch, yes 17:21:49 <BobBall> ahhh - this an iSCSI SR? 17:21:52 <guitarzan> yes 17:21:53 <johnthetubaguy> I guess the point is, if your iSCSI target dies, then VM will not start 17:22:00 <guitarzan> johnthetubaguy: exactly 17:22:04 <BobBall> yup - not surprising. 17:22:05 <BobBall> okay 17:22:05 <BobBall> got it 17:22:13 <BobBall> I was getting a little confused 17:22:45 <guitarzan> I'm not sure what happens in the other HVs case 17:22:59 <guitarzan> but I also don't have to worry about that case 17:23:07 <BobBall> *grin* 17:23:34 <BobBall> I'll have to have a think about this one 17:23:56 <johnthetubaguy> yeah, sounds like an excessive timeout, would be nice to be able to specify that in the check call 17:24:12 <guitarzan> s1rp and I talked about it quite a bit, and this seems to be the best we could come up with on short notice 17:24:18 <guitarzan> mad props to him for making it work 17:24:23 <BobBall> you mean the XAPI timeout? 17:24:55 <johnthetubaguy> erm, timeout in the xapi operation 17:25:06 <BobBall> I guess this is currently a critical issue for you guys? 17:25:12 <guitarzan> well, it's an ugly one 17:25:20 <guitarzan> requires ops to go in and nuke the SR 17:25:28 <guitarzan> it doesn't happen often 17:25:35 <johnthetubaguy> maybe some kind of health check would be better, with tunable timeout 17:25:59 <guitarzan> it should only happen if something happens to the network or we lose a storage node 17:26:01 <johnthetubaguy> I like scan SR because it should be quick in the working cases 17:26:01 <BobBall> so XS doesn't timeout the SR? 17:26:36 <johnthetubaguy> oh, I see you only call that in error cases 17:26:55 <guitarzan> johnthetubaguy: yeah, we don't do anything unless it doesn't boot 17:27:14 <johnthetubaguy> makes sense 17:27:31 <BobBall> how long ago had the SR gone away? 17:27:34 <BobBall> or had it only just gone? 17:27:55 <johnthetubaguy> shame we can't have a non destructive error case again, do we need to tell cinder we detached the volume? 17:27:57 <guitarzan> the sr is still there, it just can't make the iscsi connection 17:28:10 <guitarzan> johnthetubaguy: compute manager does that 17:28:14 <BobBall> Also, could you just post the XS error log to the bug so that we've got a traceback 17:28:28 <guitarzan> the fun part was propagating the bad devices back up to compute 17:28:38 <BobBall> *grin* that does look fun 17:28:43 <johnthetubaguy> ah, got ya, didn't get there yet 17:29:11 <guitarzan> BobBall: I'll try to remember to paste a stack 17:29:28 <BobBall> this _handle_bad_volumes_detached case? 17:29:53 <guitarzan> well, I'll grab the xen log from the failed boot 17:30:04 <BobBall> that's perfect 17:30:36 <johnthetubaguy> pull out the network cable between your iscsi target an hypervisor, it should repo OK 17:30:41 * BobBall is impressed with this one 17:30:46 <BobBall> I like that bug 17:30:48 <guitarzan> glad you like it 17:30:56 <BobBall> is there a Bug Of The Month award? 17:31:01 <guitarzan> we were hoping XS would boot without all the volumes, but alas 17:31:08 <BobBall> yup 17:31:21 <BobBall> well we might also be able to patch ISCSISR.py to do something 17:31:22 <BobBall> not sure 17:31:24 <johnthetubaguy> good old xapi trying to protect us from doing bad things again 17:31:26 <BobBall> depends how the SR is failing 17:31:35 <BobBall> unlikely to be XAPI 17:31:52 <johnthetubaguy> oh, OK 17:32:02 <BobBall> is it the vm start that fails? I'm almost surprised the shutdown works ok if the SR is timing out 17:32:17 <johnthetubaguy> it probably got shutdown before that right? 17:32:22 <guitarzan> I'm not sure 17:32:28 <johnthetubaguy> or this is the first start? 17:32:30 <guitarzan> it's a reboot, so it wasn't really shut down 17:32:35 <johnthetubaguy> ah 17:33:00 * johnthetubaguy remembers bug report…drrr 17:33:08 <BobBall> ok well might I suggest that John, you and I take an action to look at it? 17:33:26 <BobBall> I see you've already added yourself! 17:33:30 <BobBall> hah :) 17:33:43 <johnthetubaguy> make sure the SR is behaving correctly, for the "graceful" fix 17:34:09 <BobBall> guitarzan, do you happen to know if this is a soft or hard reboot? 17:34:49 <johnthetubaguy> #action johnthetubaguy guitarzan to look into broken SR issues https://review.openstack.org/#/c/23662 17:35:19 <guitarzan> BobBall: not sure 17:35:24 <guitarzan> I'll try both 17:35:31 <johnthetubaguy> maybe hard because soft failed... 17:35:33 <johnthetubaguy> cool 17:35:39 <BobBall> probably both tbh 17:35:47 <guitarzan> that's my guess 17:35:58 <johnthetubaguy> cool 17:36:02 <BobBall> *not sure if XAPI handles the SRs differently for the two cases* 17:36:05 <BobBall> Anyway - let's move on :) 17:36:08 <johnthetubaguy> indeed 17:36:13 <johnthetubaguy> any more bugs? 17:36:33 <johnthetubaguy> me guessing that is a no... 17:36:54 <johnthetubaguy> #topic Open Discussion 17:37:07 <johnthetubaguy> so, bobball has a few things? 17:37:22 <s1rp> ohai guys... 17:37:34 <johnthetubaguy> hey 17:37:34 <guitarzan> we were just talking about you 17:37:37 <s1rp> yeah the clean_reboot operation hangs for 120 secs... 17:37:50 <s1rp> luckily a subsequent SR.scan seems to be quick-ish 17:37:56 <BobBall> ahhhhh 17:37:59 <s1rp> only the first-one after unplugging seems to be slow 17:38:11 <s1rp> it's like it stores some data somewhere marking it as failed (?) 17:38:12 <guitarzan> BobBall: I haven't tried that iscsi patch you sent me yet 17:38:18 <johnthetubaguy> that figures, cool 17:38:57 <BobBall> s1rp, I thought it was the SR scan that waited 120 seconds 17:39:14 <BobBall> failing fast in clean_reboot is likely to be a XAPI thing waiting for the SR to respond to it's attach request 17:39:15 <s1rp> that too... lemme clarify 17:39:32 <BobBall> sorry john - we're derailing the agenda :D 17:39:38 <s1rp> so if you do an sr-scan w/o a reboot, then that call will take 120 secs (this is what i was doing on the comand line to troubleshoot this) 17:40:03 <johnthetubaguy> its OK, its important 17:40:07 <s1rp> but, and i'm not 100% sure on this, but if you do a clean_reboot, that will cause an underlying timeout, but i *think* the next SR.scan will actually fail-fast 17:40:08 <BobBall> ah - but failed reboot followed by sr-scan to find the failing device is fast 17:40:27 <s1rp> BobBall: yeah, need to triple check that case, but i believe so 17:40:28 <BobBall> unfortunately that might mean the timeout is in iscsiadm ? 17:40:37 <BobBall> ... or fortunately :) 17:40:43 <BobBall> that might be easy to fix 17:40:50 <johnthetubaguy> right, hack the RD 17:40:54 <johnthetubaguy> lol SR 17:41:01 <BobBall> or just an other-config 17:41:07 <BobBall> I think we can pass some iscsiadm flags through 17:41:09 <johnthetubaguy> even better 17:41:18 <BobBall> not 100% on that though. Maybe only 73% sure. 17:41:54 <BobBall> btw john, my stuff on libvirt can wait until next week if we have other things to get through :) 17:42:14 <johnthetubaguy> BobBall: thanks 17:43:19 <BobBall> s1rp, Was saying to guitarzan that we'd like some of the XS logs in the bug report just for tracability if that's ok 17:44:35 <s1rp> BobBall: cool, we can get those over to you; luckily this is very easy to replicate! 17:44:55 <johnthetubaguy> sounds good, any more on that one? 17:45:26 <BobBall> no :) Let's leave that one for now 17:45:27 <johnthetubaguy> can always take it to the ML 17:45:44 <johnthetubaguy> cool, bobball summit stuff you wanted to mention? 17:46:18 <BobBall> Uhhhh maybe? I don't remember which summit stuff you're referring to? 17:47:12 <johnthetubaguy> ok, missunderstood 17:47:24 <johnthetubaguy> put stuff on the etherpad to help discuss at the summit 17:47:47 <BobBall> Sorry - I could have been clearer! :) 17:47:51 <johnthetubaguy> assuming that session goes ahead, if there is loads, might ask for extra sessions 17:48:04 <BobBall> Summit stuff then - looking forward to it. matelakat and I have booked our flights so we'll see you there 17:48:24 <johnthetubaguy> sounds like a Xen on libvirt vs XenAPI disucssion might be good, as long as it says sensible and not too religious 17:48:50 <johnthetubaguy> I was thinking in the summit 17:48:57 <johnthetubaguy> but bob you wanted to bring that up this weel? 17:49:00 <johnthetubaguy> week? 17:49:19 <BobBall> Well what I'd like to understand is what the primary value that the XenAPI integration is using from XAPI that can't be provided by libvirt 17:50:22 <BobBall> not sure if we've got enough time to explore that question properly today 17:50:24 <BobBall> which is fine :) 17:50:58 <s1rp> we do alot of weird stuff w/ dom0 plugins, but that probably could be handled with proper hooks in the libvirt layer 17:51:37 <johnthetubaguy> s1rp: we can't use both today, we would freak out xapi 17:51:44 <johnthetubaguy> but yes, I get your point 17:52:13 <johnthetubaguy> I think the question is, should we evolve XAPI/XCP or should we evlove libvirt 17:52:30 <johnthetubaguy> and how much effort is each approach, at this point 17:52:50 <johnthetubaguy> I guess what is missing between libvirt+Xen vs xapi+Xen in openstack today 17:53:19 <BobBall> Well the question is more that there are lots of things that are getting first-dibs in libvirt and whether a libvirt-on-xen/xapi hybrid approach would bring us much and what level of pain it would be for XAPI to tolerate such a hybrid approach 17:53:22 <johnthetubaguy> I get the idea that gap could be quite small, but I never got libvirt+Xen working that well, but didn't try very hard 17:53:47 <johnthetubaguy> hmm, maybe 17:54:02 <johnthetubaguy> but that is like how many years out? 17:54:29 <BobBall> It's not this week, that's true 17:54:30 <johnthetubaguy> well, maybe that fits into evolving XAPI actually... 17:54:49 <johnthetubaguy> I wondered about using xenopsd instead 17:55:26 <johnthetubaguy> thats the thing under xapi 17:56:09 <johnthetubaguy> anyways, we can take this offline 17:56:13 <johnthetubaguy> anything else? 17:57:22 <johnthetubaguy> cool 17:57:29 <johnthetubaguy> #endmeeting