15:01:11 <johnthetubaguy> #startmeeting XenAPI 15:01:12 <openstack> Meeting started Wed Dec 4 15:01:11 2013 UTC and is due to finish in 60 minutes. The chair is johnthetubaguy. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:16 <johnthetubaguy> Hi everyone 15:01:17 <openstack> The meeting name has been set to 'xenapi' 15:01:19 <matel> hi 15:01:24 <BobBall> hi 15:01:25 <johnthetubaguy> who is around for today's meeting? 15:01:34 <thouveng> hi 15:01:42 <matel> I'm here 15:01:47 <johnthetubaguy> cool 15:01:51 <johnthetubaguy> so lets get cracking 15:02:05 <johnthetubaguy> #link https://wiki.openstack.org/wiki/Meetings/XenAPI 15:02:17 <johnthetubaguy> #topic Blueprints 15:02:32 <johnthetubaguy> Icehouse-1 is closing, and Icehouse-2 is starting 15:02:42 <johnthetubaguy> anyone got any worries or plans with that? 15:03:00 <johnthetubaguy> any blueprints we need to get in during Icehouse-2? 15:03:01 <BobBall> nope - but there is a BP that thouveng will want to talk about for I-2 15:03:11 <pvo> o/ 15:03:13 <BobBall> first - hi thouveng ! 15:03:24 <thouveng> hello everybody 15:03:30 <thouveng> hi BobBall 15:03:32 <johnthetubaguy> thouveng: hello! feel free to discuss your blueprint, and give a quick intro 15:03:44 <thouveng> now? 15:03:48 <johnthetubaguy> sure 15:03:53 <thouveng> ok 15:03:56 <BobBall> Just as a brief intro - thouveng is from bull.net who are interested in PCI pass through and potentially vGPU in future 15:04:19 <thouveng> So first here is the link: https://blueprints.launchpad.net/nova/+spec/pci-passthrough-xenapi 15:04:27 <johnthetubaguy> ah, cool 15:04:42 <thouveng> The goal of this bp is to add support for pci passthrough into xenapi driver 15:05:07 <thouveng> I see two tasks. First add the support for updating the status of the compute host 15:05:17 <thouveng> and second add the mechanism to attach a pci device to a VM when booting. 15:05:56 <BobBall> What do you mean by status of the host? The PCI devices that it has available? 15:06:02 <johnthetubaguy> sounds good, I have added them as work items in the blueprint 15:06:20 <thouveng> The pci devices that are available for the pci passthrough 15:06:52 <BobBall> So what's the procses for getting this prioritised / cores assigned johnthetubaguy ? 15:07:05 <johnthetubaguy> does the current structure in Nova look OK for wiring up for to XenAPI? 15:07:25 <johnthetubaguy> BobBall: just target it to a milestone, and it should get reviewed, I am kinda doing that as we type 15:07:44 <thouveng> johnthetubaguy: Yes it seems to have all needed wires 15:08:15 <johnthetubaguy> cool, I will add a note in the blueprint saying we expect no new configuration settings, does that seem fair? 15:09:22 <thouveng> Yes. For the configuration I have some doubts but I think that I only need to catch pci devices that are passed on the dom0 command line when booting the device. 15:09:38 <thouveng> So I think that we don't need extra configuration. 15:09:53 <johnthetubaguy> thouveng: how are you going to wire up with xenapi, does it need a plugin? 15:10:19 <thouveng> johnthetubaguy: Yes exactly. 15:10:22 <johnthetubaguy> thouveng: that sounds right to me, the filtering and grouping that is getting added in nova can be wired up once it droppes 15:10:49 <johnthetubaguy> thouveng: cool, it would be good to get these details in the blueprint, helps set expectations, and review the direction the implementation will take. 15:10:55 <johnthetubaguy> let me just add that in 15:10:56 <thouveng> I should be able to add the function into an existing plugin. So I just need to upgrade the plugin version. 15:11:36 <johnthetubaguy> BobBall: does this really need a plugin, are we missing some XenAPI functions for this stuff? 15:11:51 <BobBall> The issue is we need to list the PCI devices on the host 15:12:04 <BobBall> of course anything on the host can be exposed through XAPI - but XAPI doesn't do that ATM 15:12:17 <johnthetubaguy> right, I guess that was it, thats OK 15:12:23 <BobBall> so any use of a plugin could be considered "missing functionality" from XAPI 15:12:52 <BobBall> In particular thouveng is going to look at the boot command line and/or the output of lspci -Dv to see which modules are loaded for the devices 15:13:02 <BobBall> To me that feels too specific to be exposed through XAPI 15:13:05 <BobBall> and that's what plugins are for 15:13:14 <johnthetubaguy> yeah, probably 15:13:25 <johnthetubaguy> sounds good 15:13:28 <BobBall> thouveng isn't proposing a plugin that _modifies_ state from dom0 - just reads it 15:13:31 <johnthetubaguy> … final question 15:13:36 <BobBall> all the modification can be done through XAPI 15:14:09 <thouveng> agree 15:14:15 <johnthetubaguy> BobBall: cool, thats what I remember from looking that this, I have added it to the blueprint 15:14:19 <johnthetubaguy> so that question 15:14:33 <johnthetubaguy> do you think the code will be ready during Icehouse-2? 15:14:52 <thouveng> Sorry I don't remember the date for I2? 15:15:05 <BobBall> 23rd Jan 15:15:22 <johnthetubaguy> year, so needs to be up for review early Jan 15:15:38 <BobBall> or 23th Jan if you want to be pedantic and copy the schedule 15:15:43 <BobBall> #link https://wiki.openstack.org/wiki/Icehouse_Release_Schedule 15:15:46 <johnthetubaguy> we can always move things if we have too, but sooner is better 15:16:02 <leif> Hey quick question, since this is piggy-backing libvirt support, does libvirt handle hot-plugging? 15:16:10 <thouveng> The first part about updating host status will be available. 15:16:27 <johnthetubaguy> OK, well thats enough to say I-2 for now, and see how it goes 15:16:46 <johnthetubaguy> leif: I kinda assume this is attach at boot, not hot-plug right now, but thouveng? 15:16:48 <thouveng> I hope that the part that attaches the pci device will be ready too but I didn't did that much on that part. 15:16:48 <BobBall> think so leif - but I might be missing something from the code. don't know to be honest. 15:17:11 <thouveng> johnthetubaguy: attach at boot yes. No hotplug. 15:17:22 <johnthetubaguy> I think its just at boot via the flavour right now, so thats all cool 15:17:40 <thouveng> johnthetubaguy: exactly 15:17:46 <leif> Yeah, no problem with level. Was asking since hotplug support on libvirt what do we return back? 15:18:00 <leif> "level" - only support at boot. 15:18:23 <BobBall> no idea leif 15:18:24 <leif> Wanted to know if this is an immediate error or post launch error. 15:18:34 <johnthetubaguy> well, should just be extra wiring up, its the reporting status I have more worries about 15:18:35 <leif> no problem, just asking at this poitn. 15:18:50 <johnthetubaguy> immediate error? 15:19:06 <BobBall> Actually looking at the code - I might be mistaken 15:19:09 <BobBall> https://review.openstack.org/#/c/39891/40/nova/virt/libvirt/driver.py 15:19:12 <BobBall> at least in that changeset 15:19:33 <BobBall> the "hotpluging" only happens during VM lifecycle operations (e.g. reboot, suspend, snapshot) 15:19:47 <BobBall> so might just be reqs from libvirt for a reboot to work etc 15:19:53 <johnthetubaguy> yeah, there is no "attach PCI device" api call I know about 15:20:00 <johnthetubaguy> but there could be I supose 15:20:19 <johnthetubaguy> anyways, lets not get sidetracked 15:20:25 <johnthetubaguy> blueprint approved 15:20:26 <leif> agreed. :-) 15:20:45 <BobBall> w00t. So - john - how do we get another core signed up? 15:20:49 <thouveng> :) 15:21:06 <johnthetubaguy> BobBall: you ask them, or you get me to ask them 15:21:29 <johnthetubaguy> but to be honest, just see who is interested and adds there name in the first instance 15:21:32 <BobBall> heh :) I thought this was now a managed process 15:21:54 <johnthetubaguy> well, its crowd sourced 15:22:02 <BobBall> Ah - if it's low priority then that's one thing... I assumed this would be a medium blueprint so needed two cores to sign up before approval or something 15:22:07 <johnthetubaguy> I see blueprint I want, so I sign up to do them 15:22:14 <johnthetubaguy> or if I see xenapi ones, I sign up 15:22:32 <johnthetubaguy> its low because there are not two cores signed up 15:22:42 <johnthetubaguy> it can be promoted if another core signs up 15:23:01 <BobBall> ok 15:23:01 <johnthetubaguy> its nice and loose 15:23:05 <johnthetubaguy> anyways 15:23:23 <johnthetubaguy> asesome stuff, I remember talking about this in San Diego, and never getting time to do it :D 15:23:37 <johnthetubaguy> lets move on.. 15:23:46 <johnthetubaguy> #topic Docs 15:23:58 <johnthetubaguy> anyone got any doc things? 15:24:09 <BobBall> no doc fun this week 15:26:11 <BobBall> john? 15:26:55 <matel> oh 15:26:59 <johnthetubaguy> yeah, sorry 15:26:59 <matel> network issues? 15:27:08 <johnthetubaguy> #topic Bugs & QA 15:27:14 <johnthetubaguy> how is the tempest work going 15:27:21 * johnthetubaguy looks at matel 15:27:25 <BobBall> matel first? 15:27:32 <matel> Okay. 15:27:57 <matel> So I am eorking on scripts to create a cloud- ready xenserver 15:28:13 <matel> #link https://github.com/citrix-openstack/xenapi-in-the-cloud 15:28:36 <johnthetubaguy> is that "working" now? 15:29:01 <matel> it's working, but I am speeding it up - so that we have an image, that could just be launched. 15:29:07 <johnthetubaguy> … and have we done a tempest run in that cloud based XenServer yet? 15:29:11 <BobBall> Thanks to antonym - tap devices were fixed :) 15:29:15 <matel> Yes, we did smoke. 15:29:24 <johnthetubaguy> how "fast" was smoke? 15:29:28 <johnthetubaguy> and did it work? 15:29:39 <matel> Ran 223 tests in 624.476s 15:29:59 <matel> tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern.test_volume_boot_pattern[compute,image,volume] 164.053 15:29:59 <matel> tempest.scenario.test_snapshot_pattern.TestSnapshotPattern.test_snapshot_pattern[compute,image,network] 98.140 15:29:59 <matel> tempest.scenario.test_minimum_basic.TestMinimumBasicScenario.test_minimum_basic_scenario[compute,image,network,volume] 91.124 15:29:59 <matel> tempest.api.compute.servers.test_server_rescue.ServerRescueTestJSON.test_rescue_unrescue_instance[gate,smoke] 38.828 15:30:04 <matel> tempest.scenario.test_server_basic_ops.TestServerBasicOps.test_server_basicops[compute,network] 36.813 15:30:07 <matel> tempest.thirdparty.boto.test_ec2_volumes.EC2VolumesTest.test_create_volume_from_snapshot[gate,smoke] 33.709 15:30:10 <matel> tempest.api.compute.v3.servers.test_server_actions.ServerActionsV3TestXML.test_rebuild_server[gate,smoke] 31.742 15:30:13 <matel> tempest.api.compute.servers.test_server_rescue.ServerRescueTestXML.test_rescue_unrescue_instance[gate,smoke] 23.964 15:30:14 <johnthetubaguy> hmm, OK, that could be worse 15:30:16 <matel> tempest.scenario.test_dashboard_basic_ops.TestDashboardBasicOps.test_basic_scenario[dashboard] 23.038 15:30:19 <matel> tempest.api.compute.v3.servers.test_server_actions.ServerActionsV3TestJSON.test_rebuild_server[gate,smoke] 18.980 15:30:22 <matel> Some really slow ones. 15:30:24 <matel> So that's smoke. 15:30:30 <matel> No full runs yet 15:30:39 <johnthetubaguy> any errors with smoke? 15:30:42 <matel> Bob has fixes to make the test runs more stable. 15:30:42 <johnthetubaguy> I guess not? 15:30:49 <matel> They were a bit unstable. 15:30:53 <johnthetubaguy> hmm, OK 15:30:58 <matel> I'll pass it to Bob. 15:31:02 <johnthetubaguy> so can we wire this up into Zuul now? 15:31:08 <matel> As he found the reason. 15:31:28 <BobBall> I've found some really fun errors in full tempest... 15:31:30 <matel> I would need someone, who would help wire it up to that black box. 15:31:51 <BobBall> Ranging from lack of memory leading to compute memory fragmentation leading to VBD plug failures which then clearly fails the tempest test 15:31:52 <johnthetubaguy> OK, so have you seen the docs and other changes where people added tests? I can help with that 15:31:57 <matel> I would say, we are getting closer to wire it up, but it's measured in weeks... 15:32:14 <johnthetubaguy> why weeks? 15:32:37 <matel> Because there is work to be done, and work needs time. 15:32:54 <BobBall> Lots of unknowns still in exactly how to integrate with the existing zuul stuff 15:33:02 <johnthetubaguy> agreed, but digging into that, can I help make that go faster, or is it not that kind of work? 15:33:07 <BobBall> like exactly where the split between creating a VM and putting it in the nodepool should be etc 15:33:11 <matel> Sure, you can make it faster. 15:33:26 <matel> I would need to have a description of the zuul entry points. 15:33:42 <matel> AFAIK, it's "PREPARE IMAGE" and "RUN DEVSTACK" 15:33:50 <johnthetubaguy> Yup, OK, so I will have a chat with people, but I can come and sit next you tomorrow and help move it forward? 15:34:05 <matel> And we would need to find out how to "customise" the localrc - so that it knows about XenServer. 15:34:24 <johnthetubaguy> yup, that all makes sense 15:34:29 <matel> Yes, so I am working on a script, that is producing an image. 15:34:38 <johnthetubaguy> that sounds good 15:34:45 <johnthetubaguy> I am happy to look into Zuul bits if that helps 15:34:46 <matel> which is really the first step - I am close (this week) 15:34:55 <johnthetubaguy> anyways, we are making good progress 15:34:55 <matel> Okay, stay in touch 15:34:58 <johnthetubaguy> lets move on 15:35:04 <johnthetubaguy> http://ci.openstack.org/ 15:35:06 <johnthetubaguy> BTW 15:35:18 <BobBall> those docs are rubbish 15:35:24 <johnthetubaguy> prehaps 15:35:25 <BobBall> we were looking at the yesterday :) 15:35:37 <matel> Yes, I am so upset about the lack of documentation. 15:35:42 <johnthetubaguy> OK, any more on QA 15:35:42 <BobBall> I know more of the overview than is in the docs just from playing 15:35:48 <BobBall> Well we can talk about my tempest fun 15:35:52 <johnthetubaguy> sure 15:35:59 <johnthetubaguy> you got patches and bugs up? 15:36:00 <matel> So if john, you could put together a nice diagram, and put it on the docs page, that would help. 15:36:04 <BobBall> As I said - if you give domU too little memory, all sorts of things can go wrong 15:36:12 <johnthetubaguy> matel: my plan is to read the code 15:36:14 <BobBall> I've got a patch up for devstack - but you'll need to be aware of it in production 15:36:26 <matel> Sure, and create a diagram as well. 15:36:29 <johnthetubaguy> BobBall: yes, well, we try not to run out of memory I guess 15:36:29 <BobBall> if your compute VM is swapping then you might be hitting an issue where VBD plug fails 15:36:53 <johnthetubaguy> we don't run rabbit in our compute domUs, oddly 15:36:55 <BobBall> it's not running out of memory that's the problem - it's memory fragmentation that prevents you from allocating a 128k block - which is quite large in memory terms 15:37:01 <johnthetubaguy> but its a good catch though 15:37:13 <BobBall> so if you have a long-running and busy compute VM with only just enough memory, you _will_ hit this sooner or later 15:37:42 <BobBall> I suspect all three of those will apply at Rackspace since I know you squeeze compute's memory to get more VMs on a box 15:38:03 <johnthetubaguy> yeah, not sure about the headroom though, needs a good luck 15:38:12 <johnthetubaguy> whats the fix? 15:38:16 <BobBall> more memory 15:38:23 <johnthetubaguy> lol, OK 15:38:23 <BobBall> or reboot compute to stop fragmentation 15:38:33 <johnthetubaguy> yeah, thats possible 15:38:46 <BobBall> but who knows how often you have to reboot 15:38:57 <BobBall> parallel tempest was hitting it within 60 minutes 15:38:58 <johnthetubaguy> never under normal operation, but hey 15:39:27 <BobBall> Anyway - next fun - we already synchronize on VBD.plug, but we have to synchronise on VBD.unplug too (both at the same time) otherwise we get the same random racy failures 15:39:36 <BobBall> #link https://review.openstack.org/#/c/59856/ 15:39:47 <johnthetubaguy> Ah, yes, I did ask about that at the time, I guessed we might need that 15:40:35 <BobBall> and there are some weird things I'm looking at ATM but I don't have a solution for 15:40:38 <johnthetubaguy> cool, any more? 15:40:40 <BobBall> timeouts with volumes 15:40:52 <BobBall> the really annoying thing is that full tempest does pass 15:40:59 <johnthetubaguy> oh, is that the old kernel issue with iSCSI on the same box again? 15:40:59 <BobBall> so it's timeouts / races that we're fighting against 15:41:03 <BobBall> no 15:41:06 <BobBall> Mate fixed that 15:41:16 <johnthetubaguy> but it could be related? 15:41:31 <BobBall> I doubt it at this point 15:41:39 <johnthetubaguy> you get deadlocks in tap disk, is that related? 15:41:41 <BobBall> the iscsi issue is definitely mitigated 15:41:55 <BobBall> there may be other deadlocks - but not that one 15:42:08 <BobBall> that one is mitigated by adding a new memcopy in the process 15:42:24 <BobBall> if we're still seeing deadlocks then it's a different issue 15:42:29 <matel> Let me dig up the change... 15:42:34 <johnthetubaguy> OK, no worries 15:42:43 <johnthetubaguy> just checking the things that came to mind 15:42:48 <BobBall> (the memcopy isn't actually real - it's about changing the IO mode used by the SR) 15:43:05 <matel> #link https://github.com/citrix-openstack/qa/blob/master/install-devstack-xen.sh#L314 15:43:49 <johnthetubaguy> cool, making progress though, which is good to see 15:43:57 <johnthetubaguy> so, in other news... 15:44:17 <johnthetubaguy> I plan to use all your good work, and copy and paste it to run cloud cafe tests, assuming we keep using those 15:44:27 <johnthetubaguy> but thats just for context 15:44:42 <johnthetubaguy> so... 15:44:48 <johnthetubaguy> #topic Open Discussion 15:44:56 <johnthetubaguy> any more for any more? 15:45:19 <BobBall> can't for the life of me think what it was 15:45:22 <BobBall> but I was going to say something 15:46:08 <matel> I'm okay, I just need zuul- capable people. 15:46:11 <matel> to talk to. 15:46:24 <BobBall> Just ask on -infra 15:46:30 <BobBall> they usually answer 15:46:36 <matel> Oh, yes. 15:46:52 <BobBall> and they only charge £5 per question 15:46:54 <BobBall> quite reasonable 15:46:57 <johnthetubaguy> yeah, so I am happy to take on the zuul integration with you if that helps? 15:47:37 <matel> Let's find entry points, that's good enough. 15:47:38 <johnthetubaguy> while you make the stand-a-lone script stable, and Bob works on getting tempest running stabley 15:47:46 <johnthetubaguy> yup, I am happy to work on that 15:47:49 <BobBall> johnthetubaguy: other question - while we're here - guy in #openstack-dev asking a question I don't know the answer to... 15:47:58 <matel> okay, sounds like a plan. 15:48:02 <BobBall> hi any idea why two rules are used instead of using single rule line number 137,147 https://github.com/openstack/nova/blob/master/plugins/xenserver/networking/etc/xensource/scripts/ovs_configure_vif_flows.py 15:48:17 <BobBall> thought you might know :) 15:48:24 <johnthetubaguy> nope sorry, I am the wrong guy to ask about that stuff 15:48:36 <BobBall> I guessed it was different implementations of ARP - some using the host IP and others using 0.0.0.0 as the response 15:48:40 <BobBall> ah well 15:49:07 <johnthetubaguy> yeah, its deep in the niggles of network land, I know the basics not the details there :( 15:49:28 <BobBall> indeed 15:49:37 <johnthetubaguy> cool, are we all done? 15:49:48 <johnthetubaguy> a good meeting today I feel, thank you all for your hard work! 15:49:52 <BobBall> I know a fair bit, particularly around that tenant isolation, but that one had me stumped... 15:50:04 <BobBall> heh - you're welcome? 15:50:34 <johnthetubaguy> ...I was meeting more generally, not the hard work of the meeting 15:50:36 <johnthetubaguy> anyways 15:50:39 <johnthetubaguy> any more questions? 15:51:02 <BobBall> yeah 15:51:20 <BobBall> what's the deal with this furby boom thing? I hear it's a popular xmas toy but I just don't get it 15:51:23 <BobBall> maybe I'm too old... 15:51:31 <BobBall> hmmm - do you think that question is off-topic? 15:52:17 <johnthetubaguy> maybe 15:52:26 <johnthetubaguy> I never get these crazes 15:52:33 <johnthetubaguy> they always seem to pass me by 15:52:38 <johnthetubaguy> but maybe I am just tight 15:52:41 <johnthetubaguy> anyways... 15:52:45 <johnthetubaguy> #endmeeting