13:08:44 <ndipanov> #startmeeting sriov 13:08:46 <openstack> Meeting started Tue Mar 15 13:08:44 2016 UTC and is due to finish in 60 minutes. The chair is ndipanov. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:08:47 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:08:49 <openstack> The meeting name has been set to 'sriov' 13:09:01 <ndipanov> ok so I guess we could see where we stand on bugs... 13:09:19 <ndipanov> https://etherpad.openstack.org/p/sriov_meeting_agenda 13:09:50 <lbeliveau> for my cold migration bug (https://review.openstack.org/#/c/242573/17), I had to -1 myself, something got broken 13:10:05 <ndipanov> there were 3 bugs we were hopping to get into mitaka 13:10:20 <ndipanov> lbeliveau, ah yes I saw that 13:10:40 <vladikr> Hi everyone 13:10:53 <ndipanov> hello 13:10:58 <lbeliveau> Hey 13:11:18 <moshele> hi 13:11:25 <ndipanov> vladikr, what's up with this one? https://review.openstack.org/#/c/283198/4 13:12:00 <vladikr> ndipanov, it merged i think 13:12:02 <vladikr> sec 13:12:20 <ndipanov> ah nice was looking at an outdated revision 13:12:30 <vladikr> yea :) 13:12:52 <lbeliveau> nice 13:14:06 <ndipanov> https://review.openstack.org/#/c/216049 merged too! 13:14:44 <lbeliveau> yeap 13:15:10 <lbeliveau> I still need to open a new bug on this issue related to migration 13:15:18 <ndipanov> lbeliveau, how come? 13:16:30 <lbeliveau> I think something got broken, on devstack, I see a PCI device get allocated on the destination node, but it's not saved with the instance, so my stuff in neutronv2 don't see the new PCI devices 13:16:43 <lbeliveau> still need to troubleshoot, but was side tracked 13:16:48 <ndipanov> lbeliveau, ah you think you've hit a new bug 13:16:51 <ndipanov> ok I see 13:17:03 <ndipanov> well that's good that we are finding bugs I guess :) 13:17:19 <ndipanov> if there are no more (known) bug fixes we wanna rush in 13:17:25 <ndipanov> we can talk briefly about 13:17:28 <ndipanov> THE FUTURE 13:17:39 <ndipanov> aka newton work we wanna try and get done 13:17:45 <lbeliveau> I was not seeing that before, but I haven't run my patch in devstack for a while since the fixes I was making only needed tox 13:17:59 <moshele> lbeliveau: l can try your patch on my setup 13:18:23 <moshele> lbeliveau: it will take me sometime to prepare one 13:18:52 <lbeliveau> moshele: that would be great, but I won't be able to resume this activity until next couple of days 13:19:30 <yonglihe> lbeliveau: i hope i can help to review your code several days later, is that too late? 13:20:10 <lbeliveau> yonglihe: no, it's not too late, I have to fix a related bug before I can submit this one 13:20:50 <ndipanov> yonglihe, hi - I saw you mentioned some fixes for cold migration on the ML the other day 13:20:58 <yonglihe> lbeliveau: ok, let's do that. i spend lots of time debug the PCI migration itself, hop i can help 13:20:59 <ndipanov> could that be what's tripping lbeliveau up? 13:21:26 <yonglihe> lbeliveau: my patch is too old, and logic is urgly 13:22:18 <lbeliveau> not sure why I saw it working before (many times), maybe I was jsut getting lucky or something changed on master 13:23:15 <ndipanov> when I was looking at the code last which was a few weeks ago 13:23:25 <ndipanov> I think I conclude that it would not work as expected 13:23:26 <moshele> lbeliveau: we tested in QA your patch and it worked, but it was like 2 month ago 13:24:08 <lbeliveau> moshele: ok so that might confirm that something got broken 13:24:38 <lbeliveau> anyhow, it means we don't have a unit test for that :) 13:24:48 <moshele> lbeliveau: the test was put 2 vm on 2 different computes and the migrate one of them 13:25:41 <lbeliveau> moshele: sounds right, and it's best to use PCI devices with different PCI addresses to make sure new ones a allocated on the destination node 13:26:31 <lbeliveau> I'll work on it in a couple of days and will get back to you guys with my findings (and questions) 13:28:11 <ndipanov> cool 13:28:26 <ndipanov> anything regarding newton we wanna talk about here 13:28:26 <ndipanov> ? 13:29:05 <moshele> yes I have these spec Add scheduling with NIC capabilities https://review.openstack.org/#/c/286073/ 13:29:30 <lbeliveau> moshele: I' 13:29:34 <lbeliveau> I'll have a look 13:30:18 <moshele> so we want that nova will be aware of NIC capabilities for scheduling 13:30:52 <ndipanov> moshele, I will too 13:30:57 <lbeliveau> we want to propose a BP to make it easier to use SR-IOV, by combining the neutron port creation and the boot in one step 13:32:06 <moshele> lbeliveau: we had something like this before but it was rejected 13:32:09 <ndipanov> moshele, so this has been discussed before but I'll take a look 13:32:11 <ndipanov> and comment 13:32:17 <ndipanov> lbeliveau, that sounds useful 13:32:27 <ndipanov> moshele, how come? 13:32:37 <moshele> lbeliveau: let me look for the spec a sec 13:33:28 <moshele> see this https://review.openstack.org/#/c/138808/ 13:35:06 <ndipanov> moshele, I know there are folks who would hate to have that kind of logic live in nova 13:35:09 <lbeliveau> moshele: that looks very similar to what we wanted to propose ! we actually have something like that in our product 13:35:41 <lbeliveau> ndipanov: what is their argument ? 13:36:04 <lbeliveau> it can also avoir race condition issues when two isntances are trying to bind to the same SR-IOV port 13:36:22 <ndipanov> well that nova should not do any sort of orchestration like that and that it's up to user scripts/heat etc. 13:36:57 <ndipanov> I could see arguments for both personally 13:37:52 <lbeliveau> ndipanov: I see, but using heat and all doesn't make it easier to use, if usability is a concern 13:38:01 <moshele> when I was in vancouver summit it seems that everyone of the cores was against it 13:38:02 <ndipanov> lbeliveau, yeah 13:38:27 <ndipanov> moshele, yeah - I don't feel too strongly there are good arguments against it 13:38:33 <lbeliveau> should we try again ? 13:38:33 <ndipanov> though 13:38:42 <lbeliveau> or it is a lost cause ? 13:38:49 <ndipanov> if you have the energy - I would 13:38:55 <ndipanov> neutron is slightly different I'd say 13:39:02 <ndipanov> since in order for SR-IOV to work 13:39:09 <ndipanov> you need to configure nova properly too 13:39:28 <ndipanov> so some automation around that could be useful regardless of the "policy" 13:39:29 <ndipanov> I guess 13:39:38 <vladikr> I had a similar problem with another spec at that time.. the main argument was that the setting it to specific - and they wanted me to use some kind of an abstraction 13:40:22 <vladikr> instead of queues=8, use the term "fast" or anything like that 13:40:39 <vladikr> wasn't very useful in my case 13:42:09 <ndipanov> vladikr, that's not necessarily the same thing though 13:42:16 <lbeliveau> not sure for newton, but we still haven't got a way to boot with SR-IOV with horizon 13:42:45 <moshele> lbeliveau: but this can be fix in horizon 13:42:53 <ndipanov> lbeliveau, how come - we can't create an sri-ov port via horizon? 13:43:30 <lbeliveau> ndipanov: I looked last week and I didn't found a way, maybe just an oversight on my part 13:43:55 <moshele> lbeliveau: I know we can create direct port in horizon, I am not sure if we can select a port for vm boot 13:44:17 <lbeliveau> moshele: will have a closer look next time I have a devstack running 13:44:31 <moshele> beliveau: also I think that horizon has a bug on that 13:44:32 <ndipanov> that sounds like a good fix to have for sure 13:46:09 <vladikr> If we still have time to talk about bugs, there is one that I've ran into yesterday, https://review.openstack.org/#/c/291847/ - looks like it stops claiming pci devices in claim_test. I thought we are claiming the devices on purpose, to avoid a race ? (maybe we are not cleaning , but that's a different story ) 13:46:34 <ndipanov> should we raise a bug for the horizon issue? 13:47:03 <lbeliveau> ndipanov: I'll have a look, and if it is indeed a bug I'll take the action of raising one 13:47:04 <ndipanov> vladikr, that sounds like a regression 13:47:09 <ndipanov> lbeliveau, thanks 13:47:17 <ndipanov> if it's a regression we should hold up the RC for that 13:47:26 <ndipanov> so I propose you raise the priority to critical 13:47:31 <ndipanov> and tag the bug just in case 13:48:41 <ndipanov> vladikr, but just from a brief look I'm not sure it needs to do that there... 13:48:53 <ndipanov> will look closer 13:49:33 <mriedem> i've marked https://bugs.launchpad.net/nova/+bug/1549984 as mitaka-rc-potential 13:49:33 <openstack> Launchpad bug 1549984 in OpenStack Compute (nova) "PCI devices claimed on compute node during _claim_test()" [High,In progress] - Assigned to Jay Pipes (jaypipes) 13:49:53 <ndipanov> mriedem, thanks 13:49:56 <jaypipes> mriedem: ty 13:50:10 <ndipanov> lurkers 13:50:20 <mriedem> :) 13:51:40 <yonglihe> mriedem: hi 13:52:11 <yonglihe> good morning, mriedem 13:52:14 <mriedem> hi 13:52:31 <moshele> lbeliveau: horizon bug https://bugs.launchpad.net/horizon/+bug/1402959 13:52:31 <openstack> Launchpad bug 1402959 in OpenStack Dashboard (Horizon) "Support Launching an instance with a port with vnic_type=direct" [Medium,In progress] - Assigned to Itxaka Serrano (itxakaserrano) 13:53:10 <lbeliveau> moshele: thanks, good ! 13:53:15 <yonglihe> i might get wrong message, do you ever want sriov-test comments to nova? 13:53:43 <lbeliveau> yonglihe, what do you mean ? 13:53:51 <mriedem> yonglihe: i'm assuming you're asking about 3rd party CI? 13:54:08 <yonglihe> mriedem: you might just want NFV test recheck a patch? yes third-party CI 13:54:37 <mriedem> ndipanov: are you in open discussion? i don't want to derail your meeting 13:54:38 <lbeliveau> yonglihe: intel-pci ? 13:54:56 <ndipanov> mriedem, I think we are probably done? 13:54:57 <ndipanov> folks? 13:55:03 <lbeliveau> I'm good 13:55:05 * ndipanov is hungry 13:55:09 <moshele> I think so 13:55:13 <mriedem> yonglihe: let's talk in -nova 13:56:47 <ndipanov> ok folks thanks for the meeting 13:56:51 <ndipanov> #endmeeting sriov