09:00:47 #startmeeting scientific_wg 09:00:48 Meeting started Wed Aug 17 09:00:47 2016 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:00:49 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:00:51 The meeting name has been set to 'scientific_wg' 09:01:17 #link Agenda from wiki https://wiki.openstack.org/wiki/Scientific_working_group#IRC_Meeting_August_17th_2016 09:01:31 Good morning all 09:01:44 good morning! 09:01:50 Good morning 09:01:58 hi 09:02:09 Hi there! 09:02:34 evening 09:02:40 blairo: where's the '1'? 09:02:47 #chair blairo 09:02:48 Current chairs: blairo oneswig 09:02:49 wondered whether you'd notice 09:03:01 I miss nothing (except family birthdays) 09:03:03 coming to you live from the office tonight 09:03:25 The Presidential suite? 09:03:28 lol 09:03:29 (where i have my client configured properly for my irc bouncer) 09:04:05 Aha, well lets get the show on the road - I don't think there's a massive amount this week, it's holiday season over here 09:04:14 #topic Accounting and Scheduling 09:04:33 Did you see the mail from Danielle Mundle? 09:04:41 #link http://lists.openstack.org/pipermail/user-committee/2016-August/001186.html 09:04:45 re. quota study? 09:04:55 that's the one. Seemed relevant to some in this wG 09:05:18 yep, saw that 09:05:20 yes i volunteered (i they can accommodate me being away next two weeks) 09:05:47 I think a couple of guys here might be interested, but they’re on holiday for the next two weeks ;-) 09:05:50 bad timing 09:05:56 in the last UX thing i did (horizon mock-up review) i made some comments around quotas and the nectar use-cases 09:06:12 Quite an involved approach to have a video interview but it sounds like it could be useful way to capture feedback 09:06:22 would be good to formalise that 09:06:37 it's particularly relevant to large distributed clouds running cells i think 09:07:21 yeah i guess they are after qualitative rather than quantitative data 09:08:04 I recall Tim Bell's gripe was the combination of a user quota and a group quota 09:08:15 I hope that's well covered in their survey 09:08:25 reminds me i have not been back to review, summarise, share the results of the research/science cloud survey we knocked up for austin 09:08:49 perfect holiday activity? 09:09:18 could be (if my wife is not looking ;-) 09:09:48 You take concerns over data confidentiality very seriously, that's impressive :-) 09:10:06 OK well I think that's the UX message covered 09:10:13 In Chameleon the main pain point about quota is that they get out of sync regularly, and have to be corrected manually in the database. I have seen it reported on launchpad, anyone here seen it as well? 09:10:20 Was there more on accounting and scheduling to cover? 09:10:22 i think cern in particular want a nested quota structure that works for a department - project/team - individual layered approach 09:10:47 same here blairo 09:11:00 we actually discussed with Tim on that back in march I guess 09:11:13 we would *love* to have the same thing at the EBI 09:11:19 priteau: yes we have that problem in many large deployments i think 09:11:20 it would solve many problems 09:11:34 Good points people - stand up and be heard 09:11:39 in nectar our cron box runs a regular quota sync task 09:11:58 like the “Give the PI a big quota, and then let him decide how to split it among his people” kind of thing 09:12:28 dariov: ok, well i'll be sure to make sure the UX folks are aware of that requirement and the people interested in it 09:12:29 this would save some paint at our cloud guys 09:12:38 blairo, thanks 09:12:58 tim has blogged pretty extensively on this already so i think we can just say: that please 09:13:10 #link Sign up here for UX discussions http://doodle.com/poll/7tid473za2hpi6e7 09:13:47 Any more to cover on this? 09:14:18 maybe one thing 09:15:11 priteau: i can point you to what we're doing in nectar for quota sync (i'm not intimately familiar with the code but sounds like it may be the same problem) 09:15:32 that was all 09:15:49 #topic User stories 09:15:53 blairo: I'd like to take a look if you've got the scripts in a public repo. I have found another one on GitHub as well 09:16:59 OK I'm still working away on the OpenStack/HPC paper but I've realised I need some more data points, perhaps you can help 09:17:16 priteau: details here: http://lists.openstack.org/pipermail/openstack-operators/2015-March/006596.html 09:17:23 thanks blairo 09:18:22 Thanks blairo looks handy 09:18:43 You might have seen on the operators list, I'm looking for an IB user, ideally in production 09:18:52 Thought there were loads, turns out not so much 09:19:10 Does anyone know one? 09:19:52 you might be first oneswig o_0 09:20:14 I don't have IB (although I might be getting some second-hand kit to experiment with) 09:20:41 so, just because I’m noob, what’s an IB? 09:20:46 Jon Mills said they used to, but found IPoIB was slower than 10GE 09:20:50 IB = infiniband, sorry 09:20:58 ah yes, crossing my link-layers - cambridge has SN2700 ethernet fabric too 09:20:59 ah-ah 09:21:03 thnx 09:21:40 Morning. I will have it shortly, as I've already told oneswig. 09:21:54 Thanks verdurin, and good morning 09:22:36 oneswig: are you wanting folks that are doing IB all the way to guest, i.e., sriov, or just using IB as their DC interconnect ? 09:22:50 Well I'm not sure, whatever turns up 09:22:54 (i assumed the former, but realised i could be wrong) 09:23:00 ok 09:23:14 I'm trying to think of ways an HPC user might think of what OpenStack can't do 09:23:45 I think I'm trying to prove an apple is an orange 09:24:17 so long as there are no lemons involved i don't see a problem 09:24:39 Right! Bowled you a slow one there :-) 09:25:00 like australia against sri lanka 09:25:24 I think I missed that one 09:25:28 underarm? 09:25:52 clean sweep of latest test series to them 09:26:20 There's a strange randomness to cricket outcomes I don't fully understand 09:26:23 OK, I'll keep looking. Anything else to cover on user stories? 09:26:55 i'm finally reading your draft now oneswig 09:27:10 blairo: great thanks, appreciate that 09:28:10 #topic Bare metal 09:28:57 I don't think I have anything on bare metal this week. What's new in the Ironic world? 09:29:42 verdurin: will your new cluster be bare metal with IB or virtualised? 09:31:15 Think he's gone...? 09:32:07 OK lets move on 09:32:10 oneswig: All the multi-tenant network support appears to have been committed, but we have yet to evaluate it 09:32:13 probably to get coffee 09:32:39 priteau: great, thanks for the update on that, good to know 09:33:09 I'll be interested to see how they are mapping physical machines to physical network ports - do you know? 09:33:31 I think you have to define the port number when registering a node 09:33:54 Ah OK, defer the mapping problem... 09:34:18 This might be something the Ironic Inspector could learn, I wonder 09:34:21 cogs whirring 09:34:38 priteau: BTW did you get any further with your patches for Blazard? 09:35:05 oneswig: not yet unfortunately 09:36:05 ok, just wondering. A big set of patches can be like being carried away by a helium balloon - the longer you leave it, the more it'll hurt 09:36:56 OK, move on? 09:37:09 #topic Parallel filesystems 09:38:09 I believe the team at Cambridge have got Lustre into their VMs but I don't know how it's performing. Progress for them though. 09:38:29 oneswig: virtualised, at least initially 09:38:42 We plan to investigate GPFS into our VMs 09:39:06 good to hear 09:39:07 verdurin: via IB I assume? 09:39:50 oneswig: preferably. Ethernet is possible, too. 09:39:56 verdurin: how close are you to getting this system up and running? 09:40:17 ours on M3 is going fine, still tuning the filesystem for performance at a more basic level, no performance issues caused by sriov at this stage 09:40:41 oneswig: not very - super busy with other stuff but people are starting to ask about it more, so I'll have to find the time 09:40:54 blairo: That's Lustre right? 09:41:01 yep 09:41:06 using o2ib LNET 09:41:56 dariov: what do you use at EBI? 09:42:12 oneswig, NetApp I think 09:42:27 but the guys are moving loads of stuff lately 09:42:32 so I might need to check with them 09:42:46 Be interesting to hear what they migrate to 09:42:52 If that's the plan 09:42:59 I don’t think 09:43:25 we got “some” new kits coming in in the near future 09:44:02 when they’ll be here they’ll start migrating everything to Mitaka 09:44:03 oneswig: one thing to be aware of that i discovered recently is that device passthrough, e.g. for sriov NICs/HCAs, means that transparent huge pages cannot be allocated by the host 09:44:46 Giving a consequence for memory-intensive workloads? 09:44:52 (because IOMMU requires guest memory to be pinned) 09:45:02 but the storage backend will be the same 09:45:42 If guest memory must be pinned what effect does that have on things like KSM, overcommitment and ballooning? 09:45:43 yes will probably be very sucky for memory intensive workloads if you have very large guests (as bigger memory footprint means more top-level TLB misses) 09:46:10 e.g. 200+GB of 4kB pages 09:46:32 but! you just have to know to use static huge pages instead 09:46:57 Can you note this in the doc? Pearls of wisdom from Down Under 09:46:57 i don't have any numbers yet that quantify this, but i hope to produce something at least anecdotal 09:47:07 copy 09:47:11 Thanks! 09:47:31 OK, AOB? 09:47:33 and yes, that would impact KSM etc, but you already want those disabled if you care about HPC-like performance 09:47:43 right 09:47:56 but perhaps this means you must have them disabled? 09:48:26 #topic AOB 09:48:39 I had an interesting problem this week 09:48:49 Mellanox NICs, anyone got those? ;-) 09:48:59 not sure about ballooning as does that even work in libvirt+kvm reliably? but for KSM and overcommit just means they won't work 09:49:11 We've had this kernel barf whenever a TCP connection is initiated via VXLAN 09:49:15 Does anyone see that? 09:49:32 though if you are overcommitting by force, like with nova scheduler, it probably means you'll end up DOS-ing your compute nodses 09:49:48 i.e. they'll OOM 09:50:11 haven' 09:50:33 haven't tried any VXLAN traffic on them yet, but we are quite close to doing that with midonet 09:51:10 Midonet? Interesting. We have VXLAN+OVS and VLAN+SRIOV networks 09:51:13 sounds worrisome 09:51:32 Only the VXLAN ones have this issue with kernel backtraces 09:51:37 latest firmware+driver i take it? 09:52:02 I think so. 09:52:14 oh, do you the Pro models specifically that support VXLAN and GRE offload? 09:52:22 Working on it now with support - I'll report back 09:52:23 *you mean 09:52:36 Don't think so, we have ConnectX4-LX 09:52:54 Don't recall any reference to pro on these ones 09:52:56 yeah all newer cards just support that 09:53:10 but earlier CX-3 cards did not 09:53:29 mellanox will tell you they are all pro now ;-) 09:53:33 I think we must be doing something wrong: VXLAN bandwidth is ~1.6Gbits/s at best - on a 50G link 09:53:49 fineprint: except for the bugs 09:54:01 That's for iperf. If there's a hardware offload, it's not engaging 09:54:12 yeah definitely 09:54:31 Question is, what counts for decent bandwidth in a VXLAN world? I don't have much hope for it. 09:55:18 Through SR-IOV, we sustained 10.5GBytes/s (bi-directional) 09:55:20 brb 09:56:08 well i believe early testing in nectar land with iperf over midonet (which is just vxlan once established) managed 4-5Gbps melbourne to queensland (no h/w offload) 09:56:30 midonet is based on OVS too, right? 09:56:39 yeah OVS dataplane 09:56:59 standard MTU? It appears to be massively interrupt-dominant in the hypervisor 10:16:26 #endmeeting