11:00:06 <oneswig> #startmeeting scientific-sig 11:00:07 <janders> g'day! :) 11:00:07 <openstack> Meeting started Wed Dec 19 11:00:06 2018 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 11:00:08 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 11:00:10 <openstack> The meeting name has been set to 'scientific_sig' 11:00:17 <oneswig> janders: you are quick :-) 11:00:22 <oneswig> g'day back 11:00:36 <janders> :) how are things? 11:00:37 <priteau> Hello there 11:00:44 <oneswig> #link Agenda for today https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_December_19th_2018 11:01:08 <verdurin> Morning. 11:01:15 <oneswig> All good here, thanks. Been a busy couple of weeks. I presented 3 times in two weeks, on 3 different subjects... 11:01:25 <oneswig> Morning verdurin priteau 11:01:32 <oneswig> all well? 11:01:38 <janders> that is very busy indeed! :) 11:02:04 <oneswig> All done now, just need to catch up on the Christmas shopping 11:02:12 <janders> haha same here :) 11:02:25 <janders> we're sorting out a couple different procurement activities if that counts as Christmas shopping 11:02:40 <oneswig> ooh, like what? 11:02:44 <janders> looking good, should have some cool things to work on next year 11:02:54 <janders> mostly hardware 11:03:08 <janders> for the cyber system based on SuperCloud architecture 11:03:17 <janders> s/cyber/cybersecurity 11:03:26 <oneswig> Is the SuperCloud scaling up for production now? 11:03:46 <janders> we'll be writing prod-ready provisioning code from early jan 11:04:05 <janders> we switched focus to other things (like BeeGFS) while in tender mode 11:04:28 <janders> when the new equipment lands we should be able to build it pretty quickly 11:04:47 <oneswig> I met some BeeGFS guys at a conference last week and got a couple of lovely led-bejewelled BeeGFS pin badges from it :-) 11:05:03 <janders> excellent! :) 11:05:15 <oneswig> by provisioning code, you're talking tripleo heat templates or what? 11:05:18 <janders> I finally had a chance to think a bit about what we want from BeeGFS from the OpenStack angle 11:05:28 <janders> yes, among other things 11:05:38 <janders> but that's a fair part of it 11:06:22 <oneswig> How is it differing from the pilot system you're currently developing on? 11:06:46 <janders> we might try to apply the ephemeral hypervisor philosophy to more OpenStack components, so the tripleo run might be just one stage of building the system up 11:06:59 <janders> more emphasis on resiliency 11:07:07 <janders> and security 11:07:27 <oneswig> It will be fascinating to see where you take this in 2019. 11:08:05 <janders> the prototype focused on proving that the concept is sound and demonstrating the baremetal capability 11:08:26 <oneswig> Seems you're not the only one blurring the baremetal boundaries. 11:08:38 <janders> the next one will have all that, but should be able to withstand a fair bit more punishment and hardware failures :) 11:09:14 <oneswig> mgoddard shared this spec out for review on future Ironic-Kubernetes integration: https://review.openstack.org/#/c/625730 11:10:06 <mgoddard> I'd guess it's quite a way off yet, but interesting to see where ironic could be in a few years 11:10:23 <oneswig> Is there a term to describe this pattern of running different layers of services mixed up together? 11:10:24 <mgoddard> maybe less, depending on how keen RH are 11:10:47 <janders> excellent work! :) 11:11:31 <oneswig> Anyway, I guess we should get to the agenda (there's not much on it, but we are starting with AOB) 11:11:48 <oneswig> #topic 2018 retrospective 11:12:08 <oneswig> I realised this week I did nothing to write up the activities from the Berlin summit 11:12:26 <oneswig> Which was unfortunate as there was a good deal going on. 11:13:26 <oneswig> I finally managed a write-up of the presentation I did at Ceph days Berlin, so it must be next... 11:13:44 <oneswig> #link HBP presentation from Ceph days Berlin https://www.stackhpc.com/ceph-on-the-brain-a-year-with-the-human-brain-project.html 11:14:24 <oneswig> The most interesting piece from this was the performance quirks of Intel NVMEs using LVM and Bluestore (current Ceph best practice) 11:17:18 <oneswig> From my SIG-centric POV it's been unfortunate that Blair is not so hands-on with OpenStack in his new job - hopefully he'll get back to that... 11:17:19 <janders> excellent work guys! 11:17:35 <oneswig> thanks janders :-) 11:17:36 <janders> I'm quiet cause I'm barely keeping up to read through the links 11:17:46 <janders> but topics are so interesting I cant resist reading right now 11:17:55 <janders> bluestore seems ground breaking 11:18:16 <oneswig> it made a huge difference in our tests 11:19:45 <janders> have you guys looked into IOPS on NVMe-ceph much? 11:20:12 <janders> I just skimmed through the later part of the article, sorry if I missed it 11:20:45 <oneswig> Not so much on IOPS, I was testing at the RADOS level and looked almost exclusively at aggregate bandwidth. 11:21:40 <oneswig> We've been doing some work recently on processing the latency histograms from multiple client runs of fio and resampling to generate a single latency histogram, which is pretty cool 11:23:30 <oneswig> I was also interested in gathering input on what we should do differently in 2019. Any thoughts on things we should be doing? 11:23:56 <janders> I haven't done any work with ceph/bluestore myself but the concept makes perfect sense and your results reflect that 11:24:34 <janders> I like the final comment on xfs vs bluestore on ceph blog: 11:24:35 <oneswig> It was a good result, I was happy to see it! 11:24:43 <janders> In the end, we found there was nothing wrong with XFS; it was simply the wrong tool for the job. 11:24:43 <janders> :) 11:24:59 <oneswig> ha, that's nice. 11:26:18 <janders> I am very happy with the SIG, lots of good discussions and inspiring ideas, not sure what I'd change 11:26:52 <oneswig> OK that's good to know, thanks. Any thoughts from anyone else? 11:26:56 <janders> location of the at-the-Summit meeting rooms perhaps :) 11:27:09 <janders> (the walk to the Berlin one was a bit of a challenge) 11:27:13 <oneswig> We filled that room! 11:27:42 <janders> true! :) scientific stackers are a particularly hardy and stubborn types 11:28:01 <oneswig> We enjoy a hike 11:28:11 <oneswig> #topic Conferences for 2019 11:28:39 <oneswig> I had a few mails this week and it got me thinking we should gather details on conferences that might be of interest. 11:28:54 <janders> great idea 11:29:09 <janders> do you guys know what's the talk proposals deadline for the Lugano conference? 11:29:17 <oneswig> #link London - UKRI cloud workshop https://www.eventbrite.co.uk/e/ukri-cloud-workshop-tickets-53580893896 11:29:28 <oneswig> verdurin: are you involved in this? 11:29:49 <verdurin> oneswig: Yes, I am. We had a call about it on Monday. 11:30:09 <oneswig> I've been a couple of times and found it to be a very informative day. 11:30:11 <verdurin> Abstract submissions welcome, and accepted until mid-January. 11:30:31 <oneswig> Is there a particular theme this year? 11:31:48 <verdurin> We've suggested themes in the CfP - there's no overall theme beyond that. 11:32:00 <oneswig> ok, thanks verdurin 11:32:07 <oneswig> #link Lugano, Switzerland - HPC Advisory Council http://hpcadvisorycouncil.com/events/2019/swiss-workshop/submissions.php 11:32:23 <oneswig> Very beautiful place to visit 11:32:48 <oneswig> Last year there was a great discussion on hpc container infra, really useful. 11:33:33 <oneswig> There's a sister conference in Perth janders - well worth the trip I'd say 11:34:01 <oneswig> #link Singapore - SRECon Asia/Australia https://www.usenix.org/conference/srecon19asia/call-for-participation 11:34:30 <janders> I see Perth is late August 11:34:36 <janders> good to know 11:34:45 <oneswig> I went to an SRECon before and found it pretty useful (although mostly in abstract terms, there's little consideration for OpenStack content) 11:34:47 <janders> good excuse to escape the end of the winter here in Canberra :) 11:35:25 <oneswig> janders: winter in Canberra... you'll be telling me there's a ski season next :-) 11:35:40 <janders> not quite here but 2.5 drive away - yes! 11:36:13 <oneswig> Any other conferences/workshops people would like to announce? 11:36:41 <janders> it would be good to know what's the submission deadline for Lugano if you happen to know 11:36:55 <janders> wasn't able to find it on the event website 11:37:05 <janders> do you remember what was it like last year? 11:37:41 <oneswig> I can't see a date for that either. 11:38:17 <verdurin> Is it too early to start thinking about running something at ISC? 11:38:39 <oneswig> verdurin: I was thinking about that too, we've not done that before and perhaps we should. 11:38:43 <oneswig> Do you go? 11:39:00 <verdurin> The same - haven't done before, quite likely to this time. 11:39:53 <oneswig> John T is a regular there, I'll see what he thinks. 11:41:08 <janders> oneswig: https://photos.google.com/photo/AF1QipPLFErIFTfFqE2GPdq3EtVKgQutLsKIV8vAqB7t that's from June, somewhere over Perisher Valley, New South Wales 11:41:48 <oneswig> That link's not working for me, alas 11:43:31 <janders> https://photos.app.goo.gl/y8FyirV1G5XCEDLH7 11:43:52 <janders> sorry, google rewrites the URL after clicking, I copy pasted the re-written one 11:44:23 <oneswig> Snow in Australia, looks good! 11:44:34 <janders> good xc skiing up there 11:45:32 <janders> Lugano or ISC would likely be good timing to present the work we're planing to do with Bright 11:45:54 <janders> Lugano might be better if we make it early enough 11:46:45 <oneswig> Sounds good to me - but much better to present work actually delivered than still in the conceptual phase :-) 11:47:50 <oneswig> #topic AOB 11:48:02 <oneswig> What else is new? 11:48:20 <janders> my guys are making good progress with benchmarking the BeeGFS 11:48:33 <janders> I think they got up to 180GB/s on a quarter of the cluster 11:48:56 <oneswig> That's huge! 11:49:01 <oneswig> Are you OPA limited? 11:49:06 <janders> 2xEDR 11:49:31 <oneswig> Ah, of course, I forgot. 11:49:45 <janders> I think the 24NVMes can do 26-28GB/s per node but 2xEDR ports cut that to around 24GB/s per node 11:50:32 <oneswig> "only" 24GB/s... 11:51:02 <oneswig> How many server nodes are delivering 180GB/s? 11:51:12 <janders> John's hints were invaluable in making decisions on some of the config details - thanks for that 11:51:17 <janders> checking the numbers now.. 11:51:50 <oneswig> janders: you're welcome - share and enjoy 11:52:42 <janders> 8 servers 11:52:45 <janders> not sure how many clients 11:52:54 <janders> and more precisely it's 180GB/s peak and 160GB/s sustained 11:53:36 <janders> I think they used to run 8 clients, but they might have doubled or tripled that since, can't see the client number in the test report 11:53:41 <oneswig> That's pretty much where the data point is on the graphs I've seen from Cambridge, and performance scales linearly from there to 24 servers. 11:54:16 <janders> wow! that is great news for us 11:54:22 <oneswig> #link "Hero Performance Numbers" - slide 35 https://www.stackhpc.com/resources/2018-11-12-Berlin-HBP-Ceph.pdf 11:54:25 <janders> have you tested past 24? 11:54:56 <oneswig> I don't think so. Not sure how many storage servers they have available 11:55:15 <janders> we have 32, so not much more 11:55:34 <janders> capacity requirement was quite high, hence 32 nodes with 24NVMes each 11:55:57 <oneswig> That's a lot of nmve 11:56:18 <janders> I never quite liked the NVMe density as this requires a fair bit of blocking but it does buy the capacity :) 11:56:46 <janders> if it wasn't PCIe3 we'd perhaps be looking at 72GB/s per node 11:57:17 <oneswig> janders: ever considered bcache with backing to something bigger and slower? 11:58:08 <oneswig> Looking ahead to 2019, we have sessions planned on 2FA and iRODS in the pipeline for January. Anything else people would like to see? 11:58:38 <janders> not at CSIRO - in my days at RHAT I did look at something similar 11:58:46 <oneswig> If we are looking at iRODS, we might also want to look at Rucio 11:59:26 <oneswig> OK, we are at time - anything more to add? 11:59:28 <janders> one last comment on BeeGFS - I finally spent some time thinking about BeeGFS-OpenStack integration ideas - will share a googly doc 11:59:45 <oneswig> janders: the people I met were very enthusiastic about this, please do! 11:59:50 <verdurin> janders: yes, would be interested to see 12:00:03 <janders> what I have right now is nothing groundbreaking but it'll be a start 12:00:14 <janders> and other than that - Merry Christmas All! :) 12:00:27 <oneswig> Same to you janders - seasons greetings everyone 12:00:29 <janders> I'll be away 21 Dec - 6 Jan so speak after the break 12:00:38 <oneswig> #endmeeting