21:00:26 <oneswig> #startmeeting scientific-sig
21:00:26 <openstack> Meeting started Tue Jun 12 21:00:26 2018 UTC and is due to finish in 60 minutes.  The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:27 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:30 <openstack> The meeting name has been set to 'scientific_sig'
21:00:45 <oneswig> Think I got the spelling right this week...
21:00:59 <oneswig> #link Agenda for today https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_June_12th_2018
21:02:33 <janders> g'day everyone
21:02:41 <oneswig> Hello - Jacob?
21:02:53 <janders> yes, that's me :)
21:03:03 <janders> how are you Stig?
21:03:14 <oneswig> Good, but busy...
21:03:36 <oneswig> Rolling out a deploy of bare metal + IB, as it happens...
21:03:49 <oneswig> (although we are not onto the interesting bits yet)
21:03:50 <janders> sounds familiar :) that'd do it to you
21:04:07 <oneswig> How's your work going?
21:04:24 <janders> I keep hearing new MOFED with eth_ipoib will be out soon (even if only a beta)
21:04:57 <janders> it's good. I've been focusing mostly on the networking lately, running into un-interesting ethernet problems... (solved)
21:04:58 <oneswig> I'm quite pleased we decided not to depend on it.
21:05:23 <janders> I read through your solution - clever! :)
21:05:27 <oneswig> All ethernet problems are interesting, in the right audience :-)
21:05:39 <janders> the more interesting part is multirail
21:05:42 <oneswig> thanks janders - borne of necessity
21:05:57 <janders> I got it working, though there are some "interesting" issues
21:06:06 <janders> have you ever tried multirail IB with Ironic?
21:06:23 <oneswig> No.  We only have one NIC and we use one port as IB and the other as Ethernet
21:06:32 <janders> I need it for 1) storage 2) AI (DGX) systems
21:06:37 <oneswig> What happens?
21:06:50 <janders> the issue I'm seeing is - if I have multiple IB ports defined, instance creation fails
21:06:54 <janders> "neutron timeout"
21:07:01 <janders> (though no errors in neutron)
21:07:05 <oneswig> dug any deeper?
21:07:27 <janders> however if I have one ironic port on instance creation and add the other three after - that works and I can attach ports OK
21:07:44 <b1airo> Morning. A bit hectic here, need 10 mins to get kids organised...
21:07:45 <janders> I suspect something between neutron server and NEO. I need to chat to my mlnx friends..
21:08:01 <janders> good morning Blair
21:08:02 <oneswig> I was wondering if you were getting the wrong "MAC" for DHCP on the wrong physical rail
21:08:04 <oneswig> Hey b1airo
21:08:08 <oneswig> #chair b1airo
21:08:10 <openstack> Current chairs: b1airo oneswig
21:08:11 <martial_> Hi Stig, Blair
21:08:19 <oneswig> aha, hi martial_
21:08:23 <oneswig> #chair martial_
21:08:25 <openstack> Current chairs: b1airo martial_ oneswig
21:08:40 <oneswig> How many DockerCons have you been to this year martial_? :-)
21:08:55 <martial_> two this year :)
21:09:09 <martial_> Federal one + the real US one ... not the EU one
21:09:12 <oneswig> Can't get enough of the stuff, eh...
21:09:38 <martial_> something like that :)
21:10:19 <oneswig> How's the level of interest in GPU/AI/ML (and by some extension, HPC)?
21:11:33 <martial_> will tell you when it really start (ie tomorrow)
21:11:40 <oneswig> Ah, the calm before the storm.
21:11:51 <oneswig> Has DataMachines got a booth?
21:11:51 <martial_> today is more see who is out there, what ...
21:11:56 <martial_> OpenStack has a booth
21:12:01 <martial_> DMC does not
21:12:15 <oneswig> You're manning the OpenStack booth?
21:12:33 <martial_> The scientific effort will be in a SIG tomorrow. I will attend
21:12:45 <martial_> I am not, Chris Hoge and others are
21:13:50 <oneswig> Something I don't know, what's the overlap between Kata and Docker?  Lots?  None?
21:14:04 <janders> this DockerCon SIG meeting sounds interesting - maybe it's something we could add to next week's agenda for this meeting? :)
21:14:27 <martial_> some, kata is a container solution native to OpenStack, Docker is still not
21:14:33 <oneswig> I'm guessing the images are different, given the Kata ones are booted rather than started
21:15:07 <janders> this reminds me of something from the Vancouver Summit that might be worth mentioning
21:15:12 <martial_> I will tell you after I talk to the people of the SIG (Christine and Christian are going to be there)
21:15:12 <b1airo> Yes, but you can make a Kata container from a Docker image I believe
21:15:35 <janders> some of you have attended the DK Panda's talk on the last day
21:15:46 <janders> he was talking about optimising overheads in containers
21:15:47 <martial_> yes, same as CharlieCloud or Singularity (or docker save to tar)
21:15:57 <oneswig> martial_: conversely, thanks to Kolla, OpenStack runs natively well on Docker (once you disable all that networking...)
21:16:20 <janders> I had to leave early to attend a meeting so didn't get to ask this question but where did he see these overheads?
21:16:23 <janders> networking I suppose?
21:17:00 <oneswig> janders: networking is painful owing to the overlays and bridging on the host side.
21:17:18 <janders> indeed
21:17:53 <janders> but I wonder if he's hit container overheads in other areas
21:17:56 <oneswig> I'm interested in the scope for VF pass-through, SRIOV style.  In a way that doesn't circumvent the overlays - wonder if that is possible
21:18:38 <oneswig> janders: filesystems of many overlays can often be cited - each adding a layer of indirection
21:18:58 <janders> re SRIOV/vf - to my best knowledge nothing like this exists today with k8s and kuryr, howver it would be of interest to us as well
21:20:00 <janders> speaking of filesystems - do you have experience with passing through an RDMA-native system mounted on the "container host" to containers?
21:20:04 <martial_> I think the Cloud Native foundation will be there as well, I want to talk to there as well
21:20:46 <oneswig> janders: We've used GlusterFS like that I believe and I think it JFWed
21:21:16 <oneswig> martial_: keep an eye out for Michael Jennings - I saw a really great presentation from him in April on Charliecloud
21:22:36 <oneswig> janders: it ought to work because the container's seeing a filesystem, the VFS implementation shouldn't be affected by the encapsulation (in theory)
21:23:27 <janders> oneswig: thanks! :) I will likely try a similar approach soon
21:24:04 <oneswig> martial_: be very interested to know if you hear more about https://rootlesscontaine.rs or similar efforts - unprivileged container runtimes
21:24:40 <b1airo> Yeh, I think someone here at Monash tried it on a DGX and it works
21:24:59 <oneswig> janders: for extra points, orchestrate your GlusterFS volume using Manila
21:25:01 <martial_> interesting indeed
21:25:11 <martial_> do not know if they are here but I can try to check
21:25:39 <trandles> Hey there.  Sorry, been lurking and working...I can ask Michael Jennings if he's going to be at <insert name here> if someone fills in that blank for me.
21:25:50 <oneswig> Hi Tim!  Smoked you out :-)
21:25:55 <janders> oneswig: we'll likely use BeeGFS or perhaps GPFS
21:25:56 <trandles> indeed
21:25:59 <oneswig> DockerCon
21:26:19 <martial_> Hi Tim, yes DockerCon :)
21:26:27 <janders> b1airo: do you happen to run your DGXes with OpenStack?
21:26:28 <oneswig> janders: I have an ansible module under my wing for BeeGFS but it's a little alpha right now.
21:26:46 <oneswig> How are you provisioning the filesystem and mounting the client?
21:27:18 <b1airo> janders: not today, but we are planning to put Ironic under everything eventually
21:27:34 <trandles> @mej: Nope, no DockerCon for me.  Would love to go sometime, though.
21:27:53 <oneswig> brb
21:28:29 <janders> oneswig: with our BeeGFS, we've finished specing out the hardware and the kit is "in flight" so we haven't done much hands on work yet, but it is coming soon
21:29:03 <oneswig> janders: might have these roles on Ansible Galaxy by then, will let you know if so...
21:29:33 <janders> b1airo: thanks! DGX1 is one of my motivations behind implementing multi-rail IB with Ironic.
21:30:02 <janders> oneswig: great! we would be more than happy to work with you on this one
21:30:19 <oneswig> Guess we ought to look at the agenda...
21:30:24 <janders> good call
21:30:43 <oneswig> b1airo: martial_: can you guys drive a sec - family bed-time matters to attend to...
21:31:15 <b1airo> Copy oneswig
21:31:24 <martial_> sure thing
21:32:02 <martial_> joined a little after the hour ... need to quickly check the agenda link
21:32:05 <b1airo> #link https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_June_12th_2018
21:32:50 <b1airo> #topic sensitive data processing on OpenStack
21:33:10 <martial_> mmmh extra agenda item?
21:33:14 <martial_> how sensitive?
21:33:23 <b1airo> Firstly, we're still collecting input on this and trying to form up what efforts here should look like
21:34:16 <martial_> so my experience with this is related to a private cloud type work
21:34:17 <oneswig> back now.  Good topic :-)
21:34:20 <b1airo> I think we can at least extend the book by a chapter or two
21:34:33 <oneswig> agreed, if we can
21:34:39 <martial_> not a shared cloud
21:34:49 <martial_> for SC18?
21:34:54 <martial_> (re book)
21:34:59 <oneswig> martial_: private in the most private sense, ie one tenant only?
21:35:15 <b1airo> Maybe, not sure whether we'll have it ready
21:35:49 <b1airo> Ha, well tenant per hypervisor isolation is one control we've considered using
21:36:06 <b1airo> But remember, "defence in depth"!
21:36:17 <oneswig> I'm interested in hearing all the hoops people jump through and what each gives them
21:36:50 <martial_> more with known tenants
21:37:01 <martial_> otherwise we have to create separate hardware instances
21:37:12 <oneswig> How would you implement per-tenant hypervisor isolation - is there a filter for doing that or would it be about creating host aggregates with restricted visibility (if that's possible)
21:37:21 <martial_> good news, Ansible makes it easier obviously
21:37:25 <b1airo> #link https://etherpad.openstack.org/p/Scientific-SIG-Controlled-Data-Research
21:37:44 <janders> martial: even easier with ansible + ironic
21:37:53 <b1airo> oneswig: yeah there is an isolation filter
21:38:27 <b1airo> But other ways with aggregates too, depends on the overall cloud design I guess
21:38:51 <oneswig> Might be worth a line item under the technologies section b1airo
21:39:07 <oneswig> janders: what are you doing for cleaning - how thorough?
21:39:24 <b1airo> Yeah I still haven't dumped much in the etherpad yet, definitely have more to add
21:39:39 <oneswig> b1airo: the perfect accompaniment to breakfast :-)
21:39:51 <janders> I think AggregateMultiTenancyIsolation is one of the keywords
21:40:02 <b1airo> Ha, I'm already smearing honey on my phone so why not!
21:40:06 <janders> I personally haven't used it though, just remember it exists
21:41:05 <oneswig> Thanks janders, good to know
21:41:16 <janders> oneswig: not cleaning yet, though will be looking at this shortly. Are you thinking storage, firmware or both?
21:41:25 <b1airo> Ok, shall we move on briefly..?
21:41:46 <oneswig> janders: the first spanner is local disks, if you have more than one.
21:42:05 <oneswig> We can follow up offline, we've got some handy tools up on github for when you tackle this
21:42:16 <janders> oneswig: ok! thanks
21:42:22 <b1airo> oneswig: do self encrypting drives help there?
21:42:54 <oneswig> yes - insta-scramble - without that one of our bare metal flavors takes 29 hours to clean.  Not much use...
21:43:16 <janders> ouch!
21:43:30 <b1airo> Lol, yeah I think our users would assume Nova was out to lunch if that happened
21:43:50 <oneswig> Thankfully we only have one of those nodes, but it's too long.  What we really want is LVM signatures and partition tables erasing.
21:44:11 <oneswig> That's enough to prevent it breaking the next deployment.
21:44:25 <janders> I'm hoping to use some of the smarts in the SSDs to help
21:44:32 <oneswig> anyway, let's move on...
21:44:43 <martial_> fair enough
21:44:44 <b1airo> #topic workload management
21:45:29 <oneswig> We have a summer intern to work on polishing a slurm-as-a-service packaging
21:46:00 <b1airo> Sounds interesting oneswig , what needs polishing?
21:46:18 <b1airo> And how does that relate to what's in openhpc?
21:46:24 <oneswig> Right now, we've got a bag of tools, it really needs some better integration without loss of modularity
21:47:19 <oneswig> There's a pair of Ansible modules on Galaxy for creating heat stacks or magnum clusters and spitting out a structured ansible inventory for further config.
21:47:50 <oneswig> It works pretty well for people who know it's innards.  Should be easier to use though.
21:48:02 <b1airo> Ok, sounds interesting
21:48:49 <oneswig> #link heat clusters https://galaxy.ansible.com/stackhpc/cluster-infra/
21:48:52 <b1airo> Is that largely about setting up the inventory then or is the idea to go all the way to compete running SLURM
21:49:04 <b1airo> *complete
21:49:11 <oneswig> #link magnum k8s/swarm clusters: https://galaxy.ansible.com/stackhpc/os-container-infra/
21:49:52 <oneswig> b1airo: we follow a policy of only doing the infra in heat because it's like working in a small dark cupboard when you're debugging something
21:50:06 <oneswig> Ansible takes over pretty much as soon as the infra's up.
21:50:21 <oneswig> Using OpenHPC, it's pretty easy to put Slurm on top
21:50:55 <b1airo> Right
21:51:07 <oneswig> That ansible's here: https://github.com/stackhpc/p3-appliances - somewhat ragged round the edges
21:51:33 <b1airo> So SLURM seems to be all the rage but has anyone had any luck getting it to behave nicely in a dynamic environment?
21:51:55 <b1airo> oneswig: ragged around the edges, I can relate!
21:51:56 <oneswig> To make a slurm config file from an ansible inventory: https://github.com/stackhpc/p3-appliances/blob/master/ansible/roles/openhpc_runtime/templates/slurm.conf.j2
21:52:30 <oneswig> It turns out Ansible's facts have just enough data to populate what Slurm requires - phew
21:52:47 <janders> oneswig: nice work!
21:52:53 <trandles> b1airo: by "dynamic environment" do you mean adding/removing compute nodes?
21:52:59 <trandles> on the fly
21:53:06 <oneswig> The dynamic one's interesting.  A lot of people follow the maximum-sized cluster, and work with most of it missing.
21:53:10 <b1airo> trandles: yep
21:53:29 <trandles> last I looked closely there was functionality around "cloud burst" type stuff
21:53:51 <oneswig> The thorny bit - would love to see a working demo of this - is scaling down
21:54:00 <trandles> the max cluster stuff oneswig references is another way, you just look like you have a load of down nodes when slurmd isn't running on cloud nodes
21:54:01 <b1airo> Is it really cloud burst or actually thinly disguised power management...
21:54:32 <trandles> b1airo: I can't answer that, never gave it a serious look
21:54:44 <b1airo> Can you rename nodes without bringing slurmd down?
21:55:23 <trandles> there are annoying things to deal with...such as telling slurmctld and the slurmd's that they shouldn't care if the config files differ
21:55:25 <oneswig> b1airo: I don't think so.  Or perhaps I'm thinking of Ceph OSDs
21:55:42 <b1airo> Another point is, I'm not sure how suitable SLURM really is for handling high throughput workloads that would be bread and butter for a dynamic cluster
21:55:51 <oneswig> trandles: you can do that? revolutionary. How?
21:56:16 <janders> it will be interesting to see when will cluster management software vendors add "native" OpenStack support - as in using OpenStack APIs to create all the infrastructure from scratch
21:56:23 <b1airo> From what I've seen at a distance it is not very good at handling huge numbers of jobs in queue
21:56:25 <trandles> oneswig: there's a config param in later versions, not sure if it only appeared in 17.X or earlier, to say "don't hash the configs"
21:56:33 <janders> I was having a chat about this with Bright recently, they seem to be working on it
21:56:40 <trandles> we
21:56:54 <oneswig> b1airo: huge as in?  Cambridge Slurm, apparently the queue was normally 4 days long
21:56:55 <trandles> 've gone slurm-only at LANL because moab couldn't handle large queues efficiently
21:57:23 <trandles> slurm can limit queue sizes, both max and by UID
21:57:29 <trandles> so one user can't hose up the works
21:57:37 <oneswig> janders: Bright's doing some interesting stuff around mixing bare metal clusters and openstack dynamically
21:57:45 <b1airo> Huge as in millions of jobs
21:58:01 <oneswig> that'll do it.
21:58:08 <trandles> I know we handle 10's of thousands, not sure about millions
21:58:35 <oneswig> b1airo: what's the alternative? HTCondor?
21:58:36 <trandles> does any HPC WLM handle millions without a 2 hour scheduling cycle?
21:59:18 <trandles> <aside> oneswig: Charliecloud v0.2.5 just released with new functionality to better handle MPI jobs
21:59:23 <b1airo> Yeah we had problems with a user trying to submit 10s of thousands recently and I was really surprised because I used to run experiments with millions of jobs using Nimrod (against PostgreSQL) years back
21:59:28 <janders> oneswig: indeed - however till now it was mostly running OpenStack on a cluster fully managed by Bright. I think now they are starting to realise it might make sense to do it the other way round. Have you seen any of this functionality yet?
21:59:36 <oneswig> trandles: nice work!  What about OpenHPC packaging?
22:00:10 <trandles> oneswig: OpenHPC in the next release.  One of the Intel devs was just working on an RPM spec file for Charliecloud in the upcoming release.
22:00:15 <b1airo> Yikes, times up!
22:00:22 <oneswig> janders: no but you're absolutely right, it was an issue that their OpenStack was basically proprietary
22:00:27 <oneswig> Oops...
22:00:30 <b1airo> #topic final comments...
22:00:58 <oneswig> Nothing major from here
22:01:18 <oneswig> Having fun with new Ceph on old hardware
22:01:33 <b1airo> Seems like we could at least so more knowledge and pain sharing in this space anyway
22:01:42 <b1airo> *do
22:01:56 <oneswig> Always seemed to be a good thing
22:02:00 <b1airo> Did it go a lot faster oneswig ?
22:02:13 <b1airo> We're just about to try the mimic upgrade...
22:02:19 <oneswig> Not sure yet - trying to figure out multipathing runes for it currently...
22:02:24 <janders> we didn't get to talk much about the focus areas for this cycle... next week?
22:02:30 <oneswig> ooh, mimic's just out isn't it?
22:02:47 <b1airo> Multipath, urgh. So Lustre hardware then
22:02:50 <oneswig> janders: sounds good, come prepared...
22:02:59 <oneswig> b1airo: got it in one
22:03:08 <b1airo> Tentacles loaded!
22:03:21 <b1airo> Ok, thanks all!!
22:03:25 <oneswig> I'm tempted by mimic now
22:03:31 <oneswig> Thanks everyone
22:03:34 <b1airo> #endmeeting