21:01:01 #startmeeting scientific-sig 21:01:02 Meeting started Tue Jan 7 21:01:01 2020 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:01:03 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:01:05 The meeting name has been set to 'scientific_sig' 21:01:30 A quiet one for today I think 21:01:44 martial? 21:02:00 Is there a meeting today oneswig ? 21:02:12 Hi trandles, perhaps... 21:02:13 Also, hello oneswig and happy new year 21:02:29 right back at ya, hope you had a good break. All well here. 21:03:02 As a resident of the high plains, I expect you'd struggle to picture how bracing the sea is round our way on Christmas day :-) 21:03:06 Yes, thanks, good break. Luce and I spent the holidays at home and put a 3-day trip to San Francisco in between 21:03:45 I've spent Christmas in Gosport. 2C along the sea feels a LOT colder than 2C in Santa Fe. ;) 21:04:02 I have no doubt 21:04:33 In the office, we have our 6-monthly design summit next week. Following the discussion last month I put podman onto the list for discussion 21:04:52 excellent 21:05:12 I'm just about (in an hour or so) to deploy Train into production using OpenStack-Ansible. 21:05:35 Ooh, exciting. 21:05:48 New system or upgrade of a previous one? 21:05:57 Other than dealing with gnocchi, it hasn't been too bad. Just the normal lack of documenting of required key variables that need to go into your config. 21:06:01 New system 21:06:08 The previous was testbed-only 21:07:43 As always, I'd be interested to hear how it goes. Does OSA still build LXC containers on the hosts? 21:07:58 Yes, lxc containers. 21:08:59 Does it support SELinux? Something else on the agenda. 21:09:14 I don't believe so 21:09:28 But I'm likely to be wrong there. I didn't look into it. 21:11:25 A fiddly problem, every time I get caught out by it. 21:12:43 IIRC the LXC containers follow the lightweight OS model, right? Do they also have private network namespaces, IP addresses, etc? 21:12:57 Yes, they have their own IPs 21:13:18 They're also running systemd to init multiple processes/container 21:13:52 Makes sense. 21:14:36 I was hearing earlier about a new ARM CI resource - apparently it'll be all IPV6 on the control plane (for fun, perhaps?) 21:14:44 So there is a nova container with api, conductor, and scheduler all running for instance 21:15:10 I have a couple ARM nodes in our production cloud. 21:15:20 Haven't booted them in anger yet though 21:16:08 Otherwise, haven't done much with ARM myself. We have a couple ARM clusters from Cray... 21:16:26 I think I heard that somewhere... 21:18:35 The news at this end for today has been a new project testing out Mellanox VF-LAG. We didn't get to the point of integrating it yet but that should come in tomorrow. Has promise I think 21:18:41 although someone's dog ate all the documentation 21:18:49 lol 21:19:13 If it works for us, we'll write up the experience for sure. 21:19:43 Your new deploy, is that going to be bare metal compute? 21:19:56 Your use cases are pretty far advanced from ours. We're just going into production with a basket of internal use cases and a mob of users with pitchforks and torches screaming for "cloud" but without any ability to write a one-page proposal on what they really think they're going to use it for. 21:21:33 We need this because I have to exercise a filesystem with massive connectivity using as few (virtualised) clients as possible. The tangled webs we weave. 21:22:06 The first use of the production cloud will be bog standard VMs. We have some internal use cases for containers on bare metal but that's stage 2. I'm still stuck as the only one tasked with getting this up and going and it's only something <40% of my time. As you can imagine, being able to execute quickly and efficiently to put a completely new class of resource into production here and only have .4 FTE to do so is highly suboptimal. 21:22:56 We've tried twice in the past 16 months to hire a dedicated FTE to do nothing but cloud and been thwarted both times. :( 21:22:57 trandles: get your logging and monitoring in good order and you'll have a better chance. 21:23:59 sorry to hear the recruitment fell through, I'd have thought it's an excellent job (for a respectable American!) 21:24:09 The one thing we have boatloads of is monitoring and logging. All of my logs will be going to rabbitmq and straight into splunk. We have a team of devs that are splunk gurus and they're working up custom dashboards for me right now. 21:24:50 I am sure that will help. One could lose oneself without it. 21:26:38 Perhaps we should wrap up to let you kick off that deploy! Over here I have bioinformatics to think about. 21:27:01 I'm hoping to get some friendlies working on it by the end of the week. If initial testing goes well we'll be quickly ramping up to ~100 hypervisors and ~1.5PB of block and object storage. 21:27:12 How's the meetup going in London? 21:27:36 Two people from our team are there (johnthetubaguy and wasaac) 21:28:00 Apparently very good discussions and lots of input, from what I've heard. Very positive. 21:28:16 I've not been to a meetup. Is the London one well attended? Audience from all over Europe or just mainly the UK? 21:28:46 I gather its mostly UK, some Europe, a few Americans. 21:29:11 I don't have numbers to make that data at all useful. 21:30:14 Have fun with bioinformatics. That reminds me, I need to schedule a meeting with some genomics folks here at the lab who are using Charliecloud and BEE (a workflow engine) to discuss their next steps porting everything to Common Workflow Language. 21:30:59 Coming from a physics background, the bio community is whole different beast. 21:31:16 workflows - BEE - I've heard of a few but not that one. 21:31:38 for sure. But a beast for whom cloud is a beauty :-) 21:32:21 indeed 21:33:10 Be interesting to see how Charliecloud gets integrated into these workflows. Does this research become public at some point? 21:33:33 It's all open sourced 21:33:56 BEE is currently being refactored, but that's a much longer conversation 21:34:39 We're now running extremely large production codes at scale (~15000+ MPI ranks) using Charliecloud 21:36:19 That's great. When people talk about MPI in containers, it so often boils down to some abominable hack involving spawning ORTE via ssh for mapping the ranks. 21:36:52 possibly commented along the lines of "must fix this after graduation"... 21:37:09 meee-ow 21:37:31 The amount of FUD around MPI and containers is a major point of contention for me. It's not black magic. We run MPI applications from containers without even having MPI installed on the host 21:38:14 All of the misconceptions, and dare I say lies, about MPI and containers needs to stop. It's holding up adoption in general. 21:38:46 ORTE is deprecated anyway :P 21:38:47 But the containers in your case are launched within a process framework like pmix, right? 21:39:13 Yes, using something like PMI2 or PMIx solves it 21:39:31 ORTE is deprecated by Open MPI. Mpirun is deprecated by Open MPI. 21:40:01 And your network fabric is based on verbs or libfabric as well? 21:40:02 If you're not already using PMI2 or PMIx then you're in for a very rude awakening 21:41:35 Ideally you'd use something like UCX but I'm not sure it's production ready yet. We did have this big ugly compatibility matrix on a whiteboard of various interconnects, BTL, MCA, etc. but it just made everyone sad. 21:42:00 Things should get much better with Open MPI 4+ 21:42:13 MVAPICH just seems to work 21:42:14 I think the injection of host mpi libraries into the container (like shifter) raises eyebrows, but that's only for proprietary host environments, right? 21:42:46 Yeah, injection like that is for proprietary fabrics like Cray's 21:43:15 there was an effort a while back to create a ceph messenger class based on ucx but alas I think it didn't yield. 21:43:56 I also think ultimate portability is overblown. Our big code teams have convinced me that at least for the workloads I need to care about, portability of a container image isn't very valuable. 21:44:39 how does that square with udss? 21:44:56 The toolchains used to reliably build container images mean that they're very happy to just build a new image custom for a platform they're targeting. The benefits of containerization are provenance of application runtime and the support libraries. 21:45:42 UDSS seems to still hold the same value in that it allows bringing in environments/runtimes that normally couldn't run at all. 21:45:49 I just hope you never sit next to the creator of ld.so at a party :-) 21:46:41 Things like TensorFlow, DASK, etc. where the effort to support it for a small community at the lab is way too large, and the user either can't or doesn't want to figure out how to build and run from a home or project directory. 21:48:25 We have a code with over 300 dependencies that get managed using environment modules, LD_LIBRARY_* hacks, etc. For the code teams to build a container image with exactly the versions of everything they need, eliminating the many opportunities to make mistakes with modules and environment variables, is a massive win. 21:50:09 I was looking for the code using ssh and orte for mpi rank assignment I referred to (in kubeflow), but it looks like the openmpi support has been taken out. 21:52:48 That just sounds horribly ugly 21:53:08 trandles: certainly, I can see the advantages of udss in an diverse environment (which is any environment, surely!) 21:54:00 perhaps the removed code was put out of its misery. 21:56:23 Anyway, time to wrap up... 21:56:35 I was challenged at a conference last year to prove my statement that it doesn't matter what version of MPI is on the host compared to what's in the container. I ran a multi-node containerized MPI job on a cluster that didn't have MPI installed. I think what you're referencing (ssh and orte, etc.) is a symptom of what Reid and I have been fighting since our rejected 2016 SC paper on Charliecloud. Everyone is working way too hard, it's not 21:56:35 difficult. 21:57:13 Another good SIG meeting. ;) 21:57:14 trandles: That specific example is due to running in k8s 21:57:21 lacking pmi 21:57:45 Ah, yes, well, k8s is the poster child of "working too hard." 21:58:06 I think the challenge you had might be because people assume the crafty work done by shifter is generally applicable. 21:58:43 It was a singularity dev who told me I was wrong about MPI 21:58:45 whereas what's needed is /dev/uverbs (or whatever it was) 21:59:03 hmmm 21:59:15 did them mean different mpis within a single application? 21:59:49 I have no idea what the singularity folks are on about most of the time, so I'm not sure. 22:00:20 good god, git pull of kubeflow was 205 MB, perhaps reinforcing your point :-) 22:00:27 woah 22:00:31 ah, time to close. 22:00:44 good chatting trandles, until next time 22:00:53 Likewise! Have a good night. :) 22:00:55 Now to have fun with the OS-A deploy! 22:01:01 #endmeeting