21:00:14 #startmeeting scientific-sig 21:00:15 Meeting started Tue Jan 5 21:00:14 2021 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:16 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:19 The meeting name has been set to 'scientific_sig' 21:00:31 32 cores, 128GB mem and an Nvidia GPU 21:00:32 Thanks for the reminder Martial, was just watching "agents of shield" with young sir. 21:01:03 Is that like a Jetson Nano? 21:01:16 Or are these beefy x86 cores? 21:01:26 well it might be a quiet meeting and you can get back to it soon 21:01:54 beefy cores: AMD 5950x 16 cores, 32 "threads" 21:02:10 Sounds like a new gaming rig :-) 21:02:24 So what else is new? 21:02:27 well yes, that too ;) 21:02:55 Hi all. Sorry I'm late 21:03:08 here not much, end of the year time 21:03:16 welcome Tim 21:03:16 Tim Randles is never late, he arrives exactly when he means to 21:03:25 and Jim :) 21:03:44 Hey guys, happy new year 21:05:08 Same to you 21:05:14 Happy new year :) 21:05:22 Ditto. 21:05:32 OpenStack-themed resolutions for 2021? 21:05:50 Absolutely: Full production cloud in H1C21 21:06:25 Ironic for all HPC-owned system provisioning and deployment (it's looking like I might win that battle) 21:06:28 same for me actually 21:06:29 Awesome 21:06:49 production in H1C21 21:06:51 Ooh! 21:06:58 A colleague has been working on OpenShift-on-OpenStack. Mixed results so far. 21:07:17 What's the mix? I'd have thought that would work out of the box 21:07:38 But we're about to have paid RHAT support for that config soon, so the issues we're having might go away sooner than later. 21:08:32 In meetings with RHAT they will support OpenShift-on-OpenStack but they seem to feel it's a little more bleeding edge than OpenShift on VMWare or RHV or bare metal 21:09:02 more info as we get it 21:09:22 surprising inversion against their own infrastructure product 21:09:47 I think for me my first resolution is to do a lot more about CI. Again. 21:10:25 oneswig: Does that refer to Continuous Integration? 21:10:26 RE: CI - one of our projects just switched from travis to github actions and they're really happy with it. 21:10:51 This time we are really going to do it :-) julianp: yes - regression testing, builds, all that jazz. 21:11:01 Righto. 21:11:21 trandles: I've heard similar happy stories about GitHub Actions. 21:11:42 Typically we've used whatever's available, got a jack-of-all-trades experience with different CI frameworks 21:12:51 jmlowe: when does your hardware show up? 21:13:13 Looking more and more like beginning of April 21:13:50 Ah, that's a bit of slippage, too bad 21:14:24 Do you know what's causing the delay jmlowe ? 21:14:32 Any specific component? 21:14:41 Between NVIDIA redstone boards and Milan, I think we are looking at about 6 weeks later than expected 21:16:11 Maybe there will be a working pip by then :-) Been having troubles with the new dependency resolver today, never completing. 21:16:38 Curious because the tri-labs are working on a new round of commodity systems. I think we were hoping to start taking deliveries late summer, but no contracts are awarded or anything. I have no idea if "NVIDIA redstone boards and Milan" will matter or not. 21:18:05 HPC clusters, private clouds or somewhere between the two trandles? 21:18:41 Are you getting non DGX A100's (redstone)? Are you planning on AMD Milan chips? If the answer is no then it's not a concern 21:19:21 serious setup, all this 21:19:41 HPC clusters 21:20:15 7 to 8 figure price tags 21:21:16 If Livermore's one of the Tri-labs, I spent some happy weeks cabling HPC clusters with Quadrics interconnect. Such a tidy sub-floor :-) 21:21:23 So LANL/LLNL/SNL purchase common commodity tech HPC clusters. It's something like "We will buy X scalable units and we want an option to by Y more, over the next 3-5 years." 21:21:39 *buy Y more* 21:22:07 the last round was from Penguin Computing 21:22:16 Intel procs, OPA interconnect 21:22:29 NVIDIA GPUs for those who want them 21:22:42 I can't imagine that went too well 21:23:10 Or did you get the world's first solid OPA interconnect? 21:23:11 It went ok. We have ~10000 nodes total across a bunch of discrete clusters 21:23:38 OPA has been ok. 21:23:55 Once we got rid of gobs of bad cables 21:24:08 I'll be fascinated to see how the spin-off company wanting to revive it works out 21:25:51 trandles: have you spoken to Penguin about Ironic for provisioning? 21:26:17 oneswig: nope 21:26:52 I wonder how flexible their Penguin-on-demand service is. 21:29:52 DMC has had success with Linode but our use case is once again specific 21:30:30 Not tried Linode before. 21:30:59 I've never used Penguin on-demand. 10+ years ago when I was in a uni physics department they pitched it HARD to the faculty. 21:32:18 We hosted one of the PoD racks for a few years in our data center 21:33:02 It's a losing business model, people just don't have opex just capex 21:33:09 and it's nearly impossible to covert 21:34:13 Perhaps another resolution is to get to grips with ceph orchestrator. We've got an Ansible role taking shape to drive it from yaml data, so it fits better into a sane deployment process. 21:38:12 o/ 21:39:03 martial: DMC must be a big user of linode, it's made a case study - https://www.linode.com/spotlight/data-machines-brian-dennis/ 21:39:52 yes, like I said very specific use case; GPU for one of our ML project 21:40:21 and that is a burst on top of our Data Center hardware 21:41:32 martial: interesting. How do you share the datasets in this hybrid model? 21:41:59 happy to ask Brian if he can talk about it directly at a follow up meeting if that would be of interest to the group 21:42:35 sounds good to me 21:42:58 will reach out ... maybe the next USA friendly meeting if he is able to 21:43:10 cool 21:44:56 slack-ing him right now 21:45:51 Hello cmart! 21:46:10 howdy. joining from Matrix, appears to be working 21:46:12 We can host a google meet like the presentation Rion did a while back 21:46:20 sounds good 21:48:50 julianp: what's new with you? 21:49:20 Quite a bit. Lots of irons in the fire. 21:50:03 cmart implemented a mechanism to white label Exosphere, and Jetstream has a beta version now. 21:50:18 https://exosphere.jetstream-cloud.org/ 21:50:43 white label? 21:51:00 Customizing the theme. Colors, logos, etc. 21:51:21 This file is all that's required to make the Jetstream version: https://exosphere.jetstream-cloud.org/exosphere/config.js 21:51:32 ah, ok. Intersting 21:52:03 No recompilation required. Just works. 21:54:08 And since there are no services required for Exosphere to work (other than some CORS proxies) it's easy to provide a customized version. GitHub pages is sufficient to host it. 21:55:12 I wonder if this could ever work with OAuth2, in theory. Just feel a little uneasy about typing my credentials into somebody else's portal. 21:55:29 we actually need to build that in the next few months, oneswig 21:55:43 ooh, now that would be neat. 21:55:48 the answer is probably yes, Keystone has support for OpenID connect on top of OAuth 21:56:28 if you go to https://iu.jetstream-cloud.org/ you'll see an option for "OpenID Connect", and it will take you to Globus to authenticate yourself 21:56:52 jmlowe: 21:56:58 Very nice. 21:57:04 jmlowe set it up a few weeks ago 21:57:35 the idea is that exosphere can tickle the same Keystone endpoint that Horizon does to make this work 21:58:30 oneswig: What's the eta on your ironic cloud for public good projects? 21:59:29 That's where the Ceph work's going on currently... Not sure on an ETA, but early access friends welcome perhaps in a few weeks 21:59:47 (also small nitpick, it's not "somebody else's portal" anymore, see the subdomain of jetstream-cloud.org :D ) 22:00:12 cmart: got it :-) I was mostly thinking of the trial site 22:00:22 ah, we are out of time. Any more comments? 22:01:07 no Sir, thanks everybody :) 22:01:31 #endmeeting