21:00:23 <oneswig> #startmeeting scientific-sig 21:00:24 <openstack> Meeting started Tue Apr 28 21:00:23 2020 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:25 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:27 <openstack> The meeting name has been set to 'scientific_sig' 21:00:30 <oneswig> ship ahoy! 21:00:42 <oneswig> I had an agenda here somewher. 21:00:44 <jmlowe> hey everybody 21:00:52 <oneswig> Hi jmlowe 21:00:57 <oneswig> #link agenda for today https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_April_28th_2020 21:01:15 <oneswig> Who got in on time? 21:01:49 <oneswig> jmlowe: how's things with you? 21:01:54 <martial> on time? 21:02:00 <oneswig> hey martial 21:02:03 <oneswig> #chair martial 21:02:04 <openstack> Current chairs: martial oneswig 21:02:38 <jmlowe> surviving, life is much the same, just more people underfoot 21:03:14 <oneswig> sounds familiar 21:03:34 <oneswig> Had a peach of an issue with CephFS recently, thought you might be piqued. 21:03:50 <oneswig> All the files (~6TB) disappeared. 21:04:05 <oneswig> The root dir was empty 21:04:10 <jmlowe> oh 21:04:21 <oneswig> oops :-) 21:04:29 <martial> ouch 21:04:31 <jmlowe> I'm planning on some very heavy cephfs usage in the next 6-12 months 21:04:40 <fungi> did you manage to recover the data, eventually? 21:04:55 <oneswig> After a few days of getting not very far with fsck, I found I could access directories two levels down... 21:05:07 <oneswig> ie / was empty, but /home/stig had stuff in it. 21:05:13 <oneswig> hmm. 21:05:41 <oneswig> I made a new directory in / and the MDS had a hiccup and returned the missing dirs. 21:05:58 <jmlowe> weird, version? 21:06:08 <oneswig> In a parallel universe, it would deleted all the unexpected inodes for me instead ... 21:06:13 <oneswig> This was mimic. 21:06:34 <oneswig> We upgraded to Nautilus over the weekend with the Ceph-Ansible rolling upgrade. A good experience. 21:07:46 <oneswig> We aren't quite sure how it got into that state but it has spent a good deal of time recently with intermittent connectivity and multiple MDSes 21:08:28 <oneswig> Anyway we should get the show on the road. 21:08:37 <oneswig> #topic Open Infra Labs 21:08:44 <oneswig> msdisme: are you there? 21:08:49 <msdisme> here! 21:09:00 <oneswig> Hello and thanks for joining. 21:09:13 <oneswig> What's the story? 21:09:15 <msdisme> Hi - thanks for inviting me 21:09:34 <jmlowe> I'm on Nautilus and looking hard at Octopus 21:09:37 <msdisme> So apologies in advance for the potential wall of text - I dropped some stuff in a paste buffer :-) 21:10:09 <msdisme> OpenInfra Labs (OI Labs) is the new effort established in partnership with the OpenStack Foundation and Mass Open Cloud. OI Labs is created with the goal to expand the existing community and simplify and standardize how different institutions deploy and operate open source cloud infrastructure and cloud native software. Initially, OI Labs will prioritize the needs of the MOC and NERC environments. Longer term 21:10:09 <msdisme> the goal is to see more organizations globally (especially in the academic and research space) stand up multiple consistent clouds that can enable hybrid and federated use cases. If you are building or operating open source based clouds and would like to help standardize the process for creating them, we invite you to get involved and participate in OpenInfra Labs today! (links later I promise :-0) 21:10:45 <oneswig> What's NERC? 21:10:48 <msdisme> Some History: in 2012 Boston University, Harvard University, MIT, Northeastern University, and The University of Massachusetts, in partnership with the state of Massachusetts, Dell EMC, and CISCO , completed the Massachusetts Green High Performance Computing Center (https://www.mghpcc.org/) in Holyoke, Massachusetts. 21:10:48 <msdisme> The MGHPCC is a LEED Platinum Certified 30 megawatt data center with high speed fiber connections. 90% of the energy used to run the data center is from carbon free resources. 21:11:07 <msdisme> MOC: In 2014 PI’s Orran Krieger and Peter Desnoyers create the Mass Open Cloud (MOC) is a project to explore an alternative model to the traditional public cloud. We call it an Open Cloud eXchange (OCX) - where many stakeholders can participate in implementing and operating a shared cloud. The project, initially funded in 2015 by a $3M seed grant from the Commonwealth of Massachusetts, along with 21:11:07 <msdisme> investments by industry partners, has demonstrated that it is viable to create an alternative to today’s public clouds that is more economical for academic institutions, that offers a rich set of open source services, and that enables broad industry and research participation. 21:11:07 <msdisme> One thing that became clear is that standing up even a small scale commercial cloud is HARD. Running it is HARD. Discussions with our partners (Intel, Red Hat and Two Sigma) led to discussions with the Openstack Foundation, which led to Open Infra Labs. 21:11:08 <msdisme> We think of the constellation of projects around the MOC and OI Las as an Open Cloud Initiative and it consists of: 21:11:22 <msdisme> The MOC (massopen.cloud) 21:11:22 <msdisme> Openinfralabs.org and an associated gitlab repository where we are gathering sample code as well as epics an user stories around the first iteration of OI Labs to support the MOC and the New England Research Cloud (a partnership between BU, Harvard and the State of Massachusetts) to build a cloud based on the OI Labs scripts. 21:11:22 <msdisme> The NSF “Open Cloud Testbed” (OCT) project will build and support a testbed for research and experimentation into new cloud platforms – the underlying software that provides cloud services to applications. Testbeds such as OCT are critical for enabling research into new cloud technologies – research that requires experiments that potentially change the operation of the cloud itself. The OCT will 21:11:22 <msdisme> combine proven software technologies from both the CloudLab and the Mass Open Cloud projects and make FPGA’s widely available to systems researchers, building on Intel’s generous donation of FPGA boards currently available on the MOC. 21:11:24 <msdisme> And the Operate First Initiative The Operate First Initiative, an open effort with the MOC, OI Labs, Red Hat (and hopefully you!) is an effort to open source cloud operations at scale. One example is that the work being done to enable Operate First is occurring in the Open Infrastructure Labs git repository located at https://gitlab.com/open-infrastructure-labs. 21:11:26 <jmlowe> My major challenge is trying to figure out a sane way to do full bandwidth with native cephfs, I think I need to push the virtual routers into the switching gear with something like opendaylight or try to get everybody to add a second routed provider network at instance creation 21:11:28 <msdisme> Whew, sorry 21:13:32 <msdisme> The NERC (New England Research Cloud) is a partnership between Boston University and Harvard to create a research cloud with research facilitators and 21:13:45 <jmlowe> This sounds like it might fill the gap left when Intel closed everything down 21:14:05 <msdisme> a support staff - the MOC is a best effort sort of deal and the NERC will have somewhat more meaningful SLA's 21:14:10 <jmlowe> specifically the OCT 21:14:57 <oneswig> Is OCT for hardware testing or software testing or something else? 21:15:53 <jmlowe> oneswig: The view from Bloomington today https://photos.app.goo.gl/e2GuBJw2VfSEQX2u5 21:15:54 <msdisme> it's based on https://cloudlab.us/ - meant to to allow reproducibility 21:16:41 <msdisme> so you may get bare metal access and run your hw on it. 21:17:30 <oneswig> That's cool. How is CloudLab implemented? Is it open source itself? 21:17:31 <msdisme> and because cloudlab usage goes to 100% around paper deadlines we are looking at sharing hardware from the MOC andOCT depending on loads 21:18:06 <msdisme> It is - looking for a link off their page 21:18:34 <martial> CloudLab looks interesting too; they have FPGAs too it looks like 21:18:48 <msdisme> the work we did on MOC for hardware sharing is moving into Ironic and Ansible as ESI (Elastic Secure Infrastructure) 21:19:21 <jmlowe> The PI for cloudlab Robert Ricci sits on our advisory board 21:20:13 <fungi> #link https://gitlab.flux.utah.edu/emulab 21:20:14 <oneswig> What are the components of ESI? Sounds like it may have a good deal of alignment with Kayobe (also Ironic + Ansible) 21:20:14 <jmlowe> fun fact, the arm test harness for ceph uses cloud lab 21:20:54 <fungi> that's what they link to where it says "cloudlab is open source" on the main docs page anyway 21:20:56 <oneswig> msdisme: how does CloudLab compare to the NSF Chameleon project? 21:21:07 <oneswig> thanks fungi 21:21:14 <jmlowe> Competing projects from the same directorate 21:21:34 <jmlowe> similar to the relationship between Jetstream and Bridges 21:22:09 <msdisme> I'm probalby not qualified to compare them, but the PI's are generally pretty connected - (link hunting) 21:22:20 <fungi> #link https://www.tacc.utexas.edu/systems/chameleon 21:22:34 <fungi> (for reference) 21:22:56 <fungi> saw some folks on stage at ocw mention chameleon and cloudlab in the same breath, anyway ;) 21:23:04 <oneswig> also https://www.chameleoncloud.org/ 21:23:33 <msdisme> we hosted Kate in 2018: https://www.bu.edu/rhcollab/2018/10/30/colloquium-chameleon-new-capabilities-for-experimental-computer-science/ 21:23:35 <fungi> ahh, yep, that's a better link 21:23:51 <jmlowe> Kate Keahey the PI for Chameleon also sits on our advisory board 21:24:16 <oneswig> jmlowe: seems like you are exceedingly well advised :-) 21:24:36 <fungi> oh, right, the renci folks are involved in chameleon too, basically in my old back yard 21:24:52 <jmlowe> The NSF strongly suggested we get an advisory board, we did our best 21:25:02 <msdisme> we have some great PI's 21:25:03 <oneswig> I think Chameleon have an interesting take on network isolation including switches that support "network-slicing" OpenFlow. How does CloudLab approach multi-tenant network isolation? 21:25:40 <msdisme> Jim - sorry I feel I should know, but what are the projects you are tied to? 21:26:14 <jmlowe> I keep Jetstream running 21:26:24 <jmlowe> https://jetstream-cloud.org 21:26:30 <msdisme> also, re. earlier ESI question - if you drop me a note with "ESI info" in the subject line I'll figure out a way to connect folks for a discussion 21:26:42 <msdisme> Ahh - cool! 21:27:15 <oneswig> I think rbudden is lurking, he's ex BRIDGES, as mentioned earlier 21:27:16 <fungi> i recall larsks et al discussing the network isolation and control model for the esi proposal at ocw in a breakout one afternoon 21:27:20 <jmlowe> I'm John Michael, odds are you probably know me as Mike 21:28:05 <rbudden> @oneswig: ah yes, on another meeting, lurking, apologies 21:28:19 <oneswig> fungi: we usually approach this using networking-generic-switch but that can limit your choice of switches. 21:28:38 <oneswig> rbudden: :-) 21:28:49 <msdisme> in retrospect I should have grabbed the PI's for this - I'll grab an answer to the chameleon question and NW isolation question above 21:29:29 <oneswig> msdisme: so MOC has been going for a good number of years now, when did OIL start? 21:30:26 <msdisme> one area we want to look at a lot this summer is how to connect Openstack and Openshift/Kube into the HPC billing systems - we know NSF is funding some work there and would love to make it part of O ILabs 21:30:38 <martial> the rbudden? :) 21:31:17 <oneswig> msdisme: that sounds like xsede, if I've got my names right. In Europe we have something similar (caso) 21:32:11 <deardooley> @msdisme are you thinking of a second tier charge for the orchestration service on top of the existing VM charge? 21:32:19 <msdisme> ideally we'd want to tie to that eventually too - one of the real goal is federated clouds 21:33:28 <fungi> oneswig: oilabs more or less formally started with announcements at ocw this year 21:33:31 <fungi> #link https://massopen.cloud/events/2020-open-cloud-workshop 21:33:40 <fungi> though there were discussions leading up to that 21:33:46 <msdisme> For MOC figuring out a payment model has been a nightmare because the silos within openstack/kube think of things differently so in general it's a "free" userbase. NERC will need to be much more tied to costs 21:34:30 <msdisme> (which was me ducking a questin :-) ) 21:34:52 <oneswig> MOC has had a few years to figure this out though, right? :-) 21:35:04 <deardooley> there's a whole community of startups addressing that right now. It winds up becoming an issue of metric granularity in k8s/openshift. 21:35:35 <oneswig> I think it's an interesting idea that you could get your VMs and your Cinder volumes from different vendors in the same cloud, and pay separate bills for each 21:36:03 <deardooley> I'm interested in understanding who pays for the "dead" time when the system is up, but no workloads are running. 21:36:48 <oneswig> Who can use OIL resources? I assume it isn't universal. 21:36:53 <msdisme> we did, we spent a ton of time on various models and various ways to capture usage - like Edison we learned a tremendous number of ways to NOT do it. 21:37:47 <deardooley> k8s has their storageclass primitive to address that. does cinder have a concept of block storage QoS or multiple backends? 21:38:15 <oneswig> deardooley: yes, it does support those 21:38:32 <msdisme> I think of OIL as a place to create opinionated scripts for setting up and runnng (monitoring ) a cloud so that the day 3, 5, 100 are simpler. 21:38:47 <deardooley> link? 21:38:49 <msdisme> and the goal is for them to be available to any/everyone 21:39:31 <oneswig> msdisme: how is the cloud infra deployed and what form do those scripts take today? 21:39:32 <msdisme> openinfralabs.org (currently pretty static page with links to mailing list and IRC) 21:40:05 <msdisme> https://gitlab.com/open-infrastructure-labs is a currently very sparse repository for droppng off samples 21:40:15 <deardooley> lol. ok, didn't know if that was the same acronym, or if OI referenced a resource description language of some kind. 21:40:38 <rbudden> @deardooley we deploy Cinder with multibackend 21:40:39 <msdisme> eg. a group at RH is dropping off their monitoring scripts 21:41:16 <msdisme> MOC will be contributing our adjutant based onboarding and microservice to onboard for Kube/Openshift 21:41:28 <jmlowe> You still can't quota individual cinder backends can you? 21:41:52 <rbudden> https://docs.openstack.org/cinder/latest/admin/blockstorage-multi-backend.html 21:41:53 <oneswig> msdisme: is that the "service assurance framework" or something else? 21:42:02 <msdisme> Operate First is doing their work in the same repository and will begin holding meeting s via IRC in the next few weeks 21:42:05 <rbudden> @jmlowe not that I am aware of 21:42:14 <jmlowe> that's unfortuante 21:42:18 <deardooley> I'd be more interested in how you'd enforce qos guarantees on the different backends. 21:42:32 <fungi> msdisme: oh, wow, this is the first time i've seen anyone mention they're using adjutant... that's awesome! 21:43:39 <msdisme> it simplified SO much - Knikolla is working with them around future wishlist 21:43:56 <oneswig> Not heard of that, is it like Comanage? 21:44:16 <msdisme> This page includes links to most of the projects: https://massopen.cloud/connected-initiatives 21:44:42 <fungi> #link https://docs.openstack.org/adjutant/ 21:45:03 <fungi> oneswig: basically a business logic implementation framework service for openstack 21:45:13 <oneswig> wow 21:45:40 <knikolla> fungi: i think besides catalyst cloud, we're the only ones. 21:46:14 <fungi> yeah, they were the ones to start it 21:46:24 <fungi> but i'm happy to see some uptake 21:46:53 <msdisme> right now we use it for onboarding openstack users. As of next week we should have onboarding for openshift users on Intel and Power, and we're working with an intern on adding suport for quota requests 21:47:31 <oneswig> msdisme: so is this project looking for others to get involved? 21:48:48 <msdisme> Right now presentation is via horizon, will likely plug into the openondemand front end for Research Computing (no idea what that will take yet, but it seems logical) 21:49:21 <msdisme> we absolutely are looking for others to get involved! 21:50:15 <msdisme> We are gathering epics and user stories (still very sparse) at 21:50:21 <fungi> #link https://openinfralabs.org/#get-involved 21:50:52 <msdisme> https://gitlab.com/open-infrastructure-labs/nerc-architecture/-/boards 21:51:14 <oneswig> We've got stewardship of 144 compute blades plus a small control plane, with the brief of doing good and open community things with them. Perhaps this is an option. 21:51:28 <msdisme> the ESI tema is also pretty small 21:52:33 <fungi> opendev has also been talking to knikolla about maybe running some test workloads in there, i know mordred has gotten some credentials to try out at least 21:52:41 <msdisme> cool ! So part of the next month is going to be a requirements gathering around NERC - turns out some of the MOC corporate partners consulting groups have time and we want to use there expertise 21:53:19 <oneswig> fungi: that was another option, seems appealing 21:54:08 <msdisme> our expectation is that once NERC is running many of our research users will move there and we expect to have a lot of CI/CD opensource loads as well as (we hope) interesting hardware variants (eg. we have Power 9 and Intel, hoping to add some ARM and the like.( 21:54:38 <msdisme> re. ESI - team small and would love some folks to get involved. 21:54:52 <martial> it does sound very useful 21:55:11 <fungi> small team but full of wicked smart folk 21:55:20 <oneswig> Coincidentally I recently met up with a group running a Kolla-Ansible cloud with x86, ARM and POWER9 intermixed. 21:55:37 <msdisme> The Operate First model has a good video/slide presentatin here from the Open Cloud Workshop: Welcome and Overview of Mass Open Cloud, Open Cloud Testbed, New England Research Cloud (NERC), OpenInfra Labs and Operate First – slides – video 21:55:41 <msdisme> oops, lost links: 21:55:47 <fungi> oneswig: that's a remarkably strange mix of architectures 21:56:01 <oneswig> fungi: necessity is the mother of invention 21:56:02 <msdisme> video: https://www.youtube.com/watch?v=gLWsORV4y9Y&feature=youtu.be 21:56:12 <fungi> oneswig: unless you're running a research cloud i guess (or a processor zoo) 21:56:13 <msdisme> slides: https://massopen.cloud/wp-content/uploads/2020/04/OpenCloudWorkshop2020-kickoff-1.pdf 21:57:02 <oneswig> We've kind of overrun the agenda... 21:57:32 <oneswig> Is that Orran Krieger speaking in the video? 21:57:38 <msdisme> Operate first is here: https://massopen.cloud/wp-content/uploads/2020/04/Autonomous-Open-Hybrid-Cloud-and-the-MOC-1.pdf 21:57:48 <msdisme> yep 21:58:34 <msdisme> yep r.e Orran 21:58:48 <oneswig> I met him at OpenStack Tokyo. A memorable guy! :-) 21:59:19 <oneswig> What is the connection between OIL and the OSF? 21:59:24 <msdisme> actually it's a group of spearkers - I think Peter Desnoyer is there 22:00:15 <oneswig> Ah, we are out of time - final closing comments please 22:00:17 <msdisme> It's a project hosted by OI Labs - 22:00:58 <msdisme> thanks everyone -very excited to see that the work we are doing resonates. I will go through and look for questions I missed and figure out how to answer them :-) 22:01:18 <fungi> osf is helping to represent oilabs, and provide some guidance on community formation et cetera 22:01:19 <martial> thanks :) 22:01:20 <oneswig> Thanks for coming along, a very stimulating discussion 22:01:43 <oneswig> #endmeeting