11:00:16 <oneswig> #startmeeting scientific-sig
11:00:17 <openstack> Meeting started Wed Feb 27 11:00:16 2019 UTC and is due to finish in 60 minutes.  The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot.
11:00:18 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
11:00:20 <openstack> The meeting name has been set to 'scientific_sig'
11:00:29 <ildikov> o/
11:00:32 <bogdando> o/
11:00:34 <oneswig> greetings o/
11:00:47 <belmoreira> o/
11:00:53 <mgoddard> \o
11:00:56 <oneswig> #link Agenda for today https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_February_27th_2019
11:01:22 <oneswig> hi ildikov bogdando belmoreira mgoddard
11:01:32 <janders> g'day all
11:01:38 <ildikov> hi :)
11:01:41 <mgoddard> hi
11:02:06 <oneswig> evening janders
11:02:24 <oneswig> I trust everyone has got their summit arrangements in place, for those that are going? :-)
11:02:38 <b1airo> yeah i should really do that
11:02:44 <oneswig> g'day b1airo
11:02:47 <oneswig> #chair b1airo
11:02:48 <openstack> Current chairs: b1airo oneswig
11:02:51 <janders> flights booked - rest in progress
11:02:54 <b1airo> how goes it?
11:02:59 <janders> who's staying around for the PTG?
11:03:20 <oneswig> #topic Denver Summit
11:03:36 <oneswig> I am - but only for 1 day.  Got to fly home for a family occasion at the weekend.
11:03:39 <mgoddard> I'll be there sunday - saturday
11:03:47 <oneswig> mgoddard: thought it was Monday?
11:03:53 <mgoddard> sorry, monday - saturday
11:04:07 <b1airo> unsure yet. probably. we seem to have a spot in the PTG for the SIG
11:04:20 <janders> indeed!
11:04:24 <mgoddard> in fact, travelling monday, summit on tuesday :)
11:04:37 <verdurin> Morning.
11:04:39 <priteau> No summit / PTG for me, but there should be two Blazar core contributors present
11:05:00 <oneswig> We do.  We have SIG sessions in the forum followed by SIG sessions at the PTG.
11:05:22 <oneswig> Different focus but we'll need to concentrate on channelling the discussion in each...
11:05:35 <oneswig> Hi verdurin priteau, morning
11:05:40 <mgoddard> How will those sessions differ?
11:06:15 <janders> great to see a fair bit of interest in the PTG. I propose we dedicate a part of one of the future IRC meetings to discuss the Scientific SIG agenda for the PTG
11:06:37 <oneswig> mgoddard: The forum session is always operator/user facing.  I don't think so many of those will be in the PTG.  Last one I was at we talked a lot with the Nova and Ironic folks who were really great and turned up to our session.
11:06:58 <oneswig> martial led the last PTG and I think you know more about that one than me :-)
11:07:16 <b1airo> seconded janders
11:07:21 <oneswig> janders: thirded
11:07:38 <mgoddard> will be a common theme I think, trying to maintain a useful distinction between these sessions with many of the same people present
11:07:39 <b1airo> did i make you backspace oneswig ?
11:08:01 <ildikov> oneswig: +1, the PTG is a great place for cross-project type discussions
11:08:10 <oneswig> What I saw was a really productive discussion on articulating use case requirements.  We did *cough* not follow up on our side of the ensuing participation admittedly
11:08:32 <oneswig> Should make a resolution to do that better this cycle.
11:08:55 <oneswig> Anyway, it's actually a good opportunity because the forum session will be fresh in our minds.
11:09:33 <oneswig> On the subject of the forum session, we are planning to split it into some lightning talks again, given that seems to work well.
11:09:59 <janders> are any of you attending the Red Hat Summit in addition to the Denver events?
11:10:09 <oneswig> janders: not me, I don't get out much
11:10:38 <oneswig> I'm due to be at a Dell community HPC event at the end of March though...
11:10:39 <b1airo> yeah, we could tee up some time with relevant project teams (e.g. Blazar, Cyborg, ...), but unless we have people who are using or have tried using those projects it is questionable as to how useful it would be. i suppose prep work and careful formatting is the key to making it worthwhile
11:11:14 <b1airo> those Dell events are generally decent
11:11:47 <b1airo> unlikely to do Red Hat, too long away from home
11:12:39 <b1airo> i did enjoy it last year though. they have good hands on stuff, but you really have to get in fast
11:12:51 <oneswig> If the PTG format is similar, we'll go in the early stage cross-project phase, and a lot of leads and cores from the larger projects tend to drop by.
11:13:30 <ildikov> I'm not sure about the format yet as it'll be shorter this time
11:13:44 <janders> speaking of Dell - do you guys have any experience with their PowerEdge C6420 platform?
11:13:54 <oneswig> ildikov: thanks, I've not seen anything on it
11:14:01 <oneswig> janders: yes, sure
11:14:20 <verdurin> janders: we have a bunch of those
11:14:21 <b1airo> yes
11:14:32 <ildikov> oneswig: it's still under scheduling based on the teams' requests
11:14:41 <janders> any stability issues (and specifically MCEs)?
11:14:43 <b1airo> nice dense boxes - good luck finding a location for 5 asset tags...
11:14:49 <oneswig> We've done fun things with configuring BIOS and RAID parameters via Ansible on those
11:15:19 <janders> (MCE = machine check exception)
11:15:29 <verdurin> We've seen some instability on (non-OpenStack) compute nodes, largely resolved with firmware updates so far.
11:15:34 <oneswig> Getting memory and PCI issues janders?
11:15:45 <janders> we are having a nightmare run on about 40 of those, seems a faulty batch
11:16:02 <janders> they reset with MCE in the SEL and no indication of what the problem is whatsoever. No kernel panics.
11:16:22 <oneswig> Does it go into the iDRAC event log?
11:16:30 <janders> yes
11:16:49 <janders> but usually MCE is accompanied by a hint what the reason was
11:16:51 <janders> not in this case
11:17:00 <b1airo> we had a big headache with vanilla (i.e. not Dell OEM) Mellanox CX-4s in them, but they wouldn't even boot. went away after updating the Mellanox firmware - had to cold boot 60 of them about ~15 times each via iDRAC console to get the initial installs and updates done o_0
11:17:33 <oneswig> b1airo: think I remember that time
11:17:51 <janders> very interesting
11:18:06 <janders> I will check with the guys if they updated the Mellanox firmware - BIOS/iDRAC I'm sure they did
11:18:37 <oneswig> Good luck with that janders, always good to hear your reports from the coal face
11:18:37 <b1airo> iirc it was a PCI training error or something of that nature that kept coming up (before BIOS had even finished initialising)
11:19:00 <b1airo> try taking all peripherals out and see where that lands you
11:19:01 <oneswig> b1airo: seen a few of those on C4130 GPU nodes.
11:19:32 <oneswig> Anyway, let's get on?
11:19:34 <janders> sorry to hijack the discussion with an off topic question - let's continue - we can touch on this more later time permitting (and if not I will chat to the team and likely email you with more questions)
11:19:42 <janders> sure! thanks heaps guys
11:19:51 <oneswig> #topic CFP Deadlines
11:19:59 <oneswig> belmoreira: how's preparation going?
11:20:10 <oneswig> Congratulations on your election btw!
11:20:44 <oneswig> #link CERN OpenStack days CFP - closes tomorrow https://openstackdayscern.web.cern.ch/present
11:21:57 <oneswig> Also closing tomorrow is the Open Infra days London - https://www.papercall.io/openinfradaysuk
11:22:36 <oneswig> belmoreira: I've been trying to reach Chiara but she's on holiday until tomorrow
11:23:41 <oneswig> OK, let's get onto edge computing
11:23:46 <bogdando> \o/
11:23:50 <oneswig> #topic Edge computing use cases
11:23:59 <ildikov> :)
11:24:17 <oneswig> Who wants to go first ildikov bogdando?
11:24:26 <bogdando> ildikov: please go for it
11:24:32 <ildikov> as a generic highlight here's the Edge WG wiki: https://wiki.openstack.org/wiki/Edge_Computing_Group
11:24:37 <ildikov> sorry
11:24:41 <ildikov> #link https://wiki.openstack.org/wiki/Edge_Computing_Group
11:25:04 <ildikov> and another wiki for our current use cases:
11:25:07 <ildikov> #link https://wiki.openstack.org/wiki/Edge_Computing_Group/Use_Cases
11:25:44 <ildikov> if you take a look at the second page you can see the first half of the list is Telecom heavy, like 5G and uCPE
11:25:59 <ildikov> and some of the items on the list are a little futuristic, like drones
11:26:18 <oneswig> OpenStack powered drones?
11:26:24 <ildikov> the Edge WG is planning to have Forum sessions as well as meet at the PTG
11:26:41 <ildikov> well, the SW on the drone itself might not be OpenStack
11:27:04 <ildikov> but the small edge site and further layers could as well be OpenStack powered :)
11:27:11 <b1airo> most important question i can think of, is there a community openstack edge cloud that i can have a Node of at home with a couple of raspberry pis and a liberal sprinkling of old usb keys?!
11:27:45 <oneswig> ildikov: the broader the use cases the less in common, how do you draw a line on defining what is edge computing?
11:27:59 <ildikov> we have a bi-weekly call to get further into the details of use cases as well as requirements and serving as a bridge towards the technical community who's doing the implementation work
11:28:39 <ildikov> we published a whitepaper last year to give that definition: https://www.openstack.org/edge-computing/cloud-edge-computing-beyond-the-data-center?lang=en_US
11:28:45 <oneswig> b1airo: Is Owncloud still going?
11:29:15 <b1airo> owncloud and nextcloud (fork) last i looked oneswig
11:29:25 <ildikov> the edge platform angle of OpenStack is StarlingX
11:29:46 <janders> definitely not a typical use case, but we once built an OpenStack-powered Slurm HTC system running on VMs distributed across most continents, just to prove it's possible. Controllers were in central locations, computes all over the place. Worked fine despite latency in order of 100-300ms
11:30:25 <ildikov> it's incorporating OpenStack services, Kubernetes, Ceph, etc to provide an edge platform that can be as small as one node
11:30:53 <ildikov> janders: that's a good info
11:31:41 <ildikov> the WG is also working on reference architectures:
11:31:43 <ildikov> #link https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures
11:31:52 <oneswig> ildikov: in the edge model, how is the decentralised service kept coherent?  User management, storage and so on.
11:31:59 <ildikov> where we have the centralized model which of course doesn't work for all use cases
11:32:29 <ildikov> we are looking into enhancing federation in Keystone
11:32:54 <ildikov> StarlingX has a Distributed Cloud service which provides synchronization
11:33:38 <ildikov> so kind of trying to explore multiple ways of providing a solution as there's no one size fits all model here
11:33:48 <b1airo> it's interesting to see that first diagram on the reference architectures wiki page - proximity (in terms of latency) to the end-user/consumer is always how i'd looked at Edge
11:33:59 <janders> are there any facilities there that would help with mapping filesystem-level ownerships in a federated context?
11:34:18 <oneswig> ildikov: I guess with the "edge" terminology, there's always assumed to be a center.  It's never a peer-to-peer decentralised model?
11:34:45 <janders> (how would UID:GUID from site X map to UID:GUID on site Y while the two have some sort of federation going)
11:34:47 <ildikov> janders: that's a good question, we didn't get down to that level of detail at least with the working group
11:35:02 <ildikov> janders: there might be an answer in the StarlingX community
11:35:19 <b1airo> oneswig: i'd always thought the centre could be relative to the user/consumer
11:35:54 <ildikov> oneswig: I would say yes based on what I've heard so far
11:36:43 <b1airo> maybe we need a OpenStack DHT project oneswig ?
11:37:09 <ildikov> I assume that partially comes from the assumption that the things we are currently talking about are running on an operator network at the end of the day
11:37:40 <ildikov> I wonder if there's any use case you have that's not on our wiki/radar yet?
11:37:55 <ildikov> which could also give us new ideas on architecture and requirements
11:37:55 <oneswig> As janders mentions, some kind of federated management of POSIX UID&GID is something that comes up a lot in the scientific space.
11:39:01 <oneswig> I think the edge model follows the "totoro" analogy of having roles for a "big totoro" and a "little totoro"
11:39:02 <ildikov> is there some material or description to look at about that?
11:39:16 <bogdando> I have a 5 minutes left before to eject, may I chime in with a short statement please?
11:39:24 <ildikov> bogdando: sure
11:39:27 <bogdando> So indeed some of the edge cases depend on the enhanced autonomy (de-centralized models) and the multiple-control planes requirement, like involving sophisticated data replication and conflicts resolving techniques. Not only limited to keystone federation or glance caching, basically.
11:39:49 <bogdando> I wanted to ask you'all folks to chime in for that research request to do the best to fill that gap and ideally come up with some new/adopted casual consistent database or KVS solution for OpenStack in general (and not only) and not only for the StarlingX scope, to cover such cases. And the link I posted is just a draft for a position paper, which only states the challenges but proposes not much yet :)
11:40:40 <bogdando> or at least do the better problems statement/positioning job than I attempted ;)
11:41:11 <bogdando> that's it from my side
11:41:45 <oneswig> bogdando: I saw a good presentation yesterday on DynaFed, which is a project from the high-energy physics area for creating multi-site data federations and presenting a more user-friendly interface than (eg) S3
11:42:12 <bogdando> oneswig: thank you for info, that's interesting to look into
11:42:21 <oneswig> I will ask if I can share the slides from yesterday.
11:42:29 <oneswig> Might be interesting.
11:42:31 <bogdando> yes please
11:42:37 <ildikov> +1
11:42:43 <bogdando> thank you for getting my message folks
11:42:45 <bogdando> gtg now
11:42:47 <priteau> bogdando: Are you aware of the research work done by the Discovery project?
11:42:49 <priteau> #link http://beyondtheclouds.github.io/
11:42:58 <bogdando> not really, will gladly accept all links )
11:43:26 <priteau> They've done research on replacing the SQL database by KV databases like Redis
11:43:39 <priteau> They're also using Cockroach DB
11:43:53 <oneswig> Was this the work presented by Thierry Carrez and co at (IIRC) Austin?
11:43:56 <oneswig> Grid5000?
11:44:21 <priteau> This work was lead by Adrien Lebre
11:44:35 <priteau> Using Grid'5000 for the experiments, so maybe it was
11:45:20 <priteau> bogdando: If you don't know Adrien I will get you in touch with him :-)
11:46:47 <ildikov> I'm also happy to put further pointers on the Edge WG wiki
11:46:59 <ildikov> to make sure people find all the good and interesting info
11:47:29 <b1airo> ildikov: a use-case that might not be present there could be around applying infrastructure as code to scientific instruments. typically things like microscopes come with vendor supplied or specified workstations that are only good enough for basic data collection... having a cloud in the closet next door with some special connectivity to the instrument stack would open up some interesting possibilities
11:47:52 <ildikov> I'm happy to put up federated management of POSIX UID&GID as a topic for one of the upcoming meetings as well to see if others have some input for that
11:48:36 <ildikov> b1airo: yeah, I have not heard this one yet
11:49:08 <ildikov> b1airo: would be a good candidate to draft on the wiki if you agree
11:49:12 <janders> that would be great!
11:49:33 <b1airo> sure
11:50:11 <ildikov> b1airo: would you take an action point to add some details to the use cases wiki I linked above?
11:50:33 <b1airo> sure
11:50:41 <ildikov> b1airo: thank you :)
11:51:22 <oneswig> What are the changes you are looking to make for federated authentication?
11:52:29 <ildikov> oneswig: we have some activities aroun x509 authentication to fix things up
11:52:32 <ildikov> #link http://lists.openstack.org/pipermail/edge-computing/2019-January/000520.html
11:53:14 <ildikov> #link https://bugs.launchpad.net/keystone/+bugs?field.tag=x509
11:53:42 <oneswig> Crikey... X509
11:53:46 <ildikov> basically make sure that code path works with the latest features like auto-provisioning
11:54:34 <b1airo> ildikov: heard of SciTokens yet...?
11:54:36 <ildikov> it came out from a discussion around an IDP master with shadow users scenario
11:54:39 <ildikov> #link https://wiki.openstack.org/wiki/Keystone_edge_architectures#Identity_Provider_.28IdP.29_Master_with_shadow_users
11:54:57 <ildikov> b1airo: I personally did not, but I'm also not an expert in this area
11:54:59 <oneswig> Auto-provisioning in this case means service accounts and automated actions on behalf of users?
11:55:54 <oneswig> Does this also relate to limitations using trusts (ie, Heat) for shadow users?
11:56:16 <ildikov> yes and I don't know :)
11:56:49 <oneswig> I guess application credentials also come in here.  It would be important to be able to use them in this scenario, and I am not sure that it works.
11:57:31 <ildikov> that's a separate layer, so we didn't look into that, not directly at least
11:57:35 <oneswig> Ah, we are nearly at time - final comments on this?
11:57:50 <ildikov> would be good to think of next steps
11:58:02 <ildikov> like what relevant challenges this group has that we could help with
11:58:08 <oneswig> ildikov: see you in the forum for that one, I guess
11:58:19 <ildikov> or solutions to use
11:58:30 <oneswig> #topic AOB
11:58:34 <ildikov> oneswig: Forum and PTG would be a good place to catch up!
11:58:39 <oneswig> final comments anyone?
11:58:47 <priteau> Very briefly before we finish
11:58:53 <janders> mgoddard: one of my team members was looking at your ansible UFM work (very cool!)
11:59:01 <ildikov> and we also have weekly calls for those who're interested in this stuff :)
11:59:04 <janders> quick question: can your role do HA-UFM?
11:59:04 <b1airo> i'm keen to talk to anyone who has used Rucio, dCache, or friends to proxy user access to HSM tape systems...?
11:59:13 <priteau> There is some interest from a joint Red Hat - MASS Open Cloud about using Blazar and Ironic: http://lists.openstack.org/pipermail/openstack-discuss/2019-February/002671.html
11:59:49 <priteau> There's some work required to Blazar to make it compatible with Ironic, I wonder if anyone else here would be interested by this use case?
11:59:58 <mgoddard> janders: nice. It doesn't support HA right now. It's a bit weird in that it runs systemd & multiple services in a container
12:00:12 <janders> ok! thank you
12:00:27 <janders> we might just deploy it on a bare metal os - but when we get back to containers and ha - I will get in touch
12:00:35 <janders> thank you all
12:00:38 <mgoddard> janders: picking apart the UFM init scripts into individual containers would have been challenging
12:00:49 <janders> yeah UFM is a complex beast..
12:00:50 <mgoddard> janders: sure, get in touch
12:01:06 <b1airo> priteau: yes, i know several folks who would be interested in reservations for bare-metal nodes. but i thought that's what Chameleon already did...?
12:01:13 <oneswig> priteau: Ironic + Blazar does sound generally useful.
12:01:24 <priteau> b1airo: it does but with patches to Nova, so not easy to replicate
12:01:29 <janders> I'd be interested, too! :)
12:01:32 <priteau> And not easy to maintain
12:01:33 <oneswig> How can we follow up if we form a group of interested people?
12:02:22 <priteau> Be willing to test some early code, or even join in to the coding ;-)
12:02:57 <oneswig> We are over time - priteau janders b1airo can we follow up on this offline?
12:03:17 <priteau> Sure
12:03:18 <janders> sure
12:03:20 <b1airo> no prob
12:03:24 <oneswig> OK, thanks all, until next time
12:03:27 <oneswig> #endmeeting