21:00:44 <oneswig> #startmeeting scientific-wg
21:00:46 <openstack> Meeting started Tue Sep 19 21:00:44 2017 UTC and is due to finish in 60 minutes.  The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:47 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:49 <openstack> The meeting name has been set to 'scientific_wg'
21:00:57 <oneswig> #chair martial
21:00:58 <openstack> Current chairs: martial oneswig
21:01:14 <oneswig> #link Agenda for today https://wiki.openstack.org/wiki/Scientific_working_group#IRC_Meeting_September_19th_2017
21:01:26 <rajulk> hi martial
21:01:30 <martial> Hi Stig
21:01:42 <oneswig> Hello martial rajulk, good evening
21:01:48 <oneswig> good afternoon, etc.
21:01:56 <rajulk> hi oneswig
21:02:02 <rajulk> good evening
21:02:06 <martial> I see Mr Kumar is already with us, wonderful :)
21:02:07 <priteau> Hello
21:02:22 <oneswig> Hey priteau, how are you?
21:02:24 <rajulk> :)
21:02:34 <hogepodge> hi
21:02:42 <simon-AS5591> Hello all
21:02:58 <oneswig> #chair b1airo
21:02:59 <openstack> Current chairs: b1airo martial oneswig
21:03:04 <oneswig> morning Blair
21:03:11 <b1airo> Morning
21:03:27 <oneswig> OK, lets roll?
21:03:43 <oneswig> #topic Scientific Hackathon
21:03:59 <b1airo> Yep, sounds good. I'm just feeding the animals here so will keep one eye on this.
21:04:07 <oneswig> Flanders mentioned the scientific hackathon is looking for mentors
21:04:17 <oneswig> #link http://hackathon.openstack.org.au/mentors/
21:04:33 <oneswig> What else would you do in a dump like Sydney while the jetlag wears off :-)
21:05:05 <rbudden> hello
21:05:21 <oneswig> Not that we are a mercenary bunch, I did see mention of summit tickets in return...
21:05:26 <oneswig> Hi rbudden
21:06:03 <oneswig> The hackathon starts Friday afternoon before the summit if anyone is around and interested in taking part
21:06:23 <martial> blairo oneswig: I apologize, I need to sign off for a few minutes, will be back as soon as I can
21:06:29 <oneswig> #item OpenStack London
21:06:32 <oneswig> martial: np
21:06:41 <oneswig> Is next Tuesday!
21:06:55 <oneswig> We have a WG session for those attending.
21:07:07 <oneswig> Probably not a big win in this time zone...
21:07:57 <b1airo> Great to hear oneswig
21:07:59 <oneswig> I wanted to mention though, we are looking for lightning talks for the BoF, and in customary fashion there is now a small prize for best talk
21:08:08 <b1airo> Hopefully drum up some new members
21:08:40 <oneswig> I think so too.  Many of the presentations of the main schedule are scientifically oriented.
21:08:56 <oneswig> Am hoping to meet some new faces.
21:09:03 <oneswig> (and some old ones, of course)!
21:09:12 <priteau> oneswig: Do you know if talks will be recorded?
21:09:27 <oneswig> They were last year.  They usually are.
21:10:14 <oneswig> OK, that's all to add there I think.
21:10:39 <oneswig> #topic Opportunistic utilisation for HTC/HPC
21:11:00 <oneswig> We have Rajul Kumar with us today from Northeastern Uni in Boston MA
21:11:26 <oneswig> #link Rajul's presentation made in Boston https://www.openstack.org/videos/boston-2017/hpchtc-and-cloud-making-them-work-together-efficiently
21:11:43 <rajulk> hello everyone
21:11:49 <oneswig> #link Slides for the presentation are here: https://drive.google.com/file/d/0B6_MvTMovwvFcVppSy1xVmFVbDQ/view?usp=sharing
21:11:57 <oneswig> Hi rajulk, thanks for coming along
21:12:25 <rajulk> oneswig: thanks for inviting me
21:12:48 <oneswig> Thank martial, but he's missing the fun :-)
21:12:59 <oneswig> Can you describe the context for your work?
21:13:06 <oneswig> Is this something used in MOC?
21:13:11 <rajulk> sure
21:13:35 <rajulk> well we are still working on it and is not ready for use yet
21:13:56 <rajulk> but hopefully will soon be in shape
21:14:51 <rajulk> So the idea is it use the idle/underutilized resources in OpenStack cloud to be offered as an HTC service
21:15:25 <rajulk> This will be a Slurm cluster comprising of virtual machines on OpenStack
21:15:55 <oneswig> Is it a separate partition in slurm, or a separate instance of slurm altogether?
21:16:09 <rajulk> these VMs will be added/removed from the cluster based on the resource availability driven by the resource utilization of the OpenStack cluster
21:16:54 <rajulk> It's a seperate instance of slurm -  one that provides HTC service on OS
21:17:05 <b1airo> How is the OpenStack resource utilisation monitored?
21:17:43 <rajulk> we will be using watcher that will pull metrics from monasca
21:18:15 <rajulk> and watcher will drive the VM provisioning
21:18:19 <oneswig> rajulk: you don't have that piece in place yet, or is it currently getting developed?
21:19:16 <rajulk> we have developed an inital watcher strategy that does the job
21:19:22 <rajulk> but is yet to be tested
21:20:47 <priteau> Are the Slurm instances pre-emptible?
21:21:02 <rajulk> yes
21:21:10 <rajulk> so the way it works is
21:21:32 <b1airo> rajulk: so do you rely on Nova scheduling or force
21:21:50 <rajulk> whenever the utilization goes beyond a certain threshold, the job and node is supended on slurm and then then VM on OS
21:22:08 <rajulk> we use nova scheduling
21:22:32 <oneswig> What do you do about re-creating the hpc cluster environment in OpenStack?  What about filesystems etc?
21:23:46 <rajulk> we have used Salt as the config management tool that drives the provisioning and configuration of a new node for Slurm
21:23:54 <rajulk> the controller remains static
21:25:02 <rajulk> We'll be using an NFS file system mounted on top of swift storage
21:26:20 <oneswig> sounds interesting.
21:26:41 <oneswig> It's an internal filesystem to OpenStack then?
21:27:06 <rajulk> yes. initally that's the plan
21:27:18 <rajulk> however wwe may extend it as required down the line
21:27:26 <b1airo> Have you looked at any Keystone integration or would this service be seperate from OpenStack from the user's perspective?
21:28:28 <rajulk> So at MOC we have developed an UI that deals with these scientific tools
21:28:35 <rajulk> that is integrated with keystone
21:28:57 <rajulk> and whenver a user comes on to use slurm he has to use it via that UI
21:29:29 <rajulk> as of now, there will be no direct interaction between slurm and a user
21:30:22 <b1airo> Ah ok, so you have a science gateway of some kind
21:30:33 <rajulk> exactly
21:31:12 <oneswig> If you ever get to the point of allowing users to login, I've previously used this to have them authenticated against keystone in PAM: https://github.com/stackhpc/pam-keystone
21:31:20 <rajulk> it's still in development phase though
21:31:25 <martial> (back, sorry)
21:31:41 <b1airo> Ok, so presumably the OpenStack SLURM is just one computational resource behind that gateway, which would also include traditional HPC systems?
21:31:45 <oneswig> welcome back martial :-)
21:31:47 <martial> (will check the logs for the presentation)
21:32:25 <b1airo> rajulk: what gateway software do you use there?
21:32:29 <rajulk> As of now yes, slurm is running behind the gateway for the user
21:33:32 <priteau> rajulk: when you say utilization goes beyond a threshold, is that looking at per-node utilization (CPU, mem, etc.) or number of instances across the whole cluster?
21:33:38 <rajulk> there is a different team working on that but as I know it's built on top of troposphere and atmosphere from cyverse
21:35:04 <rajulk> priteau: it's the utilization across the cluster on CPU and memory for a certain period of time
21:35:29 <priteau> And how do you decide which VM to kill?
21:36:16 <b1airo> Interesting possibilities to look for idle VMs in SLURM there...
21:36:42 <rajulk> priteau: well there's not logic behind that yet. just pick the first one in the list and we dont kill a VM but just suspend it
21:37:05 <oneswig> rajulk: your presentation mentions multi-node jobs are future work, what's stopping that?
21:39:24 <rajulk> oneswig: theortically nothing.we haven't tried that yet. the only concern that was there initially that should we stop all the nodes in the cluster or just a portion of it
21:39:42 <rajulk> and if a portion that how to figure that and work around it
21:40:11 <rajulk> so began with a simpler case of single node jobs
21:40:31 <oneswig> makes sense
21:42:20 <oneswig> OK - did we have any final questions for Rajul?
21:42:45 <b1airo> Just a thanks for joining!
21:42:58 <oneswig> seconded - thanks again Rajul
21:43:08 <armstrong> I joined so I can't see his presentation
21:43:24 <armstrong> any link to that please?
21:43:34 <oneswig> armstrong: https://drive.google.com/file/d/0B6_MvTMovwvFcVppSy1xVmFVbDQ/view?usp=sharing
21:43:37 <b1airo> I'm still wanting a generic OpenStack level preemptible instance, but we could use this approach for a lot of stuff too
21:43:49 <armstrong> Ok thanks
21:43:58 <oneswig> no worries
21:44:10 <oneswig> #topic Scientific OpenStack book
21:44:11 <rajulk> Thanks for the discussion. really helpful for me as well.
21:44:30 <rajulk> b1airo: totally agree
21:44:36 <martial> thank you Mr Kumar
21:44:46 <oneswig> Final week for rabid hacking on the text everyone!
21:44:59 <rajulk> martial: thanks a lot :)
21:45:13 <martial> oneswig: well we are adding a chapter about "Research in Production" :)
21:45:26 <oneswig> Today I received a great study on Lustre on OpenStack from the Sanger Institute team
21:45:30 <b1airo> What's our drop dead date again oneswig ?
21:45:40 <martial> rajulk: thank you for giving us the opportunity to learn from your expertise
21:45:56 <rbudden> b1airo: 22nd I believe
21:46:02 <oneswig> Kathy wants final copy end of next week, edits close end of this week
21:46:03 <b1airo> oneswig: excellent, was hoping for that!
21:46:05 <martial> oneswig: started reviewing ... I saw you spot me :)
21:46:58 <oneswig> It's coming together - apologies to anyone who has commented and I've not responded for a couple of days.  Quite a bit going on at once.
21:47:41 <oneswig> martial: not sure that was me... I've been largely elsewhere today.
21:48:01 <rbudden> might have been me
21:48:04 <oneswig> I'm working on an SKA infrastructure study to replace the Cray one, should have that in review tomorrow
21:48:07 <rbudden> i was making lots of edits before this meeting ;)
21:48:32 <rbudden> i sent oneswig an email for review, but any feedback is welcome
21:48:33 <oneswig> rbudden: looking for the prize for the most up-to-date entry, huh? :-)
21:48:36 <martial> oneswig: interesting, that person had your face icon ... you are being identity theft-ed on book reviews of your own chapter ... sneaky
21:48:49 <hogepodge> Sept 29 is the hard deadline
21:49:04 <hogepodge> Absolutely no edits after that. If it's not ready we won't have time to produce the book.
21:49:09 <oneswig> Thanks hogepodge
21:49:28 <rbudden> oh nice, i have a few more days ;)
21:49:49 <oneswig> We can propose cover images for this edition.
21:49:50 <hogepodge> (Kathy has entrusted me to remind everyone of that as often as possible... I don't want to let Kathy down) :-D
21:50:14 <oneswig> I need URLs of images that are creative commons licensed
21:50:42 <oneswig> If we get options, I'll set up a vote
21:51:14 <oneswig> So - impressive images please everyone from your cloudy science workloads!
21:51:22 <rbudden> i know someone mentioned last year that they liked the Bridges OPA layout image
21:51:32 <rbudden> it’s already in the previous book though, so maybe something fresh
21:51:40 <rbudden> but figured it’s up for grabs if needed
21:52:17 <oneswig> rbudden: I used a screen shot from the vrml widget.  Any way of getting something better resolution?
21:52:29 <rbudden> yes
21:52:39 <rbudden> I can ask and get the original
21:52:56 <oneswig> sounds good.
21:53:12 <oneswig> OK - any more on the book work?
21:53:26 <oneswig> #topic SC17
21:53:36 <oneswig> hogepodge: was this your item?
21:53:57 <hogepodge> Yes, we need point people to receive book and sticker shipments
21:54:12 <hogepodge> Denise got stickers approved, so I just need to know who to mail them to for the conference.
21:54:13 <rbudden> i can volunteer for that
21:54:16 <oneswig> IU or PSC booths might be good places to deliver to - rbudden?
21:54:21 <oneswig> ah, great
21:54:25 <rbudden> yes, i’ll be at SC Sun-Fri
21:54:47 <oneswig> b1airo: what's the latest on the BoF?
21:54:48 <rbudden> they can likely be shipped straight to the PSC booth or split between us and IU
21:54:51 <hogepodge> Great! Just send an email to me and I'll connect you with Denise to set it up.
21:54:59 <rbudden> hogepodge: will do, thx
21:55:26 <b1airo> oneswig: BoF is on
21:55:42 <oneswig> Did the merge happen or is it just us?
21:55:42 <rbudden> #action rbudden to get in touch with hogepodge about SC book and sticker shipments
21:55:45 <hogepodge> chris@openstack.org
21:55:48 <b1airo> Final participant list and format TBA
21:56:18 <b1airo> I believe merge is happening, but I still haven't heard anymore from Meredith on the other side
21:57:54 <oneswig> Any more on SC?
21:58:01 <b1airo> Not from me
21:58:05 <rbudden> do we have a list of presenters for booth talks?
21:58:09 <rbudden> or still TBA?
21:58:13 <rbudden> errr TBD
21:58:21 <b1airo> Other thing I wanted to raise is feedback (if any) on becoming a SIG??
21:58:59 <oneswig> rbudden: I think Mike Lowe was tracking that.
21:59:08 <b1airo> Plus! Forum topics for Sydney - we have not thrown anything into the brainstorming pile yet I think
21:59:09 <oneswig> #topic AOB - SIG etc.
21:59:33 <rbudden> oneswig: thanks, i’ll sync up with Mike
21:59:49 <oneswig> SIG, still seems fine to me.  The mailing list split (ideally involving less cross-posting) is the only question for me.
22:00:07 <martial> b1airo: our usual BoF / Lightning Talks
22:00:13 <oneswig> We upgraded to Pike today, it worked!
22:00:35 <b1airo> martial: those are already in, I mean the main Forum
22:00:59 <oneswig> b1airo: might be a good thread for openstack-sigs[scientific]?
22:01:00 <martial> (I am still confused about this Forum thing then)
22:01:34 <martial> oneswig: might be especially since we are out of time
22:01:35 <priteau> oneswig: Anything to look out for before moving to Pike? I keep cherry-picking bug fixes from it into our Ocata deployment
22:03:05 <oneswig> priteau: Mark and johnthetubaguy in our team did a fair bit of prep.  We use Kolla.  There were issues with docker_py -> docker.  Also, we had problems with RabbitMQ hosts mangling /etc/hosts with repeated entries.  Finally, a racy problem with updating haproxy which was fixed by a cherry-pick from master
22:03:21 <oneswig> And I forgot to merge a PR I'd left dangling, doh.
22:03:39 <oneswig> Kolla worked really well for us though.
22:03:51 <oneswig> Ah, we are way over time.
22:03:57 <rbudden> oneswig: I would be very interested in thoughts/war storries of Kolla
22:04:02 <rbudden> for another time then!
22:04:06 <oneswig> Better wrap up.
22:04:14 <oneswig> rbudden: will try to write something up!
22:04:17 <oneswig> cheers all
22:04:20 <oneswig> #endmeeting