21:00:44 <oneswig> #startmeeting scientific-wg 21:00:46 <openstack> Meeting started Tue Sep 19 21:00:44 2017 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:47 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:49 <openstack> The meeting name has been set to 'scientific_wg' 21:00:57 <oneswig> #chair martial 21:00:58 <openstack> Current chairs: martial oneswig 21:01:14 <oneswig> #link Agenda for today https://wiki.openstack.org/wiki/Scientific_working_group#IRC_Meeting_September_19th_2017 21:01:26 <rajulk> hi martial 21:01:30 <martial> Hi Stig 21:01:42 <oneswig> Hello martial rajulk, good evening 21:01:48 <oneswig> good afternoon, etc. 21:01:56 <rajulk> hi oneswig 21:02:02 <rajulk> good evening 21:02:06 <martial> I see Mr Kumar is already with us, wonderful :) 21:02:07 <priteau> Hello 21:02:22 <oneswig> Hey priteau, how are you? 21:02:24 <rajulk> :) 21:02:34 <hogepodge> hi 21:02:42 <simon-AS5591> Hello all 21:02:58 <oneswig> #chair b1airo 21:02:59 <openstack> Current chairs: b1airo martial oneswig 21:03:04 <oneswig> morning Blair 21:03:11 <b1airo> Morning 21:03:27 <oneswig> OK, lets roll? 21:03:43 <oneswig> #topic Scientific Hackathon 21:03:59 <b1airo> Yep, sounds good. I'm just feeding the animals here so will keep one eye on this. 21:04:07 <oneswig> Flanders mentioned the scientific hackathon is looking for mentors 21:04:17 <oneswig> #link http://hackathon.openstack.org.au/mentors/ 21:04:33 <oneswig> What else would you do in a dump like Sydney while the jetlag wears off :-) 21:05:05 <rbudden> hello 21:05:21 <oneswig> Not that we are a mercenary bunch, I did see mention of summit tickets in return... 21:05:26 <oneswig> Hi rbudden 21:06:03 <oneswig> The hackathon starts Friday afternoon before the summit if anyone is around and interested in taking part 21:06:23 <martial> blairo oneswig: I apologize, I need to sign off for a few minutes, will be back as soon as I can 21:06:29 <oneswig> #item OpenStack London 21:06:32 <oneswig> martial: np 21:06:41 <oneswig> Is next Tuesday! 21:06:55 <oneswig> We have a WG session for those attending. 21:07:07 <oneswig> Probably not a big win in this time zone... 21:07:57 <b1airo> Great to hear oneswig 21:07:59 <oneswig> I wanted to mention though, we are looking for lightning talks for the BoF, and in customary fashion there is now a small prize for best talk 21:08:08 <b1airo> Hopefully drum up some new members 21:08:40 <oneswig> I think so too. Many of the presentations of the main schedule are scientifically oriented. 21:08:56 <oneswig> Am hoping to meet some new faces. 21:09:03 <oneswig> (and some old ones, of course)! 21:09:12 <priteau> oneswig: Do you know if talks will be recorded? 21:09:27 <oneswig> They were last year. They usually are. 21:10:14 <oneswig> OK, that's all to add there I think. 21:10:39 <oneswig> #topic Opportunistic utilisation for HTC/HPC 21:11:00 <oneswig> We have Rajul Kumar with us today from Northeastern Uni in Boston MA 21:11:26 <oneswig> #link Rajul's presentation made in Boston https://www.openstack.org/videos/boston-2017/hpchtc-and-cloud-making-them-work-together-efficiently 21:11:43 <rajulk> hello everyone 21:11:49 <oneswig> #link Slides for the presentation are here: https://drive.google.com/file/d/0B6_MvTMovwvFcVppSy1xVmFVbDQ/view?usp=sharing 21:11:57 <oneswig> Hi rajulk, thanks for coming along 21:12:25 <rajulk> oneswig: thanks for inviting me 21:12:48 <oneswig> Thank martial, but he's missing the fun :-) 21:12:59 <oneswig> Can you describe the context for your work? 21:13:06 <oneswig> Is this something used in MOC? 21:13:11 <rajulk> sure 21:13:35 <rajulk> well we are still working on it and is not ready for use yet 21:13:56 <rajulk> but hopefully will soon be in shape 21:14:51 <rajulk> So the idea is it use the idle/underutilized resources in OpenStack cloud to be offered as an HTC service 21:15:25 <rajulk> This will be a Slurm cluster comprising of virtual machines on OpenStack 21:15:55 <oneswig> Is it a separate partition in slurm, or a separate instance of slurm altogether? 21:16:09 <rajulk> these VMs will be added/removed from the cluster based on the resource availability driven by the resource utilization of the OpenStack cluster 21:16:54 <rajulk> It's a seperate instance of slurm - one that provides HTC service on OS 21:17:05 <b1airo> How is the OpenStack resource utilisation monitored? 21:17:43 <rajulk> we will be using watcher that will pull metrics from monasca 21:18:15 <rajulk> and watcher will drive the VM provisioning 21:18:19 <oneswig> rajulk: you don't have that piece in place yet, or is it currently getting developed? 21:19:16 <rajulk> we have developed an inital watcher strategy that does the job 21:19:22 <rajulk> but is yet to be tested 21:20:47 <priteau> Are the Slurm instances pre-emptible? 21:21:02 <rajulk> yes 21:21:10 <rajulk> so the way it works is 21:21:32 <b1airo> rajulk: so do you rely on Nova scheduling or force 21:21:50 <rajulk> whenever the utilization goes beyond a certain threshold, the job and node is supended on slurm and then then VM on OS 21:22:08 <rajulk> we use nova scheduling 21:22:32 <oneswig> What do you do about re-creating the hpc cluster environment in OpenStack? What about filesystems etc? 21:23:46 <rajulk> we have used Salt as the config management tool that drives the provisioning and configuration of a new node for Slurm 21:23:54 <rajulk> the controller remains static 21:25:02 <rajulk> We'll be using an NFS file system mounted on top of swift storage 21:26:20 <oneswig> sounds interesting. 21:26:41 <oneswig> It's an internal filesystem to OpenStack then? 21:27:06 <rajulk> yes. initally that's the plan 21:27:18 <rajulk> however wwe may extend it as required down the line 21:27:26 <b1airo> Have you looked at any Keystone integration or would this service be seperate from OpenStack from the user's perspective? 21:28:28 <rajulk> So at MOC we have developed an UI that deals with these scientific tools 21:28:35 <rajulk> that is integrated with keystone 21:28:57 <rajulk> and whenver a user comes on to use slurm he has to use it via that UI 21:29:29 <rajulk> as of now, there will be no direct interaction between slurm and a user 21:30:22 <b1airo> Ah ok, so you have a science gateway of some kind 21:30:33 <rajulk> exactly 21:31:12 <oneswig> If you ever get to the point of allowing users to login, I've previously used this to have them authenticated against keystone in PAM: https://github.com/stackhpc/pam-keystone 21:31:20 <rajulk> it's still in development phase though 21:31:25 <martial> (back, sorry) 21:31:41 <b1airo> Ok, so presumably the OpenStack SLURM is just one computational resource behind that gateway, which would also include traditional HPC systems? 21:31:45 <oneswig> welcome back martial :-) 21:31:47 <martial> (will check the logs for the presentation) 21:32:25 <b1airo> rajulk: what gateway software do you use there? 21:32:29 <rajulk> As of now yes, slurm is running behind the gateway for the user 21:33:32 <priteau> rajulk: when you say utilization goes beyond a threshold, is that looking at per-node utilization (CPU, mem, etc.) or number of instances across the whole cluster? 21:33:38 <rajulk> there is a different team working on that but as I know it's built on top of troposphere and atmosphere from cyverse 21:35:04 <rajulk> priteau: it's the utilization across the cluster on CPU and memory for a certain period of time 21:35:29 <priteau> And how do you decide which VM to kill? 21:36:16 <b1airo> Interesting possibilities to look for idle VMs in SLURM there... 21:36:42 <rajulk> priteau: well there's not logic behind that yet. just pick the first one in the list and we dont kill a VM but just suspend it 21:37:05 <oneswig> rajulk: your presentation mentions multi-node jobs are future work, what's stopping that? 21:39:24 <rajulk> oneswig: theortically nothing.we haven't tried that yet. the only concern that was there initially that should we stop all the nodes in the cluster or just a portion of it 21:39:42 <rajulk> and if a portion that how to figure that and work around it 21:40:11 <rajulk> so began with a simpler case of single node jobs 21:40:31 <oneswig> makes sense 21:42:20 <oneswig> OK - did we have any final questions for Rajul? 21:42:45 <b1airo> Just a thanks for joining! 21:42:58 <oneswig> seconded - thanks again Rajul 21:43:08 <armstrong> I joined so I can't see his presentation 21:43:24 <armstrong> any link to that please? 21:43:34 <oneswig> armstrong: https://drive.google.com/file/d/0B6_MvTMovwvFcVppSy1xVmFVbDQ/view?usp=sharing 21:43:37 <b1airo> I'm still wanting a generic OpenStack level preemptible instance, but we could use this approach for a lot of stuff too 21:43:49 <armstrong> Ok thanks 21:43:58 <oneswig> no worries 21:44:10 <oneswig> #topic Scientific OpenStack book 21:44:11 <rajulk> Thanks for the discussion. really helpful for me as well. 21:44:30 <rajulk> b1airo: totally agree 21:44:36 <martial> thank you Mr Kumar 21:44:46 <oneswig> Final week for rabid hacking on the text everyone! 21:44:59 <rajulk> martial: thanks a lot :) 21:45:13 <martial> oneswig: well we are adding a chapter about "Research in Production" :) 21:45:26 <oneswig> Today I received a great study on Lustre on OpenStack from the Sanger Institute team 21:45:30 <b1airo> What's our drop dead date again oneswig ? 21:45:40 <martial> rajulk: thank you for giving us the opportunity to learn from your expertise 21:45:56 <rbudden> b1airo: 22nd I believe 21:46:02 <oneswig> Kathy wants final copy end of next week, edits close end of this week 21:46:03 <b1airo> oneswig: excellent, was hoping for that! 21:46:05 <martial> oneswig: started reviewing ... I saw you spot me :) 21:46:58 <oneswig> It's coming together - apologies to anyone who has commented and I've not responded for a couple of days. Quite a bit going on at once. 21:47:41 <oneswig> martial: not sure that was me... I've been largely elsewhere today. 21:48:01 <rbudden> might have been me 21:48:04 <oneswig> I'm working on an SKA infrastructure study to replace the Cray one, should have that in review tomorrow 21:48:07 <rbudden> i was making lots of edits before this meeting ;) 21:48:32 <rbudden> i sent oneswig an email for review, but any feedback is welcome 21:48:33 <oneswig> rbudden: looking for the prize for the most up-to-date entry, huh? :-) 21:48:36 <martial> oneswig: interesting, that person had your face icon ... you are being identity theft-ed on book reviews of your own chapter ... sneaky 21:48:49 <hogepodge> Sept 29 is the hard deadline 21:49:04 <hogepodge> Absolutely no edits after that. If it's not ready we won't have time to produce the book. 21:49:09 <oneswig> Thanks hogepodge 21:49:28 <rbudden> oh nice, i have a few more days ;) 21:49:49 <oneswig> We can propose cover images for this edition. 21:49:50 <hogepodge> (Kathy has entrusted me to remind everyone of that as often as possible... I don't want to let Kathy down) :-D 21:50:14 <oneswig> I need URLs of images that are creative commons licensed 21:50:42 <oneswig> If we get options, I'll set up a vote 21:51:14 <oneswig> So - impressive images please everyone from your cloudy science workloads! 21:51:22 <rbudden> i know someone mentioned last year that they liked the Bridges OPA layout image 21:51:32 <rbudden> it’s already in the previous book though, so maybe something fresh 21:51:40 <rbudden> but figured it’s up for grabs if needed 21:52:17 <oneswig> rbudden: I used a screen shot from the vrml widget. Any way of getting something better resolution? 21:52:29 <rbudden> yes 21:52:39 <rbudden> I can ask and get the original 21:52:56 <oneswig> sounds good. 21:53:12 <oneswig> OK - any more on the book work? 21:53:26 <oneswig> #topic SC17 21:53:36 <oneswig> hogepodge: was this your item? 21:53:57 <hogepodge> Yes, we need point people to receive book and sticker shipments 21:54:12 <hogepodge> Denise got stickers approved, so I just need to know who to mail them to for the conference. 21:54:13 <rbudden> i can volunteer for that 21:54:16 <oneswig> IU or PSC booths might be good places to deliver to - rbudden? 21:54:21 <oneswig> ah, great 21:54:25 <rbudden> yes, i’ll be at SC Sun-Fri 21:54:47 <oneswig> b1airo: what's the latest on the BoF? 21:54:48 <rbudden> they can likely be shipped straight to the PSC booth or split between us and IU 21:54:51 <hogepodge> Great! Just send an email to me and I'll connect you with Denise to set it up. 21:54:59 <rbudden> hogepodge: will do, thx 21:55:26 <b1airo> oneswig: BoF is on 21:55:42 <oneswig> Did the merge happen or is it just us? 21:55:42 <rbudden> #action rbudden to get in touch with hogepodge about SC book and sticker shipments 21:55:45 <hogepodge> chris@openstack.org 21:55:48 <b1airo> Final participant list and format TBA 21:56:18 <b1airo> I believe merge is happening, but I still haven't heard anymore from Meredith on the other side 21:57:54 <oneswig> Any more on SC? 21:58:01 <b1airo> Not from me 21:58:05 <rbudden> do we have a list of presenters for booth talks? 21:58:09 <rbudden> or still TBA? 21:58:13 <rbudden> errr TBD 21:58:21 <b1airo> Other thing I wanted to raise is feedback (if any) on becoming a SIG?? 21:58:59 <oneswig> rbudden: I think Mike Lowe was tracking that. 21:59:08 <b1airo> Plus! Forum topics for Sydney - we have not thrown anything into the brainstorming pile yet I think 21:59:09 <oneswig> #topic AOB - SIG etc. 21:59:33 <rbudden> oneswig: thanks, i’ll sync up with Mike 21:59:49 <oneswig> SIG, still seems fine to me. The mailing list split (ideally involving less cross-posting) is the only question for me. 22:00:07 <martial> b1airo: our usual BoF / Lightning Talks 22:00:13 <oneswig> We upgraded to Pike today, it worked! 22:00:35 <b1airo> martial: those are already in, I mean the main Forum 22:00:59 <oneswig> b1airo: might be a good thread for openstack-sigs[scientific]? 22:01:00 <martial> (I am still confused about this Forum thing then) 22:01:34 <martial> oneswig: might be especially since we are out of time 22:01:35 <priteau> oneswig: Anything to look out for before moving to Pike? I keep cherry-picking bug fixes from it into our Ocata deployment 22:03:05 <oneswig> priteau: Mark and johnthetubaguy in our team did a fair bit of prep. We use Kolla. There were issues with docker_py -> docker. Also, we had problems with RabbitMQ hosts mangling /etc/hosts with repeated entries. Finally, a racy problem with updating haproxy which was fixed by a cherry-pick from master 22:03:21 <oneswig> And I forgot to merge a PR I'd left dangling, doh. 22:03:39 <oneswig> Kolla worked really well for us though. 22:03:51 <oneswig> Ah, we are way over time. 22:03:57 <rbudden> oneswig: I would be very interested in thoughts/war storries of Kolla 22:04:02 <rbudden> for another time then! 22:04:06 <oneswig> Better wrap up. 22:04:14 <oneswig> rbudden: will try to write something up! 22:04:17 <oneswig> cheers all 22:04:20 <oneswig> #endmeeting