20:58:49 <martial> #startmeeting scientific-wg
20:58:50 <openstack> Meeting started Tue Feb 21 20:58:49 2017 UTC and is due to finish in 60 minutes.  The chair is martial. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:58:52 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
20:58:54 <openstack> The meeting name has been set to 'scientific_wg'
20:59:04 <martial> #chair oneswig
20:59:05 <openstack> Current chairs: martial oneswig
20:59:08 <oneswig> Hi Martial
20:59:14 <martial> Hi Stig
20:59:27 <zhipeng> hey
20:59:30 <oneswig> Hi all - are we a couple of minutes early?
20:59:42 <oneswig> Hi zhipeng, welcome
20:59:43 <martial> I started one minute early I guess
20:59:50 <martial> Hi zhipeng, welcome :)
20:59:54 <priteau> Hello
20:59:55 <oneswig> no problem
20:59:58 <oneswig> hi priteau
21:00:00 <zhipeng> :)
21:00:22 <oneswig> #link Agenda for today (a short one) https://wiki.openstack.org/wiki/Scientific_working_group#IRC_Meeting_February_21st_2017
21:00:31 <rbudden> hello
21:00:36 <oneswig> Hi rbudden
21:01:01 <simon-AS559> Hello all!
21:01:14 <oneswig> Evening simon-AS559
21:01:14 <b1airo> Hi simon-AS559
21:01:25 <jmlowe> Hello
21:01:28 <martial> oneswig: I missed doing my update from last week
21:01:28 <oneswig> g'day b1airo jmlowe
21:01:30 <b1airo> Howdy oneswig , good holiday?
21:01:33 <martial> #chair b1airo
21:01:34 <openstack> Current chairs: b1airo martial oneswig
21:01:41 <oneswig> fab-u-lous thanks
21:01:54 <oneswig> Martial you want to start with that?
21:02:05 <martial> can do
21:02:20 <martial> #topic Boston Cloud Declaration
21:03:05 <martial> So a couple weeks back (Monday 13th) was the first meeting (online) of people on the cloud declaration
21:03:41 <hogepodge> o/
21:03:52 <martial> a few things were discussed
21:04:03 <martial> the draft outline was created on google docs
21:04:13 <oneswig> got a link martial?
21:04:15 <martial> I am not sure if a link was shared yet (public one)
21:04:54 <martial> the document I see is still listed as edit by all
21:04:59 <martial> (who have the link)
21:05:23 <martial> so I need to check with Mr Yazdi if he has a finalized/shareable version
21:06:08 <jmlowe> invite by email?
21:06:11 <martial> I made a copy
21:06:18 <oneswig> Can you summarise on the overall direction?
21:06:24 <martial> so here is a view only shareable link:   https://docs.google.com/document/d/1CFRfRC4oFtXUsLbyJosKz06IhwKrWBo_Ag3ElGM8vtU/edit?usp=sharing
21:07:04 <martial> yes, the discussion was centered about figuring out who was interesting in championing a task/topic
21:07:30 <martial> there were quite a few topics discussed as you can see
21:07:32 <jmlowe> oneswig: got a google docs enabled email?
21:07:52 <oneswig> jmlowe: try stig@stackhpc.com or stig@telfer.org
21:07:53 <martial> Identity Federation
21:07:53 <martial> Security Considerations (digital rights management, IP protection, privacy sensitive data management, etc…)
21:07:53 <martial> Authorized Shared Use Facilitation
21:07:53 <martial> Data Federation (Dataverse project)
21:07:55 <martial> Cost Sharing and Business Facilitation
21:08:29 <b1airo> On this topic... We of Nectar have been chatting to the Kiwis in both University of Auckland and NeSI (NZ eScience Initiative). I'm hopeful we can have an international federation started with them by Boston
21:08:43 <martial> great b1airo
21:09:16 <martial> the idea is to come up with a consensus before the Summit to be able to have a fruitful conversation
21:09:32 <martial> because we all agreed that the time allocated is short in Boston alone
21:09:51 <oneswig> b1airo: how much actual federating goes on in Nectar? As in, how often do people run workloads on the resources of other institutions?
21:10:18 <martial> you can see in the document I shared in red, the tasks in place
21:10:51 <martial> but it does cover the 5 topics listed above
21:11:01 <martial> the next meeting is ...
21:11:07 <b1airo> Constantly. Though the bulk of users tend to head towards their local zones, but some of them don't care and Cells scheduler just puts them wherever the most free capacity is
21:11:23 <martial> next Monday
21:11:59 <martial> for those interested, I would email khalilyazdi@outlook.com <khalilyazdi@outlook.com>
21:12:07 <trandles> o/  sorry I'm late...competing meetings...
21:12:16 <oneswig> Hi trandles
21:12:18 <martial> does this help with a quick recap
21:12:22 <martial> ?
21:13:01 <oneswig> Thanks martial I think so.
21:13:05 <b1airo> Tim Bell also gave a recap last meeting if you want to check the logs
21:13:43 <oneswig> b1airo: last WG meeting?
21:14:20 <b1airo> Yep
21:15:06 <b1airo> PS: I'm at the airport so may drop out for a while
21:15:13 <oneswig> Just looking at it now...
21:15:16 <martial> I see that
21:15:23 <martial> http://eavesdrop.openstack.org/meetings/scientific_wg/2017/scientific_wg.2017-02-15-09.02.log.txt
21:15:29 <oneswig> b1airo: before you go, did you get to see the talk on HPC monitoring?
21:15:53 <b1airo> oneswig: yes I did
21:16:07 <martial> #topic Update on scientific infrastructure monitoring
21:16:11 <b1airo> I was planning to follow up with Jordi over email and introduce you
21:16:19 <oneswig> Can you describe or are you tapping on a phone?
21:16:35 <b1airo> Yes & yes
21:17:20 <zhipeng> b1airo we from Cyborg project would like to work on the GPU topic
21:17:32 <b1airo> Basically he started with a fairly traditional architecture using rsyslog for consolidation and then pushing into logstash+kibaba
21:17:37 <oneswig> #link abstract from NeSI HPC monitoring project http://www.eresearchnzconference.org.nz/wp-content/uploads/2016/12/9.-Blasco.pdf
21:17:43 <b1airo> zhipeng: great!
21:18:13 <b1airo> Found scalability issues for the number of nodes he wanted the system to handle
21:18:15 <martial> zhipeng: this is great news, thanks for confirming:)
21:18:56 <b1airo> So wrote his own custom metrics gathering and aggregation layer
21:19:01 <oneswig> What are the NeSI HPC resources - anything vendor specific?
21:19:57 <b1airo> oneswig: I don't think so anymore, they had a BlueGene P but it's now decommissioned
21:20:35 <b1airo> Anyway he metrics tooling can handle up to 15k nodes with a few hundred metrics
21:20:49 <b1airo> With just a modest host
21:21:32 <b1airo> The interesting things was that they are pulling out HPC specific metrics and using it to target efficiency improvements
21:22:14 <b1airo> Efficiency as in number of nodes needed to complete same job
21:22:47 <oneswig> That is indeed interesting.  And if it's not vendor-specific, there's a hope for reuse in an openstack context
21:23:39 <oneswig> We've been talking over how to get IB verbs metrics into Monasca from our hypervisors
21:23:49 <oneswig> sounds like quite a similar theme
21:24:51 <b1airo> That sounds interesting
21:25:07 <martial> oneswig : the abstract is very interesting indeed
21:25:29 <oneswig> OK thanks b1airo, assume you're at the airport to do some skydiving, right?  Hate to distract you while you're packing your parachute
21:26:07 <oneswig> conferences in Queenstown NZ, indeed... :-)
21:26:08 <b1airo> By the way, I found a fun bug in libvirt+qemu last week - can't launch static hugepage backed guests much larger than 120GB
21:26:24 <oneswig> what's the symptom?
21:26:34 <jmlowe> bah, who wold need to do that!
21:27:15 <b1airo> Kernel zeros the memory before really launching the process so libvirt times out after 30s waiting for QMP socket and then kills Qemu
21:27:31 <b1airo> jmlowe: lol, thanks Mr Gates
21:28:14 <b1airo> "120GB is more than anyone will ever need"
21:28:41 <oneswig> I'm surprised that would take 30s
21:29:01 <jmlowe> not disimilar to when I mkfs.ext4 on a multi TB volume with discard support, will try to do a sync discard on every block before laying down the filesystem
21:29:39 <b1airo> Oneswig: yeah seems to be single threaded in the kernel
21:29:42 <jmlowe> then I figured out -K
21:30:04 <b1airo> Real solution is for libvirt to pass an already open socket to qemu for QMP
21:30:24 <jmlowe> bug filed?
21:30:36 <b1airo> But sadly for now I think we will have to patch the timeout and build our own packages
21:31:09 <b1airo> jmlowe: planning to push it to Redhat this week to ensure it gets attention
21:31:40 <oneswig> It's a nice thing to be able to alloc 120GB of contiguous physical memory in your host at all...
21:31:43 <jmlowe> enabling huge pages are on my wishlist
21:32:15 <oneswig> I'd imagine the time goes in "draining the swamp"
21:33:07 <oneswig> ok should we move on?
21:33:21 <martial> #topic More input wanted on Identity Federation article
21:33:30 <oneswig> Aha, well this ties in nicely
21:33:40 <oneswig> Section 3A in the document Martial shared
21:34:03 <simon-AS559> One of the links under "further reading" seems dead (https://www.gitbook.com/book/indigo-dc/openid-keystone)
21:34:04 <oneswig> begins "Compile a summary of the current state of identity management solutions..."
21:35:01 <oneswig> Thanks simon-AS559 - that's one of the few things in a bare bones document
21:35:20 <jmlowe> the mechanics of federation are changing in octa?
21:35:38 <oneswig> #link Scientific WG studies on OpenStack github https://github.com/openstack/scientific-wg/blob/master/doc/source/openstack-and-federated-identity.rst
21:35:46 <jmlowe> afaik every federated IdP goes in their own domain?
21:35:55 <oneswig> jmlowe: in a backwards-incompatible way?
21:36:14 <jmlowe> oneswig: would you expect it to be any other way?
21:36:26 <oneswig> jmlowe: shocked, the cyncism
21:36:34 <oneswig> :-)
21:37:02 <oneswig> Anyway, I'm hoping and looking for contributions from people who can speak better from experience on this study
21:37:24 <priteau> This seems to be the repo for the gitbook: https://github.com/indigo-dc/openid-keystone-doc
21:37:38 <oneswig> #link Done in the usual way - gerrit review https://review.openstack.org/#/q/project:openstack/scientific-wg
21:37:44 <jmlowe> well I was ramping up to do a whole bunch of scripting to make openid connect federation work, now on hold to see how it's all changed
21:38:38 <rbudden> jmlowe: looks like we should wait for Octa for the Jetstream/Bridges Federation
21:38:53 <martial> jmlowe: will it also be changed through the global API (not the project API)?
21:38:57 <jmlowe> I have a burning need to federate with globus auth doing openid connect, this is extremely helpful
21:38:59 <oneswig> jmlowe: anything to be gained from the experience of Indigo-DC or is that all wrapped up in the change in Ocata?
21:40:06 <jmlowe> I'm not sure, was just casually chatting about having to build a whole mapping thing and was told the modeling of federated users in keystone changes completely in octa
21:41:19 <martial> jmlowe: okay wait and see then. The PTG is right now, I wonder if the Keystone Etherpad reflect anything yet
21:41:30 <oneswig> It's my hope that when this study has more useful content, the documents will become available via CI/CD to somewhere under openstack.org.  I'd like to see that before Boston...
21:42:18 <oneswig> jmlowe: this relates to shadow users?
21:43:38 <oneswig> I thought that stuff got reworked only in Mitaka...
21:45:39 <jmlowe> not sure now
21:45:48 <martial> I do not see much comment on this unfortunately, I guess it is an open topic for the time being
21:46:14 <martial> Keystone meets from Wed-Fri, so no real updates until then
21:47:04 <oneswig> martial: I'm guessing they have moved on to Pike planning now anyway?
21:47:33 <oneswig> OK, I had a final item on the agenda for today wrt Boston
21:47:39 <martial> oneswig: yes but by the next meeting, we ought to see their notes on etherpad and know where they are planning to move to
21:47:43 <martial> #topic Boston summit activities
21:48:13 <oneswig> There's the question of what one does in the evenings in Boston.
21:48:33 <oneswig> trandles has a tremendously helpful local friend
21:49:02 <oneswig> I've looked through the options and have four broad categories of venue
21:49:22 <oneswig> #link lobsters http://www.unionoysterhouse.com/pages/meetings.html
21:49:33 <oneswig> #link brewpubs http://www.cambridgebrewingcompany.com
21:49:55 <oneswig> #link local bar near MIT http://www.miracleofscience.us/about.php
21:50:08 <oneswig> #link barbecue https://www.redbones.com
21:50:23 <oneswig> I used to live in Boston and happen to have been to all four :-)
21:50:43 <martial> how many are we going to be?
21:50:51 <oneswig> Last time it was 57
21:51:03 <oneswig> I'm guessing the same +- 10%
21:51:14 <oneswig> perhaps more for the US location?
21:51:57 <trandles> might depend on who's allowed through the airport :(
21:52:02 <oneswig> Is there a way of voting, perhaps in a couple of weeks?
21:52:15 <oneswig> trandles: oh, how topical
21:54:04 <martial> oneswig: seems reasonable
21:54:19 <oneswig> I'll follow up with all options and see if there's a strong contender.  I'll also see if we can get a vendor or two to fund it
21:54:46 <b1airo> The Mellanox OpenStack Scientific-WG
21:54:51 <b1airo> ;-)
21:54:57 <b1airo> C'mon Intel!
21:54:58 <rbudden> has a nice ring to it ;)
21:55:29 <oneswig> Other interconnects are also available :-)
21:55:39 <oneswig> I think Mellanox have thrown down a gauntlet!
21:56:18 <oneswig> OK that's all I had on the evening social for now - will report back
21:56:31 <martial> cool :)
21:56:33 <b1airo> Loving Cumulus at the moment - they helped us debug a Cisco problem recently
21:56:50 <b1airo> Stupid 9Ks throwing away unicast ARP replys
21:57:38 <oneswig> good work to pinpoint that to the switch, unless it happens every time
21:58:32 <oneswig> AOB?
21:59:59 <oneswig> ... nothing to declare from here.  I'm scoping a new project - Kolla + Ironic + stuff - might be interesting int he coming months
22:00:16 <rbudden> oneswig: I’m interested in that
22:00:17 <b1airo> Sounds fun oneswig
22:00:19 <martial> still working on our Dmoni and DataScience agnostic VM
22:00:25 <rbudden> been eyeing up the move to Kolla
22:00:25 <jmlowe> ooh stuff, I love stuff
22:00:26 <oneswig> Will keep you posted!
22:00:37 <martial> packer + ansible + heat as the backbone
22:00:39 <rbudden> oneswig: definitely keep me in the loop
22:00:51 <oneswig> time up, alas - until next time
22:00:58 <priteau> I would love to hear more about this too
22:01:12 <simon-AS559> bye!
22:01:14 <b1airo> rbudden: is your Slurm - OpenStack integration in public domain?
22:01:16 <trandles> +1 on kolla
22:01:18 <martial> #endmeeting