20:58:49 <martial> #startmeeting scientific-wg 20:58:50 <openstack> Meeting started Tue Feb 21 20:58:49 2017 UTC and is due to finish in 60 minutes. The chair is martial. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:58:52 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:58:54 <openstack> The meeting name has been set to 'scientific_wg' 20:59:04 <martial> #chair oneswig 20:59:05 <openstack> Current chairs: martial oneswig 20:59:08 <oneswig> Hi Martial 20:59:14 <martial> Hi Stig 20:59:27 <zhipeng> hey 20:59:30 <oneswig> Hi all - are we a couple of minutes early? 20:59:42 <oneswig> Hi zhipeng, welcome 20:59:43 <martial> I started one minute early I guess 20:59:50 <martial> Hi zhipeng, welcome :) 20:59:54 <priteau> Hello 20:59:55 <oneswig> no problem 20:59:58 <oneswig> hi priteau 21:00:00 <zhipeng> :) 21:00:22 <oneswig> #link Agenda for today (a short one) https://wiki.openstack.org/wiki/Scientific_working_group#IRC_Meeting_February_21st_2017 21:00:31 <rbudden> hello 21:00:36 <oneswig> Hi rbudden 21:01:01 <simon-AS559> Hello all! 21:01:14 <oneswig> Evening simon-AS559 21:01:14 <b1airo> Hi simon-AS559 21:01:25 <jmlowe> Hello 21:01:28 <martial> oneswig: I missed doing my update from last week 21:01:28 <oneswig> g'day b1airo jmlowe 21:01:30 <b1airo> Howdy oneswig , good holiday? 21:01:33 <martial> #chair b1airo 21:01:34 <openstack> Current chairs: b1airo martial oneswig 21:01:41 <oneswig> fab-u-lous thanks 21:01:54 <oneswig> Martial you want to start with that? 21:02:05 <martial> can do 21:02:20 <martial> #topic Boston Cloud Declaration 21:03:05 <martial> So a couple weeks back (Monday 13th) was the first meeting (online) of people on the cloud declaration 21:03:41 <hogepodge> o/ 21:03:52 <martial> a few things were discussed 21:04:03 <martial> the draft outline was created on google docs 21:04:13 <oneswig> got a link martial? 21:04:15 <martial> I am not sure if a link was shared yet (public one) 21:04:54 <martial> the document I see is still listed as edit by all 21:04:59 <martial> (who have the link) 21:05:23 <martial> so I need to check with Mr Yazdi if he has a finalized/shareable version 21:06:08 <jmlowe> invite by email? 21:06:11 <martial> I made a copy 21:06:18 <oneswig> Can you summarise on the overall direction? 21:06:24 <martial> so here is a view only shareable link: https://docs.google.com/document/d/1CFRfRC4oFtXUsLbyJosKz06IhwKrWBo_Ag3ElGM8vtU/edit?usp=sharing 21:07:04 <martial> yes, the discussion was centered about figuring out who was interesting in championing a task/topic 21:07:30 <martial> there were quite a few topics discussed as you can see 21:07:32 <jmlowe> oneswig: got a google docs enabled email? 21:07:52 <oneswig> jmlowe: try stig@stackhpc.com or stig@telfer.org 21:07:53 <martial> Identity Federation 21:07:53 <martial> Security Considerations (digital rights management, IP protection, privacy sensitive data management, etc…) 21:07:53 <martial> Authorized Shared Use Facilitation 21:07:53 <martial> Data Federation (Dataverse project) 21:07:55 <martial> Cost Sharing and Business Facilitation 21:08:29 <b1airo> On this topic... We of Nectar have been chatting to the Kiwis in both University of Auckland and NeSI (NZ eScience Initiative). I'm hopeful we can have an international federation started with them by Boston 21:08:43 <martial> great b1airo 21:09:16 <martial> the idea is to come up with a consensus before the Summit to be able to have a fruitful conversation 21:09:32 <martial> because we all agreed that the time allocated is short in Boston alone 21:09:51 <oneswig> b1airo: how much actual federating goes on in Nectar? As in, how often do people run workloads on the resources of other institutions? 21:10:18 <martial> you can see in the document I shared in red, the tasks in place 21:10:51 <martial> but it does cover the 5 topics listed above 21:11:01 <martial> the next meeting is ... 21:11:07 <b1airo> Constantly. Though the bulk of users tend to head towards their local zones, but some of them don't care and Cells scheduler just puts them wherever the most free capacity is 21:11:23 <martial> next Monday 21:11:59 <martial> for those interested, I would email khalilyazdi@outlook.com <khalilyazdi@outlook.com> 21:12:07 <trandles> o/ sorry I'm late...competing meetings... 21:12:16 <oneswig> Hi trandles 21:12:18 <martial> does this help with a quick recap 21:12:22 <martial> ? 21:13:01 <oneswig> Thanks martial I think so. 21:13:05 <b1airo> Tim Bell also gave a recap last meeting if you want to check the logs 21:13:43 <oneswig> b1airo: last WG meeting? 21:14:20 <b1airo> Yep 21:15:06 <b1airo> PS: I'm at the airport so may drop out for a while 21:15:13 <oneswig> Just looking at it now... 21:15:16 <martial> I see that 21:15:23 <martial> http://eavesdrop.openstack.org/meetings/scientific_wg/2017/scientific_wg.2017-02-15-09.02.log.txt 21:15:29 <oneswig> b1airo: before you go, did you get to see the talk on HPC monitoring? 21:15:53 <b1airo> oneswig: yes I did 21:16:07 <martial> #topic Update on scientific infrastructure monitoring 21:16:11 <b1airo> I was planning to follow up with Jordi over email and introduce you 21:16:19 <oneswig> Can you describe or are you tapping on a phone? 21:16:35 <b1airo> Yes & yes 21:17:20 <zhipeng> b1airo we from Cyborg project would like to work on the GPU topic 21:17:32 <b1airo> Basically he started with a fairly traditional architecture using rsyslog for consolidation and then pushing into logstash+kibaba 21:17:37 <oneswig> #link abstract from NeSI HPC monitoring project http://www.eresearchnzconference.org.nz/wp-content/uploads/2016/12/9.-Blasco.pdf 21:17:43 <b1airo> zhipeng: great! 21:18:13 <b1airo> Found scalability issues for the number of nodes he wanted the system to handle 21:18:15 <martial> zhipeng: this is great news, thanks for confirming:) 21:18:56 <b1airo> So wrote his own custom metrics gathering and aggregation layer 21:19:01 <oneswig> What are the NeSI HPC resources - anything vendor specific? 21:19:57 <b1airo> oneswig: I don't think so anymore, they had a BlueGene P but it's now decommissioned 21:20:35 <b1airo> Anyway he metrics tooling can handle up to 15k nodes with a few hundred metrics 21:20:49 <b1airo> With just a modest host 21:21:32 <b1airo> The interesting things was that they are pulling out HPC specific metrics and using it to target efficiency improvements 21:22:14 <b1airo> Efficiency as in number of nodes needed to complete same job 21:22:47 <oneswig> That is indeed interesting. And if it's not vendor-specific, there's a hope for reuse in an openstack context 21:23:39 <oneswig> We've been talking over how to get IB verbs metrics into Monasca from our hypervisors 21:23:49 <oneswig> sounds like quite a similar theme 21:24:51 <b1airo> That sounds interesting 21:25:07 <martial> oneswig : the abstract is very interesting indeed 21:25:29 <oneswig> OK thanks b1airo, assume you're at the airport to do some skydiving, right? Hate to distract you while you're packing your parachute 21:26:07 <oneswig> conferences in Queenstown NZ, indeed... :-) 21:26:08 <b1airo> By the way, I found a fun bug in libvirt+qemu last week - can't launch static hugepage backed guests much larger than 120GB 21:26:24 <oneswig> what's the symptom? 21:26:34 <jmlowe> bah, who wold need to do that! 21:27:15 <b1airo> Kernel zeros the memory before really launching the process so libvirt times out after 30s waiting for QMP socket and then kills Qemu 21:27:31 <b1airo> jmlowe: lol, thanks Mr Gates 21:28:14 <b1airo> "120GB is more than anyone will ever need" 21:28:41 <oneswig> I'm surprised that would take 30s 21:29:01 <jmlowe> not disimilar to when I mkfs.ext4 on a multi TB volume with discard support, will try to do a sync discard on every block before laying down the filesystem 21:29:39 <b1airo> Oneswig: yeah seems to be single threaded in the kernel 21:29:42 <jmlowe> then I figured out -K 21:30:04 <b1airo> Real solution is for libvirt to pass an already open socket to qemu for QMP 21:30:24 <jmlowe> bug filed? 21:30:36 <b1airo> But sadly for now I think we will have to patch the timeout and build our own packages 21:31:09 <b1airo> jmlowe: planning to push it to Redhat this week to ensure it gets attention 21:31:40 <oneswig> It's a nice thing to be able to alloc 120GB of contiguous physical memory in your host at all... 21:31:43 <jmlowe> enabling huge pages are on my wishlist 21:32:15 <oneswig> I'd imagine the time goes in "draining the swamp" 21:33:07 <oneswig> ok should we move on? 21:33:21 <martial> #topic More input wanted on Identity Federation article 21:33:30 <oneswig> Aha, well this ties in nicely 21:33:40 <oneswig> Section 3A in the document Martial shared 21:34:03 <simon-AS559> One of the links under "further reading" seems dead (https://www.gitbook.com/book/indigo-dc/openid-keystone) 21:34:04 <oneswig> begins "Compile a summary of the current state of identity management solutions..." 21:35:01 <oneswig> Thanks simon-AS559 - that's one of the few things in a bare bones document 21:35:20 <jmlowe> the mechanics of federation are changing in octa? 21:35:38 <oneswig> #link Scientific WG studies on OpenStack github https://github.com/openstack/scientific-wg/blob/master/doc/source/openstack-and-federated-identity.rst 21:35:46 <jmlowe> afaik every federated IdP goes in their own domain? 21:35:55 <oneswig> jmlowe: in a backwards-incompatible way? 21:36:14 <jmlowe> oneswig: would you expect it to be any other way? 21:36:26 <oneswig> jmlowe: shocked, the cyncism 21:36:34 <oneswig> :-) 21:37:02 <oneswig> Anyway, I'm hoping and looking for contributions from people who can speak better from experience on this study 21:37:24 <priteau> This seems to be the repo for the gitbook: https://github.com/indigo-dc/openid-keystone-doc 21:37:38 <oneswig> #link Done in the usual way - gerrit review https://review.openstack.org/#/q/project:openstack/scientific-wg 21:37:44 <jmlowe> well I was ramping up to do a whole bunch of scripting to make openid connect federation work, now on hold to see how it's all changed 21:38:38 <rbudden> jmlowe: looks like we should wait for Octa for the Jetstream/Bridges Federation 21:38:53 <martial> jmlowe: will it also be changed through the global API (not the project API)? 21:38:57 <jmlowe> I have a burning need to federate with globus auth doing openid connect, this is extremely helpful 21:38:59 <oneswig> jmlowe: anything to be gained from the experience of Indigo-DC or is that all wrapped up in the change in Ocata? 21:40:06 <jmlowe> I'm not sure, was just casually chatting about having to build a whole mapping thing and was told the modeling of federated users in keystone changes completely in octa 21:41:19 <martial> jmlowe: okay wait and see then. The PTG is right now, I wonder if the Keystone Etherpad reflect anything yet 21:41:30 <oneswig> It's my hope that when this study has more useful content, the documents will become available via CI/CD to somewhere under openstack.org. I'd like to see that before Boston... 21:42:18 <oneswig> jmlowe: this relates to shadow users? 21:43:38 <oneswig> I thought that stuff got reworked only in Mitaka... 21:45:39 <jmlowe> not sure now 21:45:48 <martial> I do not see much comment on this unfortunately, I guess it is an open topic for the time being 21:46:14 <martial> Keystone meets from Wed-Fri, so no real updates until then 21:47:04 <oneswig> martial: I'm guessing they have moved on to Pike planning now anyway? 21:47:33 <oneswig> OK, I had a final item on the agenda for today wrt Boston 21:47:39 <martial> oneswig: yes but by the next meeting, we ought to see their notes on etherpad and know where they are planning to move to 21:47:43 <martial> #topic Boston summit activities 21:48:13 <oneswig> There's the question of what one does in the evenings in Boston. 21:48:33 <oneswig> trandles has a tremendously helpful local friend 21:49:02 <oneswig> I've looked through the options and have four broad categories of venue 21:49:22 <oneswig> #link lobsters http://www.unionoysterhouse.com/pages/meetings.html 21:49:33 <oneswig> #link brewpubs http://www.cambridgebrewingcompany.com 21:49:55 <oneswig> #link local bar near MIT http://www.miracleofscience.us/about.php 21:50:08 <oneswig> #link barbecue https://www.redbones.com 21:50:23 <oneswig> I used to live in Boston and happen to have been to all four :-) 21:50:43 <martial> how many are we going to be? 21:50:51 <oneswig> Last time it was 57 21:51:03 <oneswig> I'm guessing the same +- 10% 21:51:14 <oneswig> perhaps more for the US location? 21:51:57 <trandles> might depend on who's allowed through the airport :( 21:52:02 <oneswig> Is there a way of voting, perhaps in a couple of weeks? 21:52:15 <oneswig> trandles: oh, how topical 21:54:04 <martial> oneswig: seems reasonable 21:54:19 <oneswig> I'll follow up with all options and see if there's a strong contender. I'll also see if we can get a vendor or two to fund it 21:54:46 <b1airo> The Mellanox OpenStack Scientific-WG 21:54:51 <b1airo> ;-) 21:54:57 <b1airo> C'mon Intel! 21:54:58 <rbudden> has a nice ring to it ;) 21:55:29 <oneswig> Other interconnects are also available :-) 21:55:39 <oneswig> I think Mellanox have thrown down a gauntlet! 21:56:18 <oneswig> OK that's all I had on the evening social for now - will report back 21:56:31 <martial> cool :) 21:56:33 <b1airo> Loving Cumulus at the moment - they helped us debug a Cisco problem recently 21:56:50 <b1airo> Stupid 9Ks throwing away unicast ARP replys 21:57:38 <oneswig> good work to pinpoint that to the switch, unless it happens every time 21:58:32 <oneswig> AOB? 21:59:59 <oneswig> ... nothing to declare from here. I'm scoping a new project - Kolla + Ironic + stuff - might be interesting int he coming months 22:00:16 <rbudden> oneswig: I’m interested in that 22:00:17 <b1airo> Sounds fun oneswig 22:00:19 <martial> still working on our Dmoni and DataScience agnostic VM 22:00:25 <rbudden> been eyeing up the move to Kolla 22:00:25 <jmlowe> ooh stuff, I love stuff 22:00:26 <oneswig> Will keep you posted! 22:00:37 <martial> packer + ansible + heat as the backbone 22:00:39 <rbudden> oneswig: definitely keep me in the loop 22:00:51 <oneswig> time up, alas - until next time 22:00:58 <priteau> I would love to hear more about this too 22:01:12 <simon-AS559> bye! 22:01:14 <b1airo> rbudden: is your Slurm - OpenStack integration in public domain? 22:01:16 <trandles> +1 on kolla 22:01:18 <martial> #endmeeting