#openstack-meeting log

21:01:25 <martial_> #startmeeting scientific-sig
21:01:26 <openstack> Meeting started Tue Mar  3 21:01:25 2020 UTC and is due to finish in 60 minutes.  The chair is martial_. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:01:27 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:01:29 <openstack> The meeting name has been set to 'scientific_sig'
21:01:33 <martial_> #chair oneswig
21:01:34 <openstack> Current chairs: martial_ oneswig
21:02:06 <martial_> Good day/evening/morning everybody and welcome to the 2020 March 3rd edition of the famous Scientific SIG weekly IRC meeting
21:02:08 <jonmills_nasa> gee...that openstack-discuss mailing list has been kinda wild lately, huh?
21:02:33 <oneswig> not been following it recently - any specific threads?
21:03:08 <jonmills_nasa> well there's the debacle over the glance team rejecting the OSC client, in favor of python-glanceclient
21:03:22 <rbudden> hello
21:03:25 <oneswig> I saw that go by
21:03:51 <oneswig> why oh why?
21:04:01 <jonmills_nasa> multiple governane questions from Mohammed Naser
21:04:08 <janders> g'day all
21:04:28 <jonmills_nasa> the think about OpenDev splitting off from OpenStack?
21:04:41 <jonmills_nasa> just curious what folks think about stuff like that
21:05:45 <martial_> The OpenDev part, I think is not crazy
21:06:08 <martial_> they had one during the past Summit (was it in Vancouver last year?)
21:06:14 <martial_> no, Denver
21:07:30 <martial_> will see what happens with the Open Infra shift
21:08:19 <martial_> #topic UKRI cloud workshop
21:08:27 <martial_> Stig?
21:08:38 <oneswig> jonmills_nasa: that's quite a long thread... sorry was distracted
21:09:01 <oneswig> I've been up to London, for https://cloud.ac.uk
21:09:06 <jonmills_nasa> yeah no worries, and i don't want to distract us if there's specific business.
21:09:33 <oneswig> It's an annual 1-day conference for the UK research sector on cloud compute.
21:10:19 <oneswig> In recent years it has been an interesting bellweather for public vs private cloud (for example)
21:11:05 <jonmills_nasa> so how's it looking for private cloud?
21:11:11 <oneswig> Over recent years the presentation content has shown maturing content, in terms of the form cloud adoption is taking
21:11:45 <oneswig> Well, I don't know if there's an unbiased estimate of that but the presentation content covered a lot more private cloud than last time.
21:12:17 <oneswig> Some promising presentations from a year ago on managing costs in public cloud had no follow-up this year, which was a pity.
21:12:46 <oneswig> At the end, someone summed up along the lines of "I can't believe after all these years it's still a question of someone getting their credit card out"
21:13:10 <jonmills_nasa> ha
21:13:16 <martial_> that last one seems strange ... of course compute requires money to run
21:13:30 <oneswig> Some good presentations from CERN, Sanger, Vscaler/Genomics England - and our gang at StackHPC
21:13:45 <oneswig> martial_: more about the means than the money
21:14:29 <oneswig> It was also observed that in the whole schedule, there were zero public cloud user stories involving significant data.
21:14:30 <martial_> 🙌
21:14:56 <oneswig> (as in volumes of data)
21:15:15 <jonmills_nasa> not even archival data?  what's that arctic service they have....
21:15:19 <martial_> agreed, volumes" of data is different from user to user
21:15:29 <martial_> we have a 100TB use case but that is small
21:15:36 <oneswig> The hybrid story and the auto-cluster-deployment story - both looking strong
21:16:19 <oneswig> jonmills_nasa: I think the access times for archival data make it only suitable for archive...
21:16:27 <jonmills_nasa> i've never understood the hybrid story, when storage is involved.  any efficiency gained in compute seems like it would be lost in data transfers
21:16:52 <oneswig> I spoke in a session on controlled-access data - life sciences etc.  That got a good discussion going.
21:17:17 <oneswig> jonmills_nasa: you're right, it only works if there's lots of compute to not very much data.
21:17:50 <martial_> well there is a possible hybrid conversation: you ETL the data on the large cluster, push the dataset to be worked on to the public cloud and can then use specific compute components as needed to get the result
21:18:29 <martial_> it is a fairly common exercise in data optimization
21:18:49 <martial_> trick is that it requires some understanding of how to get to the result
21:18:53 <oneswig> John, Pierre and Steve from our firm gave a highly compressed overview of autoscaling, reservations and preemption - in 15 minutes.
21:19:17 <martial_> that is cool:)
21:20:32 <oneswig> collectively the fight against resource contention we termed "the coral reef cloud"
21:21:56 <martial_> pretty but fragile?
21:22:18 <oneswig> The analogy was the fight for space on the sea bed.
21:22:42 <oneswig> Or perhaps "keep your friends close, and anenomes closer" perhaps? :-)
21:22:54 <jonmills_nasa> ugh
21:23:01 <oneswig> sorry
21:23:55 <oneswig> This might be useful for a future hands-on conference workshop: https://www.instancehub.com/
21:24:12 <oneswig> Currently an academic's side project
21:24:45 <jonmills_nasa> SSL error, won't load in chrome
21:25:01 <oneswig> funny, you're the second person to say that, seems to work for most
21:25:13 <martial_> weird, works for me
21:25:36 <jonmills_nasa> maybe it's my gov computer rejecting it
21:25:51 <martial_> wish OpenStack passport would offer similar service
21:26:35 <oneswig> martial_: that would indeed be cool.
21:27:16 <martial_> oh and do not forget to write an abstract for SuperComp Cloud for ISC :)
21:27:22 <oneswig> There were also some interesting benchmarks on Oracle bare metal cloud
21:27:42 <oneswig> martial_: thanks for the reminder!
21:27:57 <martial_> (saw Mike join, just posting everywhere :) )
21:28:38 <jmlowe_> excellent!
21:29:04 <jmlowe_> 10 min late, 20 min fiddling with my irc client
21:29:06 <jonmills_nasa> all federal travel is in jeopardy at this moment FYI (due to coronavirus)
21:29:24 <jmlowe_> this makes me very unhappy
21:29:37 <martial_> not entirely surprised that said
21:29:42 <jmlowe_> although it does seem to have increased the availability of upgrades
21:30:08 <janders> haha
21:31:19 <oneswig> Definitely a move not to shake hands at today's event.  (I only forgot a couple of times)
21:32:15 <jonmills_nasa> and, assuming rbudden and I would be allowed to Vancouver in June for OpenDev....we don't even know if that's the correct conference to attend, or if it should be the Summit in Berlin in Oct (which may be yet harder to attend)
21:33:23 <rbudden> yeah, travel at the moment is interesting… even just heading on-base next week
21:33:27 <martial_> Vancouver is PTG only, no presentation
21:33:29 <oneswig> The way I see it, OpenDev could be a good opportunity to share knowledge from one deployment to another in a hands-on way.
21:33:42 <rbudden> but as Jonathan mentioned we are actively looking at whether the OpenDev event or the Berlin Summit is the better option
21:33:59 <martial_> the "Summit" is Berlin
21:34:03 <jmlowe_> I'm leaning towards Berlin these days
21:35:18 <jonmills_nasa> Berlin, being in Oct, may hopefully far away in time from this coronavirus mess, but it does fall in a new fiscal year, which is also messy for approvals.  but yeah
21:35:52 <martial_> 🤷
21:36:40 <martial_> Stig anything else on the UKRI topic?
21:36:52 <oneswig> I don't think so thanks martial_
21:37:14 <martial_> we are already well into AOB :) so let's make it official after this
21:37:23 <martial_> #topic SuperComp Cloud at ISC
21:37:28 <martial_> Mike?
21:37:39 <jmlowe_> um, yes, we need submissions
21:38:18 <jmlowe_> https://sites.google.com/view/supercompcloud
21:38:46 <martial_> 2nd Workshop on Interoperability of Supercomputing and Cloud Technologies
21:39:12 <jmlowe_> Extended Abstract Submission Deadline: March 20, 2020
21:39:19 <oneswig> The "interoperability" part was only lightly covered last time... are there specific themes?
21:39:50 <jmlowe_> There's 15 of them listed on the website
21:39:58 <jmlowe_> can copy and paste if desired
21:40:39 <oneswig> ah, just saw that.
21:40:39 <jmlowe_> It's quite broad, covers most anything that touches on cloudy and HPC type things
21:41:08 <jmlowe_> right down to workforce development for these types of environments
21:41:47 <martial_> Working on an abstract on our end
21:42:07 <jmlowe_> the program committee is let's say friendly
21:42:25 <martial_> and not for the topic everybody was waiting for :)
21:42:42 <martial_> #topic Spanish Inquisition
21:42:46 <martial_> oops
21:42:52 <jmlowe_> I was not expecting that
21:42:56 <martial_> #topic AOB
21:43:34 <martial_> got a few minutes it look like
21:45:00 <oneswig> Nobody expects the spanish inquisition! :-)
21:45:45 <jonmills_nasa> @one
21:45:55 <martial_> I see a couple people still awake :)
21:45:57 <jonmills_nasa> oneswig we continue playing with monasca
21:46:27 <oneswig> jonmills_nasa: daveholland from the Sanger was asking recently about CloudKitty
21:46:42 <jonmills_nasa> i'd say my general impression is that it feels more substantial than ceilometer.  i kinda dig it
21:47:21 <oneswig> Good to hear it.  We find there's quite a bit of config you need to apply to make it useful - alarms etc.
21:47:47 <jonmills_nasa> so just got monasca-agent working with multi-tenant support (there's docs for it....kinda buried).  hopefully can replace ceilometer-compute agents
21:47:49 <oneswig> Trying to gather more helpful dashboards than the standard
21:48:34 <oneswig> jonmills_nasa: just reminded me I have a patch up for that for monitoring unmounted disks (very helpful for Bluestore Ceph, for example)
21:48:51 <oneswig> just need to ... document and unit-test
21:49:58 <jonmills_nasa> while monasca-agent can report on disk size (cinder volume size) for billing purposes, what it can't report on is the cinder volume_type, which makes it hard to bill more for faster disk vs slower or standard disk
21:50:26 <jonmills_nasa> so i may still use the cinder-volume-audit cron script to send to ceilosca to monasca
21:51:00 <jmlowe__> How's the data usage of monasca, looked like they were making all the same mistakes ceilometer made in the bad old days, last time I checked
21:51:32 <jmlowe__> I remember a Mirantis writeup where a 100 node cluster would produce 6TB of ceilometer data per day
21:51:33 <jonmills_nasa> umm...i just set an influxdb retention period and don't worry about it
21:51:41 <oneswig> jonmills_nasa: unless you can get a hint from (eg) lsblk -o '+...', that kind of infrastructure data would have to come from outside the agent's scope.
21:52:15 <jmlowe__> ah, ok, they are putting the measurements in influx and not mongodb
21:52:24 <oneswig> jmlowe__: have a blog post in the wings on writing to Influxdb databases per tenancy, which helps a lot with that.
21:52:36 <jonmills_nasa> jmlowe__ correct, and I am also turning off unnecessary metrics
21:52:45 <jonmills_nasa> i.e. metrics I'll never bill on
21:53:15 <jmlowe__> that could be fun, billing only on unusual metrics
21:53:44 <oneswig> billing according to PID for example?
21:54:08 <jmlowe__> compute is free, $100/per console session
21:54:09 <jonmills_nasa> or billing for dropped tx packets on your tun device
21:55:01 <jmlowe__> $1k per security group change
21:55:09 <martial_> (5 minutes warning)
21:55:46 <jonmills_nasa> i'm done
21:55:55 <oneswig> I knew there was something
21:56:18 <oneswig> #link metrics exported as an Oslo module https://review.opendev.org/#/c/704733/
21:57:00 <oneswig> we get similar data in "black box" fashion from HAProxy but this has the potential to be insightful on cloud scaling and performance issues.
21:58:16 <oneswig> From the Large Scale SIG, there was a request for "scaling stories" to be added here - https://etherpad.openstack.org/p/scaling-stories
21:58:48 <martial_> with that, we are reaching the end of our time
21:59:11 <oneswig> What they are looking for is what broke, how it was observed, and how to fix/avoid it.
21:59:19 <oneswig> Thanks all
21:59:32 <martial_> Thanks everybody
21:59:36 <martial_> #endmeeting