Wednesday, 2021-03-10

oneswig#startmeeting scientific-sig11:00
openstackMeeting started Wed Mar 10 11:00:19 2021 UTC and is due to finish in 60 minutes.  The chair is oneswig. Information about MeetBot at
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.11:00
*** openstack changes topic to " (Meeting topic: scientific-sig)"11:00
openstackThe meeting name has been set to 'scientific_sig'11:00
oneswigHi all11:00
oneswig#link Agenda for today
oneswigWe have a discussion on Jupyter as today's main event11:02
oneswigHi eliaswimmer, thanks for coming along11:03
eliaswimmerI have prepared a few slides about my lessons learned using jupyterhub for lectures11:03
*** b1airo has joined #openstack-meeting11:03
oneswigI need to request access, can you make them world-readable?11:03
oneswigHi b1airo, evening11:04
oneswig#chair b1airo11:04
openstackCurrent chairs: b1airo oneswig11:04
b1airohowdy oneswig11:04
* b1airo yawns11:04
eliaswimmeroneswig: is it working now?11:05
oneswigYes, thanks that works11:05
oneswigI am in11:06
*** gaut_123 has quit IRC11:07
*** belmoreira has joined #openstack-meeting11:08
oneswigeliaswimmer: how much additional work does Zero to JupyterHub require to make into a production service?11:09
eliaswimmerThat really depends on your requirements11:10
eliaswimmerbut it works really well out of the box11:10
eliaswimmera lot of effort went into it11:11
eliaswimmersometimes it is a bit hard to follow the fast release updates11:11
eliaswimmerI spent most time with creating images and getting user creation right11:12
*** dh3 has joined #openstack-meeting11:12
oneswigWhat were the complexities in docker container image creation?11:13
eliaswimmerand of course on the underlying kubernetes setup11:13
verdurinI'm not really here, but Magic Castle includes Jupyterhub.11:13
b1airodoes Z2J already handle idle notebook cleanup?11:13
eliaswimmerb1airo: yes, that works quite well11:14
*** mpryor has joined #openstack-meeting11:14
eliaswimmeroneswig: mainly the diverse needs of our users, that's the place where all the features are setup11:15
dh3sorry for being late. we have a group which runs Jhub notebooks on k8s on Rancher on OpenStack (on turtles...) Their main difficulty was scheduling pods, not knowing if an arriving user was about to run something big or something small, and overcommitting resources.11:15
*** mpryor has quit IRC11:16
b1airowhat does your front-end proxy setup look like, any load or scaling issues with so many clients?11:16
dh3nothing special in our front end as far as I know, standard k8s ingresses11:18
*** mpryor has joined #openstack-meeting11:18
oneswigeliaswimmer: it sounds like your labs scale up and down a lot.  What's the highest scale the deployment has reached in terms of users online?  Is the Kubernetes auto-scaling working well?11:19
*** macz_ has joined #openstack-meeting11:19
eliaswimmerI have to admit, we just throw a lot of hardware onto it, so it was never 100% utilized11:19
mpryorI apologise for my lateness! Where I work we run Jupyter notebooks in two ways - we have JupyterHub running on Rancher Kubernetes which we manage for our users. We also run a system we have called "Cluster-as-a-Service" which can dynamically deploy Pangeo-based JupyterHub instances on our OpenStack cloud.11:19
eliaswimmeroneswig: autoscaling was never needed and tested so far11:20
oneswigah ok11:20
eliaswimmerI set limits for each lecture separately, which is easier for lectures where you know your requirements11:21
eliaswimmerDoes anyone use GPUs already in their setups?11:22
oneswigeliaswimmer: how is the storage interface working?11:22
eliaswimmeroneswig: right now I use CephFS via CSI driver directly for homes and shares11:23
oneswigWe have used GPUs with K8S in a couple of deploys11:23
*** macz_ has quit IRC11:24
eliaswimmerHow is it with utilization? I wonder how to do AI lectures with 150 students, I would need 150 GPUs for them!11:24
*** udesale_ has joined #openstack-meeting11:24
eliaswimmerThat is why I am looking into an additional KubeFlow setup for a GPU cluster11:25
mpryorHow many people that are running JupyterHub have Dask enabled? We have found that users like having that functionality.11:26
b1airoCould split them under k8s - MIG or vGPU11:26
eliaswimmerb1airo: therefore I would need tesla grade GPUs11:27
b1airoYep, our HPC integrated JHub has Dask built-in11:27
*** udesale has quit IRC11:28
eliaswimmerAnyone using SLURM spawner?11:28
verdurinThat's how the Magic Castle implementation works.11:28
b1airotrue eliaswimmer , though is they are already in an OpenStack cluster I'm assuming they're in server machines, in a data centre somewhere, so...11:28
b1airoyes, we're using Slurmspawner11:29
eliaswimmerb1airo: Do you have extra partitions for JupyterHub with shared nodes?11:30
dh3the system here has Dask (not sure how many people use it). not using SLURM (there is LSF elsewhere for those who want it)11:32
*** mpryor has quit IRC11:32
b1airoActually our spawner is kind of a mashup, as the Hub machine is not allowed the Slurm keys directly, so once a user authenticates (2fa via a custom PAM authenticator) we create a Kerberos credential that allows them temporary ssh access to a login node, so our version of batchspawner does things via ssh11:34
*** mpryor has joined #openstack-meeting11:36
oneswigeliaswimmer: what do you do to provide user data into Jupyter environments?11:36
b1airore partitions, we were just filling up space on gpu nodes to start with, but now have a dedicated interactive partition for jupyter and other modest jobs. partly because we had some requests for teaching postgrad labs on the environment11:37
eliaswimmeroneswig: that's a good question, upload and download capabilities are quite limited, so for lectures with huge data sets I provided the lecturers with an extra share server.11:38
mpryoreliaswimmer oneswig We have found that, especially when using Dask, it makes sense to have any large datasets in an object store.11:39
eliaswimmerb1airo: we do so the same as our clusters are getting smaller and smaller in terms of nodes11:39
mpryorFor our managed notebook service, even though the notebook servers are running in Kubernetes we actually mount home directories and shared filesystems, so the environment they see is much like they see if they SSH to our traditional batch platform.11:41
eliaswimmermpryor: Oh that sound interesting, are you using a plugin for Jupyter to provide a view on the object store?11:41
mpryoreliaswimmer The community that we operate in (Earth Sciences) has built tools around a technology called Zarr that makes using the object store more or less transparent.11:42
eliaswimmermpryor: opencube?11:42
mpryoreliaswimmer I think we have some people using datacube-like technologies. However the most common software stack seems to be data on object store, accessed using Dask, XArray and Zarr. Data catalogs are provided by a tool called Intake.11:43
mpryorThis is basically the Pangeo stack -
mpryorThe Pangeo community also maintain a data catalog for CMIP6 -
*** dh3 has quit IRC11:45
*** yasufum has quit IRC11:45
eliaswimmermpryor: thank you! Our geo scientist are very interested in our setups, so that is a good starting point for me11:46
*** dh3 has joined #openstack-meeting11:46
mpryoreliaswimmer Pangeo is our standard setup that we provide via our Cluster-as-a-Service system.11:47
mpryorMostly it is oceanographers using it at the moment, but we have had interest from other groups included geo-type stuff.11:48
eliaswimmerdoes anyone use gpfs with manila and kubernetes?11:48
oneswigeliaswimmer: are you deploying all the k8s and jupyterhub environments for users or do they self-service somehow?11:48
eliaswimmeroneswig: right now I do everything my own, but the plan is to have a self service platform with a service catalog once11:49
eliaswimmerAnyone using JupyterHubs for teaching?11:52
mpryoreliaswimmer A few of our tenants have used the self-service hubs we offer via CaaS for workshops and teaching.11:52
mpryorThey are able to onboard all their own users, so we often don't find out about it though.11:53
eliaswimmerWe are planing to improve grading services (nbgrader and ngshare) a lot over summer, we will open source our code when ready11:53
eliaswimmermpryor: how do the manage authentication?11:54
verdurinMagic Castle was originally created for teaching at ComputeCanada, so I'm pretty sure they use the Jupyter part for that. It sets up IPA for auth.11:56
mpryoreliaswimmer One of the other cluster types we offer as part of our Cluster-as-a-Service is a central identity manager, which all other clusters connect to.11:56
verdurinI meant to ask if they wanted to contribute to this meeting, but forgot, and it's a bit early for Canada.11:56
oneswigI think it's a significant enough use case that a follow-on is warranted11:57
b1airoComputational notebooks for teaching.. some interesting pedagogical arguments around that I've come across, feels like a lost battle though11:57
mpryoreliaswimmer verdurin Our identity manager is FreeIPA + Keycloak. Keycloak is only there in a readonly capacity to provide OpenID Connect. Our Pangeo (JupyterHub) instances authenticate using the LDAP that you get from FreeIPA.11:59
oneswigah, we are out of time12:00
mpryorWe could have also used OpenID Connect, but OIDC is not fully supported yet by the JupyterHub OAuthenticator.12:00
oneswigfinal comments please12:00
b1airointeresting discussion - thanks all12:00
eliaswimmermpryor: keycloak is a great tool, I use it for our SLURM based setup, wrote a little extension for our 2fa auth12:00
b1airoseems everyone is doing Jupyter these days12:00
*** macz_ has joined #openstack-meeting12:00
eliaswimmerI think there is a lot to share, maybe we can setup some etherpad?12:01
oneswigThanks eliaswimmer and all, useful discussion12:01
b1airorelated, am keen to talk to anyone using OpenOnDemand12:01
*** mpryor has quit IRC12:01
oneswigeliaswimmer: some follow-up is definitely needed.12:01
oneswigOK, have to close the session.  Thanks all12:01
*** openstack changes topic to "OpenStack Meetings ||"12:01
openstackMeeting ended Wed Mar 10 12:01:52 2021 UTC.  Information about MeetBot at . (v 0.1.4)12:01
openstackMinutes (text):
liuyulong#startmeeting neutron_l314:00
openstackMeeting started Wed Mar 10 14:00:11 2021 UTC and is due to finish in 60 minutes.  The chair is liuyulong. Information about MeetBot at
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.14:00
*** openstack changes topic to " (Meeting topic: neutron_l3)"14:00
openstackThe meeting name has been set to 'neutron_l3'14:00
liuyulongHi there14:00
*** ociuhandu has quit IRC14:01
*** ociuhandu has joined #openstack-meeting14:01
liuyulongOK, looks like a one person meeting14:01
liuyulongLet's scan the bugs quickly.14:02
liuyulong#topic Bugs14:02
*** openstack changes topic to "Bugs (Meeting topic: neutron_l3)"14:02
liuyulongTwo lists from our bug deputy.14:02
liuyulongFirst one is related to DVR14:03
openstackLaunchpad bug 1916761 in neutron "[dvr] bound port permanent arp entries never deleted" [High,In progress] - Assigned to LIU Yulong (dragon889)14:03
liuyulongWe revert the code deletion of DVR arp control code.14:03
liuyulongThat DVR arp control code removal is for this change
liuyulongBut as we discussed in the gerrit, another approach same to l2pop should be introduced to handle such "ARP responder for allowed_address_pair IP"14:05
haleybhi, sorry i'm late, conflict14:05
liuyulongAnd this one is also related to 601336.14:06
liuyulongIf the 601336 is not going to accepted, maybe we should revert this as well.14:06
liuyulongI have created one:
liuyulongBut there are many conflicts.14:06
liuyulonghaleyb, hi14:07
haleybyes, some of those dvr changes are not a simple revert unfortunately14:08
*** jamesden_ is now known as jamesdenton14:09
liuyulongYes, I will read the git log again. My idea is to remove those changes.14:09
liuyulongHope you guys can also take a look at these changes. : )14:10
liuyulongNext one14:10
openstackLaunchpad bug 1917393 in neutron "[L3][Port forwarding] admin state DOWN/UP router will lose all pf-floating-ips and nat rules" [High,Fix released] - Assigned to LIU Yulong (dragon889)14:10
liuyulongA quick fix:
liuyulongThis should be backport to stable branches. I will do that.14:11
openstackLaunchpad bug 1752903 in neutron "Floating IPs should not allocate IPv6 addresses" [Medium,Won't fix]14:11
liuyulongI changed this to "wont fix" after my comments:
liuyulong"Add service type to the subnets of public network" can fix the problem ! Mission complete!14:12
*** b1airo has quit IRC14:12
*** jrollen is now known as jroll14:13
liuyulongLast one:14:13
openstackLaunchpad bug 1917409 in neutron "neutron-l3-agents won't become active" [High,New]14:13
liuyulongThere are gerrit changes uploaded.14:14
haleybi finally figured out there were two related changes there, i was confused14:15
liuyulongThe fix is simple, but we have no idea about why the port can be None from the call trace.14:16
liuyulonghaleyb, yes, I have left some comments to that.14:16
liuyulongThe author should merge these two changes to one.14:16
haleybliuyulong: right, and i also don't like just catching the error there if it's really the caller at fault14:17
haleybi'm guessing the port is '' or something14:17
haleybi put a comment14:19
liuyulongOK, I see the bug...14:19
liuyulongafter reading the report's l3 log, it is related to agent gateway port creation.14:20
liuyulongWe fixed that here14:20
openstackLaunchpad bug 1883089 in neutron "[L3] floating IP failed to bind due to no agent gateway port(fip-ns)" [Medium,Fix released] - Assigned to LIU Yulong (dragon889)14:21
liuyulongSame error as this bug.14:21
liuyulongOK, mission complete!14:21
*** lyarwood has joined #openstack-meeting14:22
liuyulongMark duplicate of bug #1883089.14:22
openstackbug 1883089 in neutron "[L3] floating IP failed to bind due to no agent gateway port(fip-ns)" [Medium,Fix released] - Assigned to LIU Yulong (dragon889)14:22
liuyulongOK, no more bugs from me now.14:22
*** lyarwood has left #openstack-meeting14:23
liuyulongOK, let's move on.14:24
liuyulong#topic On demand agenda14:24
*** openstack changes topic to "On demand agenda (Meeting topic: neutron_l3)"14:24
liuyulongAgain for  bp/distributed-dhcp-for-ml2-ovs:
liuyulongIt's pretty close now, reviews are welcomed.14:26
haleybyes, i have it on my list, just have a busy meeting morning14:26
liuyulongThe fullstack test case works fine now:
liuyulonghaleyb, thank you, your comments are all addressed. : )14:28
liuyulongOK, let's end the meeting here.14:34
*** openstack changes topic to "OpenStack Meetings ||"14:34
openstackMeeting ended Wed Mar 10 14:34:37 2021 UTC.  Information about MeetBot at . (v 0.1.4)14:34
openstackMinutes (text):
timburke__#startmeeting swift21:00
openstackMeeting started Wed Mar 10 21:00:50 2021 UTC and is due to finish in 60 minutes.  The chair is timburke__. Information about MeetBot at
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.21:00
*** openstack changes topic to " (Meeting topic: swift)"21:00
openstackThe meeting name has been set to 'swift'21:00
timburke__who's here for the swift meeting?21:01
rledisezhi o/21:01
timburke__as usual, the agenda's at
timburke__first up21:03
timburke__#topic next PTG21:03
*** openstack changes topic to "next PTG (Meeting topic: swift)"21:03
timburke__i probably should have brought this up last week, but i've gotten behind on the ML21:03
timburke__it's going to be another virtual PTG, April 19-2321:04
timburke__register (for free) at
timburke__i'll get a doodle poll up for next week to pick meeting times21:05
acolesI already have my ticket :)21:05
timburke__me too21:05
timburke__we'll see if we can get notmyname to drop in again :-)21:05
mattoliverauOh I should go get mine.. I assume my employer will let me go21:06
timburke__heh "go"21:06
timburke__but yeah, pretty sure your boss will be on board ;-)21:06
timburke__next up21:07
timburke__#topic elections21:07
*** openstack changes topic to "elections (Meeting topic: swift)"21:07
timburke__(again, probably should have brought this up last week)21:07
timburke__the nomination period is over!21:07
*** jmasud has quit IRC21:08
timburke__looks like there are currently 4 candidates for 5 slots on the TC, and no actual elections will need to be held :-/21:08
timburke__i don't really have much else to say about it, but wanted to keep people informed21:10
*** dklyle has quit IRC21:10
timburke__so on to Swift work!21:10
timburke__#topic CORS and s3api21:10
*** openstack changes topic to "CORS and s3api (Meeting topic: swift)"21:10
timburke__now that the CORS test suite landed, i'd like to keep pushing on the actual behavioral changes i'd like21:11
timburke__specifically, (which has s3api translate more cors-related headers)21:11
acolesoh yes, I got distracted from that, I'll try to re-review21:12
timburke__and (which allows preflight OPTIONS requests)21:12
*** ociuhandu has joined #openstack-meeting21:12
acolesI've noticed a the zuul cors test has failed on a few patches :(21:12
timburke__thanks! fwiw, it looks like there's at least a couple more changes i ought to make to the gate job given the failure on that last patch, but i'm fairly certain tests will pass with the recheck21:13
timburke__acoles, did they seem to be timeouts during setup? that's what i saw on one, anyway...21:13
acolesTBH I didn't dig into the failure too much21:13
timburke__(though it got a little buried by a post-failure when we tried to copy some non-existent outputs)21:14
timburke__i'll be sure to dig in for next week21:14
acolesother than they weren't 'real' failures - the tests didn't seem to have run21:14
acolesok, or maybe they ran but the output wasn't posted. IDK21:15
timburke__as long as the tests actually run, i expect we won't see a post failure, fwiw21:15
timburke__next up21:16
timburke__#topic relinker21:16
*** openstack changes topic to "relinker (Meeting topic: swift)"21:16
timburke__i know we've got a few different threads we're pulling at now21:16
timburke__mattoliverau has a few patches for improved logging21:16
timburke__acoles has been working on some refactoring that should make cleanup behave a lot more like relinking (and they both should get a lot more tolerant of things like reapable tombstones)21:17
mattoliveraubut these will need to be rebased on top of acoles changes once they stablise. But mostly effect the post_partition hook for the moment, so mostly seperate.21:18
timburke__do we have a preference for order of landing?21:18
acolestoday I wrote --policy and --partition option patches21:18
mattoliverauoh cool, I'll check em out.21:19
acolesI'd really appreciate this one merging (subject to review of course)
timburke__nice, i should do some reviews :-)21:19
acolesp 778530 is blocking me doing some stuff on the refactor chain that I'd like21:20
timburke__...also, i should dust off the priority reviews wiki21:20
*** belmoreira has quit IRC21:20
timburke__acoles, i'll try to make sure it's got a review by the time you wake up21:21
mattoliverauwhich the amount of churn als code does on the main functions, it'll be easier to land that and then mine. Because mine will just need to be plugged into the partitals for the post_part hook.21:21
timburke__sounds like a plan21:21
seongsoochoThere are no timestamp in relink log in my env. How to add a timestamp in log?21:21
acolesmattoliverau: yours and mine may not conflict, but IIRC some tests were failing on your logging patch?21:22
mattoliverauyeah, I'll poke it, I reworked it a bit yesterday and pushed stuff up at the end of my day.. I shoud've run tests locally first. But will check it out today.21:23
acolesmattoliverau: sorry, it looks like it may have just been the cors test failing21:23
mattoliverau\o/ (well kinda yay)21:23
timburke__seongsoocho, that sounds like a good logging enhancement to make. fwiw, it should "just" be a matter of updating format='%(message)s' in relinker.py21:24
timburke__would you mind writing up a bug (or even a patch!) for it, so we won't forget about it?21:24
seongsoochotimburke__:  Ok I will writing up a bug and also write a patch too!21:25
acolesmattoliverau: so don't wait on me, if is ready lets try to land it then I'll take care of any conflict (now I looked harder there will be some but nothing too tricky)21:26
mattoliveraunah it isn't. Like a said it should mostly be passing some extra params to the post_part hook partial call. Which you've just centralised, so actually makes it eaiser :)21:27
acolesand TBH I'd rather we merged smaller changes rather than rush through review of the bigger refactor21:28
acolesbut yeah, the partial call does get easier :)21:28
timburke__ok, one last topic21:29
timburke__#topic releases21:29
*** openstack changes topic to "releases (Meeting topic: swift)"21:29
mattoliverauTrue. We'll put them in "order" on the priorities page then. Doesn't matter which order, but if you don't mind a conflict then happy to get mine in first, I just want to finish the 2nd recon one (needs some tests)21:29
timburke__still need to get a swift release out -- anyone think they'll have a chance to proofread the changelog at ? make sure i'm not forgetting anything?21:30
mattoliverauI'll take a look today21:30
timburke__and separately, is anyone aware of any reason we should hold off on getting a release out? any critical bugs?21:30
acolesI'll take a look tomorrow21:30
mattoliveraumaybe relinker improvements?21:31
mattoliveraubut so long as all the races are fixed and landed then that's good21:31
timburke__i guess my main thought on the relinker is whether we've got anything that's demonstrably worse than it was in 2.26.021:32
acolestimburke__: maybe this ?21:32
mattoliverauBut who relinks very oftern, that could wait until next release :)21:32
timburke__i'll take a look21:33
timburke__that's all i've got21:33
timburke__#topic open discussion21:33
*** openstack changes topic to "open discussion (Meeting topic: swift)"21:33
timburke__anything else we ought to discuss this week?21:33
acolesjust an FYI that mattoliverau and I have some patches to improve sharder & swift-manage-shard-range config handling, and we're proposing to change a couple of the defaults that relate to shrinking21:35
mattoliverauoh yeah! good call to mention it :)21:36
acolesunless you use auto-sharding it should not have any impact21:36
acolesor 'swift-manage-shard-ranges compact'21:37
timburke__all right21:41
timburke__sounds like we've got a pretty good idea of what we want to look at next21:42
*** dklyle has joined #openstack-meeting21:42
timburke__thank you all for coming, and thank you for working on swift!21:42
*** openstack changes topic to "OpenStack Meetings ||"21:42
openstackMeeting ended Wed Mar 10 21:42:25 2021 UTC.  Information about MeetBot at . (v 0.1.4)21:42
openstackMinutes (text):
