#openstack-meeting log

21:00:22 <oneswig> #startmeeting scientific_sig
21:00:22 <opendevmeet> Meeting started Tue Jun 22 21:00:22 2021 UTC and is due to finish in 60 minutes.  The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:22 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:22 <opendevmeet> The meeting name has been set to 'scientific_sig'
21:00:37 * oneswig double-checks the spelling as usual
21:00:44 <julianp> Heh.
21:00:53 <oneswig> greetings julianp how are you?
21:01:24 <julianp> Hi oneswig. I'm doing well. Getting ready for summer vacation.
21:01:26 <julianp> How are you?
21:01:52 <oneswig> Well our CEO's on vacation and I'm holding the fort.  Just about.
21:02:14 <oneswig> Got a few extra things going on but I am not complaining.
21:02:27 <julianp> Excellent.
21:03:05 <oneswig> One of those things has been testing to confirm that one tenancy on the Scalable Metal Service cannot adversely impact other tenancies.  ie, that we've got the security model about right.
21:03:19 <oneswig> It has taken a little longer than anyone would have liked :-)
21:03:38 <julianp> Ooh. Yeah, I guess that's important to get right.
21:04:20 <oneswig> As a budget service, IPV4's getting pricey so we are embracing IPV6.  It's a weird form of embrace
21:04:55 <oneswig> I assume other people in the world are serious on IPV6 but it does take some getting used to
21:05:10 <julianp> Yes. I know what you mean.
21:06:13 <oneswig> What's new with Exosphere?
21:06:38 <oneswig> Do you have IPV6 support up and running?
21:07:09 <julianp> Quite a bit.
21:07:11 <julianp> Biggest news is that we're working on sharing workflows. Instead of having to maintain long-lived OpenStack images.
21:07:51 <julianp> We're leveraging this specification: https://repo2docker.readthedocs.io/en/latest/specification.html
21:08:08 <julianp> It's the same tool used by Jupyter's Binder project.
21:08:45 <julianp> We want to go where the community is, and this seems like a relatively low burden for minimal viable reproducible workflows.
21:08:56 <oneswig> ooh, interesting.  How will you use it?
21:09:08 <rbudden> hello
21:09:17 <oneswig> hey rbudden, nice to see you
21:09:17 <julianp> Hey rbudden! Long time no see.
21:10:07 <rbudden> yep, been awhile, good to see everyone
21:11:00 <oneswig> What's up with you rbudden?
21:11:04 <julianp> First approach is to allow a user to specify a GitHub repo (or Zenodo DOI, etc.) when launching a server. Then we use the repo2docker tool to create a container from that, and then have a link to launch the Jupyter Notebook, RStudio or whatever. Similar to how we do web terminal and remote desktop in the browser.
21:11:42 <rbudden> Knee deep in our new production OpenStack deploy, so playing with Wallaby
21:11:53 <oneswig> julianp: is what you're doing involving open ondemand or is that competing/irrelevant here?
21:12:19 <oneswig> rbudden: Wallaby?  Cutting edge :-)
21:13:46 <rbudden> haha, yeah, well we figured why deploy a CentOS 7 Train cloud like our TDS and immediately need to do upgrades ;)
21:14:06 <rbudden> so we’re building on CentOS Stream and Wallaby ATM
21:14:30 <julianp> oneswig: It's competing with the notebook interface for Open OnDemand, I think, but we go further. In particular we leverage REES to allow researchers to specify dependencies easily.
21:14:49 <oneswig> rbudden: nice.  I think we start on the same setup this week (but you guys roll your own deployments, right?)
21:14:58 <rbudden> correct
21:15:08 <rbudden> we’re in the process of moving from Puppet -> Ansible as well
21:15:30 <julianp> Whoah.
21:15:54 <julianp> rbudden: What was the motivation for moving to Ansible?
21:15:58 <oneswig> julianp: would be interesting for the team here to see a demo of your work, if you have time.  I think it would be informative.
21:16:31 <julianp> oneswig: We can definitely do that. I have time this week. Ping me on Slack.
21:17:17 <oneswig> Thanks julianp - I may not have time this week but I'll ping you all the same.
21:18:08 <rbudden> So we wrapped Ansible around our custom deploy model mainly for mainly for automation purposes.
21:19:43 <rbudden> We have our TDS nailed down to being able to rebuild our entire infrastructure from scratch with essentially a single playbook (two separate openstack clouds) from the metal -> xCAT imagining -> OpenStack deploy
21:21:24 <oneswig> Nice work.  I bet those playbooks look lovely
21:22:02 <rbudden> We liked the setup and decided we’d slowly replace the pieces over time. Additionally, for security reasons it’s also simpler for us to deploy over SSH
21:23:17 <rbudden> oneswig: they are getting there. still a lot of work to do, but I structured it similar to the Ansible setup I did at PSC
21:23:45 <oneswig> rbudden: did they go greenlake with bridges-2 do you know?
21:24:34 <rbudden> I’m not sure TBH, I’ve been out of touch with ppl there since COVID started
21:24:48 <rbudden> but I know they’ve switched directions on a few fronts
21:25:57 <oneswig> interesting, thanks rbudden
21:26:16 <oneswig> So how far on are you with Stream?
21:27:27 <rbudden> we have a full control plane built right now, I’m troubleshooting an issue with our Neutron server/api and communication to the Neutron network nodes
21:27:39 <oneswig> Are you using OVN?
21:28:02 <rbudden> no no, we’re still just doing provider networks, straight VLANs with LB
21:28:16 <martial> sorry for being late
21:28:39 <oneswig> rbudden: nice.  Did anything happen on linuxbridge and driver maintenance?
21:28:46 <oneswig> hi martial, you made it!
21:28:49 <oneswig> #chair martial
21:28:49 <opendevmeet> Current chairs: martial oneswig
21:29:35 <rbudden> I know Mills was fighting for LB support in the PTG, we’re getting nervous about continued support
21:30:21 <rbudden> we’ve both done OVS setups in the past so we may move towards that in the future and see what kind of VXLAN offload support could do for us, etc.
21:30:35 <rbudden> not very familiar with OVN yet, is that what StackHPC is deploying these days?
21:31:31 <oneswig> Yes unless required otherwise.
21:31:39 <rbudden> Our focus lately has been in the multi AZ/Cell setup pieces.
21:31:43 <oneswig> We've been having fun with OVN and hardware offloads for OVS flows.
21:31:52 <rbudden> well, I’m happy to have you convince me about OVN before we go production ;)
21:32:32 <oneswig> Any effort to convince you would entail a moral obligation to support it afterwards :-)
21:33:26 <rbudden> lol
21:34:15 <oneswig> The major advantage of it is that the project has a pulse.  The only advantage I can think of otherwise is that you can do load-balancers in SDN flow rules, instead of booting Amphora VMs running HAProxy.
21:34:19 <rbudden> we have a tight schedule actually, since we have to merge our two running OpenStack’s into this new cluster, so we aren’t deviating into new tech much yet
21:34:45 <oneswig> wise move
21:34:58 <rbudden> the only major changes are to AZs really
21:35:14 <rbudden> so we can have localized storage/network/etc to each respective building
21:35:27 <rbudden> but also failover to another building if necessary
21:36:06 <rbudden> We are interested in BPG Unnumbered and other alternatives to our network stack though
21:36:28 <oneswig> That sounds neat.  I'm wary on AZs it would be interesting to see them used well
21:36:53 <rbudden> they work well in our TDS aside from a few bugs that I’m hoping may have been fixed in Wallaby
21:36:56 <ewimmer> Hi oneswig, sorry I almost forget about the meeting today.
21:37:05 <oneswig> Hi ewimmer, good to see you
21:37:36 <oneswig> rbudden: well hopefully someone else will have hit the same issues as you
21:37:57 <oneswig> ewimmer: we haven't covered BGP to the host, shall we?
21:38:18 <oneswig> Although jmlowe isn't here and he was the other respondent on it.
21:38:39 <ewimmer> oneswig: Would be nice to see if someone has done it before!
21:39:49 <oneswig> I'm sure it has been done but I just don't know anyone who has done it!
21:40:16 <rbudden> ditto, I don’t think anyone on the Slack channel admitted to actively using it
21:41:01 <ewimmer> So maybe we can think of the downsides if any?
21:41:59 <oneswig> I may be mixing this with routed networks but does it create tenant networks without L2 connectivity?
21:42:01 <ewimmer> Or maybe on the usecase first! :)
21:43:20 <ewimmer> oneswig: is that possible?
21:43:52 <ewimmer> these networks have to be connected to somewhere
21:44:23 <oneswig> Yes - if you got instances on packet.net, they had l3-only networking due to a BGP-managed fabric
21:47:01 <ewimmer> So VXLAN only?
21:48:07 <oneswig> I think the way it worked was all hosts got their own /30 network.  So any communication with another host was via a gateway
21:48:28 <oneswig> This might also be how Calico works
21:49:07 <oneswig> But I'm not sure if that's the same experience as BGP to the host - perhaps you still do VXLAN overlays in that case.
21:50:42 <ewimmer> Calico uses BGP
21:51:35 <ewimmer> Hm I was more thinking of using VXLAN for segmentation, replaceing my VLANs
21:52:17 <ewimmer> And using OVS/OVN on top.
21:53:10 <ewimmer> BGP would just help to get ECMP to the host. Replacing bonds.
21:53:12 <oneswig> That would be the extension of the Cumulus model, wouldn't it?
21:53:34 <oneswig> I mean, instead of VXLAN from the switch, VXLAN from within the hypervisor.
21:53:39 <ewimmer> Exactly, BGP EVPN
21:55:20 <ewimmer> Cumulus can do what they call multihoming, using BGP instead of a MC-LAG on the leaf switches, but presenting still a bond to the hosts.
21:55:38 <ewimmer> But this works for Mellanox ASICs only...
21:56:49 <ewimmer> So if I can't have this bonding feature, I will have to bring the routing protocol to the hosts.
21:58:11 <oneswig> I think so.  The quest is for a kolla user who sets a precedent.
22:00:13 <oneswig> Ah, we are out of time.
22:00:33 <ewimmer> I will tell you hopefully next time :)
22:00:36 <oneswig> Too bad we didn't get a user story on this
22:00:47 <oneswig> Good luck ewimmer :-)
22:00:57 <oneswig> Final comments anyone?
22:01:46 <oneswig> #endmeeting