11:00:34 <oneswig> #startmeeting scientific-sig 11:00:35 <openstack> Meeting started Wed Apr 22 11:00:34 2020 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 11:00:36 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 11:00:39 <openstack> The meeting name has been set to 'scientific_sig' 11:01:17 <oneswig> hello 11:01:29 <witek> hi 11:02:06 <oneswig> Hi witek, how are you? 11:02:14 <oneswig> #link agenda for today https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_April_22nd_2020 11:02:22 <oneswig> small agenda. Must try harder :-) 11:04:05 <janders> g'day 11:04:11 <oneswig> #topic Sig at virtual PTG 11:04:15 <witek> I'm fine, thanks 11:04:16 <oneswig> Hi janders, evening 11:05:19 <oneswig> janders: had a question for you about how you were using vxlan - on the IB NIC was that right? 11:05:46 <janders> I've done this some time back 11:07:16 <janders> I used this method for the fully virtualised part of the cluster running on IB kit 11:07:16 <oneswig> #link virtual PTG planning https://ethercalc.openstack.org/126u8ek25noy 11:09:26 <janders> I wouldnt recommend vxlan over IB for large scale, performance sensitive cases 11:09:32 <janders> but for small-medium it works pretty well 11:11:44 <oneswig> I put in for 1500UTC-1700UTC and 2100UTC-2300UTC for virtual PTG, hopefully the split session will work for more people. 11:12:12 <janders> comms issues sorry 11:12:26 <witek> the afternoon session should work for me 11:13:31 <oneswig> apologies, had a call 11:14:02 <oneswig> afternoon for me too, I'll see how much I've got left in the tank for the evening. 11:15:40 <oneswig> We should think about how to advocate scientific sig use cases 11:15:58 <oneswig> (and who to?) 11:17:46 <oneswig> #topic AOB 11:18:23 <oneswig> Just sitting in on a presentation on OpenStack changes for supporting CentOS8 11:18:45 <oneswig> plenty of changes to accommodate 11:20:48 <oneswig> Quite a big piece of work but CentOS 7 will rapidly go off, likely 11:20:50 <janders> top #3 challenges? 11:21:15 <janders> which release will support both el7 and el8? 11:21:31 <oneswig> I joined late, but DNF, PFtables, Python 3, and network scripts being deprecated all have to be resolved. 11:21:56 <oneswig> In the Kolla world, Train spans both. 11:22:09 <oneswig> Assume it's the same for TripleO 11:22:25 <janders> how well is el7->el8 migration going for people? 11:22:34 <janders> or is it irrelevant for OS? 11:22:49 <oneswig> It's a reinstall 11:23:01 <janders> wasn't that supposed to be an in-place? 11:23:27 <oneswig> I'm not sure for RHEL but for CentOS it's reinstall 11:23:42 <janders> I've done interesting el6->el7 migrations on Icehouse I think it was 11:23:46 <janders> back in the day 11:24:21 <janders> SRIOV cluster so no live migration 11:24:42 <janders> but /var/lib/nova/instances as a separate LV worked a treat 11:25:03 <oneswig> Right. I occasionally hear things about support for SRIOV live migration... 11:25:30 <janders> in the "maybe one day" context or ready for testing context? 11:27:21 <oneswig> Trying to remember where I saw it recently. 11:33:29 <oneswig> It's quiet today, anyone with AOB to add? 11:33:56 <janders> playing around with tripleo updates on OSP13 11:34:06 <janders> seems updates are heaps better than installs :) 11:34:18 <janders> mostly ansible 11:34:28 <oneswig> Good to hear it, although usually it's the other way around 11:34:31 <janders> as next versions come out, who knows, maybe tripleo will get usable 11:34:48 <janders> promise is less puppet more ansible, should be all ansible soon 11:35:06 <janders> another interesting find - CX6 drivers are in overcloud images, but not in IPA 11:35:10 <oneswig> Seems like a good path and one that converges with OS-A and Kolla. 11:35:15 <janders> we were rebuilding IPA with CX6 drivers this week 11:35:42 <oneswig> From our team Doug did our first production deploy with the Ironic SW RAID driver 11:36:15 <janders> nice! 11:36:26 <janders> i may move to that before I get VROC working in IPA :P 11:37:03 <oneswig> Those things are harder than they ought to be, I've heard. 11:37:12 <janders> confirmed :/ 11:37:30 <oneswig> The price paid for always having the shiny toys! 11:37:46 <janders> indeed 11:38:47 <oneswig> janders: are you still deploying supercloud nodes with only one NIC? 11:39:17 <janders> for the cyber project im doing right now I have two 11:39:31 <janders> 100/200 HDR for storage only 11:39:35 <janders> and 100GE for everything else 11:39:55 <janders> but this cluster is a little HPC and more security focused 11:41:05 <oneswig> Are you using the m-key stuff on the IB, or is that outside of the user-facing world? 11:42:00 <janders> m-key? 11:43:21 <oneswig> There was a fix for bare metal multi-tenant IB security, I believe it related to preventing IB NICs from sending management datagrams. 11:43:45 <janders> right! 11:44:01 <janders> for older cards it was special locked-down firmware 11:44:19 <janders> unsure for the latest ones - in this cluster, IB fabric is not user facing (at least for now) 11:44:35 <oneswig> ok, makes sense. 11:44:35 <janders> so it's just the standard FW 11:45:07 <oneswig> Are your tenants untrusted in this system janders? 11:45:29 <janders> im less worried about the tenants and more about the stuff they will be working on 11:45:33 <janders> malware analysis 11:45:50 <janders> so IB is mostly used to connect volumes with their data to hypervisors at decent performance 11:46:13 <oneswig> NVMEoF? 11:46:50 <janders24> GPFS 11:47:07 <janders24> though NVMEoF is a good idea actually :) 11:47:27 <janders24> I've gone with GPFS cause it can do everything (glance/cinder backend + parallel FS) 11:47:57 <janders24> but im starting to lean towards splitting openstack backends off the parallel FS 11:47:57 <oneswig> What's the state of the OpenStack support? 11:48:08 <janders24> for OSP13 it *just works* 11:48:16 <oneswig> Nice, good to hear it. 11:48:29 <janders24> for OSP16 it's uncertain, sometimes I'm thinking about forward-porting it myself :) 11:48:31 <oneswig> Assuming you didn't mean it *just* works :-) 11:48:53 <janders24> yeah... there is some fiddling to get it to work 11:49:00 <janders24> and then it *just works* :) 11:49:25 <janders24> containerisation isn't helping to make things simple 11:49:58 <oneswig> how so? 11:50:25 <janders> passing through extra subtrees into a number of containers via tripleo isnt trivial 11:53:20 <oneswig> Something like /var/lib/nova? 11:53:57 <janders> spot on 11:53:59 <janders> and glance 11:54:20 <janders> cinder is even more interesting cause its pacemaker powered in OSP 11:55:22 <oneswig> Back in the day it was a huge problem upgrading a hypervisor or controller with OFED packages installed. How's that going now? 11:55:31 <janders24> hahaha 11:55:35 <janders24> it was killing us just this morning 11:55:50 <janders24> I fixed this some time back with elaborate repo management tactics 11:55:57 <janders24> but these days more and more stuff is in kernel 11:56:05 <janders24> so I usually get away with running without OFED 11:56:14 <janders24> i probably wouldnt do that on GPFS servers 11:56:20 <janders24> but GPFS clients are running fine 11:56:41 <janders24> and as mentioned earlier recent OSP13 overcloud images come with CX6 drivers included 11:56:49 <oneswig> Interestingly, it does seem possible to upgrade in-place, but not easy: https://www.centlinux.com/2020/01/how-to-upgrade-centos-7-to-8-server.html 11:57:14 <janders24> IPA may need injection of bits of OFED but that doesn't need to be upgradable, it's rebuild-as-needed 11:57:50 <oneswig> I remember benchmarking the iSER cinder driver, it was really good (but also a significant SPOF) 11:58:18 <janders24> tahts the beauty of GPFS 11:58:20 <janders24> no SPOFs 11:58:32 <oneswig> yes, very nice. 11:58:45 <janders24> yeah with el7>el8 I was expecting to see Fedora-like in-place upgrade mechanism 11:59:03 <janders24> unsure to what degree its applicable to OpenStack (or how worthwhile this would be) 11:59:15 <janders24> likes of tripleo will probably rebuild node after node 11:59:24 <janders24> but having said that i havent explored that path so not sure... yet 11:59:32 <janders24> OSP13 has a fair bit of life left on it and works well 12:00:07 <oneswig> Ah, we are out of time. 12:00:22 <janders24> thanks guys 12:00:23 <oneswig> Thanks janders24 witek good talking with you both 12:00:24 <janders24> stay safe! 12:00:28 <oneswig> and you 12:00:30 <oneswig> #endmeeting