11:00:34 #startmeeting scientific-sig 11:00:35 Meeting started Wed Apr 22 11:00:34 2020 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 11:00:36 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 11:00:39 The meeting name has been set to 'scientific_sig' 11:01:17 hello 11:01:29 hi 11:02:06 Hi witek, how are you? 11:02:14 #link agenda for today https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_April_22nd_2020 11:02:22 small agenda. Must try harder :-) 11:04:05 g'day 11:04:11 #topic Sig at virtual PTG 11:04:15 I'm fine, thanks 11:04:16 Hi janders, evening 11:05:19 janders: had a question for you about how you were using vxlan - on the IB NIC was that right? 11:05:46 I've done this some time back 11:07:16 I used this method for the fully virtualised part of the cluster running on IB kit 11:07:16 #link virtual PTG planning https://ethercalc.openstack.org/126u8ek25noy 11:09:26 I wouldnt recommend vxlan over IB for large scale, performance sensitive cases 11:09:32 but for small-medium it works pretty well 11:11:44 I put in for 1500UTC-1700UTC and 2100UTC-2300UTC for virtual PTG, hopefully the split session will work for more people. 11:12:12 comms issues sorry 11:12:26 the afternoon session should work for me 11:13:31 apologies, had a call 11:14:02 afternoon for me too, I'll see how much I've got left in the tank for the evening. 11:15:40 We should think about how to advocate scientific sig use cases 11:15:58 (and who to?) 11:17:46 #topic AOB 11:18:23 Just sitting in on a presentation on OpenStack changes for supporting CentOS8 11:18:45 plenty of changes to accommodate 11:20:48 Quite a big piece of work but CentOS 7 will rapidly go off, likely 11:20:50 top #3 challenges? 11:21:15 which release will support both el7 and el8? 11:21:31 I joined late, but DNF, PFtables, Python 3, and network scripts being deprecated all have to be resolved. 11:21:56 In the Kolla world, Train spans both. 11:22:09 Assume it's the same for TripleO 11:22:25 how well is el7->el8 migration going for people? 11:22:34 or is it irrelevant for OS? 11:22:49 It's a reinstall 11:23:01 wasn't that supposed to be an in-place? 11:23:27 I'm not sure for RHEL but for CentOS it's reinstall 11:23:42 I've done interesting el6->el7 migrations on Icehouse I think it was 11:23:46 back in the day 11:24:21 SRIOV cluster so no live migration 11:24:42 but /var/lib/nova/instances as a separate LV worked a treat 11:25:03 Right. I occasionally hear things about support for SRIOV live migration... 11:25:30 in the "maybe one day" context or ready for testing context? 11:27:21 Trying to remember where I saw it recently. 11:33:29 It's quiet today, anyone with AOB to add? 11:33:56 playing around with tripleo updates on OSP13 11:34:06 seems updates are heaps better than installs :) 11:34:18 mostly ansible 11:34:28 Good to hear it, although usually it's the other way around 11:34:31 as next versions come out, who knows, maybe tripleo will get usable 11:34:48 promise is less puppet more ansible, should be all ansible soon 11:35:06 another interesting find - CX6 drivers are in overcloud images, but not in IPA 11:35:10 Seems like a good path and one that converges with OS-A and Kolla. 11:35:15 we were rebuilding IPA with CX6 drivers this week 11:35:42 From our team Doug did our first production deploy with the Ironic SW RAID driver 11:36:15 nice! 11:36:26 i may move to that before I get VROC working in IPA :P 11:37:03 Those things are harder than they ought to be, I've heard. 11:37:12 confirmed :/ 11:37:30 The price paid for always having the shiny toys! 11:37:46 indeed 11:38:47 janders: are you still deploying supercloud nodes with only one NIC? 11:39:17 for the cyber project im doing right now I have two 11:39:31 100/200 HDR for storage only 11:39:35 and 100GE for everything else 11:39:55 but this cluster is a little HPC and more security focused 11:41:05 Are you using the m-key stuff on the IB, or is that outside of the user-facing world? 11:42:00 m-key? 11:43:21 There was a fix for bare metal multi-tenant IB security, I believe it related to preventing IB NICs from sending management datagrams. 11:43:45 right! 11:44:01 for older cards it was special locked-down firmware 11:44:19 unsure for the latest ones - in this cluster, IB fabric is not user facing (at least for now) 11:44:35 ok, makes sense. 11:44:35 so it's just the standard FW 11:45:07 Are your tenants untrusted in this system janders? 11:45:29 im less worried about the tenants and more about the stuff they will be working on 11:45:33 malware analysis 11:45:50 so IB is mostly used to connect volumes with their data to hypervisors at decent performance 11:46:13 NVMEoF? 11:46:50 GPFS 11:47:07 though NVMEoF is a good idea actually :) 11:47:27 I've gone with GPFS cause it can do everything (glance/cinder backend + parallel FS) 11:47:57 but im starting to lean towards splitting openstack backends off the parallel FS 11:47:57 What's the state of the OpenStack support? 11:48:08 for OSP13 it *just works* 11:48:16 Nice, good to hear it. 11:48:29 for OSP16 it's uncertain, sometimes I'm thinking about forward-porting it myself :) 11:48:31 Assuming you didn't mean it *just* works :-) 11:48:53 yeah... there is some fiddling to get it to work 11:49:00 and then it *just works* :) 11:49:25 containerisation isn't helping to make things simple 11:49:58 how so? 11:50:25 passing through extra subtrees into a number of containers via tripleo isnt trivial 11:53:20 Something like /var/lib/nova? 11:53:57 spot on 11:53:59 and glance 11:54:20 cinder is even more interesting cause its pacemaker powered in OSP 11:55:22 Back in the day it was a huge problem upgrading a hypervisor or controller with OFED packages installed. How's that going now? 11:55:31 hahaha 11:55:35 it was killing us just this morning 11:55:50 I fixed this some time back with elaborate repo management tactics 11:55:57 but these days more and more stuff is in kernel 11:56:05 so I usually get away with running without OFED 11:56:14 i probably wouldnt do that on GPFS servers 11:56:20 but GPFS clients are running fine 11:56:41 and as mentioned earlier recent OSP13 overcloud images come with CX6 drivers included 11:56:49 Interestingly, it does seem possible to upgrade in-place, but not easy: https://www.centlinux.com/2020/01/how-to-upgrade-centos-7-to-8-server.html 11:57:14 IPA may need injection of bits of OFED but that doesn't need to be upgradable, it's rebuild-as-needed 11:57:50 I remember benchmarking the iSER cinder driver, it was really good (but also a significant SPOF) 11:58:18 tahts the beauty of GPFS 11:58:20 no SPOFs 11:58:32 yes, very nice. 11:58:45 yeah with el7>el8 I was expecting to see Fedora-like in-place upgrade mechanism 11:59:03 unsure to what degree its applicable to OpenStack (or how worthwhile this would be) 11:59:15 likes of tripleo will probably rebuild node after node 11:59:24 but having said that i havent explored that path so not sure... yet 11:59:32 OSP13 has a fair bit of life left on it and works well 12:00:07 Ah, we are out of time. 12:00:22 thanks guys 12:00:23 Thanks janders24 witek good talking with you both 12:00:24 stay safe! 12:00:28 and you 12:00:30 #endmeeting