11:04:27 <oneswig> #startmeeting scientific-sig 11:04:28 <openstack> Meeting started Wed Jul 3 11:04:27 2019 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 11:04:29 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 11:04:31 <openstack> The meeting name has been set to 'scientific_sig' 11:04:33 <oneswig> #chair b1airo 11:04:33 <openstack> Current chairs: b1airo oneswig 11:04:43 <oneswig> janders: used the 2600TP white box servers 11:04:44 <b1airo> sorry, laptop just froze at a very inopportune moment 11:04:59 <janders> oneswig: yes, those ones 11:05:11 <janders> do you have any experience with automating BIOS configuration? 11:05:12 <oneswig> Would prefer not to, bmc and bios firmware was not good. 11:05:28 <janders> oops then we might be a little screwed 11:05:37 <oneswig> janders: I don't think it is possible. I recall we had to upgrade NIC firmware on them and it was painful at the time. 11:05:46 <oneswig> (this was 2 years ago, things may have improved). 11:05:47 <janders> they have partial redfish support but it seems very read-only especially if done via ansible 11:06:04 <janders> there are some new tools my guys are testing 11:06:05 <oneswig> sounds like things have evolved a bit then. 11:06:18 <b1airo> were they cheap enough to warrant it...? 11:06:37 <janders> but I'm in discussion with their devs, they seem to be keen to make things better 11:07:08 <janders> allright.. it's good to know we're not alone 11:07:20 <janders> these seem pretty cool piece of kit other than this "little 11:07:24 <janders> " detail 11:07:41 <janders> I will try to work out something with Intel and happy to report back if anyone is interested 11:08:00 <oneswig> always like a discussion on things like that... 11:08:27 <oneswig> belmoreira: I have John and Mark here, are you available to discuss next week? 11:09:21 <belmoreira> should be fine. Send me your availability (day, hour) and I will check with the other guys 11:09:39 <oneswig> will do, thanks (and apologies for the thrash) 11:10:08 <belmoreira> oneswig no worries 11:10:35 <oneswig> How's the new control plane rollout going? 11:10:53 <b1airo> how's your control-plane k8s-ing going belmoreira ? 11:11:05 <b1airo> jinx 11:11:12 <oneswig> :-) 11:11:19 <belmoreira> :) 11:11:42 <belmoreira> we have production request going to the k8s cluster with glance 11:12:03 <belmoreira> next step is nova-api (maybe next week) 11:12:11 <janders> nice! 11:12:17 <belmoreira> and then have a full cell control plane 11:12:43 <belmoreira> the main motivation is to gain ops experience and see if it's makes sense to move the control plane to k8s 11:13:02 <janders> belmoreira: when complete, will you run a "small/internal" k8s to run OpenStack control plane that will then run "large/user-facing" k8s on top? 11:14:40 <belmoreira> janders not clear. What I would like is to have the k8s cluster deployed with magnum in the same infrastructure. Like the inception that we have today for the control plane VMs 11:15:41 <janders> inception... is the OpenStack running itself, essentially? 11:15:52 <b1airo> if i've understood belmoreira's current cell-level control-plane architecture correctly then each control-plane k8s cluster will itself be running across some subset of his production cloud! it's turtles, or dogfood - hopefully those things aren't equivalent - all the way down/round 11:16:57 <belmoreira> b1airo: that's it 11:17:54 <b1airo> memory can't be too rusty yet then, despite being part of a "fleshy von neumann machine" (referencing a conversation earlier this evening) 11:18:06 <b1airo> anyone played with Singularity 3.3 yet? 11:18:16 <b1airo> fakeroot builds and such? 11:18:48 <b1airo> (it's still in rc for the moment i think, so won't be surprised if not) 11:21:38 <oneswig> belmoreira: got any tips for ceph mds performance? We've got a deployment with CephFS as a cluster filesystem and it's apparently pegged in directory operations 11:23:58 <b1airo> yeesh, is telling them not to do that an option oneswig ? 11:24:44 <oneswig> belmoreira: back on the self-hosted control plane, what have you looked at in terms of disaster recovery scenarios? How to fix a broken control plane from within? 11:25:05 <oneswig> b1airo: It was my idea ... 11:25:37 <oneswig> I have to head off 11:26:20 <oneswig> b1airo: can you take the reins? Got to head out now 11:26:34 <b1airo> oh, well in that case it's a fine idea :-P. i guess that means it's not a prod deployment that you're trying to fix though? 11:26:40 <b1airo> sure oneswig 11:26:47 <oneswig> see you janders belmoreira b1airo, nice to briefly catch up 11:26:59 <oneswig> b1airo: no, a bit of scientific experimentation 11:27:03 <oneswig> cheers all 11:27:06 <janders> oneswig: thank you, till next time 11:27:07 <b1airo> o/ 11:27:55 <belmoreira> sorry, was away for few moments 11:28:13 <b1airo> i'm assuming oneswig must have already looked at directory fragmentation/sharding across multiple MSD's, but perhaps worth mentioning anyway 11:28:23 <b1airo> *MDS's 11:28:51 <belmoreira> oneswig "ceph mds performance" I don't have great tips. Maybe Dan can help 11:30:38 <belmoreira> oneswig "disaster recovery scenarios" in case of catastrophe... we always keep few physical nodes to bootstrap the cloud. But that would be very unlikely 11:31:08 <b1airo> also, vul at unimelb might have some tips - they are running a HTC/HPC system against CephFS, so have probably seen some of these things 11:32:33 <b1airo> ok, i think we can probably call it quits for now. late here anyway 11:32:46 <janders> agreed 11:32:49 <janders> thanks guys! 11:32:52 <janders> till next time 11:33:15 <belmoreira> thanks 11:34:02 <b1airo> cheers! 11:34:06 <b1airo> #endmeeting