11:00:10 #startmeeting scientific_sig 11:00:11 Meeting started Wed Feb 14 11:00:10 2018 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 11:00:12 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 11:00:14 The meeting name has been set to 'scientific_sig' 11:00:21 hi all 11:00:30 #link Agenda for today https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_February_14th_2018 11:00:33 hi! 11:00:43 o/ 11:01:00 Got a few good things to cover today, let's get going! 11:01:06 morning 11:01:17 #topic CFPs and conferences 11:01:26 evening 11:01:36 I saw a couple. Anyone else with one to announce - please do! 11:01:38 hi b1airo 11:01:41 #chair b1airo 11:01:42 Current chairs: b1airo oneswig 11:01:56 a'lo oneswig 11:02:18 Tim Randles passed on details of the ScienceCloud workshop in Arizona, June 11th 11:02:33 #link ScienceCloud workshop https://sites.google.com/site/sciencecloudhpdc/ 11:03:00 The HPCAC workshop is coming up in Lugano, 9-12 April 11:03:21 #link HPCAC conference https://www.cscs.ch/publications/press-releases/swiss-hpc-advisory-council-conference-2018-hpcxxl-user-group/ 11:03:27 welcome to Lugano! 11:03:31 Any others to report? 11:03:43 I will be in Lugano as well 11:03:44 Hi mpasserini - good to see you. Ready for HPCAC? 11:03:58 excellent, I hope to return as well. 11:04:36 1 question - does everyone also attend HPCXXL on the 4th day or is that restricted attendance? 11:04:37 this is all starting to sound like an excuse for why i need to go 11:04:57 get on it b1airo 11:05:35 isn't HPCXXL IBM specific? 11:05:35 I'll join the XXL, I'm not sure about how the restriction works, I can ask 11:06:07 sounds like DellXL/DellHPC but for IBM/Lenovo 11:06:18 be interesting to know how the two events fit together. 11:06:35 b1airo: interest starting to fade... 11:06:46 :-) 11:07:29 b1airo: you can probably call in on your way home from Dell in Austin, if you're going that way! 11:08:06 OK, are there other events to announce, if not lets move on 11:08:31 #topic Ironic for bare metal infrastructure management 11:08:56 OK - there's quite a bit going on here. 11:09:06 CERN team, would you like to describe your project? 11:09:26 * johnthetubaguy picks up his ear trumpet 11:09:31 Hi all 11:09:39 Hi makowals, welcome 11:09:51 Since quite some time now we are running Ironic for our users 11:10:07 At the moment we have ~600 physical nodes in use 11:10:56 They cover various use cases - we have "autonomous" users whose only request is to get physical resource, but we are also running Ironic for ourselves, what means we are deploying hypervisors using ironic 11:11:10 And then offer virtual machines in a standard way to the users 11:11:28 Is that a Bifrost servicde? 11:11:41 Don't know what "bifrost" is, so I would guess no 11:12:00 Ah, it's basically Ironic run without OpenStack 11:12:18 Then no, we have Ironic fully integrated with the rest of OpenStack 11:12:20 I guess the VM cloud is a totally separate cloud from the ironic one? 11:12:33 Are you using TripleO? 11:12:38 or just different cells? 11:12:55 We have only separate cells for the ironic nodes 11:13:05 But it still runs as one cloud 11:13:15 This design works well for us at this moment 11:13:25 are you doing anything to help users understand what images work where? 11:13:38 makowals: do you have a link to a blog post or to a documentation site where who is interested can go into the details ? 11:14:17 johnthetubaguy: Our current strategy with the images is to tell our users "you can use the same images as you are used to use with VMs, but now they also work with the physical machines" 11:14:24 zioproto: No, no any blog post yet 11:14:41 would be great to see something on openstack-in-production 11:14:59 oneswig: +1 11:15:00 makowals: ah, interesting, that works I guess 11:15:12 oneswig: +1 11:15:18 Yes, the use cases I described work correctly, users are happy at the moment 11:15:22 We do the same thing on Chameleon: we generate images with diskimage-builder that work both on KVM and bare-metal 11:15:36 priteau: GTK 11:15:40 We have also one more use case in hand which is containers-on-bare-metal. We don't expose it yet, but the works are in quite advanced state 11:16:03 makowals: for your "undercloud" Ironic cells (those you use to provision your hypervisors), are you just relying on policy and cells scheduler restrictions to stop end-users accessing these? 11:16:09 makowals: You've got something like 20 models of server in the cloud, right? Does that cause issues for creating bare metal images or are they comprehensive for all hardware? 11:16:48 b1airo: We are using both cell and flavor separation to keep users from accessing the nodes they shouldn't touch 11:17:11 oneswig: No, so far we did not have any issues regarding images compatibility between different hardware models 11:17:24 cells are quite nice for that (at least in the v1 world) 11:17:48 However one big problem for now is lack of software RAID support in Ironic, this is the biggest thing our users complain on now 11:18:07 There is a RFE open for that though 11:18:07 https://bugs.launchpad.net/ironic/+bug/1590749 11:18:09 Launchpad bug 1590749 in Ironic "RFE: LVM/Software RAID support in ironic-python-agent" [Wishlist,Confirmed] 11:18:20 makowals: also curious about security - you must have quite a few users with admin various levels of admin permission. any issues managing that within this all-encompassing environment? 11:18:56 b1airo: Do you mean "admin permissions" for the operators running the cloud, or user-side? 11:19:07 makowals: ah software raid rather than the existing clean steps that configure hardware raid? or different RAID config based on flavor? 11:19:27 makowals: just wait another couple of years and software raid will be effectively obsolete with everything running on NVMe 11:19:53 johnthetubaguy: First step would be software raid instead of the current hardware raid setup, but as a next improvement it would be nice to be able to configure raid dynamically, according to the user's request 11:20:03 yes, admin permissions across a large set of operators 11:20:23 b1airo: Our set of operators is quite small at this moment, so it's one team running all the Openstack services 11:20:23 makowals: we are very keen on that level of capability too! 11:20:42 makowals: I am glad mark and I are working on that with traits :) 11:20:52 Regarding SW/HW raid, it's more political, there are constant fights between these two camps ;) 11:21:23 makowals: we should point you at the specs to make sure they work for your use case, sounds identical 11:21:32 makowals: Sounds like more generally you need custom partitioning which would include LVM and SW RAID 11:21:43 priteau: Yes, exactly this 11:21:51 makowals: There was also some work relating to automating the on-boarding of new hardware, right? 11:22:07 oneswig: Ahhh yes, this is the other side of the project 11:22:26 So at CERN we have another team responsible for the process of onboarding the hardware arriving on site 11:22:39 They are using multiple in-house made tools and databases to handle this process 11:23:40 Recently we started to work with them to possibly first -- integrate their tools into our deploy image, next -- move their "onboarding" scripts into Ironic "inspection" phase, and last -- to merge their database with Ironic's one 11:24:17 Nice. What does onboarding include in this context? Burn-in and hardware inventory? 11:24:18 At this moment what we would like to see in Ironic is ability to add more states into the "node's lifecycle graph" 11:24:34 priteau: Yes, burn-in/testing and getting the inventory 11:25:06 makowals: same instance of Ironic as the production environment, I guess? 11:25:13 At this very moment we are trying to move their "stuff" into the "inspection" state in ironic 11:25:29 oneswig: I did not get the last question, sorry 11:26:02 as in, you're using the same Ironic service for on-boarding new hardware as for managing bare metal deployments in the production cloud? 11:26:20 Yes, that's the same Ironic deployment 11:26:44 what do you mean with "burn-in" ? 11:26:55 That's interesting to know, thanks 11:27:00 zioproto: CPU, memory, disk and network tests 11:27:15 Also running some benchmarks to check if the node's performance is as expected 11:27:27 And to detect any discrepancies inside nodes from the same delivery 11:27:48 makowals: this is like an extended version of the hardware benchmarks you can enable in inspection, I guess? 11:27:58 makowals: Adding more states to the lifecycle graph would be quite intrusive into the Ironic codebase. Have you considered adding tags to your nodes to describe the "sub-state" of the inspection? 11:28:46 oneswig: Yes, but we don't want to run it all the time when we do "node-inspect", that's why our need for a new state or something providing this functionality 11:29:03 priteau: How would that work with tagging the node? 11:29:16 seems a bit like a clean step, would that work? 11:29:33 i.e. clean before it becomes available 11:29:46 johnthetubaguy: Yes, but then we don't want it to be run every time the instance is deleted as our burn-in takes 1-2 weeks 11:29:58 makowals: I wonder if you could select by changing which deploy image is associated with the node? 11:30:07 In fact we are getting into our next issue which is cleaning, repurposing and retirement 11:30:25 If your deploy could query the Ironic API (might need to embed some credentials), it could to a `node show` to check which tags have been added to the node object. If there is e.g. a tag saying "ram:ok", don't run RAM tests 11:30:36 oneswig: I don't think that would do the job, as it's more custom hardware driver describing what's done during the process 11:31:07 priteau: Oh yes, something like this would work perfectly, it's just the issue with doing "ironic node" from the deploy image which is problematic in terms of credentials 11:31:36 But thanks for the idea, this is definitely something to investigate more 11:31:46 create a user with only read-only privileges via policy.json 11:32:39 makowals: what were the issues with cleaning? 11:32:42 you could even pass in a short-lived pre-baked token 11:32:55 Moving forward, the issue we see with cleaning is similar. For the standard cleaning we only want to delete data from the disk to have it clean. But there are more complicated cases where for example the machine is going to be used by a different user and we need to make a thorough cleaning of the machine 11:32:58 the new keystone application credentials stuff should make that easier 11:33:19 So something like different cleaning paths depending on what is going to happen with the machine afterwards 11:33:42 At cleaning time, who can predict the next user? 11:34:04 the egg oneswig 11:34:07 Me as an operator, I request the user to delete his instances because I know I'm going to repurpose the machine 11:34:08 Regarding cleaning, anyone ever used those disk encryption keys to make all data unreadable very quickly? 11:34:38 At this moment we only play with erasing disk metadata and shreding 11:34:40 I think we are doing that on the SKA AlaSKA machine 11:34:56 priteau: I believe we do - I've just come across blkdiscard for the same result (on SSD) 11:35:19 I think its already built into the clean steps in Ironic 11:35:42 Yes, it's just matter of configuring it to use exactly what you need, but the implementation is done 11:35:55 should call a 2 minute warning here for us to move on to the next topic... 11:36:07 Ok 11:36:27 So for our use case the current cleaning could stay as it is, but we would appreciate something to be able to execute a "clean better" procedure 11:36:28 i'm curious how cells v2 might shake things up (read: break the world) here? 11:36:29 ATA secure erase was the phrase I couldn't remember 11:36:59 the scheduling is global in the v2 world, i.e. placement has all hosts in its DB 11:37:00 makowals: an effective erase is instant on SSDs - and secure erase works for hdds - is there a need for anything quicker? 11:37:07 I think we don't foresee bigger problems with going to cells v2, correct me belmoreira if I'm wrong 11:37:20 b1airo: scheduling will be more challenging 11:37:25 oneswig: Secure erase has to be supported by the disk 11:37:36 ah, ok, got it 11:37:51 That's our problem with having such a bit granularity with the hardware arriving 11:37:56 i'm sure you are up to the challenge belmoreira! 11:38:15 we had a related discussion - came by on the openstack-dev list - what do people do in Ironic for managing firmware updates? 11:38:54 our policy has been to check during cleaning and fail cleaning if the firmware version does not match an expected version string. Then manually upgrade from maintenance mode. 11:39:01 On our side, we did not start to implement anything in this are yet, but we are aware this problem should be solved 11:39:05 s/are/area 11:39:35 At the same time we also want to focus on regenerating ipmi/bmc credentials 11:39:37 priteau: what happens on chameleon? 11:40:17 oneswig: on Chameleon we have a separate API and client which can check that various hardware details (including BIOS firmware version) match what's expected. But it's not integrated in the ironic workflow. 11:41:36 #link this is the widget for Dell BIOS upgrade checks in node cleaning https://github.com/stackhpc/stackhpc-ipa-hardware-managers 11:42:13 oneswig: Thanks! It looks like we could use this :) 11:42:21 Share and enjoy :-) 11:42:28 Let's move on - final thoughts on this? 11:42:55 #topic SWITCH - Kubernetes-as-a-Service 11:43:02 Hi zioproto - ready? 11:43:03 hello all 11:43:04 yes 11:43:13 zioproto: how much for that? :-) 11:43:27 so at SWITCH we are working on Kubernetes, do understand how far we can go for a K8s as a service 11:43:43 we are running k8s on openstack instances 11:43:47 we deploy with this ansible playbook 11:43:52 #link https://github.com/zioproto/k8s-on-openstack 11:44:02 kargo, or your own? 11:44:03 the playbook will create the instances for you 11:44:11 watch the stars on that repo go up 11:44:16 oneswig: forked from an existing project of a company called infraly 11:44:21 see fork history on github 11:44:35 it uses kubeadm in the code 11:44:52 for the users 11:45:05 the idea is to give them just the ~/.kube/config file 11:45:10 The key idea is to reuse the existing openstack username and password to login into K8s. 11:45:11 is there a quick way to compare this to Magnum creating the k8s clusters? 11:45:11 zioproto: is this a shared cluster for all the users? 11:45:52 johnthetubaguy: Magnum has a dependency on Heat, because we are running always several releases behind, we decided not to use that 11:46:05 belmoreira: it is possible, hold on let me explain the rest 11:46:12 zioproto: cool, makes sense 11:46:19 thanks to Dims that developed this code 11:46:27 #link https://github.com/dims/k8s-keystone-aut 11:46:41 is possible to delegate to keystone the authentication and the authorization 11:46:49 but we use only the authentication part 11:47:04 #link I think it was https://github.com/dims/k8s-keystone-auth 11:47:24 so keystone just tells k8s, that I am really that user, and that my token is really scoped to a certain group 11:47:34 I can create a RBAC rule un k8s 11:47:41 where kind: Group is a keystone project 11:47:52 so I can use the same cluster for all the users 11:47:59 where every user see just a namespace 11:48:09 or all users of a keystone project see 1 namespace 11:48:10 BUT 11:48:13 there is a bug BUT 11:48:19 depending on the network solution you have 11:48:27 you have to implement your self the multi-tenancy at the network level 11:48:35 at the moment in our solution all pods are on the same network 11:48:42 so users are confined with actions to 1 namespace 11:48:54 but I can for example connect over the network to a pod of a different name space 11:49:01 at the moment we fixed the IPv6 support 11:49:13 #link https://github.com/kubernetes/kubernetes/pull/59749 11:49:21 we are planning to deploy our Horizon 11:49:27 on top of this Kubernetes 11:49:39 and expose horizon on IPv4 and IPv6 using the nginx-ingress-controller 11:49:50 that runs as a docker container with host networking on the controller 11:49:55 so can expose both IPv4 and IPv6 11:50:02 the cluster internally is IPv4 11:50:06 and we use the Neutron integration 11:50:11 so each pod is assigned a subnet 11:50:17 and the k8s master talks to neutron 11:50:23 and injects routes into the neutron router 11:50:31 so that each pod subnet so routed to the right VM 11:50:34 questions so far ? 11:50:53 What are SWITCH users doing with it? 11:50:59 #link https://cloudblog.switch.ch/2017/11/15/deploy-kubernetes-v1-8-3-on-openstack-with-native-neutron-networking/ 11:51:18 oneswig: not yet released to users. At the moment we are running our staging Horizon deployment 11:51:32 as soon as the IPv6 patch I linked in merged in k8s, we go in production with horizon 11:51:41 then we will be ready to offer this to the users 11:51:42 so dogfood first? 11:51:48 b1airo: right 11:52:35 we still have a problem capturing the Horizon python logs 11:52:39 because 11:52:41 our docker container 11:52:47 is based on Apache + WSGI 11:52:56 Docker intercept STDERR and STDOUT of Apache 11:53:03 Why do you need docker containers on kubernetes? 11:53:04 but WSGI is yet a new process in the container 11:53:25 armstrong: I mean, a pod is collection of docker containers 11:53:50 we wrote the Dockerfile for our Horizon deployment 11:54:14 zioproto: so it's you (operators) managing the cluster size for all users 11:54:15 all sounds like great work that lots of others will want to replicate 11:54:18 armstrong: we used Kubernetes with Docker containers, I am not aware of other container platforms supported by k8s 11:54:31 belmoreira: right 11:54:42 armstrong: maybe I did not understand the question ? 11:54:43 belmoreira: In this case you do you charge? 11:54:48 @zioproto Ok I see 11:54:58 when you do 11:55:03 kubectl logs podname 11:55:12 basically is a wrapper on docker logs containername 11:55:12 interesting that we have heard about two distinct OpenStack use-cases this evening where the control-plane is merging into the production cloud 11:55:46 to conclude, I hope we go to production in the next weeks with the Horizon deployment 11:55:47 b1airo: good point. The stratification blurs! 11:55:54 we will run Horizon Pike on top of our Openstack Newton cluster 11:56:30 on that note, do other deployments have a notion of tiers of OpenStack services? 11:56:31 once we learn more about the platform, we will offer it first to other teams inside SWITCH, and then to our cloud users 11:56:53 Did you consider running Heat from Pike and Magnum as an alternative? 11:57:26 oneswig: no, and I can explain why 11:57:51 oneswig: we prefer keeping the Openstack footprint small. As few projects as possible. This is because Upgrades are difficult 11:57:56 i had the same question next too 11:58:03 oneswig: we want a solution indipendent from a Openstack upgrade 11:58:21 when I upgrade openstack, I dont want to upgrade magnum too 11:58:51 fair point zioproto, though there are no longer many upgrade dependencies across services 11:59:14 Final comments, the hour is upon us 11:59:15 stable rest APIs has helped with the inter service stuff 11:59:31 zioproto: do you have a way to enforce quotas to users? 11:59:42 belmoreira: I dont know yet 11:59:53 I guess it is possible if you can enforce quota to namespaces 12:00:02 in k8s namespaces I mean 12:00:24 join on slack #sig-openstack to get updates on this 12:00:37 I think is slack.k8s.io 12:00:41 I would have to check the URL 12:00:51 practical example of mixed version cloud: https://trello.com/b/9fkuT1eU/nectar-openstack-versions 12:01:09 We are out of time, thanks everyone 12:01:24 thank you ! 12:01:24 Thanks @oneswig 12:01:29 Thanks, cheers 12:01:35 Thanks all! 12:01:37 Thanks @zioproto 12:01:38 thanks 12:01:38 thanks all! especially makowals and zioproto 12:01:41 #endmeeting