derekokeeffe85 | Morning noonedeadpunk (if you're on that is) | 08:16 |
---|---|---|
opendevreview | Jonathan Rosser proposed openstack/openstack-ansible-os_tempest master: Update to cirros 0.6.2 https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/886165 | 09:08 |
ncuxo | hmm so the playbook is installing all the services in lxc containers? | 10:07 |
ncuxo | https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/run-playbooks.html the first playbook setup-hosts.yml is preparing the target hosts. If that is true then why I need to prepare the host beforehand? All you should need before hand is just put your ssh keys inside the target host and thats it. Or I'm missing a point here? | 10:14 |
jrosser | ncuxo: there are some pre-requisites, like networking that you must do yourself on the target hosts | 11:56 |
jrosser | the setup-hosts playbook is specific things required on all hosts for the openstack deployment | 11:57 |
jrosser | "All you should need before hand is just put your ssh keys inside the target host and thats it" - yes you are missing the point because to some extent every deployment is different at a physical/network level at least, number of interfaces, approach to H/A, storage local/NFS/infiniband/whatever | 11:59 |
jrosser | openstack-ansible allows a very large degree of operator freedom to architect the deployment to meet their own requirements, so it is really not a "shrink wrap installer" | 12:00 |
ncuxo | wait what all I've seen is install some packages, ssh and do the network bridges dependant on the node | 13:02 |
ncuxo | oh yeah the storage, well the storage can also be compartmentalised in a playbook | 13:03 |
ncuxo | there is even an already written playbooks for that linux_system_roles/storage | 13:03 |
opendevreview | Simon Hensel proposed openstack/openstack-ansible-galera_server master: Add optional compression to mariabackup https://review.opendev.org/c/openstack/openstack-ansible-galera_server/+/886180 | 14:21 |
NeilHanlon | ncuxo: as jrosser said, the point of the project is not to do everything for the users, but provide a high degree of freedom for it to be customized to your environment. We cannot (and won't) make decisions about how your network, storage, etc, is configured | 18:22 |
NeilHanlon | Doing so would limit the amount of flexibility operators have to use OSA how they like | 18:22 |
ncuxo | okay so I have 5 servers 3 will be in the initial install, I want to have everything on those 3 servers and scale out with whatever services I need. I'm trying to build an HCI deployment where openstack is self sufficient and doesn't need anything except external router. How should my storage be configured then if openstack doesn't manage my storage | 18:53 |
NeilHanlon | lowercas_: https://review.opendev.org/c/openstack/openstack-ansible/+/869762/8 | 18:58 |
*** lowercas_ is now known as lowercase | 18:59 | |
opendevreview | Neil Hanlon proposed openstack/openstack-ansible stable/yoga: Drop `else` condition in the container_skel_load loop https://review.opendev.org/c/openstack/openstack-ansible/+/886143 | 19:02 |
opendevreview | Neil Hanlon proposed openstack/openstack-ansible stable/yoga: Drop `else` condition in the container_skel_load loop https://review.opendev.org/c/openstack/openstack-ansible/+/886143 | 19:19 |
opendevreview | Neil Hanlon proposed openstack/openstack-ansible stable/yoga: Add is_nest property for container_skel https://review.opendev.org/c/openstack/openstack-ansible/+/886206 | 19:19 |
jrosser | ncuxo_: openstack has a few different types of storage (volumes / object / images / ephemeral) and supports many backend implementations for those, for example for block storage you can choose from these https://docs.openstack.org/cinder/latest/configuration/block-storage/volume-drivers.html | 19:51 |
jrosser | so to answer "how should my storage be configured" you need to choose which of the storage types you want to implement and which backend you are going to use for them | 19:52 |
jrosser | as an example, it is pretty common to use ceph to provide volume, image and object storage | 19:52 |
ncuxo_ | I wanna be able to implement all of them block file and object storage | 19:53 |
ncuxo_ | so I have to install ceph outside of openstack? | 19:53 |
jrosser | right - so it is your choice of backend driver | 19:53 |
jrosser | openstack-ansible can deploy ceph because it has an integration with ceph-ansible | 19:54 |
jrosser | though, for various reasons it is a popular choice not to have tight coupling between the ceph deployment and the openstack deployment | 19:54 |
jrosser | obviously that is hard to do with an HCI approch | 19:54 |
jrosser | but HCI does bring it's own challenges | 19:55 |
ncuxo_ | exactly because I want to use all the resources each server has | 19:55 |
ncuxo_ | could you point a few please? | 19:56 |
jrosser | you would need to have a plan for dealing with resource contention between ceph OSD and your virtual machines, and the control plane processes | 19:56 |
jrosser | how will you prioritise which process should be killed by the OOM killer when ceph memory usage balloons during a large recovery event? | 19:56 |
jrosser | your vm libvirt? mariadb database for openstack? | 19:57 |
ncuxo_ | if openstack installs ceph shouldn't it take care of that ? | 19:57 |
jrosser | openstack is the projects that implement the APIs like nova / cinder | 19:58 |
jrosser | openstack does not install ceph, openstack-ansible does | 19:58 |
ncuxo_ | ok then doesn't openstack ansible installs a systemd unit that is checking for stuff like that | 19:59 |
jrosser | we have a reference implementation which is not HCI | 19:59 |
jrosser | and we would generally from an opestack-ansible perspective not recommend an HCI approach, though nothing stops configuring a deployment like that | 19:59 |
ncuxo_ | https://docs.openstack.org/openstack-ansible/latest/user/ceph/full-deploy.html | 20:00 |
jrosser | yes | 20:00 |
jrosser | the compute hosts are separate from the controllers, and separate from the OSD hosts | 20:00 |
ncuxo_ | and since I'm planning to have it all in one I'm begging for trouble .... | 20:01 |
ncuxo_ | got it now | 20:01 |
mgariepy | what are you guys uses for networking ? | 20:02 |
jrosser | so you define in nova for example, how much memory to keep spare for "other things" | 20:02 |
jrosser | and you would need to come up with a figure that was sufficient for ceph + 1/3rd of the control plane | 20:02 |
ncuxo_ | jrosser: only that ? then thats not that hard I can spare an easy 128g ram for all those vm | 20:03 |
ncuxo_ | s/vm/vns | 20:03 |
ncuxo_ | ,,, cant type today | 20:03 |
jrosser | well like i say ceph memory usage can be wildly unpredictable | 20:04 |
jrosser | steady state is very different from when it's recovering from a major "event" in the cluster, like loss of a node or somethnig | 20:04 |
ncuxo_ | all my storage is 2g ssds and I'm not planning on making it larger, I prefer to scale out then add thicker drives | 20:04 |
ncuxo_ | 2T ssds ... | 20:05 |
jrosser | mgariepy: do you every try anything converged like this? | 20:05 |
mgariepy | i would not. | 20:05 |
mgariepy | when 1 service needs debugging it's enough for me i don't need all of them to be down at the same time | 20:05 |
ncuxo_ | mgariepy: but I have 3 hosts so they all should be replicated and in ha state? | 20:06 |
jrosser | normally those would be 3x control plane hosts then you add more as computes | 20:06 |
ncuxo_ | I don't care why something fails just rinse and repeat | 20:06 |
mgariepy | well. when this works sure. | 20:06 |
jrosser | controllers can be smaller resource-wise than compute hosts | 20:07 |
mgariepy | maybe openstack isn't the right solution ? | 20:07 |
ncuxo_ | and I have only beefy servers this is why I need everything to work on the control plane as well | 20:08 |
ncuxo_ | mgariepy: I'm trying to move away from the typical hypervisor infra. I've been checking baremetal k8s and baremetal openstack | 20:09 |
mgariepy | for only a couple server like that i would probably try proxmox | 20:10 |
ncuxo_ | I'm doing 3 as a start then I'll add 2 more and have another 20 waiting for the load | 20:10 |
jrosser | it feels wrong for that quantity of hardware not to have dedicated controllers | 20:11 |
mgariepy | maybe try to have 1 controller and a couple compute ? for the storage i'm not sure. | 20:12 |
jrosser | it really depends on the use case | 20:13 |
jrosser | you would build a cluster dedicated to CI jobs with no shared storage at all | 20:13 |
jrosser | but if uptime/availability were important then you would make different choices | 20:13 |
jrosser | there is not one correct way to build openstack, the point is you architect something that fits the use case | 20:13 |
mgariepy | i tend to build cluster dedicated to users without local storage instead :D but yeah depend on the use-case. | 20:14 |
ncuxo_ | jrosser: it really doesn't make sense to waste 48 cores and 768 per server just for the control plane | 20:14 |
jrosser | then personally i would also have some smaller hardware | 20:14 |
mgariepy | you can have a couple of 12 core 128gb nodes for the controllers.. | 20:14 |
jrosser | my test lab has 3x 4 core / 64g controllers for example | 20:15 |
jrosser | super cheap | 20:15 |
ncuxo_ | https://docs.openstack.org/openstack-ansible/latest/user/ceph/full-deploy.html ok and then still I need seperate hosts for the ceph and compute | 20:15 |
jrosser | thats what the reference architecture says | 20:16 |
ncuxo_ | also what about the LBs I want them also in openstack | 20:16 |
jrosser | nothing stops you co-locating ceph & compute, opestack-ansible will deploy that if its what you want | 20:17 |
mgariepy | how many drive do you have per server? | 20:17 |
jrosser | but then remember it lets you have pretty much any architecture you want | 20:17 |
ncuxo_ | jrosser: hmm if I have the compute and ceph on the same server I can simply add quotas on all my compute to fill up to 80% and this way even if ceph hogs the memory during a recovery the vms can migrate to the other hosts | 20:18 |
jrosser | like i say you can tell nova through it's config how much host memory should be reserved | 20:19 |
ncuxo_ | mgariepy: 10 drives per server 2t sas ssds | 20:19 |
jrosser | sas.... | 20:19 |
jrosser | no raid controller i hope | 20:20 |
mgariepy | you can also pins cores to vms. | 20:20 |
mgariepy | it's flexible :D | 20:20 |
ncuxo_ | the raid controller is in jbod | 20:20 |
jrosser | maybe reserve 50G, dont know i'm just guessing | 20:21 |
ncuxo_ | probably will leave it at 68 and use the 700 | 20:21 |
jrosser | i am generally more concerned about "day 2 operations" when thinking about this stuff | 20:21 |
jrosser | like how do i upgrade my openstack version, what happens when i need to upate the OS major release across the whole cluster | 20:22 |
ncuxo_ | well I have 6 months for planning and testing | 20:22 |
jrosser | what happens when the OS I have does not support the release of ceph that i need | 20:22 |
mgariepy | cephadm.. only needs podman.. lo | 20:23 |
mgariepy | lol | 20:23 |
jrosser | ^ all this is really what becomes your tasks, not worrying about if you fully utilised some server with HCI or not | 20:23 |
ncuxo_ | I'm confused again ... why should I care about those stuff if server is broken re-provision it and continue with my day ? why I feel I'm missing something here | 20:24 |
mgariepy | it's not micro-service deployed in k8s. and auto-respawn when one goes offline. | 20:24 |
ncuxo_ | isn't that the point of self healing infra everything is ephemeral | 20:25 |
mgariepy | you are talking of openstack. | 20:25 |
ncuxo_ | sure isn't ironic responsible to reprovision your host ? | 20:25 |
jrosser | not at all | 20:25 |
mgariepy | nop | 20:25 |
jrosser | ironic is a service you can deploy, which will manage baremetal host deployment for your users, as a service | 20:26 |
ncuxo_ | I feel I've been reading then and not understood a thing ... | 20:26 |
ncuxo_ | oh so its not meant for the operator its meant for the user ... | 20:26 |
jrosser | some tools (the now-deprecated tripleo for example) did used to use ironic to deploy openstack itself | 20:26 |
jrosser | but that is really not the core purpose of ironic | 20:27 |
jrosser | it can and in some cases is used by the operator too, but thats kind of pretty advanced usage | 20:28 |
ncuxo_ | I'm really starting to thing about baremetal k8s with ceph kubevirt and ironic | 20:28 |
jrosser | right - so it entirely depends on your use case what is suitable | 20:29 |
jrosser | if you want multi-tenancy properly for example, that might be a factor | 20:29 |
ncuxo_ | my idea was to get a vm on my laptop with openstack ansible as deployment host then provision single host install what is necessary and from this one server expand everything out. This was what I'm looking at | 20:31 |
ncuxo_ | this server has all the services inside, provisions the next server if the server count is less than 3 moves the infra services and the core services until I reach total of 3 then just ceph and nova | 20:33 |
jrosser | openstack-ansible is not self-replicating like that | 20:34 |
mgariepy | there are quite a lot of static stuff in osa. | 20:36 |
ncuxo_ | when you mentioned 12 cpu 128 ram per control plane host you mean 12cpu not vcpu ? | 20:36 |
ncuxo_ | mgariepy: you said earlier that all ceph requires is podman, but in the docu I've seen only docker and lxc containers. So podman is used just for ceph? | 20:39 |
mgariepy | cephadm deploy ceph in podman/docker | 20:39 |
ncuxo_ | I prefer if podman was hardcoded and not docker but well .... | 20:39 |
mgariepy | i got to run now. familly time now. | 20:40 |
ncuxo_ | thanks for explaining stuff to me | 20:40 |
jrosser | if openstack-ansible deploys ceph it does not use podman or cephadm | 20:41 |
ncuxo_ | jrosser: can openstack-ansible manage my LBs or I have to have them separate | 20:41 |
jrosser | it uses LXC (or not if you dont want) and distro packages | 20:41 |
jrosser | but most people, when at decent scale choose to decouple ceph from openstack | 20:41 |
jrosser | ncuxo_: which LB? for your openstack API endpoint, or LBAAS via the octavia service | 20:42 |
ncuxo_ | 25 servers is not so big at least in my understanding, after listening to some podcast big infra is over 500 servers | 20:42 |
ncuxo_ | https://docs.openstack.org/openstack-ansible/latest/user/ceph/full-deploy.html not sure which one is this one | 20:43 |
jrosser | that is the LB for the dashboard and API endpoints | 20:44 |
jrosser | openstack-ansible deploys haproxy and keepalived for that by default | 20:44 |
ncuxo_ | so I don't need something external ... sweet | 20:45 |
jrosser | again you can choose :) | 20:45 |
jrosser | some poeple like F5 type appliance | 20:45 |
ncuxo_ | as I've said outside of firewall I want all the services to come from openstack, dhcp dns lb ntp | 20:46 |
jrosser | i think this is also maybe not right | 20:47 |
jrosser | you need to provide NTP yourself, for example | 20:47 |
ncuxo_ | can't I have vms which live on openstack and provide the service ? | 20:49 |
jrosser | tbh i think it is worth stepping back and looking at what it takes to provide infrastructure as a service | 20:53 |
jrosser | your openstack hosts cannot, for example, validate SSL certificates unless they have accurate time | 20:54 |
jrosser | and unsynchronised host clocks is disastrous for ceph | 20:54 |
jrosser | so this tells you that as the platform operator, you must have proper sources of fundamentals like NTP as foundations to build your infrastructure on top of | 20:55 |
ncuxo_ | jrosser: I can't find an article describing the prerequisite services before deploying openstack | 21:16 |
NeilHanlon | seems I've missed a lively conversation about HCI | 22:26 |
* NeilHanlon is relieved he missed it | 22:27 | |
ncuxo_ | :D we can continue if you were not relieved | 22:27 |
NeilHanlon | :P | 22:28 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!