| gokhan | Hi everyone, I’m working on designing a large-scale architecture and could use some advice from the community. So far, I’ve been running about 50-100 hypervisors with 3 controllers without any issues, but I'm now planning to scale up to 1,000 nodes and I'm wondering what the best path forward is. At what point does Cells v2 become a 'must-have' rather than just an option? For those of you running massive setups, how many compute nodes do | 06:59 |
|---|---|---|
| gokhan | you typically manage per cell to keep things stable and limit the blast radius? I'd love to hear your experiences on whether a single-cell setup can actually handle 1,000 nodes with the right tuning, or if I should definitely start splitting things up now. Thanks a lot for any tips! | 06:59 |
| gibi | gokhan: I think 1000 node in a single cell is a huge strech. Might work if you have a fairly static workload. I think our non official recommendation is to start considering cells around 200 nodes | 08:13 |
| gibi | gokhan: if you search around the net you will find reports from large scale openstack deployers, like CERN, about their size and tuning. For example https://superuser.openinfra.org/articles/large-scale-openstack-operators-tricks-and-tools-openinfra-live-recap/ or https://techblog.web.cern.ch/techblog/ | 08:17 |
| gokhan | thanks gibi for your insights :) | 08:53 |
| gibi | gokhan: also take the approach to scale gradually if you can, and measure the scale. If you see a bottleneck in certain operations then we can help suggesting tunings for it | 08:55 |
| gibi | unfortunately there is no one single good answer for scale | 08:56 |
| gokhan | thanks gibi, we will start with about 100 nodes per cell . | 09:07 |
| opendevreview | Merged openstack/nova master: docs: mention q35 machine type for UEFI guests https://review.opendev.org/c/openstack/nova/+/981413 | 09:19 |
| *** sambork_ is now known as sambork | 09:26 | |
| -opendevstatus- NOTICE: Recent POST_FAILURE job results with no logs were due to upload errors in one of our providers, which has been temporarily disabled now so rechecking those should be safe | 15:04 | |
| yessou-sami | Hi everyone, i was wondering if has anyone done an analysis about how much is safe enabling hw_qemu_guest_agent on openstack to enable guest agents to be in contact with the hypervisor? | 15:38 |
| sean-k-mooney | that now quite how that works | 15:59 |
| sean-k-mooney | hw_qemu_guest_agent when you enable hw_qemu_guest_agent it provide a channel form the hsot to allow the host to interact with an agent in teh guest | 15:59 |
| sean-k-mooney | but it does not provide a way for the guest to interact with the hypervior | 15:59 |
| sean-k-mooney | at least not by default | 15:59 |
| sean-k-mooney | hw_qemu_guest_agent is generally consider safe unless you are plannign to develop a host agent that will interact with it or do some other custom integration | 16:00 |
| gibi | also settinng hw_qemu_guest_agent alone is not enough, the image needs to have the agent installed. Nova neiter checks for it nor installs it into the guest. | 16:59 |
| bauzas | fun that I answered the same question earlier today somewhere else | 17:00 |
| bauzas | if people want to ensure that GA is installed if users set the GA property, then they have to talk to Glance, not us | 17:00 |
| cardoe | hello all. Still hoping to see https://review.opendev.org/c/openstack/nova-specs/+/471815 get done. The code is written for a few cycles. As are the tests. The code is deployed for a number of cloud operators as well. | 17:06 |
| sean-k-mooney | gibi: yes and for the actual usful featre i.e. quiesing the guest disk on snapshot i think you need to set a second extra spec too | 17:29 |
| sean-k-mooney | os_require_quiesce | 17:30 |
| sean-k-mooney | although in thory an operator can use teh qemu geust agent for other thing | 17:30 |
| sean-k-mooney | i belive rackspace used to use it to configure soem thign by defualt in there cloud like dhcp/dns ectra | 17:31 |
| sean-k-mooney | bauzas: well hw_qemu_guest_agent isnt strictly saying it is installed in teh current image. its saying i want to use the guest agent. for reason i might chosse to install it after the fact with ansible or cloud init in the image | 18:00 |
| sean-k-mooney | *in the vm rather then in the image | 18:00 |
| bauzas | then there is no way for Nova to introspect that VM | 18:00 |
| sean-k-mooney | but ya all that image proprly does is ensure the qemu channel for the guest agent is providsioned in qemu | 18:00 |
| sean-k-mooney | bauzas: correct that is out of scope fo nova | 18:01 |
| bauzas | at least I don't want to have Nova introspecting the guest :) | 18:01 |
| sean-k-mooney | nor do i | 18:01 |
| sean-k-mooney | it woudl be incorrect for nova to do so in this case | 18:01 |
| bauzas | my proposal was to make it enforced by Glance, provided the operator was wanting it | 18:01 |
| bauzas | but that's a Glance discussion | 18:01 |
| sean-k-mooney | oh no i dont think galnce shoudl enforce it either | 18:01 |
| sean-k-mooney | that to me woudl be a breakign chagne requrieign at least a config option | 18:02 |
| sean-k-mooney | as i said this image peroperlty is for requesting the guest agent channel to be provisioned | 18:02 |
| sean-k-mooney | its not declearing that the image has it installed | 18:02 |
| sean-k-mooney | those are two very diffent things | 18:02 |
| bauzas | surely, I'm just saying that the service that has the responsibility for introspecting the image is Glance as of today | 18:02 |
| sean-k-mooney | sure but im saying that the image proprty does not state that its installed | 18:03 |
| sean-k-mooney | that not what its for | 18:03 |
| bauzas | sean-k-mooney: yup yup, but AFAICU, the request was to have GA be default | 18:03 |
| bauzas | since public clouds operators don't trust their users | 18:03 |
| sean-k-mooney | oh thats an entirly diffent converation | 18:03 |
| sean-k-mooney | we coudl do that | 18:03 |
| sean-k-mooney | or make it configurable via the flavor so admin can opt in | 18:04 |
| sean-k-mooney | but that very diffent then introspecting the image | 18:04 |
| gmaan | Gibi: Bauzas: Not sure if you got time to check this change (running graceful shutdown running on both mode), but it will be good to do it before any breaking change. https://review.opendev.org/c/openstack/nova/+/978292 | 18:27 |
| gmaan | gibi: ^^ | 18:27 |
| gibi | gmaan: +2, thanks | 18:30 |
| gmaan | thanks | 18:30 |
| opendevreview | Ghanshyam Maan proposed openstack/nova stable/2026.1: DNM: test grenade job https://review.opendev.org/c/openstack/nova/+/981980 | 18:36 |
| opendevreview | Ghanshyam Maan proposed openstack/nova stable/2025.2: DNM: test grenade job https://review.opendev.org/c/openstack/nova/+/981982 | 18:39 |
| JayF | Uggla: I don't see OpenStack Nova with a slide in OpenInfra Live Gazpacho; is nova being repped in that at all? Y'all probably should have someone there :| | 20:53 |
| sean-k-mooney | i tought someone was going to attend but i have not really been invoved | 20:58 |
| Uggla | JayF, I have just answered that I will cover Nova highlights. Thanks for the reminder. | 22:42 |
| JayF | awesome, thanks. It didn't look like many projects had opted in so I'm glad to get more | 22:43 |
Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!