ttx | Large Scale SIG meeting here in one hour! | 07:59 |
---|---|---|
amorin | Will be there! | 08:17 |
ttx | songwenping: hoping you can join the meeting in 15 minutes so we address questions on your doc! | 08:42 |
stan | How do I observe the meeting, from here in this chat? | 08:49 |
amorin | stan: yes, this will be an IRC meeting | 08:55 |
ttx | And everyone can participate! | 08:56 |
ttx | #startmeeting large_scale_sig | 09:00 |
opendevmeet | Meeting started Wed Jun 19 09:00:01 2024 UTC and is due to finish in 60 minutes. The chair is ttx. Information about MeetBot at http://wiki.debian.org/MeetBot. | 09:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 09:00 |
opendevmeet | The meeting name has been set to 'large_scale_sig' | 09:00 |
ttx | Hi everyone, welcome to our monthly Large Scale SIG meeting! | 09:00 |
amorin | o/ | 09:00 |
ttx | #topic Rollcall | 09:00 |
ttx | ping felix.huettner songwenping | 09:00 |
ttx | Our agenda is at: | 09:01 |
ttx | #link https://etherpad.opendev.org/p/large-scale-sig-meeting | 09:01 |
ttx | Waiting a few minutes in case other participants join late | 09:02 |
songwenping | \o/ | 09:02 |
songwenping | hi ttx | 09:02 |
ttx | songwenping: hi! Glad you could make it | 09:02 |
ttx | OK, let's get started | 09:03 |
ttx | #topic Brainstorm OpenInfra Live next episode ideas | 09:03 |
ttx | In previous meetings we discussed a potential new episode, after unsuccessfully trying to crowdsource one frmo the rset of the community | 09:03 |
ttx | amorin: we were considering one around infrastructure for GPUs, did you manage to convince anyone at OVH around that? | 09:04 |
amorin | I completely forgot to talk about it unfortunately, sorry for this | 09:04 |
amorin | so I will ask, adding in my local todo right now | 09:04 |
ttx | I was wondering if we could get https://www.nexgencloud.com/ to talk | 09:05 |
ttx | They are a one of the biggest buyers of GPUs recently and run an openstack cloud | 09:05 |
amorin | what is the idea in your mind? | 09:05 |
amorin | how openstack and gpu can work together? | 09:05 |
amorin | or how is it consumed by customers? | 09:05 |
amorin | or the usage of GPU in cloud? | 09:06 |
ttx | Specific challenges in providing a large scale GPU cloud, I guess | 09:06 |
ttx | identifying any gap | 09:06 |
songwenping | GPU management? our product adapt many kinds of GPUs, like A | 09:06 |
ttx | trying to anticipate questions the next GPU cloud deployer may have | 09:07 |
amorin | so, so more related to infrastructure than customer use cases | 09:07 |
ttx | yeah... Would not mind some shiny workload example too, but that's a bit orthogonal to our SIG purpose | 09:07 |
songwenping | A100, A40, V100, P100 and so on. | 09:07 |
ttx | Could be more of a panel thing | 09:07 |
amorin | ok, I have a guy for this in the team, will ask if he is willing to join/talk about it | 09:08 |
ttx | Experience operating an OpenStack GPU cloud those days | 09:08 |
ttx | cool. We'll reach out to Nexgen see if they are interested | 09:08 |
ttx | and then open it up to others | 09:08 |
ttx | probably somethign we'd do in ~October | 09:09 |
ttx | September we'll be busy at OpenInfra Summit Asia | 09:09 |
amorin | ack, so we have time to refine this, that'd good | 09:09 |
ttx | and July-August will be tricky | 09:09 |
ttx | #agreed let's try to do a panel episode around Experience operating an OpenStack GPU cloud | 09:10 |
ttx | #action amorin to confirm an OVHCloud speaker | 09:10 |
amorin | ack | 09:10 |
ttx | #action ttx to see if someone from nexgen would be interested | 09:10 |
ttx | #info targeting October timeframe | 09:10 |
amorin | maybe have sylvain bauza in the talk as well? he is involved in GPU and openstack a lot | 09:11 |
ttx | yeah that's a good idea... | 09:11 |
ttx | #info Sylvain Bauza could bring the development angle | 09:11 |
ttx | I'll give it some extra thought and pull Allison in for extra ideas | 09:12 |
ttx | moving on to next topic | 09:12 |
ttx | #topic Large scale doc | 09:12 |
ttx | songwenping sent a great report to the mailing-list | 09:12 |
ttx | #link https://etherpad.opendev.org/p/large-scale-inspur | 09:12 |
ttx | There were some open followup questions | 09:12 |
amorin | yes, that's great, thanks! | 09:13 |
ttx | mnaser asked "How did you adjust the max number of conns for RabbitMQ and for the relay I assume you used https://docs.ovn.org/en/latest/tutorials/ovn-ovsdb-relay.html ?" | 09:13 |
ttx | than amorin had questions too | 09:13 |
ttx | then* | 09:13 |
amorin | yup, I am eager to learn more about what you wanted to achieve and what you exactly did to fix your deployment | 09:14 |
songwenping | amorin, good question. we want to manage more nodes as there are big requirement for customer. | 09:16 |
ttx | songwenping: did you see those questions on the mailing-list? ideally you would respond there so that everyone benefits | 09:16 |
songwenping | we use k8s infrastructure to deploy openstack | 09:16 |
songwenping | sorry, maybe i miss the mail | 09:17 |
amorin | e.g. you mentionned booting 3k instances and having scheduler / placement issue. Is it because you ask those 3k instance in one shot? | 09:17 |
ttx | songwenping: still here? | 09:19 |
songwenping | yeah | 09:20 |
songwenping | i am finding the mail. | 09:20 |
songwenping | but still not find :( | 09:20 |
ttx | ah, let me link | 09:20 |
ttx | #link https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/ISIG5TG4DYCTDTP4ZJNJFYCSUVYMX5BT/ | 09:21 |
ttx | you will see both questions there ^ | 09:21 |
songwenping | amorin, yes, we send requests to create 3k instances in one shot. | 09:21 |
amorin | that make sense then, that's an unusual use case, amazing! | 09:22 |
ttx | ideally you would reply by email to the mailing-list again, adressing mnaser's and amorin's questions | 09:22 |
ttx | that way everyone else can see the answers | 09:22 |
amorin | yes, sounds good to me also | 09:22 |
ttx | songwenping: would that work for you? | 09:23 |
songwenping | ttx, could you please forward the mail to me? | 09:23 |
amorin | it's weird you did no receive it, maybe check you spam box? | 09:24 |
ttx | can you see them at the link I just posted? | 09:24 |
ttx | https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/ISIG5TG4DYCTDTP4ZJNJFYCSUVYMX5BT/ | 09:24 |
songwenping | i can see at the link | 09:24 |
ttx | ok perfect | 09:24 |
ttx | #action songwenping to reply to the questions on the mailing-list | 09:25 |
ttx | amorin: is there anything new in the report that could be documented in the large-scale sig doc? | 09:25 |
songwenping | but i canot reply at the link. | 09:26 |
amorin | I believe yes, we can have something new to add to the doc | 09:26 |
ttx | I'll forward you both emails now | 09:27 |
amorin | however, we need to explain your use-case correctly also, because, e.g. max_connections = 100000 is unusual and maybe counter productive | 09:27 |
songwenping | ttx, thansks. | 09:27 |
amorin | the rabbit config you did also, I need to understand the details of it | 09:28 |
amorin | maybe your situation could also be improved if you switch to quorum queues | 09:28 |
amorin | I dont know for now to be honest | 09:28 |
amorin | let's continue the mail thread | 09:28 |
ttx | OK emails forwarded... let me know if you receive them :) | 09:29 |
songwenping | amorin, we donnot use quorum queues. | 09:29 |
ttx | OK let's continue the discussion on the mailing-list and we'll see if we can extract a few things from the story to add to the doc | 09:31 |
amorin | yup | 09:31 |
ttx | #topic Next meeting(s) | 09:31 |
songwenping | ttx, exactly not yet receive. | 09:31 |
ttx | Normally the next meeting would be on July 17, but I won't be around. Should we skip for summer and do next one September 18? | 09:31 |
ttx | songwenping: sent to the inspur.com address you used to post | 09:32 |
songwenping | amorin, i will complete the rabbit detail optimization on the etherpad. | 09:32 |
ttx | great! | 09:32 |
amorin | thanks | 09:32 |
songwenping | ttx, recevied just now, thanks. | 09:33 |
amorin | july 17 I will also be off | 09:33 |
ttx | OK so that one is a skip for sure | 09:33 |
amorin | we can maybe skip meetings this summer, agre | 09:33 |
ttx | We could keep the August 21 one if you are around | 09:34 |
amorin | I should be there | 09:35 |
ttx | OK let's keep it on the agenda | 09:35 |
ttx | #info next meeting, August 21 on IRC | 09:35 |
ttx | #topic Open discussion | 09:35 |
ttx | Anything else we should cover today? | 09:36 |
amorin | maybe stan you were there to talk about something? | 09:36 |
ttx | stan: still around? | 09:38 |
amorin | nothing more on my side | 09:39 |
ttx | alright then | 09:39 |
ttx | #endmeeting | 09:39 |
opendevmeet | Meeting ended Wed Jun 19 09:39:44 2024 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 09:39 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/large_scale_sig/2024/large_scale_sig.2024-06-19-09.00.html | 09:39 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/large_scale_sig/2024/large_scale_sig.2024-06-19-09.00.txt | 09:39 |
opendevmeet | Log: https://meetings.opendev.org/meetings/large_scale_sig/2024/large_scale_sig.2024-06-19-09.00.log.html | 09:39 |
amorin | thank you! | 09:39 |
songwenping | nothing from myside | 09:39 |
songwenping | bye | 09:40 |
ttx | Thanks amorin and songwenping for participating! | 09:40 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!