Friday, 2024-06-21

*** bauzas_ is now known as bauzas00:07
*** bauzas_ is now known as bauzas04:02
*** ralonsoh_ is now known as ralonsoh06:20
*** bauzas_ is now known as bauzas07:40
jopdorp_Hi07:55
jopdorp_I was wondering if anyone has experience with H100 HGX systems , or maybe DGX and GPU passthrough to KVM VMs in nova07:56
jopdorp_We're seeing an issue where the nvidia-smi command hangs07:56
jopdorp_we do isntall fabricmanager, and I've tried some different configs of it.07:57
jopdorp_We're trying to pass one gpu per vm, on a host that has 8xH100 SXM gpus07:57
jopdorp_ nv_open_q takes 100% of a single cpu core when nvidia-smi is invoked and hangs07:57
*** bauzas_ is now known as bauzas08:14
sean-k-mooneyjopdorp_: even if our downstream customers ask that question we would have to direct them to nvida supprot11:45
sean-k-mooneywhat nvidia-smi does internally is really a black box that only they understand11:46
opendevreviewsean mooney proposed openstack/nova master: refactor cmd dir for non eventlet console_scripts  https://review.opendev.org/c/openstack/nova/+/90442414:02
*** bauzas_ is now known as bauzas15:03
*** ykarel is now known as ykarel|away16:30
opendevreviewsean mooney proposed openstack/nova master: update nova-next to use ubuntu 24.04  https://review.opendev.org/c/openstack/nova/+/92249216:47
opendevreviewsean mooney proposed openstack/nova master: Remove nova debugger funcitonality  https://review.opendev.org/c/openstack/nova/+/92249617:35
opendevreviewsean mooney proposed openstack/nova master: Add IO Thread Pool Executor  https://review.opendev.org/c/openstack/nova/+/92249717:35
opendevreviewsean mooney proposed openstack/nova master: remove eventlet.tpool  https://review.opendev.org/c/openstack/nova/+/90528718:03
*** bauzas_ is now known as bauzas20:11

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!