*** gmann_afk is now known as gmann | 00:35 | |
*** pmannidi|AFK is now known as pmannidi | 00:42 | |
*** amoralej|off is now known as amoralej | 08:25 | |
dtantsur | morning ironic | 10:28 |
---|---|---|
holtgrewe | dtantsur: good morning, the crowd here appears to be thinning out with the end of the year approaching | 10:32 |
dtantsur | yep, exactly :) | 10:32 |
dtantsur | holtgrewe: do you have a winter break? | 10:32 |
holtgrewe | dtantsur: usually, yes, but this year not | 10:35 |
dtantsur | oh | 10:35 |
* dtantsur is wondering if the noon is a good time for breakfast | 10:50 | |
janders | good morning / afternoon dtantsur holtgrewe and Ironic o/ | 11:02 |
holtgrewe | o/ | 11:05 |
dtantsur | hey janders | 11:10 |
*** sshnaidm|afk is now known as sshnaidm | 11:26 | |
holtgrewe | I have assigned a capability "node:NODENAME" to my baremetal host. Now I want to target this particular baremetal host with nova. How would the scheduler hints look like? I'm trying to follow https://fatmin.com/2018/08/20/openstack-mapping-ironic-hostnames-to-nova-hostnames/ | 12:28 |
dtantsur | holtgrewe: this is tripleo-specific stuff. I don't think you can just use it on a generic Ironic. | 12:32 |
holtgrewe | dtantsur: :-( | 12:32 |
dtantsur | I'm curious why you need to target a specific host with nova. | 12:32 |
holtgrewe | And I'm on Kolla/Kayobe. | 12:32 |
dtantsur | This is considered an anti-pattern. | 12:32 |
holtgrewe | I want to reproduce my old xCAT based HPC deployment. | 12:33 |
dtantsur | If you have admin rights, I think you can ask nova for a specific hypervisor? | 12:33 |
dtantsur | each Ironic node becomes its own hypervisor in Nova | 12:33 |
dtantsur | I think hypervisor ID == node uUID | 12:33 |
holtgrewe | OK, I'm admin | 12:34 |
holtgrewe | life is hard, root passwords tend to help from time to time ;-) | 12:34 |
dtantsur | :) | 12:34 |
janders | see you tomorrow Ironic o/ | 12:55 |
janders | (and to those who are about to start their Christmas Holiday - Merry Christmas and Happy New Year! ) | 12:56 |
janders | I'll be back for one more day tomorrow | 12:56 |
holtgrewe | Is there a way to show the current nova configuration including the scheduler_filter section including applied defaults? | 13:03 |
dtantsur | holtgrewe: I don't know. Ironic in debug mode logs its configuration on start-up, but I'm not sure if Nova does it. | 13:09 |
holtgrewe | dtantsur: thanks | 13:13 |
*** amoralej is now known as amoralej|lunch | 13:16 | |
holtgrewe | dtantsur: you are right, this one is specific to https://github.com/openstack/tripleo-common/blob/master/tripleo_common/filters/capabilities_filter.py | 13:42 |
dtantsur | holtgrewe: maybe https://docs.openstack.org/nova/xena/admin/availability-zones.html#using-availability-zones-to-select-hosts can help? | 13:48 |
holtgrewe | dtantsur: that looks very helpful | 13:49 |
dtantsur | I remember there was a way to target a hypervisor specifically, but this is all I could find | 13:52 |
holtgrewe | I'll ask in #openstack-kolla what they think is the best way | 13:53 |
dtantsur | the old docs have this https://docs.openstack.org/nova/train/admin/availability-zones.html#using-explicit-host-and-or-node | 13:53 |
holtgrewe | I could inject something like the tripleo thing | 13:53 |
dtantsur | I'm not sure if it's still supported or not. worth trying? | 13:53 |
holtgrewe | dtantsur: it's still in the docs | 13:53 |
* dtantsur does not use nova nowadays | 13:53 | |
holtgrewe | I'm deploying via the openstack collection... which only has this availability zone... | 13:55 |
TheJulia | Admin users should be able to request instances to be deployed to specific hypervisors in nova | 13:59 |
holtgrewe | TheJulia: that functionality appears to be not exposed via the Ansible collections. Anyway, mgoddard convinced me to see the light and define one flavor per host generation and rather than HPC hammer/nail go for baremetal cloud. | 14:00 |
TheJulia | Heh, okay. I mean… if you were to propose a patch to the ansible modules… we might know some people :) | 14:02 |
TheJulia | Flavor per host doesn’t sound that ideal either, but that is the far friendlier option to users | 14:03 |
holtgrewe | TheJulia: thanks for the offer. | 14:03 |
TheJulia | Anyway, time to wake up, make coffee, and load up the car | 14:03 |
holtgrewe | TheJulia: I guess all I needed was the reminder that one should try to use one's shiny new tools in their idiomatic ways. | 14:03 |
TheJulia | It does often help | 14:04 |
mgoddard | TheJulia: I think holtgrewe means flavor per host type/generation rather than per host | 14:05 |
holtgrewe | yes | 14:06 |
TheJulia | Oh, good | 14:06 |
* TheJulia hears tons of coyotes from her kitchen and wonders if it is because she is on IRC at the moment | 14:07 | |
holtgrewe | And here is my brand new flavor bm.2020-11-c6420 ... | 14:10 |
TheJulia | Yay! | 14:11 |
* TheJulia makes coffee otherwise the drive to flagstaff today will not be fun | 14:11 | |
timeu_ | holtgrewe: FYI we use traits or CUSTOM_RESOURCE on the flavors to target specific node types (cpu, memory, gpu, etc) for our test baremetal HPC cluster. works quite well. | 14:27 |
dtantsur | good morning TheJulia | 14:27 |
holtgrewe | timeu_: thanks for the info, I'm actually dealing with ~8 delivery batches of hardware here, having a flavor for each works well enough I think | 14:28 |
holtgrewe | I need to put some labels statically into the slurm.conf anyway. | 14:28 |
* holtgrewe is bumping the quotas for the hpc project to 999,999,999 | 14:29 | |
timeu_ | yeah we render our slurm config based on the heat output + ansible dynamically for the various node types. works quite well | 14:29 |
*** amoralej|lunch is now known as amoralej | 14:29 | |
holtgrewe | I'm not yet at the point where I am going down the rabbithole of heat | 14:29 |
* dtantsur feels like a lot of experience exchange can be happening | 14:30 | |
timeu_ | we have been using heat but we are considering of switching to terraform | 14:30 |
timeu_ | because it gives you a bit more transparency of what is happening if you scale up/down the stack | 14:30 |
timeu_ | in general works well, however writing the heat templates is much more verbose/cumbersome than HCL | 14:31 |
holtgrewe | timeu_: for now I'm replacing proxmox for vms and xcat for bare metal with openstack | 14:32 |
holtgrewe | I have existing Ansible playbooks for the old infrastructure and now want to do that switch first | 14:32 |
holtgrewe | next up would be looking at having neutron configure the switches (I understand that's possible) | 14:33 |
timeu_ | yeah sounds like a good plan | 14:33 |
holtgrewe | My use case is like ... one HPC | 14:33 |
holtgrewe | ;-) | 14:33 |
timeu_ | yeah that's basically what we are planing to do for the next iteration of our HPC system. Right now it's fully virtualized on Openstack but the goal is to move it to a baremetal deployment on OpenStack. | 14:34 |
timeu_ | we do also integrate ironic with neutron but we are using cisco SDN | 14:34 |
dtantsur | have you considered doing a talk for the opendev summit next year? | 14:35 |
holtgrewe | Actually, once I have a sizeable portion of nodes ported over, iDRAC/BIOS settings cleaned up, everything booting from UEFI, ... I need to get my all-NVME CephFS up and running and tuned so I can office-space-printer-scene my HDD based GPFS system | 14:35 |
timeu_ | but it does work quite well including trunk ports so we can get rid of 200 1G cables and just go with the 100G ones (should also improve air flow in the racks) and the 1G cables/NIC flap like cracy | 14:35 |
holtgrewe | timeu_: OK... I don't understand much about SDN | 14:35 |
timeu_ | make sure to record it then ;-) | 14:36 |
holtgrewe | timeu_: we have 2x10GbE/2x40GbE network in the old part of the cluster, old blade enclosures | 14:36 |
holtgrewe | and 2x25GbE/2x100GbE in the new part of the cluster | 14:36 |
holtgrewe | I currently have 10MB/sec throughput of the HDD file system | 14:36 |
timeu_ | quite a bit of connectivity there ;-) | 14:37 |
holtgrewe | with the NVME system I'm pretty certain the network will be the bottlneck | 14:37 |
holtgrewe | yeah, life science sure wants their I/O | 14:37 |
holtgrewe | CPU cycles per GB is quite low compared to things like FEM simulations | 14:37 |
timeu_ | yeah I think with NVME 100G is required. | 14:37 |
holtgrewe | some people might say that it's essentially "gunzip -c | awk | gzip" performance wise ;-) | 14:38 |
holtgrewe | And I'm not even looking at the imaging people who either do 100M images of 4kb or large HDF files with only random seeks. | 14:38 |
timeu_ | yeah we also have those but its true that an all flash parallel filesystem gives you the biggest performance boost across the board for various HPC workloads | 14:39 |
timeu_ | but yeah we have also all kinds of edge cases in regards with I/O workloads including the thousand of small temporary files in one folder that usually is poison for a paralllel filesystem ;-) | 14:40 |
timeu_ | I can feel the pain | 14:40 |
holtgrewe | timeu_: yes... if I can move my 24 recent nodes across data centers early next year, I can connect modern CPUs with 2x25GB to the ceph system on one switch. I'm interested how the IO500 benchmark looks there | 14:41 |
holtgrewe | Well, rater thousands of small files on NVME than on HDD ;-) | 14:41 |
holtgrewe | even the GPFS meta data lives on HDD, fast spinning ones, but still | 14:41 |
timeu_ | yeah but we also saw our all flash beegfs cluster crawl due to metadata operation in certain cases but it's definately better than rust ;-) | 14:42 |
holtgrewe | ;-) | 14:42 |
holtgrewe | What's your experience with BeeGFS, it always felt more like a /scratch when doing research about it than a /home | 14:43 |
* holtgrewe wonders whether that's actually OT here | 14:43 | |
timeu_ | yeah it's definately more than a /scratch although it has been very reliable. Almost no crashes but we have a relatively small setup (~300TB, 12 nodes). | 14:45 |
holtgrewe | timeu_: sounds good | 14:45 |
timeu_ | there are features such as mirroring (metadata + data) but they come with performance overhead, so we don't use them | 14:45 |
holtgrewe | I've used two clusters ... 8 and 12 years ago that had a dying ... FrauenhoferFS as it was called back then | 14:46 |
holtgrewe | But that's ages ago | 14:47 |
holtgrewe | it's good to hear that things have improved | 14:47 |
opendevreview | Merged openstack/ironic-python-agent stable/ussuri: Re-read the partition table with partx -a https://review.opendev.org/c/openstack/ironic-python-agent/+/821785 | 15:38 |
opendevreview | Merged openstack/ironic-python-agent stable/ussuri: Re-read the partition table with partx -a, part 2 https://review.opendev.org/c/openstack/ironic-python-agent/+/821786 | 15:38 |
opendevreview | Verification of a change to openstack/ironic-python-agent stable/train failed: Re-read the partition table with partx -a https://review.opendev.org/c/openstack/ironic-python-agent/+/821791 | 15:38 |
dmellado | dtantsur: hi again man xD | 16:17 |
dtantsur | o/ | 16:18 |
dmellado | I'm histingt an issue which I'm not sure about | 16:18 |
dmellado | ailed to prepare to deploy: Could not link image http://192.168.200.8:8082/ipa.kernel from /var/lib/ironic/master_images/1b75d649-9e6d-5c0a-966c-36b1a84264fc.converted to /httpboot/ef4bbe4a-3ecd-4bae-888e-52c406707206/deploy_kernel, error: [Errno 18] Invalid cross-device link: '/var/lib/ironic/master_images/1b75d649-9e6d-5c0a-966c-36b1a84264fc.converted' -> | 16:18 |
dmellado | '/httpboot/ef4bbe4a-3ecd-4bae-888e-52c40 | 16:18 |
dmellado | 6707206/deploy_kernel' | 16:18 |
dmellado | does this ring a bell? | 16:18 |
dmellado | if not, I'll try debugging but it may be one of those 'oh, that' | 16:18 |
dtantsur | dmellado: yeah. something we should likely document: our caches work via hardlinking. | 16:19 |
dtantsur | so if you have /var and /httpboot on different devices, it work | 16:19 |
dtantsur | egh | 16:19 |
dtantsur | won't work | 16:19 |
dtantsur | is it the case? | 16:19 |
dmellado | let me check | 16:20 |
dmellado | https://paste.openstack.org/show/boRymb2crVqUTDjB73ds/ | 16:21 |
dmellado | I don't see anything extraordinary there | 16:21 |
dmellado | maybe I should try to hardlink that myself | 16:22 |
dtantsur | yeah, give it a try | 16:22 |
dmellado | aha, so it failed as well, interesting | 16:23 |
dtantsur | honestly, we should probably move Ironic stuff to /var/lib completely | 16:23 |
dmellado | ++ | 16:23 |
dtantsur | I guess these /tftpboot and /httpboot are historical | 16:23 |
dmellado | I assume a soft link won't work | 16:24 |
dtantsur | dmellado: no, it won't because of the way we deal with caches (we rely on being able to check the link count) | 16:24 |
dmellado | how about moving http_boot_folder to /var/lib/httpboot | 16:25 |
dmellado | ? | 16:25 |
dtantsur | this is what I'm pondering, yes | 16:25 |
dtantsur | you can try it. if it works, we should probably change the bifrost's default | 16:25 |
dmellado | checking | 16:27 |
dtantsur | I'll prepare a patch meanwhile | 16:29 |
dmellado | any way I can check its status, besides 'deploying'? | 16:33 |
dtantsur | its = which? | 16:34 |
dmellado | the server one | 16:34 |
dmellado | now it says provisioning state = wait call-back | 16:34 |
dtantsur | dmellado: ironic-conductor logs, the virtual console of the machine (if you have access) | 16:34 |
dmellado | hmmm it seems that it gets stuck waiting on pxe | 16:39 |
dmellado | but that workaround re: the hardlink worked | 16:40 |
opendevreview | Dmitry Tantsur proposed openstack/bifrost master: Move /{tftp,http}boot to /var/lib/ironic https://review.opendev.org/c/openstack/bifrost/+/822743 | 16:41 |
dtantsur | dmellado: ^^ | 16:41 |
dmellado | hmmm stupid question | 16:49 |
dmellado | it tried to pxe | 16:50 |
dmellado | then it gets stuck waitin on net0, which is the mac address I set up on baremetal.json | 16:50 |
dmellado | that should be it, shouldn't it? | 16:50 |
dmellado | it responds to ipmi commands and reboots | 16:50 |
dmellado | on the ironic side it gets stuck on wait call-back | 16:51 |
dtantsur | dmellado: so, it doesn't DHCP? doesn't get the ramdisk? or? | 17:02 |
dmellado | it doesn't dhcp | 17:03 |
dmellado | may be some weird config on the host, though | 17:03 |
dmellado | network_interface does it refers to the network interface it would use for dhcp | 17:04 |
dmellado | or the bmc one? | 17:04 |
dmellado | dtantsur: | 17:05 |
dmellado | ? | 17:06 |
dtantsur | dmellado: it's the host interface you're using for DHCP and other boot business | 17:06 |
dmellado | gotcha, then it was wrong | 17:06 |
dtantsur | not the BMC one | 17:06 |
dtantsur | docs updates are very welcome | 17:06 |
dmellado | I'll put that there, as at least I'm finding some issues (and being annoying) | 17:06 |
dmellado | hope that at least it helps fix some stuff around, dtantsur ;) | 17:07 |
dtantsur | it helps a lot! | 17:08 |
dtantsur | especially the docs, it's hard to see that something is not obvious after 7 years on the project | 17:09 |
dmellado | ++ | 17:09 |
dmellado | d'oh kernel panic??? | 17:14 |
dtantsur | Oo | 17:15 |
dtantsur | I wonder which IPA image it is using by default | 17:15 |
dmellado | If you don't mind I may ping you tomorrow morning | 17:15 |
dmellado | and if you have some time to do a quick meet | 17:16 |
dtantsur | I'm on a break starting tomorrow | 17:16 |
dmellado | as it's probably me doing something stupid | 17:16 |
dmellado | oh, then after xmas | 17:16 |
dtantsur | dmellado: try setting use_tinyipa=false in bifrost if not already | 17:16 |
* dtantsur is curious why we even default to that... | 17:16 | |
dtantsur | dmellado: do you see which operating system it's trying to boot? CentOS or TinyCoreLinux? | 17:17 |
dmellado | just a huge kernel panic | 17:18 |
dmellado | I'll retry and be back later or my wife would kill me xD | 17:18 |
dtantsur | fair :D | 17:18 |
dmellado | do I need to reinstall | 17:18 |
dmellado | bifrost? | 17:19 |
dmellado | or would these changes be picked on the fly and just reenroll and redeploy? | 17:19 |
dmellado | dtantsur: ? | 17:19 |
dmellado | off I go, will read later xD | 17:19 |
dmellado | thanks! | 17:19 |
dtantsur | reinstalling will be needed | 17:19 |
dmellado | ack | 17:19 |
opendevreview | Dmitry Tantsur proposed openstack/bifrost master: Change the default image to a DIB-built one https://review.opendev.org/c/openstack/bifrost/+/822751 | 17:22 |
dtantsur | dmellado: ^^ | 17:22 |
dtantsur | on this positive note, I'm wishing everyone a great rest of the week, great holidays for those celebrating, and see you in January! | 17:22 |
JayF | Have a great holiday Dmitry (& others)! | 17:32 |
*** amoralej is now known as amoralej|off | 17:34 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!