机器简介
Currently, we are using the SuperMicro 4124GS-TNR server.
- Product Spec
- AS-4124GS-TNR User manual
- H12DSG-O-CPU Motherboard manual
- Information for Lot 9 of ErP (Ecodesign)
CPU: Epyc 7742
AMD claim that theoretical floating point performance can be calculated as: Double Precision theoretical Floating Point performance = #real_cores * 8 DP flop/clk * core frequency. For a 2 socket system =2 * 64 cores * 8 DP flops/ clk * 2.2 GHz=2252.8 Gflops. This includes counting FMA as two flops.
GPU
RDMA
a1:00.0 Infiniband controller [0207]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017]
a1:00.1 Infiniband controller [0207]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017]
- Official Documention
- IB 卡的通讯协议 https://www.rdmamojo.com/2013/06/01/which-queue-pair-type-to-use/
- OpenMPI 使用 http://scc.ustc.edu.cn/zlsc/user_doc/html/mpi-application/mpi-application.html
RAID
e6:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9230 PCIe SATA 6Gb/s Controller [1b4b:9230] (rev 11) (prog-if 01 [AHCI 1.0])
Official brief: https://www.marvell.com/content/dam/marvell/en/public-collateral/storage/marvell-storage-88se92xx-product-brief-2012-04.pdf
To configure the RAID controller, the easiest way is to press Ctrl+M during booting.
If you want to boot a system on RAID, please use Legacy mode. If you switched to UEFI only, you can't find the controller even if you change it back later. To solve it, see Supermicro FAQ Entry
Firmware
It's possible to flash firmware, see Marvell 9230 Firmware Updates and such. Our current firmware is 1070 (bios oprom version)
. If you want to flash another firmware, you might need to make a FreeDOS bootable disk.
Note: Do backup before flashing!
Many links to firmware or utilities are broken. Station Drivers may still work. Also refer Marvell 92xx A1 Firmware Image Repository, it have a full collection of firmware images.
You can find supermicro's firmware from official site but you can't download it. Try download from http://members.iinet.net.au/~michaeldd/.
NVMe
Installed with https://www.asus.com/us/Motherboards-Components/Motherboards/Accessories/HYPER-M-2-X16-CARD-V2/.
21:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 [144d:a808]
22:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 [144d:a808]
23:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 [144d:a808]
24:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 [144d:a808]
⚠️ The PCIE socket of the NVME card must be configured as 4x4x4x4
so as to be recognized by the system correctly.
The card may have problems. If you find it doesn't work correctly, ask in Slack.
Other links
RAID Controller
MegaRAID
LSI_SAS_EmbMRAID_SWUG.pdf 2006 LSI_SAS_EmbMRAID_SWUG.pdf
ASrock
Win-Raid
Help-Problem-to-flash-the-Marvel-SE-card-resolve
Syba-SI-PEX-PCIe-Card-with-Marvell-SATA-Controller
http://members.iinet.net.au/~michaeldd/CDR-A1-UP_1.01_for_Intel_A1_UP_platform.zip
Supermicro superserver bios change cause 960 nvme disappear
https://tinkertry.com/supermicro-superserver-bios-change-can-cause-960-pro-and-evo-to-hide-heres-the-fix