SGX Explained

  • ifconfig ib0 192.168.*
  • change ~/.bashrc for (non-)header conditional setup.
if [ -f /dev/ ]; then #头节点
setup eth .xxx ib() .xxx
fi
  • 100 机器 nfs
  • slurm 启动开始跑,机器 one by one启动,任务复用、
  • 脚本allocate 起停 可以轮流睡觉 轮流slurm
  • MIG 启动两套命令/ rmmod nvdia*
  • Prometheus + Grafana for all chassis

Performance

echo 2 > /proc/sys/vm/overcommit_memory
ulimit -a ulimited
echo performance > /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
[ -f "/shared/opt/home/q-e" ] sudo mount 10.0.0.8:/mnt/exports/shared/home /shared/home
...