wien2k是用密度泛函理論計算固體的電子結構。它基於鍵結構計算最準確的方案——完全勢能(線性)增廣平面波((L)APW)+局域軌道(lo)方法。在密度泛函中可以使用局域(自旋)密度近似(LDA)或廣義梯度近似(GGA)。WIEN 2000使用全電子方案,包含相對論影響。
基本介紹
- 中文名:wien2k
- 硬體環境:Shanghai/Suse 10u2
- 軟體版本:Ver:wien2k09
- 編譯器:ifort/icc
- 功能:平衡結構,結構最佳化
功能,安裝設定,硬體環境,軟體版本,安裝Intel 編譯器,安裝Intel的MKL,安裝mpich v1.2.7,設定環境變數,安裝fftw庫,創建編譯目錄,將壓縮檔解開,編譯,安裝後設定,配置web界面,算例測試,採用作業調度提交作業,性能benchmark,其他,Troubleshooting,
功能
X射線結構因子,Baders的“分子中的原子”概念,總能量,力,平衡結構,結構最佳化,分子動力學,電場梯度,異構體位移,超精細場,自旋極化(鐵磁性和反鐵磁性結構),自旋-軌道耦合,X射線發射和吸收譜,電子能量損失譜計算固體的光學特性費米表面LDA,GGA,meta-GGA,LDA+U,軌道極化中心對稱和非中心對稱晶格,內置230個空間群圖形用戶界面和用戶指南友好的用戶環境W2web (WIEN to WEB)可以很容易的產生和修改輸入檔案。它還能幫助用戶執行各種任務(如電子密度,態密度,等)。
安裝設定
硬體環境
Shanghai/Suse 10u2
軟體版本
Ver:wien2k09
安裝Intel 編譯器
ifort/icc
Ver:11.083
安裝Intel的MKL
Ver:10.1.2.024
安裝mpich v1.2.7
./configure -c++=icpc -cc=icc -f77=ifort -f90=ifort --prefix=/home/soft/mpi/mpich-1.2.7-intel
make
make install
設定環境變數
vi ~/.bashrc
添加如下:
##############MPICH###########
export PATH=/home/soft/mpi/mpich-1.2.7-intel/bin:$PATH
################intel compiler###################
. /home/soft/intel/Compiler/11.0/083/bin/intel64/ifortvars_intel64.sh
. /home/soft/intel/Compiler/11.0/083/bin/intel64/iccvars_intel64.sh
###############intel mkl###################
export LD_LIBRARY_PATH=/home/soft/intel/mkl/10.1.2.024/lib/em64t/:$LD_LIBRARY_PATH
安裝fftw庫
tar zxf fftw-2.1.5.tar.gz
cd fftw-2.1.5/
export F77=ifort
export CC=icc
./configure --prefix=/home/soft/mathlib/fftwv215-mpich --enable-mpi
make
make install
創建編譯目錄
進入安裝用戶目錄
su - mjhe
mkdir ~/WIEN2k_09
cp WIEN_2k.tar ~/WIEN2k_09
將壓縮檔解開
cd ~/WIEN2k_09
tar xf WIEN2k_09.tar
./expand_lapw
編譯
./siteconfig_lapw
其中幾個編譯參數需要修改: (可以參考如下)
specify a system
K Linux (Intel ifort 10.1 compiler + mkl 10.0 )
specify compiler
Current selection: ifort
Current selection: icc
specify compiler options, BLAS and LAPACK
Current settings:
O Compiler options: -FR -mp1 -w -prec_div -pc80 -pad -align -DINTEL_VML -traceback
L Linker Flags: $(FOPT) -L/home/soft/intel/mkl/10.1.2.024/lib/em64t/ -pthread -i-static
P Preprocessor flags '-DParallel'
mkl的庫用靜態的:
R R_LIB (LAPACK+BLAS): /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_lapack.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libguide.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_core.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_em64t.a
configure Parallel execution
Shared Memory Architecture? (y/n):n
Remote shell (default is ssh) = ssh
Do you have MPI and Scalapack installed and intend to run
finegrained parallel? (This is usefull only for BIG cases)!
(y/n) n
Current selection: mpiifort
Current settings:
採用靜態庫
RP RP_LIB(SCALAPACK+PBLAS): -lmkl_intel_lp64 /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_scalapack_lp64.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_sequential.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_blacs_lp64.a /home/soft/mathlib/fftwv215-mpich/lib/libfftw_mpi.a /home/soft/mathlib/fftwv215-mpich/lib/libfftw.a -lmkl /home/soft/intel/mkl/10.1.2.024/lib/em64t/libguide.a
//
RP RP_LIB(SCALAPACK+PBLAS): -lmkl_intel_lp64 /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_scalapack_lp64.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_sequential.a /home/soft/intel/mkl/10.1.2.024/lib/em64t/libmkl_blacs_lp64.a -L/data1/soft/lib/lib/ -lfftw_mpi -lfftw -lmkl /data1/soft/intel/mkl/10.0.3.020/lib/em64t/libguide.a
FP FPOPT(par.comp.options): $(FOPT)
MP MPIRUN commando : mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_
Dimension Parameters
該部分可以採用默認值,也可以設定為(4GB以上記憶體)
PARAMETER (NMATMAX= 30000)
PARAMETER (NUME= 1000)
進入編譯部分:
Compile/Recompile
A Compile all programs (suggested)
主要在編譯mpi並行版本的5個執行檔時會出錯,因此編譯後需要檢查以下檔案是否存在:
./SRC_lapw0/lapw0_mpi
./SRC_lapw1/lapw1_mpi
./SRC_lapw1/lapw1c_mpi
./SRC_lapw2/lapw2_mpi
./SRC_lapw2/lapw2c_mpi
安裝後設定
./userconfig_lapw
editor shall be: vi
其餘都回車
修改.bashrc,注釋以下這行:
#ulimit -s unlimited
修改parallel_options
setenv WIEN_MPIRUN “mpirun -machinefile _HOSTS_ -np _NP_ _EXEC_”
配置web界面
用root用戶打開apache服務
service apache2 start
在普通用戶下執行
w2web
將打開7890連線埠作為wien2k的web界面
算例測試
進行串列計算:
以系統自帶算例TiC為例:
cd TiC
mkdir TiC
cp ../TiC.struct .
生成原子信息:
instgen_lapw
初始化算例:
init_lapw –b
計算:
run_lapw
可以看到程式的輸出結果在*.output中,如有錯誤可以在TiC.dayfile中查詢。
進行並行計算:
測試並行環境是否設定:
testpara_lapw
測試算例計算狀態:
testpara1_lapw
testpara2_lapw
根據.machines檔案不同決定進行k點或mpi並行計算:
K點:
granularity:1
1:node31:1
1:node31:1
1:node32:1
1:node32:1
lapw0:node31:2 node32:2
extrafine:1
mpi:
granularity:1
1:node31:2
1:node32:2
lapw0:node31:2 node32:2
extrafine:1
計算:
run_lapw -p
採用作業調度提交作業
cat wien2k.pbs
###########################################################################
# #
# Script for submitting parallel wien2k_09 jobs to Dawning cluster. #
# #
###########################################################################
###########################################################################
# Lines that begin with #PBS are PBS directives (not comments).
# True comments begin with "# " (i,e., # followed by a space).
###########################################################################
#PBS -S /bin/bash
#PBS -N TiO2
#PBS -j oe
#PBS -l nodes=1:ppn=8
#PBS -V
#############################################################################
# -S: shell the job will run under
# -o: name of the queue error filename
# -j: merges stdout and stderr to the same file
# -l: resources required by the job: number of nodes and processors per node
# -l: resources required by the job: maximun job time length
#############################################################################
#########parallel mode is mpi/kpoint############
PARALLEL=mpi//表示採用mpi並行或k點並行
echo $PARALLEL
################################################
NP=`cat ${PBS_NODEFILE} | wc -l`
NODE_NUM=`cat $PBS_NODEFILE|uniq|wc -l`
NP_PER_NODE=`expr $NP / $NODE_NUM`
username=`whoami`
export WIENROOT=/home/users/mjhe/wien2k_09/
export PATH=$PATH:$WIENROOT:.
WIEN2K_RUNDIR=/scratch/${username}.${PBS_JOBID}
export SCRATCH=${WIEN2K_RUNDIR}
#creat scratch dir
if [ ! -a $WIEN2K_RUNDIR ]; then
echo "Scratch directory $WIEN2K_RUNDIR created."
mkdir -p $WIEN2K_RUNDIR
fi
cd $PBS_O_WORKDIR
###############creating .machines################
case $PARALLEL in
mpi)
echo "granularity:1" >.machines
for i in `cat $PBS_NODEFILE |uniq`
do
echo "1:"$i":"$NP_PER_NODE >> .machines
done
printf "lapw0:">> .machines
#####lapw0 用mpi並行#############
for i in `cat ${PBS_NODEFILE}|uniq`
do
printf $i:$NP_PER_NODE" " >>.machines
done
#################################
####lapw0用mpi並行 報錯的算例用以下 mpi error lapw0########
# printf `cat ${PBS_NODEFILE}|uniq|head -1`:1>>.machines
#############end#################
printf "/n" >>.machines
echo "extrafine:1">>.machines
;;
kpoint)
echo "granularity:1" >.machines
for i in `cat $PBS_NODEFILE`
do
echo "1:"$i":"1 >> .machines
done
printf "lapw0:">> .machines
#####lapw0 用mpi並行#############
for i in `cat ${PBS_NODEFILE}|uniq`
do
printf $i:$NP_PER_NODE" " >>.machines
done
#################################
####lapw0用mpi並行 報錯的算例用以下 mpi error lapw0########
# printf `cat ${PBS_NODEFILE}|uniq|head -1`:1>>.machines
#############end#################
printf "/n" >>.machines
echo "extrafine:1">>.machines
;;
esac
#################end creating####################
####### Run the parallel executable "WIEN2K"#########
instgen_lapw
init_lapw -b
clean -s
echo "##################start time is `date`########################"
run_lapw -p
echo "###################end time is `date`########################"
rm -rf $WIEN2K_RUNDIR
########################END########################
一般需要修改的地方已用粗體標出
該腳本可以實現算例的初始化,必須在存在*.struct的前提下進行。
性能benchmark
CB65
Shanghai 2382:16GB 147GB SAS
1000Gb/mpich v1.2.7
TiO2算例:
NMATMAX=30000
2進程k點,mpi並行lapw0、k點並行lapw1、lapw2模組
4m44s
4進程k點,mpi並行lapw0、k點並行lapw1、lapw2模組
4m30s
8進程k點,mpi並行lapw0、k點並行lapw1、lapw2模組
6m29s
2進程mpi,mpi並行lapw0、lapw1、lapw2模組
7m53s
4進程mpi,mpi並行lapw0、lapw1、lapw2模組
6m56s
8進程mpi,mpi並行lapw0、lapw1、lapw2模組
9m5s
標準測試算例:
官方提供的測試算例:
串列:
test_case
export OMP_NUM_THREADS=1
time x lapw1 –c
SUM OF WALL CLOCK TIMES: 135.0 (INIT = 1.0 + K-POINTS = 133.9)
export OMP_NUM_THREADS=4
time x lapw1 –c
SUM OF WALL CLOCK TIMES: 62.0 (INIT = 1.0 + K-POINTS = 61.0)
export OMP_NUM_THREADS=8
time x lapw1 –c
SUM OF WALL CLOCK TIMES: 56.2 (INIT = 1.0 + K-POINTS = 55.2)
並行:
time x lapw1 –p
test_case
2 kpoint:
test_case.output1: SUM OF WALL CLOCK TIMES: 62.0 (INIT = 1.0 + K-POINTS = 61.0)
test_case.output1_1: SUM OF WALL CLOCK TIMES: 138.5 (INIT = 1.0 + K-POINTS = 137.5)
4 kpoint:
test_case.output1: SUM OF WALL CLOCK TIMES: 62.0 (INIT = 1.0 + K-POINTS = 61.0)
test_case.output1_1: SUM OF WALL CLOCK TIMES: 134.9 (INIT = 1.0 + K-POINTS = 133.9)
mpi-benchmark
2process:
mpi-benchmark.output1_1: TIME HAMILT (CPU) = 134.1, HNS = 116.4, HORB =0.0, DIAG=697.5
mpi-benchmark.output1_1: TOTAL CPU TIME: 950.0 (INIT = 1.9 + K-POINTS = 948.1)
mpi-benchmark.output1_1: SUM OF WALL CLOCK TIMES: 1138.9 (INIT =2.2 + K-POINTS =1136.7)
4process:
mpi-benchmark.output1_1: TIME HAMILT (CPU) = 67.8, HNS = 70.5, HORB = 0.0, DIAG = 420.6
mpi-benchmark.output1_1: TOTAL CPU TIME: 560.7 (INIT = 1.8 + K-POINTS = 558.9)
mpi-benchmark.output1_1: SUM OF WALL CLOCK TIMES: 643.2 (INIT = 2.2 + K-POINTS = 640.9)
8process:
mpi-benchmark.output1_1: TIME HAMILT (CPU) = 40.4, HNS = 44.9, HORB = 0.0, DIAG = 422.0
mpi-benchmark.output1_1: TOTAL CPU TIME: 509.3 (INIT = 1.9 + K-POINTS = 507.4)
mpi-benchmark.output1_1: SUM OF WALL CLOCK TIMES: 614.3 (INIT = 2.2 + K-POINTS = 612.0)
16process:
mpi-benchmark.output1_1: TIME HAMILT (CPU) = 22.6, HNS = 32.5, HORB = 0.0, DIAG = 140.5
mpi-benchmark.output1_1: TOTAL CPU TIME: 197.5 (INIT = 1.9 + K-POINTS = 195.7)
mpi-benchmark.output1_1: SUM OF WALL CLOCK TIMES: 1190.0 (INIT =2.8 + K-POINTS =1187.2)
可以用grep TIME *output1* 顯示計算時間
其他
Troubleshooting
1、需要在所有計算節點建立本地快取目錄/scratch
mkdir /scratch
chmod 777 /scratch
2、每次進行計算時需要將算例先清空、重做初始化
3、其他