我有一个程序可以读取由另一个MPI
程序生成的大量文件(大约100个文件,每个120MB),这可能需要一些时间。每个文件都包含相应子域中的变量。我想从这些文件中读取变量并将它们存储到4维数组的特定切片中。由于需要相当长的时间,我想将这段代码与openmp
并行化:
6 SUBROUTINE read_old_restart
7 INTEGER :: ii
8 INTEGER :: thread_ID
9 INTEGER :: OMP_GET_THREAD_NUM
10 CHARACTER(LEN=21) :: file_name
11
12 !$OMP PARALLEL DO PRIVATE(ii,file_name)
13 DO ii=0,Nproc_old-1
14 IF(ii < 10) THEN
15 WRITE(file_name,401) "input/Restart_00", ii, ".out"
16 ELSE IF(ii < 100) THEN
17 WRITE(file_name,402) "input/Restart_0" , ii, ".out"
18 ELSE
19 WRITE(file_name,403) "input/Restart_" , ii, ".out"
20 END IF
21 PRINT*, "Thread = ", OMP_GET_THREAD_NUM(), "Reading ", file_name
22 401 format(a16,I1,a4)
23 402 format(a15,I2,a4)
24 403 format(a14,I3,a4)
25 OPEN (unit=321, file=TRIM(file_name), status="old", form="unFORMATted")
26 READ(321) t , &
27 old_u (:,:,:,ii), &
28 old_v (:,:,:,ii), &
29 old_w (:,:,:,ii), &
30 old_p (:,:,:,ii), &
31 old_uc (:,:,:,ii), &
32 old_vc (:,:,:,ii), &
33 old_wc (:,:,:,ii), &
34 old_un2 (:,:,:,ii), &
35 old_vn2 (:,:,:,ii), &
36 old_wn2 (:,:,:,ii), &
37 old_un1 (:,:,:,ii), &
38 old_vn1 (:,:,:,ii), &
39 old_wn1 (:,:,:,ii), &
40 old_p1 (:,:,:,ii), &
41 old_viscu (:,:,:,ii), &
42 old_viscv (:,:,:,ii), &
43 old_viscw (:,:,:,ii), &
44 old_convu (:,:,:,ii), &
45 old_convv (:,:,:,ii), &
46 old_convw (:,:,:,ii), &
47 statindex , &
48 old_umn (:,:,:,ii), &
49 old_uumn (:,:,:,ii), &
50 old_urms (:,:,:,ii), &
51 old_mass_frac (:,:,:,:,:,ii), &
52 old_enthT (:,:,:,:,ii)
53 CLOSE (321)
54 END DO
55 !$OMP END PARALLEL DO
56 END SUBROUTINE read_old_restart
代码编译并运行每个线程的第一个循环。这是输出:
Thread = 3 Reading input/Restart_030.out
Thread = 7 Reading input/Restart_067.out
Thread = 2 Reading input/Restart_020.out
Thread = 6 Reading input/Restart_058.out
Thread = 9 Reading input/Restart_085.out
Thread = 8 Reading input/Restart_076.out
Thread = 5 Reading input/Restart_049.out
Thread = 4 Reading input/Restart_040.out
Thread = 11 Reading input/Restart_103.out
Thread = 0 Reading input/Restart_000.out
Thread = 1 Reading input/Restart_010.out
Thread = 10 Reading input/Restart_094.out
代码似乎正在运行并且卡在上面的输出上。运行顶部时,我无法使用任何CPU。知道为什么它不按预期工作?
答案 0 :(得分:4)
您应该为单元号使用专用整数变量,并为每个线程将其设置为不同的值。使用与不同线程不同的相同文件单元是一个麻烦的方法。我很惊讶它没有崩溃。