- 积分
- 34
- 贡献
-
- 精华
- 在线时间
- 小时
- 注册时间
- 2015-12-18
- 最后登录
- 1970-1-1
|
登录后查看更多精彩内容~
您需要 登录 才可以下载或查看,没有帐号?立即注册
x
本帖最后由 a1333888 于 2016-7-26 09:30 编辑
各位大神好,我在三个节点的集群上面运行CESM模式,节点是用Hydra管理,用MPI来通信的。我新建case的语句是
./create_newcase -case /cesm/cesm1_2_0/test4 -mach newmach -compset F_1850_CAM5 -res f19_g16
在test4.run的时候出现了问题,报错log信息如下
64 pes participating in computation
-----------------------------------
TASK# NAME
0 node1
1 node1
2 node1
3 node1
4 node2
5 node2
6 node2
7 node2
8 node3
9 node3
10 node3
11 node3
12 node1
13 node1
14 node1
15 node1
16 node2
17 node2
18 node2
19 node2
20 node3
21 node3
22 node3
23 node3
24 node1
25 node1
26 node1
27 node1
28 node2
29 node2
30 node2
31 node2
32 node3
33 node3
34 node3
35 node3
36 node1
37 node1
38 node1
39 node1
40 node2
41 node2
42 node2
43 node2
44 node3
45 node3
46 node3
47 node3
48 node1
49 node1
50 node1
51 node1
52 node2
53 node2
54 node2
55 node2
56 node3
57 node3
58 node3
59 node3
60 node1
61 node1
62 node1
63 node1
Opened existing file
/cesm/cesm1_2_0/inputdata/atm/cam/inic/fv/cami-mam3_0000-01-01_1.9x2.5_L30_c090306.nc
65536
Opened existing file
/cesm/cesm1_2_0/inputdata/atm/cam/topo/USGS-gtopo30_1.9x2.5_remap_c050602.nc
131072
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 12479 RUNNING AT node3
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
[proxy:0:1@node2] HYD_pmcd_pmip_control_cmd_cb (pm/pmiserv/pmip_cb.c:885): assert (!closed) failed
[proxy:0:1@node2] HYDT_dmxu_poll_wait_for_event (tools/demux/demux_poll.c:76): callback returned error status
[proxy:0:1@node2] main (pm/pmiserv/pmip.c:206): demux engine error waiting for event
[mpiexec@node2] HYDT_bscu_wait_for_completion (tools/bootstrap/utils/bscu_wait.c:76): one of the processes terminated badly; aborting
[mpiexec@node2] HYDT_bsci_wait_for_completion (tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for completion
[mpiexec@node2] HYD_pmci_wait_for_completion (pm/pmiserv/pmiserv_pmci.c:218): launcher returned error waiting for completion
[mpiexec@node2] main (ui/mpich/mpiexec.c:344): process manager error waiting for completion
问题好像是出现在读输入数据的时候,我换过几个数据集和分辨率,都会出现同样的问题
有没有大神知道怎么解决,或者任何建议。
十分感谢!
|
|