MPJ-Express错误mpjdev.MPJDevException:在Comm.irecv()中,请求的源1在大小为1的通信器中不存在

时间:2014-04-01 22:52:29

标签: spn mpj-express

我试图在http://alchemy.cs.washington.edu/spn/

处运行一些Sum Product Network代码

当我尝试在我的Mac上运行它时(版本10.8.4),我遇到以下错误:

mpjrun.sh -np 1 eval.Run -d O
MPJ Express (0.40) is started in the multicore configuration
[Rank=0] *** Parameters ***
[Rank=0]    domain=O
[Rank=0]    numSumPerRegion=20
[Rank=0]    numComponentsPerVar=4
[Rank=0]    sparsePrior=1.0
[Rank=0]    baseResolution=4
[Rank=0]    numSlavePerClass=50
[Rank=0]    numSlaveGrp=1
[Rank=0] <TIME> init 1687 ms

mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in 

communicator of size 1
        at mpjdev.Comm.recv(Comm.java:864)
        at mpi.Comm.recv(Comm.java:1294)
        at mpi.Comm.Recv(Comm.java:1255)
        at spn.SPN.recvUpdate(SPN.java:650)
        at spn.GenerativeLearning.learnHardEM(GenerativeLearning.java:52)
        at spn.GenerativeLearning.learn(GenerativeLearning.java:16)
        at eval.Run.runOlivetti(Run.java:147)
        at eval.Run.proc(Run.java:46)
        at eval.Run.main(Run.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at runtime.starter.MulticoreStarter$1.run(MulticoreStarter.java:277)
        at java.lang.Thread.run(Thread.java:744)
    java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at runtime.starter.MulticoreStarter$1.run(MulticoreStarter.java:277)
        at java.lang.Thread.run(Thread.java:744)
    Caused by: mpi.MPIException: mpi.MPIException: mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in communicator of size 1
        at mpi.Comm.Recv(Comm.java:1259)
        at spn.SPN.recvUpdate(SPN.java:650)
        at spn.GenerativeLearning.learnHardEM(GenerativeLearning.java:52)
        at spn.GenerativeLearning.learn(GenerativeLearning.java:16)
        at eval.Run.runOlivetti(Run.java:147)
        at eval.Run.proc(Run.java:46)
        at eval.Run.main(Run.java:40)
        ... 6 more
    Caused by: mpi.MPIException: mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in communicator of size 1
        at mpi.Comm.recv(Comm.java:1317)
        at mpi.Comm.Recv(Comm.java:1255)
        ... 12 more
    Caused by: mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in communicator of size 1
        at mpjdev.Comm.recv(Comm.java:864)
        at mpi.Comm.recv(Comm.java:1294)
        ... 13 more

对于我给出的任何np值都会发生这种情况。我假设这不是SPN代码的问题,而是我用MPJ-Express做的事情。我已经尝试了MPJ-Express的版本0.40和0.37,并得到了相同的结果。

感谢您的时间。

1 个答案:

答案 0 :(得分:0)

当我运行代码并在SPN用户指南中找到答案时,我遇到了同样的问题。 运行SPN的命令是:

mpjrun.sh -np [NUM_PROCESSOR] -dev niodev -mx8000m eval.Run [SPN OPTIONS] > [LOG FILE]

其中NUM_PROCESSOR取决于每个图像类别的从属进程数以及从属组的数量。它应该等于(numSlavePerCat + 1)×numSlaveGroup,numSlavePerCat和numSlaveGroup可以在common / Parameter.java中找到。如果你想在没有这么多处理器的机器上运行,你可以修改numSlavePerCat。