我试图在http://alchemy.cs.washington.edu/spn/
处运行一些Sum Product Network代码当我尝试在我的Mac上运行它时(版本10.8.4),我遇到以下错误:
mpjrun.sh -np 1 eval.Run -d O
MPJ Express (0.40) is started in the multicore configuration
[Rank=0] *** Parameters ***
[Rank=0] domain=O
[Rank=0] numSumPerRegion=20
[Rank=0] numComponentsPerVar=4
[Rank=0] sparsePrior=1.0
[Rank=0] baseResolution=4
[Rank=0] numSlavePerClass=50
[Rank=0] numSlaveGrp=1
[Rank=0] <TIME> init 1687 ms
mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in
communicator of size 1
at mpjdev.Comm.recv(Comm.java:864)
at mpi.Comm.recv(Comm.java:1294)
at mpi.Comm.Recv(Comm.java:1255)
at spn.SPN.recvUpdate(SPN.java:650)
at spn.GenerativeLearning.learnHardEM(GenerativeLearning.java:52)
at spn.GenerativeLearning.learn(GenerativeLearning.java:16)
at eval.Run.runOlivetti(Run.java:147)
at eval.Run.proc(Run.java:46)
at eval.Run.main(Run.java:40)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at runtime.starter.MulticoreStarter$1.run(MulticoreStarter.java:277)
at java.lang.Thread.run(Thread.java:744)
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at runtime.starter.MulticoreStarter$1.run(MulticoreStarter.java:277)
at java.lang.Thread.run(Thread.java:744)
Caused by: mpi.MPIException: mpi.MPIException: mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in communicator of size 1
at mpi.Comm.Recv(Comm.java:1259)
at spn.SPN.recvUpdate(SPN.java:650)
at spn.GenerativeLearning.learnHardEM(GenerativeLearning.java:52)
at spn.GenerativeLearning.learn(GenerativeLearning.java:16)
at eval.Run.runOlivetti(Run.java:147)
at eval.Run.proc(Run.java:46)
at eval.Run.main(Run.java:40)
... 6 more
Caused by: mpi.MPIException: mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in communicator of size 1
at mpi.Comm.recv(Comm.java:1317)
at mpi.Comm.Recv(Comm.java:1255)
... 12 more
Caused by: mpjdev.MPJDevException: In Comm.irecv(), requested source 1 does not exist in communicator of size 1
at mpjdev.Comm.recv(Comm.java:864)
at mpi.Comm.recv(Comm.java:1294)
... 13 more
对于我给出的任何np值都会发生这种情况。我假设这不是SPN代码的问题,而是我用MPJ-Express做的事情。我已经尝试了MPJ-Express的版本0.40和0.37,并得到了相同的结果。
感谢您的时间。
答案 0 :(得分:0)
当我运行代码并在SPN用户指南中找到答案时,我遇到了同样的问题。 运行SPN的命令是:
mpjrun.sh -np [NUM_PROCESSOR] -dev niodev -mx8000m eval.Run [SPN OPTIONS] > [LOG FILE]
其中NUM_PROCESSOR取决于每个图像类别的从属进程数以及从属组的数量。它应该等于(numSlavePerCat + 1)×numSlaveGroup,numSlavePerCat和numSlaveGroup可以在common / Parameter.java中找到。如果你想在没有这么多处理器的机器上运行,你可以修改numSlavePerCat。