我有一个独立的集群设置,其中包含1个主服务器和1个工作器(独立的VM),在我的工作中,我从mongo中读取了一些数据,然后经过一些分析后将其写回。我已经从eclipse进行了测试,并且运行良好(本地)。但是,当我将工作提交给集群主服务器时,尝试将数据写回到mongo时,它会失败。
以下是在工作节点上报告的错误。我可以看到tmp文件(/ tmp / hadoop-aga / attempt_20180620130637_0012_r_000000_0 / _MONGO_OUT_TEMP / _out)实际上是在worker上创建的,因此它可能不是文件系统权限问题,但是文件为0字节。
这是我用来写到mongo的api调用
rdd.saveAsNewAPIHadoopFile(
"file:///this-is-completely-unused",
keyClass,
BSONObject.class,
MongoOutputFormat.class,
outputConfig
);
来自工作者的错误:
2018-06-20 13:06:53 INFO connection:71 - Opened connection [connectionId{localValue:3, serverValue:28}] to 192.168.1.6:27017
2018-06-20 13:06:53 INFO cluster:71 - Monitor thread successfully connected to server with description ServerDescription{address=192.168.1.6:27017, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 0, 4]}, minWireVersion=0, maxWireVersion=3, maxDocumentSize=16777216, roundTripTimeNanos=716463}
2018-06-20 13:06:54 INFO MongoRecordWriter:60 - Writing to temporary file: /tmp/hadoop-aga/attempt_20180620130637_0012_r_000000_0/_MONGO_OUT_TEMP/_out
2018-06-20 13:06:54 ERROR Executor:91 - Exception in task 0.0 in stage 2.0 (TID 2)
java.lang.RuntimeException: Could not open temporary file for buffering Mongo output
at com.mongodb.hadoop.output.MongoRecordWriter.<init>(MongoRecordWriter.java:64)
at com.mongodb.hadoop.output.MongoRecordWriter.<init>(MongoRecordWriter.java:75)
at com.mongodb.hadoop.MongoOutputFormat.getRecordWriter(MongoOutputFormat.java:46)
at org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.initWriter(SparkHadoopWriter.scala:344)
at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:118)
at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:79)
at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: ExitCodeException exitCode=-1073741515:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:582)
at org.apache.hadoop.util.Shell.run(Shell.java:479)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:866)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:849)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:733)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:225)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209)
at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:307)
at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:296)
at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:328)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:398)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:461)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:440)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
at com.mongodb.hadoop.output.MongoRecordWriter.<init>(MongoRecordWriter.java:61)
... 12 more
任何帮助将不胜感激!
谢谢
答案 0 :(得分:0)
最后解决了这个问题,事实证明这是一个权限问题。
我安装了“ Microsoft Visual C ++ 2010(x64)”并下载了winutils。然后执行:
winutils.exe chmod 777 \ tmp