我已经不得不尝试执行EnronMail mongo-hadoop连接器示例(https://github.com/mongodb/mongo-hadoop/wiki/Enron-Emails-Example) 没有成功。我收到这个错误:
15/11/18 11:56:23 INFO util.MongoTool: Created a conf: 'Configuration: core-default.xml, core-site.xml, mongo_enron.xml, mapred-default.xml, mapred-site.xml, hdfs-default.xml, hdfs-site.xml' on {class com.mongodb.hadoop.examples.enron.EnronMail} as job named 'EnronMail'
15/11/18 11:56:23 INFO util.MongoTool: Setting up and running MapReduce job in foreground, will wait for results. {Verbose? true}
15/11/18 11:56:23 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
15/11/18 11:56:23 INFO mapred.JobClient: Cleaning up the staging area hdfs://MASTER1:8020/tmp/hadoop-mapred/mapred/staging/user/.staging/job_201511020757_0042
15/11/18 11:56:23 ERROR security.UserGroupInformation: PriviledgedActionException as:user (auth:SIMPLE) cause:java.io.IOException: No FileSystem for scheme: mongodb
15/11/18 11:56:23 ERROR util.MongoTool: Exception while executing job...
java.io.IOException: No FileSystem for scheme: mongodb
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2296)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2303)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:87)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2342)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2324)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:351)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:210)
at com.mongodb.hadoop.BSONFileInputFormat.getSplits(BSONFileInputFormat.java:79)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1079)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1096)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:177)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:995)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:948)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:948)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:566)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596)
at com.mongodb.hadoop.util.MongoTool.runMapReduceJob(MongoTool.java:230)
at com.mongodb.hadoop.util.MongoTool.run(MongoTool.java:100)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at com.mongodb.hadoop.examples.enron.EnronMail.main(EnronMail.java:197)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
在hadoop shell中执行此命令后:
hadoop jar /home/user/Pruebas/jars/bigdata-0.0.3-SNAPSHOT.jar com.mongodb.hadoop.examples.enron.EnronMail -Dmongo.input.split_size=8 -Dmongo.job.verbose=true -Dmongo.input.uri=mongodb://192.168.1.187:27017/mongoHadoopConnector.messages -Dmongo.output.uri=mongodb://192.168.1.187:27017/mongoHadoopConnector.message_pairs
注意: 我在我的机器(192.168.1.187)中启动了mongo服务器进程,并且可以访问LAN中的其他机器。 集合中有数据。 我尝试了几个版本的依赖项。 我的版本:
hadoop:Hadoop 2.0.0-cdh4.5.0
mongo:3.0.7
这是我的maven项目的POM:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.company.test</groupId>
<artifactId>bigdata-light</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>bigdata-light</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.0.0-cdh4.5.0</version>
</dependency>
<dependency>
<groupId>org.mongodb.mongo-hadoop</groupId>
<artifactId>mongo-hadoop-core</artifactId>
<version>1.4.2</version>
</dependency>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongo-java-driver</artifactId>
<version>3.0.3</version>
</dependency>
</dependencies>
<build>
<finalName>bigdata-0.0.3-SNAPSHOT</finalName>
<plugins>
<plugin>
<artifactId>maven-antrun-plugin</artifactId>
<version>1.7</version>
<dependencies>
<dependency>
<groupId>org.apache.ant</groupId>
<artifactId>ant-jsch</artifactId>
<version>1.9.2</version>
</dependency>
</dependencies>
<executions>
<execution>
<phase>install</phase>
<configuration>
<target>
<ant antfile="${basedir}\build.xml">
<target name="upload" />
</ant>
</target>
</configuration>
<goals>
<goal>run</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
请,任何帮助将非常感激=)。我被困了好几天......:$
答案 0 :(得分:0)
我找到了一个解决方案,以帮助那些可能遇到同样问题的人 要阅读MongoDB集合,请使用
small = 25;
large = 35;
// upscale
linear_extrude(height = 10, center = false, scale = large/small) circle(r = small);
// downscale
translate([2*large,0,0]) {
linear_extrude(height = 10, center = false, scale = small/large) circle(r = large);
}
而不是
MapredMongoConfigUtil.setInputFormat(getConf(), com.mongodb.hadoop.mapred.MongoInputFormat.class);
(这是直接从MapredMongoConfigUtil.setInputFormat(getConf(), com.mongodb.hadoop.mapred.BSONFileInputFormat.class);
生成的.bson
文件中读取的替代方法)在mapreduce配置类(mongodump
)中。