我只是想编写一个需要在HDFS中保存csv文件的程序,代码在eclipse中运行正常,但是当我尝试在eclipse之外执行jar时,它会给我一个错误:
2014-10-14 12:41:31 INFO SecurityManager:58 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(aroman)
Exception in thread "main" java.lang.ExceptionInInitializerError
at com.tekcomms.c2d.utils.MyWatchService.saveIntoHdfs(MyWatchService.java:362)
at com.tekcomms.c2d.utils.MyWatchService.processDataCastFile(MyWatchService.java:332)
at com.tekcomms.c2d.utils.MyWatchService.processCreateEvent(MyWatchService.java:224)
at com.tekcomms.c2d.utils.MyWatchService.watch(MyWatchService.java:180)
at com.tekcomms.c2d.main.FeedAdaptor.main(FeedAdaptor.java:40)
Caused by: com.typesafe.config.ConfigException$Missing: No configuration setting found for key 'akka.version'
at com.typesafe.config.impl.SimpleConfig.findKey(SimpleConfig.java:115)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:136)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:142)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:150)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:155)
at com.typesafe.config.impl.SimpleConfig.getString(SimpleConfig.java:197)
at akka.actor.ActorSystem$Settings.<init>(ActorSystem.scala:136)
at akka.actor.ActorSystemImpl.<init>(ActorSystem.scala:470)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:111)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:104)
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:104)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:152)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:202)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:53)
at com.tekcomms.c2d.utils.MySparkUtils.<clinit>(MySparkUtils.java:29)
... 5 more
这是负责在HDFS中写作的部分:
public class MySparkUtils {
final static Logger LOGGER = Logger.getLogger(MySparkUtils.class);
private static JavaSparkContext sc;
static {
SparkConf conf = new SparkConf().setAppName("MySparkUtils");
String master = MyWatchService.getSPARK_MASTER();
conf.setMaster(master );
//this is horrible! how can i pass of it?
String [] jars = {"target/feed-adapter-0.0.1-SNAPSHOT.jar"};
conf.setJars(jars );
sc = new JavaSparkContext(conf);
LOGGER.debug("spark context initialized!");
}
public static boolean saveWithinHDFS(String path,StringBuffer sb){
LOGGER.debug("Trying to save in HDFS. path: " + path);
boolean isOk=false;
String [] aStrings = sb.toString().split("\n");
List<String> jsonDatab = Arrays.asList(aStrings);
JavaRDD<String> dataRDD = sc.parallelize(jsonDatab);
dataRDD.saveAsTextFile(path);
return isOk;
}
}
这是我的pom.xml:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.tekcomms.c2d</groupId>
<artifactId>feed-adapter</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>feed-adaptor</name>
<description>a poc about to scan every second a remote filesystem seeking new csv files from datacast, load the csv file into memory, scan every line of csv matching with a set of pattern rules (matching_phone, matching_mac) if found a match, i will create a string buffer with that previous info, if there is no match, i will create another string buffer with that discarded data. Finally i have to copy those files into HDFS. </description>
<developers>
<developer>
<name>Alonso Isidoro Román</name>
<email>XXX</email>
<timezone>+1 Madrid</timezone>
<organization>XXXX</organization>
<url>about.me/alonso.isidoro.roman</url>
</developer>
</developers>
<dependencies>
<!-- StringUtils... -->
<dependency>
<groupId>commons-lang</groupId>
<artifactId>commons-lang</artifactId>
<version>2.6</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
</dependency>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.0.0</version>
<scope>compile</scope>
<optional>false</optional>
</dependency>
</dependencies>
<repositories>
<repository>
<id>Akka repository</id>
<url>http://repo.akka.io/releases</url>
</repository>
<!-- >repository> <id>cloudera-repos</id> <name>Cloudera Repos</name> <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository -->
<!-- repository> <id>CLOUDERA</id> <url>https://repository.cloudera.com/artifactory/repo/org/apache/spark/spark-core_2.10/0.9.0-cdh5.0.0-beta-2/</url>
</repository> <repository> <id>cdh.repo</id> <url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
<name>Cloudera Repositories</name> <snapshots> <enabled>false</enabled> </snapshots>
</repository> <repository> <id>cdh.snapshots.repo</id> <url>https://repository.cloudera.com/artifactory/libs-snapshot-local</url>
<name>Cloudera Snapshots Repository</name> <snapshots> <enabled>true</enabled>
</snapshots> <releases> <enabled>false</enabled> </releases> </repository>
<repository> <id>central</id> <url>http://repo1.maven.org/maven2/</url> <releases>
<enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled>
</snapshots> </repository -->
<repository>
<id>cloudera-repos</id>
<name>Cloudera Repos</name>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
</repositories>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>com.tekcomms.c2d.main.FeedAdaptor</mainClass>
</transformer>
</transformers>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
我做错了什么?
修改
最后问题是找出你的hdfs群集的确切jar,错误的版本!,另一个问题是hdfs方面的一个非常严格的umask,我的本地用户因为permisions而无法在HDFS中写入!
答案 0 :(得分:0)
最后问题是找出你的hdfs群集的确切jar,错误的版本!,另一个问题是hdfs方面的一个非常严格的umask,我的本地用户因为permisions而无法在HDFS中写入!