我有以下火花字数计划:
<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
android:layout_width="match_parent"
android:layout_height="match_parent" >
<android.support.v4.widget.SwipeRefreshLayout
android:id="@+id/swipeRefreshLayout"
android:layout_width="match_parent"
android:layout_height="match_parent" >
<android.support.v7.widget.RecyclerView
android:id="@+id/recyclerview_stream"
android:layout_width="match_parent"
android:layout_height="match_parent"
android:paddingBottom="@dimen/feed_item_padding_topbottom"
android:paddingTop="@dimen/feed_item_padding_topbottom"
android:scrollbars="vertical" />
<LinearLayout
android:id="@+id/error_page"
android:layout_width="fill_parent"
android:layout_height="fill_parent"
android:layout_gravity="center"
android:gravity="center"
android:orientation="vertical" >
<ImageView
android:layout_width="@dimen/no_found_width"
android:layout_height="@dimen/no_found_height"
android:layout_marginBottom="15dp"
android:src="@drawable/ic_no_found" />
<TextView
android:id="@+id/error_text"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:gravity="center"
android:text="@string/not_found"
android:textSize="@dimen/error_page_textsize" />
</LinearLayout>
</android.support.v4.widget.SwipeRefreshLayout>
</LinearLayout>
当我从ecllipse运行代码时,我得到了引发者:java.lang.ClassNotFoundException:org.apache.hadoop.fs.CanSetDropBehind异常但是如果我导出为runnable jar并从终端运行,如下所示:< / p>
package com.sample.spark;
import java.util.Arrays;
import java.util.List;
import java.util.Map;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.*;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.PairFlatMapFunction;
import org.apache.spark.api.java.function.PairFunction;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import scala.Tuple2;
public class SparkWordCount {
public static void main(String[] args) {
SparkConf conf = new SparkConf().setAppName("wordcountspark").setMaster("local").setSparkHome("/Users/hadoop/spark-1.4.0-bin-hadoop1");
JavaSparkContext sc = new JavaSparkContext(conf);
//SparkConf conf = new SparkConf();
//JavaSparkContext sc = new JavaSparkContext("hdfs", "Simple App","/Users/hadoop/spark-1.4.0-bin-hadoop1", new String[]{"target/simple-project-1.0.jar"});
JavaRDD<String> textFile = sc.textFile("hdfs://localhost:54310/data/wordcount");
JavaRDD<String> words = textFile.flatMap(new FlatMapFunction<String, String>() {
public Iterable<String> call(String s) { return Arrays.asList(s.split(" ")); }
});
JavaPairRDD<String, Integer> pairs = words.mapToPair(new PairFunction<String, String, Integer>() {
public Tuple2<String, Integer> call(String s) { return new Tuple2<String, Integer>(s, 1); }
});
JavaPairRDD<String, Integer> counts = pairs.reduceByKey(new Function2<Integer, Integer, Integer>() {
public Integer call(Integer a, Integer b) { return a + b; }
});
counts.saveAsTextFile("hdfs://localhost:54310/data/output/spark/outfile");
}
}
maven pom看起来像:
bin/spark-submit --class com.sample.spark.SparkWordCount --master local /Users/hadoop/spark-1.4.0-bin-hadoop1/finalJars/SparkJar-v2.jar
答案 0 :(得分:1)
当您在eclipse中运行时,引用的jar是程序运行的唯一来源。因此,由于某些原因,jar的hadoop-core(存在CanSetDropBehind的地方)未在本地存储库的eclipse中正确添加。如果它是代理问题,您需要识别它,或者任何其他与pom。
当你从终端运行jar时,运行的原因可能是,由于引用的类路径中存在jar。同时从终端运行时,您也可以选择将这些罐子作为胖罐(包括hadoop-core)放在罐子里。我希望您在创建jar时不使用此选项。然后将从jar内部挑选引用,而不依赖于类路径。
验证每个步骤,它将帮助您确定原因。快乐的编码
答案 1 :(得分:1)
发现这是因为版本0.23.11的hadoop-common jar没有类,将版本更改为2.7.0并且还添加了以下依赖项:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>2.7.0</version>
</dependency>
然后摆脱了错误,但仍然看到以下错误:
线程“main”中的异常java.io.EOFException:本地主机之间的文件结束异常为:“mbr-xxxx.local / 127.0.0.1”;目标主机是:“localhost”:54310; :java.io.EOFException;有关详细信息,请参阅:http://wiki.apache.org/hadoop/EOFException