使用java编写spark程序,代码如下:
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function;
public class SimpleApp {
public static void main(String[] args) {
SparkConf conf = new SparkConf().setAppName("wordCount").setMaster("local");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD<String> input = sc.textFile("/bigdata/softwares/spark-2.1.0-bin-hadoop2.7/testdata/a.txt");
System.out.println();
Long bCount = input.filter(new Function<String,Boolean>(){
public Boolean call(String s){return s.contains("yes");}
}).count();
Long cCount = input.filter(new Function<String,Boolean>(){
public Boolean call(String s){return s.contains("ywq");}
}).count();
System.out.println("yes:"+bCount+" ywq:"+cCount+" all:");
// sc.stop();
}
}
Pom如下:
<dependencies>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.1.0</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
发生以下错误 Maven将所有资源打包成jar文件,运行时报告以下错误,我刚开始学习,谁知道教,谢谢 enter image description here
答案 0 :(得分:0)
您还必须使用spark-submit
指定主类spark-submit --class <your.package>.SimpleApp testjar/spark-0.0.1-SNAPSHOT.jar
答案 1 :(得分:0)
您需要指定主类和主
./bin/spark-submit --class package.name.MainClass --master local[2] /testjar/spark-0.0.1-SNAPSHOT.jar