我从使用Java API,Maven和IntelliJ Idea的Apache Spark开始
我编写了一个程序,但收到了“ ClassNotFoundException”。
Mac-mini:~ username$ spark-submit --class test.FirstRDD --master local IdeaProjects/exercices/out/artifacts/exercices_jar/exercices.jar
2018-07-06 10:09:56 WARN NativeCodeLoader:62-无法加载 适用于您平台的native-hadoop库...使用内建的Java类 适用时 java.lang.ClassNotFoundException:test.FirstRDD 在java.net.URLClassLoader.findClass(URLClassLoader.java:381) 在java.lang.ClassLoader.loadClass(ClassLoader.java:424) 在java.lang.ClassLoader.loadClass(ClassLoader.java:357) 在java.lang.Class.forName0(本地方法) 在java.lang.Class.forName(Class.java:348) 在org.apache.spark.util.Utils $ .classForName(Utils.scala:238) 在org.apache.spark.deploy.SparkSubmit $ .org $ apache $ spark $ deploy $ SparkSubmit $$ runMain(SparkSubmit.scala:851) 在org.apache.spark.deploy.SparkSubmit $ .doRunMain $ 1(SparkSubmit.scala:198) 在org.apache.spark.deploy.SparkSubmit $ .submit(SparkSubmit.scala:228) 在org.apache.spark.deploy.SparkSubmit $ .main(SparkSubmit.scala:137) 在org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 2018-07-06 10:09:56 INFO ShutdownHookManager:54-名为Shutdown的钩子 2018-07-06 10:09:56 INFO ShutdownHookManager:54-删除目录 / private / var / folders / m_ / y0g8dr8x4vx390cv6z23s47c0000gr / T / spark-e70c0995-5c0d-4bb5-9bc1-776b27786949
这是我的主班:
FirstRDD.java
package test;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import java.net.URISyntaxException;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.time.Instant;
import java.time.LocalDateTime;
import java.time.ZoneId;
import java.util.Comparator;
public class FirstRDD {
public void run() throws URISyntaxException {
SparkConf conf = new SparkConf().setAppName("Exercice 1").setMaster("local[*]");
conf.setJars(new String[]{"/Users/username/IdeaProjects/exercices/out/artifacts/exercices_jar/exercices.jar"});
JavaSparkContext sc = new JavaSparkContext(conf);
//sc.addJar("/Users/username/IdeaProjects/exercices/out/artifacts/exercices_jar/exercices.jar");
Path url = Paths.get(FirstRDD.class.getResource("/ratings.txt").getPath());
JavaRDD<Rating> lines = sc.textFile(url.toString())
.map(line -> line.split("\\t"))
.map(row -> new Rating(Long.parseLong(row[0]), Long.parseLong(row[1]), Integer.parseInt(row[2]), LocalDateTime.ofInstant(Instant.ofEpochSecond(Long.parseLong(row[3])*1000),ZoneId.systemDefault()))
);
double count = lines
.filter(rating -> rating.user==200)
.count();
double max = lines
.filter(rating -> rating.user==200)
.mapToDouble(rating -> rating.rating)
.max(Comparator.<Double>naturalOrder());
double min = lines
.filter(rating -> rating.user==200)
.mapToDouble(rating -> rating.rating)
.min(Comparator.<Double>naturalOrder());
double mean = lines
.filter(rating -> rating.user==200)
.mapToDouble(rating -> rating.rating)
.mean();
System.out.println("Count : "+count+" | Max : "+max+" | Min : "+min+" | Moyenne : "+mean);
}
public static void main(String[] args) throws URISyntaxException {
//System.out.println(new test.FirstRDD().getClass().getCanonicalName());
new FirstRDD().run();
}
}
还有我的pom.xml:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>spark</groupId>
<artifactId>exercices</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<spark.version>2.3.1</spark.version>
<scala.version>2.11</scala.version>
</properties>
<repositories>
<repository>
<id>Apache Spark temp - Release Candidate repo</id>
<url>https://repository.apache.org/content/repositories/orgapachespark-1080/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.2</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</build>
</project>
到目前为止,我已经尝试过:
将.class添加到test.FirstRDD
“运行示例SparkPi 10”有效