我正试图从learning spark本书
运行示例package com.example
import org.apache.spark.SparkConf
import org.apache.spark.streaming.dstream.DStream
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark._
import org.apache.spark.SparkContext._
import org.apache.spark.streaming._
import org.apache.spark.streaming.dstream._
object SimpleExample {
def main(args: Array[String]) {
val master = args(0)
val conf = new SparkConf().setMaster(master).setAppName("StreamingLogInput")
// Create a StreamingContext with a 1 second batch size
val ssc = new StreamingContext(conf, Seconds(1))
// Create a DStream from all the input on port 7777
val lines = ssc.socketTextStream("localhost", 7777)
val errorLines = processLines(lines)
// Print out the lines with errors, which causes this DStream to be evaluated
errorLines.print()
// start our streaming context and wait for it to "finish"
ssc.start()
// Wait for 10 seconds then exit. To run forever call without a timeout
ssc.awaitTerminationOrTimeout(10000)
ssc.stop()
}
def processLines(lines: DStream[String]) = {
// Filter our DStream for lines with "error"
lines.filter(_.contains("error"))
}
}
使用maven项目这是我的pom.xml
<project>
<groupId>com.streaming.example</groupId>
<artifactId>streaming-example</artifactId>
<modelVersion>4.0.0</modelVersion>
<name>example</name>
<packaging>jar</packaging>
<version>0.0.1</version>
<dependencies>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.1.0</version>
<scope>provided</scope>
</dependency>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>1.2.0</version>
</dependency>
</dependencies>
<properties>
<java.version>1.7</java.version>
</properties>
<build>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.6</source>
<target>1.6</target>
</configuration>
</plugin>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.1.6</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
<configuration>
<args>
<!-- work-around for https://issues.scala-lang.org/browse/SI-8358 -->
<arg>-nobootcp</arg>
</args>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
</project>
但在运行时如下:
> spark-submit --class src/scala/com/example/SimpleExample.scala \
> target/streaming-example-0.0.1.jar local[4]
我收到此错误:
java.lang.ClassNotFoundException:src / scala / com / example / SimpleExample 在java.lang.Class.forName0(本机方法)
编辑jar内容:
$ jar tf target/streaming-example-0.0.1.jar
META-INF/
META-INF/MANIFEST.MF
META-INF/maven/
META-INF/maven/com.streaming.example/
META-INF/maven/com.streaming.example/streaming-example/
META-INF/maven/com.streaming.example/streaming-example/pom.xml
答案 0 :(得分:1)
--class参数不是文件路径,而是查找包结构。试试这个:
spark-submit --class com.example.SimpleExample target/streaming-example-0.0.1.jar
答案 1 :(得分:1)
您的SimpleExample.class不在您的jar中。
检查你的maven构建插件。
您可以考虑使用程序集插件并编译:
mvn assembly:assembly
当它创建一个超级jar时,它将包含所有依赖项。