java.lang.ClassNotFoundException:org.apache.spark.streaming.twitter.TwitterReceiver

时间:2015-11-08 13:03:23

标签: java eclipse maven apache-spark

这是我的第一个Spark Stream应用程序。设置如下

运行spark群集的3个节点。一个主节点和两个从节点。

该应用程序是一个简单的java应用程序流,来自twitter和maven管理的依赖项。

以下是应用程序的代码

public class SimpleApp {

    public static void main(String[] args) {

        SparkConf conf = new SparkConf().setAppName("Simple Application").setMaster("spark://rethink-node01:7077");

        JavaStreamingContext sc = new JavaStreamingContext(conf, new Duration(1000));

        ConfigurationBuilder cb = new ConfigurationBuilder();

        cb.setDebugEnabled(true).setOAuthConsumerKey("ConsumerKey")
                .setOAuthConsumerSecret("ConsumerSecret")
                .setOAuthAccessToken("AccessToken")
                .setOAuthAccessTokenSecret("TokenSecret");

        OAuthAuthorization auth = new OAuthAuthorization(cb.build());

        JavaDStream<Status> tweets = TwitterUtils.createStream(sc, auth);

         JavaDStream<String> statuses = tweets.map(new Function<Status, String>() {
             public String call(Status status) throws Exception {
                return status.getText();
            }
        });

         statuses.print();;

         sc.start();

         sc.awaitTermination();

    }

}

这是pom文件

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>SparkFirstTry</groupId>
    <artifactId>SparkFirstTry</artifactId>
    <version>0.0.1-SNAPSHOT</version>

    <dependencies>
        <dependency> <!-- Spark dependency -->
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.10</artifactId>
            <version>1.5.1</version>
            <scope>provided</scope>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming_2.10</artifactId>
            <version>1.5.1</version>
            <scope>provided</scope>
        </dependency>

        <dependency>
            <groupId>org.twitter4j</groupId>
            <artifactId>twitter4j-stream</artifactId>
            <version>3.0.3</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming-twitter_2.10</artifactId>
            <version>1.0.0</version>
        </dependency>

    </dependencies>

    <build>
        <sourceDirectory>src</sourceDirectory>
        <plugins>
            <plugin>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.3</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <configuration>
                    <archive>
                        <manifest>
                            <mainClass>com.test.sparkTest.SimpleApp</mainClass>
                        </manifest>
                    </archive>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                </configuration>
            </plugin>

        </plugins>
    </build>
</project>

应用程序成功启动,但没有推文,抛出此异常

15/11/08 15:55:46 WARN TaskSetManager: Lost task 0.0 in stage 4.0 (TID 78, 192.168.122.39): java.io.IOException: java.lang.ClassNotFoundException: org.apache.spark.streaming.twitter.TwitterReceiver
    at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1163)
    at org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:70)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
    at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.streaming.twitter.TwitterReceiver
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1707)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1345)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
    at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:501)
    at org.apache.spark.rdd.ParallelCollectionPartition$$anonfun$readObject$1.apply$mcV$sp(ParallelCollectionRDD.scala:74)
    at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1160)
    ... 20 more

我从Eclipse本地运行应用程序。很明显问题出在依赖关系中,但我无法弄清楚如何解决它。

0 个答案:

没有答案