java.lang.ClassNotFoundException:text.DefaultSource

时间:2016-11-06 09:31:15

标签: java scala maven apache-spark intellij-idea

我有一个带火花的scala maven应用程序。我使用Intellij Idea。我从它做了一个可执行的jar但当我尝试通过Windows控制台启动它时,有一个关于缺少某些类的错误。我无法弄清楚是否存在问题,因为我已将其添加到我的.pom文件中。当我查看.jar​​时,我看到那个类的库:

需要.jar中的库: needed library in .jar

我尝试使用两个插件:maven-shade-plugin和maven-assembly-plugin,结果是一样的。我尝试通过Intellij中的项目结构 - >库显式在classpath中设置这个库:

IDEA中的

类路径: classpath in idea

任何帮助将不胜感激! 这是我的代码:

import org.apache.spark.broadcast.Broadcast
import org.apache.spark.ml.recommendation.{ALS, ALSModel}
import org.apache.spark.sql.functions._
import org.apache.spark.sql.{DataFrame, Dataset, SparkSession}
import scala.collection.{Map, Set}
import scala.collection.mutable.ArrayBuffer
import scala.util.Random
object RunRecommender {

  def main(args: Array[String]): Unit = {
    val spark: SparkSession =SparkSession.builder()
      .master("local")
      .appName("Recommender Engines with Audioscrobbler data")
      .config("spark.sql.warehouse.dir", "spark-warehouse")
      .getOrCreate()

    val rawUserArtistData: Dataset[String] = spark.read.textFile("user_artist_data.txt")
    val rawArtistData: Dataset[String] = spark.read.textFile("artist_data.txt")
    val rawArtistAlias: Dataset[String] = spark.read.textFile("artist_alias.txt")

    val runRecommender: RunRecommender = new RunRecommender(spark)
    runRecommender.preparation(rawUserArtistData, rawArtistData, rawArtistAlias)
    runRecommender.model(rawUserArtistData, rawArtistData, rawArtistAlias)
    runRecommender.evaluate(rawUserArtistData, rawArtistAlias)
    runRecommender.recommend(rawUserArtistData, rawArtistData, rawArtistAlias)
  }

}

这是我的.pom文件:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>recommends</groupId>
    <artifactId>recommends</artifactId>
    <packaging>jar</packaging>
    <name>Recommender Engine with Audioscrobbler data</name>
    <version>1.0-SNAPSHOT</version>

    <repositories>
    <repository>
        <id>mavencentral</id>
        <name>Maven Central</name>
        <url>https://repo1.maven.org/maven2/</url>
        <layout>default</layout>
    </repository>
    </repositories>

    <build>
        <pluginManagement>
            <plugins>
                <plugin>
                    <groupId>net.alchim31.maven</groupId>
                    <artifactId>scala-maven-plugin</artifactId>
                    <version>3.2.1</version>
                </plugin>
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-compiler-plugin</artifactId>
                    <version>2.0.2</version>
                </plugin>
            </plugins>
        </pluginManagement>

        <plugins>
            <plugin>
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <version>3.2.1</version>
                <configuration>
                    <archive>
                        <manifest>
                            <mainClass>RunRecommender</mainClass>
                        </manifest>
                    </archive>
                </configuration>
                    <executions>
                        <execution>
                            <id>scala-compile-first</id>
                            <phase>process-resources</phase>
                            <goals>
                                <goal>add-source</goal>
                                <goal>compile</goal>
                            </goals>
                        </execution>
                        <execution>
                            <id>scala-test-compile</id>
                            <phase>process-test-resources</phase>
                            <goals>
                                <goal>testCompile</goal>
                            </goals>
                        </execution>
                    </executions>
            </plugin>
            <plugin>
                <artifactId>maven-shade-plugin</artifactId>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <createDependencyReducedPom>false</createDependencyReducedPom>
                            <filters>
                                <filter>
                                    <artifact>*:*</artifact>
                                    <excludes>
                                        <exclude>META-INF/*.SF</exclude>
                                        <exclude>META-INF/*.DSA</exclude>
                                        <exclude>META-INF/*.RSA</exclude>
                                    </excludes>
                                </filter>
                            </filters>
                            <transformers>
                                <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                    <mainClass>RunRecommender</mainClass>
                                </transformer>
                            </transformers>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

    <dependencies>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>2.0.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-mllib_2.11</artifactId>
            <version>2.0.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.10</artifactId>
            <version>2.0.0</version>
        </dependency>
    </dependencies>

</project>

当我尝试运行jar时,这是堆栈跟踪:

16/11/06 11:56:40 WARN SparkContext: Use an existing SparkContext, some configuration may not take effect.
16/11/06 11:56:40 INFO SharedState: Warehouse path is 'spark-warehouse'.
Exception in thread "main" java.lang.ClassNotFoundException: Failed to find data source: text. Please find packages at http://spark-packages.org
        at org.apache.spark.sql.execution.datasources.DataSource.lookupDataSource(DataSource.scala:145)
        at org.apache.spark.sql.execution.datasources.DataSource.providingClass$lzycompute(DataSource.scala:78)
        at org.apache.spark.sql.execution.datasources.DataSource.providingClass(DataSource.scala:78)
        at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:310)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
        at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:492)
        at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:528)
        at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:501)
        at RunRecommender$.main(RunRecommender.scala:20)
        at RunRecommender.main(RunRecommender.scala)
Caused by: java.lang.ClassNotFoundException: text.DefaultSource
        at java.net.URLClassLoader.findClass(Unknown Source)
        at java.lang.ClassLoader.loadClass(Unknown Source)
        at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
        at java.lang.ClassLoader.loadClass(Unknown Source)
        at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5$$anonfun$apply$1.apply(DataSource.scala:130)
        at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5$$anonfun$apply$1.apply(DataSource.scala:130)
        at scala.util.Try$.apply(Try.scala:192)
        at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5.apply(DataSource.scala:130)
        at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5.apply(DataSource.scala:130)
        at scala.util.Try.orElse(Try.scala:84)
        at org.apache.spark.sql.execution.datasources.DataSource.lookupDataSource(DataSource.scala:130)
        ... 9 more
16/11/06 11:56:40 INFO SparkContext: Invoking stop() from shutdown hook
1

2 个答案:

答案 0 :(得分:0)

我有一个相似的问题,对我来说,maven中的这个元素丢失了:

<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.11</artifactId> <version>2.3.2</version> </dependency>

答案 1 :(得分:0)

仅当您尝试使用spark-submit运行相同的jar来运行jar而不是在终端中运行java -jar时,才能轻松解决此问题。