Spark找不到依赖项

时间:2019-06-14 10:09:54

标签: java apache-spark

我是Spark的新手,我正在尝试从以下示例执行NaiveBayes:https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/ml/JavaLogisticRegressionSummaryExample.java

我知道这是一个愚蠢的问题,但是我搜索了很长时间,但仍然无法解决它,因为我正在使用NetBeans。以下导入有错误,要求我搜索依赖项

import org.apache.spark.ml.classification.BinaryLogisticRegressionTrainingSummary;
        import org.apache.spark.ml.classification.LogisticRegression;
        import org.apache.spark.ml.classification.LogisticRegressionModel;
        import org.apache.spark.sql.Dataset;
        import org.apache.spark.sql.Row;
        import org.apache.spark.sql.SparkSession;
        import org.apache.spark.sql.functions;

我只能从https://mvnrepository.com/artifact/org.apache.spark/spark-mllib_2.10/1.0.0中找到这种依赖性,但是错误仍然存​​在。

谁能告诉我在哪里可以找到这些Maven依赖项?谢谢!

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-mllib_2.10</artifactId>
    <version>1.0.0</version>
    <scope>runtime</scope>
</dependency>

我的pom:

?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.mycompany</groupId>
    <artifactId>BigDataLab02</artifactId>
    <version>1.0-SNAPSHOT</version>
    <packaging>jar</packaging>
    <dependencies>
        <dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-common</artifactId>
    <version>2.2.0</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce-client-core</artifactId>
    <version>2.2.0</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-core</artifactId>
    <version>0.20.2</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-hdfs</artifactId>
    <version>2.2.0</version>
</dependency>
        </dependencies>
         <build>
    <plugins>
        <plugin>
          <groupId>org.apache.maven.plugins</groupId>
          <artifactId>maven-jar-plugin</artifactId>
          <version>2.4</version>
          <configuration>
              <archive>
                  <manifest>
                      <addClasspath>true</addClasspath>
                      <mainClass>com.mycompany.bigdatalab02.Demo</mainClass>
                  </manifest>
              </archive>
          </configuration>
      </plugin>
  </plugins>
</build>
    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <maven.compiler.source>1.8</maven.compiler.source>
        <maven.compiler.target>1.8</maven.compiler.target>
    </properties>
</project>

enter image description here

1 个答案:

答案 0 :(得分:0)

您需要添加:

<properties>
  <spark.version>2.1.1</spark.version>
  <scala.version>2.11.8</scala.version>
  <scala.compat.version>2.11</scala.compat.version>
</properties>

到您pom.xml文件的根。

然后添加:

<dependency>
    <groupId>org.scala-lang</groupId>
    <artifactId>scala-library</artifactId>
    <version>${scala.version}</version>
    <scope>provided</scope>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_${scala.compat.version}</artifactId>
    <version>${spark.version}</version>
    <scope>provided</scope>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_${scala.compat.version}</artifactId>
    <version>${spark.version}</version>
    <scope>provided</scope>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-mllib_${scala.compat.version}</artifactId>
    <version>${spark.version}</version>
    <scope>provided</scope>
</dependency>

依赖项。