我使用SparkSession创建代码spark但我无法运行此代码。 我想我在pom.xml或其他东西中缺少一些依赖项 -
import org.apache.spark.sql.SparkSession
val spark = SparkSession
.builder
.appName("loader")
.master("local")
.getOrCreate()
scala 2.11的pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>info.daviot</groupId>
<version>0.1-SNAPSHOT</version>
<artifactId>demo</artifactId>
<packaging>jar</packaging>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<scala.version>2.11.5</scala.version>
<java.version>1.7</java.version>
</properties>
<dependencies>
<dependency>
<artifactId>scala-library</artifactId>
<groupId>org.scala-lang</groupId>
<version>${scala.version}</version>
</dependency>
<!-- optional dependencies -->
<dependency>
<groupId>com.softwaremill.macwire</groupId>
<artifactId>macros_2.11</artifactId>
<version>0.8.0</version>
</dependency>
<dependency>
<groupId>com.typesafe.akka</groupId>
<artifactId>akka-actor_2.11</artifactId>
<version>2.3.9</version>
</dependency>
<dependency>
<groupId>com.github.nscala-time</groupId>
<artifactId>nscala-time_2.11</artifactId>
<version>1.4.0</version>
</dependency>
<dependency>
<groupId>com.propensive</groupId>
<artifactId>rapture-json-jawn_2.11</artifactId>
<version>1.1.0</version>
</dependency>
<!-- logs -->
<dependency>
<groupId>org.clapper</groupId>
<artifactId>grizzled-slf4j_2.11</artifactId>
<version>1.0.2</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.1.2</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.6</version>
</dependency>
<!-- tests -->
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_2.11</artifactId>
<version>2.2.2</version>
<scope>test</scope>
</dependency>
<dependency>
<artifactId>junit</artifactId>
<groupId>junit</groupId>
<version>4.10</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-api-mockito</artifactId>
<version>1.5.5</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.1.6</version>
<executions>
<execution>
<phase>compile</phase>
<goals>
<goal>compile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.0.2</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
</configuration>
<executions>
<execution>
<phase>compile</phase>
<goals>
<goal>compile</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
当我尝试添加时,我得到了相同的错误:
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/clouderarepos/</url>
</repository>
</repositories>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.0.0-cloudera1-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.0.0-cloudera1-SNAPSHOT</version>
</dependency>
导入有效:
导入dosn&#t>
import org.apache.spark.ml.feature.CountVectorizerModel
import org.apache.spark.ml.feature.StopWordsRemover
有错误无法解析符号
答案 0 :(得分:2)
你需要这些依赖
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.11</artifactId>
<version>2.2.0</version>
</dependency>
答案 1 :(得分:1)
添加以下依赖项 -
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.11</artifactId>
<version>2.0.0</version>
<scope>provided</scope>
</dependency>