尝试构建一个简单的Spark独立Java应用程序时Maven包错误

时间:2017-07-22 09:21:08

标签: maven apache-spark

我正在尝试构建一个简单的Spark独立Java应用程序,就像在Spark - Self-Contained Applications中一样。

/* SimpleApp.java */
import org.apache.spark.sql.SparkSession;

public class SimpleApp {
    public static void main(String[] args) {
        String logFile = "YOUR_SPARK_HOME/README.md"; // Should be some file on your system
        SparkSession spark = SparkSession.builder().appName("Simple Application").getOrCreate();
        Dataset<String> logData = spark.read.textFile(logFile).cache();

        long numAs = logData.filter(s -> s.contains("a")).count();
        long numBs = logData.filter(s -> s.contains("b")).count();

        System.out.println("Lines with a: " + numAs + ", lines with b: " + numBs);

        spark.stop();
    }
}

包结构如下

./pom.xml
./src
./src/main
./src/main/java
./src/main/java/SimpleApp.java

这是pom.xml

<project>
    <groupId>edu.berkeley</groupId>
    <artifactId>simple-project</artifactId>
    <modelVersion>4.0.0</modelVersion>
    <name>Simple Project</name>
    <packaging>jar</packaging>
    <version>1.0</version>
    <dependencies>
        <dependency> <!-- Spark dependency -->
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.11</artifactId>
            <version>2.2.0</version>
        </dependency>
    </dependencies>
</project>

如果我运行mvn package,我会收到以下错误。

[ERROR] COMPILATION ERROR : 
[INFO] -------------------------------------------------------------
[ERROR] /Users/fengyich/Dev/Sandbox/SimpleApp/src/main/java/SimpleApp.java:[8,9] cannot find symbol
  symbol:   class Dataset
  location: class SimpleApp
[ERROR] /Users/fengyich/Dev/Sandbox/SimpleApp/src/main/java/SimpleApp.java:[8,40] cannot find symbol
  symbol:   variable read
  location: variable spark of type org.apache.spark.sql.SparkSession

3 个答案:

答案 0 :(得分:0)

也许你需要:   import org.apache.spark.sql.Dataset

答案 1 :(得分:0)

添加一个额外的导入行

import org.apache.spark.sql.Dataset;

更改

spark.read.textFile(logFile).cache();

spark.read().textFile(logFile).cache();

我的pom.xml如下所示

<project>
  <groupId>edu.berkeley</groupId>
  <artifactId>simple-project</artifactId>
  <modelVersion>4.0.0</modelVersion>
  <name>Simple Project</name>
  <packaging>jar</packaging>
  <version>1.0</version>
  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>3.3</version>
        <configuration>
            <source>1.8</source>
            <target>1.8</target>
        </configuration>
      </plugin>
    </plugins>
  </build>
  <dependencies>
    <dependency> <!-- Spark dependency -->
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql_2.11</artifactId>
      <version>2.2.0</version>
    </dependency>
  </dependencies>
</project>

这应解决您的问题

答案 2 :(得分:-1)

您可以尝试添加以下插件:

<ul class="nav nav-tabs" role="tablist">
  <li role="presentation">
    <a href="#about" aria-controls="about" role="tab" data-toggle="tab">
      About
    </a>
  </li>

  <li role="presentation">
    <a href="#advertise" aria-controls="advertise" role="tab" data-toggle="tab">
      Advertise
    </a>
  </li>

  <li role="presentation">
    <a href="#legal" aria-controls="legal" role="tab" data-toggle="tab">
      Legal
    </a>
  </li>

  <li role="presentation">
    <a href="#privacy" aria-controls="privacy" role="tab" data-toggle="tab">
      Privacy
    </a>
  </li>

  <li role="presentation">
    <a href="#terms" aria-controls="terms" role="tab" data-toggle="tab">
      Terms
    </a>
  </li>
</ul>

<!-- TAB PANES -->
<div class="tab-content" id="infoTabContent">
  <!-- TAB DIVIDER -->
  <div role="tabpanel" class="tab-pane fade in" id="about">
    <h3>About</h3>
    <div class="indent">
      <p>This is the about tab using a div to indent lines</p>
    </div>
  </div>
  <!-- TAB DIVIDER -->
  <div role="tabpanel" class="tab-pane fade" id="advertise">
    <h3>Advertise</h3>
  </div>
  <!-- TAB DIVIDER -->
  <div role="tabpanel" class="tab-pane fade" id="legal">
    <h3>Legal</h3>
  </div>
  <!-- TAB DIVIDER -->
  <div role="tabpanel" class="tab-pane fade" id="privacy">
    <h3>Privacy</h3>
  </div>
  <!-- TAB DIVIDER -->
  <div role="tabpanel" class="tab-pane fade" id="terms">
    <h3>Terms</h3>
  </div>
  <!-- TAB DIVIDER -->
</div><!-- Tab Content -->