这是用于读取输入json的spark应用程序。 但是我无法读取输入文件。
public class SampleApplication{
public static void main(String[] args) throws IOException{
BasicConfigurator.configure();
SparkConf conf= new SparkConf().setMaster("local[*]");
SparkSession spark = SparkSession
.builder()
.config(conf)
.getOrCreate();
// Encoders are created for Java bean class
Encoder<Input> inputEncoder = Encoders.bean(Input.class);
Dataset<Input> df = spark.read().option("multiline","true").json(args[0]).as(inputEncoder);
System.out.println("Finished !!!!!");
df.show();
spark.close();
}
}
我在项目中包括的Gradle依赖项:
dependencies {
compile group: 'log4j', name: 'log4j', version: '1.2.17'
compileOnly 'org.projectlombok:lombok:1.18.8'
annotationProcessor 'org.projectlombok:lombok:1.18.8'
compile group: 'com.fasterxml.jackson.core', name: 'jackson-core', version: '2.9.9'
compile group: 'com.fasterxml.jackson.core', name: 'jackson-annotations', version: '2.9.9'
compile 'org.apache.spark:spark-sql_2.11:2.3.0'
compile 'org.apache.spark:spark-core_2.11:2.3.0'
testCompile group: 'junit', name: 'junit', version: '4.12'
}
这是错误:
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Stopwatch.elapsedMillis()J
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:279)
at org.apache.spark.input.StreamFileInputFormat.setMinPartitions(PortableDataStream.scala:51)
at org.apache.spark.rdd.BinaryFileRDD.getPartitions(BinaryFileRDD.scala:51)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
.....
我确实使用gradle依赖项运行。 我无法粘贴所有内容
+--- org.apache.spark:spark-catalyst_2.11:2.3.0
| | +--- org.scala-lang:scala-reflect:2.11.8 (*)
| | +--- org.scala-lang.modules:scala-parser-combinators_2.11:1.0.4 (*)
| | +--- org.apache.spark:spark-core_2.11:2.3.0 (*)
| | +--- org.apache.spark:spark-tags_2.11:2.3.0 (*)
| | +--- org.apache.spark:spark-unsafe_2.11:2.3.0 (*)
| | +--- org.apache.spark:spark-sketch_2.11:2.3.0 (*)
| | +--- org.codehaus.janino:janino:3.0.8
| | | \--- org.codehaus.janino:commons-compiler:3.0.8
| | +--- org.codehaus.janino:commons-compiler:3.0.8
| | +--- org.antlr:antlr4-runtime:4.7
| | +--- commons-codec:commons-codec:1.10
| | \--- org.spark-project.spark:unused:1.0.0
| +--- org.apache.spark:spark-tags_2.11:2.3.0 (*)
| +--- org.apache.orc:orc-core:1.4.1
| | +--- com.google.protobuf:protobuf-java:2.5.0
| | +--- commons-lang:commons-lang:2.6
| | +--- io.airlift:aircompressor:0.8
| | \--- org.slf4j:slf4j-api:1.7.5 -> 1.7.25
| +--- org.apache.orc:orc-mapreduce:1.4.1
| | +--- com.esotericsoftware:kryo-shaded:3.0.3 (*)
| | +--- commons-codec:commons-codec:1.4 -> 1.10
| | \--- org.apache.hadoop:hadoop-mapreduce-client-core:2.6.4 -> 2.6.5 (*)
| +--- org.apache.parquet:parquet-column:1.8.2
| | +--- org.apache.parquet:parquet-common:1.8.2
| | | \--- org.slf4j:slf4j-api:1.7.5 -> 1.7.25
| | +--- org.apache.parquet:parquet-encoding:1.8.2
| | | +--- org.apache.parquet:parquet-common:1.8.2 (*)
| | | \--- commons-codec:commons-codec:1.5 -> 1.10
| | \--- commons-codec:commons-codec:1.5 -> 1.10
| +--- org.apache.parquet:parquet-hadoop:1.8.2
| | +--- org.apache.parquet:parquet-column:1.8.2 (*)
| | +--- org.apache.parquet:parquet-format:2.3.1
| | +--- org.apache.parquet:parquet-jackson:1.8.2
| | +--- org.codehaus.jackson:jackson-mapper-asl:1.9.11 -> 1.9.13 (*)
| | +--- org.codehaus.jackson:jackson-core-asl:1.9.11 -> 1.9.13
| | \--- org.xerial.snappy:snappy-java:1.1.1.6 -> 1.1.2.6
| +--- com.fasterxml.jackson.core:jackson-databind:2.6.7.1 (*)
| +--- org.apache.arrow:arrow-vector:0.8.0
| | +--- org.apache.arrow:arrow-format:0.8.0
| | | \--- com.vlkan:flatbuffers:1.2.0-3f79e055
| | +--- org.apache.arrow:arrow-memory:0.8.0
| | | +--- com.google.code.findbugs:jsr305:3.0.2
| | | \--- org.slf4j:slf4j-api:1.7.25
| | +--- joda-time:joda-time:2.9.9
| | +--- com.fasterxml.jackson.core:jackson-core:2.7.9 -> 2.9.9
| | +--- com.carrotsearch:hppc:0.7.2
| | +--- commons-codec:commons-codec:1.10
| | +--- com.vlkan:flatbuffers:1.2.0-3f79e055
| | +--- com.google.code.findbugs:jsr305:3.0.2
| | \--- org.slf4j:slf4j-api:1.7.25
| +--- org.apache.xbean:xbean-asm5-shaded:4.4
| \--- org.spark-project.spark:unused:1.0.0
+--- org.apache.spark:spark-core_2.11:2.3.0 (*)
\--- junit:junit:4.12
\--- org.hamcrest:hamcrest-core:1.3
问题:
答案 0 :(得分:0)
方法elapsedMillis()
很久以前已删除。似乎您需要更新某些库或使用旧的番石榴版本。
为了获取依赖关系树,请运行gradle dependencies
。
答案 1 :(得分:0)
在 maven 中包含以下依赖项。这应该可以解决问题。
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>28.0-jre</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>failureaccess</artifactId>
<version>1.0.1</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>listenablefuture</artifactId>
<version>9999.0-empty-to-avoid-conflict-with-guava</version>
</dependency>
<dependency>
<groupId>com.googlecode.concurrentlinkedhashmap</groupId>
<artifactId>concurrentlinkedhashmap-lru</artifactId>
<version>1.2_jdk5</version>
</dependency>