有一个Spark作业,它使用Cassandra的CQLSSTableWriter
作为独立程序运行得很好,但即使在干净的本地安装中通过java.lang.VerifyError: Operand stack overflow
运行时,也无法运行spark-submit
,但它却没有这样做。甚至不使用执行程序,它只是一个驱动程序!因此,Spark的类路径上的 something 导致了它。字节码有些混乱,因为需要屏蔽Guava 19.0才能使Cassandra库正常工作(Spark提供了Guava 14.0)。我该如何调试该东西?
这是问题的最小例子,仍然是一堵墙:
Exception in thread "main" java.lang.VerifyError: Operand stack overflow
Exception Details:
Location:
org/apache/cassandra/config/ColumnDefinition.comparisonOrder(Lorg/apache/cassandra/config/ColumnDefinition$Kind;ZJLorg/apache/cassandra/cql3/ColumnIdentifier;)J @49: bipush
Reason:
Exceeded max stack size.
Current Frame:
bci: @49
flags: { }
locals: { 'org/apache/cassandra/config/ColumnDefinition$Kind', integer, long, long_2nd, 'org/apache/cassandra/cql3/ColumnIdentifier' }
stack: { long, long_2nd, long, long_2nd }
Bytecode:
0x0000000: b200 429a 0019 2009 949b 000b 2014 0043
0x0000010: 949b 000b bb00 4659 b700 4abf 2ab6 004e
0x0000020: 8510 3d79 1b99 0009 1400 4fa7 0004 0981
0x0000030: 2010 3079 8119 04b4 0055 1010 7d81 ad
Stackmap Table:
same_frame(@20)
same_frame(@28)
same_locals_1_stack_item_frame(@46,Long)
full_frame(@47,{Object[#16],Integer,Long,Object[#82]},{Long,Long})
at org.apache.cassandra.db.Columns.<clinit>(Columns.java:55)
at org.apache.cassandra.db.PartitionColumns.<clinit>(PartitionColumns.java:35)
at org.apache.cassandra.config.CFMetaData$Builder.build(CFMetaData.java:1464)
at org.apache.cassandra.config.CFMetaData.compile(CFMetaData.java:499)
at org.apache.cassandra.schema.SchemaKeyspace.compile(SchemaKeyspace.java:268)
at org.apache.cassandra.schema.SchemaKeyspace.<clinit>(SchemaKeyspace.java:116)
at org.apache.cassandra.io.sstable.CQLSSTableWriter$Builder.build(CQLSSTableWriter.java:517)
工作本身
package me.synapse.cassandra_export
import java.nio.file.Files
import org.apache.cassandra.dht.Murmur3Partitioner
import org.apache.cassandra.io.sstable.CQLSSTableWriter
object CassandraExport extends App {
val writer = CQLSSTableWriter.builder()
.inDirectory(Files.createTempDirectory("export").toAbsolutePath.toString)
.forTable("CREATE TABLE my_space.my_table (id int PRIMARY KEY, value text)")
.using("INSERT INTO my_space.my_table (id, value) VALUES(?, ?)")
.withPartitioner(new Murmur3Partitioner())
.build()
}
生成文件
name := "cassandra_export"
scalaVersion := "2.11.8"
libraryDependencies := Seq(
"org.apache.spark" %% "spark-sql" % "2.3.2" % "provided",
("org.apache.cassandra" % "cassandra-all" % "3.11.3").exclude("org.slf4j", "jcl-over-slf4j"),
"com.datastax.cassandra" % "cassandra-driver-core" % "3.6.0",
"io.netty" % "netty-all" % "4.0.56.Final" // To evict all other nettys
)
assemblyShadeRules in assembly := Seq(
ShadeRule.rename("com.google.common.**" -> "shaded.google.common.@1")
.inAll //Cassandra requires Guava 19.0 while Spark provides 14.0
)
assemblyMergeStrategy in assembly := {
case PathList(ps@_*) if Assembly.isReadme(ps.last) || Assembly.isLicenseFile(ps.last) =>
MergeStrategy.rename
case PathList("META-INF", xs@_*) => {
xs map {
_.toLowerCase
} match {
case "manifest.mf" :: Nil | ("index.list" :: Nil) | ("dependencies" :: Nil) =>
MergeStrategy.discard
case ps@(x :: xs) if ps.last.endsWith(".sf") || ps.last.endsWith(".dsa") =>
MergeStrategy.discard
case "plexus" :: xs =>
MergeStrategy.discard
case "services" :: xs =>
MergeStrategy.filterDistinctLines
case ("spring.schemas" :: Nil) | ("spring.handlers" :: Nil) =>
MergeStrategy.filterDistinctLines
case _ => MergeStrategy.discard
}
}
case _ => MergeStrategy.deduplicate
}