我们正在使用Spark结构化流媒体进行Kafka执行时我们遇到了以下问题:
Ivy Default Cache set to: /root/.ivy2/cache
The jars for the packages stored in: /root/.ivy2/jars
:: loading settings :: url = jar:file:/usr/hdp/2.6.3.0-235/spark2/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.apache.spark#spark-sql-kafka-0-10_2.11 added as a dependency
org.apache.kafka#kafka-clients added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found org.apache.spark#spark-sql-kafka-0-10_2.11;2.2.0 in central
found org.apache.spark#spark-tags_2.11;2.2.0 in local-m2-cache
found org.spark-project.spark#unused;1.0.0 in local-m2-cache
found org.apache.kafka#kafka-clients;0.10.1.0 in local-m2-cache
found net.jpountz.lz4#lz4;1.3.0 in local-m2-cache
found org.xerial.snappy#snappy-java;1.1.2.6 in local-m2-cache
found org.slf4j#slf4j-api;1.7.21 in local-m2-cache
:: resolution report :: resolve 3640ms :: artifacts dl 20ms
:: modules in use:
net.jpountz.lz4#lz4;1.3.0 from local-m2-cache in [default]
org.apache.kafka#kafka-clients;0.10.1.0 from local-m2-cache in [default]
org.apache.spark#spark-sql-kafka-0-10_2.11;2.2.0 from central in [default]
org.apache.spark#spark-tags_2.11;2.2.0 from local-m2-cache in [default]
org.slf4j#slf4j-api;1.7.21 from local-m2-cache in [default]
org.spark-project.spark#unused;1.0.0 from local-m2-cache in [default]
org.xerial.snappy#snappy-java;1.1.2.6 from local-m2-cache in [default]
:: evicted modules:
org.apache.kafka#kafka-clients;0.10.0.1 by [org.apache.kafka#kafka-clients;0.10.1.0] in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 8 | 2 | 2 | 1 || 7 | 0 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent
confs: [default]
0 artifacts copied, 7 already retrieved (0kB/30ms)
18/03/14 15:52:02 INFO SparkContext: Running Spark version 2.2.0.2.6.3.0-235
18/03/14 15:52:02 INFO SparkContext: Submitted application: StructuredKafkaWordCount
...
18/03/14 15:52:05 INFO SparkContext: Added JAR file:/usr/hdp/2.6.3.0-235/spark2/examples/jars/spark-examples_2.11-2.2.0.2.6.3.0-235.jar at spark://172.16.10.53:31702/jars/spark-examples_2.11-2.2.0.2.6.3.0-235.jar with timestamp 1521022925004
18/03/14 15:52:05 INFO SparkContext: Added JAR file:/usr/hdp/2.6.3.0-235/spark2/examples/jars/scopt_2.11-3.3.0.jar at spark://172.16.10.53:31702/jars/scopt_2.11-3.3.0.jar with timestamp 1521022925006
18/03/14 15:52:05 INFO SparkContext: Added JAR file:/usr/hdp/2.6.3.0-235/spark2/examples/jars/spark-assembly_2.10-0.9.0-incubating.jar at spark://172.16.10.53:31702/jars/spark-assembly_2.10-0.9.0-incubating.jar with timestamp 1521022925006
18/03/14 15:52:05 INFO SparkContext: Added JAR file:/root/.ivy2/jars/org.apache.spark_spark-sql-kafka-0-10_2.11-2.2.0.jar at spark://172.16.10.53:31702/jars/org.apache.spark_spark-sql-kafka-0-10_2.11-2.2.0.jar with timestamp 1521022925006
18/03/14 15:52:05 INFO SparkContext: Added JAR file:/root/.ivy2/jars/org.apache.kafka_kafka-clients-0.10.1.0.jar at spark://172.16.10.53:31702/jars/org.apache.kafka_kafka-clients-0.10.1.0.jar with timestamp 1521022925006
18/03/14 15:52:05 INFO SparkContext: Added JAR file:/root/.ivy2/jars/org.apache.spark_spark-tags_2.11-2.2.0.jar at spark://172.16.10.53:31702/jars/org.apache.spark_spark-tags_2.11-2.2.0.jar with timestamp 1521022925006
18/03/14 15:52:05 INFO SparkContext: Added JAR file:/root/.ivy2/jars/org.spark-project.spark_unused-1.0.0.jar at spark://172.16.10.53:31702/jars/org.spark-project.spark_unused-1.0.0.jar with timestamp 1521022925007
18/03/14 15:52:05 INFO SparkContext: Added JAR file:/root/.ivy2/jars/net.jpountz.lz4_lz4-1.3.0.jar at spark://172.16.10.53:31702/jars/net.jpountz.lz4_lz4-1.3.0.jar with timestamp 1521022925007
18/03/14 15:52:05 INFO SparkContext: Added JAR file:/root/.ivy2/jars/org.xerial.snappy_snappy-java-1.1.2.6.jar at spark://172.16.10.53:31702/jars/org.xerial.snappy_snappy-java-1.1.2.6.jar with timestamp 1521022925007
18/03/14 15:52:05 INFO SparkContext: Added JAR file:/root/.ivy2/jars/org.slf4j_slf4j-api-1.7.21.jar at spark://172.16.10.53:31702/jars/org.slf4j_slf4j-api-1.7.21.jar with timestamp 1521022925007
18/03/14 15:52:05 INFO Executor: Starting executor ID driver on host localhost
18/03/14 15:52:05 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 11138.
18/03/14 15:52:05 INFO NettyBlockTransferService: Server created on 172.16.10.53:11138
18/03/14 15:52:05 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
18/03/14 15:52:05 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 172.16.10.53, 11138, None)
18/03/14 15:52:05 INFO BlockManagerMasterEndpoint: Registering block manager 172.16.10.53:11138 with 366.3 MB RAM, BlockManagerId(driver, 172.16.10.53, 11138, None)
18/03/14 15:52:05 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 172.16.10.53, 11138, None)
18/03/14 15:52:05 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 172.16.10.53, 11138, None)
18/03/14 15:52:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@10bea4{/metrics/json,null,AVAILABLE,@Spark}
18/03/14 15:52:07 INFO EventLoggingListener: Logging events to hdfs:///spark2-history/local-1521022925116
18/03/14 15:52:07 INFO SharedState: loading hive config file: file:/etc/spark2/2.6.3.0-235/0/hive-site.xml
18/03/14 15:52:07 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/usr/hdp/2.6.3.0-235/spark2/bin/spark-warehouse/').
18/03/14 15:52:07 INFO SharedState: Warehouse path is 'file:/usr/hdp/2.6.3.0-235/spark2/bin/spark-warehouse/'.
18/03/14 15:52:07 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@e700eba{/SQL,null,AVAILABLE,@Spark}
18/03/14 15:52:07 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@7186b202{/SQL/json,null,AVAILABLE,@Spark}
18/03/14 15:52:07 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@3d88e6b9{/SQL/execution,null,AVAILABLE,@Spark}
18/03/14 15:52:07 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@208205ed{/SQL/execution/json,null,AVAILABLE,@Spark}
18/03/14 15:52:07 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@2173a742{/static/sql,null,AVAILABLE,@Spark}
18/03/14 15:52:09 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
18/03/14 15:52:13 INFO StreamExecution: Starting [id = 7880cf41-0bbc-4b40-a284-c41f9dcbfbc8, runId = 5636831e-aad8-43e4-8d99-9a4b27b69b9f]. Use /tmp/temporary-a86e0bc9-99fd-45dd-b38a-4c5fc10def22 to store the query checkpoint.
18/03/14 15:52:13 ERROR StreamExecution: Query [id = 7880cf41-0bbc-4b40-a284-c41f9dcbfbc8, runId = 5636831e-aad8-43e4-8d99-9a4b27b69b9f] terminated with error
java.lang.NoClassDefFoundError: org/apache/kafka/common/serialization/ByteArrayDeserializer
at org.apache.spark.sql.kafka010.KafkaSourceProvider.createSource(KafkaSourceProvider.scala:74)
at org.apache.spark.sql.execution.datasources.DataSource.createSource(DataSource.scala:246)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$2$$anonfun$applyOrElse$1.apply(StreamExecution.scala:158)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$2$$anonfun$applyOrElse$1.apply(StreamExecution.scala:155)
at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:194)
at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$2.applyOrElse(StreamExecution.scala:155)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$2.applyOrElse(StreamExecution.scala:153)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:266)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:256)
at org.apache.spark.sql.execution.streaming.StreamExecution.logicalPlan$lzycompute(StreamExecution.scala:153)
at org.apache.spark.sql.execution.streaming.StreamExecution.logicalPlan(StreamExecution.scala:147)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches(StreamExecution.scala:276)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:206)
Caused by: java.lang.ClassNotFoundException: org.apache.kafka.common.serialization.ByteArrayDeserializer
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 47 more
Exception in thread "main" Exception in thread "stream execution thread for [id = 7880cf41-0bbc-4b40-a284-c41f9dcbfbc8, runId = 5636831e-aad8-43e4-8d99-9a4b27b69b9f]" org.apache.spark.sql.streaming.StreamingQueryException: org/apache/kafka/common/serialization/ByteArrayDeserializer
=== Streaming Query ===
Identifier: [id = 7880cf41-0bbc-4b40-a284-c41f9dcbfbc8, runId = 5636831e-aad8-43e4-8d99-9a4b27b69b9f]
Current Committed Offsets: {}
Current Available Offsets: {}
Current State: INITIALIZING
Thread State: RUNNABLE
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches(StreamExecution.scala:343)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:206)
Caused by: java.lang.NoClassDefFoundError: org/apache/kafka/common/serialization/ByteArrayDeserializer
at org.apache.spark.sql.kafka010.KafkaSourceProvider.createSource(KafkaSourceProvider.scala:74)
at org.apache.spark.sql.execution.datasources.DataSource.createSource(DataSource.scala:246)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$2$$anonfun$applyOrElse$1.apply(StreamExecution.scala:158)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$2$$anonfun$applyOrElse$1.apply(StreamExecution.scala:155)
at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:194)
at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$2.applyOrElse(StreamExecution.scala:155)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$2.applyOrElse(StreamExecution.scala:153)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:266)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:256)
at org.apache.spark.sql.execution.streaming.StreamExecution.logicalPlan$lzycompute(StreamExecution.scala:153)
at org.apache.spark.sql.execution.streaming.StreamExecution.logicalPlan(StreamExecution.scala:147)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches(StreamExecution.scala:276)
... 1 more
Caused by: java.lang.ClassNotFoundException: org.apache.kafka.common.serialization.ByteArrayDeserializer
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 47 more
java.lang.NoClassDefFoundError: org/apache/kafka/common/serialization/ByteArrayDeserializer
at org.apache.spark.sql.kafka010.KafkaSourceProvider.createSource(KafkaSourceProvider.scala:74)
at org.apache.spark.sql.execution.datasources.DataSource.createSource(DataSource.scala:246)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$2$$anonfun$applyOrElse$1.apply(StreamExecution.scala:158)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$2$$anonfun$applyOrElse$1.apply(StreamExecution.scala:155)
at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:194)
at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$2.applyOrElse(StreamExecution.scala:155)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$2.applyOrElse(StreamExecution.scala:153)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:266)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:306)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:304)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:272)
at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:256)
at org.apache.spark.sql.execution.streaming.StreamExecution.logicalPlan$lzycompute(StreamExecution.scala:153)
at org.apache.spark.sql.execution.streaming.StreamExecution.logicalPlan(StreamExecution.scala:147)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches(StreamExecution.scala:276)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:206)
Caused by: java.lang.ClassNotFoundException: org.apache.kafka.common.serialization.ByteArrayDeserializer
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 47 more
18/03/14 15:52:13 INFO SparkContext: Invoking stop() from shutdown hook
18/03/14 15:52:13 INFO AbstractConnector: Stopped Spark@30221a6b{HTTP/1.1,[http/1.1]}{0.0.0.0:4041}
18/03/14 15:52:13 INFO SparkUI: Stopped Spark web UI at http://172.16.10.53:4041
18/03/14 15:52:14 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/03/14 15:52:14 INFO MemoryStore: MemoryStore cleared
18/03/14 15:52:14 INFO BlockManager: BlockManager stopped
18/03/14 15:52:14 INFO BlockManagerMaster: BlockManagerMaster stopped
18/03/14 15:52:14 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/03/14 15:52:14 INFO SparkContext: Successfully stopped SparkContext
18/03/14 15:52:14 INFO ShutdownHookManager: Shutdown hook called
18/03/14 15:52:14 INFO ShutdownHookManager: Deleting directory /tmp/spark-bfc1f921-8877-4dc8-81ed-dfcba6da84c0
18/03/14 15:52:14 INFO ShutdownHookManager: Deleting directory /tmp/temporary-a86e0bc9-99fd-45dd-b38a-4c5fc10def22
在此之前,我们使用spark shell导入spark-sql-kafka-0-10_2.11-2.0.2.jar。
我们甚至尝试使用./runexample
命令在hortonworks中运行该示例。
版本:
答案 0 :(得分:1)
它需要两个罐子&org.apache.spark:spark-sql-kafka-0-10_2.11:2.2.0' 和' org.apache.kafka:kafka-clients:0.10.1.0',导入它们。
您可以使用
而不是手动指定jarspark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.2.0,org.apache.kafka:kafka-clients:0.10.1.0
它会自动下载它。
答案 1 :(得分:0)
您似乎错过了kafka-clients jar。在spark-submit
期间传递 spark-sql-kafka-0-10_2.11-2.0.2.jar public static void Validate_KPI(DataGridView dataGridView)
{
FileStream fs = new FileStream(@"C:\brandon\InvalidKPI.txt", FileMode.OpenOrCreate, FileAccess.Write);
StreamWriter sw = new StreamWriter(fs);
sw.BaseStream.Seek(0, SeekOrigin.End);
StringBuilder sb = new StringBuilder();
//decimal num;
if (dataGridView.ColumnCount > 2)
{
sw.WriteLine("----------------------------");
sw.WriteLine("");
for (int i = 0; i < dataGridView.RowCount; i++)
{
//MessageBox.Show(dataGridView.Rows[i].Cells["KPI"].Value.ToString());
if (dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Total") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Revenue") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Sales Volume") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Gross Con") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Brand Con") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Employees") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Payroll and Benefits") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("# of Absent") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("MANCOM") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Sales") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Marketing") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Finance") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Logistics") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Manufacturing") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("HR") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("IT") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Resignation") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Turnover Rate") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Sell in volumes") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Sell out volumes") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Weekly recognized sales") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Average selling price per HL") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Discounts per HL") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("FTH per HL") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Gross con per HL") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Advertising expense") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Promotion expense") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Brand con per HL") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Variable manufacturing cost per HL") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Average selling price per HL") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Brand con") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Gross con") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Discounts") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("FTH") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Fixed Costs") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Operating Income") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("EBITDA") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Brewing Utilization") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Plant Efficiency") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Packaging Utilitization") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Product Quality Index (PQI)") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Cost of Non Conformance (CoNC) (in USD)") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Number of complaints from Trade") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("# of dealers") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("# of additions") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("# of resignations") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("% of active") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Inventory Days level") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Yield") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("# of outlets") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("# of active") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Average yield") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("# of sponsored/promo outlets") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("yield of sponsored/promo outlets)") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("# of draught outlets") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("yield of draught outlets") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("# of dealers") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("% of active Draught Outlets") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains(" Yield(Total)") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("yield of sponsored/promo outlets") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Wholesalers days level (where applicable)") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("# of wholesalers") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Total # of Sponsored") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("wholesaler total") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Total") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Past Due") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Within Terms") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Actual") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Sell in volumes") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Sell out volumes") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Brand") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Adjuncts") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Hops - Asia") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Hops - Europe") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Malt - Asia") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Malt - Local") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Malt - Europe") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Finished Goods") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Slow Moving") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Obsoletes") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Container Rate of Return") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Container") ||
dataGridView.Rows[i].Cells["KPI"].Value.ToString().Contains("Adjuncts"))
{
//sb.AppendLine(dataGridView.Rows[i].Cells["KPI"].Value.ToString() + " is Valid.");
}
else if (dataGridView.Rows[i].Cells["KPI"].Value.ToString() == null || dataGridView.Rows[i].Cells["KPI"].Value.ToString() == "" || dataGridView.Rows[i].Cells["KPI"].Value.ToString() == "KPI" || dataGridView.Rows[i].Cells["KPI"].Value.ToString() == "Category")
{
}
else
{
//MessageBox.Show("Row not decimal:" + " [ " + dataGridView[h, i].Value.ToString() + "] in column " + dataGridView.Columns[h].Name);
sb.AppendLine(dataGridView.Rows[i].Cells["KPI"].Value.ToString() + " is **NOT** Valid.");
}
}
}
if (sb.ToString() == null || sb.ToString() == "" || sb.Length < 1)
{
sw.WriteLine("No Errors!");
sw.WriteLine("");
sw.WriteLine("----------------------------");
MessageBox.Show("No errors!");
Process.Start(@"C:\brandon\InvalidKPI.txt");
}
else if (sb.ToString() != null || sb.ToString() != "")
{
sw.WriteLine(sb.ToString());
sw.WriteLine("----------------------------");
MessageBox.Show(sb.ToString());
Process.Start(@"C:\brandon\InvalidKPI.txt");
}
sw.Flush();
sw.Close();
}
答案 2 :(得分:0)
@Souvik是正确的,但是他指定的名称不是kafka-clients jar(但这也是必需的)。以下是样本spark-submit语句。
const dialogRef = this.dialog.open(
MyDialogComponent, {
width: '100vw',
maxWidth: '100vw',
}
)
答案 3 :(得分:0)
Spark提供了带有发行版本的spark-streaming-kafka-0.8软件包,当您尝试添加spark-sql-kafka-0.10结构化的流软件包时,kafka客户端与该流软件包中的客户端冲突。
这就是如果您遵循此问题下的其他答案的原因,那么您发现那根本不起作用。
这就是为什么在pom中或在'--jars'或'driver.extraClasspath'中包含spark-sql-kafka-0.10包的原因,而您仍然会收到ClassNotFound异常。
您唯一要做的是:
删除路径中的jar文件:${SPARK_HOME}/jars/spark-streaming-kafka-0-8_2.11.jar
将--jars ${your_lib}/spark-sql-kafka-0-10_2.11.jar,${your_lib}/kafka-clients-${your_version}.jar
与spark-submit
命令一起使用。
此外,如果您有spark-streaming-kafka-0-8_2.11.jar
,并且可能已将spark lib放在该目录中,则可能必须删除HDFS上的lib文件spark.yarn.jars
。