我开始使用Spark流媒体。我想从Kafka获取一个流,其中包含我在Spark文档中找到的示例代码:https://spark.apache.org/docs/2.1.0/streaming-kafka-0-10-integration.html
这是我的代码:
object SparkStreaming {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Test_kafka_spark").setMaster("local[*]") // local parallelism 1
val ssc = new StreamingContext(conf, Seconds(1))
val kafkaParams = Map[String, Object](
"bootstrap.servers" -> "localhost:9093",
"key.deserializer" -> classOf[StringDeserializer],
"value.deserializer" -> classOf[StringDeserializer],
"group.id" -> "test",
"auto.offset.reset" -> "latest",
"enable.auto.commit" -> (false: java.lang.Boolean)
)
val topics = Array("spark")
val stream = KafkaUtils.createDirectStream[String, String](
ssc,
PreferConsistent,
Subscribe[String, String](topics, kafkaParams)
)
stream.map(record => (record.key, record.value))
}
}
所有人似乎开始都很好,但工作立即停止,记录如下:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
17/04/19 14:37:37 INFO SparkContext: Running Spark version 2.1.0
17/04/19 14:37:37 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/04/19 14:37:37 WARN Utils: Your hostname, thibaut-Precision-M4600 resolves to a loopback address: 127.0.1.1; using 10.192.176.101 instead (on interface eno1)
17/04/19 14:37:37 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
17/04/19 14:37:37 INFO SecurityManager: Changing view acls to: thibaut
17/04/19 14:37:37 INFO SecurityManager: Changing modify acls to: thibaut
17/04/19 14:37:37 INFO SecurityManager: Changing view acls groups to:
17/04/19 14:37:37 INFO SecurityManager: Changing modify acls groups to:
17/04/19 14:37:37 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(thibaut); groups with view permissions: Set(); users with modify permissions: Set(thibaut); groups with modify permissions: Set()
17/04/19 14:37:37 INFO Utils: Successfully started service 'sparkDriver' on port 41046.
17/04/19 14:37:37 INFO SparkEnv: Registering MapOutputTracker
17/04/19 14:37:37 INFO SparkEnv: Registering BlockManagerMaster
17/04/19 14:37:37 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
17/04/19 14:37:37 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
17/04/19 14:37:37 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-266e2f13-0eb2-40a8-9d2f-d50797099a29
17/04/19 14:37:37 INFO MemoryStore: MemoryStore started with capacity 879.3 MB
17/04/19 14:37:37 INFO SparkEnv: Registering OutputCommitCoordinator
17/04/19 14:37:38 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/04/19 14:37:38 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.192.176.101:4040
17/04/19 14:37:38 INFO Executor: Starting executor ID driver on host localhost
17/04/19 14:37:38 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 39207.
17/04/19 14:37:38 INFO NettyBlockTransferService: Server created on 10.192.176.101:39207
17/04/19 14:37:38 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
17/04/19 14:37:38 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.192.176.101, 39207, None)
17/04/19 14:37:38 INFO BlockManagerMasterEndpoint: Registering block manager 10.192.176.101:39207 with 879.3 MB RAM, BlockManagerId(driver, 10.192.176.101, 39207, None)
17/04/19 14:37:38 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.192.176.101, 39207, None)
17/04/19 14:37:38 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.192.176.101, 39207, None)
17/04/19 14:37:38 WARN KafkaUtils: overriding enable.auto.commit to false for executor
17/04/19 14:37:38 WARN KafkaUtils: overriding auto.offset.reset to none for executor
17/04/19 14:37:38 WARN KafkaUtils: overriding executor group.id to spark-executor-test
17/04/19 14:37:38 WARN KafkaUtils: overriding receive.buffer.bytes to 65536 see KAFKA-3135
17/04/19 14:37:38 INFO SparkContext: Invoking stop() from shutdown hook
17/04/19 14:37:38 INFO SparkUI: Stopped Spark web UI at http://10.192.176.101:4040
17/04/19 14:37:38 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/04/19 14:37:38 INFO MemoryStore: MemoryStore cleared
17/04/19 14:37:38 INFO BlockManager: BlockManager stopped
17/04/19 14:37:38 INFO BlockManagerMaster: BlockManagerMaster stopped
17/04/19 14:37:38 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/04/19 14:37:38 INFO SparkContext: Successfully stopped SparkContext
17/04/19 14:37:38 INFO ShutdownHookManager: Shutdown hook called
17/04/19 14:37:38 INFO ShutdownHookManager: Deleting directory /tmp/spark-f28a1361-58ba-416b-ac8e-11da0044c1f2
感谢您的帮助。
答案 0 :(得分:5)
看来你还没有启动StreamingContext。 尝试在最后添加这两行
ssc.start
ssc.awaitTermination
答案 1 :(得分:1)
你没有在DStream上调用任何动作,所以没有任何东西被执行(map是转换而且是懒惰的),你也需要启动StreamingContext。
请查看这个完整的例子。