我写了这个非常简单的火花流程序
object TrendingHashTags {
def main(args: Array[String]) : Unit = {
val url = getClass.getResource("/twitterapi.properties")
val source = Source.fromURL(url)
val props = new Properties()
props.load(source.bufferedReader())
System.setProperty("twitter4j.oauth.consumerKey", props.get("consumer_key").toString)
System.setProperty("twitter4j.oauth.consumerSecret", props.get("consumer_secret").toString)
System.setProperty("twitter4j.oauth.accessToken", props.get("access_token").toString)
System.setProperty("twitter4j.oauth.accessTokenSecret", props.get("access_token_secret").toString)
val conf = new SparkConf().setAppName("Abhishek Spark Streaming")
val ssc = new StreamingContext(conf, Seconds(2))
ssc.checkpoint("checkpoint")
val tweets = TwitterUtils.createStream(ssc, None)
val tweetText = tweets.map(t => t.getText)
val counts = tweetText.flatMap(_.split("\\s+")).map((_, 1)).reduceByKeyAndWindow((a: Int, b: Int) => a + b, Seconds(10), Seconds(10))
counts.foreachRDD{rdd =>
val c = rdd.count()
println("Number of words " + c)
}
ssc.start()
ssc.awaitTermination()
ssc.stop(true)
}
}
当我使用以下命令spark-submit --class com.abhi.TrendingHashTags --master local[*] /app/SparkStreaming1-assembly-1.0.jar
程序保持运行和运行(流式传输),我可以看到它定期创建一个检查点......但是它不会打印我代码中的任何语句。
16/05/19 01:57:46 INFO scheduler.JobScheduler: No jobs added for time 1463637466000 ms
16/05/19 01:57:46 INFO scheduler.JobGenerator: Checkpointing graph for time 1463637466000 ms
16/05/19 01:57:46 INFO streaming.DStreamGraph: Updating checkpoint data for time 1463637466000 ms
16/05/19 01:57:46 INFO streaming.DStreamGraph: Updated checkpoint data for time 1463637466000 ms
16/05/19 01:57:46 INFO storage.MemoryStore: Block input-0-1463637465800 stored as bytes in memory (estimated size 59.1 KB, free 2.5 MB)
16/05/19 01:57:46 INFO streaming.CheckpointWriter: Submitted checkpoint of time 1463637466000 ms writer queue
16/05/19 01:57:46 INFO streaming.CheckpointWriter: Saving checkpoint for time 1463637466000 ms to file 'hdfs://sandbox:9000/user/root/checkpoint/checkpoint-1463637466000'
16/05/19 01:57:46 INFO storage.BlockManagerInfo: Added input-0-1463637465800 in memory on localhost:52605 (size: 59.1 KB, free: 515.0 MB)
16/05/19 01:57:46 WARN storage.BlockManager: Block input-0-1463637465800 replicated to only 0 peer(s) instead of 1 peers
16/05/19 01:57:46 INFO receiver.BlockGenerator: Pushed block input-0-1463637465800
16/05/19 01:57:46 INFO streaming.CheckpointWriter: Deleting hdfs://sandbox:9000/user/root/checkpoint/checkpoint-1463637446000
16/05/19 01:57:46 INFO streaming.CheckpointWriter: Checkpoint for time 1463637466000 ms saved to file 'hdfs://sandbox:9000/user/root/checkpoint/checkpoint-1463637466000', took 5617 bytes and 33 ms
16/05/19 01:57:46 INFO storage.MemoryStore: Block input-0-1463637466400 stored as bytes in memory (estimated size 5.7 KB, free 2.5 MB)
16/05/19 01:57:46 INFO storage.BlockManagerInfo: Added input-0-1463637466400 in memory on localhost:52605 (size: 5.7 KB, free: 515.0 MB)
16/05/19 01:57:46 WARN storage.BlockManager: Block input-0-1463637466400 replicated to only 0 peer(s) instead of 1 peers
16/05/19 01:57:46 INFO receiver.BlockGenerator: Pushed block input-0-1463637466400
16/05/19 01:57:47 INFO storage.MemoryStore: Block input-0-1463637466800 stored as bytes in memory (estimated size 63.3 KB, free 2.5 MB)
16/05/19 01:57:47 INFO storage.BlockManagerInfo: Added input-0-1463637466800 in memory on localhost:52605 (size: 63.3 KB, free: 514.9 MB)
16/05/19 01:57:47 WARN storage.BlockManager: Block input-0-1463637466800 replicated to only 0 peer(s) instead of 1 peers
16/05/19 01:57:47 INFO receiver.BlockGenerator: Pushed block input-0-1463637466800
16/05/19 01:57:47 INFO storage.MemoryStore: Block input-0-1463637467000 stored as bytes in memory (estimated size 5.8 KB, free 2.6 MB)
16/05/19 01:57:47 INFO storage.BlockManagerInfo: Added input-0-1463637467000 in memory on localhost:52605 (size: 5.8 KB, free: 514.9 MB)
16/05/19 01:57:47 WARN storage.BlockManager: Block input-0-1463637467000 replicated to only 0 peer(s) instead of 1 peers
16/05/19 01:57:47 INFO receiver.BlockGenerator: Pushed block input-0-1463637467000
16/05/19 01:57:47 INFO storage.MemoryStore: Block input-0-1463637467400 stored as bytes in memory (estimated size 4.6 KB, free 2.6 MB)
16/05/19 01:57:47 INFO storage.BlockManagerInfo: Added input-0-1463637467400 in memory on localhost:52605 (size: 4.6 KB, free: 514.9 MB)
16/05/19 01:57:47 WARN storage.BlockManager: Block input-0-1463637467400 replicated to only 0 peer(s) instead of 1 peers
16/05/19 01:57:47 INFO receiver.BlockGenerator: Pushed block input-0-1463637467400
16/05/19 01:57:48 INFO dstream.ShuffledDStream: Slicing from 1463637460000 ms to 1463637468000 ms (aligned to 1463637460000 ms and 1463637468000 ms)
16/05/19 01:57:48 INFO storage.MemoryStore: Block input-0-1463637467800 stored as bytes in memory (estimated size 61.4 KB, free 2.6 MB)
16/05/19 01:57:48 INFO storage.BlockManagerInfo: Added input-0-1463637467800 in memory on localhost:52605 (size: 61.4 KB, free: 514.9 MB)
16/05/19 01:57:48 WARN storage.BlockManager: Block input-0-1463637467800 replicated to only 0 peer(s) instead of 1 peers
16/05/19 01:57:48 INFO receiver.BlockGenerator: Pushed block input-0-1463637467800
16/05/19 01:57:48 INFO storage.MemoryStore: Block input-0-1463637468000 stored as bytes in memory (estimated size 4.2 KB, free 2.6 MB)
16/05/19 01:57:48 INFO storage.BlockManagerInfo: Added input-0-1463637468000 in memory on localhost:52605 (size: 4.2 KB, free: 514.9 MB)
16/05/19 01:57:48 WARN storage.BlockManager: Block input-0-1463637468000 replicated to only 0 peer(s) instead of 1 peers
16/05/19 01:57:48 INFO receiver.BlockGenerator: Pushed block input-0-1463637468000
16/05/19 01:57:48 INFO scheduler.JobScheduler: Added jobs for time 1463637468000 ms
16/05/19 01:57:48 INFO scheduler.JobGenerator: Checkpointing graph for time 1463637468000 ms
16/05/19 01:57:48 INFO streaming.DStreamGraph: Updating checkpoint data for time 1463637468000 ms
16/05/19 01:57:48 INFO streaming.DStreamGraph: Updated checkpoint data for time 1463637468000 ms
16/05/19 01:57:48 INFO streaming.CheckpointWriter: Saving checkpoint for time 1463637468000 ms to file 'hdfs://sandbox:9000/user/root/checkpoint/checkpoint-1463637468000'
我希望在上面输出的某个地方,我会将打印语句组成我的代码。