为什么我的火花流演示不输出任何东西

时间:2016-11-24 05:29:39

标签: spark-streaming

最近,我正在学习这本书 - 学习 - spark-o-reilly-2015。我试图运行spark流示例StreamingLogInput。代码如下:

val conf = new SparkConf().setMaster(master).setAppName("StreamingLogInput")
// Create a StreamingContext with a 1 second batch size
val ssc = new StreamingContext(conf, Seconds(1))
// Create a DStream from all the input on port 7777
val lines = ssc.socketTextStream("localhost", 7777)
val errorLines = processLines(lines)
// Print out the lines with errors, which causes this DStream to be evaluated
errorLines.print()
// start our streaming context and wait for it to "finish"
ssc.start()

def processLines(lines: DStream[String]) = {
// Filter our DStream for lines with "error"
lines.filter(_.contains("error"))
}

当我使用如下方法在单节点机器中运行该程序时,

$SPARK_HOME/bin/spark-submit \
--class com.oreilly.learningsparkexamples.scala.StreamingLogInput \
--master spark://singlenode:7077 \
/home/hadoop/project/learning-spark/target/scala-2.10/learning-spark-examples_2.10-0.0.1.jar \
spark://singlenode:7077 

在另一个窗口中,我输入订单

nc -l 7777 

并输入一些假日志 但没有输出错误日志。 日志如下:

16/11/24 04:20:48 INFO BlockManagerInfo: Added input-0-1479932447800 in memory 
on singlenode:37112 (size: 32.0 B, free: 267.2 MB)
16/11/24 04:20:49 INFO JobScheduler: Added jobs for time 1479932449000 ms
16/11/24 04:20:50 INFO JobScheduler: Added jobs for time 1479932450000 ms
16/11/24 04:20:51 INFO JobScheduler: Added jobs for time 1479932451000 ms
16/11/24 04:20:51 INFO BlockManagerInfo: Added input-0-1479932451000 in memory on singlenode:37112 (size: 33.0 B, free: 267.2 MB) 
16/11/24 04:20:52 INFO JobScheduler: Added jobs for time 1479932452000 ms
16/11/24 04:20:53 INFO JobScheduler: Added jobs for time 1479932453000 ms
16/11/24 04:20:54 INFO JobScheduler: Added jobs for time 1479932454000 ms
16/11/24 04:20:55 INFO JobScheduler: Added jobs for time 1479932455000 ms
16/11/24 04:20:56 INFO JobScheduler: Added jobs for time 1479932456000 ms
16/11/24 04:20:57 INFO JobScheduler: Added jobs for time 1479932457000 ms
16/11/24 04:20:58 INFO JobScheduler: Added jobs for time 1479932458000 ms

为什么会这样?感谢任何帮助!

1 个答案:

答案 0 :(得分:0)

我通过在提交应用程序时指定多个执行程序来解决它,例如local [3]。