我想使用Scala IDE并在Windows 7上运行spark代码。我已经安装了Scala IDE并开始创建一个scala项目。所以我需要知道:
是否有任何指令在Scala IDE中运行以下代码:
/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object SimpleApp {
def main(args: Array[String]) {
val logFile = "D:/Spark_Installation/eclipse-ws/Scala/README.md" // Should be some file on your system
val conf = new SparkConf().setAppName("Simple Application")
.setMaster("spark://myhost:7077")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
}
}
当我运行此代码时,我收到以下错误:
15/03/26 11:59:55 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster@myhost:7077/user/Master...
15/03/26 11:59:58 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@myhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@myhost:7077
15/03/26 11:59:58 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@myhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: myhost
15/03/26 12:00:15 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster@myhost:7077/user/Master...
15/03/26 12:00:17 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@myhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@myhost:7077
15/03/26 12:00:17 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@myhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: myhost
15/03/26 12:00:35 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster@myhost:7077/user/Master...
15/03/26 12:00:37 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster@myhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster@myhost:7077
15/03/26 12:00:37 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@myhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: myhost
15/03/26 12:00:55 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
15/03/26 12:00:55 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up.
15/03/26 12:00:55 WARN SparkDeploySchedulerBackend: Application ID is not initialized yet.
答案 0 :(得分:5)
你有火花大师设置吗?如果没有,请看看这个: http://spark.apache.org/docs/1.2.1/submitting-applications.html#master-urls
你最想使用
local[*]
将使用本地计算机拥有的每个核心,而不是使用:
spark://myhost:7077
spark://假设你在myhost上设置了一个spark master:7077
答案 1 :(得分:0)
如果您作为本地独立群集运行,请使用本地[*]表示使用您的计算机拥有的所有核心。所以现在你的sparkconf对象创建如下:MyGrid