Spark Cluster + Spring XD

时间:2015-09-17 06:44:06

标签: apache-spark spring-xd

我正在尝试在Spring XD上运行Spark处理器以进行流操作。

Spring XD上的spark处理器模块在spark指向local时工作。当我们将spark指向独立火花(在同一台机器上运行)或yarn-client时,处理器无法运行。是否有可能在火花独立的火花处理器上运行火花处理器或在弹簧XD内部使用火花,或者火花局部是唯一的选择吗?

处理器模块定义为:

class WordCount extends Processor[String, (String, Int)] {

  def process(input: ReceiverInputDStream[String]): DStream[(String, Int)] = {
      val words = input.flatMap(_.split(" "))
      val pairs = words.map(word => (word, 1))
      val wordCounts = pairs.reduceByKey(_ + _)
      wordCounts
  }

  @SparkConfig
  def properties : Properties = {
    val props = new Properties()
    // Any specific Spark configuration properties would go here.
    // These properties always get the highest precedence
    //props.setProperty("spark.master", "spark://a.b.c.d:7077")
    **props.setProperty("spark.master", "spark://abcd.hadoop.ambari:7077**")
    props
  }

}

当配置为本地配置时,处理器正常工作。是否有我在申报中遗漏的东西。

谢谢!

编辑:错误日志

//commands executed on xd-shell
===================================================================
spark/sbin/start-all.sh

module upload --file /opt/igc_services/SparkDev/XdWordCount/build/libs/spark-streaming-wordcount-scala-processor-0.1.0.jar  --name scala-word-count --type processor

stream create spark-streaming-word-count --definition "http | processor:scala-word-count | log" --deploy


// Error Log 
====================================================================
2015-09-16T14:28:48+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module 'log' for stream 'spark-streaming-word-count'
2015-09-16T14:28:48+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module [ModuleDescriptor@6dbc4f81 moduleName = 'log', moduleLabel = 'log', group = 'spark-streaming-word-count', sourceChannelName = [null], sinkChannelName = [null], index = 2, type = sink, parameters = map[[empty]], children = list[[empty]]]
2015-09-16T14:28:48+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Path cache event: path=/deployments/modules/allocated/4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06/spark-streaming-word-count.processor.processor.1, type=CHILD_ADDED
2015-09-16T14:28:48+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module 'processor' for stream 'spark-streaming-word-count'
2015-09-16T14:28:48+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module [ModuleDescriptor@5e16dafb moduleName = 'scala-word-count', moduleLabel = 'processor', group = 'spark-streaming-word-count', sourceChannelName = [null], sinkChannelName = [null], index = 1, type = processor, parameters = map[[empty]], children = list[[empty]]]
2015-09-16T14:28:49+0530 1.2.0.RELEASE WARN DeploymentsPathChildrenCache-0 util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-09-16T14:28:49+0530 1.2.0.RELEASE WARN sparkDriver-akka.actor.default-dispatcher-3 remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://sparkMaster@abcd.hadoop.ambari:7077] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
2015-09-16T14:29:09+0530 1.2.0.RELEASE WARN sparkDriver-akka.actor.default-dispatcher-4 remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://sparkMaster@abcd.hadoop.ambari:7077] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
2015-09-16T14:29:18+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Path cache event: path=/deployments/modules/allocated/8d07cdba-557e-458a-9225-b90e5a5778ce/spark-streaming-word-count.source.http.1, type=CHILD_ADDED
2015-09-16T14:29:18+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module 'http' for stream 'spark-streaming-word-count'
2015-09-16T14:29:18+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module [ModuleDescriptor@610e43b0 moduleName = 'http', moduleLabel = 'http', group = 'spark-streaming-word-count', sourceChannelName = [null], sinkChannelName = [null], index = 0, type = source, parameters = map[[empty]], children = list[[empty]]]
2015-09-16T14:29:19+0530 1.2.0.RELEASE INFO DeploymentSupervisor-0 zk.ZKStreamDeploymentHandler - Deployment status for stream 'spark-streaming-word-count': DeploymentStatus{state=failed,error(s)=Deployment of module 'ModuleDeploymentKey{stream='spark-streaming-word-count', type=processor, label='processor'}' to container '4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06' timed out after 30000 ms}
2015-09-16T14:29:29+0530 1.2.0.RELEASE WARN sparkDriver-akka.actor.default-dispatcher-4 remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://sparkMaster@abcd.hadoop.ambari:7077] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
2015-09-16T14:29:49+0530 1.2.0.RELEASE ERROR sparkDriver-akka.actor.default-dispatcher-3 cluster.SparkDeploySchedulerBackend - Application has been killed. Reason: All masters are unresponsive! Giving up.
2015-09-16T14:29:49+0530 1.2.0.RELEASE WARN DeploymentsPathChildrenCache-0 cluster.SparkDeploySchedulerBackend - Application ID is not initialized yet.
2015-09-16T14:29:49+0530 1.2.0.RELEASE ERROR sparkDriver-akka.actor.default-dispatcher-3 scheduler.TaskSchedulerImpl - Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up.
2015-09-16T14:29:50+0530 1.2.0.RELEASE INFO DeploymentSupervisor-0 zk.ContainerListener - Path cache event: path=/containers/4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06, type=CHILD_REMOVED
2015-09-16T14:29:50+0530 1.2.0.RELEASE INFO DeploymentSupervisor-0 zk.ContainerListener - Container departed: Container{name='4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06', attributes={groups=, host=abcd.hadoop.ambari, id=4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06, managementPort=54998, ip=a.b.c.d, pid=4597}}

1 个答案:

答案 0 :(得分:1)

由于版本冲突,错误看起来像。确保使用XD支持开箱即用的Spark 1.2.1。

如果您有特定版本,您仍然可以通过从XD_HOME / lib中删除1.2.1版本的spark依赖项来使其工作,并将其替换为您使用的spark版本。