Apache Bahir,将内容发送给ActorReceiver

时间:2017-01-28 19:18:09

标签: scala apache-spark spark-streaming apache-bahir

我正在尝试使用Apache Streir设置一个简单的流程,使用Apache Bahir连接到Akka。我尝试将their exampleolder one一起关注。我有一个简单的转发器演员

class ForwarderActor extends ActorReceiver {
  def receive = {
    case data: MyData => store(data)
  }
}

我用

创建了一个流
val stream = AkkaUtils.createStream[RSVP](ssc, Props[ForwarderActor], actorName)

配置如下所示:

akka {
  actor {
    provider = "akka.remote.RemoteActorRefProvider"
  }
  remote {
    enabled-transports = ["akka.remote.netty.tcp"]
    netty.tcp {
      hostname = "localhost"
      port = 7777
    }
  }
}

我的问题是:如何向Forwarder演员发送消息?也许我不明白在这种情况下如何使用Akka Remote。当应用程序启动时,我会看到一个日志

[akka.remote.Remoting] Remoting started; listening on addresses :[akka.tcp://test@localhost:7777]

后来我看到了

[akka.remote.Remoting] Remoting now listens on addresses: [akka.tcp://streaming-actor-system-0@192.168.192.7:52369]

这似乎提醒了ScalaDoc

中的描述
 /**
   * A default ActorSystem creator. It will use a unique system name
   * (streaming-actor-system-<spark-task-attempt-id>) to start an ActorSystem that supports remote
   * communication.
   */

总而言之,我不确定我应该如何向Forwarder演员发送消息。谢谢你的帮助!

1 个答案:

答案 0 :(得分:0)

Akka actor可以向远程JVM上运行的其他Akka actor发送消息。所以...当发送者角色需要知道预期的接收者角色的地址时。

AkkaUtil(Bahir)使您能够从ReceiverActor收到的消息中创建火花流。但是,从哪里接收消息?嗯......一些偏远的演员。要发送消息,这个远程actor将需要你的spark应用程序中运行的ReceiverActor的地址。

一般情况下,你不能太确定将运行你的spark应用程序的ip。因此,我们将使它与使用spark运行的actor将告诉生产者actor它的引用并请求它发送它的东西。

确保使用相同版本的Scala编写这两个应用程序并运行相同的JRE。

现在......让我们先写一下将成为数据源的演员,

import akka.actor.{Actor, ActorRef, ActorLogging, ActorSystem, Props}
import akka.actor.Actor.Receive
import com.typesafe.config.{Config, ConfigFactory}

case class SendMeYourStringsRequest(requesterRef: ActorRef)
case class RequestedString(s: String)

class MyActor extends Actor with ActorLogging {

  val theListOfMyStrings = List("one", "two", "three")

  override def receive: Receive = {
    case SendMeYourStringsRequest(requesterRef) => {
      theListOfMyStrings.foreach(s => {
        requesterRef ! RequestedString(s)
      })
    }
  }
}

object MyApplication extends App {

  val config = ConfigFactory.parseString(
    """
      |akka{
      |  actor {
      |    provider = remote
      |  }
      |  remote {
      |    enabled-transports = ["akka.remote.netty.tcp"]
      |    untrusted-mode = off
      |    netty.tcp {
      |      hostname="my-ip-address"
      |      port=18000
      |    }
      |  }
      |}
    """.stripMargin
  )

  val actorSystem = ActorSystem("my-actor-system", config)

  var myActor = actorSystem.actorOf(Props(classOf[MyActor]), "my-actor")

}

现在......让我们编写简单的火花应用

import akka.actor.{Actor, ActorRef, ActorLogging, ActorSystem, Props}
import akka.actor.Actor.Receive
import com.typesafe.config.{Config, ConfigFactory}
import org.apache.spark.SparkConf
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.streaming.akka.{ActorReceiver, AkkaUtils}

case class SendMeYourStringsRequest(requesterRef: ActorRef)
case class RequestedString(s: String)

class YourStringRequesterActor extends ActorReceiver {
  def receive = {
    case RequestedString(s) => store(s)
  }

  override def preStart(): Unit = {
    val myActorPath = ActorPath.fromString("akka.tcp://my-actor-system@my-ip-address:18000/user/my-actor")
    val myActorSelection = context.actorSelection(myActorPath)

    myActorSelection ! SendMeYourStringsRequest(self)
  }
}

object YourSparkApp {
  def main(args: Array[String]): Unit = {
    val sparkConf = new SparkConf().setAppName("ActorWordCount")

    if (!sparkConf.contains("spark.master")) {
      sparkConf.setMaster("local[2]")
    }

    val ssc = new StreamingContext(sparkConf, Seconds(2))

    val stringStream = AkkaUtils.createStream[String](
        ssc,
        Props(classOf[YourStringRequesterActor]),
        "your-string-requester-actor"
    )

    stringStream.foreach(println)

  }
}

注意::只需照顾my-ip-address即可。如果还有其他问题,请在评论中告诉我。