在SPARK中实现Receiver

时间:2014-07-10 04:12:51

标签: scala apache-spark packet-capture spark-streaming

我一直在尝试为SPARK 0.9实现接收器。我使用Jnetpcap库捕获了数据包,需要将它传递给Scala中的spark。是否足以在" def receive()"中写入数据包的捕获部分。方法?

编辑:以下是this link中使用Jnetpcap库捕获数据包的代码:

import java.util.Date
import java.lang.StringBuilder
import org.jnetpcap.Pcap
import org.jnetpcap.packet.PcapPacket
import org.jnetpcap.packet.PcapPacketHandler

object PacketCapture1 {
  def main(args: Array[String]){
    val snaplen = 64 * 1024 // Capture all packets, no trucation
    val flags = Pcap.MODE_PROMISCUOUS // capture all packets
    val timeout = 10 * 1000
    //val errbuf = new StringBuilder()

    val jsb = new java.lang.StringBuilder()
    val errbuf = new StringBuilder(jsb);
    val pcap = Pcap.openLive("eth0", snaplen, flags, timeout, errbuf)
    if (pcap == null) {
      println("Error : " + errbuf.toString())
    }
    println(pcap)
    val jpacketHandler = new PcapPacketHandler[String]() {

      def nextPacket(packet: PcapPacket, user: String) {
        println("Received packet at %s caplen=%4d len=%4d %s\n", new Date(packet.getCaptureHeader.timestampInMillis()),
          packet.getCaptureHeader.caplen(), packet.getCaptureHeader.wirelen(), user)
      }
    }
    pcap.loop(30, jpacketHandler, "jNetPcap works!")
    pcap.close()

  }
}

如何使用此代码为捕获的数据包实现spark接收器?

1 个答案:

答案 0 :(得分:5)

您必须创建自定义NetworkReceiver(或Spark 1.0+中的Receiver)并实现onStart()方法。对于Spark 0.9,请参阅http://spark.apache.org/docs/0.9.1/streaming-custom-receivers.html

对于spark 1.0(强烈推荐),请参阅http://spark.apache.org/docs/latest/streaming-custom-receivers.html