Scala位域序列化

时间:2014-09-19 03:27:02

标签: scala binary-serialization

我对Scala非常陌生,我对位操作功能感到困惑。我希望有人能指出我正确的方向吗?

我有一个用以下位字段定义的字节数组:

0-3 - magic number
  4 - version
5-7 - payload length in bytes
8-X - payload, of variable length, as indicated in bits 5-7

我想将此来回序列化为如下结构:

MagicNumber: Integer
Version: Integer
Length: Integer
payload: Array[Byte]

如何以最佳方式处理这种情况下的位?我见过的大多数示例都涉及更高级别的序列化,例如JSON。我想在这种情况下序列化和反序列化TCP二进制数据。

2 个答案:

答案 0 :(得分:6)

您可以使用Scala Pickling或POF或Google Protobuf,但如果您的格式受到限制,最简单的方法是编写自己的序列化程序:

case class Data(magicNumber: Int, version: Int, payload: Array[Byte])

def serialize(data: Stream[Data]): Stream[Byte] = 
   data.flatMap(x => 
     Array((x.magicNumber << 4 | x.version << 3 | x.payload.length).toByte) ++ x.payload)

@scala.annotation.tailrec
def deserialize(binary: Stream[Byte], acc: Stream[Data] = Stream[Data]()): Stream[Data] =   
   if(binary.nonEmpty) {
     val magicNumber = binary.head >> 4 
     val version = (binary.head & 0x08) >>3 
     val size = binary.head & 0x07
     val data = Data(magicNumber, version, ByteVector(binary.tail.take(size).toArray)) 
     deserialize(binary.drop(size + 1), acc ++ Stream(data)) 
   } else acc

或者您可以使用Scodec库(此选项更好,因为您将进行自动值范围检查):

SBT:

  libraryDependencies += "org.typelevel" %% "scodec-core" % "1.3.0"

编解码器:

  case class Data(magicNumber: Int, version: Int, payload: ByteVector)
  val codec = (uint(4) :: uint(1) :: variableSizeBytes(uint(3), bytes)).as[Data]

使用:

  val encoded = codec.encode(Data(2, 1, bin"01010101".bytes)).fold(sys.error, _.toByteArray)
  val decoded = codec.decode(BitVector(encoded)).fold(sys.error, _._2)

答案 1 :(得分:3)

我会看scodec。基于UDP example,它应该是(未经测试的):

import scodec.bits.{ BitVector, ByteVector }
import scodec.codecs._

case class Datagram(
  magicNumber: Int,
  version: Byte,
  payload: ByteVector)

object Datagram {
  implicit val codec: Codec[Datagram] = {
    ("magic_number" | int32 ) ::
    ("version" | byte ) ::
    variableSizeBytes(int(3),
      ("payload" | bytes ))
  }.as[Datagram]
}