我对Scala非常陌生,我对位操作功能感到困惑。我希望有人能指出我正确的方向吗?
我有一个用以下位字段定义的字节数组:
0-3 - magic number
4 - version
5-7 - payload length in bytes
8-X - payload, of variable length, as indicated in bits 5-7
我想将此来回序列化为如下结构:
MagicNumber: Integer
Version: Integer
Length: Integer
payload: Array[Byte]
如何以最佳方式处理这种情况下的位?我见过的大多数示例都涉及更高级别的序列化,例如JSON。我想在这种情况下序列化和反序列化TCP二进制数据。
答案 0 :(得分:6)
您可以使用Scala Pickling或POF或Google Protobuf,但如果您的格式受到限制,最简单的方法是编写自己的序列化程序:
case class Data(magicNumber: Int, version: Int, payload: Array[Byte])
def serialize(data: Stream[Data]): Stream[Byte] =
data.flatMap(x =>
Array((x.magicNumber << 4 | x.version << 3 | x.payload.length).toByte) ++ x.payload)
@scala.annotation.tailrec
def deserialize(binary: Stream[Byte], acc: Stream[Data] = Stream[Data]()): Stream[Data] =
if(binary.nonEmpty) {
val magicNumber = binary.head >> 4
val version = (binary.head & 0x08) >>3
val size = binary.head & 0x07
val data = Data(magicNumber, version, ByteVector(binary.tail.take(size).toArray))
deserialize(binary.drop(size + 1), acc ++ Stream(data))
} else acc
或者您可以使用Scodec库(此选项更好,因为您将进行自动值范围检查):
SBT:
libraryDependencies += "org.typelevel" %% "scodec-core" % "1.3.0"
编解码器:
case class Data(magicNumber: Int, version: Int, payload: ByteVector)
val codec = (uint(4) :: uint(1) :: variableSizeBytes(uint(3), bytes)).as[Data]
使用:
val encoded = codec.encode(Data(2, 1, bin"01010101".bytes)).fold(sys.error, _.toByteArray)
val decoded = codec.decode(BitVector(encoded)).fold(sys.error, _._2)
答案 1 :(得分:3)
我会看scodec。基于UDP example,它应该是(未经测试的):
import scodec.bits.{ BitVector, ByteVector }
import scodec.codecs._
case class Datagram(
magicNumber: Int,
version: Byte,
payload: ByteVector)
object Datagram {
implicit val codec: Codec[Datagram] = {
("magic_number" | int32 ) ::
("version" | byte ) ::
variableSizeBytes(int(3),
("payload" | bytes ))
}.as[Datagram]
}