从scala中的相同源读取行和原始字节

时间:2013-05-07 09:04:13

标签: scala io

我需要编写执行以下操作的代码:

  1. 连接到tcp套接字
  2. 读取以“\ r \ n”结尾的行,其中包含数字N
  3. 读取N个字节
  4. 使用那些N字节
  5. 我目前正在使用以下代码:

    val socket = new Socket(InetAddress.getByName(host), port)
    val in = socket.getInputStream;
    val out = new PrintStream(socket.getOutputStream)
    
    val reader = new DataInputStream(in)
    val baos = new ByteArrayOutputStream
    val buffer = new Array[Byte](1024)
    
    out.print(cmd + "\r\n")
    out.flush
    
    val firstLine = reader.readLine.split("\\s")
    
    if(firstLine(0) == "OK") {
      def read(written: Int, max: Int, baos: ByteArrayOutputStream): Array[Byte] = {
        if(written >= max) baos.toByteArray
        else {
          val count = reader.read(buffer, 0, buffer.length)
          baos.write(buffer, 0, count)
          read(written + count, max, baos)
        }
      }
    
      read(0, firstLine(1).toInt, baos)
    } else {
      // RAISE something
    }
    
    baos.toByteArray()
    

    此代码的问题在于使用DataInputStream#readLine会引发弃用警告,但我找不到同时实现read(...)readLine(...)的类。例如BufferedReader,实现read但它读取字符而不是字节。我可以将这些字符转换为字节,但我认为这不安全。

    在scala中写这样的东西的其他任何方法吗?

    谢谢

3 个答案:

答案 0 :(得分:3)

请注意,在JVM上,char有2个字节,因此“\ r \ n”是4个字节。对于存储在JVM外部的字符串,通常不会这样。

我认为最安全的方法是以原始字节读取文件,直到你找到“\ r \ n”的二进制表示,现在你可以在第一个字节上创建一个Reader(使字节与JVM兼容的字符),在那里你可以确保只有Text,解析它,并安全地与其余的二进制数据一起使用。

答案 1 :(得分:2)

您可以实现在一个类中使用read(...)和readLine(...)的目标。这个想法是使用BufferedReader.read():Int。 BufferedReader类缓冲了内容,因此您可以一次读取一个字节而不会降低性能。

更改可以是:(没有scala样式优化)

import java.io.BufferedInputStream
import java.io.BufferedReader
import java.io.ByteArrayOutputStream
import java.io.PrintStream
import java.net.InetAddress
import java.net.Socket
import java.io.InputStreamReader


object ReadLines extends App {
  val host = "127.0.0.1"
  val port = 9090
  val socket = new Socket(InetAddress.getByName(host), port)
  val in = socket.getInputStream;
  val out = new PrintStream(socket.getOutputStream)

//  val reader = new DataInputStream(in)
  val bufIns = new BufferedInputStream(in)
  val reader = new BufferedReader(new InputStreamReader(bufIns, "utf8"));

  val baos = new ByteArrayOutputStream
  val buffer = new Array[Byte](1024)

  val cmd = "get:"
  out.print(cmd + "\r\n")
  out.flush

  val firstLine = reader.readLine.split("\\s")

  if (firstLine(0) == "OK") {
    def read(written: Int, max: Int, baos: ByteArrayOutputStream): Array[Byte] = {
      if (written >= max) {
        println("get: " + new String(baos.toByteArray))
        baos.toByteArray()
      } else {
//         val count = reader.read(buffer, 0, buffer.length)
        var count = 0
        var b = reader.read()
        while(b != -1){
          buffer(count) = b.toByte
          count += 1
          if (count < max){
              b = reader.read()
          }else{
            b = -1
          }
        }
        baos.write(buffer, 0, count)
        read(written + count, max, baos)
      }
    }

    read(0, firstLine(1).toInt, baos)
  } else {
    // RAISE something
  }

  baos.toByteArray()
}

进行测试,下面是服务器代码:

object ReadLinesServer extends App {
  val serverSocket = new ServerSocket(9090)
  while(true){
    println("accepted a connection.")
    val socket = serverSocket.accept()
    val ops = socket.getOutputStream()
    val printStream = new PrintStream(ops, true, "utf8")
    printStream.print("OK 2\r\n")   // 1 byte for alpha-number char
    printStream.print("ab")
  }
}

答案 2 :(得分:0)

似乎这是我能找到的最佳解决方案:

val reader = new BufferedReader(new InputStreamReader(in))
val buffer = new Array[Char](1024)

out.print(cmd + "\r\n")
out.flush

val firstLine = reader.readLine.split("\\s")

if(firstLine(0) == "OK") {
  def read(readCount: Int, acc: List[Byte]): Array[Byte] = {
    if(readCount <= 0) acc.toArray
    else {
      val count = reader.read(buffer, 0, buffer.length)
      val asBytes = buffer.slice(0, count).map(_.toByte)

      read(readCount - count, acc ++ asBytes)
    }
  }

  read(firstLine(1).toInt, List[Byte]())
} else {
  // RAISE
}

也就是说,使用buffer.map(_.toByte).toArray将char数组转换为字节数组而不关心编码。