当read =“UTF-8”时,R readLines从socketConnection读取chokes

时间:2017-09-08 10:07:58

标签: r sockets utf-8

使用chokes我的意思是readLines只是停止处理和阻塞。 我怎样才能让它适用于UTF-8?

当我设置socketConnection时 并为服务器或客户端(或两者)设置encoding =“UTF-8” 然后readLines扼流圈。 但是当服务器和客户端都使用encoding =“native.enc”时,它可以正常工作。

readChar在每个设置中都能正常工作,无论服务器和客户端使用哪种编码。 但它太慢了,我认为扫描也在内部使用readLines 所以使用readChar对我来说不是一个真正的选择。

示例代码(请参阅socket_chockes.zip)定义了8个函数。 我希望这些名字能说清楚它们是什么(见下面的代码)。 该表显示哪个组合起作用(w)和哪个块(b):

                    consume_char_native
                      consume_char_utf8
                        consume_lines_native
                          consume_lines_utf8

serve_char_native   w w w b
serve_char_utf8     w w w b
serve_lines_native  w w w b
serve_lines_utf8    b b b b

我用RawCap抓住了交通 并可以通过电子邮件转发 如果这是任何帮助。 您可以看到客户端在s1c2中停止处理 和s2c2中的服务器。

            server.R            client.R
s1c1.pcap:  serve_lines_native  consume_lines_native
s1c2.pcap:  serve_lines_native  consume_lines_utf8
s2c2.pcap:  serve_lines_utf8    consume_lines_utf8

我的sessionInfo():

R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C                    LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_3.4.1 tools_3.4.1   

代码server.R

make_server_con  <- function(encoding) {
  socketConnection(
    server = TRUE,
    host = "localhost", port = 49125,
    blocking = TRUE, open = "r+", encoding = encoding
  )
}

serve_readLines <- function(con) {
  print(readLines(con, n = 1L))
  print(readLines(con, n = 1L))
  print(readLines(con, n = 1L))
  print(readLines(con, n = 1L))
  writeLines(
    c("0"),
    con = con
  )
}

serve_readChar <- function(con) {
  print(readChar(con, n = 3L))
  print(readChar(con, n = 2L))
  print(readChar(con, n = 8L))
  print(readChar(con, n = 3L))
  writeLines(
    c("0"),
    con = con
  )
}

serve_example <- function(encoding, fun) {
  con <- make_server_con(encoding)
  fun(con)
  fun(con)
  close(con)
}

serve_char_native <- function() {
  serve_example("native.enc", serve_readChar)
}

serve_char_utf8  <- function() {
  serve_example("UTF-8",      serve_readChar)
}

serve_lines_native  <- function() {
  serve_example("native.enc", serve_readLines)
}

serve_lines_utf8  <- function() {
  serve_example("UTF-8",      serve_readLines)
}

代码client.R

make_client_con  <- function(encoding) {
  socketConnection(
    host = "localhost", port = 49125,
    blocking = TRUE, open = "r+", encoding = encoding
  )
}

consume_readLines <- function(con) {
  writeLines(
    c("ls", "2", "Overall", "de"),
    con = con
  )
  print(readLines(con, n = 1L))
}

consume_readChar <- function(con) {
  writeLines(
    c("ls", "2", "Overall", "de"),
    con = con
  )
  print(readChar(con, n = 2L))
}

consume_example <- function(encoding, fun) {
  con <- make_client_con(encoding)
  fun(con)
  fun(con)
  close(con)
}

consume_char_native <- function() {
  consume_example("native.enc", consume_readChar)
}

consume_char_utf8 <- function() {
  consume_example("UTF-8", consume_readChar)
}

consume_lines_native <- function() {
  consume_example("native.enc", consume_readLines)
}

consume_lines_utf8 <- function() {
  consume_example("UTF-8", consume_readLines)
}

0 个答案:

没有答案