读取现有连接对象的编码

时间:2014-10-09 15:45:59

标签: r character-encoding

有没有办法获得(并设置)现有连接的encoding?例如:

con <- file(tempfile(), encoding = "UTF-8")
summary(con)

摘要列出了模式以及模式是否已打开,但未列出连接使用的编码。

1 个答案:

答案 0 :(得分:0)

我真的不确定我很清楚你需要做什么。但假设

  • 连接与磁盘上的现有文件相关
  • 你肯定需要从文件中读取
  • 可能想要写入文件

如果您强制使用UTF-8编码,那么你可以做这样的事情:

# Hypothetical connection used by the user (file must exist on dist, hence 
# the "w" here
con <- file(tempfile(), open = "w", encoding = "UTF-8")

# recup the attributes of the existing connection
con.attr <- summary(con)

# build a list of parameters for a new connection that would replace
# the original one
newcon.attr <- list()
newcon.attr["description"] <- con.attr$description
newcon.attr["open"] <- paste0("r", ifelse(con.attr$'can write'=='yes', "+", ""))
newcon.attr["encoding"] <- "UTF-8"

# close the original connection, and create the new one
close(con)
newcon <- do.call(what = file, args = newcon.attr)

# Check its attributes
summary(newcon)
# $description
# [1] "C:\\Users\\...\\Temp\\Rtmpo9ykjo\\file54744993321b"
#
# $class
# [1] "file"
#
# $mode
# [1] "r+"
#
# $text
# [1] "text"
#
# $opened
# [1] "opened"
# 
# $`can read`
# [1] "yes"
# 
# $`can write`
# [1] "yes"

为了检查先前的内容是否使用UTF-8进行编码是另一个故事,所以这可能对您的情况有用,也可能没用。