如何读取R中的额外ASCII字符?

时间:2015-06-30 17:34:31

标签: r utf-8 character-encoding utf-16

我正在使用以下函数逐行读取输入文本文件:

lines_reader<-function(filename){
    conn<-file(filename,open="r")
    linn<-readLines(conn,encoding="UCS-2LE")
    close(conn)
    return(linn)
}

如果我试图在R环境中绘制这些线条,带有重音符号的字母被视为没有充分显示为“Ô或“Ô而不是“à”或“è”。

如何应对这个?我应该选择什么编码?

这里是我的会话和本地系统信息:

> sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)

locale:
[1] LC_COLLATE=Italian_Italy.1252  LC_CTYPE=Italian_Italy.1252   
[3] LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C                  
[5] LC_TIME=Italian_Italy.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_3.2.0


> Sys.getlocale()
[1] "LC_COLLATE=Italian_Italy.1252;LC_CTYPE=Italian_Italy.1252;LC_MONETARY=Italian_Italy.1252;LC_NUMERIC=C;LC_TIME=Italian_Italy.1252"

1 个答案:

答案 0 :(得分:0)

如何更改您正在使用的编码:

lines_reader<-function(filename){
    conn<-file(filename,open="r")
    linn<-readLines(conn,encoding="UTF-8")
    close(conn)
    return(linn)
}