R source()编码错误?

时间:2017-05-04 17:22:00

标签: r encoding utf-8

我发现有关R中字符常量编码的非常奇怪的错误。

main.R:

options(encoding = "UTF-8")
print(Sys.getlocale())
print(getOption("encoding"))

print("first run")
source("internal.R")
print("")

print("second run")
source("internal.R", encoding = "UTF-8")
print("")

internal.R

print(Sys.getlocale())
print(getOption("encoding"))
char_constant="Тут не просто живут баги, тут у них гнездо"
print(Encoding(char_constant))

现在让我们看一下R

中的输出,按下源按钮
[1] "ru_RU.UTF-8/ru_RU.UTF-8/ru_RU.UTF-8/C/ru_RU.UTF-8/ru_RU.UTF-8"
[1] "UTF-8"
[1] "first run"
[1] "ru_RU.UTF-8/ru_RU.UTF-8/ru_RU.UTF-8/C/ru_RU.UTF-8/ru_RU.UTF-8"
[1] "UTF-8"
[1] "unknown"
[1] ""
[1] "second run"
[1] "ru_RU.UTF-8/ru_RU.UTF-8/ru_RU.UTF-8/C/ru_RU.UTF-8/ru_RU.UTF-8"
[1] "UTF-8"
[1] "UTF-8"
[1] ""

注意编码的区别。 “未知”第一次和“UTF-8”第二次。 有明显的小错误源忽略默认编码参数。

真正的错误是将data.table中的不同编码混合在一起会导致很多问题,并且当你执行一个字符串时R-studio使“UTF-8”保持不变,并在你整个文件来源时使“未知”保持不变。

有人知道发生了什么以及如何制定解决方法吗?

R version 3.3.0 (2016-05-03)
Platform: x86_64-apple-darwin14.5.0 (64-bit)
Running under: OS X 10.12.4 (unknown)

locale:
[1] ru_RU.UTF-8/ru_RU.UTF-8/ru_RU.UTF-8/C/ru_RU.UTF-8/ru_RU.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_3.3.0

1 个答案:

答案 0 :(得分:0)

  

在Windows上,R的源函数不适用于包含不属于当前系统编码的字符的文件。您可能无法使用RStudio的“全部运行”和“在保存时运行源代码”命令,因为它们依赖源代码。

看看:https://support.rstudio.com/hc/en-us/articles/200532197-Character-Encoding