r-包含unicode(Farsi)字符的源脚本文件

时间:2018-08-08 18:29:43

标签: r unicode

将下面的文本写到缓冲区中,并将其另存为.r脚本:

letters_fa <- c('الف','ب','پ','ت','ث','ج','چ','ح','خ','ر','ز','د')

然后尝试以下行来source()

script <- "path/to/script.R"
file(script,
     encoding = "UTF-8") %>%
  readLines() # works fine

file(script,
     encoding = "UTF-8") %>%
  source() # works fine

source(script) # the Farsi letters in the environment are misrepresented

source(script,
       encoding = "UTF-8") # gives error

最后一行抛出错误。我尝试对其进行调试,并且我相信source函数中存在以下行中的错误:

...
loc <- utils::localeToCharset()[1L]
...

错误发生在.Internal(parse(行。

...
exprs <- if (!from_file) {
      if (length(lines)) 
        .Internal(parse(stdin(), n = -1, lines, "?", 
          srcfile, encoding))
      else expression()
    }
    else .Internal(parse(file, n = -1, NULL, "?", srcfile, 
      encoding))
...

确切的错误是:

Error in source(script, encoding = "UTF-8") : 
  script.R:2:17: unexpected INCOMPLETE_STRING
1: #' @export
2: letters_fa <- c('
                   ^

2 个答案:

答案 0 :(得分:1)

此问题的解决方案是将OS语言环境更改为本机语言环境(例如,在这种情况下为波斯语),或使用R内置函数Sys.setlocale(locale="Persian")更改R会话本机语言环境。

答案 1 :(得分:0)

使用/opt/csw/gcc4/bin/g++ -DSIMBA -D_REENTRANT -m64 -fPIC -pthread -Wall -Wno-unknown-pragmas -lrt -O0 -g -shared -L/bamboo/bamboo-agent-home/xml-data/build-dir/ThirdParty/icu/53.1.x/solaris10sparc/gcc4_9/release64/lib -lstdc++ -licudata_sb64 -licui18n_sb64 -licuuc_sb64 -lpthread -lm -lsocket -lnsl -Wl,-M,exports_SunOS.map -Wl,-zallextract,/bamboo/bamboo-agent-home/xml-data/build-dir/SimbaEngine/Maintenance/10.1/Product/Lib/solaris10sparc/gcc4_9/debug64/libSimbaDSI.a,/bamboo/bamboo-agent-home/xml-data/build-dir/SimbaEngine/Maintenance/10.1/Product/Lib/solaris10sparc/gcc4_9/debug64/libSimbaSupport.a,/bamboo/bamboo-agent-home/xml-data/build-dir/SimbaEngine/Maintenance/10.1/Product/Lib/solaris10sparc/gcc4_9/debug64/libAEProcessor.a,/bamboo/bamboo-agent-home/xml-data/build-dir/SimbaEngine/Maintenance/10.1/Product/Lib/solaris10sparc/gcc4_9/debug64/libCore.a,/bamboo/bamboo-agent-home/xml-data/build-dir/SimbaEngine/Maintenance/10.1/Product/Lib/solaris10sparc/gcc4_9/debug64/libDSIExt.a,/bamboo/bamboo-agent-home/xml-data/build-dir/SimbaEngine/Maintenance/10.1/Product/Lib/solaris10sparc/gcc4_9/debug64/libExecutor.a,/bamboo/bamboo-agent-home/xml-data/build-dir/SimbaEngine/Maintenance/10.1/Product/Lib/solaris10sparc/gcc4_9/debug64/libParser.a,/bamboo/bamboo-agent-home/xml-data/build-dir/SimbaEngine/Maintenance/10.1/Product/Lib/solaris10sparc/gcc4_9/debug64/libSimbaODBC.a -Wl,-zweakextract -o ../Bin/solaris10sparc/gcc4_9/debug64/libQuickstart64.so 而不指定编码,然后使用source修改向量的编码:

Encoding