我似乎无法深究这一点,我想阅读一个包含阿拉伯字符的csv文件,但它没有正确阅读。
这是我的sessionInfo
R version 3.2.4 Revised (2016-03-16 r70336)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] dplyr_0.4.3 plyr_1.8.3
loaded via a namespace (and not attached):
[1] magrittr_1.5 R6_2.1.2 assertthat_0.1 parallel_3.2.4 DBI_0.3.1 tools_3.2.4
[7] Rcpp_0.12.4
我试过这个
ar <- read.csv (file.choose(), encoding = "UTF-8")
这个
ar <- read.csv (file.choose(), encoding = "Windows-1256")
它对我来说没有用,我也尝试将语言环境设置为阿拉伯语但没有运气
Sys.setlocale("LC_ALL","Arabic")
有什么建议吗?
答案 0 :(得分:0)
您可以使用readLines
和参数warn = FALSE
来读取文件,然后使用read.csv
参数将text
设置为readLines
的结果来执行LabelName,Label1,Label2,SpeciesLabel,Group,Subgroup,Species
التسمية 1,Group 1,Subgroup 1,Species 1,1,1,1
التسمية 2,Group 1,Subgroup 1,Species 1,1,1,1
التسمية 3,Group 1,Subgroup 1,Species 1,1,1,1
。
arabic.csv内容:
arabic <- readLines("arabic.csv", warn = FALSE, encoding = "UTF-8")
Data <- read.csv(text = arabic)
str(Data)
Output:
'data.frame': 3 obs. of 7 variables:
$ X.U.FEFF.LabelName: Factor w/ 3 levels "التسمية 1","التسمية 2",..: 1 2 3
$ Label1 : Factor w/ 1 level "Group 1": 1 1 1
$ Label2 : Factor w/ 1 level "Subgroup 1": 1 1 1
$ SpeciesLabel : Factor w/ 1 level "Species 1": 1 1 1
$ Group : int 1 1 1
$ Subgroup : int 1 1 1
$ Species : int 1 1 1
R代码以读取csv文件:
if Words(1) Then