Question

我试图在R中加载.csv。我得到这样的东西

<f3>?<e9><U+00BC>?<e4><f3> .

我在全局选项中将我的deafult文本编码设置为UTF-8。 R可能在导出时特别编码撇号吗？

df = read.csv("text.csv", encoding="UTF-8",header=TRUE, stringsAsFactors=FALSE)

####Original CSV (Open in Notepad++)####
I don?ó?é¼?äót want
Jes?ÇÖs in the Family
others that wasn?ó?é¼?äót resolved and told
Am really happy with the this ?ƒÿü,
new ?ó?é¼?ôunbreakable?ó?é¼?¥ 
on the freeway?Çª.

####Load in R####
I don?<f3>?<e9><U+00BC>?<e4><f3>t want
Jes?<c7><d6>s in the Family
others that wasn?<f3>?<e9><U+00BC>?<e4><f3>t resolved and told
Am really happy with the this ?<U+0083><ff><fc>
new ?<f3>?<e9><U+00BC>?<f4>unbreakable?<f3>?<e9><U+00BC>?<U+00A5> 
on the freeway?<U+01EA>.

####What I want####
Because I don't want
Jes's in the Family
others that wasn't resolved and told
Am really happy with the this 
new 'unbreakable'
on the freeway….

感谢。

Answer 1

你可以这样做：

这里x是您在一个字符串中的给定数据，如下所示：

x <- "I don?ó?é¼?äót want Jes?ÇÖs in the Family others that wasn?ó?é¼?äót resolved and told Am really happy with the this ?ƒÿü, new ?ó?é¼?ôunbreakable? ?é¼?¥ on the freeway?Çª."

您可以将gsub与iconv结合使用，以获得几乎所需的结果。我不知道如何在你的输出中得到笑容：

 gsub("\\?+","'",iconv(x, "latin1", "ASCII", sub=""))

<强>输出：

[1] "I don't want
     Jes's in the Family
     others that wasn't resolved and told
     Am really happy with the this ',
     new 'unbreakable'on the freeway'."

Answer 2

您应该尝试将utf-8转换为ascii：

dt <- iconv(dt, 'utf-8', 'ascii', sub='')

iconv在“ tm”库中

R中奇怪的字符

2 个答案: