在R中使用表情符号

时间:2016-02-11 00:04:00

标签: r emoji

我有一个包含大量表情符号的csv文件:

Person, Message,
A, ,
A, How are you?,
B,  Alright!,
A, 

我怎样read.csv()进入R以便表情符号不会变成黑色?

(我希望跟踪表情符号的使用情况)

1 个答案:

答案 0 :(得分:3)

我的控制台有一个接受这些“字符”的字体:

  txt <- "Person, Message,
 A, ,
 A, How are you?,
 B,  Alright!,
 A, "

 Encoding(txt)
#[1] "UTF-8"
 dput(txt)
#"Person, Message,\nA, \U0001f609,\nA, How are you?,\nB, \U0001f64d Alright!,\nA, \U0001f483\U0001f483"

> tvec <- scan(text=txt, what="")
Read 13 items
> dput(tvec)
c("Person,", "Message,", "A,", "\U0001f609,", "A,", "How", "are", 
"you?,", "B,", "\U0001f64d", "Alright!,", "A,", "\U0001f483\U0001f483"
)

> which(tvec == '\U0001f609,')
[1] 4

当我使用扫描来使用逗号sep读取该文本时,前导空格阻止了相等测试成功,但如果我使用了两个字符版本则成功:

> which(tvec == '\U0001f609')
integer(0)
> dput(tvec)
c("Person", " Message", "", "A", " \U0001f609", "", "A", " How are you?", 
"", "B", " \U0001f64d Alright!", "", "A", " \U0001f483\U0001f483"
)
> which(tvec == " ")
[1] 5

这是使用Courier New作为Mac上的控制台/编辑器字体。要查看Unicode表示的说明,请查看?Quotes {base}。