转换unicode codeepoint格式

时间:2018-01-04 23:32:10

标签: r encoding character-encoding

让我们说我有一个包含表示表情符号的字节的字符串:

VC2

如何将其转换为

string <- "This is a test. U+1F600"

这样我就可以将其渲染为

string <- "This is a test. \U0001F600"

1 个答案:

答案 0 :(得分:1)

这是一种黑客攻击,但它适用于您的情况:

string <- c("This is a test. U+1F600", "Another test")

# change U+XXXXYYYY to \UXXXXYYYY, quote and encode special characters
expr <- gsub("U[+]([0-9A-Fa-f]{1,8})", "\\\\U\\1",
             encodeString(string, quote = '"'))

# evaluate the string as an R expression
vapply(parse(text = expr, keep.source = FALSE), eval, "")
#> [1] "This is a test. \U0001f600" "Another test"