我正在尝试使用R分析一些下载的Facebook消息。某些消息中的撇号用“â”替换-我正尝试使用str_replace_all()替换它。
以下面的数据为例。
names <- c("Me", "Me", "You", "You", "Me", "You")
content <- c("Iâ<U+0080><U+0099>ve got my party on the 5th", "Hello", "Bears", "Four times four", "what do you want to eat?", "get some music")
date <- c("1/1/2001", "2/1/2001", "3/1/2001", "4/1/2001", "5/1/2001", "6/1/2001")
fbmessagesexample <- data.table(names, date, content)
然后我尝试使用str_replace_all
fbmessagesexample[, content := str_replace_all(content, pattern = fixed("â<U\\+0080><U\\+0099>"), replacement=fixed("'"))]
内容的第一行未替换。我在做错什么吗?
答案 0 :(得分:1)
请传递pattern
的向量。
以下代码段将导致控制台输出,如下所示。
library(data.table)
library(tidyverse)
names <- c("Me", "Me", "You", "You", "Me", "You")
content <- c("Iâ<U+0080><U+0099>ve got my party on the 5th", "Hello", "Bears", "Four times four", "what do you want to eat?", "get some music")
date <- c("1/1/2001", "2/1/2001", "3/1/2001", "4/1/2001", "5/1/2001", "6/1/2001")
fbmessagesexample <- data.table(names, date, content)
pattern <- c("â<U\\+0080><U\\+0099>")
fbmessagesexample[, content := str_replace_all(content, pattern, replacement=fixed("'"))]
控制台:
> fbmessagesexample
names date content
1: Me 1/1/2001 I've got my party on the 5th
2: Me 2/1/2001 Hello
3: You 3/1/2001 Bears
4: You 4/1/2001 Four times four
5: Me 5/1/2001 what do you want to eat?
6: You 6/1/2001 get some music