我有一个从Whatsapp-Chat导出的data_frame(名为chat
)。为了解决这个问题,我必须检查第一列中的每一行是否具有特定的格式(与正则表达式字符串"^(\\d\\d\\.\\d\\d\\.\\d\\d,)"
匹配),如果没有,则必须移动整行。我在网上找到了一个如下所示的解决方案:
for(row in c(1:nrow(chat))[-grep("^(\\d\\d\\.\\d\\d\\.\\d\\d,)", chat[[1]])]){
end <- which(is.na(chat[row,]))[1]
chat[row, 5:(4+end)]<- chat[row, 1:(end-1)]
chat[row, 1:4] <- NA
}
因此,此函数的作用是遍历对象"^(\\d\\d\\.\\d\\d\\.\\d\\d,)"
的列[[1]]
中与条件chat
不匹配的所有行,确定第一个空列,然后结束将整行向左移动五列。
尽管这确实可行,但我有兴趣使用map
中的purrr
函数编写此函数。但是,我仍然很难抓住它们。但是,这里最重要的问题是:
如何使用purrr
仅遍历向量的某些元素?我想到map_at
和map_if
,但我想不通。
然后,仅选择了要突变的行之后,如何将函数应用于整个行?
我很感谢purrr提供的任何优秀且深入的教程,但当然也为您提供了答案。
要复制,MWE:
# creating an MWE tibble; the second line is empty, while the third line doesn't begin with a date:
chat <- tibble(
X1=c("24.05.16,", NA, "word", "24.05.16,", "24.05.16,", "24.05.16,"),
X2=c("09:04", NA, "word","12:48,14:13","16:16", "word"),
X3=c("word",NA, "word", "word","word","word" ),
X4=c("word",NA, "word", "word","word","word" )
)
# adding five columns to the end of the tibble
chat <- cbind(chat, matrix(nrow = nrow(chat), ncol = 5))
# applying the loop:
for(row in c(1:nrow(chat))[-grep("^(\\d\\d\\.\\d\\d\\.\\d\\d,)", chat[[1]])]){
end <- which(is.na(chat[row,]))[1]
chat[row, 5:(4+end)] <- chat[row, 1:(end-1)]
chat[row, 1:4] <- NA
}