我有一个> 300000行的数据框。我想选择匹配三个字符串并移动那些匹配数据框末尾的行。我需要保留最终数据框中不匹配的行。最后,我的数据将被绘制,重新排序的数据框将被写入xls。
以下是一些示例数据:
mydata <- structure(list(id = structure(c(1L, 4L, 1L, 2L, 3L, 2L, 1L, 6L,
5L, 2L, 1L, 3L, 4L), .Label = c("match1", "match2", "match3",
"match4", "match8", "match9"), class = "factor"), A = structure(c(6L,
5L, 7L, 4L, 10L, 7L, 8L, 8L, 9L, 4L, 3L, 2L, 1L), .Label = c("19",
"2", "20", "3", "4", "6", "8", "H", "j", "T"), class = "factor"),
B = structure(c(2L, 2L, 2L, 3L, 4L, 2L, 4L, 5L, 2L, 3L, 5L,
3L, 1L), .Label = c("beside", "in", "out", "over", "under"
), class = "factor")), .Names = c("id", "A", "B"), row.names = c(NA,
-13L), class = "data.frame")
看起来像这样:
id A B
match1 6 in
match4 4 in
match1 8 in
match2 3 out
match3 T over
match2 8 in
match1 H over
match9 H under
match8 j in
match2 3 out
match1 20 under
match3 2 out
match4 19 beside
我想使用这个字符串向量来移动与数据帧末尾匹配的行。
matchlist = c("match1", "match2", "match3")
结果数据框如下所示:
id A B
match4 4 in
match9 H under
match8 j in
match4 19 beside
match1 H over
match1 6 in
match1 8 in
match1 20 under
match2 3 out
match2 8 in
match2 3 out
match3 T over
match3 2 out
我需要保留不匹配的行。 我查看了这篇文章Select and sort rows of a data frame based on a vector,但它丢失了不匹配的数据。
答案 0 :(得分:6)
试试这个:
x <- as.character(df$id) %in% matchlist
rbind(df[!x,], df[x,])
# id A B
# 2 match4 4 in
# 8 match9 H under
# 9 match8 j in
# 13 match4 19 beside
# 1 match1 6 in
# 3 match1 8 in
# 4 match2 3 out
# 5 match3 T over
# 6 match2 8 in
# 7 match1 H over
# 10 match2 3 out
# 11 match1 20 under
# 12 match3 2 out
答案 1 :(得分:2)
这是一个没有grep的解决方案:
matched <- mydata$id %in% matchlist
mydata2 <- rbind(mydata[!matched,], mydata[matched,])
您当然可以在rbind之前订购匹配的行,然后您将获得与示例中完全相同的输出。
答案 2 :(得分:2)
考虑这个dplyr解决方案:
mydata %>%
arrange(id %in% match_list)
答案 3 :(得分:0)
top = mydata[-grep("match1|match2|match3", mydata$id),]
bottom = mydata[grep("match1|match2|match3", mydata$id),]
bottom = bottom[order(bottom$id),]
xls = rbind(top, bottom)