基于匹配向量将数据帧的某些行移动到末尾

时间:2017-03-28 15:39:47

标签: r dataframe match

我有一个> 300000行的数据框。我想选择匹配三个字符串并移动那些匹配数据框末尾的行。我需要保留最终数据框中不匹配的行。最后,我的数据将被绘制,重新排序的数据框将被写入xls。

以下是一些示例数据:

mydata <- structure(list(id = structure(c(1L, 4L, 1L, 2L, 3L, 2L, 1L, 6L, 
5L, 2L, 1L, 3L, 4L), .Label = c("match1", "match2", "match3", 
"match4", "match8", "match9"), class = "factor"), A = structure(c(6L, 
5L, 7L, 4L, 10L, 7L, 8L, 8L, 9L, 4L, 3L, 2L, 1L), .Label = c("19", 
"2", "20", "3", "4", "6", "8", "H", "j", "T"), class = "factor"), 
    B = structure(c(2L, 2L, 2L, 3L, 4L, 2L, 4L, 5L, 2L, 3L, 5L, 
    3L, 1L), .Label = c("beside", "in", "out", "over", "under"
    ), class = "factor")), .Names = c("id", "A", "B"), row.names = c(NA, 
-13L), class = "data.frame")

看起来像这样:

    id  A   B
match1  6   in
match4  4   in
match1  8   in
match2  3   out
match3  T   over
match2  8   in
match1  H   over
match9  H   under
match8  j   in
match2  3   out
match1  20  under
match3  2   out
match4  19  beside

我想使用这个字符串向量来移动与数据帧末尾匹配的行。

matchlist = c("match1", "match2", "match3")

结果数据框如下所示:

id  A   B
match4  4   in
match9  H   under
match8  j   in
match4  19  beside
match1  H   over
match1  6   in
match1  8   in
match1  20  under
match2  3   out
match2  8   in
match2  3   out
match3  T   over
match3  2   out

我需要保留不匹配的行。 我查看了这篇文章Select and sort rows of a data frame based on a vector,但它丢失了不匹配的数据。

4 个答案:

答案 0 :(得分:6)

试试这个:

x <- as.character(df$id) %in% matchlist
rbind(df[!x,], df[x,])

       # id  A      B
# 2  match4  4     in
# 8  match9  H  under
# 9  match8  j     in
# 13 match4 19 beside
# 1  match1  6     in
# 3  match1  8     in
# 4  match2  3    out
# 5  match3  T   over
# 6  match2  8     in
# 7  match1  H   over
# 10 match2  3    out
# 11 match1 20  under
# 12 match3  2    out

答案 1 :(得分:2)

这是一个没有grep的解决方案:

matched <- mydata$id %in% matchlist
mydata2 <- rbind(mydata[!matched,], mydata[matched,])

您当然可以在rbind之前订购匹配的行,然后您将获得与示例中完全相同的输出。

答案 2 :(得分:2)

考虑这个dplyr解决方案:

mydata %>%
  arrange(id %in% match_list)

答案 3 :(得分:0)

top = mydata[-grep("match1|match2|match3", mydata$id),]
bottom = mydata[grep("match1|match2|match3", mydata$id),]
bottom = bottom[order(bottom$id),]
xls = rbind(top, bottom)