Question

示例数据：

  dept    category   quantity  price
  1         r           s        t    
  1         .
  1         .           .
  1
  1
  1
  2
  2
  2
  2         .
  2         .
  2
  3
  3
  3
  3         .           .        .

我想减少每个＆＃39; dept＆＃39;的行数。列如下：

  if(dept == 1) keep only 2 rows
  if(dept == 2) keep only 4 rows
  if(dept == 3) keep only 3 rows

最终数据框应如下所示：

dept    category   quantity  price
  1         r           s        t    
  1         .
  2
  2
  2
  2         .
  3
  3
  3

我如何轻松地做到这一点？

Answer 1

不使用任何其他包的另一种方式：

df <- data.frame(dept=c(rep(1:3, each=5)), # exemplary data.frame
                 data=sample(letters, 15, replace=TRUE))
rows_to_keep <- c(2, 4, 3)
do.call(rbind, lapply(split(df, df$dept), function(subdf)
   subdf[seq_len(rows_to_keep[subdf$dept[1]]),]))

Answer 2

这样做的一种方法是使用plyr。首先制作一个矩阵，显示您希望为每个dept值保留多少行，例如a <- cbind(1:3, c(2, 4, 3))。然后使用

library(plyr)
ddply(data, .(dept), 
    function(d) head(d, n = a[,2][which(a[,1] == d$dept[1])]))

如何减少数据框中列的行数？

2 个答案: