如何获得具有正值的最后一行

时间:2018-11-08 14:58:19

标签: r dplyr

char line[100];
fgets(line, 100, stdin);
sscanf(line, "%d",&film[i].length);

如果查看# A tibble: 10 x 1 a <dbl> 1 1. 2 2. 3 3. 4 0. 5 5. 6 0. 7 7. 8 0. 9 0. 10 0. ,您会注意到第七行是最后一个行号,其值大于0(正)。如何让R找到第七行?

换句话说,我要过滤以包括1-7行,但排除7之后的所有行(即8-10行),因为7是最后一个具有正值的行。这是让我们开始的小技巧。

column a

1 个答案:

答案 0 :(得分:3)

一种简洁的方法是

df[1:max(which(df$a>0)),]
# A tibble: 7 x 1
#       a
#   <dbl>
# 1     1
# 2     2
# 3     3
# 4     0
# 5     5
# 6     0
# 7     7

df[1:which.max(cumsum(df$a)),]
head(df,1-which.max(rev(df$a)>0))
df[rev(cumsum(rev(df$a>0)))>0,]

让我们花点时间df$a并比较所有方法:

df <- data.frame(a = rbinom(5000, 2, 0.2) - 1)

microbenchmark(
  df[1:max(which(df$a>0)),],
  df[1:which.max(cumsum(df$a)),],
  head(df,1-which.max(rev(df$a)>0)),
  df[rev(cumsum(rev(df$a>0)))>0,],
  df[1:tail(which(sign(df$a) == 1), 1),],
  times = 10000
)
# Unit: microseconds
#                                     expr     min       lq      mean   median       uq       max neval cld
#             df[1:max(which(df$a > 0)), ]  52.817  58.5800 102.80519  62.2160  71.5910  17108.65 10000 a  
#          df[1:which.max(cumsum(df$a)), ]  36.190  40.7620  65.68274  43.0785  49.7835  18827.08 10000 a  
#   head(df, 1 - which.max(rev(df$a) > 0)) 214.812 230.7590 355.37321 249.1085 297.4340  18158.22 10000   c
#     df[rev(cumsum(rev(df$a > 0))) > 0, ] 106.391 114.6345 192.44990 124.4690 141.5650  14473.12 10000  b 
#  df[1:tail(which(sign(df$a) == 1), 1), ] 106.152 116.8985 207.69863 125.6520 150.3425 195384.36 10000  b