Question

我有一个数据集：

Simulation  Time    BTC
   1          1     0
   1          2     0
   1          3     0
   2          1    23
   2          2    45
   2          3    55

如果Time = 1和BTC = 0，我希望R删除所有模拟。

预期输出为：

Simulation  Time    BTC 
   2          1      23
   2          2      45
   2          3      55

关于如何做到这一点的任何想法？

Answer 1

将dplyr groupby与any一起使用

df%>%group_by(Simulation)%>%
      mutate(n=any(Time==1&BTC==0))%>%
      filter(!n)%>%
      select(-n)
# A tibble: 3 x 3
# Groups:   Simulation [1]
  Simulation  Time   BTC
       <int> <int> <int>
1          2     1    23
2          2     2    45
3          2     3    55

Answer 2

使用ave

的Base R解决方案

df[!with(df, ave(Time == 1 & BTC == 0, Simulation, FUN = any)), ]

#  Simulation Time BTC
#4          2    1  23
#5          2    2  45
#6          2    3  55

可以翻译为dplyr（与@ Wen-Ben相同）

library(dplyr)

df %>%
 group_by(Simulation) %>%
 filter(!any(Time == 1 & BTC == 0))

为了完整data.table翻译

library(data.table)
setDT(df)[, .SD[!any(Time == 1 & BTC == 0)], by = Simulation]

另一个没有任何分组的Base R选项

df[!with(df, Simulation %in% Simulation[Time == 1 & BTC == 0]),]

如何在相同条件下删除整个模拟

2 个答案: