data.table基于组的滞后值删除行

时间:2016-07-12 20:59:03

标签: r data.table

我有以下表格中的data.table

DT <- data.table(tag = rep(c("A", "B"), each = 10),
                 value =  c(0, 3, 3, 3, 0, 1, 1, 1, 3, 0,
                            0, 1, 3, 1, 0, 3, 0, 1, 1, 0))
> DT
    tag value
 1:   A     0
 2:   A     3
 3:   A     3
 4:   A     3
 5:   A     0
 6:   A     1
 7:   A     1
 8:   A     1
 9:   A     3
10:   A     0
11:   B     0
12:   B     1
13:   B     3
14:   B     1
15:   B     0
16:   B     3
17:   B     0
18:   B     1
19:   B     1
20:   B     0

我想删除所有值为3但仍然只有0的行。这是我想删除第2,3,4和16行,但需要保留第9行和第13行。

有没有办法执行此操作?

3 个答案:

答案 0 :(得分:5)

可能的解决方案:

DT[, `:=` (threes = rleid(value==3), apz = value == 3 & shift(value) == 0)
   ][, if (all(!apz)) .SD, by = threes
     ][, c('threes','apz') := NULL]

给出:

    tag value
 1:   A     0
 2:   A     0
 3:   A     1
 4:   A     1
 5:   A     1
 6:   A     3
 7:   A     0
 8:   B     0
 9:   B     1
10:   B     3
11:   B     1
12:   B     0
13:   B     0
14:   B     1
15:   B     1
16:   B     0

答案 1 :(得分:3)

DT[, prev.value := shift(value), by = tag][
   , prev.value := prev.value[1], by = .(tag, rleid(value))][
   !(value == 3 & prev.value == 0)]
#    tag value prev.value
# 1:   A     0         NA
# 2:   A     0          3
# 3:   A     1          0
# 4:   A     1          0
# 5:   A     1          0
# 6:   A     3          1
# 7:   A     0          3
# 8:   B     0         NA
# 9:   B     1          0
#10:   B     3          1
#11:   B     1          3
#12:   B     0          1
#13:   B     0          3
#14:   B     1          0
#15:   B     1          0
#16:   B     0          1

答案 2 :(得分:2)

这是一种排序(为改进提供@Procrastinatus的道具):

$db = new PDO('mysql:host='.$database_host.';dbname='.$database_name.';charset=utf8', $database_user, $database_password);

$stmt = $db->prepare("SELECT id FROM users WHERE username=? AND password=?");
$stmt->execute([$_POST['username'], $_POST['password']]);

$user = $stmt->fetch(); // Or fetchColumn() to get just the id

if($user) { 
    // Login
} else {
    // The user doesn't exist or the password is wrong
}

要了解其工作原理,请尝试运行DT[setDT(rle(value))[, rep(!( values==3 & shift(values)==0 ), lengths)] ] ,显示R如何汇总连续值的运行,并阅读DT[, setDT(rle(value))]

我原来的方法是:

?rle

尝试DT[ rleid(value) %in% setDT(rle(value))[ , .I[!( values==3 & shift(values)==0 )]] ] 并阅读DT[, rleid(value)]了解详情。第二种方法更糟糕,因为运行评估两次(同时使用?rleidrle)。