Question

我有以下简单的data.table“ test”。我想选择第3到8行的所有行，其中X等于“ A”：

library(data.table)
set.seed(1)
test <- data.table(X=c(rep("A",5),rep("B",5)),Y=rnorm(10),Z=rnorm(10))

test[3:8 & X == "A"] # gives the not desired output:

1: A -0.6264538  1.5117812
2: A  0.1836433  0.3898432
3: A -0.8356286 -0.6212406
4: A  1.5952808 -2.2146999
5: A  0.3295078  1.1249309
Warning message:
  In 3:8 & X == "A" :
  longer object length is not a multiple of shorter object length

# desired outcome:

3: A -0.8356286 -0.62124058
4: A  1.5952808 -2.21469989
5: A  0.3295078  1.12493092

在第3：8行之间，我只选择X ==“ A”的那些。这怎么可能？请注意，test[3:8][X == "A"]似乎不是一种选择，因为我想对保存在原始数据表中的这些行进行一些计算。

Answer 1

这里resource: {…} severity: "INFO" textPayload: "Data from res is>>>> <!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <title>Error</title> </head> <body> <pre>Bad Request</pre> </body> </html> "的长度绝对不同于第二个表达式（3:8），而且，我们将逻辑索引与数字索引进行比较。而是在行序列上使用X == "A"将第一个表达式转换为逻辑，然后发生两件事-1）长度相同，2）类型相同

%in%

Answer 2

library(data.table)
set.seed(1)
test <- data.table(X=c(rep("A",5),rep("B",5)),Y=rnorm(10),Z=rnorm(10))

test[test[, .I %in% 3:8 & X == "A"], Z := Z+3][]
#>     X          Y           Z
#>  1: A -0.6264538  1.51178117
#>  2: A  0.1836433  0.38984324
#>  3: A -0.8356286  2.37875942
#>  4: A  1.5952808  0.78530011
#>  5: A  0.3295078  4.12493092
#>  6: B -0.8204684 -0.04493361
#>  7: B  0.4874291 -0.01619026
#>  8: B  0.7383247  0.94383621
#>  9: B  0.5757814  0.82122120
#> 10: B -0.3053884  0.59390132

^{由reprex package（v0.3.0）于2019-06-21创建}

Answer 3

如果您需要从特定索引（3：8）中选择行，然后过滤出具有特定值的变量（此处X =='A'），您可以尝试使用“ dplyr软件包”：

library(data.table)
library(dplyr)
set.seed(1)
test <- data.table(X=c(rep("A",5),rep("B",5)),Y=rnorm(10),Z=rnorm(10))

test %>% slice(3:8) %>% filter(X == 'A')

使用行号和值过滤器进行选择

3 个答案: