我有以下data.table:
dt=structure(list(a = c("10", "10", "20", "30", "10", "25", "10"
), b = c("0.605887455840394", "0", "0.709466017509524", "0",
"0.585528817843856", "-0.109303314681054", "-0.453497173462763"
), c = c("-0.919322002474128", "0", "0.630098551068391", "0",
"-1.81795596770373", "-0.276184105225216", "-0.284159743943371"
), d = c("-0.750531994502331", "0", "1.81731204370422", "0",
"-0.116247806352002", "0.370627864257954", "0.520216457554957"
), e = c("0.298723699267293", "0", "-0.886357521243213", "0",
"0.816899839520583", "-0.331577589942552", "1.12071265166956"
), key = c("A", "A", "B", "B", "C", "C", "C")), .Names = c("a",
"b", "c", "d", "e", "key"), row.names = c(NA, -7L), class = c("data.table",
"data.frame"), sorted = "key")
这给了我一个类似于下面所示的数据表。
a b c d e key
1: 10 0.605887455840394 -0.919322002474128 -0.750531994502331 0.298723699267293 A
2: 10 0 0 0 0 A
3: 20 0.709466017509524 0.630098551068391 1.81731204370422 -0.886357521243213 B
4: 30 0 0 0 0 B
5: 10 0.585528817843856 -1.81795596770373 -0.116247806352002 0.816899839520583 C
6: 25 -0.109303314681054 -0.276184105225216 0.370627864257954 -0.331577589942552 C
7: 10 -0.453497173462763 -0.284159743943371 0.520216457554957 1.12071265166956 C
我想做一个子集操作,删除全部为零的行。
我正在考虑
的内容 dt[!(all(i[2:4) == 0)]
但我不确定如何在data.table中实际说明这一点
非常感谢任何帮助。
答案 0 :(得分:3)
这似乎是使用not-join的绝佳机会。这需要将密钥设置为您希望在
上进行子集化的列keys <- names(dt)[2:5]
setkeyv(dt, keys)
dt[!as.list(rep("0", length(keys)))]
请注意,目前您的键列是字符,这比它们是数字更有效。
答案 1 :(得分:2)
1)第一行创建一个逻辑向量,用于选择适当的行,第二行选择它们:
ok <- dt[, ! apply(.SD == 0, 1, all), .SDcols = 2:5]
dt[ok]
2)我们也可以用any
来表达,节省一个字符加空格:
ok <- dt[, apply(.SD != 0, 1, any), .SDcols = 2:5]
dt[ok]
3)对于少数列,这甚至更短:
dt[ apply(cbind(b, c, d, e) != 0, 1, any) ]
4)并且对于少数列,这一列更短且更简单
dt[ b != 0 | c != 0 | d != 0 | e != 0 ]
答案 2 :(得分:1)
这是一个两步解决方案:
dt[
!dt[,
.I[all(sapply(.SD,function(x)x=="0"))]
,by=1:nrow(dt),.SDcols=letters[2:5]]$V1
]
产生
a b c d e key
1: 10 0.605887455840394 -0.919322002474128 -0.750531994502331 0.298723699267293 A
2: 20 0.709466017509524 0.630098551068391 1.81731204370422 -0.886357521243213 B
3: 10 0.585528817843856 -1.81795596770373 -0.116247806352002 0.816899839520583 C
4: 25 -0.109303314681054 -0.276184105225216 0.370627864257954 -0.331577589942552 C
5: 10 -0.453497173462763 -0.284159743943371 0.520216457554957 1.12071265166956 C
内部部分选择满足条件的行索引“.I”。外括号通过使用非“!”排除这些行来设置“dt”。操作