我想删除value
a
= b
的{{1}}行,但我不知道该怎么做。
示例数据:
df <- data.frame(day = c(1, 1, 2, 2, 3, 3), var = c("a", "b", "a", "b", "a", "b"), value = c(1, 2, 3, 3, 2, 1)
输出:
day var value
1 1 a 1
2 1 b 2
3 2 a 3
4 2 b 3
5 3 a 2
6 3 b 1
期望的输出:
day var value
1 1 a 1
2 1 b 2
答案 0 :(得分:3)
这里是一个数据表解决方案,用于避免从长到宽:
dt <- data.table(df)
dt[,if(value[var == 'a'] >= value[var == 'b']) .SD,by = day]
编辑:我现在意识到你想要的输出不符合你的初始不等式,所以调整不等式来匹配:)
EDIT2:如果您不想在data.table中执行此操作,那么这里是dplyr解决方案
df %>% group_by(day) %>% filter(value[var == 'a'] >= value[var == 'b'])
EDIT3:如果你想把NA&#39; s放在这个
中df %>% group_by(day) %>% mutate(value = if(value[var == 'a'] >= value[var == 'b']) as.numeric(NA) else value)
EDIT4:注意这最后一个解决方案似乎暴露了一个错误,其中NA的处理方式很奇怪,请参见此处:Why is dplyr removing values not met by condition?
答案 1 :(得分:3)
Shape的答案是解决问题的正确方法
为了扩展Shape的答案,我想通过更通用的解决方案做出贡献
包eav中的dwtools功能旨在通过更轻松地计算度量来解决Entity-attribute-value数据结构问题。功能定义如下,您不需要dwtools包
它为每个组计算rm
变量。计算公式可以与熔化您的EAV之后引用j
arg到[.data.table
之后,以及再次转换为EAV之前的引用相同。
library(data.table)
eav = function(x, j, id.vars = key(x)[-length(key(x))], variable.name = key(x)[length(key(x))], measure.vars = names(x)[!(names(x) %in% key(x))], fun.aggregate = sum, shift.on = character(), wide=FALSE){
stopifnot(is.data.table(x))
r <- x[,lapply(.SD,fun.aggregate),c(id.vars,variable.name),.SDcols=measure.vars
][,dcast(.SD,formula=as.formula(paste(paste(id.vars,collapse=' + '),paste(variable.name,collapse=' + '),sep=' ~ ')),fun.aggregate=fun.aggregate,value.var=measure.vars)
][,eval(j), by = eval(id.vars[!(id.vars %in% shift.on)])
]
if(wide) r[] else melt(r,id.vars=id.vars, variable.name=variable.name, value.name=measure.vars)[,.SD,keyby=c(id.vars,variable.name)]
}
df = data.frame(day = c(1, 1, 2, 2, 3, 3), var = c("a", "b", "a", "b", "a", "b"), value = c(1, 2, 3, 3, 2, 1))
dt = as.data.table(df)
setkey(dt, day, var)
r = eav(dt, quote(rm := as.numeric(a >= b)))
print(r)
# day var value
#1: 1 a 1
#2: 1 b 2
#3: 1 rm 0
#4: 2 a 3
#5: 2 b 3
#6: 2 rm 1
#7: 3 a 2
#8: 3 b 1
#9: 3 rm 1
r[, if(value[var=="rm"] == 0) .SD, by = day
][var!="rm"] # you need to exclude temporary variable
# day var value
#1: 1 a 1
#2: 1 b 2
此解决方案也可能比Shape更慢(您可以填充大数据样本以便对其进行测量),但对于EAV中的许多度量的复杂计算可能更容易,并且支持移位 - 请参阅{{3 }}。