在下面的矩阵中,我想根据位置的差异输出变化值的差异。示例:对于ID1,减去平均变化值,其中Position1 = 1,其中Position1 = 0的变化值
ID1 Position1的输出
Position1= average(0.59-0.04+0.37) - average(-0.18)
IDs Change Position1 Position2
ID1 0.5941262037 1 1
ID1 -0.0418420656 1 1
ID1 0.3766006166 1 1
ID1 -0.1842130385 0 0
ID2 -1.3847740208 0 0
ID2 -1.2668185169 0 1
ID2 1.8034297622 1 1
...
编辑:
我的输出应该是每个位置的每个ID的一个值。
ID1-位置1:
ID2-位置2:
答案 0 :(得分:2)
您可以将dplyr
与tidyr
用于多个Position
列
library(dplyr)
library(tidyr)
dat %>%
gather(Var, Val, starts_with("Position")) %>%
group_by(IDs, Var) %>%
summarise(Mean=mean(Change[!!Val], na.rm=TRUE)-mean(Change[!Val], na.rm=TRUE)) %>%
spread(Var, Mean)
给出了
# IDs Position1 Position2
#1 ID1 0.4938413 0.4938413
#2 ID2 3.1292260 1.6530796
或者,您可以将data.table
与reshape2
library(reshape2)
library(data.table)
DT <- data.table(melt(dat, id.var=c("IDs", "Change")), key=c("IDs", "variable"))
dcast(DT[, list(mean(Change[!!value], na.rm=TRUE)-mean(Change[!value], na.rm=TRUE)),
by=list(IDs, variable)],
IDs~variable, value.var="V1")
# IDs Position1 Position2
#1 ID1 0.4938413 0.4938413
#2 ID2 3.1292260 1.6530796
或使用base R
do.call(`rbind`,
lapply(split(dat[,-1], dat$IDs),
function(x) {
apply(x[,-1], 2, function(y) mean(x[,1][!!y], na.rm=TRUE)-
mean(x[,1][!y], na.rm=TRUE))}))
# Position1 Position2
#ID1 0.4938413 0.4938413
#ID2 3.1292260 1.6530796
dat <- structure(list(IDs = c("ID1", "ID1", "ID1", "ID1", "ID2", "ID2",
"ID2"), Change = c(0.5941262037, -0.0418420656, 0.3766006166,
-0.1842130385, -1.3847740208, -1.2668185169, 1.8034297622), Position1 = c(1L,
1L, 1L, 0L, 0L, 0L, 1L), Position2 = c(1L, 1L, 1L, 0L, 0L, 1L,
1L)), .Names = c("IDs", "Change", "Position1", "Position2"), class = "data.frame", row.names = c(NA,
-7L))
答案 1 :(得分:1)
根据IDs
拆分数据框并对每个ID执行所需的操作似乎是最直接的方法。
library(plyr)
X <- data.frame(IDs = c(1,1,1,1,2,2,2), change = 1:7, Position1 = c(1,1,1,0,0,0,1))
Y <- ddply(X, "IDs", function(df) {
change.diff <- mean(subset(df,Position1==1)$change) -
mean(subset(df,Position1==0)$change)
})
Y
# IDs V1
# 1 1 -2.0
# 2 2 1.5