我有这个数据集:我想找到wt.Df和创始人相同的任何行的平均生存能力。然后我想替换数据集中的那些值
Store;
founder wt.Df Replicate Block Food_Source Viability
1 A4 5905 1 1 Regular 0.9523810
2 A4 24834 1 1 Regular 0.8095238
3 A4 24834 2 1 Regular 0.8571429
4 A4 27861 1 1 Regular 0.8095238
5 A4 27861 2 1 Regular 0.9230769
12 A3 5905 1 1 Regular 0.9473684
13 A3 24834 1 1 Regular 0.9047619
14 A3 27861 1 1 Regular 0.8571429
我知道这段代码会找到相似点之间的平均值,但我不知道如何替换数据集
tmp<- with(Store, mean(Viability[wt.Df == 27861 & founder == "A4"]))
通缉输出:
founder wt.Df Replicate Block Food_Source Viability
1 A4 5905 1 1 Regular 0.9523810
2 A4 24834 1 1 Regular 0.8333333
4 A4 27861 1 1 Regular 0.8663004
12 A3 5905 1 1 Regular 0.9473684
13 A3 24834 1 1 Regular 0.9047619
14 A3 27861 1 1 Regular 0.8571429
答案 0 :(得分:2)
有一些很好的选择让人想到。首先,来自aggregate
包的普通旧base
:
aggregate( Viability ~ wt.Df + founder , FUN = mean , data = store )
# wt.Df founder Viability
#1 5905 A3 0.9473684
#2 24834 A3 0.9047619
#3 27861 A3 0.8571429
#4 5905 A4 0.9523810
#5 24834 A4 0.8333333
#6 27861 A4 0.8663003
另一个好的选择是使用data.table包并通过分组变量进行聚合。我还为剩下的列记录每组的第一条记录,例如Block = Block[1]
这就是你在结果中所拥有的......
require( data.table )
store <- data.table( store )
store[ , list( Viability = mean(Viability) , Block = Block[1], Replicate = Replicate[1] ) , by = list( wt.Df , founder ) ]
# wt.Df founder Viability Block Replicate
#1: 5905 A4 0.9523810 1 1
#2: 24834 A4 0.8333333 1 1
#3: 27861 A4 0.8663003 1 1
#4: 5905 A3 0.9473684 1 1
#5: 24834 A3 0.9047619 1 1
#6: 27861 A3 0.8571429 1 1
答案 1 :(得分:0)
我会尝试生成摘要数据集然后合并它们。
library(gdata)
library(plyr)
avg_summary <- ddply(Store, .(wt.DF, founder), summary, viability1 = mean(Viability))
Store <- join(Store, avg_summary)
# delete original Viability column
Store$Viability <- NULL
# rename viability1 -> Viability
Store <- rename.vars(Store, 'viability1', 'Viability')