频率表应用功能的平均值

时间:2016-01-12 20:58:22

标签: r algorithm

如果我们想要计算第一个数据集中事件的频率平均值,我们可以使用以下函数。

ID<-c("R1","R2","R2","R3","R3","R4","R4","R4","R4","R3","R3","R3","R3","R2","R2","R2","R5","R6")
event<-c("a","b","b","M","s","f","y","b","a","a","a","a","s","c","c","b","m","a")
df<-data.frame(ID,event)

功能:

apply(table(df$ID,df$event),2,function(x) mean(x[x>0]))

我想知道如何通过考虑类型的级别来修改此代码来计算事件的频率平均值。我的意思是,我想为每个类型的类别计算apply(table(df$ID,df$event),2,function(x) mean(x[x>0]))。例如,在aaa级别中,a的频率平均值为2/2。在cc级别中,b = 3/1的频率平均值。

ID<-   c("R1","R2","R2","R3","R3","R4","R4","R4","R4","R3","R3","R3","R3","R2","R2","R2","R5","R6")
 event<-c("a","b","b","M","s","f","y","b","a","a","a","a","s","c","c","b","m","a")
 type<-c("ee","cc","cc","mm","mm","ff","yy","bb","cc","mm","ff","aaa","cc","ccc","ff","cc","mmm","aaa")

df<-data.frame(ID,event,type)

2 个答案:

答案 0 :(得分:0)

我认为您希望获得ID和类型的唯一组合的平均值。那就是你想要的:

table <- unlist(apply(table(df), 2, function(x) x))
apply(table, 2, function(x) mean(x[x>0]))

托马斯

答案 1 :(得分:0)

如果您只想在带有“event”的2way表的列中的“平均计数”作为结果,则需要排除ID列,然后使用沿行而不是列应用:

> table(df[-1])
     type
event aaa bb cc ccc ee ff mm mmm yy
    a   2  0  1   0  1  1  1   0  0
    b   0  1  3   0  0  0  0   0  0
    c   0  0  0   1  0  1  0   0  0
    f   0  0  0   0  0  1  0   0  0
    m   0  0  0   0  0  0  0   1  0
    M   0  0  0   0  0  0  1   0  0
    s   0  0  1   0  0  0  1   0  0
    y   0  0  0   0  0  0  0   0  1

> apply(table(df[-1]),1,function(x) mean(x[x>0]))
  a   b   c   f   m   M   s   y 
1.2 2.0 1.0 1.0 1.0 1.0 1.0 1.0 

如果你真的想要使用和3的索引:

中描述的内容
> apply(table(df),3,function(x) mean(x[x>0]))
     aaa       bb       cc      ccc       ee       ff       mm      mmm       yy 
1.000000 1.000000 1.666667 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 

(我不确定这个程序如何产生有用的结果。)