我有一个字频数据帧,例如:
df <- data.frame(
Predictor = c("for","of","as","for","for","as","of","of","as","for"),
ToPredict = c("sure","course","much","him","keeps","far","them","this","an","petes"),
Freq = c(53,32,21,17,13,5,3,2,2,1))
我想计算一个新列,它是每个ToPredict构成每个预测变量的比例。
因此,在上面的示例中,此新列的值为:
df$Props = c(0.631,0.865,0.75,0.202,0.155,0.179,0.081,0.054,0.071,0.012)
目前,我有一个总和的数据框:
sums <- aggregate(df$Freq, by=list(Category=df$Predictor), FUN=sum)
我尝试过:
df$Props <- with(df, Freq/sums$x[which(sums$Category == Predictor)])
显然,这不起作用。但我不知道会发生什么。非常感谢任何帮助。
答案 0 :(得分:2)
Per thelatemail:
with(df, ave(Freq, Predictor, FUN=prop.table))
答案 1 :(得分:1)
a=aggregate(df$Freq, by=list(df$Pred), FUN=sum)
a1=a[,2]
names(a1)=as.character(a[,1])
df$Props=df$Freq/a1[df$Pred]