将组除以R中每组中的样本数

时间:2013-09-18 16:31:56

标签: r

我正在尝试使用ggplot来制作在6个不同位置和7个不同时间具有底物组成的图。问题是我每个采样周期和站点都有不同的样本量。我基本上想要代码y=freq/(#of stations in that time period)。以下是我的数据集样本

   Substrate     Time   Site Freq
1      Floc    July 11   P1    4
2      Fine    July 11   P1    2
3    Medium    July 11   P1   12
4    Coarse    July 11   P1    0
5   Bedrock    July 11   P1    3
6      Floc     Aug 11   P1    7
7      Fine     Aug 11   P1    1
8    Medium     Aug 11   P1    7
9    Coarse     Aug 11   P1    1
10  Bedrock     Aug 11   P1    4

因此我想要

      Var1       Var2 Var3 Freq
1      Floc    July 11   P1    4/(21 - The number of samples taken in July).

有关如何编写此代码然后绘制结果的任何想法?

2 个答案:

答案 0 :(得分:5)

使用data.table(来自同名的包)......

require(data.table)
DT <- data.table(dat)

DT[,Freq2:=Freq/sum(Freq),by=Var2]

给出了

       Var1    Var2 Var3 Freq     Freq2
 1:    Floc July 11   P1    4 0.1904762
 2:    Fine July 11   P1    2 0.0952381
 3:  Medium July 11   P1   12 0.5714286
 4:  Coarse July 11   P1    0 0.0000000
 5: Bedrock July 11   P1    3 0.1428571
 6:    Floc  Aug 11   P1    7 0.3500000
 7:    Fine  Aug 11   P1    1 0.0500000
 8:  Medium  Aug 11   P1    7 0.3500000
 9:  Coarse  Aug 11   P1    1 0.0500000
10: Bedrock  Aug 11   P1    4 0.2000000

编辑:现在问题有更好的列名,所以更明确的是“for ... period and site”的含义。正如@DWin在评论中写道,答案现在是:

DT[,Freq2:=Freq/sum(Freq),by='Time,Site']

答案 1 :(得分:3)

查看?ave

df <- read.table(textConnection("
Var0 Var1       Var2 Var3 Freq
1      Floc    July 11   P1    4
2      Fine    July 11   P1    2
3    Medium    July 11   P1   12
4    Coarse    July 11   P1    0
5   Bedrock    July 11   P1    3
6      Floc     Aug 11   P1    7
7      Fine     Aug 11   P1    1
8    Medium     Aug 11   P1    7
9    Coarse     Aug 11   P1    1
10  Bedrock     Aug 11   P1    4"), header=TRUE, row.names=1)

df$freq <- ave(df$Freq, df$Var1, FUN=function(x)x/sum(x))
df
#      Var0 Var1 Var2 Var3 Freq      freq
#1     Floc July   11   P1    4 0.1904762
#2     Fine July   11   P1    2 0.0952381
#3   Medium July   11   P1   12 0.5714286
#4   Coarse July   11   P1    0 0.0000000
#5  Bedrock July   11   P1    3 0.1428571
#6     Floc  Aug   11   P1    7 0.3500000
#7     Fine  Aug   11   P1    1 0.0500000
#8   Medium  Aug   11   P1    7 0.3500000
#9   Coarse  Aug   11   P1    1 0.0500000
#10 Bedrock  Aug   11   P1    4 0.2000000