Question

一直试图分析我的数据集，我相信我走在正确的轨道上，但需要一些确认。我试图分析沿同一条河流的几个河段的鱼类捕捞率，并评估两年研究中使用的渔具类型的有效性。

我的数据包括：

1）21个不同的网站2）抽样技术分类为＆＃34;活跃＆＃34;或者＆＃34;被动＆＃34; 3）按月分开的2年数据收集。

在整个研究过程中，这些网站没有统一抽样。它们并非每个月都被采样，不是全部采样的时间相同，也不采用相同的采样技术。我相信您可以将其归类为非重复测量，因为几乎没有两个采样周期是相同的。

我认为分析这些数据的正确方法是使用双向随机区组ANOVA。这几个月将是分析中被阻止的内容。我得到了一些结果，但不确定使用的代码是否正确。

是否有人能够证明我使用的代码并确认/否认它确实是R中双向随机区组设计的正确代码？

Fish<-read.csv(file.choose(),header=TRUE)
Fish
FishLM<-lm(Caught.Hr ~ Site + Method + Site:Method,Fish)
anova(FishLM)

以下是一些示例数据：

Site     Month  Year    Device  Method  Hrs/Month   Caught  Caught/Hr  
Reach 01    5   2014    BS      Active  0.7            0    0  
Reach 01    6   2014    BS      Active  7.92           0    0  
Reach 01    7   2014    BS      Active  5.73           0    0  
Reach 01    8   2014    BS      Active  1.82           0    0  
Reach 01    9   2014    BS      Active  10.08          0    0  
Reach 01    10  2014    BS      Active  10.08          0    0  
Reach 01    11  2014    BS      Active  6.9            0    0  
Reach 02    3   2013    BS      Active  2.5            0    0  
Reach 02    4   2013    BS      Active  2.5            0    0  
Reach 02    5   2013    BS      Active  3.75           0    0  
Reach 02    6   2013    BS      Active  17.3           0    0  
Reach 02    7   2013    BS      Active  2.5            0    0  
Reach 02    8   2013    BS      Active  2.5            0    0  
Reach 02    9   2013    BS      Active  2.5            0    0  
Reach 02    10  2013    BS      Active  2.5            0    0  
Reach 02    11  2013    BS      Active  2.5            0    0  
Reach 03    3   2013    BS      Active  3              0    0  
Reach 03    4   2013    BS      Active  3              0    0  
Reach 03    5   2013    BS     Active   2.5            0    0  
Reach 03    6   2013    BS     Active   3.5            1    0.285714286  
Reach 03    7   2013    BS     Active   3              0    0  
Reach 03    8   2013    BS     Active   3              0    0  
Reach 03    9   2013    BS     Active   3              1    0.333333333  
Reach 03    10  2013    BS     Active   8.75           2    0.228571429  
Reach 03    11  2013    BS      Active  3              0    0  
Reach 04    3   2013    MT      Passive           
Reach 04    4   2013    MT      Passive           
Reach 04    5   2013    MT      Passive           
Reach 04    6   2013    MT      Passive 72             0    0  
Reach 04    7   2013    MT      Passive 120            2    0.016666667  
Reach 04    8   2013    MT      Passive 120            0    0  
Reach 04    9   2013    MT      Passive 72             0    0  
Reach 04    10  2013    MT      Passive           
Reach 04    11  2013    MT      Passive           
Reach 07    3   2014    MF      Passive           
Reach 07    4   2014    MF      Passive 96             7    0.072916667  
Reach 07    5   2014    MF      Passive 96             5    0.052083333  
Reach 07    6   2014    MF      Passive 96             8    0.083333333  
Reach 07    7   2014    MF      Passive 96             1    0.010416667  
Reach 07    8   2014    MF      Passive 96             1    0.010416667  
Reach 07    9   2014    MF      Passive 96             3    0.03125  
Reach 07    10  2014    MF      Passive 96            10    0.104166667  
Reach 07    11  2014    MF      Passive

感谢。

Answer 1

如果我正确地阅读了您的问题（并且我不确定我是谁），那么您正在尝试检查您认为是正态分布的连续变量Caught.Hr中的变体 - 因为ANOVA。此外，您有两种处理效果：Site和Method，您每月都会重复采取措施。

因此您的模型

$$ Y_ {ijk} = \ mu + S_i + M_j +（SM） {ij} + \ epsilon {ijk} $$

其中
- Y_ {ijk} 是站点i的捕获率，方法j在时间段k中。
- mu 表示人口平均捕获率
- S_i 表示每个站点的效果，
- M_j 表示每种采样方法的效果，
- （SM）_ {ij} 是互动效果，
- e_ {ijk} 是随机变体

我从你的描述中看不出阻挡因素是什么。听起来你有一个不平衡的设计。在您的描述中，我认为您没有随机区组设计。你有一个因子设计有两个因素，这也是不平衡的。

但是，这可行：

FishLM<-lm(Caught.Hr ~ Site + Method + Site:Method,Fish)
anova(FishLM)

编辑：

我认为我上面所说的内容基于您的数据是有效的。虽然我确实担心你会使用ANOVA。这似乎是计数数据，即泊松而非正态分布。例如：

# This has problems based on hours / obs
Fishglm <- glm(Caught ~ Site + Method + Site:Method, data=Fish, 
  family= poisson(link = "log"))
# could use neg-binomial on the rate instead.
library(MASS)
Fishnb <- glm.nb(Caught.Hr ~ Site + Method + Site:Method, data=Fish)

R

1 个答案:

编辑：