R-按组自动对多个变量进行分析(卡方)

时间:2018-09-25 14:51:20

标签: r for-loop group-by automation subset

RING    SPECIES        SEX AGE  FAT WEIGHT  WING    WINGPRI BEAK    TARSUS   
H8309   ACCIPITER NISUS M   5   0   141     199     117     19,2    52      
K617    ACCIPITER NISUS F   4   0   288,5   232     167     20,4    62,2    
A264905 ACROCEPHALUS    F   4   2   11,8    64,5    NA      NA      NA      
A358705 ACROCEPHALUS    M   3   2   11      66      50      18,2    22      
A432721 ACROCEPHALUS    U   4   6   14,5    63      48      16      21,9    
O59461  AEGITHALOS      M   4   0   6,4     57      42      8,2     13,8    
O92094  AEGITHALOS      F   2   0   6,8     56      38      7,96    16,54   
O92095  AEGITHALOS      U   2   0   7       58      44      8,78    17,85   

这是我的数据框(“ amostra”)的一小部分样本,目前我想弄清楚每种物种的性别之间是否存在差异(我的原始df中有60多个),并且为此,我被告知最好的办法是使用变量WEIGHT,WING,WINGPRI,BEAK和TARSUS的卡方值

所以我需要对每个物种进行卡方检验,以独立地使用所有这5个变量来确定性别之间是否存在差异

我已经为此苦苦挣扎了好几天,到目前为止,我能做的最好的事情就是这样:

for(i in unique(amostra$SPECIES)){
  for (j in 6:10){
  print(
    colnames(amostra[j]))
    names(amostra$SPECIES)
    print(
    chisq.test(amostra$SEX, amostra[,j]))}
}

哪一个给我每5个变量正确的输出,但乘以我拥有的唯一物种的数量,所以我得到TARSUS x60的相同p值,而不是每个物种的唯一p值 例如只是从变量TARSUS:

[1] "TARSUS"

    Pearson's Chi-squared test

data:  amostra$SEX and amostra[, j]
X-squared = 1072, df = 758, p-value = 3.53e-13

我也尝试过:

subset1 <- amostra[, c(2,3,6:10)]

subset1$SPECIES<- as.factor(subset1$SPECIES)

analise<- function(subset1){
  for (i in 3:7){
    print(
      colnames(amostra[i]))
    print(
      chisq.test(amostra[,2],amostra[,i]))
  }
  subset1
}

by(subset1,subset1$SPECIES,FUN = analise)

这给了我巨大的输出,我无法完全看到它,但是初始输出与上面的输出相同,但是现在不再是按物种对卡方检验进行分组的结果,我对所有物种都得到了...

---------------------------------------------------------------------------------------- 
subset1$SPECIES: PASSER SP.
        SPECIES SEX WEIGHT WING WINGPRI  BEAK TARSUS
1522 PASSER SP.   F   25.5   74      55 14.64  21.51
1523 PASSER SP.   F     NA   76      56    NA     NA
1524 PASSER SP.   F   29.5   78      58 14.70  20.40
----------------------------------------------------------------------------------------

我希望我能澄清我的问题,这是我的第一篇文章,对于任何错误,我感到抱歉

提前谢谢

1 个答案:

答案 0 :(得分:0)

撇开统计有效性,您可以使用tidyverse / purrr方法来完成循环中的工作:

library(tidyverse)
library(broom)

df %>%
  select(-RING, -AGE, -FAT) %>%
  gather(variable, value, -SPECIES, -SEX) %>%
  group_by(SPECIES, variable) %>%
  nest() %>% 
  mutate(
    chi_sq_results = map(data, ~ chisq.test(.x$SEX, .x$value)),
    tidied = map(chi_sq_results, tidy)
  ) %>%
  unnest(tidied, .drop = TRUE)

# # A tibble: 15 x 6
#    SPECIES         variable statistic p.value parameter method                                                      
#    <chr>           <chr>        <dbl>   <dbl>     <int> <chr>                                                       
#  1 ACCIPITER NISUS WEIGHT          0    1             1 Pearson's Chi-squared test with Yates' continuity correction
#  2 ACROCEPHALUS    WEIGHT          6.   0.199         4 Pearson's Chi-squared test                                  
#  3 AEGITHALOS      WEIGHT          6.   0.199         4 Pearson's Chi-squared test                                  
#  4 ACCIPITER NISUS WING            0    1             1 Pearson's Chi-squared test with Yates' continuity correction
#  5 ACROCEPHALUS    WING            6.   0.199         4 Pearson's Chi-squared test                                  
#  6 AEGITHALOS      WING            6.   0.199         4 Pearson's Chi-squared test                                  
#  7 ACCIPITER NISUS WINGPRI         0    1             1 Pearson's Chi-squared test with Yates' continuity correction
#  8 ACROCEPHALUS    WINGPRI         0    1             1 Pearson's Chi-squared test with Yates' continuity correction
#  9 AEGITHALOS      WINGPRI         6.   0.199         4 Pearson's Chi-squared test                                  
# 10 ACCIPITER NISUS BEAK            0    1             1 Pearson's Chi-squared test with Yates' continuity correction
# 11 ACROCEPHALUS    BEAK            0    1             1 Pearson's Chi-squared test with Yates' continuity correction
# 12 AEGITHALOS      BEAK            6.   0.199         4 Pearson's Chi-squared test                                  
# 13 ACCIPITER NISUS TARSUS          0    1             1 Pearson's Chi-squared test with Yates' continuity correction
# 14 ACROCEPHALUS    TARSUS          0    1             1 Pearson's Chi-squared test with Yates' continuity correction
# 15 AEGITHALOS      TARSUS          6.   0.199         4 Pearson's Chi-squared test