将函数的相关输出放入数据帧

时间:2017-07-19 20:31:16

标签: r function loops dataframe correlation

随着时间的推移,我正在表现出很多人口的相关性。我已经相应地将它们分开并且通过lapply使它们通过一个函数。我想将每个相关的输出放入数据框中(即:每行将是一个相关的信息,列为:相关性的名称 p值 t统计 df CI corcoeff )。

我有两个问题:

  1. 我不知道如何提取分割中的相关名称
  2. 我可以让我的功能在分割(600+相关)上运行相关性,但我无法将其打印到数据框中。澄清一下:当我在没有循环的情况下运行函数时,它会为每个组执行所有600个相关。但是,当我添加循环时,它会为拆分中的所有组生成NULL。
  3. 这是我到目前为止所做的:

    > head(Birds) #Shortened for this Post
    Location      Species   Year Longitude Latitude Section Total Percent  Family
    1 Chiswell A  Kittiwake 1976 -149.5847 59.59559 Central   310 16.78397 Gull
    
    BigSplit<-split(Birds,list(Birds$Family, Birds$Location, 
    Birds$Section,Birds$Species), drop=T) #A list of Dataframes
    
    #Make empty data frame
    resultcor <- data.frame(Name = character(),
                            tvalue = character(),
                            degreeF = character(),
                            pvalue = character(),
                            CIs = character(),
                            corcoeff = character(),stringsAsFactors = F)
    
    WorkFunc <- function(dataset) {
         data.name = substitute(dataset) #Use "dataset" as substitute for actual dataset name
    
         #Correlation between Year and population Percent
         try({
              correlation <- cor.test(dataset$Year, dataset$Percent, method = "pearson")    
         }, silent = TRUE)
    
         for (i in 1:nrow(resultcor)) {
              resultcor$Name[i] <- ??? #These ??? are not in the code, just highlighting Issue 1
              resultcor$tvalue[i] <- correlation$dataset$statistic
              resultcor$degreeF[i] <- correlation$dataset$parameter
              resultcor$pvalue[i] <- correlation$dataset$p.value
              resultcor$CIs[i] <- correlation$dataset$conf.int
              resultcor$corcoeff[i] <- correlation$dataset$estimate
         }
    }
    
    lapply(BigSplit, WorkFunc)
    

    任何帮助将不胜感激,谢谢!

1 个答案:

答案 0 :(得分:1)

考虑使用Map(变种为mapply),您传递 BigSplit 的元素和名称。使用Map将输出一个数据帧列表,然后您可以使用do.call()将其绑定到最后。下面假设 BigSplit 是一个命名列表。

WorkFunc <- function(dataset, dataname) {
    # Correlation between Year and population Percent
    tryCatch({ 
        correlation <- cor.test(dataset$Year, dataset$Percent, method = "pearson")
        CIs <- correlation$conf.int

        return(data.frame(
                  Name = dataname,
                  tvalue = correlation$statistic,
                  degreeF = correlation$parameter,
                  pvalue = correlation$p.value,
                  CI_lower = ifelse(is.null(CIs), NA, CIs[[1]]),
                  CI_higher = ifelse(is.null(CIs), NA, CIs[[2]]),
                  corcoeff = correlation$estimate
             )
         ) 
     }, error = function(e) 
             return(data.frame(
                        Name = character(0),
                        tvalue = numeric(0),
                        degreeF = numeric(0),
                        pvalue = numeric(0),
                        CI_lower = numeric(0),
                        CI_higher = numeric(0),
                        corcoeff = numeric(0)
                    )
              )
      )
}    

# BUILD CORRELATION DATAFRAMES INTO LIST
cor_df_list <- Map(WorkFunc, BigSplit, names(BigSplit))
cor_df_list <- mapply(WorkFunc, BigSplit, names(BigSplit), SIMPLIFY=FALSE)   # EQUIVALENT

# ROW BIND ALL DATAFRAMES TO FINAL LARGE DATAFRAME
finaldf <- do.call(rbind, cor_df_list)