dplyr在函数内部不起作用

时间:2017-03-24 10:54:26

标签: r function dplyr

我对R中的函数没有经验。我正在尝试构建一个用目标变量计算均值的函数(在我的例子中:funding_final)。

我的数据:

residential_status  funded_final
Living with parents 0
Rent                0
Rent                0
Own                 1
Own                 0
Own                 0
Rent                0
Rent                0
Rent                0
Living with parents 0
Rent                0
Rent                0
Rent                1

当我在外面执行此功能时,功能很棒

test2 %>% group_by(residential_status) %>% 
summarise(tar_average = round((mean(funded_final, na.rm=TRUE))*100,2),N =     n()) %>% arrange(desc(tar_average)) %>% mutate(Perc = round((N/sum(N))*100,2),Cum_Perc = cumsum(Perc))%>% print(n = nrow(.))

结果:

 residential_status tar_average     N  Perc Cum_Perc
           <fctr>       <dbl> <int> <dbl>    <dbl>
1                 Own       33.33     3 23.08    23.08
2                Rent       12.50     8 61.54    84.62
3 Living with parents        0.00     2 15.38   100.00

当我使用该功能时,我只得到总平均值:

 group.by.func <- function(dataframe,target){ dataframe %>%group_by(residential_status) %>% 
summarise(tar_average = round((mean(target, na.rm=TRUE))*100,2),N = n()) %>%
arrange(desc(tar_average)) %>%
mutate(Perc = round((N/sum(N))*100,2),Cum_Perc = cumsum(Perc))%>%
print(n = nrow(.))}
group.by.func(test2,test2$funded_final)

结果:

residential_status tar_average     N  Perc Cum_Perc
           <fctr>       <dbl> <int> <dbl>    <dbl>
1 Living with parents       15.38     2 15.38    15.38
2                 Own       15.38     3 23.08    38.46
3                Rent       15.38     8 61.54   100.00

提前致谢!

1 个答案:

答案 0 :(得分:0)

问题是/* java.util.Date */ Date date = xmlDate.toGregorianCalendar().getTime(); System.out.println("java.util.date :- " + date); 使用非标准评估,并期望列的名称为不带引号的字符串。在您的情况下,变量dplyr::summarise不是列名,而是包含列值的向量。该函数无法将向量与data.frame相关联。因此,分组不适用于向量target。在分组data.frame的每次评估中,均值取整个向量target

您可以通过将列名称作为字符串传递并使用&#39;标准评估&#39;来解决此问题。版本target

dplyr::summarise

结果:

group.by.func <- function(dataframe, target){ 
    dataframe %>% group_by(residential_status) %>% 
            summarise_(.dots = list(
                            tar_average = paste0("round((mean(", target,", na.rm=TRUE))*100,2)"), 
                        N = "n()")) %>%
        arrange(desc(tar_average)) %>%
        mutate(Perc = round((N/sum(N))*100,2),Cum_Perc = cumsum(Perc))%>%
        print(n = nrow(.))
}
group.by.func(test2,"funded_final")