我尝试编写一个函数,根据另一列的值计算一列(结果)的比例。代码如下所示:
thresh_measure <- function(data, indicator, thresh_value)
{
d1 <- data %>%
group_by(class_number, outcome) %>%
summarize(n=sum(indicator <= thresh_value)) %>% spread(outcome, n)
d1$thresh_value <- thresh_value
return(d1)
}
final_test <- thresh_measure(df, 'pass_rate', 0.8)
总结函数似乎有一个错误,其中当前函数返回所有0。当我改变它看起来像这样,它的工作原理:
thresh_measure <- function(data, indicator, thresh_value)
{
d1 <- data %>%
group_by(class_number, outcome) %>%
summarize(n=sum(pass_rate <= thresh_value)) %>% spread(outcome, n)
d1$thresh_value <- thresh_value
return(d1)
}
final_test <- thresh_measure(df, 'pass_rate', 0.8)
我已经尝试使用.GlobalEnv
设置值,除了dplyr之外我还分离了所有库,但它仍然无法正常工作。
答案 0 :(得分:0)
您必须处理要作为参数传递的列的名称。 例如(当然存在更好的方法):
thresh_measure <- function(data, indicator, thresh_value)
{
d1 <- data
names(d1)[names(d1)==indicator] <- "indicator"
d1 <- d1 %>%
group_by(class_number, outcome) %>%
summarize(n=sum(indicator <= thresh_value)) %>% spread(outcome, n)
d1$thresh_value <- thresh_value
return(d1)
}
答案 1 :(得分:0)
应该采用的两种替代方法:
# alternative I
thresh_measure <- function(data, indicator, thresh_value)
{
ind_quo <- rlang::sym(indicator)
d1 <- data %>%
group_by(class_number, outcome) %>%
summarize(n=sum(UQ(ind_quo) <= thresh_value)) %>% spread(outcome, n)
d1$thresh_value <- thresh_value
return(d1)
}
final_test <- thresh_measure(df, 'pass_rate', 0.8)
# alternative II
thresh_measure <- function(data, indicator, thresh_value)
{
ind_quo <- enquo(indicator)
d1 <- data %>%
group_by(class_number, outcome) %>%
summarize(n=sum(UQ(ind_quo) <= thresh_value)) %>% spread(outcome, n)
d1$thresh_value <- thresh_value
return(d1)
}
final_test <- thresh_measure(df, pass_rate, 0.8)