我正在构建一个dplyr
结构,以便在1个代码块中在数据框的列上运行一些自定义功能
当前我的函数看起来是这样
funx <- function(x) {
logchoice <- if(max(x) < 400) {'T' } else { 'F' }
logtest <- suppressWarnings(log10(x))
remaining <- length(logtest[which(!is.na(logtest) & is.finite(logtest))])
x <- if(remaining > 0.75*length(x)) {suppressWarnings(log10(x)) } else { x }
x <- x[which(!is.na(x) & is.finite(x))]
y <- diptest::dip.test(x)
z <- tibble(pvalue = y$p.value, Transform = logchoice)
return(z)
}
和dplyr
结构如下:
mtcars %>%
sample_n(30) %>%
select(colnames(mtcars)[2:5]) %>%
summarise_all(list(~ list(funx(.)))) %>%
gather %>%
unnest %>%
arrange(pvalue) %>%
rename(Parameter = key)
这给了我:
Parameter pvalue Transform
1 cyl 0.00000000 T
2 drat 0.03026093 T
3 hp 0.04252001 T
4 disp 0.06050505 F
我想知道如何访问函数中的列名,主要是因为我想更改结果表中的名称,使其看起来像这样的输出:paste(original_column_name, 'log10', sep = '')
(如果函数应用了日志转换,但在决定不保留时保留原始名称。
所以预期的输出是:
Parameter pvalue Transform
1 log10_cyl 0.00000000 T
2 log10_drat 0.03026093 T
3 log10_hp 0.04252001 T
4 disp 0.06050505 F
答案 0 :(得分:0)
您距离很近。您只需在末尾添加var data = setInterval(function(){
// call fetch data function
fetchData((err, data) => {
if(data) {
clearInterval(data);
}
})
}, 10000);
function fetchData(cb) {
// rightnow we are passing data '1' to the callback so the timer will run only once as data is fetched
// try removing '1' and you will see that `fetchData` function is called on every 10 seconds
cb(null, 1);
}
mutate()
答案 1 :(得分:0)
在另一篇文章中解答,因为解决方法有所不同。为了在print()
中获得列名,我将在函数中传递它们,并使用purrr::map_dfr
建立结果的数据框。我所做的小更改是获取列名col_name
,并指定数据框。我尝试了几种方法来使用您的原始函数来获取列名,但未成功。
logtest_pval <- function(col, df) {
col_name <- col
x <- df %>% pull(!!col)
logchoice <- ifelse(max(x) < 400, TRUE, FALSE)
logtest <- log10(x)
remaining <- length(logtest[which(!is.na(logtest) & is.finite(logtest))])
x <- if(remaining > 0.75*length(x)) {suppressWarnings(log10(x)) } else { x }
x <- x[which(!is.na(x) & is.finite(x))]
y <- diptest::dip.test(x)
z <-
tibble(
transform = logchoice,
column = ifelse(logchoice, paste0("log10_", col_name), col_name),
pvalue = y$p.value
)
print(paste0(z, collapse = " | "))
return(z)
}
然后您可以构建数据框:
purrr::map_dfr(
.x = names(mtcars), # the columns to use
.f = logtest_pval, # the function to use
df = mtcars # additional arguments needed
)
这是另一个例子
df <-
mtcars %>%
select_if(is.numeric)
pvalues <-
map_dfr(names(df), logtest_pval, df)