我有以下数据框:
setTimeout
我要为每个功能执行的操作:library(tidyverse)
dat <- structure(list(charge.Group3 = c(0.167, 0.167, 0.1, 0.067, 0.033,
0.033, 0.067, 0.133, 0.2, 0.067, 0.133, 0.114, 0.167, 0.033,
0.1, 0.033, 0.133, 0.267, 0.133, 0.233, 0.1, 0.167, 0.067, 0.133,
0.1, 0.133, 0.1, 0.133, 0.1, 0.067, 0.167, 0), hydrophobicity.Group3 = c(0.267,
0.467, 0.067, 0.167, 0.267, 0.1, 0.367, 0.233, 0.367, 0.233,
0.133, 0.205, 0.333, 0.267, 0.267, 0.067, 0.133, 0.3, 0.233,
0.267, 0.5, 0.333, 0.2, 0.5, 0.5, 0.4, 0.033, 0.3, 0.233, 0.5,
0.233, 0.033), class = c("Negative", "Negative", "Positive",
"Positive", "Positive", "Positive", "Positive", "Negative", "Positive",
"Positive", "Positive", "Positive", "Positive", "Positive", "Negative",
"Positive", "Negative", "Negative", "Negative", "Negative", "Negative",
"Negative", "Negative", "Negative", "Negative", "Negative", "Positive",
"Positive", "Positive", "Negative", "Positive", "Negative")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -32L))
dat
#> # A tibble: 32 x 3
#> charge.Group3 hydrophobicity.Group3 class
#> <dbl> <dbl> <chr>
#> 1 0.167 0.267 Negative
#> 2 0.167 0.467 Negative
#> 3 0.1 0.067 Positive
#> 4 0.067 0.167 Positive
#> 5 0.033 0.267 Positive
#> 6 0.033 0.1 Positive
#> 7 0.067 0.367 Positive
#> 8 0.133 0.233 Negative
#> 9 0.2 0.367 Positive
#> 10 0.067 0.233 Positive
#> # ... with 22 more rows
和charge.Group3
,在消极和积极之间做hydrophobicity.Group3
。最后获得p值作为数据帧或小标题:
wilcox.test
请注意,实际上有2个以上的功能。 我该如何实现?
答案 0 :(得分:2)
这是使用dplyr::summarize_at
和tidyr::gather
的一种方法:
library(tidyverse)
dat %>%
summarize_at(c("charge.Group3","hydrophobicity.Group3"),
~wilcox.test(.x ~ .y)$p.value, .$class) %>%
gather(features, pvalue)
# # A tibble: 2 x 2
# features pvalue
# <chr> <dbl>
# 1 charge.Group3 0.109
# 2 hydrophobicity.Group3 0.039
总结除class
之外的所有变量:
dat %>%
summarize_at(vars(-class),
~wilcox.test(.x ~ .y)$p.value,
.$class) %>%
gather(features,pvalue)
答案 1 :(得分:2)
如果只需要测试的p值,则实际上不需要使用broom
。
library(tidyverse)
dat %>%
gather(group, value, -class) %>% # reshape data
nest(-group) %>% # for each group nest data
mutate(pval = map_dbl(data, ~wilcox.test(value ~ class, data = .)$p.value)) %>% # get p value for wilcoxon test
select(-data) # remove data column
# # A tibble: 2 x 2
# group pval
# <chr> <dbl>
# 1 charge.Group3 0.109
# 2 hydrophobicity.Group3 0.0390
首先重塑将使您能够应用此过程,无论您拥有多少列,并假设class
是唯一的额外变量。
或者您甚至可以避免map
,因为@Moody_Mudskipper建议使用
dat %>%
gather(group, value, -class) %>%
group_by(group) %>%
summarize(results = wilcox.test(value ~ class)$p.value)
如果您真的想参与broom
,那么可以
library(broom)
dat %>%
gather(group, value, -class) %>%
nest(-group) %>%
mutate(results = map(data, ~tidy(wilcox.test(value ~ class, data = .)))) %>%
select(-data) %>%
unnest(results)
# # A tibble: 2 x 5
# group statistic p.value method alternative
# <chr> <dbl> <dbl> <chr> <chr>
# 1 charge.Group3 170. 0.109 Wilcoxon rank sum test with continuity correction two.sided
# 2 hydrophobicity.Group3 183 0.0390 Wilcoxon rank sum test with continuity correction two.sided
返回更多列,但是如果需要,您可以保留p值。