我有一个数据集,我想在其中计算每个样地每个起源(本地和外来)的第一朵花的中位数。
我的最终目标是测试在暖和的环境下,本土和外来物种的第一朵花的中位日期是否存在显着差异。
这是我的数据的子集:
dput(umbs_firstflower[8:16,])
structure(list(site = c("umbs", "umbs", "umbs", "umbs", "umbs",
"umbs", "umbs", "umbs", "umbs"), plot = c("A1", "A1", "A1", "A2",
"A2", "A2", "A2", "A3", "A3"), species = c("Sogi", "Sone", "Syla",
"Cest", "Poco", "Popr", "Ruac", "Cest", "Dasp"), origin = c("Native",
"Native", "Native", "Exotic", "Exotic", "Exotic", "Exotic", "Exotic",
"Native"), state = c("ambient", "ambient", "ambient", "warmed",
"warmed", "warmed", "warmed", "ambient", "ambient"), first.flower = c("248",
"240", "227", "195", "169", "155", "156", "194", "185")), row.names = c(NA,
-9L), class = c("tbl_df", "tbl", "data.frame"))
这是我为尝试执行此操作而编写的代码示例:
umbs <- umbs_firstflower %>% group_by(plot, origin) %>% summarize(mean.firstflw = mean(as.numeric(date))) %>% ungroup()
答案 0 :(得分:0)
您可能会在这里使用许多重要性测试,因此,我将使用一个(kruskal.test()
)来演示解决方案。但是请注意,there is disagreement是测试3个以上组中位数之间显着差异的最佳方法,因此您可能希望将此测试换成另一组。
步骤:
grp
变量,该变量与分类列中感兴趣的各种组合匹配。pivot_wider()
,其中各组作为first.flower
值的列。library(tidyverse)
library(magrittr)
df_wide <-
df %>%
mutate(first.flower = as.numeric(first.flower),
grp = case_when(
origin == "Native" & state == "ambient" ~ "nat_ambi",
origin == "Native" & state == "warmed" ~ "nat_warm",
origin == "Exotic" & state == "ambient" ~ "exo_ambi",
origin == "Exotic" & state == "warmed" ~ "nat_warm",
TRUE ~ NA_character_
)) %>%
pivot_wider(id_cols = 1:5, names_from = grp, values_from = "first.flower")
df_wide
# A tibble: 9 x 8
site plot species origin state nat_ambi nat_warm exo_ambi
<chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
1 umbs A1 Sogi Native ambient 248 NA NA
2 umbs A1 Sone Native ambient 240 NA NA
3 umbs A1 Syla Native ambient 227 NA NA
4 umbs A2 Cest Exotic warmed NA 195 NA
5 umbs A2 Poco Exotic warmed NA 169 NA
6 umbs A2 Popr Exotic warmed NA 155 NA
7 umbs A2 Ruac Exotic warmed NA 156 NA
8 umbs A3 Cest Exotic ambient NA NA 194
9 umbs A3 Dasp Native ambient 185 NA NA
%$%
magrittr
管道直接运行重要性测试。 df_wide %$% kruskal.test(list(nat_ambi, nat_warm, exo_ambi))
Kruskal-Wallis rank sum test
data: list(nat_ambi, nat_warm, exo_ambi)
Kruskal-Wallis chi-squared = 4.2667, df = 2, p-value = 0.1184