使用dplyr

时间:2017-10-11 18:48:09

标签: r dplyr tidyr

假设:

x <- data.frame(Day = c(1,2,3,4,5,6,7,8,9,10),
                var1 = c(5,4,2,3,4,5,1,2,3,4),
                var2 = c(3,6,2,3,4,5,7,8,1,2),
                var3 = c(1,2,3,4,6,2,4,7,8,4),
                var4 = c(1,3,7,5,3,7,2,3,1,2))

此时的日变量是数字,但对应于1 =星期一,5 =星期五,6 =星期一,10 =星期五。我想把所有相关日子一起崩溃,并在白天平均价值:

z <- data.frame(Day = c("Monday", "Tuesday", "Wednesday", "Thursday","Friday"),
                var1 = c(5,2.5,2,3,4),
                var2 = c(4,6.5,5,2,3),
                var3 = c(1.5,3,5,6,5),
                var4 = c(4,2.5,5,3,2.5))

2 个答案:

答案 0 :(得分:3)

使用modular %%

days = c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
x %>% group_by(Day = days[(Day - 1) %% 5 + 1]) %>% summarise_all(mean)

# A tibble: 5 x 5
#        Day  var1  var2  var3  var4
#      <chr> <dbl> <dbl> <dbl> <dbl>
#1    Friday   4.0   3.0   5.0   2.5
#2    Monday   5.0   4.0   1.5   4.0
#3  Thursday   3.0   2.0   6.0   3.0
#4   Tuesday   2.5   6.5   3.0   2.5
#5 Wednesday   2.0   5.0   5.0   5.0

答案 1 :(得分:1)

如果订购数据,请通过复制日期创建分组变量,然后使用summarise_at获取'var'列的mean

library(dplyr)
 v1 <- c("Monday", "Tuesday", 
            "Wednesday", "Thursday","Friday")
x %>%
   group_by(Day = factor(rep(v1, 2), levels = v1))  %>%
   summarise_at(vars(matches('var')), mean)
# A tibble: 5 x 5
#     Day  var1  var2  var3  var4
#       <chr> <dbl> <dbl> <dbl> <dbl>
# 1    Monday   5.0   4.0   1.5   4.0
# 2   Tuesday   2.5   6.5   3.0   2.5
# 3 Wednesday   2.0   5.0   5.0   5.0
# 4  Thursday   3.0   2.0   6.0   3.0
# 5    Friday   4.0   3.0   5.0   2.5

如果未对数据进行排序,则创建一个键/值数据集,与原始数据集连接,在按“日”分组后,获取上述mean

x1 <- data.frame(Day = 1:10, DayC = c("Monday", "Tuesday", 
        "Wednesday", "Thursday","Friday"), stringsAsFactors= FALSE)

x %>%
  left_join(., x1) %>% 
  group_by(Day = DayC) %>%
  summarise_at(vars(matches('var')), mean) %>%
  arrange(factor(Day, levels = v1))