我有一个具有以下结构的数据框:
record <- c(seq_along(1:10))
store <- c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5)
week <- c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2)
sales_1 <- c(3, 3, 3, 3, 3, 2, 5, 1, 2, 10)
sales_2 <- c(1, 2, 4, 5, 6, 2, 3, 6, 1, 8)
price_1 <- runif(10, 2, 6)
price_2 <- runif(10, 2, 6)
df <- data_frame(record, store, week, sales_1, sales_2, price_1, price_2)
假设我想收集并转换它,以便记录&#39;,&#39;存储&#39;和周&#39;列全部保留,但我还创建了一个名为&#39; category&#39;的新列,它代表每个销售_&#39;末尾的结尾数字。和&#39;价格_&#39;柱。最后,我会巩固“销售”的价值观。和&#39;价格&#39;列分为两列(只是&#39;销售&#39;和&#39;价格&#39;)。结果看起来像这样:
record | store | week | category | sales | price
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 1 1 1 3 2.51
1 1 1 2 1 5.50
2 2 1 1 3 4.86
最初的讨论来自here。感谢@markdly谁预测我会在这里结束......
答案 0 :(得分:2)
您可以gather
销售和价格列,将键分隔为新标题和类然后spread
标题:
df %>%
gather(key, val, sales_1:price_2) %>%
separate(key, c('header', 'category'), sep='_') %>%
spread(header, val)
# A tibble: 20 x 6
# record store week category price sales
# * <int> <dbl> <dbl> <chr> <dbl> <dbl>
# 1 1 1 1 1 5.005186 3
# 2 1 1 1 2 4.184387 1
# 3 2 2 1 1 3.790764 3
# 4 2 2 1 2 4.668122 2
# ...