将数据从长格式转换为宽格式

时间:2020-07-02 12:45:30

标签: r dplyr data.table reshape tidyr

我正在寻找一种方法来重塑我的数据:

> test
      policyID startYear   product
1: G000246-000      2014 Product 1
2: G000246-000      2014 Product 2
3: G000246-000      2014 Product 3
4: G000246-000      2015 Product 1
5: G000246-000      2015 Product 2
6: G000246-000      2015 Product 3

对此:

     policyID       2014         2015
1: G000246-000    Product 1    Product 1
2: G000246-000    Product 2    Product 2
3: G000246-000    Product 3    Product 3

我尝试过:

  reshape(test, idvar = "policyID", timevar = "startYear", direction = "wide")

但是我得到了:

      policyID product.2014 product.2015
1: G000246-000    Product 1    Product 1

什么是获得我想要的结果的最佳方法?

数据:

structure(list(policyID = c("G000246-000", "G000246-000", "G000246-000", 
"G000246-000", "G000246-000", "G000246-000"), startYear = c(2014, 
2014, 2014, 2015, 2015, 2015), product = c("Product 1", "Product 2", 
"Product 3", "Product 1", "Product 2", "Product 3")), row.names = c(NA, 
-6L), class = c("data.table", "data.frame"))

2 个答案:

答案 0 :(得分:2)

一种tidyr解决方案,尽管由于您的数据集中没有唯一标识符而获得警告消息,但仍会

library(tidyr)

test %>% 
  pivot_wider(policyID, names_from = startYear, values_from = product) %>%
  unnest(starts_with("2"))   # or unnest(everything()) ; it depends on which are your other columns

# A tibble: 3 x 3
#   policyID    `2014`    `2015`   
#   <chr>       <chr>     <chr>    
# 1 G000246-000 Product 1 Product 1
# 2 G000246-000 Product 2 Product 2
# 3 G000246-000 Product 3 Product 3

答案 1 :(得分:1)

与markus的评论非常相似:

test[, dcast(.SD, policyID + product ~ startYear, value.var = "product")
     ][, !"product"]

      policyID      2014      2015
1: G000246-000 Product 1 Product 1
2: G000246-000 Product 2 Product 2
3: G000246-000 Product 3 Product 3

数据

test <- data.table(
  policyID = c("G000246-000"), 
  startYear = rep(c(2014,2015), each = 3), 
  product = paste("Product", 1:3)
)