ivot_longer和pivot_wider的问题

时间:2019-10-10 22:13:18

标签: r tidyr

我正在尝试使用pivot_longer和pivot_wider,它在独立脚本中工作正常。但是,一旦在闪亮的效果中使用它,就会出现以下错误:

Warning: Values in `value` are not uniquely identified; output will contain list-cols.
* Use `values_fn = list(value = list)` to suppress this warning.
* Use `values_fn = list(value = length)` to identify where the duplicates arise
* Use `values_fn = list(value = summary_fun)` to summarise duplicates
Warning: Error in : Can't cast `x` <list_of<double>> to `to` <double>.

数据

d1 <- tibble::tribble(
      ~Date, ~apple_count, ~apple_sale, ~banana_count, ~banana_sale, ~orange_count, ~orange_sale, ~peaches_count, ~peaches_sale, ~watermelon_count, ~watermelon_sale, ~strawberry_count, ~strawberry_sale,
  "8/19/19",  10882.05495,      239575,             0,            0,             0,            0,              0,             0,       9643.600102,           630827,                 0,                0,
  "8/20/19",    516.29755,       11281,             0,            0,             0,            0,              0,             0,       6041.538067,           510219,           1694.44,           684210,
  "8/21/19",     949.4084,       20150,             0,            0,             0,            0,              0,             0,       5371.758106,           565440,           9105.89,          3695182,
  "8/22/19",    3950.5318,       88679,             0,            0,             0,            0,              0,             0,       5238.308826,           576678,           6179.47,          2501560,
  "8/23/19",   2034.02055,       45672,             0,            0,             0,            0,              0,             0,        4994.43054,           518081,           7366.31,          2984563,
  "8/24/19",   1770.50415,       38553,             0,            0,             0,            0,              0,             0,       5001.303585,           551733,           6275.43,          2531400,
)

下面是代码。

d1 %>%
  pivot_longer(cols = -Date) %>%
  separate(name, into=c('partner', 'parameter'), sep='_') %>% 
  pivot_wider(names_from = parameter, values_from = value) %>%
  dplyr::group_by(partner) %>%
   dplyr::summarise( Total_Count = sum(as.numeric(count)),
                    Total_Sale = sum(as.numeric(sale))) 

什么可能导致此问题。

只是更新了数据和代码。我当时使用的是收集和传播,但现在又变长了。

1 个答案:

答案 0 :(得分:1)

您的示例数据实际上没有引起问题,因为它不包含任何重复的日期,我假设这是您实际数据集中的问题,因此我向示例数据添加了重复的行:

d1 <- tibble::tribble(
    ~Date, ~apple_count, ~apple_sale, ~banana_count, ~banana_sale, ~orange_count, ~orange_sale, ~peaches_count, ~peaches_sale, ~watermelon_count, ~watermelon_sale, ~strawberry_count, ~strawberry_sale,
    "8/19/19",  10882.05495,      239575,             0,            0,             0,            0,              0,             0,       9643.600102,           630827,                 0,                0,
    "8/19/19",  10882.05495,      239575,             0,            0,             0,            0,              0,             0,       9643.600102,           630827,                 0,                0,
    "8/20/19",    516.29755,       11281,             0,            0,             0,            0,              0,             0,       6041.538067,           510219,           1694.44,           684210,
    "8/21/19",     949.4084,       20150,             0,            0,             0,            0,              0,             0,       5371.758106,           565440,           9105.89,          3695182,
    "8/22/19",    3950.5318,       88679,             0,            0,             0,            0,              0,             0,       5238.308826,           576678,           6179.47,          2501560,
    "8/23/19",   2034.02055,       45672,             0,            0,             0,            0,              0,             0,        4994.43054,           518081,           7366.31,          2984563,
    "8/24/19",   1770.50415,       38553,             0,            0,             0,            0,              0,             0,       5001.303585,           551733,           6275.43,          2531400,
)

您可以通过为每个重复的行创建一个唯一的ID号来解决此问题:

d1 %>%
    pivot_longer(cols = -Date) %>%
    separate(name, into=c('partner', 'parameter'), sep='_') %>% 
    group_by(Date, partner, parameter) %>%
    mutate(row_num = 1:n()) %>%
    ungroup() %>%
    pivot_wider(names_from = parameter, values_from = value) %>%
    dplyr::group_by(partner) %>%
    dplyr::summarise( Total_Count = sum(as.numeric(count)),
                      Total_Sale = sum(as.numeric(sale)))