在R中使用ivot_wider复制和列出

时间:2020-06-15 13:05:22

标签: r dplyr tidyverse spread

在将数据集扩展到更大范围方面,我遇到了很多问题。

数据框如下:

> head(waste)
# A tibble: 6 x 5
  council_name Period          RowText              ColText                         Data              
  <fct>        <fct>           <chr>                <chr>                           <chr>             
1 Adur         Apr 06 - Jun 06 Mixed glass          Tonnage collected for recycling 243.39            
2 Adur         Apr 06 - Jun 06 Mixed paper &  card  Tonnage collected for recycling 632.80999999999995
3 Adur         Apr 06 - Jun 06 Co mingled materials Tonnage collected for recycling 97.36             
4 Adur         Apr 06 - Jun 06 Green waste only     Tonnage collected for recycling 24.68             
5 Adur         Apr 06 - Jun 06 Fridges & Freezers   Tonnage collected for recycling 2.76              
6 Adur         Jul 06 - Sep 06 Mixed glass          Tonnage collected for recycling 251.51
> tail(waste)
# A tibble: 6 x 5
  council_name Period          RowText              ColText                                                        Data  
  <fct>        <fct>           <chr>                <chr>                                                          <chr> 
1 York         Jan 19 - Mar 19 Co mingled materials Tonnage collected for recycling                                635.22
2 York         Jan 19 - Mar 19 Co mingled materials Tonnage collected for recycling but actually rejected/disposed 0     
3 York         Jan 19 - Mar 19 Co mingled materials No. of households receiving a collection                       89160 
4 York         Jan 19 - Mar 19 Co mingled materials Tonnage Collected for Reuse                                    0     
5 York         Jan 19 - Mar 19 Co mingled materials Tonnage Collected for reuse but actually rejected / disposed   0     
6 York         Jan 19 - Mar 19 Co mingled materials Collected Co-mingled?                                          Yes 

我想要这样的东西:

> df.desired
  council_name   period               type     data
1         Adur 1st 2006      glass.tonnage  243.390
2         Adur 1st 2006      paper.tonnage  632.809
3         Adur 1st 2006 co.mingled.tonnage   97.360
4         Adur 1st 2006 greenwaste.tonnage   24.680
5         Adur 1st 2006    fridges.tonnage    2.760
6         Adur 2nd 2006      glass.tonnage  251.510
7         Adur 2nd 2006      paper.tonnage 6555.430

因此,基本上,我希望将数据框旋转到更大的范围,并将“ RowText”和“ ColText”的组合作为列。在此处并不明显,但在整个数据中,“ ColText列具有多个值,而不仅仅是” Tonnage收集用于回收利用”。

我尝试过:

w1 <- waste %>%
  pivot_wider(id_cols = c("Period","council_name", "RowText", "ColText"),
              names_from = c(RowText, ColText),
              values_from = Data,
              names_sep = "_")

但是我继续遇到以下错误:

Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates

输出看起来像这样:

Period council_name `Mixed glass_To… `Mixed paper & … `Co mingled mat… `Green waste on… `Fridges & Free… `Mixed glass_No… `Mixed paper & … `Co mingled mat…
  <fct>  <fct>        <list>           <list>           <list>           <list>           <list>           <list>           <list>           <list>          
1 Apr 0… Adur         <chr [1]>        <chr [1]>        <chr [1]>        <chr [1]>        <chr [1]>        <NULL>           <NULL>           <NULL>          
2 Jul 0… Adur         <chr [1]>        <chr [1]>        <chr [1]>        <chr [1]>        <chr [1]>        <NULL>           <NULL>           <NULL>          
3 Oct 0… Adur         <chr [1]>        <chr [1]>        <chr [1]>        <chr [1]>        <chr [1]>        <NULL>           <NULL>           <NULL>          
4 Jan 0… Adur         <chr [1]>        <chr [1]>        <chr [1]>        <chr [1]>        <NULL>           <chr [1]>        <chr [1]>        <chr [1]>       
5 Apr 0… Adur         <chr [1]>        <chr [1]>        <chr [1]>        <chr [1]>        <NULL>           <chr [1]>        <chr [1]>        <chr [1]>       
6 Jul 0… Adur         <chr [1]>        <chr [1]>        <chr [1]>        <chr [1]>        <NULL>           <chr [1]>        <chr [1]>        <chr [1]>

当我检查重复项时,它们是因为pivot_wider无法识别“ Period”中的不同值,所以将给定值“ council_name”的所有“ Data”值都堆积起来,并且得到列表列。

我在尝试其他事情上已经用尽了很多选择,因此非常欢迎所有帮助。

谢谢!

0 个答案:

没有答案