在将数据集扩展到更大范围方面,我遇到了很多问题。
数据框如下:
> head(waste)
# A tibble: 6 x 5
council_name Period RowText ColText Data
<fct> <fct> <chr> <chr> <chr>
1 Adur Apr 06 - Jun 06 Mixed glass Tonnage collected for recycling 243.39
2 Adur Apr 06 - Jun 06 Mixed paper & card Tonnage collected for recycling 632.80999999999995
3 Adur Apr 06 - Jun 06 Co mingled materials Tonnage collected for recycling 97.36
4 Adur Apr 06 - Jun 06 Green waste only Tonnage collected for recycling 24.68
5 Adur Apr 06 - Jun 06 Fridges & Freezers Tonnage collected for recycling 2.76
6 Adur Jul 06 - Sep 06 Mixed glass Tonnage collected for recycling 251.51
> tail(waste)
# A tibble: 6 x 5
council_name Period RowText ColText Data
<fct> <fct> <chr> <chr> <chr>
1 York Jan 19 - Mar 19 Co mingled materials Tonnage collected for recycling 635.22
2 York Jan 19 - Mar 19 Co mingled materials Tonnage collected for recycling but actually rejected/disposed 0
3 York Jan 19 - Mar 19 Co mingled materials No. of households receiving a collection 89160
4 York Jan 19 - Mar 19 Co mingled materials Tonnage Collected for Reuse 0
5 York Jan 19 - Mar 19 Co mingled materials Tonnage Collected for reuse but actually rejected / disposed 0
6 York Jan 19 - Mar 19 Co mingled materials Collected Co-mingled? Yes
我想要这样的东西:
> df.desired
council_name period type data
1 Adur 1st 2006 glass.tonnage 243.390
2 Adur 1st 2006 paper.tonnage 632.809
3 Adur 1st 2006 co.mingled.tonnage 97.360
4 Adur 1st 2006 greenwaste.tonnage 24.680
5 Adur 1st 2006 fridges.tonnage 2.760
6 Adur 2nd 2006 glass.tonnage 251.510
7 Adur 2nd 2006 paper.tonnage 6555.430
因此,基本上,我希望将数据框旋转到更大的范围,并将“ RowText”和“ ColText”的组合作为列。在此处并不明显,但在整个数据中,“ ColText列具有多个值,而不仅仅是” Tonnage收集用于回收利用”。
我尝试过:
w1 <- waste %>%
pivot_wider(id_cols = c("Period","council_name", "RowText", "ColText"),
names_from = c(RowText, ColText),
values_from = Data,
names_sep = "_")
但是我继续遇到以下错误:
Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates
输出看起来像这样:
Period council_name `Mixed glass_To… `Mixed paper & … `Co mingled mat… `Green waste on… `Fridges & Free… `Mixed glass_No… `Mixed paper & … `Co mingled mat…
<fct> <fct> <list> <list> <list> <list> <list> <list> <list> <list>
1 Apr 0… Adur <chr [1]> <chr [1]> <chr [1]> <chr [1]> <chr [1]> <NULL> <NULL> <NULL>
2 Jul 0… Adur <chr [1]> <chr [1]> <chr [1]> <chr [1]> <chr [1]> <NULL> <NULL> <NULL>
3 Oct 0… Adur <chr [1]> <chr [1]> <chr [1]> <chr [1]> <chr [1]> <NULL> <NULL> <NULL>
4 Jan 0… Adur <chr [1]> <chr [1]> <chr [1]> <chr [1]> <NULL> <chr [1]> <chr [1]> <chr [1]>
5 Apr 0… Adur <chr [1]> <chr [1]> <chr [1]> <chr [1]> <NULL> <chr [1]> <chr [1]> <chr [1]>
6 Jul 0… Adur <chr [1]> <chr [1]> <chr [1]> <chr [1]> <NULL> <chr [1]> <chr [1]> <chr [1]>
当我检查重复项时,它们是因为pivot_wider无法识别“ Period”中的不同值,所以将给定值“ council_name”的所有“ Data”值都堆积起来,并且得到列表列。
我在尝试其他事情上已经用尽了很多选择,因此非常欢迎所有帮助。
谢谢!