合并具有相同行值但展开的列

时间:2018-02-22 01:13:12

标签: r

我想根据"日期"中的值组合这些列。因此,只有相应年龄组聚集的日期的唯一值。这是在spread()中使用tidyr的结果。如果你看看重复日期的值

dput(dataframe)读取....

structure(list(Date = c("201740", "201740", "201740", "201740", 
"201741", "201741", "201741", "201741", "201742", "201742", "201742", 
"201742", "201743", "201743", "201743", "201743", "201743", "201743", 
"201744", "201744", "201744", "201744", "201744", "201744", "201745", 
"201745", "201745", "201745", "201745", "201745", "201746", "201746", 
"201746", "201746", "201746", "201746", "201747", "201747", "201747", 
"201747", "201747", "201747", "201748", "201748", "201748", "201748", 
"201748", "201748", "201749", "201749", "201749", "201749", "201749", 
"201749", "201750", "201750", "201750", "201750", "201750", "201750", 
"201751", "201751", "201751", "201751", "201751", "201751", "201752", 
"201752", "201752", "201752", "201752", "201752", "201801", "201801", 
"201801", "201801", "201801", "201801", "201802", "201802", "201802", 
"201802", "201802", "201802", "201803", "201803", "201803", "201803", 
"201803", "201803", "201804", "201804", "201804", "201804", "201804", 
"201804", "201805"), `0-4 yr` = c(NA, 0.1, NA, NA, NA, 0.2, NA, 
NA, NA, 0.2, NA, NA, NA, NA, 0.3, NA, NA, NA, NA, NA, 0.6, NA, 
NA, NA, NA, NA, 0.7, NA, NA, NA, NA, NA, 1, NA, NA, NA, NA, NA, 
1.8, NA, NA, NA, NA, NA, 2.7, NA, NA, NA, NA, NA, 3.3, NA, NA, 
NA, NA, NA, 5.2, NA, NA, NA, NA, NA, 7.9, NA, NA, NA, NA, NA, 
13.7, NA, NA, NA, NA, NA, 18.3, NA, NA, NA, NA, NA, 23.3, NA, 
NA, NA, NA, NA, 28.2, NA, NA, NA, NA, NA, 35.6, NA, NA, NA, 41.9
), `18-49 yr` = c(NA, 0.1, NA, NA, 0.1, NA, NA, NA, NA, 0.2, 
NA, NA, NA, 0.2, NA, NA, NA, NA, NA, 0.4, NA, NA, NA, NA, NA, 
0.5, NA, NA, NA, NA, NA, 0.7, NA, NA, NA, NA, NA, 1, NA, NA, 
NA, NA, NA, 1.4, NA, NA, NA, NA, NA, 1.9, NA, NA, NA, NA, NA, 
2.7, NA, NA, NA, NA, NA, 4.2, NA, NA, NA, NA, NA, 6.6, NA, NA, 
NA, NA, NA, 9.3, NA, NA, NA, NA, NA, 12.5, NA, NA, NA, NA, NA, 
15.2, NA, NA, NA, NA, NA, 17.7, NA, NA, NA, NA, NA), `5-17 yr` = c(0, 
NA, NA, NA, 0.1, NA, NA, NA, 0.1, NA, NA, NA, 0.1, NA, NA, NA, 
NA, NA, 0.2, NA, NA, NA, NA, NA, 0.3, NA, NA, NA, NA, NA, 0.5, 
NA, NA, NA, NA, NA, 0.7, NA, NA, NA, NA, NA, 0.9, NA, NA, NA, 
NA, NA, 1.2, NA, NA, NA, NA, NA, 1.7, NA, NA, NA, NA, NA, 2.5, 
NA, NA, NA, NA, NA, 3.5, NA, NA, NA, NA, NA, 4.3, NA, NA, NA, 
NA, NA, 5.9, NA, NA, NA, NA, NA, 7.3, NA, NA, NA, NA, NA, 9, 
NA, NA, NA, NA, NA, NA), `50-64 yr` = c(NA, NA, 0.2, NA, NA, 
NA, 0.3, NA, NA, NA, 0.5, NA, NA, NA, NA, NA, 0.8, NA, NA, NA, 
NA, NA, 1.1, NA, NA, NA, NA, NA, 1.6, NA, NA, NA, NA, NA, 2.2, 
NA, NA, NA, NA, NA, 3.1, NA, NA, NA, NA, 4.1, NA, NA, NA, NA, 
NA, 5.4, NA, NA, NA, NA, NA, NA, 8.1, NA, NA, NA, NA, NA, 13.7, 
NA, NA, NA, NA, 21.7, NA, NA, NA, NA, NA, NA, 32.6, NA, NA, NA, 
NA, NA, 42.9, NA, NA, NA, NA, NA, 52, NA, NA, NA, NA, NA, 60.2, 
NA, NA), `65+ yr` = c(NA, NA, NA, 0.5, NA, NA, NA, 1, NA, NA, 
NA, 2.1, NA, NA, NA, NA, NA, 3, NA, NA, NA, NA, NA, 3.9, NA, 
NA, NA, NA, NA, 5.1, NA, NA, NA, NA, NA, 6.5, NA, NA, NA, NA, 
NA, 9.2, NA, NA, NA, NA, NA, 14.3, NA, NA, NA, NA, NA, 20.5, 
NA, NA, NA, NA, NA, 30.2, NA, NA, NA, NA, NA, 50.2, NA, NA, NA, 
NA, NA, 90.1, NA, NA, NA, NA, NA, 137.9, NA, NA, NA, NA, NA, 
179.5, NA, NA, NA, NA, NA, 217.4, NA, NA, NA, NA, NA, 251.8, 
NA)), .Names = c("Date", "0-4 yr", "18-49 yr", "5-17 yr", "50-64 yr", 
"65+ yr"), class = "data.frame", row.names = c(NA, 97L))

enter image description here

1 个答案:

答案 0 :(得分:2)

可以尝试聚合,这可以在你传播之前完成。但是在工作之后

library(tidyverse)
dataframe %>%
    group_by(Date) %>%
    summarise_all(funs(sum(., na.rm = T)))

我在这里使用了sum(),因为它不清楚你想要总结的方式。

更合适的方式可能是:

dataframe %>%
    gather("age_group", "value", -Date) %>%
    filter(!is.na(value)) %>%
    spread(age_group, value)

我们收集数据的地方可能有你原来的输入,这需要过滤然后再重新传播