我想根据"日期"中的值组合这些列。因此,只有相应年龄组聚集的日期的唯一值。这是在spread()
中使用tidyr
的结果。如果你看看重复日期的值
dput(dataframe)
读取....
structure(list(Date = c("201740", "201740", "201740", "201740",
"201741", "201741", "201741", "201741", "201742", "201742", "201742",
"201742", "201743", "201743", "201743", "201743", "201743", "201743",
"201744", "201744", "201744", "201744", "201744", "201744", "201745",
"201745", "201745", "201745", "201745", "201745", "201746", "201746",
"201746", "201746", "201746", "201746", "201747", "201747", "201747",
"201747", "201747", "201747", "201748", "201748", "201748", "201748",
"201748", "201748", "201749", "201749", "201749", "201749", "201749",
"201749", "201750", "201750", "201750", "201750", "201750", "201750",
"201751", "201751", "201751", "201751", "201751", "201751", "201752",
"201752", "201752", "201752", "201752", "201752", "201801", "201801",
"201801", "201801", "201801", "201801", "201802", "201802", "201802",
"201802", "201802", "201802", "201803", "201803", "201803", "201803",
"201803", "201803", "201804", "201804", "201804", "201804", "201804",
"201804", "201805"), `0-4 yr` = c(NA, 0.1, NA, NA, NA, 0.2, NA,
NA, NA, 0.2, NA, NA, NA, NA, 0.3, NA, NA, NA, NA, NA, 0.6, NA,
NA, NA, NA, NA, 0.7, NA, NA, NA, NA, NA, 1, NA, NA, NA, NA, NA,
1.8, NA, NA, NA, NA, NA, 2.7, NA, NA, NA, NA, NA, 3.3, NA, NA,
NA, NA, NA, 5.2, NA, NA, NA, NA, NA, 7.9, NA, NA, NA, NA, NA,
13.7, NA, NA, NA, NA, NA, 18.3, NA, NA, NA, NA, NA, 23.3, NA,
NA, NA, NA, NA, 28.2, NA, NA, NA, NA, NA, 35.6, NA, NA, NA, 41.9
), `18-49 yr` = c(NA, 0.1, NA, NA, 0.1, NA, NA, NA, NA, 0.2,
NA, NA, NA, 0.2, NA, NA, NA, NA, NA, 0.4, NA, NA, NA, NA, NA,
0.5, NA, NA, NA, NA, NA, 0.7, NA, NA, NA, NA, NA, 1, NA, NA,
NA, NA, NA, 1.4, NA, NA, NA, NA, NA, 1.9, NA, NA, NA, NA, NA,
2.7, NA, NA, NA, NA, NA, 4.2, NA, NA, NA, NA, NA, 6.6, NA, NA,
NA, NA, NA, 9.3, NA, NA, NA, NA, NA, 12.5, NA, NA, NA, NA, NA,
15.2, NA, NA, NA, NA, NA, 17.7, NA, NA, NA, NA, NA), `5-17 yr` = c(0,
NA, NA, NA, 0.1, NA, NA, NA, 0.1, NA, NA, NA, 0.1, NA, NA, NA,
NA, NA, 0.2, NA, NA, NA, NA, NA, 0.3, NA, NA, NA, NA, NA, 0.5,
NA, NA, NA, NA, NA, 0.7, NA, NA, NA, NA, NA, 0.9, NA, NA, NA,
NA, NA, 1.2, NA, NA, NA, NA, NA, 1.7, NA, NA, NA, NA, NA, 2.5,
NA, NA, NA, NA, NA, 3.5, NA, NA, NA, NA, NA, 4.3, NA, NA, NA,
NA, NA, 5.9, NA, NA, NA, NA, NA, 7.3, NA, NA, NA, NA, NA, 9,
NA, NA, NA, NA, NA, NA), `50-64 yr` = c(NA, NA, 0.2, NA, NA,
NA, 0.3, NA, NA, NA, 0.5, NA, NA, NA, NA, NA, 0.8, NA, NA, NA,
NA, NA, 1.1, NA, NA, NA, NA, NA, 1.6, NA, NA, NA, NA, NA, 2.2,
NA, NA, NA, NA, NA, 3.1, NA, NA, NA, NA, 4.1, NA, NA, NA, NA,
NA, 5.4, NA, NA, NA, NA, NA, NA, 8.1, NA, NA, NA, NA, NA, 13.7,
NA, NA, NA, NA, 21.7, NA, NA, NA, NA, NA, NA, 32.6, NA, NA, NA,
NA, NA, 42.9, NA, NA, NA, NA, NA, 52, NA, NA, NA, NA, NA, 60.2,
NA, NA), `65+ yr` = c(NA, NA, NA, 0.5, NA, NA, NA, 1, NA, NA,
NA, 2.1, NA, NA, NA, NA, NA, 3, NA, NA, NA, NA, NA, 3.9, NA,
NA, NA, NA, NA, 5.1, NA, NA, NA, NA, NA, 6.5, NA, NA, NA, NA,
NA, 9.2, NA, NA, NA, NA, NA, 14.3, NA, NA, NA, NA, NA, 20.5,
NA, NA, NA, NA, NA, 30.2, NA, NA, NA, NA, NA, 50.2, NA, NA, NA,
NA, NA, 90.1, NA, NA, NA, NA, NA, 137.9, NA, NA, NA, NA, NA,
179.5, NA, NA, NA, NA, NA, 217.4, NA, NA, NA, NA, NA, 251.8,
NA)), .Names = c("Date", "0-4 yr", "18-49 yr", "5-17 yr", "50-64 yr",
"65+ yr"), class = "data.frame", row.names = c(NA, 97L))
答案 0 :(得分:2)
可以尝试聚合,这可以在你传播之前完成。但是在工作之后
library(tidyverse)
dataframe %>%
group_by(Date) %>%
summarise_all(funs(sum(., na.rm = T)))
我在这里使用了sum()
,因为它不清楚你想要总结的方式。
更合适的方式可能是:
dataframe %>%
gather("age_group", "value", -Date) %>%
filter(!is.na(value)) %>%
spread(age_group, value)
我们收集数据的地方可能有你原来的输入,这需要过滤然后再重新传播