我想在R中扩展三个维度。我想在一个数据框中合并三年中每天的县级信息,该数据框包含所有年份的所有县,包括所有月份和所有天数(例如31)。问题在于在使用数据中并非每个县/县的天观测都可用。这是因为在特定县的特定日期没有发生此事件。因此,这些对我来说是零观测值。
为了创建我的主文件,我列出了所有县的列表。然后,我想对其进行扩展,以便对每个县/年/月/日组合有一个唯一的观察结果。
我为您节省了代码。我有一个包括县的data.frame。我将生成年,月和日。到目前为止,我一直使用tidyverse的expand。
编辑:
library(tidyverse)
# This is my list of all counties from an official source
counties <- data.frame("county" = c("A", "B" ,"c"))
# This is what I have, the data includes counties (not all),
# for year (not all),
# months (not all)
# and days (not all)
using <- data.frame("county" = c("A", "A", "A", "B", "B", "B", "B"),
"year" = c(2015,2016,2017,2015,2016,2017,2018),
"month" = c(1,2,7,2,3,2,4),
"day" = c(1,2,22,3,21,14,5))
# This is my attempt to get at least all county year combinations
county.month <- expand(counties, county, 1:12)
# But I wish I could get all county#year#month#dya combinations
最佳
丹尼尔
答案 0 :(得分:0)
我不确定您要输出什么...但是我想您想要tidyr
的功能:complete
而不是expand
?
例如
using %>%
complete(month, nesting(county, year))
# A tibble: 35 x 4
month county year day
<dbl> <fct> <dbl> <dbl>
1 1 A 2015 1
2 1 A 2016 NA
3 1 A 2017 NA
4 1 B 2015 NA
5 1 B 2016 NA
6 1 B 2017 NA
7 1 B 2018 NA
8 2 A 2015 NA
9 2 A 2016 2
10 2 A 2017 NA
答案 1 :(得分:0)
这种方法应该可以满足您的要求:细化所有可能的县/年/月/日组合(假设每个月有31天...;))关键是要考虑因素
library(tidyverse)
counties <- data.frame("county" = c("A", "B" ,"C"), stringsAsFactors = F)
using <- tibble("county" = c("A", "A", "A", "B", "B", "B", "B"),
"year" = c(2015,2016,2017,2015,2016,2017,2018),
"month" = c(1,2,7,2,3,2,4),
"day" = c(1,2,22,3,21,14,5))
using %>%
mutate_if(is.character, as_factor) %>%
mutate_if(is.numeric, as.ordered) %>%
mutate(county = fct_expand(county, counties$county),
month = fct_expand(month, as.character(1:12)),
day = fct_expand(day, as.character(1:31))) %>%
expand(county, year, month, day) %>%
arrange(year, month, day)
# A tibble: 4,464 x 4
county year month day
<fct> <ord> <ord> <ord>
1 A 2015 1 1
2 B 2015 1 1
3 c 2015 1 1
4 A 2015 1 2
5 B 2015 1 2
6 c 2015 1 2
7 A 2015 1 3
8 B 2015 1 3
9 c 2015 1 3
10 A 2015 1 5
# … with 4,454 more rows
答案 2 :(得分:0)
也许您想要的是数据中年份的所有日期。在这种情况下,请使用seq()
函数by="1 day"
。
library(tidyverse)
library(lubridate)
counties <- data.frame("county" = c("A", "B" ,"c"), stringsAsFactors = FALSE)
start_date<-as_date("2015-01-01")
end_date<-as_date("2018-12-31")
all_dates<-seq(start_date, end_date, by='1 day')
allcounties_alldates<-crossing(counties, all_dates)