如何使用多个变量或维度进行扩展

时间:2019-05-07 13:12:32

标签: r tidyverse expand dimensions

我想在R中扩展三个维度。我想在一个数据框中合并三年中每天的县级信息,该数据框包含所有年份的所有县,包括所有月份和所有天数(例如31)。问题在于在使用数据中并非每个县/县的天观测都可用。这是因为在特定县的特定日期没有发生此事件。因此,这些对我来说是零观测值。

为了创建我的主文件,我列出了所有县的列表。然后,我想对其进行扩展,以便对每个县/年/月/日组合有一个唯一的观察结果。

我为您节省了代码。我有一个包括县的data.frame。我将生成年,月和日。到目前为止,我一直使用tidyverse的expand。

编辑:

library(tidyverse)

# This is my list of all counties from an official source
counties <- data.frame("county" = c("A", "B" ,"c"))

# This is what I have, the data includes counties (not all),
# for year (not all),
# months (not all)
# and days (not all)

using <- data.frame("county"  = c("A", "A", "A", "B", "B", "B", "B"),
                    "year"  = c(2015,2016,2017,2015,2016,2017,2018),
                    "month" = c(1,2,7,2,3,2,4),
                    "day" = c(1,2,22,3,21,14,5))

# This is my attempt to get at least all county year combinations
county.month <- expand(counties, county, 1:12)

# But I wish I could get all county#year#month#dya combinations

最佳

丹尼尔

3 个答案:

答案 0 :(得分:0)

我不确定您要输出什么...但是我想您想要tidyr的功能:complete而不是expand

例如

using %>% 
    complete(month, nesting(county, year))


# A tibble: 35 x 4
   month county  year   day
   <dbl> <fct>  <dbl> <dbl>
 1     1 A       2015     1
 2     1 A       2016    NA
 3     1 A       2017    NA
 4     1 B       2015    NA
 5     1 B       2016    NA
 6     1 B       2017    NA
 7     1 B       2018    NA
 8     2 A       2015    NA
 9     2 A       2016     2
10     2 A       2017    NA

答案 1 :(得分:0)

这种方法应该可以满足您的要求:细化所有可能的县/年/月/日组合(假设每个月有31天...;))关键是要考虑因素

library(tidyverse)
counties <- data.frame("county" = c("A", "B" ,"C"), stringsAsFactors = F)
using <- tibble("county"  = c("A", "A", "A", "B", "B", "B", "B"),
                    "year"  = c(2015,2016,2017,2015,2016,2017,2018),
                    "month" = c(1,2,7,2,3,2,4),
                    "day" = c(1,2,22,3,21,14,5))

using %>% 
  mutate_if(is.character, as_factor) %>%
  mutate_if(is.numeric, as.ordered) %>%
  mutate(county = fct_expand(county, counties$county),
         month = fct_expand(month, as.character(1:12)),
         day = fct_expand(day, as.character(1:31))) %>%
  expand(county, year, month, day) %>%
  arrange(year, month, day)

# A tibble: 4,464 x 4
   county year  month day  
   <fct>  <ord> <ord> <ord>
 1 A      2015  1     1    
 2 B      2015  1     1    
 3 c      2015  1     1    
 4 A      2015  1     2    
 5 B      2015  1     2    
 6 c      2015  1     2    
 7 A      2015  1     3    
 8 B      2015  1     3    
 9 c      2015  1     3    
10 A      2015  1     5    
# … with 4,454 more rows

答案 2 :(得分:0)

也许您想要的是数据中年份的所有日期。在这种情况下,请使用seq()函数by="1 day"

library(tidyverse)
library(lubridate)
counties <- data.frame("county" = c("A", "B" ,"c"), stringsAsFactors = FALSE)

start_date<-as_date("2015-01-01")
end_date<-as_date("2018-12-31")

all_dates<-seq(start_date, end_date, by='1 day')

allcounties_alldates<-crossing(counties, all_dates)