我有一组调查数据,每个调查涵盖多天。以下是当前表单中数据的示例:
| Survey | Dates | Result |
|--------|--------------|--------|
| A | 11/30 - 12/1 | 33% |
| B | 12/2 - 12/4 | 26% |
| C | 12/4 - 12/5 | 39% |
此示例可以通过以下方式进行:
frame <- data.frame(Survey = c('A','B','C'),
Dates = c('11/30 - 12/1', '12/2 - 12/4', '12/4 - 12/5'),
Result = c('33%', '26%', '39%'))
我想做的是为每个日期创建一个列,如果日期在调查范围内,则将结果放入单元格。它看起来像这样:
| Survey | 11/30 | 12/1 | 12/2 | 12/3 | 12/4 | 12/5 |
|--------|-------|------|------|------|------|------|
| A | 33% | 33% | | | | |
| B | | | 26% | 26% | 26% | |
| C | | | | | 39% | 39% |
任何帮助都将不胜感激。
答案 0 :(得分:3)
这是一个想法:
library(dplyr)
library(tidyr)
frame %>%
separate_rows(Dates, sep = " - ") %>%
mutate(Dates = as.Date(Dates, format = "%m/%d")) %>%
group_by(Survey) %>%
complete(Dates = seq(min(Dates), max(Dates), 1)) %>%
fill(Result) %>%
spread(Dates, Result)
给出了:
# Survey `2017-11-30` `2017-12-01` `2017-12-02` `2017-12-03` `2017-12-04` `2017-12-05`
#* <fctr> <fctr> <fctr> <fctr> <fctr> <fctr> <fctr>
#1 A 33% 33% NA NA NA NA
#2 B NA NA 26% 26% 26% NA
#3 C NA NA NA NA 39% 39%
答案 1 :(得分:0)
Tidyverse解决方案,但它需要您稍微使用Dates
列:
#install.packages('tidyverse')
library(tidyverse)
dframe <- data.frame(Survey = c('A','B','C'),
Dates = c('11/30 - 12/1', '12/2 - 12/4', '12/4 - 12/5'),
Result = c('33%', '26%', '39%'), stringsAsFactors = F)
dframe$Dates <- lapply(strsplit(dframe$Dates, split = " - "), function(x) {
x <- strptime(x, "%m/%d")
x <- seq(min(x), max(x), '1 day')
paste0(strftime(x, "%m/%d"), collapse = " - ")
})
dframe %>%
separate_rows(Dates, sep = " - ") %>%
spread(Dates, Result)
应该得到:
Survey 11/30 12/01 12/02 12/03 12/04 12/05
A 33% 33% <NA> <NA> <NA> <NA>
B <NA> <NA> 26% 26% 26% <NA>
C <NA> <NA> <NA> <NA> 39% 39%
我希望这会有所帮助。