该帖子与最近发布的transform date into dummy variable in R相关,但更为复杂。 我有数据
df=structure(list(Data = structure(c(4L, 5L, 6L, 7L, 8L, 9L, 10L,
1L, 2L, 3L), .Label = c("01.01.2018", "02.01.2018", "03.01.2018",
"25.12.2017", "26.12.2017", "27.12.2017", "28.12.2017", "29.12.2017",
"30.12.2017", "31.12.2017"), class = "factor"), Y = 1:10), .Names = c("Data",
"Y"), class = "data.frame", row.names = c(NA, -10L))
我不得不将日期转换成虚拟变量。如果一天是指该日期,则为1,否则为0。
PawełKozielski-Romaneczko提供的解决方案帮助了我。
library(dplyr)
library(lubridate)
library(tidyr)
df %>%
mutate(weekDay = lubridate::dmy(Data) %>% weekdays(),
value = 1) %>%
spread(key=weekDay, value=value, fill=0)
但是现在,我必须添加带有假日的列。 即是假期吗?
我有辅助数据集,其中指示的日期是假期?
df1=structure(list(Data = structure(1:2, .Label = c("01.01.2018",
"08.03.2018"), class = "factor"), name = structure(c(2L, 1L), .Label = c("International Women's Day",
"New Year"), class = "factor")), .Names = c("Data", "name"), class = "data.frame", row.names = c(NA,
-2L))
所以我需要这个假期作为输出
Data Y Mon Tue Wed Thu Fri Sat Sun New Year International Women's Day
25.12.2017 1 1 0 0 0 0 0 0 0 0
26.12.2017 2 0 1 0 0 0 0 0 0 0
27.12.2017 3 0 0 1 0 0 0 0 0 0
28.12.2017 4 0 0 0 1 0 0 0 0 0
29.12.2017 5 0 0 0 0 1 0 0 0 0
30.12.2017 6 0 0 0 0 0 1 0 0 0
31.12.2017 7 0 0 0 0 0 0 1 0 0
01.01.2018 8 1 0 0 0 0 0 0 1 0
02.01.2018 9 0 1 0 0 0 0 0 0 0
03.01.2018 10 0 0 1 0 0 0 0 0 0
如何将假期添加为虚拟变量,其名称取自辅助数据集?
P.S。如果您认为该主题必须在我的上一篇文章中,请告诉我,我将其删除。
答案 0 :(得分:1)
使用您的示例,我在此进行扩展。根据您的需要,使用left_join或full_join。我使用了full_join,因此结果中显示了“国际妇女节”。
我使用as.character清除名称,因为在您的示例中这是一个因素。如果名称不是一个因素,则不需要as.character。最后,我删除了No_holidays。
df %>% full_join(df1) %>%
mutate(weekDay = lubridate::dmy(Data) %>% weekdays(),
name = ifelse(is.na(name), "No_Holiday", as.character(name)),
holiday = ifelse(is.na(name), 0, 1),
value = 1) %>%
spread(key = weekDay, value=value, fill=0) %>%
spread(key = name, value = holiday, fill = 0) %>%
select(-No_Holiday)
Data Y Friday Monday Saturday Sunday Thursday Tuesday Wednesday International Women's Day New Year
1 01.01.2018 8 0 1 0 0 0 0 0 0 1
2 02.01.2018 9 0 0 0 0 0 1 0 0 0
3 03.01.2018 10 0 0 0 0 0 0 1 0 0
4 08.03.2018 NA 0 0 0 0 1 0 0 1 0
5 25.12.2017 1 0 1 0 0 0 0 0 0 0
6 26.12.2017 2 0 0 0 0 0 1 0 0 0
7 27.12.2017 3 0 0 0 0 0 0 1 0 0
8 28.12.2017 4 0 0 0 0 1 0 0 0 0
9 29.12.2017 5 1 0 0 0 0 0 0 0 0
10 30.12.2017 6 0 0 1 0 0 0 0 0 0
11 31.12.2017 7 0 0 0 1 0 0 0 0 0