我有一个示例数据框,该数据框的一列每行存储3个字母。数据框还具有2个附加列:日期和颜色:
Alphabet Date Colour
ABC 2018-09-10 green
DEF 2017-06-11 red
GHI 2016-05-12 blue
JKL NA yellow
MNO NA orange
PQR Unknown brown
此数据框中某些日期丢失/未知。我有另一个数据框,其中也有一个字母和一个日期列。第二个数据框在第一个数据框中包含缺少日期的日期:
Alphabet Date
JKL 2017-06-07
MNO 2018-08-03
PQR 2019-10-07
STU 2019-11-08
VWX 2019-12-08
我想通过匹配两个数据框中的字母记录来填充第一个数据框中的缺失日期,然后将第二个数据框中的日期插入第一个数据框中。
所需的输出:
Alphabet Date Colour
ABC 2018-09-10 green
DEF 2017-06-11 red
GHI 2016-05-12 blue
JKL 2017-06-07 yellow
MNO 2018-08-03 orange
PQR 2019-10-07 brown
感谢您的帮助。
答案 0 :(得分:1)
一个选择是加入data.table
library(data.table)
setDT(df1)[df2, Date := i.Date, on = .(Alphabet)]
df1
# Alphabet Date Colour
#1: ABC 2018-09-10 green
#2: DEF 2017-06-11 red
#3: GHI 2016-05-12 blue
#4: JKL 2017-06-07 yellow
#5: MNO 2018-08-03 orange
#6: PQR 2019-10-07 brown
使用新的“ df2n”数据集
i1 <- is.na(df1$Date)|df1$Date %in% "Unknown"
setDT(df1)[df2n[df2n$Alphabet %in% df1$Alphabet[i1],],
Date := i.Date, on = .(Alphabet)]
df1
# Alphabet Date Colour
#1: ABC 2018-09-10 green
#2: DEF 2017-06-11 red
#3: GHI 2016-05-12 blue
#4: JKL 2017-06-07 yellow
#5: MNO 2018-08-03 orange
#6: PQR 2019-10-07 brown
或使用match
中的base R
i1 <- match(df2$Alphabet, df1$Alphabet)
df1$Date[i1] <- df2$Date
df1 <- structure(list(Alphabet = c("ABC", "DEF", "GHI", "JKL", "MNO",
"PQR"), Date = c("2018-09-10", "2017-06-11", "2016-05-12", NA,
NA, "Unknown"), Colour = c("green", "red", "blue", "yellow",
"orange", "brown")), class = "data.frame", row.names = c(NA,
-6L))
df2 <- structure(list(Alphabet = c("JKL", "MNO", "PQR"), Date = c("2017-06-07",
"2018-08-03", "2019-10-07")), class = "data.frame", row.names = c(NA,
-3L))
df2a <- structure(list(Alphabet = c("JKL", "MNO", "PQR", "STU", "VWX"
), Date = c("2017-06-07", "2018-08-03", "2019-10-07", "2019-11-08",
"2019-12-08")), class = "data.frame", row.names = c(NA, -5L))
答案 1 :(得分:1)
使用dplyr
,我们可以left_join
df1
和df2
,然后使用coalesce
来填写缺失的值。
library(dplyr)
left_join(df1, df2, by = "Alphabet") %>%
mutate(Date = coalesce(Date.y, Date.x)) %>%
select(-Date.x, -Date.y)
# Alphabet Colour Date
#1 ABC green 2018-09-10
#2 DEF red 2017-06-11
#3 GHI blue 2016-05-12
#4 JKL yellow 2017-06-07
#5 MNO orange 2018-08-03
#6 PQR brown 2019-10-07