给定两个包含日期的数据框:
d1
# dates
# 2016-08-01
# 2016-08-02
# 2016-08-03
# 2016-08-04
d2
# dates
# 2016-08-02
# 2016-08-03
# 2016-08-04
# 2016-08-05
# 2016-08-06
如何创建具有非常用值的第3个数据框?
d3
# dates
# 2016-08-01
# 2016-08-05
# 2016-08-06
数据:
df1 <- structure(list(dates = structure(c(17014, 17015, 17016, 17017 ),
class = "Date")), .Names = "dates", row.names = c(NA, -4L), class =
"data.frame")
df2 <- structure(list(dates = structure(c(17015, 17016, 17017, 17018,
17019), class = "Date")), .Names = "dates", row.names = c(NA, -5L), class
= "data.frame")
答案 0 :(得分:2)
假设您有两个向量x
和y
,则不共享的元素是
c(x[!(x %in% y)], y[!(y %in% x)])
如果您使用数据框,只要您的dates
列是&#34;字符&#34;或&#34;日期&#34;而不是&#34;因素&#34;,你可以做
rbind(subset(df1, !(df1$dates %in% df2$dates)),
subset(df2, !(df2$dates %in% df1$dates)))
简单的矢量示例
x <- 1:5
y <- 3:8
c(x[!(x %in% y)], y[!(y %in% x)])
# [1] 1 2 6 7 8
&#34;日期&#34;
的矢量x <- seq(from = as.Date("2016-01-01"), length = 5, by = 1)
y <- seq(from = as.Date("2016-01-03"), length = 5, by = 1)
c(x[!(x %in% y)], y[!(y %in% x)])
# [1] "2016-01-01" "2016-01-02" "2016-01-06" "2016-01-07"
问题中的示例数据框
rbind(subset(df1, !(df1$dates %in% df2$dates)),
subset(df2, !(df2$dates %in% df1$dates)))
# dates
#1 2016-08-01
#4 2016-08-05
#5 2016-08-06
答案 1 :(得分:1)
您可能只是使用其他人已经显示的联接。我个人喜欢在基地R中使用?setops
。像这样:
# if they are just character/factor variables
setdiff(d1$dates, d2$dates)
# if they are date variables
setdiff(as.character(d1$dates), as.character(d2$dates))
# then convert back to as.Date(setdiff(...))
应用此功能,您可以根据结果过滤data.frame,或者像@ZheyuanLi已间接识别,使用匹配排除:
# If they are date variables
d2[!as.character(d2$dates) %in% as.character(d1$dates),]
# If they are character/factor variables
d2[!d2$dates %in% d1$dates,]