考虑这些小型数据框a
和b
:
a <-structure(list(date = structure(c(16071, 16072, 16073, 16074, 16075,
16076, 16077, 16078, 16079, 16080,
16081), class = "Date"),
value = c(3L, 5L, 6L, 6L, 15L, 2L, 7L, 12L, 20L, 22L, 100L)),
.Names = c("date", "value"), row.names = c(NA, -11L), class = "data.frame")
b <- structure(list(date = structure(c(16071, 16072, 16073, 16074, 16075,
16076, 16077, 16078, 16079, 16080, 16081),
class = "Date"),
value = c(200L, 5L, 202L, 101L, 204L, 205L, 7L, 206L, 1000L,
456L, 555L)),
.Names = c("date", "value"), row.names = c(NA, -11L),
class = "data.frame")
我想创建第三个数据框,其中的列匹配上面两天和下面两天。例如,我匹配的两个值是(在合并a和b数据帧之后):
date value
1 2014-01-02 5
2 2014-01-07 7
只有一个记录来自上面和下面的匹配记录INSTEAD: 最终数据集应该有来自下方的1条记录和来自a和b的匹配记录上方的1条记录:
像这样:
FROM dataframe“a”
date value
1 2014-01-01 3 #one record above the matching record from dataframe "a"
2 2014-01-02 5 #matching record from "a" and "b"
3 2014-01-03 6 #one record below the matching record from dataframe "a"
6 2014-01-06 2 #one record above the matching record from dataframe "a"
7 2014-01-07 7 #matching record from "a" and "b"
8 2014-01-08 12 #one record below the matching record from dataframe "a"
FROM dataframe“b”
date value
1 2014-01-01 200 #one record from above
2 2014-01-02 5 #matching record from dataframe "a" and "b"
3 2014-01-03 202 #one record from below
6 2014-01-06 205 #one record from above
7 2014-01-07 7 #matching record from dataframe "a" and "b"
8 2014-01-08 206 # one record from below
最终产品应该是a和b的组合,并且看起来像这样:
date value date value
1 2014-01-01 3 2014-01-01 200
2 2014-01-02 5 2014-01-02 5
3 2014-01-03 6 2014-01-03 202
6 2014-01-06 2 2014-01-06 205
7 2014-01-07 7 2014-01-07 7
8 2014-01-08 12 2014-01-08 206
答案 0 :(得分:0)
我会评论,但我还没有代表。如果您希望新数据帧仅包含a和b中的行,则可以采用以下方法:
foo <- rbind(a, b)
foo.match <- unique(foo[duplicated(foo), ])
rownames(foo.match) <- 1:nrow(foo.match)
> foo.match
date value
1 2014-01-02 5
2 2014-01-07 7
我不明白你问题的第二部分(上面两天和下面两天)。你想要输出到底是什么样的?
修改强>
以下是您问题的快速解决方案。 假设:此解决方案假设a和b始终具有相同的行数和相同的日期,并且&#34;值&#34;总是一个整数。
# sort the data by date to be safe
a <- a[order(a$date), ]
b <- b[order(b$date), ]
# determine which rows in a and b match
match.rows <- which(a$date == b$date & a$value == b$value)
# get the rows indices above and below the matching values
idx <- c(match.rows, match.rows + 1, match.rows - 1)
final.data <- cbind(a[idx, ], b[idx, ])
> final.data
date value date value
2 2014-01-02 5 2014-01-02 5
7 2014-01-07 7 2014-01-07 7
3 2014-01-03 6 2014-01-03 202
8 2014-01-08 12 2014-01-08 206
1 2014-01-01 3 2014-01-01 200
6 2014-01-06 2 2014-01-06 205
这是一个简单的案例。例如,如果其中一个匹配的行是第一天或最后一天,则必须调整代码。 a和b具有不同大小或不同日期等的情况稍微复杂一些,可能需要更多思考。