选择另一个日期之上和之下的日期范围

时间:2014-07-12 03:29:39

标签: r

考虑这些小型数据框ab

a <-structure(list(date = structure(c(16071, 16072, 16073, 16074, 16075, 
                                      16076, 16077, 16078, 16079, 16080, 
                                      16081), class = "Date"), 
                   value = c(3L, 5L, 6L, 6L, 15L, 2L, 7L, 12L, 20L, 22L, 100L)), 
              .Names = c("date", "value"), row.names = c(NA, -11L), class = "data.frame")

b <- structure(list(date = structure(c(16071, 16072, 16073, 16074, 16075, 
                                       16076, 16077, 16078, 16079, 16080, 16081), 
                                     class = "Date"), 
                    value = c(200L, 5L, 202L, 101L, 204L, 205L, 7L, 206L, 1000L, 
                              456L, 555L)), 
               .Names = c("date", "value"), row.names = c(NA, -11L), 
               class = "data.frame")

我想创建第三个数据框,其中的列匹配上面两天和下面两天。例如,我匹配的两个值是(在合并a和b数据帧之后):

        date value
1 2014-01-02     5
2 2014-01-07     7

只有一个记录来自上面和下面的匹配记录INSTEAD: 最终数据集应该有来自下方的1条记录和来自a和b的匹配记录上方的1条记录:

像这样:

FROM dataframe“a”

date value 1 2014-01-01 3 #one record above the matching record from dataframe "a" 2 2014-01-02 5 #matching record from "a" and "b" 3 2014-01-03 6 #one record below the matching record from dataframe "a" 6 2014-01-06 2 #one record above the matching record from dataframe "a" 7 2014-01-07 7 #matching record from "a" and "b" 8 2014-01-08 12 #one record below the matching record from dataframe "a"

FROM dataframe“b”

date value 1 2014-01-01 200 #one record from above 2 2014-01-02 5 #matching record from dataframe "a" and "b" 3 2014-01-03 202 #one record from below 6 2014-01-06 205 #one record from above 7 2014-01-07 7 #matching record from dataframe "a" and "b" 8 2014-01-08 206 # one record from below

最终产品应该是a和b的组合,并且看起来像这样:

date value date value 1 2014-01-01 3 2014-01-01 200 2 2014-01-02 5 2014-01-02 5 3 2014-01-03 6 2014-01-03 202 6 2014-01-06 2 2014-01-06 205 7 2014-01-07 7 2014-01-07 7 8 2014-01-08 12 2014-01-08 206

1 个答案:

答案 0 :(得分:0)

我会评论,但我还没有代表。如果您希望新数据帧仅包含a和b中的行,则可以采用以下方法:

foo <- rbind(a, b)
foo.match <- unique(foo[duplicated(foo), ])
rownames(foo.match) <- 1:nrow(foo.match)

> foo.match
        date value
1 2014-01-02     5
2 2014-01-07     7

我不明白你问题的第二部分(上面两天和下面两天)。你想要输出到底是什么样的?

修改

以下是您问题的快速解决方案。 假设:此解决方案假设a和b始终具有相同的行数和相同的日期,并且&#34;值&#34;总是一个整数。

# sort the data by date to be safe                                             
a <- a[order(a$date), ]                                                        
b <- b[order(b$date), ]                                                        

# determine which rows in a and b match                                        
match.rows <- which(a$date == b$date & a$value == b$value)                     

# get the rows indices above and below the matching values                     
idx <- c(match.rows, match.rows + 1, match.rows - 1)                           
final.data <- cbind(a[idx, ], b[idx, ])                                        

> final.data                                                                     
        date value       date value                                              
2 2014-01-02     5 2014-01-02     5                                              
7 2014-01-07     7 2014-01-07     7                                              
3 2014-01-03     6 2014-01-03   202                                              
8 2014-01-08    12 2014-01-08   206                                              
1 2014-01-01     3 2014-01-01   200                                              
6 2014-01-06     2 2014-01-06   205

这是一个简单的案例。例如,如果其中一个匹配的行是第一天或最后一天,则必须调整代码。 a和b具有不同大小或不同日期等的情况稍微复杂一些,可能需要更多思考。