合并基于相似但不精确的日期

时间:2016-10-15 19:14:34

标签: r merge

我有两个数据帧。我想只在id相同且且TripDate相同的情况下合并它们不超过三天。合并很简单,但如何指定日期范围而不是确切日期合并?

以下是一个示例:

DF1:

private static String getWordMostVowel(String sentence) {
    return Arrays.stream(sentence.toLowerCase().split("\\s"))
                 .max(Comparator.comparing
                         (s -> s.chars()
                                .mapToObj(c -> Character.valueOf((char) c))
                                .filter(VOWELS::contains)
                                .count()))
                 .orElse(null);
}

DF2:

var Context = {
    canvas: null,
    context: null,
    create: function(canvas_tag_id, size){
        this.canvas = document.getElementById(canvas_tag_id);
        this.canvas.width = size[0];
        this.canvas.height = size[1];
        this.context = this.canvas.getContext('2d');
        return this.context;
    },
    fps:1/30
};

$(function(){

// Initialize
Context.create('canvas', [798, 652]);

var s_size = [Context.canvas.width, Context.canvas.height]; // screen size

function Player(){
    this.rect = [0, s_size[1]-40, 20, 40];
    this.color = 'blue';

    this.create = function(){
        // function for creating player object

        Context.context.beginPath();
        Context.context.fillStyle = this.color;
        Context.context.rect(
            this.rect[0], this.rect[1], this.rect[2], this.rect[3]);
        Context.context.fill();
    };

    this.control = function(){
        // player movement control function

        if (event.which == 39 || event.keyCode == 39){
            alert(this.rect);
        }
    };

    this.update = function(){
        this.rect[0] += 1;
    }
}

// player instance creation

var archie = new Player();

// game loop functions

function events(){
    // Keydown events

    function keydown(){
        window.addEventListener('keydown', archie.control);
    }

    keydown();
}

function update(){
    archie.update();
}

function render(){
    Context.context.clearRect(0, 0, canvas.width, canvas.height);

    archie.create();
}

function game(){
    events();
    update();
    render();
}

setInterval(game, Context.fps);
});

输出应为:

structure(list(V1 = structure(c(5L, 1L, 1L, 2L, 3L, 3L, 3L, 4L
), .Label = c("1", "2", "3", "4", "Id"), class = "factor"), V2 = structure(c(7L, 
5L, 6L, 5L, 3L, 4L, 2L, 1L), .Label = c("2012-01-02", "2012-02-03", 
"2012-02-14", "2012-03-06", "2012-05-23", "2014-07-13", "VisitDate"
), class = "factor"), V3 = structure(c(8L, 2L, 4L, 5L, 1L, 6L, 
7L, 3L), .Label = c("12", "2", "22", "23", "33", "43", "54", 
"Another column"), class = "factor")), .Names = c("V1", "V2", 
"V3"), class = "data.frame", row.names = c(NA, -8L))

1 个答案:

答案 0 :(得分:2)

如果您的数据不是太大,您可以简单地加入id,然后过滤到关闭的行不超过3天。

例如,在tidyverse框架下:

library(tidyverse)

df1 = structure(list(Id = c(1, 1, 2, 3, 3, 3, 4), 
                     VisitDate = structure(c(15483, 16264, 15483, 15384, 15405, 15373, 15341), class = "Date"),
                     Column = c(2, 4, 5, 1, 6, 7, 3)), 
                .Names = c("Id", "VisitDate", "Column"), 
                row.names = 1:7, 
                class = "data.frame")

df2 = structure(list(Id = c(1, 2, 3, 4), 
                     VisitDate = structure(c(15536, 15485, 15386, 15347), class = "Date"), 
                     Column = c(3, 1, 4, 2)), 
                .Names = c("Id", "VisitDate", "Column"), 
                row.names = 1:4, 
                class = "data.frame")


df1 %>%
    left_join(df2, by = "Id", suffix = c(".df1", ".df2")) %>%
    filter(abs(VisitDate.df1 - VisitDate.df2) <= 3)
#>   Id VisitDate.df1 Column.df1 VisitDate.df2 Column.df2
#> 1  2    2012-05-23          5    2012-05-25          1
#> 2  3    2012-02-14          1    2012-02-16          4

另一种方法是复制数据行,可能在df1中。如果您在多个日期中出现相同的ID,则可能效率更高。

df1 %>%
    mutate(date = map(VisitDate, function(x){seq(x - 3, x + 3, by = 1)})) %>%
    unnest(date) %>%
    inner_join(df2, by = c("Id", "date" = "VisitDate"), suffix = c(".df1", ".df2"))
#>   Id  VisitDate Column.df1       date Column.df2
#> 1  2 2012-05-23          5 2012-05-25          1
#> 2  3 2012-02-14          1 2012-02-16          4