我有两个数据帧。我想只在id相同且且TripDate相同的情况下合并它们不超过三天。合并很简单,但如何指定日期范围而不是确切日期合并?
以下是一个示例:
DF1:
private static String getWordMostVowel(String sentence) {
return Arrays.stream(sentence.toLowerCase().split("\\s"))
.max(Comparator.comparing
(s -> s.chars()
.mapToObj(c -> Character.valueOf((char) c))
.filter(VOWELS::contains)
.count()))
.orElse(null);
}
DF2:
var Context = {
canvas: null,
context: null,
create: function(canvas_tag_id, size){
this.canvas = document.getElementById(canvas_tag_id);
this.canvas.width = size[0];
this.canvas.height = size[1];
this.context = this.canvas.getContext('2d');
return this.context;
},
fps:1/30
};
$(function(){
// Initialize
Context.create('canvas', [798, 652]);
var s_size = [Context.canvas.width, Context.canvas.height]; // screen size
function Player(){
this.rect = [0, s_size[1]-40, 20, 40];
this.color = 'blue';
this.create = function(){
// function for creating player object
Context.context.beginPath();
Context.context.fillStyle = this.color;
Context.context.rect(
this.rect[0], this.rect[1], this.rect[2], this.rect[3]);
Context.context.fill();
};
this.control = function(){
// player movement control function
if (event.which == 39 || event.keyCode == 39){
alert(this.rect);
}
};
this.update = function(){
this.rect[0] += 1;
}
}
// player instance creation
var archie = new Player();
// game loop functions
function events(){
// Keydown events
function keydown(){
window.addEventListener('keydown', archie.control);
}
keydown();
}
function update(){
archie.update();
}
function render(){
Context.context.clearRect(0, 0, canvas.width, canvas.height);
archie.create();
}
function game(){
events();
update();
render();
}
setInterval(game, Context.fps);
});
输出应为:
structure(list(V1 = structure(c(5L, 1L, 1L, 2L, 3L, 3L, 3L, 4L
), .Label = c("1", "2", "3", "4", "Id"), class = "factor"), V2 = structure(c(7L,
5L, 6L, 5L, 3L, 4L, 2L, 1L), .Label = c("2012-01-02", "2012-02-03",
"2012-02-14", "2012-03-06", "2012-05-23", "2014-07-13", "VisitDate"
), class = "factor"), V3 = structure(c(8L, 2L, 4L, 5L, 1L, 6L,
7L, 3L), .Label = c("12", "2", "22", "23", "33", "43", "54",
"Another column"), class = "factor")), .Names = c("V1", "V2",
"V3"), class = "data.frame", row.names = c(NA, -8L))
答案 0 :(得分:2)
如果您的数据不是太大,您可以简单地加入id
,然后过滤到关闭的行不超过3天。
例如,在tidyverse
框架下:
library(tidyverse)
df1 = structure(list(Id = c(1, 1, 2, 3, 3, 3, 4),
VisitDate = structure(c(15483, 16264, 15483, 15384, 15405, 15373, 15341), class = "Date"),
Column = c(2, 4, 5, 1, 6, 7, 3)),
.Names = c("Id", "VisitDate", "Column"),
row.names = 1:7,
class = "data.frame")
df2 = structure(list(Id = c(1, 2, 3, 4),
VisitDate = structure(c(15536, 15485, 15386, 15347), class = "Date"),
Column = c(3, 1, 4, 2)),
.Names = c("Id", "VisitDate", "Column"),
row.names = 1:4,
class = "data.frame")
df1 %>%
left_join(df2, by = "Id", suffix = c(".df1", ".df2")) %>%
filter(abs(VisitDate.df1 - VisitDate.df2) <= 3)
#> Id VisitDate.df1 Column.df1 VisitDate.df2 Column.df2
#> 1 2 2012-05-23 5 2012-05-25 1
#> 2 3 2012-02-14 1 2012-02-16 4
另一种方法是复制数据行,可能在df1
中。如果您在多个日期中出现相同的ID,则可能效率更高。
df1 %>%
mutate(date = map(VisitDate, function(x){seq(x - 3, x + 3, by = 1)})) %>%
unnest(date) %>%
inner_join(df2, by = c("Id", "date" = "VisitDate"), suffix = c(".df1", ".df2"))
#> Id VisitDate Column.df1 date Column.df2
#> 1 2 2012-05-23 5 2012-05-25 1
#> 2 3 2012-02-14 1 2012-02-16 4