R:根据给定时间范围,从数据帧子集行

时间:2020-06-01 10:35:21

标签: r datetime dplyr subset

假设我有df1:

animalData.animal[i].type

'animal' : [ 
  {'type':'dog', 'colour':'brown'},
  {'type':'dog', 'colour':'yellow'},
  {'type':'cat', 'colour':'grey'},
  {'type':'chicken', 'colour':'orange'},
  {'type':'frog', 'colour':'green'},
  {'type':'cat', 'colour':'pink'},
  {'type':'dog', 'colour':'yellow'},
  {'type':'cat', 'colour':'grey'},
  {'type':'chicken', 'colour':'black'},
  {'type':'dog', 'colour':'yellow'}
]

&df2:

[
  {'type':'dog', 'count':'4'},
  {'type':'cat', 'count':'3'},
  {'type':'chicken', 'count':'2'},
  {'type':'frog', 'count':'1'},
]

我想做的是通过检查df2中的哪个DateTime介于df1中的Start_Date和End_Date之间,将值从df1导入df2。预期结果的看法:

Start_Date    End_Date     Value
2001-01-01    2001-12-31   1
2002-01-01    2002-12-31   2
2003-01-01    2003-12-31   3
2004-01-01    2004-12-31   4
2005-01-01    2005-12-31   5 

请告知

2 个答案:

答案 0 :(得分:1)

其他选项。使用lubridate,您可以查看日期是什么时间间隔

library(tidyverse)
df2 %>% 
  rowwise() %>% 
  mutate(out = df1$Value[(DateTime %within% interval(df1$Start_Date, df1$End_Date))])

答案 1 :(得分:0)

使用#ifndef bit_set #define bit_set struct bit{ unsigned b0 : 1; unsigned b1 : 1; unsigned b2 : 1; unsigned b3 : 1; unsigned b4 : 1; unsigned b5 : 1; unsigned b6 : 1; unsigned b7 : 1; unsigned b02 : 1; unsigned b12 : 1; unsigned b22 : 1; unsigned b32 : 1; unsigned b42 : 1; unsigned b52 : 1; unsigned b62 : 1; unsigned b72 : 1; unsigned b03 : 1; unsigned b13 : 1; unsigned b23 : 1; unsigned b33 : 1; unsigned b43 : 1; unsigned b53 : 1; unsigned b63 : 1; unsigned b73 : 1; unsigned b04 : 1; unsigned b14 : 1; unsigned b24 : 1; unsigned b34 : 1; unsigned b44 : 1; unsigned b54 : 1; unsigned b64 : 1; unsigned b74 : 1; }; union bit_set { unsigned int x; struct bit foo; }word; #endif #include <stdio.h> #include <stdlib.h> #include <string.h> #include <ctype.h> #include <stdlib.h> #include <math.h> #include <time.h> #include "bit_set.h" int main(void) { printf("Input number: "); if (scanf("%u", &word.x) == 0) { printf("Incorrect input"); return 1; } int sum = 0; sum += word.foo.b7+ word.foo.b6 +word.foo.b5 + word.foo.b4 + word.foo.b3 + word.foo.b2 + word.foo.b1 + word.foo.b0 + word.foo.b72 + word.foo.b62 + word.foo.b52 + word.foo.b42 + word.foo.b32 + word.foo.b22 + word.foo.b12 + word.foo.b02 + word.foo.b73 + word.foo.b63 + word.foo.b53 + word.foo.b43 + word.foo.b33 + word.foo.b23 + word.foo.b13 + word.foo.b03 + word.foo.b74 + word.foo.b64 + word.foo.b54 + word.foo.b44 + word.foo.b34 + word.foo.b24 + word.foo.b14 + word.foo.b04; sum % 2 ? printf("NO") : printf("YES"); return 0; } 提供了解决方案。

在开始之前,请确保所有日期的格式都相同:

dplyr

在您的情况下,仅需对列lubridate进行操作。

首先,我交叉连接两个data.frame,因为我看不到将两个df组合在一起的简单解决方案。

df1 <- df1 %>%
  mutate(Start_Date=ymd(Start_Date), End_Date=dmy(End_Date))

df2 <- df2 %>%
  mutate(DateTime=ymd(DateTime))

接下来使用End_Datedf3 <- merge(df1, df2, all=TRUE)

filter

给予

between

使用软件包df3 %>% filter(between(DateTime, Start_Date, End_Date)) %>% select(-c(Start_Date, End_Date))

的另一个选项
  Value   DateTime
1     3 2003-01-01
2     3 2003-05-09
3     4 2004-12-31
4     5 2005-01-31
5     5 2005-08-13

收益

data.table