在具有长数据和宽数据的不一致时间戳上加入/合并数据帧

时间:2017-04-01 20:50:50

标签: r

我有两个数据框,

df3有一个时间戳,上面有日期和时间,用户ID和带有多个观察的心率(长数据)

df4有一个时间戳,只有日期,用户ID,卡路里和睡眠(宽数据)

我想将它们组合在一起,以便可以使用基于日期和用户ID的宽数据的数据框填充具有长数据的数据框

以下是具有类似布局的玩具数据集的代码

df3 <- data.frame(  time_stamp=c('2016-11-01 10:29:41','2016-11-01 10:53:11','2016-11-02 01:07:54','2016-11-02 02:00:40','2016-11-02 04:01:33','2016-11-02 05:23:53','2016-11-02 13:20:17'),
               users_user_id=c(7,7,7,7,7,7,7),
               avg_heart_rate=c(94,90,88,85,91,89,95))
df4 <- data.frame(  time_stamp=c('2016-11-01','2016-11-02'), users_user_id=c(7,7), calories=c(1800,2000), sleep=c(480,560))
df3$time_stamp <- as.POSIXct(df3$time_stamp)
df4$time_stamp <- as.POSIXct(df4$time_stamp)

我尝试从时间分割时间,但是当我使用时间戳和用户ID在dplyr上执行full_join时,我留下了很多NA。我试着查看如何使用reshape2来融化我的数据?但我迷失了它对我的帮助......

2 个答案:

答案 0 :(得分:1)

一种整齐的方式:

library(tidyr)
library(dplyr)

df3 <- separate(df3, time_stamp, into = c("date_stamp", "time_stamp"), sep = " ")
df3$date_stamp <- as.POSIXct(df3$date_stamp)

left_join(df3, df4, by = c("date_stamp" = "time_stamp", "users_user_id"))


  date_stamp time_stamp users_user_id avg_heart_rate calories sleep
  1 2016-11-01   10:29:41             7             94     1800   480
  2 2016-11-01   10:53:11             7             90     1800   480
  3 2016-11-02   01:07:54             7             88     2000   560
  4 2016-11-02   02:00:40             7             85     2000   560
  5 2016-11-02   04:01:33             7             91     2000   560
  6 2016-11-02   05:23:53             7             89     2000   560
  7 2016-11-02   13:20:17             7             95     2000   560

答案 1 :(得分:0)

您可以创建一个仅包含日期信息的新列,并合并到该列:

df3$date <- as.Date(df3$time_stamp)
df4$date <- as.Date(df4$time_stamp)
merge(df3, df4, by = c("date", "users_user_id"))

给你:

        date users_user_id        time_stamp.x avg_heart_rate time_stamp.y calories sleep
1 2016-11-01             7 2016-11-01 10:29:41             94   2016-11-01     1800   480
2 2016-11-01             7 2016-11-01 10:53:11             90   2016-11-01     1800   480
3 2016-11-02             7 2016-11-02 01:07:54             88   2016-11-02     2000   560
4 2016-11-02             7 2016-11-02 02:00:40             85   2016-11-02     2000   560
5 2016-11-02             7 2016-11-02 04:01:33             91   2016-11-02     2000   560
6 2016-11-02             7 2016-11-02 05:23:53             89   2016-11-02     2000   560
7 2016-11-02             7 2016-11-02 13:20:17             95   2016-11-02     2000   560