将时差添加到一系列文件中

时间:2019-03-27 16:08:46

标签: r if-statement lubridate

我正在尝试创建一个函数,在该函数中我可以批量处理CSV文件的文件夹。所有CSV文件都包含不正确的时间戳记,因此我有另一个文件,其中包含错误的时间戳记和相应的正确时间戳记之间的差异。例如,我的文件如下所示:

library(lubridate)
library(stringr)
timestamp <- "03-APR-06 12.41.00.000000000 PM US/CENTRAL"
as_datetime(timestamp,tz=str_extract(timestamp,"\\S*$"))
[1] "2003-04-06 00:41:00 CST"

#without lubridate
strptime(strsplit(timestamp," \\S*$")[[1]][1],format="%y-%b-%d %I.%M.%S.%OS %p",tz=str_extract(timestamp,"\\S*$"))

我试图创建一个if语句,以在ID和访问次数匹配时添加差异

df1
ID        Visit    Difference (in seconds)
1002      V2       35
2038      V1       86786

df2
ID        Visit    startTime
1002      V2       2017-12-01 19:47:11
1002      V2       2017-12-01 19:49:55
1002      V2       2017-12-01 19:50:42
1002      V2       2017-12-01 20:18:24

...

它会重复相加35秒,然后再加上86786秒,再加上35,依此类推,所以我会得到这样的输出

if (df1$ID == df2$ID &
      df1$Visit == df2$Visit) {
    df2$startTime <- df2$startTime + df1$Difference
  }

我希望它只加上35秒。有办法吗?

1 个答案:

答案 0 :(得分:1)

我认为这可以帮助您

# load packages
library(dplyr)
library(lubridate)
# reproduce similar data
df1 <-
  data.frame(
    "ID" = c(1002, 2038),
    "Visit" = as.character(c("V2", "V1")),
    "Difference" = c(35, 86786)
  )
df2 <-
  data.frame(
    "ID" = c(rep(1002, 3), 2038),
    Visit = as.character(rep("V2", 4)),
    startTime = ymd_hms(
      "2017-12-01 19:47:11",
      "2017-12-01 19:49:55",
      "2017-12-01 19:50:42",
      "2017-12-01 20:18:24"
    )
  )
# join before adding time
df <- left_join(df2, df1, by = c("ID", "Visit"))
df %>%
  mutate(new_time = if_else(!is.na(Difference),
                            startTime + Difference,
                            startTime))