使用R将Date转换为正确的日期时间格式

时间:2019-08-28 06:38:43

标签: r datetime

我从API获得以下输出作为Date;

|  news_time  |
---------------
 23 Aug 19
 24 Aug 19
 11 hours ago
 12 hours ago
 5 minutes ago
 44 minutes ago

我想将通常为 CHARACTER 数据类型的API输入转换为适当的POSIXct格式。

是否有可能将上述数据转换为以下提供的数据;

Current Time: 28-08-2019 10:00:00

|  news_time  |    converted_time   |
-------------------------------------
 23 Aug 19    | 23-08-2019 00:00:00 |  
 24 Aug 19    | 24-08-2019 00:00:00 |
 6 hours ago  | 28-08-2019 04:00:00 |
 2 hours ago  | 28-08-2019 08:00:00 |
 5 minutes ago| 28-08-2019 09:55:00 |
 4 minutes ago| 28-08-2019 09:56:00 |

如果没有,我想根据时间从最小到最大对新闻时间进行排序。

2 个答案:

答案 0 :(得分:1)

数据和库

library(tidyverse)
library(lubridate)
library(glue)

df <- structure(list(news_time = c(" 11 hours ago", " 12 hours ago", " 23 Aug 19", 
                                   " 24 Aug 19", " 44 minutes ago", " 5 minutes ago")),
                class = "data.frame", row.names = c(NA, -6L))

代码

此功能可以解决问题:

get_time <- function(news_time) {
  res <- vector("list", length(news_time))
  ## we assume that entries in the form "xx .* ago" can be either 
  ## seconds, minutes or hours
  units <- list(minute = minutes, second = seconds, hour = hours) 
  ## the marker for periods is the word "ago"
  periods <- grepl("ago", news_time)
  ## keep just the numbers
  amt <- if_else(periods, as.numeric(gsub("[^0-9]*", "", news_time)), NA_real_)
  unit_traf <- units[gsub(glue(".*({paste0(names(units), collapse = '|')})",
                                              "s*.*"), 
                                         "\\1", news_time)]
  ref_time <- dmy("28-02-2019", tz = "GMT") # change if needed
  ## for "normal" time stamps just use lubridate::dmy
  res[!periods] <- as.list(dmy(news_time[!periods], tz = "GMT"))
  ## for persiod time stamps loop over amount and units to do the calculation
  res[periods]  <- map2(amt[periods], unit_traf[periods], 
                        function(amt, unit) ref_time - unit(amt))
  ## transfrom list of POSIXct to vector
  do.call(c, res)
}

df %>%
  as_tibble() %>%
  mutate(time_stamp = get_time(news_time))

# # A tibble: 6 x 2
#   news_time         time_stamp         
#   <chr>             <dttm>             
# 1 " 11 hours ago"   2019-02-27 13:00:00
# 2 " 12 hours ago"   2019-02-27 12:00:00
# 3 " 23 Aug 19"      2019-08-23 00:00:00
# 4 " 24 Aug 19"      2019-08-24 00:00:00
# 5 " 44 minutes ago" 2019-02-27 23:16:00
# 6 " 5 minutes ago"  2019-02-27 23:55:00

答案 1 :(得分:0)

这适用于单个news_time字符串,因此您应该使它在列值上循环,但是我确定可以管理。

library(lubridate)
library(stringr)


i <- "6 minutes ago"
#i <- "24 Aug 19"
#i <- "5 hours ago"


if(str_detect(string = i, pattern = "ago")){
  x <- strsplit(i, " ")[[1]][1] %>% as.integer()
    if(str_detect(string = i, pattern = "hour")){
      y <- now()
      hour(y) <- hour(y) - x
    } else if(str_detect(string = i, pattern = "minute")){
      y <- now()
      minute(y) <- minute(y) - x
    }
} else {y <- as.POSIXct(i, format = "%d %b %y")}

print(y)

请记住,如果POSIXct是午夜,它不会打印时间。它将打印时区。