我从API获得以下输出作为Date;
| news_time |
---------------
23 Aug 19
24 Aug 19
11 hours ago
12 hours ago
5 minutes ago
44 minutes ago
我想将通常为 CHARACTER 数据类型的API输入转换为适当的POSIXct格式。
是否有可能将上述数据转换为以下提供的数据;
Current Time: 28-08-2019 10:00:00
| news_time | converted_time |
-------------------------------------
23 Aug 19 | 23-08-2019 00:00:00 |
24 Aug 19 | 24-08-2019 00:00:00 |
6 hours ago | 28-08-2019 04:00:00 |
2 hours ago | 28-08-2019 08:00:00 |
5 minutes ago| 28-08-2019 09:55:00 |
4 minutes ago| 28-08-2019 09:56:00 |
如果没有,我想根据时间从最小到最大对新闻时间进行排序。
答案 0 :(得分:1)
数据和库
library(tidyverse)
library(lubridate)
library(glue)
df <- structure(list(news_time = c(" 11 hours ago", " 12 hours ago", " 23 Aug 19",
" 24 Aug 19", " 44 minutes ago", " 5 minutes ago")),
class = "data.frame", row.names = c(NA, -6L))
代码
此功能可以解决问题:
get_time <- function(news_time) {
res <- vector("list", length(news_time))
## we assume that entries in the form "xx .* ago" can be either
## seconds, minutes or hours
units <- list(minute = minutes, second = seconds, hour = hours)
## the marker for periods is the word "ago"
periods <- grepl("ago", news_time)
## keep just the numbers
amt <- if_else(periods, as.numeric(gsub("[^0-9]*", "", news_time)), NA_real_)
unit_traf <- units[gsub(glue(".*({paste0(names(units), collapse = '|')})",
"s*.*"),
"\\1", news_time)]
ref_time <- dmy("28-02-2019", tz = "GMT") # change if needed
## for "normal" time stamps just use lubridate::dmy
res[!periods] <- as.list(dmy(news_time[!periods], tz = "GMT"))
## for persiod time stamps loop over amount and units to do the calculation
res[periods] <- map2(amt[periods], unit_traf[periods],
function(amt, unit) ref_time - unit(amt))
## transfrom list of POSIXct to vector
do.call(c, res)
}
df %>%
as_tibble() %>%
mutate(time_stamp = get_time(news_time))
# # A tibble: 6 x 2
# news_time time_stamp
# <chr> <dttm>
# 1 " 11 hours ago" 2019-02-27 13:00:00
# 2 " 12 hours ago" 2019-02-27 12:00:00
# 3 " 23 Aug 19" 2019-08-23 00:00:00
# 4 " 24 Aug 19" 2019-08-24 00:00:00
# 5 " 44 minutes ago" 2019-02-27 23:16:00
# 6 " 5 minutes ago" 2019-02-27 23:55:00
答案 1 :(得分:0)
这适用于单个news_time字符串,因此您应该使它在列值上循环,但是我确定可以管理。
library(lubridate)
library(stringr)
i <- "6 minutes ago"
#i <- "24 Aug 19"
#i <- "5 hours ago"
if(str_detect(string = i, pattern = "ago")){
x <- strsplit(i, " ")[[1]][1] %>% as.integer()
if(str_detect(string = i, pattern = "hour")){
y <- now()
hour(y) <- hour(y) - x
} else if(str_detect(string = i, pattern = "minute")){
y <- now()
minute(y) <- minute(y) - x
}
} else {y <- as.POSIXct(i, format = "%d %b %y")}
print(y)
请记住,如果POSIXct是午夜,它不会打印时间。它将打印时区。