Question

我有这个时间戳向量：

c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
"01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
"01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
"01/09/2019 10:52:20")

我想从字符向量中删除分钟和秒，以便我只有01/09/2019 9和01/09/2019 10

最有效的方法是什么？

Answer 1

这里是一个。

datevec <- c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
      "01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
      "01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
      "01/09/2019 10:52:20")

format(as.POSIXct(datevec, format = "%d/%m/%Y %H:%M:%OS"), "%d/%m/%Y %H")

# Result
 [1] "01/09/2019 09" "01/09/2019 09" "01/09/2019 09" "01/09/2019 10" "01/09/2019 10" "01/09/2019 10"
 [7] "01/09/2019 10" "01/09/2019 10" "01/09/2019 10" "01/09/2019 10"

Answer 2

您想要的输出类别是什么？怎么样：

v <- c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
  "01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
  "01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
  "01/09/2019 10:52:20")


strptime(v, "%m/%d/%Y %H")

Answer 3

这看起来不错，

unlist(strsplit(mystring, split = ":", fixed=TRUE))[c(TRUE, FALSE,FALSE)]

（在here的帮助下）

替代方法可能是

sapply(strsplit(mystring, split=':', fixed=TRUE), `[`, 1)

使用罗纳克（Ronak）的一些基准测试和最近的评论，fixed = TRUE使方法变快了很多，我们看到方法 4 （上述方法）最快，

mystring <- c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
              "01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
              "01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
              "01/09/2019 10:52:20")

microbenchmark(one = sapply(strsplit(mystring, split=':', fixed=TRUE), `[`, 1),
           two = unlist(lapply(mystring,function(x) strsplit(x,":", fixed=TRUE)[[1]][1])),
           three = strptime(mystring, "%m/%d/%Y %H"),
           four = unlist(strsplit(mystring, split = ":", fixed=TRUE))[c(TRUE, FALSE,FALSE)],
           five = format(as.POSIXct(mystring, format = "%d/%m/%Y %H:%M:%OS"), "%d/%m/%Y %H"), 
           six = gsub("(.*?):.*", "\\1", mystring),
           seven = str_extract(mystring, ".+(?=:.+:)"),
           times = 100000)



    Unit: microseconds
  expr     min      lq      mean  median       uq        max neval
   one  42.792  49.471  85.63742  52.572  57.1310  669280.96 1e+05
   two  64.637  70.618 114.16364  73.252  77.6840  582466.94 1e+05
 three 129.456 134.771 156.82308 136.188 139.2030  339715.94 1e+05
  four  12.860  15.641  22.75699  17.254  18.5440  305703.52 1e+05
  five 482.888 505.647 633.15388 512.880 552.1155  551274.28 1e+05
   six  37.889  43.121  52.79030  45.567  49.1880   32954.59 1e+05
 seven  53.432  59.051  88.05015  62.326  69.9320 1180361.17 1e+05

Answer 4

另一个：

dates <- c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
                  "01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
                  "01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
                  "01/09/2019 10:52:20")
unlist(lapply(dates,function(x) strsplit(x,":")[[1]][1]))

给予

 [1] "01/09/2019 9"  "01/09/2019 9"  "01/09/2019 9"  "01/09/2019 10" "01/09/2019 10"
 [6] "01/09/2019 10" "01/09/2019 10" "01/09/2019 10" "01/09/2019 10" "01/09/2019 10"

Answer 5

这是另一个使用gsub

的人

通过()和\\1捕获模式以引用捕获的组，由于存在多个?，因此需要:使正则表达式变得懒惰。

gsub("(.*?):.*", "\\1", dates)

Answer 6

您还可以使用/oauth/destroy?access_token=xxx中的str_extract：

stringr

从R中的字符日期删除分钟和秒

6 个答案: