我在R中有一个数据框,还有一个名为created_at的列,它包含一个我要解析为日期时间的文本。这是一个快速预览:
head(pushes)
created_at repo.url repository.url
1 2013-06-17T00:14:04Z https://github.com/Mindful/blog
2 2013-07-31T21:08:15Z https://github.com/leapmotion/js.leapmotion.com
3 2012-11-04T07:08:15Z https://github.com/jplusui/jplusui
4 2012-06-21T08:16:22Z https://github.com/LStuker/puppet-rbenv
5 2013-03-10T09:15:51Z https://github.com/Fchaubard/CS108FinalProject
6 2013-10-04T11:34:11Z https://github.com/cmmurray/soccer
actor.login payload.actor actor_attributes.login
1 Mindful
2 joshbuddy
3 xuld
4 LStuker
5 ststanko
6 cmmurray
我写了一些适用于某些测试数据的说明:
xts::.parseISO8601("2012-06-17T00:14:04",tz="UTC")$first.time
返回正确的Posix日期
但是当我使用此指令将其应用于列时:
pushes$created_at <- xts::.parseISO8601(substr(pushes$created_at,1,nchar(pushes$created_at)-1),tz="UTC")$first.time
数据框中的每一行都有一个重复的日期2012-06-17 00:14:04 UTC
就像第一行只运行一次的函数一样,然后结果在其余行中重复:(请你帮我在created_at列中每行正确应用它?
感谢。
答案 0 :(得分:1)
.parseISO8601
的第一个参数应该是一个字符串,而不是一个向量。您需要使用sapply
(或等效的)来遍历您的向量。
created_at <-
c("2013-06-17T00:14:04Z", "2013-07-31T21:08:15Z", "2012-11-04T07:08:15Z",
"2012-06-21T08:16:22Z", "2013-03-10T09:15:51Z", "2013-10-04T11:34:11Z")
# Only parses first element
.parseISO8601(substr(created_at,1,nchar(created_at)-1),tz="UTC")$first.time
# [1] "2013-06-17 00:14:04 UTC"
firstParseISO8601 <- function(x) .parseISO8601(x,tz="UTC")$first.time
# parse all elements
datetimes <- sapply(sub("Z$","",created_at), firstParseISO8601, USE.NAMES=FALSE)
# note that "simplifying" the output strips the POSIXct class, so we re-add it
datetimes <- .POSIXct(datetimes, tz="UTC")