弹性搜索中的和聚合

时间:2015-09-04 06:14:16

标签: r data-manipulation

我有这种格式的数据(df)。我需要将时间戳(tweetCreatedAt)转换为日期对象,以便我可以进一步操作数据。

    tweetCreatedAt                comment_text
1   2014-05-17T00:00:49.000Z      @truthout: India Elects Hard-Right Hindu 

2   2014-05-17T00:00:49.000Z     Narendra Modi is welcome to visit US !

任何帮助?

我试过以下

df[,1] <- lapply(df[,1],function(x) as.POSIXct(x, '%Y-%m-%dT%H:%M:%S'))

但现在我只得到日期而不是实际时间。

1 个答案:

答案 0 :(得分:0)

不确定这是否是问题,但可能是问题。 正如我在评论中提到的,由于生成此数据集的过程,列的元素可能是值或列表。 检查此示例:

# simplified example
dt = read.table(text = "tweetCreatedAt  comment_text
1   2014-05-17T00:00:49.000Z      @truthout 
2   2014-05-19T00:00:49.000Z     Narendra", header=T)

dt$tweetCreatedAt = as.character(dt$tweetCreatedAt)

# data set looks like
dt

#             tweetCreatedAt comment_text
# 1 2014-05-17T00:00:49.000Z    @truthout
# 2 2014-05-19T00:00:49.000Z     Narendra

as.POSIXct(dt$tweetCreatedAt, format='%Y-%m-%dT%H:%M:%S')

# [1] "2014-05-17 00:00:49 BST" "2014-05-19 00:00:49 BST"


# let's manually change this element to a list
dt$tweetCreatedAt[2] = list(c("2014-05-19T00:00:49.000Z","2014-05-20T00:00:49.000Z"))

# data set now looks like this
dt

#                                       tweetCreatedAt comment_text
# 1                           2014-05-17T00:00:49.000Z    @truthout
# 2 2014-05-19T00:00:49.000Z, 2014-05-20T00:00:49.000Z     Narendra

as.POSIXct(dt$tweetCreatedAt, format='%Y-%m-%dT%H:%M:%S')

# Error in as.POSIXct.default(dt$tweetCreatedAt, format = "%Y-%m-%dT%H:%M:%S") : 
#   do not know how to convert 'dt$tweetCreatedAt' to class “POSIXct”