这是我拥有的数据框:
df <- data_frame(
id = rep(c(1, 2, 3), 3),
timestamp = as.integer(Sys.time()),
timezone = rep(
c('America/New_York', 'America/Chicago', 'America/Los_Angeles'), 3
)
)
df
# A tibble: 9 x 3
id timestamp timezone
<dbl> <int> <chr>
1 1 1540887120 America/New_York
2 2 1540887120 America/Chicago
3 3 1540887120 America/Los_Angeles
4 1 1540887120 America/New_York
5 2 1540887120 America/Chicago
6 3 1540887120 America/Los_Angeles
7 1 1540887120 America/New_York
8 2 1540887120 America/Chicago
9 3 1540887120 America/Los_Angeles
我想使用每组timzeone字符串将纪元秒转换为时区字符串。但是,当我尝试这样做时,它不起作用:
library(dplyr)
df %>%
group_by(id) %>%
mutate(
new_timestamp = as.POSIXct(
timestamp, origin = '1970-01-01', tz = first(timezone)
)
)
# A tibble: 9 x 4
# Groups: id [3]
id timestamp timezone new_timestamp
<dbl> <int> <chr> <dttm>
1 1 1540887120 America/New_York 2018-10-30 08:12:00
2 2 1540887120 America/Chicago 2018-10-30 08:12:00
3 3 1540887120 America/Los_Angeles 2018-10-30 08:12:00
4 1 1540887120 America/New_York 2018-10-30 08:12:00
5 2 1540887120 America/Chicago 2018-10-30 08:12:00
6 3 1540887120 America/Los_Angeles 2018-10-30 08:12:00
7 1 1540887120 America/New_York 2018-10-30 08:12:00
8 2 1540887120 America/Chicago 2018-10-30 08:12:00
9 3 1540887120 America/Los_Angeles 2018-10-30 08:12:00
是否有办法使它正常工作?
以下内容为我提供了正确的结果,但作为数据帧列表,这不是我想要的:
lapply(
split(df, df$id), function(x) {
x$new_timestamp <- as.POSIXct(
x$timestamp, tz = unique(x$timezone), origin = '1970-01-01'
)
x
}
)
$`1`
# A tibble: 3 x 4
id timestamp timezone new_timestamp
<dbl> <int> <chr> <dttm>
1 1 1540887806 America/New_York 2018-10-30 04:23:26
2 1 1540887806 America/New_York 2018-10-30 04:23:26
3 1 1540887806 America/New_York 2018-10-30 04:23:26
$`2`
# A tibble: 3 x 4
id timestamp timezone new_timestamp
<dbl> <int> <chr> <dttm>
1 2 1540887806 America/Chicago 2018-10-30 03:23:26
2 2 1540887806 America/Chicago 2018-10-30 03:23:26
3 2 1540887806 America/Chicago 2018-10-30 03:23:26
$`3`
# A tibble: 3 x 4
id timestamp timezone new_timestamp
<dbl> <int> <chr> <dttm>
1 3 1540887806 America/Los_Angeles 2018-10-30 01:23:26
2 3 1540887806 America/Los_Angeles 2018-10-30 01:23:26
3 3 1540887806 America/Los_Angeles 2018-10-30 01:23:26
将这些绑定到单个数据框中会再次将引用转换为UTC,这也不是我想要的:
bind_rows(
lapply(
split(df, df$id), function(x) {
x$new_timestamp <- as.POSIXct(
x$timestamp, tz = unique(x$timezone), origin = '1970-01-01'
)
x
}
)
)
# A tibble: 9 x 4
id timestamp timezone new_timestamp
<dbl> <int> <chr> <dttm>
1 1 1540887806 America/New_York 2018-10-30 08:23:26
2 1 1540887806 America/New_York 2018-10-30 08:23:26
3 1 1540887806 America/New_York 2018-10-30 08:23:26
4 2 1540887806 America/Chicago 2018-10-30 08:23:26
5 2 1540887806 America/Chicago 2018-10-30 08:23:26
6 2 1540887806 America/Chicago 2018-10-30 08:23:26
7 3 1540887806 America/Los_Angeles 2018-10-30 08:23:26
8 3 1540887806 America/Los_Angeles 2018-10-30 08:23:26
9 3 1540887806 America/Los_Angeles 2018-10-30 08:23:26
注意:将其保留为字符串即可(在as.character()
调用周围使用as.POSIXct()
)。但是,我想为组内的其他计算保留时间戳类型。