转换多个数据帧的相同列

时间:2017-03-04 03:35:25

标签: r loops posix

我在全球环境中拥有多个数据框,我们称之为abc

每个数据帧都有一个名为start_time的列,需要将其转换为posix类,但我正在寻找方法来实现这一点,而无需为每个数据帧写出相同的代码。代码是:

 a$start_time <- strptime(a$start_time, format = '%Y-%m-%d %H:%M:%S')

这只会转换start_time

中的a

使用数据框名称,如何设计一种循环每个数据框并将start_time转换为posix的方法?

lapply的这种尝试仅适用于第一个数据帧...

ll <- list(a, b, c)
lapply(ll,function(df){
  df$start_time <- strptime(df$start_time, format = '%Y-%m-%d %H:%M:%S')         

})

2 个答案:

答案 0 :(得分:1)

数据:df1df2df3

df1 <- data.frame(start_time = seq(Sys.time(), Sys.time() + 100, 10))    
df2 <- data.frame(start_time = seq(Sys.time(), Sys.time() + 100, 10))    
df3 <- data.frame(start_time = seq(Sys.time(), Sys.time() + 100, 10))

# create a vector with names of the data frames   
data_vec <- c('df1', 'df2', 'df3')

# loop through the data_vec and modify the start_time column
a1 <- lapply(data_vec, function( x ) {
  x <- get( x )
  x <- within(x, start_time <- strptime(start_time, format = '%Y-%m-%d %H:%M:%S') )
  return( x )
  })

# assign names to the modified data in a1
names(a1) <- data_vec

# list objects in global environment
ls()
# [1] "a1"       "data_vec" "df1"      "df2"      "df3" 

# remove df1, df2, df3 from global environment
rm(list = c('df1', 'df2', 'df3') )

# confirm the removal of data
ls()
# [1] "a1"       "data_vec"

# assign the named list in a1 as data in global environment
list2env(a1, envir = .GlobalEnv)

# list objects in global environment and confirm that the data appeared again
ls()
# [1] "a1"       "data_vec" "df1"      "df2"      "df3"     

# output
head(df1)
#            start_time
# 1 2017-03-03 22:49:54
# 2 2017-03-03 22:50:04
# 3 2017-03-03 22:50:14
# 4 2017-03-03 22:50:24
# 5 2017-03-03 22:50:34
# 6 2017-03-03 22:50:44

head(df2)
#            start_time
# 1 2017-03-03 22:49:54
# 2 2017-03-03 22:50:04
# 3 2017-03-03 22:50:14
# 4 2017-03-03 22:50:24
# 5 2017-03-03 22:50:34
# 6 2017-03-03 22:50:44

答案 1 :(得分:1)

在OP的代码中,未返回数据集。所以,它基本上是

lapply(ll,function(df){
  df$start_time <- strptime(df$start_time, format = '%Y-%m-%d %H:%M:%S')         
  df
})

但是,如果不返回对象和匿名函数调用,transform是一个选项。此外,strptime也会返回POSIXlt课程。如果我们只需要POSIXct,请使用as.POSIXct

lapply(ll, transform, start_time = as.POSIXct(start_time,  format = '%Y-%m-%d %H:%M:%S'))

或者使其更紧凑

library(lubridate)
lapply(ll, transform, start_time = ymd_hms(start_time))