R:收集data_frame中变量的类信息

时间:2016-01-15 19:19:00

标签: r dataframe dplyr

我想收集class中某些列的data_frame。所以:

df_x = data_frame(date1 = seq.POSIXt(from = as.POSIXct("2016-01-01 12:00:00 UTC"), 
                              to = as.POSIXct("2016-01-03 12:00:00 UTC"), by = "hour", 
                              tz = "UTC"),
           date2 = seq.POSIXt(from = as.POSIXct("2016-01-01 12:00:00 UTC"), 
                              to = as.POSIXct("2016-01-03 12:00:00 UTC"), by = "hour", 
                              tz = "UTC"))

class(df_x$date1)
class(df_x$date2)

如何针对一堆列收集调用class的结果?我可以限制返回向量的第一个元素,然后调用summarize_each有效:

# summarize_each
get_class = function(x) class(x)[1]
df_x %>% 
  summarise_each(funs(get_class), date1, date2)

我想知道如何为data_frame中的一堆变量获取任意列data_frame的最大类数的变量?我的猜测是,调用do和某些后处理的某种组合可以在这里工作。

期望的结果:

# desired result
df_result = data_frame(date1 = c("POSIXct", "POSIXt"),
                       date2 = c("POSIXct", "POSIXt"))

> df_result
Source: local data frame [2 x 2]

    date1   date2
    (chr)   (chr)
1 POSIXct POSIXct
2  POSIXt  POSIXt

2 个答案:

答案 0 :(得分:0)

我没有使用dplyr所以只坚持base

为每列创建一个具有不同类别数的数据框。

df_x <- data.frame(date1 = seq.POSIXt(from = as.POSIXct("2016-01-01 12:00:00 UTC"),
                                      to = as.POSIXct("2016-01-03 12:00:00 UTC"), by = "hour",
                                      tz = "UTC"),
                   date2 = seq.POSIXt(from = as.POSIXct("2016-01-01 12:00:00 UTC"),
                                      to = as.POSIXct("2016-01-03 12:00:00 UTC"), by = "hour",
                                      tz = "UTC"),
                   character1 = sample(LETTERS, 49, replace = TRUE),
                   numeric1 = sample(1:100, 49, replace = TRUE))

df_x.class <- lapply(df_x, class)

创建一个函数,使列表中的每个向量都具有相同的长度

cbind.fill <- function(n) {
    vec.len <- max(unlist(lapply(n, length)))
    list.appended <- lapply(n, function(m) {
                            if(length(m) < vec.len) {
                                vec.append <- vec.len - length(m)
                                m <- c(m, rep(NA, vec.append))}
                            m})
    do.call(cbind, list.appended)}

cbind.fill(df_x.class)

答案 1 :(得分:0)

我会用你的例子尝试下面的单行代码:

as.data.frame(lapply(df_x, class))