我正在使用下面的代码生成一个简单的汇总表:
# Data
data("mtcars")
# Lib
require(dplyr)
# Summary
mt_sum <- mtcars %>%
group_by(am) %>%
summarise_each(funs(min, mean, median, max), mpg, cyl) %>%
mutate(am = as.character(am)) %>%
left_join(y = as.data.frame(table(mtcars$am),
stringsAsFactors = FALSE),
by = c("am" = "Var1"))
代码产生了预期的结果:
> head(mt_sum)
Source: local data frame [2 x 10]
am mpg_min cyl_min mpg_mean cyl_mean mpg_median cyl_median mpg_max cyl_max Freq
(chr) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (int)
1 0 10.4 4 17.14737 6.947368 17.3 8 24.4 8 19
2 1 15.0 4 24.39231 5.076923 22.8 4 33.9 8 13
但是,我对列的排序方式不满意。特别是,我想:
按名称排序列
通过select()
dplyr
实现这一目标
醇>
所需的顺序如下:
> names(mt_sum)[order(names(mt_sum))]
[1] "am" "cyl_max" "cyl_mean" "cyl_median" "cyl_min" "Freq" "mpg_max"
[8] "mpg_mean" "mpg_median" "mpg_min"
理想情况下,我希望通过names(mt_sum)[order(names(mt_sum))]
方式对select()
中的列进行排序。但代码:
mt_sum <- mtcars %>%
group_by(am) %>%
summarise_each(funs(min, mean, median, max), mpg, cyl) %>%
mutate(am = as.character(am)) %>%
left_join(y = as.data.frame(table(mtcars$am),
stringsAsFactors = FALSE),
by = c("am" = "Var1")) %>%
select(names(.)[order(names(.))])
将返回预期的错误:
Error: All select() inputs must resolve to integer column positions. The following do not: * names(.)[order(names(.))]
在我的实际数据中,我正在生成大量的摘要列。因此,我的问题如何动态地将已排序的列名称传递给select()
中的dplyr
,以便它能理解并适用于data.frame
手头?
我的重点是找出将动态生成的列名传递给select()
的方法。我知道我可以对base
中的列进行排序,或者按照here的说明键入名称。
答案 0 :(得分:8)
你肯定是在正确的道路上。
mt_sum <- mtcars %>%
group_by(am) %>%
summarise_each(funs(min, mean, median, max), mpg, cyl) %>%
mutate(am = as.character(am)) %>%
left_join(y = as.data.frame(table(mtcars$am),
stringsAsFactors = FALSE),
by = c("am" = "Var1")) %>%
.[, names(.)[order(names(.))]]
答案 1 :(得分:6)
您只需要:
mt_sum %>% select(order(names(.)))
#Source: local data frame [2 x 10]
#
# am cyl_max cyl_mean cyl_median cyl_min Freq mpg_max mpg_mean mpg_median mpg_min
# (chr) (dbl) (dbl) (dbl) (dbl) (int) (dbl) (dbl) (dbl) (dbl)
#1 0 8 6.947368 8 4 19 24.4 17.14737 17.3 10.4
#2 1 8 5.076923 4 4 13 33.9 24.39231 22.8 15.0
它有效,因为order
会根据select
的要求返回整数列位置。