我正在寻找一个基于函数中定义的变量名称以升序或降序对变量进行排名的函数。
我可以手动进行排名,但是我希望能够调用该函数以简化df
的代码。寻找某人向我展示如何在宽而长的df
上执行该功能。我的示例代码如下。我希望收看电视节目,并且分钟数排名递增,其他各列均按降序排列。如果有人可以向我展示如何执行该函数,以便为升序和降序变量定义变量名,以及仅定义要按降序排列的变量,而所有其他列均默认为变量的另一种选择,那就太好了升序。
library(tidyverse)
df <- tibble::tribble(
~Name, ~Team, ~minutes, ~ftm, ~fta, ~oreb, ~dreb, ~treb, ~ast, ~stl, ~blk, ~tov, ~pts, ~eff,
"Russell Westbrook", "OKC", 34.6, 8.8, 10.4, 1.7, 9, 10.7, 10.4, 1.6, 0.4, 5.4, 31.6, 33.8,
"James Harden", "HOU", 36.4, 9.2, 10.9, 1.2, 7, 8.1, 11.2, 1.5, 0.5, 5.7, 29.1, 32.4,
"Isaiah Thomas", "BOS", 33.8, 7.8, 8.5, 0.6, 2.1, 2.7, 5.9, 0.9, 0.2, 2.8, 28.9, 24.7,
"Anthony Davis", "NOP", 36.1, 6.9, 8.6, 2.3, 9.5, 11.8, 2.1, 1.3, 2.2, 2.4, 28, 31.1,
"DeMar DeRozan", "TOR", 35.4, 7.4, 8.7, 0.9, 4.3, 5.2, 3.9, 1.1, 0.2, 2.4, 27.3, 22.7,
"Damian Lillard", "POR", 35.9, 6.5, 7.3, 0.6, 4.3, 4.9, 5.9, 0.9, 0.3, 2.6, 27, 24.5,
"DeMarcus Cousins", "NOP", 34.2, 7.2, 9.3, 2.1, 8.9, 11, 4.6, 1.4, 1.3, 3.7, 27, 28.5,
"LeBron James", "CLE", 37.8, 4.8, 7.2, 1.3, 7.3, 8.6, 8.7, 1.2, 0.6, 4.1, 26.4, 31,
"Kawhi Leonard", "SAS", 33.4, 6.3, 7.2, 1.1, 4.7, 5.8, 3.5, 1.8, 0.7, 2.1, 25.5, 25.3,
"Stephen Curry", "GSW", 33.4, 4.1, 4.6, 0.8, 3.7, 4.5, 6.6, 1.8, 0.2, 3, 25.3, 25.2
)
df_wide <- df %>%
mutate_at(vars(ftm, ast), funs(rank = rank(desc(.)))) %>%
mutate_at(vars(tov, minutes), funs(rank = rank((.))))
df_wide
#> # A tibble: 10 x 18
#> Name Team minutes ftm fta oreb dreb treb ast stl blk
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Russ~ OKC 34.6 8.8 10.4 1.7 9 10.7 10.4 1.6 0.4
#> 2 Jame~ HOU 36.4 9.2 10.9 1.2 7 8.1 11.2 1.5 0.5
#> 3 Isai~ BOS 33.8 7.8 8.5 0.6 2.1 2.7 5.9 0.9 0.2
#> 4 Anth~ NOP 36.1 6.9 8.6 2.3 9.5 11.8 2.1 1.3 2.2
#> 5 DeMa~ TOR 35.4 7.4 8.7 0.9 4.3 5.2 3.9 1.1 0.2
#> 6 Dami~ POR 35.9 6.5 7.3 0.6 4.3 4.9 5.9 0.9 0.3
#> 7 DeMa~ NOP 34.2 7.2 9.3 2.1 8.9 11 4.6 1.4 1.3
#> 8 LeBr~ CLE 37.8 4.8 7.2 1.3 7.3 8.6 8.7 1.2 0.6
#> 9 Kawh~ SAS 33.4 6.3 7.2 1.1 4.7 5.8 3.5 1.8 0.7
#> 10 Step~ GSW 33.4 4.1 4.6 0.8 3.7 4.5 6.6 1.8 0.2
#> # ... with 7 more variables: tov <dbl>, pts <dbl>, eff <dbl>,
#> # ftm_rank <dbl>, ast_rank <dbl>, tov_rank <dbl>, minutes_rank <dbl>
df_long <- df %>%
gather(key = data_col, value = "stat_value", 3:14) %>%
group_by(data_col) %>%
mutate(rank = if_else(data_col %in% c("tov", "minutes"), rank(stat_value, ties.method = "first"), rank(-stat_value, ties.method = "first")))
df_long
#> # A tibble: 120 x 5
#> # Groups: data_col [12]
#> Name Team data_col stat_value rank
#> <chr> <chr> <chr> <dbl> <int>
#> 1 Russell Westbrook OKC minutes 34.6 5
#> 2 James Harden HOU minutes 36.4 9
#> 3 Isaiah Thomas BOS minutes 33.8 3
#> 4 Anthony Davis NOP minutes 36.1 8
#> 5 DeMar DeRozan TOR minutes 35.4 6
#> 6 Damian Lillard POR minutes 35.9 7
#> 7 DeMarcus Cousins NOP minutes 34.2 4
#> 8 LeBron James CLE minutes 37.8 10
#> 9 Kawhi Leonard SAS minutes 33.4 1
#> 10 Stephen Curry GSW minutes 33.4 2
#> # ... with 110 more rows
我想要的输出将与上面列出的df
相同。我正在寻找创建一个清理手册if_else和上面两行代码的函数。假设该函数名为stat_rank
。我希望代码可以执行以下操作:
df_wide <- df %>%
mutate_at(vars(ftm, ast, tov, minutes), funs(rank = stat_rank(.))))
df_long <- df %>%
gather(key = data_col, value = "stat_value", 3:14) %>%
group_by(data_col) %>%
mutate(rank = stat_rank(stat_value))
答案 0 :(得分:1)
如果我们需要一个功能,那么
#owl-featured > div > div {
margin-bottom: -45px;
}
请注意,在上述实现中,列名是在函数内进行硬编码的。如果需要更灵活,那么可以通过列名作为另一参数
stat_rank <- function(x) {
col1 <- deparse(substitute(x))
if(col1 %in% c('ftm', 'ast')) {
rank(desc(x))
} else rank(x)
}
df %>%
mutate_at(vars(ftm, ast, tov, minutes), funs(rank = stat_rank))
# A tibble: 10 x 18
# Name Team minutes ftm fta oreb dreb treb ast stl blk tov pts eff ftm_rank ast_rank tov_rank minutes_rank
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 Russell Wes… OKC 34.6 8.8 10.4 1.7 9 10.7 10.4 1.6 0.4 5.4 31.6 33.8 2 2 9 5
# 2 James Harden HOU 36.4 9.2 10.9 1.2 7 8.1 11.2 1.5 0.5 5.7 29.1 32.4 1 1 10 9
# 3 Isaiah Thom… BOS 33.8 7.8 8.5 0.6 2.1 2.7 5.9 0.9 0.2 2.8 28.9 24.7 3 5.5 5 3
# 4 Anthony Dav… NOP 36.1 6.9 8.6 2.3 9.5 11.8 2.1 1.3 2.2 2.4 28 31.1 6 10 2.5 8
# 5 DeMar DeRoz… TOR 35.4 7.4 8.7 0.9 4.3 5.2 3.9 1.1 0.2 2.4 27.3 22.7 4 8 2.5 6
# 6 Damian Lill… POR 35.9 6.5 7.3 0.6 4.3 4.9 5.9 0.9 0.3 2.6 27 24.5 7 5.5 4 7
# 7 DeMarcus Co… NOP 34.2 7.2 9.3 2.1 8.9 11 4.6 1.4 1.3 3.7 27 28.5 5 7 7 4
# 8 LeBron James CLE 37.8 4.8 7.2 1.3 7.3 8.6 8.7 1.2 0.6 4.1 26.4 31 9 3 8 10
# 9 Kawhi Leona… SAS 33.4 6.3 7.2 1.1 4.7 5.8 3.5 1.8 0.7 2.1 25.5 25.3 8 9 1 1.5
#10 Stephen Cur… GSW 33.4 4.1 4.6 0.8 3.7 4.5 6.6 1.8 0.2 3 25.3 25.2 10 4 6 1.5
对于长格式数据,可以使用一个函数
stat_rank <- function(x, descCols) {
col1 <- deparse(substitute(x))
if(col1 %in% descCols) {
rank(desc(x))
} else rank(x)
}
df %>%
mutate_at(vars(ftm, ast, tov, minutes),
funs(rank = stat_rank(., descCols = c('ftm', 'ast'))))