我有一个数据框,其中每一列都有值。有没有一种方法可以创建新列并查看“最大值”列的顺序。例如。请参考我的预期输出
df
COlA COlB COLC COLD
34 40 4.5 50
35 70 2.0 30
90 40 4.0 10
预期产量
df
COlA COlB COLC COLD NewCOl
34 40 4.5 50 COLD > COLB > COLA > COLC
35 70 2.0 30 COLB > COLA > COLD > COLC
90 40 4.0 10 COLA > COLB > COLD > COLC
答案 0 :(得分:2)
在基本R中,您可以这样操作(仅在没有联系的情况下才是正确的解决方案):
dat$NewCol <- by(
unlist(dat),
row(dat),
function(x) paste(names(dat)[order(x, decreasing = T)], collapse = ' > ')
)
dat
# COlA COlB COLC COLD NewCol
# 1 34 40 4.5 50 COLD > COlB > COlA > COLC
# 2 35 70 2.0 30 COlB > COlA > COLD > COLC
# 3 90 40 4.0 10 COlA > COlB > COLD > COLC
数据:
dat <- structure(
list(
COlA = c(34L, 35L, 90L),
COlB = c(40L, 70L, 40L),
COLC = c(4.5, 2, 4),
COLD = c(50L, 30L, 10L)
),
class = "data.frame",
row.names = c(NA,-3L)
)
答案 1 :(得分:1)
我们可以在创建行号列之后将其重塑为“长”格式,然后通过基于“值”列中的递减值paste
将“名称”列ordered
设置为新列,重新调整为“宽”格式,并select
列
library(dplyr)
library(tidyr) #v 1.0.0
library(stringr)
df %>%
mutate(rn = row_number()) %>%
pivot_longer(cols = -rn) %>%
# or use gather for older versions
# gather(name, value, -rn) %>%
group_by(rn) %>%
mutate(NewCol = str_c(name[order(-value)], collapse=' > ')) %>%
pivot_wider(names_from = name, values_from = value) %>%
# or use spread for older versions
# spread(name, value) %>%
ungroup %>%
select(names(df), NewCol)
# A tibble: 3 x 5
# COlA COlB COLC COLD NewCol
# <dbl> <dbl> <dbl> <dbl> <chr>
#1 34 40 4.5 50 COLD > COlB > COlA > COLC
#2 35 70 2 30 COlB > COlA > COLD > COLC
#3 90 40 4 10 COlA > COlB > COLD > COLC
或者另一个选择是pmap
library(purrr)
df %>%
mutate(NewCol =pmap_chr(., ~ c(...) %>%
{names(.)[order(-.)]} %>%
str_c(collapse=" > ")))
如果有character
列,请使用select_if
df %>%
mutate(NewCol = pmap_chr(select_if(., is.numeric), ~ c(...) %>%
{names(.)[order(-.)]} %>%
str_c(collapse=" > ")))
df <- structure(list(COlA = c(34L, 35L, 90L), COlB = c(40L, 70L, 40L
), COLC = c(4.5, 2, 4), COLD = c(50L, 30L, 10L)),
class = "data.frame", row.names = c(NA,
-3L))