有没有一种方法可以根据那里的排名来捕获值的序列

时间:2019-11-21 04:31:35

标签: r

大家好,我有一个数据框。我需要创建另一列,以便它可以告诉每个类别在什么地方。例如,请参考预期输出

df
ColB   ColA       
  X    A>B>C  
  U    B>C>A
  Z    C>A>B

预期产量

df1
ColB    ColA     A       B       C
  X     A>B>C    1       2       3
  U     B>C>A    3       1       2
  Z     C>A>B    2       3       1

2 个答案:

答案 0 :(得分:1)

我们可以首先将ColA放入单独的行group_by ColB中,并为每个条目提供唯一的行号,然后使用pivot_wider将数据转换为宽格式。

library(dplyr)
library(tidyr)

df %>%
  mutate(ColC = ColA) %>%
  separate_rows(ColC, sep = ">") %>%
  group_by(ColB) %>%
  mutate(row = row_number()) %>%
  pivot_wider(names_from = ColC, values_from = row)

#  ColB  ColA      A     B     C
#  <fct> <fct> <int> <int> <int>
#1 X     A>B>C     1     2     3
#2 U     B>C>A     3     1     2
#3 Z     C>A>B     2     3     1

数据

df <- structure(list(ColB = structure(c(2L, 1L, 3L), .Label = c("U", 
"X", "Z"), class = "factor"), ColA = structure(1:3, .Label = c("A>B>C", 
"B>C>A", "C>A>B"), class = "factor")), class = "data.frame", row.names = c(NA, -3L))

答案 1 :(得分:0)

我们可以在base R

中完成此操作
df[LETTERS[1:3]] <- t(sapply(regmatches(df$ColA, gregexpr("[A-Z]", 
      df$ColA)), match, x = LETTERS[1:3]))
df
#  ColB  ColA A B C
#1    X A>B>C 1 2 3
#2    U B>C>A 3 1 2
#3    Z C>A>B 2 3 1

数据

df <- structure(list(ColB = structure(c(2L, 1L, 3L), .Label = c("U", 
"X", "Z"), class = "factor"), ColA = structure(1:3, .Label = c("A>B>C", 
"B>C>A", "C>A>B"), class = "factor")), class = "data.frame", 
row.names = c(NA, 
-3L))