将基于行中重复项的值分成R

时间:2019-10-17 21:18:22

标签: r

我有以下数据框示例:

  first         second       third
 ----------------------------------
  A             1            A1
  A             2            A2
  A             3            A2
  B             1            B1
  C             1            C1
  C             2            C2

是否可以根据第一列中重复的值/行将第二列和第三列拆分为新列?像这样:

  first         second       second.2      second.3    third   third.2     third.3
  A             1            2             3           A1      A2          A2
  B             1            NA            NA          B1      NA          NA
  C             1            2             NA          C1      C2          NA

2 个答案:

答案 0 :(得分:2)

一个选项是pivot_wider。在这里,“第二”列也是每个“第一”组的序列列,因此,请用mutate复制该列,然后使用pivot_wider将其从“长”整形为“宽” < / p>

library(dplyr)
library(tidyr)
df1 %>%
   mutate(rn = second) %>% 
   pivot_wider(names_from  = rn, values_from = c(second, third), names_sep = ".")
# A tibble: 3 x 7
#  first second.1 second.2 second.3 third.1 third.2 third.3
#  <chr>    <int>    <int>    <int> <chr>   <chr>   <chr>  
#1 A            1        2        3 A1      A2      A2     
#2 B            1       NA       NA B1      <NA>    <NA>   
#3 C            1        2       NA C1      C2      <NA>   

数据

df1 <- structure(list(first = c("A", "A", "A", "B", "C", "C"), second = c(1L, 
2L, 3L, 1L, 1L, 2L), third = c("A1", "A2", "A2", "B1", "C1", 
"C2")), class = "data.frame", row.names = c(NA, -6L))

答案 1 :(得分:2)

您还可以使用data.table::dcast转换为宽格式。 ~的右侧类似于names_from的{​​{1}}参数,而pivot_wider类似于value.var参数。因此,列将为[value.var name] .rowid(first),其中values_from创建组内行号,其中组由rowid(first)的值确定。

first