我有以下数据框示例:
first second third
----------------------------------
A 1 A1
A 2 A2
A 3 A2
B 1 B1
C 1 C1
C 2 C2
是否可以根据第一列中重复的值/行将第二列和第三列拆分为新列?像这样:
first second second.2 second.3 third third.2 third.3
A 1 2 3 A1 A2 A2
B 1 NA NA B1 NA NA
C 1 2 NA C1 C2 NA
答案 0 :(得分:2)
一个选项是pivot_wider
。在这里,“第二”列也是每个“第一”组的序列列,因此,请用mutate
复制该列,然后使用pivot_wider
将其从“长”整形为“宽” < / p>
library(dplyr)
library(tidyr)
df1 %>%
mutate(rn = second) %>%
pivot_wider(names_from = rn, values_from = c(second, third), names_sep = ".")
# A tibble: 3 x 7
# first second.1 second.2 second.3 third.1 third.2 third.3
# <chr> <int> <int> <int> <chr> <chr> <chr>
#1 A 1 2 3 A1 A2 A2
#2 B 1 NA NA B1 <NA> <NA>
#3 C 1 2 NA C1 C2 <NA>
df1 <- structure(list(first = c("A", "A", "A", "B", "C", "C"), second = c(1L,
2L, 3L, 1L, 1L, 2L), third = c("A1", "A2", "A2", "B1", "C1",
"C2")), class = "data.frame", row.names = c(NA, -6L))
答案 1 :(得分:2)
您还可以使用data.table::dcast
转换为宽格式。 ~
的右侧类似于names_from
的{{1}}参数,而pivot_wider
类似于value.var
参数。因此,列将为[value.var name] .rowid(first),其中values_from
创建组内行号,其中组由rowid(first)
的值确定。
first