我有一个要清理的嵌套df
。
Sample Data:
df <-
tibble::tribble(
~idTeam, ~ptsTotalBehindFirst, ~ptsOverall, ~ptsDiffLastPeriod, ~rankOverall, ~ptsBattingBehindFirst, ~ptsBatting, ~ptsDiffBattingLastPeriod, ~dataBatting, ~rankBatting, ~ptsPitchingBehindFirst, ~ptsPitching, ~ptsDiffPitchingLastPeriod, ~dataPitching, ~rankPitching,
"2", "0", "111", "-4", 1L, "0", "65", "0", list(abbr = c("OBP", "HR", "RBI", "R", "SB"), roto_points = c(13, 13, 13, 13, 13), value = c(0.3663, 384, 1012, 1102, 164), diff = c(0, 0, 0, 0, 0), rank = c(1, 1, 1, 1, 1)), 1L, "5", "46", "-4", list(abbr = c("S", "W", "K", "ERA", "WHIP"), roto_points = c(12, 6, 11, 8, 9), value = c(94, 89, 1576, 3.946, 1.2179), diff = c(0, -2, -2, 0, 0), rank = c(2, 8, 3, 6, 5)), 3L,
"8", "13.5", "97.5", "2", 2L, "13", "52", "0", list(abbr = c("OBP", "HR", "RBI", "R", "SB"), roto_points = c(12, 11, 11, 12, 6), value = c(0.3576, 323, 954, 1011, 89), diff = c(0, 0, 0, 0, 0), rank = c(2, 3, 3, 2, 8)), 3L, "5.5", "45.5", "2", list(abbr = c("S", "W", "K", "ERA", "WHIP"), roto_points = c(2, 7.5, 10, "13", 13), value = c(56, 91, 1508, 3.688, 1.1474), diff = c(-1, 1.5, 0.5, 1, 0), rank = c(12, 6, 4, 1, 1)), 4L
)
我要尝试unnest
的数据存储在dataBatting
和dataPitching
列中。我正在尝试unnest
两列中的所有列并将结果绑定为行。与pivot_longer
类似,但是我不确定将4个重复的列嵌套在2个单独的列中的正确方法。
我这样做的尝试是:
df %>%
unnest_wider(dataBatting) %>%
unnest(c(abbr, roto_points, value, diff, rank)) %>%
unnest_wider(dataPitching) %>%
unnest(c(abbr, roto_points, value, diff, rank))
Error is:
Error: Column names `abbr`, `roto_points`, `value`, `diff`, `rank` must not be duplicated.
Use .name_repair to specify repair.
Call `rlang::last_error()` to see a backtrace
我的问题是我想绑定dataPitching中与dataBatting具有相同列名的相同列(abbr,roto_points,value,diff,rank)。
我还想更改重复的列的名称。 tidyr::hoist
是更好的方法吗?
所需的df
:
tibble::tribble(
~idTeam, ~ptsTotalBehindFirst, ~ptsOverall, ~ptsDiffLastPeriod, ~rankOverall, ~ptsBattingBehindFirst, ~ptsBatting, ~ptsDiffBattingLastPeriod, ~abbr, ~roto_points5, ~value, ~diff, ~rank, ~rankPitching, ~ptsPitchingBehindFirst, ~ptsPitching, ~ptsDiffPitchingLastPeriod,
2, 0, 111, -4, 1, 0, 65, 0, "OBP", 13, 0.3663, 0, 1, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "HR", 13, 384, 0, 1, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "RBI", 13, 1012, 0, 1, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "R", 13, 1102, 0, 1, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "SB", 13, 164, 0, 1, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "S", 12, 94, 0, 2, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "W", 6, 89, -2, 8, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "K", 11, 1576, -2, 3, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "ERA", 8, 3.946, 0, 6, 3, 5, 46, -4,
2, 0, 111, -4, 1, 0, 65, 0, "WHIP", 9, 1.2179, 0, 5, 3, 5, 46, -4,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "OBP", 12, 0.3576, 0, 2, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "HR", 11, 323, 0, 3, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "RBI", 11, 954, 0, 3, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "R", 12, 1011, 0, 2, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "SB", 6, 89, 0, 8, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "S", 2, 56, -1, 12, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "W", 7.5, 91, 1.5, 6, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "K", 10, 1508, 0.5, 4, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "ERA", 13, 3.688, 1, 1, 4, 5.5, 45.5, 2,
8, 13.5, 97.5, 2, 2, 13, 52, 0, "WHIP", 13, 1.1474, 0, 1, 4, 5.5, 45.5, 2
)
答案 0 :(得分:1)
一种选择是循环遍历“ dataBatting”,“ dataPitching”列名称,分别进行unnest_wider
,unnest
感兴趣的列,然后将各行绑定在一起(map_dfr
-后缀'dfr'返回具有list
的{{1}}的{{1}}绑定在一起的行的数据帧。应该注意的一件事是,许多dydyverse函数都是类型敏感的。在这里,我们发现某些data.frames
元素具有不同的类型,除非提到'ptype',否则在tibble
中将出现问题。为避免这种情况,我们可以使用list
根据值自动更改类型,然后进行unnest
ing
type.convert