我不确定如何描述我尝试做的操作。我有一个包含两列(电影和演员)的数据框。我想从这里创建一个基于他们在一起的电影的独特双人组合的列表。下面是代码,它创建了我拥有的数据框的示例,以及另一个数据框,它是我想要的结果。
start_data <- tibble::tribble(
~movie, ~actor,
"titanic", "john",
"star wars", "john",
"baby driver", "john",
"shawshank", "billy",
"titanic", "billy",
"star wars", "sarah",
"titanic", "sarah"
)
end_data <- tibble::tribble(
~movie, ~actor1, ~actor2,
"titanic", "john", "billy",
"titanic", "john", "sarah",
"titanic", "billy", "sarah",
"star wars", "john", "sarah"
)
感谢任何帮助,谢谢!奖金积分如果是短的++
答案 0 :(得分:3)
您可以使用combn(..., 2)
查找两个actor组合,这两个组合可以转换为两列 tibble 并存储在带有summarize
的列表列中;要获得平面数据框,请使用unnest
:
library(tidyverse)
start_data %>%
group_by(movie) %>%
summarise(acts = list(
if(length(actor) > 1) set_names(as.tibble(t(combn(actor, 2))), c('actor1', 'actor2'))
else tibble()
)) %>%
unnest()
# A tibble: 4 x 3
# movie actor1 actor2
# <chr> <chr> <chr>
#1 star wars john sarah
#2 titanic john billy
#3 titanic john sarah
#4 titanic billy sarah
答案 1 :(得分:2)
library(tidyverse)
library(stringr)
inner_join(start_data, start_data, by = "movie") %>%
filter(actor.x != actor.y) %>%
rowwise() %>%
mutate(combo = str_c(min(actor.x, actor.y), "_", max(actor.x, actor.y))) %>%
ungroup() %>%
select(movie, combo) %>%
distinct %>%
separate(combo, c("actor1", "actor2"))