我有一个包含演员姓名的变量。
(actor=structure(c(4L, 1L, 6L, 2L, 5L, 3L), .Label = c("Christian Bale, Tom Hardy, Anne Hathaway, Gary Oldman",
"Jamie Foxx, Christoph Waltz, Leonardo DiCaprio, Kerry Washington",
"Jennifer Lawrence, Josh Hutcherson, Liam Hemsworth, Stanley Tucci",
"Leonardo DiCaprio, Joseph Gordon-Levitt, Ellen Page, Ken Watanabe",
"Leonardo DiCaprio, Mark Ruffalo, Ben Kingsley, Max von Sydow",
"Robert Downey Jr., Chris Evans, Scarlett Johansson, Jeremy Renner"
), class = "factor"))
# [1] Leonardo DiCaprio, Joseph Gordon-Levitt, Ellen Page, Ken Watanabe
# [2] Christian Bale, Tom Hardy, Anne Hathaway, Gary Oldman
# [3] Robert Downey Jr., Chris Evans, Scarlett Johansson, Jeremy Renner
# [4] Jamie Foxx, Christoph Waltz, Leonardo DiCaprio, Kerry Washington
# [5] Leonardo DiCaprio, Mark Ruffalo, Ben Kingsley, Max von Sydow
# [6] Jennifer Lawrence, Josh Hutcherson, Liam Hemsworth, Stanley Tucci
# 6 Levels: Christian Bale, Tom Hardy, Anne Hathaway, Gary Oldman ...
我想从中提取所有完整的actor名称(name + surname),并在输出矩阵中创建它们。
答案 0 :(得分:3)
如果要提取actor的唯一名称,可以使用as.character
函数获取指定的actor,将其分隔为逗号strsplit
,将结果列表中的所有向量组合在一起unlist
,并使用unique
获取唯一名称:
(all.actors <- unique(unlist(strsplit(as.character(actor), ", "))))
# [1] "Leonardo DiCaprio" "Joseph Gordon-Levitt" "Ellen Page" "Ken Watanabe"
# [5] "Christian Bale" "Tom Hardy" "Anne Hathaway" "Gary Oldman"
# [9] "Robert Downey Jr." "Chris Evans" "Scarlett Johansson" "Jeremy Renner"
# [13] "Jamie Foxx" "Christoph Waltz" "Kerry Washington" "Mark Ruffalo"
# [17] "Ben Kingsley" "Max von Sydow" "Jennifer Lawrence" "Josh Hutcherson"
# [21] "Liam Hemsworth" "Stanley Tucci"
通过使用as.character(actor)
,此代码仅使用显示在因子actor
中的actor,即使该因子具有更多未使用的级别。如果您使用levels(actor)
代替,您将获得因子级别中的所有参与者,无论他们是否在actors
中使用。在定义all.actors
时,您可以使用您喜欢的任何一种。
如果您想要一个矩阵,表明每个元素都包含actor
中的每个元素,那么您可以
mat <- sapply(strsplit(as.character(actor), ", "), function(x) all.actors %in% x)
row.names(mat) <- all.actors
mat
# [,1] [,2] [,3] [,4] [,5] [,6]
# Leonardo DiCaprio TRUE FALSE FALSE TRUE TRUE FALSE
# Joseph Gordon-Levitt TRUE FALSE FALSE FALSE FALSE FALSE
# Ellen Page TRUE FALSE FALSE FALSE FALSE FALSE
# Ken Watanabe TRUE FALSE FALSE FALSE FALSE FALSE
# Christian Bale FALSE TRUE FALSE FALSE FALSE FALSE
# Tom Hardy FALSE TRUE FALSE FALSE FALSE FALSE
# Anne Hathaway FALSE TRUE FALSE FALSE FALSE FALSE
# Gary Oldman FALSE TRUE FALSE FALSE FALSE FALSE
# Robert Downey Jr. FALSE FALSE TRUE FALSE FALSE FALSE
# Chris Evans FALSE FALSE TRUE FALSE FALSE FALSE
# Scarlett Johansson FALSE FALSE TRUE FALSE FALSE FALSE
# Jeremy Renner FALSE FALSE TRUE FALSE FALSE FALSE
# Jamie Foxx FALSE FALSE FALSE TRUE FALSE FALSE
# Christoph Waltz FALSE FALSE FALSE TRUE FALSE FALSE
# Kerry Washington FALSE FALSE FALSE TRUE FALSE FALSE
# Mark Ruffalo FALSE FALSE FALSE FALSE TRUE FALSE
# Ben Kingsley FALSE FALSE FALSE FALSE TRUE FALSE
# Max von Sydow FALSE FALSE FALSE FALSE TRUE FALSE
# Jennifer Lawrence FALSE FALSE FALSE FALSE FALSE TRUE
# Josh Hutcherson FALSE FALSE FALSE FALSE FALSE TRUE
# Liam Hemsworth FALSE FALSE FALSE FALSE FALSE TRUE
# Stanley Tucci FALSE FALSE FALSE FALSE FALSE TRUE