我有一个小组。
library(tidyverse)
df <- tibble(
id = 1:4,
genres = c("Action|Adventure|Science Fiction|Thriller",
"Adventure|Science Fiction|Thriller",
"Action|Crime|Thriller",
"Family|Animation|Adventure|Comedy|Action")
)
df
我想用“|”分隔这些类型和空列填充NA。
这就是我所做的:
df %>%
separate(genres, into = c("genre1", "genre2", "genre3", "genre4", "genre5"), sep = "|")
但是,每封信后都会分开。
答案 0 :(得分:2)
我认为你还没有into
:
df <- tibble::tibble(
id = 1:4,
genres = c("Action|Adventure|Science Fiction|Thriller",
"Adventure|Science Fiction|Thriller",
"Action|Crime|Thriller",
"Family|Animation|Adventure|Comedy|Action")
)
df %>% tidyr::separate(genres, into = c("genre1", "genre2", "genre3",
"genre4", "genre5"))
结果:
# A tibble: 4 x 6
id genre1 genre2 genre3 genre4 genre5
* <int> <chr> <chr> <chr> <chr> <chr>
1 1 Action Adventure Science Fiction Thriller
2 2 Adventure Science Fiction Thriller <NA>
3 3 Action Crime Thriller <NA> <NA>
4 4 Family Animation Adventure Comedy Action
编辑:或者正如RichScriven在评论df %>% tidyr::separate(genres, into = paste0("genre", 1:5))
中所写的那样。要准确分离|
,请使用sep = "\\|"
。
答案 1 :(得分:0)
嗯,这有助于正确编写正则表达式。
df %>%
separate(genres, into = paste0("genre", 1:5), sep = "\\|")