我们如何根据“|”将列分成多个列?

时间:2017-06-16 01:20:27

标签: r tidyverse

我有一个小组。

library(tidyverse)
df <- tibble(
  id = 1:4,
  genres = c("Action|Adventure|Science Fiction|Thriller", 
        "Adventure|Science Fiction|Thriller",
        "Action|Crime|Thriller",
        "Family|Animation|Adventure|Comedy|Action")
)
df

enter image description here

我想用“|”分隔这些类型和空列填充NA。

这就是我所做的:

df %>% 
  separate(genres, into = c("genre1", "genre2", "genre3", "genre4", "genre5"), sep = "|")

但是,每封信后都会分开。

enter image description here

2 个答案:

答案 0 :(得分:2)

我认为你还没有into

df <- tibble::tibble(
  id = 1:4,
  genres = c("Action|Adventure|Science Fiction|Thriller", 
             "Adventure|Science Fiction|Thriller",
             "Action|Crime|Thriller",
             "Family|Animation|Adventure|Comedy|Action")
)
df %>% tidyr::separate(genres, into = c("genre1", "genre2", "genre3", 
                 "genre4", "genre5"))

结果:

# A tibble: 4 x 6
     id    genre1    genre2    genre3   genre4   genre5
* <int>     <chr>     <chr>     <chr>    <chr>    <chr>
1     1    Action Adventure   Science  Fiction Thriller
2     2 Adventure   Science   Fiction Thriller     <NA>
3     3    Action     Crime  Thriller     <NA>     <NA>
4     4    Family Animation Adventure   Comedy   Action

编辑:或者正如RichScriven在评论df %>% tidyr::separate(genres, into = paste0("genre", 1:5))中所写的那样。要准确分离|,请使用sep = "\\|"

答案 1 :(得分:0)

嗯,这有助于正确编写正则表达式。

df %>% 
  separate(genres, into = paste0("genre", 1:5), sep = "\\|")