将str_split应用于dataframe中的列

时间:2016-11-04 12:31:21

标签: r strsplit

我有以下名为i的df:

structure(list(price = c(11772, 14790, 2990, 1499, 21980, 27999
), fuel = c("diesel", "petrol", "petrol", "diesel", "diesel", 
"petrol"), gearbox = c("manual", "manual", "manual", "manual", 
"automatic", "manual"), colour = c("white", "purple", "yellow", 
"silver", "red", "rising blue metalli"), engine_size = c(1685, 
1199, 998, 1753, 2179, 1984), mileage = c(18839, 7649, 45058, 
126000, 31891, 100), year = c("2013 hyundai ix35", "2016 citroen citroen ds3 cabrio", 
"2007 peugeot 107 hatchback", "2007 ford ford focus hatchback", "2012 jaguar xf saloon", 
"2016 volkswagen scirocco coupe"), doors = c(5, 2, 3, 5, 4, 3
)), .Names = c("price", "fuel", "gearbox", "colour", "engine_size", 
"mileage", "year", "doors"), row.names = c(NA, 6L), class = "data.frame")

列中的一些单词' year'是重复的。我想删除它们。作为第一步,我想用单独的单词分隔此列中的字符串。 我能够为一个单独的字符串做这个,但是当我尝试将它应用到整个数据框时,它会给出一个错误

unlist(str_split( "2013 hyunday ix35", "[[:blank:]]"))

[1]" 2013" " hyunday" " IX35"

for( k in 1:nrow(i))
+ i[k,7]<-unlist(str_split( i[k, 7], "[[:blank:]]"))

[<-.data.frame中的错误(*tmp*,k,7,值= c(&#34; 2013&#34;,&#34; hyundai&#34;,:   替换有3行,数据有1个

2 个答案:

答案 0 :(得分:2)

我们可以通过循环\\s+输出(paste)将一个或多个空格(unique)和list sapply(..元素拆分为一起

i$year <- sapply(strsplit(i$year, "\\s+"), function(x) paste(unique(x), collapse=' '))

答案 1 :(得分:2)

使用dplyrstringr(在purrr的帮助下处理列表),您可以这样做:

library(dplyr)
df %>%
  mutate(newyear = purrr::map_chr(
    stringr::str_split(year, pattern = "[[:blank:]]"), 
    ~ paste(unique(.x), collapse = " ")
    ))
#>   price   fuel   gearbox              colour engine_size mileage
#> 1 11772 diesel    manual               white        1685   18839
#> 2 14790 petrol    manual              purple        1199    7649
#> 3  2990 petrol    manual              yellow         998   45058
#> 4  1499 diesel    manual              silver        1753  126000
#> 5 21980 diesel automatic                 red        2179   31891
#> 6 27999 petrol    manual rising blue metalli        1984     100
#>                              year doors                        newyear
#> 1               2013 hyundai ix35     5              2013 hyundai ix35
#> 2 2016 citroen citroen ds3 cabrio     2        2016 citroen ds3 cabrio
#> 3      2007 peugeot 107 hatchback     3     2007 peugeot 107 hatchback
#> 4  2007 ford ford focus hatchback     5      2007 ford focus hatchback
#> 5           2012 jaguar xf saloon     4          2012 jaguar xf saloon
#> 6  2016 volkswagen scirocco coupe     3 2016 volkswagen scirocco coupe