在R列中用逗号分隔字符串而不将其拆分为多列

时间:2019-06-07 04:39:53

标签: r dataframe

我需要在下面给出的R数据框中分隔逗号分隔的字符串。我还给出了输出df的外观。下面的代码仅在创建列的每一行之外的列表时部分完成了这项工作。 我应该怎么做?

Df1:

Fruits
"Apple, Mango, papaya"  
"Apple, Mango"                                                                                                                                                                                                                                   
"Avocado"                                                                                                                                                                                                                                            
"Papaya, Raspberry, Strawberry, Blueberry" 

Desired Output df:

Fruits
"Apple", "Mango", "papaya"  
"Apple", "Mango"                                                                                                                                                                                                                                   
"Avocado"                                                                                                                                                                                                                                            
"Papaya", "Raspberry", "Strawberry", "Blueberry" 

我的代码

Df1$Fruits <- strsplit(Df1$Fruits, ",")

这是输出:

Fruits
c("Apple", "Mango", "papaya")  
c("Apple", "Mango")                                                                                                                                                                                                                                   
"Avocado"                                                                                                                                                                                                                                            
c("Papaya", "Raspberry", "Strawberry", "Blueberry")

我应该如何取消列出这些字符串并以所需的格式获得它们?

1 个答案:

答案 0 :(得分:0)

可以通过tidyr::separate和数据操作来做到这一点。

library(tidyr)
library(dplyr)
newdf <- df %>% 
  tidyr::separate(fruits, paste0('col', c(1:4)), sep = ',', remove = T)

Fruits_1 <- newdf %>% 
  dplyr::select(col1) %>% 
  na.omit()
Fruits_2 <- newdf %>% 
  dplyr::select(col2) %>% 
  na.omit()
Fruits_3 <- newdf %>% 
  dplyr::select(col3) %>% 
  na.omit()
Fruits_4 <- newdf %>% 
  dplyr::select(col4) %>% 
  na.omit()
  • 数据:
df <- data.frame(fruits = c("Apple, Mango, papaya",
                            "Apple, Mango", "Avocado",
                            "Papaya, Raspberry, Strawberry, Blueberry"))

R: Splitting one column (different lengths) into new columns