我需要在下面给出的R数据框中分隔逗号分隔的字符串。我还给出了输出df的外观。下面的代码仅在创建列的每一行之外的列表时部分完成了这项工作。 我应该怎么做?
Df1:
Fruits
"Apple, Mango, papaya"
"Apple, Mango"
"Avocado"
"Papaya, Raspberry, Strawberry, Blueberry"
Desired Output df:
Fruits
"Apple", "Mango", "papaya"
"Apple", "Mango"
"Avocado"
"Papaya", "Raspberry", "Strawberry", "Blueberry"
我的代码
Df1$Fruits <- strsplit(Df1$Fruits, ",")
这是输出:
Fruits
c("Apple", "Mango", "papaya")
c("Apple", "Mango")
"Avocado"
c("Papaya", "Raspberry", "Strawberry", "Blueberry")
我应该如何取消列出这些字符串并以所需的格式获得它们?
答案 0 :(得分:0)
可以通过tidyr::separate
和数据操作来做到这一点。
library(tidyr)
library(dplyr)
newdf <- df %>%
tidyr::separate(fruits, paste0('col', c(1:4)), sep = ',', remove = T)
Fruits_1 <- newdf %>%
dplyr::select(col1) %>%
na.omit()
Fruits_2 <- newdf %>%
dplyr::select(col2) %>%
na.omit()
Fruits_3 <- newdf %>%
dplyr::select(col3) %>%
na.omit()
Fruits_4 <- newdf %>%
dplyr::select(col4) %>%
na.omit()
df <- data.frame(fruits = c("Apple, Mango, papaya",
"Apple, Mango", "Avocado",
"Papaya, Raspberry, Strawberry, Blueberry"))
R: Splitting one column (different lengths) into new columns