R-Convert长度不等的字符数组到数据帧

时间:2018-01-03 11:15:49

标签: r dataframe

我是R的新手,如果有人可以帮我解决这个问题,我将非常感激。我正在尝试将char列表转换为R中的数据框。

char列表如下所示。

 chr [1:6]   
 [1] "Colour: Gold|Style: Without Offers"                     
 [2] "Colour: Gold|Style: Without Offers|Verified Purchase" .  
 [3] "Colour: Gold|Style: With Offers|Verified Purchase"   
 [4] "Colour: Gold|Style: Without Offers|Verified Purchase"  
 [5] "Colour: Black|Verified Purchase" .   
 [6] "Colour: Gold|Style: Without Offers" . 

所需的输出如下所示:

Colour Style PurchaseType
========================

[1]  Gold  Without Offers     NA .   
[2]  Gold  Without Offers   Verified Purchase .   
[3]  Gold  With Offers      Verified Purchase .   
[4]  Gold  Without Offers   Verified Purchase     
[5]  Black  NA .            Verified Purchase .  
[6]  Gold  Without Offers"   NA 

请提出解决方案。

1 个答案:

答案 0 :(得分:3)

我们可以在重新排列输入字段后用分隔符和rbind分割字符串。

lst <- lapply(strsplit(v1, "\\||\\w+\\:\\s*", perl = TRUE), function(x) {
                x1 <- x[nzchar(x)]
                if(length(grep("Offer", x1))==0) c(x1[1], NA, x1[2]) else x1})

d1 <- as.data.frame(do.call(rbind, lapply(lst, `length<-`,
              max(lengths(lst)))), stringsAsFactors = FALSE)
names(d1) <- c("Colour", "Style", "PurchaseType")
d1
#   Colour          Style      PurchaseType
#1   Gold Without Offers              <NA>
#2   Gold Without Offers Verified Purchase
#3   Gold    With Offers Verified Purchase
#4   Gold Without Offers Verified Purchase
#5  Black           <NA> Verified Purchase
#6   Gold Without Offers              <NA>