粘贴字符向量,删除NA之间的NA和分隔符

时间:2013-08-13 09:12:53

标签: regex string r

我想要粘贴几个字符向量。问题是一些字符向量非常稀疏。所以,当我粘贴它们时,我会得到NA和额外的分隔符。如何在仍然加入向量的同时有效地移除NA和额外的分隔符?

我有类似的东西:

n1 = c("goats", "goats", "spatula", NA, "rectitude", "boink")
n2 = c("forever", NA, "...yes", NA, NA, NA)
cbind(paste(n1,n2, sep=", "))

给了我:

[1,] "goats, forever" 
[2,] "goats, NA"      
[3,] "spatula, ...yes"
[4,] "NA, NA"         
[5,] "rectitude, NA"  
[6,] "boink, NA" 

但我想:

[1,] "goats, forever" 
[2,] "goats"          
[3,] "spatula, ...yes"
[4,] <NA>
[5,] "rectitude"      
[6,] "boink"

使用大量正则表达式和字符串拆分,显然效率低,乏味乏味。但是什么快速/简单?

3 个答案:

答案 0 :(得分:5)

不是很多正则表达式,只需要1行,还有1行来替换NA

n1 <- c("goats", "goats", "spatula", NA, "rectitude", "boink")
n2 <- c("forever", NA, "...yes", NA, NA, NA)
n3 <- cbind(paste(n1,n2, sep=", "))
n3 <- gsub("(, )?NA", "", n3)
n3[n3==""] <- NA

答案 1 :(得分:5)

代码(无正则表达式或字符串拆分):

vec <- apply(cbind(n1,n2),1,function(x)
    ifelse(all(is.na(x)), NA, paste(na.omit(x),collapse=", ")) )

结果:

> vec # as a vector
[1] "goats, forever"  "goats"  "spatula, ...yes"  NA  "rectitude"  "boink"

> cbind(vec) # as a matrix
     vec              
[1,] "goats, forever" 
[2,] "goats"          
[3,] "spatula, ...yes"
[4,] NA               
[5,] "rectitude"      
[6,] "boink"

答案 2 :(得分:1)

这是一个使用qdap包的选项(虽然其他选项对我来说似乎更好,因为它们使用基础R):

library(qdap)
gsub(" ", ", ", blank2NA(Trim(gsub("NA", "", paste(n1, n2)))))

## [1] "goats, forever"  "goats"           "spatula, ...yes" NA               
## [5] "rectitude"       "boink"

或者...

## gsub(" ", ", ", blank2NA(gsub("NA| NA", "", paste(n1, n2))))