根据每个元素中的单词数拆分字符向量

时间:2014-10-23 12:28:36

标签: r vector split

我有一个字符向量,每个元素的字数不同,例如

myVector <- c("a quick", "brown", "fox jumped over", "a", "deer")

我想将矢量分成两个矢量,一个是单字元素,另一个是多字元素。我怎样才能实现它?我尝试了以下内容,

split.it <- function(x){

mult.vec <- character()

  if (length(unlist(strsplit(x,split=" ")))>1) {
    return(append(mult.vec, x))
  }

} 

然后打电话,

kj <- sapply(myVector , FUN=split.it)

但它没有给出理想的结果。有人可以帮忙吗?

3 个答案:

答案 0 :(得分:1)

尝试

library(stringr)
split(myVector,(str_count(myVector, "\\S+")>1)+1)
#$`1`
# [1] "brown" "a"     "deer" 

# $`2`
# [1] "a quick"         "fox jumped over"

此外,当有尾随/前导空格

时有效
 myVector1 <- c(myVector, " foxy")
 split(myVector1,(str_count(myVector1, "\\S+")>1)+1)
 #$`1`
 #[1] "brown" "a"     "deer"  " foxy"

 #$`2`
 #[1] "a quick"         "fox jumped over"

或修改你的功能

split.it2 <- function(x){
lst <- strsplit(x, " ")
Length <- sapply(lst, length)
split(x, (Length>1) +1)
}

split.it2(myVector)
#$`1`
#[1] "brown" "a"     "deer" 

#$`2`
#[1] "a quick"         "fox jumped over"

答案 1 :(得分:1)

也许不是很优雅,但这是一个非常简单的方法:

vect_multi<-myVector[grepl(" ",myVector)]
vect_single<-myVector[!grepl(" ",myVector)]

答案 2 :(得分:1)

word_count包中的qdap函数是一个方便的包装器。该函数还有一些可能有用的参数:

library(qdap)
split(x = myVector, f = word_count(myVector) > 1)
# $`FALSE`
# [1] "brown" "a"     "deer" 
# 
# $`TRUE`
# [1] "a quick"         "fox jumped over"