我有一个字符向量,每个元素的字数不同,例如
myVector <- c("a quick", "brown", "fox jumped over", "a", "deer")
我想将矢量分成两个矢量,一个是单字元素,另一个是多字元素。我怎样才能实现它?我尝试了以下内容,
split.it <- function(x){
mult.vec <- character()
if (length(unlist(strsplit(x,split=" ")))>1) {
return(append(mult.vec, x))
}
}
然后打电话,
kj <- sapply(myVector , FUN=split.it)
但它没有给出理想的结果。有人可以帮忙吗?
答案 0 :(得分:1)
尝试
library(stringr)
split(myVector,(str_count(myVector, "\\S+")>1)+1)
#$`1`
# [1] "brown" "a" "deer"
# $`2`
# [1] "a quick" "fox jumped over"
此外,当有尾随/前导空格
时有效 myVector1 <- c(myVector, " foxy")
split(myVector1,(str_count(myVector1, "\\S+")>1)+1)
#$`1`
#[1] "brown" "a" "deer" " foxy"
#$`2`
#[1] "a quick" "fox jumped over"
或修改你的功能
split.it2 <- function(x){
lst <- strsplit(x, " ")
Length <- sapply(lst, length)
split(x, (Length>1) +1)
}
split.it2(myVector)
#$`1`
#[1] "brown" "a" "deer"
#$`2`
#[1] "a quick" "fox jumped over"
答案 1 :(得分:1)
也许不是很优雅,但这是一个非常简单的方法:
vect_multi<-myVector[grepl(" ",myVector)]
vect_single<-myVector[!grepl(" ",myVector)]
答案 2 :(得分:1)
word_count
包中的qdap
函数是一个方便的包装器。该函数还有一些可能有用的参数:
library(qdap)
split(x = myVector, f = word_count(myVector) > 1)
# $`FALSE`
# [1] "brown" "a" "deer"
#
# $`TRUE`
# [1] "a quick" "fox jumped over"