合并特征向量中的特定特定字符串

时间:2014-02-15 10:31:03

标签: r vector

我有字符向量,其中每个级别都是一个单词。它是从一个文本生成的,其中一些段用尖括号标记。这些部分的长度各不相同。我需要将标记的段合并到向量中。

输入如下:

c("This","is","some","text","with","<marked","up","chunks>[L]","in","it")

我需要输出看起来像这样:

c("This","is","some","text","with","<marked up chunks>[L]","in","it")

感谢。

1 个答案:

答案 0 :(得分:0)

这是一种方法,也适用于向量中的多个块:

vec <- c("This","is","some","text","with","<marked","up","chunks>[L]","in","it")

from <- grep("<", vec)
to <- grep(">", vec)

idx <- mapply(seq, from, to, SIMPLIFY = FALSE)

new_strings <- sapply(idx, function(x) 
  paste(vec[x], collapse = " "))

replacement <- unlist(mapply(function(x, y) c(y, rep(NA, length(x) - 1)), 
                             idx, new_strings, SIMPLIFY = FALSE))

new_vec <- "attributes<-"(na.omit(replace(vec, unlist(idx), replacement)), NULL)


[1] "This"                  "is"                   
[3] "some"                  "text"                 
[5] "with"                  "<marked up chunks>[L]"
[7] "in"                    "it"