我有字符向量,其中每个级别都是一个单词。它是从一个文本生成的,其中一些段用尖括号标记。这些部分的长度各不相同。我需要将标记的段合并到向量中。
输入如下:
c("This","is","some","text","with","<marked","up","chunks>[L]","in","it")
我需要输出看起来像这样:
c("This","is","some","text","with","<marked up chunks>[L]","in","it")
感谢。
答案 0 :(得分:0)
这是一种方法,也适用于向量中的多个块:
vec <- c("This","is","some","text","with","<marked","up","chunks>[L]","in","it")
from <- grep("<", vec)
to <- grep(">", vec)
idx <- mapply(seq, from, to, SIMPLIFY = FALSE)
new_strings <- sapply(idx, function(x)
paste(vec[x], collapse = " "))
replacement <- unlist(mapply(function(x, y) c(y, rep(NA, length(x) - 1)),
idx, new_strings, SIMPLIFY = FALSE))
new_vec <- "attributes<-"(na.omit(replace(vec, unlist(idx), replacement)), NULL)
[1] "This" "is"
[3] "some" "text"
[5] "with" "<marked up chunks>[L]"
[7] "in" "it"