在分隔符上拆分字符串,在拆分前保留分隔符

时间:2018-06-08 09:37:53

标签: r

问题R split on delimiter (split) keep the delimiter (split)的明显延伸是: 如何拆分字符串,使分隔符保持在每个部分的开头?

x <- "What is this?  It's an onion.  What! That's| Well Crazy."

解决方案

unlist(strsplit(x, "(?<=[?.!|])", perl=TRUE))

给出:

"What is this?"    "  It's an onion." "  What!" " That's|" " Well Crazy."

我正在寻找:

"What is this"    "? It's an onion" ".  What" "! That's" "| Well Crazy."

将积极的外观转变为积极的前瞻并不能解决问题。

1 个答案:

答案 0 :(得分:1)

我设法使用正向前瞻,然后是单词边界标记来解决它:

x <- "What is this?  It's an onion.  What! That's| Well Crazy."
strsplit(x, "(?=[?.!|].)\\b", perl=TRUE)

[1] "What is this"     "?  It's an onion" ".  What"          "! That's"        
[5] "| Well Crazy."

Demo