你好我有一个字符串作为这样的linux命令
x <- "cd/etc/init[BKSP][BKSP]it.d[ENTER]"
我想按字符拆分字符串并保持方括号内的内容不变。基本上我想保留命令按。如果使用str_split
:
c("c","d","/","e","t","c","i",......"BKSP","BKSP","i","t",".", "d", "ENTER")
有人可以帮我解决这个问题吗?我一直在玩正则表达式,并没有想出如何实现这一目标。
我试过/.*?[^[A-Z*?]/
,但我没有做到这一点。我也试图将分隔符添加到匹配组中以分割字符串。
答案 0 :(得分:2)
匹配并捕获括号内的所有子串,并捕获branch reset group内的任何其他字符。然后,删除找到的匹配中的外部方形支架:
> x <- c("cd/etc/init[BKSP][BKSP]it.d[ENTER]", "abc]", "[abc")
> matches <- regmatches(x, gregexpr("(?s)(?|\\[([^][]*)]|(.))", x, perl=TRUE))
> sapply(matches, sub, pattern="\\[(.*)\\]", replacement="\\1")
[[1]]
[1] "c" "d" "/" "e" "t" "c" "/" "i" "n" "i" "t" "BKSP" "BKSP" "i" "t" "." "d" "ENTER"
[[2]]
[1] "a" "b" "c" "]"
[[3]]
[1] "[" "a" "b" "c"
请参阅regex demo。详细说明:
(?s)
- 允许.
匹配换行符的DOTALL修饰符(?|\[([^][]*)]|(.))
- 一个分支重置组,其中捕获不同交替分支中的组ID具有标识ID:
\[([^][]*)]
- [
后跟除[
和]
以外的任何0 +字符捕获到第1组然后]
|
- 或(.)
- 任何字符(再次,第1组)。答案 1 :(得分:0)
这是我对你的问题相当啰嗦的解决方案(我在这篇文章中改编了方法:split string with regex)
x <- "[a] + [bc] + 1"
foo <- function(x){
#Mark square brackets with commas
x <- gsub("\\[",",[",x)
x <- gsub("\\]","],",x)
#Separate the string based on commas
x <- unlist(strsplit(x,","))
#Find which vector elements contain brackets
ind <- grepl("\\[.*\\]", x)
y <- character()
for (a in seq_along(x)){
if (nchar(x[a])!=0){
if (ind[a]){
y <- c(y, x[a]) #Store original character
} else {
y <- c(y, unlist(strsplit(x[a], ""))) #Store split character
}
}
}
#Remove the brackets
y <- gsub("\\[", "", y)
y <- gsub("\\]", "", y)
return(y)
}