Question

我需要用相同数量的零替换2个以上的连续1。目前，我可以找到匹配如下，但我不知道如何找到匹配的确切数量的零

ind<-c(1,1,0,0,0,1,1,1,1,0,1,1,0,0,0,1,1,0,1,0,0,1,0,1,0,1,0,1,1,1,1,1,0,1,0,1,0,1,1,1,0)
gsub("([1])\\1\\1+","0",paste0(ind,collapse=""))

给出

"11000001100011010010101000101000"

因为它只用一个0替换匹配但我需要

"11000000001100011010010101000000010100000"

Answer 1

您可以使用以下gsub替换：

[1] "11000000001100011010010101000000010100000"

请参阅IDEONE demo，结果为\G。

正则表达式是基于Perl的，因为它使用了先行和1运算符。

此正则表达式匹配：

1 - 文字(?=1{2,}) if ...
1 - 后跟2个或更多(?!^)\\G1 s或......
1 - 上一场比赛后的任何\G。

有关p = pd.Panel(items=a.columns, major_axis=a.index, minor_axis=b.columns) for x in a.index: p.loc[:, :, c] = a.loc[x, :] * b运算符的详细信息，请参阅perldoc.perl.org上的What good is \G in a regular expression?和When is \G useful application in a regex? SO帖子。

Answer 2

不使用regex但rle的解决方案：

x = rle(ind)
x$values[x$lengths>2 & x$values] <- 0
inverse.rle(x)

#[1] 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0

gsub：用正则表达式替换字符串替换正则表达式匹配

2 个答案: