将字符串中第N个出现的字符替换为其他内容

时间:2019-04-26 21:11:03

标签: r regex

考虑a = paste(1:10,collapse=", ")会导致

a = "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"

我想替换第n个(例如第4个)出现的“,”,并用其他内容替换(例如“ \ n”)。所需的输出将是:

"1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"

我正在寻找使用gsub(或类似的东西)和某种形式的regular expression来实现此目标的代码。

4 个答案:

答案 0 :(得分:6)

您可以将((?:\d+, ){3}\d),替换为\1\n

基本上,您捕获到了group1中第四个逗号之前的所有内容,并分别用逗号分隔,然后将其替换为\1\n,该字符串将匹配的文本替换为group1文本和换行符,从而为您提供预期的结果。

Regex Demo

R Code demo

gsub("((?:\\d+, ){3}\\d),", "\\1\n", "1, 2, 3, 4, 5, 6, 7, 8, 9, 10")

打印

[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"

编辑:

要将上述解决方案推广到任何文本,我们可以将\d更改为[^,]

New R code demo

gsub("((?:[^,]+, ){3}[^,]+),", "\\1\n", "1, 2, 3, 4, 5, 6, 7, 8, 9, 10")
gsub("((?:[^,]+, ){3}[^,]+),", "\\1\n", "a, bb, ccc, dddd, 500, 600, 700, 800, 900, 1000")

输出

[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"
[1] "a, bb, ccc, dddd\n 500, 600, 700, 800\n 900, 1000"

答案 1 :(得分:1)

同时使用regexgsub

a = paste(1:10,collapse=", ")
x <- gsub("([^,]*,[^,]*,[^,]*,[^,]*),", '\\1\n', a)
x
#> [1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"

答案 2 :(得分:1)

正则表达式是最好的选择,不过这是没有正则表达式的另一种方法

> str_vec <- strsplit(a, " ")[[1]] 
> where <- seq_along(str_vec) %% 4 == 0
> str_vec[where] <- sub(",", "\n", str_vec[where])
> paste(str_vec, collapse=" ")
[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"

答案 3 :(得分:1)

regmatches作为另一种选择:

a <- "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"

fn <- ","
rp <- "\n"
n <- 4

regmatches(a, gregexpr(fn, a)) <- list(c(rep(fn,n-1),rp))
a
#[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"

功能:

a <- "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"

replN <- function(x, fn, rp, n) {
    regmatches(x, gregexpr(fn, x)) <- list(c(rep(fn,n-1),rp))
    x
}
replN(a, ",", "\n", 4)
#[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10

您甚至可以将其扩展为替换参数上的向量化:

a = "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"

replN <- function(x,fn,rp,n) {
    sel <- rep(fn, n*length(rp))
    sel[seq_along(rp)*n] <- rp
    regmatches(x, gregexpr(fn, x)) <- list(sel)
    x
}
replN(a, fn=",", rp=c("1st","2nd"), n=4)
#[1] "1, 2, 3, 41st 5, 6, 7, 82nd 9, 10"