Question

我正在尝试设计一个产生以下两种情况的正则表达式：

foobar_foobar_190412_foobar_foobar.jpg  =>  190412
foobar_20190311_2372_foobar.jpg         =>  20190311

我想到的正则表达式很接近，但是我不知道如何使其仅输出第一个数字：

.*_(\d+)_(\d*).*                        =>  $1

foobar_foobar_190412_foobar_foobar.jpg  =>  190412
foobar_20190311_2372_foobar.jpg         =>  (no match)

有人知道吗？

Answer 1

带有选项-P（perl正则表达式）和-o（仅匹配项）：

grep -Po '^\D+\K\d+' file.txt
190412
20190311

说明：

^           # beginning of line
  \D+       # 1 or more non digit, you can use \D* for 0 or more non digits
  \K        # forget all we have seen until this position
  \d+       # 1 or more digits

根据对grep标签的误解进行编辑

您可以这样做：

查找：^\D(\d+)_.*$
替换：$1

Answer 2

如果您关心下划线匹配，则为sed版本

sed -E 's/[^0-9]*_([0-9]+)_.*/\1/' file

Answer 3

这就是我想要的：

find:    \D+_(\d+)_.*
replace: $1

我不知道“非数字”字符！

Answer 4

如果我们希望捕获第一个数字，则可以使用以下简单表达式：

_([0-9]+)?_

Demo

或

.+?_([0-9]+)?_.+

Demo

，然后将其替换为$1。

RegEx电路

jex.im可视化正则表达式：

演示

此代码段仅显示捕获组的工作方式：

const regex = /_([0-9]+)?_/gm;
const str = `foobar_foobar_190412_foobar_foobar.jpg
foobar_20190311_2372_foobar.jpg`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    // The result can be accessed through the `m`-variable.
    m.forEach((match, groupIndex) => {
        console.log(`Found match, group ${groupIndex}: ${match}`);
    });
}

如何只用grep / regex替换第一实例？

4 个答案:

Demo

Demo

RegEx电路

演示