正则表达式,用于匹配单词后跟斜杠和10位数字

时间:2019-05-16 23:21:08

标签: python regex regex-lookarounds regex-group regex-greedy

我有一个字符串,我试图搜索所有以mystring/开头并以10位ID号结尾的字符串。我想输出带有附件字符串的所有这些ID的列表。

我不太了解regex,但是我猜想这是这里要使用的库。我已经从下面开始了:

import re
source = 'mystring/1234567890 hello world mystring/2345678901 hello'
re.findall("mystring/",source)

3 个答案:

答案 0 :(得分:0)

您可以使用

r"\bmystring/(\d{10})(?!\d)"

请参见regex demo

详细信息

  • \bmystring/-单词边界,整个单词仅与mystring匹配,结尾为/
  • (\d{10})-捕获第1组:10位数字
  • (?!\d)-后面没有数字。

Python demo

import re
source = 'mystring/1234567890 hello world mystring/2345678901 hello'
matches = re.finditer(r"\bmystring/(\d{10})(?!\d)", source)
for match in matches:
    print("Whole match: {}".format(match.group(0)))
    print("Group 1: {}".format(match.group(1)))

输出:

Whole match: mystring/1234567890
Group 1: 1234567890
Whole match: mystring/2345678901
Group 1: 2345678901

或者,只需使用

print(re.findall(r"\bmystring/(\d{10})(?!\d)", source))

将输出ID列表:['1234567890', '2345678901']

答案 1 :(得分:0)

在这里,我们可以使用两个捕获组,并捕获带有和不带有ID的mystring:

// Start button will disappear after click and countDown method will begin
function startTimer(){
    startButton.style.display="none"; 
    counter.style.display = "";  // <----------------------HERE
    for (var i = 0; i < pausePlay.length; i++) {
        pausePlay[i].style.display = "block";
      }


    countDown(10);
}

测试

(mystring\/([0-9]{10}))

enter image description here

RegEx

如果不需要此表达式,可以在regex101.com中对其进行修改/更改。

RegEx电路

jex.im可视化正则表达式:

enter image description here

演示

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"(mystring\/([0-9]{10}))"

test_str = "hello mystring/1234567890 hello world mystring/2345678901 hellomystring/1234567890 hello world mystring/2345678901 hello"

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

答案 2 :(得分:0)

您可以使用单词边界\b来防止mystring成为较大单词的一部分,然后使用quantifier将正斜杠后跟10个数字\d{10}进行匹配:

\bmystring/\d{10}

Regex demo | Python demo

例如:

import re
source = 'mystring/1234567890 hello world mystring/2345678901 hello'
print(re.findall(r"\bmystring/\d{10}",source))

结果:

['mystring/1234567890', 'mystring/2345678901']

如果您只想列出数字,则可以在后面使用正向查找:

(?<=\bmystring/)\d{10}(?!\S)
  • (?<=\bmystring/)向后看,断言左边直接是mystring
  • \d{10}匹配10位数字
  • (?!\S)负向查找,断言右边直接不是非空格字符

Regex demo | Python demo