Python-使用Regex

时间:2019-05-29 16:43:19

标签: python regex

我正在尝试使用Regex从字符串中获取日期(格式为yyyymmddhhmmss),但是我找不到要使用的模式。

我正在尝试以下代码:

import re
string = "date file /20190529050003/folder "
regex = re.compile(r'\b\d{4}\d{2}\d{2}\s\d{2}\d{2}\d{2}\b')
result = regex.findall(string)[0],
print(result)

但是我遇到以下错误:

result = regex.findall(string)[0],
IndexError: list index out of range

如何使用正则表达式从脚本中的字符串中返回“ 20190529050003”?

谢谢!

4 个答案:

答案 0 :(得分:3)

如果我们的日期恰好在斜杠之后,我们可以简单地使用以下表达式:

np.count_nonzero(Resultat, axis=1)

然后,如果有必要,并且我们希望添加更多边界,我们肯定可以这样做,例如:

.+\/(\d{4})(\d{2})(\d{2}).+

DEMO

或:

.+\/(\d{4})(\d{2})(\d{2})(\d{2})(\d{2})(\d{2}).+

DEMO

测试

^.+\/(\d{4})(\d{2})(\d{2})(\d{2})(\d{2})(\d{2})\/.+$

如果我们想获取所有数字,则可以使用另一个表达式:

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r".+\/(\d{4})(\d{2})(\d{2}).+"

test_str = "date file /20190529050003/folder "

subst = "\\1-\\2-\\3"

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

测试

.+\/(\d+)\/.+

DEMO

RegEx电路

jex.im可视化正则表达式:

enter image description here

答案 1 :(得分:1)

您的正则表达式模式已关闭,因为目标时间戳记中没有空格。这是执行搜索的一种简单方法:

string = "date file /20190529050003/folder "
matches = re.findall(r'\b\d{14}\b', string)
print(matches)

此打印:

['20190529050003']

我们可以尝试使模式更具针对性,例如仅允许小时,分钟等字段的有效值。但是,这将需要更多的工作,并且如果您不希望在文本中看到任何不是 时间戳的14位数字,那么我建议您避免使用该模式,使其比原来更复杂成为。

答案 2 :(得分:1)

从表达式中删除了# coding=utf8 # the above tag defines encoding for this document and is for Python 2.x compatibility import re regex = r".+\/(\d+)\/.+" test_str = "date file /20190529050003/folder " subst = "\\1" # You can manually specify the number of replacements by changing the 4th argument result = re.sub(regex, subst, test_str, 0, re.MULTILINE) if result: print (result) # Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

\s

答案 3 :(得分:0)

我建议将导致错误的行分为两行:

matches = regex.findall(string)
result = matches[0]

现在您可以检查matches以查看其内容。