re.search搜索字符串的最后一部分

时间:2018-10-09 15:34:53

标签: python regex python-3.x

所以我正在使用re.search函数搜索子字符串,问题在于字符串的末尾有重复的数据,我只想搜索声明的第一个数据集

这是代码

file = open ("flash-ori", "rb").read().hex()

DTC_data = re.search("0080040004000100(.*)010202010202020202020202", file)

print (DTC_data.group())

这就是我得到的

0080040004000100**DATA**01020201020202020202020202020202010202020102020202010202020202020202020102020202020a0202020202020202020202020a02020102020202020202020202020202020202020202020202020202020202020202020101010102020101010102020101010102020101010102020101010101020201010101010202010102010101020202010102010101020202010202020102020201020202020102020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202010202010202010202010202010202020202010202020202010202020202010202020202020202020202020201020201020201020201020

这是我想做的

0080040004000100**DATA**010202010202020202020202

非常适合所有解决方案。

2 个答案:

答案 0 :(得分:0)

默认情况下,正则表达式的量词是贪婪的;例如,当您提供.*时,它们将尽可能多地花费。您可以通过添加?并使用正则表达式来切换到非贪婪模式:

r"0080040004000100(.*?)010202010202020202020202"

还要注意,我添加了一个前导r,以使其成为原始字符串文字。此处完全没有区别,但是最好只将原始字符串文字用于正则表达式,因为不这样做最终会咬你,例如当您需要单词边界r'\b'并搜索ASCII退格字符'\b'时。

答案 1 :(得分:0)

将您的正则表达式更改为:

DTC_data = re.search("0080040004000100(.*?)010202010202020202020202", file)

?将使其变得非贪婪。