input = """
endless gibberish
some more stuff
color texture mytexture
[640 480 1]
'BE4C16FFBD4B15FFBD4B15FFBD4B15FFBD4B15FFBE4C16FFBE4C16FFBD4B15FFBD4B15FF'
'BE4C16FFBE4C16FFBD4B15FFBC4A14FFBC4A14FFBC4A14FFBC4A14FFBC4A14FFBE4C16FF'
'BF4C16FFBF4C16FFBE4B15FFBE4B15FFBC4913FFBC4913FFBC4913FFBB4812FFBC4913FF'
'BC4A14FFBB4913FFBB4812FFBB4812FFBA4812FFBA4812FFBB4913FFBC4A16FFBB4915FF'
'B84612FFB84612FFB94713FFB84612FFB64410FFB64410FFB64410FFB4420EFFB3410DFF'
'FB03E0AFFB13F0BFB13F0BFFAE3C08FFAA3804FFAD3B07FFB03E0AFFB3410DFFB4420EFF'
'B4400DFFB13D0AFFB23C0AFFB03C09FFB23E0BFFB5410EFFB74310FFB94512FFB84411FF'
color texture mytexture2
[640 480 1]
'BE4C16FFBD4B15FFBD4B15FFBD4B15FFBD4B15FFBE4C16FFBE4C16FFBD4B15FFBD4B15FF'
'BE4C16FFBE4C16FFBD4B15FFBC4A14FFBC4A14FFBC4A14FFBC4A14FFBC4A14FFBE4C16FF'
(... etc...)
"""
需要在带有“纹理”的行之后获取括号和二进制blob数据之间的内容,直到到达空行。有几个“纹理”段落。
这是我到目前为止所得到的:
p = re.compile(r'texture\s+(\S+)\s+\[(\d+\s+\d+\s+\d+)\]\s+(\'.+\')')
matches = p.findall(data)
for match in matches:
print match[0]
print match[1]
print match[2]
print "---------------"
提供以下输出:
mytexture
640 480 1
'BE4C16FFBD4B15FFBD4B15FFBD4B15FFBD4B15FFBE4C16FFBE4C16FFBD4B15FFBD4B15FF'
---------------
mytexture2
640 480 1
'BE4C16FFBD4B15FFBD4B15FFBD4B15FFBD4B15FFBE4C16FFBE4C16FFBD4B15FFBD4B15FF'
---------------
我很确定re.MULTILINE应该用于获取整个 blob,但我不清楚如何抓取所有二进制行。 我的问题基本上是:如何抓住多条线并知道何时“停止”(即:到达空行)。
答案 0 :(得分:1)
re.MULTILINE
会影响^
和$
锚点的含义。我认为你想要的是re.DOTALL
,没有它.
字符将永远不会与换行符匹配。
要将所有文字与下一个空行匹配,您可以使用(.*?)\n\s*\n
之类的内容。这似乎可以满足您的需求?
p = re.compile(r'texture\s+(\S+)\s+\[(\d+\s+\d+\s+\d+)\]\s+(.*?)\n\s*\n', re.DOTALL)
matches = p.findall(input)
for match in matches:
print match[0]
print match[1]
print match[2]
print "---------------"
在示例文本中,这会产生:
mytexture
640 480 1
'BE4C16FFBD4B15FFBD4B15FFBD4B15FFBD4B15FFBE4C16FFBE4C16FFBD4B15FFBD4B15FF'
'BE4C16FFBE4C16FFBD4B15FFBC4A14FFBC4A14FFBC4A14FFBC4A14FFBC4A14FFBE4C16FF'
'BF4C16FFBF4C16FFBE4B15FFBE4B15FFBC4913FFBC4913FFBC4913FFBB4812FFBC4913FF'
'BC4A14FFBB4913FFBB4812FFBB4812FFBA4812FFBA4812FFBB4913FFBC4A16FFBB4915FF'
'B84612FFB84612FFB94713FFB84612FFB64410FFB64410FFB64410FFB4420EFFB3410DFF'
'FB03E0AFFB13F0BFB13F0BFFAE3C08FFAA3804FFAD3B07FFB03E0AFFB3410DFFB4420EFF'
'B4400DFFB13D0AFFB23C0AFFB03C09FFB23E0BFFB5410EFFB74310FFB94512FFB84411FF'
---------------
mytexture2
640 480 1
'BE4C16FFBD4B15FFBD4B15FFBD4B15FFBD4B15FFBE4C16FFBE4C16FFBD4B15FFBD4B15FF'
'BE4C16FFBE4C16FFBD4B15FFBC4A14FFBC4A14FFBC4A14FFBC4A14FFBC4A14FFBE4C16FF'
(... etc...)
---------------
答案 1 :(得分:0)
你是现场的,只是没有实际实现它;)
import re
input = """
endless gibberish
some more stuff
texture mytexture
'01AB01AB01AB01BA'
'01AB01AB01AB01BA'
'01AB01AB01AB01BA'
'01AB01AB01AB01BA'
"""
matches = re.findall(r'01AB01AB01AB01BA', input, re.M)
print matches