Question

我正在尝试使用正则表达式从字符串中提取/匹配数据，但我似乎没有得到它。

我不想从以下字符串中提取i386（最后一个 - 和.iso之间的文字）：

/xubuntu/daily/current/lucid-alternate-i386.iso

这应该适用于：

/xubuntu/daily/current/lucid-alternate-amd64.iso

根据情况，结果应该是i386或amd64。

非常感谢你的帮助。

Answer 1

在这种情况下你也可以使用split（而不是正则表达式）：

>>> str = "/xubuntu/daily/current/lucid-alternate-i386.iso"
>>> str.split(".iso")[0].split("-")[-1]
'i386'

split为您提供了字符串“拆分”的元素列表。然后使用Python的slicing syntax，您可以找到相应的部分。

Answer 2

r"/([^-]*)\.iso/"

您想要的位将位于第一个捕获组中。

Answer 3

首先，让我们的生活更简单，只获得文件名。

>>> os.path.split("/xubuntu/daily/current/lucid-alternate-i386.iso")
('/xubuntu/daily/current', 'lucid-alternate-i386.iso')

现在只需捕捉最后一个短划线和'.iso'之间的所有字母。

Answer 4

如果要使用re.compile（）匹配其中几行，并保存生成的正则表达式对象以便重用is more efficient。

s1 = "/xubuntu/daily/current/lucid-alternate-i386.iso"
s2 = "/xubuntu/daily/current/lucid-alternate-amd64.iso"

pattern = re.compile(r'^.+-(.+)\..+$')

m = pattern.match(s1)
m.group(1)
'i386'

m = pattern.match(s2)
m.group(1)
'amd64'

Answer 5

表达式应该没有前导斜杠。

import re

line = '/xubuntu/daily/current/lucid-alternate-i386.iso'
rex = re.compile(r"([^-]*)\.iso")
m = rex.search(line)
print m.group(1)

收益'i386'

Answer 6

reobj = re.compile(r"(\w+)\.iso$")
match = reobj.search(subject)
if match:
    result = match.group(1)
else:
    result = ""

主题包含文件名和路径。

Answer 7

>>> import os
>>> path = "/xubuntu/daily/current/lucid-alternate-i386.iso"
>>> file, ext = os.path.splitext(os.path.split(path)[1])
>>> processor = file[file.rfind("-") + 1:]
>>> processor
'i386'

Python将一些字符匹配到一个字符串中

7 个答案: