Question

我想替换字符串中的版本号，例如

Microsoft Visual C++ 2008 Redistributable - x86 9.0.30729.4148 9.0.30729.4148
Microsoft Visual C++ 2008 Redistributable - x86 9.0.30729.6161 9.0.30729.6161
Microsoft_VC80_DebugCRT_x86_x64 1.0.0
Microsoft_VC80_DebugCRT_x86 1.0.0
Windows UPnP Browser 0.1.01
CamStudio
Microsoft Visual C++ 2008 Redistributable - x86 9.0.30729.4148 9.0.30729.4148
Microsoft Visual C++ 2008 Redistributable - x86 9.0.30729.6161 9.0.30729.6161
Microsoft_VC80_DebugCRT_x86_x64 1.0.0
Microsoft_VC80_DebugCRT_x86 1.0.0

我希望它成为

Microsoft Visual C++ 2008 Redistributable - x86 
Microsoft Visual C++ 2008 Redistributable - x86 
Microsoft_VC80_DebugCRT_x86_x64 
Microsoft_VC80_DebugCRT_x86 
Windows UPnP Browser
CamStudioe 
Microsoft Visual C++ 2008 Redistributable - x86 
Microsoft Visual C++ 2008 Redistributable - x86 
Microsoft_VC80_DebugCRT_x86_x64 
Microsoft_VC80_DebugCRT_x86

以下是我的代码

s="MicrosoftVisualC++2008Redistributable-x86\nMicrosoftVisualC++2008Redistributable-x86\nMicrosoft_VC80_DebugCRT_x86_x64\nMicrosoft_VC80_DebugCRT_x86\nWindowsUPnPBrowser\nCamStudioe\nMicrosoftVisualC++2008Redistributable-x86\nMicrosoftVisualC++2008Redistributable-x86\nMicrosoft_VC80_DebugCRT_x86_x64\nMicrosoft_VC80_DebugCRT_x86"
s1=s.replace('\r','').split('\n')
s2=[]
for s in s1:
    m = re.search('(?<=([ ]+[\.\d]*)*$)', s)
    s2.append(m.group(0))
print(s2)

我遇到了问题

error: look-behind requires fixed-width pattern

有没有更好的方法来完成这项任务？

Answer 1

诀窍在于，您可以在模式中包含未在匹配组中返回的内容（即，它们将是组（0）的一部分，但不是任何其他组的一部分）。这是我的成果：

# put the lines to clean in a string
s='''Microsoft Visual C++ 2008 Redistributable - x86 9.0.30729.4148 9.0.30729.4148
Microsoft Visual C++ 2008 Redistributable - x86 9.0.30729.6161 9.0.30729.6161
Microsoft_VC80_DebugCRT_x86_x64 1.0.0
Microsoft_VC80_DebugCRT_x86 1.0.0
Windows UPnP Browser 0.1.01
CamStudio
Microsoft Visual C++ 2008 Redistributable - x86 9.0.30729.4148 9.0.30729.4148
Microsoft Visual C++ 2008 Redistributable - x86 9.0.30729.6161 9.0.30729.6161
Microsoft_VC80_DebugCRT_x86_x64 1.0.0
Microsoft_VC80_DebugCRT_x86 1.0.0'''

# use findall to return the parts we want
print(re.findall(r'(.+?)(?: (?:[\d\.]+))*(?:\n|\Z)', s))

正则表达式的解释：(.+?)是对一堆字符的非贪婪捕获。
(?: [\d\.]+)*是一个非捕获组，重复零次或多次，以空格开头，只有数字或'。'以下（每次重复） (?:\n|\Z)匹配换行符或字符串的结尾。如果您的字符串可能有回车符，则可以使用\r?(?:\n|\Z)代替。

对于只有一个捕获组的正则表达式，re.findall返回字符串中每个匹配的group（1），这正是您想要的。正则表达式的其他部分必须匹配，但由于它们未被捕获，因此不会返回它们。

使用正则表达式删除版本号

1 个答案: