Question

我尝试使用python3 regex格式化字符串 - 重新

我的意见：

{'factorial.2.0.0.zip', 'Microsoft ASP.NET Web API 2.2 Client Libraries 5.2.3.zip', 'Newtonsoft.Json.9.0.1.zip'}

我尝试只获取名称和包的版本，例如：

factorial.2.0.0.zip
- 阶乘
- 2.0.0
Microsoft ASP.NET Web API 2.2客户端库5.2.3.zip
- Microsoft ASP.NET Web API 2.2客户端库
- 5.2.3

等。这是我的代码

if diff is not None:
    for values in diff.values():
        for value in values:
            temp = ''
            temp1 = ''
            temp = re.findall('[aA-zZ]+[0-9]*', value) #name pack
            temp1 = re.findall('\d+', value) #version
            print(temp)
            print(temp1)

我的错误输出：

 temp:
 ['Microsoft', 'ASP', 'NET', 'Web', 'API', 'Client', 'Libraries', 'zip']
 ['Newtonsoft', 'Json', 'zip']
 ['factorial', 'zip']

temp1:
['2', '0', '0']
['2', '2', '5', '2', '3']
['9', '0', '1']

右输出：

temp:
['Microsoft', 'ASP', 'NET', 'Web', 'API', 'Client', 'Libraries']
['Newtonsoft', 'Json']
['factorial']

temp1:
['2', '0', '0']
['5', '2', '3']
['9', '0', '1']

我如何解决问题，删除＆＃34; zip＆＃34;是搜索和额外的数字。也许有另一种方式解决了我的问题。

Answer 1

这样的东西？

import re

a = {'factorial.2.0.0.zip', 'Newtonsoft.Json.9.0.1.zip',\
     'Microsoft ASP.NET Web API 2.2 Client Libraries 5.2.3.zip',\
     'namepack010.0.0.153.212583'}

for b in a:
    c = re.findall('(.*?).(\d+\.\d+\.\d+)(\.zip|\.\d+)$', b)[0]
    if c[2] == '.zip':
        print c[0],'||',c[1]
    else:
        print c[0],'||',c[1]+c[2]

输出：

Newtonsoft.Json || 9.0.1
namepack010 || 0.0.153.212583
Microsoft ASP.NET Web API 2.2 Client Libraries || 5.2.3
factorial || 2.0.0

请勿使用[aA-zZ]选择所有字母。它也会匹配一些特殊字符。您应该使用[a-zA-Z]

查看此内容以获得更多理解：Why is this regex allowing a caret?

Python3正则表达式字符串格式

1 个答案: