[Python3] RegEx匹配多个字符串

时间:2016-09-26 21:06:58

标签: regex python-3.x

我正在尝试匹配多个stings,其中还包括一个可选的捕获组。

我的RegEx:

(\[[A-Za-z]*\])(.*) - (.*)(.[0-9]{2}\.[0-9]{2}\.[0-9]{2}.)?(\[.*\])

的字符串:

[Test]Kyubiikitsune - Company Of Wolves[20.06.96][Hi-Res]
[TEst]_ANother - Company Of 2[Hi-Res]
[Yes]coOl__ - some text_[20.06.96][Hi-Res]

如何匹配所有这些并优化我的RegEx?我还是新手。

2 个答案:

答案 0 :(得分:0)

我认为this是你想要的: r"\[(.*?)\](.*?)\s*-\s*(.*?)(?:\[(\d{2}\.\d{2}\.\d{2})\])?\[(.*?)\]"g

答案 1 :(得分:0)

考虑使用pandas进行此操作,如下所示:

import pandas as pd

# create a Series object containing the strings to be searched
s = pd.Series([
    '[Test]Kyubiikitsune - Company Of Wolves[20.06.96][Hi-Res]',
    '[TEst]_ANother - Company Of 2[Hi-Res]',
    '[Yes]coOl__ - some text_[20.06.96][Hi-Res]'
])

# use pandas' StringMethods to peform regex extraction; a DataFrame object is returned because your regex contains more than one capture group
s.str.extract('(\[[A-Za-z]*\])(.*) - (.*)(.[0-9]{2}\.[0-9]{2}\.[0-9]{2}.)?(\[.*\])', expand=True)

# returns the following
        0              1                            2    3         4
0  [Test]  Kyubiikitsune  Company Of Wolves[20.06.96]  NaN  [Hi-Res]
1  [TEst]       _ANother                 Company Of 2  NaN  [Hi-Res]
2   [Yes]         coOl__         some text_[20.06.96]  NaN  [Hi-Res]