python - Python - 基于模式的带括号的拆分字符串 - Thinbug

Python - 基于模式的带括号的拆分字符串

时间：2014-10-06 23:08:42

标签： python string split

我在python中有一个问题，我有一个模式，可以重复1到XXX次。

模式是我有一个格式字符串

作者（隶属）作者（隶属）等等有许多作者/隶属关系。

当你不知道你是否会有1个作者（Affiliation）或100个实例时，Python中最好的方法是将这个字符串拆分成什么？

编辑 - Viktor Leis *（慕尼黑工业大学）Alfons Kemper（慕尼黑工业大学）Thomas Neumann（德国慕尼黑工业大学）

这是我正在使用的示例字符串。我试过re.split / re.findall并且没有运气。我假设我正在做正则表达式的错误。

编辑2 - '\ w + {1,3}（\ w {1,10}）'是我试图使用的模式。

我的逻辑是一个名字是1-3个单词，然后（。然后一个联系在1-10个单词之间，并且结束）。

3 个答案:

答案 0 :(得分：1)

这是一个示例。看起来你想要将文本与no）或（以及（和）之间的文本匹配。下面是一种方法，假设它与上面完全一样。

import re
text = r'Viktor Leis* (Technische Universitt Mnchen) Alfons Kemper (Technische Universitt Mnchen) Thomas Neumann (Technische Universitt Mnchen, Germany)'
pattern = '[^\(\)]* \([^\(]+\)'
result = re.findall(pattern,s)
print result

<强>输出：

['Viktor Leis* (Technische Universitt Mnchen)', ' Alfons Kemper (Technische Universitt Mnchen)', ' Thomas Neumann (Technische Universitt Mnchen, Germany)']

您可能希望使用strip删除前导和尾随空格。

答案 1 :(得分：0)

这是我想到的第一件事

import re
s = 'Bob (ABC) Steve (XYZ) Mike (ALPHA)'
pattern = '\w+ \(\w+\)'

>>> re.findall(pattern,s)
['Bob (ABC)', 'Steve (XYZ)', 'Mike (ALPHA)']

答案 2 :(得分：0)

你可以这样做：

thing="Author1 (Affiliation) Author2 (Affiliation) Author3 (Affiliation)"
s=thing.split(') ')

list=[]
for i in s:
    if not i.endswith(')'):
        list.append(i+')')
    else:
        list.append(i)