Question

我想根据多个条件分割文本字符串。我想在确定的项目之前采取所有文本。各个标题之间可能有多个空格，而不仅仅是这里所示的一个空格，并且也希望能够处理这个空格。

有两个问题：

循环多个标题（此处未指明）
，它们之间可能有不同的空格

我尝试了以下内容：

job_titles = ['senior payroll specialist', 'employment coordinator']

import re 
string = 'some text that has a bunch of words in it Blank Name senior payroll specialist 
with a bunch of words after this that are not needed'
out = re.split('senior payroll specialist', string)[0]
out = re.split('senior payroll specialist', out)[0]

谢谢

Answer 1

或许考虑将您的拆分字符串组合成一个正则表达式。例如：

bash-3.2$ python3
Python 3.6.2 (default, Jul 17 2017, 16:44:32) 
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>> job_titles = ['senior payroll specialist', 'employment coordinator']
>>> string = ('some text that has a bunch of words in it '
... 'Blank Name senior payroll specialist with the words '
... 'employment coordinator and words after this that are not needed')

>>> import re, pprint
>>> pat = "(" + "|".join(job_titles) + ")"
>>> pprint.pprint( re.split( pat, string ))
['some text that has a bunch of words in it Blank Name ',
 'senior payroll specialist',
 ' with the words ',
 'employment coordinator',
 ' and words after this that are not needed']
>>>

在多个条件下拆分文本

1 个答案: