我想根据多个条件分割文本字符串。我想在确定的项目之前采取所有文本。各个标题之间可能有多个空格,而不仅仅是这里所示的一个空格,并且也希望能够处理这个空格。
有两个问题:
我尝试了以下内容:
job_titles = ['senior payroll specialist', 'employment coordinator']
import re
string = 'some text that has a bunch of words in it Blank Name senior payroll specialist
with a bunch of words after this that are not needed'
out = re.split('senior payroll specialist', string)[0]
out = re.split('senior payroll specialist', out)[0]
谢谢
答案 0 :(得分:0)
或许考虑将您的拆分字符串组合成一个正则表达式。例如:
bash-3.2$ python3
Python 3.6.2 (default, Jul 17 2017, 16:44:32)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> job_titles = ['senior payroll specialist', 'employment coordinator']
>>> string = ('some text that has a bunch of words in it '
... 'Blank Name senior payroll specialist with the words '
... 'employment coordinator and words after this that are not needed')
>>> import re, pprint
>>> pat = "(" + "|".join(job_titles) + ")"
>>> pprint.pprint( re.split( pat, string ))
['some text that has a bunch of words in it Blank Name ',
'senior payroll specialist',
' with the words ',
'employment coordinator',
' and words after this that are not needed']
>>>