Question

我希望在某些单词首次出现时用某些单词将句子分开。让我说明一下：

message = 'I wish to check my python code for errors to run the program properly with fluency'

我希望将上述消息按for/to/with的第一次出现进行拆分，因此上述消息的结果为check my python code for errors to run the program properly with fluency

我也希望包括将句子与之分隔的单词，所以我的最终结果将是： to check my python code for errors to run the program properly with fluency

我的代码不起作用：

import re
message = 'I wish to check my python code for errors to run the program properly with fluency'
result = message.split(r"for|to|with",1)[1]
print(result)

我该怎么办？

Answer 1

message = 'I wish to check my python code for errors to run the program properly with fluency'
array = message.split(' ')
number = 0
message_new = ''
for i in range(len(array)):
    if array[i] == 'to' or array[i] == 'for':
        number=i
        break
for j in range(number,len(array)):
    message_new += array[j] + ' '
print(message_new)

输出：

to check my python code for errors to run the program properly with fluency

Answer 2

split不使用正则表达式作为参数（也许您正在考虑使用Perl）。

以下内容可满足您的需求：

import re
message = 'I wish to check my python code for errors to run the program properly with fluency'
result = re.search(r'\b(for|to|with)\b', message)
print message[result.start(1):]

这不使用替换，重新连接或循环，而只是简单搜索所需的字符串并使用其位置结果。

Answer 3

该问题已在how to remove all characters before a specific character in python中得到解答但它仅适用于一个特定的定界符，对于多个定界符，您首先必须找出最先出现的定界符，可以在此处找到：how can i find the first occurrence of a substring in a python string 您从第一个猜测开始，我没有太多的想象力，所以我们将其称为bestDelimiter = firstDelimiter，找出其首次出现的位置，将位置保存到bestPosition =首次出现的位置，然后继续查找以下位置其余的定界符，每当您发现一个在当前bestPosition之前发生的定界符时，都更新变量bestDelimiter和bestPosition，最后首先出现的定界符将是bestDelimiter，然后使用bestDelimiter <来应用所需的操作/ p>

Answer 4

我的猜测是，这个简单的表达式可能会做到这一点

.*?(\b(?:to|for|with)\b.*)

和re.match可能是以下五种方法中最快的一种：

使用`re.findall`

进行测试

import re

regex = r".*?(\b(?:to|for|with)\b.*)"
test_str = "I wish to check my python code for errors to run the program properly with fluency"
print(re.findall(regex, test_str))

使用`re.sub`

进行测试

import re

regex = r".*?(\b(?:to|for|with)\b.*)"
test_str = "I wish to check my python code for errors to run the program properly with fluency"
subst = "\\1"

result = re.sub(regex, subst, test_str)

if result:
    print (result)

使用`re.finditer`

进行测试

import re

regex = r".*?(\b(?:to|for|with)\b.*)"

test_str = "I wish to check my python code for errors to run the program properly with fluency"

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):

    # FULL MATCH
    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

使用`re.match`

进行测试

import re

regex = r".*?(\b(?:to|for|with)\b.*)"
test_str = "I wish to check my python code for errors to run the program properly with fluency"

print(re.match(regex, test_str).group(1))

使用`re.search`

进行测试

import re

regex = r".*?(\b(?:to|for|with)\b.*)"
test_str = "I wish to check my python code for errors to run the program properly with fluency"

print(re.search(regex, test_str).group(1))

在this demo的右上角解释了该表达式，如果您想进一步探索或修改它，并且在this link中，您可以观察它如何与某些示例输入匹配，如果你喜欢。

Answer 5

您可以首先找到for，to和with的所有实例，分割为所需的值，然后拼接并重新加入：

import re
message = 'I wish to check my python code for errors to run the program properly with fluency'
vals, [_, *s] = re.findall(r"\bfor\b|\bto\b|\bwith\b", message), re.split(r"\bfor\b|\bto\b|\bwith\b", message)
result = ''.join('{} {}'.format(a, re.sub("^\s+", "", b)) for a, b in zip(vals, s))

输出：

'to check my python code for errors to run the program properly with fluency'

按找到的第一个子字符串分割字符串

5 个答案:

使用`re.findall`

使用`re.sub`

使用`re.finditer`

使用`re.match`

使用`re.search`

按找到的第一个子字符串分割字符串

5 个答案:

使用re.findall

使用re.sub

使用re.finditer

使用re.match

使用re.search

使用`re.findall`

使用`re.sub`

使用`re.finditer`

使用`re.match`

使用`re.search`