按找到的第一个子字符串分割字符串

时间:2019-07-06 18:55:39

标签: python regex

我希望在某些单词首次出现时用某些单词将句子分开。让我说明一下:

message = 'I wish to check my python code for errors to run the program properly with fluency'

我希望将上述消息按for/to/with的第一次出现进行拆分,因此上述消息的结果为check my python code for errors to run the program properly with fluency

我也希望包括将句子与之分隔的单词,所以我的最终结果将是: to check my python code for errors to run the program properly with fluency

我的代码不起作用:

import re
message = 'I wish to check my python code for errors to run the program properly with fluency'
result = message.split(r"for|to|with",1)[1]
print(result)

我该怎么办?

5 个答案:

答案 0 :(得分:1)

message = 'I wish to check my python code for errors to run the program properly with fluency'
array = message.split(' ')
number = 0
message_new = ''
for i in range(len(array)):
    if array[i] == 'to' or array[i] == 'for':
        number=i
        break
for j in range(number,len(array)):
    message_new += array[j] + ' '
print(message_new) 

输出:

to check my python code for errors to run the program properly with fluency 

答案 1 :(得分:1)

split不使用正则表达式作为参数(也许您正在考虑使用Perl)。

以下内容可满足您的需求:

import re
message = 'I wish to check my python code for errors to run the program properly with fluency'
result = re.search(r'\b(for|to|with)\b', message)
print message[result.start(1):]

这不使用替换,重新连接或循环,而只是简单搜索所需的字符串并使用其位置结果。

答案 2 :(得分:1)

该问题已在how to remove all characters before a specific character in python中得到解答 但它仅适用于一个特定的定界符,对于多个定界符,您首先必须找出最先出现的定界符,可以在此处找到:how can i find the first occurrence of a substring in a python string 您从第一个猜测开始,我没有太多的想象力,所以我们将其称为bestDelimiter = firstDelimiter,找出其首次出现的位置,将位置保存到bestPosition =首次出现的位置,然后继续查找以下位置其余的定界符,每当您发现一个在当前bestPosition之前发生的定界符时,都更新变量bestDelimiter和bestPosition,最后首先出现的定界符将是bestDelimiter,然后使用bestDelimiter <来应用所需的操作/ p>

答案 3 :(得分:0)

我的猜测是,这个简单的表达式可能会做到这一点

.*?(\b(?:to|for|with)\b.*)

re.match可能是以下五种方法中最快的一种:

使用re.findall

进行测试
import re

regex = r".*?(\b(?:to|for|with)\b.*)"
test_str = "I wish to check my python code for errors to run the program properly with fluency"
print(re.findall(regex, test_str))

使用re.sub

进行测试
import re

regex = r".*?(\b(?:to|for|with)\b.*)"
test_str = "I wish to check my python code for errors to run the program properly with fluency"
subst = "\\1"

result = re.sub(regex, subst, test_str)

if result:
    print (result)

使用re.finditer

进行测试
import re

regex = r".*?(\b(?:to|for|with)\b.*)"

test_str = "I wish to check my python code for errors to run the program properly with fluency"

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):

    # FULL MATCH
    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

使用re.match

进行测试
import re

regex = r".*?(\b(?:to|for|with)\b.*)"
test_str = "I wish to check my python code for errors to run the program properly with fluency"

print(re.match(regex, test_str).group(1))

使用re.search

进行测试
import re

regex = r".*?(\b(?:to|for|with)\b.*)"
test_str = "I wish to check my python code for errors to run the program properly with fluency"

print(re.search(regex, test_str).group(1))

this demo的右上角解释了该表达式,如果您想进一步探索或修改它,并且在this link中,您可以观察它如何与某些示例输入匹配,如果你喜欢。

答案 4 :(得分:0)

您可以首先找到fortowith的所有实例,分割为所需的值,然后拼接并重新加入:

import re
message = 'I wish to check my python code for errors to run the program properly with fluency'
vals, [_, *s] = re.findall(r"\bfor\b|\bto\b|\bwith\b", message), re.split(r"\bfor\b|\bto\b|\bwith\b", message)
result = ''.join('{} {}'.format(a, re.sub("^\s+", "", b)) for a, b in zip(vals, s))

输出:

'to check my python code for errors to run the program properly with fluency'