如何使用Python删除特定单词之前的所有单词(如果有多个特定单词)?

时间:2019-04-14 08:27:49

标签: python regex python-3.x postgresql

我想删除特定单词之前的所有单词。但是在我的句子中有一些特定的词。以下示例:

dvdrentalLOG: statement: SELECT email, actor.last_name, count(actor.last_name) FROM (SELECT email, actor_id FROM (SELECT email, film_id FROM (SELECT email, inventory_id FROM customer as cu JOIN rental ON cu.customer_id = rental.customer_id ORDER BY email) as sq JOIN inventory ON sq.inventory_id = inventory.inventory_id) as sq2 JOIN film_actor ON sq2.film_id = film_actor.film_id) as sq3 JOIN actor ON sq3.actor_id = actor.actor_id GROUP BY email, actor.last_name ORDER BY COUNT(actor.last_name) DESC

在上面的示例中,我想删除 first SELECT之前的所有单词。我已经尝试过How to remove all characters before a specific character in Python?

知道我需要做什么吗?

2 个答案:

答案 0 :(得分:2)

您可以使用此正则表达式并替换为空字符串:

^.+?(?=SELECT)

像这样:

result = re.sub(r"^.+?(?=SELECT)", "", your_string)

说明:

由于要删除第一个SELECT之前的所有内容,因此匹配将从字符串^的开头开始。然后您懒惰地匹配任何字符.+?,直到看到SELECT

或者,删除前瞻并替换为SELECT

result = re.sub(r"^.+?SELECT", "SELECT", your_string)

编辑:

我发现了另一种使用partition的方法:

partitions = your_string.partition("SELECT")
result = partitions[1] + partitions[2]

答案 1 :(得分:1)

如果仅关注单词的第一次出现,则很容易做到。考虑以下示例

import re
txt = 'blah blah blah SELECT something SELECT something another SELECT'
output = re.sub(r'.*?(?=SELECT)','',txt,1)
print(output) #SELECT something SELECT something another SELECT

我在模式内部使用了所谓的零长度断言,因此只有在跟随SELECT且将1作为第四个re.sub参数时才匹配,这意味着只有1个替换