鉴于以下推文:
Brick Brewing Co Limited (BRB) Downgraded by Cormark to Market Perform
Brinker International Inc (EAT) Upgraded by Zacks Investment Research to Hold
如何编写删除"by Cormark"
和"by Zacks Investment Research"
我试过了:
"by ([A-Za-z ]+\w to)"
使用python但它需要单词“to”。我希望正则表达式在捕获“to”之前停止。
如果有人能告诉我如何编写捕获驼峰案例的正则表达式,如"Zacks Investment Research"
,那也很有趣。
答案 0 :(得分:3)
您可以使用positive look-ahead排除单词to
:
>>> s1 = "Brick Brewing Co Limited (BRB) Downgraded by Cormark to Market Perform"
>>>
>>> s2 = "Brinker International Inc (EAT) Upgraded by Zacks Investment Research to Hold"
>>>
>>> import re
>>> re.sub(r'by[\w\s]+(?=to)','',s1)
'Brick Brewing Co Limited (BRB) Downgraded to Market Perform'
>>> re.sub(r'by[\w\s]+(?=to)','',s2)
'Brinker International Inc (EAT) Upgraded to Hold'
>>>
请注意,正则表达式[\w\s]+
将匹配单词字符和空格的任意组合。如果您只想匹配字母字符和空格,可以使用[a-z\s]
re.I
标记(忽略大小写)。
答案 1 :(得分:2)
要删除by
之后的所有大写单词,您可以使用
by [A-Z][a-z]*(?: +[A-Z][a-z]*)*
请参阅regex demo
解释:
by
- 包含3个字符b
,y
和空格的文字序列[A-Z][a-z]*
- 大写单词(一个大写后跟零个或多个小写字母)(?: +[A-Z][a-z]*)*
- 零个或多个序列......
+[A-Z][a-z]*
- 一个或多个空格后跟一个大写字母,后跟零个或多个小写字母。可以在模式中用\s
替换常规空间以匹配任何空格。另外,要匹配CaMeL字词,您可以将所有[a-z]
替换为[a-zA-Z]
。
答案 2 :(得分:0)
您也可以使用str
方法index
执行此操作,然后切片并添加:
>>> def remove_name(s):
b = s.index(' by ')
t = s.index(' to ')
s = s[:b]+s[t:]
return s
>>>
>>> s = 'Brick Brewing Co Limited (BRB) Downgraded by Cormark to Market Perform'
>>> remove_name(s)
'Brick Brewing Co Limited (BRB) Downgraded to Market Perform'
>>>
>>> s = "Brinker International Inc (EAT) Upgraded by Zacks Investment Research to Hold"
>>> remove_name(s)
'Brinker International Inc (EAT) Upgraded to Hold'