我的文字如下。
mytext = "This is AVGs_ABB and NMN_ABB and most importantly GFD_ABB This is so important that you have to CLEAN the lab everyday"
我想将其转换为小写,但其中包含_ABB
的字词除外。
因此,我的输出应如下所示。
mytext = "this is AVGs_ABB and NMN_ABB and most importantly GFD_ABB this is so important that you have to clean the lab everyday"
我目前的代码如下。
splits = mytext.split()
newtext = []
for item in splits:
if not '_ABB' in item:
item = item.lower()
newtext.append(item)
else:
newtext.append(item)
但是,我想知道是否有任何简单的方法可以这样做,可能在一行?
答案 0 :(得分:11)
您可以使用单行将字符串拆分为单词,使用str.endswith()
检查单词,然后将单词重新组合在一起:
' '.join(w if w.endswith('_ABB') else w.lower() for w in mytext.split())
# 'this is AVGs_ABB and NMN_ABB and most importantly GFD_ABB this is so important that you have to clean the lab everyday'
当然使用in
运算符而不是str.endswith()
,如果'_ABB'
实际上可以出现在单词的任何位置,而不仅仅是在结尾。
答案 1 :(得分:3)
扩展正则表达式方法:
import re
mytext = "This is AVGs_ABB and NMN_ABB and most importantly GFD_ABB This is so important that you have to CLEAN the lab everyday"
result = re.sub(r'\b((?!_ABB)\S)+\b', lambda m: m.group().lower(), mytext)
print(result)
输出:
this is AVGs_ABB and NMN_ABB and most importantly GFD_ABB this is so important that you have to clean the lab everyday
详细说明:
\b
- 字边界(?!_ABB)
- 前瞻性否定断言,确保给定的模式不匹配\S
- 非空白字符\b((?!_ABB)\S)+\b
- 整个模式匹配不包含子串_ABB
答案 2 :(得分:0)
这是另一种可能(不优雅)的单线:
mytext = "This is AVGs_ABB and NMN_ABB and most importantly GFD_ABB This is so important that you have to CLEAN the lab everyday"
print(' '.join(map(lambda x : x if '_ABB' in x else x.lower(), mytext.split())))
哪个输出:
this is AVGs_ABB and NMN_ABB and most importantly GFD_ABB this is so important that you have to clean the lab everyday
注意:这假设您的文字只会按空格分隔,因此split()
就足够了。如果您的文字包含标点符号,例如",!."
,则需要使用正则表达式来分割单词。