我需要编写一个函数,将多个格式字符串替换为小写字母。
例如,一个段落包含一个单词' something'以不同的格式,例如“Something',' SomeThing',' SOMETHING',' SomeTHing'需要将所有格式的单词转换为小写字母'
。如何编写一个替换为downcase的函数?
答案 0 :(得分:2)
您可以将段落拆分为不同的单词,然后使用slugify模块生成每个单词的slug,将其与"某些" 进行比较,如果有一个匹配,用"替换单词"。
In [1]: text = "This paragraph contains Something, SOMETHING, AND SomeTHing"
In [2]: from slugify import slugify
In [3]: for word in text.split(" "): # Split the text using space, and iterate through the words
...: if slugify(unicode(word)) == "something": # Compare the word slug with "something"
...: text = text.replace(word, word.lower())
In [4]: text
Out[4]: 'This paragraph contains something, something AND something'
答案 1 :(得分:1)
将文本拆分为单个单词,并检查以小写字母书写的单词是否为""。如果是,则将案例更改为更低
if word.lower() == "something":
text = text.replace(word, "something")
要了解如何将文字拆分为文字,请参阅this question。
另一种方法是迭代单个字母并检查字母是否是"的第一个字母":
text = "Many words: SoMeThInG, SOMEthING, someTHing"
for n in range(len(text)-8):
if text[n:n+9].lower() == "something": # check whether "something" is here
text = text.replace(text[n:n+9], "something")
print text
答案 2 :(得分:1)
您还可以使用re.findall
搜索并将段落拆分为单词和标点符号,并将 database.update("State_Regions",
cvs, "Country_ID IN ( ? ) AND State_ID IN ( ? ) AND Region_ID IN ( ? )",
new String[]{countryIDStr, stateIDStr, regionsIDStr});
的所有不同情况替换为小写版本:
"Something"
哪个输出:
import re
text = "Something, Is: SoMeThInG, SOMEthING, someTHing."
to_replace = "something"
words_punct = re.findall(r"[\w']+|[.,!?;: ]", text)
new_text = "".join(to_replace if x.lower() == to_replace else x for x in words_punct)
print(new_text)
注意: something, Is: something, something, something.
需要使用硬编码的正则表达式来搜索字符串中的内容。您的实际文本可能包含上述正则表达式中不包含的字符,您需要根据需要添加这些字符。