我需要对字符串中的子字符串进行模糊搜索并替换该部分。例如:
str_a = "Alabama"
str_b = "REPLACED"
orig_str = "Flabama is a state located in the southeastern region of the United States."
print(fuzzy_replace(str_a, str_b, orig_str)) # fuzzy_replace code should be implemented
# Output: REPLACED is a state located in the southeastern region of the United States.
使用fuzzywuzzy模块搜索本身很简单,但它只给出了字符串之间差异的比率。有没有办法在原始字符串中找到子字符串模糊匹配的位置?
答案 0 :(得分:3)
试试这个..
from fuzzywuzzy import fuzz
def fuzzy_replace(str_a, str_b, orig_str):
l = len(str_a.split()) # Length to read orig_str chunk by chunk
splitted = orig_str.split()
for i in range(len(splitted)-l+1):
test = " ".join(splitted[i:i+l])
if fuzz.ratio(str_a, test) > 75: #Using fuzzwuzzy library to test ratio
before = " ".join(splitted[:i])
after = " ".join(splitted[i+1:])
return before+" "+str_b+" "+after #Output will be sandwich of these three strings
str_a = "Alabama is a"
str_b = "REPLACED"
orig_str = "Flabama is a state located in the southeastern region of the United States."
print fuzzy_replace(str_a, str_b, orig_str)
打印
REPLACED state located in the southeastern region of the United States.