我需要对列表中的每个元素应用一些正则表达式替换。我写了一个函数来重复自己。无论如何还有太多的重复。我怎么能优化这个?
def cleanlist(mylist, regex, substitution):
tmp_list = mylist
cleaned_list = [re.sub(regex, substitution, line) for line in tmp_list]
return cleaned_list
create_table_parts = cleanlist(create_table_parts, "(SET).+?(\n)", "\n")
create_table_parts = cleanlist(create_table_parts, "(__|\(__).*?\n|(^\)|(?<=\n)(\n))", "")
create_table_parts = cleanlist(create_table_parts, "\"", "")
create_table_parts = cleanlist(create_table_parts, "(?<=CREATE\sTABLE\s).+?(\.)", "")
create_table_parts = cleanlist(create_table_parts, "(PRIMARY\sKEY\s).+?(\n)|(FOREIGN\sKEY\s).+?(\n)|", "")
create_table_parts = cleanlist(create_table_parts, "(CREATE_INDEX\s).+?(\n)", "")
答案 0 :(得分:4)
将您的模式放入列表并循环:
patterns = [
("(SET).+?(\n)", "\n"),
("(__|\(__).*?\n|(^\)|(?<=\n)(\n))", ""),
("\"", ""), ("(?<=CREATE\sTABLE\s).+?(\.)", ""),
("(PRIMARY\sKEY\s).+?(\n)|(FOREIGN\sKEY\s).+?(\n)|", ""),
("(CREATE_INDEX\s).+?(\n)", "")
]
for patt, sub in patterns:
create_table_parts = cleanlist(create_table_parts, patt, sub)
您甚至可以使用reduce()
代替for
循环:
create_table_parts = reduce(lambda ctp, patt: cleanlist(ctp, *patt),
patterns, create_table_parts)
但这是一个个人电话,或者更不可读。