假设您有一段代码可以接受列表或文件名,并且必须通过应用相同的条件过滤提供的任何一项:
import argparse
parser = argparse.ArgumentParser()
group = parser.add_mutually_exclusive_group(required = True)
group.add_argument('-n', '--name', help = 'single name', action = 'append')
group.add_argument('-N', '--names', help = 'text file of names')
args = parser.parse_args()
results = []
if args.name:
# We are dealing with a list.
for name in args.name:
name = name.strip().lower()
if name not in results and len(name) > 6: results.append(name)
else:
# We are dealing with a file name.
with open(args.names) as f:
for name in f:
name = name.strip().lower()
if name not in results and len(name) > 6: results.append(name)
我想在上面的代码中删除尽可能多的冗余。我尝试为strip
和lower
创建以下函数,但并没有删除太多重复代码:
def getFilteredName(name):
return name.strip().lower()
有没有办法在同一函数中遍历列表和文件?我应该如何减少尽可能多的代码?
答案 0 :(得分:1)
您有可以简化的重复代码:list
和file-objects
都是 iterables -如果创建的方法使用iterable
并返回正确的输出可以减少代码重复(DRY)。
数据结构选择:
您不希望重复的项目,这意味着set()
或dict()
更适合收集要解析的数据-通过设计,它们消除了重复的 查看某个项目是否已经in
列表:
OrderedDict
的{{1}}
collections
(决定保证输入顺序)dict
以上任一选择都会为您删除重复项。
set()
测试代码:
import argparse
from collections import OrderedDict # use normal dict on 3.7+ it hasinput order
def get_names(args):
"""Takes an iterable and returns a list of all unique lower cased elements, that
have at least length 6."""
seen = OrderedDict() # or dict or set
def add_names(iterable):
"""Takes care of adding the stuff to your return collection."""
k = [n.strip().lower() for n in iterable] # do the strip().split()ing only once
# using generator comp to update - use .add() for set()
seen.update( ((n,None) for n in k if len(n)>6))
if args.name:
# We are dealing with a list:
add_names(args.name)
elif args.names:
# We are dealing with a file name:
with open(args.names) as f:
add_names(f)
# return as list
return list(seen)
parser = argparse.ArgumentParser()
group = parser.add_mutually_exclusive_group(required = True)
group.add_argument('-n', '--name', help = 'single name', action = 'append')
group.add_argument('-N', '--names', help = 'text file of names')
args = parser.parse_args()
results = get_names(args)
print(results)
的输出:
-n Joh3333n -n Ji3333m -n joh3333n -n Bo3333b -n bo3333b -n jim
输入文件:
['joh3333n', 'ji3333m', 'bo3333b']
with open("names.txt","w") as names:
for n in ["a"*k for k in range(1,10)]:
names.write( f"{n}\n")
的输出:
-N names.txt
答案 1 :(得分:0)
子类list
,并将子类设为context manager:
class F(list):
def __enter__(self):
return self
def __exit__(self,*args,**kwargs):
pass
然后条件语句可以决定要迭代的内容
if args.name:
# We are dealing with a list.
thing = F(args.name)
else:
# We are dealing with a file name.
thing = open(args.names)
并且可以分解出迭代代码。
results = []
with thing as f:
for name in f:
name = name.strip().lower()
if name not in results and len(name) > 6: results.append(name)
这是一种类似的解决方案,它从文件或列表中创建一个io.StringIO
对象,然后使用单个指令集对其进行处理。
import io
if args.name:
# We are dealing with a list.
f = io.StringIO('\n'.join(args.name))
else:
# We are dealing with a file name.
with open(args.names) as fileobj:
f = io.StringIO(fileobj.read())
results = []
for name in f:
name = name.strip().lower()
if name not in results and len(name) > 6: results.append(name)
如果文件很大且内存不足,则具有将整个文件读入内存的缺点。