我正在尝试创建一个函数来减少将大量重复代码分配给变量的内容。
目前,如果我这样做,它可以正常工作
from pyquery import PyQuery as pq
import pandas as pd
d = pq(filename='20160319RHIL0_edit.xml')
# from nominations
res = d('nomination')
nomID = [res.eq(i).attr('id') for i in range(len(res))]
horseName = [res.eq(i).attr('horse') for i in range(len(res))]
zipped = list(zip(nomID, horseName))
frames = pd.DataFrame(zipped)
print(frames)
制作此输出。
In [9]: 0 1
0 171115 Vergara
1 187674 Heavens Above
2 184732 Sweet Fire
3 181928 Alegria
4 158914 Piamimi
5 171408 Blendwell
6 166836 Adorabeel (NZ)
7 172933 Mary Lou
8 182533 Skyline Blush
9 171801 All Cerise
10 181079 Gust of Wind (NZ)
然而,为了继续添加,我需要创建更多变量,如下一个(下面)。唯一不变的部分是变量名称和本例中的属性attr(' horse' )
horseName = [res.eq(i).attr('horse') for i in range(len(res))]
因此,对于DRY并创建一个带有参数的函数是合乎逻辑的,该参数是一个属性列表
from pyquery import PyQuery as pq
import pandas as pd
d = pq(filename='20160319RHIL0_edit.xml')
# from nominations
res = d('nomination')
aList = []
def inputs(args):
'''function to get elements matching criteria'''
optsList = ['id', 'horse']
for item in res:
for attrs in optsList:
if res.attr(attrs) in item:
aList.append([res.eq(i).attr(attrs) for i in range(len(res))])
zipped = list(zip(aList))
frames = pd.DataFrame(zipped)
print(frames)
答案 0 :(得分:1)
attrs = ('id', 'horse', ...)
...
data = [[res.eq(i).attr(x) for x in attrs] for i in range(len(res))]