使用zip()遍历具有4个以上项目的元组的良好样式

时间:2019-06-06 07:50:26

标签: python

对于预处理工作流程,我需要针对多个数量(温度,相对湿度等)执行相同的工作流程(从.csv,文件,清理,聚合等收集数据)。为此,我使用一个for循环遍历包含有关各个数量的元信息的元组。这些陈述往往变得很大。有没有更清洁,更优雅的方法来实现这一目标?

我已经考虑过事先初始化元组并遍历元组列表,但是恕我直言,这并没有真正使代码更具可读性。

    for quantity, inputFileName, aggregationMethod, locationShapeFile in zip([temperature, relativeHumidity, wind, radiation, precipitation],
['temp.csv', 'rh.csv', 'wind.csv', 'rad.csv', 'prec.csv'],
['mean', 'mean', 'mean', 'mean', 'sum'],
['locTemp.shp', 'locRH.shp', 'locWind.shp', 'locRad.shp', 'locPrec.shp']):
    collect(quantity, inputFileName, aggregationMethod)     
    aggregate(aggregationMethod, locationShapeFile)

2 个答案:

答案 0 :(得分:0)

为此使用zip已经导致相当紧凑的代码;请注意,您可以使用较短的变量名将字符数减少23%,例如:

for q, inp, agg, loc in zip(
    [temperature, relativeHumidity, wind, radiation, precipitation],
    ['temp.csv', 'rh.csv', 'wind.csv', 'rad.csv', 'prec.csv'],
    ['mean', 'mean', 'mean', 'mean', 'sum'],
    ['locTemp.shp', 'locRH.shp', 'locWind.shp', 'locRad.shp', 'locPrec.shp']):

    collect(q, inp, agg)     
    aggregate(agg, loc)

一种替代方法是使用numpy.vectorize,其目的是在并行可迭代对象上进行这种功能的应用:

import numpy as np

def f(q, inp, agg, loc):
    collect(q, inp, agg)     
    aggregate(agg, loc)

np.vectorize(f)(
    [temperature, relativeHumidity, wind, radiation, precipitation],
    ['temp.csv', 'rh.csv', 'wind.csv', 'rad.csv', 'prec.csv'],
    ['mean', 'mean', 'mean', 'mean', 'sum'],
    ['locTemp.shp', 'locRH.shp', 'locWind.shp', 'locRad.shp', 'locPrec.shp'])

答案 1 :(得分:0)

@Sayse在评论中已经建议-在循环之前定义变量

quantities = [temperature, relativeHumidity, wind, radiation, precipitation]
csv_files = ['temp.csv', 'rh.csv', 'wind.csv', 'rad.csv', 'prec.csv']
methods = ['mean', 'mean', 'mean', 'mean', 'sum']
shp_files = ['locTemp.shp', 'locRH.shp', 'locWind.shp', 'locRad.shp', 'locPrec.shp']
for quantity, csv_file, method, shp_file in zip(quantities, csv_files, methods, shp_files):
    collect(quantity, csv_file, method)     
    aggregate(method, shp_file)

您也可以缩短它

data_points = ['Temp', 'RH', 'Wind', 'Rad', 'Prec'] # maybe there is better name than data_points?
quantities = [temperature, relativeHumidity, wind, radiation, precipitation]
methods = ['mean', 'mean', 'mean', 'mean', 'sum']
for dp, quantity, method in zip(data_points, quantities, methods):
    collect(quantity, f'{dp.lower()}.csv', method)     
    aggregate(method, f'loc{dp}.shp')
  • f字符串需要python 3.6 +