对于预处理工作流程,我需要针对多个数量(温度,相对湿度等)执行相同的工作流程(从.csv,文件,清理,聚合等收集数据)。为此,我使用一个for循环遍历包含有关各个数量的元信息的元组。这些陈述往往变得很大。有没有更清洁,更优雅的方法来实现这一目标?
我已经考虑过事先初始化元组并遍历元组列表,但是恕我直言,这并没有真正使代码更具可读性。
for quantity, inputFileName, aggregationMethod, locationShapeFile in zip([temperature, relativeHumidity, wind, radiation, precipitation],
['temp.csv', 'rh.csv', 'wind.csv', 'rad.csv', 'prec.csv'],
['mean', 'mean', 'mean', 'mean', 'sum'],
['locTemp.shp', 'locRH.shp', 'locWind.shp', 'locRad.shp', 'locPrec.shp']):
collect(quantity, inputFileName, aggregationMethod)
aggregate(aggregationMethod, locationShapeFile)
答案 0 :(得分:0)
为此使用zip已经导致相当紧凑的代码;请注意,您可以使用较短的变量名将字符数减少23%,例如:
for q, inp, agg, loc in zip(
[temperature, relativeHumidity, wind, radiation, precipitation],
['temp.csv', 'rh.csv', 'wind.csv', 'rad.csv', 'prec.csv'],
['mean', 'mean', 'mean', 'mean', 'sum'],
['locTemp.shp', 'locRH.shp', 'locWind.shp', 'locRad.shp', 'locPrec.shp']):
collect(q, inp, agg)
aggregate(agg, loc)
一种替代方法是使用numpy.vectorize,其目的是在并行可迭代对象上进行这种功能的应用:
import numpy as np
def f(q, inp, agg, loc):
collect(q, inp, agg)
aggregate(agg, loc)
np.vectorize(f)(
[temperature, relativeHumidity, wind, radiation, precipitation],
['temp.csv', 'rh.csv', 'wind.csv', 'rad.csv', 'prec.csv'],
['mean', 'mean', 'mean', 'mean', 'sum'],
['locTemp.shp', 'locRH.shp', 'locWind.shp', 'locRad.shp', 'locPrec.shp'])
答案 1 :(得分:0)
@Sayse在评论中已经建议-在循环之前定义变量
quantities = [temperature, relativeHumidity, wind, radiation, precipitation]
csv_files = ['temp.csv', 'rh.csv', 'wind.csv', 'rad.csv', 'prec.csv']
methods = ['mean', 'mean', 'mean', 'mean', 'sum']
shp_files = ['locTemp.shp', 'locRH.shp', 'locWind.shp', 'locRad.shp', 'locPrec.shp']
for quantity, csv_file, method, shp_file in zip(quantities, csv_files, methods, shp_files):
collect(quantity, csv_file, method)
aggregate(method, shp_file)
您也可以缩短它
data_points = ['Temp', 'RH', 'Wind', 'Rad', 'Prec'] # maybe there is better name than data_points?
quantities = [temperature, relativeHumidity, wind, radiation, precipitation]
methods = ['mean', 'mean', 'mean', 'mean', 'sum']
for dp, quantity, method in zip(data_points, quantities, methods):
collect(quantity, f'{dp.lower()}.csv', method)
aggregate(method, f'loc{dp}.shp')