如何在for循环中分配新变量-Pandas DataFrame变量名称分配

时间:2018-12-13 06:36:33

标签: python pandas

我有一个熊猫数据帧时间序列,该序列在5分钟内增加。我想为每5分钟增量分配一个变量名。例如:

df_5min = df.resample('5min').first()
df_10min = df.resample('10min').first()
.
.
.
df_7200min = df.resample('7200min').first()

我宁愿将它们保留为单独的数据框名称,并将其保留在ram中,而不是存储数据框并在以后调用它-通过简单的编写

for i in range(0,7201,5): df.to_csv('/path/df_' + str(i) + 'min.csv')

如何为每个变量分配一个变量名,以便可以在同一脚本中独立地对每个数据框执行分析?

1 个答案:

答案 0 :(得分:1)

您可以创建dictionary of DataFrames,因为不建议使用globals解决方案:

#python 3.6+
dfs = {f'{x}min': df.resample(f'{x}min').first() for x in range(5,7201,5)}
#python bellow
dfs = {'{}min'.format(x): df.resample('{}min'.format(x)).first() for x in range(5,7201,5)}

示例

rng = pd.date_range('2017-04-03 12:15:10', periods=5, freq='11Min')
df = pd.DataFrame({'a': range(5)}, index=rng)  
print (df)
                     a
2017-04-03 12:15:10  0
2017-04-03 12:26:10  1
2017-04-03 12:37:10  2
2017-04-03 12:48:10  3
2017-04-03 12:59:10  4

dfs = {f'{x}min': df.resample(f'{x}min').first() for x in range(5,16,5)}
print (dfs)
{'5min':                        a
2017-04-03 12:15:00  0.0
2017-04-03 12:20:00  NaN
2017-04-03 12:25:00  1.0
2017-04-03 12:30:00  NaN
2017-04-03 12:35:00  2.0
2017-04-03 12:40:00  NaN
2017-04-03 12:45:00  3.0
2017-04-03 12:50:00  NaN
2017-04-03 12:55:00  4.0, '10min':                      a
2017-04-03 12:10:00  0
2017-04-03 12:20:00  1
2017-04-03 12:30:00  2
2017-04-03 12:40:00  3
2017-04-03 12:50:00  4, '15min':                      a
2017-04-03 12:15:00  0
2017-04-03 12:30:00  2
2017-04-03 12:45:00  3}

然后通过dict键实现:

print (dfs['5min'])
                       a
2017-04-03 12:15:00  0.0
2017-04-03 12:20:00  NaN
2017-04-03 12:25:00  1.0
2017-04-03 12:30:00  NaN
2017-04-03 12:35:00  2.0
2017-04-03 12:40:00  NaN
2017-04-03 12:45:00  3.0
2017-04-03 12:50:00  NaN
2017-04-03 12:55:00  4.0