我有一个熊猫数据帧时间序列,该序列在5分钟内增加。我想为每5分钟增量分配一个变量名。例如:
df_5min = df.resample('5min').first()
df_10min = df.resample('10min').first()
.
.
.
df_7200min = df.resample('7200min').first()
我宁愿将它们保留为单独的数据框名称,并将其保留在ram中,而不是存储数据框并在以后调用它-通过简单的编写
for i in range(0,7201,5):
df.to_csv('/path/df_' + str(i) + 'min.csv')
如何为每个变量分配一个变量名,以便可以在同一脚本中独立地对每个数据框执行分析?
答案 0 :(得分:1)
您可以创建dictionary of DataFrames
,因为不建议使用globals
解决方案:
#python 3.6+
dfs = {f'{x}min': df.resample(f'{x}min').first() for x in range(5,7201,5)}
#python bellow
dfs = {'{}min'.format(x): df.resample('{}min'.format(x)).first() for x in range(5,7201,5)}
示例:
rng = pd.date_range('2017-04-03 12:15:10', periods=5, freq='11Min')
df = pd.DataFrame({'a': range(5)}, index=rng)
print (df)
a
2017-04-03 12:15:10 0
2017-04-03 12:26:10 1
2017-04-03 12:37:10 2
2017-04-03 12:48:10 3
2017-04-03 12:59:10 4
dfs = {f'{x}min': df.resample(f'{x}min').first() for x in range(5,16,5)}
print (dfs)
{'5min': a
2017-04-03 12:15:00 0.0
2017-04-03 12:20:00 NaN
2017-04-03 12:25:00 1.0
2017-04-03 12:30:00 NaN
2017-04-03 12:35:00 2.0
2017-04-03 12:40:00 NaN
2017-04-03 12:45:00 3.0
2017-04-03 12:50:00 NaN
2017-04-03 12:55:00 4.0, '10min': a
2017-04-03 12:10:00 0
2017-04-03 12:20:00 1
2017-04-03 12:30:00 2
2017-04-03 12:40:00 3
2017-04-03 12:50:00 4, '15min': a
2017-04-03 12:15:00 0
2017-04-03 12:30:00 2
2017-04-03 12:45:00 3}
然后通过dict键实现:
print (dfs['5min'])
a
2017-04-03 12:15:00 0.0
2017-04-03 12:20:00 NaN
2017-04-03 12:25:00 1.0
2017-04-03 12:30:00 NaN
2017-04-03 12:35:00 2.0
2017-04-03 12:40:00 NaN
2017-04-03 12:45:00 3.0
2017-04-03 12:50:00 NaN
2017-04-03 12:55:00 4.0