Python Pandas动态创建数据帧

时间:2017-11-04 10:53:21

标签: python pandas dataframe

下面的代码将在 ONE 数据帧中生成所需的输出,但是,我想在FOR循环中动态创建数据帧,然后将移位的值分配给该数据帧。例如,数据框df_lag_12只包含column1_t12和column2_12。任何想法将不胜感激。我尝试使用EXEC语句动态创建12个数据帧,谷歌搜索似乎表明这是不好的做法。

import pandas as pd
list1=list(range(0,20))
list2=list(range(19,-1,-1))
d={'column1':list(range(0,20)),
   'column2':list(range(19,-1,-1))}
df=pd.DataFrame(d)
df_lags=pd.DataFrame()
for col in df.columns:
    for i in range(12,0,-1):
        df_lags[col+'_t'+str(i)]=df[col].shift(i)
    df_lags[col]=df[col].values  
print(df_lags)
for df in (range(12,0,-1)):
    exec('model_data_lag_'+str(df)+'=pd.DataFrame()')

dymanically创建的数据框DF_LAGS_12的所需输出:

var_list=['column1_t12','column2_t12']
df_lags_12=df_lags[var_list]  
print(df_lags_12)

2 个答案:

答案 0 :(得分:6)

我认为最好的是创建dictionary of DataFrames

d = {}
for i in range(12,0,-1):
    d['t' + str(i)] = df.shift(i).add_suffix('_t' + str(i))

如果需要先指定列:

d = {}
cols = ['column1','column2']
for i in range(12,0,-1):
    d['t' + str(i)] = df[cols].shift(i).add_suffix('_t' + str(i))

dict comprehension解决方案:

d = {'t' + str(i): df.shift(i).add_suffix('_t' + str(i)) for i in range(12,0,-1)}
print (d['t10'])
    column1_t10  column2_t10
0           NaN          NaN
1           NaN          NaN
2           NaN          NaN
3           NaN          NaN
4           NaN          NaN
5           NaN          NaN
6           NaN          NaN
7           NaN          NaN
8           NaN          NaN
9           NaN          NaN
10          0.0         19.0
11          1.0         18.0
12          2.0         17.0
13          3.0         16.0
14          4.0         15.0
15          5.0         14.0
16          6.0         13.0
17          7.0         12.0
18          8.0         11.0
19          9.0         10.0

编辑:是否可以通过全局变量,但更好的是dictionary

d = {}
cols = ['column1','column2']
for i in range(12,0,-1):
    globals()['df' + str(i)] =  df[cols].shift(i).add_suffix('_t' + str(i))

print (df10)
    column1_t10  column2_t10
0           NaN          NaN
1           NaN          NaN
2           NaN          NaN
3           NaN          NaN
4           NaN          NaN
5           NaN          NaN
6           NaN          NaN
7           NaN          NaN
8           NaN          NaN
9           NaN          NaN
10          0.0         19.0
11          1.0         18.0
12          2.0         17.0
13          3.0         16.0
14          4.0         15.0
15          5.0         14.0
16          6.0         13.0
17          7.0         12.0
18          8.0         11.0
19          9.0         10.0

答案 1 :(得分:0)

for i in range(1, 16):
    text=f"Version{i}=pd.DataFrame()"
    exec(text)

execf"..." 的组合可以帮助您做到这一点。 如果您需要迭代或相同变量的版本,上述语句会有所帮助