我手动完成了以下任务,并且我确定有写循环的方法,但是我不确定如何在Python中编写循环。
数据如下:
df
a b c market ret
date id
2015-01-01 1 10 4 2 10 0.02
2015-01-01 2 20 3 5 15 0.03
2015-01-01 3 30 2 3 20 0.05
2015-01-01 4 40 1 10 25 0.01
2015-01-02 1 15 8 4 15 -0.03
2015-01-02 2 10 6 1 10 0.02
2015-01-02 3 25 10 2 22 0.06
2015-01-02 4 30 3 7 26 0.06
2015-01-03 1 25 2 2 16 -0.07
2015-01-03 2 10 6 1 18 0.01
2015-01-03 3 5 8 5 26 0.04
2015-01-03 4 30 1 6 21 -0.05
我执行以下操作:
dfa = df
dfa['market'] = dfa.groupby(level = ['id']).market.shift()
dfa['port'] = dfa.groupby(['date'])['a'].transform(lambda x: pd.qcut(x, 4, labels = False))
# value-weighted portoflio returns
dfa = dfa.set_index(['port'], append = True)
dfa['tmktcap'] = dfa.groupby(['date','port'])['mktcap'].transform(sum)
dfa['w_ret'] = (dfa.mktcap / dfa.tmktcap) * dfa.ret
#reshape long to wide
dfa = dfa.groupby(['date', 'port'])['w_ret'].sum().shift(-4)
dfa = dfa['2006-01-01':].rename('a')
dfa = dfa.unstack()
dfa[4.0] = dfa[3.0] - dfroe[0.0]
dfa = dfa.stack().reset_index().set_index(['date'])
dfa['port'] = dfa['port'].map({0.0:'a0',1.0:'a1',2.0:'a2',3.0:'a3',4.0:'aL-S'})
dfa = dfa.reset_index().set_index(['date', 'port']).unstack()
但是随后我对b和c重复了此任务。
因此,我首先将dfb = df
设置为a
,然后将b
更改为c
,然后按照此过程进行操作。
我必须对总共从a
到h
的变量(仅在此处使用的一些示例数据)执行此操作,因此编写循环的任何帮助都将是惊人的!
答案 0 :(得分:1)
将选择列循环。 然后将结果保存在数组,列表或字典中。这是一个清单的示例。
results = [] # this list will store your results
columns_to_process = ['a', 'b','c','d','f']
for col in columns_to_process:
data = df.copy()
data['market'] = data.groupby(level = ['id']).market.shift()
data['port'] = data.groupby(['date'])[col].transform(lambda x: pd.qcut(x, 4, labels = False))
# do whatever you want with data
results.append(data) # this store the result in position 0 then 1 then 2 etc
#then use your result:
result[0] # for the dfa
result[1] # for dfb etc
或者,您可能希望将所有结果存储在一个DataFrame中。为此,您只需选择所需的列并将其保存在DataFrame中即可。
df['result_a'] = data.columns_i_want_to_save
您问:
#Do I just change a to col where I change name of the column?
dfa['port'].map({0.0:'a0',1.0:'a1',2.0:'a2',3.0:'a3',4.0:'aL-S'})
您可以执行一些“字符串加法”。像这样的东西:
dfa['port'].map({0.0:col+'0',
1.0:col+'1',
2.0:col+'2',
3.0:col+'3',
4.0:col+'L-S'})