用先前和当前行的值之和填充数据框行

时间:2019-03-20 13:32:00

标签: python pandas

以下代码创建了作为我起点的数据框:

import pandas as pd
import numpy as np


d= {'PX_LAST':[1,2,3,3,3,1,2,2,1,1,3,3],'ma':[2,2,2,2,2,2,2,2,2,2,2,2],'action':[0,0,1,0,0,1,0,0,1,0,1,0]}
df_zinc = pd.DataFrame(data=d)

#add column buy_sell
mask1 = df_zinc['action'] != 0
mask2 = df_zinc['PX_LAST'] < df_zinc['ma']
mask3 = df_zinc['PX_LAST'] > df_zinc['ma']

df_zinc['buy_sell'] = np.select([mask1 & mask2, mask1 & mask3], [-1,1], 0)
df_zinc

我在下面尝试做的是添加一列,其中每一行是上一行中的值,'operational_col'列的当前值和该列的当前值之间的总和结果'buy_sell'。

#empty operational column and weight column
df_zinc['operational_col']=0
df_zinc['weight']=0

#weight column
while index < df_zinc.shape[0]:                      
df_zinc['weight'][index] = df_zinc['weight'][index-1] + df_zinc['operational_col'][index] + df_zinc['buy_sell'][index]
index = index + 1

这将产生仅包含零的列,而不是我要查找的值。有人可以帮忙吗?

1 个答案:

答案 0 :(得分:0)

在您的示例中,indexwhile循环中被访问之前未定义,因此会产生名称错误。这是使用for循环和loc来重写以选择列中的值的循环:

for index in range(1, len(df_zinc)):                     
    df_zinc.loc[index, 'weight'] = df_zinc.loc[index-1, 'weight'] + \
    df_zinc.loc[index, 'operational_col'] + df_zinc.loc[index, 'buy_sell']