用循环填充python中的数据框

时间:2017-10-25 15:55:50

标签: python pandas numpy

我是python中的新手,我正面临一个让我发疯的问题:X 我有一个产品清单,因为它跟随着独特的产品

array(['NEG_00_04', 'NEG_04_08', 'NEG_08_12', 'NEG_12_16', 'NEG_16_20',
       'NEG_20_24', 'POS_00_04', 'POS_04_08', 'POS_08_12', 'POS_12_16',
       'POS_16_20', 'POS_20_24'], dtype=object)

我已经完成了一个函数,该函数将包含一个结果列表及其各自的日期,如下所示: 例如,产品 resultado(waps_df1,' NEG_00_04')

datum_von  Result
0   2017-10-10    1.10
1   2017-10-11    2.74
2   2017-10-12   3.96
3   2017-10-13   11.85
4   2017-10-14    7.83
5   2017-10-15   14.64
6   2017-10-16    5.11
7   2017-10-17   12.09
8   2017-10-18    8.47
9   2017-10-19    6.34
10  2017-10-20    7.68
11  2017-10-21   13.40
12  2017-10-22   25.53
13  2017-10-23    2.85
14  2017-10-24    5.80
15  2017-10-25    4.09

我创建了这个数据框

NEG_00_04  NEG_04_08  NEG_08_12  NEG_12_16  NEG_16_20  NEG_20_24  \
datum_von                                                                      
2017-10-10          1          1          1          1          1          1   
2017-10-10          1          1          1          1          1          1   
2017-10-10          1          1          1          1          1          1   
2017-10-10          1          1          1          1          1          1   
2017-10-10          1          1          1          1          1          1   
2017-10-10          1          1          1          1          1          1   
2017-10-10          1          1          1          1          1          1   
2017-10-10          1          1          1          1          1          1   
2017-10-10          1          1          1          1          1          1   
2017-10-10          1          1          1          1          1          1   
2017-10-10          1          1          1          1          1          1   
2017-10-10          1          1          1          1          1          1   
2017-10-11          1          1          1          1          1          1   
2017-10-11          1          1          1          1          1          1   
2017-10-11          1          1          1          1          1          1   
2017-10-11          1          1          1          1          1          1 

索引为od datum_von

但是现在我不确定如何将每个产品的结果插入其日期。我在datum_von中的日期和我的框架中的索引以及我的函数中的日期相同,所以我想用以下函数给出的正确的对应结果填充我的日期框架

def resultado(waps_df1,prod):

    NEG_00_04_p =  waps_df1[waps_df1['produktname']== prod] #one prodkt NEG_00_04
    NEG_00_04_p = NEG_00_04_p.reset_index()
    NEG_00_04_p['Diff'] = -NEG_00_04_p['wap'].diff(-1) 
    NEG_00_04_p['Diff'].shift(+1).fillna(0)
    NEG_00_04_p['Diff'] = NEG_00_04_p['Diff'].shift(+1).fillna(0)
    NEG_00_04_p['Results'] = NEG_00_04_p['Diff'] + NEG_00_04_p['wap']
    NEG_00_04_p = NEG_00_04_p.drop('index', 1)
    NEG_00_04_p = NEG_00_04_p.drop('produktname', 1)
    NEG_00_04_p = NEG_00_04_p.drop('Diff', 1)
    NEG_00_04_p = NEG_00_04_p.drop('wap', 1)

    return NEG_00_04_p

此功能的结果是这样的,但需要为每个产品运行。这适用于产品NEG_OO_04

 datum_von  Results
0   2017-10-10     2.10
1   2017-10-11     4.74
2   2017-10-12    39.96
3   2017-10-13    11.85
4   2017-10-14     7.83
5   2017-10-15    14.64
6   2017-10-16     5.11
7   2017-10-17    12.09
8   2017-10-18     8.47
9   2017-10-19     6.34
10  2017-10-20     7.68
11  2017-10-21    13.40
12  2017-10-22    25.53
13  2017-10-23     2.85
14  2017-10-24     5.80
15  2017-10-25     4.09

(有任何想法?)可能原因我不太确定如此操纵数据框架,所以我无法解决我的问题

到目前为止,我已经完成了以下代码...

waps_df1 = get_marketdata(sql,'marketdata_db','prod6')
Products = waps_df1['produktname'].unique()
del waps_df1['datum_bis']
Days = waps_df1['datum_von'].unique()

###
indexed_df = waps_df1['datum_von']
columns = [Products]
df_ = pd.DataFrame(index = indexed_df, columns=columns)
df_ = df_.fillna(0)


dias = 0
x = 0
k = 0
i=0
y = df_.index.tolist()
for prod in Products:
    df_r = resultado(waps_df1,prod)
    if k == len(Products):
        break

    for dias in y:

        #df_.ix[dias,prod] = 1
        df_.ix[dias, prod] = df_r['Results'][i]
        i=i+1
    k=k+1
#df_.ix[dias,prod] = 1 

1 个答案:

答案 0 :(得分:0)

使用pd.concat考虑​​水平合并,您可以通过 datum_von 索引加入所有数据帧。请参阅调整功能:

def resultado(df, prod):

    tmp = df[df['produktname'] == prod] # one prodkt NEG_00_04
    tmp = tmp.reset_index()
    tmp['Diff'] = -tmp['wap'].diff(-1) 
    tmp['Diff'].shift(+1).fillna(0)
    tmp['Diff'] = tmp['Diff'].shift(+1).fillna(0)
    tmp[prod] = tmp['Diff'] + tmp['wap']             # NAME EACH RESULTS COLUMN TO PRODUCT
    tmp = tmp.drop(['index', 'produktname', 'Diff', 'wap'], axis=1)        
    tmp = tmp.set_index('datum_von')                 # SET EACH DATAFRAME TO DATE INDEX

    return tmp

finaldf = pd.concat([resultado(waps_df1, p) for p in Products], axis=1)