Python:访问数组中的正确值

时间:2018-03-29 09:05:11

标签: python arrays pandas

我想计算以下等式:

calc = value_a(2D) - (value_b(0D) + value_b(1D))/10000

value_a(2D) corresponds to type **a**, year **2D** and value **1.1275**
value_b(0D) corresponds to type **b**, year **0D** and value **0**
value_b(1D) corresponds to type **b**, year **1D** and value **0.125**

,结果应为

1.1274875

但是我不确定如何在循环中访问正确的数据?我想保留我的代码结构。

代码如下所示:

import pandas as pd

data = pd.read_csv('C:/Book1.csv').fillna('')
pd_date = pd.DatetimeIndex(data['date'].values)
data['date'] = pd_date
index_data = data.set_index('date')

for current_date in index_data.index.unique():
    for index, row in index_data.iterrows():
        if index == current_date:
            for index2, row2 in index_data.iterrows(): 
                if index2 == current_date:    
                    if row['type'] in {'a', 'b'} and row2['type'] in {'a', 'b'}:
                        if row['year'] in {'0D','1D','2D'}:
                            print(row['value'])

数据如下所示:

date          type    year    value
2015-02-09    a       2D      1.1275
2015-02-09    b       10M     58.125
2015-02-09    b       11M     68.375
2015-02-09    b       1M      3.345
2015-02-09    b       1W      0.89
2015-02-09    b       1Y      79.375
2015-02-09    b       2M      7.535
2015-02-09    b       2W      1.8
2015-02-09    b       3M      11.61
2015-02-09    b       3W      2.48
2015-02-09    b       4M      16.2
2015-02-09    b       5M      21.65
2015-02-09    b       6M      27.1
2015-02-09    b       7M      33.625
2015-02-09    b       8M      41.375
2015-02-09    b       9M      49.5
2015-02-09    b       0D      0
2015-02-09    b       1D      0.125

1 个答案:

答案 0 :(得分:2)

看起来你真的可以在这里使用多索引:

In [4]: df.reset_index(inplace=True)

In [5]: df
Out[5]:
   type year        date    value
0     a   2D  2015-02-09   1.1275
1     b  10M  2015-02-09  58.1250
2     b  11M  2015-02-09  68.3750
3     b   1M  2015-02-09   3.3450
4     b   1W  2015-02-09   0.8900
5     b   1Y  2015-02-09  79.3750
6     b   2M  2015-02-09   7.5350
7     b   2W  2015-02-09   1.8000
8     b   3M  2015-02-09  11.6100
9     b   3W  2015-02-09   2.4800
10    b   4M  2015-02-09  16.2000
11    b   5M  2015-02-09  21.6500
12    b   6M  2015-02-09  27.1000
13    b   7M  2015-02-09  33.6250
14    b   8M  2015-02-09  41.3750
15    b   9M  2015-02-09  49.5000
16    b   0D  2015-02-09   0.0000
17    b   1D  2015-02-09   0.1250

In [6]: df.set_index(['type','year'], inplace=True)


In [7]: df
Out[7]:
                 date    value
type year
a    2D    2015-02-09   1.1275
b    10M   2015-02-09  58.1250
     11M   2015-02-09  68.3750
     1M    2015-02-09   3.3450
     1W    2015-02-09   0.8900
     1Y    2015-02-09  79.3750
     2M    2015-02-09   7.5350
     2W    2015-02-09   1.8000
     3M    2015-02-09  11.6100
     3W    2015-02-09   2.4800
     4M    2015-02-09  16.2000
     5M    2015-02-09  21.6500
     6M    2015-02-09  27.1000
     7M    2015-02-09  33.6250
     8M    2015-02-09  41.3750
     9M    2015-02-09  49.5000
     0D    2015-02-09   0.0000
     1D    2015-02-09   0.1250

然后简单地说:

In [8]: df.loc['a','2D'].value  - (df.loc['b', '0D'].value + df.loc['b','1D'].value)/10000
Out[8]: 1.1274875

注意,假设我有多年(这是我通过简单地将df连接到自身而制作的):

In [24]: df2
Out[24]:
   type year        date    value
0     a   2D  2015-02-09   1.1275
1     b  10M  2015-02-09  58.1250
2     b  11M  2015-02-09  68.3750
3     b   1M  2015-02-09   3.3450
4     b   1W  2015-02-09   0.8900
5     b   1Y  2015-02-09  79.3750
6     b   2M  2015-02-09   7.5350
7     b   2W  2015-02-09   1.8000
8     b   3M  2015-02-09  11.6100
9     b   3W  2015-02-09   2.4800
10    b   4M  2015-02-09  16.2000
11    b   5M  2015-02-09  21.6500
12    b   6M  2015-02-09  27.1000
13    b   7M  2015-02-09  33.6250
14    b   8M  2015-02-09  41.3750
15    b   9M  2015-02-09  49.5000
16    b   0D  2015-02-09   0.0000
17    b   1D  2015-02-09   0.1250
18    a   2D  2015-02-10   1.1275
19    b  10M  2015-02-10  58.1250
20    b  11M  2015-02-10  68.3750
21    b   1M  2015-02-10   3.3450
22    b   1W  2015-02-10   0.8900
23    b   1Y  2015-02-10  79.3750
24    b   2M  2015-02-10   7.5350
25    b   2W  2015-02-10   1.8000
26    b   3M  2015-02-10  11.6100
27    b   3W  2015-02-10   2.4800
28    b   4M  2015-02-10  16.2000
29    b   5M  2015-02-10  21.6500
30    b   6M  2015-02-10  27.1000
31    b   7M  2015-02-10  33.6250
32    b   8M  2015-02-10  41.3750
33    b   9M  2015-02-10  49.5000
34    b   0D  2015-02-10   0.0000
35    b   1D  2015-02-10   0.1250

In [25]: df.iloc[-2,-1] = 100000 # this corresponds to (b, 0D) and used to be 0

如@cᴏʟᴅsᴘᴇᴇᴅ所述,您可以按'date'列进行分组:

In [26]: df2.groupby('date').apply(
     ...:     lambda df:
     ...:         df.loc['a','2D'].value
     ...:         - (df.loc['b', '0D'].value + df.loc['b','1D'].value)
     ...:         / 10000
     ...: )
Out[27]:
date
2015-02-09    1.127487
2015-02-10   -8.872513
dtype: float64