我觉得这个应该是显而易见的,但我有点卡住了。
我有一个DataFrame(df
),行上有3级MultiIndex。 MultiIndex的其中一个级别是ccy
,表示该行中包含的信息的货币。每行有3列数据。
我想将所有数据转换为以参考货币(比如美元)计价。为此,我有一系列(forex
)包含相关货币的外汇汇率。
所以目标很简单:将df
每行中的所有数据乘以forex
的值,该值对应ccy
中该行索引的df
条目1}}。
机械设置如下所示:
import pandas as pd
import numpy as np
import itertools
np.random.seed(0)
tuples = list(itertools.product(
list('abd'),
['one', 'two', 'three'],
['USD', 'EUR', 'GBP']
))
np.random.shuffle(tuples)
idx = pd.MultiIndex.from_tuples(tuples[:-10], names=['letter', 'number', 'ccy'])
df = pd.DataFrame(np.random.randn(len(idx), 3), index=idx,
columns=['val_1', 'val_2', 'val_3'])
forex = pd.Series({'USD': 1.0,
'EUR': 1.3,
'GBP': 1.7})
我可以通过运行得到我需要的东西:
df.apply(lambda col: col.mul(forex, level='ccy'), axis=0)
但对我来说似乎很奇怪我需要在这么简单的情况下使用pd.DataFrame.apply
。我希望以下语法(或非常类似的东西)能够工作:
df.mul(forex, level='ccy', axis=0)
但是这给了我:
ValueError: cannot reindex from a duplicate axis
显然,apply
方法并非灾难。但似乎很奇怪,我无法在mul
的所有列中找出直接执行此操作的语法。有没有更直接的方法来处理这个?如果没有,是否有一个直观的原因mul
语法不应该以这种方式增强?
答案 0 :(得分:3)
现在可以在master / 0.14中使用。请参阅问题:https://github.com/pydata/pandas/pull/6682
In [11]: df.mul(forex,level='ccy',axis=0)
Out[11]:
val_1 val_2 val_3
letter number ccy
a one GBP -2.172854 2.443530 -0.132098
d three USD 1.089630 0.096543 1.418667
b two GBP 1.986064 1.610216 1.845328
three GBP 4.049782 -0.690240 0.452957
a two GBP -2.304713 -0.193974 -1.435192
b one GBP 1.199589 -0.677936 -1.406234
d two GBP -0.706766 -0.891671 1.382272
b two EUR -0.298026 2.810233 -1.244011
d one EUR 0.087504 0.268448 -0.593946
GBP -1.801959 1.045427 2.430423
b three EUR -0.275538 -0.104438 0.527017
a one EUR 0.154189 1.630738 1.844833
b one EUR -0.967013 -3.272668 -1.959225
d three GBP 1.953429 -2.029083 1.939772
EUR 1.962279 1.388108 -0.892566
a three GBP 0.025285 -0.638632 -0.064980
USD 0.367974 -0.044724 -0.302375
[17 rows x 3 columns]
这是另一种方法(也需要master / 0.14)
In [127]: df = df.sortlevel()
In [128]: df
Out[128]:
val_1 val_2 val_3
letter number ccy
a one EUR 0.118607 1.254414 1.419102
GBP -1.278149 1.437371 -0.077705
three GBP 0.014873 -0.375666 -0.038224
USD 0.367974 -0.044724 -0.302375
two GBP -1.355714 -0.114103 -0.844231
b one EUR -0.743856 -2.517437 -1.507096
GBP 0.705641 -0.398786 -0.827197
three EUR -0.211952 -0.080337 0.405398
GBP 2.382224 -0.406024 0.266445
two EUR -0.229251 2.161717 -0.956931
GBP 1.168273 0.947186 1.085487
d one EUR 0.067311 0.206499 -0.456881
GBP -1.059976 0.614957 1.429661
three EUR 1.509445 1.067775 -0.686589
GBP 1.149076 -1.193578 1.141042
USD 1.089630 0.096543 1.418667
two GBP -0.415745 -0.524512 0.813101
[17 rows x 3 columns]
idx = pd.IndexSlice
In [129]: pd.concat([ df.loc[idx[:,:,x],:]*v for x,v in forex.iteritems() ])
Out[129]:
val_1 val_2 val_3
letter number ccy
a one EUR 0.154189 1.630738 1.844833
b one EUR -0.967013 -3.272668 -1.959225
three EUR -0.275538 -0.104438 0.527017
two EUR -0.298026 2.810233 -1.244011
d one EUR 0.087504 0.268448 -0.593946
three EUR 1.962279 1.388108 -0.892566
a one GBP -2.172854 2.443530 -0.132098
three GBP 0.025285 -0.638632 -0.064980
two GBP -2.304713 -0.193974 -1.435192
b one GBP 1.199589 -0.677936 -1.406234
three GBP 4.049782 -0.690240 0.452957
two GBP 1.986064 1.610216 1.845328
d one GBP -1.801959 1.045427 2.430423
three GBP 1.953429 -2.029083 1.939772
two GBP -0.706766 -0.891671 1.382272
a three USD 0.367974 -0.044724 -0.302375
d three USD 1.089630 0.096543 1.418667
[17 rows x 3 columns]
这是通过合并的另一种方式
In [36]: f = forex.to_frame('value')
In [37]: f.index.name = 'ccy'
In [38]: pd.merge(df.reset_index(),f.reset_index(),on='ccy')
Out[38]:
letter number ccy val_1 val_2 val_3 value
0 a one GBP -1.278149 1.437371 -0.077705 1.7
1 b two GBP 1.168273 0.947186 1.085487 1.7
2 b three GBP 2.382224 -0.406024 0.266445 1.7
3 a two GBP -1.355714 -0.114103 -0.844231 1.7
4 b one GBP 0.705641 -0.398786 -0.827197 1.7
5 d two GBP -0.415745 -0.524512 0.813101 1.7
6 d one GBP -1.059976 0.614957 1.429661 1.7
7 d three GBP 1.149076 -1.193578 1.141042 1.7
8 a three GBP 0.014873 -0.375666 -0.038224 1.7
9 d three USD 1.089630 0.096543 1.418667 1.0
10 a three USD 0.367974 -0.044724 -0.302375 1.0
11 b two EUR -0.229251 2.161717 -0.956931 1.3
12 d one EUR 0.067311 0.206499 -0.456881 1.3
13 b three EUR -0.211952 -0.080337 0.405398 1.3
14 a one EUR 0.118607 1.254414 1.419102 1.3
15 b one EUR -0.743856 -2.517437 -1.507096 1.3
16 d three EUR 1.509445 1.067775 -0.686589 1.3
[17 rows x 7 columns]