所以我有一个DataFrame
,其中有几千行包含人工外汇交易数据。前十行看起来像这样:
我想迭代这个集合,并且对于每一行,计算CommonCurrency
,在这种情况下将是USD。因此,对于每一行,我会查看CurrencyPair
,DeskRate
和OrderQty
列并计算CommonCurrency
:
for i in range(len(order_data)):
if (order_data['CurrencyPair'][i] == 'GBP/USD'):
order_data['CommonCurrency'][i] = order_data['DeskRate'][i] *
order_data['OrderQty'][i]
elif (order_data['CurrencyPair'][i] == 'AUD/USD'):
order_data['CommonCurrency'][i] = order_data['DeskRate'][i] *
order_data['OrderQty'][i]
elif (order_data['CurrencyPair'][i] == 'EUR/USD'):
order_data['CommonCurrency'][i] = order_data['DeskRate'][i] *
order_data['OrderQty'][i]
elif (order_data['CurrencyPair'][i] == 'USD/CHF'):
order_data['CommonCurrency'][i] = order_data['DeskRate'][i] /
order_data['OrderQty'][i]
elif (order_data['CurrencyPair'][i] == 'EUR/GBP'):
order_data['CommonCurrency'][i] = #different calculation
这似乎不是正确的做法,特别是如果有大量不同的货币对。我遇到的另一个问题是当我到达EUR/GBP
时,因为现在我必须同时从DeskRate
和GBP/USD
获取EUR/USD
,我无法看到我该怎么做这个方法。
任何提示?
答案 0 :(得分:2)
大熊猫的一个有趣特征是indexing的概念。有更多pythonic方法可以做到这一点,但使用loc
,您可以使用系列(列)为数据框的一部分赋值:
order_data.loc[order_data['CurrencyPair'].isin(('GBP/USD', 'AUD/USD', 'EUR/USD')), 'CurrencyPair'] = order_data['DeskRate'] * order_data['OrderQty']
order_data.loc[order_data['CurrencyPair'] == 'USD/CHF', 'CurrencyPair'] = order_data['DeskRate'] / order_data['OrderQty']
order_data.loc[order_data['CurrencyPair'] == 'EUR/GBP', 'CurrencyPair'] = some_func(order_data['DeskRate'], order_data['OrderQty'])
因此避免任何for
循环