我正在研究将pandas.DataFrame
个贸易交易作为输入,对开和关交易进行配对并计算指标的代码。该代码将每个事务扩展为一个大列表。这是将交易量乘以特定于工具的乘数的结果。对于每种交易的工具(可能为1_000s),每笔交易(可能为1_000s)都会发生此操作。代码循环该列表以计算指标。
我需要优化此函数中的循环以提高性能。我尝试使用pandas
并使用numpy
进行矢量化处理,但无法正确执行逻辑或提高性能。由于循环中的条件语句和price_stack
deque
的突变,我被赶上了。
def round_trips(transactions):
roundtrips = []
# group all transactions by symbol and iterate through each group
for sym, trans_sym in transactions.groupby("symbol"):
trans_sym = trans_sym.sort_index()
price_stack = deque()
dt_stack = deque()
trans_sym["signed_price"] = trans_sym.trade_price * np.sign(trans_sym.quantity)
trans_sym["abs_amount"] = trans_sym.quantity.abs().astype(int) * trans_sym.multiplier.astype(int)
# for each transaction, extract date as dt and transaction details as t
for dt, t in trans_sym.iterrows():
# create a list of the signed price where len(...) == the trade amount(qty * multiplier)
indiv_prices = [t.signed_price] * t.abs_amount
# create a list of commission per unit where len(...) == the trade amount (qty * multiplier)
indiv_commissions = [t.commission / t.abs_amount] * t.abs_amount
# this is an opening trade
if (len(price_stack) == 0) or (
copysign(1, price_stack[-1]) == copysign(1, t.quantity)
):
price_stack.extend(indiv_prices)
dt_stack.extend([dt] * len(indiv_prices))
else:
# close round-trip
gross_pnl = 0
cur_open_dts = []
# this could loop tens of millions of times
for price, commission in zip(indiv_prices, indiv_commissions):
if len(price_stack) != 0 and (
copysign(1, price_stack[-1]) != copysign(1, price) or
abs(price) == 0
):
prev_price = price_stack.popleft()
prev_dt = dt_stack.popleft()
gross_pnl += -(price + prev_price)
cur_open_dts.append(prev_dt)
else:
price_stack.append(price)
commission_stack.append(commission)
dt_stack.append(dt)
roundtrips.append({
"gross_pnl": gross_pnl,
"open_dt": cur_open_dts[0],
"close_dt": dt,
"long": price < 0,
"symbol": sym,
})
return pd.DataFrame(roundtrips)
这是transactions
输入DataFrame
的示例:
created_on updated_on id user_id instrument_id amount commission trade_datetime trade_key proceeds quantity side trade_price margin_requirement symbol multiplier
146 2019-04-09 02:05:14.370164+00:00 2019-04-09 02:05:14.370191+00:00 148 100 70 44325.00 2.97 2018-04-17 09:51:35+00:00 1.1 44327.97 -1 sell 1.1820 -11081.2500 KCU18 37500.0
147 2019-04-09 02:05:14.390017+00:00 2019-04-09 02:05:14.390045+00:00 149 100 70 45731.25 2.97 2018-04-24 09:44:51+00:00 1.1 45734.22 -1 sell 1.2195 -11432.8125 KCU18 37500.0
148 2019-04-09 02:05:14.409739+00:00 2019-04-09 02:05:14.409767+00:00 150 100 70 -47793.75 2.97 2018-05-02 07:41:29+00:00 1.1 -47790.78 1 buy 1.2745 11948.4375 KCU18 37500.0
149 2019-04-09 02:05:14.432697+00:00 2019-04-09 02:05:14.432743+00:00 151 100 70 -47793.75 2.97 2018-05-02 07:41:29+00:00 1.1 -47790.78 1 buy 1.2745 11948.4375 KCU18 37500.0