我有这个脚本来计算交易的盈利和亏损。它工作正常 但我认为它可以改进。摆脱for循环至少可以使代码看起来很紧凑。 有人可以帮帮我吗?
计算盈利/亏损的逻辑首先是将卖出交易与潜在买入交易相匹配。单个卖出交易可以与多个买入匹配。因此,成本可能会分配给多个购买。
步骤:
由于
以下是样本数据集
> structure(list(AsxCode = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "QAN", class = "factor"), Order.Type = structure(c(1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L), .Label = c("Buy", "Sell"), class = "factor"), Trade.Date = structure(c(13L, 12L, 12L, 11L, 10L, 9L, 8L, 7L, 6L, 5L, 4L, 3L, 2L, 1L), .Label = c("2014-03-28", "2014-05-22", "2014-11-07", "2014-11-18", "2014-12-04", "2015-03-02", "2015-03-24", "2015-03-27", "2015-05-11", "2015-05-15", "2015-08-21", "2016-04-15", "2016-04-18"), class = "factor"), Price = c(3.75, 4.05, 4.01, 3.55, 3.68, 3.38, 2.9, 2.98, 2.9, 2.05, 1.8, 1.65, 1.25, 1.07), Quantity = c(850L, 1350L, 150L, 1000L, 1500L, 1400L, 1091L, 2000L, 1750L, 600L, 366L, 375L, 500L, 500L), Consideration = c(3198.5, 5456.5, 590.5, 3561, 5531, 4743, 3152.9, 5949, 5086, 1241, 669.8, 629.75, 614, 546), match_status = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), match_vol = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), avg_price = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), profit_loss = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("AsxCode", "Order.Type", "Trade.Date", "Price", "Quantity", "Consideration", "match_status", "match_vol", "avg_price", "profit_loss"), row.names = c(NA, -14L), class = "data.frame")
AsxCode Order.Type Trade.Date Price Quantity Consideration match_status match_vol avg_price profit_loss
1 QAN Buy 2016-04-18 3.75 850 3198.50 NA 0 0 0
2 QAN Sell 2016-04-15 4.05 1350 5456.50 NA 0 0 0
3 QAN Sell 2016-04-15 4.01 150 590.50 NA 0 0 0
4 QAN Buy 2015-08-21 3.55 1000 3561.00 NA 0 0 0
5 QAN Buy 2015-05-15 3.68 1500 5531.00 NA 0 0 0
6 QAN Buy 2015-05-11 3.38 1400 4743.00 NA 0 0 0
7 QAN Sell 2015-03-27 2.90 1091 3152.90 NA 0 0 0
8 QAN Sell 2015-03-24 2.98 2000 5949.00 NA 0 0 0
9 QAN Buy 2015-03-02 2.90 1750 5086.00 NA 0 0 0
10 QAN Buy 2014-12-04 2.05 600 1241.00 NA 0 0 0
11 QAN Buy 2014-11-18 1.80 366 669.80 NA 0 0 0
12 QAN Buy 2014-11-07 1.65 375 629.75 NA 0 0 0
13 QAN Sell 2014-05-22 1.25 500 614.00 NA 0 0 0
14 QAN Buy 2014-03-28 1.07 500 546.00 NA 0 0 0
calculate.profit <- function(trades){
trades$match_vol <- 0
s <- trades[trades$Order.Type== 'Sell', ]
sell.trades <- s[order(s$Trade.Date, decreasing=FALSE),]
b <- trades[trades$Order.Type== 'Buy', ]
buy.trades <- b[order(b$Trade.Date, decreasing=FALSE),]
# Don't want to execute the for loop when there is no sell trades. In other words when there is no profit/loss unless you sell
if(nrow(sell.trades)==0){
return (buy.trades)
}
# for each sell find the associated buys
for(i in 1:nrow(sell.trades))
{
# calculate average price. The Consideration column contains total cost
s.price <- sell.trades[i, 'Consideration']/sell.trades[i,'Quantity']
for(j in 1:nrow(buy.trades))
{
# this part matches sell with a buy trade
# if sell volume and buy volume are same, the sell is fully matched otherwise it has to find the remaining sell units.
s.vol <- sell.trades[i,'Quantity'] - sell.trades[i,'match_vol']
b.vol <- buy.trades[j, 'Quantity'] - buy.trades[j, 'match_vol']
if (b.vol != 0)
{
b.price <- buy.trades[j, 'Consideration']/buy.trades[j, 'Quantity']
# contains the volume which is matched between buy and sell
# trades
match.vol <- min(s.vol, b.vol)
profit <- match.vol * (s.price - b.price)
buy.trades[j, 'match_vol'] <- match.vol + buy.trades[j, 'match_vol']
sell.trades[i, 'profit_loss'] <- profit + sell.trades[i, 'profit_loss']
sell.trades[i, 'match_vol'] <- match.vol + sell.trades[i, 'match_vol']
}
# sell parcel fully processed
if (sell.trades[i ,'match_vol'] == sell.trades[i ,'Quantity'])
{
j=1
break;
}
}
}
return (rbind(buy.trades, sell.trades))
}
答案 0 :(得分:0)
可以做出许多改进。
最明显的事情是预先分配对象尺寸。收到的智慧是inefficient to expand objects in loops。因此你会这样做:
# On example of a single column
sell.trades.vec <- vector(mode = "numeric", length = nrow(buy.trades))
以避免对象在循环中消耗。
seq_along()
从广义上讲,使用seq_along
代替1:something
很简洁,看看:
>> a <- NULL
>> 1:length(a)
[1] 1 0
>> seq_along(a)
integer(0)
>>
:
>> 1:0
[1] 1 0
>> seq_along(0)
[1] 1
>>
我猜你会(很可能)总是有一些明智的nrow
值,但seq_along
可能值得反思,以防有可能得到一些奇怪的数据。< / p>