JavaScript - 数千次密钥过滤数千次

时间:2018-01-16 06:35:40

标签: javascript jquery performance ecmascript-6 javascript-objects

我的网络应用程序上有图表,每次收到新信息时我都需要更新图表。

现在我正在进行模拟,所以我使用大约100000个数据进行回测(在json中)(但如果浏览器和硬件可以处理数据,则可能是数百万)。

出于这个原因,我需要尽可能优化我的代码。

我有这样的对象:

var trades = {"1515867288390":{price:"10", quantity:"500"},
            "1515867289541":{price:"9", quantity:"400"},
            "1515867295400":{price:"11", quantity:"750"},
            "1515867296500":{price:"7", quantity:"1100"},
            ...}

每次我在交易中扫描一个对象时,我想获得最后X秒的中等价格,所以我有一个$ .each(交易,getAverage ......)

getAverage = function (trade_time) {

var total_quantity = 0;
var total_trade_value = 0;
var startfrom = trade_time - duration;

Object.keys(trades).forEach(function (time) {
    if (time < startfrom)
        delete trades[time];
});

$.each(trades, function (key, value) {
    total_quantity += parseFloat(value.quantity);
    total_trade_value += (value.price * value.quantity);
});

var average = (total_trade_value / total_quantity);
return average;
}

80000笔交易的平均执行时间约为7.5秒。

不错我想但是问题是我需要var startfrom = trade_time - duration中的持续时间可调,这导致问题导致我的getAverage函数根据startfrom移除所有元素,这取决于持续时间,所以如果在开始持续时间= 10,然后持续时间变为20,获得平均值只能回顾过去10秒。

解决方案是复制数组以保持“完整”副本,但是我的函数每次都会迭代所有元素,并且会慢一些。 我尝试过的第二个选项是不删除项目并使用:

Object.keys(trades).filter(t => t>=startfrom).forEach(function (time) {
    var value = trades[time];
    total_quantity += parseFloat(value.quantity);
    total_trade_value += (value.price * value.quantity);
});

它慢了大约300倍,所以选择真的很糟糕,我想知道你会怎么想?

谢谢。

PS:我正在考虑使用数组,因为我的键总是数字(时间戳),但是如果我使用数组,我最终会得到数百万的空索引,这不会再次表现慢吗?

3 个答案:

答案 0 :(得分:1)

也许低级别的实施更快。为此,您可以创建一个新的Buffer来存储您的数据:

 const buffer = new ArrayBuffer(10 ** 4 * (3 * 3));

要实际使用缓冲区,我们需要一个视图。我认为int32足以存储时间戳,数量和数据(以3 * 3字节为单位)。所有这些都可以捆绑在一个类中:

 class TradeView {
  constructor(buffer, start, length){
   this.buffer = buffer;
   this.trades = new Uint32Array(buffer, start, length);
  }
  //...
}

现在要添加交易,我们转到相关位置,并将数据存储在那里:

   //TradeView.addTrade
   addTrade(index, timestamp, {quantity, price}){
    this.trades[index * 3] = +timestamp;
    this.trades[index * 3 + 1] = +price;
    this.trades[index * 3 + 2] = +quantity;
  }

或者得到它:

 //TradeView.getTrade
 getTrade(index){
   return {
     timestamp: this.trades[index * 3],
     price: this.trades[index * 3 + 1],
     quantity: this.trades[index * 3 + 2],
  };
}

现在我们需要用对象数据填充它(这很慢,所以当你从后端收到一小块时应该调用它):

 const trades = new TradeView(buffer);
 let end = 0;

 function loadChunk(newTrades){
   for(const [timestamp, data] of Object.entries(newTrades))
     trades.addTrade(end++, timestamp, data);
}

现在真正酷的部分:缓冲区可以有多个数据视图。这意味着,我们可以“过滤”交易数组而无需复制数据。为此,我们只需要找到起始索引和结束索引:

 //TradeView.getRangeView
 getRangeView(startTime, endTime){
   let start = 0, end = 0;
   for(let i = 0; i < this.trades.length; i += 3){
      if(!start && startTime < this.trades[i])
         start = i;
      if(this.trades[i] > endTime){
         end = i - 3;
         break;
      }
  }
  return new TradeView(this.buffer, start, end - start);
}

答案 1 :(得分:1)

这里有几个(密切相关的)想法。

创意1

当每笔交易到达时,将其推入一个数组(如果交易可能无序到达,则将其拼接到数组中)。通过钩子或骗子,确保数组保持时间戳顺序。然后,当您的非套接字代码从套接字中获取数据(作为交易)并计算出平均值时,您想要的数据将始终位于数组的一端。一旦达到非合格交易,计算就可以停止(突破循环)。

创意2

与Idea 1类似,但不是维护一系列原始交易,而是存储一系列&#34; stats对象&#34;,每个对象代表一个时间片 - 可能只有15秒的交易,但可能多达五分钟值得。

在每个统计信息对象中,汇总trade.quantitytrade.quantity * trade.price。这将允许计算时间片的平均值,但更重要的是,在计算平均值之前,可以通过简单的加法组合两个或更多个时间片。

这可以通过两个相互依赖的构造函数来实现:

/*
 * Stats_store() Constructor
 * Description: 
 *    A constructor, instances of which maintain an array of Stats_store() instances (each representing a time-slice), 
 *    and receive a series of timestamped "trade" objects of the form { price:"10", quantity:"500" }.
 *    On receipt of a trade object, an exiting Stats_store() instance is found (by key based on timestamp) or a new one is created,
 *    then the found/created Stats_store's .addTrade(trade)` method is called.
 * Methods: 
 *    .addTrade(timestamp, trade): called externally
 *    .getMean(millisecondsAgo): called externally
 *    .timeStampToKey(timestamp): called internally
 *    .findByKey(key): called internally
 * Example: var myStats_store = new Stats_store(101075933);
 * Usage: 
 */
const Stats_store = function(granularity) {
    this.buffer = [];
    this.granularity = granularity || 60000; // milliseconds (default 1 minute)
};
Stats_store.prototype = {
    'addTrade': function(timestamp, trade) {
        let key = this.timeStampToKey(timestamp);
        let statObj = this.findByKey(key);
        if (!statObj) {
            statObj = new StatObj(key);
            this.buffer.unshift(statObj);
        }
        statObj.addTrade(trade);
        return this;
    },
    'timeStampToKey': function (timestamp) {
        // Note: a key is a "granulated" timestamp - the leading edge of a timeslice.
        return Math.floor(timestamp / this.granularity); // faster than parseInt()
    },
    'findByKey': function(key) {
        for(let i=0; i<this.buffer.length; i++) {
            if(this.buffer[i].key === key) {
                return this.buffer[i];
                break;
            }
            return null;
        }
    },
    'getMean': function(millisecondsAgo) {
        let key = this.timeStampToKey(Date.now() - millisecondsAgo);
        let s = { 'n':0, 'sigma':0 };
        let c = 0;
        for(let i=0; i<this.buffer.length; i++) {
            if(this.buffer[i].isFresherThan(key)) {
                s.n += this.buffer[i].n;
                s.sigma += this.buffer[i].sigma;
                c++;
            } else {
                break;
            }
        }
        console.log(c, 'of', this.buffer.length);
        return s.sigma / s.n; // arithmetic mean
    }
};

/*
 * StatObj() Constructor
 * Description: 
 *    A stats constructor, instances of which receive a series of "trade" objects of the form { price:"10", quantity:"500" }.
 *    and to aggregate data from the received trades:
 *       'this.key': represents a time window (passes on construction).
 *       'this.n': is an aggregate of Σ(trade.quantity)
 *       'this.sigma' is an aggregate of trade values Σ(trade.price * trade.quantity)
 *    Together, 'n' and 'sigma' are the raw data required for (or contributing to) an arithmetic mean (average).
 *    NOTE: If variance or SD was required, then the store object would need to accumulate 'sigmaSquared' in addition to 'n' and 'sigma'.
 * Methods: 
 *    .addTrade(trade): called externally
 *    .isFresherThan(key): called externally
 * Example: var myStaObj = new StatObj(101075933);
 * Usage: should only be called by Stats_store()
 */
const StatObj = function(key) {
    this.key = key;
    this.n = 0;
    this.sigma = 0;
}
StatObj.prototype = {
    'addTrade': function(trade) { // eg. { price:"10", quantity:"500" }
        this.n += +trade.quantity;
        this.sigma += +trade.quantity * +trade.price;
    },
    'isFresherThan': function(key) {
        return this.key >= key;
    }
};

用法

// Initialisation
let mySocket = new WebSocket("ws://www.example.com/socketserver", "protocolOne");
const stats_store = new Stats_store(2 * 60000); // 2 minutes granularity

// On receiving a new trade (example)
mySocket.onmessage = function(event) {
    let trade = ....; // extract `trade` from event
    let timestamp = ....; // extract `timestamp` from event
    let mean = stats_store.addTrade(timestamp, trade).getMean(10 * 60000); // 10 minutes averaging timeslice.
    console.log(mean); // ... whatever you need to do with the calculated mean.
    // ... whatever else you need to do with `trade` and `timestamp`.
};

通过选择传递给new Stats_store().getMean()的值,可以提供一定程度的灵活性。只需确保第一个值小于第二个值。

(2) here 的轻度测试(在中等性能计算机上,Win7下的Chrome浏览器)表示:

  • 表现至少应该适合那种&#34;贸易&#34;您正在谈论的比率(12小时内100,000或每分钟140)。
  • 内存使用量很小,但短期内不会泄漏。你可能需要一个&#34;管家&#34;从长远来看扫尾的过程。

最后,想法(1)和(2)并没有完全不同。

As(2)&#39; granularity传递给new Stats_store()的常数变小,因此(2)的行为将倾向于(1)的行为。

答案 2 :(得分:0)

如果将其转换为一个循环,而不是代码中的两个循环(一个用于删除,另一个用于迭代),如下所示

if条件的反转

Object.keys(trades).forEach(function (time) {
    if (time >= startfrom) {
      value = trades[type];
      total_quantity += parseFloat(value.quantity);
      total_trade_value += (value.price * value.quantity);
    }

});