Question

我必须设计一个订单簿数据结构，该结构允许我查询已插入但尚未删除的订单的最高价格。在文件中预先给出插入和删除操作，其中每个操作看起来像以下两个之一：

TIMESTAMP 插入 ID PRICE
TIMESTAMP 删除 ID

其中ID是订单的整数标识符，时间戳始终按递增顺序排列，每个ID恰好显示两次：一次插入一次，一次执行删除操作，按此顺序。

从这个操作列表中，我需要输出最高价格的时间加权平均值。

举个例子，我们说我们有以下输入： 10 I 1 10 20 I 2 13 22 I 3 13 24 E 2 25 E 3 40 E 1 我们可以说在ith操作之后，最大值是 10, 13, 13, 13, 10 和时间平均值是 10*(20-10) + 13*(22-20) + 13*(24-22)+13*(25-24)+10*(40-25) = 10.5 因为10是时间戳[10-20]和[25,40]之间的最高价格，而其余时间为13。

我正在考虑使用unordered_map<ID,price>和multiset<price>来支持：

在O(log(n))

中插入

O(log(n))

删除 O(1)
getMax

以下是我提出的一个例子：

struct order {
  int timestamp, id;
  char type;
  double price;
};

unordered_map<uint, order> M;
multiset<double> maxPrices;
double totaltime = 0;
double avg = 0;
double lastTS = 0;

double getHighest() {
  return !maxPrices.empty() ? *maxPrices.rbegin()
                            : std::numeric_limits<double>::quiet_NaN();
}

void update(const uint timestamp) {
  const double timeLeg = timestamp - lastTS;
  totaltime += timeLeg;
  avg += timeLeg * getHighest();
  lastTS = timestamp;
}

void insertOrder(const order& ord) {
  if (!maxPrices.empty()) {
    if (ord.price >= getHighest()) {
      // we have a new maxPrice
      update(ord.timestamp);
    }

  } else  // if there are not orders this is the mex for sure
    lastTS = ord.timestamp;

  M[ord.id] = ord;
  maxPrices.insert(ord.price);
}

void deleteOrder(
    const uint timestamp,
    const uint id_ord) {  // id_ord is assumed to exists in both M and maxPrices
  order ord = M[id_ord];
  if (ord.price >= getHighest()) {
    update(timestamp);
  }
  auto it = maxPrices.find(ord.price);
  maxPrices.erase(it);
  M.erase(id_ord);
}

此方法的复杂度为nlogn，其中n是有效订单的数量。

有没有更快的渐近和/或更优雅的方法来解决这个问题？

Answer 1

我建议您采用数据库方法。

将所有记录放入private static string TruncateCommas(string input) { return Regex.Replace(input, @",+", ","); }。

创建一个索引表std::vector，它将包含一个键值和向量中记录的索引。如果您希望按降序排序键，还提供比较仿函数。

此策略允许您创建许多索引表，而无需重新排序所有数据。地图将为您的密钥提供良好的搜索时间。您还可以遍历地图以按顺序列出所有键。

注意：对于现代计算机，您可能需要大量数据才能在二分搜索（地图）和线性搜索（矢量）之间提供显着的时序改进。

数据结构：insert（id，value），delete（id）和getMax（）

1 个答案: