Question

我需要存储大量金融时间序列数据，其中不同的数据点具有可能不同的属性。

例如，考虑一种情况，即您的数据库需要存储包含股票和期权的金融工具的时间序列。股票和期权在任何给定时间点都有价格，但期权有其他属性，如希腊（delta，gamma，vega）等。

关系数据库似乎在这里最合适，一种可能性是为每个属性创建一个列，并将未使用的属性设置为NULL。因此，在上面的示例中，对于代表股票的记录，您只使用一些列，而对于选项，您将使用其他一些。

这种方法的问题在于效率非常低（最终存储了大量的NULL）并且它非常不灵活（每次添加或删除属性时都需要添加或删除列）。 / p>

一种替代方法可能是将所有属性存储在垂直表中（即键名称值），但其缺点是强制使所有属性类型不安全（例如，它们可能都存储为字符串）。

我想到的另一个选项可能是将属性作为XML文档存储在时间序列表的单个列中。我测试了这种方法，从性能的角度来看它是不切实际的。如果要为任意大量的时间序列记录提取属性，则解析每行中的XML太慢。

理想的数据库技术将是NoSQL和RDBMS之间的组合，其中密钥时间戳对的行为类似于关系表格数据库中的行，但所有属性都存储在行级别包中，可快速访问每个属性。 / p>

有人知道这样的系统吗？是否有其他建议存储我描述的数据类型？

Answer 1

使用“financial_instruments”存储所有金融工具共有的信息。使用“股票”来存储仅适用于股票的属性; “options”用于存储仅适用于选项的属性。

create table financial_instruments (
  inst_id integer primary key,
  inst_name varchar(57) not null unique, 
  inst_type char(1) check (inst_type in ('s', 'o')),
  other_columns char(1), -- columns common to all financial instruments
  unique (inst_id, inst_type) -- required for the FK constraint below.
);

create table stocks (
  inst_id integer primary key,
  inst_type char(1) not null default 's' check (inst_type = 's'),
  other_columns char(1), -- columns unique to stocks.
  foreign key (inst_id, inst_type) references financial_instruments (inst_id, inst_type)
);

create table options (
  inst_id integer primary key,
  inst_type char(1) not null default 'o' check (inst_type = 'o'),
  other_columns char(1), -- columns unique to options; delta, gamma, vega.
  foreign key (inst_id, inst_type) references financial_instruments (inst_id, inst_type)
);

为了简化编程工作，您可以构建可更新的视图，将“financial_instruments”与其每个子类型相连接。应用程序代码可以只使用视图。

存储有关所有金融工具的相关信息的附加表将设置外键引用“financial_instruments”。“inst_id”。那些只关注股票的相关信息的表格会设置一个外键引用“股票”。“inst_id”。

Answer 2

另一种选择。具有附属表的主表，用于类似对象的属性（认为面向对象的继承）。根据主表的主键作为相关主键，master和subs之间有1-1个关系。

对于具有异构属性的金融时间序列数据，最合适的数据库技术是什么？

2 个答案: