用于系统任务跟踪的高效数据库解决方案

时间:2016-06-11 00:00:09

标签: postgresql python-3.x cassandra voltdb bigdata

我目前正在研究数据跟踪系统。该系统是一个用Python编写的多进程应用程序,其工作方式如下:

  1. 每隔S秒从中选择N个最合适的任务 数据库(目前是Postgres)并为其找到数据
  2. 如果没有任务,则创建N个新任务并返回(1)。
  3. 问题在于 - 目前我有约。 80GB的数据和36M的任务以及对tasks表的查询开始变得越来越慢(它是人口最多和最常用的表)。

    性能的主要瓶颈 是任务跟踪查询:

    LOCK TABLE task IN ACCESS EXCLUSIVE MODE;
    SELECT * FROM task WHERE line = 1 AND action = ANY(ARRAY['Find', 'Get']) AND (stat IN ('', 'CR1') OR stat = 'ERROR' AND (actiondate <= NOW() OR actiondate IS NULL)) ORDER BY taskid, actiondate, action DESC, idtype, date ASC LIMIT 36;
    
                                        Table "public.task"
       Column   |            Type             |                    Modifiers
    ------------+-----------------------------+-------------------------------------------------
     number     | character varying(16)       | not null
     date       | timestamp without time zone | default now()
     stat       | character varying(16)       | not null default ''::character varying
     idtype     | character varying(16)       | not null default 'container'::character varying
     uri        | character varying(1024)     |
     action     | character varying(16)       | not null default 'Find'::character varying
     reason     | character varying(4096)     | not null default ''::character varying
     rev        | integer                     | not null default 0
     actiondate | timestamp without time zone |
     modifydate | timestamp without time zone |
     line       | integer                     |
     datasource | character varying(512)      |
     taskid     | character varying(32)       |
     found      | integer                     | not null default 0
    Indexes:
        "task_pkey" PRIMARY KEY, btree (idtype, number)
        "action_index" btree (action)
        "actiondate_index" btree (actiondate)
        "date_index" btree (date)
        "line_index" btree (line)
        "modifydate_index" btree (modifydate)
        "stat_index" btree (stat)
        "taskid_index" btree (taskid)
    
                                   QUERY PLAN                          
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=312638.87..312638.96 rows=36 width=668) (actual time=1838.193..1838.197 rows=36 loops=1)
       ->  Sort  (cost=312638.87..313149.54 rows=204267 width=668) (actual time=1838.192..1838.194 rows=36 loops=1)
             Sort Key: taskid, actiondate, action, idtype, date
             Sort Method: top-N heapsort  Memory: 43kB
             ->  Bitmap Heap Scan on task  (cost=107497.61..306337.31 rows=204267 width=668) (actual time=1013.491..1343.751 rows=914586 loops=1)
                   Recheck Cond: ((((stat)::text = ANY ('{"",CR1}'::text[])) OR ((stat)::text = 'ERROR'::text)) AND (line = 1))
                   Filter: (((action)::text = ANY ('{Find,Get}'::text[])) AND (((stat)::text = ANY ('{"",CR1}'::text[])) OR (((stat)::text = 'ERROR'::text) AND ((actiondate <= now()) OR (actiondate IS NULL)))))
                   Rows Removed by Filter: 133
                   Heap Blocks: exact=76064
                   ->  BitmapAnd  (cost=107497.61..107497.61 rows=237348 width=0) (actual time=999.457..999.457 rows=0 loops=1)
                         ->  BitmapOr  (cost=9949.15..9949.15 rows=964044 width=0) (actual time=121.936..121.936 rows=0 loops=1)
                               ->  Bitmap Index Scan on stat_index  (cost=0.00..9449.46 rows=925379 width=0) (actual time=117.791..117.791 rows=920900 loops=1)
                                     Index Cond: ((stat)::text = ANY ('{"",CR1}'::text[]))
                               ->  Bitmap Index Scan on stat_index  (cost=0.00..397.55 rows=38665 width=0) (actual time=4.144..4.144 rows=30262 loops=1)
                                     Index Cond: ((stat)::text = 'ERROR'::text)
                         ->  Bitmap Index Scan on line_index  (cost=0.00..97497.14 rows=9519277 width=0) (actual time=853.033..853.033 rows=9605462 loops=1)
                               Index Cond: (line = 1)
     Planning time: 0.284 ms
     Execution time: 1838.882 ms
    (19 rows)
    

    当然,所有涉及的字段都已编入索引。我目前正在考虑两个方向:

    1. 如何优化查询并且实际上是否会为我提供透视效果(目前每个查询大约需要10秒,这在动态任务跟踪中是不可接受的)
    2. 在哪里以及如何更有效地存储任务数据 - 可能我应该使用另一个数据库用于此类目的 - Cassandra,VoltDB或其他大数据存储?
    3. 我认为数据应该以某种方式预先排序,以便尽快获得实际任务。

      另请注意,我目前的80G音量最有可能是最小值而不是最大值。

      提前致谢!

1 个答案:

答案 0 :(得分:0)

我不太了解您的用例,但它并不像我的索引工作得太好。看起来查询主要依赖于stat索引。我认为你需要研究一个复合索引,比如(action,line,stat)。

另一种选择是在多个表格上分割您的数据,将其分成具有低基数的某个键。我不使用postgres但我不认为看另一个数据库解决方案会更好地工作,除非你确切知道你正在优化什么。