预期的表和索引的存储大小

时间:2012-11-29 08:21:31

标签: postgresql

我在Oracle Linux Server 6.3版上使用postgresql 9.2.1。

我正在努力获得表和索引的预期存储大小。

感谢本网站提供的一些建议,我为下表制作了我的公式......

- 如果是TABLE ...

postgres=# \d test
           Table "public.test"
    Column     |         Type          | Modifiers 
---------------+-----------------------+-----------
 c1            | integer               | not null
 c2            | character varying(20) | not null
 c3            | character varying(8)  | not null
 c4            | character varying(6)  | not null
 c5            | character varying(15) | 
 c6            | character varying(20) | 
 c7            | character varying(20) | 
 c8            | character varying(20) | 
Indexes:
    "idx_test" PRIMARY KEY, btree (c1, c3, c4, c5)
Tablespace: "test"

postgres=# insert into test values(1, 
                                   '11111111111111111111', -- 20(exactly same with max length of each column)
                                   '11111111',             -- 8
                                   '111111',               -- 6
                                   '111111111111111',      -- 15
                                   '11111111111111111111', -- 20
                                   '11111111111111111111', -- 20
                                   '11111111111111111111');-- 20
INSERT 0 1

postgres=# select * from pgstattuple('test');           
 table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent                          
-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
      8192 |           1 |        81 |          0.99 |                0 |              0 |                  0 |       8072 |        98.54 

postgres=# insert into test values(2, 
                                   '11111111111111111111', -- 20(exactly same with max length of each column)
                                   '11111111',             -- 8
                                   '111111',               -- 6
                                   '111111111111111',      -- 15
                                   '11111111111111111111', -- 20
                                   '11111111111111111111', -- 20
                                   '11111111111111111111');-- 20      
INSERT 0 1

postgres=# select * from pgstattuple('test');
 table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent 
-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
      8192 |           2 |       162 |          1.98 |                0 |              0 |                  0 |       7980 |        97.41                    

所以,我发现每个页面可以容纳88(X)个元组。

  • 实际存储中的元组大小:8072(第一次插入后的free_space)-7980(第二次插入后的free_space)= 92
  • page default = 8192 - 8072(第一次插入后的free_space) - 92(元组大小)= 28
  • 9182 - 28 = 92 * X(每页最大元组数)

如果是索引...

postgres=# \d test_pkey
           Index "public.idx_test"
    Column     |         Type          |  Definition   
---------------+-----------------------+---------------
 c1            | integer               | c1
 c2            | character varying(20) | c2
 c3            | character varying(8)  | c3
 c4            | character varying(6)  | c4
primary key, btree, for table "public.test"

postgres=# truncate table test;
postgres=# vacuum;
postgres=# analyze;

postgres=# insert into test values(1, 
                                   '11111111111111111111', -- 20(exactly same with max length of each column)
                                   '11111111',             -- 8
                                   '111111',               -- 6
                                   '111111111111111',      -- 15
                                   '11111111111111111111', -- 20
                                   '11111111111111111111', -- 20
                                   '11111111111111111111');-- 20
INSERT 0 1

postgres=# select * from pgstattuple('idx_test');
 table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent 
-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
     16384 |           1 |        56 |          0.34 |                0 |              0 |                  0 |       8088 |        49.37
(1 row)

postgres=# select * from pgstatindex('idx_test');
 version | tree_level | index_size | root_block_no | internal_pages | leaf_pages | empty_pages | deleted_pages | avg_leaf_density | leaf_fragmentation 
---------+------------+------------+---------------+----------------+------------+-------------+---------------+------------------+--------------------
       2 |          0 |       8192 |             1 |              0 |          1 |           0 |             0 |             0.79 |                  0
(1 row)

postgres=# insert into test values(1, 
                                   '11111111111111111111', -- 20(exactly same with max length of each column)
                                   '11111111',             -- 8
                                   '111111',               -- 6
                                   '111111111111111',      -- 15
                                   '11111111111111111111', -- 20
                                   '11111111111111111111', -- 20
                                   '11111111111111111111');-- 20

INSERT 0 1       

postgres=# select * from pgstattuple('idx_test');                                                                                                                                                                                                                                                   
 table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent
-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------             
     16384 |           2 |       112 |          0.68 |                0 |              0 |                  0 |       8028 |           49              
(1 row)                                                                                                                                                

postgres=# select * from pgstatindex('idx_test');                                                                                            
 version | tree_level | index_size | root_block_no | internal_pages | leaf_pages | empty_pages | deleted_pages | avg_leaf_density | leaf_fragmentation 
---------+------------+------------+---------------+----------------+------------+-------------+---------------+------------------+--------------------
       2 |          0 |       8192 |             1 |              0 |          1 |           0 |             0 |             1.52 |                  0 
(1 row)        

同样,我发现每个页面可以容纳135(Y)个元组。

  • 实际存储中的元组大小:8088(第一次插入后的free_space)-8028(第二次插入后的free_space)= 60
  • page default = 8192 - 8088(第一次插入后的free_space) - 60(元组大小)= 44
  • 8192 - 44 = 60 * Y(每页最多元组数)

当我在表格中插入1350行时......我得到了这个......

postgres=# select * from pgstattuple('test');
 table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent 
-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
    131072 |        1350 |    109350 |         83.43 |                0 |              0 |                  0 |       6424 |          4.9
(1 row)


postgres=# select * from pgstattuple('idx_test');
 table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent 
-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
     90112 |        1350 |     54000 |         59.93 |                0 |              0 |                  0 |      13580 |        15.07
(1 row)

postgres=# select * from pgstatindex('idx_test');
 version | tree_level | index_size | root_block_no | internal_pages | leaf_pages | empty_pages | deleted_pages | avg_leaf_density | leaf_fragmentation 
---------+------------+------------+---------------+----------------+------------+-------------+---------------+------------------+--------------------
       2 |          1 |      81920 |             3 |              0 |          9 |           0 |             0 |            81.49 |                  0
(1 row)     

表的文件大小?

1350(行数)/ 88(X)= 15.34 - >这意味着需要16页,这意味着文件大小为16 * 8192 = 131072.看起来是正确的。

然而,索引是不同的......

1350(行数)/ 135(Y)= 10正好... 10 * 8192需要文件大小但是90112。

再插入一行,除此之外,应该扩展index_size(如果我是对的)所以我尝试但没有更改。

postgres=# insert into test values(27108,'sanjuk1052','20121022','233338','172,20,30,177','win7','IE','9,0');
INSERT 0 1
postgres=# select * from pgstattuple('test');
 table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent 
-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
    131072 |        1351 |    109431 |         83.49 |                0 |              0 |                  0 |       6332 |         4.83
(1 row)

postgres=# select * from pgstattuple('idx_test');
 table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent 
-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
     90112 |        1351 |     54040 |         59.97 |                0 |              0 |                  0 |      13536 |        15.02
(1 row)

postgres=# select * from pgstatindex('idx_test');
 version | tree_level | index_size | root_block_no | internal_pages | leaf_pages | empty_pages | deleted_pages | avg_leaf_density | leaf_fragmentation 
---------+------------+------------+---------------+----------------+------------+-------------+---------------+------------------+--------------------
       2 |          1 |      81920 |             3 |              0 |          9 |           0 |             0 |            81.55 |                  0
(1 row)

我甚至不能确定这种方法是否合理,但即使不完美,也必须完成工作...

我需要自己的公式,特别是要获得预期的索引大小,包括由TOAST重新调整的其他文件...

非常感谢任何建议。

1 个答案:

答案 0 :(得分:0)

您在这里遇到的一个真正问题是varchar不使用恒定的空间量。这可能会导致您的页面估算值随时间推移(您的估算值基本上是最大值,可能会更低)。凭借你的桌子结构,我认为任何东西都不会被烘烤。

这也会影响您的索引,因为您有可变长度字段。因此,您可以预期您的估算值代表最大值,而不是已知的准确值。实际大小取决于您的实际数据,而不仅仅是架构和行数。