限制Amazon Redshift中架构的大小

时间:2015-07-27 08:41:25

标签: amazon-redshift

我们在项目中使用Amazon Redshift。

在我们的项目中,我们为不同的人群分配了不同的模式。因此,例如,营销获得一个单独的模式,他们可以存储他们的表进行分析,而销售团队则获得一个单独的模式。

通常发生的情况是,来自一个组的分析师使用数据库大小的大部分表格,这些表格本质上是临时性的,并不关心丢弃/清除它。所以数据爆炸式增长。其他人没有空间。然后,我们所做的是大量的库存收集工作/

我想知道我们是否可以将模式的大小或数据库的大小限制为特定的一组用户。让我们说,我们将100 GB分配给销售架构,50 GB分配给营销等等......

2 个答案:

答案 0 :(得分:3)

根据Redshift文档,Redshift似乎没有提供限制每个架构/数据库大小的功能,但有一种解决方法。

由于您可以使用以下查询获取每个表的数据大小,因此您可以编写一个监视其使用情况的脚本,并在超出时发送警报。然后,只需通过cron定期运行脚本。

  • 查询以获取每个表的数据大小和行数
select
  trim(pgdb.datname) as database, trim(pgn.nspname) as schema,
  trim(a.name) as Table, b.mbytes, a.rows
from
  (select db_id, id, name, sum(rows) as rows from stv_tbl_perm a group by db_id, id, name) as a
  join pg_class as pgc on pgc.oid = a.id
  join pg_namespace as pgn on pgn.oid = pgc.relnamespace
  join pg_database as pgdb on pgdb.oid = a.db_id
  join (select tbl, count(*) as mbytes from stv_blocklist group by tbl) b on a.id=b.tbl
order by 1, 2, 3;
  • ex)结果
 database |     schema    |    table    | mbytes |   rows
----------+---------------+-------------+--------+----------+
 test_db  | dev_schmea_1  | click_log   |     23 |     4653
 prod_db  | prod_schema_1 | click_log   |  16217 |  2112354
 prod_db  | prod_schema_1 | install_log |   5544 |   433538
  • 查询以获取每个架构的数据大小和行数
select
  trim(pgdb.datname) as database, trim(pgn.nspname) as schema,
  sum(b.mbytes) as mbytes, sum(a.rows) as rows
from
  (select db_id, id, name, sum(rows) as rows from stv_tbl_perm a group by db_id, id, name) as a
  join pg_class as pgc on pgc.oid = a.id
  join pg_namespace as pgn on pgn.oid = pgc.relnamespace
  join pg_database as pgdb on pgdb.oid = a.db_id
  join (select tbl, count(*) as mbytes from stv_blocklist group by tbl) b on a.id=b.tbl
group by pgdb.datname, pgn.nspname
order by 1, 2;
  • 查询以获取每个数据库的数据大小和行数
select
  trim(pgdb.datname) as database, sum(b.mbytes) as mbytes, sum(a.rows) as rows
from
  (select db_id, id, name, sum(rows) as rows from stv_tbl_perm a group by db_id, id, name) as a
  join pg_class as pgc on pgc.oid = a.id
  join pg_namespace as pgn on pgn.oid = pgc.relnamespace
  join pg_database as pgdb on pgdb.oid = a.db_id
  join (select tbl, count(*) as mbytes from stv_blocklist group by tbl) b on a.id=b.tbl
group by pgdb.datname
order by 1;

答案 1 :(得分:1)

此功能现在存在于Redshift中: Redshift Create Schema Docs

文档中的相关示例:

create schema us_sales authorization dwuser QUOTA 50 GB;