我正在尝试获取表中每一行的列大小。这基本上是这两个查询的组合:
SELECT pg_size_pretty(sum(pg_column_size(COLUMN_NAME))) FROM TABLE_NAME;
和
SELECT column_name FROM information_schema.columns WHERE table_schema = 'public' AND table_name = 'TABLE_NAME';
我的第一次尝试是执行以下两个查询:
=> SELECT column_name, (SELECT pg_size_pretty(sum(pg_column_size(column_name))) FROM TABLE_NAME) FROM information_schema.columns WHERE table_schema = 'public' AND table_name = 'TABLE_NAME';
ERROR: column "columns.column_name" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT column_name, (SELECT pg_size_pretty(sum(pg_column_siz...
^
=> SELECT column_name, (SELECT pg_size_pretty(sum(pg_column_size(column_name))) FROM TABLE_NAME) FROM information_schema.columns WHERE table_schema = 'public' AND table_name = 'TABLE_NAME' GROUP BY column_name;
ERROR: more than one row returned by a subquery used as an expression
也尝试了以下方法:
SELECT column_name, (SELECT pg_size_pretty(sum(pg_column_size(column_name))) FROM TABLE_NAME) FROM information_schema.columns WHERE table_schema = 'public' AND table_name = 'TABLE_NAME' GROUP BY 1;
哪个返回:
ERROR: more than one row returned by a subquery used as an expression
当我添加LIMIT 1
时,结果不正确:
SELECT column_name,
(SELECT pg_size_pretty(sum(pg_column_size(column_name))) FROM main_apirequest LIMIT 1)
FROM information_schema.columns
WHERE table_schema = 'public' AND table_name = 'main_apirequest'
GROUP BY 1;
它看起来像这样:
column_name | pg_size_pretty
------------------+----------------
api_key_id | 11 bytes
id | 3 bytes
...
应该是这样的情况(由于限制1而不会发生)
=> SELECT pg_size_pretty(sum(pg_column_size(id))) FROM main_apirequest
;
pg_size_pretty
----------------
19 MB
答案 0 :(得分:1)
由于您事先不知道列名,但是想在查询中使用列名,因此必须使用动态sql。这是一个简单的示例:
CREATE TABLE t1 (id INTEGER, txt TEXT);
INSERT INTO t1
SELECT g, random()::TEXT
FROM generate_series(1, 10) g;
然后生成查询的SQL是:
DO $$
DECLARE
query TEXT;
BEGIN
SELECT 'SELECT ' || STRING_AGG(FORMAT('sum(pg_column_size(%1$I)) AS %1$s', column_name), ', ') || ' FROM t1'
INTO query
FROM information_schema.columns
WHERE table_schema = 'public'
AND table_name = 't1';
RAISE NOTICE '%', query;
END $$
创建的查询为SELECT pg_size_pretty(sum(pg_column_size(id))) AS id, pg_size_pretty(sum(pg_column_size(txt))) AS txt FROM t1
如果您有数百列,则工作方式相同。
现在让它生成并运行查询并返回结果,这实际上取决于您的需求。如果您很高兴将其打印到屏幕上,则可以改成这样的格式:
DO $$
DECLARE
query TEXT;
result TEXT;
BEGIN
SELECT 'SELECT CONCAT_WS(E''\n'', ' || STRING_AGG(FORMAT('''%1$s: '' || pg_size_pretty(sum(pg_column_size(%1$I)))', column_name), ', ') || ') FROM t1'
INTO query
FROM information_schema.columns
WHERE table_schema = 'public'
AND table_name = 't1';
EXECUTE query
INTO result;
RAISE NOTICE '%', result;
END $$
打印:
id: 40 bytes
txt: 181 bytes
如果相反,您希望返回包含多列的记录,则我不太确定如何处理,因为列数及其名称是未知的。我能想到的最好的办法是将其作为JSON返回,然后只返回一件事,并且在那里将有可变数量的字段,无论使用什么列名:
CREATE OR REPLACE FUNCTION test1(_schema_name TEXT, _table_name TEXT)
RETURNS JSON AS
$$
DECLARE
query TEXT;
result JSON;
BEGIN
SELECT 'SELECT ROW_TO_JSON(cols) FROM (SELECT ' || STRING_AGG(FORMAT('pg_size_pretty(sum(pg_column_size(%1$I))) AS %1$s', column_name), ', ') || ' FROM t1) AS cols'
INTO query
FROM information_schema.columns
WHERE table_schema = _schema_name
AND table_name = _table_name;
EXECUTE query
INTO result;
RETURN result;
END
$$
LANGUAGE plpgsql;
运行它:SELECT test1('public', 't1')
返回:{"id":"40 bytes","txt":"181 bytes"}