通过循环遍历Netezza中的列来生成NULL计数

时间:2014-08-13 20:46:28

标签: sql netezza

我尝试查询Netezza表以获取列值为NULL的行数 - 对于所有列。具体来说,假设我们有下表(标题为merchants

business_name    phone                 email
-------------------------------------------------------
NULL             505-844-1234         john@example.com
Alibaba          NULL                 mary@domain.com
NULL             NULL                 harry@company.com

我想生成一个输出表,如:

column_name          NULL_count
-------------------------------
business_name          2
phone                  2
email                  0

我可以使用以下方法为各列生成它:

select count(*)
from merchants
where <column_name> is null;

但是,我的表格有100多个列,我不想手动编写查询代码。我知道我可以编写Java / Python代码来以编程方式查询表,甚至编写脚本来生成~100个查询。但是,这个任务看起来并不复杂,我认为它应该可以在纯SQL中实现。每个Netezza表的列列表可通过以下方式获得:

SELECT column_name 
FROM information_schema.columns 
WHERE LOWER(table_name) = 'merchants'`

我想对上面列表中的每一列运行上面的count(*)查询。我很难确定正确的连接和/或使用存储过程。到目前为止,我已经尝试按照here解释修改Netezza存储过程,但我一直在原始代码上遇到语法错误。

TL; DR如何为Netezza中的所有列生成空计数?

1 个答案:

答案 0 :(得分:4)

显示的表格代码 -

VARUNDBTEST.ADMIN(ADMIN)=> create table sample(business_name varchar(255), phone varchar(255), email varchar(255));
CREATE TABLE
VARUNDBTEST.ADMIN(ADMIN)=> insert into sample values(NULL,'505-844-1234','john@example.com');
INSERT 0 1
VARUNDBTEST.ADMIN(ADMIN)=> insert into sample values('Alibaba',NULL,'mary@domain.com');
INSERT 0 1
VARUNDBTEST.ADMIN(ADMIN)=> insert into sample values(null,NULL,'harry@company.com');
INSERT 0 1
VARUNDBTEST.ADMIN(ADMIN)=> select * from sample;
 BUSINESS_NAME |    PHONE     |       EMAIL
---------------+--------------+-------------------
 Alibaba       |              | mary@domain.com
               | 505-844-1234 | john@example.com
               |              | harry@company.com
(3 rows)

参考代码表 -

                   Table "TBL1"
  Attribute  |          Type          | Modifier | Default Value
-------------+------------------------+----------+---------------
 COLUMN_NAME | CHARACTER VARYING(255) |          |
 NULL_COUNT  | CHARACTER VARYING(255) |          |
Distributed on hash: "COLUMN_NAME"

程序代码(getNullCount.sql) -

CREATE OR REPLACE PROCEDURE getNullCount()
LANGUAGE NZPLSQL RETURNS REFTABLE(tbl1) AS

BEGIN_PROC

DECLARE
        p_abc     RECORD;
        p_bcd     RECORD;

BEGIN
        FOR p_abc IN
                SELECT column_name FROM information_schema.columns WHERE LOWER(table_name) = 'sample'
        LOOP
                FOR p_bcd IN
                        execute 'SELECT COUNT(*) as col_null_count FROM SAMPLE WHERE '|| p_abc.column_name||' is null'
                LOOP
                        execute immediate 'INSERT INTO '|| REFTABLENAME||' VALUES('||quote_literal(p_abc.column_name)||','|| p_bcd.col_null_count||')';
                END LOOP;

        END LOOP;

return reftable;

END;
END_PROC;

程序创建和执行 -

VARUNDBTEST.ADMIN(ADMIN)=> \i getNullCount.sql
CREATE PROCEDURE


VARUNDBTEST.ADMIN(ADMIN)=> call getnullcount();
  COLUMN_NAME  | NULL_COUNT
---------------+------------
 BUSINESS_NAME | 2
 EMAIL         | 0
 PHONE         | 2
(3 rows)

希望这会有所帮助。