为什么空日期会导致交叉表功能失败?

时间:2012-04-03 23:02:45

标签: postgresql

我在postgres的交叉表函数中发现了一个奇怪的行为,我无法解释,但希望别人可以......

我使用的交叉表功能版本需要先构建一个初步表格。

此SQL成功创建了初步表:

SELECT 
    ST.studyabrv||' '||S.labid||' '||S.subjectid||' '||S.box::varchar||' '||S.well AS "rowname",
    M.marker AS "bucket", 
    G.allele1||' '||G.allele2 AS "bucket_value" 
INTO TABLE ct 
FROM 
    geno.gmarkers M, 
    geno.genotypes G, 
    geno.gsamples S, 
    geno.guploads U, 
    geno.gibg_studies ST 
WHERE 
    G.markers_id=M.id 
    AND G.gsamples_id=S.id 
    AND S.guploads_id=U.id 
    AND U.ibg_study_id=ST.id 
    AND ( M.id=5 OR M.id=6 OR M.id=2 OR M.id=4 OR M.id=3) 
    AND ( S.labid='CL100001' OR S.labid='CL100002' OR S.labid='CL100003' OR S.labid='CL100004' OR S.labid='CL100005' OR S.labid='CL100006' OR S.labid='CL100007' OR S.labid='CL100008' OR S.labid='CL100009' OR S.labid='CL100010' OR S.labid='CL100011' OR S.labid='CL100012' OR S.labid='CL100013' OR S.labid='CL100014' OR S.labid='CL100015') 
ORDER BY box,well;

产生的输出如下:

         rowname          |  bucket   | bucket_value 
--------------------------+-----------+--------------
 LTS CL100001 10011 1 A01 | 5HTTLPR-T | S La
 LTS CL100001 10011 1 A01 | 5HTTLPR-D | 14 16
 LTS CL100001 10011 1 A01 | DAT1      | 440 480
 LTS CL100001 10011 1 A01 | DRD4      | 475 475
 LTS CL100001 10011 1 A01 | Caspi     | 351 351
 LTS CL100009 10420 1 A02 | Caspi     |  
 LTS CL100009 10420 1 A02 | 5HTTLPR-T | La Lg
 LTS CL100009 10420 1 A02 | 5HTTLPR-D | 16 16
 LTS CL100009 10420 1 A02 | DAT1      | 440 480
 LTS CL100009 10420 1 A02 | DRD4      | 475 475
...

但是,如果我尝试包含一个全部为null的日期列,如:

SELECT 
    ST.studyabrv||' '||S.labid||' '||S.subjectid||' '||S.box::varchar||' '||S.well||' '||G.run_date::text AS "rowname", 
    M.marker AS "bucket", 
    G.allele1||' '||G.allele2 AS "bucket_value" 
INTO TABLE ct 
FROM 
    geno.gmarkers M, 
    geno.genotypes G, 
    geno.gsamples S, 
    geno.guploads U, 
    geno.gibg_studies ST 
WHERE 
    G.markers_id=M.id 
    AND G.gsamples_id=S.id 
    AND S.guploads_id=U.id 
    AND U.ibg_study_id=ST.id 
    AND ( M.id=5 OR M.id=6 OR M.id=2 OR M.id=4 OR M.id=3) 
    AND ( S.labid='CL100001' OR S.labid='CL100002' OR S.labid='CL100003' OR S.labid='CL100004' OR S.labid='CL100005' OR S.labid='CL100006' OR S.labid='CL100007' OR S.labid='CL100008' OR S.labid='CL100009' OR S.labid='CL100010' OR S.labid='CL100011' OR S.labid='CL100012' OR S.labid='CL100013' OR S.labid='CL100014' OR S.labid='CL100015') 
ORDER BY box,well;

这会产生输出:

 rowname |  bucket   | bucket_value 
---------+-----------+--------------
         | 5HTTLPR-T | S La
         | 5HTTLPR-D | 14 16
         | DAT1      | 440 480
         | DRD4      | 475 475
         | Caspi     | 351 351
         | Caspi     |  
         | 5HTTLPR-T | La Lg
         | 5HTTLPR-D | 16 16

如您所见,将run_date列添加到" rowname"的末尾。复合列渲染整个复合空白......这很疯狂。 如果我使用虚拟数据填充run_date,它将显示....但如果它为空或null,则会导致" rowname"空白。

我无法判断这是否是postgres中的错误,但如果可能的话,我想解决这个奇怪的结果。

TIA, rixter

2 个答案:

答案 0 :(得分:0)

|| coalesce(G.run_date, '')::text

答案 1 :(得分:0)

您应该将null视为unknown值。 null值不是数字或字符串,因此您无法像对待它们那样对它们进行操作。所以你应该确保使用一些会返回非空值的函数,例如coalesce(),它会从左到右返回第一个非null参数并强制默认值作为最右边的参数。