Varchar字段大小为0,但未在DISTINCT SQL查询中列出

时间:2016-01-10 13:16:49

标签: sql postgresql

这不是直接编程相关的,所以也许它不是主题。

对于我的光伏发电,我已经有了这个postgres表几年了。我做了像

这样的查询
WITH all_sources AS
(
    SELECT DISTINCT origin from solardata
    ORDER by origin
)
SELECT all_sources.origin, count(*) 
FROM solardata, all_sources
WHERE all_sources.origin = solardata.origin
GROUP BY all_sources.origin
ORDER BY all_sources.origin

我得到了这个结果

                origin                | count  
--------------------------------------+--------
 kostal-log-parser: 10.1.log          |   5905
 kostal-log-parser: 5.5.log           |   6059
 kostal-log-parser: LogDaten_10_1.dat |   3474
 kostal-log-parser: LogDaten_5_5.dat  |   3369
 kostal-web-parser                    | 480869
 time-gridder                         |  18432
(6 rows)

但另一方面,如果我跑

select date_time, origin 
from solardata 
order by date_time limit 2;

我得到了

      date_time      | origin 
---------------------+--------
 2009-08-17 18:34:00 | 
 2009-08-17 18:34:00 | 

怎么可能?

我的postgres版本是9.4.5

这是解决方案。原因是内连接,但我需要左连接。

WITH all_sources AS
(
    SELECT DISTINCT origin from solardata
    ORDER by origin
)
SELECT all_sources.origin, count(*) 
FROM solardata
LEFT JOIN all_sources on (all_sources.origin = solardata.origin)
GROUP BY all_sources.origin
ORDER BY all_sources.origin

1 个答案:

答案 0 :(得分:2)

公用表表达式是多余的:

SELECT origin, count(*) 
from solardata 
GROUP BY origin 
ORDER BY origin; 

您的查询存在问题:

WHERE   all_sources.origin = solardata.origin

NULL = NULL => NULL(UKNOWN)因此会跳过该行。

SqlFiddleDemo_GROUP_BY SqlFiddleDemo_Original

输出:

╔═════════╦═══════╗         ╔═════════╦═══════╗
║ origin  ║ count ║         ║ origin  ║ count ║
╠═════════╬═══════╣         ╠═════════╬═══════╣
║ (null)  ║     2 ║         ║         ║     2 ║
║         ║     2 ║   vs    ║ a       ║     1 ║
║ a       ║     1 ║         ║ b       ║     1 ║
║ b       ║     1 ║         ╚═════════╩═══════╝
╚═════════╩═══════╝

请注意,您的版本中不存在(null)组。

您不应该使用它(冗余cte),但如果您将=更改为IS NOT DISTINCT FROM,您的查询也会有效:

WITH all_sources AS
(
    SELECT DISTINCT origin from solardata
)
SELECT all_sources.origin, count(*) 
from solardata
JOIN all_sources
  ON all_sources.origin IS NOT DISTINCT FROM solardata.origin
GROUP BY all_sources.origin
ORDER BY all_sources.origin