这不是直接编程相关的,所以也许它不是主题。
对于我的光伏发电,我已经有了这个postgres表几年了。我做了像
这样的查询WITH all_sources AS
(
SELECT DISTINCT origin from solardata
ORDER by origin
)
SELECT all_sources.origin, count(*)
FROM solardata, all_sources
WHERE all_sources.origin = solardata.origin
GROUP BY all_sources.origin
ORDER BY all_sources.origin
我得到了这个结果
origin | count
--------------------------------------+--------
kostal-log-parser: 10.1.log | 5905
kostal-log-parser: 5.5.log | 6059
kostal-log-parser: LogDaten_10_1.dat | 3474
kostal-log-parser: LogDaten_5_5.dat | 3369
kostal-web-parser | 480869
time-gridder | 18432
(6 rows)
但另一方面,如果我跑
select date_time, origin
from solardata
order by date_time limit 2;
我得到了
date_time | origin
---------------------+--------
2009-08-17 18:34:00 |
2009-08-17 18:34:00 |
怎么可能?
我的postgres版本是9.4.5
这是解决方案。原因是内连接,但我需要左连接。
WITH all_sources AS
(
SELECT DISTINCT origin from solardata
ORDER by origin
)
SELECT all_sources.origin, count(*)
FROM solardata
LEFT JOIN all_sources on (all_sources.origin = solardata.origin)
GROUP BY all_sources.origin
ORDER BY all_sources.origin
答案 0 :(得分:2)
公用表表达式是多余的:
SELECT origin, count(*)
from solardata
GROUP BY origin
ORDER BY origin;
您的查询存在问题:
WHERE all_sources.origin = solardata.origin
NULL = NULL => NULL(UKNOWN)
因此会跳过该行。
的 SqlFiddleDemo_GROUP_BY
强>
的 SqlFiddleDemo_Original
强>
输出:
╔═════════╦═══════╗ ╔═════════╦═══════╗
║ origin ║ count ║ ║ origin ║ count ║
╠═════════╬═══════╣ ╠═════════╬═══════╣
║ (null) ║ 2 ║ ║ ║ 2 ║
║ ║ 2 ║ vs ║ a ║ 1 ║
║ a ║ 1 ║ ║ b ║ 1 ║
║ b ║ 1 ║ ╚═════════╩═══════╝
╚═════════╩═══════╝
请注意,您的版本中不存在(null)
组。
您不应该使用它(冗余cte),但如果您将=
更改为IS NOT DISTINCT FROM
,您的查询也会有效:
WITH all_sources AS
(
SELECT DISTINCT origin from solardata
)
SELECT all_sources.origin, count(*)
from solardata
JOIN all_sources
ON all_sources.origin IS NOT DISTINCT FROM solardata.origin
GROUP BY all_sources.origin
ORDER BY all_sources.origin