我有以下简单架构:
CREATE TABLE POSTS (
ID INT NOT NULL,
DATE DATE NOT NULL,
[Other stuff omitted]
);
CREATE TABLE TOPICS (
ID INT NOT NULL,
[Other stuff omitted]
);
CREATE TABLE THETA (
POST_ID INT NOT NULL,
TOPIC_ID INT NOT NULL,
WEIGHT FLOAT NOT NULL
);
我有一个查询可以在所有帖子中对THETA中的权重进行求和,按日期和主题ID分组:
SELECT THETA.TOPIC_ID as TopicID, POSTS.DATE as Date, SUM(THETA.WEIGHT) as Value
FROM POSTS INNER JOIN THETA
WHERE THETA.POST_ID=POSTS.ID
GROUP BY YEAR(POSTS.DATE), MONTH(POSTS.DATE), TopicID;
这可以按预期工作,得到如下结果:
+---------+------------+---------------------+
| TopicID | Date | Value |
+---------+------------+---------------------+
| 0 | 2008-08-19 | 350.4930010139942 |
| 0 | 2008-09-18 | 1745.5010008439422 |
| 0 | 2008-10-03 | 1468.824001269415 |
| 0 | 2008-11-25 | 1079.579000659287 |
| 0 | 2008-12-11 | 1070.3860008455813 |
| 0 | 2009-01-24 | 1453.3730010837317 |
| 0 | 2009-02-20 | 1139.2920009773225 |
| 1 | 2008-08-19 | 288.09700035490096 |
| 1 | 2008-09-22 | 1307.5790000930429 |
| 1 | 2008-10-16 | 1050.1739999558777 |
| 1 | 2008-11-11 | 868.2280002105981 |
| 1 | 2008-12-18 | 897.6830000579357 |
| 1 | 2009-01-12 | 1148.5619999151677 |
| 1 | 2009-02-12 | 858.0710002686828 |
| 2 | 2008-08-19 | 415.83300026878715 |
...
但是,我想通过该月的帖子数来规范化价值。例如,如果月份2008-08-19
中有100个帖子,则第一个结果行的值为3.50493,而8个结果行的值为2.88097。挑战是帖子的数量每月不同,所以我不太清楚该怎么做。有什么想法吗?
答案 0 :(得分:1)
也许:
SELECT t.TOPIC_ID as TopicID, p.DATE as Date, SUM(t.WEIGHT)/s.Month_CT as Value
FROM POSTS p
JOIN THETA t
ON t.POST_ID = p.ID
JOIN (SELECT YEAR(DATE) as Yr, MONTH(DATE) as Mnth, COUNT(ID) as Month_CT
FROM POSTS
GROUP BY YEAR(DATE), MONTH(DATE)
)s
ON YEAR(p.DATE) = s.Yr
AND MONTH(p.DATE) = s.Mnth
GROUP BY YEAR(p.DATE), MONTH(p.DATE), TopicID;
答案 1 :(得分:0)
SELECT t.topic_id TopicID,
CONCAT(y, '-', m, '-01') AS Date
SUM(t.weight) / cnt as NormalizedValue
FROM (
SELECT YEAR(date) y,
MONTH(date) m,
COUNT(*) AS cnt
FROM posts
GROUP BY
y, m, cnt
) p
JOIN posts p
ON p.date >= '0000-01-01' + INTERVAL y YEAR + INTERVAL m - 1 MONTH
AND p.date < '0000-01-01' + INTERVAL y YEAR + INTERVAL m MONTH
JOIN theta t
ON t.post_id = p.id
GROUP BY
y, m, t.topic_id