Mysql - 从复杂的标准中获得结果

时间:2015-07-23 16:55:30

标签: mysql sql

我的数据库包含许多网站的统计数据,我目前遇到的问题是相当复杂的查询,我不知道该怎么做(或者甚至可能)。

我有2个表:websitesvisits。前者是所有网站及其属性的列表,而前者是每个用户在特定网站上访问的列表。

我正在制作的节目应该提取需要“扫描”的网站。每个站点的每次扫描之间的间隔取决于过去30天的网站总访问次数。这是一个包含预期扫描间隔的表:

enter image description here

表格具有以下结构:

网站 enter image description here

访问 enter image description here

我想要的是一个返回 过去各个更新截止日期的网站的查询(可以从last_scanned列中看到)

这在单个查询中是否可以轻松实现?

3 个答案:

答案 0 :(得分:1)

这是你可以尝试的东西:

SELECT main.* 
FROM ( 
  SELECT
    w.web_id,
    w.url,
    w.last_scanned,
    (SELECT COUNT(*)
     FROM visits v
     WHERE v.web_id = w.web_id
       AND TIMESTAMPDIFF(DAY,v.added_on, NOW()) <=30
    ) AS visit_count,
    TIMESTAMPDIFF(HOUR,w.last_scanned, NOW()) AS hrs_since_update
  FROM websites w
  ) main
WHERE
  (CASE 
    WHEN visit_count >= 0 AND visit_count <= 10 AND hrs_since_update >= 4320 THEN 1
    WHEN visit_count >= 11 AND visit_count <= 100 AND hrs_since_update >= 2160 THEN 1
    WHEN visit_count >= 101 AND visit_count <= 500 AND hrs_since_update >= 1080 THEN 1
    WHEN visit_count >= 501 AND visit_count <= 1000 AND hrs_since_update >= 720 THEN 1
    WHEN visit_count >= 1001 AND visit_count <= 2000 AND hrs_since_update >= 360 THEN 1
    WHEN visit_count >= 2001 AND visit_count <= 5000 AND hrs_since_update >= 168 THEN 1
    WHEN visit_count >= 5001 AND visit_count <= 10000 AND hrs_since_update >= 72 THEN 1
    WHEN visit_count >= 10001 AND hrs_since_update >= 24 THEN 1
    ELSE 0 
  END) = 1;

这是小提琴演示:http://sqlfiddle.com/#!9/1f671/1

答案 1 :(得分:0)

首先,我会创建一个子查询,以便从visits表中获取每个不同web_id的访问次数。然后,LEFT OUTER JOIN websites表到此子查询。然后,您可以在访问更新频率表中查询每个可能条件的结果,如下所示:

SELECT websites.* FROM websites
  LEFT OUTER JOIN (
    SELECT visits.web_id, COUNT(*) AS visits_count FROM visits GROUP BY visits.web_id
  ) v ON v.web_id = websites.web_id
  WHERE 
    (v.visits_count <= 10 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 4320 HOUR)) OR
    (v.visits_count BETWEEN 11 AND 100 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 2160 HOUR)) OR
    (v.visits_count BETWEEN 101 AND 500 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 1080 HOUR)) OR
    (v.visits_count BETWEEN 501 AND 1000 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 720 HOUR)) OR
    (v.visits_count BETWEEN 1001 AND 2000 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 360 HOUR)) OR
    (v.visits_count BETWEEN 2001 AND 5000 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 168 HOUR)) OR
    (v.visits_count BETWEEN 5001 AND 10000 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 72 HOUR)) OR
    (v.visits_count > 10000 AND websites.last_scanned <= DATE_SUB(NOW(), INTERVAL 24 HOUR));

答案 2 :(得分:0)

只是对@morgb查询的改进,使用表格来访问计数范围

SQL FIDDLE DEMO

create table visitCount (
  `min` bigint(20),
  `max` bigint(20),
  `frequency` bigint(20)
);


SELECT main.*
FROM ( 
  SELECT
    w.web_id,
    w.url,
    w.last_scanned,
    (SELECT COUNT(*)
     FROM visits v
     WHERE v.web_id = w.web_id
       AND TIMESTAMPDIFF(DAY,v.added_on, NOW()) <=30
    ) AS visit_count,
    TIMESTAMPDIFF(HOUR,w.last_scanned, NOW()) AS hrs_since_update
  FROM websites w
  ) main inner join
  visitCount v on visit_count between v.min and v.max
WHERE 
    main.hrs_since_update > v.frequency