使用MySQL计算视力正常的帐户中的IP地址变化

时间:2016-01-03 10:03:03

标签: mysql

previous question中,真棒amdixon能够提出查询以计算IP的重复级别。

我已使用WHERE earning_account_id = ?

对此进行调整以查看特定帐户
SELECT MAX(repeat_factor)
FROM
(
SELECT earning_ip, count(*) / rc.row_count AS repeat_factor
FROM earnings
CROSS JOIN (SELECT count(*) AS row_count FROM earnings WHERE earning_account_id = ?) rc
WHERE earning_account_id = ?
GROUP BY earning_ip
) q

但是,现在我想添加一个额外的安全级别。

我想应用相同类型的查询。但是,我想将其限制为任何具有特定IP地址的帐户分组,而不是将其限制为earning_account_id。

如果他们使用多个alt帐户,我可以更好地全面检测代理垃圾邮件。

请注意,我将不再使用WHERE earning_account_id = ?

限制查询

换句话说,如果ip_address是" 45.55.80.86"

+--------------------+-------------+---------------------------+
| earning_account_id | earning_ip  | select row for repeat_factor query?|
+--------------------+-------------+---------------------------+
|                  1 | 45.55.80.86 | YES                       |
|                  1 | 45.55.80.86 | YES                       |
|                  2 | 1.22.83.65  | NO                        |
|                  2 | 91.15.76.37 | NO                        |
|                  3 | 45.55.80.86 | YES                       |
|                  4 | 61.25.76.37 | YES                       |
|                  4 | 1.22.83.65  | YES                       |
|                  4 | 45.55.80.86 | YES                       |
|                  5 | 61.25.76.37 | NO                        |
+--------------------+-------------+---------------------------+

要返回的值将是此ip的所有收入的repeat_factor,但忽略所有从未包含此IP地址的帐户

换句话说,我试图找出的是:

  

"所有帐户中的IP地址重复多少,但是看起来   仅在已查看此IP地址的帐户处?"

2 个答案:

答案 0 :(得分:1)

<强>更新

根据How to get multiple counts with one SQL query?的想法和@SteveChambers的答案,我们可以进一步简化这一点。

SELECT sum(CASE WHEN earning_ip = ? THEN 1 ELSE 0 END) / count(*)
FROM earnings WHERE earning_account_id IN (
    SELECT DISTINCT earning_account_id FROM earnings WHERE earning_ip = ?
)

这也使用示例IP 0.666745.55.80.86

我在这里留下原始答案,因为其中一部分可能对另一个查询有用。

原始答案

通过修改子查询并逐步完成,以下内容将返回给定IP的ID。

SELECT earning_account_id
FROM earnings WHERE earning_ip = ?
GROUP BY earning_account_id

如果IP为45.55.80.86的示例,则查询将返回1, 3, 4

然后按ID计算给定IP的出现次数。

SELECT earning_account_id, count(earning_ip) AS occurrence
FROM earnings
WHERE earning_account_id IN (
    SELECT earning_account_id
    FROM earnings WHERE earning_ip = ?
    GROUP BY earning_account_id
) AND earning_ip = ?
GROUP BY earning_account_id

如果是示例,则返回1 => 2, 3 => 1, 4 => 1

然后还计算这些ID的所有IP的数量,并将其与之前的结果连接起来。

SELECT e.earning_account_id, count(e.earning_account_id) AS ip_count, o.occurrence
FROM earnings e
CROSS JOIN (
    SELECT earning_account_id, count(earning_ip) AS occurrence FROM earnings
    WHERE earning_account_id IN (
        SELECT earning_account_id FROM earnings WHERE earning_ip = ?
        GROUP BY earning_account_id
    ) AND earning_ip = ?
    GROUP BY earning_account_id
) o
WHERE e.earning_account_id = o.earning_account_id
GROUP BY e.earning_account_id

如果是示例,则帐户的所有IP均为1 => 2, 3 => 1, 4 => 3

最后,将所有出现次数的总和除以此行子集中所有IP的总和。

SELECT sum(q.occurrence) / sum(q.ip_count) FROM (
    SELECT e.earning_account_id, count(e.earning_account_id) AS ip_count, o.occurrence
    FROM earnings e
    CROSS JOIN (
        SELECT earning_account_id, count(earning_ip) AS occurrence FROM earnings
        WHERE earning_account_id IN (
            SELECT earning_account_id FROM earnings WHERE earning_ip = ?
            GROUP BY earning_account_id
        ) AND earning_ip = ?
        GROUP BY earning_account_id
    ) o
    WHERE e.earning_account_id = o.earning_account_id
    GROUP BY e.earning_account_id
) q

如果是示例,则会返回0.6667,这与46行中标记为YES的{​​{1}}次出现相对应。

答案 1 :(得分:1)

可以简单地获得要选择的行:

select e.*
from example e
join 
(select distinct earning_account_id
 from example
 where ip = '45.55.80.86') subq
on e.earning_account_id = subq.earning_account_id;

此时,如果它是SQL Server数据库,您只需将其捆绑到公用表表达式(CTE)中,并使用其别名而不是amdixon's query中对表名的两个引用。不幸的是MySQL doesn't provide such a luxury因此我们被限制在子查询中,每个子查询都必须有一个唯一的别名 - 所以有点丑陋但是这样做了:

select max(repeat_factor)
from
(
select ip, count(*) / rc.row_count as repeat_factor
from
(select e.*
 from example e
 join 
 (select distinct earning_account_id
  from example
  where ip = '45.55.80.86') subq
 on e.earning_account_id = subq.earning_account_id) cte1
cross join ( select count(*) as row_count from 
(select e.*
 from example e
 join 
 (select distinct earning_account_id
  from example
  where ip = '45.55.80.86') subq
 on e.earning_account_id = subq.earning_account_id) cte2
) rc
group by ip
) q;

参见 SQL Fiddle Demo