Question

我有一组要收集的表，我试图在两个不同的黑名单（typeId）之间找到不同的ip地址（ip）的计数 - 基本上将表与自身相交。但是我将表连接到自身的查询提供了奇怪的结果。

sqlite> .schema
CREATE TABLE feedz_ip2 (fileId INTEGER NOT NULL, ip NUMERIC, utime INT, typeId INTEGER);
CREATE TABLE feedz_ip_types (typeId INTEGER PRIMARY KEY, type STRING UNIQUE);
CREATE INDEX ip_id ON feedz_ip2(ip);
CREATE INDEX types_id ON feedz_ip2(typeId);
sqlite> select * from feedz_ip2 limit 4;
1|86176256|1347929568|2
1|247463936|1347929568|2
1|247476224|1347929568|2
1|247478272|1347929568|2
sqlite> select * from feedz_ip_types;

1 | malwaredomains
2 | spamhaus
3 | badipset
4 | abuse.ch
5 | malwarepatrol

sqlite> select a.typeId, b.typeId, count(a.ip) 
          from feedz_ip2 a 
    inner join feedz_ip2 b on a.typeId != b.typeId and a.ip=b.ip 
      ;

5 | 3 | 9265512

我正在寻找的应该是所有不同列表的交集

1|3|200
1|5|900
2|3|300

如果没有交叉点或公共IP地址，则列组合将不会列出。

我不知道查询是否真的让sqlite感到困惑，或者我很困惑..

Answer 1

好的找到了解决方案：

select a.typeId, b.typeId, count(*) from feedz_ip2 a inner join feedz_ip2 b on a.typeId != b.typeId and a.ip=b.ip group by a.typeId,b.typeId;
+--------+--------+----------+
| typeId | typeId | count(*) |
+--------+--------+----------+
|      1 |      3 |   471718 | 
|      1 |      4 |  3662405 | 
|      1 |      5 |   323609 | 
|      2 |      3 |      426 | 
|      3 |      1 |   471718 | 
|      3 |      2 |      426 | 
|      3 |      4 |   133002 | 
|      3 |      5 |    41596 | 
|      4 |      1 |  3662405 | 
|      4 |      3 |   133002 | 
|      5 |      1 |   323609 | 
|      5 |      3 |    41596 | 
+--------+--------+----------+

SQL连接表本身提供的结果不完整

1 个答案: