如何在不同行上的同一表中选择空的数据?

时间:2013-08-15 18:00:30

标签: sql select sqlite

我有这三个表(我附上预览版)。列表末尾是表“virustotalscans”中的数据示例。有名称为“virustotal”的列。每个唯一样本有数字,例如165,下一个样本有数字166等。

VIRUTOTALS

CREATE TABLE virustotals (
                            virustotal INTEGER PRIMARY KEY,
                            virustotal_md5_hash TEXT NOT NULL,
                            virustotal_timestamp INTEGER NOT NULL,
                            virustotal_permalink TEXT NOT NULL
                    );
CREATE INDEX virustotals_md5_hash_idx
                    ON virustotals (virustotal_md5_hash);

VIRUSTOTALSCANS

CREATE TABLE virustotalscans (
                    virustotalscan INTEGER PRIMARY KEY,
                    virustotal INTEGER NOT NULL,
                    virustotalscan_scanner TEXT NOT NULL,
                    virustotalscan_result TEXT
            );
CREATE INDEX virustotalscans_result_idx
                    ON virustotalscans (virustotalscan_result);
CREATE INDEX virustotalscans_scanner_idx
                    ON virustotalscans (virustotalscan_scanner);
CREATE INDEX virustotalscans_virustotal_idx
                    ON virustotalscans (virustotal);

下载

CREATE TABLE downloads (
                            download INTEGER PRIMARY KEY,
                            connection INTEGER,
                            download_url TEXT,
                            download_md5_hash TEXT
                            -- CONSTRAINT downloads_connection_fkey FOREIGN KEY (connection) REFERENCES connections (connection)
                    );
CREATE INDEX downloads_connection_idx   ON downloads (connection);
CREATE INDEX downloads_md5_hash_idx
                    ON downloads (download_md5_hash);
CREATE INDEX downloads_url_idx
                    ON downloads (download_url);

表“virustotalscans”中的数据示例:http://pastebin.com/7E7McZwT

现在,我需要选择所有样本,这些样本位于“virustotalscan_result”列的所有行中为空。所以我需要选择所有样本,它们不会检测VirusTotal与任何防病毒软件。我试过这个选择:

select distinct downloads.download_md5_hash from virustotalscans, virustotals, 
   downloads 
where downloads.download_md5_hash = virustotals.virustotal_md5_hash and 
   virustotals.virustotal = virustotalscans.virustotal and 
   virustotalscans.virustotalscan_result IS NULL;

但是我得到所有样本的MD5哈希...可能的原因是所有样本都包含至少一行,这是空的。这是合乎逻辑的,因为有些防病毒软件总是检测不到一些样本。

更好的例子:http://pastebin.com/y81DPpmQ。现在我需要选择样本 - 数字(列virustotal),其中所有行都在列virustotalscan_result中为空。它可以是例如2号。

你能帮帮我吗?

非常感谢您的回复。

1 个答案:

答案 0 :(得分:0)

SELECT download_md5_hash
FROM downloads
JOIN virustotals ON download_md5_hash = virustotal_md5_hash
WHERE virustotal IN (SELECT virustotal
                     FROM virustotalscans
                     GROUP BY virustotal
                     HAVING COUNT(virustotalscan_result) = 0)