从两个数据库中提取重复项以进行演示

时间:2014-03-19 19:45:38

标签: excel sqlite

我一直在试图解决这个问题。

我有一个SQLite d'基础有多个表,大约125k记录,与恢复的文件有关。

我被要求比较两组文件中的重复项 - 可访问和无法访问

我能够像这样提取类别......

ACCESSIBLE
SELECT fileoffset, fileName, folderName, hash, myDescription, myUnique, category
FROM c4p_index 
WHERE (category = 1 OR category = 2 OR category = 3 OR category = 4 OR category = 5)
AND myUnique=1
AND (fileoffset = 0 
AND folderName NOT LIKE '%\Lost Files\%' 
AND folderName NOT LIKE '%Unallocated Clusters%' 
AND fileName NOT LIKE '%Unallocated Clusters%' 
AND folderName NOT LIKE '%thumbs.db%' 
AND fileName NOT LIKE '%thumbs.db%' 
AND folderName NOT LIKE '%thumbcache%' 
AND fileName NOT LIKE '%thumbcache% 
AND myDescription NOT LIKE %deleted%' 
AND myDescription NOT LIKE '%recycled%')

INACCESSIBLE
SELECT fileoffset, fileName, folderName, hash, myDescription, myUnique, category
FROM c4p_index 
WHERE (category = 1 OR category = 2 OR category = 3 OR category = 4 OR category = 5)
AND myUnique=1
AND (fileoffset > 0 
OR folderName LIKE '%\Lost Files\%' 
OR folderName LIKE '%Unallocated Clusters%' 
OR fileName LIKE '%Unallocated Clusters%' 
OR folderName LIKE '%thumbs.db%' 
OR fileName LIKE '%thumbs.db%' 
OR folderName LIKE '%thumbcache%' 
OR fileName LIKE '%thumbcache% 
OR myDescription LIKE %deleted%' 
OR myDescription LIKE '%recycled%')

我的问题是,使用了每个GROUP中的唯一值 ' myUnique = 1' 我现在需要将HASH列中的值与两个单独的输出进行比较,我已经设法通过将其作为.CSV导出到EXCEL并操纵数据来实现冗长的方式。

但是我确定我可以一直使用SQLite来做.... ??

我的想法是,如果可以将输出导出到新的D'基础,然后将组之间的最终重复计数导出到电子表格中进行演示。

任何人都可以帮忙吗?

1 个答案:

答案 0 :(得分:0)

您可以使用ATTACH打开多个数据库文件:

ATTACH 'C:\some\where\accessible.db' AS accessible;
ATTACH 'C:\some\where\inaccessible.db' AS inaccessible;

然后您可以从这些数据库中复制数据:

CREATE TABLE hash_A AS
SELECT hash FROM accessible.c4p_index WHERE ...;
CREATE TABLE hash_I AS
SELECT hash FROM inaccessible.c4p_index WHERE ...;

然后你可以用你喜欢的方式比较它:

SELECT hash FROM hash_I EXCEPT SELECT hash FROM hash_A;