这是搜索记录:
A = {
field1: value1,
field2: value2,
...
fieldN: valueN
}
我在数据库中有很多这样的记录。
如果这些记录中的偶数N-M字段相等,则其他记录(B)几乎与记录A匹配。这是一个例子,M = 2:
B = {
field1: OTHER_value1,
field2: OTHER_value2,
field3: value3,
...
fieldN: valueN
}
如果可以是任何领域,不仅仅是第一个。
我可以进行非常大的组合SQL查询,但可能还有更漂亮的解决方案。
P.S。:我的数据库是PostgreSQL。
答案 0 :(得分:3)
这样的搜索条件将无法使用任何索引,但可以完成...
SELECT
*
FROM
yourTable
WHERE
N-M <= CASE WHEN yourTable.field1 = searchValue1 THEN 1 ELSE 0 END
+ CASE WHEN yourTable.field2 = searchValue2 THEN 1 ELSE 0 END
+ CASE WHEN yourTable.field3 = searchValue3 THEN 1 ELSE 0 END
...
+ CASE WHEN yourTable.fieldN = searchValueN THEN 1 ELSE 0 END
同样,如果您的搜索条件位于另一个表格中......
SELECT
*
FROM
yourTable
INNER JOIN
search
ON N-M <= CASE WHEN yourTable.field1 = search.field1 THEN 1 ELSE 0 END
+ CASE WHEN yourTable.field2 = search.field2 THEN 1 ELSE 0 END
+ CASE WHEN yourTable.field3 = search.field3 THEN 1 ELSE 0 END
...
+ CASE WHEN yourTable.fieldN = search.fieldN THEN 1 ELSE 0 END
(您需要自己填充N-M
的值)
<强> 编辑: 强>
更长时间的方法,可以某些使用索引......
SELECT
id, -- your table would need to have a primary key / identity column
MAX(field1) AS field1,
MAX(field2) AS field2,
MAX(field3) AS field3,
...
MAX(fieldN) AS fieldN
FROM
(
SELECT * FROM yourTable WHERE field1 = searchValue1
UNION ALL
SELECT * FROM yourTable WHERE field2 = searchValue2
UNION ALL
SELECT * FROM yourTable WHERE field3 = searchValue3
...
SELECT * FROM yourTable WHERE fieldN = searchValueN
)
AS unioned_seeks
GROUP BY
id
HAVING
COUNT(*) >= N-M
如果每个字段都有一个索引,并且您希望每个字段的匹配数相对较少,则 可能 的性能优于第一个选项非常重复的代码。
答案 1 :(得分:3)
我会使用is not distinct from
来处理NULL
值。
您也可以使用Postgres简写来简化逻辑。一种方法是:
where ( (a.field1 is not distinct from b.field1)::int +
(a.field2 is not distinct from b.field2)::int +
. . .
(a.fieldn is not distinct from b.fieldn)::int +
) >= N - M
我认为仅使用M
更容易表达。所以,只看一下不同的字段:
where ( (a.field1 is distinct from b.field1)::int +
(a.field2 is distinct from b.field2)::int +
. . .
(a.fieldn is distinct from b.fieldn)::int +
) <= M
对您的数据执行此操作需要cross join
这非常昂贵。