我需要一个SQL查询/过程,以根据给定访问者的性别,最匹配的兴趣和学术领域找到最匹配的访问主机。我有以下表格:
HOSTS:
HOST_ID GENDER INTEREST_ONE_ID INTEREST_TWO_ID ACADEMIC_FIELD_ID NUM_CAN_HOST
A M 1 2 10 2
B M 5 4 3 1
C F 2 1 3 2
D F 1 2 10 3
E M 5 1 3 1
F M 5 1 6 1
VISTORS:
VISTOR_ID GENDER INTEREST_ONE_ID INTEREST_TWO_ID ACADEMIC_FIELD_ID
1 M 2 1 10
2 M 5 4 3
3 M 1 2 2
4 F 4 1 6
请注意,所有兴趣ID都来自相同的列表,而且academic_field_id也来自同一列表(但自然不同于兴趣列表)。因此,我想要一个查询/过程,该查询/过程首先根据性别返回给定访问者的前X个最佳主机匹配项,而不是根据哪个主机最匹配兴趣和学术领域。兴趣匹配的位置并不重要(interest_one可以匹配interest_two,反之亦然)。因此是Vistor 1的示例输出:
BEST_MATCHES (for Vistor 1..only males with most matches)
VISITOR_ID HOST_ID INTEREST_ONE_MATCH INTEREST_TWO_MATCH Academic_int_MATCH
1 A x [one to two] x [two to one] x
1 B - - - Next best..which is not too good!
和访问者2:
BEST_MATCHES
VISITOR_ID HOST_ID INTEREST_ONE_MATCH INTEREST_TWO_MATCH Academic_int_MATCH
2 B x x x
2 E x - x
2 F x - -
等
答案 0 :(得分:0)
这是一个昂贵的查询,但是:
select hv.
from (select h.host_id, v.visitor_id,
(case when h.INTEREST_ONE_ID = v.INTEREST_ONE_ID then 'X' end) as INTEREST_ONE_ID_match,
(case when h.INTEREST_TWO_ID = v.INTEREST_TWO_ID then 'X' end) as INTEREST_TWO_ID_match,
. . . ,
dense_rank() over (partition by h.host_id
order by ((case when h.INTEREST_ONE_ID = v.INTEREST_ONE_ID then 1 else 0 end) +
(case when h.INTEREST_TWO_ID = v.INTEREST_TWO_ID then 1 else 0 end) +
. . .
) desc
) as seqnum
from hosts h cross join
visitors v
) hv
where seqnum = 1;