Orient DB:带权重的建议查询

时间:2016-03-04 16:41:01

标签: orientdb

在博客文章Movielens Recommendation Engine with OrientDB中,他们编写了一个查询,查找用户评分为5的电影,评分为5的电影评分为5#16:0。

这是型号Person->(额定) - >电影

create class Movie extends V
create property Movie.title String
create class Person extends V
create property Person.id String
create class Rated extends E
create property Rated.rating int

这是原始建议查询

select title, count(*) as conto
  from (select expand(rid.outE('rated')[rating = 5].in)
          from (
        select @rid as rid, id as id, count(*) as conto
        from (select expand(outE('rated')    
              [rating=5].in.inE('rated'[rating=5].out) from #16:0)
              where @rid <> #16:0 group by rid, id order by conto desc limit 10))
 where title not in (select out('rated').title from #16:0)
 group by title
 order by conto desc

我正在寻找增加一些重量的方法:评价5到100部电影的用户X比仅评价50部电影的用户Y更重。 与Y相比,X评级的电影应该是有利的。

此查询可以通过将其他用户评为5的电影评分为5的电话来计算权重#16:0:

select @rid as rid, count(*) as p from (
      select from (
        select expand(outE('rated')[rating=5].in.inE('rated').out
                 ) from #16:0
      ) where @rid <> #16:0
    ) group by @rid order by p desc

但我不知道如何编写建议查询来使用它。

我试试这个,但它不起作用

select @rid as rid, title, count(*) from (
  select expand(rid.outE('rated')[rating=5].in) from (
    select @rid as rid, count(*) as p from (
      select from (
        select expand(outE('rated')[rating=5].in.inE('rated').out
                 ) from #16:0
      ) where @rid <> #16:0
    )group by @rid order by p desc
  )
)
where @rid not in (select out('rated').@rid from #16:0) group by @rid

1 个答案:

答案 0 :(得分:0)

我找到了一个查询,可以检索评分的电影及其相关的重量&#39; :

select a.@rid as rid, weight from (
    select u.outE("rated")[rating=5].in as a, count(*) as weight from (
      select @rid as u from (
        select expand(outE('rated')[rating=5].in.inE('rated')[rating=5].out) from #13:0
      ) where @rid <> #13:0
    ) group by u
  ) 
unwind rid

我现在可以有多少时间和重量(我不确定某些是个好主意)

select rid, count(*) as c, sum(weight) as w from(the_query_above)
where rid not in (select out("rated").@rid from #13:0)
group by rid

我不确定这是最好的方法,也许有人有另一个想法?