使用SPARQL查询查找k个最近邻居

时间:2013-01-28 21:27:09

标签: algorithm machine-learning sparql nearest-neighbor

我想编写一个SPARQL查询来查找一组向量的k个最近邻居。要查找单个向量的100个最近邻居的平均标签,我可以使用以下查询:

PREFIX : <ml://>
PREFIX vector: <ml://vector/>
PREFIX feature: <ml://feature/>

SELECT (AVG(?label) as ?prediction)
WHERE {
  {
    SELECT ?other_vector (COUNT(?common_feature) as ?similarity)
    WHERE { vector:0 :has ?common_feature . 
      ?other_vector :has ?common_feature .
    } GROUP BY ?other_vector ORDER BY DESC(?similarity) LIMIT 100
  }
  ?other_vector :hasLabel ?label .
}

有没有办法在单个查询中为多个向量执行此操作?

1 个答案:

答案 0 :(得分:0)

除非我忽视某些内容,否则您可以通过将URI vector:0替换为变量来实现此目的,如下所示:

SELECT ?vector (AVG(?label) as ?prediction)
WHERE {
  {
    SELECT ?vector ?other_vector (COUNT(?common_feature) as ?similarity)
    WHERE { ?vector :has ?common_feature . 
      ?other_vector :has ?common_feature .
      FILTER(?vector != ?other_vector)
    } GROUP BY ?other_vector ORDER BY DESC(?similarity) LIMIT 100
  }
  ?other_vector :hasLabel ?label .
}

我添加了一个过滤条件来检查?vector?other_vector是否不相等,当然是否有必要取决于您:)

如果您需要限制要查找匹配项的向量列表,可以使用VALUES子句限制?vector的可能绑定:

VALUES ?vector { vector:0 vector:1 ... }