为一个谓词随机取两个不同的对象

时间:2015-04-07 17:06:41

标签: random rdf sparql

我有一个像这样的RDF数据集:

<subject1> <some_predicate> "Value 1" .
<subject1> <some_predicate> "Value 2" .
<subject1> <some_predicate> "Value 3" .
<subject1> <some_predicate> "Value 4" .
<subject1> <some_predicate> "Value 5" .

<subject2> <some_predicate> "Value 6" .
<subject2> <some_predicate> "Value 7" .
<subject2> <some_predicate> "Value 8" .
<subject2> <some_predicate> "Value 9" .
<subject2> <some_predicate> "Value 10" .

现在,对于每个主题,我想要有两个随机值&#34; some_predicate&#34;。它们应该是两个不同的。所以,预期的结果将是:

--------------------------------------------
| subject | random_value_1 | random_value_2 |
============================================
| subject1 | "Value 2"     | "Value 5"      |
| subject2 | "Value 6"     | "Value 7"      |
--------------------------------------------

我发现了这个问题sparql: randomly select one connection for each node但是,问题只是获得一个值,我需要两个不同的值。

1 个答案:

答案 0 :(得分:1)

你可以只是关于同样的事情,但它有点复杂。首先为其他主题选择一个随机值。然后,在外部查询中,以相同的方式选择一个随机值,但是与第一个不同的一个(通过删除过滤器可以允许相同):

select ?subject (sample(?v1) as ?value1) (sample(?v2) as ?value2) {
  { select ?subject ?v1 ?v2 {
      { select ?subject ?v1 {
          ?subject <some_predicate> ?v1
        }
        order by rand() }

      ?subject <some_predicate> ?v2
      filter(!sameTerm(?v1,?v2))
    }
    order by rand()
  }
}
group by ?subject

请注意,应用于关联问题sparql: randomly select one connection for each node的相同警告适用;由于未指定样本的实施,因此可以想象得到非随机结果。以下是使用Jena的ARQ的一些示例输出:

---------------------------------------
| subject    | value1     | value2    |
=======================================
| <subject1> | "Value 1"  | "Value 2" |
| <subject2> | "Value 10" | "Value 6" |
---------------------------------------

--------------------------------------
| subject    | value1    | value2    |
======================================
| <subject1> | "Value 4" | "Value 1" |
| <subject2> | "Value 8" | "Value 6" |
--------------------------------------

--------------------------------------
| subject    | value1    | value2    |
======================================
| <subject1> | "Value 4" | "Value 3" |
| <subject2> | "Value 6" | "Value 8" |
--------------------------------------