Question

首先，我没有做出最低限度的例子，因为我认为没有它就可以理解我的问题。

其次，我没有给你数据，因为我认为没有它我的问题就能解决。但是，如果你问的话，我愿意给你。

这是我的疑问：

select distinct (?x as ?likedItem) (?item as ?suggestedItem) ?similarity ?becauseOf  ((?similarity * ?importance * ?levelImportance) as ?finalSimilarity)

{
  values ?user {bo:ania}
  #the variable ?x is bound to the items the user :ania has liked.
  ?user rs:hasRated ?ratings.
  ?ratings a rs:Likes.
  ?ratings rs:aboutItem ?x.
  ?ratings rs:ratesBy   ?ratingValue.
  #level 0 class similarities
  {
    #extract all the items that are from the same class (type) as the liked items.
    #I assumed the being from the same class accounts for 50% of the similarities.
    #This value can be changed according to the test or the application domain.
    values ?classImportance {0.5} #class level
    bind (?classImportance as ?importance)
    bind( 4/7 as ?levelImportance)
  ?x  a ?class.
  ?class rdfs:subClassOf ?mainClass .
  ?mainClass rdfs:subClassOf rs:RecommendableClass .
  ?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
  ?similarityConfiguration rs:hasClassSimilarity ?classSimilarity .
  ?classSimilarity rs:appliedOnClass ?class .
  ?classSimilarity rs:hasClassSimilarityValue ?similarity .
  ?item a ?class.
  bind (concat("it shares the same class, which is ", strafter(str(?class), "#"), ", with ", strafter(str(?x), "#")) as ?becauseOf)
  }
  union
   #level 0 instance similarities
  {
  #extract the items that share the same value for important predicates with the already liked items..
  #I assumed that having the same instance for important predicates account for 100% of the similarities.
  #This value can be changed according to the test or the application domain.
   values ?instanceImportance {1} #instance level
   bind (?instanceImportance as ?importance)
   bind( 4/7 as ?levelImportance)
   ?x  a ?class.
  ?class rdfs:subClassOf ?mainClass .
  ?mainClass rdfs:subClassOf rs:RecommendableClass .
  ?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
  ?similarityConfiguration rs:hasPropertySimilarity ?propertySimilarity .
  ?propertySimilarity rs:appliedOnProperty ?property .
  ?propertySimilarity rs:hasPropertySimilarityValue ?similarity .
  ?x ?property ?value .
  ?item ?property ?value .
    bind (concat("it shares ", strafter(str(?value), "#"), " for predicate ", strafter(str(?property), "#"), " with ", strafter(str(?x), "#")) as ?becauseOf)
  }
  filter (?x != ?item)
}

这是结果：

如您所见，结果包含相同suggestedItem 的多个值，我想根据suggestedItem创建组并将finalSimilarity

的值相加

我试过了：

select   ?item (SUM(?similarity * ?importance * ?levelImportance ) as ?finalSimilarity)  (group_concat(distinct ?x) as ?likedItem) (group_concat(?becauseOf ; separator = " ,and ") as ?reason) where
{
  values ?user {bo:ania}
  #the variable ?x is bound to the items the user :ania has liked.
  ?user rs:hasRated ?ratings.
  ?ratings a rs:Likes.
  ?ratings rs:aboutItem ?x.
  ?ratings rs:ratesBy   ?ratingValue.
  #level 0 class similarities
  {
    #extract all the items that are from the same class (type) as the liked items.
    #I assumed the being from the same class accounts for 50% of the similarities.
    #This value can be changed according to the test or the application domain.
    values ?classImportance {0.5} #class level
    bind (?classImportance as ?importance)
    bind( 4/7 as ?levelImportance)
  ?x  a ?class.
  ?class rdfs:subClassOf ?mainClass .
  ?mainClass rdfs:subClassOf rs:RecommendableClass .
  ?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
  ?similarityConfiguration rs:hasClassSimilarity ?classSimilarity .
  ?classSimilarity rs:appliedOnClass ?class .
  ?classSimilarity rs:hasClassSimilarityValue ?similarity .
  ?item a ?class.
  bind (concat("it shares the same class, which is ", strafter(str(?class), "#"), ", with ", strafter(str(?x), "#")) as ?becauseOf)
  }
  union
   #level 0 instance similarities
  {
  #extract the items that share the same value for important predicates with the already liked items..
  #I assumed that having the same instance for important predicates account for 100% of the similarities.
  #This value can be changed according to the test or the application domain.
   values ?instanceImportance {1} #instance level
   bind (?instanceImportance as ?importance)
   bind( 4/7 as ?levelImportance)
   ?x  a ?class.
  ?class rdfs:subClassOf ?mainClass .
  ?mainClass rdfs:subClassOf rs:RecommendableClass .
  ?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
  ?similarityConfiguration rs:hasPropertySimilarity ?propertySimilarity .
  ?propertySimilarity rs:appliedOnProperty ?property .
  ?propertySimilarity rs:hasPropertySimilarityValue ?similarity .
  ?x ?property ?value .
  ?item ?property ?value .
    bind (concat("it shares ", strafter(str(?value), "#"), " for predicate ", strafter(str(?property), "#"), " with ", strafter(str(?x), "#")) as ?becauseOf)
  }
  filter (?x != ?item)
}
group by ?item
order by desc(?finalSimilarity)

但结果是：

我的方式有问题，因为如果查看finalSimilarity，则值为1.7。但是，如果您从第一个查询中手动求和，则会得到0.62，所以我做错了，

你可以帮我发现吗？

请注意，两个查询是相同的，只是选择参数不同

提示

我已经能够使用两个选择来解决它：

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rs: <http://www.SemanticRecommender.com/rs#>
PREFIX bo: <http://www.BookOntology.com/bo#>
PREFIX :<http://www.SemanticBookOntology.com/sbo#>

select ?suggestedItem ( SUM (?finalSimilarity) as ?summedFinalSimilarity)  (group_concat(distinct strafter(str(?likedItem), "#")) as ?becauseYouHaveLikedThisItem) (group_concat(?becauseOf ; separator = " ,and ") as ?reason)
where {
select distinct (?x as ?likedItem) (?item as ?suggestedItem) ?similarity ?becauseOf  ((?similarity * ?importance * ?levelImportance) as ?finalSimilarity)
where
{
  values ?user {bo:ania}
  #the variable ?x is bound to the items the user :ania has liked.
  ?user rs:hasRated ?ratings.
  ?ratings a rs:Likes.
  ?ratings rs:aboutItem ?x.
  ?ratings rs:ratesBy   ?ratingValue.
  #level 0 class similarities
  {
    #extract all the items that are from the same class (type) as the liked items.
    #I assumed the being from the same class accounts for 50% of the similarities.
    #This value can be changed according to the test or the application domain.
    values ?classImportance {0.5} #class level
    bind (?classImportance as ?importance)
    bind( 4/7 as ?levelImportance)
  ?x  a ?class.
  ?class rdfs:subClassOf ?mainClass .
  ?mainClass rdfs:subClassOf rs:RecommendableClass .
  ?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
  ?similarityConfiguration rs:hasClassSimilarity ?classSimilarity .
  ?classSimilarity rs:appliedOnClass ?class .
  ?classSimilarity rs:hasClassSimilarityValue ?similarity .
  ?item a ?class.
  bind (concat("it shares the same class, which is ", strafter(str(?class), "#"), ", with ", strafter(str(?x), "#")) as ?becauseOf)
  }
  union
   #level 0 instance similarities
  {
  #extract the items that share the same value for important predicates with the already liked items..
  #I assumed that having the same instance for important predicates account for 100% of the similarities.
  #This value can be changed according to the test or the application domain.
   values ?instanceImportance {1} #instance level
   bind (?instanceImportance as ?importance)
   bind( 4/7 as ?levelImportance)
   ?x  a ?class.
  ?class rdfs:subClassOf ?mainClass .
  ?mainClass rdfs:subClassOf rs:RecommendableClass .
  ?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
  ?similarityConfiguration rs:hasPropertySimilarity ?propertySimilarity .
  ?propertySimilarity rs:appliedOnProperty ?property .
  ?propertySimilarity rs:hasPropertySimilarityValue ?similarity .
  ?x ?property ?value .
  ?item ?property ?value .
    bind (concat("it shares ", strafter(str(?value), "#"), " for predicate ", strafter(str(?property), "#"), " with ", strafter(str(?x), "#")) as ?becauseOf)
  }
  filter (?x != ?item)
}
}
group by ?suggestedItem
order by desc(?summedFinalSimilarity)

但对我来说这是一个愚蠢的解决方案，必须有一个更聪明的解决方案，我可以使用一个选择来获取聚合数据

Answer 1

没有看到你的数据，这是不可能的，并且对于这么大的查询，可能不值得尝试调试确切的问题，但如果你可以有重复（这很容易获得），这很容易发生，特别是如果你正在使用某些条件可以匹配两个部分的工会）。例如，假设您有这样的数据：

@prefix : <urn:ex:>

:x :similar [ :sim 0.10 ; :mult 2 ] ,
            [ :sim 0.12 ; :mult 1 ] ,
            [ :sim 0.12 ; :mult 1 ] ,  # yup, a duplicate
            [ :sim 0.15 ; :mult 4 ] .

然后，如果您运行此查询，您将获得四个结果行：

prefix : <urn:ex:>

select ?sim ((?sim * ?mult) as ?final) {
  :x :similar [ :sim ?sim ; :mult ?mult ] .
}

----------------
| sim  | final |
================
| 0.15 | 0.60  |
| 0.12 | 0.12  |
| 0.12 | 0.12  |
| 0.10 | 0.20  |
----------------

但是，如果您选择不同，则只会看到三个：

select distinct ?sim ((?sim * ?mult) as ?final) {
  :x :similar [ :sim ?sim ; :mult ?mult ] .
}

----------------
| sim  | final |
================
| 0.15 | 0.60  |
| 0.12 | 0.12  |
| 0.10 | 0.20  |
----------------

一旦你开始分组和总和，这些非不同的值都会被包含在内：

select (sum(?sim * ?mult) as ?final) {
  :x :similar [ :sim ?sim ; :mult ?mult ] .
}

---------
| final |
=========
| 1.04  |
---------

该总和是所有四个术语的总和，而不是三个不同的术语。即使数据没有重复值，联合也会引入重复的结果：

@prefix : <urn:ex:>

:x :similar [ :sim 0.10 ; :mult 2 ] ,
            [ :sim 0.12 ; :mult 1 ] ,
            [ :sim 0.15 ; :mult 4 ] .

prefix : <urn:ex:>

select (sum(?sim * ?mult) as ?final) {
  { :x :similar [ :sim ?sim ; :mult ?mult ] }
  union
  { :x :similar [ :sim ?sim ; :mult ?mult ] }
}

---------
| final |
=========
| 1.84  |
---------

由于您发现需要使用 group_concat（distinct ...），如果存在重复性，我不会感到惊讶。

sparql如何正确分组这些数据

请注意，两个查询是相同的，只是选择参数不同

提示

1 个答案: