我正在neo4j中尝试柏林基准SPARQL查询。我使用http://michaelbloggs.blogspot.de/2013/05/importing-ttl-turtle-ontologies-in-neo4j.html
从三元组创建了Neo4j图总结数据加载,我的图表具有以下结构,
Subject => Node
Predicate => Relationship
Object => Node
如果谓词是date,string,integer(primitive),则创建属性而不是关系并存储在Node中。
现在,我正在尝试进行Noe4j中非常慢的查询,
Query 4: Feature with the highest ratio between price with that feature and price without that feature.
corresponding SPARQL query for this,
prefix bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
prefix bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
Select ?feature ((?sumF*(?countTotal-?countF))/(?countF*(?sumTotal-?sumF)) As ?priceRatio)
{
{ Select (count(?price) As ?countTotal) (sum(xsd:float(str(?price))) As ?sumTotal)
{
?product a <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/ProductType294> .
?offer bsbm:product ?product ;
bsbm:price ?price .
}
}
{ Select ?feature (count(?price2) As ?countF) (sum(xsd:float(str(?price2))) As ?sumF)
{
?product2 a <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/ProductType294> ;
bsbm:productFeature ?feature .
?offer2 bsbm:product ?product2 ;
bsbm:price ?price2 .
}
Group By ?feature
}
}
Order By desc(?priceRatio) ?feature
Limit 100
Cypher query I created for this,
MATCH p1 = (offer1:Offer)-[r1:`product`]->(products1:ProductType294)
MATCH p2 = (offer2:Offer)-[r2:`product`]->products2:ProductType294)-[:`productFeature`]->features
return (sum( DISTINCT offer2.price) * ( count( DISTINCT offer1.price) - count( DISTINCT offer2.price)) /(count(DISTINCT offer2.price)*(sum( DISTINCT offer1.price) - sum(DISTINCT offer2.price)))) AS cnt,features.__URI__ AS frui
ORDER BY cnt DESC,frui
此查询非常慢,请让我知道我是否以错误的方式制定查询。
Another query is Query 5: Show the most popular products of a specific product type for each country - by review count ,
prefix bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
prefix bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/>
prefix rev: <http://purl.org/stuff/rev#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
Select ?country ?product ?nrOfReviews ?avgPrice
{
{ Select ?country (max(?nrOfReviews) As ?maxReviews)
{
{ Select ?country ?product (count(?review) As ?nrOfReviews)
{
?product a <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/ProductType403> .
?review bsbm:reviewFor ?product ;
rev:reviewer ?reviewer .
?reviewer bsbm:country ?country .
}
Group By ?country ?product
}
}
Group By ?country
}
{ Select ?product (avg(xsd:float(str(?price))) As ?avgPrice)
{
?product a <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/ProductType403> .
?offer bsbm:product ?product .
?offer bsbm:price ?price .
}
Group By ?product
}
{ Select ?country ?product (count(?review) As ?nrOfReviews)
{
?product a <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/ProductType403> .
?review bsbm:reviewFor ?product .
?review rev:reviewer ?reviewer .
?reviewer bsbm:country ?country .
}
Group By ?country ?product
}
FILTER(?nrOfReviews=?maxReviews)
}
Order By desc(?nrOfReviews) ?country ?product
Cypher query I created for this is following,
MATCH (products2:ProductType403)<-[:`reviewFor`]-(reviews:Review)-[:`reviewer`]->(rvrs)-[:`country`]->(countries)
with count(reviews) AS reviewcount,products2.__URI__ AS pruis, countries.__URI__ AS cntrs
MATCH (products1:ProductType403)<-[:`product`]-(offer:Offer)
with AVG(offer.price) AS avgPrice, MAX(reviewcount) AS maxrevs, cntrs
MATCH (products2:ProductType403)<-[:`reviewFor`]-(reviews:Review)-[:`reviewer`]->(rvrs)-[:`country`]->(countries)
with avgPrice, maxrevs,countries, count(reviews) AS rvs, countries.__URI__ AS curis, products2.__URI__ AS puris
where maxrevs=rvs
RETURN curis,puris,rvs,avgPrice
即使这个查询也很慢。我是否以正确的方式制定查询?
答案 0 :(得分:0)
这些看起来像全局图查询给我? 数据集的大小是多少?
您是在两条路径之间创建笛卡尔积吗? 不应该以某种方式连接这两条路径吗?
type
标签上是否应该有ProductType
个属性? (:ProductType {type:"294"})
如果你有一个索引:ProductType(type),可能是:Order(orderNo)
我真的不懂计算吗?
计数差异价格乘以报价2的不同价格之和 通过 要约2的不同价格的数量,乘以两个订单价格总和的增量?
MATCH (offer1:Offer)-[r1:`product`]->(products1:ProductType294)
MATCH (offer2:Offer)-[r2:`product`]->(products2:ProductType294)-[:`productFeature`]->features
RETURN (sum( DISTINCT offer2.price) *
( count( DISTINCT offer1.price) - count( DISTINCT offer2.price))
/ (count(DISTINCT offer2.price)*
(sum( DISTINCT offer1.price) - sum(DISTINCT offer2.price))))
AS cnt,features.__URI__ AS frui
ORDER BY cnt DESC,frui