Question

我希望分析图表的结构，我想尝试的一个特定查询是在图表中提取主题类型 - 边缘类型 - 对象类型的不同组合。

这是我之前几个问题的后续跟进：

How to generate all triples that fit a particular node type or/and edge type using SPARQL query?

How to list and count the different types of node and edge entities in the graph data using SPARQL query?

例如：如果有一个边缘类型（属性/谓词类型）的语义图为

IsCapitalOf
IsCityOf
HasPopulation 等等等。

如果节点类型如下：

城市
国家
河流
山等

然后我应该得到：

City-＆gt; IsCapitalOf-＆gt; Country 4元组
City-＆gt; IsCityOf-＆gt; Country 21元组
River-＆gt; IsPartOf-＆gt; Country 3
River-＆gt; PassesThrough-＆gt; City 11

依旧......

注意：对象字段中没有文字因为我希望单位子图模式拟合（subjecttype edgetype objecttype）

总结一下：我认为我采用的方式是：

a）在图中计算不同的主题类型 b）在图中计算不同的边类型 c）在图中计算不同的对象类型（我以前的问题已经回答了a / b / c）

现在d）生成所有可能的组合（主题类型 - >边缘类型 - >对象类型（无文字）和此类模式的计数（如直方图）

希望这个问题能够得到很好的表达。|

编辑：添加样本数据[整个数据集中的几行]这是yago数据集，可以公开获取

<Alabama>   rdf:type    <wordnet_country_108544813> .
<Abraham_Lincoln>   rdf:type    <wordnet_president_110467179> .
<Aristotle> rdf:type    <wordnet_writer_110794014> .
<Academy_Award_for_Best_Art_Direction>  rdf:type    <wordnet_award_106696483> .
<Academy_Award> rdf:type    <wordnet_award_106696483> .
<Actrius>   rdf:type    <wordnet_movie_106613686> .
<Animalia_(book)>   rdf:type    <wordnet_book_106410904> .
<Ayn_Rand>  rdf:type    <wordnet_novelist_110363573> .
<Allan_Dwan>    rdf:type    <wikicategory_American_film_directors> .
<Algeria>   rdf:type    <wordnet_country_108544813> .
<Andre_Agassi>  rdf:type    <wordnet_player_110439851> .
<Austro-Asiatic_languages>  rdf:type    <wordnet_language_106282651> .
<Afroasiatic_languages> rdf:type    <wordnet_language_106282651> .
<Andorra>   rdf:type    <wordnet_country_108544813> .
<Animal_Farm>   rdf:type    <wordnet_novelette_106368962> .
<Alaska>    rdf:type    <wordnet_country_108544813> .
<Aldous_Huxley> rdf:type    <wordnet_writer_110794014> .
<Andrei_Tarkovsky>  rdf:type    <wordnet_film_maker_110088390> .

Answer 1

假设您有这样的数据：

@prefix : <http://stackoverflow.com/q/24313367/1281433/> .

:City1 a :City .
:City2 a :City .

:Country1 a :Country .
:Country2 a :Country .
:Country3 a :Country .

:River1 a :River .
:River2 a :River .
:River3 a :River .

:City1 :isCapitalOf :Country1 .

:River1 :isPartOf :Country1, :Country2 .
:River2 :isPartOf :Country2, :Country3 .

:River1 :passesThrough :City1, :City2 .
:River2 :passesThrough :City2 .

然后这个查询为您提供了您想要的实物结果，我想：

prefix : <http://stackoverflow.com/q/24313367/1281433/>

select ?type1 ?p ?type2 (count(distinct *) as ?count) where {
   [ a ?type1 ; ?p [ a ?type2 ] ] 
}
group by ?type1 ?p ?type2

----------------------------------------------
| type1  | p              | type2    | count |
==============================================
| :River | :passesThrough | :City    | 3     |
| :City  | :isCapitalOf   | :Country | 1     |
| :River | :isPartOf      | :Country | 4     |
----------------------------------------------

如果您对[ … ]空白节点语法不太满意，可能有助于查看展开的表单：

SELECT  ?type1 ?p ?type2 (count(distinct *) AS ?count)
WHERE
  { _:b0 rdf:type ?type1 .
    _:b0 ?p _:b1 .
    _:b1 rdf:type ?type2
  }
GROUP BY ?type1 ?p ?type2

但这只会抓住有类型的东西。如果你想要包含没有rdf:type的内容，那你就想做

SELECT  ?type1 ?p ?type2 (count(distinct *) AS ?count) { 
    ?x ?p ?y
    optional { ?x a ?type1 }
    optional { ?y a ?type2 }
}
GROUP BY ?type1 ?p ?type2

计算自定义直方图度量标准以使用SPARQL了解图形结构

1 个答案: