选择DBpedia资源,并在摘要中至少出现N次选择的单词?

时间:2015-11-19 18:20:16

标签: rdf sparql dbpedia

我有这个请求导致一些DBpedia资源及其摘要。如何过滤结果以获得其摘要至少包含特定单词出现次数的资源?

PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbpedia-owl:<http://www.dbpedial.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

select distinct ?resource ?url ?resume where {
   ?resource rdfs:label ?Nom.
   ?resource foaf:isPrimaryTopicOf ?url.
   ?resource dbo:abstract ?resume.
   FILTER langMatches( lang(?Nom), "EN" )
   FILTER langMatches( lang(?resume), "EN" )
   ?Nom <bif:contains> "apple".             
}  

这是没有Bind功能的新请求:

select (strlen(replace(replace(Lcase(?resume), 'Jobs', '_'),'[^_]', '')) as ?nbr )  ?resource ?url 
where {
?resource rdfs:label ?Nom.
   ?resource foaf:isPrimaryTopicOf ?url.
   ?resource dbo:abstract ?resume.
FILTER langMatches( lang(?Nom), "EN" )    
FILTER langMatches( lang(?resume), "EN" )
?Nom <bif:contains> "Apple".}
GROUP BY ?Nom
Having(?nbr >= 1)      

2 个答案:

答案 0 :(得分:2)

这不是绝对完美的,但对于你想要完成的事情它应该相对较好。您可以使用替换将要计算的单词的所有实例替换为单个字符(例如,“_”)。然后,您可以再次使用替换来替换除了该字符与空字符串。然后,你有一个像'______'这样的字符串,其中长度是单词出现在字符串中的次数。例如,这是一个在摘要中计算“the”的查询,并且仅保留“the”出现至少五次的那些。

select ?x ?nThe {
  values ?x { dbr:Horse dbr:Cat dbr:Dog }
  ?x dbo:abstract ?abs 
  filter langMatches(lang(?abs),'en')
  bind(strlen(replace(replace(?abs, '\\sthe\\s', '_'),'[^_]', '')) as ?nThe)
  filter (?nThe >= 5)
}

SPARQL results

答案 1 :(得分:0)

没关系,我为我的请求找到了另一种表格:

PREFIX rdfs:   <http://www.w3.org/2000/01/rdf-schema#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbo:     <http://dbpedia.org/ontology/> 
select distinct ?Nom ?resource ?url 
where {
   ?resource rdfs:label ?Nom.
   ?resource foaf:isPrimaryTopicOf ?url.
   ?resource dbo:abstract ?resume.
FILTER langMatches( lang(?Nom), "EN" )    
FILTER langMatches( lang(?resume), "EN" )
?Nom <bif:contains> "Apple".
FIlTER regex(?resume,"Jobs")}

想想所有那些试图帮助我的人