如何通过正则表达式过滤标签?

时间:2020-05-23 11:34:29

标签: sparql wikidata

正则表达式子句的过滤器将被忽略,FILTER(!REGEX(STR(?aLabel), "^Q[0-9]+$")) ...如何使用“标签过滤器”?


真实案例

SELECT ?a ?aLabel ?lat ?long WHERE {
  ?a wdt:P31 wd:Q274393 .   # bakery or scholl or etc.
  ?a p:P625 ?statement .    # that has coordinate-location statement

  ?statement psv:P625 ?coordinate_node .
  ?coordinate_node wikibase:geoLatitude ?lat .
  ?coordinate_node wikibase:geoLongitude ?long .

  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]" .
  }
  #FILTER(!REGEX(STR(?aLabel), "^Q[0-9]+$")) # not working, no items !  
  #FILTER(!REGEX(STR(?a), "^Q[0-9]+$")) # not working, ignored !
}
ORDER BY (?aLabel)  # need to eliminate ugly itens with no name

您可以edit here


PS:这不是问题,而是对现实生活中的问题的另一种解决方案,该条款很有趣,它是检查“无语言标签”或“空标签”的子句。

1 个答案:

答案 0 :(得分:2)

@UninformedUser评论,

诸如?aLabel之类的标签是来自某些特殊非标准服务的魔术变量,因此,它们在查询被评估之后发生

因此,为避免魔术,我们可以尝试在子查询中将其隔离出来。它工作正常!

SELECT *
WHERE {
  # no constraints here in the main query, bypass the subquery
  { # subquery:
    SELECT ?a ?aLabel ?lat ?long 
    WHERE {
      ?a wdt:P31 wd:Q274393 .   # bakery or scholl or etc.
      ?a p:P625 ?statement .    # that has coordinate-location statement
      ?statement psv:P625 ?coordinate_node .
      ?coordinate_node wikibase:geoLatitude ?lat .
      ?coordinate_node wikibase:geoLongitude ?long .
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]" . }
    }
    ORDER BY (?aLabel)  
  }
  FILTER(!REGEX(STR(?aLabel), "^Q[0-9]+$")) # to eliminate ugly itens with no name
}

请参阅或edit here

用于过滤名称的可选解决方案

正如问题末尾所述,解决现实生活中问题的另一种方法是检查“无语言标签”或“空标签”的子句。无需正则表达式,也不需要子查询,只需在原始查询上添加以上FILTER EXISTS

SELECT ?a ?aLabel ?lat ?long WHERE {
  ?a wdt:P31 wd:Q274393 .   # bakery or scholl or etc.
  ?a p:P625 ?statement .    # that has coordinate-location statement

  ?statement psv:P625 ?coordinate_node .
  ?coordinate_node wikibase:geoLatitude ?lat .
  ?coordinate_node wikibase:geoLongitude ?long .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]" .
  }
  FILTER EXISTS {
    ?a rdfs:label ?someLabel filter(langmatches(lang(?someLabel), "[AUTO_LANGUAGE]"))
  } 
}
ORDER BY (?aLabel)

查看或edit here

相关问题