Extracting resources containing an accent or umlaut

时间:2015-07-28 16:34:28

标签: regex sparql

I am looking for a specific person on dbpedia. I can get the person, and his name out via the following query.

select distinct * where {
    :Antonio_Damasio a dbpedia-owl:Person;
                     rdfs:label ?name
}

If you look at the results, you will get quite a few labels for the name with various accents (e.g. á).

enter image description here

I want to now reverse the query; given a label, give me the URI. You can easily get it with a variation of the following:

select distinct * where {
    ?person a dbpedia-owl:Person;
            rdfs:label ?name
filter(regex(?name, "^Antonio Damasio", "i"))
}

enter image description here

My problem is, in the above regular expression I omitted the letters with accent (such as á) and due to the variety of labels, I still managed to pull the resource. However, not all the resources have English labels. I want to be able to pull the resource if the user enters a label that matches a Latin alphabet. Is there a way to write a query or regular expression that given a label checks for all the variety of a letter (e.g u, ú, ü, ... ), or I just simply need to write multiple queries?

0 个答案:

没有答案