用于连接同义词的SPARQL查询?

时间:2015-07-17 20:13:32

标签: rdf sparql bioinformatics

我尝试将RDF源中的数据转换为Repetition with Star and Plus

所期望的字典格式

具体来说,我试图将概念的所有同义词折叠到一行

@ note2使用@note2 biomedical text mining application

+----------+-----------+---------------+-------------------------+
|  class   |   term    |   synonyms    |      external IDs       |
+----------+-----------+---------------+-------------------------+
| food     | bread     | pan|brot      | source1;idA             |
| nutrient | vitamin C | ascorbic acid | source1;idC|source2;idD |
+----------+-----------+---------------+-------------------------+

我可以通过dictionaries in this format

这样的查询获得每行一个同义词
SELECT ?term ?syn ?extid
FROM <http://bioportal.bioontology.org/ontologies/BTO>
WHERE
{
  ?extid <http://bioportal.bioontology.org/metadata/def/prefLabel> ?term .
  ?extid <http://www.geneontology.org/formats/oboInOWL#hasRelatedSynonym> ?syn .
}

返回这样的内容:

+-------------------------+-------------------------+-----------------+
|          term           |           syn           |      extid      |
+-------------------------+-------------------------+-----------------+
| "stomach smooth muscle" | "gastric muscle"        | bto:BTO_0001818 |
| "stomach smooth muscle" | "gastric smooth muscle" | bto:BTO_0001818 |
| "stomach smooth muscle" | "stomach muscle"        | bto:BTO_0001818 |
+-------------------------+-------------------------+-----------------+

所以...在WITHIN SPARQL中,是否有可能连接同义词并最终得到类似

的内容
+-----------------------+----------------------------------------------------------------------------+-----------------+
|         term          |                                    syn                                     |      extid      |
+-----------------------+----------------------------------------------------------------------------+-----------------+
| stomach smooth muscle | gastric muscle|gastric smooth muscle|stomach smooth muscle |stomach muscle | bto:BTO_0001818 |
+-----------------------+----------------------------------------------------------------------------+-----------------+

如果它有任何不同,我将使用virtuoso开源。

1 个答案:

答案 0 :(得分:2)

谢谢,@ jdussault!

SELECT ?term (group_concat(distinct ?syn ; separator = "|") AS ?synset) ?extid
FROM <http://bioportal.bioontology.org/ontologies/BTO>
WHERE
{
  ?extid <http://bioportal.bioontology.org/metadata/def/prefLabel> ?term .
  ?extid <http://www.geneontology.org/formats/oboInOWL#hasRelatedSynonym> ?syn .
}
group by ?term

+-----------------------+-----------------------------------------------------------------------------+-----------------+
|         term          |                                   synset                                    |      extid      |
+-----------------------+-----------------------------------------------------------------------------+-----------------+
| "3T3-F442A cell"      | "F442A cell|3T3-442A cell"                                                  | bto:BTO_0001169 |
| "stria terminalis"    | "terminal stria|Tarins tenia|tenia semicircularis|Fovilles fasciculus"      | bto:BTO_0004616 |
| "intervertebral disc" | "spinal disk|spinal disc|intervertebral fibrocartilage|intervertebral disk" | bto:BTO_0003625 |
+-----------------------+-----------------------------------------------------------------------------+-----------------+