我创建了一个连接六个表的查询:
SELECT a.accession, b.value, c.name, d.description, e.value, f.seqlen, f.residues
FROM chado.dbxref a inner join chado.dbxrefprop b on a.dbxref_id = b.dbxref_id
inner join chado.biomaterial d on b.dbxref_id = d.dbxref_id
inner join chado.feature f on d.dbxref_id = f.dbxref_id
inner join chado.biomaterialprop e on d.biomaterial_id = e.biomaterial_id
inner join chado.contact c on d.biosourceprovider_id = c.contact_id;
我目前正在使用名为Chado(http://gmod.org/wiki/Chado_Tables)的PostgreSQL架构。我尝试遵守先前存在的模式使我在同一个表中存放多个连接值(dbxrefprop表中的两个不同值,biomaterialprop表中的三个不同值)。查询数据库会导致大量冗余输出。有没有办法通过修改查询语句来减少输出冗余?理想情况下,我希望输出类似于以下内容:
test001 | GB0101 | source011 | Faaberg,K.; Lyoo,K.; Korol,D.M. | serum | T1 | Iowa, USA | 01 Jan 2005 | 1234 | AUGAACGCCUUGCAUUACUAUGACUAUGAUU
答案 0 :(得分:0)
工作查询语句:
SELECT a.accession, string_agg(distinct b.value, ' | ' ORDER BY b.value) AS bvalue_list, c.name, d.description, string_agg(distinct e.value, ' | ' ORDER BY e.value) AS evalue_list, f.seqlen, f.residues
FROM chado.dbxref a INNER JOIN chado.dbxrefprop b ON a.dbxref_id = b.dbxref_id
INNER JOIN chado.biomaterial d ON b.dbxref_id = d.dbxref_id
INNER JOIN chado.feature f ON d.dbxref_id = f.dbxref_id
INNER JOIN chado.biomaterialprop e ON d.biomaterial_id = e.biomaterial_id
INNER JOIN chado.contact c ON d.biosourceprovider_id = c.contact_id
GROUP BY a.accession, c.name, d.description, f.seqlen, f.residues;