在MySQL查询中选择相交和侧翼区域

时间:2014-10-23 15:54:22

标签: python mysql mysql-python

我正在使用一个带有基因组岛的表的mysql数据库,格式为:

+----+-------+----------+----------+-----------------------------------------------+
| id | chrom | start    | end      | line_string                                   |
+----+-------+----------+----------+-----------------------------------------------+
|  1 |     1 | 36568608 | 36569851 |                 ??    ?o?A      ??   ?p?A       |
|  2 |     1 | 82313020 | 82313491 |                 ??   ????A      ??   L??A         |
+----+-------+----------+----------+-----------------------------------------------+

线串的格式为:GeomFromText('Linestring(chrom start, chrom end)') “开始”和“结束”的数字是指基准位置

我目前正在使用以下命令在我的python脚本中选择Island与非Island区域:

SELECT 'Island' as Island FROM islands 
WHERE MBRIntersects(GeomFromText('Linestring(%d %d, %d %d)'), line_string) 
UNION ALL SELECT 'non-Island' LIMIT 1 % (Chr, Start, Chr, End)

但是,我想修改此查询,同时将岛屿海岸和隔水池定义为:

岛岸 - 来自岛屿的2,000个基地

岛架 - 来自岛屿的2,000至4,000个基地

1 个答案:

答案 0 :(得分:1)

我通过使用:

解决了这个问题
SELECT 'Island' as Island FROM methylation.islands FORCE INDEX (locations)
            WHERE MBRIntersects(GeomFromText('Linestring(%d %d, %d %d)'), line_string) 
        UNION ALL SELECT 'Shore' FROM methylation.islands FORCE INDEX (locations) 
            WHERE MBRIntersects(GeomFromText('Linestring(%d %d, %d %d)'), line_string)
        UNION ALL SELECT 'Shelf' FROM methylation.islands FORCE INDEX (locations) 
            WHERE MBRIntersects(GeomFromText('Linestring(%d %d, %d %d)'), line_string)
        UNION ALL SELECT 'Other' LIMIT 1 
% (Chr, Start, Chr, End, Chr, Start-2000, Chr, End+2000, Chr, Start-4000, Chr, End+4000)

通过这种方式,任何“岛屿”都被列为这样,接下来如果它是一个岛屿的+/- 2,000个碱基对,它被列为“岸”,接下来如果它是+/- 4,000个碱基对,它被列为“架”。最后,其他一切都被认为是“其他”。通过使用LIMIT 1,只返回第一个找到的术语。