我正在研究一些我从工作中获得的数据,并且我试图提出一个查询,让我的生活变得更加轻松(我花时间在mysql上导入这些数据。)
我有一堆样品在不同时间有不同的值(面积)(有点像钟形曲线),所以我需要做的就是在最大峰值和每个峰值(面积)之前加上每个峰值(面积)在每个样品的最大峰值(总共两列)之后。这项任务听起来非常简单,但我很难想出有效的水上运动。
我想出了类似的东西,但问题是我不能在where子句中“group by”,因为子查询中有多行返回,所以我无法比较其中的值样本。我尝试了几种不同的方法,但没有一种方法可以去任何地方。任何帮助将不胜感激。
SELECT Sample_name, sum(per_area) As '% area'/*For the areas before the peak.*/
FROM W_data.SEC_results
Where retention between /*retention = time */
0
AND
(( select retention
from W_data.SEC_results
where per_area = (
select max(per_area)
from W_data.SEC_results /* select the largest area in the entire set, instead of a specific samples */
)))
group by vial;
表:
+----------------------------------+------+-------------+----------+
| Sample_name | vial | retention | per_area |
+----------------------------------+------+-------------+----------+
| a | 74 | 14.146 | 0.08 |
| a | 74 | 16.624 | 99.79 |
| a | 74 | 20.343 | 0.13 |
| b | 75 | 12.438 | 0.16 |
| b | 75 | 13.653 | 1.85 |
| b | 75 | 16.588 | 97.95 |
| b | 75 | 20.316 | 0.04 |
+-------------+----------------+-------------+
| sample_name | Area( before) |Area (after) |
+-------------+----------------+-------------+
| a | 0.08 | 0.13 |
| b | 2.01 | 0.04 |
答案 0 :(得分:1)
逻辑是: - 首先找到所有样品瓶的最大per_area
select vial,max(per_area) maxarea from sec_results group by vial
| 74 | 99.79 |
| 75 | 97.95 |
然后为他们找到相应的时间
select sr.vial,sr.time,mt.maxarea from sec_results sr,
(select vial,max(per_area) maxarea from sec_results group by vial) mt
| 74 | 16.624 | 99.79 |
| 75 | 16.588 | 97.95 |
并将这些时间的值分别加以总和。
select a.sample_name,sum(if(a.time<temp.time,a.per_area,0)) Area_before,
sum(if(a.time>temp.time,a.per_area,0)) Area_after
from sec_results a, (select sr.vial,sr.time,mt.maxarea
from sec_results sr,(select vial,max(per_area) maxarea
from sec_results
group by vial) mt
where sr.vial = mt.vial
and sr.per_area = mt.maxarea
) temp
where a.vial = temp.vial
group by a.vial,a.sample_name;
答案 1 :(得分:0)
刚刚为此感兴趣。试图想出一种方法来避免在SUM中使用IF来获得正确的结果。
管理过它,但并不认为它有效。但是我已经付出了努力,我认为我会把它们放在这里只是为了兴趣。
第一种方式,加入一对子选择,每个子选择在之前或之后获得总和: -
SELECT DISTINCT a.sample_name, b.AreaBefore, c.AreaAfter
FROM sec_results a
LEFT OUTER JOIN (
SELECT sr.vial, SUM(sr_max.per_area) AS AreaBefore
FROM sec_results sr
INNER JOIN (
SELECT vial, max(per_area) AS maxarea
FROM sec_results
GROUP BY vial) Sub1
ON sr.vial = Sub1.vial
AND sr.per_area = Sub1.maxarea
INNER JOIN sec_results sr_max
ON sr.vial = sr_max.vial
AND sr.retention > sr_max.retention
GROUP BY vial
) b
ON a.vial = b.vial
LEFT OUTER JOIN (
SELECT sr.vial, SUM(sr_max.per_area) AS AreaAfter
FROM sec_results sr
INNER JOIN (
SELECT vial, max(per_area) AS maxarea
FROM sec_results
GROUP BY vial) Sub1
ON sr.vial = Sub1.vial
AND sr.per_area = Sub1.maxarea
INNER JOIN sec_results sr_max
ON sr.vial = sr_max.vial
AND sr.retention < sr_max.retention
GROUP BY vial
) c
ON a.vial = c.vial
第二种方式,即使用子选择来获取每条记录的前(或后)记录的总和,然后将其与子选择相加以获得最大记录。
SELECT a_sec_result.vial, a_sec_result.sample_name, area_before.area AS AreaBefore, area_after.area AS AreaAfter
FROM (
SELECT sr.vial, sr.sample_name, sr.retention
FROM sec_results sr
INNER JOIN (
SELECT vial, sample_name, max(per_area) AS maxarea
FROM sec_results
GROUP BY vial, sample_name
) max_area_sub
ON sr.vial = max_area_sub.vial
AND sr.sample_name = max_area_sub.sample_name
AND sr.per_area = max_area_sub.maxarea
) a_sec_result
INNER JOIN(
SELECT sr.vial, sr.retention, SUM(sr2.per_area) AS area
FROM sec_results sr
LEFT OUTER JOIN sec_results sr2
ON sr.vial = sr2.vial
AND sr.retention > sr2.retention
GROUP BY sr.vial, sr.retention
) area_before
ON a_sec_result.vial = area_before.vial
AND a_sec_result.retention = area_before.retention
INNER JOIN(
SELECT sr.vial, sr.retention, SUM(sr2.per_area) AS area
FROM sec_results sr
LEFT OUTER JOIN sec_results sr2
ON sr.vial = sr2.vial
AND sr.retention < sr2.retention
GROUP BY sr.vial, sr.retention
) area_after
ON a_sec_result.vial = area_after.vial
AND a_sec_result.retention = area_after.retention
两者都应该给出正确的结果。