MySQL使用Group By从同一行中提取其他varchar字段

时间:2014-01-31 23:42:07

标签: mysql

我有一个如下结构的交易表:

+--------------------------------------+--------------------------+------------+
| contact_id                           | return_reason            | date       |
+--------------------------------------+--------------------------+------------+
| 2091c2ed-8f9b-bcfe-1884-50d3ab2cd02d | R01 - Insufficient Funds | 2014-01-25 |
| 2091c2ed-8f9b-bcfe-1884-50d3ab2cd02d | R08 - Payment Stopped    | 2013-09-15 |
| 2091c2ed-8f9b-bcfe-1884-50d3ab2cd02d | R01 - Insufficient Funds | 2013-08-15 |
| 2091c2ed-8f9b-bcfe-1884-50d3ab2cd02d | R01 - Insufficient Funds | 2013-07-31 |
| 2091c2ed-8f9b-bcfe-1884-50d3ab2cd02d | R01 - Insufficient Funds | 2013-05-31 |
| 10101a4f-eaf8-b05a-4813-51a682df2189 | R08 - Payment Stopped    | 2013-03-15 |
| 10101a4f-eaf8-b05a-4813-51a682df2189 | R08 - Payment Stopped    | 2013-04-15 |
| 10101a4f-eaf8-b05a-4813-51a682df2189 | R08 - Payment Stopped    | 2013-05-15 |
+--------------------------------------+--------------------------+------------+

我想要查找的数据是每个contact_id最新的return_reason。

我目前的疑问是:

select contact_id,return_reason as most_recent_return_reason,max(date) as ram_date from transactions group by contact_id order by ram_date desc;

我的查询结果是拉正确的return_reason,但我不确定它是否正在执行,因为它是准确/正确的,如果是因为赔率对我有利。我很害怕,因为这个非常相似的查询会得出错误的日期值:

select contact_id,return_reason as most_recent_return_reason,date as ram_date from transactions group by contact_id order by ram_date desc;

3 个答案:

答案 0 :(得分:1)

你是对的,说“赔率”对你有利。此查询在大多数其他DBMS中无效,因为“return_reason”既不在group by子句中,也不在聚合函数中,但MySQL更宽松并允许您运行它。但结果是未定义的。

你需要的是聚合函数FIRST()和LAST(),这在MySQL中是不存在的。

解决方案:您可以使用GROUP_CONCAT()执行您想要执行的操作。查看文档以了解其工作原理:http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html#function_group-concat

答案 1 :(得分:1)

正如Fabien所说,你很幸运,因为你正在使用group by节点的MySQL扩展。如文档here所示,select中不在group by且没有聚合函数的列将从任意行分配值。这意味着return_reason来自任意行,而不是具有最大日期的行。

以下是获取最新原因的一种非常简单的方法:

select contact_id,
       substring_index(group_concat(return_reason order by date desc), ',', 1
                      ) as most_recent_return_reason,
       max(date) as ram_date
from transactions
group by contact_id
order by ram_date desc;

答案 2 :(得分:1)

实际上,select中的MAX运算符是结果的格式化程序,而不是运算符适用于您的结果。

你可以做那样的事情

SELECT contact_id,return_reason AS most_recent_return_reason, date AS ram_date
FROM transactions AS t
WHERE t.date = (
  SELECT MAX(t2.date)
  FROM transactions AS t2
  WHERE t2.contact_id = t.contact_id
);

此外,您可以在此SQL小提琴上看到差异:http://sqlfiddle.com/#!2/c3687/12