如何获得最大平均值和最大时间戳

时间:2017-01-03 11:41:24

标签: mysql eloquent aggregate-functions

问题:如何获得最佳3组织及其最新评论以及相关数据? 问题2:如何在Laravel Eloquent

中实现此查询

最佳组织=表格审核中平均成绩最佳的组织

最新评论=具有最高时间戳的评论但必须审核此特定组织

相关数据=表格数据地址,城市,国家,活动,用户 - >仅姓名,姓氏,身份

Table schema

到目前为止,我得到了这个:

SELECT *, AVG(r.General) as average, COUNT(r.BuyerId) as countBuyer, COUNT(r.SupplierId) as countSupplier, COUNT(r.EmployeeId) as countEmployee, COUNT(r.OtherId) as countOther, COUNT(r.Id) as countReview
FROM organization as o 
INNER JOIN review as r ON o.Id = r.OrganizationId 
INNER JOIN user as u ON u.Id = r.UserId
INNER JOIN address as a ON a.Id = o.AddressId
INNER JOIN city as c ON c.Id = a.CityId
INNER JOIN country as co ON co.Id = c.CountryId
INNER JOIN activity as ac ON ac.Id = o.ActivityId
GROUP BY o.Id
ORDER BY `average` DESC  
LIMIT 3

1 个答案:

答案 0 :(得分:1)

您的问题:JOIN操作导致组合爆炸 - 结果集中有很多行。当您对GROUP BY操作使用JOIN操作时,可以复制许多行,因此通过多次计算内容会使总和,计数和平均值失真。

此外,COUNT(some_column)计算该列中的非空值,COUNT(*)计算所有行。如果您想要数量明确的买家,我认为您可能需要COUNT(DISTINCT BuyerId)

如果要从该review表中获取干净聚合,则需要在子查询中计算它们。

              SELECT OrganizationID,
                     AVG(General) as average, 
                     COUNT(DISTINCT BuyerId) as countBuyer, 
                     COUNT(DISTINCT SupplierId) as countSupplier, 
                     COUNT(DISTINCT EmployeeId) as countEmployee, 
                     COUNT(DISTINCT OtherId) as countOther, 
                     COUNT(*) as countReview
                FROM review
            GROUP BY OrganizationID   
            ORDER BY AVG(General) DESC
            LIMIT 3 

这为您提供了一个虚拟表,每个组织有一行,显示摘要审阅数据。它仅限于前三名。

最新的评论比较复杂。只要没有重复的时间戳,这可能会有效。

             SELECT r.*
               FROM review r
               JOIN (
                       SELECT OrganizationId,
                              MAX(Timestamp) Timestamp
                         FROM review
                        GROUP BY OrganizationId
                    ) maxts   ON r.OrganizationId = maxts.OrganizationId
                             AND r.Timestamp = maxts.Timestamp

然后将这些虚拟表连接到其他表,就好像它是一个物理表,并且可以避免聚合的组合爆炸失真。

   SELECT whatever, aggr.*, latest.*
     FROM organization as o
     JOIN (
              SELECT OrganizationID,
                     AVG(General) as average, 
                     COUNT(DISTINCT BuyerId) as countBuyer, 
                     COUNT(DISTINCT SupplierId) as countSupplier, 
                     COUNT(DISTINCT EmployeeId) as countEmployee, 
                     COUNT(DISTINCT OtherId) as countOther, 
                     COUNT(*) as countReview
                FROM review
            GROUP BY OrganizationID   
            ORDER BY AVG(General) DESC
            LIMIT 3 
          ) aggr ON o.Id = aggr.OrganizationId
     JOIN (
             SELECT r.*
               FROM review r
               JOIN (
                       SELECT OrganizationId,
                              MAX(Timestamp) Timestamp
                         FROM review
                        GROUP BY OrganizationId
                    ) maxts   ON r.OrganizationId = maxts.OrganizationId
                             AND r.Timestamp = maxts.Timestamp
          ) latest ON a.OrganizationId = latest.OrganizationId
     JOIN user as u ON u.Id = r.UserId
     JOIN etc.

由于您正在加入活动,用户和组织,因此您可能仍会获得此结果集中的行数。但至少你的评论聚合会没问题。

请注意,您被GROUP BY臭名昭着的非标准MySQL扩展所咬。读这个。 https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html

专业提示。请勿在具有*操作的SELECT中使用JOIN。没有什么好处可以来自它。 *会返回重复的ID列,它会诱使您认为您可能不了解您正在加入的内容。在这种情况下,由于您滥用GROUP BY扩展名,您绝对不会这样做。作为一名专业程序员,您应该对这些正在审核的用户负责,以便您理解。