Jooq / SQL在另一列中找到唯一值的平均值

时间:2019-03-14 15:58:30

标签: java mysql sql jooq

我有一个查询要从组合表中返回各种图形。我正在使用jooq来运行此查询。

final SiteSalesFigures siteSalesFigures =
dsl.select(
      countDistinct(LINE.TRANSACTION_ID).as("transactionCount"),
      sum(LINE.PROFIT).as("totalProfit"),
      sum(LINE.TOTAL).as("totalSalesAmount"),
      sum(LINE.QUANTITY).as("totalItemsSold"),
      sum(LINE.PROFIT).divide(sum(LINE.TOTAL)).multiply(100).as("profitMarginPercentage"),
      avg(TRANSACTIONS.NO_OF_ITEMS).as("averageItemsPerTransaction"),
      sum(LINE.TOTAL).divide(countDistinct(LINE.TRANSACTION_ID)).as("averageSalesTotalPerTransaction"),
      sum(LINE.PROFIT).divide(countDistinct(LINE.TRANSACTION_ID)).as("averageProfitTotalPerTransaction"))
    .from(TRANSACTIONS)
    .join(LINE).on(TRANSACTIONS.TRANSACTION_ID.equal(LINE.TRANSACTION_ID))
    .leftJoin(ITEM).on(LINE.ITEM_ID.equal(ITEM.ITEM_CODE))
    .where(TRANSACTIONS.SITE_ID.equal(siteId))
    .and(TRANSACTIONS.NO_OF_LINES.greaterThan(0))
    .and(TRANSACTIONS.START_TIME
      .between(new Timestamp(reportStartDate.toInstant().toEpochMilli()))
      .and(new Timestamp(reportEndDate.toInstant().toEpochMilli())))
    .and(TRANSACTIONS.TRANSACTION_TYPE_ID.notEqual(cancelledSaleID))
    .fetchOneInto(SiteSalesFigures.class);

averageItemsPerTransaction被证明是问题所在。我完全理解为什么它不起作用,但是我不确定如何使它起作用。不幸的是,由于使用了Line表的排除,联接是必需的。

如果一个事务有3行,那么该事务详细信息(包括no_of_items)将被复制3次,这将导致错误的值。

我只对交易表进行了平均值查询,所以知道正确的值。

这是仅两个事务的表的外观(此示例不需要隐藏列):

          **transaction_id**       **no_of_lines    no_of_items**
8abf1720-51f6-a1bf-4714-004b644cb99f --- 2         --- 2
8abf1720-51f6-a1bf-4714-004b644cb99f --- 2         --- 2
d239feab-38ea-7c8a-4814-7d5a38f74949 --- 3         --- 4
d239feab-38ea-7c8a-4814-7d5a38f74949 --- 3         --- 4
d239feab-38ea-7c8a-4814-7d5a38f74949 --- 3         --- 4

您会注意到行数并不总是等于项目数(例如,一行可以扫描一项,两次)

有人有解决方案吗?

2 个答案:

答案 0 :(得分:1)

使用2个查询的解决方案

一个明显的解决方案是运行两个查询以获得这些结果。第一个查询将是您已经拥有的查询(但没有平均值),第二个查询将仅计算平均值:

final SiteSalesFigures siteSalesFigures =
dsl.select(
      avg(TRANSACTIONS.NO_OF_ITEMS).as("averageSalesTotalPerTransaction"),
      avg(TRANSACTIONS.PRICE).as("averageSalesTotalPerTransaction"),
      avg(TRANSACTIONS.PROFIT).as("averageProfitTotalPerTransaction"))
    .from(TRANSACTIONS)
    .where(TRANSACTIONS.SITE_ID.equal(siteId))
    .and(TRANSACTIONS.NO_OF_LINES.greaterThan(0))
    .and(TRANSACTIONS.START_TIME
      .between(new Timestamp(reportStartDate.toInstant().toEpochMilli()))
      .and(new Timestamp(reportEndDate.toInstant().toEpochMilli())))
    .and(TRANSACTIONS.TRANSACTION_TYPE_ID.notEqual(cancelledSaleID))
    .fetchOneInto(SiteSalesFigures.class);

这可能比一次性完成要慢得多,具体取决于TRANSACTIONS表的大小。

使用weighted average

的解决方案

由于联接产生重复的TRANSACTIONS行,因此必须计算加权平均值,而不是普通平均值。以您的示例为例,如果您的TRANSACTIONS行重复了3次,则必须将特定事务的贡献除以3。这通常会很复杂,但是鉴于您已经通过预计数对模式进行了规范化处理每笔交易NO_OF_ITEMS,您很幸运。如果没有此列,则必须在派生表中对其进行预先计算。

在SQL / jOOQ中:

final SiteSalesFigures siteSalesFigures =
dsl.select(
      ...
      count() 
        .divide(countDistinct(TRANSACTIONS.TRANSACTION_ID)).as("averageSalesTotalPerTransaction"),
      sum(TRANSACTIONS.PRICE.divide(TRANSACTIONS.NO_OF_ITEMS))
        .divide(countDistinct(TRANSACTIONS.TRANSACTION_ID)).as("averageSalesTotalPerTransaction"),
      sum(TRANSACTIONS.PROFIT.divide(TRANSACTIONS.NO_OF_ITEMS))
        .divide(countDistinct(TRANSACTIONS.TRANSACTION_ID)).as("averageProfitTotalPerTransaction"))
    .from(TRANSACTIONS)
    .join(...)
    ...
    .fetchOneInto(SiteSalesFigures.class);

根据您的数据类型,您可能需要强制转换为DOUBLENUMBER

I've blogged about calculating weighted averages in SQL more in detail here

答案 1 :(得分:0)

一直以来,解决方案就摆在我眼前,我可以只使用能够获得所需值的值来

final SiteSalesFigures siteSalesFigures =
dsl.select(
      countDistinct(LINE.TRANSACTION_ID).as("transactionCount"),
      sum(LINE.PROFIT).as("totalProfit"),
      sum(LINE.TOTAL).as("totalSalesAmount"),
      sum(LINE.QUANTITY).as("totalItemsSold"),
      sum(LINE.PROFIT).divide(sum(LINE.TOTAL)).multiply(100).as("profitMarginPercentage"),
      sum(LINE.QUANTITY).divide(countDistinct(LINE.TRANSACTION_ID)).as("averageItemsPerTransaction"),
      sum(LINE.TOTAL).divide(countDistinct(LINE.TRANSACTION_ID)).as("averageSalesTotalPerTransaction"),
      sum(LINE.PROFIT).divide(countDistinct(LINE.TRANSACTION_ID)).as("averageProfitTotalPerTransaction"))
    .from(TRANSACTIONS)
    .join(LINE).on(TRANSACTIONS.TRANSACTION_ID.equal(LINE.TRANSACTION_ID))
    .leftJoin(ITEM).on(LINE.ITEM_ID.equal(ITEM.ITEM_CODE))
    .where(TRANSACTIONS.SITE_ID.equal(siteId))
    .and(TRANSACTIONS.NO_OF_LINES.greaterThan(0))
    .and(TRANSACTIONS.START_TIME
      .between(new Timestamp(reportStartDate.toInstant().toEpochMilli()))
      .and(new Timestamp(reportEndDate.toInstant().toEpochMilli())))
    .and(TRANSACTIONS.TRANSACTION_TYPE_ID.notEqual(cancelledSaleID))
    .fetchOneInto(SiteSalesFigures.class);