限制MySQL上的交叉表查询中的数据

时间:2016-05-11 21:29:01

标签: mysql pivot aggregate-functions crosstab

我最近得到了这个问题的帮助:

MySQL Crosstab / Pivot Aggregation. Removing counts based on column in other table

我试图根据另一个表中存在的字符串过滤掉交叉表计数。

现在我试图进一步过滤结果,这次是按日期。

我尝试做的只是计算每天添加的特定Make_Model,Color和Year的最后一辆车。

例如,如果我在2016年11月5日下午12:15添加2015 Black Ford Fusion,并在当天下午1:33添加另一个,我只想在交叉表中的下午1:33计算一个。

数据

我稍微更改了我的数据,以便考虑新信息:

CAR_INVENTORY TABLE
CAR_ID  MAKE_MODEL      COLOR   YEAR    DATE_ADDED
1       Ford Fusion     Black   2015    2016-05-11 11:25:00
2       Tesla Model S   White   2014    2016-05-11 11:25:00
3       Acura ILX       Blue    2013    2016-05-11 11:25:00
4       Ford Fusion     Black   2013    2016-05-11 11:25:00
5       Toyota Corolla  Blue    2014    2016-05-11 11:25:00
6       Ford Fusion     Blue    2013    2016-05-11 11:25:00
7       Toyota Corolla  Blue    2012    2016-05-11 11:25:00
8       Acura ILX       Black   2015    2016-05-11 11:25:00
9       Ford Focus      Blue    2012    2016-05-11 11:25:00
10      Ford Fusion     White   2013    2016-05-11 11:25:00
11      Acura ILX       Black   2012    2016-05-11 11:25:00
12      Toyota Corolla  Black   2015    2016-05-11 11:25:00
13      Toyota Corolla  Blue    2014    2016-05-11 11:37:00
14      Ford Focus      White   2015    2016-05-11 11:25:00
15      Tesla Model S   Red     2015    2016-05-11 11:25:00
16      Acura TLX       White   2014    2016-05-11 11:25:00
17      Toyota Corolla  Blue    2014    2016-04-11 12:43:33
18      Ford Focus      Black   2013    2016-05-11 11:25:00
19      Ford Focus      White   2015    2016-05-11 14:29:12


INVENTORY_LOG TABLE
LOG_ID  CAR_ID  NOTE
1       7       Issue with Fuel Guage
2       3       Sweet Ride
3       16      Zippy
4       14      Issue with transmission
5       3       Fun to Drive
6       2       *NULL*
7       8       *NULL*
8       10      Economic
9       15      WOW
10      9       Good Fuel Economy
11      16      Minor issue with Shifting
12      7       Issue with Airbag
13      17      Great Mileage
14      1       Nice Tech
15      13      *NULL*
16      11      Trunk is small
17      12      *NULL*
18      2       Very Speedy
19      7       Good Mileage
20      10      Roomy
21      4       *NULL*
22      6       Nice Tech Package
23      5       Good Economy
24      18      Cool
25      19      Nice ride, but bad fuel econ

这是我想要得到的:

DESIRED RESULT
MAKE            Black   Blue    White   
Acura           1       1       0
Ford            3       1       2
Tesla           0       0       1
Toyota          1       2       0

删除了以下四辆汽车:

DUE TO ISSUES:
car_id  car                         issues
7       2012 Blue Toyota Corolla    2
14      2015 White Ford Focus       1
16      2014 White Acura TLX        1

DUE TO TIMESTAMP:
car_id  car
5       2014 Blue Toyota Corolla (car_id 13 is later)
14      2015 White Ford Focus* (car_id 19 is later)
*also had issue

Note we're also not counting the red one, as It's not in the cross tab.

car_inventory表中每辆车的库存有一行。 inventory_log表至少包含car_inventory中列出的每辆车的一个条目,因此每辆车可能有许多日志条目。 inventory_log中的条目可以为null。

我所做的事情远远不够

感谢上一个问题的一些帮助,我已经提出了以下查询来生成交叉表,并删除带有“问题”的汽车。

SELECT
    CASE
        WHEN ci.make_model LIKE "Acura%" THEN "Acura"
        WHEN ci.make_model LIKE "Ford%" THEN "Ford"
        WHEN ci.make_model LIKE "Toyota%" THEN "Toyota"
        WHEN ci.make_model LIKE "Tesla%" THEN "Tesla"
    END AS Make,
    SUM(CASE WHEN ci.color = "Black" THEN 1 ELSE 0 END) as Black,
    SUM(CASE WHEN ci.color = "Blue" THEN 1 ELSE 0 END) as Blue,
    SUM(CASE WHEN ci.color = "White" THEN 1 ELSE 0 END) as White
FROM car_inventory ci
WHERE 
    (ci.year > 2012) AND
    (ci.car_id NOT IN (
        SELECT DISTINCT il.car_id 
        FROM inventory_log il 
        WHERE il.note LIKE '%issue%'
    ))
GROUP BY Make
ORDER BY Make;

我也解决了这个问题:

SELECT
    ci.car_id
FROM
    car_inventory ci
GROUP BY
    ci.make_model,
    ci.color,
    ci.year,
    DATE(ci.date_added)
ORDER BY
    ci.car_id;

这限制了每个make_model,颜色和年份每天返回一个car_id的内容。然而,它已经添加了最早的car_id,而不是最新的。有了这个,我可以使用另一个子查询。

问题

  1. 如何限制所选内容以使其每天只返回一个car_id(每个make_model,年份,颜色组合),并且每天只返回最新的一个。

  2. 如何在交叉表查询中使用它。我可以在AND (ci.car_id IN (SELECT ...)之类的where子句中添加另一个子查询吗?有这么多子查询对性能有害吗?

  3. 使用连接或其他结构有更好的方法吗?我原来的子查询也有同样的问题。

1 个答案:

答案 0 :(得分:0)

我没有得到你期望的确切数量,但这可能会有所帮助

SELECT  CASE
            WHEN MAKE_MODEL LIKE 'Acura%' THEN 'Acura'
            WHEN MAKE_MODEL LIKE 'Ford%' THEN 'Ford'
            WHEN MAKE_MODEL LIKE 'Toyota%' THEN 'Toyota'
            WHEN MAKE_MODEL LIKE 'Tesla%' THEN 'Tesla'
        END AS Make,
        SUM(CASE WHEN ci.color = 'Black' THEN ci.COUNT ELSE 0 END) as Black,
        SUM(CASE WHEN ci.color = 'Blue' THEN ci.COUNT ELSE 0 END) as Blue,
        SUM(CASE WHEN ci.color = 'White' THEN ci.COUNT ELSE 0 END) as White
FROM    (-- GET DISTINCT COUNTS
         SELECT ci.MAKE_MODEL,
                ci.COLOR,
                ci.YEAR,
                DATE(ci.DATE_ADDED) AS DATE,
                COUNT(DISTINCT MAKE_MODEL) AS COUNT
         FROM   CAR_INVENTORY ci
         WHERE  YEAR > 2012
                AND CAR_ID NOT IN (
                    SELECT DISTINCT il.car_id 
                    FROM inventory_log il 
                    WHERE il.note LIKE '%issue%'
                )
         GROUP BY ci.MAKE_MODEL,
                ci.COLOR,
                ci.YEAR,
                DATE(DATE_ADDED)
        ) ci
GROUP BY CASE
            WHEN MAKE_MODEL LIKE 'Acura%' THEN 'Acura'
            WHEN MAKE_MODEL LIKE 'Ford%' THEN 'Ford'
            WHEN MAKE_MODEL LIKE 'Toyota%' THEN 'Toyota'
            WHEN MAKE_MODEL LIKE 'Tesla%' THEN 'Tesla'
        END

Result:
Make   Black       Blue        White
------ ----------- ----------- -----------
Acura  1           1           0
Ford   3           1           2
Tesla  0           0           1
Toyota 1           2           0

我只为Acura Black获得1分,因为只有2分,1分是2012年

SQL Fiddle Demo