我有一张表格,其中包含旅行社的所有预订和用户数据,并希望了解以下内容:
旅行者可以在表格中拥有多条记录 旅行者由重复数据删除列标识
我想知道每年的每个月,有多少旅行者在那个月旅行,前一年旅行(我怎样才能找到旅行的特定月份和年份的旅客数量)前一年)
是否有可能有一个类似的结果表,其中一个额外的列包含前一年旅行过的记录数?
我是SQL的新手,但是渴望学习并且已经玩了2天,只使用这个查询来计算每个月和每年的所有独特旅行者:
SELECT BookingYear, BookingMonth, COUNT( DISTINCT (Dedup) )
AS 'Count of travellers'
FROM DATA
GROUP BY BookingYear, BookingMonth
生成此数据集:
+-------------+--------------+---------------------+
| BookingYear | BookingMonth | Count of travellers |
+-------------+--------------+---------------------+
| 2009 | 11 | 384 |
| 2009 | 12 | 1084 |
| 2010 | 1 | 4641 |
| 2010 | 2 | 1922 |
| 2010 | 3 | 1453 |
| 2010 | 4 | 1032 |
| 2010 | 5 | 967 |
| 2010 | 6 | 1095 |
| 2010 | 7 | 2490 |
| 2010 | 8 | 2425 |
| 2010 | 9 | 920 |
| 2010 | 10 | 213 |
| 2010 | 11 | 1140 |
| 2010 | 12 | 1981 |
| 2011 | 1 | 3514 |
| 2011 | 2 | 1284 |
| 2011 | 3 | 1424 |
| 2011 | 4 | 867 |
| 2011 | 5 | 1395 |
| 2011 | 6 | 1318 |
| 2011 | 7 | 3182 |
| 2011 | 8 | 2491 |
| 2011 | 9 | 1119 |
| 2011 | 10 | 144 |
| 2011 | 11 | 1937 |
| 2011 | 12 | 3092 |
| 2012 | 1 | 4752 |
| 2012 | 2 | 1266 |
| 2012 | 3 | 949 |
| 2012 | 4 | 1107 |
| 2012 | 5 | 1352 |
| 2012 | 6 | 1454 |
| 2012 | 7 | 3365 |
| 2012 | 8 | 1590 |
| 2012 | 9 | 656 |
| 2012 | 10 | 209 |
| 2012 | 11 | 2445 |
| 2012 | 12 | 3769 |
| 2013 | 1 | 7570 |
| 2013 | 2 | 4646 |
| 2013 | 3 | 2329 |
| 2013 | 4 | 2666 |
| 2013 | 5 | 2506 |
| 2013 | 6 | 1973 |
| 2013 | 7 | 3336 |
| 2013 | 8 | 2229 |
| 2013 | 9 | 398 |
+-------------+--------------+---------------------+
这是表结构:
+----------------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------------+---------------+------+-----+---------+-------+
| BookingCode | int(15) | YES | | NULL | |
| Dedup | varchar(50) | YES | | NULL | |
| BookToDep | int(4) | YES | | NULL | |
| BookingYear | int(4) | YES | | NULL | |
| BookingMonth | int(2) | YES | | NULL | |
| DepartureYear | int(4) | YES | | NULL | |
| DepartureMonth | int(2) | YES | | NULL | |
| GroupCount | int(3) | YES | | NULL | |
| Duration | int(3) | YES | | NULL | |
| Gender | varchar(10) | YES | | NULL | |
| Age | int(3) | YES | | NULL | |
| Birthdate | datetime | YES | | NULL | |
| Country | varchar(2) | YES | | NULL | |
| AccoCountry | varchar(50) | YES | | NULL | |
| AccoRegion | varchar(50) | YES | | NULL | |
| AccoDestination | varchar(50) | YES | | NULL | |
| RevenueEntireBooking | decimal(15,2) | YES | | NULL | |
+----------------------+---------------+------+-----+---------+-------+
这是数据的摘录:
+-------------+-------------------------+-----------+-------------+--------------+---------------+----------------+------------+----------+--------+------+---------------------+---------+-------------+-----------------+-----------------+----------------------+
| BookingCode | Dedup | BookToDep | BookingYear | BookingMonth | DepartureYear | DepartureMonth | GroupCount | Duration | Gender | Age | Birthdate | Country | AccoCountry | AccoRegion | AccoDestination | RevenueEntireBooking |
+-------------+-------------------------+-----------+-------------+--------------+---------------+----------------+------------+----------+--------+------+---------------------+---------+-------------+-----------------+-----------------+----------------------+
| 948757 | EMMENROCUS14390 | 188 | 2009 | 11 | 2010 | 5 | 7 | 8 | M | 73 | 1939-05-25 00:00:00 | NL | Turkije | Turkse Riviera | Lara | 4136.00 |
| 948757 | EMMENANTONETTA28626 | 188 | 2009 | 11 | 2010 | 5 | 7 | 8 | F | 34 | 1978-05-16 00:00:00 | NL | Turkije | Turkse Riviera | Lara | 4136.00 |
| 948757 | HEESTERSWESLEY34719 | 188 | 2009 | 11 | 2010 | 5 | 7 | 8 | M | 17 | 1995-01-20 00:00:00 | NL | Turkije | Turkse Riviera | Lara | 4136.00 |
| 948757 | EMMENHUBERDINA25710 | 188 | 2009 | 11 | 2010 | 5 | 7 | 8 | F | 42 | 1970-05-22 00:00:00 | NL | Turkije | Turkse Riviera | Lara | 4136.00 |
| 948757 | HEESTERSANTHONY25917 | 188 | 2009 | 11 | 2010 | 5 | 7 | 8 | M | 41 | 1970-12-15 00:00:00 | NL | Turkije | Turkse Riviera | Lara | 4136.00 |
| 948757 | VANDERHOEVENRONALD27069 | 188 | 2009 | 11 | 2010 | 5 | 7 | 8 | M | 38 | 1974-02-09 00:00:00 | NL | Turkije | Turkse Riviera | Lara | 4136.00 |
| 948757 | HEESTERSMIRTHE35781 | 188 | 2009 | 11 | 2010 | 5 | 7 | 8 | C | 14 | 1997-12-17 00:00:00 | NL | Turkije | Turkse Riviera | Lara | 4136.00 |
| 949055 | BOGERSPATRICK26350 | 184 | 2009 | 11 | 2010 | 5 | 4 | 11 | M | 40 | 1972-02-21 00:00:00 | NL | Turkije | Turkse Riviera | Belek | 1922.00 |
| 949055 | BOGERSJORDI37246 | 184 | 2009 | 11 | 2010 | 5 | 4 | 11 | C | 10 | 2001-12-21 00:00:00 | NL | Turkije | Turkse Riviera | Belek | 1922.00 |
| 949055 | DEBREEESTHER25664 | 184 | 2009 | 11 | 2010 | 5 | 4 | 11 | F | 42 | 1970-04-06 00:00:00 | NL | Turkije | Turkse Riviera | Belek | 1922.00 |
+-------------+-------------------------+-----------+-------------+--------------+---------------+----------------+------------+----------+--------+------+---------------------+---------+-------------+-----------------+-----------------+----------------------+
答案 0 :(得分:0)
如果您使用的是SQL Server:
with cte as (
select
d.BookingYear, d.BookingMonth, count(distinct(d.Dedup)) as [Count of travellers]
from DATA as d
group by d.BookingYear, d.BookingMonth
)
select
c.BookingYear, c.BookingMonth,
c.[Count of travellers],
isnull(c2.[Count of travellers], 0) as [Count of travellers previous year]
from cte as c
left outer join cte as c2 on
c2.BookingMonth = c.BookingMonth and c2.BookingYear = c1.BookingYear - 1
对于MySQL,您可以使用:
select
d.BookingYear, d.BookingMonth,
count(distinct(d.Dedup)) as [Count of travellers],
count(distinct(d2.Dedup)) as [Count of travellers previous year]
from DATA as d
left outer join DATA as d2 on
d2.BookingMonth = d.BookingMonth and d2.BookingYear = d1.BookingYear - 1
group by d.BookingYear, d.BookingMonth