我正在使用MySQL并且有3个这样的表:
Create Table users (
firstName VARCHAR,
lastName VARCHAR,
userName VARCHAR,
email VARCHAR,
created DATETIME, etc.
Create Table data_2013 (
uid VARCHAR,
d1 INT,
d2 INT,
d3 INT, etc
Create Table data_2016 (
uid VARCHAR,
d1 INT,
d2 INT,
d3 INT, etc
两个数据表中的uid
与userName
表中的users
字段相匹配
每个用户在users
表中出现两次(或更多),但始终匹配firstName
和lastName
。
这些用户的子集(大约100个)在" data_xxxx"表。
对于2013年数据,userName
是一个8个字符的字符串。对于2016年数据,userName
是他们当前的电子邮件地址(不一定与2013年使用的电子邮件地址相同)。
我可以让所有拥有2016年数据的用户获得如下查询:
SELECT firstName,lastName,userName
FROM users
WHERE created > '2016-01-01'
AND userName IN(SELECT uid FROM data_2016)`
但我现在想要的是一个查询,它会向我提供具有2013年数据的userName
用户列表。但是,正如我所说,userName
(或uid
)不匹配,但firstName
和lastName
值应该匹配。
我需要这样的东西,伪代码:
SELECT userName
FROM users
WHERE created < '2014-01-01'
and firstName,lastName IN (
SELECT firstName,lastName
FROM users
WHERE created > '2016-01-01'
AND userName IN(SELECT uid FROM data_2016))
我确定工会或加入是答案,但我无法弄清楚。
任何提示?
由于
修改
以下是来自users
表的一些示例数据:
+------------------------+-----------+----------+------------------------+---------------------+ | userName | firstName | lastName | email | created | +------------------------+-----------+----------+------------------------+---------------------+ | rwhite | ROBERT | WHITE | xxxxxxxxxx@gmail.com | 2013-08-05 13:13:23 | | rwhite@company.com | Robert | White | rwhite@company.com | 2016-10-23 20:26:52 | +------------------------+-----------+----------+------------------------+---------------------+
以上用户的2013年数据示例:
+--------+---------------------+----+----+----+----+----+ | uid | created | d1 | d2 | d3 | d4 | d5 | +--------+---------------------+----+----+----+----+----+ | rwhite | 2013-08-05 13:24:24 | 38 | 31 | 7 | 22 | 46 | +--------+---------------------+----+----+----+----+----+
以上用户的2016年数据示例:
+--------------------+---------------------+----+----+----+----+----+ | uid | created | d1 | d2 | d3 | d4 | d5 | +--------------------+---------------------+----+----+----+----+----+ | rwhite@company.com | 2016-10-24 12:37:29 | 38 | 48 | 59 | 71 | 17 | +--------------------+---------------------+----+----+----+----+----+
EDIT2
我忘记了我有一张第四张表,其中包含某些客户的额外数据:
Create Table users_custA (
userName VARCHAR,
id_num VARCHAR,
etc.
)
此表中同一用户的示例:
+--------------------+-----------+
| userName | id_num |
+--------------------+-----------+
| rwhite | N00123450 |
| rwhite@company.com | N00123450 |
+--------------------+-----------+
此id_num
保证对于给定的人是唯一的(即,R White是users_custA
表中有两个条目的单个人。)
问题仍然存在:如何构建一个查询,生成在两个data_xxxx表中都有数据的userNames列表?
答案 0 :(得分:0)
通常,期望名称在时间上是唯一且一致的,这有点不可靠,但如果您确定数据中的情况属实,那么您可以像这样调整查询(假设您有不区分大小写的排序规则):
SELECT userName
FROM users As u2013
WHERE created >= '2013-01-01'
AND created < '2014-01-01'
AND EXISTS (
SELECT 1
FROM users As u2016
WHERE created >= '2016-01-01'
AND created < '2017-01-01'
AND u2016.FirstName = u2013.FirstName
AND u2016.LastName = u2013.LastName
AND EXISTS (SELECT 1 FROM data_2016 WHERE data_2016.uid = u2016.userName));
您会使用WHERE EXISTS
而不是WHERE ... IN
,因为mysql不支持WHERE (col1, col2) IN ...
,它只支持单列or so I understand. < / p>
修改强>
您可以通过这种方式整合users_custA
表格以获得更确定的匹配:
Select *
From users_custA
Where id_num In (
SELECT id_num
FROM (
SELECT DISTINCT id_num
FROM users As u
JOIN users_custA As a On u.userName = a.userName
WHERE created >= '2013-01-01'
AND created < '2014-01-01'
UNION ALL
SELECT DISTINCT id_num
FROM users As u
JOIN users_custA As a On u.userName = a.userName
WHERE created >= '2016-01-01'
AND created < '2017-01-01') As union_subquery
GROUP BY id_num
HAVING COUNT(*) = 2);