msyql比较子查询中两个字段的值

时间:2016-11-30 02:20:40

标签: mysql select union where-in

我正在使用MySQL并且有3个这样的表:

Create Table users (
  firstName VARCHAR, 
  lastName VARCHAR, 
  userName VARCHAR, 
  email VARCHAR,
  created DATETIME, etc.

Create Table data_2013 (
  uid VARCHAR,
  d1 INT,
  d2 INT,
  d3 INT, etc

Create Table data_2016 (
  uid VARCHAR,
  d1 INT,
  d2 INT,
  d3 INT, etc
  • 两个数据表中的uiduserName表中的users字段相匹配

  • 每个用户在users表中出现两次(或更多),但始终匹配firstNamelastName

  • 这些用户的子集(大约100个)在" data_xxxx"表。

  • 对于2013年数据,userName是一个8个字符的字符串。对于2016年数据,userName是他们当前的电子邮件地址(不一定与2013年使用的电子邮件地址相同)。

我可以让所有拥有2016年数据的用户获得如下查询:

SELECT firstName,lastName,userName 
FROM users 
WHERE created > '2016-01-01' 
AND userName IN(SELECT uid FROM data_2016)`

但我现在想要的是一个查询,它会向我提供具有2013年数据的userName用户列表。但是,正如我所说,userName(或uid)不匹配,但firstNamelastName值应该匹配。

我需要这样的东西,伪代码:

SELECT userName 
FROM users 
WHERE created < '2014-01-01' 
and firstName,lastName IN (
    SELECT firstName,lastName 
    FROM users 
    WHERE created > '2016-01-01' 
    AND userName IN(SELECT uid FROM data_2016))

我确定工会或加入是答案,但我无法弄清楚。

任何提示?

由于

修改

以下是来自users表的一些示例数据:


    +------------------------+-----------+----------+------------------------+---------------------+
    | userName               | firstName | lastName | email                  | created             |
    +------------------------+-----------+----------+------------------------+---------------------+
    | rwhite                 | ROBERT    | WHITE    | xxxxxxxxxx@gmail.com   | 2013-08-05 13:13:23 | 
    | rwhite@company.com     | Robert    | White    | rwhite@company.com     | 2016-10-23 20:26:52 | 
    +------------------------+-----------+----------+------------------------+---------------------+

以上用户的2013年数据示例:


    +--------+---------------------+----+----+----+----+----+
    | uid    | created             | d1 | d2 | d3 | d4 | d5 |
    +--------+---------------------+----+----+----+----+----+
    | rwhite | 2013-08-05 13:24:24 | 38 | 31 |  7 | 22 | 46 |
    +--------+---------------------+----+----+----+----+----+

以上用户的2016年数据示例:


    +--------------------+---------------------+----+----+----+----+----+
    | uid                | created             | d1 | d2 | d3 | d4 | d5 |
    +--------------------+---------------------+----+----+----+----+----+
    | rwhite@company.com | 2016-10-24 12:37:29 | 38 | 48 | 59 | 71 | 17 |
    +--------------------+---------------------+----+----+----+----+----+

EDIT2

我忘记了我有一张第四张表,其中包含某些客户的额外数据:

Create Table users_custA (
  userName VARCHAR,
  id_num VARCHAR,
  etc.
)

此表中同一用户的示例:

+--------------------+-----------+
| userName           | id_num    |
+--------------------+-----------+
| rwhite             | N00123450 | 
| rwhite@company.com | N00123450 | 
+--------------------+-----------+

id_num保证对于给定的人是唯一的(即,R White是users_custA表中有两个条目的单个人。)

问题仍然存在:如何构建一个查询,生成在两个data_xxxx表中都有数据的userNames列表?

1 个答案:

答案 0 :(得分:0)

通常,期望名称在时间上是唯一且一致的,这有点不可靠,但如果您确定数据中的情况属实,那么您可以像这样调整查询(假设您有不区分大小写的排序规则):

SELECT userName 
FROM users As u2013
WHERE created >= '2013-01-01' 
AND created < '2014-01-01'
AND EXISTS (
    SELECT 1 
    FROM users As u2016
    WHERE created >= '2016-01-01' 
    AND created < '2017-01-01'
    AND u2016.FirstName = u2013.FirstName
    AND u2016.LastName = u2013.LastName
    AND EXISTS (SELECT 1 FROM data_2016 WHERE data_2016.uid = u2016.userName));

您会使用WHERE EXISTS而不是WHERE ... IN,因为不支持WHERE (col1, col2) IN ...,它只支持单列or so I understand. < / p>

修改

您可以通过这种方式整合users_custA表格以获得更确定的匹配:

Select *
  From users_custA 
  Where id_num In (
    SELECT id_num
      FROM (
        SELECT DISTINCT id_num 
          FROM users As u
          JOIN users_custA As a On u.userName = a.userName
          WHERE created >= '2013-01-01' 
          AND created < '2014-01-01'
        UNION ALL
        SELECT DISTINCT id_num
          FROM users As u
          JOIN users_custA As a On u.userName = a.userName
          WHERE created >= '2016-01-01'
          AND created < '2017-01-01') As union_subquery
      GROUP BY id_num
      HAVING COUNT(*) = 2);