大查询SELECT语句中的JOIN ERROR:使用LEFT JOIN的USING OR运算符

时间:2018-10-09 15:46:46

标签: sql google-cloud-platform google-bigquery

我有以下选择语句。我正在使用LEFT JOIN将两个表推在一起。 LEFT JOIN应该在两种情况下起作用:

条件1:ATTOM_ID。 ATTOM ID是唯一的标识符
条件2:ZIP,姓氏和完整地址。这些字段是字符串字段,所有三个字段都必须匹配才能成为JOIN。

任何其他条件都应导致NULL,因此应为LEFT JOIN。如果任何一个条件通过,都应该发生JOIN,这就是为什么我要在这里使用OR语句。

由于某种原因,Google Big Query不喜欢该查询,因为其中包含OR。我得到的错误是:

LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join.

这是SQL语句。此声明上的所有其他问题均起作用。使用LEFT JOINS和“ OR”运算符是否有GBQ限制?谢谢。

  SELECT 
  Source, 
  FirstName, 
  LastName,
  MiddleName, 
  Gender, 
  Age, 
  DOB, 
  Address, 
  Address2,
  City, 
  State, 
  Zip, 
  Zip4, 
  TimeZone, 
  Income, 
  HomeValue, 
  Networth, 
  MaritalStatus, 
  IsRenter, 
  HasChildren, 
  CreditRating, 
  Investor, 
  LinesOfCredit, 
  InvestorRealEstate, 
  Traveler, 
  Pets, 
  MailResponder, 
  Charitable, 
  PolicalDonations, 
  PoliticalParty, 
  coalesce(P.ATTOM_ID, T.ATTOM_ID) as ATTOM_ID,
  coalesce(P.GEOID, T.GEOID) as GEOID,
  Score,
  Score1,
  Score2,
  Score3,
  Score4,
  Score5,
  PropertyLatitude AS Latitude,
  PropertyLongitude AS Longitude
  FROM `db.ds.table1` P
 LEFT JOIN `db.ds.table2` T
ON 1 = 
CASE 
    WHEN (P.ATTOM_ID = T.ATTOM_ID) 
        THEN 1
    WHEN P.Zip = T. PropertyAddressZIP
            AND ( 
                    LOWER(P.LastName) = LOWER(T.DeedOwner1NameLast)
                    OR LOWER(P.LastName) = LOWER(T.PartyOwner1NameLast)
                )
            AND ( 
                    STRPOS(LOWER(P.Address), LOWER(T.PropertyAddressFull) ) > 0
                    OR STRPOS(LOWER(T.PropertyAddressFull), LOWER(P.Address) ) > 0 
                )
            AND IFNULL(T.PropertyAddressFull,'') != ''
        THEN 1
    ELSE 0 END

4 个答案:

答案 0 :(得分:1)

也许是不同的方法?是否可以在OR条件下拆分JOIN,创建两个INNER JOIN查询,然后UNION一起?

SELECT 
    Source, 
    FirstName, 
    LastName,
    MiddleName, 
    Gender, 
    Age, 
    DOB, 
    Address, 
    Address2,
    City, 
    State, 
    Zip, 
    Zip4, 
    TimeZone, 
    Income, 
    HomeValue, 
    Networth, 
    MaritalStatus, 
    IsRenter, 
    HasChildren, 
    CreditRating, 
    Investor, 
    LinesOfCredit, 
    InvestorRealEstate, 
    Traveler, 
    Pets, 
    MailResponder, 
    Charitable, 
    PolicalDonations, 
    PoliticalParty, 
    coalesce(P.ATTOM_ID, T.ATTOM_ID) as ATTOM_ID,
    coalesce(P.GEOID, T.GEOID) as GEOID,
    Score,
    Score1,
    Score2,
    Score3,
    Score4,
    Score5,
    PropertyLatitude AS Latitude,
    PropertyLongitude AS Longitude
FROM `db.ds.Table1` P
INNER JOIN `db.ds.Table2` T ON (P.ATTOM_ID = T.ATTOM_ID) 

UNION

SELECT 
    Source, 
    FirstName, 
    LastName,
    MiddleName, 
    Gender, 
    Age, 
    DOB, 
    Address, 
    Address2,
    City, 
    State, 
    Zip, 
    Zip4, 
    TimeZone, 
    Income, 
    HomeValue, 
    Networth, 
    MaritalStatus, 
    IsRenter, 
    HasChildren, 
    CreditRating, 
    Investor, 
    LinesOfCredit, 
    InvestorRealEstate, 
    Traveler, 
    Pets, 
    MailResponder, 
    Charitable, 
    PolicalDonations, 
    PoliticalParty, 
    coalesce(P.ATTOM_ID, T.ATTOM_ID) as ATTOM_ID,
    coalesce(P.GEOID, T.GEOID) as GEOID,
    Score,
    Score1,
    Score2,
    Score3,
    Score4,
    Score5,
    PropertyLatitude AS Latitude,
    PropertyLongitude AS Longitude
FROM `db.ds.Table1` P
INNER JOIN `db.ds.Table2` T ON P.Zip = T. PropertyAddressZIP
            AND ( 
                    LOWER(P.LastName) = LOWER(T.DeedOwner1NameLast)
                    OR LOWER(P.LastName) = LOWER(T.PartyOwner1NameLast)
                )
            AND ( 
                    STRPOS(LOWER(P.Address), LOWER(T.PropertyAddressFull) ) > 0
                    OR STRPOS(LOWER(T.PropertyAddressFull), LOWER(P.Address) ) > 0 
                )
            AND IFNULL(T.PropertyAddressFull,'') != ''

答案 1 :(得分:1)

下面是问题的简化示例

#standardSQL
WITH `db.ds.Table1` AS (
  SELECT NULL id, '12345' zip, 'abc' name UNION ALL
  SELECT 2, '23456', 'vwu' UNION ALL
  SELECT 4 id, '12347' zip, 'abd' name 
), `db.ds.Table2` AS (
  SELECT 2 id, '12346' zip, 'xyz' name UNION ALL
  SELECT 3, '12345' zip, 'abc' name 
)
SELECT p, t FROM `db.ds.Table1` p
LEFT JOIN `db.ds.Table2` t
ON p.id = t.id OR p.zip = t.zip   

它产生Error: LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join.

您可以改写为

#standardSQL
WITH `db.ds.Table1` AS (
  SELECT NULL id, '12345' zip, 'abc' name UNION ALL
  SELECT 2, '23456', 'vwu' UNION ALL
  SELECT 4 id, '12347' zip, 'abd' name 
), `db.ds.Table2` AS (
  SELECT 2 id, '12346' zip, 'xyz' name UNION ALL
  SELECT 3, '12345' zip, 'abc' name 
)
SELECT 
  COALESCE(p.id, t.id) AS id,
  p.zip,
  p.name
FROM (
  SELECT ANY_VALUE(p) p , ANY_VALUE(IF(p.id = t.id OR p.zip = t.zip, t, NULL)) t
  FROM `db.ds.Table1` p
  CROSS JOIN `db.ds.Table2` t
  GROUP BY TO_JSON_STRING(p)
)   

在这里,您只需在IF()函数中移动所有条件,然后用CROSS JOIN替换LEFT JOIN

结果是

Row id  zip     name     
1   3   12345   abc  
2   2   23456   vwu  
3   4   12347   abd    

如您所见-您不会缺少表1中的id = 4

希望,您可以将其应用于特定查询(应该直接复制粘贴)

答案 2 :(得分:0)

尝试将连接条件的最后一部分移至WHERE子句:

SELECT 
  Source, 
  <lots_of_columns>
  FROM `db.ds.Table1` P
 LEFT JOIN `db.ds.Table2` T
 ON (P.ATTOM_ID = T.ATTOM_ID)
 OR (
   P.Zip = T. PropertyAddressZIP
  AND ( LOWER(P.LastName) = LOWER(T.DeedOwner1NameLast)
    OR LOWER(P.LastName) = LOWER(T.PartyOwner1NameLast))
  AND ( STRPOS(LOWER(P.Address), LOWER(T.PropertyAddressFull) ) > 0
    OR STRPOS(LOWER(T.PropertyAddressFull), LOWER(P.Address) ) > 0 )
  )
  WHERE IFNULL(T.PropertyAddressFull,'') != '';

这可能是问题的根源,因为它没有引用联接中的两个表。

答案 3 :(得分:0)

也许尝试使用CASE作为连接子句?

SELECT 
    Source, 
    FirstName, 
    LastName,
    MiddleName, 
    Gender, 
    Age, 
    DOB, 
    Address, 
    Address2,
    City, 
    State, 
    Zip, 
    Zip4, 
    TimeZone, 
    Income, 
    HomeValue, 
    Networth, 
    MaritalStatus, 
    IsRenter, 
    HasChildren, 
    CreditRating, 
    Investor, 
    LinesOfCredit, 
    InvestorRealEstate, 
    Traveler, 
    Pets, 
    MailResponder, 
    Charitable, 
    PolicalDonations, 
    PoliticalParty, 
    coalesce(P.ATTOM_ID, T.ATTOM_ID) as ATTOM_ID,
    coalesce(P.GEOID, T.GEOID) as GEOID,
    Score,
    Score1,
    Score2,
    Score3,
    Score4,
    Score5,
    PropertyLatitude AS Latitude,
    PropertyLongitude AS Longitude
FROM `db.ds.Table1` P
LEFT JOIN `db.ds.Table2` T ON 1 = 
CASE 
    WHEN (P.ATTOM_ID = T.ATTOM_ID) 
        THEN 1
    WHEN P.Zip = T. PropertyAddressZIP
            AND ( 
                    LOWER(P.LastName) = LOWER(T.DeedOwner1NameLast)
                    OR LOWER(P.LastName) = LOWER(T.PartyOwner1NameLast)
                )
            AND ( 
                    STRPOS(LOWER(P.Address), LOWER(T.PropertyAddressFull) ) > 0
                    OR STRPOS(LOWER(T.PropertyAddressFull), LOWER(P.Address) ) > 0 
                )
            AND IFNULL(T.PropertyAddressFull,'') != ''
        THEN 1
    ELSE 0 END