蜂巢离开加入不工作

时间:2017-02-25 06:41:14

标签: sql join hive

我的用户表包含以下数据作为列

region,cust no,mobileno,null,host,null,usage,null,usageduration
AP      404070620021081 Prepaid 919848052151    NULL    Facebook        NULL    2.9384765625    NULL    1.726
AP      404070620021081 Prepaid 919848052151    NULL    HTTP    NULL    1.0146484375    NULL    0.232
AP      404070620021081 Prepaid 919848052151    NULL    Bing    NULL    8.8642578125    NULL    0.746
AP      404070620021081 Prepaid 919848052151    NULL    Crashlytics     NULL    19.4599609375   NULL    48.765
AP      404070620021081 Prepaid 919848052151    NULL    DNS     NULL    17.4296875      NULL    584.596
AP      404070620021081 Prepaid 919848052151    NULL    Doubleclick     NULL    6.908203125     NULL    1.362
AP      404070620021081 Prepaid 919848052151    NULL    Dropbox NULL    37.0380859375   NULL    42.174
AP      404070620021081 Prepaid 919848052151    NULL    Facebook        NULL    21.1533203125   NULL    29.689
AP      404070620021081 Prepaid 919848052151    NULL    Google  NULL    49.0732421875   NULL    28.456
AP      404070620021081 Prepaid 919848052151    NULL    Google APIs     NULL    213.8642578125  NULL    49.866
AP      404070620021081 Prepaid 919848052151    NULL    Google Ads      NULL    5.7314453125    NULL    0.932
AP      404070620021081 Prepaid 919848052151    NULL    Google Calendar NULL    0.201171875     NULL    0.06
AP      404070620021081 Prepaid 919848052151    NULL    Google Cloud Messaging  NULL    8.5419921875    NULL    143.50799999999998
AP      404070620021081 Prepaid 919848052151    NULL    Google Play     NULL    228.7880859375  NULL    88.77600000000001
AP      404070620021081 Prepaid 919848052151    NULL    HTTP    NULL    0.29296875      NULL    1.16
AP      404070620021081 Prepaid 919848052151    NULL    NTP     NULL    0.1484375       NULL    0.122
AP      404070620021081 Prepaid 919848052151    NULL    SSL     NULL    96.095703125    NULL    452.88
AP      404070620021081 Prepaid 919848052151    NULL    Skype   NULL    93.6953125      NULL    67.649
AP      404070620021081 Prepaid 919848052151    NULL    TCP     NULL    93.591796875    NULL    117.32900000000001
AP      404070620021081 Prepaid 919848052151    NULL    WhatsApp        NULL    165780.6171875  NULL    1097.055
AP      404070620021081 Prepaid 919848052151    NULL    XMPP    NULL    62.4453125      NULL    350.03700000000003


my top20 table contains host,rank
SSL                     1
TCP                     2
DNS                     3
HTTP                    4
Facebook                5
Google Play             6
Google Cloud Messaging  7
YouTube                 8
UDP                     9
XMPP                    10
Skype                   11
WhatsApp                12
Bittorrent              13
Google                  14
STUN                    15
Google APIs             16
Doubleclick             17
Apple                   18
MDNS                    19
Google Ads              20

我需要使用这些top20网站的每个客户的使用期限。如果客户没有使用那么它应该显示0但每个客户需要20行。我做了左连接,但得到420行所有组合。这是错误的。请建议为每个客户获得20行

2 个答案:

答案 0 :(得分:0)

一种方法是使用交叉连接查找cust_no和host的所有组合,然后使用它连接用户表。

select t.cust_no,
    t.host,
    coalesce(u.usage, 0) usage,
    coalesce(u.usageduration, 0) usageduration
from (
    select *
    from (
        select distinct cust_no
        from user
        ) u
    cross join top20 t
    ) t
left join user u on t.cust_no = u.cust_no
    and t.host = u.host;

答案 1 :(得分:0)

您好我创建了一个查询,如下所示

SELECT
  imsi,
  msisdn,Subscription_plan,
  cust_nationality,
  application_name,
  rank,is_app,
  total_data_volume_host,
  SUM(total_data_volume_app) AS total_data_volume_app,
  event_duration_host,
  SUM(event_duration_app) AS event_duration_app
FROM (SELECT
  ps_data.cust_nationality,
  ps_data.imsi,ps_data.Subscription_plan,
  ps_data.msisdn,
  ps_data.host,
  top20.host_app AS application_name,
  top20.rank,
  top20.is_app,
  ps_data.total_data_volume_host,
  ps_data.total_data_volume_app,
  ps_data.event_duration_host,
  ps_data.event_duration_app
FROM (SELECT
  circle,
  host_app,
  rank,
  is_app
FROM ps_top20_host_app_1_month
WHERE is_app = '1') top20
LEFT JOIN (SELECT
  cust_nationality,
  imsi,Subscription_plan,
  msisdn,
  NULL AS host,
  application_name,
  NULL AS total_data_volume_host,
  SUM(COALESCE(total_data_volume,0)) / 1024 AS total_data_volume_app,
  NULL AS event_duration_host,
  SUM(COALESCE(data_transfer_time_dl, 0)/1000 + COALESCE(data_transfer_time_ul, 0)/1000) AS event_duration_app
FROM ps_data_up_segg_1_day
WHERE content_provider='1' and cust_nationality is not null and UPPER(cust_nationality) not in ('UNKNOWN','NULL IN SOURCE') and 
msisdn is not null and UPPER(msisdn) not in ('UNKNOWN','NULL IN SOURCE') AND dt >= '1487615400000'AND dt < '1487701800000' AND imsi='404070620021081'
GROUP BY cust_nationality,
         imsi,
         msisdn,Subscription_plan,
         host,
         application_name) ps_data
  ON (
  top20.circle = ps_data.cust_nationality) where ( top20.host_app is not null and top20.host_app=ps_data.application_name ) )t2
GROUP BY imsi,
         msisdn,Subscription_plan,
         cust_nationality,
         host,
         application_name,
         rank,is_app,
         total_data_volume_host,
         event_duration_host

但我只得到14行,这些行并不是全部20行 404070620021081 919848052151预付AP DNS 3 1 NULL 17.4296875 NULL 584.596 404070620021081 919848052151预付AP双击17 1 NULL 6.908203125 NULL 1.362 404070620021081 919848052151预付AP Facebook 5 1 NULL 24.091796875 NULL 31.415 404070620021081 919848052151预付AP谷歌14 1 NULL 49.0732421875 NULL 28.456 404070620021081 919848052151预付AP Google API 16 1 NULL 213.8642578125 NULL 49.866 404070620021081 919848052151预付AP Google广告20 1 NULL 5.7314453125 NULL 0.932 404070620021081 919848052151预付AP Google Cloud Messaging 7 1 NULL 8.5419921875 NULL 143.50799999999998 404070620021081 919848052151预付AP Google Play 6 1 NULL 228.7880859375 NULL 88.77600000000001 404070620021081 919848052151预付AP HTTP 4 1 NULL 1.3076171875 NULL 1.392 404070620021081 919848052151预付AP SSL 1 1 NULL 96.095703125 NULL 452.88 404070620021081 919848052151预付AP Skype 11 1空93.6953125空67.649 404070620021081 919848052151预付AP TCP 2 1 NULL 93.591796875 NULL 117.32900000000001 404070620021081 919848052151预付AP WhatsApp 12 1 NULL 165780.6171875 NULL 1097.055 404070620021081 919848052151预付AP XMPP 10 1 NULL 62.4453125 NULL 350.03700000000003 但是对于这个用户来说它是错误的我需要20条记录,其中未使用的主机使用0次