在MySQL中结合类似的数据

时间:2016-11-15 11:32:43

标签: mysql sql hive hiveql

我有以下数据:

 date     | source              | session | device
 5/1/2016 | facebook.com/social | 5       | mobile
 5/1/2016 | facebook.com/post   | 50      | desktop
 5/1/2016 | facebook.com/commun | 25      | mobile
 5/1/2016 | pintrest.com/social | 15      | mobile
 5/1/2016 | pintrest.com/commun | 25      | mobile

我需要以下数据:

 date     | source              | session | device
 5/1/2016 | facebook            | 30      | mobile
 5/1/2016 | facebook            | 50      | desktop
 5/1/2016 | pintrest            | 40      | mobile

我正在使用MySQL数据库

1 个答案:

答案 0 :(得分:1)

假设您可以使用第一次出现的点作为缩短的网址('。'),则以下内容适合您。

select
  date
  , LEFT(source, LOCATE('.', source) - 1) as 'short_source'
  , sum(sessions) as  'sessions'
  , device
from date
group by 
  date
  , LEFT(source, LOCATE('.', source) - 1)
  , device

SQL Fiddle

好的,为了满足表格是否包含无效网址(在这种情况下是没有DOT的网址):

select
  date
  , COALESCE(LEFT(source, LOCATE('.', source) - 1), 'invalid_url') as 'short_source'
  , sum(sessions) as  'sessions'
  , device
from date
group by 
  date
  , LEFT(source, LOCATE('.', source) - 1)
  , device