Postgres某种加入?

时间:2013-07-02 09:22:56

标签: postgresql join subquery

我对postgres很新,并且遇到一些非常困难的麻烦,我需要非常糟糕。另外,我不是在一个合适的编辑器中工作,它是某种形式的基于web的编辑器。请考虑一下。

这是我的查询:

select coalesce('user') as user_src,
       coalesce(root_domain(hostname), hostname, 'unknown') as web_domain,
       count (*) as nohits
from $log
where coalesce(root_domain(hostname), hostname, 'unknown') in
    (select coalesce(root_domain(hostname), hostname, 'unknown') as web_domain
          from $log
          group by web_domain
          limit 10
    ) 
group by user_src, web_domain
order by user_src, web_domain, nohits desc

但结果并不是我希望他们看起来如何。 我想拥有所有用户+他们的前10个网站。 现在,我看到所有用户和10个网站的总数,拆分为所有用户。 - >有些用户只有0,因为他们从未访问过前10名中的一位。

对此进行调查!

编辑:多数民众赞成我如何改变它(不工作 - 这个错误:错误:列“主机名”不存在)

select  coalesce('user') as user_src,
        coalesce(root_domain(hostname), hostname, 'unknown') as web_domain,
        count (*) as nohits
from
    (select coalesce('user') as user_src,
            coalesce(root_domain(hostname), hostname, 'unknown') as web_domain,
            count (*) as nohits,
            rank() over (partition by coalesce('user') order by coalesce('user'), count (*) desc) as rank
    from $log
    group by user_src, web_domain) w
where rank <= 2
order by user_src, rank

那会起作用,例如:(只是为了确保'主机名'存在)

select  coalesce('user') as user_src,
        coalesce(root_domain(hostname), hostname, 'unknown') as web_domain,
        count (*) as nohits
from $log
group by user_src, web_domain
order by user_src, nohits

1 个答案:

答案 0 :(得分:1)

发布时的查询无法按“用户”显示细分,因为“coalesce('user')”部分是单个实体。什么对你有用的是PostgreSQL的Window Functions之一。我将演示一个使用RANK()的简单示例,以获取特定用户的前N个。

begin;

drop table if exists weblog;
create table weblog (
"user"    int,
url     text
);

insert into weblog values
(1,'http://www.1.com'),
(1,'http://www.1.com'),
(1,'http://www.2.com'),
(1,'http://www.2.com'),
(1,'http://www.3.com'),
(1,'http://www.4.com'),
(1,'http://www.5.com'),
(1,'http://www.6.com'),

(2,'http://www.2.com'),
(2,'http://www.2.com'),
(2,'http://www.3.com'),
(2,'http://www.4.com'),
(2,'http://www.4.com'),
(2,'http://www.4.com'),
(2,'http://www.5.com'),
(2,'http://www.6.com');


select  "user",
        url,
        hits,
        rank
from    (select "user",
                url,
                count(*) as hits,
                rank() over (partition by "user" order by count(*) desc,url) as rank
        from weblog
        group by "user",url) w
where rank <= 2
order by "user",rank;

 user |       url        | hits | rank 
------+------------------+------+------
    1 | http://www.1.com |    2 |    1
    1 | http://www.2.com |    2 |    2
    2 | http://www.4.com |    3 |    1
    2 | http://www.2.com |    2 |    2


rollback;

希望这对你有用。


[OP编辑回答后:]

您的外部查询应该只是从内部查询中提取列,而不是重做相同的步骤。请尝试以下(从您最近的编辑中)

select  user_src,
        web_domain,
        nohits
from
    (select coalesce('user') as user_src,
            coalesce(root_domain(hostname), hostname, 'unknown') as web_domain,
            count (*) as nohits,
            rank() over (partition by coalesce('user') order by coalesce('user'), count (*) desc) as rank
    from $log
    group by user_src, web_domain) w
where rank <= 2
order by user_src, rank