每个值获取UNIQUE记录的计数

时间:2014-12-01 20:48:30

标签: oracle oracle11g

这与以下问题有关: split records into buckets based on a sum of counts

新的问题是,一个人可以收到很多传真,而且我试图获得一定数量的人:我的收入通常比我需要的少,因为同一个人不止一个传真机< / p>

在附加的示例中,有没有办法获得UNIQUE人数?

这是我的表,它叫做NR_PVO_120

OtherID     Fax
12365092    2762364204
12005656    2762364204
12484936    2762364204
39003042    2762364204
12365597    2762364204
12635922    2762364204
12332346    2762364204
12365092    4387267572
12005656    4387267572
12365092    4422911281
12005656    4422911281
12484936    4422911281
12651239    4422911281
12388710    4422911281
12686953    4422911281
12365092    4423311213
12005656    4423311213
12709544    4423311213
12484936    4423311213
12005656    4424450542
12346839    4424450542
12365120    4424450542
12484936    4424450542
12086512    4424450542

基于此表,我创建了一个查询,该查询在使用以下查询链接的函数中使用

SELECT   Fax
     ,COUNT(OtherID) CountOfPracs
    FROM NR_PVO_120
GROUP BY Fax

所以计数看起来像这样

Fax         CountOfPeople
4422911281  6
4387267572  2
4423311213  4
4424450542  5
2762364204  7

如果你把所有计数加起来你就会得到24个人,但实际上只有14个,因为一个人可以有多个传真。

有没有办法只计算第二个传真中那些没有计入第一个传真的人。然后,对于第三个传真,只计算前两个中没有计算的那些,依此类推?

所以结果将是

2762364204  7
4387267572  0
4422911281  3
4423311213  1
4424450542  3

the first fax has 7 people 
the second fax has 2 but both of those were already counted in the first fax so no new unique people were added
the third fax has 4 but only 1 of those hasn't already been counted 
the fourth fax has 5 but only 1 hasn't been counted earlier 
the fifth fax has 7 but only 3 weren't counted earlier

我知道这不是创建计数的方法,它们不是正确的数据,但是没关系。我只是想获得拥有一定数量人物的所有传真号码。假设我需要10个人,我必须选择10个人,但同时确保所有传真号码保持在一起。如果您查看我的NR_PVO_120表并查看前10个人,您会看到第9个人启动另一个传播到11的传真号码。我不会接收该传真。我找到一个附有1个人的传真,或者,如果没有,我会在9点停止。这一点是为了得到10个人,但要确保所有拥有相同传真的人都聚在一起。

还是有其他方法只计算UNIQUE提供者(应该等于14)?

2 个答案:

答案 0 :(得分:2)

我做了测试表:

create table nr_pvo_120 (
   otherid,
   fax
)
as
select 12365092    , 2762364204 from dual union all
select 12005656    , 2762364204 from dual union all
select 12484936    , 2762364204 from dual union all
select 39003042    , 2762364204 from dual union all
select 12365597    , 2762364204 from dual union all
select 12635922    , 2762364204 from dual union all
select 12332346    , 2762364204 from dual union all
select 12365092    , 4387267572 from dual union all
select 12005656    , 4387267572 from dual union all
select 12365092    , 4422911281 from dual union all
select 12005656    , 4422911281 from dual union all
select 12484936    , 4422911281 from dual union all
select 12651239    , 4422911281 from dual union all
select 12388710    , 4422911281 from dual union all
select 12686953    , 4422911281 from dual union all
select 12365092    , 4423311213 from dual union all
select 12005656    , 4423311213 from dual union all
select 12709544    , 4423311213 from dual union all
select 12484936    , 4423311213 from dual union all
select 12005656    , 4424450542 from dual union all
select 12346839    , 4424450542 from dual union all
select 12365120    , 4424450542 from dual union all
select 12484936    , 4424450542 from dual union all
select 12086512    , 4424450542 from dual
/

我的第一个镜头是:对于每个人(otherid),只获取他的第一个传真号码,然后按照正常组进行操作并依靠:

select first_fax, count(*) firstcount
  from (
   select otherid, min(fax) first_fax
     from nr_pvo_120
    group by otherid
       )
 group by first_fax
 order by first_fax
/

输出将变为:

 FIRST_FAX FIRSTCOUNT
---------- ----------
2762364204          7
4422911281          3
4423311213          1
4424450542          3

然后我注意到你想要的输出包括第五个传真号但计数为零。例如,可以这样做:

select fax, count(*) normalcount, count(otherid_on_first_fax) countunused
  from (
   select fax, otherid,
          case
             when fax = min(fax) over (partition by otherid order by fax)
             then otherid
          end otherid_on_first_fax
     from nr_pvo_120
       )
 group by fax
 order by fax
/

在此输出中,列NORMALCOUNT是拥有该传真的人数。列COUNTUNUSED是以前计数中尚未“已使用”的人数:

       FAX NORMALCOUNT COUNTUNUSED
---------- ----------- -----------
2762364204           7           7
4387267572           2           0
4422911281           6           3
4423311213           4           1
4424450542           5           3

诀窍是otherid_on_first_fax在第一个传真号码上只有otherid的值,其余的传真号码otherid_on_first_fax为NULL。 count(otherid_on_first_fax)然后计算所有非空值,其中没有传真4387267572。

答案 1 :(得分:1)

好的,现在我明白了。

一个人可以拥有多个号码,但在结果表中我们看到的是数字,而不是人。所以问题是 - 定义该规则的规则是什么?如果它不重要:

SQL> with t as (
  select 12365092 OtherID, 2762364204 Fax from dual union all
  select 12005656, 2762364204 from dual union all
  select 12484936, 2762364204 from dual union all
  select 39003042, 2762364204 from dual union all
  select 12365597, 2762364204 from dual union all
  select 12635922, 2762364204 from dual union all
  select 12332346, 2762364204 from dual union all
  select 12365092, 4387267572 from dual union all
  select 12005656, 4387267572 from dual union all
  select 12365092, 4422911281 from dual union all
  select 12005656, 4422911281 from dual union all
  select 12484936, 4422911281 from dual union all
  select 12651239, 4422911281 from dual union all
  select 12388710, 4422911281 from dual union all
  select 12686953, 4422911281 from dual union all
  select 12365092, 4423311213 from dual union all
  select 12005656, 4423311213 from dual union all
  select 12709544, 4423311213 from dual union all
  select 12484936, 4423311213 from dual union all
  select 12005656, 4424450542 from dual union all
  select 12346839, 4424450542 from dual union all
  select 12365120, 4424450542 from dual union all
  select 12484936, 4424450542 from dual union all
  select 12086512, 4424450542 from dual)
select mx, count(otherid) 
  from (select otherid, max(fax) mx 
          from t
         group by otherid)
 group by mx; 

        MX COUNT(OTHERID)
---------- --------------
4423311213      2
4424450542      5
2762364204      4
4422911281      3

如果您需要定义数字的顺序,可以使用:

SQL> with t as (<see previous example>)
select fax, count(otherid)
  from (select fax, otherid, row_number() over (partition by otherid order by fax) rn
          from t)
 where rn = 1
 group by fax;

       FAX COUNT(OTHERID)
---------- --------------
4423311213      1
4424450542      3
2762364204      7
4422911281      3
分析函数中的

order by定义了哪些手机将显示在结果中。