如何创建大小均匀的小组

时间:2016-04-19 23:26:47

标签: c# linq postgresql grouping

我有一张按日期列出汽车信息的表格。

car_id date       <aditional info>
1      20160101
2      20160102
1      20160103
2      20160104
1      20160105
1      20160106
1      20160107
1      20160108
1      20160109
1      20160110

如果汽车有<= 5条记录,则只有一个组。 CASE car_id = 2

如果汽车有<= 10条记录,则会有两个均匀分组。 CASE car_id = 1按日期排序

  • 20160101 - 20160106 GROUP 1
  • 20160107 - 20160110 GROUP 2

如果汽车有'&gt; 10'然后将有三个均匀分布的组。

DESIRE OUTPUT。

car_id date         group_id
1      20160101        1
1      20160103        1
1      20160105        1
1      20160106        1
------------------------
1      20160107        2
1      20160108        2
1      20160109        2
1      20160110        2
------------------------
2      20160102        1
2      20160104        1

我尝试使用ntile(),但不能使群号为dinamic。

SQL Fiddle Demo

SELECT car_id, 
       "date",
       ntile(3) over (partition by car_id order by "date") as group_id
FROM Table1      
  

奖励积分如果可以直接在C# LINQ完成,否则我会在postgres上创建一个函数。

aditional info 我将显示不同颜色(组)的历史汽车信息因此,具有少量数据的汽车将以单一颜色显示。最大颜色数为3。

4 个答案:

答案 0 :(得分:1)

我会使用row_number()count()手动计算群组:

select t1.*,
       (case when cnt <= 5 then 1
             when car_id * 2 <= cnt then 1
             else 2
        end) as grp
from (select t1.*,
             row_number() over (partition by car_id order by date) as seqnum,
             count(*) over (partition by car_id) as cnt
     from table1
    ) t
where cnt <= 10
order by car_id, grp, date;

答案 1 :(得分:1)

您可以使用ntile

SELECT car_id, "date",
        ntile(CASE WHEN c <= 5 THEN 1
                   WHEN c <= 10 THEN 2
                   ELSE 3
              END)  OVER (PARTITION BY car_id ORDER BY "date") AS group_id
FROM (SELECT car_id, "date",COUNT(*) OVER(PARTITION BY car_id) AS c
      FROM Table1) AS s

SqlFiddleDemo

输出:

╔════════╦══════════╦══════════╗
║ car_id ║   date   ║ group_id ║
╠════════╬══════════╬══════════╣
║      1 ║ 20160101 ║        1 ║
║      1 ║ 20160103 ║        1 ║
║      1 ║ 20160105 ║        1 ║
║      1 ║ 20160106 ║        1 ║
║      1 ║ 20160107 ║        2 ║
║      1 ║ 20160108 ║        2 ║
║      1 ║ 20160109 ║        2 ║
║      1 ║ 20160110 ║        2 ║
║      2 ║ 20160102 ║        1 ║
║      2 ║ 20160104 ║        1 ║
╚════════╩══════════╩══════════╝

答案 2 :(得分:1)

<强> SQL Fiddle Demo

正如戈登所说。

  • 首先计算每个car_id
  • 的日期数
  • 然后根据cnt
  • 指定我想要的群组数量
  • 然后使用groups作为ntile(grp)分析函数

WITH car_dates_count as (
  select t1.*,             
         count(*) over (partition by car_id) as cnt
  from table1 t1
),
car_groups as (
  select cdc.*,
       (case when cnt <= 5  then 1
             when cnt <= 10 then 2
                            else 3
        end) as grp
  from car_dates_count cdc
) 
SELECT *,
       ntile(grp) over (partition by car_id order by "date") as group_id
FROM car_groups;  

<强>输出

| car_id |     date | cnt | grp | group_id |
|--------|----------|-----|-----|----------|
|      1 | 20160101 |   9 |   2 |        1 | \
|      1 | 20160102 |   9 |   2 |        1 |  |
|      1 | 20160103 |   9 |   2 |        1 |  |
|      1 | 20160104 |   9 |   2 |        1 |  |
|      1 | 20160105 |   9 |   2 |        1 |  |=> (cnt 9 <= 10) Mean two groups
|      1 | 20160106 |   9 |   2 |        2 |  |
|      1 | 20160107 |   9 |   2 |        2 |  |
|      1 | 20160108 |   9 |   2 |        2 |  |
|      1 | 20160109 |   9 |   2 |        2 | /
--------------------------------------------
|      2 | 20160101 |   5 |   1 |        1 | \
|      2 | 20160102 |   5 |   1 |        1 |  |
|      2 | 20160103 |   5 |   1 |        1 |  |=> (cnt 5 <= 5) Mean one group
|      2 | 20160104 |   5 |   1 |        1 |  |
|      2 | 20160105 |   5 |   1 |        1 | /
--------------------------------------------
|      3 | 20160101 |  16 |   3 |        1 | \
|      3 | 20160102 |  16 |   3 |        1 |  |
|      3 | 20160103 |  16 |   3 |        1 |  |  
|      3 | 20160104 |  16 |   3 |        1 |  |
|      3 | 20160105 |  16 |   3 |        1 |  |
|      3 | 20160106 |  16 |   3 |        1 |  |
|      3 | 20160107 |  16 |   3 |        2 |  |
|      3 | 20160108 |  16 |   3 |        2 |  |=> (cnt 16 > 10) Mean three groups
|      3 | 20160109 |  16 |   3 |        2 |  |
|      3 | 20160110 |  16 |   3 |        2 |  |
|      3 | 20160111 |  16 |   3 |        2 |  |
|      3 | 20160112 |  16 |   3 |        3 |  |
|      3 | 20160113 |  16 |   3 |        3 |  |
|      3 | 20160114 |  16 |   3 |        3 |  |
|      3 | 20160115 |  16 |   3 |        3 |  |
|      3 | 20160116 |  16 |   3 |        3 | /

答案 3 :(得分:0)

假设您已按如下方式定义Car类,则可以使用Linq执行此操作。

public class Car
{
    public int car_id;
    public DateTime date;       
    // additional info
}

我们需要在car_id上执行两次分组,然后在均匀分割到多个组时应用分组。我更喜欢使用Linq

    var grouped = cars.GroupBy(c=>c.car_id)
        .Select(c => 
                new 
                {
                    car_grp_id = c.Key,
                    splits = c.Select((s,i)=> 
                    new 
                    {
                        grp_id = i/(c.Count() <= 5 ? 5 :(c.Count() %2 ==0)? c.Count() /2 : (c.Count() /3 +1)),
                        item = s
                    }), 

                })          
        .Select(s=> 
                new 
                {                       
                    grouponcars = s.splits.GroupBy(g=>g.grp_id)
                        .Select(x=>
                        new 
                        {
                            group_id = x.Key,
                            cars = x.Select(y=>y.item)
                        })
                })
        .ToList();

<强>输出

groupid : 0 -       Car_Id : 1 -         Date -1/1/2016 12:00:00 AM
groupid : 0 -       Car_Id : 1 -         Date -1/3/2016 12:00:00 AM
groupid : 0 -       Car_Id : 1 -         Date -1/5/2016 12:00:00 AM
groupid : 0 -       Car_Id : 1 -         Date -1/6/2016 12:00:00 AM
----------------------------------------
groupid : 1 -       Car_Id : 1 -         Date -1/7/2016 12:00:00 AM
groupid : 1 -       Car_Id : 1 -         Date -1/8/2016 12:00:00 AM
groupid : 1 -       Car_Id : 1 -         Date -1/9/2016 12:00:00 AM
groupid : 1 -       Car_Id : 1 -         Date -1/10/2016 12:00:00 AM
----------------------------------------
groupid : 0 -       Car_Id : 2 -         Date -1/2/2016 12:00:00 AM
groupid : 0 -       Car_Id : 2 -         Date -1/4/2016 12:00:00 AM

选中此Demo