Question

我推断了下表

CustomerID | OrderType | Date
=============================
1          | A         | 1/1
1          | B         | 2/1
1          | A         | 3/1
2          | A         | 1/1
2          | A         | 4/1
....

来自包含其他ID列的表，使用CustomerID和OrderType进行分组。

在获得这个表格时，我惊讶于我，我还需要选择Date列。否则，每个customerID-Ordertype对只出现一次（例如，没有该列，第三行将不存在。）

在下一步中，我想计算每位客户的Ordertype数量。我可以在R中执行此操作（这可以使用dplyr轻松完成）。但是，由于文件非常大（并且内存在MS sql Management studio中存在问题），我宁愿直接获取以下表格的表格

Customer ID | Count(Type_A) | Count(Type_B)
===========================================
1           | 2             | 1
2           | 2             | 0 
....

正如我所说，使用R这是一个简单的任务。这可以通过SQL获得吗？我相信一个实现可能需要一些自连接，但到目前为止，我无法解决这个问题。

任何提示？

Answer 1

实现它的目的是多方面的。

一个简单的例子（你可以用这种方式计算或求和）：

select 
    CustomerID,
    COUNT(case when OrderType = 'A' then 1 end) [COUNT(Type_A)],
    COUNT(case when OrderType = 'B' then 1 end) [COUNT(Type_B)]
from myTable
group by
    CustomerID

Answer 2

对于那些有兴趣在R中使用SQL操作数据框的人，他们可以使用函数sqldf来传递mxix写成字符串的查询：

df <- read.table(text ="CustomerID  OrderType  Date
                        1           A          1/1
                        1           B          2/1
                        1           A          3/1
                        2           A          1/1
                        2           A          4/1", 
                 header =TRUE)
library(sqldf)
sqldf("select 
    CustomerID,
    COUNT(case when OrderType = 'A' then 1 end) [COUNT(Type_A)],
    COUNT(case when OrderType = 'B' then 1 end) [COUNT(Type_B)]
from df
group by
    CustomerID")

输出：

  CustomerID COUNT(Type_A) COUNT(Type_B)
1          1             2             1
2          2             2             0

SQL聚合在列中的组

2 个答案: