我有贷款数据,我想按日期分组,并通过不同的产品获得金额
我的数据看起来像这样
disbursementdate | amount | product | cluster
2017-01-01 | 1000 | HL | West
2018-02-01 | 1000 | PL | East
所以在查询之后,我理想地希望结果看起来像这样
Month | HL | PL
January 2018 | 1000 | 0
February 2018 | 100 | 1000
请注意,可能会有更多产品,而且无法知道有多少产品......所以sum case when
无效
我正在努力解决查询问题
答案 0 :(得分:0)
您可以使用Pandas和专用方法pd.DataFrame.pivot_table
:
import pandas as pd
# read data
df = pd.read_csv('file.csv')
# extract month
df['Month'] = pd.to_datetime(df['disbursementdate']).apply(lambda x: x.replace(day=1))
# pivot results
res = df.pivot_table(index='Month', columns='product', values='amount',
aggfunc='sum', fill_value=0).reset_index()
# reformat month
res['Month'] = res['Month'].dt.strftime('%B %Y')
print(res)
product Month HL PL
0 January 2017 1000 0
1 February 2018 0 1000
答案 1 :(得分:0)
你可以在mysql中通过构建代码来执行此操作,例如
DROP TABLE IF EXISTS T;
CREATE TABLE T(disbursementdate DATE, amount INT, product VARCHAR(2), cluster VARCHAR(4));
INSERT INTO T VALUES
('2017-01-01' , 1000 , 'HL' , 'West'),
('2017-01-01' , 1000 , 'OL' , 'West'),
('2018-02-01' , 1000 , 'PL' , 'East'),
('2018-02-01' , 100 , 'HL' , 'West'),
('2018-02-01' , 1000 , 'HL' , 'West');
SET @SQL =
(SELECT CONCAT('SELECT DISBURSEMENTDATE,',
GROUP_CONCAT(CONCAT('SUM(CASE WHEN PRODUCT = ', CHAR(39),S.PRODUCT, CHAR(39),' THEN AMOUNT ELSE 0 END) AS ',S.PRODUCT))
,' FROM T GROUP BY DISBURSEMENTDATE;')
FROM
(SELECT DISTINCT PRODUCT FROM T) S
)
;
PREPARE SQLSTMT FROM @SQL;
EXECUTE SQLSTMT;
DEALLOCATE PREPARE SQLSTMT;
+------------------+------+------+------+
| DISBURSEMENTDATE | HL | OL | PL |
+------------------+------+------+------+
| 2017-01-01 | 1000 | 1000 | 0 |
| 2018-02-01 | 1100 | 0 | 1000 |
+------------------+------+------+------+
2 rows in set (0.00 sec)