Question

我在PostgreSQL中有很多数据。但我需要像SPSS那样做一些数据透视表。例如，我有城市和州的表格。

 create table cities
(
    city integer,
    state integer
);
insert into cities(city,state) values (1,1);
insert into cities(city,state) values (2,2);
insert into cities(city,state) values (3,1);
insert into cities(city,state) values (4,1);

实际上在这张表中我有4个城市和2个州。我想用百分比像

做数据透视表

city\state |state-1| state-2|
city1      |33%    |0%      |
city2      |0%     |100%    |
city3      |33%    |0%      |
city4      |33%    |0%      |
totalCount |3      |1       |

我明白在这个特殊情况下如何使用sql。但我想要的只是将一个变量交叉到另一个变量（只计算不同的值并将其除以“count（*），其中variable_in_column_names = 1等等）使用一些存储的函数。我现在正在查看plpython。我的问题是：< / p>

如何输出没有临时表的记录集适合输出列的数量和类型的形状。
也许有工作解决方案？

正如我所看到的，输入将是表名，第一个变量的列名，第二个变量的列名。在函数体中执行大量查询（count（*），循环遍历变量中的每个不同值并计算它等等）然后返回一个包含百分比的表。

实际上我在一个查询中没有很多行（大约10k），并且可能是在原始python中做这些事情的最好方法，而不是plpython？

Answer 1

您可能想尝试pandas，这是一个优秀的python数据分析库。

查询PostgreSQL数据库：

import psycopg2
import pandas as pd
from pandas.io.sql import frame_query

conn_string = "host='localhost' dbname='mydb' user='postgres' password='password'"
conn = psycopg2.connect(conn_string)
df = frame_query('select * from cities', con=conn)

其中df与DataFrame类似：

    city    state
0    1   1
1    2   2
2    3   1
3    4   1

然后，您可以使用pivot_table创建一个数据透视表，然后除以总数以获得百分比：

totals = df.groupby('state').size()
pivot = pd.pivot_table(df, rows='city', cols='state', aggfunc=len, fill_value=0) / totals

给你结果：

state   1   2
city        
1    0.333333   0
2    0          1
3    0.333333   0
4    0.333333   0

最后，为了获得所需的布局，您只需要重命名索引和列，并追加总计：

totals_frame = pd.DataFrame(totals).T
totals_frame.index = ['totalCount']

pivot.index = ['city%i' % item for item in pivot.index]
final_result = pivot.append(totals_frame)
final_result.columns  = ['state-%i' % item for item in final_result.columns]

给你：

            state-1     state-2
city1       0.333333    0
city2       0.000000    1
city3       0.333333    0
city4       0.333333    0
totalCount  3.000000    1

Answer 2

查看PostgreSQL窗口函数。可能会给你一个非（pl）python解决方案。 http://blog.hashrocket.com/posts/sql-window-functions

像在spss中一样在sql中创建数据透视表

2 个答案: