创建具有多个字段值的交叉表

时间:2019-04-05 08:00:37

标签: sql postgresql

这是当前表格的示例:

game_id |player_id|player_name       |event_id|event_desc        |count  |
--------|---------|------------------|--------|------------------|-------|
1       |        1|player1           |       3|Shot              |      1|
1       |        1|player1           |       5|Rebound           |      3|
1       |        1|player1           |       7|Foul              |      1|
1       |        1|player1           |      14|Assist            |      1|
1       |        1|player1           |      17|Subbed in         |      4|
1       |        1|player1           |      18|Subbed out        |      3|
1       |        1|player1           |      19|Drew a Foul       |      2|
1       |        1|player1           |      20|Free Throws Scored|      3|
1       |        1|player1           |      21|Free Throws Missed|      1|
1       |        2|player2           |       3|Shot              |      7|
1       |        2|player2           |       4|Miss              |     10|
1       |        2|player2           |       5|Rebound           |      2|
1       |        2|player2           |       7|Foul              |      1|
1       |        2|player2           |      14|Assist            |      1|
1       |        2|player2           |      17|Subbed in         |      4|
1       |        2|player2           |      18|Subbed out        |      4|
1       |        2|player2           |      19|Drew a Foul       |      2|

我需要基于此创建一个视图,以便将每个玩家的每场比赛统计信息分组。不同的统计信息基于该特定event_ID的计数。我大约有20个不同的ID可以映射。

game id | player_id | shot | Miss | Rebound |Foul | Assist | ...
1       |1          |1     |0     |3        |1    |1
1       |2          |7     |10    |2        |1    |1

我认为我必须为此使用交叉表功能,但是我不确定如何准确地为此编写代码,我并不是一个真正了解此领域的人。如果有人可以帮助我,我将不胜感激。

尝试以下代码:

select
    game_id,
    player_id,
    player_name,
    sum(case when event_desc = 'Jump Ball' then wjxbfs1 else 0) as Jump_Ball,
    sum(case when event_desc = 'Shot' then wjxbfs1 else 0) as Shot,
    sum(case when event_desc = 'Miss' then wjxbfs1 else 0) as Miss,
    sum(case when event_desc = 'Rebound' then wjxbfs1 else 0) as Rebound,
    sum(case when event_desc = 'Assist' then wjxbfs1 else 0) as Assist,
    sum(case when event_desc = 'Block' then wjxbfs1 else 0) as Block,
    sum(case when event_desc = 'Steal' then wjxbfs1 else 0) as Steal,
    sum(case when event_desc = 'Turnover' then wjxbfs1 else 0) as Turnover,
    sum(case when event_desc = 'Foul' then wjxbfs1 else 0) as Foul,
    sum(case when event_desc = 'Free Throws Taken' then wjxbfs1 else 0) as FT_Taken,
    sum(case when event_desc = 'Free Throws Scored' then wjxbfs1 else 0) as FT_Scored,
    sum(case when event_desc = 'Free Throws MIssed' then wjxbfs1 else 0) as FT_Missed,
    sum(case when event_desc = 'Timeout' then wjxbfs1 else 0) as Timeout,
    sum(case when event_desc = 'Violation' then wjxbfs1 else 0) as Violation,
    sum(case when event_desc = 'Subbed in' then wjxbfs1 else 0) as Subbed_In,
    sum(case when event_desc = 'Subbed out' then wjxbfs1 else 0) as Subbed_Out,
    sum(case when event_desc = 'Drew a Foul' then wjxbfs1 else 0) as Drew_Foul,
    sum(case when event_desc = 'Ejection' then wjxbfs1 else 0) as Ejected   
from
    stats
group by
    game_id,
    player_id,
    player_name 

返回以下错误: SQL错误[42601]:错误:“)”或附近的语法错误位置:119

此问题已解决,方法是将“ else 0”替换为“ end”。

3 个答案:

答案 0 :(得分:1)

您可以尝试使用以下情况下的情况

jcr:title.de

答案 1 :(得分:1)

import pandas as pd import matplotlib.pyplot as plt import matplotlib.dates as mdates import datetime as dt datenmodelo = pd.read_csv('1163655_010319_OZ_ALL_2.csv', sep=';', usecols=['Monat','Datum', 'Zeit', 'T_oM', 'RH_oM', 'T_uM', 'RH_uM', 'p_M', 'G_M', 'PAR_M']) daten_si_M = datenmodelo.set_index(['Datum']) dfmt1 = mdates.DateFormatter('%H:%M') x_date_M = [dt.datetime.strptime(d,'%d.%m.%Y').date() for d in dates_M] a = 0 start = 0 end = 1 for daysM in dates_M[start:end]: # Definition aller Funktionswerte/Größen (eingeschlossen NaN) T_uMd= pd.to_numeric(daten_si_M.loc[daysM].T_uM, errors='coerce') T_oMd = pd.to_numeric(daten_si_M.loc[daysM].T_oM, errors='coerce') RH_uMd = pd.to_numeric(daten_si_M.loc[daysM].RH_uM, errors='coerce') RH_oMd = pd.to_numeric(daten_si_M.loc[daysM].RH_oM, errors='coerce') G_Md = pd.to_numeric(daten_si_M.loc[daysM].G_M, errors='coerce') p_Md = pd.to_numeric(p_M_korr.loc[daysM], errors='coerce') PAR_Md= pd.to_numeric(daten_si_M.loc[daysM].PAR_M, errors='coerce') x_time_M = pd.to_datetime(daten_si_M.loc[daysM].Zeit) f, axarr = plt.subplots(4,1) f.set_size_inches(15, 20) # Titel der Graphen und Achsen: axarr[0].set_title('Modelo - %s'%daysM, fontsize=14, fontweight='bold') axarr[3].set_xlabel('Uhrzeit', fontweight='bold') axarr[1].set_ylabel('T [°C]', fontweight='bold') axarr[2].set_ylabel('RH [%]', fontweight='bold') axarr[0].set_ylabel('G [W/m^2]', fontweight='bold') axarr[3].set_ylabel('p [hPa]', fontweight='bold') # Formatierung der Achsen: axarr[0].xaxis.set_major_formatter(dfmt1) axarr[1].xaxis.set_major_formatter(dfmt1) axarr[2].xaxis.set_major_formatter(dfmt1) axarr[3].xaxis.set_major_formatter(dfmt1) # Plot der Variablen und Label der Kurve: axarr[1].plot(x_time_M, T_uMd,'r', label='Temperatur (unten)') axarr[1].plot(x_time_M, T_oMd, color='indigo', label='Temperatur (oben)') axarr[2].plot(x_time_M, RH_uMd,'r', label='Relative Feuchte (unten)') axarr[2].plot(x_time_M, RH_oMd,color='indigo', label='Relative Feuchte (oben)') axarr[0].plot(x_time_M, G_Md,'b', label='Globalstrahlung') axarr[0].plot(x_time_M, PAR_Md,'c', label='Photosynthetische Strahlung') axarr[3].plot(x_time_M, p_Md,'g', label='Druck') # Positionierung der Labels: axarr[0].legend(loc='center left', bbox_to_anchor=(1.05, 0.5)) axarr[1].legend(loc='center left', bbox_to_anchor=(1.05, 0.5)) axarr[2].legend(loc='center left', bbox_to_anchor=(1.05, 0.5)) axarr[3].legend(loc='center left', bbox_to_anchor=(1.05, 0.5)) # Gitter: plt.rc('grid', linestyle="dashed", color='b', alpha=0.5, linewidth=0.5) plt.rcParams['axes.grid'] = True # Plot: plt.savefig('Modelo (GTRHp) - %s.png' %(str(daysM)), bbox_inches='tight', dpi=300) plt.close() a += 1 plt.show() 是很好的语法。但是,在Postgres中,我很喜欢max(case when . . . end)关键字。这也可以表示为:

filter

请注意,这稍快一些,语法是ISO / ANSI标准(尽管我认为其他任何数据库都没有)。

答案 2 :(得分:0)

我不确定postgresql是否像SQLServer一样支持数据透视。但是总的来说,我使用case语句来透视数据。

select
    game_id,
    player_id,
    player_name,
    sum(case when event_desc = 'shot' then count else 0 end) as shot,
    sum(case when event_desc = 'Miss' then count else 0 end) as Miss,
    -- and so forth for every single event_desc you want to pivot as a column
from
    current_table
group by
    game_id,
    player_id,
    player_name 

那应该为您解决问题。