使用matplotlib绘制熊猫数据框,并按年/月分组数据

时间:2020-05-08 12:54:14

标签: python pandas matplotlib jupyter

我正在使用jupyter,pandas和matplotlib创建具有以下数据的图。

如何创建在x轴上按月和年将数据分组在一起的图,以使月份与年份相关联更加清晰

year    month count
2005    9   40789
2005    10  17998
...
2014    12  2168
2015    1   2286
2015    2   1274
2015    3   1126
2015    4   344
df.plot(kind='bar',x='month',y='num',color='blue', title="Num per year")
plt.show()

enter image description here

2 个答案:

答案 0 :(得分:3)

您可以每年为每种颜色上色。

创建一些数据:

DECLARE
COLUMN_NAME VARCHAR(50); 
TABLE_NAME VARCHAR(100); 
schema_name VARCHAR(100); 
A VARCHAR(100); 
B VARCHAR(100); 

CURSOR col_cursor IS 
  select col.owner as schema_name, 
       col.table_name, 
       col.column_name 
  from sys.all_tab_columns col 
  inner join sys.all_tables t 
    on col.owner = t.owner and 
       col.table_name = t.table_name 
where col.owner = 'PIYUSH1910_BEFORE' 
AND
      DATA_TYPE = 'NUMBER' 
AND
      DATA_PRECISION IS NULL 
AND
      col.TABLE_NAME NOT LIKE '%ER%'; 

BEGIN
 OPEN col_cursor; 

 LOOP
  FETCH col_cursor INTO schema_name,TABLE_NAME,COLUMN_NAME; 
  EXIT WHEN col_cursor%NOTFOUND;
  EXECUTE IMMEDIATE ' SELECT '||COLUMN_NAME ||' INTO A from ' || Table_Name || 'WHERE'||COLUMN_NAME||'- TRUNC('||COLUMN_NAME||',2) > 0'; 

  dbms_output.Put_line(A); 

END LOOP; 
CLOSE col_cursor; 
END

然后使用每年的颜色创建一个颜色数组:

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import cm
import numpy as np

# here's some data
N=50
df = pd.DataFrame({'year': np.random.randint(2005,2015,N),
                   'month': np.random.randint(1,12,N),
                   'count': np.random.randint(1,1500,N)})
df.sort_values(by=['year', 'month'],inplace=True)

更新:将x轴结合月份和年份也可能有帮助,例如this

# color map based on years
yrs = np.unique(df.year)
c = cm.get_cmap('tab20', len(yrs))
## probably a more elegant way to do this...
yrClr = np.zeros((len(df.year),4))
for i, v in enumerate(yrs): 
    yrClr[df.year==v,:]=c.colors[i,:]

# then use yrClr for color               
df.plot(kind='bar', x='month', y='count', color=yrClr, title="Num per year")

enter image description here

答案 1 :(得分:2)

您可以将sns.barplothuedodge结合使用:

sns.barplot(data=df, x='year', hue='month', y='count', dodge=True)

或者您可以旋转表格并使用plot.bar()

(df.pivot_table(index='year', columns='month', 
               values='count', aggfunc='sum')
   .plot.bar()
)

这会给你这样的东西:

enter image description here