使用matplotlib更改python条形图中日期时间数据的x轴刻度标签的频率

时间:2017-02-24 16:47:45

标签: python csv pandas datetime matplotlib

我有一个脚本,它需要多个.csv文件并输出多个条形图。数据是每日降雨量总数,因此x轴是白天格式%d %m %Y的日期。因此,代码尝试在标签中包含所有365天,但x轴被阻塞。我可以使用哪些代码,每月仅以#34; Jan 01"格式包含一个标签。

import pandas as pd
import time
import os
import matplotlib.pyplot as plt

files = ['w.pod.csv',
't.pod.csv',
'r.pod.csv',
'n.pod.csv',
'm.pod.csv',
'k.pod.csv',
'j.pod.csv',
'h.pod.csv',
'g.pod.csv',
'c.pod.csv',
'b.pod.csv']

for f in files:
    fn = f.split('.')[0]
    dat = pd.read_csv(f)
    df0 = dat.loc[:, ['TimeStamp', 'RF']]
    # Change time format
    df0["time"] = pd.to_datetime(df0["TimeStamp"])
    df0["day"] = df0['time'].map(lambda x: x.day)
    df0["month"] = df0['time'].map(lambda x: x.month)
    df0["year"] = df0['time'].map(lambda x: x.year)
    df0.to_csv('{}_1.csv'.format(fn), na_rep="0")  # write to csv

    # Combine for daily rainfall
    df1 = pd.read_csv('{}_1.csv'.format(fn), encoding='latin-1',
              usecols=['day', 'month', 'year', 'RF', 'TimeStamp'])
    df2 = df1.groupby(['day', 'month', 'year'], as_index=False).sum()
    df2.to_csv('{}_2.csv'.format(fn), na_rep="0", header=None)  # write to csv

    # parse date
    df3 = pd.read_csv('{}_2.csv'.format(fn), header=None, index_col='datetime',
             parse_dates={'datetime': [1,2,3]},
             date_parser=lambda x: pd.datetime.strptime(x, '%d %m %Y'))

    def dt_parse(date_string):
        dt = pd.datetime.strptime(date_string, '%d %m %Y')
        return dt

    # sort datetime
    df4 = df3.sort()
    final = df4.reset_index()

    # rename columns
    final.columns = ['date', 'bleh', 'rf']

  [![enter image description here][1]][1]  final[['date','rf']].plot(kind='bar')
    plt.suptitle('{} Rainfall 2015-2016'.format(fn), fontsize=20)
    plt.xlabel('Date', fontsize=18)
    plt.ylabel('Rain / mm', fontsize=16)
    plt.savefig('{}.png'.format(fn))

这是我上一个问题的扩展:Automate making multiple plots in python using several .csv files

enter image description here

1 个答案:

答案 0 :(得分:4)

这并不容易,但这有效:

#sample df with dates of one year, rf are random integers
np.random.seed(100)
N = 365
start = pd.to_datetime('2015-02-24')
rng = pd.date_range(start, periods=N)

final = pd.DataFrame({'date': rng, 'rf': np.random.randint(50, size=N)})  
print (final.head())
        date  rf
0 2015-02-24   8
1 2015-02-25  24
2 2015-02-26   3
3 2015-02-27  39
4 2015-02-28  23

fn = 'suptitle'
#rot - ratation of labels in axis x 
ax = final.plot(x='date', y='rf', kind='bar', rot='45')
plt.suptitle('{} Rainfall 2015-2016'.format(fn), fontsize=20)
plt.xlabel('Date', fontsize=18)
plt.ylabel('Rain / mm', fontsize=16)
#set cusom format of dates
ticklabels = final.date.dt.strftime('%Y-%m-%d')
ax.xaxis.set_major_formatter(ticker.FixedFormatter(ticklabels))

#show only each 30th label, another are not visible
spacing = 30
visible = ax.xaxis.get_ticklabels()[::spacing]
for label in ax.xaxis.get_ticklabels():
    if label not in visible:
        label.set_visible(False)

plt.show()

graph