Question

我正在使用pandas加载CSV数据，其中一列采用的形式日期格式为＆＃39;％a％d。％m。％Y＆＃39; （例如＆＃39;星期一06.02.2017＆＃39;），然后尝试制作一些根据日期标记x轴的图表。

在绘图过程中出现问题，因为日期标签错误; 例如什么是＆＃39;星期一06.02.2017＆＃39;在CSV / DataFrame中显示为＆＃39;星期四 06.02.0048＆＃39;在情节轴上。

这是一个MWE。这是文件＆＃39; data.csv＆＃39;：

Mon 06.02.2017  ;  1  ;  2  ;  3
Tue 07.02.2017  ;  4  ;  5  ;  6
Wed 08.02.2017  ;  7  ;  8  ;  9
Thu 09.02.2017  ; 10  ; 11  ; 12
Fri 10.02.2017  ; 13  ; 14  ; 15
Sat 11.02.2017  ; 16  ; 17  ; 18
Sun 12.02.2017  ; 19  ; 20  ; 21
Mon 13.02.2017  ; 22  ; 23  ; 24
Tue 14.02.2017  ; 25  ; 26  ; 27
Wed 15.02.2017  ; 28  ; 29  ; 30
Thu 16.02.2017  ; 31  ; 32  ; 33

这是解析/绘图代码＆＃39; plot.py＆＃39;：

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates


df = pd.read_csv(
        'data.csv',
        sep='\s*;\s*',
        header=None,
        names=['date', 'x', 'y', 'z'],
        parse_dates=['date'],
        date_parser=lambda x: pd.datetime.strptime(x, '%a %d.%m.%Y'),
        # infer_datetime_format=True,
        # dayfirst=True,
        engine='python',
)

# DataFrame 'date' Series looks fine
print df.date

ax1 = df.plot(x='date', y='x', legend=True)
ax2 = df.plot(x='date', y='y', ax=ax1, legend=True)
ax3 = df.plot(x='date', y='z', ax=ax1, legend=True)

ax1.xaxis.set_minor_locator(mdates.DayLocator(interval=1))
ax1.xaxis.set_minor_formatter(mdates.DateFormatter('%a %d.%m.%Y'))
ax1.xaxis.grid(True, which='minor')

plt.setp(ax1.xaxis.get_minorticklabels(), rotation=45)
plt.setp(ax1.xaxis.get_majorticklabels(), visible=False)
plt.tight_layout()

plt.show()

请注意，DataFrame.date系列似乎包含正确的日期，所以它可能是matplotlib问题，而不是pandas / parsing错误。

如果它可能很重要（虽然我怀疑），我的语言环境是LC_TIME = en_US.UTF-8。

另外，根据https://www.timeanddate.com/date/weekday.html，当天 06.02.0048实际上是星期二，所以绘制的年份不均匀真的是0048年。

我真的不知所措，感谢任何愿意查看此事的人。

Answer 1

虽然我无法弄清楚它为什么不能正常工作，但它似乎与使用pandas进行绘图有关，而仅仅与matplotlib有关，可能是mdates.DateFormatter ... < / p>

当我注释掉格式化行时，它似乎开始工作：

# ax1.xaxis.set_minor_locator(mdates.DayLocator(interval=1))
# ax1.xaxis.set_minor_formatter(mdates.DateFormatter('%a %d.%m.%Y'))
# ax1.xaxis.grid(True, which='minor')
# 
# plt.setp(ax1.xaxis.get_minorticklabels(), rotation=45)
# plt.setp(ax1.xaxis.get_majorticklabels(), visible=False)

Pandas自动绘制日期工作正常，但调用任何matplotlib函数会破坏日期。仅评论#plt.setp(ax1.xaxis.get_majorticklabels(), visible=False)，将同时绘制Pandas和Matplotlib xaxis，奇数0048再次显示：

所以问题仍然存在。

但是，您可以通过将parse_dates=['date']替换为index_col=0，明确创建matplotlib图，并使用mdates.DateFormatter更改ticker.FixedFormatter来避免这种情况：

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as ticker

df = pd.read_csv(
    'data.csv',
    sep='\s*;\s*',
    header=None,
    names=['date', 'x', 'y', 'z'],
    index_col=0,
    date_parser=lambda x: pd.to_datetime(x, format='%a %d.%m.%Y'),
    engine='python'
)

ax = plt.figure().add_subplot(111)
ax.plot(df)

ticklabels = [item.strftime('%d-%m-%y') for item in df.index]
ax.xaxis.set_major_locator(mdates.DayLocator(interval=1))
ax.xaxis.set_major_formatter(ticker.FixedFormatter(ticklabels))

plt.xticks(rotation='90')
ax.xaxis.grid(True, which='major')

plt.tight_layout()

plt.show()

Answer 2

我也遇到了这个问题，但根本原因却不同。

我在matplotlib DateFormatter类中进行了一些调试，以弄清楚它实际在处理什么数据。事实证明，针对postgres运行的pandas查询正在生成日期对象，而不是时间戳对象。这导致日期被错误地解析，以至于年份不正确（解析为0046年而不是2018年）。

解决方案是更新查询以将时间列转换为时间戳，然后一切正常。

SELECT start_time::timestamp at time zone '{{timezone}}' as "Start Time" ...

也就是说，我对相关的库不足以处理postgres可以生成的各种日期表示感到震惊。

带有日期轴的Pandas / matplotlib图表显示正确的日/月但错误的工作日/年

2 个答案: