对于某些背景信息,我想创建一个不同数据框(每个数据框都是从csv读取)的散点图,其中x值为日期,y值为水的“水位”。
我一直在尝试找出如何以x值为日期或索引来制作散点图。尝试了多种选择之后,我觉得这似乎是到目前为止的“最好”错误:
KeyError: "None of [DatetimeIndex(['2017-11-04 00:00:00',
'2017-11-04 01:00:00',\n ... '2018-02-26 11:00:00', '2018-02-26
12:00:00'],\n dtype='datetime64[ns]', name='date', length=2749,
freq=None)] are in the [columns]" .
我正在从如下所示的csv文件导入数据:
date, level
2017-10-26 14:00:00, 700.1
2017-10-26 15:00:00, 500.5
2017-10-26 16:00:00, NaN
...
我正在像这样读取文件:
df = pd.read_csv("data.csv", parse_dates=['date'],sep='\s*,\s*')
df.set_index('date', inplace=True)
df = df.loc['2017-11-04 00:00:00':]
然后这是我尝试绘制散点图的尝试:
ax = df.plot()
ax1 = df.plot(kind='scatter', x=df.index, y='level', color='r')
# ... my other dataframes I'd like to plot on the same graph...
我只是开始使用熊猫,因此对我缺乏理解深表歉意。我一直在摆弄导入csv的不同方法(sep='\s*,\s*'
是一种尝试),但是没有用。非常感谢您提出任何建议,谢谢。
编辑:更全面的代码
data1.csv:
date,level
2017-10-26 14:00:00,500.1
2017-10-26 15:00:00,600.5
2017-10-26 16:00:00,NaN
2017-10-26 17:00:00,NaN
2017-10-26 18:00:00,NaN
2017-10-26 19:00:00,600.5
2017-10-26 20:00:00,600.5
2017-10-26 21:00:00,700.0
2017-10-26 22:00:00,700.0
data2.csv:
date,level
2017-10-26 15:00:00,600.5
2017-10-26 16:00:00,NaN
2017-10-26 17:00:00,NaN
2017-10-26 18:00:00,NaN
2017-10-26 19:00:00,600.5
2017-10-26 20:00:00,600.5
2017-10-26 21:00:00,900.0
2017-10-26 22:00:00,900.0
2017-10-26 23:00:00,NaN
代码:
import pandas as pd
import warnings
import matplotlib.pyplot as plt
warnings.filterwarnings("ignore")
plt.style.use('fivethirtyeight')
df = pd.read_csv("data1.csv", parse_dates=['date'],sep='\s*,\s*')
df.set_index('date', inplace=True)
df = df.loc['2017-10-26 15:00:00':]
df2 = pd.read_csv("data2.csv", parse_dates=['date'],sep='\s*,\s*')
df2.set_index('date', inplace=True)
df2 = df2.loc[:'2017-10-26 22:00:00']
ax1 = df.plot(kind='scatter', x='date', y='level', color='r')
ax2 = df2.plot(kind='scatter', x='date', y='level', color='g', ax=ax1)
plt.show()
答案 0 :(得分:0)
万一有人遇到相同的问题,我可以按照以下说明找到解决方法:pandas scatter plotting datetime
我刚刚添加了style='o'
,如下所示:
df = pd.read_csv("data1.csv", parse_dates=['date'],sep='\s*,\s*')
df.set_index('date', inplace=True)
df = df.loc['2017-10-26 15:00:00':]
ax = df.plot(style='o')
df2 = pd.read_csv("data2.csv", parse_dates=['date'],sep='\s*,\s*')
df2.set_index('date', inplace=True)
df2 = df2.loc[:'2017-10-26 22:00:00']
df2.plot(ax=ax,style='o')
plt.show()