有一个Pandas DataFrame:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 300 entries, 5220 to 5519
Data columns (total 3 columns):
Date 300 non-null datetime64[ns]
A 300 non-null float64
B 300 non-null float64
dtypes: datetime64[ns](1), float64(2)
memory usage: 30.5 KB
我想绘制A和B系列与日期。
plt.plot_date(data['Date'], data['A'], '-')
plt.plot_date(data['Date'], data['B'], '-')
然后我想在A和B系列之间的区域上应用fill_between():
plt.fill_between(data['Date'], data['A'], data['B'],
where=data['A'] >= data['B'],
facecolor='green', alpha=0.2, interpolate=True)
哪个输出:
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs
could not be safely coerced to any supported types according to the casting
rule ''safe''
matplotlib是否在fill_between()
函数中接受pandas datetime64对象?我应该将其转换为不同的日期类型吗?
答案 0 :(得分:23)
matplotlib.units.registry
中的 Pandas registers a converter将许多日期时间类型(例如pandas DatetimeIndex和dtype datetime64
的numpy数组)转换为matplotlib datenums,但它不处理Pandas {{ 1}}与dtype Series
。
datetime64
In [67]: import pandas.tseries.converter as converter
In [68]: c = converter.DatetimeConverter()
In [69]: type(c.convert(df['Date'].values, None, None))
Out[69]: numpy.ndarray # converted (good)
In [70]: type(c.convert(df['Date'], None, None))
Out[70]: pandas.core.series.Series # left unchanged
检查并使用转换器处理数据(如果存在)。
因此,作为解决方法,您可以将日期转换为fill_between
的NumPy数组:
datetime64
例如,
d = data['Date'].values
plt.fill_between(d, data['A'], data['B'],
where=data['A'] >= data['B'],
facecolor='green', alpha=0.2, interpolate=True)
答案 1 :(得分:5)
WillZ指出,Pandas 0.21打破了unutbu的解决方案。但是,将日期时间转换为日期会对数据分析产生显着的负面影响。此解决方案目前有效并保持日期时间:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
N = 300
dates = pd.date_range('2000-1-1', periods=N, freq='ms')
x = np.linspace(0, 2*np.pi, N)
data = pd.DataFrame({'A': np.sin(x), 'B': np.cos(x),
'Date': dates})
d = data['Date'].dt.to_pydatetime()
plt.plot_date(d, data['A'], '-')
plt.plot_date(d, data['B'], '-')
plt.fill_between(d, data['A'], data['B'],
where=data['A'] >= data['B'],
facecolor='green', alpha=0.2, interpolate=True)
plt.xticks(rotation=25)
plt.show()
编辑:根据jedi的评论,我开始确定以下三个选项中最快的方法:
方法2略快,但更加一致,因此我编辑了上述答案以反映最佳方法。
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import time
N = 300
dates = pd.date_range('2000-1-1', periods=N, freq='ms')
x = np.linspace(0, 2*np.pi, N)
data = pd.DataFrame({'A': np.sin(x), 'B': np.cos(x),
'Date': dates})
time_data = pd.DataFrame(columns=['1', '2', '3', '4', '5', '6', '7', '8', '9', '10'])
method1 = []
method2 = []
method3 = []
for i in range(0, 10):
start = time.clock()
for i in range(0, 500):
d = [pd.Timestamp(x).to_pydatetime() for x in data['Date']]
#d = data['Date'].dt.to_pydatetime()
plt.plot_date(d, data['A'], '-')
plt.plot_date(d, data['B'], '-')
plt.fill_between(d, data['A'], data['B'],
where=data['A'] >= data['B'],
facecolor='green', alpha=0.2, interpolate=True)
plt.xticks(rotation=25)
plt.gcf().clear()
method1.append(time.clock() - start)
for i in range(0, 10):
start = time.clock()
for i in range(0, 500):
#d = [pd.Timestamp(x).to_pydatetime() for x in data['Date']]
d = data['Date'].dt.to_pydatetime()
plt.plot_date(d, data['A'], '-')
plt.plot_date(d, data['B'], '-')
plt.fill_between(d, data['A'], data['B'],
where=data['A'] >= data['B'],
facecolor='green', alpha=0.2, interpolate=True)
plt.xticks(rotation=25)
plt.gcf().clear()
method2.append(time.clock() - start)
for i in range(0, 10):
start = time.clock()
for i in range(0, 500):
#d = [pd.Timestamp(x).to_pydatetime() for x in data['Date']]
#d = data['Date'].dt.to_pydatetime()
plt.plot_date(data['Date'].dt.to_pydatetime(), data['A'], '-')
plt.plot_date(data['Date'].dt.to_pydatetime(), data['B'], '-')
plt.fill_between(data['Date'].dt.to_pydatetime(), data['A'], data['B'],
where=data['A'] >= data['B'],
facecolor='green', alpha=0.2, interpolate=True)
plt.xticks(rotation=25)
plt.gcf().clear()
method3.append(time.clock() - start)
time_data.loc['method1'] = method1
time_data.loc['method2'] = method2
time_data.loc['method3'] = method3
print(time_data)
plt.errorbar(time_data.index, time_data.mean(axis=1), yerr=time_data.std(axis=1))
答案 2 :(得分:4)
升级到Pandas 0.21后我遇到了这个问题。我的代码以前用fill_between()
运行良好,但在升级后中断了。
事实证明,在@unutbu的答案中提到的这个修复,这是我之前所拥有的,只有在DatetimeIndex
包含date
个对象而不是datetime
个对象的情况下才有效。有时间信息。
查看上面的示例,我所做的是在调用fill_between()
之前添加以下行:
d['Date'] = [z.date() for z in d['Date']]