我有hdf5,我已经转移到DataFrame,但问题是当我想绘图时,图表上没有任何显示。我检查了新的数据帧,但是我看到,什么都没有。 这是我的DF( I don't allowed to post pics, so please click to the link)
df1 = pd.DataFrame(df.Price, index = df.Timestamp)
plt.figure()
df1.plot()
plt.show()
第二个DF在价格列中显示NAN。怎么了?
答案 0 :(得分:1)
我认为您需要Timestamp
列中的set_index
,选择列Price
并绘制:
#convert column to floats
df['Price'] = df['Price'].astype(float)
df.set_index('Timestamp')['Price'].plot()
#if some non numeric data, convert them to NaNs
df['Price'] = pd.to_numeric(df['Price'], errors='coerce')
df.set_index('Timestamp')['Price'].plot()
如果使用NaNs
构造函数,则获取DataFrame
,因为数据未对齐 - 索引df
的值与Timestamp
列不同。
答案 1 :(得分:0)
你可以通过添加.values来实现这一点,而如何创建一个系列呢?
#df1 = pd.DataFrame(df.Price.values, df.Timestamp)
serie = pd.Series(df.Price.values, df.Timestamp)
看到它在这里回答:pandas.Series() Creation using DataFrame Columns returns NaN Data entries
完整示例:
import pandas as pd
import numpy as np
import datetime
import matplotlib.pyplot as plt
df = pd.DataFrame(columns=["Price","Timestamp","Random"])
df.Price = np.random.randint(100, size = 10)
df.Timestamp = [datetime.datetime(2000,1,1) + \
datetime.timedelta(days=int(i)) for i in np.random.randint(100, size = 10)]
df.Random = np.random.randint(10, size= 10)
serie = pd.Series(df.Price.values, df.Timestamp)
serie.plot()
plt.show()
差分
print("{}\n{}".format(type(df.Price), type(df.Price.values)))
<class 'pandas.core.series.Series'> # does not work
<class 'numpy.ndarray'> # works