预先感谢您的帮助!
我正在尝试根据时间序列数据创建热图,并且该数据始于年中,这导致我的热图的顶部向左移动,而与其余图不匹配(如图所示)下面)。我该如何将最上面的线移到上方,以便数据的可视化与其他绘图保持同步?
(下面提供的代码)
import pandas as pd
import matplotlib.pyplot as plt
# links to datadata
url1 = 'https://raw.githubusercontent.com/the-datadudes/deepSoilTemperature/master/minotDailyAirTemp.csv'
# load the data into a DataFrame, not a Series
# parse the dates, and set them as the index
df1 = pd.read_csv(url1, parse_dates=['Date'], index_col=['Date'])
# groupby year and aggregate Temp into a list
dfg1 = df1.groupby(df1.index.year).agg({'Temp': list})
# create a wide format dataframe with all the temp data expanded
df1_wide = pd.DataFrame(dfg1.Temp.tolist(), index=dfg1.index)
# ploting the data
fig, (ax1) = plt.subplots(ncols=1, figsize=(20, 5))
ax1.matshow(df1_wide, interpolation=None, aspect='auto');
答案 0 :(得分:2)
现在,问题是什么,数据集上的日期,如果您看到数据集的起始日期
`1990-4-24,15.533`
要解决此问题,必须在1990/01/01-/ 04/23之间添加数据并删除29Feb。
rng = pd.date_range(start='1990-01-01', end='1990-04-23', freq='D')
df = pd.DataFrame(index= rng)
df.index = pd.to_datetime(df.index)
df['Temp'] = np.NaN
frames = [df, df1]
result = pd.concat(frames)
result = result[~((result.index.month == 2) & (result.index.day == 29))]
有了这些数据
dfg1 = result.groupby(result.index.year).agg({'Temp': list})
df1_wide = pd.DataFrame(dfg1['Temp'].tolist(), index=dfg1.index)
# ploting the data
fig, (ax1) = plt.subplots(ncols=1, figsize=(20, 5))
ax1.matshow(df1_wide, interpolation=None, aspect='auto');
未填充部分的问题是数据集上NaN值的结果,在这种情况下,您可以选择用列均值替换NaN值,或用行均值替换。 还有其他方法可以替换NaN值
df1_wide = df1_wide.apply(lambda x: x.fillna(x.mean()),axis=0)