我有一个excel文件,我想在特定列中绘制特定范围的行 我可以访问所需的列和行,但是每当我运行代码时,此错误消息就会出现
x和y必须具有相同的第一尺寸,但形状为(23,)和(1,) 消费表的第一行是:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import math
start_date = "2017-07-24 00:00:00"
end_date = "2019-03-09 23:00:00"
data_generation2 = pd.read_excel(r"C:\\Users\\user\\Desktop\\Master\\Thesis\\Tarek\\Parent.xlsx", index_col="Timestamp")
data_consumption2 = pd.read_excel(r"C:\\Users\\user\\Desktop\\Master\\Thesis\\Tarek\\Parent.xlsx", index_col="Timestamp", sheetname="Consumption")
data_generation = data_generation2.loc[start_date:end_date]
data_consumption = data_consumption2.loc[start_date:end_date]
data_consumption2["Timestamp"] =
pd.to_datetime(data_consumption2["Timestamp"], format="%d/%m/%y %H%M:%S")
df_to_plot = data_consumption2.loc[(data_consumption2.Timestamp >= start_date) & (data_consumption2.Timestamp < end_date)]
df_to_plot.set_index(df_to_plot.Timestamp)
df_to_plot = df_to_plot[df_to_plot.columns[1:]]
我希望输出为2053G
列中前720行相对于时间列的图。
答案 0 :(得分:1)
我会尝试给您一些提示。
读取数据后,应将date
列(字符串)转换为datetime
格式。 to_datetime
函数完成(doc)的工作。
然后,您需要按date
进行过滤。完成后,可以使用loc
来完成。
现在,我们要绘制数据框。熊猫模块提供了plot
方法(doc)。让我们尝试使用它。默认情况下,它将绘制所有列。在我们的案例中,我们想要绘制日期消费量。一种方法是将date
列转换为索引。 set_index
方法可以为您提供(doc)。
完成后,就可以进行绘制了!
这里是一个示例,假设其具有以下数据帧:
import pandas as pd
import matplotlib.pyplot as plt
start_date = "2019-01-01 00:00:00"
end_date = "2019-01-02 00:00:00"
data_generation2 = pd.read_excel("your_path")
# transform date column (type string) to datetime type
data_generation2["Date"] = pd.to_datetime(data_generation2["Date"], format="%d/%m/%y %H%M:%S")
print(data_generation2)
# Date House1 House2 House3 House4 ... House17 House18 House19 House20
# 0 2019-01-01 00: 00: 00 5.087228 6.416912 0.997723 4.094832 ... 0.462354 5.105688 7.504992 4.688377
# 1 2019-01-01 00: 01: 00 5.343098 3.205145 8.874018 0.567703 ... 5.593759 3.584948 5.503817 7.027469
# 2 2019-01-01 00: 02: 00 5.894551 6.411896 3.208524 6.582306 ... 9.149066 5.485656 0.820700 6.391030
# 3 2019-01-01 00: 03: 00 5.674840 6.558950 3.043355 6.759310 ... 2.270176 3.513524 1.833294 9.889432
# 4 2019-01-01 00: 04: 00 9.022593 4.088398 6.082084 8.095797 ... 9.270887 4.947194 2.505969 7.396191
# 5 2019-01-01 00: 05: 00 9.217690 1.950666 5.684551 7.782337 ... 6.990156 1.430377 2.269080 7.343031
# ... ...
# 715 2019-01-30 01: 09: 00 6.892563 0.553806 0.744839 3.151984 ... 9.418145 2.575782 0.801205 4.290434
# 716 2019-01-30 02: 00: 00 0.894091 8.872986 8.445927 2.781972 ... 0.832707 4.264321 6.972225 4.155494
# 717 2019-01-30 02: 01: 00 0.079396 5.270916 0.452875 5.248232 ... 5.331906 4.026245 6.598595 1.637325
# 718 2019-01-30 02: 02: 00 5.287946 4.745940 8.479177 7.713943 ... 0.642650 9.735159 2.499371 7.824247
# 719 2019-01-30 02: 03: 00 1.376087 6.262784 5.621571 4.937615 ... 0.801079 5.506654 6.878042 2.430062
# [720 rows x 23 columns]
# Filter according the date interval
df_to_plot = data_generation2.loc[(data_generation2.Date >= start_date) & (data_generation2.Date < end_date)]
# Transform the columns of date as index
df_to_plot.set_index(df_to_plot.Date)
df_to_plot = df_to_plot[df_to_plot.columns[1:]] # Remove date columns
print(df_to_plot)
# [720 rows x 21 columns]
# House1 House2 House3 ... House17 House18 House19 House20
# 0 5.087228 6.416912 0.997723 ... 0.462354 5.105688 7.504992 4.688377
# 1 5.343098 3.205145 8.874018 ... 5.593759 3.584948 5.503817 7.027469
# 2 5.894551 6.411896 3.208524 ... 9.149066 5.485656 0.820700 6.391030
# 3 5.674840 6.558950 3.043355 ... 2.270176 3.513524 1.833294 9.889432
# 4 9.022593 4.088398 6.082084 ... 9.270887 4.947194 2.505969 7.396191
# 5 9.217690 1.950666 5.684551 ... 6.990156 1.430377 2.269080 7.343031
# ...
# 20 8.908322 7.452523 9.046442 ... 9.786092 6.370266 5.635464 8.560888
# 21 7.666228 1.294614 2.034965 ... 0.091431 3.210088 9.038868 1.492585
# 22 8.266986 6.209395 9.689687 ... 6.500988 6.133358 2.237199 7.289748
# 23 2.687560 6.852061 5.136214 ... 0.815944 8.267767 0.244232 7.705181
# [24 rows x 20 columns]
# create a figure (2 x 2)
fig, axes = plt.subplots(nrows=2, ncols=2)
# Daily chart
ax = df_to_plot.plot(ax=axes[0, 0], title="Daily consumption", legend=False)
ax.set_xlabel("Days")
ax.set_ylabel("Consumption")
# Cumulative chart
ax = df_to_plot.sort_index().cumsum().plot(ax=axes[0, 1], title="Cumuative consumption")
ax.set_xlabel("Days")
ax.set_ylabel("Cumulative consumption")
ax.legend(ncol=3)
# Histogram chart house 1
ax = df_to_plot.House1.plot(ax=axes[1, 0], kind="bar",
title="House 1 daily consumption", rot=1)
ax.set_xlabel("Days")
ax.set_ylabel("Consumption")
# Pie chart
df_to_plot.sum().sort_values().plot(
ax=axes[1, 1], kind="pie", title="Pie chart consumption")
# Main title
fig.suptitle("From " + start_date +' to ' +end_date + ' summary')
plt.show()
您要做的主要任务是重建数据框。我不能为你。完成后,先前的代码可以为您提供一些想法。