Question

我有一个数据集，如下所示。

Date;Time;Global_active_power;Global_reactive_power;Voltage;Global_intensity;Sub_metering_1;Sub_metering_2;Sub_metering_3
16/12/2006;17:24:00;4.216;0.418;234.840;18.400;0.000;1.000;17.000
16/12/2006;17:25:00;5.360;0.436;233.630;23.000;0.000;1.000;16.000
16/12/2006;17:26:00;5.374;0.498;233.290;23.000;0.000;2.000;17.000
16/12/2006;17:27:00;5.388;0.502;233.740;23.000;0.000;1.000;17.000
16/12/2006;17:28:00;3.666;0.528;235.680;15.800;0.000;1.000;17.000
16/12/2006;17:29:00;3.520;0.522;235.020;15.000;0.000;2.000;17.000
16/12/2006;17:30:00;3.702;0.520;235.090;15.800;0.000;1.000;17.000
16/12/2006;17:31:00;3.700;0.520;235.220;15.800;0.000;1.000;17.000
16/12/2006;17:32:00;3.668;0.510;233.990;15.800;0.000;1.000;17.000

我已经使用熊猫将数据获取到DataFrame中。该数据集包含多天的数据，数据集中每行的间隔为1分钟。

我想使用python绘制每天（如第1栏所示）相对于时间（如第2栏所示）的电压的单独图表。我怎样才能做到这一点？

Answer 1

我相信这可以解决问题（我编辑了日期，所以我们有两个日期）

import pandas as pd
import matplotlib.pyplot as plt

%matplotlib inline #If you use Jupyter Notebook

df = pd.read_csv('test.csv', sep=';', usecols=['Date','Time','Voltage'])
unique_dates = df.Date.unique()
for date in unique_dates:
   print('Date: ' + date)
   df.loc[df.Date == date].plot.line('Time', 'Voltage')
   plt.show()

您将获得：

Answer 2

txt = '''Date;Time;Global_active_power;Global_reactive_power;Voltage;Global_intensity;Sub_metering_1;Sub_metering_2;Sub_metering_3
16/12/2006;17:24:00;4.216;0.418;234.840;18.400;0.000;1.000;17.000
16/12/2006;17:25:00;5.360;0.436;233.630;23.000;0.000;1.000;16.000
16/12/2006;17:26:00;5.374;0.498;233.290;23.000;0.000;2.000;17.000
16/12/2006;17:27:00;5.388;0.502;233.740;23.000;0.000;1.000;17.000
16/12/2006;17:28:00;3.666;0.528;235.680;15.800;0.000;1.000;17.000
16/12/2006;17:29:00;3.520;0.522;235.020;15.000;0.000;2.000;17.000
16/12/2006;17:30:00;3.702;0.520;235.090;15.800;0.000;1.000;17.000
16/12/2006;17:31:00;3.700;0.520;235.220;15.800;0.000;1.000;17.000
16/12/2006;17:32:00;3.668;0.510;233.990;15.800;0.000;1.000;17.000'''

from io import StringIO
f = StringIO(txt)
df = pd.read_table(f,sep =';' )
plt.plot(df['Time'],df['Voltage'])
plt.show()

给出输出：

Answer 3

X = df.Date.unique() 

for i in X: #iterate over unique days
    temp_df = df[df.Date==i] #get df for specific day
    temp_df.plot(x = 'Time', y = 'Voltage') #plot

如果要更改x值，可以使用

x = np.arange(1, len(temp_df.Time), 1)

Answer 4

在创建 DateTime 变量以处理多天后按小时和分钟分组。您可以过滤特定日期的分组。

txt = '''日期;时间;Global_active_power;Global_reactive_power;Voltage;Global_intensity;Sub_metering_1;Sub_metering_2;Sub_metering_3 16/12/2006;17:24:00;4.216;0.418;234.840;18.400;0.000;1.000;17.000 16/12/2006;17:25:00;5.360;0.436;233.630;23.000;0.000;1.000;16.000 16/12/2006;17:26:00;5.374;0.498;233.290;23.000;0.000;2.000;17.000 16/12/2006;17:27:00;5.388;0.502;233.740;23.000;0.000;1.000;17.000 16/12/2006;17:28:00;3.666;0.528;235.680;15.800;0.000;1.000;17.000 16/12/2006;17:29:00;3.520;0.522;235.020;15.000;0.000;2.000;17.000 16/12/2006;17:30:00;3.702;0.520;235.090;15.800;0.000;1.000;17.000 16/12/2006;17:31:00;3.700;0.520;235.220;15.800;0.000;1.000;17.000 16/12/2006;17:32:00;3.668;0.510;233.990;15.800;0.000;1.000;17.000'''

 from io import StringIO
 f = StringIO(txt)
 df = pd.read_table(f,sep =';' )

 df['DateTime']=pd.to_datetime(df['Date']+"T"+df['Time']+"Z")
 df.set_index('DateTime',inplace=True)

 filter=df['Date']=='16/12/2006'
 grouped=df[filter].groupby([df.index.hour,df.index.minute])['Voltage'].mean()
 grouped.plot()
 plt.show()

如何根据给定时间绘制数据？

4 个答案: