我有一个文本文件,其中我有几行我想读作pandas dataframe。以下是我从文本文件中复制并保存到另一个文本文件中的几行
MTU, Time, Power, Cost, Voltage
MTU1,05/11/2015 19:59:06,4.102,0.62,122.4
MTU1,05/11/2015 19:59:05,4.089,0.62,122.3
MTU1,05/11/2015 19:59:04,4.089,0.62,122.3
MTU1,05/11/2015 19:59:06,4.089,0.62,122.3
MTU1,05/11/2015 19:59:04,4.097,0.62,122.4
MTU1,05/11/2015 19:59:03,4.097,0.62,122.4
MTU1,05/11/2015 19:59:02,4.111,0.62,122.5
MTU1,05/11/2015 19:59:03,4.111,0.62,122.5
MTU1,05/11/2015 19:59:02,4.104,0.62,122.5
MTU1,05/11/2015 19:59:01,4.090,0.62,122.4
MTU1,05/11/2015 19:59:00,4.093,0.62,122.4
MTU1,05/11/2015 19:58:59,4.112,0.62,122.5
MTU1,05/11/2015 19:58:58,4.107,0.62,122.6
MTU1,05/11/2015 19:58:57,4.092,0.62,122.7
现在,当我使用以下内容读入文本文件时。
energy=pd.read_csv("energy.txt",sep=",")
# Reading in first 5 rows of data.
energy.head()
Out[65]:
我明白了:
MTU Time Power Cost Voltage
0 MTU1 05/11/15 19:59 4.102 0.62 122.4
1 MTU1 05/11/15 19:59 4.089 0.62 122.3
2 MTU1 05/11/15 19:59 4.089 0.62 122.3
3 MTU1 05/11/15 19:59 4.089 0.62 122.3
4 MTU1 05/11/15 19:59 4.097 0.62 122.4
问题是我猜列仍然是字符串形式。我使用以下内容将它们转换为数字。
energy=energy.convert_objects(convert_numeric=True)
但是当我尝试用时间绘制功率变量以及时间来看趋势时,我得到了 错误
energy.plot(energy.time,energy.power)
if isinstance(obj, tuple) and is_setter:
1142 return {'key': obj}
-> 1143 raise KeyError('%s not in index' % objarr[mask])
1144
1145 return _values_from_object(indexer)
KeyError: '[ 4.102 4.089 4.089 4.089 4.097 4.097 4.111 4.111 4.104 4.09\n 4.093 4.112 4.107 4.092 4.092 4.109 4.107 4.107 4.092 4.092\n 4.092 4.107 4.109 4.094 4.09 4.103 4.103 4.103 4.11 4.096\n 4.122 4.156 4.154 4.154 4.144 4.15 4.16 4.16 4.163 4.163\n 4.154 4.15 4.157 4.167 4.16 4.149 4.153 4.165 4.166 4.155\n 4.151 4.164 4.172 4.161 4.152 4.16
我想是因为功率变量仍然在某些值上附加了“\ n”。我该如何纠正这个错误。
答案 0 :(得分:1)
我对熊猫0.16这看起来似乎对我很好。列名在名称的开头有一个空格,但是 -
In [48]: energy
Out[48]:
MTU Time Power Cost Voltage
0 MTU1 05/11/2015 19:59:06 4.102 0.62 122.4
1 MTU1 05/11/2015 19:59:05 4.089 0.62 122.3
2 MTU1 05/11/2015 19:59:04 4.089 0.62 122.3
3 MTU1 05/11/2015 19:59:06 4.089 0.62 122.3
4 MTU1 05/11/2015 19:59:04 4.097 0.62 122.4
5 MTU1 05/11/2015 19:59:03 4.097 0.62 122.4
6 MTU1 05/11/2015 19:59:02 4.111 0.62 122.5
7 MTU1 05/11/2015 19:59:03 4.111 0.62 122.5
8 MTU1 05/11/2015 19:59:02 4.104 0.62 122.5
9 MTU1 05/11/2015 19:59:01 4.090 0.62 122.4
10 MTU1 05/11/2015 19:59:00 4.093 0.62 122.4
11 MTU1 05/11/2015 19:58:59 4.112 0.62 122.5
12 MTU1 05/11/2015 19:58:58 4.107 0.62 122.6
13 MTU1 05/11/2015 19:58:57 4.092 0.62 122.7
In [49]: energy.columns
Out[49]: Index([u'MTU', u' Time', u' Power', u' Cost', u' Voltage'], dtype='object')
In [50]: energy.plot(x=' Time', y=' Power') # or energy.plot(' Time', ' Voltage')
Out[50]: <matplotlib.axes.AxesSubplot at 0x10847ffd0>
以x
为Time
而y
为Power
的情节为: