绘制Numpy数组问题的日期

时间:2013-07-04 20:14:49

标签: python arrays numpy matplotlib

我正在绘制天气数据的CSV文件,我在代码中导入得很好,但我正在尝试绘制它。以下是CSV数据的示例:

12:00am,171,6,7,52,76,77.1,63.7,28.74,0.00,0.00,0.0,0,63.7,78.1,67.4,56.0,29.96
12:01am,192,4,6,52,76,77.1,63.7,28.74,0.00,0.00,0.0,0,63.7,78.1,67.4,56.0,29.96
12:02am,197,3,6,52,76,77.1,63.7,28.74,0.00,0.00,0.0,0,63.7,78.1,67.4,56.0,29.96
12:03am,175,3,6,52,76,77.1,63.7,28.73,0.00,0.00,0.0,0,63.7,78.1,67.4,56.0,29.96
12:04am,194,4,6,52,76,77.1,63.7,28.73,0.00,0.00,0.0,0,63.7,78.1,67.4,56.0,29.96
12:05am,148,5,6,52,76,77.1,63.7,28.73,0.00,0.00,0.0,0,63.7,78.1,67.4,56.0,29.96

无论如何,我希望时间在X轴上,但我无法用matplotlib进行绘图。我尝试了一种使用xticks的方法,它绘制了我的y值,但就是这样。它只是在我的X轴上给了我一条粗实线。

import matplotlib as mpl
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cbook as cbook
from matplotlib.dates import date2num
import datetime as DT
import re

data = np.genfromtxt('FILE.csv', delimiter=',', dtype=None, skip_header=3)
length = len(data)

x = data['f0']
y = data['f7']

fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.set_title("Temperature")    
ax1.set_xlabel('Time')
ax1.set_ylabel('Degrees')


#plt.plot_date(x, y)
plt.show()
leg = ax1.legend()

plt.show()

我遗漏了一些关键部分,因为我老实说不知道从哪里开始。我检查了我的numpy数组的数据类型,它一直在说numpy.ndarray,我找不到将它转换为字符串或int值来绘制的方法。这是一个24小时的CSV文件,我想每30分钟左右刻一次标记。有什么想法吗?

2 个答案:

答案 0 :(得分:1)

嗯,这不是很优雅,但它有效。关键是x中存储的时间(只是字符串)更改为日期时间对象,以便matploblib可以绘制它们。我已经创建了一个执行转换的函数,并将其命名为get_datetime_from_string

**编辑的代码与Python 2.7兼容,并使用单位数小时**

import matplotlib as mpl
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cbook as cbook
from matplotlib.dates import date2num
import datetime as DT
import re

def get_datetime_from_string(time_string):
    ''' Returns a datetime.datetime object

        Args
        time_string: a string of the form 'xx:xxam'
        '''

    # there's got to be a better way to do this.
    # Convert it to utf-8 so string slicing works as expected.
    time_string = unicode(time_string, 'utf-8')

    # period is either am or pm
    colon_position = time_string.find(':')
    period = time_string[-2:]
    hour = int(time_string[:colon_position])
    if period.lower() == 'pm':
        hour += 12

    minute = int(time_string[colon_position + 1:colon_position + 3])

    return DT.datetime(1,1,1,hour, minute)

data = np.genfromtxt('test.csv', delimiter=',', dtype=None, skip_header=3)
length=len(data)

x=data['f0']
y=data['f7']

datetimes = [get_datetime_from_string(t) for t in x]

fig = plt.figure()

ax1 = fig.add_subplot(111)

ax1.set_title("Temperature")    
ax1.set_xlabel('Time')
ax1.set_ylabel('Degrees')

plt.plot(datetimes, y)
leg = ax1.legend()

plt.show()

我一直被绊倒,因为我试图在time_string上进行字符串切片,然后再将其转换为utf-8。在它给我ASCII值之前。我不确定为什么转换它有帮助,但确实如此。

答案 1 :(得分:1)

pandas是一个非常有用的时间序列分析库,并且有一些基于matplotlib的绘图功能。

Pandas在内部使用dateutil来解析日期,但问题是,日期未包含在您的文件中。在下面的代码中,我假设您将在解析文件之前知道日期(从文件名?)

In [125]: import pandas as pd
In [126]: pd.options.display.mpl_style = 'default'
In [127]: import matplotlib.pyplot as plt

In [128]: class DateParser():                                          
   .....:     def __init__(self, datestring):
   .....:         self.datestring = datestring
   .....:     def get_datetime(self, time):    
   .....:         return dateutil.parser.parse(' '.join([self.datestring, time]))
   .....:     

In [129]: dp = DateParser('2013-01-01')

In [130]: df = pd.read_csv('weather_data.csv', sep=',', index_col=0, header=None,
                  parse_dates={'datetime':[0]}, date_parser=dp.get_datetime)

In [131]: df.ix[:, :12] # show the first columns
Out[131]: 
                      1   2   3   4   5     6     7      8   9   10  11  12  
datetime                                                                      
2013-01-01 00:00:00  171   6   7  52  76  77.1  63.7  28.74   0   0   0   0   
2013-01-01 00:01:00  192   4   6  52  76  77.1  63.7  28.74   0   0   0   0   
2013-01-01 00:02:00  197   3   6  52  76  77.1  63.7  28.74   0   0   0   0   
2013-01-01 00:03:00  175   3   6  52  76  77.1  63.7  28.73   0   0   0   0   
2013-01-01 00:04:00  194   4   6  52  76  77.1  63.7  28.73   0   0   0   0   
2013-01-01 00:05:00  148   5   6  52  76  77.1  63.7  28.73   0   0   0   0   

In [132]: ax = df.ix[:,1:3].plot(secondary_y=1)

In [133]: ax.margins(0.04)

In [134]: plt.tight_layout()

In [135]: plt.savefig('weather_data.png')

weather_data.png