使用Python绘制日志文件图

时间:2020-05-22 15:50:01

标签: python-3.x pandas matplotlib

首先,我通常不使用bash以外的任何脚本语言。但是,我需要从服务器机房监视器中绘制大量环境监视日志文件中的数据,并将它们绘制成图形,并认为Python会工作得最好。我正在为此使用Python 3.7,并已通过macports和pip安装了它以及到目前为止所需的所有库。

我想最终得到至少7个图形,每个图形都有多行。其中四个图表是每个物理测量点的温度和湿度数据。其中两个用于冷热气流,最后一个用于线路电压。

我试图独自开始,并取得了不错的成绩。我打开日志文件并提取所需的数据。但是,将数据制成图表似乎超出了我的范围。要绘制图形的数据是日期和时间戳记(如X)和点分十进制数字(应始终为正),如Y。

提取日期时,我使用time.strptime和time.mktime将其转换为Unix纪元,这很好用。提取数据时,我使用re.findall删除了非数字部分。我计划将日期从一个纪元移到日期和时间,但可能会稍后。

当我到达绘图部分时,我遇到了问题。

我首先尝试直接绘制数据图形,这给了我错误: TypeError:不可散列的类型:'numpy.ndarray'

我也尝试过使用熊猫数据框。这给了我错误: TypeError:不可散列的类型:'列表'

我什至试图将列表转换为带有和不带有数据框的元组,都给出了相同的错误。

基于列表的输出,我认为问题在于使用append作为Y轴的值。但是,在Google看来,我似乎还找不到足够的解决方案。

下面是代码,看到的输出和输入数据。注释来自上一次运行,我将其用于测试各个部分。

到目前为止的代码:

# Import needed libraries
import re
import time

import matplotlib.pyplot as plt
import pandas as pd
#import matplotlib.dates as mpd

# Need to initialize these or append doesn't work
hvacepoch = []
hvacnum = []
endepoch = []
endnum = []

# Known static variables
datepattern = '%m-%d-%Y %H:%M:%S'

# Open the files
coldairfile = open("air-cold.log","r")

# Grab the data and do some initial conversions
for coldairline in coldairfile:

        fields = coldairline.split()

        colddate = fields[0] + " " + fields[1]
#       coldepoch = mpd.epoch2num(int(time.mktime(time.strptime(colddate, datepattern))))
        coldepoch = int(time.mktime(time.strptime(colddate, datepattern)))
        coldnum = re.findall('\d*\.?\d+',fields[4])
        coldloc = fields[9]

        if coldloc == "HVAC":
                hvacepoch.append(coldepoch)
                hvacnum.append(coldnum)

        if coldloc == "Cold":
                endepoch.append(coldepoch)
                endnum.append(coldnum)


# Convert the lists to a tuple. Do I need this?
hvacepocht = tuple(hvacepoch)
hvacnumt = tuple(hvacnum)
endepocht = tuple(endepoch)
endnumt = tuple(endnum)

# Testing output
print(f'HVAC air flow date and time: {hvacepoch}')
print(f'HVAC air flow date and time tuple: {hvacepocht}')
print(f'HVAC air flow numbers: {hvacnum}')
print(f'HVAC air flow numbers tuple: {hvacnumt}')
print(f'Cold end air flow date and time: {endepoch}')
print(f'Cold end air flow date and time tuple: {endepocht}')
print(f'Cold end air flow numbers: {endnum}')
print(f'Cold end air flow numbers tuple: {endnumt}')

# Graph it. How to do for multiple graphs?

# With a Pandas dataframe as a list.
#colddata=pd.DataFrame({'x': endepoch, 'y1': endnum, 'y2': hvacnum })
#plt.plot( 'x', 'y1', data=colddata, marker='', color='blue', linewidth=2, label="Cold Aisle End")
#plt.plot( 'x', 'y2', data=colddata, marker='', color='skyblue', linewidth=2, label="HVAC")

# With a Pandas dataframe as a tuple.
#colddata=pd.DataFrame({'x': endepocht, 'y1': endnumt, 'y2': hvacnumt })
#plt.plot( 'x', 'y1', data=colddata, marker='', color='blue', linewidth=2, label="Cold Aisle End")
#plt.plot( 'x', 'y2', data=colddata, marker='', color='skyblue', linewidth=2, label="HVAC")

# Without a Pandas dataframe as a list.
#plt.plot(hvacepoch, hvacnum, label = "HVAC")
#plt.plot(endepoch, endnum, label = "Cold End")

# Without a Pandas dataframe as a tuple.
#plt.plot(hvacepocht, hvacnumt, label = "HVAC")
#plt.plot(endepocht, endnumt, label = "Cold End")

# Needed regardless
#plt.title('Airflow\nUnder Floor')
#plt.legend()
#plt.show()


# Close the files
coldairfile.close()

打印行的输出(被截断):

HVAC air flow date and time: [1588531379, 1588531389, 1588531399]
HVAC air flow date and time tuple: (1588531379, 1588531389, 1588531399)
HVAC air flow numbers: [['0.14'], ['0.15'], ['0.15']]
HVAC air flow numbers tuple: (['0.14'], ['0.15'], ['0.15'])
Cold end air flow date and time: [1588531379, 1588531389, 1588531399]
Cold end air flow date and time tuple: (1588531379, 1588531389, 1588531399)
Cold end air flow numbers: [['0.10'], ['0.09'], ['0.07']]
Cold end air flow numbers tuple: (['0.10'], ['0.09'], ['0.07'])

输入(被截断):

05-03-2020  14:42:59   Air Velocit 0.14m/ Under Floor Air Flow HVAC                                   
05-03-2020  14:42:59   Air Velocit 0.10m/ Under Floor Air Flow Cold End                               
05-03-2020  14:43:09   Air Velocit 0.15m/ Under Floor Air Flow HVAC                                   
05-03-2020  14:43:09   Air Velocit 0.09m/ Under Floor Air Flow Cold End                               
05-03-2020  14:43:19   Air Velocit 0.15m/ Under Floor Air Flow HVAC                                   
05-03-2020  14:43:19   Air Velocit 0.07m/ Under Floor Air Flow Cold End                   

2 个答案:

答案 0 :(得分:0)

IIUC,有了您的日志文件,您可以将pd.read_fwf与特定的colspecs一起使用:

df = pd.read_fwf('/home/quang/projects/untitled.txt', header=None,
            colspecs=[[0,20], [22,34], [35,39], [42, 54], [54,-1]],   # modify this to fit your needs
            parse_dates=[0],
            names=['time', 'veloc', 'value', 'location', 'type']    # also modify this
           ) 

为您提供这样的数据框:

                 time        veloc  value     location               type
0 2020-05-03 14:42:59  Air Velocit   0.14  Under Floor      Air Flow HVAC
1 2020-05-03 14:42:59  Air Velocit   0.10  Under Floor  Air Flow Cold End
2 2020-05-03 14:43:09  Air Velocit   0.15  Under Floor      Air Flow HVAC
3 2020-05-03 14:43:09  Air Velocit   0.09  Under Floor  Air Flow Cold End
4 2020-05-03 14:43:19  Air Velocit   0.15  Under Floor      Air Flow HVAC
5 2020-05-03 14:43:19  Air Velocit   0.07  Under Floor  Air Flow Cold End

您可以使用sns进行绘图:

sns.lineplot(data=df, x='time', y='value', hue='type' )

输出:

enter image description here

答案 1 :(得分:0)

我刚刚检查了您的数据,看来问题在于endnumhvacnum不是值列表。它们是列表列表,如下所示:

In [1]: colddata.head()
Out[1]:
         x        y1      y2
0   1588531379  [0.10]  [0.14]
1   1588531389  [0.09]  [0.15]
2   1588531399  [0.07]  [0.15]

因此,当您绘制数据时,matplotlib不知道如何绘制那些行。您可以做的就是使用列表理解功能来抓取列表。

In [2]:
    print(endnum)
    print(hvacnum)
Out[2]:
    [['0.10'], ['0.09'], ['0.07']]
    [['0.14'], ['0.15'], ['0.15']]

In [3]:
    endnum = [i[0] for i in endnum]
    hvacnum = [i[0] for i in hvacnum]
    print(endnum)
    print(hvacnum)
Out[3]:
    ['0.10', '0.09', '0.07']
    ['0.14', '0.15', '0.15']