使用python解析给定行上的整数的文本文件

时间:2016-06-28 14:37:07

标签: python parsing matplotlib

我有这个测试套件将测试结果输出为文本文件。 这是一个示例:

    File Opened: Tuesday, June 26, 2016, 10:17:13 AM

File Opened: Tuesday, June 26, 2016, 10:17:29 AM
Radio Test BER LOOP BACK successful
Radio Test PAUSE successful
Radio Test BER LOOP BACK successful

File Opened: Tuesday, June 28, 2016, 10:18:11 AM
Bits received                    10152
Bits in error                       117
Access code bit errors     0
Packets received             49
Packets expected            2707
Packets w/ header error  0
Packets w/ CRC error      0
Packets w/ uncorr errors 0
Sync timeouts                  3
==================================
Bits received                    10368
Bits in error                       85
Access code bit errors     0
Packets received             52
Packets expected            2758
Packets w/ header error  0
Packets w/ CRC error      0
Packets w/ uncorr errors 0
Sync timeouts                  1
==================================
Bits received                    10152
Bits in error                       93
Access code bit errors     0
Packets received             49
Packets expected            2707
Packets w/ header error  0
Packets w/ CRC error      0
Packets w/ uncorr errors 0
Sync timeouts                  3

我试图在Bits receivedBits in error之后提取数字,并将它们除以得到百分比。

然后,我想将这些作为散点图用matplotlib.pyplot绘制。

我很难从这个文件中获取这些数字,但是......我正在弄清楚我解析这个问题的方法。

无论哪种方式,我只是想通过这种方式,而且我确定我没有以最优雅的方式做到这一点。对于Python而言,这似乎是一项简单的任务,而且我肯定会让它变得比它需要的更难。

你会怎么处理这个? 感谢

2 个答案:

答案 0 :(得分:3)

创建两个数组,一个用于received数据,一个用于error数据,然后循环遍历文件并解析:

receivedData = []
errorData = []
with open("data.txt") as f:
    for line in f:
        if line.startswith("Bits received"):
            receivedData.append(int(line.split()[-1]))
        elif line.startswith("Bits in error"):
            errorData.append(int(line.split()[-1]))
        else:
            #do normal stuff with other lines
            pass

答案 1 :(得分:1)

另一种简单的方法是使用正则表达式库re。 (https://docs.python.org/2/library/re.html

import re
pattern1 = re.compile(r'Bits received\s+(\d+)')  # \d means any digit character
pattern2 = re.compile(r'Bits in error\s+(\d+)')

with open('path/file.txt', 'r') as f:
    text = f.read()
    received = int(pattern1.match(text).group(1))
    in_error = int(pattern2.match(text).group(1))
    value_of_interest = in_error/received

此方法假定每个输入文件都有这两行。如果无法做出这种假设,请分解match以检查它们的存在:

match1 = pattern1.match(text)  # re.MatchObject if the pattern is found
if match1:  # None if it's not found
    received = int(match1.group(1))  # re.MatchObject.group(1) is the first parenthesized group
match2 = pattern2.match(text)
if match2:
    in_error = int(match2.group(1))