将值从日志文件中生成为字符串,以便使用Python进行计算

时间:2015-11-29 14:07:01

标签: python parsing

我有一个日志文件forces.dat,它使用强制值不断更新,每行一个。该文件如下所示:

1.190476e-05    ((6.882904e-04 3.133477e-04 -5.099806e+02) (8.595292e-08 1.222541e-08 -1.198233e-04) (0.000000e+00 0.000000e+00 0.000000e+00)) ((-1.555656e-05 2.712085e-05 2.977440e-06) (4.087154e-09 1.635450e-08 -2.306391e-08) (0.000000e+00 0.000000e+00 0.000000e+00))

第一个数字是力值后面的时间。

我希望在文件增长时用一些值来计算。我可以通过以下方式阅读该文件:

import time

def follow(thefile):
    thefile.seek(0,2)
    while True:
        line = thefile.readline()
        if not line:
          time.sleep(0.1)
          continue
        yield line

if __name__ == '__main__':
    logfile = open("forces.dat","r")
    loglines = follow(logfile)

    for line in loglines:
        print line

http://www.dabeaz.com/generators/

我不会得到数字1,11,12和13并为它们分配一个字符串,以便我可以用它们来计算一些值。

我可以使用

line = line.replace() 

但是

line = line.rsplit('\t', 1)[0]
line = line[:12]

无效。

1 个答案:

答案 0 :(得分:1)

在python的re模块的帮助下,可以使用正则表达式解析数据。 E.g。

import re

# Suppose line has the data in your question
line = '1.190476e-05    ((6.882904e-04 3.133477e-04 -5.099806e+02) (8.595292e-08 1.222541e-08 -1.198233e-04) (0.000000e+00 0.000000e+00 0.000000e+00)) ((-1.555656e-05 2.712085e-05 2.977440e-06) (4.087154e-09 1.635450e-08 -2.306391e-08) (0.000000e+00 0.000000e+00 0.000000e+00))'

numbers = re.findall('[0-9]\.[0-9]+e[+-][0-9]{2}', line)

numbers包含以下数据,简单(使用)列表中的数字:

['1.190476e-05',
 '6.882904e-04',
 '3.133477e-04',
 '5.099806e+02',
 '8.595292e-08',
 '1.222541e-08',
 '1.198233e-04',
 '0.000000e+00',
 '0.000000e+00',
 '0.000000e+00',
 '1.555656e-05',
 '2.712085e-05',
 '2.977440e-06',
 '4.087154e-09',
 '1.635450e-08',
 '2.306391e-08',
 '0.000000e+00',
 '0.000000e+00',
 '0.000000e+00']

如果您不熟悉正则表达式,请让我对其进行处理:

\d     # Match any digit between 0 and 9, followed by
\.     # ... a literal dot character, followed by
\d+    # ... one or more digits, followed by
e      # ... a literal character 'e', followed by
[+-]   # ... a single occurrence of either '+' or '-', followed by
\d{2}  # ... exactly two digits.

支持的正则表达式语法的完整参考是here