我有一个日志文件forces.dat
,它使用强制值不断更新,每行一个。该文件如下所示:
1.190476e-05 ((6.882904e-04 3.133477e-04 -5.099806e+02) (8.595292e-08 1.222541e-08 -1.198233e-04) (0.000000e+00 0.000000e+00 0.000000e+00)) ((-1.555656e-05 2.712085e-05 2.977440e-06) (4.087154e-09 1.635450e-08 -2.306391e-08) (0.000000e+00 0.000000e+00 0.000000e+00))
第一个数字是力值后面的时间。
我希望在文件增长时用一些值来计算。我可以通过以下方式阅读该文件:
import time
def follow(thefile):
thefile.seek(0,2)
while True:
line = thefile.readline()
if not line:
time.sleep(0.1)
continue
yield line
if __name__ == '__main__':
logfile = open("forces.dat","r")
loglines = follow(logfile)
for line in loglines:
print line
(http://www.dabeaz.com/generators/)
我不会得到数字1,11,12和13并为它们分配一个字符串,以便我可以用它们来计算一些值。
我可以使用
line = line.replace()
但是
line = line.rsplit('\t', 1)[0]
line = line[:12]
无效。
答案 0 :(得分:1)
在python的re
模块的帮助下,可以使用正则表达式解析数据。 E.g。
import re
# Suppose line has the data in your question
line = '1.190476e-05 ((6.882904e-04 3.133477e-04 -5.099806e+02) (8.595292e-08 1.222541e-08 -1.198233e-04) (0.000000e+00 0.000000e+00 0.000000e+00)) ((-1.555656e-05 2.712085e-05 2.977440e-06) (4.087154e-09 1.635450e-08 -2.306391e-08) (0.000000e+00 0.000000e+00 0.000000e+00))'
numbers = re.findall('[0-9]\.[0-9]+e[+-][0-9]{2}', line)
numbers
包含以下数据,简单(使用)列表中的数字:
['1.190476e-05',
'6.882904e-04',
'3.133477e-04',
'5.099806e+02',
'8.595292e-08',
'1.222541e-08',
'1.198233e-04',
'0.000000e+00',
'0.000000e+00',
'0.000000e+00',
'1.555656e-05',
'2.712085e-05',
'2.977440e-06',
'4.087154e-09',
'1.635450e-08',
'2.306391e-08',
'0.000000e+00',
'0.000000e+00',
'0.000000e+00']
如果您不熟悉正则表达式,请让我对其进行处理:
\d # Match any digit between 0 and 9, followed by
\. # ... a literal dot character, followed by
\d+ # ... one or more digits, followed by
e # ... a literal character 'e', followed by
[+-] # ... a single occurrence of either '+' or '-', followed by
\d{2} # ... exactly two digits.
支持的正则表达式语法的完整参考是here。