Question

我正在尝试从数据文件中提取值（浮点数）。我只想提取行上的第一个值，第二个是错误。（例如xo @ 9.95322254_0.00108217853 表示9.953...为值，0.0010..为错误）

这是我的代码：

import sys
import re

inf = sys.argv[1]
out = sys.argv[2]
f = inf
outf = open(out, 'w')
intensity = []

with open(inf) as f:    
    pattern = re.compile(r"[^-\d]*([\-]{0,1}\d+\.\d+)[^-\d]*")  

    for line in f:
        f.split("\n")
        match = pattern.match(line)
        if match:
            intensity.append(match.group(0))


for k in range(len(intensity)):
    outf.write(intensity[k])

但它不起作用。输出文件为空。数据文件中的行如下所示：

xo_Is 
xo  @  9.95322254`_0.00108217853
SPVII_to_PVII_Peak_type
PVII_m(@, 1.61879`_0.08117)
PVII_h(@, 0.11649`_0.00216)
I @  0.101760618`_0.00190314017

每次第一个数字是我想要提取的值，第二个数字是错误。

Answer 1

您几乎就在那里，但您的代码包含阻止其运行的错误。以下作品：

pattern = re.compile(r"[^-\d]*(-?\d+\.\d+)[^-\d]*")  

with open(inf) as f, open(out, 'w') as outf:
    for line in f:
        match = pattern.match(line)
        if match:
            outf.write(match.group(1) + '\n')

Answer 2

我认为您应该在简单的字符串而不是文件上测试您的模式。这将显示错误的位置：在模式中或在解析文件的代码中。模式看起来不错。此外，在大多数语言中，我知道组（0）是所有捕获的数据，对于您的号码，您需要使用组（1）你确定f.slit（'\ n'）必须在里面吗？

从数据文件中提取浮点数

2 个答案: