使用python从文件中提取数据

时间:2016-03-22 00:59:58

标签: python

我需要从文本文件的行中提取数据。数据是名称和评分信息,格式如下:

Feature_Locations:
   - { x:9.0745818614959717e-01, y:2.8846755623817444e-01,
       z:3.5268107056617737e-01 }
   - { x:1.1413983106613159e+00, y:2.7305576205253601e-01,
       z:4.4357028603553772e-01 }
   - { x:1.7582545280456543e+00, y:2.2776308655738831e-01,
       z:6.6982054710388184e-01 }
   - { x:9.6545284986495972e-01, y:2.8368893265724182e-01,
       z:3.6416915059089661e-01 }
   - { x:1.2183872461318970e+00, y:2.7094465494155884e-01,
       z:4.5954680442810059e-01 }

此文件由其他软件生成。 基本上我想在这个程序中获取数据,我想将它们保存在不同的其他文件中,例如“axeX.txt”“axeY.txt”“axeZ.txt”

我试过这个

import numpy as np
import matplotlib.pyplot as plt
import re
file = open('data.txt', "r")
for r in file:
    y = re.sub("- {", "",r).split()
    tt = y[:2]
    zz = tt
    st = re.findall('\d+', r)
    print st
file.close()

有没有更好的方法,或者我做错了?

2 个答案:

答案 0 :(得分:1)

输入文件为YAML格式。建议使用PyYAML包来解析yaml文件。

import yaml

document = """
Feature_Locations:
   - { x: 9.0745818614959717e-01, y: 2.8846755623817444e-01,
       z: 3.5268107056617737e-01 }
   - { x: 1.1413983106613159e+00, y: 2.7305576205253601e-01,
       z: 4.4357028603553772e-01 }
   - { x: 1.7582545280456543e+00, y: 2.2776308655738831e-01,
       z: 6.6982054710388184e-01 }
   - { x: 9.6545284986495972e-01, y: 2.8368893265724182e-01,
       z: 3.6416915059089661e-01 }
   - { x: 1.2183872461318970e+00, y: 2.7094465494155884e-01,
       z: 4.5954680442810059e-01 }
"""

locations = yaml.load(document)['Feature_Locations']

for ch in 'XYZ':
    fname = 'axe%s.txt' %ch
    with open(fname, 'w') as fh:
        for item in locations:
            fh.write('%s\n' % item[ch.lower()])

输入文件稍有损坏。 yamllint会进行健全性检查并告知我们错误。

yamllint inputfile.yaml
inputfile.yaml
  1:1       warning  missing document start "---"  (document-start)
  2:9       error    syntax error: found unexpected ':'

在这种情况下,我们可以轻松修复输入文件。

 sed -i 's/:/: /g' inputfile.yaml

答案 1 :(得分:0)

您可以尝试以下内容:

s = open('data.txt', "r").read()

x = re.findall(r'x:(.*), ', s)
y = re.findall(r'y:(.*),', s)
z = re.findall(r'z:(.*) ', s)

with open('axeX.txt', 'w') as f: f.write('\n'.join(x))
with open('axeY.txt', 'w') as f: f.write('\n'.join(y))
with open('axeZ.txt', 'w') as f: f.write('\n'.join(z))