使用Python从特定格式的字符串中提取数据

时间:2016-03-31 18:57:08

标签: python regex parsing format

我是Python的新手,目前我正在尝试使用它来解析一些自定义输出格式化字符串。事实上,format包含float的命名列表和float的元组列表。我写了一个函数,但它看起来过分了。如何以更合适的方式为Python做到这一点?

import re

def extract_line(line):
    line = line.lstrip('0123456789@ ')
    measurement_list = list(filter(None, re.split(r'\s*;\s*', line)))
    measurement = {}
    for elem in measurement_list:
        elem_list = list(filter(None, re.split(r'\s*=\s*', elem)))
        name = elem_list[0]
        if name == 'points':
            points = list(filter(None, re.split(r'\s*\(\s*|\s*\)\s*',elem_list[1].strip(' {}'))))
            for point in points:
                p = re.match(r'\s*(\d+(?:\.\d+)?)\s*,\s*(\d+(?:\.\d+)?)\s*', point).groups()
                if 'points' not in measurement.keys():
                    measurement['points'] = []
                measurement['points'].append(tuple(map(float,p)))
        else:
            values = list(filter(None, elem_list[1].strip(' {}').split(' ')))
            for value in values:
                if name not in measurement.keys():
                    measurement[name] = []
                measurement[name].append(float(value))
    return measurement

to_parse = '@10 points = { ( 2.96296 , 0.822213 ) ( 3.7037 , 0.902167 ) } ; L = { 5.20086 } ; P = { 3.14815 3.51852 } ;'

print(extract_line(to_parse))

2 个答案:

答案 0 :(得分:0)

此:

import re
a=re.findall(r' ([\d\.eE-]*) ',to_parse)
map(float, a)
>> [2.96296, 0.822213, 3.7037, 0.902167, 5.20086, 3.14815]

会给你你的数字列表,这是你要找的吗?

答案 1 :(得分:0)

您可以使用re.findall执行此操作:

import re
to_parse = '@10 points = { ( 2.96296 , 0.822213 ) ( 3.7037 , 0.902167 ) } ; L = { 5.20086 } ; P = { 3.14815 3.51852 } ;'

m_list = re.findall(r'(\w+)\s*=\s*{([^}]*)}', to_parse)
measurements = {}
for k,v in m_list:
    if k == 'points':
        elts = re.findall(r'([0-9.]+)\s*,\s*([0-9.]+)', v)
        measurements[k] = [tuple(map(float, elt)) for elt in elts]
    else:
        measurements[k] = [float(x) for x in v.split()]

print(measurements)

随意将它放入一个函数中并检查键是否已经存在。