Question

我需要一种方法来以几种不同的方式读取存储在文件中的列表。我试图想到用户可能会考虑将列表存储在文件中的所有方法，并正确地将其解释为。

这是一个示例输入文件，其中列表的编写方式不同。

# in_file.dat

# List 1 (enclosing brackets, comma separated, spaces after commas)
[0.5, 0.2, 0.6, 0.9, 1.5]

# List 2 (enclosing parenthesis, comma separated, no spaces or some spaces after commas)
(0.5,0.2,0.6,0.9, 1.5)

# List 3 (enclosing keys, mixed commas and semi-colons, mixed no-spaces and spaces)
{0.5,0.2,0.6;0.9;1.5}

# List 4 (single item)
[0.6]

# List 5 (space separated elements)
[2.3 5. 0.6 1.2 0.0 3.1]

每行应正确读取为列表，结果为：

ls_1 = [0.5, 0.2, 0.6, 0.9, 1.5]
ls_2 = [0.5, 0.2, 0.6, 0.9, 1.5]
ls_3 = [0.5, 0.2, 0.6, 0.9, 1.5]
ls_4 = [0.6]
ls_5 = [2.3, 5., 0.6, 1.2, 0.0, 3.1]

我阅读文件的常用方法是

# Read data from file.
with open('in_file.dat', "r") as f_dat:
    # Iterate through each line in the file.
    for line in f_dat:
        # Skip comments
        if not line.startswith("#") and line.strip() != '':
            # Read list stored in line.
            ls_X = ??

是否有一些通用的方法可以强制python将该行解释为列表？

Answer 1

如果您确定每一行只有数字序列

，请使用re

import re
lines=[]
for l in f_dat:
    if l and l[0]!='#':
        lines.append([float(i) for i in re.findall('[0-9.]+',l)])
print lines

希望这就是你要找的东西。

Answer 2

试试这个：

import re
with open('in_file.dat', "r") as f_dat:
    for line in f_dat:
      if not line.startswith("#") and line.strip() != '':
          parts = re.split('[, ;]', line[1:-1])  # removes first and last char
          ls_X = filter(lambda x: x!="", parts)  # removes any empty string

Answer 3

也许是这样的。这也适用于嵌套结构：

from ast import literal_eval
import re
from string import maketrans

table = maketrans(';,{}()', '  [][]')
with open('file.txt', "r") as f_dat:
    for line in f_dat:
        if not line.startswith("#") and line.strip() != '':
            line = re.sub(r'\s+', ',', line.strip().translate(table))
            try:
                print literal_eval(line)
            except (ValueError, SyntaxError):
                pass

<强>演示：

>>> !cat file.txt
# in_file.dat

# List 1 (enclosing brackets, comma separated, spaces after commas)
[0.5,    0.2, 0.6,              [0.9, 1.5]]

# List 2 (enclosing parenthesis, comma separated, no spaces or some spaces after commas)
(0.5,0.2,0.6,0.9, 1.5)

# List 3 (enclosing keys, mixed commas and semi-colons, mixed no-spaces and spaces)
{0.5,0.2,0.6;0.9;1.5;{1, [2, 3; 100 200]}}

# List 4 (single item)
[0.6]

# List 5 (space separated elements)
[2.3 5. 0.6 1.2 0.0 3.1 [10 20 {50, [60]}] ]

>>> %run so.py
[0.5, 0.2, 0.6, [0.9, 1.5]]
[0.5, 0.2, 0.6, 0.9, 1.5]
[0.5, 0.2, 0.6, 0.9, 1.5, [1, [2, 3, 100, 200]]]
[0.6]
[2.3, 5.0, 0.6, 1.2, 0.0, 3.1, [10, 20, [50, [60]]]]

Answer 4

>>> file
'[0.5, 0.2, 0.6, 0.9, 1.5]\n(0.5,0.2,0.6,0.9, 1.5)\n{0.5,0.2,0.6;0.9;1.5}\n[0.6]\n[2.3 5. 0.6 1.2 0.0 3.1]'
>>> for line in file.split('\n'):
...     print re.split(r"[,\s;]\s*",re.sub(r"[{}()\[\]]",'',line))
... 
['0.5', '0.2', '0.6', '0.9', '1.5']
['0.5', '0.2', '0.6', '0.9', '1.5']
['0.5', '0.2', '0.6', '0.9', '1.5']
['0.6']
['2.3', '5.', '0.6', '1.2', '0.0', '3.1']

读取以不同方式从文件中写入的列表

4 个答案: