Question

我的问题不同，因为它不处理正则表达式。所以我认为它略有不同。我收到了这个错误。

ValueError: invalid literal for float(): 512220      0      20      34.4     
 2.4      0     10010      913        52      0.00

我的csv文件看起来像

512220     0      20       34.4      2.4      0     10010      913        52      0.00
512221     1      30       34.6      2.3      0     10230      910.3      54      0.00
512222     2      50       34.8      2.1      0     10020      932        56      0.00
512223     3      60       35.4      2.5      0     10340      945.5      58      0.00

我的代码是

with open(item[1]) as f:
    lines = f.readlines()
    print 'lines', lines
for k, line in enumerate(lines):
    data_temporary = line.strip().split("\r\n")

当我打印＆＃34; line＆＃34;我得到了跟随

['512220     0      20       34.4      2.4      0     10010      913        52   
 0.00\n', '512221     1      30       34.6      2.3      0     10230      910.3   
 54      0.00\n', '512222     2      50       34.8      2.1      0     10020     
 932        56      0.00\n', '512223     3      60       35.4      2.5  
 0     10340      945.5      58      0.00'\n]

当我打印data_temporary时，我只得到以下一行。

['160129    29  0000     0      0.04       5.3      2.04  
  0.00     11758      9.13        52      0.00']

我尝试了这些命令，结果如下。。 data_temporary = line.strip（）。split（＆＃34;＆＃34;）

['512220', '', '', '', '', '', '', '0', '', '', '', '', '', '20', '', '', '', 
 '', '', '', '34.4', '', '', '', '', '', '2.4', '', '', '', '', '', '0', '', '',
 '', '10010', '', '', '', '', '', '913', '', '','', '', '', '52', '', '',
 '', '', '', '0.00']

我尝试应用在SO上找到的不同解决方案，但无法正常工作。就像我尝试使用

  lines = map(lambda l: l.strip().split('\t'), lines) and some others.

我认为我必须将列表分解为字符串，然后对其执行操作。有人可以帮助我解决这个问题，以便我更好地理解。感谢

Answer 1

如果使用for循环遍历文件，则每次迭代都会获得一行。然后，您可以在该行上调用split()，将空格拆分为列表。

with open('filename.txt', 'r') as f:
    for line in f:
        data = line.split()
        print data

        z = float(data[3])

输出：

['512220', '0', '20', '34.4', '2.4', '0', '10010', '913', '52', '0.00']
['512221', '1', '30', '34.6', '2.3', '0', '10230', '910.3', '54', '0.00']
['512222', '2', '50', '34.8', '2.1', '0', '10020', '932', '56', '0.00']
['512223', '3', '60', '35.4', '2.5', '0', '10340', '945.5', '58', '0.00']

很多元素看起来像整数，所以我不建议将每个字段转换为float。相反，我会挑选出各个列并进行转换。

我不知道你的田地名称，所以我做了一些。这里有一些代码可以将此文件加载到字典列表中，其中字段已转换为适当的类型：

from pprint import pprint

fields = [ 
    ('id', int),
    ('n', int),
    ('s', int),
    ('a', float),
    ('b', float),
    ('z', int),
    ('n2', int),
    ('top', float),
    ('x', int),
    ('bottom', float),
]

def read_data(path):
    with open(path, 'r') as f:
        for line in f:
            data = line.split()

            res = {}
            for n, field in enumerate(fields):
                name, _type = field
                res[name] = _type(data[n])
            yield res 

pprint(list(read_data('data.txt')))

<强>输出：

[{'a': 34.4,
  'b': 2.4,
  'bottom': 0.0,
  'id': 512220,
  'n': 0,
  'n2': 10010,
  's': 20,
  'top': 913.0,
  'x': 52,
  'z': 0},
 {'a': 34.6,
  'b': 2.3,
  'bottom': 0.0,
  'id': 512221,
  'n': 1,
  'n2': 10230,
  's': 30,
  'top': 910.3,
  'x': 54,
  'z': 0},
 {'a': 34.8,
  'b': 2.1,
  'bottom': 0.0,
  'id': 512222,
  'n': 2,
  'n2': 10020,
  's': 50,
  'top': 932.0,
  'x': 56,
  'z': 0},
 {'a': 35.4,
  'b': 2.5,
  'bottom': 0.0,
  'id': 512223,
  'n': 3,
  'n2': 10340,
  's': 60,
  'top': 945.5,
  'x': 58,
  'z': 0}]

Answer 2

''不是float的有效值。

尝试data_temporary = line.split()，看看是否有效。

或者，使用列表理解：

values = [float(item) for item in line.split() if item]

如何处理ValueError：python中float（）的无效文字

2 个答案: