Question

有一个名为“array.txt”的.txt文件，其中包含以下5x5两位数字数组：

+-------------------------+

¦ 34 ¦ 21 ¦ 32 ¦ 41 ¦ 25  ¦

+----+----+----+----+-----¦

¦ 14 ¦ 42 ¦ 43 ¦ 14 ¦ 31  ¦

+----+----+----+----+-----¦

¦ 54 ¦ 45 ¦ 52 ¦ 42 ¦ 23  ¦

+----+----+----+----+-----¦

¦ 33 ¦ 15 ¦ 51 ¦ 31 ¦ 35  ¦

+----+----+----+----+-----¦

¦ 21 ¦ 52 ¦ 33 ¦ 13 ¦ 23  ¦

+-------------------------+

我想要一个自动读取此文件的脚本，而无需手动编写代码：

array = np.matrix([[34,21,32,41,25],[14,42,43,14,31],[54,45,52,42,23],[33,15,51,31,35],[21,52,33,13,23]])

我所拥有的是以下内容：

import numpy as np
np.loadtxt('array.txt', skiprows=1)

返回错误“ValueError：无法将字符串转换为float：b'xa6”。所以好像它不喜欢ascii字符。是否有任何函数只能将文本文件的数值读入数组？非常感谢阅读，任何帮助都会受到无限的赞赏。

Answer 1

以下是如何在一行或两行中执行此操作：

import re

import numpy as np


numbers = re.compile(r'\d+')

np.array([map(int, numbers.findall(line)) 
          for line in open("array.txt", "r") 
          if numbers.search(line) is not None])

Answer 2

您可以使用正则表达式从行中提取数字：

import re, numpy
with open(myFile, 'r') as content:
    # This extracts all word boundary-delimited numbers from each line
    x = [re.findall(r'\b\d+\b', i) for i in content.readlines()]
# Then you keep only those lines that contained a number and 
#  convert the resulting list to an array
myArray = numpy.array([i for i in x if len(i) > 0])

Answer 3

假设这些是您在.txt文件中遇到过的唯一分隔符，据我所知，这里有一些适用于Python 2.7的东西。

array = []
for line in open('array.txt', 'r').readlines():
    if line.startswith('\xc2'):
        line = line.replace('\xa6','').replace('\xc2',',').split(',')
        line[:] = [int(x) for x in line if x not in ['','\n']]
        array.append(line)

在您提供的示例文本文件上调用其他print array会产生以下输出：

[[34, 21, 32, 41, 25], [14, 42, 43, 14, 31], [54, 45, 52, 42, 23], [33,  15, 51, 31, 35], [21, 52, 33, 13, 23]]

从这里你可以调整它以适应你的numpy需求。

来自ascii样式的.txt文件的Python加载数组

3 个答案: