Question

我有一个包含'＃'标题的文件，然后是以逗号分隔的数据。即

#blah
#blah
#blah 
1, 2, 3, 4, 5, 6
7, 8, 9,10,11,12
...

我希望能够跳过＃并将数据转换为数组（列上的直方图）。读取它。

filename = 'input.txt'
listin = []
for line in open(filename,'r'):
    li=line.strip()
    if not li.startswith("#"):
        line = line.partition('#')[0]
        #line = line.rstrip()
        listin.append(line.split(',')) 

data = numpy.asarray(listin)
SubData=data[:,0]

hist,nbins = np.histogram(SubData)

Error:
TypeError: cannot perform reduce with flexible type

Answer 1

您需要将数据转换为数字类型。你的数组包含字符串。

listin.append([int(token) for token in line.split(',')])

此外，您需要每行删除一行以删除换行符。

Answer 2

你也可以使用像

这样的numpy loadtxt

from numpy import loadtxt
data = loadtxt("input.txt", comments="#", delimiter=",", unpack=True)

返回列中的数据：

array([[  1.,   7.],
   [  2.,   8.],
   [  3.,   9.],
   [  4.,  10.],
   [  5.,  11.],
   [  6.,  12.]])

使用unpack=False将文件加载到行中。

将带有一些标题的文本文件转换为直方图的数组

2 个答案: