Question

我有一个充满数字的文件；

010101228522 0 31010 3 3 7 7 43 0 2 4 4 2 2 3 3 20.00 89165.30

01010222852313 3 0 0 7 31027 63 5 2 0 0 3 2 4 12 40.10 94170.20

0101032285242337232323 7 710153 9 22 9 9 9 3 3 4 80.52 88164.20

0101042285252313302330302323197 9 5 15 9 15 15 9 9 110.63 98168.80

01010522852617 7 7 3 7 31330 87 6 3 3 2 3 2 5 15 50.21110170.50

...

我正在尝试读取此文件，但是我不确定如何处理，当我使用内置函数open和numpy的loadtxt时，我什至尝试转换为熊猫，但该文件被读为一列，是，其形状为（364 x 1），但我希望它将数字分隔为列，并将空格替换为零，请多多帮助。注意，有些地方后面有两个空格

Answer 1

如果您尝试使用str.split()列内容类型是字符串，这会将字符串转换为数组，那么每个数字都会按每个空格分开。然后，您可以对上面提到的数组中的对象数量使用一个for循环，以从中创建一个表，不确定是否已经回答了这个问题。

str.split():

Answer 2

所以我终于解决了我的问题，实际上我不得不剥离线，然后从行中读取每个“字母”，以我为例，我是从剥离的线中挑选单个数字，然后将它们附加到数组中。这是我的解决方案的代码；

arr = [] 
with open('Kp2001', 'r') as f:
    for ii, line in enumerate(f):  
         arr.append([])     #Creates an n-d array
         cnt = line.strip() #Strip the lines
         for letter in cnt:  #Get each 'letter' from the line, in my case it's the individual numbers
              arr[ii].append(letter)   #Append them individually so python does not read them as one string

df = pd.DataFrame(arr)    #Then converting to DataFrame gives proper columns and actually keeps the spaces to their respectful columns
df2 = df.replace(' ', 0)      #Replace the spaces with what you will

读取被检测为一列的文件

2 个答案: