Question

我已经获得了文本格式的数据集，并希望使用Python对该数据集进行分析。数据如下：

1 1 -0.0007 -0.0004 100.0 518.67 641.82 1589.70 1400.60 14.62 21.61 554.36 2388.06 9046.19 1.30 47.47 521.66 2388.02 8138.62 8.4195 0.03 392 2388 100.00 39.06 23.4190  
1 2 0.0019 -0.0003 100.0 518.67 642.15 1591.82 1403.14 14.62 21.61 553.75 2388.04 9044.07 1.30 47.49 522.28 2388.07 8131.49 8.4318 0.03 392

我想像使用具有正确列名的Python中的csv文件一样读取此数据（我将对此进行定义）。任何关于如何做到这一点的想法将不胜感激。

Answer 1

您可以尝试逐行读取文件，然后以空格分隔每一行，并使分隔列表中的每个元素都与某个列名相关联。

您可以使用类似的方法开始：

with open('filename') as f:
    lines = f.readlines()

for line in lines:
    l = line.split(" ");
    for el in l:
        #do stuff

Answer 2

读取并使用split（''）将其拆分为列。也许将其分配给数据框以便于访问。

file1 = open("myfile.txt","r+") 
print ("Output of Read function is")
lines = file1.readlines() 
print("Col1  Col2  ....")
for line in lines:
  temp = line.split(' ')
  print(temp[0],temp[1])

Answer 3

尝试从熊猫（documentation）中读取read_csv 并使用空格定义分隔符，如下所示：

data = pandas.read_csv(delimiter: ' ')

Answer 4

假设文本文件为temp.txt，并且列之间的分隔符为空格，则可以使用read_csv函数读取此类文件：

import pandas as pd
names = ['col1', 'col2', 'col3'] # define your column names here, the lentgh of the list should match the actual number of columns you load into the dataframe
df = pd.read_csv('temp.txt', delimiter=' ', header=None, names=names)  # use header=None to indicate that your file has no header

如何在Python中从文本文件读取数据？

4 个答案: