我正在尝试从python中的*.dat
文件中读取,而我在*.dat
文件中的数据看起来像这样
1 275
2 264
3 256 275
4 194
5 38 218
6 98
7 10 255
8 157 186
9 210 261
10 141
11 45 130
它基本上有三列,所以有什么方法可以逐行读取它们或按列读取并将它们存储到三个不同的数组中吗?
答案 0 :(得分:0)
在回答这个问题之前,我想指出的问题是,这不是您应该在Stack Overflow中提出的问题。可以通过简单的google搜索并快速浏览任何机器学习库的文档或在类似media等网站上的简单教程来回答此问题。当您用尽了所有最基本的途径 all (例如,谷歌搜索或阅读库的文档)后,堆栈溢出就是您应该去的地方。从长远来看,在寻求其他程序员之前耗尽所有可能途径的这种习惯将帮助您成为更好的程序员。
个人而言,我要做的是将文件转换为数据友好文件,例如.csv或.xlsx,然后使用通用的机器学习库 Pandas 来解析数据。< / p>
例如:(文件名为“ data.csv”)
Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigree,Age,Outcome
10,148,84,48,237,37.6,1.001,51,1
3,173,82,48,465,38.4,2.137,25,1
7,129,68,49,125,38.5,0.439,43,1
3,129,92,49,155,36.4,0.968,32,1
1,172,68,49,579,42.4,0.702,28,1
1,71,78,50,45,33.2,0.422,21,0
2,112,78,50,140,39.4,0.175,24,0
1,136,74,50,204,37.4,0.399,24,0
1,122,90,51,220,49.7,0.325,31,1
2,100,70,52,57,40.5,0.677,25,0
1,86,66,52,65,41.3,0.917,29,0
0,162,76,56,100,53.2,0.759,25,1
0,100,88,60,110,46.8,0.962,31,0
0,180,78,63,14,59.4,2.42,25,1
然后将相关代码逐行解析并将其放入数组将是:
import pandas as pd
#gets the path - tells the code where to look for said piece of data
path = r"data.csv"
#creates a dataframe from the information in the path (your .csv file)
df = pd.read_csv(path)
#extract glucose column information and store it in a dataframe
glu_df = df["Glucose"]
#take the information from the dataframe and then store it in an array
glu_list = glu_df.values