我的要求是在n
行数之后读取多个CSV文件,这些文件包含我感兴趣的数据。这个数字n
不是恒定的,因为它随CSV不同而不同(因此,我不能使用跳过行)。
CSV的格式如下:
Test: Rate1, "2" , units
specimen: Rectangular, "3", units
Time, Estimate, Load
(s) , (units) , (N)
"1","2","4"
"5","8","12"
另一个CSV可能是:
Test: Rate1, "2" , units
specimen: Rectangular, "3" , units
value_based : Sample7, "9" , product
Test_condition: controlled, "0" , test
Time, Estimate, Load
(s) , (units) , (N)
"12","6","8"
"18","3","2"
但是,我唯一感兴趣的列名称是:[{Time
,Estimate
和Load
]。
我要执行以下操作:
使用指定标头为Time
,Estimate
和Load
的数据。
跳过值的第一行((s) , (units), (N)
),因为我想将它们与标头连接起来并重命名为
Time(s) , Estimate(units), Load(N)
。
这是我尝试过的:
with open(file,"r+",newline="") as csvFile:
dictReader = csv.DictReader(csvFile)
for row in dictReader:
print(row["Time"], row["Load"], row["Extension"])
df = pd.read_csv(file,usecols=["Time","Load","Extension"])
print(df["Time"].head(3))
请建议我将如何继续获取具有预期标题的数据。预先感谢。
答案 0 :(得分:2)
我认为Pandas本身不能确定正确的开始行,但是可以做一些准备工作就可以确定正确的行。例如:
import pandas as pd
import csv
filename = 'test.csv'
header_row = ["Time", "Estimate", "Load"]
with open(filename, newline='') as f_csv:
for row_number, row in enumerate(csv.reader(f_csv), start=-1):
if row == header_row:
break
df = pd.read_csv(filename, skiprows=row_number, names=header_row)
print(df)
给予:
Time Estimate Load
0 1 2 4
1 5 8 12