Question

我有一个CSV文件，名为results_date.csv，如下所示：

Header1 ; Header2 ; Header3
string1=0,1 ; string2=0,2 ; string3=0,3
string4=0,4 ; string5=0,5 ; string6=0,6
..............................
stringX=0,x ; stringY=0,y ; stringZ=0,z
some other big string at the end

我想解析这个文件，并将文本保存到2个数组中，对于第1列和第2列，但没有相等的数字，在它之后的数字，标题和最后的另一个大字符串。例如：

array_for_column1 = ["string1", "string4", ..., "stringX"]
array_for_column2 = ["string2", "string5", ..., "stringY"]

这些数组将用于混淆矩阵。

我试着这样做：

#!/usr/bin/python3

import csv

data = csv.reader(open('results_date.csv', 'r'), delimiter=";", quotechar='|')

column1 = []
column2 = []

for row in data:
    column1.append(row[0])
    column2.append(row[1])

print (column1)
print (column2)

但它不起作用。此代码仅打印第1列，并为第2列引发错误。

提前感谢您的帮助！

Answer 1

可能最后一行不包含;。您可以执行以下操作：

with open('results_date.csv', 'r') as f:
    data = csv.reader(f, delimiter=";", quotechar='|')
    next(data)  # skip headers

    # transpose rows to columns while safe-checking row length
    columns = list(zip(*(row for row in data if len(row) >= 2)))[:2]

    # process cells: strip, split on '=', take first part
    column1, column2 = [[s.strip().split('=')[0] for s in c] for c in columns]

Python：将CSV列转换为数组，第1行和最后一行除外

1 个答案: