Question

说我有一个csv文件：

Col1,Col2,Col3,Col4
1,2,3,4
1,2,3,4
1,2,3,4
1,2,3,4

我想将列中的所有值添加到数组中然后操作它，然后再添加到下一列

所以：

# loop through data
Col1 = [1,1,1,1]
# do something
Col2 = [2,2,2,2]
# do something
Col3 = [3,3,3,3]
# do something
Col4 = [4,4,4,4]

使用

的问题

data = csv.reader(input_file)
lst = []

for row in data:
    lst.append(row[0])
    # do something with lst

我是否只能为第一列做到这一点。

Answer 1

查看Ben Southgate的这篇文章：Extract csv file specific columns to list in Python

import csv

# open the file in universal line ending mode 
with open('test.csv', 'rU') as infile:
  # read the file as a dictionary for each row ({header : value})
  reader = csv.DictReader(infile)
  data = {}
  for row in reader:
    for header, value in row.items():
      try:
        data[header].append(value)
      except KeyError:
        data[header] = [value]

他的代码用你的列表创建一个字典。然后，您可以通过以下方式访问它们：

Col1 = data['Col1']

本来会把这个链接放在评论中，但我还没有足够的代表发表评论。

Answer 2

我会使用numpy一次读取整个csv然后你可以按如下方式使用数组：

import numpy as np
my_data = np.genfromtxt('test.csv', delimiter=',')
for column in my_data.T:
  print(column)

给出了：

[ 1.  1.  1.  1.]
[ 2.  2.  2.  2.]
[ 3.  3.  3.  3.]
[ 4.  4.  4.  4.]

对于像这样的csv文件：

1,2,3,4
1,2,3,4
1,2,3,4
1,2,3,4

Answer 3

似乎您可以将文件读入列表列表。如果是这样，请查看zip功能。它将列表作为参数，并将第一个元素组合成一个新列表，第二个元素组合成一个新列表，等等。

>>> data = [[1,2,3,4],[1,2,3,4],[1,2,3,4]]
>>> transposed = zip(*data)
>>> transposed
[(1, 1, 1), (2, 2, 2), (3, 3, 3), (4, 4, 4)]
>>>

正如所指出的，numpy可以做到这一点（还有更多！）但它是python中不包含的附加包。

Answer 4

将内容读入字典：

import csv
import pprint

with open('cols.csv') as input_file:
    reader = csv.reader(input_file)
    col_names = next(reader)
    data = {name: [] for name in col_names}
    for line in reader:
        for pos, name in enumerate(col_names):
            data[name].append(int(line[pos]))

pprint.pprint(data)

输出：

{'Col1': [1, 1, 1, 1],
 'Col2': [2, 2, 2, 2],
 'Col3': [3, 3, 3, 3],
 'Col4': [4, 4, 4, 4]}

如何将csv文件中列的数据添加到数组中

4 个答案: