什么是将CSV文件数据作为命名元组行读取的pythonic方法?

时间:2012-01-25 17:27:59

标签: python csv namedtuple

获取包含标题行的数据文件并将此行读入命名元组的最佳方法是什么,以便可以通过标题名称访问数据行?

我正在尝试这样的事情:

import csv
from collections import namedtuple

with open('data_file.txt', mode="r") as infile:
    reader = csv.reader(infile)
    Data = namedtuple("Data", ", ".join(i for i in reader[0]))
    next(reader)
    for row in reader:
        data = Data(*row)

reader对象不可订阅,因此上面的代码会抛出TypeError。将文件头读入namedtuple的pythonic方法是什么?

3 个答案:

答案 0 :(得分:37)

使用:

Data = namedtuple("Data", next(reader))

并省略该行:

next(reader)

将此与基于下面的martineau评论的迭代版本相结合,该示例适用于Python 2

import csv
from collections import namedtuple
from itertools import imap

with open("data_file.txt", mode="rb") as infile:
    reader = csv.reader(infile)
    Data = namedtuple("Data", next(reader))  # get names from column headers
    for data in imap(Data._make, reader):
        print data.foo
        # ...further processing of a line...

和Python 3

import csv
from collections import namedtuple

with open("data_file.txt", newline="") as infile:
    reader = csv.reader(infile)
    Data = namedtuple("Data", next(reader))  # get names from column headers
    for data in map(Data._make, reader):
        print(data.foo)
        # ...further processing of a line...

答案 1 :(得分:22)

请查看csv.DictReader。基本上,它提供了在您查找时从第一行获取列名的功能,之后,您可以使用字典按名称访问行中的每个列。

如果由于某种原因你还需要以collections.namedtuple的形式访问这些行,那么应该很容易将字典转换为命名元组,如下所示:

with open('data_file.txt') as infile:
    reader = csv.DictReader(infile)
    Data = collections.namedtuple('Data', reader.fieldnames)
    tuples = [Data(**row) for row in reader]

答案 2 :(得分:0)

我建议这种方法:

import csv
from collections import namedtuple

with open("data.csv", 'r') as f:
        reader = csv.reader(f, delimiter=',')
        Row = namedtuple('Row', next(reader))
        rows = [Row(*line) for line in reader]

如果使用Pandas,解决方案将变得更加优雅:

import pandas as pd
from collections import namedtuple

data = pd.read_csv("data.csv")
Row = namedtuple('Row', data.columns)
rows = [Row(*row) for index, row in data.iterrows()]

在两种情况下,您都可以通过字段名称与记录进行交互:

for row in rows:
    print(row.foo)