['Date,Open,High,Low,Close,Volume,Adj Close',
'2014-02-12,1189.00,1190.00,1181.38,1186.69,1724500,1186.69',
'2014-02-11,1180.17,1191.87,1172.21,1190.18,2050800,1190.18',
'2014-02-10,1171.80,1182.40,1169.02,1172.93,1945200,1172.93',
'2014-02-07,1167.63,1177.90,1160.56,1177.44,2636200,1177.44',
'2014-02-06,1151.13,1160.16,1147.55,1159.96,1946600,1159.96',
'2014-02-05,1143.38,1150.77,1128.02,1143.20,2394500,1143.20',
'2014-02-04,1137.99,1155.00,1137.01,1138.16,2811900,1138.16',
'2014-02-03,1179.20,1181.72,1132.01,1133.43,4569100,1133.43']
我需要为这个行列表中的每一行创建一个命名元组,基本上字段将是第一行中的单词'Date,Open,High,Low,Close,Volume,Adj Close',我会然后进行一些计算,并需要在每个namedtuple的末尾再添加2个字段。有关如何做到这一点的任何帮助?
答案 0 :(得分:2)
from collections import namedtuple
data = ['Date,Open,High,Low,Close,Volume,Adj Close',
'2014-02-12,1189.00,1190.00,1181.38,1186.69,1724500,1186.69',
'2014-02-11,1180.17,1191.87,1172.21,1190.18,2050800,1190.18',
'2014-02-10,1171.80,1182.40,1169.02,1172.93,1945200,1172.93',
'2014-02-07,1167.63,1177.90,1160.56,1177.44,2636200,1177.44',
'2014-02-06,1151.13,1160.16,1147.55,1159.96,1946600,1159.96',
'2014-02-05,1143.38,1150.77,1128.02,1143.20,2394500,1143.20',
'2014-02-04,1137.99,1155.00,1137.01,1138.16,2811900,1138.16',
'2014-02-03,1179.20,1181.72,1132.01,1133.43,4569100,1133.43']
def convert_to_named_tuples(data):
# get the names for the named tuple
field_names = data[0].split(",")
# these are you two extra custom fields
field_names.append("extra1")
field_names.append("extra2")
# field names can't have spaces in them (they have to be valid python identifiers
# and "Adj Close" isn't)
field_names = [field_name.replace(" ", "_") for field_name in field_names]
# you can do this as many times as you like..
# personally I'd do it manually once at the start and just check you're getting
# the field names you expect here...
ShareData = namedtuple("ShareData", field_names)
# unpack the data into the named tuples
share_data_list = []
for row in data[1:]:
fields = row.split(",")
fields += [None, None]
share_data = ShareData(*fields)
share_data_list.append(share_data)
return share_data_list
# check it works..
share_data_list = convert_to_named_tuples(data)
for share_data in share_data_list:
print share_data
实际上我认为这更好,因为它将字段转换为正确的类型。在不利方面,它不会采取任意数据......
from collections import namedtuple
from datetime import datetime
data = [...same as before...]
field_names = ["Date","Open","High","Low","Close","Volume", "AdjClose", "Extra1", "Extra2"]
ShareData = namedtuple("ShareData", field_names)
def convert_to_named_tuples(data):
share_data_list = []
for row in data[1:]:
row = row.split(",")
fields = (datetime.strptime(row[0], "%Y-%m-%d"), # date
float(row[1]), float(row[2]),
float(row[3]), float(row[4]),
int(row[5]), # volume
float(row[6]), # adj close
None, None) # extras
share_data = ShareData(*fields)
share_data_list.append(share_data)
return share_data_list
# test
share_data_list = convert_to_named_tuples(data)
for share_data in share_data_list:
print share_data
但我同意其他帖子..为什么在使用类定义时使用namedtuple。
答案 1 :(得分:1)
您想使用namedtuples的任何特殊原因?如果您想稍后添加字段,可能应该使用字典。如果你真的不想采用namedtuple方式,你可以使用占位符,如:
from collections import namedtuple
field_names = data[0].replace(" ", "_").lower().split(",")
field_names += ['placeholder_1', 'placeholder_2']
Entry = namedtuple('Entry', field_names)
list_of_named_tuples = []
mock_data = [None, None]
for row in data[1:]:
row_data = row.split(",") + mock_data
list_of_named_tuples.append(Entry(*row_data))
相反,如果您要将数据解析为字典列表(更多pythonic IMO),您应该这样做:
field_names = data[0].split(",")
list_of_dicts = [dict(zip(field_names, row.split(','))) for row in data[1:]]
编辑:请注意,即使您可以为示例中的小数据集使用词典而不是命名元组,但使用大量数据这样做会为您的程序带来更高的内存占用量。
答案 2 :(得分:0)
为什么不使用字典作为数据,然后添加额外的键就很容易了
dataList = []
keys = myData[0].split(',')
for row in myData:
tempdict = dict()
for index, value in enumerate(row.split(',')):
tempdict[keys[index]] = value
# if your additional values are going to be determined here then
# you can do whatever calculations you need and add them
# otherwise you do work with this list elsewhere
dataList.append(tempdict)