我有一个包含此类数据的电子表格
Group,Region,Market
G7,EMEA,Germany
G7,NA,Canada
G7,APAC,Japan
捕获此信息的最有效方法是什么? 我使用字典将此信息存储为{Group:{Region:Market}}
我的代码是
try:
with open(fileName) as sourceFile:
for line in sourceFile:
if not headerRow:
for group, region, market in [line.rstrip().split(",")]:
if group in self.REGIONAL_MARKETS:
self.REGIONAL_MARKETS[group].update({int(region):market})
else:
self.REGIONAL_MARKETS.update({group:{int(region):market}})
headerRow=False
return self.REGIONAL_MARKETS
except IOError as e:
print("Invalid File Name. Message = "%(e))
感谢您的投入
答案 0 :(得分:1)
两件事:
try
块太大(越短越好,因为它意味着更具体的错误处理);和collections.defaultdict
来简化输出数据结构的创建。尝试类似:
from collections import defaultdict
data = defaultdict(dict)
try:
with open(fileName) as sourceFile:
header = sourceFile.readline() # skip header
lines = sourceFile.readlines() # get the rest of the data
except IOError as e:
print("Invalid File Name. Message = "%(e))
else:
for line in lines:
group, region, market = line.rstrip().split(",") # don't iterate over a
# single-element list
data[group].update({region: market}) # how is e.g. 'EMEA' an integer?
在测试数据上,这给了我:
>>> data
defaultdict(<type 'dict'>, {'G7': {'NA': 'Canada',
'EMEA': 'Germany',
'APAC': 'Japan'}})
此外,请查看csv.DictReader
,它将为您完成一些文件处理工作。