我得到了解析.csv文件并将某些行放入列表和/或词典的基础知识,但这个我不能破解。
有9行包含一般信息,如
然后有详细的产品和价格清单。我想做的是:
进入字典。然后我将这些数据写入mySql数据库。有人可以建议如何在这个“标题”(第9行)之后开始向字典中添加项目吗?
感谢。
Bill to Client
Billing ID xxxx-xxxx-xxxx
Invoice number 3359680287
Issue date 1/31/2016
Due Date 3/1/2016
Currency EUR
Invoice subtotal 2,762,358.40
VAT (0%) 0
Amount due 2,762,358.40
Account ID Account Order Purchase Order Product Description Quantity Units Amount
xxx-xxx-xxxx Client - Search, GDN, Youtube Client- Google Search Google AdWords Belgium_GDN_january_(FR) 1 Impressions 0.04
xxx-xxx-xxxx Client - Search, GDN, Youtube Client- Google Search Google AdWords UK_GDN_january 392 Impressions 2.92
xxx-xxx-xxxx Client - Search, GDN, Youtube Client- Google Search Google AdWords Poland_GDN_january 12 Impressions 0.05
xxx-xxx-xxxx Client - Search, GDN, Youtube Client Google AdWords Switzerland Family vacation 251 Clicks 4,718.91
xxx-xxx-xxxx Client - Search, GDN, Youtube Client Google
xxx-xxx-xxxx Client - Search, GDN, Youtube Client Google AdWords Invalid activity -16.46
当我尝试这段代码时:
import csv
with open('test.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=",")
for row in readCSV:
print(row[0])
我在终端得到这个:
比尔到 帐单ID
发票编号
发行日期
截止日期
货币发票
小计
增值税(0%)
应付金额
Traceback(最近一次调用最后一次):文件 “xlwings_test.py”,第7行,in print(row [0])IndexError:列表索引超出范围xlwings git:master❯
答案 0 :(得分:1)
import csv
dict1 = {}
dict2 = {}
with open("test.csv", "rb") as f:
reader = csv.reader(f, delimiter="\t")
for i, line in enumerate(reader):
if i in [3, 4, 5, 9]:
prop_name = line[0]
prop_val = line[1]
dict1[prop_name] = prop_value # Invoice number, Issue date, Due date or Amount date
elif i > 11:
# Fetch other information like 'description' and 'amount'
print "Description: " + line[5]
print "Amount: " + line[-1]
dict2[line[5]] = line[-1]
print dict1
print dict2
答案 1 :(得分:1)
最简单的解决方案是用逗号分隔列表中的特定行,并从列表的结尾读取数量和描述数据。您可能会收到错误,因为文件中有空白行,您不能拆分它们。这是代码:
import csv
general_info=dict()
rest_of_file_list=[]
row_counter=0
with open('test.csv', 'rb') as file:
reader = csv.reader(file)
for row in file:
if row_counter==2:
#invoice row
general_info['Invoice number'] = row.split(',')[1].rstrip()
elif row_counter==3:
#issue date row
general_info['Issue date'] = row.split(',')[1].rstrip()
elif row_counter==4:
#due date row
general_info['Due date'] = row.split(',')[1].rstrip()
elif row_counter==8:
#amount due row
general_info['Amount due'] = row.split(',')[1].rstrip()
elif row_counter > 10:
#last and 4th item from the end of the list are amount and description
if row and not row.isspace():
item=dict()
lista=row.split(',')
item['Description']=lista[len(lista)-4].rstrip()
item['Amount']=lista[len(lista)-1].rstrip()
rest_of_file_list.append(item)
row_counter+=1
print(general_info)
print(rest_of_file_list)
答案 2 :(得分:0)
我建议您分别阅读一般信息,然后使用csv模块作为字符串解析剩余的行。为了第一个目的,我将创建header_attributes字典,其余的将使用csv.DictReader类实例读取。
import csv
from StringIO import StringIO
CLIENT_PROPERTY_LINE_COUNT = 10
f = open("test.csv")
#When reading the file, headers are comma separated in the following format: Property, Value.
#The if inside the forloop is used to ignore blank lines or lines with only one attribute.
for i in xrange(CLIENT_PROPERTY_LINE_COUNT):
splitted_line = f.readline().rsplit(",", 2)
if len(splitted_line) == 2:
property_name, property_value = splitted_line
stripped_property_name = property_name.strip()
stripped_property_value = property_value.strip()
header_attributes[stripped_property_name] = stripped_property_value
print(header_attributes)
account_data = f.read()
account_data_memory_file = StringIO()
account_data_memory_file.write(account_data)
account_data_memory_file.seek(0)
account_reader = csv.DictReader(account_data_memory_file)
for account in account_reader:
print(account['Units'], account['Amount']