使用Python For循环将数据写入Excel

时间:2017-06-23 20:23:55

标签: python database excel

我目前正在将PDFS转换为巨型文件夹中的文本,然后将某些关键字输出到Excel文件。一切正常,但即使我的文件夹中有多个PDFS,他们也会在A1列上相互写作。

如何迭代它以便下一个字典转到后续行?

custData = {}

def data_grabbing(pdf):
    row = 0
    col = 0
    string = convert_pdf_to_txt(pdf)
    lines = list(filter(bool,string.split('\n')))
    for i in range(len(lines)):
        if 'Lead:' in lines[i]:
            custData['Name'] = lines[i+2]
        elif 'Date:Date:Date:Date:' in lines[i]:
            custData['Fund Manager'] = lines[i+2]
        elif 'Priority:' in lines[i]:
            custData['Industry'] = lines[i+2]
            custData['Date'] = lines[i+1]
            custData['Deal Size']= lines [i+3]
        elif 'DEAL QUALIFYING MEMORANDUM' in lines[i]:
            custData['Owner'] = lines[i+2]
        elif 'Fund Manager' in lines[i]:
            custData['Investment Type'] = lines [i+2]
    print custData
    for item, descrip in custData.iteritems():
        worksheet.write(row, col,     item)
        worksheet.write(row+1, col, descrip)
        col += 1
    row +=2


for myFile in os.listdir(directory):
    if myFile.endswith(".pdf"):
        data_grabbing(os.path.join(directory, myFile))
workbook.close()

1 个答案:

答案 0 :(得分:1)

您的一些选择是:

  1. 使row成为一个全局的,并实例化外部函数(@ StevenRumbalski的建议)
  2. 使datag_grabbing成为类的方法,并使row成为实例变量。
  3. 将当前行传递给您的函数。
  4. 我会显示选项#3(但可能更喜欢#2):

    custData = {}
    
    def data_grabbing(pdf, row):
        col = 0
        string = convert_pdf_to_txt(pdf)
        lines = list(filter(bool,string.split('\n')))
        for i in range(len(lines)):
            if 'Lead:' in lines[i]:
                custData['Name'] = lines[i+2]
            elif 'Date:Date:Date:Date:' in lines[i]:
                custData['Fund Manager'] = lines[i+2]
            elif 'Priority:' in lines[i]:
                custData['Industry'] = lines[i+2]
                custData['Date'] = lines[i+1]
                custData['Deal Size']= lines [i+3]
            elif 'DEAL QUALIFYING MEMORANDUM' in lines[i]:
                custData['Owner'] = lines[i+2]
            elif 'Fund Manager' in lines[i]:
                custData['Investment Type'] = lines [i+2]
        print custData
        for item, descrip in custData.iteritems():
            worksheet.write(row, col,     item)
            worksheet.write(row+1, col, descrip)
            col += 1
    
    
    cur_row = 0
    for myFile in os.listdir(directory):
        if myFile.endswith(".pdf"):
            data_grabbing(os.path.join(directory, myFile), cur_row)
            cur_row +=-2
    workbook.close()