在将数据从pdfplumber导入到.csv文件

时间：2020-03-16 07:31:17

标签： python pdf text-extraction tabula python-pdfreader

我使用pdfplumber从pdf中提取文本，但是当我尝试使用to_csv导入数据时抛出#me错误。在将数据导入到.csv

中需要帮助

import pdfplumber
import pandas as pd
import numpy as np
import os
import re
from collections import OrderedDict

pdf = pdfplumber.open('C:/Users/Desktop/Mydata.pdf')
page = pdf.pages[1-76]
text = page.extract_text()
text
print(text)



text2 = pd.DataFrame([text])

text2.to_csv("C:\\Users\\Desktop\\MyPDFData\\converted_text.csv")

未获取导入文件中的数据只是得到一个空文件

1 个答案:

答案 0 :(得分：0)

您可能不需要熊猫。只需先打开CSV引擎即可：

with open(your_csv_file_name, mode='w', newline='') as export_csv:
        csv_writer = csv.writer(export_csv, escapechar=' ', quoting=csv.QUOTE_NONE)
        csv_writer.writerow(text)

有一个很棒的页面可以了解CSV导出：

https://realpython.com/python-csv/?fireglass_rsn=true