Question

我正在尝试使用camelot从pdf提取表，但出现此属性错误。能否请你帮忙？

import camelot
import pandas as pd
pdf = camelot.read_pdf("Gordian.pdf")

AttributeError跟踪（最近一次通话）    在   ----> 1 pdf = camelot.read_pdf（“ Gordian.pdf”）

AttributeError：模块'camelot'没有属性'read_pdf'

Answer 1

注意：如果您正在使用虚拟环境，请先激活环境，然后再执行此操作。

我已经遇到了这个错误。您的代码中没有bug。问题出在camelot安装上。

1删除已安装的camelot版本

2使用此命令再次安装。有多种安装camelot的方法。请一一尝试

pip install camelot-py
pip install camelot-py[cv]
pip install camelot-py[all]

3运行您的代码>>我在此处附加了示例代码

import camelot

data = camelot.read_pdf("test_file.pdf", pages='all')
print(data)

Answer 2

试试这个：import camelot.io as camelot 这对我有用。

Answer 3

请检查您的机器上是否安装了 java，转到终端并运行“java -version”，如果没有，您将无法使用 Camelot 或 tabula 阅读 pdf，

安装java后，使用命令安装tabula-py pip install tabula-py。

from tabula.io import read_pdf
tables = read_pdf('file.pdf')  # substitute your file name

Answer 4

我放弃了让 Camelot 在 Jupiter Notebooks 中工作来阅读表格的尝试，而是安装了以下内容：

!{sys.executable} -m pip install tabula-py tabulate

from tabula import read_pdf
from tabulate import tabulate


pdf_path = (
    Path.home()
    / "my_pdf.pdf"
)
df = read_pdf(str(pdf_path), pages=1)
df[0]

Answer 5

下载库时请注意下载位置。因为你下载的库可能已经保存在另一个Python版本中

AttributeError：模块'camelot'没有属性'read_pdf'

5 个答案: