应用错误收集

我们正在尝试使用tesseract-ocr从普通pdf和扫描的pdf（图像）中提取文本内容。

由于表格内容未正确提取，我们发现表格的pdf存在以下问题。

尝试过image_to_string，image_to_data，opencv方法

使用的示例代码为：

从PIL导入图像

导入pytesseract 从pytesseract导入image_to_string 从pytesseract导入image_to_boxes

image =（pytesseract.image_to_string（Image.open（'table_number.jpg'）））打印（图像）

它应该正确提取到现在为止尚未提取的行和列。请建议使用功能或方法来增强使用tesseract提取表内容的结果。