Question

我有一个plone（v.4.1.4）站点独立统一安装程序，带有collective.documentviewer 2.2.1。只要从扫描的doc，xls，open office，rtf，pdf中搜索单词，它就能正常工作。如果图像（包含文本）作为图像内容类型上载，即使在文档设置中选中了OCR，文档查看器也不支持该图像。如果图像作为文件上传，我也无法在设置适当的图像格式即gif，png，jpg之后搜索作为图像一部分的单词。我在linux系统上安装了以下命令所需的tesseract文件：

dpkg -l| grep tesseract
ii  libtesseract3                        3.02.01-6                        i386         Command line OCR tool
ii  tesseract-ocr                        3.02.01-6                        i386         Command line OCR tool
ii  tesseract-ocr-eng                    3.02-2                           all          tesseract-ocr language files for English
ii  tesseract-ocr-equ                    3.02-2                           all          tesseract-ocr language files for equations
ii  tesseract-ocr-osd                    3.02-2                           all          tesseract-ocr language files for script and orientation

附加样本gif图像。 enter image description here 我想例如搜索“Lab”这个词，它是图像的一部分。文本选项卡不显示此pdf中嵌入的图像的文字。请指导

如何在图像plone 4.1.4中使用plone for ocr的用户collective.documentviewer？

0 个答案: