Question

我正在开发一个Android应用程序以检测PDF文件中的文本。

首先，我尝试使用Google Cloud Vision API。但它需要OAuth 2.0。所以我从它改为Firebase ML Kit。

但是当我运行“ fromFilePath”方法时，发生了NPE。

val file = getPdfFile()
Log.d(TAG, "file.length: ${file.length()}") // File size is printed correctly!

// NPE occurred while below code running
val image = FirebaseVisionImage.fromFilePath(context, Uri.fromFile(file))

// Because already NPE occurred, I cannot reach out to below code.
val detector = FirebaseVision.getInstance()
    .cloudDocumentTextRecognizer

Process: com.youknow.redact, PID: 13122
java.lang.NullPointerException: Attempt to invoke virtual method 'int android.graphics.Bitmap.getWidth()' on a null object reference

Firebase ML套件似乎不支持PDF文件，对吧？

有什么好的解决方法吗？

使用Firebase ML套件无法识别PDF文件中的文本吗？

我尝试测试更多文件格式：JPG，TIFF

全部相同，只是输入文件被更改。 JPG可以正常工作，但是TIFF存在相同的问题。

 Caused by: java.lang.NullPointerException: Attempt to invoke virtual method 'int android.graphics.Bitmap.getWidth()' on a null object reference
    at com.google.android.gms.internal.firebase_ml.zzox.zza(Unknown Source)
    at com.google.firebase.ml.vision.common.FirebaseVisionImage.fromFilePath(Unknown Source)

Answer 1

TIFF不是Android上官方支持的图像格式。 PDF是文档格式，而不是图像格式。请参阅以下链接，以获取所有支持的图像格式的列表： https://developer.android.com/guide/topics/media/media-formats#image-formats

[更新]了解OP的问题所在。 Firebase ML Kit支持两种类型的文本识别：

城市或风景图像中的文字（例如街道照片中的标志）
文档图像中的文本

OP想要的是识别PDF“文档”中的文本，并且不支持此操作。

我认为OP误解了ML Kit上下文中的文档含义。

要识别PDF文件中的文本，您需要使用3rd party库将PDF首先转换为位图。

如何使用Firebase ML Kit识别PDF文件中的文本？

1 个答案: