美好的一天,大家好!
我试图在Python中使用Apache tika并得到此错误:
jnius.JavaException: JVM exception occurred: ä (The system cannot find the file specified)
你可以帮个忙吗?我使用Windows 10(x64),我想这个问题与python和Java之间的编码或类似的东西有关。提前谢谢。
我使用的代码是:
import os
os.environ['CLASSPATH'] = "tika/tika-app-1.16.jar"
from jnius import autoclass
Tika = autoclass('org.apache.tika.Tika')
Metadata = autoclass('org.apache.tika.metadata.Metadata')
FileInputStream = autoclass('java.io.FileInputStream')
tika = Tika()
meta = Metadata()
file_path = FileInputStream("./content/2.xlsx")
text = tika.parseToString(file_path, meta)
print(text)
答案 0 :(得分:1)
我知道已经晚了,但是我遇到了完全相同的问题。
这是由于unicode字符串无法从Python正确转换为Java引起的,可以通过创建Java String
对象来解决:
import os
os.environ['CLASSPATH'] = "tika/tika-app-1.16.jar"
from jnius import autoclass
Tika = autoclass('org.apache.tika.Tika')
Metadata = autoclass('org.apache.tika.metadata.Metadata')
FileInputStream = autoclass('java.io.FileInputStream')
String = jnius.autoclass("java.lang.String")
tika = Tika()
meta = Metadata()
file_path = FileInputStream(String("./content/2.xlsx"))
text = tika.parseToString(file_path, meta)
print(text)