将Java运行时环境连接到pycharm以运行tabula

时间:2018-06-05 17:34:13

标签: pycharm tabula

尝试使用tabula提取表,但由于它最初是在java中编写的,因此它尝试访问java运行时。我在我的mac os上安装了java,但我猜它没有在Pycharm上配置。因此,当我运行tabula时,我得到以下错误:

    No Java runtime present, requesting install.
Error: 
Traceback (most recent call last):
  File "/Users/rohank2/salesorderautomation/test.py", line 138, in <module>
    text = pdf_to_text(filename)  # call to pdftotext function
  File "/Users/rohank2/salesorderautomation/test.py", line 53, in pdf_to_text
    df = read_pdf(filename)
  File "/Users/rohank2/Library/Python/3.6/lib/python/site-packages/tabula/wrapper.py", line 87, in read_pdf
    output = subprocess.check_output(args)
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['java', '-jar', '/Users/rohank2/Library/Python/3.6/lib/python/site-packages/tabula/tabula-1.0.2-jar-with-dependencies.jar', '--pages', '1', '--guess', './testdataset/test12.pdf']' returned non-zero exit status 1.

Process finished with exit code 1

这就是我尝试访问数据的方式:

def pdf_to_text(pdfname):

    # PDFMiner boilerplate
    df = read_pdf(filename)
    print(list(df))

0 个答案:

没有答案