如何在python多线程中执行jar文件

时间:2014-10-30 10:12:11

标签: python multithreading jar

在我的项目中,我有一个jar文件(由其他开发人员编写),用于将内容从pdf复制到文本文件。使用python多线程概念,我试图执行这个jar。

运行此脚本后,我可以看到文本文件已创建。但文件大小为0KB。为什么内容不会复制到此文件。但是我尝试在命令行中运行此jar,这可以按预期工作。有人可以告诉我们提供解决方案吗?

from threading import Thread
import os
import sys
import time
import urllib2
from lxml import etree, html
import re
import Queue
import traceback


def createfile(x):
    try:
        file="test_"+str(x)
        print "java -jar tika-app-1.1.jar -t --encoding=utf8 \"%s\" > \"%s\" "%("C:\\samplefile.pdf",file)
        os.system("java -jar tika-app-1.1.jar -t --encoding=utf8 \"%s\" > \"%s\" "%("C:\tmp\samplefile.pdf",file))
    except Exception,e:
        print "excet",traceback.format_exc()

def process():
    try:
        result = Queue.Queue()
        threads = [Thread(target=createfile, args=(x,)) for x in range(1,5)]
        for t in threads:
            t.start()
        for t in threads:
            t.join()
    except:
        print "exception",traceback.format_exc()
        pass
    end_time = time.time()
    print "Estimate time", end_time - start_time

if __name__ == '__main__':
    process()

我的输出:

Exception in thread "main" java.net.MalformedURLException: unknown protocol: c
        at java.net.URL.<init>(Unknown Source)
        at java.net.URL.<init>(Unknown Source)
        at java.net.URL.<init>(Unknown Source)
        at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:393)
        at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:101)
Exception in thread "main" java.net.MalformedURLException: unknown protocol: c
        at java.net.URL.<init>(Unknown Source)
        at java.net.URL.<init>(Unknown Source)
        at java.net.URL.<init>(Unknown Source)
        at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:393)
        at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:101)
Exception in thread "main" java.net.MalformedURLException: unknown protocol: c
        at java.net.URL.<init>(Unknown Source)
        at java.net.URL.<init>(Unknown Source)
        at java.net.URL.<init>(Unknown Source)
        at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:393)
        at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:101)
Exception in thread "main" java.net.MalformedURLException: unknown protocol: c
        at java.net.URL.<init>(Unknown Source)
        at java.net.URL.<init>(Unknown Source)
        at java.net.URL.<init>(Unknown Source)
        at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:393)
        at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:101)
Estimate time 1.73799991608

1 个答案:

答案 0 :(得分:2)

您告诉Java应用程序读取此文件:C: mpsamplefile.pdf因为\t在Python字符串中变为Tab字符。然后,Java应用程序会看到C:后面没有/\,并假设它必须是一个网址(如http:ftp:)。但是当它问到时,没有URL协议处理程序支持它,因此是例外。

为避免此类问题,请使用os.path.join()

inputFile = os.path.join('C:', 'tmp', 'samplefile.pdf')

或使用/代替\; Windows上的Java将在访问文件时转换这些分隔符。