丢失Apache tika中的输入流

时间:2014-04-29 14:29:52

标签: apache-tika

我从HttpRequest获取输入流并使用相同的输入流来提取元数据。如下所示。

ServletFileUpload upload = new ServletFileUpload();
FileItemIterator iter = upload.getItemIterator(request);

--- more lines for the iteration and getting the stream ------
InputStream input = item.openStream();

此输入将传递给解析器,如下所示

public Map<String, String> extractMetadata(InputStream is) {

    Map<String,String> map = new HashMap<>();
    ContentHandler contentHandler = new BodyContentHandler(-1);
    Metadata metadata = new Metadata();


        Parser parser = new AutoDetectParser();
        ParseContext parseContext = new ParseContext();
        parseContext.set(Parser.class ,
                new ParserDecorator(parser));

    try {
        TikaInputStream tikaInputStream = TikaInputStream.get(is);
        parser.parse(tikaInputStream, contentHandler, metadata,parseContext);


    for (String name : metadata.names()) {
            map.put(name ,metadata.get(name));
        }

    } catch (IOException|SAXException|TikaException e) {
        map.put("ERROR","Error while retriving Metadata");
    }

    return  map;
}

但是当我尝试获取输入流时,它与我不使用tika提取物不一样。 Tika肮脏的流吗?

0 个答案:

没有答案