我从HttpRequest获取输入流并使用相同的输入流来提取元数据。如下所示。
ServletFileUpload upload = new ServletFileUpload();
FileItemIterator iter = upload.getItemIterator(request);
--- more lines for the iteration and getting the stream ------
InputStream input = item.openStream();
此输入将传递给解析器,如下所示
public Map<String, String> extractMetadata(InputStream is) {
Map<String,String> map = new HashMap<>();
ContentHandler contentHandler = new BodyContentHandler(-1);
Metadata metadata = new Metadata();
Parser parser = new AutoDetectParser();
ParseContext parseContext = new ParseContext();
parseContext.set(Parser.class ,
new ParserDecorator(parser));
try {
TikaInputStream tikaInputStream = TikaInputStream.get(is);
parser.parse(tikaInputStream, contentHandler, metadata,parseContext);
for (String name : metadata.names()) {
map.put(name ,metadata.get(name));
}
} catch (IOException|SAXException|TikaException e) {
map.put("ERROR","Error while retriving Metadata");
}
return map;
}
但是当我尝试获取输入流时,它与我不使用tika提取物不一样。 Tika肮脏的流吗?