Apache tika检测到csv的mime类型错误

时间:2017-10-26 17:18:43

标签: java csv apache-tika file-type probe

我使用excel创建了.csv文件,并使用apache tika编写了以下代码:

public static boolean checkThatMimeTypeIsCsv(InputStream inputStream) throws IOException {
    BufferedInputStream bis = new BufferedInputStream(inputStream);
    AutoDetectParser parser = new AutoDetectParser();
    Detector detector = parser.getDetector();
    Metadata md = new Metadata();
    MediaType mediaType = detector.detect(bis, md);
    return "text/csv".equals(mediaType.toString());
}

public static void main(String[] args) throws IOException {
    System.out.println(checkThatMimeTypeIsCsv(new FileInputStream("Data.csv")));
}

但它返回false'。

Tika这么糟糕还是我错过了什么?

1 个答案:

答案 0 :(得分:1)

尝试一下...

public static String checkThatMimeTypeIsCsv(String fileName ) throws Exception {
    File sourceFile = new File(fileName );
    DefaultDetector file_detector = new DefaultDetector();
    TikaInputStream file_stream = TikaInputStream.get(sourceFile);
    Metadata metadata = new Metadata();
    metadata.set(Metadata.RESOURCE_NAME_KEY, sourceFile.getName());
    MediaType mediaType = file_detector.detect(file_stream, metadata);              
    String file_type = mediaType.toString();
    System.out.println(file_type);
    return file_type;
}