<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>0.9</version>
</dependency>
我试图添加下面的依赖项,而不仅仅是tika的依赖项,以覆盖Tika对PDFBOX 1.6.0的依赖性但它不起作用..
<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>0.9</version>
<exclusions>
<exclusion>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>1.6.0</version>
</dependency>
Tika Parser依赖于PdfBox版本1.4.0。我想将Apache Tika的这种依赖性改为PdfBox版本1.6.0。我怎么能在我的Pom.xml文件中执行此操作。 这是我的pom.xml文件。任何建议将不胜感激。
< project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.xyz.search</groupId>
<artifactId>xyzz-crawler4j</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>qcom-crawler4j</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<repositories>
<repository>
<id>repo-for-dsiutils</id>
<url>http://ir.dcs.gla.ac.uk/~bpiwowar/maven/</url>
</repository>
<repository>
<id>JBoss</id>
<name>jboss-maven2-release-repository</name>
<url>https://oss.sonatype.org/content/repositories/JBoss</url>
</repository>
<repository>
<id>oracle</id>
<url>http://download.oracle.com/maven</url>
</repository>
<repository>
<id>boilerpipe</id>
<url>http://boilerpipe.googlecode.com/svn/repo/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.0.1</version>
<!-- 4.1.1 -->
</dependency>
//PDFBOX version 1.6.0
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>1.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpcore</artifactId>
<version>4.0.1</version>
</dependency>
<!-- 4.1 -->
<dependency>
<groupId>it.unimi.dsi</groupId>
<artifactId>fastutil</artifactId>
<version>6.2.2</version>
</dependency>
<dependency>
<groupId>com.sleepycat</groupId>
<artifactId>je</artifactId>
<version>4.0.71</version>
</dependency>
<!-- Boilerpipe -->
<dependency>
<groupId>de.l3s.boilerpipe</groupId>
<artifactId>boilerpipe</artifactId>
<version>1.2.0</version>
</dependency>
<!-- Tika (for non-HTML extractions) -->
<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-core</artifactId>
<version>0.9</version>
</dependency>
<dependency>
<groupId>xerces</groupId>
<artifactId>xercesImpl</artifactId>
<version>2.8.1</version>
</dependency>
<dependency>
<groupId>nekohtml</groupId>
<artifactId>nekohtml</artifactId>
<version>0.6.5</version>
</dependency>
<dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>0.9</version>
</dependency>
**// I was trying to add this below dependency instead of just above dependency of tika to override the dependency of Tika to PDFBOX 1.6.0 But its not working..
<!-- <dependency>
<groupId>org.apache.tika</groupId>
<artifactId>tika-parsers</artifactId>
<version>0.9</version>
<exclusions>
<exclusion>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>1.6.0</version>
</dependency>
-->**
</dependencies>
</project>
答案 0 :(得分:4)
最干净的方法可能是添加一个dependencyManagement部分,用于升级依赖关系树中的PDFBox版本。例如:
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>1.6.0</version>
</dependency>
</dependencies>
</dependencyManagement>
请注意,许多Tika解析器与PDFBox等上游解析器库的特定版本紧密相关,因此如果您覆盖此类依赖项版本,则需要对系统进行测试。
强制依赖版本更改的替代方法是使用Tika的最新主干版本,其中PDFBox依赖关系已经是版本1.6.0。此外,将使用更新后的依赖关系的Tika 0.10版本应该在下周初就已经发布了。