Apache POI - 读取修改excel文件

时间:2016-10-07 14:58:26

标签: java apache-poi

每当我使用Apatche POI打开excel文件时,文件都会被修改,即使我只是在阅读文件而没有进行任何修改。

以此类测试代码为例。

public class ApachePoiTest {

    @Test
    public void readingShouldNotModifyFile() throws Exception {
        final File testFile = new File("C:/work/src/test/resources/Book2.xlsx");
        final byte[] originalChecksum = calculateChecksum(testFile);
        Assert.assertTrue("Calculating checksum modified file",
            MessageDigest.isEqual(originalChecksum, calculateChecksum(testFile)));
        try (Workbook wb = WorkbookFactory.create(testFile)) {
            Assert.assertNotNull("Reading file with Apache POI", wb);
        }
        Assert.assertTrue("Reading file with Apache POI modified file",
            MessageDigest.isEqual(originalChecksum, calculateChecksum(testFile)));
    }

    @Test
    public void readingInputStreamShouldNotModifyFile() throws Exception {
        final File testFile = new File("C:/work/src/test/resources/Book2.xlsx");
        final byte[] originalChecksum = calculateChecksum(testFile);
        Assert.assertTrue("Calculating checksum modified file",
            MessageDigest.isEqual(originalChecksum, calculateChecksum(testFile)));
        try (InputStream is = new FileInputStream(testFile); Workbook wb = WorkbookFactory.create(is)) {
            Assert.assertNotNull("Reading file with Apache POI", wb);
        }
        Assert.assertTrue("Reading file with Apache POI modified file",
            MessageDigest.isEqual(originalChecksum, calculateChecksum(testFile)));
    }

    private byte[] calculateChecksum(final File file) throws Exception {
        final MessageDigest md = MessageDigest.getInstance("MD5");
        md.reset();
        try (InputStream is = new FileInputStream(file)) {
            final byte[] bytes = new byte[2048];
            int numBytes;
            while ((numBytes = is.read(bytes)) != -1) {
                md.update(bytes, 0, numBytes);
            }
            return md.digest();
        }
    }
}

测试readingShouldNotModifyFile总是失败,因为该文件总是被Apache POI修改。在使用MS Office新创建的空白excel文件上进行测试时,Apache POI会将文件从8.1 kb切换到6.2 kb并破坏文件。

经过测试:

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>3.15</version>
</dependency>

以及版本3.12

我可以通过其他方式阻止Apache POI修改我的文件,然后传递InputStream而不是File。我不想传递InputStream,因为我担心Apache会警告它需要更多内存并且对InputStream有一些特定要求。

1 个答案:

答案 0 :(得分:7)

您的问题是您没有传入readonly标志,因此Apache POI默认为打开文件读/写。

您需要将overloaded WorkbookFactory.create method which takes a readonly flag +设置为readonly标记为true

更改行

try (InputStream is = new FileInputStream(testFile); Workbook wb = WorkbookFactory.create(is)) {

try (IWorkbook wb = WorkbookFactory.create(testFile,null,true)) {

,您的文件将以只读方式打开,无需更改