我想用Apache POI读取受密码保护的excel文件(.xls和.xlsx)。我没有使用usermodel(org.apache.poi.ss.usermodel),而是使用事件API来处理xls和xlsx文件(以解决内存占用问题)。
我正在实现HSSFListener并覆盖xls文件的processRecord(记录记录)方法。对于xlsx文件,我使用的是javax.xml.parsers.SAXParser和org.xml.sax.XMLReader。
如果我使用下面的代码来读取.xls文件:
Biff8EncryptionKey.setCurrentUserPassword("password");
POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream(this.getFileName()));
MissingRecordAwareHSSFListener listener = new MissingRecordAwareHSSFListener(this);
formatListener = new FormatTrackingHSSFListener(listener);
HSSFEventFactory factory = new HSSFEventFactory();
HSSFRequest request = new HSSFRequest();
request.addListenerForAllRecords(formatListener);
rowsReadSet.clear();
factory.processWorkbookEvents(request, fs);
我得到了这个例外:
Exception in thread "Thread-6" org.apache.poi.EncryptedDocumentException: HSSF does not currently support CryptoAPI encryption
at org.apache.poi.hssf.record.FilePassRecord$Rc4KeyData.read(FilePassRecord.java:65)
at org.apache.poi.hssf.record.FilePassRecord.<init>(FilePassRecord.java:193)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at org.apache.poi.hssf.record.RecordFactory$ReflectionConstructorRecordCreator.create(RecordFactory.java:87)
at org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:338)
at org.apache.poi.hssf.record.RecordFactoryInputStream$StreamEncryptionInfo.<init>(RecordFactoryInputStream.java:74)
at org.apache.poi.hssf.record.RecordFactoryInputStream.<init>(RecordFactoryInputStream.java:207)
at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:136)
at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:103)
at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processWorkbookEvents(HSSFEventFactory.java:62)
at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processWorkbookEvents(HSSFEventFactory.java:50)
at com.mycompany.component.reader.MSExcelReader.readxls(MSExcelReader.java:300)
at com.mycompany.component.reader.MSExcelReader.run(MSExcelReader.java:274)
at java.lang.Thread.run(Unknown Source)
我会在读完.xls之后发布.xlsx的代码。
我正在使用JDK7和Apache POI 3.11。有人可以帮忙吗?
[EDITED]
另一个问题是,我读了here
POI将无法读取加密的工作簿 - 这意味着如果您保护整个工作簿(而不仅仅是工作表),那么它将无法读取它。否则,它应该工作。
这是真的吗?因此,对于POI 3.11,我无法读取密码为整个工作簿设置密码的密码保护文件(通常这是通过另存为 - &gt;工具 - &gt;常规选项完成的)?
[EDITED]:
如果我设置了工作表密码(使用Review - &gt; Protect Sheet功能区选项)并使用事件模型读取文件,则可以正常工作。但是,如果我为整个工作簿设置密码(使用Review - &gt; Protect workbook功能区选项)或设置文件密码(使用另存为 - &gt;工具 - &gt;常规选项),则会失败。以下是我对这两种方法的例外情况:
1。使用评论 - &gt;保护工作簿功能区选项
Exception in thread "Thread-6" org.apache.poi.EncryptedDocumentException: Supplied password is invalid for salt/verifier/verifierHash
at org.apache.poi.hssf.record.RecordFactoryInputStream$StreamEncryptionInfo.createDecryptingStream(RecordFactoryInputStream.java:127)
at org.apache.poi.hssf.record.RecordFactoryInputStream.<init>(RecordFactoryInputStream.java:209)
at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:136)
at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:103)
at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processWorkbookEvents(HSSFEventFactory.java:62)
at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processWorkbookEvents(HSSFEventFactory.java:50)
我可以通过提供Decryptor.DEFAULT_PASSWORD或“VelvetSweatshop”字符串作为密码(这是默认密码)来读取此文件。那么为什么它无法用手动设置的密码字符串读取?
2。使用另存为 - &gt;工具 - &gt;一般选项
Exception in thread "Thread-6" org.apache.poi.EncryptedDocumentException: HSSF does not currently support CryptoAPI encryption
at org.apache.poi.hssf.record.FilePassRecord$Rc4KeyData.read(FilePassRecord.java:65)
at org.apache.poi.hssf.record.FilePassRecord.<init>(FilePassRecord.java:193)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at org.apache.poi.hssf.record.RecordFactory$ReflectionConstructorRecordCreator.create(RecordFactory.java:87)
at org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:338)
at org.apache.poi.hssf.record.RecordFactoryInputStream$StreamEncryptionInfo.<init>(RecordFactoryInputStream.java:74)
at org.apache.poi.hssf.record.RecordFactoryInputStream.<init>(RecordFactoryInputStream.java:207)
at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:136)
at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:103)
at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processWorkbookEvents(HSSFEventFactory.java:62)
at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processWorkbookEvents(HSSFEventFactory.java:50)
我使用相同的密码用于上述所有三种方法并使用MS Office Professional Plus 2013.为什么它不能使用上面的第一种方法(使用Review - &gt; Protect Workbook)。我在异常中获取密码不正确。使用第二种方法的例外明确指出HSSF不支持加密,所以没关系。但是,如果Workbook也受到保护(使用Review - &gt; Protect Workbook),如果它可以使用密码(使用Review - &gt; Protect Sheet设置)读取Sheet,我希望它能够正常工作。专家可以澄清一下吗?
[EDITED]
好吧,当我说它适用于usermodel时,我错了。使用上述两种方法(在之前的编辑部分中列出),它也不适用于usermodel。方法1我得到了同样的例外。使用评论 - &gt;保护工作簿功能区选项和方法2.使用另存为 - &gt;工具 - &gt;常规选项。而我可以读取设置了Sheet密码的文件(甚至模型也可以看到类似的观察结果)。请参阅下面的示例测试用例。我没有找到附加excel文件的选项,所以不能这样做,但任何简单的excel文件都可以使用下面的测试用例进行测试(尽管测试条件会根据输入而改变)。
包含用户模型和事件模型的简单测试用例:
import java.io.File;
import java.io.FileInputStream;
import java.util.ArrayList;
import java.util.List;
import junit.framework.TestCase;
import org.apache.poi.hssf.eventusermodel.HSSFEventFactory;
import org.apache.poi.hssf.eventusermodel.HSSFListener;
import org.apache.poi.hssf.eventusermodel.HSSFRequest;
import org.apache.poi.hssf.record.BoundSheetRecord;
import org.apache.poi.hssf.record.NumberRecord;
import org.apache.poi.hssf.record.Record;
import org.apache.poi.hssf.record.crypto.Biff8EncryptionKey;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.poifs.filesystem.NPOIFSFileSystem;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;
import org.apache.poi.ss.usermodel.Cell;
/**
* Testing for {@link HSSFEventFactory}
*/
public final class TestHSSFEventFactory extends TestCase {
private String[] fileNames = {"C:\\XLS\\General_Password.xls",
"C:\\XLS\\Sheet_Password.xls",
"C:\\XLS\\Workbook_Password.xls"};
private static class MockHSSFListener implements HSSFListener {
private final List<Record> records = new ArrayList<Record>();
public MockHSSFListener() {}
public Record[] getRecords() {
Record[] result = new Record[records.size()];
records.toArray(result);
return result;
}
public void processRecord(Record record) {
records.add(record);
}
}
public void testWithPasswordProtectedWorkbooksUserModel() throws Exception {
// XOR/RC4 decryption for xls
Biff8EncryptionKey.setCurrentUserPassword("4Sys-Tem");
NPOIFSFileSystem nfs = new NPOIFSFileSystem(new File(fileNames[2]), true);
HSSFWorkbook hwb = new HSSFWorkbook(nfs.getRoot(), true);
HSSFSheet sheet = hwb.getSheetAt(0);
HSSFRow row = sheet.getRow(2);
Cell cell1 = row.getCell(3);
row = sheet.getRow(3);
Cell cell2 = row.getCell(3);
row = sheet.getRow(4);
Cell cell3 = row.getCell(3);
assertEquals("17000.0", cell1.toString());
assertEquals("7500.0", cell2.toString());
assertEquals("5000.0", cell3.toString());
Biff8EncryptionKey.setCurrentUserPassword(null);
}
public void testWithPasswordProtectedWorkbooksEvenModel() throws Exception {
// With the password, is properly processed
Biff8EncryptionKey.setCurrentUserPassword("4Sys-Tem");
HSSFRequest req = new HSSFRequest();
MockHSSFListener mockListen = new MockHSSFListener();
POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream(fileNames[2]));
HSSFEventFactory factory = new HSSFEventFactory();
req.addListenerForAllRecords(mockListen);
factory.processWorkbookEvents(req, fs);
// Check we got the sheet and the contents
Record[] recs = mockListen.getRecords();
assertTrue( recs.length > 50 );
// Has one sheet, with values 1,2,3 in column A rows 1-3
boolean hasSheet=false, hasA1=false, hasA2=false, hasA3=false;
for (Record r : recs) {
if (r instanceof BoundSheetRecord) {
BoundSheetRecord bsr = (BoundSheetRecord)r;
assertEquals("Trade Data", bsr.getSheetname());
hasSheet = true;
}
if (r instanceof NumberRecord) {
NumberRecord nr = (NumberRecord)r;
if (nr.getColumn() == 3 && nr.getRow() == 2) {
assertEquals(17000, (int)nr.getValue());
hasA1 = true;
}
if (nr.getColumn() == 3 && nr.getRow() == 3) {
assertEquals(7500, (int)nr.getValue());
hasA2 = true;
}
if (nr.getColumn() == 3 && nr.getRow() == 4) {
assertEquals(5000, (int)nr.getValue());
hasA3 = true;
}
}
}
assertTrue("Sheet record not found", hasSheet);
assertTrue("Numeric record for A1 not found", hasA1);
assertTrue("Numeric record for A2 not found", hasA2);
assertTrue("Numeric record for A3 not found", hasA3);
}
}