我正在尝试通过Apache POI读取来加速从excel文件导入数据库,并通过JBOSS 7.1中的Hibernate和JPA(这是一个特定的要求,使用JYA数据源)进行持久化。然而目前进口速度太慢 - 对于30,000条记录大约需要3分钟,我需要将其减少到大约30秒。我正在寻找帮助来设置批量插入,我在presend工作中没有尝试过。
我的persistence.xml如下:
<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.0"
xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://java.sun.com/xml/ns/persistence
http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd">
<persistence-unit name="primary" transaction-type="JTA">
<jta-data-source>java:jboss/datasources/MySqlDS</jta-data-source>
<properties>
<!-- Properties for Hibernate -->
<property name="hibernate.hbm2ddl.auto" value="update" />
<property name="hibernate.default_catalog" value="myDatabase"/>
<property name="hibernate.dialect" value="org.hibernate.dialect.MySQLDialect" />
<property name="hibernate.show_sql" value="false" />
<property name="hibernate.format_sql" value="false" />
<property name="hibernate.dialect" value="org.hibernate.dialect.MySQL5InnoDBDialect"/>
<property name="hibernate.order_updates" value="true"/>
<property name="hibernate.order_inserts" value="true"/>
<property name="hibernate.jdbc.batch_versioned_data" value="true"/>
<property name="hibernate.jdbc.fetch_size" value="500"/>
<property name="hibernate.jdbc.batch_size" value="500"/>
<property name="hibernate.default_batch_fetch_size" value="16"/>
<property name="hibernate.connection.release_mode" value="auto"/>
<property name="hibernate.cache.region.jbc2.cachefactory" value="java:CacheManager"/>
<property name="hibernate.cache.use_second_level_cache" value="true"/>
<property name="hibernate.cache.use_query_cache" value="false"/>
<property name="hibernate.cache.use_minimal_puts" value="true"/>
<property name="hibernate.cache.region.jbc2.cfg.entity" value="mvcc-entity"/>
<property name="hibernate.cache.region_prefix" value="services"/>
<property name="hibernate.connection.driver_class" value="com.mysql"/>
<property name="hibernate.connection.url" value="jdbc:mysql://localhost:3306/myDatabase"/>
<property name="hibernate.connection.username" value="root"/>
</properties>
</persistence-unit>
</persistence>
我有一个EJB Timer类,它在JBOSS启动时部署,它会查找新的excel文件,如果找到它们将它们导入数据库 - 这一切都运行正常 - 它只是慢...... //听众类 excelReader.loadDatabase(child.getPath());
// This all works ok
}
}
}
}
* 这是通过JPA *
实际保存文件的类@Stateless
@LocalBean
public class ExcelReader implements TableDao {
@PersistenceContext
private EntityManager em;
private HSSFRow row = null;
private HSSFWorkbook wb;
private BaseDataTable baseDataTable;
public void loadDatabase(String path)
{
try
{
FileInputStream latestExcelFile = new FileInputStream(path);
wb = new HSSFWorkbook(latestExcelFile);
} catch (Exception ex) {}
importTheTable();
}
public ExcelReader() {}
public void importTheTable(){
HSSFSheet baseDataTableSheet = wb.getSheetAt(0);
for (int i = 1; i <= baseDataTableSheet.getLastRowNum(); i++)
{
row = baseDataTableSheet.getRow(i);
baseDataTable = new BaseDataTable();
try
{
baseDataTable.setDateTime(row.getCell(0).getDateCellValue());
baseDataTable.setEventId((int) row.getCell(1).getNumericCellValue());
baseDataTable.setCauseClass(parseCauseClass(row.getCell(2).toString()));
baseDataTable.setUeType((int) row.getCell(3).getNumericCellValue());
baseDataTable.setMarket((int) row.getCell(4).getNumericCellValue());
baseDataTable.setOperator((int) row.getCell(5).getNumericCellValue());
baseDataTable.setCellId((int) row.getCell(6).getNumericCellValue());
baseDataTable.setDuration((int) row.getCell(7).getNumericCellValue());
baseDataTable.setCauseCode((int) row.getCell(8).getNumericCellValue());
baseDataTable.setNeVersion(row.getCell(9).toString());
baseDataTable.setImsi(row.getCell(10).getNumericCellValue());
baseDataTable.setHier3Id((row.getCell(11).toString()));
baseDataTable.setHier32Id((row.getCell(12).toString()));
baseDataTable.setHier321Id((row.getCell(13).toString()));
addBaseTableEntry(baseDataTable);
} catch (Exception ex) { System.out.println("Error in excel file"); }
if(i%1000 == 0)
{
em.flush();
em.clear();
}
}
}
**这就是EntityManager的创建方式**
@Stateful
@RequestScoped
public class Resources {
@PersistenceContext(type = PersistenceContextType.EXTENDED)
private EntityManager em;
@Produces
public EntityManager getEm() {
return em;
}
}
这一切都运行正常但速度太慢 - 我在网上无休止地搜索并应用UserTransaction尝试加速导入但无济于事,任何正确方向的帮助将非常感激,
干杯
答案 0 :(得分:1)
我没有看到任何与事务相关的注释,看起来每个插入(addBaseTableEntry方法对吗?)都在它自己的事务中(这将非常慢)。
尝试添加
@TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
到你的loadDatabase方法。
编辑: 将id生成策略更改为GenerationType.SEQUENCE或TABLE(适合您)。 IDENTITY生成策略的每个插入返回新生成的id的原因是ID,这使得批量插入不可能。 有关详细信息,请参阅http://docs.jboss.org/hibernate/core/3.6/reference/en-US/html/batch.html。