我现在已经好几天都没有遇到数据库问题了。应用程序在特定的hibernate criteria.list()之后挂起。完全由以下stacktrace:
java.net.SocketInputStream.read(byte[], int, int)
org.postgresql.core.VisibleBufferedInputStream.readMore(int)
org.postgresql.core.VisibleBufferedInputStream.ensureBytes(int)
org.postgresql.core.VisibleBufferedInputStream.read()
org.postgresql.core.PGStream.ReceiveChar()
org.postgresql.core.v3.QueryExecutorImpl.processResults(ResultHandler, int)
org.postgresql.core.v3.QueryExecutorImpl.execute(Query, ParameterList, ResultHandler, int, int, int)
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(Query, ParameterList, int)
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(int)
org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery()
org.hibernate.internal.CriteriaImpl.list()
经过一些研究和测试后,我发现问题不是阻塞查询,而是永远执行的查询。
它是一个带有以下sessionFactory和事务管理器配置的java spring应用程序:
<bean id="dataSource" class="org.apache.commons.dbcp2.BasicDataSource" destroy-method="close">
<property name="driverClassName" value="org.postgresql.Driver" />
<property name="url" value="jdbc:postgresql://localhost:5432/database" />
<property name="username" value="username" />
<property name="password" value="password" />
</bean>
<bean id="sessionFactory"
class="org.springframework.orm.hibernate4.LocalSessionFactoryBean">
<property name="dataSource" ref="dataSource" />
<property name="packagesToScan" value="com.myapp.domain" />
<property name="configLocation" value="/WEB-INF/hibernate.cfg.xml" />
</bean>
<tx:annotation-driven transaction-manager="transactionManager" />
<bean id="transactionManager"
class="org.springframework.orm.hibernate4.HibernateTransactionManager">
<property name="sessionFactory" ref="sessionFactory" />
底层数据库是PostgreSQL,这是当前的hibernate配置
<hibernate-configuration>
<session-factory>
<property name="dialect">org.hibernate.dialect.PostgreSQLDialect</property>
<property name="hbm2ddl.auto">none</property>
<property name="hibernate.cache.region.factory_class">org.hibernate.cache.ehcache.SingletonEhCacheRegionFactory</property>
<property name="hibernate.cache.use_second_level_cache">true</property>
<property name="cache.use_query_cache">true</property>
<property name="hibernate.transaction.factory_class">
org.hibernate.transaction.JDBCTransactionFactory</property>
<property name="show_sql">false</property>
<property name="format_sql">true</property>
<property name="use_sql_comments">false</property>
<property name="order_updates">true</property>
</session-factory>
</hibernate-configuration>
代码中的关键区域是:
private void fillEmptyNames() throws CablewatchException {
List<Device> devicesList = deviceDao.getDevices();
if (devicesList != null) {
for (Device device : devicesList {
String name = deviceDao.getDeviceName(device.getModule().getObjectIdentifier(), device.getSubrack(), device.getSlot());
...
}
}
}
应用程序挂起第二个dao函数&#34; getDeviceName&#34;。实现如下:
@Transactional(timeout=30)
public String getDeviceName(long moduleId, int subrackNr, int slotNr) throws CablewatchException {
Criteria criteria = sessionFactory.getCurrentSession().createCriteria(Device.class).add(Restrictions.eq("module.objectIdentifier", moduleId)).add(Restrictions.eq("subrack",subrackNr)).add(Restrictions.eq("slot",slotNr)).addOrder(Order.desc("objectIdentifier")).setMaxResults(1);
List<Device> devicesList = criteria.list();
if (devicesList != null && !devicesList.isEmpty() && devicesList.get(0) instanceof Device) {
Device device = devicesList.get(0);
return device.getName();
}
return null;
}
我面临的一个细节是,同一段落在Windows下运行正常,所以目前问题只发生在Linux上。
更新: 生成的查询是:
select
this_.objectIdentifier as objectId1_0_9_,
this_.ackId as ackId11_0_9_,
this_.alarmInfoId as alarmIn12_0_9_,
this_.cleared as cleared2_0_9_,
this_.clearedTime as clearedT3_0_9_,
this_.logIndex as logIndex4_0_9_,
this_.module as module5_0_9_,
this_.neId as neId13_0_9_,
this_.occurenceTime as occurenc6_0_9_,
this_.serial as serial7_0_9_,
this_.severityId as severit14_0_9_,
this_.slot as slot8_0_9_,
this_.subrack as subrack9_0_9_,
this_.value as value10_0_9_,
acknowledg2_.objectIdentifier as objectId1_2_0_,
acknowledg2_.comment as comment2_2_0_,
acknowledg2_.username as username3_2_0_,
alarminfo3_.objectIdentifier as objectId1_1_1_,
alarminfo3_.cw_alarmMessage as cw_alarm2_1_1_,
alarminfo3_.cw_alarmOid as cw_alarm3_1_1_,
ne4_.OBJECTIDENTIFIER as OBJECTID1_8_2_,
ne4_.cw_neActive as cw_neAct2_8_2_,
ne4_.cw_neCategory as cw_neCat3_8_2_,
ne4_.cw_neFirmware as cw_neFir4_8_2_,
ne4_.cw_neHasWebInterface as cw_neHas5_8_2_,
ne4_.cw_neInetAddress as cw_neIne6_8_2_,
ne4_.cw_neInfo as cw_neInf7_8_2_,
ne4_.cw_neMacAddress as cw_neMac8_8_2_,
ne4_.cw_neModel as cw_neMod9_8_2_,
ne4_.cw_neName as cw_neNa10_8_2_,
ne4_.cw_neSerial as cw_neSe11_8_2_,
ne4_.cw_neSysDescription as cw_neSy12_8_2_,
ne4_.cw_neType as cw_neTy13_8_2_,
ne4_.cw_installationDate as cw_inst14_8_2_,
ne4_.cw_instance as cw_inst15_8_2_,
ne4_.cw_lastAlarmLogIndex as cw_last16_8_2_,
ne4_.cw_locationId as cw_loca19_8_2_,
ne4_.cw_readCommunity as cw_read17_8_2_,
ne4_.cw_severityId as cw_seve20_8_2_,
ne4_.cw_writeCommunity as cw_writ18_8_2_,
location5_.objectIdentifier as objectId1_5_3_,
location5_.cw_imageName as cw_image2_5_3_,
location5_.cw_locationCity as cw_locat3_5_3_,
location5_.cw_locationCode as cw_locat4_5_3_,
location5_.cw_locationContact as cw_locat5_5_3_,
location5_.cw_locationDescription1 as cw_locat6_5_3_,
location5_.cw_locationDescription2 as cw_locat7_5_3_,
location5_.cw_locationName as cw_locat8_5_3_,
location5_.cw_locationStreet as cw_locat9_5_3_,
location5_.cw_locationType as cw_loca10_5_3_,
location5_.cw_parentLocationId as cw_pare11_5_3_,
location5_.cw_severityId as cw_seve12_5_3_,
location5_.cw_sublocationSeverityId as cw_subl13_5_3_,
location6_.objectIdentifier as objectId1_5_4_,
location6_.cw_imageName as cw_image2_5_4_,
location6_.cw_locationCity as cw_locat3_5_4_,
location6_.cw_locationCode as cw_locat4_5_4_,
location6_.cw_locationContact as cw_locat5_5_4_,
location6_.cw_locationDescription1 as cw_locat6_5_4_,
location6_.cw_locationDescription2 as cw_locat7_5_4_,
location6_.cw_locationName as cw_locat8_5_4_,
location6_.cw_locationStreet as cw_locat9_5_4_,
location6_.cw_locationType as cw_loca10_5_4_,
location6_.cw_parentLocationId as cw_pare11_5_4_,
location6_.cw_severityId as cw_seve12_5_4_,
location6_.cw_sublocationSeverityId as cw_subl13_5_4_,
severity7_.id as id1_15_5_,
severity7_.cw_severityColor as cw_sever2_15_5_,
severity7_.cw_severityName as cw_sever3_15_5_,
severity8_.id as id1_15_6_,
severity8_.cw_severityColor as cw_sever2_15_6_,
severity8_.cw_severityName as cw_sever3_15_6_,
severity9_.id as id1_15_7_,
severity9_.cw_severityColor as cw_sever2_15_7_,
severity9_.cw_severityName as cw_sever3_15_7_,
severity10_.id as id1_15_8_,
severity10_.cw_severityColor as cw_sever2_15_8_,
severity10_.cw_severityName as cw_sever3_15_8_
from
CW_ALARM this_
left outer join
CW_Acknowledgment acknowledg2_
on this_.ackId=acknowledg2_.objectIdentifier
left outer join
CW_ALARMINFO alarminfo3_
on this_.alarmInfoId=alarminfo3_.objectIdentifier
left outer join
CW_NE ne4_
on this_.neId=ne4_.OBJECTIDENTIFIER
left outer join
CW_LOCATION location5_
on ne4_.cw_locationId=location5_.objectIdentifier
left outer join
CW_LOCATION location6_
on location5_.cw_parentLocationId=location6_.objectIdentifier
left outer join
CW_SEVERITY severity7_
on location6_.cw_severityId=severity7_.id
left outer join
CW_SEVERITY severity8_
on location6_.cw_sublocationSeverityId=severity8_.id
left outer join
CW_SEVERITY severity9_
on ne4_.cw_severityId=severity9_.id
left outer join
CW_SEVERITY severity10_
on this_.severityId=severity10_.id
where
this_.neId=?
and this_.subrack=?
and this_.slot=?
and this_.module<>?
order by
this_.objectIdentifier desc limit ?
我是从pgAdmin执行它(并用它们的值替换了参数),它工作正常。以下是查询计划:
"Limit (cost=25819.66..25819.66 rows=1 width=1185)"
" -> Sort (cost=25819.66..25819.66 rows=1 width=1185)"
" Sort Key: this_.objectidentifier"
" -> Nested Loop Left Join (cost=0.00..25819.65 rows=1 width=1185)"
" -> Nested Loop Left Join (cost=0.00..25811.37 rows=1 width=1021)"
" -> Nested Loop Left Join (cost=0.00..25803.09 rows=1 width=857)"
" -> Nested Loop Left Join (cost=0.00..25799.21 rows=1 width=693)"
" -> Nested Loop Left Join (cost=0.00..25795.33 rows=1 width=529)"
" -> Nested Loop Left Join (cost=0.00..25793.45 rows=1 width=464)"
" Join Filter: (ne4_.cw_locationid = location5_.objectidentifier)"
" -> Nested Loop Left Join (cost=0.00..25791.22 rows=1 width=399)"
" Join Filter: (this_.neid = ne4_.objectidentifier)"
" -> Nested Loop Left Join (cost=0.00..25788.76 rows=1 width=225)"
" -> Nested Loop Left Join (cost=0.00..25780.48 rows=1 width=150)"
" Join Filter: (this_.ackid = acknowledg2_.objectidentifier)"
" -> Seq Scan on cw_alarm this_ (cost=0.00..25779.32 rows=1 width=132)"
" Filter: (((module)::text <> ''::text) AND (neid = 471) AND (subrack = (-1)) AND (slot = (-1)))"
" -> Seq Scan on cw_acknowledgment acknowledg2_ (cost=0.00..1.07 rows=7 width=18)"
" -> Index Scan using cw_alarminfo_pkey on cw_alarminfo alarminfo3_ (cost=0.00..8.27 rows=1 width=75)"
" Index Cond: (this_.alarminfoid = objectidentifier)"
" -> Seq Scan on cw_ne ne4_ (cost=0.00..2.45 rows=1 width=174)"
" Filter: (objectidentifier = 471)"
" -> Seq Scan on cw_location location5_ (cost=0.00..2.10 rows=10 width=65)"
" -> Index Scan using cw_location_pkey on cw_location location6_ (cost=0.00..1.87 rows=1 width=65)"
" Index Cond: (location5_.cw_parentlocationid = objectidentifier)"
" -> Index Scan using cw_severity_pkey on cw_severity severity7_ (cost=0.00..3.87 rows=1 width=164)"
" Index Cond: (location6_.cw_severityid = id)"
" -> Index Scan using cw_severity_pkey on cw_severity severity8_ (cost=0.00..3.87 rows=1 width=164)"
" Index Cond: (location6_.cw_sublocationseverityid = id)"
" -> Index Scan using cw_severity_pkey on cw_severity severity9_ (cost=0.00..8.27 rows=1 width=164)"
" Index Cond: (ne4_.cw_severityid = id)"
" -> Index Scan using cw_severity_pkey on cw_severity severity10_ (cost=0.00..8.27 rows=1 width=164)"
" Index Cond: (this_.severityid = id)"
关于Linux和Windows版本的详细信息,我尝试了相同的测试但是没有使用本地数据库postgresql_9.1.13(Debian)我使用远程访问postgresql_9.3.5(Windows)并安装并尝试了postgresql_9 .1.13(Windows)中。两者都正常工作。
我也尝试了从Windows系统到远程postgresql_9.1.13(Debian)的相同代码和另一台带有远程postgresql_9.1.15(Debian)的机器。在这两种情况下都会出现问题。
似乎问题可能出在Linux版本的postgresql_9.1.x上。
提前致谢。
答案 0 :(得分:0)
使用Craig和Clemens提供的命令(显示当前查询,pg_stat_activity,解析分析select,pg_locks等)调试数据库后,我发现它不是一个无限循环但是由于访问超过20.000个条目的运行时间很长。由于ORM层,这个时间甚至延长到超过几个小时。我正在对数据库进行小规模重新设计以优化此问题。
谢谢大家的支持。