我有一个Hibernate / JPA查询,它根据数据库中的索引字段执行简单查询。该表相当大,有大约2800万条记录但是当直接使用数据库时,查询立即返回,使用Hibernate + JPA需要20秒。数据库是PostgreSQL 9.0.4。
这是包含相关查询的日志摘录,从时间戳中可以看到发出查询和打开ResultSet之间的长时间延迟。
2011-09-28 06:26:38,019 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.AbstractBatcher - about to open PreparedStatement (open PreparedStatements: 0, globally: 0)
2011-09-28 06:26:38,019 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.ConnectionManager - opening JDBC connection
2011-09-28 06:26:38,020 [http-thread-pool-8080(5)] DEBUG org.hibernate.SQL - select pafaddress0_.address_id as address1_0_, pafaddress0_.admin1 as admin2_0_, pafaddress0_.admin2 as admin3_0_, pafaddress0_.admin3 as admin4_0_, pafaddress0_.barcode as barcode0_, pafaddress0_.building_name as building6_0_, pafaddress0_.building_number as building7_0_, pafaddress0_.country_name as country8_0_, pafaddress0_.dependent_locality as dependent9_0_, pafaddress0_.dependent_street as dependent10_0_, pafaddress0_.dept_name as dept11_0_, pafaddress0_.double_dep_locality as double12_0_, pafaddress0_.dps as dps0_, pafaddress0_.mailsort as mailsort0_, pafaddress0_.org_name as org15_0_, pafaddress0_.pca_id as pca16_0_, pafaddress0_.pobox as pobox0_, pafaddress0_.postcode as postcode0_, pafaddress0_.postcode1 as postcode19_0_, pafaddress0_.postcode_type as postcode20_0_, pafaddress0_.street as street0_, pafaddress0_.su_org_indicator as su22_0_, pafaddress0_.subbuilding_name as subbuil23_0_, pafaddress0_.posttown as posttown0_, pafaddress0_.trad_county as trad25_0_, pafaddress0_.type as type0_, pafaddress0_.udprn as udprn0_ from paf pafaddress0_ where pafaddress0_.pca_id=? limit ?
2011-09-28 06:27:01,897 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.AbstractBatcher - about to open ResultSet (open ResultSets: 0, globally: 0)
2011-09-28 06:27:01,900 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.AbstractBatcher - about to close ResultSet (open ResultSets: 1, globally: 1)
2011-09-28 06:27:01,900 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.AbstractBatcher - about to close PreparedStatement (open PreparedStatements: 1, globally: 1)
2011-09-28 06:27:01,900 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.ConnectionManager - aggressively releasing JDBC connection
201
我的实体没有复杂的关联,并且日志中没有迹象表明在此长时间延迟期间发生了什么。
@Entity
@Table(name = "paf")
@NamedQueries({
@NamedQuery(name = "PafAddress.findByPcaId", query = "SELECT a FROM PafAddress a where a.pcaId=:pcaId ")
})
public class PafAddress implements Serializable, Identifiable<Long> {
private static final long serialVersionUID = -5831210090000717285L;
@Id
@Column(name = "address_id")
private Long id;
private String postcode;
@Column(name = "posttown")
private String town;
@Column(name = "dependent_locality")
private String dependentLocality;
@Column(name = "double_dep_locality")
private String doubleDepLocality;
@Column(name = "street")
private String street;
@Column(name = "dependent_street")
private String dependentStreet;
@Column(name = "building_number")
private String buildingNumber;
@Column(name = "building_name")
private String buildingName;
@Column(name = "subbuilding_name")
private String subbuildingName;
@Column(name = "pobox")
private String pobox;
@Column(name = "dept_name")
private String deptName;
@Column(name = "org_name")
private String orgName;
@Column(name = "udprn")
private String udprn;
@Column(name = "postcode_type")
private String postcodeType;
@Column(name = "su_org_indicator")
private String suOrgIndicator;
@Column(name = "dps")
private String dps;
@Column(name = "postcode1")
private String postcode1;
@Column(name = "admin1")
private String admin1;
@Column(name = "admin2")
private String admin2;
@Column(name = "admin3")
private String admin3;
@Column(name = "trad_county")
private String tradCounty;
@Column(name = "country_name")
private String countryName;
@Column(name = "mailsort")
private String mailsort;
@Column(name = "barcode")
private String barcode;
@Column(name = "type")
private String type;
@Column(name = "pca_id")
private Double pcaId;
public PafAddress() {
}
关于为什么这个查询表现如此糟糕或者我如何确定原因的任何想法?
答案 0 :(得分:1)
我发现了这个问题:
我在PostgreSQL中启用了auto_explain,查询时间> 1秒,这使我能够在日志中查看查询计划。 见http://www.postgresql.org/docs/current/static/auto-explain.html
对于Hibernate / JPA查询:
Limit (cost=0.00..13.12 rows=2 width=469)
-> Seq Scan on paf pafaddress0_ (cost=0.00..943043.74 rows=143753 width=469)
Filter: ((pca_id)::double precision = 7912139::double precision)
原生数据库查询:
QUERY PLAN
Limit (cost=0.00..7.03 rows=2 width=469)
-> Index Scan using paf_pca_idx on paf pafaddress0_ (cost=0.00..505649.83 rows=143753 width=469)
Index Cond: (pca_id = 7912139::numeric)
这很慢,因为正在执行顺序扫描而不是使用索引,这似乎是我的实体和数据库字段之间类型不匹配的结果。实体属性为“Double”,而数据库类型为“numeric”,查询说明显示数据库字段的强制转换为双double precision
。
我将数据库字段类型更改为double precision
,现在查询正在按预期执行。
答案 1 :(得分:0)
如果sql相同,则瓶颈必须是为每条记录创建单独的java对象的过程。我假设它返回了大量记录,你可以尝试限制这个数字。