当从服务层调用saveAll
长JpaRepository
List<Entity>
方法时,Hibernate的跟踪日志记录会显示每个实体发出的单个SQL语句。
我是否可以强制它进行批量插入(即多行)而无需手动摆弄EntityManger
,事务等甚至原始SQL语句字符串?
对于多行插入,我的意思不仅仅是转换:
start transaction
INSERT INTO table VALUES (1, 2)
end transaction
start transaction
INSERT INTO table VALUES (3, 4)
end transaction
start transaction
INSERT INTO table VALUES (5, 6)
end transaction
为:
start transaction
INSERT INTO table VALUES (1, 2)
INSERT INTO table VALUES (3, 4)
INSERT INTO table VALUES (5, 6)
end transaction
但改为:
start transaction
INSERT INTO table VALUES (1, 2), (3, 4), (5, 6)
end transaction
在PROD中,我使用的是CockroachDB,性能差异很大。
以下是重现问题的最小示例(为简单起见H2)。
./src/main/kotlin/ThingService.kt
:
package things
import org.springframework.boot.autoconfigure.SpringBootApplication
import org.springframework.boot.runApplication
import org.springframework.web.bind.annotation.RestController
import org.springframework.web.bind.annotation.GetMapping
import org.springframework.data.jpa.repository.JpaRepository
import javax.persistence.Entity
import javax.persistence.Id
import javax.persistence.GeneratedValue
interface ThingRepository : JpaRepository<Thing, Long> {
}
@RestController
class ThingController(private val repository: ThingRepository) {
@GetMapping("/test_trigger")
fun trigger() {
val things: MutableList<Thing> = mutableListOf()
for (i in 3000..3013) {
things.add(Thing(i))
}
repository.saveAll(things)
}
}
@Entity
data class Thing (
var value: Int,
@Id
@GeneratedValue
var id: Long = -1
)
@SpringBootApplication
class Application {
}
fun main(args: Array<String>) {
runApplication<Application>(*args)
}
./src/main/resources/application.properties
:
jdbc.driverClassName = org.h2.Driver
jdbc.url = jdbc:h2:mem:db
jdbc.username = sa
jdbc.password = sa
hibernate.dialect=org.hibernate.dialect.H2Dialect
hibernate.hbm2ddl.auto=create
spring.jpa.generate-ddl = true
spring.jpa.show-sql = true
spring.jpa.properties.hibernate.jdbc.batch_size = 10
spring.jpa.properties.hibernate.order_inserts = true
spring.jpa.properties.hibernate.order_updates = true
spring.jpa.properties.hibernate.jdbc.batch_versioned_data = true
./build.gradle.kts
:
import org.jetbrains.kotlin.gradle.tasks.KotlinCompile
plugins {
val kotlinVersion = "1.2.30"
id("org.springframework.boot") version "2.0.2.RELEASE"
id("org.jetbrains.kotlin.jvm") version kotlinVersion
id("org.jetbrains.kotlin.plugin.spring") version kotlinVersion
id("org.jetbrains.kotlin.plugin.jpa") version kotlinVersion
id("io.spring.dependency-management") version "1.0.5.RELEASE"
}
version = "1.0.0-SNAPSHOT"
tasks.withType<KotlinCompile> {
kotlinOptions {
jvmTarget = "1.8"
freeCompilerArgs = listOf("-Xjsr305=strict")
}
}
repositories {
mavenCentral()
}
dependencies {
compile("org.springframework.boot:spring-boot-starter-web")
compile("org.springframework.boot:spring-boot-starter-data-jpa")
compile("org.jetbrains.kotlin:kotlin-stdlib-jdk8")
compile("org.jetbrains.kotlin:kotlin-reflect")
compile("org.hibernate:hibernate-core")
compile("com.h2database:h2")
}
执行命令
./gradlew bootRun
触发DB INSERT:
curl http://localhost:8080/test_trigger
日志输出:
Hibernate: select thing0_.id as id1_0_0_, thing0_.value as value2_0_0_ from thing thing0_ where thing0_.id=?
Hibernate: call next value for hibernate_sequence
Hibernate: select thing0_.id as id1_0_0_, thing0_.value as value2_0_0_ from thing thing0_ where thing0_.id=?
Hibernate: call next value for hibernate_sequence
Hibernate: select thing0_.id as id1_0_0_, thing0_.value as value2_0_0_ from thing thing0_ where thing0_.id=?
Hibernate: call next value for hibernate_sequence
Hibernate: select thing0_.id as id1_0_0_, thing0_.value as value2_0_0_ from thing thing0_ where thing0_.id=?
Hibernate: call next value for hibernate_sequence
Hibernate: select thing0_.id as id1_0_0_, thing0_.value as value2_0_0_ from thing thing0_ where thing0_.id=?
Hibernate: call next value for hibernate_sequence
Hibernate: select thing0_.id as id1_0_0_, thing0_.value as value2_0_0_ from thing thing0_ where thing0_.id=?
Hibernate: call next value for hibernate_sequence
Hibernate: select thing0_.id as id1_0_0_, thing0_.value as value2_0_0_ from thing thing0_ where thing0_.id=?
Hibernate: call next value for hibernate_sequence
Hibernate: select thing0_.id as id1_0_0_, thing0_.value as value2_0_0_ from thing thing0_ where thing0_.id=?
Hibernate: call next value for hibernate_sequence
Hibernate: select thing0_.id as id1_0_0_, thing0_.value as value2_0_0_ from thing thing0_ where thing0_.id=?
Hibernate: call next value for hibernate_sequence
Hibernate: select thing0_.id as id1_0_0_, thing0_.value as value2_0_0_ from thing thing0_ where thing0_.id=?
Hibernate: call next value for hibernate_sequence
Hibernate: select thing0_.id as id1_0_0_, thing0_.value as value2_0_0_ from thing thing0_ where thing0_.id=?
Hibernate: call next value for hibernate_sequence
Hibernate: select thing0_.id as id1_0_0_, thing0_.value as value2_0_0_ from thing thing0_ where thing0_.id=?
Hibernate: call next value for hibernate_sequence
Hibernate: select thing0_.id as id1_0_0_, thing0_.value as value2_0_0_ from thing thing0_ where thing0_.id=?
Hibernate: call next value for hibernate_sequence
Hibernate: select thing0_.id as id1_0_0_, thing0_.value as value2_0_0_ from thing thing0_ where thing0_.id=?
Hibernate: call next value for hibernate_sequence
Hibernate: insert into thing (value, id) values (?, ?)
Hibernate: insert into thing (value, id) values (?, ?)
Hibernate: insert into thing (value, id) values (?, ?)
Hibernate: insert into thing (value, id) values (?, ?)
Hibernate: insert into thing (value, id) values (?, ?)
Hibernate: insert into thing (value, id) values (?, ?)
Hibernate: insert into thing (value, id) values (?, ?)
Hibernate: insert into thing (value, id) values (?, ?)
Hibernate: insert into thing (value, id) values (?, ?)
Hibernate: insert into thing (value, id) values (?, ?)
Hibernate: insert into thing (value, id) values (?, ?)
Hibernate: insert into thing (value, id) values (?, ?)
Hibernate: insert into thing (value, id) values (?, ?)
Hibernate: insert into thing (value, id) values (?, ?)
答案 0 :(得分:31)
要使用Sring Boot和Spring Data JPA获得批量插入,您只需要两件事:
将选项spring.jpa.properties.hibernate.jdbc.batch_size
设置为您需要的适当值(例如:20)。
使用您的仓库的saveAll()
方法和准备插入的实体列表。
工作示例是here。
关于将insert语句转换为类似的内容:
INSERT INTO table VALUES (1, 2), (3, 4), (5, 6)
这在PostgreSQL中可用:您可以在jdbc连接字符串中将选项reWriteBatchedInserts
设置为true:
jdbc:postgresql://localhost:5432/db?reWriteBatchedInserts=true
然后jdbc驱动程序将执行this transformation。
有关批处理的其他信息,您可以找到here。
<强>已更新强>
Kotlin的演示项目:sb-kotlin-batch-insert-demo
<强>已更新强>
答案 1 :(得分:6)
基础问题是SimpleJpaRepository中的以下代码:
@Transactional
public <S extends T> S save(S entity) {
if (entityInformation.isNew(entity)) {
em.persist(entity);
return entity;
} else {
return em.merge(entity);
}
}
除了批量大小属性设置之外,还必须确保SimpleJpaRepository类调用是持久的而不是合并。有几种方法可以解决这个问题:使用不查询序列的@Id
生成器,例如
@Id
@GeneratedValue(generator = "uuid2")
@GenericGenerator(name = "uuid2", strategy = "uuid2")
var id: Long
或者强制持久性通过让您的实体实现Persistable并覆盖isNew()
调用
@Entity
class Thing implements Pesistable<Long> {
var value: Int,
@Id
@GeneratedValue
var id: Long = -1
@Transient
private boolean isNew = true;
@PostPersist
@PostLoad
void markNotNew() {
this.isNew = false;
}
@Override
boolean isNew() {
return isNew;
}
}
或覆盖save(List)
并使用实体管理员致电persist()
@Repository
public class ThingRepository extends SimpleJpaRepository<Thing, Long> {
private EntityManager entityManager;
public ThingRepository(EntityManager entityManager) {
super(Thing.class, entityManager);
this.entityManager=entityManager;
}
@Transactional
public List<Thing> save(List<Thing> things) {
things.forEach(thing -> entityManager.persist(thing));
return things;
}
}
以上代码基于以下链接:
答案 2 :(得分:3)
您可以将Hibernate配置为批量DML。看看Spring Data JPA - concurrent Bulk inserts/updates。我认为答案的第2部分可以解决您的问题:
启用批处理DML语句启用批处理支持 这样可以减少到数据库的往返次数 插入/更新相同数量的记录。
从批处理INSERT和UPDATE语句引用:
hibernate.jdbc.batch_size = 50
hibernate.order_inserts = true
hibernate.order_updates = true
hibernate.jdbc.batch_versioned_data = true
更新:您必须在application.properties
文件中以不同方式设置hibernate属性。它们位于名称空间下:spring.jpa.properties.*
。示例可能如下所示:
spring.jpa.properties.hibernate.jdbc.batch_size = 50
spring.jpa.properties.hibernate.order_inserts = true
....
答案 3 :(得分:0)
所有提到的方法都可以使用,但是会很慢,特别是如果插入数据的源位于其他表中时。首先,即使使用batch_size>1
,插入操作也将在多个SQL查询中执行。其次,如果源数据位于另一个表中,则需要使用其他查询来获取数据(在最坏的情况下,将所有数据加载到内存中),并将其转换为静态大容量插入。第三,对每个实体(即使启用了批处理)分别进行persist()
调用,您将使用所有这些实体实例来膨胀实体管理器的一级缓存。
但是Hibernate还有另一个选择。如果您将Hibernate用作JPA提供程序,则可以回退到{Q3}}的HQL,它本身可以从另一个表中进行子选择。示例:
Session session = entityManager.unwrap(Session::class.java)
session.createQuery("insert into Entity (field1, field2) select [...] from [...]")
.executeUpdate();
这是否可行取决于您的ID生成策略。如果Entity.id
由数据库生成(例如MySQL自动递增),它将成功执行。如果Entity.id
是由您的代码生成的(对于UUID生成器尤其如此),它将因“不支持的ID生成方法”异常而失败。
但是,在后一种情况下,可以通过自定义SQL函数解决此问题。例如,在PostgreSQL中,我使用supports bulk inserts扩展名,该扩展名提供了uuid_generate_v4()
函数,最后我在自定义对话框中注册了该函数:
import org.hibernate.dialect.PostgreSQL10Dialect;
import org.hibernate.dialect.function.StandardSQLFunction;
import org.hibernate.type.PostgresUUIDType;
public class MyPostgresDialect extends PostgreSQL10Dialect {
public MyPostgresDialect() {
registerFunction( "uuid_generate_v4",
new StandardSQLFunction("uuid_generate_v4", PostgresUUIDType.INSTANCE));
}
}
然后我将此类注册为休眠对话框:
hibernate.dialect=MyPostgresDialect
最后,我可以在批量插入查询中使用此功能:
SessionImpl session = entityManager.unwrap(Session::class.java);
session.createQuery("insert into Entity (id, field1, field2) "+
"select uuid_generate_v4(), [...] from [...]")
.executeUpdate();
最重要的是Hibernate生成的用于完成此操作的基础SQL,它只是一个查询:
insert into entity ( id, [...] ) select uuid_generate_v4(), [...] from [...]
答案 4 :(得分:-2)
首先:我添加了两个配置
然后
我使用方法saveAll()
userInfoRepository.saveAll(userInfoList);
但是日志显示如下:
Hibernate: insert into user_info (time_inst, time_upd, adress, age, education, login_name, login_pwd, phone, sex, user_name) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: insert into user_info (time_inst, time_upd, adress, age, education, login_name, login_pwd, phone, sex, user_name) values (?, ?
它不起作用。