我正在使用play 2.5和slick 3.1.1,我正在尝试为一对多的多个关系以及一对一构建最佳查询。我有一个这样的数据库模型:
case class Accommodation(id: Option[Long], landlordId: Long, name: String)
case class LandLord(id: Option[Long], name: String)
case class Address(id: Option[Long], accommodationId: Long, street: String)
case class ExtraCharge(id: Option[Long], accommodationId: Long, title: String)
对于数据输出:
case class AccommodationFull(accommodation: Accommodation, landLord: LandLord, extraCharges:Seq[ExtraCharge], addresses:Seq[Address])
我已经创建了两个查询来通过id获得住宿:
/** Retrieve a accommodation from the id. */
def findByIdFullMultipleQueries(id: Long): Future[Option[AccommodationFull]] = {
val q = for {
(a, l) <- accommodations join landLords on (_.landlordId === _.id)
if a.id === id
} yield (a, l)
for {
(data) <- db.run(q.result.headOption)
(ex) <- db.run(extraCharges.filter(_.accommodationId === id).result)
(add) <- db.run(addresses.filter(_.accommodationId === id).result)
} yield data.map { accLord => AccommodationFull(accLord._1, accLord._2, ex, add) }
}
/** Retrieve a accommodation from the id. */
def findByIdFull(id: Long): Future[Option[AccommodationFull]] = {
val qr = accommodations.filter(_.id === id).join(landLords).on(_.landlordId === _.id)
.joinLeft(extraCharges).on(_._1.id === _.accommodationId)
.joinLeft(addresses).on(_._1._1.id === _.accommodationId)
.result.map { res =>
res.groupBy(_._1._1._1.id).headOption.map {
case (k, v) =>
val addresses = v.flatMap(_._2).distinct
val extraCharges = v.flatMap(_._1._2).distinct
val landLord = v.map(_._1._1._2).head
val accommodation = v.map(_._1._1._1).head
AccommodationFull(accommodation, landLord, extraCharges, addresses)
}
}
db.run(qr)
}
经过测试,多次查询比加入快5倍。如何创建更优化的连接查询?
===更新===
我正在使用数据在postgresql 9.3上进行测试:
private[bootstrap] object InitialData {
def landLords = (1L to 10000L).map { id =>
LandLord(Some(id), s"Good LandLord $id")
}
def accommodations = (1L to 10000L).map { id =>
Accommodation(Some(id), s"Nice house $id", 100 * id, 3, 5, 500, 1l, None)
}
def extraCharge = (1L to 10000L).flatMap { id =>
(1 to 100).map { nr =>
ExtraCharge(None, id, s"Extra $nr", 100.0)
}
}
def addresses = (1L to 1000L).flatMap { id =>
(1 to 100).map { nr =>
Address(None, id, s"Słoneczna 4 - $nr", "17-200", "", "PL")
}
}
}
这里有多次运行(ms)的结果:
JOIN: 367
MULTI: 146
JOIN: 306
MULTI: 110
JOIN: 300
MULTI: 103
==更新2 ==
添加索引后,它会更好,但仍然更快:
def accommodationLandLordIdIndex = index("ACCOMMODATION__LANDLORD_ID__INDEX", landlordId, unique = false)
def addressAccommodationIdIndex = index("ADDRESS__ACCOMMODATION_ID__INDEX", accommodationId, unique = false)
def extraChargeAccommodationIdIndex = index("EXTRA_CHARGE__ACCOMMODATION_ID__INDEX", accommodationId, unique = false)
我做了一个测试:
val multiResult = (1 to 1000).map { i =>
val start = System.currentTimeMillis()
Await.result(accommodationDao.findByIdFullMultipleQueries(i), Duration.Inf)
System.currentTimeMillis() - start
}
println(s"MULTI AVG Result: ${multiResult.sum.toDouble / multiResult.length}")
val joinResult = (1 to 1000).map { i =>
val start = System.currentTimeMillis()
Await.result(accommodationDao.findByIdFull(i), Duration.Inf)
System.currentTimeMillis() - start
}
println(s"JOIN AVG Result: ${joinResult.sum.toDouble / joinResult.length}")
此处结果为2次运行:
MULTI AVG Result: 3.287
JOIN AVG Result: 96.797
MULTI AVG Result: 3.206
JOIN AVG Result: 100.221
答案 0 :(得分:2)
Postgres does not add indexes for foreign key columns。多查询在所有三个表(主键)上使用索引,而单个连接查询将扫描连接表以获取所需的ID。
尝试在accommodationId
列上添加索引。
<强>更新强>
虽然如果这是1:1的关系,索引会有所帮助,但看起来这些都是1:很多关系。在这种情况下,使用连接和后来的distinct
过滤器将从数据库中返回比您需要的更多的数据。
对于您的数据模型,执行多个查询看起来就像处理数据的正确方法一样。
答案 1 :(得分:1)
我认为这取决于您的数据库引擎。 Slick会生成可能不是最佳的查询(请参阅docs),但您需要在数据库级别对查询进行概要分析,以了解正在发生的情况并进行优化