TL; DR
由于我意识到INTERVAL
并不是最佳解决方案,因此该如何改善此查询:
SELECT * FROM category c WHERE c.createdAt BETWEEN (
select c.createdAt from category c order by c.createdAt desc limit 1
) - INTERVAL 30 day AND (
select c.createdAt from category c order by c.createdAt desc limit 1
) AND c.budgetId = (
select c.budgetId from category c order by c.createdAt desc limit 1
) order by c.`order` desc;
因此,它不会遗漏以下数据,并且具有足够的弹性,因此我不必以INTERVAL
为基础,而是最后创建的category
记录都属于同一记录budgetId
?
+----+--------+--------+----------+-------+----------------------------+----------------------------+
| id | name | type | budgetId | order | updatedAt | createdAt |
+----+--------+--------+----------+-------+----------------------------+----------------------------+
| 1 | Income | INCOME | 1 | 0 | 2019-01-25 07:38:22.000000 | 2018-12-29 13:49:29.414187 |
+----+--------+--------+----------+-------+----------------------------+----------------------------+
1 row in set (0.00 sec)
我正在构建预算应用程序。技术挑战之一是,当用户导航到“新”月份时,系统将需要使用先前指定的类别和订单项填写用户界面。正如我发现的那样,这不能仅仅取自上个月,因为如果他们没有上个月的预算,而对我来说,我不知道要走多远的数据库查询将是一个挑战。
数据关系结构:
Budget - a Budget contains many Categories
- Category - a Category contains many Line Items
-- Line Item
通过数据的结构方式,我担心每个月随着数据集的增长和增长,任何查询都将花费越来越长的时间。这似乎不是很可扩展,但是也许只是在处理数据。无论如何,我也在寻求尽可能的优化。
那么我该如何编写一个MySQL查询:
听起来很简单,但是我对MySQL不太了解。下面是我对此的尝试,它可以工作,但是它有很多问题,我不确定如何解决:
这是表的数据:
+----+--------------+---------+----------+-------+----------------------------+----------------------------+
| id | name | type | budgetId | order | updatedAt | createdAt |
+----+--------------+---------+----------+-------+----------------------------+----------------------------+
| 1 | Income | INCOME | 1 | 0 | 2019-01-25 07:38:22.000000 | 2018-12-29 13:49:29.414187 |
| 2 | NULL | INCOME | 7 | 0 | 2018-12-29 13:49:29.336374 | 2018-11-30 13:49:29.414000 |
| 4 | NULL | EXPENSE | 7 | 0 | 2018-12-29 13:49:29.336374 | 2018-12-29 13:49:29.414187 |
| 5 | Savings | EXPENSE | 7 | 0 | 2018-12-29 13:49:29.336374 | 2018-12-29 13:49:29.414187 |
| 6 | NULL | EXPENSE | 7 | 0 | 2018-12-29 13:49:29.336374 | 2018-12-29 13:49:29.414187 |
| 7 | NULL | EXPENSE | 7 | 0 | 2018-12-29 13:49:29.336374 | 2018-12-29 13:49:29.414187 |
| 8 | NULL | EXPENSE | 7 | 0 | 2018-12-29 13:49:29.336374 | 2018-12-29 13:49:29.414187 |
| 9 | NULL | EXPENSE | 7 | 0 | 2018-12-29 13:49:29.336374 | 2018-12-29 13:49:29.414187 |
| 10 | NULL | EXPENSE | 1 | 0 | 2019-01-15 07:29:30.385994 | 2019-01-15 07:29:30.385994 |
| 61 | NULL | EXPENSE | 1 | 0 | 2019-01-30 08:08:29.829840 | 2019-01-30 08:08:29.829840 |
| 62 | asdfasdf | INCOME | 1 | 0 | 2019-02-02 13:32:13.147986 | 2019-02-02 13:32:13.147986 |
| 63 | kjljlklkj | INCOME | 1 | 0 | 2019-02-02 17:20:21.582486 | 2019-02-02 17:20:21.582486 |
| 67 | asdfasfd | INCOME | 1 | 0 | 2019-02-07 07:33:37.426932 | 2019-02-07 07:33:37.426932 |
| 68 | asdfasdf | INCOME | 1 | 0 | 2019-02-07 07:38:36.467545 | 2019-02-07 07:38:36.467545 |
| 69 | Old Income | INCOME | 8 | 0 | 2019-03-22 12:14:25.117000 | 2018-11-29 12:14:32.211000 |
| 70 | Older Income | INCOME | 9 | 0 | 2019-03-22 12:15:14.681000 | 2018-10-29 12:15:20.969000 |
| 71 | | INCOME | 10 | 0 | 2019-03-22 12:16:00.746000 | 2018-09-29 12:15:38.125000 |
| 72 | NULL | INCOME | 11 | 0 | 2019-03-22 12:17:27.445000 | 2018-08-22 12:16:28.689000 |
| 73 | NULL | INCOME | 12 | 0 | 2019-03-22 12:17:30.825000 | 2018-07-22 12:16:39.544000 |
| 74 | NULL | INCOME | 13 | 0 | 2019-03-22 12:17:29.230000 | 2018-06-22 12:16:45.362000 |
| 75 | NULL | INCOME | 14 | 0 | 2019-03-22 12:17:32.142000 | 2018-05-22 12:16:51.574000 |
| 76 | NULL | INCOME | 15 | 0 | 2019-03-22 12:17:33.269000 | 2018-04-22 12:17:00.142000 |
| 77 | NULL | INCOME | 16 | 0 | 2019-03-22 12:17:34.972000 | 2018-03-22 12:17:22.573000 |
+----+--------------+---------+----------+-------+----------------------------+----------------------------+
23 rows in set (0.00 sec)
首先,根据我的经验,我确定了要解决的MySQL查询,以解决问题中提到的性能问题,并且found this excellent answer还将{{3} }提出此查询:
SELECT * FROM category c WHERE c.createdAt BETWEEN (
select c.createdAt from category c order by c.createdAt desc limit 1
) - INTERVAL 30 day AND (
select c.createdAt from category c order by c.createdAt desc limit 1
) AND c.budgetId = (
select c.budgetId from category c order by c.createdAt desc limit 1
) order by c.`order` desc;
哪个数据来自同一张表:
+----+-----------+---------+----------+-------+----------------------------+----------------------------+
| id | name | type | budgetId | order | updatedAt | createdAt |
+----+-----------+---------+----------+-------+----------------------------+----------------------------+
| 10 | NULL | EXPENSE | 1 | 0 | 2019-01-15 07:29:30.385994 | 2019-01-15 07:29:30.385994 |
| 61 | NULL | EXPENSE | 1 | 0 | 2019-01-30 08:08:29.829840 | 2019-01-30 08:08:29.829840 |
| 62 | asdfasdf | INCOME | 1 | 0 | 2019-02-02 13:32:13.147986 | 2019-02-02 13:32:13.147986 |
| 63 | kjljlklkj | INCOME | 1 | 0 | 2019-02-02 17:20:21.582486 | 2019-02-02 17:20:21.582486 |
| 67 | asdfasfd | INCOME | 1 | 0 | 2019-02-07 07:33:37.426932 | 2019-02-07 07:33:37.426932 |
| 68 | asdfasdf | INCOME | 1 | 0 | 2019-02-07 07:38:36.467545 | 2019-02-07 07:38:36.467545 |
+----+-----------+---------+----------+-------+----------------------------+----------------------------+
6 rows in set (0.00 sec)
然后我根据here的一些启发将其翻译为TypeORM:
async getCategoriesForLastEnteredMonth() {
return await this.categoryRepository.createQueryBuilder('c')
.select('*')
.where((qb => {
const subCreatedAtQuery = qb.subQuery()
.select('c.createdAt').from(Category, 'c').orderBy('c.createdAt', 'DESC')
.limit(1).getQuery();
const subBudgetQuery = qb.subQuery()
.select('c.budgetId').from(Category, 'c').orderBy('c.createdAt', 'DESC')
.limit(1).getQuery();
return `c.createdAt BETWEEN (${subCreatedAtQuery}) - INTERVAL 30 day AND (${subCreatedAtQuery}) AND c.budgetId = (${subBudgetQuery})`;
}))
.orderBy('c.`order`', 'ASC').getMany();
}
没有订单项的结果。那本来就足够了,但我想更进一步,并且也能做到。因此,我here在最后添加了一个内部联接:
async getCategoriesForLastEnteredMonth() {
return await this.categoryRepository.createQueryBuilder('c')
.select('c.id')
.addSelect('c.name')
.addSelect('c.type')
.addSelect('c.createdAt')
.addSelect('c.order')
.addSelect('c.budgetId')
.where((qb => {
const subCreatedAtQuery = qb.subQuery()
.select('c.createdAt').from(Category, 'c').orderBy('c.createdAt', 'DESC')
.limit(1).getQuery();
const subBudgetQuery = qb.subQuery()
.select('c.budgetId').from(Category, 'c').orderBy('c.createdAt', 'DESC')
.limit(1).getQuery();
return `c.createdAt BETWEEN (${subCreatedAtQuery}) - INTERVAL 30 day AND (${subCreatedAtQuery}) AND c.budgetId = (${subBudgetQuery})`;
}))
.leftJoinAndSelect('c.lineItems', 'li')
.orderBy('c.`order`', 'ASC').getMany();
}
最后,这产生了我想要的结果。唯一的错误是,我在某个预算ID 1
范围内找到了我期望的所有类别,如下所示:
+----+-----------+---------+----------+-------+----------------------------+----------------------------+
| id | name | type | budgetId | order | updatedAt | createdAt |
+----+-----------+---------+----------+-------+----------------------------+----------------------------+
| 10 | NULL | EXPENSE | 1 | 0 | 2019-01-15 07:29:30.385994 | 2019-01-15 07:29:30.385994 |
| 61 | NULL | EXPENSE | 1 | 0 | 2019-01-30 08:08:29.829840 | 2019-01-30 08:08:29.829840 |
| 62 | asdfasdf | INCOME | 1 | 0 | 2019-02-02 13:32:13.147986 | 2019-02-02 13:32:13.147986 |
| 63 | kjljlklkj | INCOME | 1 | 0 | 2019-02-02 17:20:21.582486 | 2019-02-02 17:20:21.582486 |
| 67 | asdfasfd | INCOME | 1 | 0 | 2019-02-07 07:33:37.426932 | 2019-02-07 07:33:37.426932 |
| 68 | asdfasdf | INCOME | 1 | 0 | 2019-02-07 07:38:36.467545 | 2019-02-07 07:38:36.467545 |
+----+-----------+---------+----------+-------+----------------------------+----------------------------+
6 rows in set (0.00 sec)
我不得不做30天的INTERVAL
,这有点令人担忧,因为如果数据超过30天,则查询将变得很脆弱。我觉得它应该比那更具弹性,但是我不确定该改变什么。举例来说,我看到一个日期的预算ID也为1
,但结果却被排除在外:
+----+--------+--------+----------+-------+----------------------------+----------------------------+
| id | name | type | budgetId | order | updatedAt | createdAt |
+----+--------+--------+----------+-------+----------------------------+----------------------------+
| 1 | Income | INCOME | 1 | 0 | 2019-01-25 07:38:22.000000 | 2018-12-29 13:49:29.414187 |
+----+--------+--------+----------+-------+----------------------------+----------------------------+
1 row in set (0.00 sec)
尽管这不是主要问题(我认为),因为所有将使用查询结果在同一日期生成新月份的类别和订单项;仍然感觉还不太可靠。
我正在使用TypeScript,TypeORM,带有Express的NestJS和MySQL作为后端。这是我正在处理的实体:
类别实体
@Entity()
export class Category extends TrackedAndIdentifiedEntity {
@ManyToOne(type => Budget, budget => budget.categories)
budget: Budget;
@Column({ nullable: true })
name: string;
@Column('enum', { enum: CategoryType })
type: CategoryType;
@Column('integer', { default: 0 })
order: number;
@ManyToMany(type => LineItem, { cascade: true })
@JoinTable()
lineItems: LineItem[];
}
订单项实体
@Entity()
export class LineItem extends IdentifiedEntity {
@Column('integer', { default: 0 })
order: number;
@Column()
label: string;
@Column('decimal', { precision: 19, scale: 4 })
planned: number;
@OneToOne(type => Actual, { cascade: true, onUpdate: 'CASCADE' })
@JoinColumn()
actual: Actual;
_createdAt: moment.Moment;
}
如果相关,则IdentifiedAndTrackedEntity和IdentifiedEntity
export abstract class TrackedAndIdentifiedEntity extends IdentifiedEntity {
@UpdateDateColumn()
updatedAt: Date;
@CreateDateColumn()
createdAt: Date;
}
export abstract class IdentifiedEntity {
@PrimaryGeneratedColumn({ type: 'bigint' })
id: number;
}
如果对此有更好的建议,我将不胜感激。感谢您提前提供的帮助!