在Django中我的模型类似于这个例子:
class Currency(models.Model):
name = models.CharField(max_length=3, unique=True)
full_name = models.CharField(max_length=20)
class ExchangeRate(models.Model):
currency = models.ForeignKey('Currency')
start_date = models.DateFiled()
end_date = models.DateField()
exchange_rate = models.DecimalField(max_digits=12, decimal_places=4)
让我们简化一下,假设我们只有一种货币,ExchangeRate
表格如下:
+---------------------+-------------------+------------+------------+---------------+
| currency_from__name | currency_to__name | start_date | end_date | exchange_rate |
+---------------------+-------------------+------------+------------+---------------+
| PLN | USD | 2014-03-01 | 2014-08-01 | 3.00000 |
| PLN | USD | 2014-08-01 | 2014-12-01 | 6.00000 |
+---------------------+-------------------+------------+------------+---------------+
请注意 这是简化数学运算的示例!
在此表中,数据密度为每月一次,有效记录为一个月,例如start_date = 2014.03.01
和end_date = 2014.04.01
,因此start_date
为包含且end_date
为独占
我想计算时间段的平均汇率:
在Django中我写道:
start_date = date(2014, 6, 1)
end_date = date(2014, 9, 1)
ExchangeRate.objects.all().filter(
(
Q(start_date__lt=start_date) &
Q(end_date__gt=start_date)
) | (
Q(start_date__gte=start_date) &
Q(start_date__lt=end_date) &
Q(end_date__gt=start_date)
)
).annotate(
currency_from_name = 'currency_from__name',
currency_to_name = 'currency_to__name'
).values( # GROUP BY
'currency_from_name',
'currency_to_name'
).aggregate(
F('currency_from_name'),
F('currency_to_name'),
Avg('exchange_rate')
)
在此查询之后,我收到的值4.5000
,由于数学原因是正确的,但在您需要处理时间范围时是错误的。
正确答案是4.000
。
我只想出这个解决方案,用这个公式注释额外的列,然后计算这个列的平均值:
其中:
Abs
是绝对值abs()
months
是计算两个日期months_between()
greater
,smaller
是从参数中选择越来越小的值的函数 - greatest()
, least()
ER
表示来自ExchangeRate
的列 - 例如F('exchange_rate')
我使用 9.3 PostgreSQL DB 和 Django 1.8.4 。
也许这有一个简单的功能?
也许我过于复杂了?
答案 0 :(得分:3)
months_between()
create function months_of(interval)
returns int strict immutable language sql as $$
select extract(years from $1)::int * 12 + extract(month from $1)::int
$$;
create function months_between(date, date)
returns int strict immutable language sql as $$
select months_of(age($1, $2))
$$;
average_weight():
create function average_weight(numeric, date, date, date, date)
returns numeric(9,2) strict immutable language sql as $$
select abs(months_between(GREATEST($2, $4), LEAST($3, $5))/months_between($4, $5))*$1
$$;
AverageWeight:
from django.db.models.aggregates import Func
from django.db.models.fields import FloatField
class AverageWeight(Func):
function = 'average_weight'
def __init__(self, *expressions):
super(AverageWeight, self).__init__(*expressions, output_field=FloatField())
ExchangeRate.objects.all().filter(
(
Q(start_date__lt=start_date) &
Q(end_date__gt=start_date)
) | (
Q(start_date__gte=start_date) &
Q(start_date__lt=end_date) &
Q(end_date__gt=start_date)
)
).annotate(
currency_from_name = 'currency_from__name',
currency_to_name = 'currency_to__name',
weight_exchange = AverageWeight(
F('exchange_rate'),
start_date,
end_date,
F('start_date'),
F('end_date'),
)
).values( # GROUP BY
'currency_from_name',
'currency_to_name'
).aggregate(
F('currency_from_name'),
F('currency_to_name'),
Avg('weight_exchange')
)
答案 1 :(得分:2)
您的申请存在的问题是您选择存储汇率的方式。所以,回答你的问题:是的,你已经过度复杂了。
"数学"告诉你平均汇率是4.5因为
(3 + 6) /2 == 4.5
无论您选择何种开始日期或结束日期,系统都会为您提供相同的价值。
为了解决根本原因,让我们尝试不同的方法。 (为简单起见,我将保留外键和其他与获取特定日期范围内的平均值无关的细节,您可以稍后再添加它们)
使用此模型:
class ExchangeRate(models.Model):
currency1 = models.CharField(max_length=3)
currency2 = models.CharField(max_length=3)
start_date = models.DateField()
exchange_rate = models.DecimalField(max_digits=12, decimal_places=4)
和这个数据:
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-03-01', 3);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-04-01', 3);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-05-01', 3);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-06-01', 3);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-07-01', 3);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-08-01', 6);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-09-01', 6);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-10-01', 6);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-11-01', 6);
我们可以执行此查询:
from django.db.models import Avg
from datetime import date
first_date = date(2014, 6, 1)
last_date = date(2014, 9, 1)
er.models.ExchangeRate.objects.filter(
start_date__gte = first_date,
start_date__lt = last_date
).aggregate(Avg('exchange_rate'))
获得此输出:
{'exchange_rate__avg': 4.0}
答案 2 :(得分:0)
您应该将此视为加权平均值,因此您要做的是计算每一行的权重,然后将它们加在一起。
我不太了解Django可以帮助你,但在SQL中这将是(我现在无法测试,但我认为它给出了正确的想法):
SELECT SUM((LEAST(end_date, @end_date) - GREATEST(start_date, @start_date)) * exchange_rate) / (@end_date - @start_date) AS weighted_avg
FROM
ExchangeRate
WHERE
(start_date, end_date) OVERLAPS (@start_date, @end_date)
这使用OVERLAPS运算符来查看周期是否重叠。我不确定在权重计算中是否存在1个错误,但认为这应该在输入变量的定义中处理(@end_date = @end_date - 1)