计算时间段的平均汇率

时间:2015-09-02 10:52:41

标签: django postgresql django-models django-queryset django-aggregation

在Django中我的模型类似于这个例子:

class Currency(models.Model):
    name = models.CharField(max_length=3, unique=True)
    full_name = models.CharField(max_length=20)


class ExchangeRate(models.Model):
    currency = models.ForeignKey('Currency')
    start_date = models.DateFiled()
    end_date = models.DateField()
    exchange_rate = models.DecimalField(max_digits=12, decimal_places=4)

让我们简化一下,假设我们只有一种货币,ExchangeRate表格如下:

+---------------------+-------------------+------------+------------+---------------+
| currency_from__name | currency_to__name | start_date |  end_date  | exchange_rate |
+---------------------+-------------------+------------+------------+---------------+
|        PLN          |        USD        | 2014-03-01 | 2014-08-01 |    3.00000    |
|        PLN          |        USD        | 2014-08-01 | 2014-12-01 |    6.00000    |
+---------------------+-------------------+------------+------------+---------------+

请注意 这是简化数学运算的示例!

在此表中,数据密度为每月一次,有效记录为一个月,例如start_date = 2014.03.01end_date = 2014.04.01,因此start_date为包含且end_date为独占

我想计算时间段的平均汇率:

[2014.06.01 ; 2012.09.01)

这意味着:>= 2014.06.01< 2014.09.01

在Django中我写道:

start_date = date(2014, 6, 1)
end_date = date(2014, 9, 1)

ExchangeRate.objects.all().filter(
        (
            Q(start_date__lt=start_date) & 
            Q(end_date__gt=start_date)
        ) | (
            Q(start_date__gte=start_date) & 
            Q(start_date__lt=end_date) & 
            Q(end_date__gt=start_date) 
        )
).annotate(
    currency_from_name = 'currency_from__name', 
    currency_to_name = 'currency_to__name'
).values(  # GROUP BY
    'currency_from_name',
    'currency_to_name'
).aggregate(
    F('currency_from_name'), 
    F('currency_to_name'), 
    Avg('exchange_rate')
)

在此查询之后,我收到的值4.5000,由于数学原因是正确的,但在您需要处理时间范围时是错误的。 正确答案是4.000

我只想出这个解决方案,用这个公式注释额外的列,然后计算这个列的平均值:

https://www.codecogs.com/eqnedit.php?latex=\inline&space;Abs&space;\left&space;(&space;\frac{months&space;\left&space;(&space;greater(ER_{start_date}\&space;,\&space;start_date),&space;smaller(ER_{start_date}\&space;,\&space;end_date)&space;\right&space;)&space;}{months(start_date\&space;,\&space;end_date)}&space;\right&space;)&space;*&space;ER_{exchange_rate}

其中:

我使用 9.3 PostgreSQL DB Django 1.8.4

也许这有一个简单的功能?
也许我过于复杂了?

3 个答案:

答案 0 :(得分:3)

1。 months_between()

create function months_of(interval)
 returns int strict immutable language sql as $$
  select extract(years from $1)::int * 12 + extract(month from $1)::int
$$;

create function months_between(date, date)
 returns int strict immutable language sql as $$
   select months_of(age($1, $2))
$$;

2。 average_weight():

create function average_weight(numeric, date, date, date, date)
 returns numeric(9,2) strict immutable language sql as $$
   select abs(months_between(GREATEST($2, $4), LEAST($3, $5))/months_between($4, $5))*$1
$$;

3。 AverageWeight:

from django.db.models.aggregates import Func
from django.db.models.fields import FloatField

class AverageWeight(Func):
    function = 'average_weight'

    def __init__(self, *expressions):
        super(AverageWeight, self).__init__(*expressions, output_field=FloatField())

在您的视图中:

ExchangeRate.objects.all().filter(
        (
            Q(start_date__lt=start_date) & 
            Q(end_date__gt=start_date)
        ) | (
            Q(start_date__gte=start_date) & 
            Q(start_date__lt=end_date) & 
            Q(end_date__gt=start_date) 
        )
).annotate(
    currency_from_name = 'currency_from__name', 
    currency_to_name = 'currency_to__name',
    weight_exchange = AverageWeight(
        F('exchange_rate'),
        start_date,
        end_date,
        F('start_date'),
        F('end_date'),
    )
).values(  # GROUP BY
    'currency_from_name',
    'currency_to_name'
).aggregate(
    F('currency_from_name'), 
    F('currency_to_name'), 
    Avg('weight_exchange')
)

答案 1 :(得分:2)

您的申请存在的问题是您选择存储汇率的方式。所以,回答你的问题:是的,你已经过度复杂了。

&#34;数学&#34;告诉你平均汇率是4.5因为

(3 + 6) /2 == 4.5 

无论您选择何种开始日期或结束日期,系统都会为您提供相同的价值。

为了解决根本原因,让我们尝试不同的方法。 (为简单起见,我将保留外键和其他与获取特定日期范围内的平均值无关的细节,您可以稍后再添加它们)

使用此模型:

class ExchangeRate(models.Model):
    currency1 = models.CharField(max_length=3)
    currency2 = models.CharField(max_length=3)
    start_date = models.DateField()
    exchange_rate = models.DecimalField(max_digits=12, decimal_places=4)

和这个数据:

INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-03-01', 3);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-04-01', 3);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-05-01', 3);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-06-01', 3);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-07-01', 3);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-08-01', 6);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-09-01', 6);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-10-01', 6);
INSERT INTO exchange_rate_exchangerate(currency1, currency2, start_date, exchange_rate) VALUES ('PLN', 'USD', '2014-11-01', 6);

我们可以执行此查询:

from django.db.models import Avg
from datetime import date

first_date = date(2014, 6, 1)
last_date = date(2014, 9, 1)
er.models.ExchangeRate.objects.filter(
    start_date__gte = first_date,
    start_date__lt = last_date

).aggregate(Avg('exchange_rate'))

获得此输出:

{'exchange_rate__avg': 4.0}

答案 2 :(得分:0)

您应该将此视为加权平均值,因此您要做的是计算每一行的权重,然后将它们加在一起。

我不太了解Django可以帮助你,但在SQL中这将是(我现在无法测试,但我认为它给出了正确的想法):

SELECT SUM((LEAST(end_date, @end_date) - GREATEST(start_date, @start_date)) * exchange_rate) / (@end_date - @start_date) AS weighted_avg
FROM 
  ExchangeRate
WHERE
  (start_date, end_date) OVERLAPS (@start_date, @end_date)

这使用OVERLAPS运算符来查看周期是否重叠。我不确定在权重计算中是否存在1个错误,但认为这应该在输入变量的定义中处理(@end_date = @end_date - 1)