使用Django QuerySets的复杂查询

时间:2019-04-12 01:26:36

标签: python django sqlite django-queryset

我正在处理一个个人项目,并且正在尝试编写一个复杂的查询,

  1. 获取属于某个用户的每个设备

  2. 获取属于用户设备中每个设备的每个传感器

  3. 获取每个用户设备传感器的最后记录值和时间戳。

我正在使用Sqlite,并且设法将查询编写为纯SQL,但是,对于我来说,我一生都无法找到在Django中进行查询的方法。我看了其他问题,尝试遍历文档,但无济于事。

我的模特:

class User(AbstractBaseUser):
    email = models.EmailField()

class Device(models.Model):
    user = models.ForeignKey(User)
    name = models.CharField()

class Unit(models.Model):
    name = models.CharField()

class SensorType(models.Model):
    name = models.CharField()
    unit = models.ForeignKey(Unit)

class Sensor(models.Model):
    gpio_port = models.IntegerField()
    device = models.ForeignKey(Device)
    sensor_type = models.ForeignKey(SensorType)

class SensorData(models.Model):
    sensor = models.ForeignKey(Sensor)
    value = models.FloatField()
    timestamp = models.DateTimeField()

这是SQL查询:

SELECT acc.email, 
           dev.name as device_name, 
           stype.name as sensor_type,
           sen.gpio_port as sensor_port,
           sdata.value as sensor_latest_value, 
           unit.name as sensor_units, 
           sdata.latest as value_received_on
FROM devices_device as dev
INNER JOIN accounts_user  as acc on dev.user_id = acc.id
INNER JOIN devices_sensor  as sen on sen.device_id = dev.id
INNER JOIN devices_sensortype as stype on stype.id = sen.sensor_type_id
INNER JOIN devices_unit as unit on unit.id = stype.unit_id
LEFT JOIN (
            SELECT MAX(sd.timestamp) latest, sd.value, sensor_id
            FROM devices_sensordata as sd
            INNER JOIN devices_sensor as s ON s.id = sd.sensor_id
        GROUP BY sd.sensor_id) as sdata on sdata.sensor_id= sen.id
WHERE acc.id = 1
ORDER BY dev.id

我一直在使用django shell,以寻找一种使用QuerySet API实现此查询的方法,但是我无法弄清楚...

我设法得到的最接近的是这个:

>>> sub = SensorData.objects.values('sensor_id', 'value').filter(sensor_id=OuterRef('pk')).order_by('-timestamp')[:1]
>>> Sensor.objects.annotate(data_id=Subquery(sub.values('sensor_id'))).filter(id=F('data_id')).values(...)

但是它有两个问题:

  1. 它不包括在SensorsData中还没有任何值的传感器
  2. 如果我将SensorData.values字段包含到.values()中,那么我开始获取以前记录的传感器值

如果有人可以告诉我该怎么做,或者至少告诉我我在做错什么,我将非常感激!

谢谢!

P.S。请原谅我的语法和拼写错误,我正在半夜里写这篇文章,很累。

编辑: 根据答案,我应该澄清: 我只需要每个传感器的最新传感器值。例如,我有In sensordata:

id | sensor_id | value | timestamp|
1  |  1             |  2       |  <today>   |
2  |  1             |  5       | <yesterday>|
3  |  2             |  3       | <yesterday>|

对于每个sensor_id,仅应返回最新信息:

id |   sensor_id    |   value  |  timestamp |
1  |  1             |  2       |  <today>   |
3  |  2             |  3       | <yesterday>|

或者如果传感器在此表中还没有任何数据,则我等待查询以返回其记录,其中值和时间戳为“ null”(基本上是SQL查询中的左联接)。

EDIT2:

根据@ivissani的回答,我设法产生了这一点:

>>> latest_sensor_data = Sensor.objects.annotate(is_latest=~Exists(SensorData.objects.filter(sensor=OuterRef('id'),timestamp__gt=OuterRef('sensordata__timestamp')))).filter(is_latest=True)
>>> user_devices = latest_sensor_data.filter(device__user=1)
>>> for x in user_devices.values_list('device__name','sensor_type__name', 'gpio_port','sensordata__value', 'sensor_type__unit__name', 'sensordata__timestamp').order_by('device__name'):
...     print(x)

似乎可以完成这项工作。

这是它产生的SQL:

    SELECT
  "devices_device"."name",
  "devices_sensortype"."name",
  "devices_sensor"."gpio_port",
  "devices_sensordata"."value",
  "devices_unit"."name",
  "devices_sensordata"."timestamp"
FROM
  "devices_sensor"
  LEFT OUTER JOIN "devices_sensordata" ON (
    "devices_sensor"."id" = "devices_sensordata"."sensor_id"
  )
  INNER JOIN "devices_device" ON (
    "devices_sensor"."device_id" = "devices_device"."id"
  )
  INNER JOIN "devices_sensortype" ON (
    "devices_sensor"."sensor_type_id" = "devices_sensortype"."id"
  )
  INNER JOIN "devices_unit" ON (
    "devices_sensortype"."unit_id" = "devices_unit"."id"
  )
WHERE
  (
    NOT EXISTS(
      SELECT
        U0."id",
        U0."sensor_id",
        U0."value",
        U0."timestamp"
      FROM
        "devices_sensordata" U0
      WHERE
        (
          U0."sensor_id" = ("devices_sensor"."id")
          AND U0."timestamp" > ("devices_sensordata"."timestamp")
        )
    ) = True
    AND "devices_device"."user_id" = 1
  )
ORDER BY
  "devices_device"."name" ASC

4 个答案:

答案 0 :(得分:0)

对于这种查询,我强烈建议使用Q对象,这里是文档https://docs.djangoproject.com/en/2.2/topics/db/queries/#complex-lookups-with-q-objects

答案 1 :(得分:0)

是这样吗?:

1位用户可使用多个设备

device_ids = Device.objects.filter(user=user).values_list("id", flat=True)
SensorData.objects.filter(sensor__device__id__in=device_ids
                          ).values("sensor__device__name", "sensor__sensor_type__name", 
                                   "value","timestamp").order_by("-timestamp")

1个设备,1个用户

SensorData.objects.filter(sensor__device__user=user
                          ).values("sensor__device__name", "sensor__sensor_type__name", 
                                   "value", "timestamp").order_by("-timestamp")

该查询集将:

1。获取属于某个用户的每个设备

2。获取属于用户设备中每个设备的每个传感器(但是它返回每个传感器的sensor_type,因为那里没有名称字段,所以我返回sensor_type_name)

3。获取每个用户设备传感器的所有记录(按最新时间戳排序)和时间戳。

更新

尝试一下:

list_data=[]
for _id in device_ids:
    sensor_data=SensorData.objects.filter(sensor__device__user__id=_id)
    if sensor_data.exists():
        data=sensor_data.values("sensor__id", "value", "timestamp", "sensor__device__user__id").latest("timestamp")
        list_data.append(data)

答案 2 :(得分:0)

使用django执行原始查询非常好,尤其是如果它们是如此复杂。

如果要将结果映射到模型,请使用以下命令: https://docs.djangoproject.com/en/2.2/topics/db/sql/#performing-raw-queries

否则,请参见:https://docs.djangoproject.com/en/2.2/topics/db/sql/#executing-custom-sql-directly

请注意,在两种情况下,django都不会对查询进行检查。 这意味着查询的安全性是您的全部责任,清理参数。

答案 3 :(得分:0)

实际上,您的查询非常简单,唯一复杂的部分是确定每个SensorData中哪个Sensor是最新的。我将通过以下方式使用annotationsExists subquery

latest_data = SensorData.objects.annotate(
    is_latest=~Exists(
        SensorData.objects.filter(sensor=OuterRef('sensor'),
                                  timestamp__gt=OuterRef('timestamp'))
    )
).filter(is_latest=True)

然后只需按以下方式由用户过滤此查询集即可:

certain_user_latest_data = latest_data.filter(sensor__device__user=certain_user)

现在,即使您没有传感器,也要检索它们,因为仅检索SensorData个实例,并且必须SensorDevice才能检索到该查询通过字段访问。不幸的是,Django不允许通过其ORM进行显式联接。因此,我建议采取以下措施(并且让我说,从性能角度来看,这远非理想)。

这个想法是用以下最新方式的Sensor的特定值(值和时间戳)注释SensorData的查询集:

latest_data = SensorData.objects.annotate(
    is_latest=~Exists(
        SensorData.objects.filter(sensor=OuterRef('sensor'),
                                  timestamp__gt=OuterRef('timestamp'))
    )
).filter(is_latest=True, sensor=OuterRef('pk'))

sensors_with_value = Sensor.objects.annotate(
    latest_value=Subquery(latest_data.values('value')),
    latest_value_timestamp=Subquery(latest_data.values('timestamp'))
)  # This will generate two subqueries...

certain_user_sensors = sensors_with_value.filter(device__user=certain_user).select_related('device__user')

如果某个SensorData没有Sensor的任何实例,则带注释的字段latest_valuelatest_value_timestamp将被简单地设置为None