如何使用PERCENTILE_CONT和GROUP BY id计算单位价格中位数

时间:2019-01-12 00:34:44

标签: postgresql percentile

我正在使用Postgres 9.5,并尝试使用l=[{x:y.tolist()}for x , y in df.groupby('IssuerID')['Sedol']] l [{1: [1, 2]}, {2: [3]}, {3: [4]}] 计算中位数平均 每单位价格 ID 。这是DBFIDDLE

中的查询

这里是数据

GROUP BY

使用id | price | units -----+-------+-------- 1 | 100 | 15 1 | 90 | 10 1 | 50 | 8 1 | 40 | 8 1 | 30 | 7 2 | 110 | 22 2 | 60 | 8 2 | 50 | 11 这是我的查询:

percentile_cont

此查询返回:

SELECT id,
  ceil(avg(price)) as avg_price,
  percentile_cont(0.5) within group (order by price) as median_price,
  ceil( sum (price) / sum (units) ) AS avg_pp_unit,
  ceil( percentile_cont(0.5) within group (order by price)  / 
        percentile_cont(0.5) within group (order by units) ) as median_pp_unit
FROM t
GROUP by id

我很确定平均值计算正确。这是计算每单位中位数价格的正确方法吗?

这篇文章表明这​​是正确的(尽管性能很差),但我很好奇中位数计算中的除法是否会歪曲结果。

Calculating median with PERCENTILE_CONT and grouping

1 个答案:

答案 0 :(得分:1)

  

中位数是将数据样本的上半部分与下半部分(总体或概率分布)分开的值。对于数据集,可以将其视为“中间”值。   https://en.wikipedia.org/wiki/Median

所以您的中位数价格为55,中位数为9

{
"$connections": {
    "value": {
        "azureblob": {
            "connectionId": "/subscriptions/XXX/resourceGroups/Default-SQL-CentralUS/providers/Microsoft.Web/connections/azureblob",
            "connectionName": "azureblob",
            "id": "/subscriptions/XXX/providers/Microsoft.Web/locations/centralus/managedApis/azureblob"
        }
    }
},
"definition": {
    "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
    "actions": {
        "Create_blob": {
            "inputs": {
                "body": "@variables('fileAccessURL')",
                "headers": {
                    "Content-Type": "application/octet-stream"
                },
                "host": {
                    "connection": {
                        "name": "@parameters('$connections')['azureblob']['connectionId']"
                    }
                },
                "method": "post",
                "path": "/datasets/default/files",
                "queries": {
                    "folderPath": "/validis",
                    "name": "logicapptest",
                    "queryParametersSingleEncoded": true
                }
            },
            "runAfter": {
                "DownloadZIP": [
                    "Succeeded"
                ]
            },
            "runtimeConfiguration": {
                "contentTransfer": {
                    "transferMode": "Chunked"
                }
            },
            "type": "ApiConnection"
        },
        "DownloadZIP": {
            "inputs": {
                "method": "GET",
                "uri": "@variables('fileAccessURL')"
            },
            "operationOptions": "DisableAutomaticDecompression",
            "runAfter": {
                "SetFileAccessURL": [
                    "Succeeded"
                ]
            },
            "type": "Http"
        },
        "InitializeAccessToken": {
            "inputs": {
                "variables": [
                    {
                        "name": "access_token",
                        "type": "String"
                    }
                ]
            },
            "runAfter": {},
            "type": "InitializeVariable"
        },
        "InitializeFileAccessURL": {
            "inputs": {
                "variables": [
                    {
                        "name": "fileAccessURL",
                        "type": "String"
                    }
                ]
            },
            "runAfter": {
                "InitializeAccessToken": [
                    "Succeeded"
                ]
            },
            "type": "InitializeVariable"
        },
        "POST-AuthKey": {
            "inputs": {
                "body": "grant_type=vapi_key&key=XXX",
                "headers": {
                    "Content-Type": "application/x-www-form-urlencoded",
                    "Ocp-Apim-Subscription-Key": "XXX",
                    "cache-control": "no-cache"
                },
                "method": "POST",
                "uri": "https://api.sandbox.XXX.com/v1/oauth/token"
            },
            "runAfter": {
                "InitializeFileAccessURL": [
                    "Succeeded"
                ]
            },
            "type": "Http"
        },
        "RetrieveZIP_URL": {
            "inputs": {
                "headers": {
                    "Authorization": "@{concat('Bearer ',variables('access_token'))}",
                    "Ocp-Apim-Subscription-Key": "XXX",
                    "cache-control": "no-cache"
                },
                "method": "GET",
                "uri": "https://api.sandbox.XXX.com/v1/extracts/general-ledger/engagements/XXX"
            },
            "runAfter": {
                "SetAccessToken": [
                    "Succeeded"
                ]
            },
            "type": "Http"
        },
        "SetAccessToken": {
            "inputs": {
                "name": "access_token",
                "value": "@{body('POST-AuthKey').access_token}"
            },
            "runAfter": {
                "POST-AuthKey": [
                    "Succeeded"
                ]
            },
            "type": "SetVariable"
        },
        "SetFileAccessURL": {
            "inputs": {
                "name": "fileAccessURL",
                "value": "@{body('RetrieveZIP_URL').fileaccessurl}"
            },
            "runAfter": {
                "RetrieveZIP_URL": [
                    "Succeeded"
                ]
            },
            "type": "SetVariable"
        }
    },
    "contentVersion": "1.0.0.0",
    "outputs": {},
    "parameters": {
        "$connections": {
            "defaultValue": {},
            "type": "Object"
        }
    },
    "triggers": {
        "Recurrence": {
            "recurrence": {
                "frequency": "Month",
                "interval": 12
            },
            "type": "Recurrence"
        }
    }
}

我不确定您打算将“单价中位数”设置为什么:

        Sort by price                  Sort by units
  id    |   price   |  units |  | id    |  price  |   units  
 -------|-----------|--------|  |-------|---------|---------- 
      1 | 30        |      7 |  |     1 |      30 | 7        
      1 | 40        |      8 |  |     1 |      40 | 8        
      1 | 50        |      8 |  |     1 |      50 | 8        
 >>>  2 | 50        |     11 |  |     2 |      60 | 8    <<<<    
 >>>  2 | 60        |      8 |  |     1 |      90 | 10   <<<<
      1 | 90        |     10 |  |     2 |      50 | 11       
      1 | 100       |     15 |  |     1 |     100 | 15       
      2 | 110       |     22 |  |     2 |     110 | 22       
        |           |        |  |       |         |          
         (50+60)/2                               (8+10)/2 
          55                                        9        

如果“价格”列表示“单价”,则无需将55除以9,但如果“价格”是“订单总数”,则需要除以单位:55/9 = 6.11