我想为每个不同的hadm_id排一行,这是我的查询:
select
DISTINCT (hadm_id) as Admission_ID,
subject_id as Patient_ID,
icustay_id as Care_Unit_ID ,
intime as In_Time ,
outtime Out_Time,
age as Age,
heartrate_max as Heartrate_max,
-- heartrate_min as Heartrate_min,
sysbp_max as Systolic_Blood_Pressure_max,
-- sysbp_min as Systolic_Blood_Pressure_min,
tempc_max as Temperature_max,
-- tempc_min as Temperature_min,
pao2fio2_vent_min as PAO2_FIO2,
urineoutput as Urine_Otput,
-- bun_min as Blood_Urea_Nitrogen_min,
bun_max as Blood_Urea_Nitrogen_max,
wbc_min as White_Blood_Cells_min,
-- wbc_max as White_Blood_Cells_max,
-- potassium_min as Potassium_min,
potassium_max as Potassium_max,
-- sodium_min as Sodium_min,
sodium_max as Sodium_max,
-- bicarbonate_min as Bicarbonate_min,
bicarbonate_max as Bicarbonate_max,
-- bilirubin_min as Bilirubin_min,
bilirubin_max as Bilirubin_max,
mingcs as mingcs, aids as aids,
hem as hem,
mets as mets,
admissiontype as admissiontype,
resprate_max as resprate_max,
pco2 as pco2,
infection as infection,
organ_dysfunction as organ_dysfunction
from sepsiscategories
-- where hadm_id in (select DISTINCT(hadm_id) from sepsiscategories) and subject_id in (select DISTINCT(subject_id) from sepsiscategories)
-- group by
-- hadm_id,subject_id
order by hadm_id asc
limit 100
结果不是我所期望的:
admission_id | patient_id | care_unit_id | in_time | out_time | age | heartrate_max | systolic_blood_pressure_max | temperature_max | pao2_fio2 | urine_otput |> blood_urea_nitrogen_max | white_blood_cells_min | potassium_max | sodium_max | bicarbonate_max | bilirubin_max | mingcs | aids | hem | mets | admissiontype | resprate_max | pco2 | inf
100003 | 54610 | 209281 | 2150-04-17 15:35:42 | 2150-04-19 14:12:52 | 59.91 | 104 | 146 | 36.7777777777778 | | 2580 | 51 | 13.4 | 5.7 | 133 | 21 | 5.5 | 14 | | | | Medical | 21 | 29 | 0 | 0
100006 | 9895 | 291788 | 2108-04-06 15:50:15 | 2108-04-11 15:18:03 | 48.92 | 127 | 149 | 36.5555572509766 | | 2950 | 16 | 9.6 | 4.4 | 131 | 24 | | 15 | | 1 | | Medical | 29 | 37 | 1 | 0
100006 | 9895 | 291788 | 2108-04-06 15:50:15 | 2108-04-11 15:18:03 | 48.92 | 127 | 149 | 36.5555572509766 | | 2950 | 16 | 9.6 | 4.4 | 131 | 24 | | 15 | | 1 | | Medical | 29 | 40 | 1 | 0
100006 | 9895 | 291788 | 2108-04-06 15:50:15 | 2108-04-11 15:18:03 | 48.92 | 127 | 149 | 36.5555572509766 | | 2950 | 16 | 9.6 | 4.4 | 131 | 24 | | 15 | | 1 | | Medical | 29 | 48 | 1 | 0
100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 30 | 0 | 0
100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 35 | 0 | 0
100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 37 | 0 | 0
100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 39 | 0 | 0
100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 40 | 0 | 0
100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 41 | 0 | 0
100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 44 | 0 | 0
100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 45 | 0 | 0
100009 | 533 | 253656 | 2162-05-17 10:18:31 | 2162-05-19 22:05:14 | 60.80 | 92 | 132 | | 126 | 2490 | 13 | 12.3 | 4.9 | 143 | 24 | | 15 | | | | UnscheduledSurgical | 37 | 59 | 0 | 0
100011 | 87977 | 214619 | 2177-08-29 04:52:21 | 2177-09-09 16:56:35 | 21.50 | 142 | 165 | 38.7222222222222 | 308.333333333333 | 5160 | 12 | 7.6 | 4.5 | 144 | 24 | | 15 | | | | Medical | 30 | 33 | 1 | 0
100011 | 87977 | 214619 | 2177-08-29 04:52:21 | 2177-09-09 16:56:35 | 21.50 | 142 | 165 | 38.7222222222222 | 308.333333333333 | 5160 | 12 | 7.6 | 4.5 | 144 | 24 | | 15 | | | | Medical | 30 | 35 | 1 | 0
100011 | 87977 | 214619 | 2177-08-29 04:52:21 | 2177-09-09 16:56:35 | 21.50 | 142 | 165 | 38.7222222222222 | 308.333333333333 | 5160 | 12 | 7.6 | 4.5 | 144 | 24 | | 15 | | | | Medical | 30 | 47 | 1 | 0
我该如何解决。
答案 0 :(得分:0)
SELECT DISTINCT
用于整个数据集,而不是特定的列。因此,查询返回的所有列中所有值的不同组合将获得一条记录。
由于您使用的是Postgres,因此您可能需要尝试DISTINCT ON
。这样就为您在ON子句中定义的每个组中的一条记录 。 ORDER BY
子句确定每个组应返回哪一行。
在您的查询中:
select
distinct on(hadm_id) hadm_id as Admission_ID,
subject_id as Patient_ID,
icustay_id as Care_Unit_ID ,
...
from sepsiscategories
order by hadm_id, intime
limit 100
这将给您每个hadm_id
一条记录,而该记录具有最小的intime
。您需要为order by
子句使用适合您的用例的正确列。