我需要计算特定字符串出现的次数,但是当一个ID具有多次相同的字符串时,它们只计算一次。基本上,我需要计算ID唯一出现的字符串出现次数。我相信这应该是一件简单的事情,但我不知道自己在做什么。这是我目前的代码:
SELECT
RXNAME as Name,
DUPERSID as ID,
COUNT(RXNAME) as Number
FROM
`OmniHealth.PrescriptionsMEPS`
GROUP BY
ID,
Name
ORDER BY
Number
运行时,它表示所有内容都计为1.感谢您的帮助!
更新: 数据集:https://storage.googleapis.com/omnihealth/MepsPrescriptionData.csv
使用上面的代码运行时输出:
Row Name ID Number
1 SUMATRIPTAN 68896102 1
2 IBUPROFEN 65063102 1
3 PENICILLN VK 66179101 1
4 FUROSEMIDE 63217102 1
5 HYSINGLA ER 70373101 1
6 FUROSEMIDE 76090101 1
7 SKELETAL MUSCLE RELAXANTS 78414101 1
8 AMOXICILLIN 69467103 1
9 TRAMADOL HCL 67667101 1
10 PANTOPRAZOLE 60737102 1
11 CARBAMIDE PEROXIDE 6.5% OTIC SOLN 63990104 1
12 PROMETH/COD 68433101 1
13 AZITHROMYCIN 79045102 1
14 METRONIDAZOL 75414101 1
15 DEXILANT 69625101 1
16 TRAMADOL HCL 66890203 1
17 AZITHROMYCIN 73838101 1
18 COLCRYS 63856102 1
19 PERMETHRIN 62103107 1
20 ACETAMINOPHEN TAB 500 MG 62456102 1
答案 0 :(得分:1)
不确定这是否是您的要求 - 但如果您正在寻找DISTINCT COUNT - 请使用以下内容:
#standardSQL
SELECT
RXNAME AS Name,
COUNT(DISTINCT DUPERSID) AS Number
FROM `OmniHealth.PrescriptionsMEPS`
GROUP BY 1
ORDER BY Number DESC
答案 1 :(得分:0)
试试这个......你在不同的领域进行分组。我认为你的意思是通过RXNAME分组。
SELECT
RXNAME as Name,
DUPERSID as ID,
COUNT(RXNAME) as Number
FROM
`OmniHealth.PrescriptionsMEPS`
GROUP BY
ID,
RXNAME
ORDER BY
Number
答案 2 :(得分:0)
我想你想要:
SELECT DUPERSID as ID, COUNT(DISTINCT RXNAME) as Number
FROM `OmniHealth.PrescriptionsMEPS`
GROUP BY ID
ORDER BY Number;
这假设"相同的字符串"表示" RXNAME"的相同值。