我正在使用大查询来查询Google云平台中的芝加哥犯罪数据。但是,我想统计每种犯罪类型的逮捕和不逮捕人数。在大熊猫中对此进行计数很容易,但是对我来说,如何通过大查询对数据框中的二进制值进行计数并不直观。谁能给我可能的主意,使这一点得以体现?
数据
因为芝加哥犯罪数据很大,所以我无法在此处提供可重复的示例,但是从此处预览犯罪数据非常容易:Chicago crime data
这是个小预览:
我的大查询:
SELECT
primary_type,
count(arrest),
COUNTIF(year = 2015) AS arrests_2015,
COUNTIF(year = 2016) AS arrests_2016
FROM
`bigquery-public-data.chicago_crime.crime`
WHERE
arrest = TRUE
AND year IN (2001,
2018)
AND primary_type NOT IN ('OTHER OFFENSE', ' all non-criminal types')
GROUP BY
primary_type,
arrest
但是此查询给了我空的输出,我不知道如何使其工作。
目标:
从芝加哥犯罪数据表中,我想提取每种主要类型的总逮捕人数和非逮捕人数,我想在2018年底之前排除其他犯罪和所有非犯罪类型。
如何纠正我的大查询以获得预期的输出?任何有效的大查询脚本来获取预期的查询输出?任何想法?谢谢
答案 0 :(得分:2)
下面应该可以工作
#standardSQL
SELECT
primary_type,
COUNT(arrest) arrest_total,
COUNTIF(year = 2015) AS arrests_2015,
COUNTIF(year = 2016) AS arrests_2016
FROM `bigquery-public-data.chicago_crime.crime`
WHERE arrest = TRUE
AND year BETWEEN 2001 AND 2018
AND primary_type NOT IN ('OTHER OFFENSE', ' all non-criminal types')
GROUP BY primary_type, arrest
我认为您的问题在下面的行中,您只选择了2001和2018,而不是其间的所有年份(至少包括2015和2016)
AND year IN (2001, 2018)
因此,您应该使用一个以下
AND year BETWEEN 2001 AND 2018
此外,如果要包括非逮捕,也可以在下面使用
#standardSQL
SELECT
primary_type,
arrest,
COUNT(arrest) arrest_total,
COUNTIF(year = 2015) AS arrests_2015,
COUNTIF(year = 2016) AS arrests_2016
FROM `bigquery-public-data.chicago_crime.crime`
WHERE year BETWEEN 2001 AND 2018
AND primary_type NOT IN ('OTHER OFFENSE', ' all non-criminal types')
GROUP BY primary_type, arrest
注意:我在这里删除了WHERE arrest = TRUE
并将arrest
添加到了SELECT列表
除了这些调整以外,您的初始查询是正确的
如果您希望每个primary_type
有一个输出行,则可以在下面使用
#standardSQL
SELECT
primary_type,
COUNTIF(arrest) arrests,
COUNTIF(NOT arrest) non_arrests,
COUNT(arrest) arrest_total,
COUNTIF(year = 2015) AS arrests_2015,
COUNTIF(year = 2016) AS arrests_2016
FROM `bigquery-public-data.chicago_crime.crime`
WHERE year BETWEEN 2001 AND 2018
AND primary_type NOT IN ('OTHER OFFENSE', ' all non-criminal types')
GROUP BY primary_type
此外-如果您可以“扩展”如下所示的年份计数(例如2015年)
COUNTIF(year = 2015 AND arrest) AS arrests_2015,
COUNTIF(year = 2015 AND NOT arrest) AS non_arrests_2015,
是否有任何编程方式可以计算出每种犯罪类型从2001年到2018年的逮捕人数
#standardSQL
SELECT
primary_type,
year,
COUNTIF(arrest) arrests,
COUNTIF(NOT arrest) non_arrests,
COUNT(arrest) arrest_total
FROM `bigquery-public-data.chicago_crime.crime`
WHERE year BETWEEN 2001 AND 2018
AND primary_type NOT IN ('OTHER OFFENSE', ' all non-criminal types')
GROUP BY primary_type, year