我想像这样通过application_number收集数据。
SELECT p.application_number AS app, COUNT(c.publication_number) AS Citations
FROM 'patents-public-data.patents.publications' AS p, UNNEST(citation) AS c
WHERE p.application_number IN ('CN201510747352'
)
GROUP BY p.application_number
但是它不起作用。网址是专利页面。谁能帮我一个忙? patent_application_number
答案 0 :(得分:1)
以下是用于BigQuery标准SQL
#standardSQL
SELECT
p.application_number AS app,
SUM((SELECT COUNT(publication_number) FROM UNNEST(citation))) AS Citations
FROM `patents-public-data.patents.publications` AS p
WHERE p.application_number IN ('CN-201510747352-A')
GROUP BY p.application_number
有结果
Row app Citations
1 CN-201510747352-A 14
请注意:如果您使用CN-201510747352-A
而不是CN201510747352
,则原始查询将起作用
#standardSQL
SELECT p.application_number AS app, COUNT(c.publication_number) AS Citations
FROM `patents-public-data.patents.publications` AS p,
UNNEST(citation) AS c
WHERE p.application_number IN ('CN-201510747352-A')
GROUP BY p.application_number
但是-我建议您使用我提供的查询-原因是-如果给定的应用程序根本没有引用-这样的应用程序将不会在输出中返回,而建议的查询将返回count = 0
例如-如果您将两个查询中的WHERE子句都注释掉-首先将返回76,073,734;而第二个将返回29,489,639个应用程序。
在此特定用例中,它可能并不那么重要-但对于您的下一个查询要牢记
另一个问题是查询的数字是14,与原始网站中的7不同。有任何错误吗?
7是正确答案-参见下文
#standardSQL
SELECT
p.application_number AS app,
COUNT(DISTINCT c.publication_number) Citations
FROM `patents-public-data.patents.publications` AS p,
UNNEST(citation) c
WHERE p.application_number IN ('CN-201510747352-A')
GROUP BY p.application_number
有结果
Row app Citations
1 CN-201510747352-A 7