我想像这样通过application_number
收集数据。实际的申请号是CN 201510747352
。
SELECT c.application_number AS Pub, COUNT(p.publication_number) AS CitedBy
FROM `patents-public-data.patents.publications` AS p, UNNEST(citation) AS c
WHERE c.application_number IN ('CN-201510747352-A')
GROUP BY c.application_number
但是它不起作用。网址是专利页面。谁能帮我一个忙? https://patents.google.com/patent/CN105233911B/zh?oq=CN201510747352.8
答案 0 :(得分:1)
我的猜测是,专利可以在其状态为“申请”后才被引用-因此,当状态为“申请”时,您应该使用应用程序/发布号,而不是使用初始编号CN-201510747352
-另外,您不仅需要申请不同的计数而且还排除了使用-A或-B或类似后缀来计数相同的应用程序-这就是为什么您会看到使用regex_extract函数
#standardSQL
SELECT
c.publication_number AS Pub,
COUNT(DISTINCT REGEXP_EXTRACT(p.publication_number, r'(.+-.+)-')) AS CitedByCount
FROM `patents-public-data.patents.publications` AS p,
UNNEST(citation) AS c
WHERE c.publication_number LIKE ('CN-105233911%')
GROUP BY c.publication_number
有结果
Row Pub CitedBy
1 CN-105233911-A 10
...如果我只有应用程序数据,该如何实现?
#standardSQL
SELECT
c.publication_number AS Pub,
COUNT(DISTINCT REGEXP_EXTRACT(p.publication_number, r'(.+-.+)-')) AS CitedByCount
FROM `patents-public-data.patents.publications` AS p,
UNNEST(citation) AS c
WHERE c.publication_number IN (
SELECT publication_number
FROM `patents-public-data.patents.publications`
WHERE application_number IN ('CN-201510747352-A')
)
GROUP BY c.publication_number