再次使用count和sql
在sqlite上,我有桌子
inst是院校列表:大学等。 书面形式的每一行都给出了一篇论文,一位作者,该作者当时所隶属的机构。 可能有一个以上的机构,并且每个机构都重复一对夫妇paper_id,author_id。 对于给定的作者,我想要一个列表和一份papers.doi,papers.year以及与他一起撰写论文的同事的数量。 我尝试过
class Foo {val bar = "a"}
class Bar {val foo = "b"}
我遇到的问题可能是,如果我正在寻找的作者针对某篇论文出现两次(因为有两个机构) 数量增加了一倍。对于给定的论文,2890的预期结果由
的行数给出 SELECT papers.doi, papers.year, count(*) as c
FROM authors
INNER JOIN writtenby ON authors.author_id = writtenby.author_id
INNER JOIN writtenby AS writtenby_1 ON writtenby.paper_id =
writtenby_1.paper_id
INNER JOIN papers on writtenby_1.paper_id = papers.paper_id
WHERE authors.name ='Beck' AND authors.firstname= 'H P'
GROUP BY papers.doi, papers.year
ORDER BY c DESC
(使用我的数据:2890行) 如果没有区别,我将有3023行,上面的第一个查询给出6046的计数。我尝试在上面的Count子句中使用DISTINCT,但这仍然无法正常工作。
我可以在子查询中使用count吗?感谢您的帮助...
样本数据:
SELECT DISTINCT author_id
FROM writtenby
WHERE paper_id = 4593
检查查询:
-- Make the tables
CREATE TABLE 'authors' (name collate nocase, firstname collate nocase, see_id integer, 'author_id' INTEGER PRIMARY KEY NOT NULL );
CREATE TABLE 'inst' ('name' TEXT NOT NULL, 'country' TEXT NOT NULL , 'see_id' INTEGER, 'inst_id' INTEGER PRIMARY KEY NOT NULL );
CREATE TABLE 'papers' ('doi' TEXT NOT NULL,'year' TEXT NOT NULL, 'paper_id' INTEGER PRIMARY KEY NOT NULL );
CREATE TABLE 'writtenby' ('paper_id' INTEGER NOT NULL, 'author_id' INTEGER NOT NULL, 'inst_id' INTEGER NOT NULL, PRIMARY KEY ('paper_id', 'author_id', 'inst_id'));
-- Insert the data
-- authors : 5 names, one with 2 variants
INSERT INTO 'authors' (name, firstname, see_id, author_id) VALUES ('Doe', 'J', 1, 1);
INSERT INTO 'authors' (name, firstname, see_id, author_id) VALUES ('Klein', 'K', 2, 2);
INSERT INTO 'authors' (name, firstname, see_id, author_id) VALUES ('Lang', 'F', 3, 3);
INSERT INTO 'authors' (name, firstname, see_id, author_id) VALUES ('Rue', 'A De La', 6, 4);
INSERT INTO 'authors' (name, firstname, see_id, author_id) VALUES ('La Rue', 'A De', 6, 5);
INSERT INTO 'authors' (name, firstname, see_id, author_id) VALUES ('De La Rue', 'A', 6, 6);
INSERT INTO 'authors' (name, firstname, see_id, author_id) VALUES ('Smith', 'S', 7, 7);
-- inst 4 name, 2 variants
INSERT INTO 'inst' (name, country, see_id, inst_id) VALUES ('Universite de Paris', 'France', 1, 1);
INSERT INTO 'inst' (name, country, see_id, inst_id) VALUES ('Paris University', 'France', 1, 2);
INSERT INTO 'inst' (name, country, see_id, inst_id) VALUES ('Universite de Lyon', 'France', 3, 3);
INSERT INTO 'inst' (name, country, see_id, inst_id) VALUES ('Univ Freiburg', 'Germany', 4, 4);
INSERT INTO 'inst' (name, country, see_id, inst_id) VALUES ('EPFZ', 'Switzerland', 5, 5);
INSERT INTO 'inst' (name, country, see_id, inst_id) VALUES ('Eidg Techn Hochschule', 'Switzerland', 5, 6);
-- papers: 3 papers
INSERT INTO 'papers' (doi, year, paper_id) VALUES ('doi1', '2017', 1);
INSERT INTO 'papers' (doi, year, paper_id) VALUES ('doi2', '2018', 2);
INSERT INTO 'papers' (doi, year, paper_id) VALUES ('doi3', '2018', 3);
-- paper 1: 4 authors
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (1, 6, 1);
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (1, 6, 3);
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (1, 1, 5);
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (1, 2, 4);
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (1, 7, 1);
-- paper 2: 3 authors
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (2, 6, 1);
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (2, 6, 3);
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (2, 1, 5);
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (2, 2, 5);
-- paper 3: 3 authors
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (3, 6, 1);
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (3, 2, 4);
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (3, 6, 3);
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (3, 2, 1);
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (3, 3, 4);
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (3, 3, 5);
INSERT INTO 'writtenby' (paper_id, author_id, inst_id) VALUES (3, 3, 1);
两个查询均给出错误的结果 第一个:
SELECT papers.doi, papers.year, count(*) as c
FROM authors
INNER JOIN writtenby ON authors.author_id = writtenby.author_id
INNER JOIN writtenby AS writtenby_1 ON writtenby.paper_id =
writtenby_1.paper_id
INNER JOIN papers on writtenby_1.paper_id = papers.paper_id
WHERE authors.name ='De La Rue' AND authors.firstname= 'A'
GROUP BY papers.doi, papers.year
ORDER BY c DESC
SELECT p.doi, p.year, COUNT(w2.author_id) AS cnt
FROM authors a
INNER JOIN writtenby w1
ON a.author_id = w1.author_id
INNER JOIN writtenby w2
ON w1.paper_id = w2.paper_id AND w1.author_id <> w2.author_id
INNER JOIN papers p
ON w2.paper_id = p.paper_id
WHERE
a.name = 'De La Rue' AND a.firstname = 'A'
GROUP BY
p.doi, p.year
ORDER BY
cnt DESC;
第二个查询
doi3|2018|14
doi1|2017|10
doi2|2018|8
François
答案 0 :(得分:1)
我发现正在发生的一个计数问题是您在writtenby
表的自联接中。此处,您没有检查匹配的行是否具有不同 author_id
。如果author_id
是相同的,那么您不应该计算它。另外,您要计算的共享作者数量是第二个writtenby
表。这样,如果给定作者没有任何共同作者,则计数将显示为零。
SELECT p.doi, p.year, COUNT(w2.author_id) AS cnt
FROM authors a
INNER JOIN writtenby w1
ON a.author_id = w1.author_id
INNER JOIN writtenby w2
ON w1.paper_id = w2.paper_id AND w1.author_id <> w2.author_id
INNER JOIN papers p
ON w2.paper_id = p.paper_id
WHERE
a.name = 'Beck' AND a.firstname = 'H P'
GROUP BY
p.doi, p.year
ORDER BY
cnt DESC;
答案 1 :(得分:0)
借助Tim Biegeleisen和示例数据,我发现计数中缺少子句DISTINCT
SELECT p.doi, p.year, COUNT(DISTINCT w2.author_id) AS cnt
FROM authors a
INNER JOIN writtenby w1
ON a.author_id = w1.author_id
INNER JOIN writtenby w2
ON w1.paper_id = w2.paper_id
INNER JOIN papers p
ON w2.paper_id = p.paper_id
WHERE
a.name = 'De La Rue' AND a.firstname = 'A'
GROUP BY
p.doi, p.year
ORDER BY
cnt DESC;
提供作者总数。
doi1 2017 4
doi2 2018 3
doi3 2018 3
在子句w1.author_id <> w2.author_id
中,计数减少了一个。
F。