Python sqlite3从具有异常的组中选择最小值

时间:2016-07-01 17:24:03

标签: python sqlite

我在python中使用public class MyClass implements IArrayFactory { public static void main(String[] args) { MyClass foo = new MyClass(); List<List<Double>> l = new ArrayList<>(); for (int i = 0; i < 10; i++) { l.add(foo.getListDouble(3, 0.0)); } System.out.println(l); } @Override public List<Double> getListDouble(int size, double initVal) { Double[] d = new Double[size]; Arrays.fill(d, initVal); return Arrays.asList(d); } } interface IArrayFactory { List<Double> getListDouble(int size, double initVal); } 从名为sqlite3的sqlite表中检索数据,其中包含字段"Documents"。对于每个唯一标题(组),我想为该日期选择最小"Title,Date,Author"Date,除非作者姓名为Author,在这种情况下,我会喜欢选择不是'foo'下一个最早的作者,但保留最早的日期。如果所有作者都是'foo',那么这很好。

我之前的查询是'foo',它不符合上一个规范,因为它只是选择作​​者的最小日期,无论它是否为"SELECT Title,min(Date),Author FROM Documents GROUP BY Title"

我正在考虑使用create_aggregate创建一个聚合函数,只是过滤掉'foo',但我不确定如何确保我获得下一个最早的作者。使用子查询或CASE表达式也可能更容易,但我对这些不太熟悉。

我怎样才能实现这个目标?

1 个答案:

答案 0 :(得分:2)

如果您不介意使用两个查询和Python来完成这项工作,那么这将按预期工作:

# first query to get the min date of each "Title"
query = "SELECT Title, MIN(Date) FROM Documents GROUP BY Title"
min_date_by_title = cursor.fetchall(query)

# then get the author for each "Title", except if it's "foo"
query = "SELECT Title, Author FROM Documents WHERE Author != 'foo' ORDER BY Date GROUP BY Title"
author_by_title = cusor.fetchall(query)

# last step: match entries one by one of the two previous results
final_result = []
for title1, date in min_date_by_title:
    for title2, author in author_by_title:
        if title1 == title2:  # same title
            final_result.append([title1, date, author])
            break
    else:  # if we didn't find any match, it means that the only author for this title was 'foo'
        final_result.append([title1, date, 'foo'])

通过使用词典(键是不同的标题)可以改善表演,以避免内循环。