Question

我想获取在DB中有3个或更多记录的所有用户，但他们不能拥有相同的日期。示例表：

User1 40 12/11/15
User1 33 13/11/16
User1 23 04/09/16
User2 21 30/09/16
User3 12 12/11/16
User3 54 12/11/16
User3 99 04/09/16

所以从这里开始，我只想获得User1，User3有3条记录，但由于其中2条是在同一天，所以他没有资格。（如果两个或多个记录在同一天是可以的，只要在不同的日期至少有3个记录，所以如果我们在另一个日期为user3添加一个记录，他也会满足要求）

这是我目前的查询：

SELECT id, price, date
FROM Table
WHERE id
IN (

SELECT id
FROM Table
GROUP BY id
HAVING COUNT( ocust ) >=3
)

它需要有3个或更多记录，但我不知道如何处理不同的日期要求。

Answer 1

您只需将COUNT添加到ocust，然后将date（无论是什么）更改为SELECT id, price, date FROM Table WHERE id IN ( SELECT id FROM Table GROUP BY id HAVING COUNT(DISTINCT date ) >=3 -- Added DISTINCT, changed 'ocust' to 'date' )：

import csv
from bs4 import BeautifulSoup
from html.parser import HTMLParser

class MyHTMLParser(HTMLParser):
    def handle_starttag(self, tag, attrs):
        global number_of_starttags
        number_of_starttags += 1

    def handle_endtag(self, tag):
        global number_of_endtags
        number_of_endtags += 1

with open('output1.html', 'r') as f:
    html = f.read()
soup = BeautifulSoup(html.strip(), 'html.parser')

ress = []
for line in html.strip().split('\n'):
    link_words = 0

    line_soup = BeautifulSoup(line.strip(), 'html.parser')
    for link in line_soup.findAll('a'):
        link_words += len(link.text.split())

    words_count = len(line_soup.text.split())- link_words
    number_tag_p = len(line_soup.find_all('p'))
    number_tag_br = len(line_soup.find_all('br'))
    number_tag_break = number_tag_br + number_tag_p

    number_of_starttags = 0
    number_of_endtags = 0

    parser = MyHTMLParser()
    parser.feed(line.lstrip())
    number_tag = number_of_starttags + number_of_endtags
    CTTD = words_count + link_words + number_tag_break


    if (words_count + link_words) == 0:
        CTTD == 0
    res = [words_count, link_words, number_tag, number_tag_break, CTTD]
    ress.append(res)

csvfile = "./output.csv"
firstline = ["TC", "LTC", "TG", "P", "CTTD"]
with open(csvfile, "w") as output:
    writer = csv.writer(output, lineterminator='\n')
    writer.writerow(firstline)
    for val in ress:
        writer.writerow(val)

SELECT记录在不同的日期SQL

1 个答案: